From charlesreid1

Intro

Single dimension: solving

Multidimensional: solving

Solving nonlinear equations is hard - solving multidimensional nonlinear equations is really hard. Even though they look the same, the second equation is much, much harder than the first.

Nonlinear sets of equations may have no real solutions at all

Implicit function theorem - generically, solutions will be distinct, pointlike, and separated from one another. However, you may have a non-generic (degenerate) case where the solutions are a continuous family.

In one dimension, you can bracket a solution and narrow in on it. In multiple dimensions, you don't have an analogous approach - you can't know it's a solution until you've found the solution.

Root-finding proceeds by iteration (in one or in multiple dimensions). It is usually possible to determine the rate of converge of an algorithm in advance.

Good behavior of an algorithm depends crucially on a good first guess, especially with multivariate problems.

Hamming's motto: "the purpose of computing is insight, not numbers". Use this to guide your first guess: use insight into the problem, not random/arbitrary guesses!

Before you start, try to see what your function looks like with a graph.

Here are starting points for solving nonlinear equations:

  • Brent's algorithm - method of choice for finding bracketed root of general 1D function, if you can't find the derivative
  • Newton-Raphson algorithm - method for finding bracketed root of 1D function (with bookkeeping) if you know the first derivative
  • Halley's method - finding bracketed root of 1D function if you know the second derivative
  • Roots of polynomials are special case - use Laguerre's method (and keep in mind that polynomials can potentially be ill-conditioned)
  • For multivariate problems, Newton-Raphson method is the only elementary method, and its performance depends on a good initial guess

Bracketing, Bisection

Using the Intermediate Value Theorem - if a function is continuous, and it passes through f(a) and then through f(b) for a < b, there must be a value c such that f(a) <= f(c) <= f(b)