Root-finding algorithm

From Wikipedia, the free encyclopedia

Jump to: navigation, search

A root-finding algorithm is a numerical method, or algorithm, for finding a value x such that f(x) = 0, for a given function f. Such an x is called a root of the function f.

This article is concerned with finding real or complex roots, approximated as floating point numbers. Finding integer roots or exact algebraic roots are separate problems, whose algorithms have little in common with those discussed here. (See: Diophantine equation as for integer roots)

Finding a root of f(x)g(x) = 0 is the same as solving the equation f(x) = g(x). Here, x is called the unknown in the equation. Conversely, any equation can take the canonical form f(x) = 0, so equation solving is the same thing as computing (or finding) a root of a function.

Numerical root-finding methods use iteration, producing a sequence of numbers that hopefully converge towards a limit (the so called "fixed point") which is a root. The first values of this series are initial guesses. The method computes subsequent values based on the old ones and the function f.

The behaviour of root-finding algorithms is studied in numerical analysis. Algorithms perform best when they take advantage of known characteristics of the given function. Thus an algorithm to find isolated real roots of a low-degree polynomial in one variable may bear little resemblance to an algorithm for complex roots of a "black-box" function which is not even known to be differentiable. Questions include ability to separate close roots, robustness in achieving reliable answers despite inevitable numerical errors, and rate of convergence.

Contents

[edit] Specific algorithms

The simplest root-finding algorithm is the bisection method. It works when f is a continuous function and it requires previous knowledge of two initial guesses, a and b, such that f(a) and f(b) have opposite signs. Although it is reliable, it converges slowly, gaining one bit of accuracy with each iteration.

Newton's method assumes the function f to have a continuous derivative. Newton's method may not converge if started too far away from a root. However, when it does converge, it is faster than the bisection method. Convergence is usually quadratic, so the number of bits of accuracy doubles with each iteration. Newton's method is also important because it readily generalizes to higher-dimensional problems. Newton-like methods with higher order of convergence are the Householder's methods. The first one after Newton's method is Halley's method with cubic order of convergence.

Replacing the derivative in Newton's method with a finite difference, we get the secant method. This method does not require the computation (nor the existence) of a derivative, but the price is slower convergence (the order is approximately 1.6).

The false position method, also called the regula falsi method, is like the secant method. However, instead of retaining the last two points, it makes sure to keep one point on either side of the root. The false position method is faster than the bisection method and more robust than the secant method.

The secant method also arises if one approximates the unknown function f by linear interpolation. When quadratic interpolation is used instead, one arrives at Müller's method. It converges faster than the secant method. A particular feature of this method is that the iterates xn may become complex.

This can be avoided by interpolating the inverse of f, resulting in the inverse quadratic interpolation method. Again, convergence is asymptotically faster than the secant method, but inverse quadratic interpolation often behaves poorly when the iterates are not close to the root.

Finally, Brent's method is a combination of the bisection method, the secant method and inverse quadratic interpolation. At every iteration, Brent's method decides which method out of these three is likely to do best, and proceeds by doing a step according to that method. This gives a robust and fast method, which therefore enjoys considerable popularity.

[edit] Finding roots of polynomials

Much attention has been given to the special case that the function f is a polynomial; there exist root-finding algorithms exploiting the polynomial nature of f. For a univariate polynomial of degree less than five, we have closed form solutions such as the quadratic formula. However, even this degree-two solution should be used with care to ensure numerical stability, the degree-four solution is unwieldy and troublesome, and higher-degree polynomials have no such general solution.

For real roots, Sturm's theorem provides a guide to locating and separating roots. This plus interval arithmetic combined with Newton's method yields a robust algorithm, but other choices are more common.

One possibility is to form the companion matrix of the polynomial. Since the eigenvalues of this matrix coincide with the roots of the polynomial, one can use any eigenvalue algorithm to find the roots of the polynomial. For instance the classical Bernoulli's method to find the root greater in modulus, if it exists, turns out to be the power method applied to the companion matrix.

Laguerre's method is rather complicated, but it converges quickly. It exhibits cubic convergence for simple roots, dominating the quadratic convergence displayed by Newton's method. The Jenkins–Traub method is another complicated method which converges faster than Newton's method.

Bairstow's method uses Newton's method to find quadratic factors of a polynomial with real coefficients. It can determine both real and complex roots of a real polynomial using only real arithmetic.

The simple Durand-Kerner and the slightly more complicated Aberth method simultaneously finds all the roots using only simple complex number arithmetic.

The splitting circle method is useful for finding the roots of polynomials of high degree to arbitrary precision; it has almost optimal complexity in this setting. Another method with this style is the Dandelin-Gräffe method (actually due to Lobachevsky) which factors the polynomial.

Wilkinson's polynomial illustrates that high precision may be necessary when computing the roots of a polynomial.

[edit] Finding multiple roots

If p(x) is a polynomial with a multiple root at r, then finding the value of r can be difficult (inefficient or impossible) for many of the standard root-finding algorithms. Fortunately, there is a technique especially for this case, provided that p is given explicitly as a polynomial in one variable with exact coefficients.

[edit] Algorithm

  1. First, we need to determine whether p(x) has a multiple root. If p(x) has a multiple root at r, then its derivative p′(x) will also have a root at r (one fewer than p(x) has there). So we calculate the greatest common divisor of the polynomials p(x) and p′(x); adjust the leading coefficient to be one and call it g(x). (See Sturm's theorem.) If g(x) = 1, then p(x) has no multiple roots and we can safely use those other root-finding algorithms which work best when there are no multiple roots, and then exit.
  2. Now suppose that there is a multiple root. Notice that g(x) will have a root of the same multiplicity at r that p′(x) has and the degree of the polynomial g(x) will generally be much less than that of p(x). Recursively call this routine, i.e. go back to step #1 above, using g(x) in place of p(x). Now suppose that we have found the roots of g(x), i.e. we have factored it.
  3. Since r has been found, we can factor (xr) out of p(x) repeatedly until we have removed all of the roots at r. Repeat this for any other multiple roots until there are no more multiple roots. Then the quotient, i.e. the remaining part of p(x), can be factored in the usual way with one of the other root-finding algorithms. Exit.

[edit] Example

Suppose p(x) = x3+x2−5x+3 is the function whose roots we want to find. We calculate p′(x) = 3x2+2x−5. Now divide p′(x) into p(x) to get p(x) = p′(x)·((1/3)x+(1/9))+((−32/9)x+(32/9)). Divide the remainder by −32/9 to get x−1 which is monic. Divide x−1 into p′(x) to get p′(x) = (x−1)·(3x+5)+0. Since the remainder is zero, g(x) = x−1. So the multiple root of p(x) is r = 1. Dividing p(x) by (x−1)2, we get p(x) = (x+3)(x−1)2 so the other root is −3, a single root.

[edit] Algebraic geometry

The performance of standard polynomial root-finding algorithms degrades in the presence of multiple roots. Using ideas from algebraic geometry, Zhonggang Zeng has published a more sophisticated approach, with a MATLAB package implementation, that identifies and handles multiplicity structures considerably better. A different algebraic approach including symbolic computations has been pursued by Andrew Sommese and colleagues, available as a preprint (PDF).

[edit] Direct algorithm for multiple root elimination

There is a direct method of eliminating multiple (or repeated) roots from polynomials with exact coefficients (integers, rational numbers, Gaussian integers or rational complex numbers).

Suppose a is a root of P(x) with multiplicity m>0. Then a will be a root of the formal derivative P’(x) with multiplicity m-1. (If m=1, then a will be a "root of P’(x) with multiplicity 0". That is, if it is a distinct (non-multiple) root of the polynomial, then it won't be a root of the polynomial's derivative.) However, P’(x) may have additional roots that are not roots of P(x). (For example, if P(x)=(x-1)3(x-3)3, then P’(x)=6(x-1)2(x-2)(x-3)2. So 2 is a root of P’(x) here, but not of P(x).)

Define G(x) to be the greatest common divisor of P(x) and P’(x).

Finally, G(x) divides P(x) exactly, so form the quotient Q(x)=P(x)/G(x).

Now, a is a root of P(x) with multiplicity m>0 if and only if a is a root of Q(x) with multiplicity 1. That is, Q(x) has exactly the roots of P(x), but has no multiple roots.

As P(x) is a polynomial with exact coefficients, then if the algorithm is executed using exact arithmetic, Q(x) will also be a polynomial with exact coefficients.

Q(x) may also be simpler than P(x): degree(Q(x)) ≤ degree(P(x)). Whether or not degree(P(x))≤4, if degree(Q(x))≤4 then the roots may be found algebraically.

It is then possible to determine the multiplicities of those roots in P(x) algebraically.

If degree(Q(x))>4, standard (iterative) root-finding algorithms may be used, and should perform well in the absence of multiple roots.

[edit] See also

[edit] External links

Personal tools