\" groff -te -ms 6_Numerical | lpr .fp 4 C .EQ delim $$ .EN .bp .EH '\fIScientific Computation''Numerical Analysis\fR %' .OH '\fIScientific Computation''Numerical Analysis\fR %' .sp 5 .ps 14 .ce .B 6. Numerical Programming .R .NL .sp 3 .SH Introduction .PP A previous chapter presented programs which are capable of manipulating mathematical statements symbolically to obtain analytic results. Mathematical symbolic programming languages are user-friendly and the results error-free. However there are limitations. Analytic mathematical descriptions of the real world unfortunately do not abound. .PP In fact it is quite easy to think up problems for which there is no analytic solution. We understand an analytic, or "closed" solution here to mean an expression involving algebraic and transcendental terms. Consider solving polynomial expressions in one variable. Solving first degree polynomial equations in one variable is trivial; the solutions to second-degree polynomial equations can involve irrational (square root) and complex numbers; cubic and biquadratic equation solutions are algorithmetic. It has been known since the 17th century that no analytic functions can describe the solution to a polynomial expression having terms with powers greater than 5. (Even the fifth-degree polynomial solution is not usually mentioned since it involves elliptic functions.) Eigenvalue problems involving more than four solutions (e.g. 4 $times$ 4 matrices) therefore do not lend themselves to analytic solution since they effectively involve characteristic polynomials of degree equal to the size of the problem. .PP Another example is the value of common functions. The value of sin(x) for an arbitrary value of x, for example, is not trivial to obtain. Tables, pocket calculators and computer math library functions are deceptive. Also recall the definition of the exponential function in terms of an infinite series. .PP \fBNumerical analysis\fR may be defined as the study of numerical algorithms, that is, algorithms which manipulate numbers, particularly reals. Since many practical mathematical problems do not admit analytic solutions (solutions in terms of simple functions), the only hope is approximate solutions in terms of numerical values. Numerical methods range from estimations of pi to a large numbers of decimal places to solving systems of intractable differential equations. .PP In this chapter we will discuss numerical solutions to linear and non-linear algebraic systems, and integration of differential functions and systems. In the discussion of algorithms it was possible to present some rather efficient solutions to closed problems, but here the algorithms become more intricate. Numerical analysis is a field that is relatively new on the one hand, yet highly developed on the other. It is sufficiently complex to provide fertile ground for numerous commercial and research efforts. Here we can only afford an elementary introduction to some of the useful concepts with illustrative algorithms and implementations, to the point that a user can begin to appreciate the various production packages that might be encountered in scientific applications. For more sophisticated (and realistic) algorithms and their implementations we must refer the interested investigator to the literature. .NH Linear Algebraic Equations .PP Linear algebraic equations contain only constants and variables; i.e. powers of the variables to degree no greater than unity. A single linear equation in a single variable can be manipulated into the form $ax~=~b$, where all the factors of the variable have been collected together and the constant terms have likewise been collected together via arithmetic equivalence operations, i.e. operations which preserve the integrity of the relationship between the variable and the constants by addition, subtraction, multiplication or division of both sides of the equation by the same value (variable or constant), and the equivalence properties of expressions (association, commutation and distribution under addition and multiplication). A final manipulation of division by the coefficient of x produces the \fBsolution\fR to the equation, $x~=~b/a$. .PP Now consider a "system" (set) of linear algebraic equations, containing possibly several variables and constants each. Each variable appears in the equations to first power only. The solution to the system consists of a set of equations relating each variable separately to a value such that all the values of the variables simultaneously satisfy the original set of equations. A system of linear algebraic equations can be expressed efficiently in matrix notation: .EQ (6.1) bold A bold x~=~bold b, .EN where $bold x$ is a column array (vector) of n unknown variables, $bold A$ is a square $n times n$ matrix of coefficient parameters, and $bold b$ is a column array of n "target" values. Note that the variables are in the same order in each equation of the system. .PP Formally, the set of $bold x$ values which simultaneously satisfies the system of equation (i.e. the \fIsolution\fR to the linear system) can be expressed as .EQ (6.2) bold x~=~bold A sup -1 bold b, .EN where $bold A sup -1$ is the matrix "inverse" to $bold A$: .EQ (6.3) bold A sup -1 bold A~=~bold I, .EN provided the system is \fInon-homogeneous\fR: $bold b~!=~bold 0$, $bold A$ \fInonsingular\fR (simpler systems are considered below). .PP How should we proceed to solve a realistic problem involving several equations, with possibly with irrational coefficients? One possibility would be to perform repeated substitutions based on solving each equation for one of the unknowns in terms of all the others. Given an initial "guess" solution vector, $bold x sub 0$, a new set of $bold x$ values can be obtained from .EQ (6.4) bold x sub roman i~=~1 over bold a sub roman ii [ bold b sub roman i~-~sum from {~~~roman j != roman i} bold a sub roman ij bold x sub roman j ] .EN and repeating the iteration until no further changes occur in $bold x$, i.e., until $bold x$ reaches a \fBfixed point\fR: .EQ (6.5) bold x~=~bold f(x), .EN where $bold f(x)$ is the right-hand-side of Eqn. (6.4). However, convergence may be slow or nonexistent, depending on the initial guess. This approach becomes more useful when there is no obvious alternative, as with the non-linear systems discussed below. .PP A direct approach uses a method due to Karl Friedrich Gauss, called \fBGaussian elimination\fR. It is based on the elementary row operations of linear algebraic systems which preserve equivalent solutions: interchange of position of two equations in the system and addition of a constant multiple of one equation to another. \fIEquivalence\fR is used here to mean that the transformed system has the same solution as the original system. We will illustrate the procedure with a simple example. This is useful in the development of a general algorithm as well. .RS \fBExample 6.1\fR Solve the following system of three linear equations in three unknowns: .EQ (6.6a) -x sub 1~+~x sub 2~-~4x sub 3~mark =~0 .EN .EQ (6.6b) 2x sub 1~+~2x sub 2~lineup =~1 .EN .EQ (6.6c) 3x sub 1~+~3x sub 2~+~2x sub 3~lineup =~1/2 .EN The elementary row operations can be employed to eliminate $x sub 1$ from two of the three equations, say the second and the third, by adding 2 times the first to the second and subtracting 3 times the first to the third, respectively. The resulting (equivalent) system is: .EQ (6.7a) -x sub 1~+~x sub 2~-~4x sub 3~mark =~0 .EN .EQ (6.7b) 4x sub 2~-~8x sub 3~lineup =~1 .EN .EQ (6.7c) 6x sub 2~-~10x sub 3~lineup =~1/2 .EN Note that the coefficient of $x sub 3$ in the second equation is no longer zero, but that doesn't matter to the general process. Now, eliminate $x sub 2$ from the third equation by adding $6 over 4$ times the second to the third. The final system is: .EQ (6.8a) -x sub 1~+~x sub 2~-~4x sub 3~mark =~0 .EN .EQ (6.8b) 4x sub 2~-~8x sub 3~lineup =~1 .EN .EQ (6.8c) 2x sub 3~lineup =~-1 .EN At this point, the last equation contains a single variable ($x sub 3$), for which a solution is easy to obtain: .EQ (6.9a) x sub 3~=~-1 over 2 .EN This value can be substituted into the second equation converting it into another equation in a single variable ($x sub 2$), from which the value of $x sub 2$ may be derived: .EQ (6.9b) x sub 2~=~{1~+~8x sub 3} over 4~=~-3 over 4 .EN Finally, knowledge of the values of $x sub 3$ and $x sub 2$ yield a value for $x sub 1$ from the first equation: .EQ (6.9c) x sub 1~=~x sub 2~-~4x sub 3~=~-3 over 4~-~4( -1 over 2 )~=~5 over 4 .EN .RE This process can be generalized into an algorithm with two procedures: .sp .nr LL -0.5i .KS .B1 .sp .ce \fBGaussian Algorithm\fR .sp .5 .LP Purpose: To solve n linear equations in n variables. .IP .LP Procedure: .RS .IP 1. 3 \fIForward elimination\fR. Repeat the following process for each equation: eliminate one variable from each of the remaining equations by subtraction of the appropriate multiple of the equation from the remaining equations. .IP 2. 3 \fIBackward substitution\fR. Beginning at the last equation, solve for its remaining variable. Successively substitute the solution values into the preceding equations and solve for their remaining variables. .RE .sp .B2 .KE .nr LL +0.5i .LP Forward elimination is also called \fItriangularization\fR, as a triangular matrix results. .PP Note that the same process which eliminates the lower triangle of elements of $bold A$ can be used to eliminate the upper triangle elements as well, leaving a \fIdiagonal\fR matrix equivalent to $bold A$. Since it is a trivial matter to invert a diagonal matrix, the solutions are easy to obtain from Eq. (6.2). In symbols, the original system $bold A bold x~=~bold b$ is replaced by the equivalent diagonalized system $bold D bold x~=~bold c$ with the solution $bold x~=~bold D sup -1 bold c$. This process, called Gauss-Jordan elimination plus scaling, produces an alternative algorithm to forward elimination and backward substitution, which is of comparable efficiency (order($N sup 3$)) but somewhat easier to implement. .sp .nr LL -0.5i .KS .B1 .sp .ce \fBGauss-Jordan Algorithm\fR .sp .5 .LP Purpose: To solve n linear equations in n variables. .IP .LP Procedure: .RS .IP 1. 3 \fIGauss-Jordan elimination\fR. Repeat the following process for each equation: eliminate one variable from each of the other equations by subtraction of the appropriate multiple of the equation from the remaining equations. .IP 2. 3 Solve each equation in a single variable resulting from step 1 by dividing the resulting constant by the resulting coefficient of the variable. .RE .sp .B2 .KE .nr LL +0.5i .PP The Gauss-Jordan Algorithm may be expressed in matrix form by recognizing that the following row-equivalence operation between $row sub j$ and $row sub i$ will eliminate the i\fIth\fR element from the j\fIth\fR row, i.e. reduce $a sub ji$ to zero: .EQ (6.10) bold row sub j~-~left ( {a sub ji} over {a sub ii} right )~bold row sub i ~~~~~~~~~~j~!=~i~=~1,~...~n .EN This may be implemented with the assignment statements .EQ (6.11a) a sub jk~<-~a sub jk~-~{a sub ji} over {a sub ii} a sub ik , ~~~k~=~1~..~n ~~~~~~~~~~j~!=~i~=~1,~...~n .EN .EQ (6.11b) b sub j~<-~b sub j~-~{a sub ji} over {a sub ii} b sub i ~~~~~~~~~~j~!=~i~=~1,~...~n .EN The diagonalized system is inverted by .EQ (6.11c) x sub j~<-~{b sub i} over {a sub ii} ~~~~~~~~~~j~!=~i~=~1,~...~n .EN .PP (EXAMPLE HERE) .PP To be successful, division by zero or even relatively small numbers must be avoided. This can be effected by checking for singular and ill-conditioned behavior. Disregarding this complication, the following C program implements the Gauss-Jordan matrix algorithm for linear systems in a straightforward way. .sp .KS .ps -2 .vs -2 .nf \f4 .eo #define MAX_SIZE 100 main() /* Gauss-Jordan solution of linear equation system */ { int i, j, n; double A[MAX_SIZE][MAX_SIZE], b[MAX_SIZE], x[MAX_SIZE]; /* Input matrix representation of linear system */ printf ("\n\t\tSolve linear system Ax = b for x.\n\n"); printf ("Enter number of x variables: "); scanf ("%d", &n); printf ("Enter %d x %d elements of A:\n", n, n); for (i = 1; i <= n; ++i) for (j = 1; j <= n; ++j) scanf ("%lf", &A[i][j]); printf ("Enter %d elements of b:\n", n, n); for (i = 1; i <= n; ++i) scanf ("%lf", &b[i]); /* Solve matrix representation of linear system */ Diagonalize (n, A, b); /* In-place Gauss-Jordan diagonalization of A */ Scale (n, A, b, x); /* Solutions by inversion of diagonal matrix A */ /* Ouput solution vector of linear system */ printf ("\nSolution vector: \n"); for (i = 1; i <= n; ++i) printf ("%.14f\n", x[i]); } Diagonalize (n, A, b) /* In-place Gauss-Jordan diagonalization of A */ int n; double A[][MAX_SIZE], b[]; { int i, j, k; double r; for (i = 1; i <= n; ++i) for (j = 1; j <= n; ++j) if (j != i) { r = A[j][i]/A[i][i]; for (k = 1; k <= n; ++k) A[j][k] = A[j][k] - r*A[i][k]; b[j] = b[j] - r*b[i]; } } Scale (n, A, b, x) /* Solutions by inversion of diagonal matrix A */ int n; double A[][MAX_SIZE], b[], x[]; { int i; for (i = 1; i <= n; ++i) x[i] = b[i]/A[i][i]; } .ec .R .fi .ps +2 .vs +2 .KE .sp .NH Matrix Eigenvalue Equations .PP Consider the following equation: .EQ (6.12) bold A bold x~=~lambda bold x .EN where $bold x$ is a column array (vector) of n unknown variables, $bold A$ is a square $n times n$ matrix of coefficient parameters, and $lambda$ is a constant, called an \fBeigenvalue\fR. Geometrically $bold A$ transforms vector $bold x$ by stretching it without rotation. This equation is similar to the matrix form of a system of linear algebraic equations, except that the target vector has become a multiple of the solution vector itself. A connection between the two can be made by transforming the eigenvalue equation to the form $( bold A~-~lambda bold I ) bold x~=~bold 0$, which is equivalent to a linear system $bold A prime bold x~=~bold b$ where $bold A prime$ annihilates $bold x$ to produce a $bold 0$ target vector, called a \fBhomogeneous\fR system. If $bold A prime$ has an inverse, then the only solution is the trivial solution: $bold A prime sup -1 bold A prime bold x~=~bold x~=~bold 0$. Thus for non-trivial solutions $bold A prime$ does not have an inverse and is said to be \fBsingular\fR. Since $bold A prime sup -1~=~{roman adj~bold A prime} over {det~bold A prime}$, it may be inferred that $det bold A prime~=~0$, or .EQ (6.13) left | bold A~-~lambda bold I right |~=~0 .EN This determinant, called the \fBsecular determinant\fR is associated with a polynomial equation of degree n in $lambda$, called the \fBcharacteristic equation\fR. There are thus \fIn eigenvalues $italic lambda sub i$ \fIto an $italic n times italic n$ \fImatrix\fR. Corresponding to each eigenvalue there is a solution vector $bold x sub i$, called an \fBeigenvector\fR. Collecting all the eigenvalues into a diagonal matrix and collecting the corresponding eigenvectors into a matrix of column eigenvectors, the solution to the matrix eigenvalue problem can be stated in the single equation .EQ (6.14) bold A bold X~=~bold LAMBDA bold X .EN For non-singular eigensolutions, $bold X sup -1 bold A bold C~=~bold LAMBDA$, and we say that a matrix $bold A$ is \fIdiagonalized\fR by a matrix of its eigenvectors. .PP We shall concentrate here on \fBHermitian\fR matrices (matrices which have the property of being equivalent to their complex-conjugate transpose). Since it can be shown that such systems produce real (as distinct from imaginary) eigenvalues, they are of obvious importance to scientific models having real measurable values. The special case of Hermitian matrices with real components are symmetric and are particularly convenient to implement in languages which do not support complex arithmetic (although the extension to complex matrices is straightforward). .PP An example will illustrate the concepts of eigenvalue problems. Consider the quadratic equation .EQ (6.15) 7x sub 1 sup 2~+~13x sub 2 sup 2~-~6 sqrt 3 ~x sub 1 x sub 2~=~64. .EN Graphically, this equation represents an ellipse centered at the origin of a two-dimensional Cartesian coordinate system, with major axis length 4 and minor axis length 2, rotated 30\(de from the coordinate axes $x sub 1$ and $x sub 2$. The quadratic equation can be expressed in matrix "quadratic" form $bold x sup T bold A bold x~=~64$, with the coefficient matrix .EQ (6.16) bold A~=~left ( matrix { ccol { {7} above {-3 sqrt 3} } ccol { {-3 sqrt 3} above {13} } } right ) .EN The eigenvalues are found by solving the secular equation: .EQ (6.17) det ( bold A~-~lambda bold I )~=~det left ( matrix { ccol { {7~-~lambda} above {-3 sqrt 3} } ccol { {-3 sqrt 3} above {13~-~lambda} } } right )~=~0 .EN This determinant evaluates to .EQ (6.18) (7~-~lambda ) (13~-~lambda )~-27~=~lambda sup 2~-~20 lambda~+~64~=~0 .EN with solutions .EQ (6.19) lambda sub 1~=~4,~~~~lambda sub 2~=~16. .EN The eigenvectors corresponding to these eigenvalues are found by substitution into the original eigenvalue equation: .EQ (6.20) ( bold A~-~lambda sub i bold I ) bold x sub i~=~0,~~~i~=~1,2 .EN or .EQ (6.21a) (7~-~lambda sub i ) x sub 1i~-~3 sqrt 3 ~x sub 2i~=~0 .EN .EQ (6.21b) -~3 sqrt 3 ~x sub 1i~+~(13~-~lambda sub i ) x sub 2i~=~0 .EN Substituting the values of $lambda sub i$ $i~=~1,2$ into these equations yields .EQ (6.22a) x sub 11~=~sqrt 3 ~x sub 21 .EN and .EQ (6.22b) x sub 12~=~{-1} over {sqrt 3} ~x sub 22 .EN An additional condition is needed to obtain unique values of the eigenvectors. It may be taken to be the \fInormalization\fR condition: .EQ (6.23) x sub 1i sup 2~+~x sub 2i sup 2~=~1 .EN Eqns. (6.22) and (6.23) yield unique values for $bold x sub 1$ and $bold x sub 2$, which may be collected into the matrix solution .EQ (6.24) bold X~=~left ( matrix { ccol { {{sqrt 3} over 2} above {1 over 2} } ccol { {-1 over 2} above {{sqrt 3} over 2} } } right ) .EN It may be seen that this matrix is orthogonal, $bold X sup -1~=~bold X sup T$, and that it diagonalizes the original matrix .EQ (6.25) bold X sup -1 bold A bold X~=~bold LAMBDA~=~left ( matrix { ccol { 4 above 0 } ccol { 0 above 16 } } right ) .EN To return to the geometric interpretation, consider the effect of the orthogonal transformation $bold X$ on the ellipse of Eq. (6.15). Let $bold x~=~bold X bold x prime$, which means carry out the transformation .EQ (6.26a) x sub 1~=~{sqrt 3} over 2 y sub 1~-~1 over 2 y sub 2 .EN .EQ (6.26b) x sub 2~=~1 over 2 y sub 1~+~{sqrt 3} over 2 y sub 2. .EN Applying this transformation to Eq. (6.15) (and simplifying) yields .EQ (6.27) {y sub 1 sup 2} over 16~+~{y sub 2 sup 2} over 4~=~1 .EN which is the equation of the ellipse in the "standard" form $bold x prime sup T bold A prime bold x prime~=~constant$ in the new axis system. Thus we see that the orthogonal transformation rotates geometric objects in space preserving lengths (normalization) and angles (orthogonality). In matrix notation we may summarize the orthogonalization process as .EQ (6.28) bold x sup T bold A bold x~=~{( bold X bold x prime )} sup T bold A bold X bold x prime~=~bold x prime sup T bold X sup T bold A bold X bold x prime~=~bold x prime sup T bold A prime bold x prime .EN The form $bold A prime ~=~ bold X sup T bold A bold X$ is called an \fBorthogonal transformation\fR (unitary transformation for unitary $bold X$). .PP The general orthogonal rotation matrix which rotates vectors by $theta$ degrees counterclockwise in the plane is .EQ (6.29) bold R ( theta )~=~left ( matrix { ccol { {cos ( theta )} above { sin ( theta )} } ccol { {- sin ( theta )} above { cos ( theta )} } } right ) .EN The application of an orthogonal transformation to the i\fIth\fR and j\fIth\fR rows and columns of a symmetric $n times n$ matrix will reduce the (i,j) off-diagonal elements to zero. The rotation matrix which does this is constructed by inserting the (1,2) elements of $bold R$ in Eq. (6.29) into the (i,j) off-diagonal elements of a n-by-n unit matrix to rotate in the ij plane of n-space (i.e. $bold R sub 2~->~bold R sub n~=>$ $1~->~i$ and $2~->~j$). This suggests that successive rotations could diagonalize a symmetric (square) matrix. However, each rotation modifies the other elements of the matrix, so an iterative procedure is required, which is repeated until all off-diagonal elements are acceptably close to zero. The \fBJacobi Method\fR is an algorithm for successive orthogonal transformations on a square matrix leading to diagonalization, and hence the eigensolutions. The condition that the ij\fIth\fR off-diagonal elements be reduced to zero at any given step is .EQ (6.30) a sub ij prime~=~a sub ji prime~=~a sub ij ( cos sup 2 theta~-~sin sup 2 theta ) ~+~(a sub jj~-~a sub ii ) sin theta cos theta~=~0 .EN which leads to .EQ (6.31) theta~=~{1 over 2}tan sup -1 ({2a sub ij} over {a sub ii~-~a sub jj})~~~i~!=~j .EN Given values of $theta$, largest off-diagonal elements (at "pivot" points) are successively rotated to zero until none are greater than some preselected convergence threshold. Simultaneously, the product of the rotation matrices converges to the eigenvector matrix. .sp .nr LL -0.5i .KS .B1 .sp .ce \fBJacobi Symmetric Matrix Diagonalization Algorithm\fR .sp .5 .LP Purpose: Given a real symmetric matrix, find the eigenvalues and eigenvectors. .IP .LP Procedure: .RS .IP 1. 3 Locate the largest off-diagonal element. .IP 2. 3 Rotate the largest off-diagonal element to zero. .IP 3. 3 Repeat steps 1 and 2 until the largest off-diagonal element is below a given threshold, accumulating the eigenvectors as the product of the rotation matrices. .RE .sp .B2 .KE .nr LL +0.5i .PP The implementation below is limited in that for convenience it employs static arrays of limited size; this restriction could be removed by employing dynamically allocated arrays. .sp .KS .ps -2 .vs -2 .nf \f4 .eo #include #define MAX_SIZE 100 main() /* Jacobi solution of linear matrix eigenvalue equation */ { int i, j, n; static double A[MAX_SIZE][MAX_SIZE], X[MAX_SIZE][MAX_SIZE], tolerance; /* Input matrix representation of eigenvalue system */ printf ("\n\t\tSolve linear eigenvalue system Ax = lx for L and X.\n\n"); printf ("Enter number of x variables: "); scanf ("%d", &n); printf ("Enter %d*(%d-1) LOWER triangle of elements of A:\n", n, n); for (i = 1; i <= n; ++i) for (j = 1; j <= i; ++j) { scanf ("%lf", &A[i][j]); A[j][i] = A[i][j]; } printf ("\nEnter convergence tolerance (largest off-diagonal value):\n"); scanf ("%lf", &tolerance); printf ("\n Initial A: \n"); Print_matrix (n, A); /* Initialize eigenvector matrix */ Initialize_X (n, X); printf ("\n Initial X: \n"); Print_matrix (n, X); /* Solve matrix representation of eigenvalue system */ Diagonalize (n, A, X, tolerance); /* Jacobi diagonalization of A */ /* Ouput eigensolutions of linear system */ printf ("\n Diagonalized A: \n"); Print_matrix (n, A); printf ("\n Eigenvectors X: \n"); Print_matrix (n, X); } Print_matrix (n, A) int n; double A[MAX_SIZE][MAX_SIZE]; { int i, j; for (i = 1; i <= n; ++i) { for (j = 1; j <= n; ++j) printf ("%.14f ", A[i][j]); printf ("\n"); } } Initialize_X (n, X) /* Initialize eigenvector matrix */ int n; double X[][MAX_SIZE]; { int i, j; for (i = 1; i <= n; ++i) for (j = 1; j <= n; ++j) if (i == j) X[i][j] = 1; else X[i][j] = 0; } #define MAX_ITERATIONS 1000 Diagonalize (n, A, X, tolerance) /* Jacobi diagonalization of A */ int n; double A[][MAX_SIZE], X[][MAX_SIZE], tolerance; { int i, j, n_iterations = 0; double Max_off_diag(); /* Iterate until the maximum off-diagonal element is less than the tolerance */ while (++n_iterations < MAX_ITERATIONS && Max_off_diag (n, A, &i, &j) > tolerance) Rotate (n, i, j, A, X); } double Max_off_diag (n, A, max_i, max_j) /* Find the largest off-diagonal element of A */ int n, *max_i, *max_j; double A[][MAX_SIZE]; { int i, j; double max_value = 0; for (i = 1; i <= n; ++i) for (j = i+1; j <= n; ++j) if (fabs(A[i][j]) > max_value) { max_value = fabs(A[i][j]); *max_i = i; *max_j = j; } return (max_value); } Rotate (n, i, j, A, X) /* Rotate */ int n, i, j; double A[][MAX_SIZE], X[][MAX_SIZE]; { int k; double theta, S, C, Xki, Aii; theta = .5*atan(2*A[i][j]/(A[i][i] - A[j][j])); S = sin(theta); C = cos(theta); for (k = 1; k <= n; ++k) { Xki = X[k][i]; X[k][i] = Xki*C + X[k][j]*S; X[k][j] = - Xki*S + X[k][j]*C; if (k != i && k != j) { A[k][i] = A[i][k]*C + A[j][k]*S; A[k][j] = - A[i][k]*S + A[j][k]*C; A[j][k] = A[k][j]; A[i][k] = A[k][i]; } } Aii = A[i][i]; A[i][i] = Aii*C*C + A[j][j]*S*S + 2.*A[i][j]*S*C; A[j][j] = Aii*S*S + A[j][j]*C*C - 2.*A[i][j]*S*C; A[i][j] = A[j][i] = 0; } .ec .R .fi .ps +2 .vs +2 .KE .sp .NH Non-linear Systems .PP Where small differences are found, linear models are useful approximations. The "real" world, however is non-linear. .SH Roots of Equations .PP The values of independent variables of functions where the value of the functions are zero are called \fBroots\fR of the equations. Roots are related to solutions and therefore are important quantities. We consider first roots of arbitrary real functions of one variable, and then extend the discussion to multivariable equations. .PP Since general functions do not lend themselves to analytic solution, numerical techniques are indicated. Specifically we will consider iteration algorithms where new guesses to solution values at the i\fIth\fR step, $x sub i$, are obtained from previous guesses, $x sub i-1$, starting with an \fIinitial guess\fR, $x sub 0$. \fBConvergence\fR means that the the current iterate is acceptably close to the previous iterate, i.e. $DELTA x~<=~epsilon$. Due to round-off error accumulation, it would not be correct to assume that all \fIfuture\fR iterates at convergence would remain constant. .SH Iterated Substitution .PP Quadratic functional dependence is the next most complicated algebraic behavior after the linear (proportional) relationship. The simplest non-trivial ($x~!=~0$) quadratic equation is .EQ (6.32) x sup 2~=~a .EN which has the formal solution .EQ (6.33) x~=~+- sqrt a .EN Eq. (6.33) is deceptively simple looking, as we shall see. To provide the value of the solution for the simplest non-trivial ($!=~1$) integer a, namely $a~=~2$ is in fact impossible! The symbol $sqrt 2$ \fIrepresents\fR the solution, and it can be manipulated according to the rules of algebra, but a \fInumerical\fR value for $sqrt 2$ is an "irrational" (never-ending) number. So our task is to try and provide an algorithm for computing an integer representation for a non-integer number (i.e. a decimal fraction representation comprised of a string of integers). .PP One was to proceed is through \fIiterated (repeated) substitution\fR. This technique is applicable when an equation can be cast into the form .EQ (6.34) x~=~f(x) .EN that is, when one of the terms in x can be \fIisolated\fR from the remaining terms in x. The procedure, then, is to start with an initial "guess" value for x, and substitute it into the right-hand-side of Eq. (6.34) to produce a \fInew\fR "guess" for x. If the new value is sufficiently close to ("agrees with") the original value, a solution (value that "satisfies" the equation) has been found. If the new value differs sufficiently from the original value the process can be repeated (substitute the current value of x into $f(x)$ to produce a new value of x, etc.). This process can be represented in terms of a \fIsequence\fR of x values: .EQ (6.35) x sub {n+1}~=~f(x sub {n}),~~n~=~0,1,2,... .EN Sufficiency of convergence is a relative term that can have an arbitrary upper bound, but is limited at the lower bound by the accuracy of a finite-length word computing system. .PP Isolating x in Eq. (6.32) to produce an equation of the form of Eq. (6.34) gives an iteration function for finding the square root of a number: .EQ (6.36) x sub {n+1}~=~a over {x sub n},~~n~=~0,1,2,~... .EN Lets apply this algorithm to find an approximation (all that's possible) to the square root of 2, starting with an initial guess of $x sub 0~=~1$. The sequence that results from Eq. (6.36) is .EQ (6.37) x~=~1,~2,~1,~2,~1,~2, ... .EN This is obviously not converging to a constant value. Perhaps the initial guess was unlucky. Starting with $x sub 0~=~1/2$ produces .EQ (6.38) x~=~1 over 2 ,~4,~1 over 2 ,~4,~1 over 2 ,~4,~... .EN A pattern of behavior is emerging which is clarified by performing two iterations of the general iteration equation, Eq. (6.36) .EQ (6.39) x sub {n+1}~mark =~a over {x sub n} .EN .EQ x sub {n+2}~lineup =~a over {x sub {n + 1}}~=~x sub n .EN Thus there is no hope of convergence for this oscillatory behavior for any initial guess, other than the solution itself. .PP It appears we have reached an impasse as there is no other way to factor Eq. (6.32) to try another form for $f(x)$. However, we can use a trick to improve the situation. We can change the form of the iteration function $f(x)$ by generating an additional term which doesn't change the desired result. Instead of iterating the form given by Eq. (6.36), let's add and subtract the same amount to the rhs of Eq. (6.36), say something linear in x. The form is now .EQ (6.40) x =~a over x ~+~px~-~px .EN This equation certainly has the same formal solution as the original equation, but now there are more choices for which x to isolate from the remaining terms in x. If we choose to isolate the last term in x, we have .EQ (6.41) x sub n~=~a over {x sub n}~+~px sub n~-~px sub {n+1},~~roman or .EN .EQ x sub {n+1}~=~{1 over p}[(p~-~1)x sub n~+~a over {x sub n}] .EN If p is assigned the value 2, the iteration relation becomes .EQ (6.42) x sub {n+1}~=~x sub n over 2~+~a over {2x sub n} .EN This time, starting with $x sub 0~=~1$, we obtain the sequence .EQ (6.43) x~=~1,~3 over 2~=~1.5,~17 over 12~=~1.416...,~577 over 408~=~1.414215... .EN and we are converging to the right value this time, with each of the iterates providing an improved \fIrational fraction approximation\fR to the square root of two. .SH Functional Composition .PP An iterated functional relationship can be expressed in the form .EQ (6.44) x sub 0~->~x sub 1~=~f(x sub 0 )~->~x sub 2~=~f(x sub 1 )~->~...~,~~roman or .EN .EQ x sub 0~->~f(x sub 0 )~->~f(f(x sub 0 ))~->~...~->~f sup (n) (x sub 0 ) .EN where the second form represents \fBiterated functions\fR, or \fBfunctional composition\fR, with the shorthand symbol for the n\fIth\fR iterate as a superscript. .PP The sequence of x values is called an "orbit" from Henri Poincaire's investigation, near the turn of the twentieth century, into iterated functions related to the paths (orbits) of the planets in the solar system . Poincaire was interested in the question, "Is the solar system \fIstable\fR?", that is, will it continue in its same course indefinitely, or, over time would a different motion result? This appeared to be a straightforward application of classical mechanics to a dynamical system of known behavior. In a sense Poincaire was doing nothing new since Newton had announced a mathematical description of the solar system in terms of differential equations in his \fIPrincipia Mathematica\fR. Of course the system was somewhat more complicated than Newton's apple/earth system, or his earth/sun system, or his comet/solar system system, for that matter. Each of these latter systems ideally involve only \fItwo bodies\fR, for which analytical solutions to Newton's equations of motion are readily available. However there are no known analytic solutions to the general \fIn-body problem\fR, where n is any number greater than two. So, Poincaire's problem is a natural for numerical analysis. However, realizing the difficulty of exploring the long-term behavior of such a system with the tools available to him at the time, he sought for general results rather than specifics. What he discovered is that, generally speaking, the question of \fIthe stability of the solar system is unpredictable\fR! Along the way he discovered that there are just three things that can eventually happen to any dynamical system, and these results apply to the iterated mathematical systems we are interested in as well. The orbit can eventually converge to a single value, called a \fBfixed point\fR, or it can oscillate between a set of n points, called a \fBperiodic n-cycle\fR, or it can \fInever return to any previous point\fR, called \fBchaos\fR. Note that a choatic orbit may be considered an infinite-cycle periodic orbit. These results apply to isolated bodies (acting under external influence) coupled particle systems, and indeed to perfectly general systems of any type, including the solar system and the universe itself (will it expand forever or will it oscillate, or?). .PP The most disturbing consequence of Poincaire's investigations was that for most systems \fIeven if a completely accurate calculation could be made, there is no hope of predicting the system's course far into the future\fR. Since no real system can be totally isolated from its surroundings, no orbit can remain totally constant, and \fIall orbits must eventually become chaotic\fR. This is because the slightest perturbation generates a slightly different orbit which eventually diverges from every nearby orbit by an unpredictable amount. .PP Poincaire's discoveries were so difficult to accept that his work lay dormant for the better part of this century, until scientists began exploring complex dynamical systems using computers. It is now realized that numbers themselves exhibit the Poincaire properties, with integers analogous to constant orbits, repeating decimals periodic orbits and irrationals chaotic orbits. For example, random number research seeks for algorithms generating irrational numbers. Of course the assumption is that the dynamical systems have an unlimited number of states they can visit. Finite-state systems like fixed-length wordsized computers can not be expected to model real systems perfectly. In a philosophical sense it may be debated that the universe must have a finite number of possible states, however large that may be. In a practical sense, sufficiently large words can accommodate a sufficient variety of states to be useful in describing the behavior of real systems. .PP Equilibrium, oscillation and chaos are seen in systems which are described by non-linear models. Hence the simplest non-linear problem, such as the algorithm for estimating the square root of two given by Eq. (6.36), can have surprising behavior. .SH Graphical Interpretation .PP Here are some other "simple" systems, which can be explored with a pocket calculator illustrating various behaviors: .KS .sp .ce \fBTable 6.1 Dynamical Behavior of Simple Systems.\fR .vs +2 .TS center tab(|); l l l. f(x)|orbit|behavior _ $x$|$x~->~x~->~x~->~...$|fixed, 1-cycle $-x$|$x~->~-x~->~+x~->~...$|periodic, 2-cycle $e sup x$|$x~->~e sup x~->~e sup {e sup x}~->~...~inf$|divergent $sin(x)$|$x~->~sin(x)~->~sin(sin(x))~->~...~0$|convergent $cos(x)$|$x~->~cos(x)~->~cos(cos(x))~->~...~0.739805 ...~rad$|convergent $3.839 x(1~-x)$|$x~->~3.839 x(1~-x)~...~1.4988 ...,~.489712 ...,~.959299 ...,1.4988 ...$|periodic, 3-cycle $4x(1~-x)$|$x~->~4x(1~-x)~...$|chaotic _ .TE .vs -2 .KE Poincaire used \fIgraphical analysis\fR to explore the global behavior of dynamical systems, and it proves useful here in understanding the behavior of iterated mathematical systems. One type of graphical analysis, sometimes called a "web" analysis, observes that each new iterate can easily be identified on a graph of $f(x)$ \fIvs\fR x by including the diagonal function $f(x)~=~x$ on the graph. The next iterate is located by connecting the current value of $f(x)$ to the diagonal curve, producing $x~=~f(x)$, thence returning to the curve $f(x)$ for the next iterate. Study of a few examples, such as $f(x)~=~ax$ for various values of a (positive and negative) illustrates the general rule in terms of the magnitude of the derivative of $f(x)$: .KS .sp .ce \fBTable 6.2 Criteria for Dynamical Behavior.\fR .TS center tab(@); l l. $| f prime (x) |$@behavior _ = 0@critical (superstable) < 1@attracting (stable) = 1@neutral (metastable) > 1@repelling (unstable) .TE .KE .PP Fixed points occur where $f(x)$ crosses the diagonal line. .LP (FIGURES HERE) .LP Consider the more general quadratic $f(x;c)~=~x sup 2~+~c$. Possible fixed points, $x *$ can be found by solving .EQ (6.45) x *~=~f(x * )~=~x * sup 2~+~c .EN for which .EQ (6.46) x *~=~{1 +- sqrt {1~-~4c}} over 2 .EN Evaluating the derivative at the fixed point: .EQ (6.47) f prime (x * )~=~2 x *~=~1~+- sqrt {1~-~4c} .EN we find that neutrality ($f prime (x * )~=~1$) occurs for $c~=~1/2$, $x *~=~1/2$. Values of c greater than 1/4 have no fixed point, while those for c less than 1/4 have two fixed points ("bifurcation" occurs at $x *~=~1/2,~c~=~1/4$). .SH Newton's Root Finding Method .PP Repeated substitution has limitations, as illustrated above, not the least of which is that it may not be possible to isolate x in a given equation. A more general scheme for obtaining roots to equations, invented by Isaac Newton, generalizes the root condition (the point where the function crosses the x-axis) to .EQ (6.48) f(x)~=~0 .EN of which the iterated substitution equation, Eq. (6.34), is a special case. The strategy is to use the \fIslope\fR of the function, or some estimate of the slope, to determine the next guess (iterate) approaching the root. Isolating x from a finite-difference estimate of the slope of a function .EQ (6.49) f prime (x sub n )~approx~{DELTA f(x sub n )} over {DELTA x sub n }~=~{f(x sub {n+1})~-~f(x sub n )} over {x sub {n+1}~-~x sub n} .EN gives the next guess .EQ (6.50) x sub {n+1}~=~x sub n~+~DELTA x sub n~=~x sub n~+~{f(x sub {n+1})~-~f(x sub n )} over {f prime (x sub n )} .EN Assuming the $x sub {n+1}$ iterate approximates the root, $f(x sub {n+1})~approx~0$, the final form for the Newton iteration method is .EQ (6.51) x sub {n+1}~=~x sub n~+~DELTA x sub n~=~x sub n~-~{f(x sub n )} over {f prime (x sub n )} .EN .PP There are a number of alternative ways to estimate the slope, of which Newton's Method obtains the slope analytically. It is therefore restricted to functional forms for which analytic derivatives can be extracted. Convergence can be measured in terms of an estimate to the relative error as ${DELTA x sub n} over {x sub n}$. .PP Returning to the example of evaluating roots of numbers, Newton's Method becomes .EQ (6.52) f(x)~=~x sup n~-~a~=~0 .EN .EQ (6.53) x sub {n+1}~=~x sub n~+~DELTA x sub n~=~x sub n~-~{f(x sub n )} over {f prime (x sub n )}~=~{x sub n} over {2}~+~{a} over {2x sub n} .EN For the case of $a~=~2$, this evaluates to $1 over x sub n~+~x sub n over 2$, which happens to be the same (converging) expression from the extended substitution algorithm, Eq. (6.42). .PP A C program which implements Newton's Method for finding the n\fIth\fR root of a real illustrates the power of defining macro functions. .sp .KS .ps -2 .vs -2 .nf \f4 .eo #include #include #include double a; /* External (global) variables for n-th root of a */ int n, IT; /* Gives up when iteration count IT > IMAX iterations */ #define IMAX 100 /* Newton's Method for root of f(x) = 0: x[k+1] = x[k] + DX(x[k]); DX(x) = -f(x)/f'(x) */ #define REL_ERR pow(10.,(double) -DBL_DIG) #define DX(x) (a/pow(x,(double) n) - 1)*x/n /* f(x) = x^n - a */ #define root(x) for (IT = 0; ++IT <= IMAX && fabs(DX(x)) > fabs(x)*REL_ERR; x += DX(x)) main() /* Driver to test function root() for finding n-th root of a */ { double r; printf ("\nEnter positive a and positive integer n for n-th root of a (or EOF):\n"); while (scanf ("%f%d", &a, &n) != EOF && a > 0 && n > 0 ) { r = 1+(a-1)/n; /* Initial estimate of root r */ root(r); printf ("After %d iterations, %g^(1/%d) = %.18g, DX = %1.e.\n",IT-1,a,n,r,fabs(DX(r))); } } .ec .R .fi .ps +2 .vs +2 .KE .sp .PP Note the use of the limits.h header machine- and implementation-dependent parameter DBL_DIG to automate the estimate of the relative error in terms of the maximum number of significant digits. Newton's Method is implemented in two macro function statements, one defining the derivative and the second implementing the iteration formula. Example runs are: .KS .ps -2 .vs -2 .nf \f4 Enter positive a and positive integer n for n-th root of a (or EOF): 2 2 After 4 iterations, 2^(1/2) = 1.4142135623730951, DX = 2e-17. 1000000 6 After 58 iterations, 1e+06^(1/6) = 10, DX = 0e+00. 6 1000000 After 8 iterations, 6^(1/1000000) = 1.0000017917610744, DX = 3e-18. 1e-38 1000000 After 91 iterations, 1e-38^(1/1000000) = 0.99991250559433013, DX = 6e-18. 1e38 1000000 After 100 iterations, 1e+38^(1/1000000) = 9.9995874358680239e+31, DX = 4e+25. .R .fi .ps +2 .vs +2 .KE .PP Note the value of the millionth root of 6 is not exactly unity, which is to be expected. The millionth roots of very small numbers are computed correctly, but the millionth root of a very large number does not converge after 100 iterations, the maximum allowed for this program. .NH Numerical Integration .PP Historically, mathematical integration goes back to property surveys in antiquity. Both the ancient Egyptians and Chinese developed methods for estimating the area of unusual two-dimensional shapes. All numerical methods of estimating integrals effectively discover the function to be integrated by evaluating it at various values of the independent argument. We will discuss two numerical methods for estimating integrals of arbitrary real functions of one variable, one systematic and one which uses random numbers. .SH Polygonal Integration .PP Geometrically, integration refers to determining a higher-dimensional quantity from a lower-dimensional description, i.e. an area from a line or volume from an area. The dimensionality of the integral refers to the number of independent variables to be integrated. Where a functional relation describes the boundaries of the area or volume, the techniques of analytical calculus may be employed to determine an analytical value for the integral. However, not all functional forms have analytic integrals, so numerical approximates are required. Integration, whether analytical or numerical involves, in essence, a summation process; hence the sigmoid symbol which is a stylized capital Greek sigma, standing for summation. .EQ (6.54) int from a to b f( bold x ) d bold x~approx~sum from a to b f( bold x sub i ) DELTA bold x sub i .EN The "limits" of integration, denoted here by the a and b, describe the range of independent variables over which the area is to be determined. .PP Numerical integration is visualized graphically as the accumulated area between the function to be integrated and the axis of the independent variable of the function. A large class of numerical integration techniques employ known analytical functions to approximate the function to be integrated. The analytical functions chosen have known analytical integral expressions from which the approximates can be obtained. Although the function to be integrated may only be approximated by a given analytic function, the analytic function may be assumed to match the function to be integrated in small regions, with the total obtained by summation. .PP The simplest analytic function to approximate an arbitrary function is the linear function. This generates integrated piecewise linear approximates to the integral. For one-dimensional integrals the area under a small portion of the curve curve can be approximated by a rectangle of width $w sub i$ equal to the distance along the independent variable to be integrated, and height $h sub i$ equal to the value of the function to be integrated at some value of the independent variable along the width of the rectangle, within the boundaries of integration. Taking the height equal to the average height of the function gives: .EQ (6.55) w sub i~=~DELTA x sub i~=~b sub i~-~a sub i .EN .EQ h sub i~=~f(x sub i )~approx~f({a sub i~+~b sub i} over 2 ) .EN .EQ int from a to b f(x) dx~approx~sum from i to n w sub i h sub i ~=~sum from i to n (b sub i~-~a sub i ) f( {a sub i~+~b sub i} over 2 ) .EN where n is the number of subintervals of the interval over which the integral is taken. For the common case of equal intervals, the integral expression simplifies to the following integration formula, or "\fBrule\fR": .EQ (6.56) int from a to b f(x) dx~approx~sum from i to n w f(a~+~iw~-~w over 2 ) .EN As n increases, the approximation to the integral should improve, unless accumulation of round-off errors corrupts the calculation. The C program below implements Eq. (6.56) to estimate $int from 0 to 1 e sup x~=~e~-~1~approx~1.71828182845904523536...$. .sp .KS .ps -2 .vs -2 .nf \f4 .eo /* Numerical One-dimensional Rectangular Integration */ main() { int i, n; double a, b, w, area = 0, f(); printf("\nEnter the range of independent variable values (a,b): "); scanf("%f%f", &a, &b); printf("\nEnter the number of points to be used: "); scanf ("%d", &n); w = (b - a)/n; /* constant width */ for (i = 1; i <= n; ++i) /* sum product of widths times */ area += w*f(a + i*w - w/2); /* height at midpoint */ printf ("\nRectangle estimate of integral = %.15lf\n", area); } double f(x) double x; { double exp(); return exp(x); /* the exponential function */ } .ec .R .fi .ps +2 .vs +2 .KE .sp .PP Table 6.3 shows the convergence of the value of the integral for increasing number of "integration points" (function values) and corresponding decreasing intervals between points.\*** .FS In a sense, numerical integration "discerns" the shape of the function as well as its integral by evaluating the function at different points. .FE It is seen that the accuracy improves by about two significant figures for each order of magnitude increase in function evaluations. .KS .sp .ce Table 6.3 Rectangle Estimate of $int from 0 to 1 e sup x~=~e~-~1~approx~1.71828182845904523536 ...$. .TS center tab(|); l l l l l l. .B Number of|Rectangle | Error function evaluations|Estimate| .R _ 1 |1.64|$1~times~10 sup -2$ 10 |1.7176|$7~times~10 sup -4$ $10 sup 2$|1.71827|$1~times~10 sup -5$ $10 sup 3$|1.71828175|$7~times~10 sup -8$ $10 sup 4$|1.7182818277|$3~times~10 sup -9$ $10 sup 5$|1.718281828451|$8~times~10 sup -12$ $10 sup 6$|1.7182818284589|$3~times~10 sup -13$ .TE .KE .SH Monte Carlo Integration .PP Monte Carlo techniques use (pseudo) random numbers, which may be thought of as related to certain games of chance as played at the casinos of Monte Carlo. Monte Carlo integration begins by enclosing the function to be integrated in a box and estimates the area under the integral as the fraction of total points in the box which lie between the function and the axis of independent values. Obviously all points cannot be investigated, so points within the range of the box are chosen at random. As the number of points increases, the approximation to the value of the integral improves. .PP Monte Carlo integration has the disadvantage that it is statistical in nature, hence error bounds cannot be determined analytically as for analytic integration techniques (but may be estimated from probability theory). However, it has the decided advantage that it is easily extended to high-dimensional integrals of functions of several variables. The simplicity of the extended algorithm is somewhat deceptive in the sense that accuracy for a given number of function evaluations deteriorates rapidly with dimensionality. Conversely, function evaluations must increase exponentially with dimensionality to maintain accuracy. The rate of convergence is illustrated with a one-dimensional example. .sp .KS .ps -2 .vs -2 .nf \f4 .eo /* Numerical One-dimensional Monte Carlo Integration */ #include main() { int i, n, hits = 0; long random(); double a, b, h, w, x, y, f(); printf("\nEnter the range of independent variable values (a,b): "); scanf("%f%f", &a, &b); w = b - a; printf("\nEnter height of rectangle (h): "); scanf("%f", &h); printf("\nEnter the number of random points to be used: "); scanf ("%d", &n); for (i = 1; i <= n; ++i) { x = a + (random()/(double) LONG_MAX)*w; y = (random()/(double) LONG_MAX)*h; if (y < f(x)) ++hits; } printf ("\nMonte Carlo estimate of integral = %f\n", h*(double) hits/(double)n); } double f(x) double x; { double exp(); return exp(x); /* the exponential function */ } .ec .R .fi .ps +2 .vs +2 .KE .sp Table 6.4 shows the convergence with increasing number of points. It is seen that the accuracy improves by about one significant figure for each order of magnitude increase in function evaluations. .KS .sp .ce Table 6.4 Monte Carlo Estimate of $int from 0 to 1 e sup x~=~e~-~1~approx~1.71828182845904523536 ...$. .TS center tab(|); l l l l l l l n n. .B Number of|Monte Carlo | Error function evaluations|Estimate| .R _ 1 |3.0000|1.2818 10 |1.8000|0.0817 $10 sup 2$|1.5300|0.1883 $10 sup 3$|1.6620|0.0562 $10 sup 4$|1.7148|0.0035 $10 sup 5$|1.7212|0.0029 $10 sup 6$|1.7189|0.0006 .TE .KE .NH Differential Equations .PP .SH Iterated Equations .PP ORBITS; NON-LINEAR DYNAMICS HERE .SH References .IP 1. Numerical Analysis, A Practical Approach\fR, M. J. Maron, Macmillan Publishing Company, 2nd. ed., 1987. .OH '''' .EH ''''