Determinant
From Wikipedia, the free encyclopedia
In algebra, a determinant is a function depending on n that associates a scalar, det(A), to an n×n square matrix A. The fundamental geometric meaning of a determinant is a scale factor for measure when A is regarded as a linear transformation. Determinants are important both in calculus, where they enter the substitution rule for several variables, and in multilinear algebra.
For a fixed nonnegative integer n, there is a unique determinant function for the n×n matrices over any commutative ring R. In particular, this function exists when R is the field of real or complex numbers.
Contents |
[edit] Definition
The determinant is a number associated to any square matrix, that is to say, a rectangular array of numbers, such that the (finite) number of rows and columns are equal. The definition of determinants of matrices will first be given for low-dimensional cases: the determinant of an 1-by-1 matrix A is the only entry of that matrix: det(A) = A11. The case of 2-by-2 and 3-by-3 matrices will be expounded below. These definitions are then subsumed in a general definition, valid for all n-by-n matrices.[nb 1]
The determinant of a matrix A, is denoted det(A) or |A|. It is also common to denote the determinant with elongated vertical bars:
[edit] 2-by-2 matrices
The 2×2 matrix
has determinant
- det A = ad − bc.
A matrix with real number entries also produces a representation of the oriented area of the parallelogram with vertices at (0,0), (a,b), (a + c, b + d), and (c,d). The oriented area is the same as the usual area, except that it is negative when the vertices are listed in clockwise order.
The assumption here is that a linear transformation is applied to row vectors as the vector-matrix product xTAT, where x is a column vector. The parallelogram in the figure is obtained by multiplying matrix A (which stores the co-ordinates of our parallelogram) with each of the row vectors and in turn. These row vectors define the vertices of the unit square. With the more common matrix-vector product Ax, the parallelogram has vertices at and (note that Ax = (xTAT)T ).
Thus when the determinant is equal to one, then the matrix represents an equi-areal mapping.
[edit] 3-by-3 matrices
The determinant of a 3×3 matrix
is given by
- det (A) = aei − afh − bdi + bfg + cdh − ceg.
The rule of Sarrus is a mnemonic for this formula: the sum of the products of three diagonal north-west to south-east lines of matrix elements, minus the sum of the products of three diagonal south-west to north-east lines of elements when the copies of the first two columns of the matrix are written beside it as below:
This mnemonic does not carry over into higher dimensions.
[edit] n-by-n matrices
The determinant of a matrix of arbitrary size can be defined by the Leibniz formula (as explained in the next paragraph) or the Laplace formula (as explained at the end of the Properties section).
The Leibniz formula for the determinant of an n-by-n matrix A is
Here the sum is computed over all permutations σ of the numbers {1, 2, ..., n}. A permutation is a function that reorders this set of integers. For example, for n = 3, the original sequence 1, 2, 3 might be reordered to 2, 3, 1 or 3, 2, 1. It is a basic fact of combinatorics that there are n! = 1 · 2 · 3 · ... · n (n factorial) such permutations. The set of all such permutations is denoted Sn. To any such permutation σ one attaches the signature σ, it is +1 for even and −1 for odd permutations. Evenness or oddness can be defined as follows: the permutation is even (odd) if the new sequence can be obtained by an even number (odd, respectively) of switches of adjacent numbers. For example, starting from 1, 2, 3 and switching once one gets 1, 3, 2, switching once more yields 3, 1, 2, and finally, after a total of three (an odd number) switches, one gets 3, 2, 1. Therefore this permutation is odd. The permutation 2, 3, 1 is even (1, 2, 3 → 2, 1, 3 → 2, 3, 1, two switches).
In any of the n factorial summands, the term is a shorthand for the product over the indicated matrix entries, where i ranges from 1 to n, or equivalently
- A1,σ(1) · A2,σ(2) · ... · An,σ(n).
For small matrices, one gets back the formulae given in the previous sections.
The formal extension to arbitrary dimensions was made by Levi-Civita, using a pseudo-tensor symbol (Levi-Civita symbol). An alternative, but equivalent definition of the determinant can be obtained by using the following theorem:
- Let Mn(K) denote the set of all matrices over the field K. There exists exactly one function
- with the two properties:
- F is alternating multilinear with regard to columns;
- F(I) = 1.
One can then define the determinant as the unique function with the above properties.
[edit] Applications
Determinants are used to characterize invertible matrices (i.e., a matrix is invertible if and only if it has a non-zero determinant), and to explicitly describe the solution to a system of linear equations with Cramer's rule. This rule is denoted
where the matrix Ai is the matrix formed by the insertion of the solution column vector in the system of linear equations.
They can be used to find the eigenvalues of the matrix A through the characteristic polynomial
where I is the identity matrix of the same dimension as A.
One often thinks of the determinant as assigning a number to every sequence of n vectors in , by using the square matrix whose columns are the given vectors. With this understanding, the sign of the determinant of a basis can be used to define the notion of orientation in Euclidean spaces. The determinant of a set of vectors is positive if the vectors form a right-handed coordinate system, and negative if left-handed.
Determinants are used to calculate volumes in vector calculus: the absolute value of the determinant of real vectors is equal to the volume of the parallelepiped spanned by those vectors. As a consequence, if the linear map is represented by the matrix A, and S is any measurable subset of , then the volume of f(S) is given by . More generally, if the linear map is represented by the m-by-n matrix A, and S is any measurable subset of , then the m-dimensional volume of f(S) is given by . By calculating the volume of the tetrahedron bounded by four points, they can be used to identify skew lines.
The volume of any tetrahedron, given its vertices a, b, c, and d, is (1/6)·|det(a − b, b − c, c − d)|, or any other combination of pairs of vertices that form a simply connected graph.
[edit] Properties characterizing the determinant
In addition to the Leibniz formula above, there is another way of calculating determinants of matrices. It is based on the following properties of determinants:
- If A is a triangular matrix, i.e. Ai,j = 0 whenever i > j or, alternatively, whenever i < j, then
- ,
- If B results from A by interchanging two rows or columns, then det(B) = −det(A).
- If B results from A by multiplying one row or column with a number c, then det(B) = c · det(A).
- If B results from A by adding a multiple of one row to another row, or a multiple of one column to another column, then
These four properties can be used to compute determinants of any matrix, using Gaussian elimination. This is an algorithm that transforms any given matrix to a triangular matrix, only by using the operations in the last three items. Since the effect of these operations on the determinant can be traced, the determinant of the original matrix is known, once Gaussian elimination is performed.
It is also possible to expand a determinant along a row or column using Laplace's formula, which is efficient for relatively small matrices. To do this along row i, say, we write
where the Ci,j represent the matrix cofactors, i.e. Ci,j is ( − 1)i + j times the minor Mi,j, which is the determinant of the matrix that results from A by removing the i-th row and the j-th column, and n is the length of the matrix.
[edit] Example
The following compares various ways of calculating the determinant of the three-by-three matrix
Method | Calculation | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Rule of Sarrus |
|
|||||||||
Laplace expansion, along the second column (for ease of calculation, it is best to choose a row or column with many zeros) |
|
|||||||||
Gauss algorithm |
B is obtained from A by adding −1/2 × the first row to the second, so that det(A) = det(B) C is obtained from B by adding the first to the third row, so that det(C) = det(B) D is gotten from C by exchanging the second and third row, so that det(D) = −det(C) The determinant of the (upper) triangular matrix D is the product of its entries on the main diagonal: (−2) · 2 · 4.5 = −18. Therefore det(A) = +18. |
[edit] Further properties
The determinant of the identity matrix
is one.
The determinant of an n-by-n matrix with rank less than n is zero.
In the sequel, let A and B be any n-by-n matrices.
The determinant is a multiplicative map in the sense that the determinant of a matrix product equals the product of the determinants of the factors:
- det(AB) = det(A)det(B).
This property implies the properties characterizing the determinant above, since any transformation as above can be realized by multiplying the given matrix with a certain elementary matrix, whose determinants are also known. This product formula is generalized by the Cauchy-Binet formula to products of non-square matrices.
A matrix over a commutative ring R is invertible if and only if its determinant is a unit in R. In particular, if A is a matrix over a field such as the real or complex numbers, then A is invertible if and only if det(A) is not zero. In this case
- det(A−1) = det(A)−1.
An important implication of the previous properties is the following: if A and B are similar, i.e., if there exists an invertible matrix X such that A = X−1BX, then
- det(A) = det(X)−1det(BX) = det(X)−1det(B)det(X) = det(B)det(X)−1det(X) = det(B).
This means that the determinant is a similarity invariant. Because of this, the determinant of some linear transformation T : V → V for some finite dimensional vector space V is independent of the basis for V. The relationship is one-way, however: there exist matrices which have the same determinant but are not similar.
Multiplying a matrix by r a number affects the determinant as follows:
- det(rA) = rndet(A).
Expressed differently: the vectors v1,...,vn in Rn form a basis if and only if det(v1,...,vn) is non-zero.
A matrix and its transpose have the same determinant:
- det(AT) = det(A).
The determinants of a complex matrix and of its conjugate transpose are conjugate:
- det(A∗) = det(A)∗.
This generalizes the previous property since the conjugate transpose is identical to the transpose for a real matrix.
If A is a square n-by-n matrix with real or complex entries and if λ1, ..., λn are the (complex) eigenvalues of A listed according to their algebraic multiplicities, then
This follows from the fact that A is always similar to its Jordan normal form, an upper triangular matrix with the eigenvalues on the main diagonal.
[edit] Sylvester's determinant theorem
Sylvester's determinant theorem states that for any m-by-n matrices A and B, det(Idm + ABT) = det (Idn + BT·A). For the case of column vectors a and b, i.e., m×1-matrices, this reads
- det(Idm + a·bT) = 1 + bT·a.
More generally, for any invertible m-by-m matrix X,
- det(X + abT) = det (X) (1 + bT·X−1·a).[1]
[edit] Block matrices
Suppose A, B, C, and D are n×n-, n×m-, m×n-, and m×m-matrices, respectively. Then
This can be seen from the Leibniz formula or by induction on n. Employing the following identity
leads to
A similar identity with det(D) factored out can be derived analogously.[2]
[edit] Relationship to trace
From this connection between the determinant and the eigenvalues, one can derive a connection between the trace function, the matrix exponential function, and the determinant:
- det(exp(A)) = exp(tr(A)).
Under the substitution A ↦ log (A), the matrix logarithm of A,the above equation yields det(A) = exp(tr(log(A))), which is closely related to the Fredholm determinant. Similarly, tr(A) = log(det(exp(A))).
For low n, these read: det(A) = tr(A) for n=1; det(A) = 1/2 (tr(A)2 − tr(A2)); and for n=3: det(A) = 1/6 (tr(A)3 − 3 tr(A) tr(A)2 + 2 tr(A3)). They are closely related to Newton's identities.
[edit] Derivative
By definition, e.g., using the Leibniz formula, the determinant of real (or analogously for complex) square matrices is a polynomial function from Rn×n to R. As such it is everywhere differentiable. Its derivative can be expressed using Jacobi's formula:
- d det(A) = tr(adj(A) dA)
where adj(A) denotes the adjugate of A. In particular, if A is invertible, we have
- d det(A) = det(A) tr(A−1 dA).
Expressed in terms of the entries of A, these are
Yet another equivalent formulation is
- det (A + εX) − det(A) = tr(adj(A)X)) ε + O(ε2) = det(A) tr(A−1X) ε + O(ε2),
using big O notation. The special case where A = Id, the identity matrix, yields
- det (Id + εX) = 1 + tr(X) ε + O(ε2) = det(A) tr(A−1X) ε + O(ε2).
This identity is used in describing the tangent space of certain matrix Lie groups.
A useful property in the case of 3 x 3 matrices is the following:
A may be written as where , , are vectors, then the gradient over one of the three vectors may be written as the cross product of the other two:
[edit] Abstract formulation
An n × n square matrix A may be thought of as the coordinate representation of a linear transformation of an n-dimensional vector space V. Given any linear transformation
- A: V → V,
we can define the determinant of A as the determinant of any matrix representation of A. This is a well-defined notion (i.e. independent of a choice of basis) since the determinant is invariant under similarity transformations.
The determinant of a linear transformation can be formulated in a coordinate-free manner, as follows: let V be n-dimensional vector space (with n finite). The n-th exterior power ΛnV is a one-dimensional vector space whose elements are written v1 ∧ v2 ∧ ... ∧ vn, where each vi is a vector in V and the wedge product ∧ is antisymmetric (i.e., u ∧ v = − v ∧ u, so that in particular u ∧ u = 0). Any linear transformation A : V → V induces a linear transformation of ΛnV by
- v1 ∧ v2 ∧ ... ∧ vn ↦ Av1 ∧ Av2 ∧ ... ∧ Avn.
Since ΛnV is one-dimensional this operation is just multiplication by some scalar that depends on A. This scalar is called the determinant of A. That is, det(A) is defined by the equation
- Av1 ∧ Av2 ∧ ... ∧ Avn = det(A) · v1 ∧ v2 ∧ ... ∧ vn
To see that this definition agrees with the coordinate-dependent definition given above, one can use the characterization of the determinant given above: for example switching two columns changes the parity of the determinant; likewise, permuting the vectors in the exterior product v1 ∧ v2 ∧ ... ∧ vn to v2 ∧ v1 ∧ v3 ∧ ... ∧ vn, say, also alters the parity. Minors of a matrix can also be cast in this setting, by considering lower exterior products ΛkV with k < dim V.
[edit] Algorithmic implementation
- The naive method of implementing an algorithm to compute the determinant is to use Laplace's formula for expansion by cofactors. This approach is extremely inefficient in general, however, as it is of order n! (n factorial) for an n×n matrix M.
- An improvement to order n3 can be achieved by using LU decomposition to write M = LU for triangular matrices L and U. Now, det M = det LU = det L det U, and since L and U are triangular the determinant of each is simply the product of its diagonal elements. Alternatively one can perform the Cholesky decomposition if possible or the QR decomposition and find the determinant in a similar fashion.
- Since the definition of the determinant does not need divisions, a question arises: do fast algorithms exist that do not need divisions? This is especially interesting for matrices over rings. Indeed algorithms with run-time proportional to n4 exist. An algorithm of Mahajan and Vinay, and Berkowitz is based on closed ordered walks (short clow). It computes more products than the determinant definition requires, but some of these products cancel and the sum of these products can be computed more efficiently. The final algorithm looks very much like an iterated product of triangular matrices.
- What is not often discussed is the so-called "bit complexity" of the problem, i.e. how many bits of accuracy you need to store for intermediate values. For example, using Gaussian elimination, you can reduce the matrix to upper triangular form, then multiply the main diagonal to get the determinant (this is essentially a special case of the LU decomposition as above), but a quick calculation will show that the bit size of intermediate values could potentially become exponential[citation needed]. One could talk about when it is appropriate to round intermediate values, but an elegant way of calculating the determinant uses the Bareiss Algorithm, an exact-division method based on Sylvester's identity to give a run time of order n3 and bit complexity roughly the bit size of the original entries in the matrix times n.
[edit] History
Historically, determinants were considered without reference to matrices: originally, a determinant was defined as a property of a system of linear equations. The determinant "determines" whether the system has a unique solution (which occurs precisely if the determinant is non-zero). In this sense, determinants were first used in Chinese mathematics textbook The Nine Chapters on the Mathematical Art (九章算術, Chinese scholars, around the 3rd century BC). In Europe, two-by-two determinants were considered by Cardano at the end of the 16th century and larger ones by Leibniz and, in Japan, by Seki about 100 years later.[3][4]
In Japan, determinants were introduced to study elimination of variables in systems of higher-order algebraic equations. They used it to give short-hand representation for the resultant. After the first work by Seki in 1683, Laplace's formula was given by two independent groups of scholars: Tanaka, Iseki (算法発揮, Sampo-Hakki, published in 1690) and Seki, Takebe, Takebe (大成算経, taisei-sankei, written at least before 1710). However, doubts have been raised about how much they recognized the determinant as an independent object.
In Europe, Cramer (1750) added to the theory, treating the subject in relation to sets of equations. The recurrent law was first announced by Bézout (1764).
It was Vandermonde (1771) who first recognized determinants as independent functions.[3] Laplace (1772) [5][6] gave the general method of expanding a determinant in terms of its complementary minors: Vandermonde had already given a special case. Immediately following, Lagrange (1773) treated determinants of the second and third order. Lagrange was the first to apply determinants to questions of elimination theory; he proved many special cases of general identities.
Gauss (1801) made the next advance. Like Lagrange, he made much use of determinants in the theory of numbers. He introduced the word determinants (Laplace had used resultant), though not in the present signification, but rather as applied to the discriminant of a quantic. Gauss also arrived at the notion of reciprocal (inverse) determinants, and came very near the multiplication theorem.
The next contributor of importance is Binet (1811, 1812), who formally stated the theorem relating to the product of two matrices of m columns and n rows, which for the special case of m = n reduces to the multiplication theorem. On the same day (November 30, 1812) that Binet presented his paper to the Academy, Cauchy also presented one on the subject. (See Cauchy-Binet formula.) In this he used the word determinant in its present sense,[7][8] summarized and simplified what was then known on the subject, improved the notation, and gave the multiplication theorem with a proof more satisfactory than Binet's.[3][9] With him begins the theory in its generality.
The next important figure was Jacobi[4] (from 1827). He early used the functional determinant which Sylvester later called the Jacobian, and in his memoirs in Crelle for 1841 he specially treats this subject, as well as the class of alternating functions which Sylvester has called alternants. About the time of Jacobi's last memoirs, Sylvester (1839) and Cayley began their work.[10][11]
The study of special forms of determinants has been the natural result of the completion of the general theory. Axisymmetric determinants have been studied by Lebesgue, Hesse, and Sylvester; persymmetric determinants by Sylvester and Hankel; circulants by Catalan, Spottiswoode, Glaisher, and Scott; skew determinants and Pfaffians, in connection with the theory of orthogonal transformation, by Cayley; continuants by Sylvester; Wronskians (so called by Muir) by Christoffel and Frobenius; compound determinants by Sylvester, Reiss, and Picquet; Jacobians and Hessians by Sylvester; and symmetric gauche determinants by Trudi. Of the text-books on the subject Spottiswoode's was the first. In America, Hanus (1886), Weld (1893), and Muir/Metzler (1933) published treatises.
[edit] See also
[edit] Notes
- ^ Some authors also define determinants of 0-by-0 matrices. There is only one 0-by-0 matrix and its determinant is one. See de Boor 1990.
- ^ Proofs can be found in http://web.archive.org/web/20080113084601/http://www.ee.ic.ac.uk/hp/staff/www/matrix/proof003.html
- ^ These identities were taken http://www.ee.ic.ac.uk/hp/staff/www/matrix/proof003.html
- ^ a b c Campbell, H: "Linear Algebra With Applications", pages 111-112. Appleton Century Crofts, 1971
- ^ a b Eves, H: "An Introduction to the History of Mathematics", pages 405, 493–494, Saunders College Publishing, 1990.
- ^ Expansion of determinants in terms of minors: Laplace, Pierre-Simon (de) "Researches sur le calcul intégral et sur le systéme du monde," Histoire de l'Académie Royale des Sciences (Paris), seconde partie, pages 267-376 (1772).
- ^ Muir, Sir Thomas, The Theory of Determinants in the Historical Order of Development [London, England: Macmillan and Co., Ltd., 1906].
- ^ The first use of the word "determinant" in the modern sense appeared in: Cauchy, Augustin-Louis “Memoire sur les fonctions qui ne peuvent obtenir que deux valeurs égales et des signes contraires par suite des transpositions operées entre les variables qu'elles renferment," which was first read at the Institute de France in Paris on November 30, 1812, and which was subsequently published in the Journal de l'Ecole Polytechnique, Cahier 17, Tome 10, pages 29-112 (1815).
- ^ Origins of mathematical terms: http://jeff560.tripod.com/d.html
- ^ History of matrices and determinants: http://www-history.mcs.st-and.ac.uk/history/HistTopics/Matrices_and_determinants.html
- ^ The first use of vertical lines to denote a determinant appeared in: Cayley, Arthur "On a theorem in the geometry of position," Cambridge Mathematical Journal, vol. 2, pages 267-271 (1841).
- ^ History of matrix notation: http://jeff560.tripod.com/matrices.html
[edit] References
- de Boor, Carl (1990), "An empty exercise", ACM SIGNUM Newsletter 25 (2): 3–7, doi:, http://ftp.cs.wisc.edu/Approx/empty.pdf.
[edit] External links
- MIT Linear Algebra Lecture on Determinants
- Linear Systems Chapter from "Fundamental Problems of Algorithmic Algebra" Chee Yap's chapter on Linear Systems describing implementation aspects of Determinant computation.
- Mahajan, Meena and V. Vinay, “Determinant: Combinatorics, Algorithms, and Complexity”, Chicago Journal of Theoretical Computer Science, v. 1997 article 5 (1997).
- Online Matrix Calculator Online Matrix calculator.
- Linear algebra: determinants. Compute determinants of matrices up to order 6 using Laplace expansion you choose.
- Matrices and Linear Algebra on the Earliest Uses Pages