Difference between revisions of "Matrix Operations"
(13 intermediate revisions by the same user not shown) | |||
Line 83: | Line 83: | ||
===Matrix multiplication=== | ===Matrix multiplication=== | ||
[[File:MatrixMultiplication.png|thumb|300px|Schematic depiction of the matrix product '''AB''' of two matrices '''A''' and '''B'''.]] | [[File:MatrixMultiplication.png|thumb|300px|Schematic depiction of the matrix product '''AB''' of two matrices '''A''' and '''B'''.]] | ||
− | ''Multiplication'' of two matrices is defined if and only if the number of columns of the left matrix is the same as the number of rows of the right matrix. If '''A''' is an ''m''-by-''n'' matrix and '''B''' is an ''n''-by-''p'' matrix, then their ''matrix product'' '''AB''' is the ''m''-by-''p'' matrix whose entries are given by | + | ''Multiplication'' of two matrices is defined if and only if the number of columns of the left matrix is the same as the number of rows of the right matrix. If '''A''' is an ''m''-by-''n'' matrix and '''B''' is an ''n''-by-''p'' matrix, then their ''matrix product'' '''AB''' is the ''m''-by-''p'' matrix whose entries are given by dot product of the corresponding row of '''A''' and the corresponding column of '''B''': |
:<span id="matrix_product"><math>[\mathbf{AB}]_{i,j} = a_{i,1}b_{1,j} + a_{i,2}b_{2,j} + \cdots + a_{i,n}b_{n,j} = \sum_{r=1}^n a_{i,r}b_{r,j},</math></span> | :<span id="matrix_product"><math>[\mathbf{AB}]_{i,j} = a_{i,1}b_{1,j} + a_{i,2}b_{2,j} + \cdots + a_{i,n}b_{n,j} = \sum_{r=1}^n a_{i,r}b_{r,j},</math></span> | ||
− | where 1 ≤ ''i'' ≤ ''m'' and 1 ≤ ''j'' ≤ ''p''. For example, the underlined entry 2340 in the product is calculated as | + | where 1 ≤ ''i'' ≤ ''m'' and 1 ≤ ''j'' ≤ ''p''. For example, the underlined entry 2340 in the product is calculated as (2 × 1000) + (3 × 100) + (4 × 10) = 2340: |
:<math> | :<math> | ||
\begin{align} | \begin{align} | ||
Line 165: | Line 165: | ||
A '''principal submatrix''' is a square submatrix obtained by removing certain rows and columns. The definition varies from author to author. According to some authors, a principal submatrix is a submatrix in which the set of row indices that remain is the same as the set of column indices that remain. Other authors define a principal submatrix as one in which the first ''k'' rows and columns, for some number ''k'', are the ones that remain; this type of submatrix has also been called a '''leading principal submatrix'''. | A '''principal submatrix''' is a square submatrix obtained by removing certain rows and columns. The definition varies from author to author. According to some authors, a principal submatrix is a submatrix in which the set of row indices that remain is the same as the set of column indices that remain. Other authors define a principal submatrix as one in which the first ''k'' rows and columns, for some number ''k'', are the ones that remain; this type of submatrix has also been called a '''leading principal submatrix'''. | ||
+ | |||
+ | ==Linear equations== | ||
+ | Matrices can be used to compactly write and work with multiple linear equations, that is, systems of linear equations. For example, if '''A''' is an ''m''-by-''n'' matrix, '''x''' designates a column vector (that is, ''n''×1-matrix) of ''n'' variables ''x''<sub>1</sub>, ''x''<sub>2</sub>, ..., ''x''<sub>''n''</sub>, and '''b''' is an ''m''×1-column vector, then the matrix equation | ||
+ | :<math>\mathbf{Ax} = \mathbf{b}</math> | ||
+ | |||
+ | is equivalent to the system of linear equations | ||
+ | :<math>\begin{align} | ||
+ | a_{1,1}x_1 + a_{1,2}x_2 + &\cdots + a_{1,n}x_n = b_1 \\ | ||
+ | &\ \ \vdots \\ | ||
+ | a_{m,1}x_1 + a_{m,2}x_2 + &\cdots + a_{m,n}x_n = b_m | ||
+ | \end{align}</math> | ||
+ | |||
+ | Using matrices, this can be solved more compactly than would be possible by writing out all the equations separately. If ''n'' = ''m'' and the equations are independent, then this can be done by writing | ||
+ | :<math>\mathbf{x} = \mathbf{A}^{-1} \mathbf{b}</math> | ||
+ | |||
+ | where '''A'''<sup>−1</sup> is the inverse matrix of '''A'''. If '''A''' has no inverse, solutions—if any—can be found using its generalized inverse. | ||
+ | |||
+ | ==Linear transformations== | ||
+ | [[File:Area parallellogram as determinant.svg|thumb|right|The vectors represented by a 2-by-2 matrix correspond to the sides of a unit square transformed into a parallelogram.]] | ||
+ | Matrices and matrix multiplication reveal their essential features when related to ''linear transformations'', also known as ''linear maps''. <span id="linear_maps">A real ''m''-by-''n'' matrix '''A''' gives rise to a linear transformation '''R'''<sup>''n''</sup> → '''R'''<sup>''m''</sup> mapping each vector '''x''' in '''R'''<sup>''n''</sup> to the (matrix) product '''Ax''', which is a vector in '''R'''<sup>''m''</sup>. Conversely, each linear transformation ''f'': '''R'''<sup>''n''</sup> → '''R'''<sup>''m''</sup> arises from a unique ''m''-by-''n'' matrix '''A''': explicitly, the (''i'', ''j'')-entry of '''A''' is the ''i''<sup>th</sup> coordinate of ''f''('''e'''<sub>''j''</sub>), where '''e'''<sub>''j''</sub> = (0,...,0,1,0,...,0) is the unit vector with 1 in the ''j''<sup>th</sup> position and 0 elsewhere.</span> The matrix '''A''' is said to represent the linear map ''f'', and '''A''' is called the ''transformation matrix'' of ''f''. | ||
+ | |||
+ | For example, the 2×2 matrix | ||
+ | :<math>\mathbf{A} = \begin{bmatrix} a & c\\b & d \end{bmatrix}</math> | ||
+ | |||
+ | can be viewed as the transform of the unit square into a parallelogram with vertices at (0, 0), (''a'', ''b''), (''a'' + ''c'', ''b'' + ''d''), and (''c'', ''d''). The parallelogram pictured at the right is obtained by multiplying '''A''' with each of the column vectors <math>\begin{bmatrix} 0 \\ 0 \end{bmatrix}, \begin{bmatrix} 1 \\ 0 \end{bmatrix}, \begin{bmatrix} 1 \\ 1 \end{bmatrix}</math>, and <math>\begin{bmatrix}0 \\ 1\end{bmatrix}</math> in turn. These vectors define the vertices of the unit square. | ||
+ | |||
+ | The following table shows several 2×2 real matrices with the associated linear maps of '''R'''<sup>2</sup>. The blue original is mapped to the green grid and shapes. The origin (0,0) is marked with a black point. | ||
+ | {| class="wikitable" style="text-align:center; margin:1em auto 1em auto;" | ||
+ | |- | ||
+ | | Horizontal shear<br>with ''m'' = 1.25. | ||
+ | | Reflection through the vertical axis | ||
+ | | Squeeze mapping<br>with ''r'' = 3/2 | ||
+ | | Scaling<br>by a factor of 3/2 | ||
+ | |<span id="rotation_matrix">Rotation<br>by <math> \pi</math>/6 = 30°</span> | ||
+ | |- | ||
+ | | <math>\begin{bmatrix} | ||
+ | 1 & 1.25 \\ | ||
+ | 0 & 1 | ||
+ | \end{bmatrix}</math> | ||
+ | | <math>\begin{bmatrix} | ||
+ | -1 & 0 \\ | ||
+ | 0 & 1 | ||
+ | \end{bmatrix}</math> | ||
+ | | <math>\begin{bmatrix} | ||
+ | \frac{3}{2} & 0 \\ | ||
+ | 0 & \frac{2}{3} | ||
+ | \end{bmatrix}</math> | ||
+ | |<math>\begin{bmatrix} | ||
+ | \frac{3}{2} & 0 \\ | ||
+ | 0 & \frac{3}{2} | ||
+ | \end{bmatrix}</math> | ||
+ | |<math>\begin{bmatrix} | ||
+ | \cos\left(\frac{\pi}{6}\right) & -\sin\left(\frac{\pi}{6}\right) \\ | ||
+ | \sin\left(\frac{\pi}{6}\right) & \cos\left(\frac{\pi}{6}\right) | ||
+ | \end{bmatrix}</math> | ||
+ | |- | ||
+ | | width="20%" | [[File:VerticalShear m=1.25.svg|175px]] | ||
+ | | width="20%" | [[File:Flip map.svg|150px]] | ||
+ | | width="20%" | [[File:Squeeze r=1.5.svg|150px]] | ||
+ | | width="20%" | [[File:Scaling by 1.5.svg|125px]] | ||
+ | | width="20%" | [[File:Rotation by pi over 6.svg|125px]] | ||
+ | |} | ||
+ | |||
+ | Under the 1-to-1 correspondence between matrices and linear maps, matrix multiplication corresponds to composition of maps: if a ''k''-by-''m'' matrix '''B''' represents another linear map ''g'': '''R'''<sup>''m''</sup> → '''R'''<sup>''k''</sup>, then the composition ''g'' ∘ ''f'' is represented by '''BA''' since | ||
+ | :(''g'' ∘ ''f'')('''x''') = ''g''(''f''('''x''')) = ''g''('''Ax''') = '''B'''('''Ax''') = ('''BA''')'''x'''. | ||
+ | |||
+ | The last equality follows from the above-mentioned associativity of matrix multiplication. | ||
+ | |||
+ | The rank of a matrix '''A''' is the maximum number of linearly independent row vectors of the matrix, which is the same as the maximum number of linearly independent column vectors. Equivalently it is the dimension of the image of the linear map represented by '''A'''. The rank–nullity theorem states that the dimension of the kernel of a matrix plus the rank equals the number of columns of the matrix. | ||
+ | |||
+ | ==Square matrix== | ||
+ | A square matrix is a matrix with the same number of rows and columns. An ''n''-by-''n'' matrix is known as a square matrix of order ''n.'' Any two square matrices of the same order can be added and multiplied. | ||
+ | The entries ''a''<sub>''ii''</sub> form the main diagonal of a square matrix. They lie on the imaginary line that runs from the top left corner to the bottom right corner of the matrix. | ||
+ | |||
+ | ===Main types=== | ||
+ | :{| class="wikitable" style="float:right; margin:0ex 0ex 2ex 2ex;" | ||
+ | |- | ||
+ | ! Name !! Example with ''n'' = 3 | ||
+ | |- | ||
+ | | Diagonal matrix || style="text-align:center;" | <math> | ||
+ | \begin{bmatrix} | ||
+ | a_{11} & 0 & 0 \\ | ||
+ | 0 & a_{22} & 0 \\ | ||
+ | 0 & 0 & a_{33} \\ | ||
+ | \end{bmatrix} | ||
+ | </math> | ||
+ | |- | ||
+ | | Lower triangular matrix || style="text-align:center;" | <math> | ||
+ | \begin{bmatrix} | ||
+ | a_{11} & 0 & 0 \\ | ||
+ | a_{21} & a_{22} & 0 \\ | ||
+ | a_{31} & a_{32} & a_{33} \\ | ||
+ | \end{bmatrix} | ||
+ | </math> | ||
+ | |- | ||
+ | | Upper triangular matrix || style="text-align:center;" | <math> | ||
+ | \begin{bmatrix} | ||
+ | a_{11} & a_{12} & a_{13} \\ | ||
+ | 0 & a_{22} & a_{23} \\ | ||
+ | 0 & 0 & a_{33} \\ | ||
+ | \end{bmatrix} | ||
+ | </math> | ||
+ | |} | ||
+ | |||
+ | ====Diagonal and triangular matrix==== | ||
+ | If all entries of '''A''' below the main diagonal are zero, '''A''' is called an ''upper triangular matrix''. Similarly if all entries of ''A'' above the main diagonal are zero, '''A''' is called a ''lower triangular matrix''. If all entries outside the main diagonal are zero, '''A''' is called a diagonal matrix. | ||
+ | |||
+ | ====Identity matrix==== | ||
+ | The ''identity matrix'' '''I'''<sub>''n''</sub> of size ''n'' is the ''n''-by-''n'' matrix in which all the elements on the main diagonal are equal to 1 and all other elements are equal to 0, for example, | ||
+ | :<math> | ||
+ | \mathbf{I}_1 = \begin{bmatrix} 1 \end{bmatrix}, | ||
+ | \ \mathbf{I}_2 = \begin{bmatrix} | ||
+ | 1 & 0 \\ | ||
+ | 0 & 1 | ||
+ | \end{bmatrix}, | ||
+ | \ \ldots , | ||
+ | \ \mathbf{I}_n = \begin{bmatrix} | ||
+ | 1 & 0 & \cdots & 0 \\ | ||
+ | 0 & 1 & \cdots & 0 \\ | ||
+ | \vdots & \vdots & \ddots & \vdots \\ | ||
+ | 0 & 0 & \cdots & 1 | ||
+ | \end{bmatrix} | ||
+ | </math> | ||
+ | It is a square matrix of order ''n'', and also a special kind of diagonal matrix. It is called an identity matrix because multiplication with it leaves a matrix unchanged: | ||
+ | :'''AI'''<sub>''n''</sub> = '''I'''<sub>''m''</sub>'''A''' = '''A''' for any ''m''-by-''n'' matrix '''A'''. | ||
+ | |||
+ | A nonzero scalar multiple of an identity matrix is called a ''scalar'' matrix. If the matrix entries come from a field, the scalar matrices form a group, under matrix multiplication, that is isomorphic to the multiplicative group of nonzero elements of the field. | ||
+ | |||
+ | ====Symmetric or skew-symmetric matrix==== | ||
+ | A square matrix '''A''' that is equal to its transpose, that is, '''A''' = '''A'''<sup>T</sup>, is a symmetric matrix. If instead, '''A''' is equal to the negative of its transpose, that is, '''A''' = −'''A'''<sup>T</sup>, then '''A''' is a skew-symmetric matrix. In complex matrices, symmetry is often replaced by the concept of Hermitian matrices, which satisfy '''A'''<sup>∗</sup> = '''A''', where the star or asterisk denotes the conjugate transpose of the matrix, that is, the transpose of the complex conjugate of '''A'''. | ||
+ | |||
+ | By the spectral theorem, real symmetric matrices and complex Hermitian matrices have an eigenbasis; that is, every vector is expressible as a linear combination of eigenvectors. In both cases, all eigenvalues are real. This theorem can be generalized to infinite-dimensional situations related to matrices with infinitely many rows and columns. | ||
+ | |||
+ | ====Invertible matrix and its inverse==== | ||
+ | A square matrix '''A''' is called ''invertible'' or ''non-singular'' if there exists a matrix '''B''' such that | ||
+ | :'''AB''' = '''BA''' = '''I'''<sub>''n''</sub> , | ||
+ | where '''I'''<sub>''n''</sub> is the ''n''×''n'' identity matrix with 1s on the main diagonal and 0s elsewhere. If '''B''' exists, it is unique and is called the ''inverse matrix'' of '''A''', denoted '''A'''<sup>−1</sup>. | ||
+ | |||
+ | ====Definite matrix==== | ||
+ | {| class="wikitable" style="float:right; text-align:center; margin:0ex 0ex 2ex 2ex;" | ||
+ | |- | ||
+ | ! Positive definite matrix !! Indefinite matrix | ||
+ | |- | ||
+ | | <math>\begin{bmatrix} | ||
+ | \frac{1}{4} & 0 \\ | ||
+ | 0 & 1 \\ | ||
+ | \end{bmatrix}</math> | ||
+ | | <math>\begin{bmatrix} | ||
+ | \frac{1}{4} & 0 \\ | ||
+ | 0 & -\frac{1}{4} | ||
+ | \end{bmatrix}</math> | ||
+ | |- | ||
+ | | ''Q''(''x'', ''y'') = <math> \tfrac{1}{4}</math> ''x''<sup>2</sup> + ''y''<sup>2</sup> | ||
+ | | ''Q''(''x'', ''y'') = <math> \tfrac{1}{4}</math> ''x''<sup>2</sup> - <math> \tfrac {1}{4}</math> ''y''<sup>2</sup> | ||
+ | |- | ||
+ | | [[File:Ellipse in coordinate system with semi-axes labelled.svg|150px]] <br>Points such that ''Q''(''x'',''y'')=1 <br> (Ellipse). | ||
+ | | [[File:Hyperbola2 SVG.svg|150px]] <br> Points such that ''Q''(''x'',''y'')=1 <br> (Hyperbola). | ||
+ | |} | ||
+ | A symmetric real matrix {{math|'''A'''}} is called ''positive-definite'' if the associated quadratic form | ||
+ | :<span id="quadratic forms">''f''('''x''') = '''x'''<sup>T</sup>'''A '''x'''</span> | ||
+ | has a positive value for every nonzero vector '''x''' in '''R'''<sup>''n''</sup>. If ''f''('''x''') only yields negative values then {{math|'''A'''}} is ''negative-definite''; if {{math|''f''}} does produce both negative and positive values then {{math|'''A'''}} is ''indefinite''. If the quadratic form {{math|''f''}} yields only non-negative values (positive or zero), the symmetric matrix is called ''positive-semidefinite'' (or if only non-positive values, then negative-semidefinite); hence the matrix is indefinite precisely when it is neither positive-semidefinite nor negative-semidefinite. | ||
+ | |||
+ | A symmetric matrix is positive-definite if and only if all its eigenvalues are positive, that is, the matrix is positive-semidefinite and it is invertible. The table at the right shows two possibilities for 2-by-2 matrices. | ||
+ | |||
+ | Allowing as input two different vectors instead yields the bilinear form associated to {{math|'''A'''}}: | ||
+ | :''B''<sub>'''A'''</sub> ('''x''', '''y''') = '''x'''<sup>T</sup>'''Ay'''. | ||
+ | |||
+ | In the case of complex matrices, the same terminology and result apply, with ''symmetric matrix'', ''quadratic form'', ''bilinear form'', and ''transpose'' '''x'''<sup>T</sup> replaced respectively by Hermitian matrix, Hermitian form, sesquilinear form, and conjugate transpose '''x'''<sup>H</sup>. | ||
+ | |||
+ | ====Orthogonal matrix==== | ||
+ | An ''orthogonal matrix'' is a square matrix with real entries whose columns and rows are orthogonal unit vectors (that is, orthonormal vectors). Equivalently, a matrix '''A''' is orthogonal if its transpose is equal to its inverse: | ||
+ | :<math>\mathbf{A}^\mathrm{T}=\mathbf{A}^{-1}, \,</math> | ||
+ | which entails | ||
+ | :<math>\mathbf{A}^\mathrm{T} \mathbf{A} = \mathbf{A} \mathbf{A}^\mathrm{T} = \mathbf{I}_n,</math> | ||
+ | where '''I'''<sub>''n''</sub> is the identity matrix of size ''n''. | ||
+ | |||
+ | An orthogonal matrix '''A''' is necessarily invertible (with inverse '''A'''<sup>-1</sup> = '''A'''<sup>T</sup>), unitary ('''A'''<sup>-1</sup> = '''A'''*), and normal '''A'''*'''A''' = '''AA'''*). The determinant of any orthogonal matrix is either {{math|+1}} or {{math|−1}}. A ''special orthogonal matrix'' is an orthogonal matrix with determinant +1. As a linear transformation, every orthogonal matrix with determinant {{math|+1}} is a pure rotation without reflection, i.e., the transformation preserves the orientation of the transformed structure, while every orthogonal matrix with determinant {{math|-1}} reverses the orientation, i.e., is a composition of a pure reflection and a (possibly null) rotation. The identity matrices have determinant {{math|1}}, and are pure rotations by an angle zero. | ||
+ | |||
+ | The complex analogue of an orthogonal matrix is a unitary matrix. | ||
+ | |||
+ | ===Main operations=== | ||
+ | |||
+ | ====Trace==== | ||
+ | The trace, tr('''A''') of a square matrix '''A''' is the sum of its diagonal entries. While matrix multiplication is not commutative as mentioned above, the trace of the product of two matrices is independent of the order of the factors: | ||
+ | : tr('''AB''') = tr('''BA'''). | ||
+ | This is immediate from the definition of matrix multiplication: | ||
+ | :<math>\operatorname{tr}(\mathbf{AB}) = \sum_{i=1}^m \sum_{j=1}^n a_{ij} b_{ji} = \operatorname{tr}(\mathbf{BA}).</math> | ||
+ | It follows that the trace of the product of more than two matrices is independent of cyclic permutations of the matrices, however this does not in general apply for arbitrary permutations (for example, tr('''ABC''') ≠ tr('''BAC'''), in general). Also, the trace of a matrix is equal to that of its transpose, that is, | ||
+ | :tr('''A''') = tr('''A'''<sup>T</sup>). | ||
+ | |||
+ | ====Determinant==== | ||
+ | [[File:Determinant example.svg|thumb|300px|right|A linear transformation on '''R'''<sup>2</sup> given by the indicated matrix. The determinant of this matrix is −1, as the area of the green parallelogram at the right is 1, but the map reverses the orientation, since it turns the counterclockwise orientation of the vectors to a clockwise one.]] | ||
+ | |||
+ | The ''determinant'' of a square matrix '''A''' (denoted det('''A''') or |'''A'''|) is a number encoding certain properties of the matrix. A matrix is invertible if and only if its determinant is nonzero. Its absolute value equals the area (in '''R'''<sup>2</sup>) or volume (in '''R'''<sup>3</sup>) of the image of the unit square (or cube), while its sign corresponds to the orientation of the corresponding linear map: the determinant is positive if and only if the orientation is preserved. | ||
+ | |||
+ | The determinant of 2-by-2 matrices is given by | ||
+ | :<math>\det \begin{bmatrix}a&b\\c&d\end{bmatrix} = ad-bc.</math> | ||
+ | The determinant of 3-by-3 matrices involves 6 terms (rule of Sarrus). The more lengthy Leibniz formula generalises these two formulae to all dimensions. | ||
+ | |||
+ | The determinant of a product of square matrices equals the product of their determinants: | ||
+ | :det('''AB''') = det('''A''') · det('''B'''). | ||
+ | Adding a multiple of any row to another row, or a multiple of any column to another column does not change the determinant. Interchanging two rows or two columns affects the determinant by multiplying it by −1. Using these operations, any matrix can be transformed to a lower (or upper) triangular matrix, and for such matrices, the determinant equals the product of the entries on the main diagonal; this provides a method to calculate the determinant of any matrix. Finally, the Laplace expansion expresses the determinant in terms of minors, that is, determinants of smaller matrices. This expansion can be used for a recursive definition of determinants (taking as starting case the determinant of a 1-by-1 matrix, which is its unique entry, or even the determinant of a 0-by-0 matrix, which is 1), that can be seen to be equivalent to the Leibniz formula. Determinants can be used to solve linear systems using Cramer's rule, where the division of the determinants of two related square matrices equates to the value of each of the system's variables. | ||
+ | |||
+ | ====Eigenvalues and eigenvectors==== | ||
+ | A number λ and a non-zero vector '''v''' satisfying | ||
+ | :<math>Av = \lambda v</math> | ||
+ | are called an ''eigenvalue'' and an ''eigenvector'' of '''A''', respectively. The number λ is an eigenvalue of an ''n''×''n''-matrix '''A''' if and only if '''A'''−λ'''I'''<sub>''n''</sub> is not invertible, which is equivalent to | ||
+ | :<math>\det(\mathbf{A}-\lambda \mathbf{I}) = 0.</math> | ||
+ | The polynomial ''p''<sub>'''A'''</sub> in an indeterminate ''X'' given by evaluation of the determinant det(''X'''''I'''<sub>''n''</sub>−'''A''') is called the characteristic polynomial of '''A'''. It is a monic polynomial of degree ''n''. Therefore the polynomial equation ''p''<sub>'''A'''</sub>(λ) = 0 has at most ''n'' different solutions, that is, eigenvalues of the matrix. They may be complex even if the entries of '''A''' are real. According to the Cayley–Hamilton theorem, ''p''<sub>'''A'''</sub>('''A''') = '''0''', that is, the result of substituting the matrix itself into its own characteristic polynomial yields the zero matrix. | ||
== Resources == | == Resources == | ||
Line 170: | Line 379: | ||
* [https://mathresearch.utsa.edu/wikiFiles/MAT1053/Matrices_and_Matrix_Operations/MAT1053_M10.1Matrices_and_Matrix_Operations.pdf Guided Notes] | * [https://mathresearch.utsa.edu/wikiFiles/MAT1053/Matrices_and_Matrix_Operations/MAT1053_M10.1Matrices_and_Matrix_Operations.pdf Guided Notes] | ||
* [https://www.youtube.com/watch?v=0iDPZolrGpE Matrix Addition]. Produced by TA Catherine Sporer, UTSA | * [https://www.youtube.com/watch?v=0iDPZolrGpE Matrix Addition]. Produced by TA Catherine Sporer, UTSA | ||
+ | |||
+ | == Licensing == | ||
+ | Content obtained and/or adapted from: | ||
+ | * [https://en.wikipedia.org/wiki/Matrix_(mathematics) Matrix (mathematics), Wikipedia] under a CC BY-SA license |
Latest revision as of 16:05, 10 January 2022
Contents
Basic operations
There are a number of basic operations that can be applied to modify matrices, called matrix addition, scalar multiplication, transposition, matrix multiplication, row operations, and submatrix.
Addition, scalar multiplication, and transposition
Operation | Definition | Example |
---|---|---|
Addition | The sum A+B of two m-by-n matrices A and B is calculated entrywise:
|
|
Scalar multiplication | The product cA of a number c (also called a scalar in the parlance of abstract algebra) and a matrix A is computed by multiplying every entry of A by c:
This operation is called scalar multiplication, but its result is not named "scalar product" to avoid confusion, since "scalar product" is sometimes used as a synonym for "inner product". |
|
Transposition | The transpose of an m-by-n matrix A is the n-by-m matrix AT (also denoted Atr or tA) formed by turning rows into columns and vice versa:
|
Familiar properties of numbers extend to these operations of matrices: for example, addition is commutative, that is, the matrix sum does not depend on the order of the summands: A + B = B + A. The transpose is compatible with addition and scalar multiplication, as expressed by (cA)T = c(AT) and (A + B)T = AT + BT. Finally, (AT)T = A.
Matrix multiplication
Multiplication of two matrices is defined if and only if the number of columns of the left matrix is the same as the number of rows of the right matrix. If A is an m-by-n matrix and B is an n-by-p matrix, then their matrix product AB is the m-by-p matrix whose entries are given by dot product of the corresponding row of A and the corresponding column of B:
where 1 ≤ i ≤ m and 1 ≤ j ≤ p. For example, the underlined entry 2340 in the product is calculated as (2 × 1000) + (3 × 100) + (4 × 10) = 2340:
Matrix multiplication satisfies the rules (AB)C = A(BC) (associativity), and (A + B)C = AC + BC as well as C(A + B) = CA + CB (left and right distributivity), whenever the size of the matrices is such that the various products are defined. The product AB may be defined without BA being defined, namely if A and B are m-by-n and n-by-k matrices, respectively, and m ≠ k. Even if both products are defined, they generally need not be equal, that is:
- AB ≠ BA,
In other words, matrix multiplication is not commutative, in marked contrast to (rational, real, or complex) numbers, whose product is independent of the order of the factors. An example of two matrices not commuting with each other is:
whereas
Besides the ordinary matrix multiplication just described, other less frequently used operations on matrices that can be considered forms of multiplication also exist, such as the Hadamard product and the Kronecker product. They arise in solving matrix equations such as the Sylvester equation.
Row operations
There are three types of row operations:
- row addition, that is adding a row to another.
- row multiplication, that is multiplying all entries of a row by a non-zero constant;
- row switching, that is interchanging two rows of a matrix;
These operations are used in several ways, including solving linear equations and finding matrix inverses.
Submatrix
A submatrix of a matrix is obtained by deleting any collection of rows and/or columns. For example, from the following 3-by-4 matrix, we can construct a 2-by-3 submatrix by removing row 3 and column 2:
The minors and cofactors of a matrix are found by computing the determinant of certain submatrices.
A principal submatrix is a square submatrix obtained by removing certain rows and columns. The definition varies from author to author. According to some authors, a principal submatrix is a submatrix in which the set of row indices that remain is the same as the set of column indices that remain. Other authors define a principal submatrix as one in which the first k rows and columns, for some number k, are the ones that remain; this type of submatrix has also been called a leading principal submatrix.
Linear equations
Matrices can be used to compactly write and work with multiple linear equations, that is, systems of linear equations. For example, if A is an m-by-n matrix, x designates a column vector (that is, n×1-matrix) of n variables x1, x2, ..., xn, and b is an m×1-column vector, then the matrix equation
is equivalent to the system of linear equations
Using matrices, this can be solved more compactly than would be possible by writing out all the equations separately. If n = m and the equations are independent, then this can be done by writing
where A−1 is the inverse matrix of A. If A has no inverse, solutions—if any—can be found using its generalized inverse.
Linear transformations
Matrices and matrix multiplication reveal their essential features when related to linear transformations, also known as linear maps. A real m-by-n matrix A gives rise to a linear transformation Rn → Rm mapping each vector x in Rn to the (matrix) product Ax, which is a vector in Rm. Conversely, each linear transformation f: Rn → Rm arises from a unique m-by-n matrix A: explicitly, the (i, j)-entry of A is the ith coordinate of f(ej), where ej = (0,...,0,1,0,...,0) is the unit vector with 1 in the jth position and 0 elsewhere. The matrix A is said to represent the linear map f, and A is called the transformation matrix of f.
For example, the 2×2 matrix
can be viewed as the transform of the unit square into a parallelogram with vertices at (0, 0), (a, b), (a + c, b + d), and (c, d). The parallelogram pictured at the right is obtained by multiplying A with each of the column vectors , and in turn. These vectors define the vertices of the unit square.
The following table shows several 2×2 real matrices with the associated linear maps of R2. The blue original is mapped to the green grid and shapes. The origin (0,0) is marked with a black point.
Horizontal shear with m = 1.25. |
Reflection through the vertical axis | Squeeze mapping with r = 3/2 |
Scaling by a factor of 3/2 |
Rotation by /6 = 30° |
Under the 1-to-1 correspondence between matrices and linear maps, matrix multiplication corresponds to composition of maps: if a k-by-m matrix B represents another linear map g: Rm → Rk, then the composition g ∘ f is represented by BA since
- (g ∘ f)(x) = g(f(x)) = g(Ax) = B(Ax) = (BA)x.
The last equality follows from the above-mentioned associativity of matrix multiplication.
The rank of a matrix A is the maximum number of linearly independent row vectors of the matrix, which is the same as the maximum number of linearly independent column vectors. Equivalently it is the dimension of the image of the linear map represented by A. The rank–nullity theorem states that the dimension of the kernel of a matrix plus the rank equals the number of columns of the matrix.
Square matrix
A square matrix is a matrix with the same number of rows and columns. An n-by-n matrix is known as a square matrix of order n. Any two square matrices of the same order can be added and multiplied. The entries aii form the main diagonal of a square matrix. They lie on the imaginary line that runs from the top left corner to the bottom right corner of the matrix.
Main types
Name Example with n = 3 Diagonal matrix Lower triangular matrix Upper triangular matrix
Diagonal and triangular matrix
If all entries of A below the main diagonal are zero, A is called an upper triangular matrix. Similarly if all entries of A above the main diagonal are zero, A is called a lower triangular matrix. If all entries outside the main diagonal are zero, A is called a diagonal matrix.
Identity matrix
The identity matrix In of size n is the n-by-n matrix in which all the elements on the main diagonal are equal to 1 and all other elements are equal to 0, for example,
It is a square matrix of order n, and also a special kind of diagonal matrix. It is called an identity matrix because multiplication with it leaves a matrix unchanged:
- AIn = ImA = A for any m-by-n matrix A.
A nonzero scalar multiple of an identity matrix is called a scalar matrix. If the matrix entries come from a field, the scalar matrices form a group, under matrix multiplication, that is isomorphic to the multiplicative group of nonzero elements of the field.
Symmetric or skew-symmetric matrix
A square matrix A that is equal to its transpose, that is, A = AT, is a symmetric matrix. If instead, A is equal to the negative of its transpose, that is, A = −AT, then A is a skew-symmetric matrix. In complex matrices, symmetry is often replaced by the concept of Hermitian matrices, which satisfy A∗ = A, where the star or asterisk denotes the conjugate transpose of the matrix, that is, the transpose of the complex conjugate of A.
By the spectral theorem, real symmetric matrices and complex Hermitian matrices have an eigenbasis; that is, every vector is expressible as a linear combination of eigenvectors. In both cases, all eigenvalues are real. This theorem can be generalized to infinite-dimensional situations related to matrices with infinitely many rows and columns.
Invertible matrix and its inverse
A square matrix A is called invertible or non-singular if there exists a matrix B such that
- AB = BA = In ,
where In is the n×n identity matrix with 1s on the main diagonal and 0s elsewhere. If B exists, it is unique and is called the inverse matrix of A, denoted A−1.
Definite matrix
Positive definite matrix | Indefinite matrix |
---|---|
Q(x, y) = x2 + y2 | Q(x, y) = x2 - y2 |
Points such that Q(x,y)=1 (Ellipse). |
Points such that Q(x,y)=1 (Hyperbola). |
A symmetric real matrix A is called positive-definite if the associated quadratic form
- f(x) = xTA x
has a positive value for every nonzero vector x in Rn. If f(x) only yields negative values then A is negative-definite; if f does produce both negative and positive values then A is indefinite. If the quadratic form f yields only non-negative values (positive or zero), the symmetric matrix is called positive-semidefinite (or if only non-positive values, then negative-semidefinite); hence the matrix is indefinite precisely when it is neither positive-semidefinite nor negative-semidefinite.
A symmetric matrix is positive-definite if and only if all its eigenvalues are positive, that is, the matrix is positive-semidefinite and it is invertible. The table at the right shows two possibilities for 2-by-2 matrices.
Allowing as input two different vectors instead yields the bilinear form associated to A:
- BA (x, y) = xTAy.
In the case of complex matrices, the same terminology and result apply, with symmetric matrix, quadratic form, bilinear form, and transpose xT replaced respectively by Hermitian matrix, Hermitian form, sesquilinear form, and conjugate transpose xH.
Orthogonal matrix
An orthogonal matrix is a square matrix with real entries whose columns and rows are orthogonal unit vectors (that is, orthonormal vectors). Equivalently, a matrix A is orthogonal if its transpose is equal to its inverse:
which entails
where In is the identity matrix of size n.
An orthogonal matrix A is necessarily invertible (with inverse A-1 = AT), unitary (A-1 = A*), and normal A*A = AA*). The determinant of any orthogonal matrix is either +1 or −1. A special orthogonal matrix is an orthogonal matrix with determinant +1. As a linear transformation, every orthogonal matrix with determinant +1 is a pure rotation without reflection, i.e., the transformation preserves the orientation of the transformed structure, while every orthogonal matrix with determinant -1 reverses the orientation, i.e., is a composition of a pure reflection and a (possibly null) rotation. The identity matrices have determinant 1, and are pure rotations by an angle zero.
The complex analogue of an orthogonal matrix is a unitary matrix.
Main operations
Trace
The trace, tr(A) of a square matrix A is the sum of its diagonal entries. While matrix multiplication is not commutative as mentioned above, the trace of the product of two matrices is independent of the order of the factors:
- tr(AB) = tr(BA).
This is immediate from the definition of matrix multiplication:
It follows that the trace of the product of more than two matrices is independent of cyclic permutations of the matrices, however this does not in general apply for arbitrary permutations (for example, tr(ABC) ≠ tr(BAC), in general). Also, the trace of a matrix is equal to that of its transpose, that is,
- tr(A) = tr(AT).
Determinant
The determinant of a square matrix A (denoted det(A) or |A|) is a number encoding certain properties of the matrix. A matrix is invertible if and only if its determinant is nonzero. Its absolute value equals the area (in R2) or volume (in R3) of the image of the unit square (or cube), while its sign corresponds to the orientation of the corresponding linear map: the determinant is positive if and only if the orientation is preserved.
The determinant of 2-by-2 matrices is given by
The determinant of 3-by-3 matrices involves 6 terms (rule of Sarrus). The more lengthy Leibniz formula generalises these two formulae to all dimensions.
The determinant of a product of square matrices equals the product of their determinants:
- det(AB) = det(A) · det(B).
Adding a multiple of any row to another row, or a multiple of any column to another column does not change the determinant. Interchanging two rows or two columns affects the determinant by multiplying it by −1. Using these operations, any matrix can be transformed to a lower (or upper) triangular matrix, and for such matrices, the determinant equals the product of the entries on the main diagonal; this provides a method to calculate the determinant of any matrix. Finally, the Laplace expansion expresses the determinant in terms of minors, that is, determinants of smaller matrices. This expansion can be used for a recursive definition of determinants (taking as starting case the determinant of a 1-by-1 matrix, which is its unique entry, or even the determinant of a 0-by-0 matrix, which is 1), that can be seen to be equivalent to the Leibniz formula. Determinants can be used to solve linear systems using Cramer's rule, where the division of the determinants of two related square matrices equates to the value of each of the system's variables.
Eigenvalues and eigenvectors
A number λ and a non-zero vector v satisfying
are called an eigenvalue and an eigenvector of A, respectively. The number λ is an eigenvalue of an n×n-matrix A if and only if A−λIn is not invertible, which is equivalent to
The polynomial pA in an indeterminate X given by evaluation of the determinant det(XIn−A) is called the characteristic polynomial of A. It is a monic polynomial of degree n. Therefore the polynomial equation pA(λ) = 0 has at most n different solutions, that is, eigenvalues of the matrix. They may be complex even if the entries of A are real. According to the Cayley–Hamilton theorem, pA(A) = 0, that is, the result of substituting the matrix itself into its own characteristic polynomial yields the zero matrix.
Resources
- Matrices and Matrix Operations, Book Chapter
- Guided Notes
- Matrix Addition. Produced by TA Catherine Sporer, UTSA
Licensing
Content obtained and/or adapted from:
- Matrix (mathematics), Wikipedia under a CC BY-SA license