Difference between revisions of "Matrix Operations"

From Department of Mathematics at UTSA
Jump to navigation Jump to search
 
(11 intermediate revisions by the same user not shown)
Line 167: Line 167:
  
 
==Linear equations==
 
==Linear equations==
Matrices can be used to compactly write and work with multiple linear equations, that is, systems of linear equations. For example, if '''A''' is an ''m''-by-''n'' matrix, '''x''' designates a column vector (that is, ''n''×1-matrix) of ''n'' variables ''x''{{sub|1}}, ''x''{{sub|2}}, ..., ''x''{{sub|''n''}}, and '''b''' is an ''m''×1-column vector, then the matrix equation
+
Matrices can be used to compactly write and work with multiple linear equations, that is, systems of linear equations. For example, if '''A''' is an ''m''-by-''n'' matrix, '''x''' designates a column vector (that is, ''n''×1-matrix) of ''n'' variables ''x''<sub>1</sub>, ''x''<sub>2</sub>, ..., ''x''<sub>''n''</sub>, and '''b''' is an ''m''×1-column vector, then the matrix equation
 
:<math>\mathbf{Ax} = \mathbf{b}</math>
 
:<math>\mathbf{Ax} = \mathbf{b}</math>
  
Line 184: Line 184:
 
==Linear transformations==
 
==Linear transformations==
 
[[File:Area parallellogram as determinant.svg|thumb|right|The vectors represented by a 2-by-2 matrix correspond to the sides of a unit square transformed into a parallelogram.]]
 
[[File:Area parallellogram as determinant.svg|thumb|right|The vectors represented by a 2-by-2 matrix correspond to the sides of a unit square transformed into a parallelogram.]]
Matrices and matrix multiplication reveal their essential features when related to ''linear transformations'', also known as ''linear maps''. <span id="linear_maps">A real ''m''-by-''n'' matrix '''A''' gives rise to a linear transformation '''R'''{{sup|''n''}} → '''R'''{{sup|''m''}} mapping each vector '''x''' in '''R'''{{sup|''n''}} to the (matrix) product '''Ax''', which is a vector in '''R'''{{sup|''m''}}. Conversely, each linear transformation ''f'': '''R'''{{sup|''n''}} → '''R'''{{sup|''m''}} arises from a unique ''m''-by-''n'' matrix '''A''': explicitly, the {{nowrap|(''i'', ''j'')-entry}} of '''A''' is the ''i''{{sup|th}} coordinate of ''f''('''e'''{{sub|''j''}}), where '''e'''{{sub|''j''}} = (0,...,0,1,0,...,0) is the unit vector with 1 in the ''j''{{sup|th}} position and 0 elsewhere.</span> The matrix '''A''' is said to represent the linear map ''f'', and '''A''' is called the ''transformation matrix'' of ''f''.
+
Matrices and matrix multiplication reveal their essential features when related to ''linear transformations'', also known as ''linear maps''. <span id="linear_maps">A real ''m''-by-''n'' matrix '''A''' gives rise to a linear transformation '''R'''<sup>''n''</sup> → '''R'''<sup>''m''</sup> mapping each vector '''x''' in '''R'''<sup>''n''</sup> to the (matrix) product '''Ax''', which is a vector in '''R'''<sup>''m''</sup>. Conversely, each linear transformation ''f'': '''R'''<sup>''n''</sup> → '''R'''<sup>''m''</sup> arises from a unique ''m''-by-''n'' matrix '''A''': explicitly, the (''i'', ''j'')-entry of '''A''' is the ''i''<sup>th</sup> coordinate of ''f''('''e'''<sub>''j''</sub>), where '''e'''<sub>''j''</sub> = (0,...,0,1,0,...,0) is the unit vector with 1 in the ''j''<sup>th</sup> position and 0 elsewhere.</span> The matrix '''A''' is said to represent the linear map ''f'', and '''A''' is called the ''transformation matrix'' of ''f''.
  
 
For example, the 2×2 matrix
 
For example, the 2×2 matrix
 
:<math>\mathbf{A} = \begin{bmatrix} a & c\\b & d \end{bmatrix}</math>
 
:<math>\mathbf{A} = \begin{bmatrix} a & c\\b & d \end{bmatrix}</math>
  
can be viewed as the transform of the unit square into a parallelogram with vertices at {{nowrap|(0, 0)}}, {{nowrap|(''a'', ''b'')}}, {{nowrap|(''a'' + ''c'', ''b'' + ''d'')}}, and {{nowrap|(''c'', ''d'')}}. The parallelogram pictured at the right is obtained by multiplying '''A''' with each of the column vectors <math>\begin{bmatrix} 0 \\ 0 \end{bmatrix}, \begin{bmatrix} 1 \\ 0 \end{bmatrix}, \begin{bmatrix} 1 \\ 1 \end{bmatrix}</math>, and <math>\begin{bmatrix}0 \\ 1\end{bmatrix}</math> in turn. These vectors define the vertices of the unit square.
+
can be viewed as the transform of the unit square into a parallelogram with vertices at (0, 0), (''a'', ''b''), (''a'' + ''c'', ''b'' + ''d''), and (''c'', ''d''). The parallelogram pictured at the right is obtained by multiplying '''A''' with each of the column vectors <math>\begin{bmatrix} 0 \\ 0 \end{bmatrix}, \begin{bmatrix} 1 \\ 0 \end{bmatrix}, \begin{bmatrix} 1 \\ 1 \end{bmatrix}</math>, and <math>\begin{bmatrix}0 \\ 1\end{bmatrix}</math> in turn. These vectors define the vertices of the unit square.
  
The following table shows several 2×2 real matrices with the associated linear maps of '''R'''{{sup|2}}. The blue original is mapped to the green grid and shapes. The origin (0,0) is marked with a black point.
+
The following table shows several 2×2 real matrices with the associated linear maps of '''R'''<sup>2</sup>. The blue original is mapped to the green grid and shapes. The origin (0,0) is marked with a black point.
 
{| class="wikitable" style="text-align:center; margin:1em auto 1em auto;"
 
{| class="wikitable" style="text-align:center; margin:1em auto 1em auto;"
 
|-
 
|-
| [[Shear mapping|Horizontal shear]]<br>with ''m'' = 1.25.
+
| Horizontal shear<br>with ''m'' = 1.25.
| [[Reflection (mathematics)|Reflection]] through the vertical axis
+
| Reflection through the vertical axis
| [[Squeeze mapping]]<br>with ''r'' = 3/2
+
| Squeeze mapping<br>with ''r'' = 3/2
| [[Scaling (geometry)|Scaling]]<br>by a factor of 3/2
+
| Scaling<br>by a factor of 3/2
|<span id="rotation_matrix">[[Rotation matrix|Rotation]]<br>by {{pi}}/6 = 30°</span>
+
|<span id="rotation_matrix">Rotation<br>by <math> \pi</math>/6 = 30°</span>
 
|-
 
|-
 
| <math>\begin{bmatrix}
 
| <math>\begin{bmatrix}
Line 228: Line 228:
 
|}
 
|}
  
Under the [[bijection|1-to-1 correspondence]] between matrices and linear maps, matrix multiplication corresponds to [[function composition|composition]] of maps:<ref>{{Harvard citations |last1=Greub |year=1975 |nb=yes |loc=Section III.2}}</ref> if a ''k''-by-''m'' matrix '''B''' represents another linear map ''g'': '''R'''{{sup|''m''}} → '''R'''{{sup|''k''}}, then the composition {{nowrap|''g'' ∘ ''f''}} is represented by '''BA''' since
+
Under the 1-to-1 correspondence between matrices and linear maps, matrix multiplication corresponds to composition of maps: if a ''k''-by-''m'' matrix '''B''' represents another linear map ''g'': '''R'''<sup>''m''</sup> → '''R'''<sup>''k''</sup>, then the composition ''g'' ∘ ''f'' is represented by '''BA''' since
 
:(''g'' ∘ ''f'')('''x''') = ''g''(''f''('''x''')) = ''g''('''Ax''') = '''B'''('''Ax''') = ('''BA''')'''x'''.
 
:(''g'' ∘ ''f'')('''x''') = ''g''(''f''('''x''')) = ''g''('''Ax''') = '''B'''('''Ax''') = ('''BA''')'''x'''.
  
 
The last equality follows from the above-mentioned associativity of matrix multiplication.
 
The last equality follows from the above-mentioned associativity of matrix multiplication.
  
The [[rank of a matrix]] '''A''' is the maximum number of [[linear independence|linearly independent]] row vectors of the matrix, which is the same as the maximum number of linearly independent column vectors.<ref>{{Harvard citations |last1=Brown |year=1991 |nb=yes |loc=Definition II.3.3}}</ref> Equivalently it is the [[Hamel dimension|dimension]] of the [[image (mathematics)|image]] of the linear map represented by '''A'''.<ref>{{Harvard citations |last1=Greub |year=1975 |nb=yes |loc=Section III.1}}</ref> The [[rank–nullity theorem]] states that the dimension of the [[kernel (matrix)|kernel]] of a matrix plus the rank equals the number of columns of the matrix.<ref>{{Harvard citations |last1=Brown |year=1991 |nb=yes |loc=Theorem II.3.22}}</ref>
+
The rank of a matrix '''A''' is the maximum number of linearly independent row vectors of the matrix, which is the same as the maximum number of linearly independent column vectors. Equivalently it is the dimension of the image of the linear map represented by '''A'''. The rank–nullity theorem states that the dimension of the kernel of a matrix plus the rank equals the number of columns of the matrix.
  
 
==Square matrix==
 
==Square matrix==
A [[square matrix]] is a matrix with the same number of rows and columns.<ref name=":4" /> An ''n''-by-''n'' matrix is known as a square matrix of order ''n.'' Any two square matrices of the same order can be added and multiplied.
+
A square matrix is a matrix with the same number of rows and columns. An ''n''-by-''n'' matrix is known as a square matrix of order ''n.'' Any two square matrices of the same order can be added and multiplied.
The entries ''a''{{sub|''ii''}} form the [[main diagonal]] of a square matrix. They lie on the imaginary line that runs from the top left corner to the bottom right corner of the matrix.
+
The entries ''a''<sub>''ii''</sub> form the main diagonal of a square matrix. They lie on the imaginary line that runs from the top left corner to the bottom right corner of the matrix.
  
 
===Main types===
 
===Main types===
Line 244: Line 244:
 
! Name !! Example with ''n'' = 3
 
! Name !! Example with ''n'' = 3
 
|-
 
|-
| [[Diagonal matrix]] || style="text-align:center;" | <math>
+
| Diagonal matrix || style="text-align:center;" | <math>
 
\begin{bmatrix}
 
\begin{bmatrix}
 
a_{11} & 0      & 0 \\
 
a_{11} & 0      & 0 \\
Line 252: Line 252:
 
</math>
 
</math>
 
|-
 
|-
| [[Lower triangular matrix]] || style="text-align:center;" | <math>
+
| Lower triangular matrix || style="text-align:center;" | <math>
 
\begin{bmatrix}
 
\begin{bmatrix}
 
a_{11} &      0 & 0 \\
 
a_{11} &      0 & 0 \\
Line 260: Line 260:
 
</math>
 
</math>
 
|-
 
|-
| [[Upper triangular matrix]] || style="text-align:center;" | <math>
+
| Upper triangular matrix || style="text-align:center;" | <math>
 
\begin{bmatrix}
 
\begin{bmatrix}
 
a_{11} & a_{12} & a_{13} \\
 
a_{11} & a_{12} & a_{13} \\
Line 270: Line 270:
  
 
====Diagonal and triangular matrix====
 
====Diagonal and triangular matrix====
If all entries of '''A''' below the main diagonal are zero, '''A''' is called an ''upper [[triangular matrix]]''. Similarly if all entries of ''A'' above the main diagonal are zero, '''A''' is called a ''lower triangular matrix''. If all entries outside the main diagonal are zero, '''A''' is called a [[diagonal matrix]].
+
If all entries of '''A''' below the main diagonal are zero, '''A''' is called an ''upper triangular matrix''. Similarly if all entries of ''A'' above the main diagonal are zero, '''A''' is called a ''lower triangular matrix''. If all entries outside the main diagonal are zero, '''A''' is called a diagonal matrix.
  
 
====Identity matrix====
 
====Identity matrix====
The ''identity matrix'' '''I'''{{sub|''n''}} of size ''n'' is the ''n''-by-''n'' matrix in which all the elements on the [[main diagonal]] are equal to 1 and all other elements are equal to 0, for example,
+
The ''identity matrix'' '''I'''<sub>''n''</sub> of size ''n'' is the ''n''-by-''n'' matrix in which all the elements on the main diagonal are equal to 1 and all other elements are equal to 0, for example,
 
:<math>
 
:<math>
 
\mathbf{I}_1 = \begin{bmatrix} 1 \end{bmatrix},
 
\mathbf{I}_1 = \begin{bmatrix} 1 \end{bmatrix},
Line 288: Line 288:
 
\end{bmatrix}
 
\end{bmatrix}
 
</math>
 
</math>
It is a square matrix of order ''n'', and also a special kind of [[diagonal matrix]]. It is called an identity matrix because multiplication with it leaves a matrix unchanged:
+
It is a square matrix of order ''n'', and also a special kind of diagonal matrix. It is called an identity matrix because multiplication with it leaves a matrix unchanged:
:'''AI'''{{sub|''n''}} = '''I'''{{sub|''m''}}'''A''' = '''A''' for any ''m''-by-''n'' matrix '''A'''.
+
:'''AI'''<sub>''n''</sub> = '''I'''<sub>''m''</sub>'''A''' = '''A''' for any ''m''-by-''n'' matrix '''A'''.
  
 
A nonzero scalar multiple of an identity matrix is called a ''scalar'' matrix. If the matrix entries come from a field, the scalar matrices form a group, under matrix multiplication, that is isomorphic to the multiplicative group of nonzero elements of the field.
 
A nonzero scalar multiple of an identity matrix is called a ''scalar'' matrix. If the matrix entries come from a field, the scalar matrices form a group, under matrix multiplication, that is isomorphic to the multiplicative group of nonzero elements of the field.
  
 
====Symmetric or skew-symmetric matrix====
 
====Symmetric or skew-symmetric matrix====
A square matrix '''A''' that is equal to its transpose, that is, '''A''' = '''A'''{{sup|T}}, is a [[symmetric matrix]]. If instead, '''A''' is equal to the negative of its transpose, that is, '''A''' = −'''A'''{{sup|T}}, then '''A''' is a [[skew-symmetric matrix]]. In complex matrices, symmetry is often replaced by the concept of [[Hermitian matrix|Hermitian matrices]], which satisfy '''A'''{{sup|}} = '''A''', where the star or [[asterisk]] denotes the [[conjugate transpose]] of the matrix, that is, the transpose of the [[complex conjugate]] of '''A'''.
+
A square matrix '''A''' that is equal to its transpose, that is, '''A''' = '''A'''<sup>T</sup>, is a symmetric matrix. If instead, '''A''' is equal to the negative of its transpose, that is, '''A''' = −'''A'''<sup>T</sup>, then '''A''' is a skew-symmetric matrix. In complex matrices, symmetry is often replaced by the concept of Hermitian matrices, which satisfy '''A'''<sup></sup> = '''A''', where the star or asterisk denotes the conjugate transpose of the matrix, that is, the transpose of the complex conjugate of '''A'''.
  
By the [[spectral theorem]], real symmetric matrices and complex Hermitian matrices have an [[eigenbasis]]; that is, every vector is expressible as a [[linear combination]] of eigenvectors. In both cases, all eigenvalues are real. This theorem can be generalized to infinite-dimensional situations related to matrices with infinitely many rows and columns, see [[#Infinite matrices|below]].
+
By the spectral theorem, real symmetric matrices and complex Hermitian matrices have an eigenbasis; that is, every vector is expressible as a linear combination of eigenvectors. In both cases, all eigenvalues are real. This theorem can be generalized to infinite-dimensional situations related to matrices with infinitely many rows and columns.
  
 
====Invertible matrix and its inverse====
 
====Invertible matrix and its inverse====
A square matrix '''A''' is called ''[[invertible matrix|invertible]]'' or ''non-singular'' if there exists a matrix '''B''' such that
+
A square matrix '''A''' is called ''invertible'' or ''non-singular'' if there exists a matrix '''B''' such that
:'''AB''' = '''BA''' = '''I'''{{sub|''n''}} ,<ref>{{Harvard citations |last1=Brown |year=1991 |nb=yes |loc=Definition I.2.28}}</ref><ref>{{Harvard citations |last1=Brown |year=1991 |nb=yes |loc=Definition I.5.13}}</ref>
+
:'''AB''' = '''BA''' = '''I'''<sub>''n''</sub> ,
where '''I'''{{sub|''n''}} is the ''n''×''n'' [[identity matrix]] with 1s on the [[main diagonal]] and 0s elsewhere. If '''B''' exists, it is unique and is called the ''[[Invertible matrix|inverse matrix]]'' of '''A''', denoted '''A'''{{sup|−1}}.
+
where '''I'''<sub>''n''</sub> is the ''n''×''n'' identity matrix with 1s on the main diagonal and 0s elsewhere. If '''B''' exists, it is unique and is called the ''inverse matrix'' of '''A''', denoted '''A'''<sup>−1</sup>.
  
 
====Definite matrix====
 
====Definite matrix====
 
{| class="wikitable" style="float:right; text-align:center; margin:0ex 0ex 2ex 2ex;"
 
{| class="wikitable" style="float:right; text-align:center; margin:0ex 0ex 2ex 2ex;"
 
|-
 
|-
! [[Positive definite matrix]] !! [[Indefinite matrix]]
+
! Positive definite matrix !! Indefinite matrix
 
|-
 
|-
 
| <math>\begin{bmatrix}
 
| <math>\begin{bmatrix}
Line 317: Line 317:
 
\end{bmatrix}</math>
 
\end{bmatrix}</math>
 
|-
 
|-
| ''Q''(''x'', ''y'') = {{sfrac|1|4}} ''x''{{sup|2}} + ''y''{{sup|2}}
+
| ''Q''(''x'', ''y'') = <math> \tfrac{1}{4}</math> ''x''<sup>2</sup> + ''y''<sup>2</sup>
| ''Q''(''x'', ''y'') = {{sfrac|1|4}} ''x''{{sup|2}} − 1/4 ''y''{{sup|2}}
+
| ''Q''(''x'', ''y'') = <math> \tfrac{1}{4}</math> ''x''<sup>2</sup> - <math> \tfrac {1}{4}</math> ''y''<sup>2</sup>
 
|-
 
|-
| [[File:Ellipse in coordinate system with semi-axes labelled.svg|150px]] <br>Points such that ''Q''(''x'',''y'')=1 <br> ([[Ellipse]]).
+
| [[File:Ellipse in coordinate system with semi-axes labelled.svg|150px]] <br>Points such that ''Q''(''x'',''y'')=1 <br> (Ellipse).
| [[File:Hyperbola2 SVG.svg|150px]] <br> Points such that ''Q''(''x'',''y'')=1 <br> ([[Hyperbola]]).
+
| [[File:Hyperbola2 SVG.svg|150px]] <br> Points such that ''Q''(''x'',''y'')=1 <br> (Hyperbola).
 
|}
 
|}
A symmetric real matrix {{math|'''A'''}} is called [[positive-definite matrix|''positive-definite'']] if the associated [[quadratic form]]
+
A symmetric real matrix {{math|'''A'''}} is called ''positive-definite'' if the associated quadratic form
:<span id="quadratic forms">{{math|1=''f''{{spaces|hair}}('''x''') = '''x'''{{sup|T}}'''A{{nbsp}}x'''}}</span>
+
:<span id="quadratic forms">''f''('''x''') = '''x'''<sup>T</sup>'''A '''x'''</span>
has a positive value for every nonzero vector {{math|'''x'''}} in {{math|'''R'''{{sup|''n''}}}}. If {{math|''f''{{spaces|hair}}('''x''')}} only yields negative values then {{math|'''A'''}} is [[definiteness of a matrix#Negative definite|''negative-definite'']]; if {{math|''f''}} does produce both negative and positive values then {{math|'''A'''}} is [[definiteness of a matrix#Indefinite|''indefinite'']]. If the quadratic form {{math|''f''}} yields only non-negative values (positive or zero), the symmetric matrix is called ''positive-semidefinite'' (or if only non-positive values, then negative-semidefinite); hence the matrix is indefinite precisely when it is neither positive-semidefinite nor negative-semidefinite.
+
has a positive value for every nonzero vector '''x''' in '''R'''<sup>''n''</sup>. If ''f''('''x''') only yields negative values then {{math|'''A'''}} is ''negative-definite''; if {{math|''f''}} does produce both negative and positive values then {{math|'''A'''}} is ''indefinite''. If the quadratic form {{math|''f''}} yields only non-negative values (positive or zero), the symmetric matrix is called ''positive-semidefinite'' (or if only non-positive values, then negative-semidefinite); hence the matrix is indefinite precisely when it is neither positive-semidefinite nor negative-semidefinite.
  
 
A symmetric matrix is positive-definite if and only if all its eigenvalues are positive, that is, the matrix is positive-semidefinite and it is invertible. The table at the right shows two possibilities for 2-by-2 matrices.
 
A symmetric matrix is positive-definite if and only if all its eigenvalues are positive, that is, the matrix is positive-semidefinite and it is invertible. The table at the right shows two possibilities for 2-by-2 matrices.
  
Allowing as input two different vectors instead yields the [[bilinear form]] associated to {{math|'''A'''}}:<ref>{{Harvard citations |last1=Horn |last2=Johnson |year=1985 |nb=yes |loc=Example 4.0.6, p. 169}}</ref>
+
Allowing as input two different vectors instead yields the bilinear form associated to {{math|'''A'''}}:
:{{math|1=''B''{{sub|'''A'''}} ('''x''', '''y''') = '''x'''{{sup|T}}'''Ay'''}}.
+
:''B''<sub>'''A'''</sub> ('''x''', '''y''') = '''x'''<sup>T</sup>'''Ay'''.
  
In the case of complex matrices, the same terminology and result apply, with ''symmetric matrix'', ''quadratic form'', ''bilinear form'', and ''transpose'' {{math|'''x'''{{sup|T}}}} replaced respectively by  [[Hermitian matrix]], [[Hermitian form]], [[sesquilinear form]], and [[conjugate transpose]] {{math|'''x'''{{sup|H}}}}.
+
In the case of complex matrices, the same terminology and result apply, with ''symmetric matrix'', ''quadratic form'', ''bilinear form'', and ''transpose'' '''x'''<sup>T</sup> replaced respectively by  Hermitian matrix, Hermitian form, sesquilinear form, and conjugate transpose '''x'''<sup>H</sup>.
  
 
====Orthogonal matrix====
 
====Orthogonal matrix====
An ''orthogonal matrix'' is a [[#Square matrices|square matrix]] with [[real number|real]] entries whose columns and rows are [[orthogonal]] [[unit vector]]s (that is, [[orthonormality|orthonormal]] vectors). Equivalently, a matrix '''A''' is orthogonal if its [[transpose]] is equal to its [[invertible matrix|inverse]]:
+
An ''orthogonal matrix'' is a square matrix with real entries whose columns and rows are orthogonal unit vectors (that is, orthonormal vectors). Equivalently, a matrix '''A''' is orthogonal if its transpose is equal to its inverse:
 
:<math>\mathbf{A}^\mathrm{T}=\mathbf{A}^{-1}, \,</math>
 
:<math>\mathbf{A}^\mathrm{T}=\mathbf{A}^{-1}, \,</math>
 
which entails
 
which entails
 
:<math>\mathbf{A}^\mathrm{T} \mathbf{A} = \mathbf{A} \mathbf{A}^\mathrm{T} = \mathbf{I}_n,</math>
 
:<math>\mathbf{A}^\mathrm{T} \mathbf{A} = \mathbf{A} \mathbf{A}^\mathrm{T} = \mathbf{I}_n,</math>
where '''I'''{{sub|''n''}} is the [[identity matrix]] of size ''n''.
+
where '''I'''<sub>''n''</sub> is the identity matrix of size ''n''.
  
An orthogonal matrix '''A''' is necessarily [[invertible matrix|invertible]] (with inverse {{nowrap|1='''A'''{{sup|&minus;1}} = '''A'''{{sup|T}}}}), [[unitary matrix|unitary]] ({{nowrap|1='''A'''{{sup|&minus;1}} = '''A'''*}}), and [[normal matrix|normal]] ({{nowrap|1='''A'''*'''A''' = '''AA'''*}}). The [[determinant]] of any orthogonal matrix is either {{math|+1}} or {{math|−1}}. A ''special orthogonal matrix'' is an orthogonal matrix with [[determinant]] +1. As a [[linear transformation]], every orthogonal matrix with determinant {{math|+1}} is a pure [[rotation (mathematics)|rotation]] without reflection, i.e., the transformation preserves the orientation of the transformed structure, while every orthogonal matrix with determinant {{math|-1}} reverses the orientation, i.e., is a composition of a pure [[reflection (mathematics)|reflection]] and a (possibly null) rotation. The identity matrices have determinant {{math|1}}, and are pure rotations by an angle zero.
+
An orthogonal matrix '''A''' is necessarily invertible (with inverse '''A'''<sup>-1</sup> = '''A'''<sup>T</sup>), unitary ('''A'''<sup>-1</sup> = '''A'''*), and normal '''A'''*'''A''' = '''AA'''*). The determinant of any orthogonal matrix is either {{math|+1}} or {{math|−1}}. A ''special orthogonal matrix'' is an orthogonal matrix with determinant +1. As a linear transformation, every orthogonal matrix with determinant {{math|+1}} is a pure rotation without reflection, i.e., the transformation preserves the orientation of the transformed structure, while every orthogonal matrix with determinant {{math|-1}} reverses the orientation, i.e., is a composition of a pure reflection and a (possibly null) rotation. The identity matrices have determinant {{math|1}}, and are pure rotations by an angle zero.
  
The [[complex number|complex]] analogue of an orthogonal matrix is a [[unitary matrix]].
+
The complex analogue of an orthogonal matrix is a unitary matrix.
  
 
===Main operations===
 
===Main operations===
Line 353: Line 353:
 
:<math>\operatorname{tr}(\mathbf{AB}) = \sum_{i=1}^m \sum_{j=1}^n a_{ij} b_{ji} = \operatorname{tr}(\mathbf{BA}).</math>
 
:<math>\operatorname{tr}(\mathbf{AB}) = \sum_{i=1}^m \sum_{j=1}^n a_{ij} b_{ji} = \operatorname{tr}(\mathbf{BA}).</math>
 
It follows that the trace of the product of more than two matrices is independent of cyclic permutations of the matrices, however this does not in general apply for arbitrary permutations (for example, tr('''ABC''') ≠ tr('''BAC'''), in general). Also, the trace of a matrix is equal to that of its transpose, that is,
 
It follows that the trace of the product of more than two matrices is independent of cyclic permutations of the matrices, however this does not in general apply for arbitrary permutations (for example, tr('''ABC''') ≠ tr('''BAC'''), in general). Also, the trace of a matrix is equal to that of its transpose, that is,
:tr('''A''') = tr('''A'''{{sup|T}}).
+
:tr('''A''') = tr('''A'''<sup>T</sup>).
  
 
====Determinant====
 
====Determinant====
[[File:Determinant example.svg|thumb|300px|right|A linear transformation on '''R'''{{sup|2}} given by the indicated matrix. The determinant of this matrix is −1, as the area of the green parallelogram at the right is 1, but the map reverses the orientation, since it turns the counterclockwise orientation of the vectors to a clockwise one.]]
+
[[File:Determinant example.svg|thumb|300px|right|A linear transformation on '''R'''<sup>2</sup> given by the indicated matrix. The determinant of this matrix is −1, as the area of the green parallelogram at the right is 1, but the map reverses the orientation, since it turns the counterclockwise orientation of the vectors to a clockwise one.]]
  
The ''determinant'' of a square matrix '''A''' (denoted det('''A''') or |'''A'''|) is a number encoding certain properties of the matrix. A matrix is invertible if and only if its determinant is nonzero. Its absolute value equals the area (in '''R'''{{sup|2}}) or volume (in '''R'''{{sup|3}}) of the image of the unit square (or cube), while its sign corresponds to the orientation of the corresponding linear map: the determinant is positive if and only if the orientation is preserved.
+
The ''determinant'' of a square matrix '''A''' (denoted det('''A''') or |'''A'''|) is a number encoding certain properties of the matrix. A matrix is invertible if and only if its determinant is nonzero. Its absolute value equals the area (in '''R'''<sup>2</sup>) or volume (in '''R'''<sup>3</sup>) of the image of the unit square (or cube), while its sign corresponds to the orientation of the corresponding linear map: the determinant is positive if and only if the orientation is preserved.
  
 
The determinant of 2-by-2 matrices is given by
 
The determinant of 2-by-2 matrices is given by
Line 366: Line 366:
 
The determinant of a product of square matrices equals the product of their determinants:
 
The determinant of a product of square matrices equals the product of their determinants:
 
:det('''AB''') = det('''A''') · det('''B''').
 
:det('''AB''') = det('''A''') · det('''B''').
Adding a multiple of any row to another row, or a multiple of any column to another column does not change the determinant. Interchanging two rows or two columns affects the determinant by multiplying it by −1. Using these operations, any matrix can be transformed to a lower (or upper) triangular matrix, and for such matrices, the determinant equals the product of the entries on the main diagonal; this provides a method to calculate the determinant of any matrix. Finally, the [[Laplace expansion]] expresses the determinant in terms of minors, that is, determinants of smaller matrices. This expansion can be used for a recursive definition of determinants (taking as starting case the determinant of a 1-by-1 matrix, which is its unique entry, or even the determinant of a 0-by-0 matrix, which is 1), that can be seen to be equivalent to the Leibniz formula. Determinants can be used to solve linear systems using Cramer's rule, where the division of the determinants of two related square matrices equates to the value of each of the system's variables.
+
Adding a multiple of any row to another row, or a multiple of any column to another column does not change the determinant. Interchanging two rows or two columns affects the determinant by multiplying it by −1. Using these operations, any matrix can be transformed to a lower (or upper) triangular matrix, and for such matrices, the determinant equals the product of the entries on the main diagonal; this provides a method to calculate the determinant of any matrix. Finally, the Laplace expansion expresses the determinant in terms of minors, that is, determinants of smaller matrices. This expansion can be used for a recursive definition of determinants (taking as starting case the determinant of a 1-by-1 matrix, which is its unique entry, or even the determinant of a 0-by-0 matrix, which is 1), that can be seen to be equivalent to the Leibniz formula. Determinants can be used to solve linear systems using Cramer's rule, where the division of the determinants of two related square matrices equates to the value of each of the system's variables.
  
 
====Eigenvalues and eigenvectors====
 
====Eigenvalues and eigenvectors====
 
A number λ and a non-zero vector '''v''' satisfying
 
A number λ and a non-zero vector '''v''' satisfying
 
:<math>Av = \lambda v</math>
 
:<math>Av = \lambda v</math>
are called an ''eigenvalue'' and an ''eigenvector'' of '''A''', respectively. The number λ is an eigenvalue of an ''n''×''n''-matrix '''A''' if and only if '''A'''−λ'''I'''{{sub|''n''}} is not invertible, which is equivalent to
+
are called an ''eigenvalue'' and an ''eigenvector'' of '''A''', respectively. The number λ is an eigenvalue of an ''n''×''n''-matrix '''A''' if and only if '''A'''−λ'''I'''<sub>''n''</sub> is not invertible, which is equivalent to
 
:<math>\det(\mathbf{A}-\lambda \mathbf{I}) = 0.</math>
 
:<math>\det(\mathbf{A}-\lambda \mathbf{I}) = 0.</math>
The polynomial ''p''{{sub|'''A'''}} in an indeterminate ''X'' given by evaluation of the determinant det(''X'''''I'''{{sub|''n''}}−'''A''') is called the [[characteristic polynomial]] of '''A'''. It is a monic polynomial of degree ''n''. Therefore the polynomial equation ''p''{{sub|'''A'''}}(λ){{nbsp}}={{nbsp}}0 has at most ''n'' different solutions, that is, eigenvalues of the matrix. They may be complex even if the entries of '''A''' are real. According to the Cayley–Hamilton theorem, ''p''{{sub|'''A'''}}('''A''') = '''0''', that is, the result of substituting the matrix itself into its own characteristic polynomial yields the zero matrix.
+
The polynomial ''p''<sub>'''A'''</sub> in an indeterminate ''X'' given by evaluation of the determinant det(''X'''''I'''<sub>''n''</sub>−'''A''') is called the characteristic polynomial of '''A'''. It is a monic polynomial of degree ''n''. Therefore the polynomial equation ''p''<sub>'''A'''</sub>(λ) = 0 has at most ''n'' different solutions, that is, eigenvalues of the matrix. They may be complex even if the entries of '''A''' are real. According to the Cayley–Hamilton theorem, ''p''<sub>'''A'''</sub>('''A''') = '''0''', that is, the result of substituting the matrix itself into its own characteristic polynomial yields the zero matrix.
  
 
== Resources ==
 
== Resources ==

Latest revision as of 16:05, 10 January 2022

Basic operations

There are a number of basic operations that can be applied to modify matrices, called matrix addition, scalar multiplication, transposition, matrix multiplication, row operations, and submatrix.

Addition, scalar multiplication, and transposition

Operations performed on matrices
Operation Definition Example
Addition The sum A+B of two m-by-n matrices A and B is calculated entrywise:
(A + B)i,j = Ai,j + Bi,j, where 1 ≤ im and 1 ≤ jn.

Scalar multiplication The product cA of a number c (also called a scalar in the parlance of abstract algebra) and a matrix A is computed by multiplying every entry of A by c:
(cA)i,j = c · Ai,j.

This operation is called scalar multiplication, but its result is not named "scalar product" to avoid confusion, since "scalar product" is sometimes used as a synonym for "inner product".

Transposition The transpose of an m-by-n matrix A is the n-by-m matrix AT (also denoted Atr or tA) formed by turning rows into columns and vice versa:
(AT)i,j = Aj,i.

Familiar properties of numbers extend to these operations of matrices: for example, addition is commutative, that is, the matrix sum does not depend on the order of the summands: A + B = B + A. The transpose is compatible with addition and scalar multiplication, as expressed by (cA)T = c(AT) and (A + B)T = AT + BT. Finally, (AT)T = A.

Matrix multiplication

Schematic depiction of the matrix product AB of two matrices A and B.

Multiplication of two matrices is defined if and only if the number of columns of the left matrix is the same as the number of rows of the right matrix. If A is an m-by-n matrix and B is an n-by-p matrix, then their matrix product AB is the m-by-p matrix whose entries are given by dot product of the corresponding row of A and the corresponding column of B:

where 1 ≤ im and 1 ≤ jp. For example, the underlined entry 2340 in the product is calculated as (2 × 1000) + (3 × 100) + (4 × 10) = 2340:

Matrix multiplication satisfies the rules (AB)C = A(BC) (associativity), and (A + B)C = AC + BC as well as C(A + B) = CA + CB (left and right distributivity), whenever the size of the matrices is such that the various products are defined. The product AB may be defined without BA being defined, namely if A and B are m-by-n and n-by-k matrices, respectively, and mk. Even if both products are defined, they generally need not be equal, that is:

ABBA,

In other words, matrix multiplication is not commutative, in marked contrast to (rational, real, or complex) numbers, whose product is independent of the order of the factors. An example of two matrices not commuting with each other is:

whereas

Besides the ordinary matrix multiplication just described, other less frequently used operations on matrices that can be considered forms of multiplication also exist, such as the Hadamard product and the Kronecker product. They arise in solving matrix equations such as the Sylvester equation.

Row operations

There are three types of row operations:

  1. row addition, that is adding a row to another.
  2. row multiplication, that is multiplying all entries of a row by a non-zero constant;
  3. row switching, that is interchanging two rows of a matrix;

These operations are used in several ways, including solving linear equations and finding matrix inverses.

Submatrix

A submatrix of a matrix is obtained by deleting any collection of rows and/or columns. For example, from the following 3-by-4 matrix, we can construct a 2-by-3 submatrix by removing row 3 and column 2:

The minors and cofactors of a matrix are found by computing the determinant of certain submatrices.

A principal submatrix is a square submatrix obtained by removing certain rows and columns. The definition varies from author to author. According to some authors, a principal submatrix is a submatrix in which the set of row indices that remain is the same as the set of column indices that remain. Other authors define a principal submatrix as one in which the first k rows and columns, for some number k, are the ones that remain; this type of submatrix has also been called a leading principal submatrix.

Linear equations

Matrices can be used to compactly write and work with multiple linear equations, that is, systems of linear equations. For example, if A is an m-by-n matrix, x designates a column vector (that is, n×1-matrix) of n variables x1, x2, ..., xn, and b is an m×1-column vector, then the matrix equation

is equivalent to the system of linear equations

Using matrices, this can be solved more compactly than would be possible by writing out all the equations separately. If n = m and the equations are independent, then this can be done by writing

where A−1 is the inverse matrix of A. If A has no inverse, solutions—if any—can be found using its generalized inverse.

Linear transformations

The vectors represented by a 2-by-2 matrix correspond to the sides of a unit square transformed into a parallelogram.

Matrices and matrix multiplication reveal their essential features when related to linear transformations, also known as linear maps. A real m-by-n matrix A gives rise to a linear transformation RnRm mapping each vector x in Rn to the (matrix) product Ax, which is a vector in Rm. Conversely, each linear transformation f: RnRm arises from a unique m-by-n matrix A: explicitly, the (i, j)-entry of A is the ith coordinate of f(ej), where ej = (0,...,0,1,0,...,0) is the unit vector with 1 in the jth position and 0 elsewhere. The matrix A is said to represent the linear map f, and A is called the transformation matrix of f.

For example, the 2×2 matrix

can be viewed as the transform of the unit square into a parallelogram with vertices at (0, 0), (a, b), (a + c, b + d), and (c, d). The parallelogram pictured at the right is obtained by multiplying A with each of the column vectors , and in turn. These vectors define the vertices of the unit square.

The following table shows several 2×2 real matrices with the associated linear maps of R2. The blue original is mapped to the green grid and shapes. The origin (0,0) is marked with a black point.

Horizontal shear
with m = 1.25.
Reflection through the vertical axis Squeeze mapping
with r = 3/2
Scaling
by a factor of 3/2
Rotation
by /6 = 30°
VerticalShear m=1.25.svg Flip map.svg Squeeze r=1.5.svg Scaling by 1.5.svg Rotation by pi over 6.svg

Under the 1-to-1 correspondence between matrices and linear maps, matrix multiplication corresponds to composition of maps: if a k-by-m matrix B represents another linear map g: RmRk, then the composition gf is represented by BA since

(gf)(x) = g(f(x)) = g(Ax) = B(Ax) = (BA)x.

The last equality follows from the above-mentioned associativity of matrix multiplication.

The rank of a matrix A is the maximum number of linearly independent row vectors of the matrix, which is the same as the maximum number of linearly independent column vectors. Equivalently it is the dimension of the image of the linear map represented by A. The rank–nullity theorem states that the dimension of the kernel of a matrix plus the rank equals the number of columns of the matrix.

Square matrix

A square matrix is a matrix with the same number of rows and columns. An n-by-n matrix is known as a square matrix of order n. Any two square matrices of the same order can be added and multiplied. The entries aii form the main diagonal of a square matrix. They lie on the imaginary line that runs from the top left corner to the bottom right corner of the matrix.

Main types

Name Example with n = 3
Diagonal matrix
Lower triangular matrix
Upper triangular matrix

Diagonal and triangular matrix

If all entries of A below the main diagonal are zero, A is called an upper triangular matrix. Similarly if all entries of A above the main diagonal are zero, A is called a lower triangular matrix. If all entries outside the main diagonal are zero, A is called a diagonal matrix.

Identity matrix

The identity matrix In of size n is the n-by-n matrix in which all the elements on the main diagonal are equal to 1 and all other elements are equal to 0, for example,

It is a square matrix of order n, and also a special kind of diagonal matrix. It is called an identity matrix because multiplication with it leaves a matrix unchanged:

AIn = ImA = A for any m-by-n matrix A.

A nonzero scalar multiple of an identity matrix is called a scalar matrix. If the matrix entries come from a field, the scalar matrices form a group, under matrix multiplication, that is isomorphic to the multiplicative group of nonzero elements of the field.

Symmetric or skew-symmetric matrix

A square matrix A that is equal to its transpose, that is, A = AT, is a symmetric matrix. If instead, A is equal to the negative of its transpose, that is, A = −AT, then A is a skew-symmetric matrix. In complex matrices, symmetry is often replaced by the concept of Hermitian matrices, which satisfy A = A, where the star or asterisk denotes the conjugate transpose of the matrix, that is, the transpose of the complex conjugate of A.

By the spectral theorem, real symmetric matrices and complex Hermitian matrices have an eigenbasis; that is, every vector is expressible as a linear combination of eigenvectors. In both cases, all eigenvalues are real. This theorem can be generalized to infinite-dimensional situations related to matrices with infinitely many rows and columns.

Invertible matrix and its inverse

A square matrix A is called invertible or non-singular if there exists a matrix B such that

AB = BA = In ,

where In is the n×n identity matrix with 1s on the main diagonal and 0s elsewhere. If B exists, it is unique and is called the inverse matrix of A, denoted A−1.

Definite matrix

Positive definite matrix Indefinite matrix
Q(x, y) = x2 + y2 Q(x, y) = x2 - y2
Ellipse in coordinate system with semi-axes labelled.svg
Points such that Q(x,y)=1
(Ellipse).
Hyperbola2 SVG.svg
Points such that Q(x,y)=1
(Hyperbola).

A symmetric real matrix A is called positive-definite if the associated quadratic form

f(x) = xTA x

has a positive value for every nonzero vector x in Rn. If f(x) only yields negative values then A is negative-definite; if f does produce both negative and positive values then A is indefinite. If the quadratic form f yields only non-negative values (positive or zero), the symmetric matrix is called positive-semidefinite (or if only non-positive values, then negative-semidefinite); hence the matrix is indefinite precisely when it is neither positive-semidefinite nor negative-semidefinite.

A symmetric matrix is positive-definite if and only if all its eigenvalues are positive, that is, the matrix is positive-semidefinite and it is invertible. The table at the right shows two possibilities for 2-by-2 matrices.

Allowing as input two different vectors instead yields the bilinear form associated to A:

BA (x, y) = xTAy.

In the case of complex matrices, the same terminology and result apply, with symmetric matrix, quadratic form, bilinear form, and transpose xT replaced respectively by Hermitian matrix, Hermitian form, sesquilinear form, and conjugate transpose xH.

Orthogonal matrix

An orthogonal matrix is a square matrix with real entries whose columns and rows are orthogonal unit vectors (that is, orthonormal vectors). Equivalently, a matrix A is orthogonal if its transpose is equal to its inverse:

which entails

where In is the identity matrix of size n.

An orthogonal matrix A is necessarily invertible (with inverse A-1 = AT), unitary (A-1 = A*), and normal A*A = AA*). The determinant of any orthogonal matrix is either +1 or −1. A special orthogonal matrix is an orthogonal matrix with determinant +1. As a linear transformation, every orthogonal matrix with determinant +1 is a pure rotation without reflection, i.e., the transformation preserves the orientation of the transformed structure, while every orthogonal matrix with determinant -1 reverses the orientation, i.e., is a composition of a pure reflection and a (possibly null) rotation. The identity matrices have determinant 1, and are pure rotations by an angle zero.

The complex analogue of an orthogonal matrix is a unitary matrix.

Main operations

Trace

The trace, tr(A) of a square matrix A is the sum of its diagonal entries. While matrix multiplication is not commutative as mentioned above, the trace of the product of two matrices is independent of the order of the factors:

tr(AB) = tr(BA).

This is immediate from the definition of matrix multiplication:

It follows that the trace of the product of more than two matrices is independent of cyclic permutations of the matrices, however this does not in general apply for arbitrary permutations (for example, tr(ABC) ≠ tr(BAC), in general). Also, the trace of a matrix is equal to that of its transpose, that is,

tr(A) = tr(AT).

Determinant

A linear transformation on R2 given by the indicated matrix. The determinant of this matrix is −1, as the area of the green parallelogram at the right is 1, but the map reverses the orientation, since it turns the counterclockwise orientation of the vectors to a clockwise one.

The determinant of a square matrix A (denoted det(A) or |A|) is a number encoding certain properties of the matrix. A matrix is invertible if and only if its determinant is nonzero. Its absolute value equals the area (in R2) or volume (in R3) of the image of the unit square (or cube), while its sign corresponds to the orientation of the corresponding linear map: the determinant is positive if and only if the orientation is preserved.

The determinant of 2-by-2 matrices is given by

The determinant of 3-by-3 matrices involves 6 terms (rule of Sarrus). The more lengthy Leibniz formula generalises these two formulae to all dimensions.

The determinant of a product of square matrices equals the product of their determinants:

det(AB) = det(A) · det(B).

Adding a multiple of any row to another row, or a multiple of any column to another column does not change the determinant. Interchanging two rows or two columns affects the determinant by multiplying it by −1. Using these operations, any matrix can be transformed to a lower (or upper) triangular matrix, and for such matrices, the determinant equals the product of the entries on the main diagonal; this provides a method to calculate the determinant of any matrix. Finally, the Laplace expansion expresses the determinant in terms of minors, that is, determinants of smaller matrices. This expansion can be used for a recursive definition of determinants (taking as starting case the determinant of a 1-by-1 matrix, which is its unique entry, or even the determinant of a 0-by-0 matrix, which is 1), that can be seen to be equivalent to the Leibniz formula. Determinants can be used to solve linear systems using Cramer's rule, where the division of the determinants of two related square matrices equates to the value of each of the system's variables.

Eigenvalues and eigenvectors

A number λ and a non-zero vector v satisfying

are called an eigenvalue and an eigenvector of A, respectively. The number λ is an eigenvalue of an n×n-matrix A if and only if A−λIn is not invertible, which is equivalent to

The polynomial pA in an indeterminate X given by evaluation of the determinant det(XInA) is called the characteristic polynomial of A. It is a monic polynomial of degree n. Therefore the polynomial equation pA(λ) = 0 has at most n different solutions, that is, eigenvalues of the matrix. They may be complex even if the entries of A are real. According to the Cayley–Hamilton theorem, pA(A) = 0, that is, the result of substituting the matrix itself into its own characteristic polynomial yields the zero matrix.

Resources

Licensing

Content obtained and/or adapted from: