Difference between revisions of "Matrix Operations"

Latest revision as of 16:05, 10 January 2022

Basic operations

There are a number of basic operations that can be applied to modify matrices, called matrix addition, scalar multiplication, transposition, matrix multiplication, row operations, and submatrix.

Addition, scalar multiplication, and transposition

Operations performed on matrices
Operation	Definition	Example
Addition	The sum A+B of two m-by-n matrices A and B is calculated entrywise: (A + B)_i,j = A_i,j + B_i,j, where 1 ≤ i ≤ m and 1 ≤ j ≤ n.	${\begin{bmatrix}1&3&1\\1&0&0\end{bmatrix}}+{\begin{bmatrix}0&0&5\\7&5&0\end{bmatrix}}={\begin{bmatrix}1+0&3+0&1+5\\1+7&0+5&0+0\end{bmatrix}}={\begin{bmatrix}1&3&6\\8&5&0\end{bmatrix}}$
Scalar multiplication	The product cA of a number c (also called a scalar in the parlance of abstract algebra) and a matrix A is computed by multiplying every entry of A by c: (cA)_i,j = c · A_i,j. This operation is called scalar multiplication, but its result is not named "scalar product" to avoid confusion, since "scalar product" is sometimes used as a synonym for "inner product".	$2\cdot {\begin{bmatrix}1&8&-3\\4&-2&5\end{bmatrix}}={\begin{bmatrix}2\cdot 1&2\cdot 8&2\cdot -3\\2\cdot 4&2\cdot -2&2\cdot 5\end{bmatrix}}={\begin{bmatrix}2&16&-6\\8&-4&10\end{bmatrix}}$
Transposition	The transpose of an m-by-n matrix A is the n-by-m matrix A^T (also denoted A^tr or ^tA) formed by turning rows into columns and vice versa: (A^T)_i,j = A_j,i.	${\begin{bmatrix}1&2&3\\0&-6&7\end{bmatrix}}^{\mathrm {T} }={\begin{bmatrix}1&0\\2&-6\\3&7\end{bmatrix}}$

Familiar properties of numbers extend to these operations of matrices: for example, addition is commutative, that is, the matrix sum does not depend on the order of the summands: A + B = B + A. The transpose is compatible with addition and scalar multiplication, as expressed by (cA)^T = c(A^T) and (A + B)^T = A^T + B^T. Finally, (A^T)^T = A.

Matrix multiplication

Schematic depiction of the matrix product AB of two matrices A and B.

Multiplication of two matrices is defined if and only if the number of columns of the left matrix is the same as the number of rows of the right matrix. If A is an m-by-n matrix and B is an n-by-p matrix, then their matrix product AB is the m-by-p matrix whose entries are given by dot product of the corresponding row of A and the corresponding column of B:

[\mathbf {AB} ]_{i,j}=a_{i,1}b_{1,j}+a_{i,2}b_{2,j}+\cdots +a_{i,n}b_{n,j}=\sum _{r=1}^{n}a_{i,r}b_{r,j},

where 1 ≤ i ≤ m and 1 ≤ j ≤ p. For example, the underlined entry 2340 in the product is calculated as (2 × 1000) + (3 × 100) + (4 × 10) = 2340:

{\begin{aligned}{\begin{bmatrix}{\underline {2}}&{\underline {3}}&{\underline {4}}\\1&0&0\\\end{bmatrix}}{\begin{bmatrix}0&{\underline {1000}}\\1&{\underline {100}}\\0&{\underline {10}}\\\end{bmatrix}}&={\begin{bmatrix}3&{\underline {2340}}\\0&1000\\\end{bmatrix}}.\end{aligned}}

Matrix multiplication satisfies the rules (AB)C = A(BC) (associativity), and (A + B)C = AC + BC as well as C(A + B) = CA + CB (left and right distributivity), whenever the size of the matrices is such that the various products are defined. The product AB may be defined without BA being defined, namely if A and B are m-by-n and n-by-k matrices, respectively, and m ≠ k. Even if both products are defined, they generally need not be equal, that is:

AB ≠ BA,

In other words, matrix multiplication is not commutative, in marked contrast to (rational, real, or complex) numbers, whose product is independent of the order of the factors. An example of two matrices not commuting with each other is:

{\begin{bmatrix}1&2\\3&4\\\end{bmatrix}}{\begin{bmatrix}0&1\\0&0\\\end{bmatrix}}={\begin{bmatrix}0&1\\0&3\\\end{bmatrix}},

whereas

{\begin{bmatrix}0&1\\0&0\\\end{bmatrix}}{\begin{bmatrix}1&2\\3&4\\\end{bmatrix}}={\begin{bmatrix}3&4\\0&0\\\end{bmatrix}}.

Besides the ordinary matrix multiplication just described, other less frequently used operations on matrices that can be considered forms of multiplication also exist, such as the Hadamard product and the Kronecker product. They arise in solving matrix equations such as the Sylvester equation.

Row operations

There are three types of row operations:

row addition, that is adding a row to another.
row multiplication, that is multiplying all entries of a row by a non-zero constant;
row switching, that is interchanging two rows of a matrix;

These operations are used in several ways, including solving linear equations and finding matrix inverses.

Submatrix

A submatrix of a matrix is obtained by deleting any collection of rows and/or columns. For example, from the following 3-by-4 matrix, we can construct a 2-by-3 submatrix by removing row 3 and column 2:

\mathbf {A} ={\begin{bmatrix}1&\color {red}{2}&3&4\\5&\color {red}{6}&7&8\\\color {red}{9}&\color {red}{10}&\color {red}{11}&\color {red}{12}\end{bmatrix}}\rightarrow {\begin{bmatrix}1&3&4\\5&7&8\end{bmatrix}}.

The minors and cofactors of a matrix are found by computing the determinant of certain submatrices.

A principal submatrix is a square submatrix obtained by removing certain rows and columns. The definition varies from author to author. According to some authors, a principal submatrix is a submatrix in which the set of row indices that remain is the same as the set of column indices that remain. Other authors define a principal submatrix as one in which the first k rows and columns, for some number k, are the ones that remain; this type of submatrix has also been called a leading principal submatrix.

Linear equations

Matrices can be used to compactly write and work with multiple linear equations, that is, systems of linear equations. For example, if A is an m-by-n matrix, x designates a column vector (that is, n×1-matrix) of n variables x₁, x₂, ..., x_n, and b is an m×1-column vector, then the matrix equation

\mathbf {Ax} =\mathbf {b}

is equivalent to the system of linear equations

{\begin{aligned}a_{1,1}x_{1}+a_{1,2}x_{2}+&\cdots +a_{1,n}x_{n}=b_{1}\\&\ \ \vdots \\a_{m,1}x_{1}+a_{m,2}x_{2}+&\cdots +a_{m,n}x_{n}=b_{m}\end{aligned}}

Using matrices, this can be solved more compactly than would be possible by writing out all the equations separately. If n = m and the equations are independent, then this can be done by writing

\mathbf {x} =\mathbf {A} ^{-1}\mathbf {b}

where A⁻¹ is the inverse matrix of A. If A has no inverse, solutions—if any—can be found using its generalized inverse.

Linear transformations

The vectors represented by a 2-by-2 matrix correspond to the sides of a unit square transformed into a parallelogram.

Matrices and matrix multiplication reveal their essential features when related to linear transformations, also known as linear maps. A real m-by-n matrix A gives rise to a linear transformation Rⁿ → R^m mapping each vector x in Rⁿ to the (matrix) product Ax, which is a vector in R^m. Conversely, each linear transformation f: Rⁿ → R^m arises from a unique m-by-n matrix A: explicitly, the (i, j)-entry of A is the i^th coordinate of f(e_j), where e_j = (0,...,0,1,0,...,0) is the unit vector with 1 in the j^th position and 0 elsewhere. The matrix A is said to represent the linear map f, and A is called the transformation matrix of f.

For example, the 2×2 matrix

\mathbf {A} ={\begin{bmatrix}a&c\\b&d\end{bmatrix}}

can be viewed as the transform of the unit square into a parallelogram with vertices at (0, 0), (a, b), (a + c, b + d), and (c, d). The parallelogram pictured at the right is obtained by multiplying A with each of the column vectors ${\begin{bmatrix}0\\0\end{bmatrix}},{\begin{bmatrix}1\\0\end{bmatrix}},{\begin{bmatrix}1\\1\end{bmatrix}}$ , and ${\begin{bmatrix}0\\1\end{bmatrix}}$ in turn. These vectors define the vertices of the unit square.

The following table shows several 2×2 real matrices with the associated linear maps of R². The blue original is mapped to the green grid and shapes. The origin (0,0) is marked with a black point.

Horizontal shear with m = 1.25.	Reflection through the vertical axis	Squeeze mapping with r = 3/2	Scaling by a factor of 3/2	Rotation by $\pi$ /6 = 30°
${\begin{bmatrix}1&1.25\\0&1\end{bmatrix}}$	${\begin{bmatrix}-1&0\\0&1\end{bmatrix}}$	${\begin{bmatrix}{\frac {3}{2}}&0\\0&{\frac {2}{3}}\end{bmatrix}}$	${\begin{bmatrix}{\frac {3}{2}}&0\\0&{\frac {3}{2}}\end{bmatrix}}$	${\begin{bmatrix}\cos \left({\frac {\pi }{6}}\right)&-\sin \left({\frac {\pi }{6}}\right)\\\sin \left({\frac {\pi }{6}}\right)&\cos \left({\frac {\pi }{6}}\right)\end{bmatrix}}$

Under the 1-to-1 correspondence between matrices and linear maps, matrix multiplication corresponds to composition of maps: if a k-by-m matrix B represents another linear map g: R^m → R^k, then the composition g ∘ f is represented by BA since

(g ∘ f)(x) = g(f(x)) = g(Ax) = B(Ax) = (BA)x.

The last equality follows from the above-mentioned associativity of matrix multiplication.

The rank of a matrix A is the maximum number of linearly independent row vectors of the matrix, which is the same as the maximum number of linearly independent column vectors. Equivalently it is the dimension of the image of the linear map represented by A. The rank–nullity theorem states that the dimension of the kernel of a matrix plus the rank equals the number of columns of the matrix.

Square matrix

A square matrix is a matrix with the same number of rows and columns. An n-by-n matrix is known as a square matrix of order n. Any two square matrices of the same order can be added and multiplied. The entries a_ii form the main diagonal of a square matrix. They lie on the imaginary line that runs from the top left corner to the bottom right corner of the matrix.

Main types

Name	Example with n = 3
Diagonal matrix	${\begin{bmatrix}a_{11}&0&0\\0&a_{22}&0\\0&0&a_{33}\\\end{bmatrix}}$
Lower triangular matrix	${\begin{bmatrix}a_{11}&0&0\\a_{21}&a_{22}&0\\a_{31}&a_{32}&a_{33}\\\end{bmatrix}}$
Upper triangular matrix	${\begin{bmatrix}a_{11}&a_{12}&a_{13}\\0&a_{22}&a_{23}\\0&0&a_{33}\\\end{bmatrix}}$

Diagonal and triangular matrix

If all entries of A below the main diagonal are zero, A is called an upper triangular matrix. Similarly if all entries of A above the main diagonal are zero, A is called a lower triangular matrix. If all entries outside the main diagonal are zero, A is called a diagonal matrix.

Identity matrix

The identity matrix I_n of size n is the n-by-n matrix in which all the elements on the main diagonal are equal to 1 and all other elements are equal to 0, for example,

\mathbf {I} _{1}={\begin{bmatrix}1\end{bmatrix}},\ \mathbf {I} _{2}={\begin{bmatrix}1&0\\0&1\end{bmatrix}},\ \ldots ,\ \mathbf {I} _{n}={\begin{bmatrix}1&0&\cdots &0\\0&1&\cdots &0\\\vdots &\vdots &\ddots &\vdots \\0&0&\cdots &1\end{bmatrix}}

It is a square matrix of order n, and also a special kind of diagonal matrix. It is called an identity matrix because multiplication with it leaves a matrix unchanged:

AI_n = I_mA = A for any m-by-n matrix A.

A nonzero scalar multiple of an identity matrix is called a scalar matrix. If the matrix entries come from a field, the scalar matrices form a group, under matrix multiplication, that is isomorphic to the multiplicative group of nonzero elements of the field.

Symmetric or skew-symmetric matrix

A square matrix A that is equal to its transpose, that is, A = A^T, is a symmetric matrix. If instead, A is equal to the negative of its transpose, that is, A = −A^T, then A is a skew-symmetric matrix. In complex matrices, symmetry is often replaced by the concept of Hermitian matrices, which satisfy A^∗ = A, where the star or asterisk denotes the conjugate transpose of the matrix, that is, the transpose of the complex conjugate of A.

By the spectral theorem, real symmetric matrices and complex Hermitian matrices have an eigenbasis; that is, every vector is expressible as a linear combination of eigenvectors. In both cases, all eigenvalues are real. This theorem can be generalized to infinite-dimensional situations related to matrices with infinitely many rows and columns.

Invertible matrix and its inverse

A square matrix A is called invertible or non-singular if there exists a matrix B such that

AB = BA = I_n ,

where I_n is the n×n identity matrix with 1s on the main diagonal and 0s elsewhere. If B exists, it is unique and is called the inverse matrix of A, denoted A⁻¹.

Definite matrix

Positive definite matrix	Indefinite matrix
${\begin{bmatrix}{\frac {1}{4}}&0\\0&1\\\end{bmatrix}}$	${\begin{bmatrix}{\frac {1}{4}}&0\\0&-{\frac {1}{4}}\end{bmatrix}}$
Q(x, y) = ${\tfrac {1}{4}}$ x² + y²	Q(x, y) = ${\tfrac {1}{4}}$ x² - ${\tfrac {1}{4}}$ y²
Points such that Q(x,y)=1 (Ellipse).	Points such that Q(x,y)=1 (Hyperbola).

A symmetric real matrix $A$ is called positive-definite if the associated quadratic form

f(x) = x^TA x

has a positive value for every nonzero vector x in Rⁿ. If f(x) only yields negative values then $A$ is negative-definite; if $f$ does produce both negative and positive values then $A$ is indefinite. If the quadratic form $f$ yields only non-negative values (positive or zero), the symmetric matrix is called positive-semidefinite (or if only non-positive values, then negative-semidefinite); hence the matrix is indefinite precisely when it is neither positive-semidefinite nor negative-semidefinite.

A symmetric matrix is positive-definite if and only if all its eigenvalues are positive, that is, the matrix is positive-semidefinite and it is invertible. The table at the right shows two possibilities for 2-by-2 matrices.

Allowing as input two different vectors instead yields the bilinear form associated to $A$ :

B_A (x, y) = x^TAy.

In the case of complex matrices, the same terminology and result apply, with symmetric matrix, quadratic form, bilinear form, and transpose x^T replaced respectively by Hermitian matrix, Hermitian form, sesquilinear form, and conjugate transpose x^H.

Orthogonal matrix

An orthogonal matrix is a square matrix with real entries whose columns and rows are orthogonal unit vectors (that is, orthonormal vectors). Equivalently, a matrix A is orthogonal if its transpose is equal to its inverse:

\mathbf {A} ^{\mathrm {T} }=\mathbf {A} ^{-1},\,

which entails

\mathbf {A} ^{\mathrm {T} }\mathbf {A} =\mathbf {A} \mathbf {A} ^{\mathrm {T} }=\mathbf {I} _{n},

where I_n is the identity matrix of size n.

An orthogonal matrix A is necessarily invertible (with inverse A^-1 = A^T), unitary (A^-1 = A*), and normal A*A = AA*). The determinant of any orthogonal matrix is either $+1$ or $-1$ . A special orthogonal matrix is an orthogonal matrix with determinant +1. As a linear transformation, every orthogonal matrix with determinant $+1$ is a pure rotation without reflection, i.e., the transformation preserves the orientation of the transformed structure, while every orthogonal matrix with determinant $-1$ reverses the orientation, i.e., is a composition of a pure reflection and a (possibly null) rotation. The identity matrices have determinant $1$ , and are pure rotations by an angle zero.

The complex analogue of an orthogonal matrix is a unitary matrix.

Main operations

Trace

The trace, tr(A) of a square matrix A is the sum of its diagonal entries. While matrix multiplication is not commutative as mentioned above, the trace of the product of two matrices is independent of the order of the factors:

tr(AB) = tr(BA).

This is immediate from the definition of matrix multiplication:

\operatorname {tr} (\mathbf {AB} )=\sum _{i=1}^{m}\sum _{j=1}^{n}a_{ij}b_{ji}=\operatorname {tr} (\mathbf {BA} ).

It follows that the trace of the product of more than two matrices is independent of cyclic permutations of the matrices, however this does not in general apply for arbitrary permutations (for example, tr(ABC) ≠ tr(BAC), in general). Also, the trace of a matrix is equal to that of its transpose, that is,

tr(A) = tr(A^T).

Determinant

A linear transformation on R² given by the indicated matrix. The determinant of this matrix is −1, as the area of the green parallelogram at the right is 1, but the map reverses the orientation, since it turns the counterclockwise orientation of the vectors to a clockwise one.

The determinant of a square matrix A (denoted det(A) or |A|) is a number encoding certain properties of the matrix. A matrix is invertible if and only if its determinant is nonzero. Its absolute value equals the area (in R²) or volume (in R³) of the image of the unit square (or cube), while its sign corresponds to the orientation of the corresponding linear map: the determinant is positive if and only if the orientation is preserved.

The determinant of 2-by-2 matrices is given by

\det {\begin{bmatrix}a&b\\c&d\end{bmatrix}}=ad-bc.

The determinant of 3-by-3 matrices involves 6 terms (rule of Sarrus). The more lengthy Leibniz formula generalises these two formulae to all dimensions.

The determinant of a product of square matrices equals the product of their determinants:

det(AB) = det(A) · det(B).

Adding a multiple of any row to another row, or a multiple of any column to another column does not change the determinant. Interchanging two rows or two columns affects the determinant by multiplying it by −1. Using these operations, any matrix can be transformed to a lower (or upper) triangular matrix, and for such matrices, the determinant equals the product of the entries on the main diagonal; this provides a method to calculate the determinant of any matrix. Finally, the Laplace expansion expresses the determinant in terms of minors, that is, determinants of smaller matrices. This expansion can be used for a recursive definition of determinants (taking as starting case the determinant of a 1-by-1 matrix, which is its unique entry, or even the determinant of a 0-by-0 matrix, which is 1), that can be seen to be equivalent to the Leibniz formula. Determinants can be used to solve linear systems using Cramer's rule, where the division of the determinants of two related square matrices equates to the value of each of the system's variables.

Eigenvalues and eigenvectors

A number λ and a non-zero vector v satisfying

Av=\lambda v

are called an eigenvalue and an eigenvector of A, respectively. The number λ is an eigenvalue of an n×n-matrix A if and only if A−λI_n is not invertible, which is equivalent to

\det(\mathbf {A} -\lambda \mathbf {I} )=0.

The polynomial p_A in an indeterminate X given by evaluation of the determinant det(XI_n−A) is called the characteristic polynomial of A. It is a monic polynomial of degree n. Therefore the polynomial equation p_A(λ) = 0 has at most n different solutions, that is, eigenvalues of the matrix. They may be complex even if the entries of A are real. According to the Cayley–Hamilton theorem, p_A(A) = 0, that is, the result of substituting the matrix itself into its own characteristic polynomial yields the zero matrix.

Resources

Matrices and Matrix Operations, Book Chapter
Guided Notes
Matrix Addition. Produced by TA Catherine Sporer, UTSA

Licensing

Content obtained and/or adapted from:

Matrix (mathematics), Wikipedia under a CC BY-SA license

Difference between revisions of "Matrix Operations"

Latest revision as of 16:05, 10 January 2022

Contents