Understanding Matrices in R Programming – A Statistician’s R Notebook

Introduction

https://www.vectorstock.com/royalty-free-vector/green-matrix-numbers-cyberspace-with-vector-24906241

Matrices are an essential data structure in R programming that allows for the manipulation and analysis of data in a two-dimensional format. Understanding their creation, manipulation, and linear algebra operations is crucial for handling complex data effectively. They provide a convenient way to store and work with data that can be represented as rows and columns. In this post, we will delve into the basics of creating, manipulating, and operating on matrices in R. Especially, we discuss how to perform basic algebraic operations such as matrix multiplication, transpose, finding eigenvalues. We also cover data wrangling operations such as matrix subsetting and column- and rowwise aggregation.

Creating Matrices in R

Matrices can be created and analyzed in a few different ways in R. One way is to create the matrix yourself. There are a few different ways you can do this.

The matrix(a, nrow = b, ncol = c) command in R creates a matrix that repeats the element a in a matrix with b rows and c columns. A matrix can be manually created by using the c() command as well.

# Creating a matrix including only 1's that are 2 by 3
matrix(1, nrow = 2, ncol = 3)

     [,1] [,2] [,3]
[1,]    1    1    1
[2,]    1    1    1

If you want to create the following matrix:

\[ A=\begin{bmatrix} 1&2&3\\ 3&6&8\\ 7&8&4 \end{bmatrix} \]

you would do it like this:

A <- matrix(c(1, 2, 3, 3, 6, 8, 7, 8, 4), nrow = 3, byrow = TRUE)
A

     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    3    6    8
[3,]    7    8    4

It converted an atomic vector of length nine to a matrix with three rows. The number of columns was determined automatically (ncol=3 could have been passed to get the same result). The option byrow = TRUE means that the rows of the matrix will be filled first. By default, the elements of the input vector are read column by column.

matrix(c(1, 2, 3, 3, 6, 8, 7, 8, 4), nrow = 3)

     [,1] [,2] [,3]
[1,]    1    3    7
[2,]    2    6    8
[3,]    3    8    4

Matrices can also be created by concatenating multiple vectors. rbind performs row-based bottom-to-bottom concatenation, while cbind performs column-based side-by-side concatenation.

Caution

Here it is important to make sure that the vectors have the same dimensions.

v1 <- c(3,4,6,8,5)
v2 <- c(4,8,4,7,1)
v3 <- c(2,2,5,4,6)
v4 <- c(4,7,5,2,5)
m1 <- cbind(v1, v2, v3, v4)
print(m1)

     v1 v2 v3 v4
[1,]  3  4  2  4
[2,]  4  8  2  7
[3,]  6  4  5  5
[4,]  8  7  4  2
[5,]  5  1  6  5

dim(m1)

[1] 5 4

In this example, 4 vectors with 5 observations are merged side by side with cbind. This results in a 5x4 matrix, which we call m1.

m2 <- rbind(v1, v2, v3, v4)
print(m2)

   [,1] [,2] [,3] [,4] [,5]
v1    3    4    6    8    5
v2    4    8    4    7    1
v3    2    2    5    4    6
v4    4    7    5    2    5

dim(m2)

[1] 4 5

With this example, 4 vectors are merged one below the other with rbind. As a result, a matrix of size 4x5, which we call m2, is obtained. We used dim function to learn dimension of matrices.

Accessing and Modifying Elements

Accessing and modifying elements in a matrix is straightforward. Use the row and column indices to access specific elements and assign new values to modify elements.

# Accessing the element in the second row and third column
m1[2, 3]

v3 
 2

# Modifying the element at the specified position
m1[2, 3] <- 10  
print(m1)

     v1 v2 v3 v4
[1,]  3  4  2  4
[2,]  4  8 10  7
[3,]  6  4  5  5
[4,]  8  7  4  2
[5,]  5  1  6  5

Also, rows and columns of matrices can be named by using colnames and rownames functions.

# Naming columns with the first 4 letters
colnames(m1) <- LETTERS[1:4] 
m1

     A B  C D
[1,] 3 4  2 4
[2,] 4 8 10 7
[3,] 6 4  5 5
[4,] 8 7  4 2
[5,] 5 1  6 5

# Naming rows with the last 5 letters
rownames(m1) <- tail(LETTERS,5) 
m1

Mathematical Operations

Vectorised functions such as round, sqrt, abs, log,exp etc., operate on each matrix element.

A <- matrix(c(1:6) * 0.15,nrow = 2)
A

     [,1] [,2] [,3]
[1,] 0.15 0.45 0.75
[2,] 0.30 0.60 0.90

sqrt(A) # gets square root of every element in A

          [,1]      [,2]      [,3]
[1,] 0.3872983 0.6708204 0.8660254
[2,] 0.5477226 0.7745967 0.9486833

round(A, 1) # rounds every element in A

     [,1] [,2] [,3]
[1,]  0.1  0.4  0.8
[2,]  0.3  0.6  0.9

Mathematical operations such as addition and subtraction can be performed on two or more matrices with the same dimensions. The operation performed here is elementwise.

A <- matrix(1:4,nrow=2)
B <- matrix(5:8,nrow=2)
print(A)

     [,1] [,2]
[1,]    1    3
[2,]    2    4

print(B)

     [,1] [,2]
[1,]    5    7
[2,]    6    8

A + B  # elementwise addition

     [,1] [,2]
[1,]    6   10
[2,]    8   12

A * B  # elementwise multiplication

     [,1] [,2]
[1,]    5   21
[2,]   12   32

They are simply the addition and multiplication of the corresponding elements of two given matrices. Also we can we can apply matrix-scalar operations. For example in the next example we squared every element in A.

A^2 # the 2nd power of the A

     [,1] [,2]
[1,]    1    9
[2,]    4   16

Aggregating rows and columns

When we call an aggregation function on a matrix, it will reduce all elements to a single number.

mean(A) # get arithmetic mean of A

[1] 2.5

min(A) # minimum of A

[1] 1

We can also calculate sum or mean of each row/columns by using rowMeans, rowSums, colMeans and colSums.

rowSums(A) # sum of rows

[1] 4 6

rowMeans(A) # mean of rows

[1] 2 3

colSums(A) # sum of columns

[1] 3 7

colMeans(A) # mean of columns

[1] 1.5 3.5

Tip

R provides the apply() function to apply functions to each row or column of a matrix. The arguments of the apply() function include the matrix, the margin (1 for rows, 2 for columns), and the function to be applied. The apply function can be used to summarise individual rows or columns in a matrix. So we call any aggregation function with apply.

apply(A, 1, f) applies a given function f on each row of a matrix A (over the first axis),
apply(A, 2, f) applies f on each column of A (over the second axis).

# Applying functions to matrices
row_sums <- apply(A, 1, sum)  # Applying sum function to each row (margin = 1)
print(row_sums)

[1] 4 6

Matrix Operations

Transpose of Matrix

The transpose of matrix, mathematically denoted with \[A^T\] is available by using t() function.

     [,1] [,2]
[1,]    1    3
[2,]    2    4

t(A) # transpose of A

     [,1] [,2]
[1,]    1    2
[2,]    3    4

Matrix Calculation

When multiplying two matrices A and B, the number of columns in matrix A must be equal to the number of rows in matrix B. If A is of size m x n and B is of size n x p, then their product AB will be of size m x p. The individual elements of the resulting matrix are calculated by taking dot products of rows from matrix A and columns from matrix B.

In R * performs elementwise multiplication. For what we call the (algebraic) matrix multiplication, we use the %*% operator. It can only be performed on two matrices of compatible sizes: the number of columns in the left matrix must match the number of rows in the right operand.

A <- matrix(c(1, 3, 5 ,3, 4, 9), nrow = 2) # create 2 by 3 matrix
B <- matrix(c(6, 2, 4 ,7, 8, 4), nrow = 3) # create 3 by 2 matrix
print(A)

     [,1] [,2] [,3]
[1,]    1    5    4
[2,]    3    3    9

print(B)

     [,1] [,2]
[1,]    6    7
[2,]    2    8
[3,]    4    4

A %*% B # we get 2 by 2 matrix

     [,1] [,2]
[1,]   32   63
[2,]   60   81

Determinant of Matrix

The determinant of a square matrix is a scalar value that represents some important properties of the matrix. In R programming, the det() function is used to calculate the determinant of a square matrix.

Understanding Determinant:

Square Matrices: The determinant is a property specific to square matrices, meaning the number of rows must equal the number of columns.
Geometric Interpretation: For a 2x2 matrix \(\mathbf{}\left[\begin{array}{rrr}a & b \\c & d \end{array}\right]\) the determinant \(ad-bc\) represents the scaling factor of the area spanned by vectors formed by the columns of the matrix. For higher-dimensional matrices, the determinant has a similar geometric interpretation related to volume and scaling in higher dimensions.
Invertibility: A matrix is invertible (has an inverse) if and only if its determinant is non-zero. If the determinant is zero, the matrix is singular and does not have an inverse.

In R, the det() function computes the determinant of a square matrix.

# Create a square matrix
A <- matrix(c(1, 2, 3, 4), nrow = 2, ncol = 2)

# Compute the determinant of the matrix
det(A)

[1] -2

Important

It’s essential to note a few considerations:

Numerical Stability: Computing determinants of large matrices or matrices close to singularity (having a determinant close to zero) can lead to numerical instability due to rounding errors.
Complexity: The computational complexity of determinant calculation increases rapidly with matrix size, especially for algorithms like cofactor expansion or LU decomposition used internally.
Use in Linear Algebra: Determinants play a vital role in linear algebra, being used in solving systems of linear equations, calculating inverses of matrices, and understanding transformations and eigenvalues.
Singular Matrices: If the determinant of a square matrix is zero, it signifies that the matrix is singular and not invertible.

Here’s an example that checks the determinant and its relation to matrix invertibility:

# Check the determinant and invertibility of a matrix
det_A <- det(A)

if (det_A != 0) {
  print("Matrix is invertible.")
} else {
  print("Matrix is singular, not invertible.")
}

[1] "Matrix is invertible."

Understanding determinants is crucial in various mathematical applications, especially in linear algebra and systems of equations, as they provide valuable information about the properties of matrices and their behavior in transformations and computations.

Inverse of Matrix

solve() function is used to compute the inverse of a square matrix. The inverse of a matrix \(A\) is denoted as \(A^{-1}\) and has the property that when multiplied by the original matrix A, it yields the identity matrix I.

print(A)

     [,1] [,2]
[1,]    1    3
[2,]    2    4

# Compute the inverse of the matrix
solve(A)

     [,1] [,2]
[1,]   -2  1.5
[2,]    1 -0.5

Important

It’s important to note a few things about matrix inversion:

Square Matrices: The matrix must be square (i.e., the number of rows equals the number of columns) to have an inverse. Inverting a non-square matrix is not possible.
Determinant Non-Zero: The matrix must have a non-zero determinant for its inverse to exist. If the determinant is zero, the matrix is singular, and its inverse cannot be computed.
Errors and Numerical Stability: Inverting matrices can be sensitive to numerical precision and errors, especially for matrices that are close to singular or ill-conditioned. Rounding errors can affect the accuracy of the computed inverse.

In practice, it’s essential to check the properties of the matrix, such as its determinant, before attempting to compute its inverse, especially when dealing with real-world data, as numerical issues can lead to unreliable results.

Here’s an example that checks the determinant before computing the inverse:

# Check the determinant before inverting the matrix
det_A <- det(A)

if (det_A != 0) {
  inverse_matrix_A <- solve(A)
  print(inverse_matrix_A)
} else {
  print("Matrix is singular, inverse does not exist.")
}

     [,1] [,2]
[1,]   -2  1.5
[2,]    1 -0.5

Understanding matrix inversion is crucial in various fields like machine learning, optimization, and solving systems of linear equations, as it allows for the transformation of equations or operations involving matrices to simplify computations. However, always ensure that the matrix you’re working with satisfies the conditions for invertibility to avoid computational errors.

Eigenvalues and Eigenvectors

In R programming, eigenvalues and eigenvectors are fundamental concepts often computed using the eigen() function. These are important in various fields, including linear algebra, data analysis, signal processing, and machine learning.

Eigenvalues: They are scalar values that represent how a linear transformation (represented by a square matrix) behaves along its eigenvectors. For a square matrix A, an eigenvalue (\(\lambda\)) and its corresponding eigenvector (\(v\)) satisfy the equation \(Av=\lambda v\). It essentially means that when the matrix A operates on the eigenvector \(v\), the resulting vector is a scaled version of the original eigenvector \(v\), scaled by the eigenvalue \(\lambda\).

In R, you can compute eigenvalues using the eigen() function.

# Create a sample matrix
A <- matrix(c(4, 2, 1, -1), nrow = 2, byrow = TRUE)

# Compute eigenvalues and eigenvectors
eig <- eigen(A)

# Access eigenvalues
eigenvalues <- eig$values
print(eigenvalues)

[1]  4.372281 -1.372281

Eigenvectors: They are non-zero vectors that are transformed only by a scalar factor when a linear transformation (represented by a matrix) is applied. Each eigenvalue has an associated eigenvector. Eigenvectors are important because they describe the directions along which the transformation represented by the matrix has a simple behavior, often stretching or compressing without changing direction.

In R, after computing the eigenvalues using eigen(), you can access the corresponding eigenvectors using:

# Access eigenvectors
eigenvectors <- eig$vectors
print(eigenvectors)

          [,1]       [,2]
[1,] 0.9831134 -0.3488887
[2,] 0.1829974  0.9371642

These eigenvalues and eigenvectors play a significant role in various applications, including principal component analysis (PCA), diagonalization of matrices, solving systems of differential equations, and more. They provide crucial insights into the behavior and characteristics of linear transformations represented by matrices.

Conclusion

Matrices are indeed useful and statisticians are used to working with them. However, in my daily work I try to use matrices as needed and prefer an approach based on data frames, because working with data frames makes it easier to use R’s advanced functional programming language capabilities. I plan to publish a post on data frames in the future, and in the conclusion of this post I would like to discuss the advantages and disadvantages of both matrices and data frames.

In R programming, matrices and data frames serve different purposes, each with its own set of advantages and limitations.

Matrices:

Pros:

Efficient for Numeric Operations: Matrices are optimized for numerical computations. If you’re working primarily with numeric data and need to perform matrix algebra, calculations tend to be faster with matrices compared to data frames.
Homogeneous Data: Matrices are homogeneous, meaning they store elements of the same data type (numeric, character, logical, etc.) throughout. This consistency simplifies some computations and analyses.
Mathematical Operations: Matrices are designed for linear algebra operations. Functions like matrix multiplication, transposition, and eigenvalue/eigenvector calculations are native to matrices in R.

Cons:

Lack of Flexibility: Matrices are restrictive when it comes to handling heterogeneous data or combining different data types within the same structure. They can only hold a single data type.
Row and Column Names: Matrices do not inherently support row or column names, which might be necessary for better data representation and interpretation.

Data Frames:

Pros:

Heterogeneous Data: Data frames can store different types of data (numeric, character, factor, etc.) within the same structure. This flexibility allows for handling diverse datasets efficiently.
Row and Column Names: Data frames support row and column names, making it easier to reference specific rows or columns and improving data readability.
Data Manipulation and Analysis: R’s data manipulation libraries (e.g., dplyr, tidyr) are optimized for data frames. They offer a wide range of functions and operations tailored for efficient data manipulation, summarization, and analysis.

Cons:

Performance: Compared to matrices, data frames might have slower performance for numerical computations involving large datasets due to their heterogeneous nature and additional data structure overhead.
Overhead for Numeric Operations: While data frames are versatile for handling different types of data, when it comes to pure numeric computations or linear algebra operations, they might be less efficient than matrices.

In summary, the choice between matrices and data frames in R depends on the nature of the data and the intended operations. If you’re working mainly with numeric data and require linear algebra operations, matrices might be more efficient. By understanding their creation, manipulation, operations, and application in advanced techniques like PCA, you can effectively handle complex data structures and perform sophisticated computations with ease. On the other hand, if you’re dealing with heterogeneous data and need more flexibility in data manipulation and analysis, data frames are a better choice. Often, data frames are preferred for general-purpose data handling and analysis due to their versatility, despite potential performance trade-offs for specific numerical operations.