`apply(X, MARGIN, FUN, ...)`

**Introduction**

In R programming, Apply functions (** apply()**,

**,**

`sapply()`

**) and the**

`lapply()`

**function from the purrr package are powerful tools for data manipulation and analysis. In this comprehensive guide, we will delve into the syntax, usage, and examples of each function, including the usage of built-in functions and additional arguments, as well as performance benchmarking.**

`map()`

## Understanding apply() Function

The `apply()`

function in R is used to apply a specified function to the rows or columns of an array. Its syntax is as follows:

: The input data, typically an array or matrix.`X`

: A numeric vector indicating which margins should be retained. Use`MARGIN`

for rows,`1`

for columns.`2`

: The function to apply.`FUN`

: Additional arguments to be passed to the function.`...`

Let’s calculate the mean of each row in a matrix using ** apply()**:

```
<- matrix(1:9, nrow = 3)
matrix_data <- apply(matrix_data, 1, mean)
row_means print(row_means)
```

`[1] 4 5 6`

This example computes the mean of each row in the matrix.

Let’s calculate the standard deviation of each column in a matrix and specify additional arguments (** na.rm = TRUE**) using

**:**

`apply()`

```
<- apply(matrix_data, 2, sd, na.rm = TRUE)
column_stdev print(column_stdev)
```

`[1] 1 1 1`

## Understanding sapply() Function

The ** sapply()** function is a simplified version of

**that returns a vector or matrix. Its syntax is similar to**

`lapply()`

**:**

`lapply()`

`sapply(X, FUN, ...)`

: The input data, typically a list.`X`

: The function to apply.`FUN`

: Additional arguments to be passed to the function.`...`

Let’s calculate the sum of each element in a list using ** sapply()**:

```
<- list(a = 1:3, b = 4:6, c = 7:9)
num_list <- sapply(num_list, sum)
sum_results print(sum_results)
```

```
a b c
6 15 24
```

This example computes the sum of each element in the list.

Let’s convert each element in a list to uppercase using ** sapply()** and the

**function:**

`toupper()`

```
<- list("hello", "world", "R", "programming")
text_list <- sapply(text_list, toupper)
uppercase_text print(uppercase_text)
```

`[1] "HELLO" "WORLD" "R" "PROGRAMMING"`

Here, ** sapply()** applies the

**function to each element in the list, converting them to uppercase.**

`toupper()`

## Understanding lapply() Function

The ** lapply()** function applies a function to each element of a list and returns a list. Its syntax is as follows:

`lapply(X, FUN, ...)`

: The input data, typically a list.`X`

: The function to apply.`FUN`

: Additional arguments to be passed to the function.`...`

Let’s apply a custom function to each element of a list using ** lapply()**:

```
<- list(a = 1:3, b = 4:6, c = 7:9)
num_list <- function(x) sum(x) * 2
custom_function <- lapply(num_list, custom_function)
result_list print(result_list)
```

```
$a
[1] 12
$b
[1] 30
$c
[1] 48
```

In this example, ** lapply()** applies the custom function to each element in the list.

Let’s extract the vowels from each element in a list of words using ** lapply()** and a custom function:

```
<- list("apple", "banana", "orange", "grape")
word_list <- lapply(word_list, function(word) grep("[aeiou]", strsplit(word, "")[[1]], value = TRUE))
vowel_list print(vowel_list)
```

```
[[1]]
[1] "a" "e"
[[2]]
[1] "a" "a" "a"
[[3]]
[1] "o" "a" "e"
[[4]]
[1] "a" "e"
```

Here, ** lapply()** applies the custom function to each element in the list, extracting vowels from words.

## Understanding map() Function

The ** map()** function from the purrr package is similar to

**but offers a more consistent syntax and returns a list. Its syntax is as follows:**

`lapply()`

`map(.x, .f, ...)`

: The input data, typically a list.`.x`

: The function to apply.`.f`

: Additional arguments to be passed to the function.`...`

Let’s apply a lambda function to each element of a list using ** map()**:

```
library(purrr)
<- list(a = 1:3, b = 4:6, c = 7:9)
num_list <- map(num_list, ~ .x^2)
mapped_results print(mapped_results)
```

```
$a
[1] 1 4 9
$b
[1] 16 25 36
$c
[1] 49 64 81
```

In this example, ** map()** applies the lambda function (squared) to each element in the list.

Let’s calculate the lengths of strings in a list using ** map()** and the

**function:**

`nchar()`

```
<- list("hello", "world", "R", "programming")
text_list <- map(text_list, nchar)
string_lengths print(string_lengths)
```

```
[[1]]
[1] 5
[[2]]
[1] 5
[[3]]
[1] 1
[[4]]
[1] 11
```

Here, ** map()** applies the

**function to each element in the list, calculating the length of each string.**

`nchar()`

## Understanding map() Function Variants

In addition to the ** map()** function, the purrr package provides several variants that are specialized for different types of output:

**,**

`map_lgl()`

**,**

`map_int()`

**, and**

`map_dbl()`

**. These variants are particularly useful when you expect the output to be of a specific data type, such as logical, integer, double, or character.**

`map_chr()`

: This variant is used when the output of the function is expected to be a logical vector.`map_lgl()`

: Use this variant when the output of the function is expected to be an integer vector.`map_int()`

: This variant is used when the output of the function is expected to be a double vector.`map_dbl()`

: Use this variant when the output of the function is expected to be a character vector.`map_chr()`

These variants provide stricter type constraints compared to the generic ** map()** function, which can be useful for ensuring the consistency of the output type across iterations. They are particularly handy when working with functions that have predictable output types.

```
library(purrr)
# Define a list of vectors
<- list(a = 1:3, b = 4:6, c = 7:9)
num_list
# Use map_lgl() to check if all elements in each vector are even
<- map_lgl(num_list, function(x) all(x %% 2 == 0))
even_check print(even_check)
```

```
a b c
FALSE FALSE FALSE
```

```
# Use map_int() to compute the sum of each vector
<- map_int(num_list, sum)
vector_sums print(vector_sums)
```

```
a b c
6 15 24
```

```
# Use map_dbl() to compute the mean of each vector
<- map_dbl(num_list, mean)
vector_means print(vector_means)
```

```
a b c
2 5 8
```

```
# Use map_chr() to convert each vector to a character vector
<- map_chr(num_list, toString)
vector_strings print(vector_strings)
```

```
a b c
"1, 2, 3" "4, 5, 6" "7, 8, 9"
```

By using these specialized variants, you can ensure that the output of your mapping operation adheres to your specific data type requirements, leading to cleaner and more predictable code.

**Performance Comparison**

To compare the performance of these functions, it’s important to note that the execution time may vary depending on the hardware specifications of your computer, the size of the dataset, and the complexity of the operations performed. While one function may perform better in one scenario, it may not be the case in another. Therefore, it’s recommended to benchmark the functions in your specific use case.

Let’s benchmark the computation of the sum of a large list using different functions:

```
library(microbenchmark)
# Create a 100 x 100 matrix
<- matrix(rnorm(10000), nrow = 100)
matrix_data
# Use apply() function to compute the sum for each column
<- microbenchmark(
benchmark_results apply_sum = apply(matrix_data, 2, sum),
sapply_sum = sapply(matrix_data, sum),
lapply_sum = lapply(matrix_data, sum),
map_sum = map_dbl(as.list(matrix_data), sum), # We need to convert the matrix to a list for the map function
times = 100
)
print(benchmark_results)
```

```
Unit: microseconds
expr min lq mean median uq max neval
apply_sum 98.1 122.95 143.123 135.60 153.35 277.8 100
sapply_sum 2326.7 2429.75 2941.094 2514.85 2852.55 11218.3 100
lapply_sum 2150.6 2247.55 2860.614 2364.90 2930.80 6556.0 100
map_sum 5063.5 5342.45 6009.474 5738.35 6788.35 8139.7 100
```

** apply_sum** demonstrates the fastest processing time among the alternatives,. These results suggest that while

**function offers the fastest processing time, it’s still relatively slow compared to other options. When evaluating these results, it’s crucial to consider factors beyond processing time, such as usability and functionality, to select the most suitable function for your specific needs.**

`apply()`

Overall, the choice of function depends on factors such as speed, ease of use, and compatibility with the data structure. It’s essential to benchmark different alternatives in your specific use case to determine the most suitable function for your needs.

**Conclusion**

Apply functions (** apply()**,

**,**

`sapply()`

**) and the**

`lapply()`

**function from the purrr package are powerful tools for data manipulation and analysis in R. Each function has its unique features and strengths, making them suitable for various tasks.**

`map()`

function is versatile and operates on matrices, allowing for row-wise or column-wise operations. However, its performance may vary depending on the size of the dataset and the nature of the computation.`apply()`

and`sapply()`

functions are convenient for working with lists and provide more optimized implementations compared to`lapply()`

. They offer flexibility and ease of use, making them suitable for a wide range of tasks.`apply()`

function offers a more consistent syntax compared to`map()`

and provides additional variants (`lapply()`

,`map_lgl()`

,`map_int()`

,`map_dbl()`

) for handling specific data types. While it may exhibit slower performance in some cases, its functionality and ease of use make it a valuable tool for functional programming in R.`map_chr()`

When choosing the most suitable function for your task, it’s essential to consider factors beyond just performance. Usability, compatibility with data structures, and the nature of the computation should also be taken into account. Additionally, the performance of these functions may vary depending on the hardware specifications of your computer, the size of the dataset, and the complexity of the operations performed. Therefore, it’s recommended to benchmark the functions in your specific use case and evaluate them based on multiple criteria to make an informed decision.

By mastering these functions and understanding their nuances, you can streamline your data analysis workflows and tackle a wide range of analytical tasks with confidence in R.