The Arrays in R Programming

Arrays are similar to matrices but it can have more than two dimensions, and they must be having the same mode or datatype of data.
They’re created with an array function of the following form..

> dim1 <- c("A1", "A2")

> dim2 <- c("B1", "B2", "B3")

> dim3 <- c("C1", "C2", "C3", "C4")

> z <- array(1:24, c(2, 3, 4), dimnames=list(dim1, dim2, dim3))

> z

, , C1

B1 B2 B3

A1 1 3 5

A2 2 4 6

, , C2

B1 B2 B3

A1 7 9 11

A2 8 10 12

, , C3

B1 B2 B3

A1 13 15 17

A2 14 16 18

, , C4

B1 B2 B3

A1 19 21 23

A2 20 22 24

As you can see, arrays are a natural extension of matrices. They can be useful in programming new statistical methods. Like matrices, they must be a single mode.

myarray <- array(vector, dimensions, dimnames)

where vector contains the data for the array, dimensions is a numeric vector giving the maximal index for each dimension, and dimnames is an optional list of dimension labels.

where vector contains the data for the array, dimensions is a numeric vector giving the maximal index for each dimension, and dimnames is an optional list of dimension labels.

**Example 1 :**

The following listing gives an example of creating a three-dimensional (2 × 3 ×4) array of numbers, which will have 2 rows, 3 columns and 4 layers(Indexes).

> dim1 <- c("A1", "A2")

> dim2 <- c("B1", "B2", "B3")

> dim3 <- c("C1", "C2", "C3", "C4")

> z <- array(1:24, c(2, 3, 4), dimnames=list(dim1, dim2, dim3))

> z

, , C1

B1 B2 B3

A1 1 3 5

A2 2 4 6

, , C2

B1 B2 B3

A1 7 9 11

A2 8 10 12

, , C3

B1 B2 B3

A1 13 15 17

A2 14 16 18

, , C4

B1 B2 B3

A1 19 21 23

A2 20 22 24

As you can see, arrays are a natural extension of matrices. They can be useful in programming new statistical methods. Like matrices, they must be a single mode.

Identifying elements is similar as matrices.

z[1,2,3] element is 15, which is an element from 1 row, 2 column and 3 Index of the Array z.

**Example 2 :**

The matrix is then a two-dimensional data structure. But suppose we also have data taken at different times, one data point per person per variable per time. Time then becomes the third dimension,in addition to rows and columns.

In R, such data sets are called arrays.As a simple example, consider students and test scores. Say each test consists of two parts, so we record two scores for a student for each test. Now suppose that we have two tests, and to keep the example small, assume we

have only three students. Here’s the data for the first test:

have only three students. Here’s the data for the first test:

> firsttest

[,1] [,2]

[1,] 46 30

[2,] 21 25

[3,] 50 48

Student 1 had scores of 46 and 30 on the first test, student 2 scored 21 and 25, and so on. Here are the scores for the same students on the second test:

> secondtest

[,1] [,2]

[1,] 46 43

[2,] 41 35

[3,] 50 49

In layer 1, there will be three rows for the three students’ scores on the first test, with two columns per row for the two portions of a test. We use R’s array function to create the data structure as follows :

> tests <- array(data=c(firsttest,secondtest),dim=c(3,2,2))

In the argument dim=c(3,2,2), we have specified 3 rows, 2 columns and 2 layers/Indexes.

Each element of tests now has three subscripts, rather than two as in the matrix case. The first subscript corresponds to the first element in the $dim vector, the second subscript corresponds to the second element in the vector, and so on.

For instance, the score on the second variable of test 1 for student 3 is retrieved as follows:

> tests[3,2,1]

[1] 48

R’s print function for arrays displays the data layer by layer:

> tests

, , 1

[,1] [,2]

[1,] 46 30

[2,] 21 25

[3,] 50 48

, , 2

[,1] [,2]

[1,] 46 43

[2,] 41 35

[3,] 50 49

Thanks, TAMATAM

## No comments:

## Post a Comment

Hi User, Thank You for Visiting My Blog. Please Post Your Feedback/Comments/Query.