Friday, 26 January 2018

How to do Indexing, Filtering and Resizing the Matrix in R Programming

Matrix Indexing, Filtering and Resizing the Matrix in R Programming
We can perform various operations on Matrices like Indexing (to access specific elements of a Matrix), Filtering and Resizing the Matrix.

1) Matrix Indexing:
> z<-matrix(c(1:4,1,1,0,0,1,0,1,0),nrow=4)
> z
[,1] [,2] [,3]
[1,]    1    1    1
[2,]    2    1    0
[3,]    3    0    1
[4,]    4    0    0

Now we can access the sub matrix of z consisting of all elements with column numbers 2 and 3 of all rows by providing the index positions of columns and skipping the rows index as follows..

> z[,2:3]

[,1] [,2]
[1,]   1   1
[2,]   1   0
[3,]   0   1
[4,]   0   0

In the similar manner, we can also access the rows 2 and 3 of all columns providing the index positions of rows and skipping the columns index as follows..
> z[2:3,]
[,1] [,2] [,3]
[1,]    2    1    0
[2,]    3    0    1

Also, in the same way we can extract the element of a Matrix which is indexed at the row 2 and column 2 by providing the row and column index as follows..

> z[2,2]
[1] 1

Assigning the values to a Sub matrix from another matrix :
> y<-matrix(1:6,nrow=3)
> y
[  ,1] [,2]
[1,]    1    4
[2,]    2    5
[3,]    3    6

Example 1:
here, we will assign the row 1 and 3 values from an another matrix.
> y[c(1,3),] <- matrix(c(1,1,8,12),nrow=2)
> y
[,1] [,2]
[1,]    1    8
[2,]    2    5
[3,]    1   12

Example 2:
In the following example, we will assign the values from a Matrix y to a blank Matrix x
> x <- matrix(nrow=3,ncol=3)
> x
[,1] [,2] [,3]
[1,]   NA   NA   NA
[2,]   NA   NA   NA
[3,]   NA   NA   NA

> y <- matrix(c(4,5,2,3),nrow=2)
> y
[,1] [,2]
[1,]   4   2
[2,]   5   3

here, we are assigning the values for x ( for rows 2,3 and columns 2,3) from y
> x[2:3,2:3] <- y
> x
[,1] [,2] [,3]
[1,] NA NA NA
[2,] NA   4    2
[3,] NA   5    3

Excluding specific rows or columns(Negative subscripts) of a matrix :
When we exclude rows or columns those will exclude only from the result set but not from the original Matrix.
> z<-matrix(c(1:4,1,1,0,0,1,0,1,0),nrow=4,byrow=TRUE)
> z
[,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    1    1
[3,]    0    0    1
[4,]    0    1    0

Excluding 2 row from Matrix z
> z[-2,]
[,1] [,2] [,3]
[1,]    1    2    3
[2,]    0    0    1
[3,]    0    1    0

Excluding 3 column from Matrix z
> z[,-3]
[,1] [,2]
[1,]    1    2
[2,]    4    1
[3,]    0    0
[4,]    0    1

Excluding 3 column from Matrix z
> z[-2,-2]
[,1] [,2]
[1,]    1    3
[2,]    0    1
[3,]    0    0

2) Filtering on Matrices :
> m<-matrix ( 1:12,nrow=4)
> m
[,1] [,2] [,3]
[1,]    1    5    9
[2,]    2    6   10
[3,]    3    7   11
[4,]    4    8   12

Now we filter out the matrix m , based on a condition that the column 2 should be > 6.
> m[m[,2] >= 6,]
[,1] [,2] [,3]
[1,]    2    6   10
[2,]    3    7   11
[3,]    4    8   12

Now we filter out the matrix m, based on a multiple conditions that the column 2 should be > 6 and column 3 > 10.
> m[m[,2] >= 6 & m[,3] > 10,]
[,1] [,2] [,3]
[1,]    3    7   11
[2,]    4    8   12

Note :
The other columns and rows (which are not used in condition) corresponding to condition column will be return.

apply() Function on Matrices :
The R function apply() ,that allows you to apply an arbitrary function to any dimension of a matrix, array, or data frame.
The format for the apply() function is apply(x, MARGIN, FUN, ...) where x is the data object, MARGIN is the dimension index, FUN is any R function that you specify, and ... are any parameters you want to pass to FUN. In a matrix or data frame, MARGIN=1 indicates rows and MARGIN=2 indicates columns.
Example:

> mydata <- matrix(rnorm(30), nrow=6)
> mydata
[,1]         [,2]             [,3]        [,4]        [,5]
[1,] 0.71298     1.368     - 0.8320  -1.234   -0.790
[2,] -0.15096   -1.149    -1.0001    -0.725    0.506
[3,] -1.77770    0.519     -0.6675    0.721   -1.350
[4,] -0.00132   -0.308      0.9117   -1.391    1.558
[5,] -0.00543    0.378     -0.0906   -1.485   -0.350
[6,] -0.52178   -0.539    -1.7347     2.050    1.569

Calculating the row( 6 rows ) means
> apply(mydata, 1, mean)
[1] -0.155 -0.504 -0.511 0.154 -0.310 0.165

Calculating the column (5 columns) means
> apply(mydata, 2, mean)
[1] -0.2907 0.0449 -0.5688 -0.3442 0.1906

Calculating the trimmed column means (in this case, means based on the middle 60% of the data,with the bottom 20% and top 20% of the values discarded)
> apply(mydata, 2, mean, trim=0.2)
[1] -0.1699 0.0127 -0.6475 -0.6575 0.2312

3) Adding and Deleting Matrix Rows and Columns :
Technically, matrices are of fixed length and dimensions, so we cannot add or delete rows or columns. However, matrices can be reassigned, and thus we can achieve the same effect as if we had directly done additions or deletions.

First we re call how we we reassign vectors to change their size.
> x
[1] 12 5 13 16 8

# append 20 the vector x
> x <- c(x,20)
> x
[1] 12 5 13 16 8 20

# insert 20 the vector x
> x <- c(x[1:3],20,x[4:6])
> x
[1] 12 5 13 20 16 8 20

# delete elements 2 through 4
> x <- x[-2:-4]
> x
[1] 12 16 8 20

Notes:
In the first case, x is originally of length 5, which we extend to 6 via concatenation and then reassignment. We didn’t literally change the length of x but instead created a new vector from x and then assigned x to that new vector.

Changing the Size of a Matrix :
The rbind() (row bind) and cbind() (column bind) functions let you change size of the Matrix  add rows or columns to a matrix.
Example:
> x<-rep(1,4)
>x
[1] 1 1 1 1

> z<-matrix(c(1:4,1,1,0,0,1,0,1,0),nrow=4)
> z
[,1] [,2] [,3]
[1,]    1    1    1
[2,]    2    1    0
[3,]    3    0    1
[4,]    4    0    0

>z <- cbind(one,z)
x
[1,]   1  1 1 1
[2,]   1  2 1 0
[3,]   1  3 0 1
[4,]   1  4 0 0

Here, cbind() creates a new matrix by combining a column of 1s with the columns of z.
We could also have relied on recycling as follows..
> cbind(1,z)
[,1] [,2] [,3] [,4]
[1,]    1    1    1    1
[2,]    1    2    1    0
[3,]    1    3    0    1
[4,]    1    4    0    0

Here, the 1 value was recycled into a vector of four 1 values.You can also use the rbind() and cbind() functions as a quick way to create small matrices

> q <- cbind(c(1,2),c(3,4))
> q
[,1] [,2]
[1,]   1   3
[2,]   2   4

You can delete rows or columns by reassignment as follows:

> m <- matrix(1:6,nrow=3)
> m
[,1] [,2]
[1,]  1   4
[2,]  2   5
[3,]  3   6

> m <- m[c(1,3),]

> m

[,1] [,2]
[1,]  1   4
[2,]  3  6

4) Distinction between Vector and Matrix :
As we know that matrix is just a vector but with two additional attributes: the number of rows and the number of columns.Here, we’ll take a closer look at the vector nature of matrices.
Example:

> z <- matrix(1:8,nrow=4)
> z
[,1] [,2]
[1,]   1   5
[2,]   2   6
[3,]   3   7
[4,]  4    8

> class(z)
[1] "matrix"

> attributes(z)
\$dim
[1] 4 3

> dim(z)
[1] 4 2

The numbers of rows and columns are obtainable individually via the nrow() and ncol() functions:
> nrow(z)
[1] 4
> ncol(z)
[1] 2

Avoiding Unintended Dimension Reduction:
Now we will see the key difference of Matrix and a Vector.
> z <- matrix(1:8,nrow=4)
> z
[,1] [,2]
[1,]    1    5
[2,]    2    6
[3,]    3    7
[4,]    4    8

> r <- z[2,]
> r
[1] 2 6

here, r has displayed as a vector format, not a matrix format. In other words, r is a vector of length 2, rather than a 1-by-2 matrix. We can confirm this in a couple of ways:
> attributes(z)
\$dim
[1] 4 2

> attributes(r)
NULL

> dim(r)
NULL

> str(z)
int [1:4, 1:2] 1 2 3 4 5 6 7 8

> str(r)
int [1:2] 2 6

Note:
Here, R tells us that z has row and column numbers, while r does not. Similarly, str() tells us that z has indices ranging in 1:4 and 1:2, for rows and columns, while r’s indices are simply range in 1:2. This tells that r is a vector, not a matrix.

In R there is a way to suppress this dimension reduction: the drop argument.
> r <- z[2,, drop=FALSE]
> r
[,1] [,2]
[1,]  2   6

> attributes(r)
\$dim
[1] 1 2

> dim(r)
[1] 1 2

Now r is a 1-by-2 matrix, not a two-element vector.For these reasons, you may find it useful to routinely include the drop=FALSE argument in all your matrix code.

If you have a vector that you wish to be treated as a matrix, you can use the as.matrix() function, as follows:

> u
[1] 1 2 3
> v <- as.matrix(u)

> attributes(u)
NULL

> attributes(v)
\$dim
[1] 3 1

5) Naming Matrix Rows and Columns
The natural way to refer to rows and columns in a matrix is via row and column numbers. However, you can also give names to these entities.
Example:
> z
[,1] [,2]
[1,]   1   3
[2,]    2  4

> colnames(z)

NULL

> colnames(z) <- c("a","b")
> z
a b
[1,] 1 3
[2,] 2 4

> colnames(z)
[1] "a" "b"

> z[,"a"]
[1] 1 2

Note:
As you see here, these names can then be used to reference specific columns. The function rownames() works similarly for naming matrix rows.Naming rows and columns is usually less important when writing R code for general applications, but it can be useful when analyzing a specific data set.

--------------------------------------------------------------------------------------------------------
Thanks, TAMATAM ; Business Intelligence & Analytics Professional
--------------------------------------------------------------------------------------------------------