Sunday, January 14, 2018

How to Filter Indices or Elements from a Vector in R Programming

Filtering Indices from a Vector in R Programming
Filtering is one of the most common operations in R, as statistical analyses often focus on data that satisfies conditions of interest. The R's filtering feature reflecting the functional language nature of R.This allows us to extract a vector’s elements that satisfy certain conditions.
Filtering Indices :
Lets see an example to extract from z all its elements whose squares were greater than 8 and then assign that subvector to w.
> z <- c(5,2,-3,8)
> w <- z[z*z > 8]
> w

[1] 5 -3 8

Here is an another example..
> z <- c(5,2,-3,8)
> z
[1] 5 2 -3 8

Evaluation of the expression z*z > 8 gives us a vector of Boolean values!
> z*z > 8
[1] TRUE FALSE TRUE TRUE

Notes :
First, in the expression z*z > 8, note that everything is a vector or vector
operator:
• Since z is a vector, that means z*z will also be a vector (of the same length
as z).
• Due to recycling, the number 8 (or vector of length 1) becomes the vector
(8,8,8,8) here.


• The operator >, like +, is actually a function.
> ">"(2,1)
[1] TRUE
> ">"(2,5)

[1] FALSE
">"(z*z,8)

Boolean values are used to call out the desired elements of z:
> z[c(TRUE,FALSE,TRUE,TRUE)]
[1] 5 -3 8

The following example will place things into even sharper focus. Here, we will again define our extraction condition in terms of z, but then we will use the results to extract from another vector, y, instead of extracting from z:
> z <- c(5,2,-3,8)
> j <- z*z > 8
> j
[1] TRUE FALSE TRUE TRUE

> y <- c(1,2,30,5)
> y[j]

[1] 1 30 5

Or, more compactly, we could write the following:
> z <- c(5,2,-3,8)
> y <- c(1,2,30,5)
> y[z*z > 8]
[1] 1 30 5

Note:
we are using one vector, z, to determine indices to use in filtering another vector, y. In contrast, our earlier example used z to filter itself.

Here’s another example, this one involving assignment. Say we have a vector x in which we wish to replace all elements larger than a 3 with a 0.
> x <- c(1,3,8,2,20)
> x[x > 3] <- 0
> x
[1] 1 3 0 2 0

Filtering with the subset() Function :
Filtering can also be done with the subset() function. When applied to vectors,the difference between using this function and ordinary filtering lies in the manner in which NA values are handled.
> x <- c(6,1:3,NA,12)
> x
[1] 6 1 2 3 NA 12

> x[x > 5]
[1] 6 NA 12

> subset(x,x > 5)
[1] 6 12

Note:
When we did ordinary filtering in the previous section, R basically said,“Well, x[5] is unknown, so it’s also unknown whether its square is greater than 5.” But you may not want NAs in your results. When you wish to exclude NA values, using subset() saves you the trouble of removing the NA values yourself.

The Selection Function which() :
The filtering consists of extracting elements of a vector z that satisfy a certain condition. In some cases, though, we may just want to find the positions within z at which the condition occurs. We can do this using which(), as follows:
> z <- c(5,2,-3,8)
> which(z*z > 8)
[1] 1 3 4

The result says that elements 1, 3, and 4 of z have squares greater than 8.
Here, the expression z*z > 8 is evaluated to (TRUE,FALSE,TRUE,TRUE). The which() function then simply reports which elements of the latter expression are TRUE.

--------------------------------------------------------------------------------------------------------
Thanks, TAMATAM ; Business Intelligence & Analytics Professional
--------------------------------------------------------------------------------------------------------

No comments:

Post a Comment

Hi User, Thank You for visiting My Blog. Please post your genuine Feedback or comments only related to this Blog Posts. Please do not post any Spam comments or Advertising kind of comments which will be Ignored.

Featured Post from this Blog

How to compare Current Snapshot Data with Previous Snapshot in Power BI

How to Dynamically compare two Snapshots Data in Power BI Scenario: Suppose, we have a sample Sales data, which is stored with Monthly Snaps...

Popular Posts from this Blog