
Vector selection: the good times (2)

How about analyzing your midweek results?

To select multiple(多种) elements from a vector, you can add square brackets at the end of it. You can indicate(表明) between the brackets what elements should be selected.
For example: suppose you want to select the first and the fifth day of the week: use the vector c(1, 5) between the square brackets. For example, the code below(下面) selects the first and fifth element of poker_vector:
poker_vector[c(1, 5)]

Assign the poker results of Tuesday, Wednesday and Thursday to the variable poker_midweek.
# Poker and roulette winnings from Monday to Friday:
poker_vector <- c(140, -50, 20, -120, 240)
roulette_vector <- c(-24, -50, 100, -350, 10)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(poker_vector) <- days_vector
names(roulette_vector) <- days_vector

# Define a new variable based on a selection
poker_midweek <- poker_vector[c(2,3,4)]
Vector selection: the good times (3)


Selecting multiple elements of poker_vector with c(2, 3, 4) is not very convenient(方便). Many statisticians are lazy people by nature(天性), so they created an easier way to do this: c(2, 3, 4) can be abbreviated (简写)to2:4, which generates(引起) a vector with all natural numbers from 2 up to 4.

So, another way to find the mid-week results is poker_vector[2:4].
Notice how the vector 2:4 is placed between the square brackets to select element 2 up to 4.(这种写法是递增)

Assign to roulette_selection_vector the roulette(轮盘赌) results from Tuesday up to Friday; make use of : if it makes things easier for you.

# Poker and roulette winnings from Monday to Friday:
poker_vector <- c(140, -50, 20, -120, 240)
roulette_vector <- c(-24, -50, 100, -350, 10)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(poker_vector) <- days_vector
names(roulette_vector) <- days_vector

# Define a new variable based on a selection
roulette_selection_vector <- roulette_vector[2:5]
Vector selection: the good times (4)


Another way to tackle(处理) the previous exercise is by using the names of the vector elements (Monday, Tuesday, ...) instead of their numeric positions. For example,


will select the first element of poker_vector since "Monday" is the name of that first element.

Just like you did in the previous exercise with numerics, you can also use the element names to select multiple elements, for example:

  • Select the first three(前3个) elements in poker_vector by using their names: "Monday""Tuesday" and "Wednesday". Assign the result of the selection to poker_start.
  • Calculate(计算) the average of the values in poker_start with the mean() function. Simply print out the result so you can inspect(检查) it.
# Poker and roulette winnings from Monday to Friday:
poker_vector <- c(140, -50, 20, -120, 240)
roulette_vector <- c(-24, -50, 100, -350, 10)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(poker_vector) <- days_vector
names(roulette_vector) <- days_vector

# Select poker results for Monday, Tuesday and Wednesday
poker_start <- poker_vector[c("Monday","Tuesday","Wednesday")]

# Calculate the average of the elements in poker_start 直接计算平均数使用自带函数mean()


Selection by comparison - Step 1


By making use of comparison(比较) operators(操作符), we can approach(靠近) the previous question in a more proactive(先进) way.

The (logical) comparison operators known to R are:

  • < for less than  不到; 少于
  • > for greater than 大于
  • <= for less than or equal to 小于等于
  • >= for greater than or equal to 大于等于
  • == for equal to each other 等于
  • != not equal to each other 不等于

As seen in the previous chapter, stating 6 > 5 returns TRUE. The nice thing about R is that you can use these comparison operators also on vectors. For example:

> c(4, 5, 6) > 5
我的理解是 第一个不是 大于号,而是R的输入提示
This command tests for every element of the vector if the condition stated by the comparison operator is TRUE or FALSE.

  • Check which elements in poker_vector are positive(正数) (i.e. > 0) and assign this to selection_vector.
  • Print out selection_vector so you can inspect(验证) it. The printout tells you whether you won (TRUE) or lost (FALSE) any money for each day.

# Poker and roulette winnings from Monday to Friday:
poker_vector <- c(140, -50, 20, -120, 240)
roulette_vector <- c(-24, -50, 100, -350, 10)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(poker_vector) <- days_vector
names(roulette_vector) <- days_vector

# Which days did you make money on poker?
selection_vector <- poker_vector[c(1,2,3,4,5)] > 0

selection_vector <- poker_vector[c(1,2,3,4,5) > 0 ]

# Print out selection_vector
Selection by comparison - Step 2


Working with comparisons will make your data analytical life easier. Instead of selecting a subset(子集) of days to investigate(研究) yourself (like before), you can simply ask R to return only those days where you realized a positive return for poker.

In the previous exercises you used selection_vector <- poker_vector > 0 to find the days on which you had a positive poker return. Now, you would like to know not only the days on which you won, but also how much you won on those days.

You can select the desired(渴望的) elements, by putting selection_vectorbetween the square brackets that follow poker_vector:

R knows what to do when you pass a logical vector in square brackets: it will only select the elements that correspond to(对应是) TRUE in selection_vector.
Use selection_vector in square brackets to assign the amounts(总额) that you won on the profitable(获利的) days to the variable poker_winning_days.

# Poker and roulette winnings from Monday to Friday:
poker_vector <- c(140, -50, 20, -120, 240)
roulette_vector <- c(-24, -50, 100, -350, 10)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(poker_vector) <- days_vector
names(roulette_vector) <- days_vector

# Which days did you make money on poker?
selection_vector <- poker_vector > 0
#选中获利的那些天,即poker_vector 表示所有元素,大于0的
# Select from poker_vector these days


poker_winning_days <- poker_vector[selection_vector]

Advanced selection

Just like you did for poker, you also want to know those days where you realized a positive return for roulette.
  • Create the variable selection_vector, this time to see if you made profit with roulette for different days.
  • Assign the amounts that you made on the days that you ended positively for roulette to the variable roulette_winning_days. This vector thus contains the positive winnings of roulette_vector.

# Poker and roulette winnings from Monday to Friday:
poker_vector <- c(140, -50, 20, -120, 240)
roulette_vector <- c(-24, -50, 100, -350, 10)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(poker_vector) <- days_vector
names(roulette_vector) <- days_vector

# Which days did you make money on roulette?
selection_vector <- roulette_vector > 0

# Select from roulette_vector these days
roulette_winning_days <- roulette_vector[selection_vector]


