R语言1向量
生信技能树R语言部分学习笔记
诸事殚精竭虑者,终将一无所成。
-------腓特烈大帝
01 R与Rstudio
在R中用新建R project来代替setwd()
,方便管理工作目录(默认文件保存的位置)。
02 数据类型和向量
数据类型分为:数值型(numeric),字符型(character),逻辑型(logical),因子型(factor)
逻辑型:TRUE(T),FALSE(F),NA(缺失值,存在但未知)
NULL:不存在
判断数据类型的函数:class()
is族函数,判断数据类型,返回值为TRUE
或FALSE
is.numeric()
is.logical()
is.charactor()
as族函数实现数据类型之间的转换
as.numeric()
as.logical()
as.charactor()
数据结构分为:向量(一串数据,可以有重复值)、数据框(~表格,一列只允许一种数据类型,单独一列来看是向量)、矩阵、列表
2.1 向量的生成
(1)用c()
逐一放到一起
(2)连续的数字用冒号“:”
(3)有重复的用rep()
,有规律的序列用seq()
,随机数用rnorm
(4)通过组合,产生更为复杂的向量
eg:
paste0(rep("gene",times=3),1:3)
问题:paste()
和paste0()
的区别与联系?
paste/paste0
函数, 用于连接字符(向量), paste
可以设置连接字符,默认以空格作为连接字符; paste0
以空字符串连接字符,不能设置 sep
值。collapse
参数可以实现用 sep
连接后的字符向量的元素间的再次连接。简单来说就是paste0是paste的简单版。
使用方式:
paste (..., sep = " ", collapse = NULL, recycle0 = FALSE)
paste0(..., collapse = NULL, recycle0 = FALSE)
举例:
## When passing a single vector, paste0 and paste work like as.character.
> paste(1:12)
[1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12"
> paste0(1:12) # same
[1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12"
## If you pass several vectors to paste0, they are concatenated in a
## vectorized way.
> paste0(1:12, c("st", "nd", "rd", rep("th", 9)))
[1] "1st" "2nd" "3rd" "4th" "5th" "6th" "7th" "8th" "9th" "10th"
[11] "11th" "12th"
## paste works the same, but separates each input with a space.
## Notice that the recycling rules make every input as long as the longest input.
> paste(month.abb, "is the", nth, "month of the year.")
[1] "Jan is the 1st month of the year."
[2] "Feb is the 2nd month of the year."
[3] "Mar is the 3rd month of the year."
[4] "Apr is the 4th month of the year."
[5] "May is the 5th month of the year."
[6] "Jun is the 6th month of the year."
[7] "Jul is the 7th month of the year."
[8] "Aug is the 8th month of the year."
[9] "Sep is the 9th month of the year."
[10] "Oct is the 10th month of the year."
[11] "Nov is the 11th month of the year."
[12] "Dec is the 12th month of the year."
> month.abb
[1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
> nth
[1] "1st" "2nd" "3rd" "4th" "5th" "6th" "7th" "8th" "9th" "10th"
[11] "11th" "12th"
> paste(month.abb, letters)
[1] "Jan a" "Feb b" "Mar c" "Apr d" "May e" "Jun f" "Jul g" "Aug h" "Sep i"
[10] "Oct j" "Nov k" "Dec l" "Jan m" "Feb n" "Mar o" "Apr p" "May q" "Jun r"
[19] "Jul s" "Aug t" "Sep u" "Oct v" "Nov w" "Dec x" "Jan y" "Feb z"
## You can change the separator by passing a sep argument
## which can be multiple characters.
> paste(month.abb, "is the", nth, "month of the year.", sep = "_*_")
[1] "Jan_*_is the_*_1st_*_month of the year."
[2] "Feb_*_is the_*_2nd_*_month of the year."
[3] "Mar_*_is the_*_3rd_*_month of the year."
[4] "Apr_*_is the_*_4th_*_month of the year."
[5] "May_*_is the_*_5th_*_month of the year."
[6] "Jun_*_is the_*_6th_*_month of the year."
[7] "Jul_*_is the_*_7th_*_month of the year."
[8] "Aug_*_is the_*_8th_*_month of the year."
[9] "Sep_*_is the_*_9th_*_month of the year."
[10] "Oct_*_is the_*_10th_*_month of the year."
[11] "Nov_*_is the_*_11th_*_month of the year."
[12] "Dec_*_is the_*_12th_*_month of the year."
## To collapse the output into a single string, pass a collapse argument.
> paste0(nth, collapse = ", ")
[1] "1st, 2nd, 3rd, 4th, 5th, 6th, 7th, 8th, 9th, 10th, 11th, 12th"
## For inputs of length 1, use the sep argument rather than collapse
> paste("1st", "2nd", "3rd", collapse = ", ") # probably not what you wanted
[1] "1st 2nd 3rd"
> paste("1st", "2nd", "3rd", sep = ", ")
[1] "1st, 2nd, 3rd"
## You can combine the sep and collapse arguments together.
> paste(month.abb, nth, sep = ": ", collapse = "; ")
[1] "Jan: 1st; Feb: 2nd; Mar: 3rd; Apr: 4th; May: 5th; Jun: 6th; Jul: 7th; Aug: 8th; Sep: 9th; Oct: 10th; Nov: 11th; Dec: 12th"
数据类型转换的优先顺序:
逻辑型---->字符型;逻辑型---->数值型---->字符型
2.2 对单个向量进行的操作
(1)赋值给一个变量名
x <- c(1,3,5)
(2)简单数学计算
log(x)
sqrt(x)
(3)根据某条件进行判断,生成逻辑值向量
x > 3
x == 3
(4)初级统计
unique(x) #去重复,保留第一次出现的元素
duplicated(x) #返回逻辑值,判断x中元素是否出现重复,没有重复,返回FALSE,有重复,返回TRUE
table(x) #重复值统计
sort(x) #排序,默认从小到大,如果要从大到小,需要添加参数:decreasing = T,或者rev(sort(x))将从小到大向量倒着输出
2.3 对两个向量的操作
(1)逻辑比较,生成等长的逻辑向量
x %in% #判断x的元素是否存在y中
(2)数学计算
(3)连接
paste(x, y, sep=":")
(4)交集、并集、差集
intersect(x,y) #交
union(x,y) #并
setdiff(x,y) #在x中不在y中的元素
setdiff(y,x) #在y中不在x中的元素
当两个向量长度不一致时,短的向量会循环补齐进行计算
2.4 向量筛选(取子集)
[]
:(1)将TRUE对应的值挑选出来,FALSE对应的值丢弃; (2)根据位置取
2.5 如何修改向量中的某个/某些元素?
取子集+赋值
2.6 简单向量作图
略
如何调整元素顺序?
直接改变向量取值的下标即可
向量匹配排序-match:谁在外面,谁就在后面,y按照x的顺序重新排序
x <- c("A","B","C","D","E")
y <- c("B","D","E","A","C")
match(x,y) #根据x作为模板,调整y的顺序,返回y的对应的元素的下标
#[1] 4 1 5 2 3
y[match(x,y)]
#[1] "A" "B" "C" "D" "E"
x[match(y,x)]
#[1] "B" "D" "E" "A" "C"