R语言1向量

生信技能树R语言部分学习笔记

诸事殚精竭虑者,终将一无所成。
-------腓特烈大帝

01 R与Rstudio

在R中用新建R project来代替setwd(),方便管理工作目录(默认文件保存的位置)。

02 数据类型和向量

数据类型分为:数值型(numeric),字符型(character),逻辑型(logical),因子型(factor)
逻辑型:TRUE(T),FALSE(F),NA(缺失值,存在但未知)
NULL:不存在

判断数据类型的函数:class()
is族函数,判断数据类型,返回值为TRUEFALSE
is.numeric()
is.logical()
is.charactor()

as族函数实现数据类型之间的转换
as.numeric()
as.logical()
as.charactor()

数据结构分为:向量(一串数据,可以有重复值)、数据框(~表格,一列只允许一种数据类型,单独一列来看是向量)、矩阵列表

2.1 向量的生成

(1)用c()逐一放到一起
(2)连续的数字用冒号“:”
(3)有重复的用rep(),有规律的序列用seq(),随机数用rnorm
(4)通过组合,产生更为复杂的向量
eg:

paste0(rep("gene",times=3),1:3)

问题:paste()paste0()的区别与联系?
paste/paste0 函数, 用于连接字符(向量), paste 可以设置连接字符,默认以空格作为连接字符; paste0 以空字符串连接字符,不能设置 sep 值。collapse 参数可以实现用 sep 连接后的字符向量的元素间的再次连接。简单来说就是paste0是paste的简单版。

使用方式:

paste (..., sep = " ", collapse = NULL, recycle0 = FALSE)
paste0(...,            collapse = NULL, recycle0 = FALSE)

举例:

## When passing a single vector, paste0 and paste work like as.character.
> paste(1:12)
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12"
> paste0(1:12)    # same
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12"

## If you pass several vectors to paste0, they are concatenated in a
## vectorized way.
> paste0(1:12, c("st", "nd", "rd", rep("th", 9)))
 [1] "1st"  "2nd"  "3rd"  "4th"  "5th"  "6th"  "7th"  "8th"  "9th"  "10th"
[11] "11th" "12th"

## paste works the same, but separates each input with a space.
## Notice that the recycling rules make every input as long as the longest input.
> paste(month.abb, "is the", nth, "month of the year.")
 [1] "Jan is the 1st month of the year." 
 [2] "Feb is the 2nd month of the year." 
 [3] "Mar is the 3rd month of the year." 
 [4] "Apr is the 4th month of the year." 
 [5] "May is the 5th month of the year." 
 [6] "Jun is the 6th month of the year." 
 [7] "Jul is the 7th month of the year." 
 [8] "Aug is the 8th month of the year." 
 [9] "Sep is the 9th month of the year." 
[10] "Oct is the 10th month of the year."
[11] "Nov is the 11th month of the year."
[12] "Dec is the 12th month of the year."
> month.abb
 [1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
> nth
 [1] "1st"  "2nd"  "3rd"  "4th"  "5th"  "6th"  "7th"  "8th"  "9th"  "10th"
[11] "11th" "12th"
> paste(month.abb, letters)
 [1] "Jan a" "Feb b" "Mar c" "Apr d" "May e" "Jun f" "Jul g" "Aug h" "Sep i"
[10] "Oct j" "Nov k" "Dec l" "Jan m" "Feb n" "Mar o" "Apr p" "May q" "Jun r"
[19] "Jul s" "Aug t" "Sep u" "Oct v" "Nov w" "Dec x" "Jan y" "Feb z"

## You can change the separator by passing a sep argument
## which can be multiple characters.
> paste(month.abb, "is the", nth, "month of the year.", sep = "_*_")
 [1] "Jan_*_is the_*_1st_*_month of the year." 
 [2] "Feb_*_is the_*_2nd_*_month of the year." 
 [3] "Mar_*_is the_*_3rd_*_month of the year." 
 [4] "Apr_*_is the_*_4th_*_month of the year." 
 [5] "May_*_is the_*_5th_*_month of the year." 
 [6] "Jun_*_is the_*_6th_*_month of the year." 
 [7] "Jul_*_is the_*_7th_*_month of the year." 
 [8] "Aug_*_is the_*_8th_*_month of the year." 
 [9] "Sep_*_is the_*_9th_*_month of the year." 
[10] "Oct_*_is the_*_10th_*_month of the year."
[11] "Nov_*_is the_*_11th_*_month of the year."
[12] "Dec_*_is the_*_12th_*_month of the year."

## To collapse the output into a single string, pass a collapse argument.
> paste0(nth, collapse = ", ")
[1] "1st, 2nd, 3rd, 4th, 5th, 6th, 7th, 8th, 9th, 10th, 11th, 12th"

## For inputs of length 1, use the sep argument rather than collapse
> paste("1st", "2nd", "3rd", collapse = ", ")    # probably not what you wanted
[1] "1st 2nd 3rd"
> paste("1st", "2nd", "3rd", sep = ", ")
[1] "1st, 2nd, 3rd"

## You can combine the sep and collapse arguments together.
> paste(month.abb, nth, sep = ": ", collapse = "; ")
[1] "Jan: 1st; Feb: 2nd; Mar: 3rd; Apr: 4th; May: 5th; Jun: 6th; Jul: 7th; Aug: 8th; Sep: 9th; Oct: 10th; Nov: 11th; Dec: 12th"

数据类型转换的优先顺序:
逻辑型---->字符型;逻辑型---->数值型---->字符型

2.2 对单个向量进行的操作

(1)赋值给一个变量名

x <- c(1,3,5)

(2)简单数学计算

log(x)
sqrt(x)

(3)根据某条件进行判断,生成逻辑值向量

x > 3
x == 3

(4)初级统计

unique(x)   #去重复,保留第一次出现的元素
duplicated(x)    #返回逻辑值,判断x中元素是否出现重复,没有重复,返回FALSE,有重复,返回TRUE
table(x)    #重复值统计
sort(x)     #排序,默认从小到大,如果要从大到小,需要添加参数:decreasing = T,或者rev(sort(x))将从小到大向量倒着输出

2.3 对两个向量的操作

(1)逻辑比较,生成等长的逻辑向量

x %in%      #判断x的元素是否存在y中

(2)数学计算
(3)连接

paste(x, y, sep=":")

(4)交集、并集、差集

intersect(x,y)   #交
union(x,y)       #并
setdiff(x,y)     #在x中不在y中的元素
setdiff(y,x)     #在y中不在x中的元素

当两个向量长度不一致时,短的向量会循环补齐进行计算

2.4 向量筛选(取子集)

[]:(1)将TRUE对应的值挑选出来,FALSE对应的值丢弃; (2)根据位置取

2.5 如何修改向量中的某个/某些元素?

取子集+赋值

2.6 简单向量作图

如何调整元素顺序?
直接改变向量取值的下标即可

向量匹配排序-match:谁在外面,谁就在后面,y按照x的顺序重新排序

x <- c("A","B","C","D","E")
y <- c("B","D","E","A","C")
match(x,y)    #根据x作为模板,调整y的顺序,返回y的对应的元素的下标
#[1] 4 1 5 2 3

y[match(x,y)]  
#[1] "A" "B" "C" "D" "E"
x[match(y,x)]
#[1] "B" "D" "E" "A" "C"
posted @ 2021-01-07 00:13  stdforml  阅读(381)  评论(0编辑  收藏  举报