R语言1向量

生信技能树R语言部分学习笔记

诸事殚精竭虑者，终将一无所成。
-------腓特烈大帝

01 R与Rstudio

在R中用新建R project来代替setwd(),方便管理工作目录（默认文件保存的位置）。

02 数据类型和向量

数据类型分为：数值型(numeric)，字符型(character)，逻辑型(logical)，因子型(factor)
逻辑型：TRUE(T),FALSE(F),NA(缺失值，存在但未知)
NULL：不存在

判断数据类型的函数：class()
is族函数，判断数据类型，返回值为TRUE或FALSE
is.numeric()
is.logical()
is.charactor()

as族函数实现数据类型之间的转换
as.numeric()
as.logical()
as.charactor()

数据结构分为：向量（一串数据，可以有重复值）、数据框（~表格，一列只允许一种数据类型，单独一列来看是向量）、矩阵、列表

2.1 向量的生成

(1)用c()逐一放到一起
(2)连续的数字用冒号“：”
(3)有重复的用rep()，有规律的序列用seq()，随机数用rnorm
(4)通过组合，产生更为复杂的向量
eg:

paste0(rep("gene",times=3),1:3)

问题：paste()和paste0()的区别与联系？
paste/paste0 函数，用于连接字符（向量）， paste 可以设置连接字符，默认以空格作为连接字符； paste0 以空字符串连接字符，不能设置 sep 值。collapse 参数可以实现用 sep 连接后的字符向量的元素间的再次连接。简单来说就是paste0是paste的简单版。

使用方式:

paste (..., sep = " ", collapse = NULL, recycle0 = FALSE)
paste0(...,            collapse = NULL, recycle0 = FALSE)

举例：

## When passing a single vector, paste0 and paste work like as.character.
> paste(1:12)
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12"
> paste0(1:12)    # same
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12"

## If you pass several vectors to paste0, they are concatenated in a
## vectorized way.
> paste0(1:12, c("st", "nd", "rd", rep("th", 9)))
 [1] "1st"  "2nd"  "3rd"  "4th"  "5th"  "6th"  "7th"  "8th"  "9th"  "10th"
[11] "11th" "12th"

## paste works the same, but separates each input with a space.
## Notice that the recycling rules make every input as long as the longest input.
> paste(month.abb, "is the", nth, "month of the year.")
 [1] "Jan is the 1st month of the year." 
 [2] "Feb is the 2nd month of the year." 
 [3] "Mar is the 3rd month of the year." 
 [4] "Apr is the 4th month of the year." 
 [5] "May is the 5th month of the year." 
 [6] "Jun is the 6th month of the year." 
 [7] "Jul is the 7th month of the year." 
 [8] "Aug is the 8th month of the year." 
 [9] "Sep is the 9th month of the year." 
[10] "Oct is the 10th month of the year."
[11] "Nov is the 11th month of the year."
[12] "Dec is the 12th month of the year."
> month.abb
 [1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
> nth
 [1] "1st"  "2nd"  "3rd"  "4th"  "5th"  "6th"  "7th"  "8th"  "9th"  "10th"
[11] "11th" "12th"
> paste(month.abb, letters)
 [1] "Jan a" "Feb b" "Mar c" "Apr d" "May e" "Jun f" "Jul g" "Aug h" "Sep i"
[10] "Oct j" "Nov k" "Dec l" "Jan m" "Feb n" "Mar o" "Apr p" "May q" "Jun r"
[19] "Jul s" "Aug t" "Sep u" "Oct v" "Nov w" "Dec x" "Jan y" "Feb z"

## You can change the separator by passing a sep argument
## which can be multiple characters.
> paste(month.abb, "is the", nth, "month of the year.", sep = "_*_")
 [1] "Jan_*_is the_*_1st_*_month of the year." 
 [2] "Feb_*_is the_*_2nd_*_month of the year." 
 [3] "Mar_*_is the_*_3rd_*_month of the year." 
 [4] "Apr_*_is the_*_4th_*_month of the year." 
 [5] "May_*_is the_*_5th_*_month of the year." 
 [6] "Jun_*_is the_*_6th_*_month of the year." 
 [7] "Jul_*_is the_*_7th_*_month of the year." 
 [8] "Aug_*_is the_*_8th_*_month of the year." 
 [9] "Sep_*_is the_*_9th_*_month of the year." 
[10] "Oct_*_is the_*_10th_*_month of the year."
[11] "Nov_*_is the_*_11th_*_month of the year."
[12] "Dec_*_is the_*_12th_*_month of the year."

## To collapse the output into a single string, pass a collapse argument.
> paste0(nth, collapse = ", ")
[1] "1st, 2nd, 3rd, 4th, 5th, 6th, 7th, 8th, 9th, 10th, 11th, 12th"

## For inputs of length 1, use the sep argument rather than collapse
> paste("1st", "2nd", "3rd", collapse = ", ")    # probably not what you wanted
[1] "1st 2nd 3rd"
> paste("1st", "2nd", "3rd", sep = ", ")
[1] "1st, 2nd, 3rd"

## You can combine the sep and collapse arguments together.
> paste(month.abb, nth, sep = ": ", collapse = "; ")
[1] "Jan: 1st; Feb: 2nd; Mar: 3rd; Apr: 4th; May: 5th; Jun: 6th; Jul: 7th; Aug: 8th; Sep: 9th; Oct: 10th; Nov: 11th; Dec: 12th"

数据类型转换的优先顺序：
逻辑型---->字符型；逻辑型---->数值型---->字符型

2.2 对单个向量进行的操作

(1)赋值给一个变量名

x <- c(1,3,5)

(2)简单数学计算

log(x)
sqrt(x)

(3)根据某条件进行判断，生成逻辑值向量

x > 3
x == 3

(4)初级统计

unique(x)   #去重复，保留第一次出现的元素
duplicated(x)    #返回逻辑值，判断x中元素是否出现重复，没有重复，返回FALSE,有重复，返回TRUE
table(x)    #重复值统计
sort(x)     #排序，默认从小到大，如果要从大到小，需要添加参数：decreasing = T，或者rev(sort(x))将从小到大向量倒着输出

2.3 对两个向量的操作

(1)逻辑比较，生成等长的逻辑向量

x %in%      #判断x的元素是否存在y中

(2)数学计算
(3)连接

paste(x, y, sep=":")

(4)交集、并集、差集

intersect(x,y)   #交
union(x,y)       #并
setdiff(x,y)     #在x中不在y中的元素
setdiff(y,x)     #在y中不在x中的元素

当两个向量长度不一致时，短的向量会循环补齐进行计算

2.4 向量筛选（取子集）

[]：(1)将TRUE对应的值挑选出来，FALSE对应的值丢弃; (2)根据位置取

2.5 如何修改向量中的某个/某些元素？

取子集+赋值

2.6 简单向量作图

略

如何调整元素顺序？
直接改变向量取值的下标即可

向量匹配排序-match:谁在外面，谁就在后面，y按照x的顺序重新排序

x <- c("A","B","C","D","E")
y <- c("B","D","E","A","C")
match(x,y)    #根据x作为模板，调整y的顺序，返回y的对应的元素的下标
#[1] 4 1 5 2 3

y[match(x,y)]  
#[1] "A" "B" "C" "D" "E"
x[match(y,x)]
#[1] "B" "D" "E" "A" "C"

posted @ 2021-01-07 00:13 stdforml 阅读(392) 评论(0) 编辑收藏举报

刷新页面返回顶部