《ggplot2:数据分析与图形艺术》,即ggplot2: Elegant Graphics for Data Analysis,目前网上介绍的比较多的是第二版且已经有中文版,Hadley Wickham等目前已经更新到了第三版,做了很多调整如下:
- 在线地址:http://ggplot2-book.org/.
- 在线地址:https://ggplot2.tidyverse.org/index.html
- The chapters on data analysis and modelling have been removed. You can find updated versions of this material in R for Data Science at https://r4ds.had.co.nz.
- The toolbox chapter has been split into multiple chapters.
- New chapter on arranging multiple plots on the page.
- The positioning chapter has been split into facetting and coordinate systems.
- New FAQ chapter that covers some of the most commonly seen problems in the wild.
- The scales and guides chapters have been radically reconfigured into one chapter each on position, colour, and other scales and guides, and then one chapter that focusses on the underlying theory.
第二部分从绘图语法入手(the grammar of graphics),关键点在于图层叠加,进行可视化结果的细节调整和美化。
第三部分是有关数据分析的相关内容,实现数据分析与可视化的结合,这一部分内容与Hadley Wickham的另一本书R for Data Science有重合的地方,可以选择跳过。
1 | install.packages ( "ggplot2" ) |
1 2 | library ( 'ggplot2' ) data (package = 'ggplot2' ) #查看ggplot2内置数据集 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | Data sets in package ‘ggplot2’: diamonds Prices of over 50,000 round cut diamonds economics US economic time series economics_long US economic time series faithfuld 2d density estimate of Old Faithful data luv_colours 'colors()' in Luv space midwest Midwest demographics mpg Fuel economy data from 1999 to 2008 for 38 popular models of cars msleep An updated and expanded version of the mammals sleep dataset presidential Terms of 11 presidents from Eisenhower to Obama seals Vector field of seal movements txhousing Housing sales in TX |
1 2 3 4 5 6 7 8 9 10 | > <strong> head (mpg) #看下数据集的前几列<br><br></strong> # A tibble: 6 x 11 manufacturer model displ year cyl trans drv cty hwy fl class <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr> 1 audi a4 1.8 1999 4 auto (l5) f 18 29 p comp~ 2 audi a4 1.8 1999 4 manual (~ f 21 29 p comp~ 3 audi a4 2 2008 4 manual (~ f 20 31 p comp~ 4 audi a4 2 2008 4 auto (av) f 21 30 p comp~ 5 audi a4 2.8 1999 6 auto (l5) f 16 26 p comp~ 6 audi a4 2.8 1999 6 manual (~ f 18 26 p c |
整个数据集有234行 x 11列
1 2 3 4 5 6 7 8 9 10 11 | manufacturer 生产厂家,如奥迪audi、吉普jeep等 model model name 车型,如奥迪A4、奥迪A6等 displ engine displacement, in litres,发动机排量,单位为每升 year year of manufacture,出厂年份 cyl number of cylinders,气缸数量 trans type of transmission,传输类型,手动还是自动 drv f = front-wheel drive, r = rear wheel drive, 4 = 4wd,驱动类型 ,前轮还是后轮驱动 cty city miles per gallon,每加仑油城市驾驶里程数 hwy highway miles per gallon,每加仑油高速驾驶里程数 fl fuel type,燃油型号 class "type" of car,suv、桑塔纳等 |
ggplot2 绘图必备三要素
- 数据集(data)
- 图像属性(aes)
- 几何对象(geom)
1 2 3 | library ( 'ggplot2' ) ggplot (mpg, aes (x = displ, y = hwy)) + geom_point () |
ggplot(mpg, #数据集(data)
aes(x = displ, y = hwy)) + #图像属性(aes),即指定x、y轴要投影的数据 data和aes通过ggplot组合在一起,使用+号添加图层(layers)
geom_point() #几何对象(geom),此处为散点图
1 2 | ggplot (mpg, aes (displ, hwy)) + geom_point () |
aes设置 (Colour, size, shape and other aesthetic attributes)
1 2 3 | library ( 'ggplot2' ) ggplot (mpg, aes (displ, hwy, colour = class)) + geom_point () |
1 2 3 | library ( 'ggplot2' ) ggplot (mpg, aes (displ, hwy, shape=drv)) + geom_point () |
1 2 3 | library ( 'ggplot2' ) ggplot (mpg, aes (displ, hwy, size=cyl)) + geom_point () |
1 2 | ggplot (mpg, aes (displ, hwy)) + geom_point ( aes (colour = "blue" )) ggplot (mpg, aes (displ, hwy)) + geom_point (colour = "blue" ) |
