gg_sankey
gg_sankey
PeRl
好像有好久没有更新了,一直想自己用ggplot2
实现一下sankey
图,就着手做了一下最简单的.
一般的sankey图长这样,左边一列,右边一列,中间的条带是左右两个状态之间的转变.
那么,首先我们就需要构建左右两边的bar
,在每个柱的中间标注上所占的比例:
library(ggplot2)
color_list <- c("#f38181", "#fce38a", "#61c0bf", "#95e1d3")
bar_data <- data.frame(
x = c(1, 1, 1, 11, 11,11,11),
type = c("a", "b", "c", "a", "b", "c","d"),
y = c(0.2, 0.3, 0.5, 0.1, 0.5, 0.2, 0.2)
)
text_data_create <- function(bar_data){
x = bar_data$x
text = bar_data$y
y = apply(
matrix(names(table(x)), ncol = 1),
1,
function(x_group){
index = which(x == as.numeric(x_group))
start = cumsum(text[index])
end = c(0, start[1:(length(start)-1)])
return((1-(start + end)/2))
}
)
text_data = data.frame(
x = x,
y = unlist(y),
text = text
)
}
bar_p <- ggplot(data = bar_data) +
geom_bar(position = "fill", stat = "identity", aes(fill = type, x,y), colour = "white", width = 0.8) +
geom_text(data = text_data_create(bar_data), aes(x, y, label = text)) +
scale_fill_manual(values = color_list)
结果如图:
接下去就是中间引流线的构建,简单来说其实就是确定上线和下线,为了美观,我用 \(X^{3}\)给线加上弧度:
river_data_create <- function(start_y_upper, end_y_upper, start_y_lower, end_y_lower){
x = seq((1 + 0.8/2), (11 - 0.8/2), length = 10000)
mean_y_upper = (start_y_upper + end_y_upper)/2
y_upper = (start_y_upper - mean_y_upper)/(4.6^3)*(-x + 6)^3 + mean_y_upper
mean_y_lower = (start_y_lower + end_y_lower)/2
y_lower = (start_y_lower - mean_y_lower)/(4.6^3)*(-x + 6)^3 + mean_y_lower
river_data = data.frame(
x,
y_upper,
y_lower
)
text_data = data.frame(
x = 6,
y = (start_y_upper + end_y_lower) / 2,
text = as.character(start_y_upper - start_y_lower)
)
return(list(line = river_data, text = text_data))
}
这样就完成了计算导流线的点坐标,之后就利用 geom_ribbon
往图层上添加即可.
river_data <- river_data_create(1,0.9,0.9, 0.8)
sankey_p <- bar_p +
geom_ribbon(data = river_data$line, aes(x, ymin = y_lower, ymax = y_upper), fill = color_list[1], colour = "white", alpha = 0.2) +
geom_text(data = river_data$text, aes(x,y,label = text))
river_data <- river_data_create(0.5,0.6, 0.4, 0.5)
sanky_p <- sanky_p +
geom_ribbon(data = river_data$line, aes(x, ymin = y_lower, ymax = y_upper), fill = color_list[3], colour = "white", alpha = 0.2) +
geom_text(data = river_data$text, aes(x,y,label = text))
river_data <- river_data_create(0.2,0.2, 0, 0)
sanky_p <- sanky_p +
geom_ribbon(data = river_data$line, aes(x, ymin = y_lower, ymax = y_upper), fill = color_list[3], colour = "white", alpha = 0.2) +
geom_text(data = river_data$text, aes(x,y,label = text))
最后就是对theme
的调整,把一些没用的线去掉:
虽然现在已经有很多包可以实现 sankey
图的绘画, 比如 riverplot, 但是实现一次还是挺有意思的.
最后,祝您
身体健康.