当我们使用dplyr包的group_by函数时,我们需要传递name(s)本质上属于分类的列。如果要使用相同的索引,column(s)则需要使用group_by_at函数,在此我们可以将列索引作为参数传递。
考虑以下数据帧-
x1<−sample(LETTERS[1:4],20,replace=TRUE) x2<−rpois(20,2) df1<−data.frame(x1,x2) df1输出结果
x1 x2 1 D 4 2 D 5 3 B 2 4 D 3 5 C 1 6 C 3 7 D 1 8 D 3 9 B 3 10 B 2 11 C 0 12 C 1 13 A 2 14 B 2 15 B 2 16 C 4 17 D 2 18 A 0 19 D 0 20 B 2
加载dplyr软件包并使用列索引而不是列名-
library(dplyr) df1%>%group_by_at(1)%>%summarise(n=n()) `summarise()` ungrouping output (override with `.groups` argument)输出结果
# A tibble: 4 x 2 x1 n < chr> <int> 1 A 2 2 B 6 3 C 5 4 D 7
y1<−sample(c("Male","Female"),20,replace=TRUE) y2<−sample(21:50,20) df2<−data.frame(y1,y2) df2输出结果
y1 y2 1 Female 29 2 Male 43 3 Female 34 4 Male 49 5 Male 28 6 Female 23 7 Female 27 8 Female 31 9 Female 36 10 Female 41 11 Male 25 12 Female 24 13 Male 30 14 Female 22 15 Female 37 16 Male 42 17 Female 47 18 Male 35 19 Female 32 20 Female 21
使用列索引而不是列名称来汇总y1-
df2%>%group_by_at(1)%>%summarise(n=n()) `summarise()` ungrouping output (override with `.groups` argument)输出结果
# A tibble: 2 x 2 y1 n <chr> <int> 1 Female 13 2 Male 7