要根据一个分类列查找多个分类列的唯一值的数量,我们可以按照以下步骤操作 -
首先,创建一个数据框
使用 summarise_each 函数和 n_distinct 函数来查找基于分类列的唯一值的数量。
让我们创建一个数据框,如下所示 -
x<- sample(c("First","Second","Third","Fourth","Fifth","Sixth","Seventh","Eighth","Nineth", "Tenth"),25,replace=TRUE) C1<-sample(LETTERS[1:4],25,replace=TRUE) C2<-sample(letters[1:4],25,replace=TRUE) df<-data.frame(x,C1,C2) df
执行时,上述脚本生成以下内容output(this output will vary on your system due to randomization)-
x C1 C2 1 Seventh B a 2 Third C c 3 Nineth A a 4 Third D c 5 Seventh D d 6 Fourth A c 7 Seventh B a 8 Third D a 9 Seventh D c 10 First A a 11 Eighth D d 12 Tenth C b 13 Fifth A c 14 Second A c 15 Fourth B d 16 Nineth C b 17 Fifth D a 18 First A a 19 Tenth B a 20 Nineth A b 21 Third B b 22 Tenth A a 23 Fifth A a 24 Sixth D b 25 First A c
使用 dplyr 包的 n_distinct 函数和 summarise_each 函数根据 x 查找 C1 和 C2 中唯一值的数量 -
x<- sample(c("First","Second","Third","Fourth","Fifth","Sixth","Seventh","Eighth","Nineth", "Tenth"),25,replace=TRUE) C1<-sample(LETTERS[1:4],25,replace=TRUE) C2<-sample(letters[1:4],25,replace=TRUE) df<-data.frame(x,C1,C2) library(dplyr) df %>% group_by(x) %>% summarise_each(funs(n_distinct(.)))
# A tibble: 10 x 3 x C1 C2 <chr> <int> <int> 1 Eighth 1 1 2 Fifth 2 2 3 First 1 2 4 Fourth 2 2 5 Nineth 2 2 6 Second 1 1 7 Seventh 2 3 8 Sixth 1 1 9 Tenth 3 2 10 Third 3 3