在数据分析中,我们经常需要找到数据组中存在的值的百分比。这有助于我们了解哪个值频繁出现,哪个值频率较低。而且,可以通过饼图绘制百分比,这可以为读者提供更好的数据视图。如果我们可以使用dplyr包的mutate函数,则添加新列作为组的百分比不是一个挑战,在这里您将获得示例。
> Group<-rep(1:2,each=5) > Frequency<-sample(1:100,10) > df1<-data.frame(Group,Frequency) > df1
输出结果
Group Frequency 1 1 67 2 1 58 3 1 54 4 1 13 5 1 23 6 2 91 7 2 3 8 2 95 9 2 38 10 2 48
> library(dplyr)
查找组中每个组值的百分比-
> df1%>%group_by(Group)%>%mutate(Percentage=paste0(round(Frequency/sum(Frequency)*100,2),"%")) # A tibble: 10 x 3 # Groups: Group [2]
输出结果
Group Frequency Percentage <int> <int> <chr> 1 1 67 31.16% 2 1 58 26.98% 3 1 54 25.12% 4 1 13 6.05% 5 1 23 10.7% 6 2 91 33.09% 7 2 3 1.09% 8 2 95 34.55% 9 2 38 13.82% 10 2 48 17.45%
> Gender<-rep(c("Male","Female"),each=5) > Salary<-sample(25000:50000,10) > df2<-data.frame(Gender,Salary) > df2
输出结果
Gender Salary 1 Male 41734 2 Male 39035 3 Male 36161 4 Male 33437 5 Male 45123 6 Female 44492 7 Female 48456 8 Female 31569 9 Female 35110 10 Female 43630
>df2%>%group_by(Gender)%>%mutate(Percentage=paste0(round(Salary/sum(Salary)*1 00,2),"%")) # A tibble: 10 x 3 # Groups: Gender [2]
输出结果
Gender Salary Percentage <fct> <int> <chr> 1 Male 41734 21.35% 2 Male 39035 19.97% 3 Male 36161 18.5% 4 Male 33437 17.1% 5 Male 45123 23.08% 6 Female 44492 21.89% 7 Female 48456 23.84% 8 Female 31569 15.53% 9 Female 35110 17.27% 10 Female 43630 21.47%
> Grade<-rep(c("A","B","C","D","E"),each=2) > Number_of_Years_in_Job<-sample(1:5,10,replace=TRUE) > df3<-data.frame(Grade,Number_of_Years_in_Job) > df3
输出结果
Grade Number_of_Years_in_Job 1 A 4 2 A 5 3 B 4 4 B 4 5 C 1 6 C 4 7 D 1 8 D 1 9 E 3 10 E 1
>df3%>%group_by(Grade)%>%mutate(Percentage=paste0(round(Number_of_Years_in_J ob/sum(Number_of_Years_in_Job)*100,2),"%")) # A tibble: 10 x 3 # Groups: Grade [5]
输出结果
Grade Number_of_Years_in_Job Percentage <fct> <int> <chr> 1 A 4 44.44% 2 A 5 55.56% 3 B 4 50% 4 B 4 50% 5 C 1 20% 6 C 4 80% 7 D 1 50% 8 D 1 50% 9 E 3 75% 10 E 1 25%