基于两个不同字符列的整数列总计的计算仅意味着我们需要为可用数据创建一个列联表。为此,我们可以使用with和tapply函数。例如,如果我们有一个数据框df,其中包含两个定义为性别和种族的分类列和一个定义为Package的整数列,则可以将权变表创建为:
与(df,tapply(包装,列表(性别,种族),总和))
请看以下数据帧-
set.seed(777) Class<−sample(c("First","Second","Third"),20,replace=TRUE) Group<−sample(c("GP1","GP2","GP3","GP4"),20,replace=TRUE) Rate<−sample(0:10,20,replace=TRUE) df1<−data.frame(Class,Group,Rate) df1
输出结果
Class Group Rate 1 First GP1 7 2 Second GP2 1 3 Second GP4 1 4 Second GP4 0 5 Third GP2 10 6 Second GP2 8 7 First GP1 7 8 First GP4 4 9 Second GP1 4 10 Third GP3 8 11 Second GP2 8 12 First GP2 4 13 Third GP2 6 14 Third GP4 4 15 Third GP4 5 16 Second GP1 2 17 Second GP1 9 18 Second GP3 2 19 Second GP3 1 20 Third GP4 10
str(df1) 'data.frame': 20 obs. of 3 variables: $ Class: chr "First" "Second" "Second" "Second" ... $ Group: chr "GP1" "GP2" "GP4" "GP4" ... $ Rate : int 7 1 1 0 10 8 7 4 4 8 ...
根据类别和组找到费率的总和-
with(df1,tapply(Rate,list(Class,Group),sum)) GP1 GP2 GP3 GP4 First 14 4 NA 4 Second 15 17 3 1 Third NA 16 8 19
让我们看另一个例子-
Gender<−sample(c("Male","Female"),20,replace=TRUE) Centering<−sample(c("Yes","No"),20,replace=TRUE) Percentage<−sample(1:100,20) df2<−data.frame(Gender,Centering,Percentage) df2
输出结果
Gender Centering Percentage 1 Male No 28 2 Male No 89 3 Female Yes 38 4 Male No 78 5 Male Yes 19 6 Female No 46 7 Female Yes 94 8 Male No 4 9 Male Yes 92 10 Male No 90 11 Male Yes 66 12 Female No 57 13 Female No 74 14 Female No 48 15 Female Yes 20 16 Male Yes 51 17 Male No 82 18 Male No 7 19 Male No 53 20 Male No 55
根据性别和居中找到总百分比-
with(df2,tapply(Percentage,list(Gender,Centering),sum)) No Yes Female 225 152 Male 486 228