数据帧的拆分主要是为了比较数据帧的不同部分而进行的,但是此拆分是基于某种条件的,并且该条件也可以是行值。例如,如果我们有一个数据框df,其中一列代表分类数据,则可以通过使用子集函数来完成基于类别的拆分,如以下示例所示。
请看以下数据帧:
> Country<-rep(c("India","China","Russia","Sudan"),5) > Ratings<-sample(1:5,20,replace=TRUE) > df1<-data.frame(Country,Ratings) > df1
输出结果
Country Ratings 1 India 1 2 China 2 3 Russia 5 4 Sudan 3 5 India 5 6 China 5 7 Russia 5 8 Sudan 5 9 India 2 10 China 1 11 Russia 5 12 Sudan 4 13 India 3 14 China 1 15 Russia 1 16 Sudan 2 17 India 3 18 China 4 19 Russia 5 20 Sudan 2
为组(印度,中国,俄罗斯和苏丹)划分df1:
> C1<-subset(df1,Country %in% c("India","China")) > C1
输出结果
Country Ratings 1 India 1 2 China 2 5 India 5 6 China 5 9 India 2 10 China 1 13 India 3 14 China 1 17 India 3 18 China 4
> C2<-subset(df1,Country %in% c("Russia")) > C2
输出结果
Country Ratings 3 Russia 5 7 Russia 5 11 Russia 5 15 Russia 1 19 Russia 5
> C3<-subset(df1,Country %in% c("Sudan")) > C3
输出结果
Country Ratings 4 Sudan 3 8 Sudan 5 12 Sudan 4 16 Sudan 2 20 Sudan 2
请看以下数据帧:
> Season<-sample(c("Summer","Spring","Winter"),20,replace=TRUE) > Rain<-sample(c("Yes","No"),20,replace=TRUE) > df2<-data.frame(Season,Rain) > df2
输出结果
Season Rain 1 Spring Yes 2 Winter Yes 3 Spring Yes 4 Spring No 5 Winter No 6 Summer No 7 Summer No 8 Winter Yes 9 Winter Yes 10 Winter Yes 11 Summer Yes 12 Summer Yes 13 Summer Yes 14 Summer No 15 Winter No 16 Spring No 17 Summer Yes 18 Spring Yes 19 Winter No 20 Winter No
分别为冬季,夏季和春季拆分df2:
> S1<-subset(df2,Season %in% c("Winter")) > S1
输出结果
Season Rain 2 Winter Yes 5 Winter No 8 Winter Yes 9 Winter Yes 10 Winter Yes 15 Winter No 19 Winter No 20 Winter No
> S2<-subset(df2,Season %in% c("Summer")) > S2
输出结果
Season Rain 6 Summer No 7 Summer No 11 Summer Yes 12 Summer Yes 13 Summer Yes 14 Summer No 17 Summer Yes
> S3<-subset(df2,Season %in% c("Spring")) > S3
输出结果
Season Rain 1 Spring Yes 3 Spring Yes 4 Spring No 16 Spring No 18 Spring Yes