要在Rdata.table对象的另一列中提取最大which.max分组行,可以通过定义分组列来利用函数。这意味着,如果我们有一个分类/分组列和一个数字列,那么我们的最大分组数量将是数字列中每个分组级别的最大值,并且我们可以基于这两列提取行。查看示例以了解其工作原理。
加载data.table包并创建一个data.table对象-
> library(data.table) > x1<-sample(c("A","B","C"),20,replace=TRUE) > x2<-rpois(20,5) > x3<-rpois(20,2) > DT1<-data.table(x1,x2,x3) > DT1输出结果
x1 x2 x31: B 3 2 2: C 6 0 3: B 4 1 4: C 8 3 5: A 3 6 6: B 5 3 7: C 4 1 8: B 4 0 9: B 5 2 10: C 6 1 11: A 4 1 12: C 5 0 13: B 2 3 14: A 5 0 15: C 8 6 16: A 5 2 17: B 4 2 18: A 3 3 19: C 10 2 20: C 3 2
从DT1提取分组最大行-
> DT1[,.SD[which.max(x2)],by=x1]输出结果
x1 x2 x3 1: B 5 3 2: C 10 2 3: A 5 0
> y1<-sample(c("Male","Female"),20,replace=TRUE) > y2<-rnorm(20) > y3<-rnorm(20) > DT2<-data.table(y1,y2,y3) > DT2输出结果
y1 y2 y3 1: Female 0.09094138 -0.4011408 2: Male -0.51845798 0.9946824 3: Male 0.73189425 0.2013690 4: Male 0.58616939 0.6290771 5: Male 2.53714401 -0.9434801 6: Female -0.98726606 -0.9564542 7: Male 1.28230337 0.2018570 8: Female -0.60125038 1.0522084 9: Female 1.06912678 -0.3825166 10: Female 0.99567103 -0.1200035 11: Male 0.66163046 -0.3596741 12: Male -0.62465260 2.2215039 13: Male 2.09315525 1.4402211 14: Male -1.18256083 0.3528192 15: Male -0.36751044 0.4837127 16: Male -0.23044236 -0.8761699 17: Male -0.84228258 -0.5922790 18: Female 0.80129337 1.5403199 19: Male 0.76037129 -0.4590728 20: Female 0.17482961 0.3189389
从DT2提取分组最大行-
> DT2[,.SD[which.max(y3)],by=y1]输出结果
y1 y2 y3 1: Female 0.8012934 1.540320 2: Male -0.6246526 2.221504