通常,我们按列合并数据帧,因为列名在数据集中被认为是突出的,但是也可以通过使用行合并两个数据帧。与按列合并相比,按行合并可能会导致产生更多未清除的数据。这可以借助合并功能及其by参数来完成。
请看以下数据帧-
df1<-data.frame(x1=rnorm(10),x2=rpois(10,5)) df1
输出结果
x1 x2 1 -0.47030794 0 2 0.86338465 8 3 -2.05770293 5 4 1.95479596 9 5 -0.06913421 5 6 0.64897263 5 7 -1.79859382 8 8 0.31247699 6 9 -0.36808285 7 10 -0.79578938 3
输出结果
df2 <-data.frame(x1=rnorm(10,1.5)) df2
输出结果
x1 1 -0.01317184 2 0.01687606 3 -0.71685289 4 1.75961121 5 2.49024285 6 2.92183374 7 0.10276216 8 1.39703966 9 1.41001339 10 0.98221783
df3 <-data.frame(x1=rnorm(5,0.5),x2=runif(5,2,3),x3=runif(5,2,5)) df3
输出结果
x1 x2 x3 1 -0.4926244 2.697937 3.961118 2 1.9863263 2.861944 3.659564 3 -0.2266537 2.383499 3.208741 4 0.5966503 2.511485 3.230795 5 0.7148641 2.362419 4.582841
合并df1与df2,df1与df3以及df2与df3-
df_1_2 <-merge(df1,df2,by='row.names',all=TRUE) df_1_2
输出结果
Row.names x1.x x2 x1.y 1 1 -0.47030794 0 -0.01317184 2 10 -0.79578938 3 0.98221783 3 2 0.86338465 8 0.01687606 4 3 -2.05770293 5 -0.71685289 5 4 1.95479596 9 1.75961121 6 5 -0.06913421 5 2.49024285 7 6 0.64897263 5 2.92183374 8 7 -1.79859382 8 0.10276216 9 8 0.31247699 6 1.39703966 10 9 -0.36808285 7 1.41001339
df_1_3 <-merge(df1,df3,by='row.names',all=TRUE) df_1_3
输出结果
Row.names x1.x x2.x x1.y x2.y x3 1 1 -0.47030794 0 -0.4926244 2.697937 3.961118 2 10 -0.79578938 3 NA NA NA 3 2 0.86338465 8 1.9863263 2.861944 3.659564 4 3 -2.05770293 5 -0.2266537 2.383499 3.208741 5 4 1.95479596 9 0.5966503 2.511485 3.230795 6 5 -0.06913421 5 0.7148641 2.362419 4.582841 7 6 0.64897263 5 NA NA NA 8 7 -1.79859382 8 NA NA NA 9 8 0.31247699 6 NA NA NA 10 9 -0.36808285 7 NA NA NA
df_2_3 <-merge(df2,df3,by='row.names',all=TRUE) df_2_3
输出结果
Row.names x1.x x1.y x2 x3 1 1 -0.01317184 -0.4926244 2.697937 3.961118 2 10 0.98221783 NA NA NA 3 2 0.01687606 1.9863263 2.861944 3.659564 4 3 -0.71685289 -0.2266537 2.383499 3.208741 5 4 1.75961121 0.5966503 2.511485 3.230795 6 5 2.49024285 0.7148641 2.362419 4.582841 7 6 2.92183374 NA NA NA 8 7 0.10276216 NA NA NA 9 8 1.39703966 NA NA NA 10 9 1.41001339 NA NA NA