为了找到数据帧的相关矩阵,我们可以将cor函数与数据帧对象名称一起使用,但是如果数据帧中存在缺失值,则不是那么简单。在这种情况下,我们可以对cor函数使用complete.obs,以便在计算相关系数时忽略缺失的值。
请看以下数据帧:
> x1<-sample(c(NA,24,5,7,8),20,replace=TRUE) > x2<-sample(c(NA,2,3,1,4,7),20,replace=TRUE) > x3<-sample(c(NA,512,520,530),20,replace=TRUE) > df1<-data.frame(x1,x2,x3) > df1
输出结果
x1 x2 x3 1 NA 3 512 2 8 7 512 3 5 2 520 4 NA 1 NA 5 NA 2 512 6 NA 4 NA 7 5 NA 530 8 NA NA 530 9 24 3 NA 10 NA 1 512 11 5 2 530 12 NA 7 520 13 5 1 NA 14 8 3 530 15 7 1 NA 16 7 4 530 17 7 3 512 18 5 2 530 19 7 3 530 20 NA 1 512
找到df1的相关矩阵:
> cor(df1,use="complete.obs",method="pearson")
输出结果
x1 x2 x3 x1 1.0000000 0.7190925 -0.2756960 x2 0.7190925 1.0000000 -0.5200868 x3 -0.2756960 -0.5200868 1.0000000
> y1<-sample(c(NA,rnorm(5,5,1)),20,replace=TRUE) > y2<-sample(c(NA,rnorm(5,2,1)),20,replace=TRUE) > y3<-sample(c(NA,rnorm(10,10,1)),20,replace=TRUE) > y4<-sample(c(NA,rnorm(10,5,2.5)),20,replace=TRUE) > df2<-data.frame(y1,y2,y3,y4) > df2
输出结果
y1 y2 y3 y4 1 NA 2.955947 NA 2.8623715 2 NA 3.087940 9.099791 4.5996351 3 NA 3.087940 9.589898 5.6097088 4 3.500343 1.150117 10.985979 NA 5 4.831364 3.087940 10.107124 NA 6 7.041597 1.840461 9.416738 2.8601661 7 NA 2.212388 10.453622 5.0717510 8 4.831364 3.087940 10.928925 6.3030777 9 7.041597 NA 9.099791 5.2709332 10 4.831364 2.212388 NA 2.6219274 11 4.831364 2.212388 10.928925 6.3030777 12 3.500343 NA 8.779948 6.3030777 13 4.772150 1.840461 9.589898 5.2709332 14 7.041597 2.955947 10.453622 5.5989568 15 NA 2.955947 9.827149 5.5989568 16 7.041597 1.840461 9.099791 5.5989568 17 3.500343 2.212388 8.779948 4.5996351 18 4.772150 2.212388 10.985979 NA 19 NA 2.955947 10.453622 0.3151969 20 4.772150 1.150117 9.099791 6.3030777
找到df2的相关矩阵:
> cor(df2,use="complete.obs",method="pearson")
输出结果
y1 y2 y3 y4 y1 1.00000000 0.07343574 0.06408734 -0.3103069 y2 0.07343574 1.00000000 0.70344970 0.1674528 y3 0.06408734 0.70344970 1.00000000 0.4544444 y4 -0.31030689 0.16745277 0.45444435 1.0000000