通常,R中的数据帧和矩阵会缺少值,如果要查找这些数据帧和矩阵的相关矩阵,则会遇到问题。数据分析中的几乎每个人都会发生这种情况,但是我们可以在使用cor函数计算相关矩阵时使用na.omit来解决该问题。请查看以下示例。
请看以下数据帧-
> x1<-sample(c(1:5,NA),500,replace=TRUE) > x2<-sample(c(rnorm(50,2,5),NA),500,replace=TRUE) > x3<-sample(c(rpois(50,2),NA),500,replace=TRUE) > x4<-sample(c(runif(50,2,10),NA),500,replace=TRUE) > df<-data.frame(x1,x2,x3,x4) > head(df,20)
输出结果
x1 x2 x3 x4 1 2 2.6347839 4 2.577690 2 3 0.3082031 1 6.250998 3 1 0.3082031 3 7.786711 4 1 2.6347839 0 3.449600 5 NA 2.5107175 1 7.269619 6 4 2.4450443 4 6.250998 7 NA 1.1747742 2 3.053929 8 NA 2.4450443 3 5.860071 9 5 6.6736496 4 7.979433 10 NA 2.4450443 2 6.250998 11 NA 1.1747742 5 NA 12 2 11.1483587 1 9.498951 13 4 2.1400502 NA 9.299100 14 2 -0.8043954 3 2.883222 15 1 1.5054120 0 2.765324 16 1 0.1283554 2 7.918015 17 3 3.0337960 3 5.588130 18 1 4.5603861 2 7.979433 19 3 4.4976830 4 8.434829 20 1 9.4147186 2 3.053929
> tail(df,20)
输出结果
x1 x2 x3 x4 481 2 -1.9780830 4 9.299100 482 3 2.0495769 1 9.639262 483 3 -4.5421502 2 3.374645 484 NA 2.1400502 3 NA 485 2 -4.0551622 2 5.999863 486 4 5.8547691 2 3.593138 487 NA NA 2 9.549274 488 3 3.9160824 1 3.053929 489 1 11.1483587 5 7.786711 490 3 -2.7581511 2 9.433952 491 NA 4.8002434 1 5.824331 492 2 4.8002434 2 8.434829 493 2 1.9706702 2 3.053929 494 NA 2.5099287 2 7.979433 495 4 1.9706702 1 7.929130 496 2 4.5919890 2 9.973436 497 4 2.5099287 4 7.269619 498 4 0.3082031 3 3.053929 499 1 5.4593713 2 9.973436 500 NA -1.9780830 4 3.219703
> cor(na.omit(df))
输出结果
x1 x2 x3 x4 x1 1.000000000 0.009571313 -0.06363564 0.03276244 x2 0.009571313 1.000000000 0.08123065 0.03330818 x3 -0.063635640 0.081230649 1.00000000 0.03503841 x4 0.032762439 0.033308181 0.03503841 1.00000000
让我们看一个矩阵数据的例子-
> M<-matrix(sample(c(rpois(10,2),NA),36,replace=TRUE),nrow=6) > M
输出结果
[,1] [,2] [,3] [,4] [,5] [,6] [1,] 2 2 2 2 NA 3 [2,] 3 2 4 1 4 3 [3,] 3 NA 1 1 1 NA [4,] 3 NA 3 2 2 1 [5,] 1 4 3 2 2 2 [6,] 1 2 1 3 1 1
> cor(na.omit(M))
输出结果
[,1] [,2] [,3] [,4] [,5] [,6] [1,] 1.0000000 -0.5000000 0.7559289 -0.8660254 0.9449112 0.8660254 [2,] -0.5000000 1.0000000 0.1889822 0.0000000 -0.1889822 0.0000000 [3,] 0.7559289 0.1889822 1.0000000 -0.9819805 0.9285714 0.9819805 [4,] -0.8660254 0.0000000 -0.9819805 1.0000000 -0.9819805 -1.0000000 [5,] 0.9449112 -0.1889822 0.9285714 -0.9819805 1.0000000 0.9819805 [6,] 0.8660254 0.0000000 0.9819805 -1.0000000 0.9819805 1.0000000