矩阵仅包含数字值,因此,如果我们将转换具有因子变量作为字符串的数据框,则因子水平将转换为数字。这些编号基于因子级别的第一个字符,例如,如果字符串以A开头,则它将得到1,依此类推。如果数据帧包含因子变量作为字符串,则要将数据帧转换为矩阵,我们需要将数据帧读取为矩阵。
请看以下数据帧-
x1<-1:10 x2<-10:1 x3<-letters[1:10] x4<-LETTERS[1:10] x5<-letters[10:1] x6<-LETTERS[10:1] x7<-rnorm(10) x8<-rnorm(10,0.2) x9<-rnorm(10,0.5) x10<-rnorm(10,1) df<-data.frame(x1,x2,x3,x4,x5,x6,x7,x8,x9,x10) str(df)
输出结果
'data.frame': 10 obs. of 10 variables: $ x1 : int 1 2 3 4 5 6 7 8 9 10 $ x2 : int 10 9 8 7 6 5 4 3 2 1 $ x3 : Factor w/ 10 levels "a","b","c","d",..: 1 2 3 4 5 6 7 8 9 10 $ x4 : Factor w/ 10 levels "A","B","C","D",..: 1 2 3 4 5 6 7 8 9 10 $ x5 : Factor w/ 10 levels "a","b","c","d",..: 10 9 8 7 6 5 4 3 2 1 $ x6 : Factor w/ 10 levels "A","B","C","D",..: 10 9 8 7 6 5 4 3 2 1 $ x7 : num 0.526 -0.795 1.428 -1.467 -0.237 ... $ x8 : num 0.0362 0.9085 -0.068 -1.2639 0.9444 ... $ x9 : num 1.395 0.779 1.508 -1.573 1.69 ... $ x10: num 1.482 1.758 -1.319 0.54 -0.105 ... df x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 1 1 10 a A j J 0.5264481 0.03624433 1.3949372 1.4824588 2 2 9 b B i I -0.7948444 0.90852210 0.7791520 1.7582138 3 3 8 c C h H 1.4277555 -0.06798055 1.5078658 -1.3193274 4 4 7 d D g G -1.4668197 -1.26392176 -1.5731065 0.5404952 5 5 6 e E f F -0.2366834 0.94443582 1.6898534 -0.1053837 6 6 5 f F e E -0.1933380 -1.21039018 -0.2243742 1.4029283 7 7 4 g G d D -0.8497547 0.66706761 0.6679838 1.5689349 8 8 3 h H c C 0.0584655 0.08067989 1.4203352 0.2939167 9 9 2 i I b B -0.8176704 0.66723896 -1.1716048 0.7099094 10 10 1 j J a A -2.0503078 0.69813556 0.9484691 -0.4838781
将数据帧df转换为矩阵-
matrix(as.numeric(unlist(df)),nrow=nrow(df))
输出结果
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [1,] 1 10 1 1 10 10 0.5264481 0.03624433 1.3949372 [2,] 2 9 2 2 9 9 -0.7948444 0.90852210 0.7791520 [3,] 3 8 3 3 8 8 1.4277555 -0.06798055 1.5078658 [4,] 4 7 4 4 7 7 -1.4668197 -1.26392176 -1.5731065 [5,] 5 6 5 5 6 6 -0.2366834 0.94443582 1.6898534 [6,] 6 5 6 6 5 5 -0.1933380 -1.21039018 -0.2243742 [7,] 7 4 7 7 4 4 -0.8497547 0.66706761 0.6679838 [8,] 8 3 8 8 3 3 0.0584655 0.08067989 1.4203352 [9,] 9 2 9 9 2 2 -0.8176704 0.66723896 -1.1716048 [10,] 10 1 10 10 1 1 -2.0503078 0.69813556 0.9484691 [,10] [1,] 1.4824588 [2,] 1.7582138 [3,] -1.3193274 [4,] 0.5404952 [5,] -0.1053837 [6,] 1.4029283 [7,] 1.5689349 [8,] 0.2939167 [9,] 0.7099094 [10,] -0.4838781
让我们看另一个例子-
y1<-c("Age","Sex","Salary","Education","Ethnicity") y2<-1:5 y3<-c(24,15,48,72,29) df_y<-data.frame(y1,y2,y3) df_y
输出结果
y1 y2 y3 1 Age 1 24 2 Sex 2 15 3 Salary 3 48 4 Education 4 72 5 Ethnicity 5 29
matrix(as.numeric(unlist(df_y)),nrow=5)
输出结果
[,1] [,2] [,3] [1,] 1 1 24 [2,] 5 2 15 [3,] 4 3 48 [4,] 2 4 72 [5,] 3 5 29