当我们找到R数据帧的统计摘要时,只会得到最小值,第一四分位数,中位数,均值,第三四分位数和最大值,但在描述中,还有许多其他有用的度量,例如方差,标准差,偏度,峰度等等。因此,我们可以使用fBasics软件包的basicStats函数。
加载fBasics软件包-
library(fBasics)
考虑基数R中的mtcars数据-
data(mtcars) head(mtcars,20)
输出结果
mpg cyl disp hp drat wt qsec vs am gear carb Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
查找mtcars数据集的统计摘要-
>basicStats(mtcars)
mpg cyl disp hp drat nobs 32.000000 32.000000 32.000000 32.000000 32.000000 NAs 0.000000 0.000000 0.000000 0.000000 0.000000 Minimum 10.400000 4.000000 71.100000 52.000000 2.760000 Maximum 33.900000 8.000000 472.000000 335.000000 4.930000 1. Quartile 15.425000 4.000000 120.825000 96.500000 3.080000 3. Quartile 22.800000 8.000000 326.000000 180.000000 3.920000 Mean 20.090625 6.187500 230.721875 146.687500 3.596563 Median 19.200000 6.000000 196.300000 123.000000 3.695000 Sum 642.900000 198.000000 7383.100000 4694.000000 115.090000 SE Mean 1.065424 0.315709 21.909473 12.120317 0.094519 LCL Mean 17.917679 5.543607 186.037211 121.967950 3.403790 UCL Mean 22.263571 6.831393 275.406539 171.407050 3.789335 Variance 36.324103 3.189516 15360.799829 4700.866935 0.285881 Stdev 6.026948 1.785922 123.938694 68.562868 0.534679 Skewness 0.610655 -0.174612 0.381657 0.726024 0.265904 Kurtosis -0.372766 -1.762120 -1.207212 -0.135551 -0.714701
wt qsec vs am gea r carb nobs 32.000000 32.000000 32.000000 32.000000 32.000000 32.000000 NAs 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 Minimum 1.513000 14.500000 0.000000 0.000000 3.000000 1.000000 Maximum 5.424000 22.900000 1.000000 1.000000 5.000000 8.000000 1. Quartile 2.581250 16.892500 0.000000 0.000000 3.000000 2.000000 3. Quartile 3.610000 18.900000 1.000000 1.000000 4.000000 4.000000 Mean 3.217250 17.848750 0.437500 0.406250 3.687500 2.812500 Median 3.325000 17.710000 0.000000 0.000000 4.000000 2.000000 Sum 102.952000 571.160000 14.000000 13.000000 118.000000 90.000000 SE Mean 0.172968 0.315890 0.089098 0.088210 0.130427 0.285530 LCL Mean 2.864478 17.204488 0.255783 0.226345 3.421493 2.230158 UCL Mean 3.570022 18.493012 0.619217 0.586155 3.953507 3.394842 Variance 0.957379 3.193166 0.254032 0.248992 0.544355 2.608871 Stdev 0.978457 1.786943 0.504016 0.498991 0.737804 1.615200 Skewness 0.423146 0.369045 0.240258 0.364016 0.528854 1.050874 Kurtosis -0.022711 0.335114 -2.001938 -1.924741 -1.069751 1.257043
让我们看一下在R中使用树数据和压力数据的另外两个示例。
树数据示例-
data(trees) head(trees,20)
输出结果
Girth Height Volume 1 8.3 70 10.3 2 8.6 65 10.3 3 8.8 63 10.2 4 10.5 72 16.4 5 10.7 81 18.8 6 10.8 83 19.7 7 11.0 66 15.6 8 11.0 75 18.2 9 11.1 80 22.6 10 11.2 75 19.9 11 11.3 79 24.2 12 11.4 76 21.0 13 11.4 76 21.4 14 11.7 69 21.3 15 12.0 75 19.1 16 12.9 74 22.2 17 12.9 85 33.8 18 13.3 86 27.4 19 13.7 71 25.7 20 13.8 64 24.9
>basicStats(trees) Girth Height Volume nobs 31.000000 31.000000 31.000000 NAs 0.000000 0.000000 0.000000 Minimum 8.300000 63.000000 10.200000 Maximum 20.600000 87.000000 77.000000 1. Quartile 11.050000 72.000000 19.400000 3. Quartile 15.250000 80.000000 37.300000 Mean 13.248387 76.000000 30.170968 Median 12.900000 76.000000 24.200000 Sum 410.700000 2356.000000 935.300000 SE Mean 0.563626 1.144411 2.952324 LCL Mean 12.097309 73.662800 24.141517 UCL Mean 14.399466 78.337200 36.200418 Variance 9.847914 40.600000 270.202796 Stdev 3.138139 6.371813 16.437846 Skewness 0.501056 -0.356877 1.013274 Kurtosis -0.710941 -0.723368 0.246039
压力数据示例-
data(pressure) head(pressure,20)
输出结果
temperature pressure 1 0 0.0002 2 20 0.0012 3 40 0.0060 4 60 0.0300 5 80 0.0900 6 100 0.2700 7 120 0.7500 8 140 1.8500 9 160 4.2000 10 180 8.8000 11 200 17.3000 12 220 32.1000 13 240 57.0000 14 260 96.0000 15 280 157.0000 16 300 247.0000 17 320 376.0000 18 340 558.0000 19 360 806.0000
basicStats(pressure) temperature pressure nobs 19.000000 19.000000 NAs 0.000000 0.000000 Minimum 0.000000 0.000200 Maximum 360.000000 806.000000 1. Quartile 90.000000 0.180000 3. Quartile 270.000000 126.500000 Mean 180.000000 124.336705 Median 180.000000 8.800000 Sum 3420.000000 2362.397400 SE Mean 25.819889 51.531945 LCL Mean 125.754426 16.072107 UCL Mean 234.245574 232.601304 Variance 12666.666667 50455.285428 Stdev 112.546287 224.622540 Skewness 0.000000 1.835588 Kurtosis -1.390471 2.334429