如何用两个变量的所有组合之间的相互作用在R中创建回归模型?

创建具有交互作用的回归模型的最简单方法是输入带有*的乘法符号的变量,但这将创建许多其他高阶组合。如果我们要创建两个变量组合的相互作用,则可以使用幂运算符,如以下示例所示。

例1

x1<−rnorm(10)
x2<−rnorm(10,1,0.2)
x3<−rnorm(10,1,0.04)
y<−rnorm(10,5,1)
M1<−lm(y~(x1+x2+x3)^2)
summary(M1)
Call:
lm(formula = y ~ (x1 + x2 + x3)^2)
Residuals:
1 2 3 4 5 6 7 8
0.47052 −0.39362 0.37762 −0.80668 0.41637 −0.04845 0.00832 0.27097
9 10
0.14218 −0.43722
Coefficients:
Estimate Std. Error t value Pr(>|t|)

输出结果

(Intercept) 0.2893 172.6567 0.002 0.999
x1 28.5300 25.1856 1.133 0.340
x2  7.9753 191.0616 0.042 0.969
x3  3.3123 168.1906 0.020 0.986
x1:x2 1.2607 16.6937 0.076 0.945
x1:x3 −28.3810 19.4585 −1.459 0.241
x2:x3 −6.2240 186.3458 −0.033 0.975

残差标准误差:3个自由度上的0.7372多个R平方:0.7996,调整后的R平方:0.3989 F统计:6和3 DF上的1.995,p值:0.3048

例2

a1<−rpois(500,5)
a2<−rpois(500,8)
a3<−rpois(500,10)
a4<−rpois(500,2)
a5<−rpois(500,12)
a6<−rpois(500,15)
a7<−rpois(500,9)
y<−rpois(500,1)
M2<−lm(y~(a1+a2+a3+a4+a5+a6+a7)^2)
summary(M2)
Call:
lm(formula = y ~ (a1 + a2 + a3 + a4 + a5 + a6 + a7)^2)
Residuals:
Min 1Q Median 3Q Max
−1.4849 −0.8804 −0.0342 0.6623 4.2336
Coefficients:
Estimate Std. Error t value Pr(>|t|)

输出结果

(Intercept) −0.1225469 1.8336636 −0.067 0.94674
a1        0.4629300 0.1548978  2.989 0.00295 **
a2       −0.0330453 0.1246535 −0.265 0.79105
a3        0.0442927 0.1191984  0.372 0.71037
a4       −0.0661164 0.2644226 −0.250 0.80266
a5        0.0657267 0.1035211  0.635 0.52579
a6       −0.0434769 0.0832513 −0.522 0.60175
a7       −0.0132370 0.1187218 −0.111 0.91127
a1:a2    −0.0055441 0.0072067 −0.769 0.44210
a1:a3    −0.0095850 0.0062517 −1.533 0.12590
a1:a4    −0.0197856 0.0156935 −1.261 0.20802
a1:a5    −0.0063698 0.0055879 −1.140 0.25489
a1:a6    −0.0119008 0.0057317 −2.076 0.03841 *
a1:a7    −0.0009957 0.0069639 −0.143 0.88637
a2:a3    −0.0005469 0.0048617 −0.112 0.91049
a2:a4    −0.0096056 0.0119358 −0.805 0.42136
a2:a5    −0.0040884 0.0048707 −0.839 0.40167
a2:a6     0.0059163 0.0045048  1.313 0.18971
a2:a7     0.0023896 0.0052308  0.457 0.64800
a3:a4    −0.0003036 0.0096746 −0.031 0.97498
a3:a5    −0.0070901 0.0045312 −1.565 0.11832
a3:a6     0.0049534 0.0039970  1.239 0.21586
a3:a7     0.0013881 0.0050959  0.272 0.78543
a4:a5     0.0138932 0.0095724  1.451 0.14734
a4:a6     0.0053824 0.0088454  0.608 0.54315
a4:a7     0.0020738 0.0107736  0.192 0.84745
a5:a6     0.0019474 0.0036433  0.535 0.59324
a5:a7     0.0019719 0.0048370  0.408 0.68370
a6:a7    −0.0031881 0.0041510 −0.768 0.44285
−−−
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

残留标准误差:在471自由度上为1.017多个R平方:0.04549,调整后的R平方:-0.01126 F-统计量:在28和471 DF上为0.8016,p值:0.7563

例子3

z1<−runif(100,1,2)
z2<−runif(100,1,4)
z3<−runif(100,1,5)
z4<−runif(100,2,5)
z5<−runif(100,2,10)
y<−runif(100,1,10)
M3<−lm(y~(z1+z2+z3+z4+z5)^2)
summary(M3)
Call:
lm(formula = y ~ (z1 + z2 + z3 + z4 + z5)^2)
Residuals:
Min 1Q Median 3Q Max
−5.4732 −2.0570 0.0582 2.1667 5.3376

输出结果

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) −2.03476 14.52311 −0.140 0.8889
z1        3.14344 6.80702  0.462 0.6454
z2        3.85518 3.05398  1.262 0.2103
z3       −1.88782 2.16124 −0.873 0.3849
z4        2.75794 3.11048  0.887 0.3778
z5       −0.70359 1.05400 −0.668 0.5063
z1:z2    −2.09623 1.24757 −1.680 0.0966 .
z1:z3     0.17328 0.97128  0.178 0.8588
z1:z4     0.53514 1.26533  0.423 0.6734
z1:z5     0.02687 0.43087  0.062 0.9504
z2:z3     0.15894 0.34335  0.463 0.6446
z2:z4    −0.72427 0.43987 −1.647 0.1034
z2:z5     0.22560 0.16570  1.362 0.1770
z3:z4    −0.16602 0.33847 −0.491 0.6251
z3:z5     0.30484 0.12536  2.432 0.0171 *
z4:z5    −0.19887 0.17768 −1.119 0.2662
−−−
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

残留标准误:84个自由度上的2.792

多个R平方:0.1587,调整后的R平方:0.008411

F统计量:15和84 DF上的1.056,p值:0.4091