Mult R .868909798 RSquare .755004238 Adj RSqu .732731896 SE 1.865938991 ANOVA table
df SS MS Regress 2.000 236.052 118.026 Residual 22.000 76.598 3.482 F value Sig F 33.89873583 .00000019
--------------Variables in the Equation----------------
B SE(B) Beta B/SE(B) x1 .025864420 .005128734 .575766004 5.043041590 x2 .176466602 .065050050 .309719365 2.712781978 Constant -1.095799831 .701872837 .000000000 -1.561251230 ------ END MATRIX -----
得到y对x1,x2的标准化岭回归方程为
??0.57576x1?0.30972x2 y未标准化的岭回归方程为
???1.09586?0.02586x1?0.17647x2 y (7)某研究人员希望做y对各项贷款余额、本年累计应收贷款、贷款项目个数这3个自变量的回归,你认为这样做是否可行,如果可行应该怎么做? 对x1,x2,x3作岭回归得出结果如下:
R-SQUARE AND BETA COEFFICIENTS FOR ESTIMATED VALUES OF K
K RSQ X1 X2 X3 .00000 .75964 .698331 .295891 -.065536 .05000 .75667 .591044 .303379 .020096 .10000 .75175 .526452 .303409 .068986 .15000 .74671 .482505 .300383 .099950 .20000 .74181 .450130 .295992 .120819 .25000 .73705 .424913 .290990 .135446 .30000 .73238 .404450 .285747 .145954 .35000 .72775 .387319 .280455 .153611 .40000 .72314 .372627 .275214 .159221 .45000 .71851 .359783 .270080 .163320
45
.50000 .71388 .348378 .265083 .166279 .55000 .70921 .338123 .260236 .168360 .60000 .70452 .328807 .255545 .169756 .65000 .69981 .320268 .251009 .170613 .70000 .69507 .312385 .246627 .171039 .75000 .69031 .305063 .242394 .171119 .80000 .68554 .298225 .238304 .170920 .85000 .68076 .291811 .234352 .170494 .90000 .67597 .285769 .230532 .169883 .95000 .67118 .280059 .226837 .169121 1.0000 .66639 .274646 .223262 .168236
岭迹图
Run MATRIX procedure:
****** Ridge Regression with k = 0.4 ******
Mult R .850373821 RSquare .723135635 Adj RSqu .683583583 SE 2.030268037
ANOVA table
df SS MS Regress 3.000 226.089 75.363 Residual 21.000 86.562 4.122
46
F value Sig F 18.28313822 .00000456
--------------Variables in the Equation----------------
B SE(B) Beta B/SE(B) x1 .016739073 .003359156 .372627316 4.983118685 x2 .156806656 .047550034 .275213878 3.297719120 x3 .067110931 .032703990 .159221005 2.052071673 Constant -.819486727 .754456246 .000000000 -1.0861951666
???0.819?0.0167x1?0.157x2?0.0671x3,回归系数有合理解释。表岭回归方程 y中B/SE(B)是近似t的值,t1?4.983,t2?3.298,x1和x2都是显著的,t3?2.052说明x3也是比较显著的,所以做y对x1,x2,x3的岭回归是可行的。
47
应用回归分析课后答案
第二章 一元线性回归
2.14 解答:EXCEL结果:
SUMMARY OUTPUT
回归统计
Multiple R 0.944911 R Square 0.892857 Adjusted R Square 0.857143
0.597614 标准误差
5 观测值
方差分析
df SS MS
1 8.928571 8.928571 回归分析
3 1.071429 0.357143 残差
4 10 总计
Coefficients 标准误差 t Stat
Intercept -0.21429 0.6962 -0.30779 X Variable 1 0.178571 0.035714 5 RESIDUAL OUTPUT
观测值 预测 Y 残差
1 1.571429 -0.57143 2 1.571429 0.428571 3 3.357143 -0.35714 4 3.357143 0.642857 5 5.142857 -0.14286
SPSS结果:(1)散点图为:
F Significance F
25 0.015392
P-value Lower 95% Upper 95% 下限 95.0% 上限 95.0% 0.778371 -2.4299 2.001332 -2.4299 2.001332 0.015392 0.064913 0.29223 0.064913 0.29223
(2)x与y之间大致呈线性关系。 (3)设回归方程为y??0??1x
n???? ?1=
?xyii?1n??i?nxy??7
2?xi?12i?n(x)??0?y??1x?20?7?3??1
?可得回归方程为y??1?7x
?2???(4)??(y?n-2i=11ni?2?yi)
2 ?1n-2n?(yi=1?i??(?0??1x))
222?10-(-1+7?1))?(10-(-1+7?2))?(20-(-1+7?3))?1( =?? 223??(20-(-1+7?4))?(40-(-1+7?5))?
?13?16?9?0?49?36?
1
?110/3
? ??13330??6 .1(5)由于?1?N(?1,??2Lxx?)
t??1??1?/Lxx2?(?1??)Lxx?
?服从自由度为n-2的t分布。因而
???(?1??)LxxP?||?t?/2(n?2)??1?? ??????????也即:p(?1?t?/2??Lxx??1??1?t?/2?Lxx)=1??
可得?1的置信度为95%的置信区间为(7-2.353?即为:(2.49,11.5)
??1333,7+2.353?13 33)?0?N(?0,(?1n?(x)2Lxx)?)
?2t?(1n?0??0???2??0??0?
2?(x)2Lxx)??1n?(x)Lxx 服从自由度为n-2的t分布。因而
????????0??0P?||?t?/2(n?2)??1??
?2???1(x)????nLxx???????即p(?0???1n?(x)2???Lxxt?/2??0??0??1n?(x)2Lxxt?/2)?1??
可得?1的置信度为95%的置信区间为(?7.77,5.77)
2
n(6)x与y的决定系数r2??(y?y)ii?1n??2?(yi?1?i?490/600?0.817
2?y)(7) ANOVA x 组间 (组合) 线性项 加权的 偏差 组内 总数 平方和 9.000 8.167 .833 1.000 10.000 df 2 1 1 2 4 均方 4.500 8.167 .833 .500 F 9.000 16.333 1.667 显著性 .100 .056 .326 由于F?F?(1,3),拒绝H0,说明回归方程显著,x与y有显著的线性关系。
??(8)t??1?2??1?Lxx?2 其中???/Lxx ?7?1310330?21?3.6 633?en?2i?11n2i?1n?2n?(yi?1?i2?yi)
?t?/2?2.353 t?3.66?t?/2
?接受原假设H0:?1?0,认为?1显著不为0,因变量y对自变量x的一元线性回归成立。
n?(x(9)相关系数 r?i?1n?i??x)(yi?y)?n2??iLxyLxxLyy
?(xi?1i?x)?(yi?1?y) =7010?600?760?0.904
r小于表中??1%的相应值同时大于表中??5%的相应值,?x与y有显著的线性关系.
3
(10) 序号 1 2 3 4 5 残差图为: x y ?y e 1 2 3 4 5 10 10 20 20 40 6 13 20 27 34 4 -3 0 -7 6 从图上看,残差是围绕e=0随机波动,从而模型的基本假定是满足的。
(11)当广告费x0=4.2万元时,销售收入y0?28.4万元,置信度为95%的置信区间
??近似为y?2?,即(17.1,39.7)
2.15 解答:
(1) 散点图为:
4
y??1.022?0.04x1?0.148x2?0.015x3?0.029x4
?,??,??,??的置信区间分别为[-2.654, 0.61], 但是,所得系数并不合理,因为?0234[-0.016,0.312],[-0.159,0.188],[-0.061,0.002],置信区间中存在0,是不合理的。而且复相关系数R?0.893,R2?0.798,由决定系数看回归方程并非高度显著。最后计算得关于?j的相应P值,结果发现,只有P1?0.05,通过显著性检验。综合以上理由认为所得系数不合理
(3)分析回归模型的共线性 1.方差扩大因子法
观察下表,可以看出VIFj?10(j?1,2,3,4),说明自变量xj之间没有严重的多重共线性,但是VIFj?1,说明xj之间还是具有一定的多重共线性的。 Coefficients Standardized Unstandardized CoefficienCoefficients Std. Model 1 (Constant) x1 x2 x3 x4 .040 .148 .015 -.029 .010 .079 .083 .015 .891 .260 .034 3.837 1.879 .175 .001 .075 .863 .067 .018 -.016 -.159 -.061 .062 .188 5.331 .312 .529 1.890 .188 .261 3.835 .002 .360 2.781 -1.022 B Error .782 -1.306 .206 -2.654 Beta t Sig. ts 95% Confidence CollinearitInterval for B y Statistics Lower Bound Upper Bound Tolerance VIF .610 a-.325 -1.937 a. Dependent Variable: y
2.特征根判定法
40
使用SPSS计算出特征根与条件数如下表所示。 Collinearity Diagnostics DimensModel ion 1 1 2 3 4 5 Eigenvalue 4.538 .203 .157 .066 .036 Condition Index (Constant) .01 .68 .16 .00 .15 1.000 4.733 5.378 8.287 11.215 Variance Proportions x1 .00 .03 .00 .09 .87 x2 .01 .02 .66 .20 .12 x3 .00 .01 .01 .36 .63 x4 .00 .09 .13 .72 .05 aa. Dependent Variable: y 通常认为,0?k?10时,自变量之间没有多重共线性;10?k?100自变量之间具有较强的多重共线性;k?100自变量之间具有严重的多重共线性。从条件数可以看出,最大的条件数k5?11.215稍稍大于10,说明自变量之间存在一定的多重共线性,这与方差扩大因子的结果是一致的。
(4)采用后退法和逐步回归法选择变量,所得回归方程的回归系数是或否合理,是否还存在共线性?
1.后退法选择变量的结果如下
Coefficients Unstandardized Coefficients Model 1 (Constant) x1 x2 x3 x4
aStandardized Coefficients Beta t -1.306 .891 .260 .034 -.325 3.837 1.879 .175 -1.937 Sig. .206 .001 .075 .863 .067 B -1.022 .040 .148 .015 -.029 Std. Error .782 .010 .079 .083 .015 41
2 (Constant) x1 x2 x4 -.972 .041 .149 -.029 .711 .009 .077 .014 .914 .261 -.317 -1.366 4.814 1.938 -2.006 .186 .000 .066 .058 a. Dependent Variable: y 对后退法选择的变量再进行多重共线性检验,使用方差扩大因子法与特征根判定法进行判定发现多重共线性有所改善。但是其中x4的系数?4为负数是不合理的,说明仍存在共线性。
2.逐步回归法选择变量
Coefficients Unstandardized Coefficients Model 1 (Constant) x1 2 (Constant) x1 x4 B -.830 .038 -.443 .050 -.032 Std. Error .723 .005 .697 .007 .015 1.120 -.355 .844 Standardized Coefficients Beta t -1.147 7.534 -.636 6.732 -2.133 Sig. .263 .000 .531 .000 .044 a
a. Dependent Variable: y 对逐步回归法选择的变量再进行多重共线性检验,使用方差扩大因子法与特征根判定法进行判定发现多重共线性有所改善。但是其中x4的系数?4为负数是不合理的,说明仍存在共线性。
(5)建立不良贷款对四个自变量的岭回归
使用SPSS软件实现岭回归,得到结果主要结果如下。
R-SQUARE AND BETA COEFFICIENTS FOR ESTIMATED VALUES OF K K RSQ X1 X2 X3 X4 .00000 .79760 .891313 .259817 .034471 -.324924
42
.05000 .79088 .713636 .286611 .096624 -.233765 .10000 .78005 .609886 .295901 .126776 -.174056 .15000 .76940 .541193 .297596 .143378 -.131389 .20000 .75958 .491935 .295607 .153193 -.099233 .25000 .75062 .454603 .291740 .159210 -.074110 .30000 .74237 .425131 .286912 .162925 -.053962 .35000 .73472 .401123 .281619 .165160 -.037482 .40000 .72755 .381077 .276141 .166401 -.023792 .45000 .72077 .364000 .270641 .166949 -.012279 .50000 .71433 .349209 .265211 .167001 -.002497 .55000 .70816 .336222 .259906 .166692 .005882 .60000 .70223 .324683 .254757 .166113 .013112 .65000 .69649 .314330 .249777 .165331 .019387 .70000 .69093 .304959 .244973 .164397 .024860 .75000 .68552 .296414 .240345 .163346 .029654 .80000 .68024 .288571 .235891 .162207 .033870 .85000 .67508 .281331 .231605 .161000 .037587 .90000 .67003 .274614 .227480 .159743 .040874 .95000 .66508 .268353 .223510 .158448 .043787 1.0000 .66022 .262494 .219687 .157127 .046373
岭迹图
用岭回归来选择变量的原则是:
1.我们可以剔除掉标准化岭回归系数比较稳定且绝对值很小的自变量。
2.当k值较小时,标准化岭回归系数的绝对值并不小,但是不稳定,随着k的增加迅速趋于0。像这样岭回归系数不稳定、震动趋于零的自变量,我们也可以予以剔除。 3.去掉标准化岭回归系数很不稳定的自变量。
43
根据以上岭回归结果,变量x3岭回归系数比较稳定且绝对值很小;变量x4的回归系数
?(k)绝对值并不小,但是不稳定,且随着k的增加趋于0。根据以上原则,将x,x其?434剔除。用余下的两个自变量作岭回归。把岭参数步长改为0.02,范围减小到0.2。编程得到以下(6)的结果。
(6)对第(4)步剔除变量后的回归方程再作岭回归。
R-SQUARE AND BETA COEFFICIENTS FOR ESTIMATED VALUES OF K K RSQ X1 X2 .00000 .75844 .643550 .294682 .02000 .75827 .627803 .299382 .04000 .75780 .613317 .303080 .06000 .75708 .599909 .305947 .08000 .75614 .587431 .308123 .10000 .75500 .575766 .309719 .12000 .75371 .564814 .310826 .14000 .75226 .554492 .311519 .16000 .75069 .544734 .311858 .18000 .74899 .535479 .311896 .20000 .74719 .526679 .311675
岭迹图
由上表看到,剔除x3,x4后岭回归的系数变化幅度减小,从岭迹看图上领参数在0.1~0.2上基本稳定。给定k=0.1,重新作岭回归输出结果如下 Run MATRIX procedure:
****** Ridge Regression with k = 0.1 ******
44
Anovac 模型 1 回归 残差 总计 2 回归 残差 总计 平方和 1.830E7 3917298.522 2.222E7 1.830E7 3921126.262 2.222E7 df 5 10 15 4 11 15 均方 3660971.683 391729.852 F 9.346 Sig. .002a 4575257.669 356466.024 12.835 .000b a. 预测变量: (常量), x6, x3, x2, x4, x5。 b. 预测变量: (常量), x6, x3, x2, x4。 c. 因变量: y 系数a 模型 非标准化系数 B 1 (常量) x2 x3 x4 x5 x6 2 (常量) x2 x3 x4 x6 a. 因变量: y ?标准系数 试用版 t 2.365 .677 .782 -1.156 .050 -.899 1.940 2.818 -4.367 .099 -2.904 2.675 .706 .760 -1.165 -.916 3.727 4.750 -4.913 -3.711 Sig. .040 .081 .018 .001 .923 .016 .022 .003 .001 .000 .003 标准 误差 2504.315 2.507 .842 187.279 147.078 291.634 2245.481 1.360 .486 167.776 232.489 5922.827 4.864 2.374 -817.901 14.539 -846.867 6007.320 5.068 2.308 -824.261 -862.699 y?6007.320?5.068x2?2.308x3?824.261x4?862.699x6
(3)逐步回归
模型汇总
30
模型 R 1 2 3 .498a .697b .811c R 方 .248 .485 .657 调整 R 方 .194 .406 .572 标准 估计的误差 1092.83206 937.95038 796.60909 a. 预测变量: (常量), x3。 b. 预测变量: (常量), x3, x5。 c. 预测变量: (常量), x3, x5, x4。 Anovad 模型 1 回归 残差 总计 2 回归 残差 总计 3 回归 残差 总计 a. 预测变量: (常量), x3。 b. 预测变量: (常量), x3, x5。 c. 预测变量: (常量), x3, x5, x4。 d. 因变量: y 平方和 5502210.090 1.672E7 2.222E7 1.079E7 1.144E7 2.222E7 1.461E7 7615032.418 2.222E7 df 1 14 15 2 13 15 3 12 15 均方 5502210.090 1194281.918 F 4.607 Sig. .050a 5392697.554 879750.910 6.130 .013b 4869041.506 634586.035 7.673 .004c 系数a 模型 非标准化系数 B 1 (常量) x3 2 (常量) x3 x5 3 (常量) x3
5161.259 1.511 472.298 3.188 212.325 1412.807 3.440 标准 误差 1142.744 .704 2150.138 .913 86.643 1865.912 .782 标准系数 试用版 t 4.517 .498 2.146 .220 3.492 2.451 .757 4.398 Sig. .000 .050 .830 .004 .029 .464 .001 1.050 .737 1.133 31
x5 x4 a. 因变量: y 348.729 -415.136 92.220 169.163 1.210 -.587 3.782 -2.454 .003 .030 y?1412.807?3.440x3?348.729x5?415.136x4
?(4)两种方法得到的模型是不同的,回退法剔除了x5,保留了x6, x3, x2, x4作为最终模型。而逐步回归法只引入了x3。说明了方法对自变量重要性的认可不同的,这与自变量的相关性有关联。相比之下,后退法首先做全模型的回归,每一个变量都有机会展示自己的作用,所得结果更有说服力
第六章 多重共线性的情形及其处理
6.6对财政收入的数据,分析数据的多重共线性,并根据多重共线性剔除变量。将所得结果与用逐步回归法所得的选元结果相比较。
答:
首先,采用方差扩大因子法,使用SPSS软件诊断财政收入的多重共线性问题,得到的计算结果如下。 Coefficients aUnstandardized Coefficients Std. Model 1 (Constant) x1 B Error Standardized Coefficients Collinearity Statistics TolerancBeta t Sig. e VIF 1348.338 2211.463 .610 .552 -.641 .167 -1.125 -3.840 .002 .003 319.484 2636.56.000 4 .002 479.288 x2 -.317 .204 -1.306 -1.551 .143 x3 -.413 .548 -.270 -.752 32
.464
x4 x5 -.002 .671 -.008 .024 .128 .008 -.007 -.087 3.706 5.241 -.020 -.928 .932 .000 .369 .037 27.177 1860.72.001 6 .574 1.743 x6 a. Dependent Variable: y 从输出结果看到,x2,x5的方差扩大因子很大,分别为VIF2?2636,VIF7?1860, 远远超过10。说明财政收入的数据存在严重的多重共线性。
其次采用特征根判定法,使用SPSS软件诊断财政收入的多重共线性问题,得到的计算结果如下。 Collinearity Diagnostics Variance Proportions DimensEigenvaCondition (ConsModel ion 1 1 2 3 4 5 6 7 lue 6.127 .857 .011 .004 .001 Index tant) x1 x2 x3 x4 x5 x6 a1.000 .00 .00 .00 .00 .00 2.673 .00 .00 .00 .00 .00 23.954 .01 .00 .00 .00 .00 38.000 .01 .16 .00 .07 .00 98.485 .02 .11 .08 .78 .02 .00 .00 .00 .00 .00 .81 .00 .00 .07 .03 .20 .09 .72 .07 .000 119.124 .11 .55 .04 .01 .13 7.352E-5 288.677 .85 .18 .88 .14 .85 a. Dependent Variable: y
从条件数看到,最大的条件数k7?288.677,说明自变量之间存在严重的多重共线性问题。这与方差扩大因子法的结果一致。
先剔除方差扩大因子最大的x2,重新做回归,结果如下,发现自变量之间仍然存在严
重的多重共线性问题。
33
Coefficients Standardized Unstandardized CoefficientCoefficients Std. Model 1 B (Constant-1252.83) x1 x3 x4 x5 x6 Error 1.508E3 2 .163 .459 .017 .078 .008 Beta t Sig. s Collinearity Statistics Tolerance VIF -.831 .419 -1.291 -4.524 .000 -.604 -2.012 .063 .093 1.591 .132 2.815 6.527 .000 -.028 -1.274 .222 a-.735 -.923 .026 .510 -.011 .004 276.969 .003 306.617 .086 11.605 .002 632.896 .608 1.645 a. Dependent Variable: y
再剔除方差扩大因子最大的x5,重新做回归,结果如下,发现自变量之间仍然存在严
重的多重共线性问题。
Coefficients Unstandardized Coefficients Model 1 B Std. Error 2829.351 .235 .526 .031 .015 Standardized Coefficients Beta t Collinearity Statistics Sig. Tolerance .006 160.513 .009 111.949 .087 .649 11.507 1.540 VIF a(Constant) -2715.046 x1 x3 x4 x6 -.047 1.463 .036 .003 -.960 .352 -.083 -.202 .843 .957 2.781 .013 .128 1.160 .263 .008 .206 .839 a. Dependent Variable: y
34
再剔除方差扩大因子最大的x1,重新做回归,结果如下,发现此时多重共线性问题应
经消除。但是继续观察如下结果,自变量x6的P值为0.801,说明x6对于财政收入的回归方程作用是不显著的。
Coefficients Unstandardized Standardized Coefficients Std. Model 1 (Constant) x3 x4 x6 B Error Beta t Sig. -2296.322 1.870E3 1.359 .031 .004 .097 .019 .014 Coefficients Collinearity Statistics Tolerance VIF -1.228 .236 .889 14.036 .000 .111 .010 1.649 .117 .256 .801 a.249 4.018 .222 4.509 .673 1.485 a. Dependent Variable: y
剔除不显著的x6,仅保留x3和x4两个自变量,进行回归分析。
Coefficients Unstandardized Coefficients Standardized Coefficients Collinearity Statistics ToleraModel 1 (Constant) x3 x4 B Std. Error Beta t Sig. nce VIF -2306.802 1820.091 1.359 .033 .094 .018 -1.267 .221 a.889 14.415 .000 .249 4.018 .116 1.886 .076 .249 4.018 a. Dependent Variable: y
???2306.8?1.359x3?0.033x4,但是发现x4的P值为0.076>0.05,表回归方程为y35
示x4对于y只有较弱的显著性。
用逐步回归法所得的选元结果如下,从中可以看出逐步回归法所保留的变量为
x5,x1,x2,而这三个变量正是方差扩大因子法所剔除的,所以按照共线性提出变量与常规的
逐步回归法按照t值显著性提出变量会有较大差别。 Coefficients Unstandardized Coefficients Std. Model 1 (Constant) x5 2 (Constant) x5 x1 3 (Constant) x5 x1 x2
36
aStandardized Coefficients Collinearity Statistics B 710.370 .180 Error 90.891 .004 Beta t 7.816 Sig. Tolerance .000 .000 .000 .000 .015 .000 .000 .000 .001 1.000 VIF .994 40.736 7.392 1.000 1011.913 136.899 .311 -.414 .049 .154 1.718 -.726 6.374 -2.694 8.184 .006 162.146 .006 162.146 .001 989.833 .005 192.871 .002 541.459 874.600 106.866 .637 -.611 -.353 .089 .124 .088 3.516 -1.073 -1.454 7.143 -4.936 -3.994 a. Dependent Variable: y
7.7一家大型商业银行有多家分行,近年来,该银行的贷款额平稳增长,但不良贷款额也有较大比例的提高。为了弄清不良贷款形成的原因,希望利用银行业务的有关数据做些定量分析,以便找出控制不良贷款的方法。下表是该银行所属25家分行2002年的有关业务数据。
初始数据:
x1 x2 x3 x4 分行编号 y
1 0.9 67.3 6.8 5 51.9 2 1.1 111.3 19.8 16 90.9 3 4.8 173 7.7 17 73.7 4 3.2 80.8 7.2 10 14.5 5 7.8 199.7 16.5 19 63.2 6 2.7 16.2 2.2 1 2.2 7 1.6 107.4 10.7 17 20.2 8 12.5 185.4 27.1 18 43.8 9 1 96.1 1.7 10 55.9 10 2.6 72.8 9.1 14 64.3 11 0.3 64.2 2.1 11 42.7 12 4 132.2 11.2 23 76.7 13 0.8 58.6 6 14 22.8 14 3.5 174.6 12.7 26 117.1 15 10.2 263.5 15.6 34 146.7 16 3 79.3 8.9 15 29.9 17 0.2 14.8 0.6 2 42.1 18 0.4 73.5 5.9 11 25.3 19 1 24.7 5 4 13.4 20 6.8 139.4 7.2 28 64.3
37
21 22 23 24 25
11.6 1.6 1.2 7.2 3.2
368.2 95.7 109.6 196.2 102.2
16.8 3.8 10.3 15.8 12
32 10 14 16 10
163.9 44.5 67.9 39.7 97.1
(1) 建立y与其余四个变量的简单相关系数
Correlations Pearson Correlation y x1 x2 x3 x4 Sig. (1-tailed) y x1 x2 x3 x4 N y x1 x2 x3 x4
从相关阵看出,y与x1,x2,x3的相关系数都在0.7以上,说明所选的自变量与y具有一定的相关性,但并不高度显著。
(2) 建立不良贷款y与4个变量的线性回归方程,所得回归系数是否合理?
38
y 1.000 .844 .732 .700 .519 . .000 .000 .000 .004 25 25 25 25 25 x1 .844 1.000 .679 .848 .780 .000 . .000 .000 .000 25 25 25 25 25 x2 .732 .679 1.000 .586 .472 .000 .000 . .001 .009 25 25 25 25 25 x3 .700 .848 .586 1.000 .747 .000 .000 .001 . .000 25 25 25 25 25 x4 .519 .780 .472 .747 1.000 .004 .000 .009 .000 . 25 25 25 25 25
Model Summary Adjusted R Std. Error of Model 1 R .893 abR Square .798 Square .757 the Estimate Durbin-Watson 1.7788 2.626 a. Predictors: (Constant), x4, x2, x3, x1 b. Dependent Variable: y
ANOVA Model 1 Regression Residual Total Sum of Squares 249.371 63.279 312.650 df 4 20 24 b Mean Square 62.343 3.164 aF 19.704 Sig. .000 aa. Predictors: (Constant), x4, x2, x3, x1 b. Dependent Variable: y Coefficients Unstandardized Standardized Coefficients Std. Model 1 (Constant) x1 x2 x3 x4 B -1.022 .040 .148 .015 -.029 Error .782 .010 .079 .083 .015 .891 .260 .034 Beta t -1.306 3.837 1.879 .175 Coefficients 95% Confidence Interval for B Lower Sig. Bound Upper Bound .610 .062 .312 .188 .002 .206 -2.654 .001 .018 .075 -.016 .863 -.159 .067 -.061 -.325 -1.937 a. Dependent Variable: y 回归方程为
39