SAS output produced by myreview2.sas ************************************************************* (1) CH06PR05.DAT : 6.5, 6.6, 6.8, 7.3, 7.12, 7.24, 7.30, 10.9 ***6.5(a) plot directly for myreview2.sas, correlation: Pearson Correlation Coefficients, N = 16 Y X1 X2 Y 1.00000 0.89239 0.39458 X1 0.89239 1.00000 0.00000 X2 0.39458 0.00000 1.00000 ***6.5(b)(c)(d) Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > |t| Intercept 1 37.65000 2.99610 12.57 <.0001 X1 1 4.42500 0.30112 14.70 <.0001 X2 1 4.37500 0.67332 6.50 <.0001 *** Y= 37.65 + 4.425*X1 + 4.375*X2 *** Interpretation of b1: Given X2 value remains the same, if we increase X1 by 1 unit, the Y value will increase by 4.425 units. *** Plots directly from myreview2.sas: Residual plots look ok. Look like normal. ***6.6 (a,b) please read my512.review2.sol.pdf too. Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 2 1872.70000 936.35000 129.08 <.0001 Error 13 94.30000 7.25385 Corrected Total 15 1967.00000 Root MSE 2.69330 R-Square 0.9521 Dependent Mean 81.75000 Adj R-Sq 0.9447 Coeff Var 3.29455 ***6.6(c) By Bonferroni correction, each confidence interval will have confidence level 1-(1-0.99)/2=0.995. Hence the critical value will be the 1-(1-0.995)/2=0.9975 percentile from a T distribution with n-(# of predictors)-1=16-2-1=13 degrees of freedom. tinv(0.9975,13)=3.372. Based on partial SAS output from 6.5: X1 1 4.42500 0.30112 14.70 <.0001 X2 1 4.37500 0.67332 6.50 <.0001 b1=4.425, SE(b1)=0.301, b2=4.375, SE(b2)=0.673 Bonferroni correction confidence intervals are: b1: 4.425 +/- 3.372*0.301 b2: 4.375 +/- 3.372*0.673 ***6.8 Dependent Predicted Std Error Obs Variable Value Mean Predict 99% CL Mean 99% CL Predict Residual 17 . 77.2750 1.1267 73.8811 80.6689 68.4808 86.0692 --- Yhat=77.275, 99% confidence interval for the mean predicted value is (73.8811,80.6689) 99% prediction interval is (68.4808,86.0692) ***7.3 From Type I SS, Parameter Standard Variable DF Estimate Error t Value Pr > |t| Type I SS Intercept 1 37.65000 2.99610 12.57 <.0001 106929 X1 1 4.42500 0.30112 14.70 <.0001 1566.45000 X2 1 4.37500 0.67332 6.50 <.0001 306.25000 ***7.3(a) SSM(X1)=1566.45, SSM(X2|X1) = 306.25, SSE(X1 X2)=94.30, SST(X1 X2)= 1967 ***7.3(b) This is the same as the t-test: X2 1 4.37500 0.67332 6.50 <.0001 If we use F-test, then H0: beta2=0; Ha: beta2 not 0. F = (SSM(X2|X1)/1) / (SSE(X1 X2)/(16-2-1)) = 306.25/(94.30/13) = 42.219 p-value=Pr(F1,13 > 42.219) < 0.0001 < alpha=0.05. Reject H0. Given X1 in the model, we still need X2. ***7.12 R^2_Y1 = corr(Y,X1)^2 = 0.892^2 = 0.796 This equals to the R^2 from 'model Y = X1'. R^2_Y2 = corr(Y,X2)^2 = 0.395^2 = 0.156 This equals to the R^2 from 'model Y = X2'. R^2_12 = corr(X1,X2)^2 = 0^2 = 0 This equals to the R^2 from 'model X1 = X2'. Current 'model Y = X1 X2' R^2=0.9521 From 'pcorr1' and 'pcorr2', partial SAS output: Squared Partial Variable DF Corr Type II Intercept 1 . X1 1 0.94322 X2 1 0.76457 R^2_Y1|2 = 0.94322 R^2_Y2|1 = 0.76457 *** 7.24 Because X1 and X2 are not correlated, corr(X1,X2)=0, extra sum of squares and the direct sum of squares are the same. SSM(X1)=SSM(X1|X2)=1566.45. This is an ideal scenario. *** 7.30 corr(residual(Y=X2),residual(X1=X2))=0.97119 squared partial correlation R^2_Y1|2 = 0.94322. square root(0.94322)=0.97119 *** 10.9 The dataset is fine. No outliers and apparent influential points. ************************************************************* (2) CH08PR24.DAT : 8.24 ***8.24 Test sameline Results for Dependent Variable Y Mean Source DF Square F Value Pr > F Numerator 2 283.07298 18.68 <.0001 Denominator 60 15.15174 The pdf solution file my512.review2.sol.pdf contains a typo here. SSM(X2,X1*X2|X1)=566.45. F=(566.45/2)/(909.105/60)=18.68. p-value=Pr(F2,60>18.68) < 0.0001 < alpha=0.05 Reject H0:beta2=beta12=0. Either they have different intercepts or different slopes or both. ************************************************************* (3) CH09PR10.DAT : 9.10, 9.11, 9.18 ***9.10 From proc univariate, data look ok. Minor non-normality from some variables but we don't need transformation. Pearson Correlation Coefficients, N = 25 Y X1 X2 X3 X4 Y 1.00000 0.51441 0.49701 0.89706 0.86939 X1 0.51441 1.00000 0.10227 0.18077 0.32666 X2 0.49701 0.10227 1.00000 0.51904 0.39671 X3 0.89706 0.18077 0.51904 1.00000 0.78204 X4 0.86939 0.32666 0.39671 0.78204 1.00000 Create scatter plot matrix from menu bar. corr(X3,X4)=0.78204. May cause a problem. Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > |t| Intercept 1 -124.38182 9.94106 -12.51 <.0001 X1 1 0.29573 0.04397 6.73 <.0001 X2 1 0.04829 0.05662 0.85 0.4038 X3 1 1.30601 0.16409 7.96 <.0001 X4 1 0.51982 0.13194 3.94 0.0008 We may need to drop some variables. **9.11 Cp and AIC will help. (X1 X3 X4) and (X1 X2 X3 X4) are two good models. Both Cp < # of predictors+1 and small AIC values. Since X2 has p-value 0.4038 > 0.05, we will use (X1 X3 X4). **9.18 For this dataset, forward, backward, stepwise selctions, and information criteria would all point to one model (X1 X3 X4).