######## Solution ######## problem 1 ######### (a) The random effect model is Y_{ij}=mu+beta_i+alpha_j+epsilon_{ij}, i=A,B,C,D,E, j=g1,g2,g3,g4,g5,g6,g7 where mu and beta_i(for method) are fixed effect, alpha_j(for girder) is random effect, and epsilon_{ij} is error term. The estimates are mu=0.9389, beta_A=0, beta_B=-0.0200, beta_c=-0.0451, beta_D=-0.04329, beta_D=-0.0623. The estimate of std of grider 0.0410 std of error 0.0442 (i) The hypothesis is H_0: beta_A=...=beta_E=0 versus H_A: not all equal The F-value for methed is 2.0821 with df 4 and 24. The p-value is 0.1147. The LR p-value for LR test is 0.0799. Bootstrap p-value is 0.114 (your answer should be close to this number). Thus, H_0 is accepted. There is no significant different among those levels. (It is not necessary to use bootstrap method to compute the p-value). (ii) The test for girder effect is H_0: sigma_g=0 (sigma_g is the std of girder) versus H_A: sigma_g>0. The p-value for likelihood-ratio statistic is 0.0015 (ML method) and 0.0033(REML meth0d). Bootstrap p-value is 0.002 (must be included). Thus, H_0 is reject. Thus, there is a significant girder effect. (b) The p-values based on the F-table for the model with the main fxied effects of method and girder are 0.1146 for method and 0.0013 for girder. Thus, method effect is not significant, but girder is. (c) The two methods are consistent. ########## Problem 2 ########## (a) Omitted. (b) The p-value of treat is 0.0449. It is significant. The p-value of block is 0.2145. It is insignificant. (c) The first treatment has the largest value (16.25). The second is the next (10.00). The third is the lowest (-26.25) (d) The method did not converge. It provides a negative F-value, which is signficant different from the previous one. (e) It is also different, which is still caused by convergence problem. (f) I use the likelihood ratio test. The p-value is 0.061. ########## # Problem 3 ########## (a) Omitted. (b) manufact has 2 levels. machine has 6 levels, speed has 2 levels. manufact level A only appears in machine m1, m2, and m3. manufact level B only appears in machine m4, m5, and m6. They cannot be both estimated. (c) I fit the model. If the same machine were test at the same speed, it should be the residual SD, which was 9.215. If different machines were tested at the same speed, then it should be the combination of machine and residual. The SD is sqrt(49.92+72.68+84.92)=14.45. (d) I used the bootstrap method for the p-value. It was 0.005, which means that the interaction effect is significant. (e) Variability between machine is SD=7.065. (f) In the nested model, variable between machine SD=12.19 and variability between manufact is SD=6.755. ########## Output ########## Problem 1 ######### > library(nls) Warning message: package 'nls' has been merged into 'stats' > library(lattice) > library(nlme) > > girder <- read.table("c:\\tonglinzhang\\stat526\\dataset\\Girder.dat",h=T) > girder girder method strength 1 g1 A 0.878 2 g2 A 0.930 3 g3 A 0.916 4 g4 A 0.963 5 g5 A 0.929 6 g6 A 0.990 7 g7 A 0.966 8 g1 B 0.976 9 g2 B 0.958 10 g3 B 0.907 11 g4 B 0.872 12 g5 B 0.950 13 g6 B 0.925 14 g7 B 0.844 15 g1 C 0.907 16 g2 C 0.918 17 g3 C 0.861 18 g4 C 0.786 19 g5 C 0.962 20 g6 C 0.969 21 g7 C 0.867 22 g1 D 0.932 23 g2 D 0.910 24 g3 D 0.870 25 g4 D 0.838 26 g5 D 0.973 27 g6 D 0.918 28 g7 D 0.828 29 g1 E 0.949 30 g2 E 0.916 31 g3 E 0.821 32 g4 E 0.737 33 g5 E 0.975 34 g6 E 0.927 35 g7 E 0.811 > > ## > # (a) > ## > > mod.reml <- lme(strength~method,random=~1|girder,data=girder) > summary(mod.reml) Linear mixed-effects model fit by REML Data: girder AIC BIC logLik -68.29413 -58.48575 41.14706 Random effects: Formula: ~1 | girder (Intercept) Residual StdDev: 0.04101475 0.04417325 Fixed effects: strength ~ method Value Std.Error DF t-value p-value (Intercept) 0.9388571 0.02278310 24 41.20850 0.0000 methodB -0.0200000 0.02361160 24 -0.84704 0.4053 methodC -0.0431429 0.02361160 24 -1.82719 0.0801 methodD -0.0432857 0.02361160 24 -1.83324 0.0792 methodE -0.0622857 0.02361160 24 -2.63793 0.0144 Correlation: (Intr) methdB methdC methdD methodB -0.518 methodC -0.518 0.500 methodD -0.518 0.500 0.500 methodE -0.518 0.500 0.500 0.500 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -1.94844438 -0.60009384 0.03591486 0.49822467 1.75774161 Number of Observations: 35 Number of Groups: 7 > > # F-method > > amod <- lme(strength~method,random=~1|girder,data=girder,method="ML") > nmod <- lme(strength~1,random=~1|girder,data=girder,method="ML") > anova(amod,nmod) Model df AIC BIC logLik Test L.Ratio p-value amod 1 7 -98.75623 -87.86880 56.37812 nmod 2 3 -98.41527 -93.74922 52.20763 1 vs 2 8.340967 0.0799 > > # likelihood method > > 2*(logLik(amod)-logLik(nmod)) [1] 8.340967 attr(,"nall") [1] 35 attr(,"nobs") [1] 35 attr(,"df") [1] 7 attr(,"class") [1] "logLik" > > # bootstrap p-value > > lrstat <- numeric(1000) > sigmaa <- 0.03641 > sigma <- 0.04746 > mu <- 0.9051 > > random <- rep(rnorm(7,0,sigmaa),5) > error <- rnorm(35,0,sigma) > rstrength <- mu+random+error > rstrength [1] 0.9544562 0.8464705 1.0155808 0.9106024 0.9563436 0.9655796 0.7917269 0.8607714 0.8238111 0.9941654 0.8533801 0.8751194 [13] 0.9210014 0.8625383 0.8496021 0.8548478 1.0807228 0.9077487 0.8148547 0.9488718 0.9047379 0.9215406 0.7225758 0.9304280 [25] 0.8508462 0.9787897 0.9277308 0.8660172 0.8693693 0.7420250 0.9785317 0.8406518 0.8979515 1.0158236 0.8934263 > > for(i in 1:1000){ + random <- rep(rnorm(7,0,sigmaa),5) + error <- rnorm(35,0,sigma) + rstrength <- mu+random+error + amodr <- lme(rstrength~method,random=~1|girder,data=girder,method="ML") + nmodr <- lme(rstrength~1,random=~1|girder,data=girder,method="ML") + lrstat[i] <- 2*(logLik(amodr)-logLik(nmodr)) + } > > mean(lrstat>8.3401) [1] 0.114 > ### test for random effect > > options(contrasts=c("contr.sum", "contr.poly")) > > amod <- lme(strength~method,random=~1|girder,data=girder) > nmod <- lm(strength~method,data=girder) > anova(amod,nmod) Model df AIC BIC logLik Test L.Ratio p-value amod 1 7 -65.07525 -55.26687 39.53763 nmod 2 6 -58.44215 -50.03496 35.22107 1 vs 2 8.633107 0.0033 > > # likelihood method > 2*(logLik(amod)-logLik(nmod)) [1] -23.60902 attr(,"nall") [1] 35 attr(,"nobs") [1] 30 attr(,"df") [1] 7 attr(,"class") [1] "logLik" > 1-pchisq(2*(logLik(amod)-logLik(nmod)),1) [1] 1 attr(,"nall") [1] 35 attr(,"nobs") [1] 30 attr(,"df") [1] 7 attr(,"class") [1] "logLik" > > ## > amod <- lme(strength~method,random=~1|girder,data=girder,method="ML") > nmod <- lm(strength~method,data=girder) > anova(amod,nmod) Model df AIC BIC logLik Test L.Ratio p-value amod 1 7 -98.75623 -87.86880 56.37812 nmod 2 6 -90.68428 -81.35219 51.34214 1 vs 2 10.07196 0.0015 > > # likelihood method > 2*(logLik(amod)-logLik(nmod)) [1] 10.07196 attr(,"nall") [1] 35 attr(,"nobs") [1] 35 attr(,"df") [1] 7 attr(,"class") [1] "logLik" > 1-pchisq(2*(logLik(amod)-logLik(nmod)),1) [1] 0.001505429 attr(,"nall") [1] 35 attr(,"nobs") [1] 35 attr(,"df") [1] 7 attr(,"class") [1] "logLik" > ### > > mu <- nmod$coeff[1] > mua <- c(nmod$coeff[-1],-sum(nmod$coeff[-1])) > mua method1 method2 method3 method4 0.033742857 0.013742857 -0.009400000 -0.009542857 -0.028542857 > sigma <- 0.06028 > > fixed <- rep(mua,rep(7,5)) > error <- rnorm(35,0,sigma) > rstrength <- mu+fixed+error > > lrstat <- rep(0,1000) > for(i in 1:1000){ + fixed <- rep(mua,rep(7,5)) + error <- rnorm(35,0,sigma) + rstrength <- mu+fixed+error + amodr <- lme(rstrength~method,random=~1|girder,data=girder,method="ML") + nmodr <- lm(rstrength~method,data=girder) + lrstat[i] <- 2*(logLik(amodr)-logLik(nmodr)) + } > mean(lrstat>10.07196) [1] 0.002 ####### Part b) ######## > g <- lm(strength~method+girder,data=girder) > anova(g) Analysis of Variance Table Response: strength Df Sum Sq Mean Sq F value Pr(>F) method 4 0.016251 0.004063 2.0821 0.114657 girder 6 0.062174 0.010362 5.3105 0.001313 ** Residuals 24 0.046831 0.001951 --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 ################ # Problem 2. ################ > > eggprod <- read.table("u:\\stat526\\dataset\\ascdata\\eggprod.txt",h=T) > eggprod treat block eggs 1 O 1 330 2 O 2 288 3 O 3 295 4 O 4 313 5 E 1 372 6 E 2 340 7 E 3 343 8 E 4 341 9 F 1 359 10 F 2 337 11 F 3 373 12 F 4 302 > > # (a) Omitted > > # (b) > > lmod <- aov(eggs~treat+factor(block),data=eggprod) > summary(lmod) Df Sum Sq Mean Sq F value Pr(>F) treat 2 4212 2106.2 5.444 0.0449 * factor(block) 3 2330 776.8 2.008 0.2145 Residuals 6 2322 386.9 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > > # (c) > > library(lattice) > library(nlme) > > options(contrasts=c("contr.sum", "contr.sum")) > g <- lme(eggs~treat,random=~1|factor(block),data=eggprod,method="ML") > summary(g) Linear mixed-effects model fit by maximum likelihood Data: eggprod AIC BIC logLik 114.8885 117.313 -52.44424 Random effects: Formula: ~1 | factor(block) (Intercept) Residual StdDev: 9.872099 17.03489 Fixed effects: eggs ~ treat Value Std.Error DF t-value p-value (Intercept) 332.75 8.045444 6 41.35881 0.0000 treat1 16.25 8.030324 6 2.02358 0.0895 treat2 10.00 8.030324 6 1.24528 0.2595 Correlation: (Intr) treat1 treat1 0.0 treat2 0.0 -0.5 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -1.97722972 -0.54794104 -0.03285658 0.74126662 1.65055481 Number of Observations: 12 Number of Groups: 4 > > # (d) > > library(pbkrtest) > amod <- lmer(eggs~treat+(1|block),data=eggprod,REML=FALSE) > nmod <- lmer(eggs~1+(1|block),data=eggprod,REML=FALSE) > result <- KRmodcomp(amod,nmod) Warning message: In deviance.merMod(object, ...) : deviance() is deprecated for REML fits; use REMLcrit for the REML criterion or deviance(.,REML=FALSE) for deviance calculated at the REML fit > result F-test with Kenward-Roger approximation; computing time: 0.04 sec. large : eggs ~ treat + (1 | block) small : eggs ~ 1 + (1 | block) stat ndf ddf F.scaling p.value Ftest -5.2377 2.0000 1.9210 1 1 > result$stats$Fstat [1] -5.237667 > > # (e) > > run <- 1000 > lstat <- rep(0,1000) > for(i in 1:1000){ + y <- unlist(simulate(nmod,use.u=TRUE)) + mod.n <- lmer(y~1+(1|block),data=eggprod,REML=FALSE) + mod.a <- lmer(y~treat+(1|block),data=eggprod,REML=FALSE) + result.sim <- KRmodcomp(mod.a,mod.n) + lstat[i] <- result.sim$stats$Fstat + } There were 50 or more warnings (use warnings() to see the first 50) > mean(lstat>result$stats$Fstat) [1] 0.954 > > # (f) > > g.a <- lme(eggs~treat,random=~1|factor(block),data=eggprod,method="ML") > g.n <- lme(eggs~1,random=~1|factor(block),data=eggprod,method="ML") > 2*(logLik(g.a) - logLik(g.n)) 'log Lik.' 8.424536 (df=5) > > summary(g.n) Linear mixed-effects model fit by maximum likelihood Data: eggprod AIC BIC logLik 119.313 120.7677 -56.65651 Random effects: Formula: ~1 | factor(block) (Intercept) Residual StdDev: 0.001605146 27.17881 Fixed effects: eggs ~ 1 Value Std.Error DF t-value p-value (Intercept) 332.75 8.194718 8 40.60542 0 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -1.6465035 -0.8278509 0.2115619 0.5243056 1.4809333 Number of Observations: 12 Number of Groups: 4 > > mu <- rep(332.75,12) > sigmaran <- 0.001605 > sigma <- 27.17 > > lrstat <- rep(0,1000) > for(i in 1:1000){ + random <- rep(rnorm(4,0,sigmaran),3) + error <- rnorm(12,0,sigma) + y <- mu+random+error + gnull <- lme(y~1,random=~1|block,data=eggprod,method="ML") + galt <- lme(y~treat,random=~1|block,data=eggprod,method="ML") + lrstat[i] <- 2*(logLik(galt) - logLik(gnull)) + } > mean(lrstat>8.42) [1] 0.061 > ########### > # Problem 3. > ########### > > lawn <- read.table("u:\\stat526\\dataset\\ascdata\\lawn.txt",h=T) > lawn manufact machine speed time 1 A m1 L 211 2 A m1 H 278 3 B m4 L 205 4 B m4 H 247 5 A m1 L 230 6 A m1 H 278 7 B m4 L 217 8 B m4 H 251 9 A m2 L 184 10 A m2 H 249 11 B m5 L 169 12 B m5 H 239 13 A m2 L 188 14 A m2 H 272 15 B m5 L 168 16 B m5 H 252 17 A m3 L 216 18 A m3 H 275 19 B m6 L 200 20 B m6 H 261 21 A m3 L 232 22 A m3 H 271 23 B m6 L 187 24 B m6 H 242 > > # (a) Omitted > > # (b) > g <- lm(time~manufact+machine+speed,data=lawn) > summary(g) Call: lm(formula = time ~ manufact + machine + speed, data = lawn) Residuals: Min 1Q Median 3Q Max -12.50 -8.50 -3.00 7.50 19.25 Coefficients: (1 not defined because of singularities) Estimate Std. Error t value Pr(>|t|) (Intercept) 230.083 2.284 100.758 < 2e-16 *** manufact1 23.083 5.106 4.521 0.000302 *** machine1 -3.917 6.459 -0.606 0.552255 machine2 -29.917 6.459 -4.632 0.000238 *** machine3 -4.667 6.459 -0.723 0.479790 machine4 23.000 7.910 2.908 0.009804 ** machine5 NA NA NA NA speed1 29.500 2.284 12.919 3.23e-10 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 11.19 on 17 degrees of freedom Multiple R-squared: 0.9251, Adjusted R-squared: 0.8986 F-statistic: 34.97 on 6 and 17 DF, p-value: 1.183e-08 > > # (c) > amod <- lmer(time~manufact+speed+(1|manufact:machine)+(1|speed:machine),data=lawn,REML=FALSE) > summary(amod) Linear mixed model fit by maximum likelihood ['lmerMod'] Formula: time ~ manufact + speed + (1 | manufact:machine) + (1 | speed:machine) Data: lawn AIC BIC logLik deviance df.resid 201.7 208.8 -94.9 189.7 18 Scaled residuals: Min 1Q Median 3Q Max -1.1925 -0.6324 -0.2089 0.8129 1.3719 Random effects: Groups Name Variance Std.Dev. speed:machine (Intercept) 49.92 7.065 manufact:machine (Intercept) 72.68 8.525 Residual 84.92 9.215 Number of obs: 24, groups: speed:machine, 12; manufact:machine, 6 Fixed effects: Estimate Std. Error t value (Intercept) 230.083 4.451 51.69 manufact1 10.250 4.451 2.30 speed1 29.500 2.775 10.63 Correlation of Fixed Effects: (Intr) mnfct1 manufact1 0.000 speed1 0.000 0.000 > > # (d) > nmod <- lm(time~manufact+speed,data=lawn) > 2*(logLik(amod) - logLik(nmod)) 'log Lik.' 6.444842 (df=6) > lrstat <- rep(0,1000) > for(i in 1:1000){ + y <- unlist(simulate(nmod,use.u=TRUE)) + mod.a <- lmer(y~manufact+speed+(1|manufact:machine)+(1|speed:machine),data=lawn,REML=FALSE) + mod.n <- lm(y~manufact+speed,data=lawn) + lrstat[i] <- 2*(logLik(mod.a) - logLik(mod.n)) + } > > mean(lrstat>2*(logLik(amod) - logLik(nmod))) [1] 0.005 > > # (f) > g <- lmer(time~speed+(1|manufact)+(1|manufact:machine),data=lawn,REML=FALSE) > summary(g) Linear mixed model fit by maximum likelihood ['lmerMod'] Formula: time ~ speed + (1 | manufact) + (1 | manufact:machine) Data: lawn AIC BIC logLik deviance df.resid 204.6 210.5 -97.3 194.6 19 Scaled residuals: Min 1Q Median 3Q Max -1.15784 -0.70450 -0.06307 0.64212 1.59863 Random effects: Groups Name Variance Std.Dev. manufact:machine (Intercept) 148.75 12.196 manufact (Intercept) 45.63 6.755 Residual 118.19 10.872 Number of obs: 24, groups: manufact:machine, 6; manufact, 2 Fixed effects: Estimate Std. Error t value (Intercept) 230.083 7.248 31.75 speed1 29.500 2.219 13.29 Correlation of Fixed Effects: (Intr) speed1 0.000