Solution 

1.  (a) The Pearson statistic is 6.880708 with p-value 0.1423282 based on df=4;
        The goodness of fit is 7.277707 with p-value 0.1219205 based on df=4.
        Thus, the independence is accepted for both. There is no relationship 
        between the response and the predictor
 
    (b) The p-value of the slopt is 0.011 in the logit model. 
        The odds ratio for logit model when increases a level are the same 1.476470.
       For this part, we can say that the response and the predictors are not independent.

   (c). The p-values for wald statistic are 0.011 for logit model.
        There is an inconsistnece between part (a)
        and (b). If this happens, I believe the results in (b) and Wald method. Because the 
        chi-square distribution is an approximate distribution. The approximation in (b)
        is better than in (a).


2. For the logit model 

        log(pi/(1-pi))=beta_0+beta_1 x and pi=1/2. 

   Then,

        beta_0+beta_1 x=0. Thus, x=-beta_0/beta_1.

   For the probit model 

        Phi^{-1}(pi)=beta_0+beta_1 x and pi=1/2. Then Phi^{-1}(0.5)=0. 

   Then,

         beta_0+beta_1 x=0. Thus, x=-beta_0/beta_1.

   For the complementary log-log model 

        log(-log(1-pi))=beta_0+beta_1 x and pi=1/2. 

   Then,
        
     beta_0+beta_1 x=-0.3665. Thus

   x=-(0.3665+beta_0)/beta_1.


3. For logit model, 

  d pi(x)/px= beta[e^{alpha+beta x}/(1+e^{alpha+beta x})^2]= beta*pi(x)*[1-pi(x)]

   Which is maximized at pi(x)=0.5. Thus, the absolute value attains it maximum at 

  pi(x)=0.5.

  For the probit model 

  d pi(x)/dx = beta* phi(alpha+beta x).

  It is clear that when alpha+ beta x=0 that it attain the absolute maximam.

  Thus, when pi(x)=0.5, it attains its maximum.


  For complementary log-log model 

  d pi(x)/dx = beta*e^{alpha+beta x}*e^{-e^{alpha+beta x}}.

  Let y=exp^{alpha+beta x}. Then, 

  d pi(x)/dx =beta y e^{-y}. 

  Let

  f(y)=log(y)-y.

  Then

  f'(y)=1/y-1 and f"(y)=-1/y^2<0.

  Thus, when y=1, f(y) attains its maximum. 

  Therefore, d pi(x)/dx attains its absolute maximum at 

  pi(x)=1-exp(-1).  


##################

R Command and Output

##################


Problem 1:
part a)

> high <- c(1,13,16,15,7)  # Input the first column
> low <- c(11,53,42,27,11)  # Input the second column
> infiltration <- cbind(high,low) # combine them as the response
> change <- 1:5  # input the predictor
> infiltration
     high low
[1,]    1  11
[2,]   13  53
[3,]   16  42
[4,]   15  27
[5,]    7  11
> change
[1] 1 2 3 4 5
> totalcolumn <- apply(infiltration,2,sum)  # compute the column total
> totalrow <- apply(infiltration,1,sum)   # compute the row total
> column <- rbind(totalcolumn,totalcolumn,totalcolumn,totalcolumn,totalcolumn) # duplicate them 
                # such that each position is the column total 
> row <- cbind(totalrow,totalrow)   # duplicate them 
                # such that each position is the row total 
 
> estimate <- column*row/(sum(infiltration))  # compute the estimate freq under independence
> estimate
                 high       low
totalcolumn  3.183673  8.816327
totalcolumn 17.510204 48.489796
totalcolumn 15.387755 42.612245
totalcolumn 11.142857 30.857143
totalcolumn  4.775510 13.224490
> X2 <- sum((infiltration-estimate)^2/estimate)  # compute Pearson statistic
> G2 <- 2*sum(infiltration*log(infiltration/estimate))  # compute G^2
> X2 
[1] 6.880708
> G2
[1] 7.277707
> 1-pchisq(X2,4)    # compute p-value based df=(I-1)(J-1)=1*4=4
[1] 0.1423282
> 1-pchisq(G2,4)
[1] 0.1219205

part b)

# logit model

> g <- glm(infiltration~change,family=binomial)
> summary(g)

Call:
glm(formula = infiltration ~ change, family = binomial)

Deviance Residuals: 
[1]  -0.60664   0.06131   0.23573   0.17732  -0.40927

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  -2.2037     0.5084  -4.334 1.46e-05 ***
change        0.3897     0.1532   2.543    0.011 *  
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 7.27771  on 4  degrees of freedom
Residual deviance: 0.62628  on 3  degrees of freedom
AIC: 22.475


> odds.logistic <- exp(g$coeff[2])
> odds.g <- g$fitted[2:5]*(1-g$fitted[1:4])/(g$fitted[1:4]*(1-g$fitted[2:5]))
> odds.logistic
  change 
1.476470 
> odds.g
[1] 1.476470 1.476470 1.476470 1.476470

# Compare the predicted value by plot

> x <- seq(-2,8,0.01)
> p.logit <- g$coeff[1]+g$coeff[2]*x 
> x <- seq(-4,15,0.01)
> p.logit <- exp(g$coeff[1]+g$coeff[2]*x)/(1+exp(g$coeff[1]+g$coeff[2]*x))
> p.probit <- pnorm(gp$coeff[1]+gp$coeff[2]*x)
> matplot(x,cbind(p.logit/p.probit,(1-p.logit)/(1-p.probit)),col=c(1,1),type="l")