##### Document information

LANCASTER UNIVERSITY 2007 EXAMINATIONS

PART II (Third or Fourth Year)

MATHEMATICS & STATISTICS

Math 352 Generalised Linear Models 90 minutes

You should answer ALL Section A questions and ONE Section B question.

In Section A there are questions worth a total of 50 marks, but the maximum mark that you can

gain there is capped at 40.

SECTION A

A1. Let Z ∼ Bino(m,µ), 0 < µ < 1, and fixed known integer m > 0. Define Y = Z/m so that

Y ∼ Binoprop(m,µ).

(a) State the values of Z (the support) which have non-zero probability. [2]

(b) Find the expectation and variance Y . [4]

(c) Write down the probability mass function of Y . [2]

(d) Explain why Z is not a GLM in the standard notation of GLMs but that Y is. [2]

A2. Define the logistic and logit functions, and show these are inverse. [8]

A3. Define the residual deviance of a GLM in terms of the log likelihood, expressed as a function

of the moment parameters, µi, and the observations, yi, i = 1, 2, . . . , n. [4]

State what the deviance measures. [2]

A4. Show that the exponential family (EF) generated from the exponential pdf q(y) = 2 exp(−2y),

y > 0, and Y may be written as

f(y|θ) = 2 exp(−2y) exp{θy + log(1− θ/2)} where θ < 2. [6]

Identify the mean of this pdf in terms of the canonical parameter θ. [4]

Find the maximum likelihood estimate of θ based on a single observation y from this pdf. [6]

please turn over

1

SECTION A continued

A5. An experiment consists of 6 units to which the following treatment combinations are applied.

1 A2 B2

2 A1 B2

3 A1 B2

4 A2 B1

5 A3 B1

6 A3 B1

(a) Write out the matrix that corresponds to this design in terms of the indicator vectors

for these factor levels. [4]

(b) Define the factor A in terms of these indicator vectors. [2]

(c) Write out the design matrix (the X matrix) for the model in which the linear predictor

η ∈ A+B. [2]

(d) Modify this matrix for the model in which the linear predictor η ∈ A+B +A.B. [2]

please turn over

2

SECTION B

B1. The exponential pdf with mean parameter µ > 0 is

f(y) = µ−1 exp(yµ−1) for y > 0.

A one dimensional covariate x is associated with the observation y through the unspecified

link function g where g(µ) = η, the linear predictor, and η = βx. Consider estimating the

regression parameter β from a single observation.

(a) Write down the log-likelihood function for β and, by using the chain rule, find the score

function and the observed information for β. [8]

(b) Show that the Fisher information for β is

1

µ2

(

∂µ

∂β

)2

. [6]

(c) Three possible candidates for the link function are the identity, log and reciprocal links

given by

reciprocal: 1

µ = βx; log: log(µ) = βx; identity: µ = βx.

Explain how these three links lead to different interpretations of the parameter β by

computing dµ dx . [6]

(d) Comment on the relative merits of these three links. [6]

(e) Suppose n = 4 independent observations are made on this GLM at the x points 1, 1, 2, 2

resulting in y values of 1, 2, 3, 4 respectively.

Using the identity link this data leads to β̂ = 1 and fitted values of 1.2, 1.8, 3.1, 4.2.

Find the Fisher information numerically. [4]

please turn over

3

SECTION B continued

B2. There are three notations for GLMs: generic, index, vector notation.

(a) Explain these different notations when describing the response variable, [6]

(b) The diagram of a GLM is

m

m

m m

m

mJ J

Add labels to the diagram to represent the generic concepts that go to define a GLM. [4]

(c) Give brief definitions of these generic concepts. [8]

(d) Add directions to the edges of this diagram to help explain the interrelationships be-

tween the generic concepts of a GLM, and label the edges with GLM functions where

appropriate. [4]

(e) Describe these interrelationships. [8]

end of exam

4