




























































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Ch 8 Material Type: Notes; Professor: Davidian; Class: Applied Longitudianal Data Analysis; Subject: Statistics; University: North Carolina State University; Term: Unknown 1989;
Typology: Study notes
1 / 101
This page cannot be seen from the preview
Don't miss anything!





























































































We have seen that the classical methods of univariate and multivariate repeated measures analysis of variance may be thought of as being based on a statistical model for a data vector from the ith individual, i = 1,... , m. So far, we have written this model in different ways. Following convention, we wrote the model as Y ′ i = a′ iM + ≤′ i,
where M is the (q × n) matrix
M =
μ 11 · · · μ 1 n ... ... ... μq 1 · · · μqn
,
and the individual means μj are for theth group at the jth time.
We could equally well write this model as
Y (^) i = μ` + ≤i
for unit i coming from the th population, = 1,... , q. Regardless of how we write the model, we note that it represents Y (^) i as having two components:
, μj for the `th group at the jth time, are further represented in terms of an overall mean and deviations asμj = μ + τ + γj + (τ γ)`j
along with constraints ∑q=1 τ = 0, etc in order to give a unique representation. As noted in the last chapter, this representation (i) Requires that the length of each data vector Y (^) i be the same, n. (ii) Does not explicitly incorporate the actual times of measurement or other information.
We also noted in Chapter 5 that this model could be written in an alternative way. Specifically, we defined β as the column vector containing all of μ, τ, γj , (τ γ)j stacked and Xi to be a matrix of 0’s and 1’s with n rows that “picks” off the appropriate elements of β for each element of Y (^) i. We wrote the model in the alternative form Y (^) i = Xiβ + ≤i, (8.1)
where again ≤i is the “overall deviation” vector with var(≤i) = Σ. Note that both the univariate and multivariate ANOVA models could be written in this way; what would distinguish them would again be the assumption on Σ. This model, along with the usual constraints, has the flavor of a “regression” model for the ith unit.
Regardless of how we write the model, it says that, for a unit in group `,
Yij = μ + τ+ γj + (τ γ)j + ≤ij , (8.2)
so that E(Yij ) is taken to have this specific form.
As we will now discuss, a representation like (8.1) offers a convenient framework for thinking about more general model for longitudinal data. In this chapter, we will discuss such a model, writing it in the form (8.1). We will see that we will be able to address several of the issues raised in the last chapter:
The sample means suggest that the true means μj at each time point may very well fall on a straight line.
This observation suggests that we may be able to refine our view about the means. Rather than thinking of the mean vector as simply as set of n unrelated means μj , we might think of these means as satisfying μj = β 0 + β 1 tj ;
that is, the means fall on the line with intercept β 0 and slope β 1.
This suggests replacing (8.2) by Yij = β 0 + β 1 tj + ≤ij. (8.3)
Model (8.3) says that, at the jth time tj , Yij values we might see have mean β 0 + β 1 tj and vary about it according to the overall deviations ≤ij.
and hence τ would be the same for all units in that model, and the mean depends on time through γj and (τ γ)`j .)To summarize, if we really believe that the mean trajectory follows a straight line, model (8.3) seems more appropriate, because it exploits this assumption.
MATRIX REPRESENTATION: The model (8.3) may be written in matrix form. With Y (^) i as usual the (n × 1) data vector, defining
1 t 1 1 t 2 ... ... 1 tn
, β =
β^0 β 1
,
we can write the model as Y (^) i = Xβ + ≤i. (8.4)
This has the form of model (8.1). Because all units are seen at the same n times, the matrix X is the same for all units.
COVARIANCE MATRIX: The above development offers an alternative way to represent mean response. To complete the model, we need to also make an assumption about the covariance matrix of the random vector ≤i. For example, as in the classical models, we could assume that this matrix is the same for all data vectors, i.e. var(≤i) = Σ,
for some matrix Σ. Momentarily, we will address the issue of specification of Σ more carefully; for now, as we consider the situation of only a single population, it is natural to take this matrix to be the same for all units.
MULTIVARIATE NORMALITY: Suppose we further assume that the responses Yij are normally dis- tributed at each time point, so that the Y (^) i are multivariate normal. Thus, we may summarize the model as Y (^) i ∼ Nn(Xβ, Σ),
where X and β are as above.
The modeling strategy for the mean above may be generalized. We consider several possibilities:
It is straightforward to see that this is a slick way of noting that if i is a girl or boy, respectively, we are defining
Xi =
1 t 1 0 0 ... ... ... ... 1 tn 0 0
,^ Xi^ =
0 0 1 t 1 ... ... ... ... 0 0 1 tn
,
respectively.
With these definitions, it is a simple matrix exercise to verify that Xiβ yields the (n × 1) vector whose elements are β 0 ,G + β 1 ,Gtj or β 0 ,B + β 1 ,B tj , depending on whether i is a boy or girl. We may thus write the model succinctly as Y (^) i = Xiβ + ≤i,
where β and Xi are defined in (8.8) and (8.9), respectively.
MULTIVARIATE NORMALITY: With the additional assumption of normality, each Y (^) i under this model is n-variate normal with mean Xiβ, where Xi depends on group membership. With some additional assumption about the covariance matrix, e.g. var(≤i) = Σ for all i, we have
Y (^) i ∼ Nn(Xiβ, Σ).
IMBALANCE: It is possible to be even more general. For definiteness, we consider two examples.
ULTRAFILTRATION DATA FOR LOW FLUX DIALYZERS: These data are given in Vonesh and Chinchilli (1997, section 6.6). Low flux dialyzers are used to treat patients with end stage renal disease to remove excess fluid and waste from their blood. In low flux hemodialysis, the ultrafiltration rate (ml/hr) at which fluid is removed is thought to follow a straight line relationship with the transmembrane pressure (mmHg) applied across the dialyzer membrane. A study was conducted to compare the average ultrafiltration rate (the response) of such dialyzers across three dialysis centers where they are used on patients. A total of m = 41 dialyzers (units) were involved. The experiment involved recording the ultrafiltration rate at several transmembrane pressures for each dialyzer.
Figure 2 shows individual dialyzer profiles for the dialyzers in each center. A notable feature of the figure is that the 4 pressures (“time” here) at which each dialyzer was observed are not necessarily the same. Thus, the ith dialyzer has its own set of times tij , j = 1,... , n = 4. Hence, we cannot calculate sample means, because each dialyzer is seen at potentially different pressures. However, if we envision taking means in each panel of the figure across all time points, it seems reasonable that the means would very likely fall approximately on a straight line.
Figure 2: Dialyzer profiles (ultrafiltration rate vs. transmembrane pressure) for 41 dialyzers in 3 centers
tranmembrane pressure (mmHg)
ultrafiltration rate (ml/hr) 100 200 300 400 500
500
1000
1500
2000
Center 1
tranmembrane pressure (mmHg)
ultrafiltration rate (ml/hr) 100 200 300 400 500
500
1000
1500
2000
- - Center 2
tranmembrane pressure (mmHg)
ultrafiltration rate (ml/hr) 100 200 300 400 500
500
1000
1500
2000
Center 3
PSfrag replacements
μ σ^21 σ^22 ρ 12 = 0. 0 ρ 12 = 0. 8
y 1 y 2
With the modeling strategy we have adopted, this does not really pose any additional difficulty. From the figure, a reasonable model for the ith dialyzer is
Yij = β 1 + β 2 tij + ≤ij , dialyzer i in center 1 Yij = β 3 + β 4 tij + ≤ij , dialyzer i in center 2 Yij = β 5 + β 6 tij + ≤ij , dialyzer i in center 3 (8.10)
Here, β 1 , β 3 , β 5 are the intercepts and β 2 , β 4 , β 6 are the slopes for the means (straight lines) for each center.
Figure 3: Hæmatocrit trajectories for hip replacement patients. The left hand panels are individual profiles by gender; the right hand panels show a fitted quadratic model for the mean superimposed.
weeks
haematocrit
0.0 0.5 1.0 1.5 2.0 2.5 3.
20
30
40
(^50) •
Males, individual trajectories
weeks
haematocrit
0.0 0.5 1.0 1.5 2.0 2.5 3.
20
30
40
(^50) •
Males, mean at age = 65.52 superimposed
weeks
haematocrit
0.0 0.5 1.0 1.5 2.0 2.5 3.
20
30
40
50
- Females, individual trajectories
weeks
haematocrit
0.0 0.5 1.0 1.5 2.0 2.5 3.
20
30
40
50
- Females, mean at age 66.07 superimposed
PSfrag replacements
μ σ^21 σ^22 ρ 12 = 0. 0 ρ 12 = 0. 8
y 1 y 2
It may be seen from the figure that a number of both male and female patients are missing the mea- surement at week 2; in fact, there is one female missing the pre-replacement measurement and week 2. The reason for this is not given by Crowder and Hand; however, because it is so systematic, happening only at this occasion and for about half of the male and half of the female patients, it suggests that the reason has nothing to do with the patients’ health or recovery from the replacement. Perhaps the centrifuge used to obtain hæmatocrit values went on the blink that week before all patients’ values could be obtained! We will assume that the reason for these missing observations has nothing to do with the thing of primary interest, gender; this seems reasonable in light of the pattern of missingness for week 2.
Thus, we have a situation where the data vectors Y (^) i are of possibly different lengths for different units. In particular, we now have that Y (^) i is (ni × 1), where ni is the number of observations on unit i. Thus, the total number of observations from all units is
N = ∑^ m i=
ni.
To determine an appropriate parsimonious representation for the mean of a data vector for each group, we could calculate the sample means at each time point for males and females. We must be a bit careful, however; because of the missingness, the sample means at different times will be of different quality.
Nonetheless, it seems clear from the figure that a model that says the means fall on a straight line for either gender would be inappropriate. For almost all patients, the pre-replacement reading is high; then, following replacement, the hæmatocrit goes down and then slowly rebounds over the next 3 weeks. This suggests that the relationship of the means with time might look more like a quadratic function of time. These observations suggest the following model:
Yij = β 1 + β 2 tij + β 3 t^2 ij + ≤ij , males Yij = β 4 + β 5 tij + β 6 t^2 ij + ≤ij , females. (8.11)
In (8.11), we have allowed for the possibility that the times for each i are not the same, writing tij. For this data set, the times that are potentially available for each individual are the same; however, as we saw in the dialyzer example above, this need not be the case.
To write the model in matrix form, define
β = (β 1 ,... , β 6 )′.
Clearly, the matrix Xi for a given unit will depend on the times of observation for that unit and will have number of rows ni, each row corresponding to one of the ni elements of Yij. For example, for a male with ni observations, we have
Xi =
1 ti 1 t^2 i 1 0 0 0 ... ... ... ... ... ... 1 tini t^2 ini 0 0 0
We may thus summarize the model as
Y (^) i = Xiβ + ≤i, (ni × 1),
where Xi is the (ni × 6) matrix defined appropriately for individual i.
HIP REPLACEMENT, CONTINUED: In the hip replacement study, the age of each participant was also recorded, and in fact an objective of the investigators was not only to understand differences in hæmatocrit response across genders but also to elucidate whether the age of the patient has an effect on response. It turns out that the sample mean age for males was 65.52 years and that for females was 66.07 years. From Figure 3, the patterns look pretty similar for both genders; of course, there is no easy way of discerning from the plot whether age affects the response.
To illustrate inclusion of the age covariate, consider the following modified model, where ai is the age of the ith patient:
Yij = β 1 + β 2 tij + β 3 t^2 ij + β 7 ai + ≤ij , males Yij = β 4 + β 5 tij + β 6 t^2 ij + β 7 ai + ≤ij , females. (8.12)
Model (8.12) says that, regardless of whether a person is male or female, the mean hæmatocrit response at any time increases by β 7 for every year increase in age (keep in mind that β 7 could be negative). One can envision fancier models where this also depends on gender. It is straightforward to write this in matrix notation as before; with β = (β 1 ,... , β 7 )′,
we can define appropriate Xi matrices, i.e. for a male of age ai
Xi =
1 ti 1 t^2 i 1 0 0 0 ai ... ... ... ... ... ... 1 tini t^2 ini 0 0 0 ai
PARAMETERIZATION: It is possible to represent models like those above in different ways. For definiteness, consider the dialyzer example. We wrote the model in (8.10) as
Yij = β 1 + β 2 tij + ≤ij , dialyzer i in center 1 Yij = β 3 + β 4 tij + ≤ij , dialyzer i in center 2 Yij = β 5 + β 6 tij + ≤ij , dialyzer i in center 3
It is sometimes more convenient, although entirely equivalent, to write the model in an alternative parameterization. As we have discussed, a question of interest is often to compare the rate of change of the mean response over time (pressure here) among groups. In this situation, we would like to compare the three slopes β 2 , β 4 , and β 6.
Define δi 1 = 1 unit i from center 1; = 0 o.w. δi 2 = 1 unit i from center 2; = 0 o.w.
Then write the model as
Yij = β 1 + β 2 δi 1 + β 3 δi 2 + β 4 tij + β 5 δi 1 tij + β 6 δi 2 tij + ≤ij (8.13)
There are still 6 parameters overall, but the ones in (8.13) have an entirely different interpretation from those in the first model.
It is straightforward to observe by simply plugging in the values of δi 1 and δi 2 for each center that the following is true:
Center Intercept Slope 1 β 1 + β 2 β 4 + β 5 2 β 1 + β 3 β 4 + β 6 3 β 1 β 4
Note that β 2 and β 3 have the interpretation of the difference in intercept between Centers 1 and 3 and Centers 2 and 3, respectively, and β 1 is the intercept for Center 3. Similarly, β 5 and β 6 have the interpretation of the difference in slope between Centers 1 and 3 and Centers 2 and 3, respectively, and β 1 is the slope for Center 3. This parameterization allows us to estimate, as we will talk about shortly, the differences of interest directly. This same type of parameterization is used in ordinary linear regression for similar reasons.
This type of parameterization is the default used by SAS PROC GLM and PROC MIXED, which we will use to implement the analyses we will discuss shortly. The different parameterizations yield equivalent models; the only thing that differs is the interpretation of the parameters.
ALTERNATIVE MODELS: We now recall the other models. Actually, there is nothing stopping us from allowing var(≤i) to be different for different groups; e.g., in the dental study, allow different covariance matrices for each gender. We discuss this further below.
Σ = var(≤i) =
σ^2 ρσ^2 0 · · · 0 ρσ^2 σ^2 ρσ^2 · · · 0 ... ... ... ... ... 0 0 · · · ρσ^2 σ^2
This model would make sense even if the times are not equally-spaced in time (as they are, for example, in the dental study: 8, 10, 12, 14). It is possible to extend this to a two-dependent or higher dependent model or to heterogeneous variances over time, as discussed in Chapter 4. SAS PROC MIXED uses the designation type = toep(2) (for “Toeplitz” with 2 diagonal bands) to refer to this assumption with the same variance at all times. With groups, we could believe the one-dependent assumption holds for each group, but allow the possibility that the variance σ^2 and correlation ρ are different in each group. The same holds true for the rest of the models we consider.
Σ = var(≤i) = σ^2
1 ρ ρ^2 · · · ρn−^1 ρ 1 ρ · · · ρn−^2 ... ... ... ... ... ρn−^1 ρn−^2 · · · ρ 1
SAS PROC MIXED uses the designation type = ar(1) to refer to this assumption.
djk = |tij − tik|, j, k = 1,... , n,
then the model is
Σ = var(≤i) = σ^2
1 ρd^12 · · · ρd^1 n ... ... ... ... ρdn^1 ρdn^2 · · · 1
SAS PROC MIXED allows this type of model to be implemented in more than one way, e.g with the type = sp(pow)(.) designation.
We will consider examples of fitting these structures to several of our examples in section 8.8. The SAS PROC MIXED documentation, as well as the books by Diggle, Heagerty, Liang, and Zeger (2002) and Vonesh and Chinchilli (1997), discuss other assumptions.
DECIDING AMONG COVARIANCE STRUCTURES: In the balanced case, one may use the tech- niques discussed in Chapter 4 to gain informal insight into the structure of var(≤i). Inspection of sample covariance matrices, scatterplot matrices, autocorrelation functions, and lag plots can aid the analyst in identifying possible reasonable models.
These methods can be modified to take into account the fact that one believes that the mean vectors follow smooth trajectories over time, such as a straight line. For instance, instead of using the sample means for “centering” in these approaches, one might estimate β somehow; e.g. by least squares treating all the individual responses from all units as if they were independent (even though we know they are probably not). Least squares is clearly not the best way to estimate β (recall our discussion in Chapter 3); however, this estimator may be “good enough” to provide reasonable estimates of the means at each time tj that take advantage of our willingness to believe they follow a smooth trajectory, so might be preferred to using sample means at each j on this account. In particular, if
μj = β 0 + β 1 tj ,
say, for a single group, we would estimate μj by β̂ 0 + β̂ 1 tj and use this in place of the sample mean.
A complete discussion of graphical and other techniques along these lines may be found in Diggle, Heagerty, Liang, and Zeger (2002).
Σi =
σ^2 ρσ^2 ρσ^2 σ^2 0 0 σ^2
.
Σi = σ^2
1 ρ ρ^3 ρ 1 ρ^2 ρ^3 ρ^2
These examples illustrate the main point – if all observations were intended to be taken at the same times, but some are not available, the covariance matrix must be carefully constructed according to the particular time pattern for each unit, using the convention of the assumed covariance model.
Now consider the situation of the ultrafiltration data. Here, the actual times of observation are different for each unit. Consider again the above models.
Clearly, with different times for different units, modeling covariance is more challenging! In fact, it is even hard to investigate the issue informally, because the information from each unit is different. In the next two chapters of the course, we will talk about another approach to modeling longitudinal data that obviates the need to think quite so hard about all of this!
INDEPENDENCE ASSUMPTION: An alternative to all of the above, in both cases of balanced and unbalanced data, is the assumption that observations within a unit are uncorrelated, which, with the assumption of multivariate normality implies that they are independent. That is, if we believe that all observations have constant variance var(Yij ) = σ^2 , take
Σi = var(≤i) = σ^2 Ini.
SUMMARY: The important message is that, by thinking about the situation at hand, it is possible to specify models for covariance that represent the main features in terms of a few parameters. Thus, just as we model the systematic component in terms of a regression parameter β, we may model the random component.