































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A 10-year longitudinal study on flashbulb and event memories of the September 11 attacks. The study examines the consistency and accuracy of these memories over an extended period, focusing on factors such as media attention, emotional intensity, and personal loss. Key findings suggest that media attention and ensuing conversation predict event memory accuracy, while flashbulb memory consistency is not affected by any of the examined factors.
Typology: Lecture notes
1 / 71
This page cannot be seen from the preview
Don't miss anything!
































































Running head: Ten-‐Year Follow-‐up
A ten-‐year follow-‐up of a study of memory for the attack of September 11, 2001:
Flashbulb memories and memories for flashbulb events
William Hirst Elizabeth A. Phelps
New School for Social Research New York University
Robert Meksin Chandan J. Vaidya
New School for Social Research Georgetown University
Marcia K. Johnson Karen J. Mitchell
Yale University West Chester University
Randy L. Buckner Andrew E. Budson
Harvard University Boston University School of Medicine
John D. E. Gabrieli Cindy Lustig
Massachusetts Institute of Technology University of Michigan
Mara Mather Kevin N. Ochsner
University of Southern California Columbia University
Daniel Schacter Jon S. Simons
Harvard University Cambridge University
Keith B. Lyle Alexandru F. Cuc
University of Louisville Nova Southeastern University
Andreas Olsson
Karolinska Institutet
Word Count: 13,
Authors’ Note
Please address correspondence to: William Hirst, Department of Psychology, New
School for Social Research, 80 Fifth Avenue, New York, NY 10011.
We thank the many student coders without whom this research project would have
been impossible, as well as Brett Sedgewick, who assisted with the supervision of this
project at its earliest stages.
Support from the James S. McDonnell Foundation, and the National Institute of
Mental Health, grant R01-‐MH0066972, is gratefully acknowledged.
Abstract
Within a week of the attack of September 11, 2001, a consortium of researchers from across the
United States distributed a survey asking about the circumstances in which respondents learned
of the attack (their flashbulb memories) and the facts about the attack itself (their event
memories). Follow-up surveys were distributed 11, 25, and 119 months after the attack. The
study, therefore, examines retention of flashbulb memories and event memories at a substantially
longer retention interval than any previous study employing a test-retest methodology, allowing
for the study of such memories over the long-term. There was rapid forgetting of both flashbulb
and event memories within the first year, but the forgetting curves leveled off after that, not
significantly changing even after a 10-year delay. Despite the initial rapid forgetting, confidence
remained high throughout the 10-year period. Five putative factors affecting flashbulb memory
consistency and event memory accuracy were examined: (1) attention to media (2) the amount of
discussion, (3) residency, (4) personal loss and/or inconvenience, and (5) emotional intensity.
After ten years, none of these factors predicted flashbulb memory consistency; media attention
and ensuing conversation predicted event memory accuracy. Inconsistent flashbulb memories
were more likely to be repeated rather than corrected over the ten-year period; inaccurate event
memories, on the other hand, were more likely to be corrected. The findings suggests that even
traumatic memories and those implicated in a community’s collective identity may be
inconsistent over time and these inconsistency can persist without the corrective force of external
influences.
Keywords: Long-term memory, flashbulb memories, event memories, September 11,
autobiographical memories, collective memories
community. There are many reasons for this interest: Researchers treat them as examples of
long-term emotional memories and as special cases of the more general class of traumatic
memories, for instance (e.g., Berntsen & Rubin, 2006). In addition, flashbulb memories are a
distinctive form of autobiographical memory, a topic of great interest to psychologists (Rubin,
2005 ). Unlike most autobiographical memories, they are built around public events. In the case
of flashbulb memories, the public event appears to have personal meaning and consequences for
the individual rememberer. That is, the memory refers to events for which the personal and the
public intersect. As a result, flashbulb memories are of concern to scholars studying both
individual and social identity (Neisser, 1982 ).
Flashbulb memories and Forgetting
A central question for those interested in flashbulb memories is how to best characterize
their long-term retention. When Brown and Kulik (1977) first examined this issue, they initially
simply asked participants many years after the event occurred whether they remembered the
circumstances in which participants learned of the event. Brown and Kulik noted, however, that
“the division [between possessing a flashbulb memory and not possessing one] [is] not so
absolute, and, more importantly, within the account there [is] much to interest us” (p. 79). As a
result, they moved beyond their dichotomous measure and also examined specific features of
participants’ reports of reception events, the “who, what, when, and where” of the reception
event. They then investigated how many of these canonical features were or were not contained
in a reported flashbulb memory. For Brown and Kulik, errors of omission (that is, those
instances in which people claimed not to remember a specific canonical feature) provided insight
into “variations as well as constancies …in the content of the reports” (p. 79).
For many researchers, this focus on errors of omission captures only one aspect of what it
means to forget (e.g., see Johnson & Raye, 1981; Winograd & Neisser, 1992). In most tests of
memory, people are said to fail to remember previously studied material not only if they state
that they do not remember, but also if they remember the material inaccurately. That is,
forgetting involves not just errors of omission, but also errors of commission. Brown and
Kulik’s methods focused on errors of omission, and did not assess errors of commission.
To study both errors of omission and errors of commission, psychologists have often
employed a test-retest methodology. Participants are asked as soon as possible after a
consequential, often emotionally charged public event for their memory for the circumstances in
which they learned of the event and then are given follow-up assessments several months or
more afterwards. The follow-up recollections are then compared with the first recollection,
resulting in a consistency score between the first and subsequent assessments. A “forgetting”
curve can then be plotted on the basis of these consistency scores.
Of course, this method does not permit a researcher to assess absolute accuracy, in that
the researcher still does not have access to what actually unfolded during the reception event (see
Curci & Conway, 2012, for a discussion of this concern, and related issues). Moreover, it creates
a situation in which participants might confuse their memory for the reception event with their
memory of their first report, though the number of retests does not appear to affect the level of
consistency (Kvavilashvili, Mirani, Schlagman, Foley, & Kornbrot, 2009). There is also some
concern that the first report is an unreliable index if it is not collected within a day or two after
the event, though in most studies collection within the first week or so appears to be adequate
(Collucia, Bianco, & Brandimonte, 2006; Julian, Bohannon, & Aue, 2009; Kvavilashvili et al.,
2009; Lee & Brown, 2003; but see Schmidt, 2004; Winningham, Hyman, & Dinnel, 2000).
Hirst, Phelps, Buckner, Budson, Cuc et al., 2009; Kvavilashvili. Mirani, Schlagman, Foley, &
Kornbrot, 2009; Luminet & Curci, 2008; Luminet, Curci, Marsh, Wessel, Constantin, Genocz, et
al., 2004; Paradis, Solomom, Florer, Thompson, 2004; Pezdek, 2003; Qin, Mitchell, Johnson,
Krystal, Southwick et al., 2003; Schmidt, 2004; Shapiro, 2006; Sharot, Martorella, Delgado, &
Phelps, 2007; Smith, Bibi, & Sheard, 2003; Talarico & Rubin, 2003; Tekcan, Berivan, Gülgöz &
Er, 2003; Weaver & Krug, 2004).
Several conclusions have emerged from this flurry of research. First, consequential and
emotionally charged public events indeed can lead to long-lasting memories for the reception
events. For the retention intervals studied to date, Brown and Kulik’s (1977) characterization
that “nobody forgets” is correct in that, for the studied public events, few people fail to report
that they simply cannot remember the circumstances in which they learned of the event. Second,
these memories can contain both errors of omission and commission. For some researchers,
then, flashbulb memories may be exceptional, in that people continue to report a recollection of
the reception event even after a substantial delay, but they are unexceptional in that they are
replete with errors of omission and commission, just like “ordinary” autobiographical memories
(e.g., Talarico & Rubin, 2003). For us, the mere existence of selective errors of omissions and
commission does not disqualify a memory as being classified as “flashbulb,” even if it makes it
“ordinary” (see Curci & Conway, 2012, for a discussion of this point). Like Brown and Kulik
(1977), we also feel that it is necessary to explore the ‘variations as well as constancies …in the
content of the reports” (p. 79). Consequently, among other things, we will be interested here in
the consistency between initial and subsequent recollections.
There is one exceptional feature of flashbulb memories, besides their long-term retention,
that is widely accepted and deserves mention. Unlike many “ordinary” autobiographical
memories, people are extremely confident in the accuracy of even their inconsistent flashbulb
memories, even after several years have passed (see, for instance, Neisser & Harsch, 1992;
Neisser, Winograd, Bergman, Shreider, Palmer, & Weldon, 1996). For example, a day after the
9/11 attack, Talarico and Rubin (2003) asked participants to record both the circumstance in
which they learned of the attack, as well as another “important” autobiographical event that had
occurred in the last week. In follow-up testing one, six, and 32 weeks after the attack, there were
similar rates of forgetting for both types of memories. On the other hand, whereas confidence in
ordinary autobiographical memories declined over time as the consistency scores declined,
confidence in flashbulb memories remained high, despite the similar rates of decline in
consistency. This finding is important in that it suggests that individuals are not simply “filling
in” missing details about the reception event with guesses as they recollect. Rather when
responding with an inconsistent memory, they truly believe that they are “remembering” it
accurately (i.e., they are making reality/source monitoring errors, Johnson, Hashtroudi &
Lindsay, 1993).
Factors Affecting Flashbulb Memory Consistency
As more psychologists utilized the test-retest methodology, questions about what makes a
flashbulb memory consistent or inconsistent over the long-term naturally arose. If flashbulb
memories are like “ordinary autobiographical memories,” in that they evidence errors of
omission and commission, then psychologists hypothesized that many of the factors affecting the
retention of “ordinary” autobiographical memories would also affect flashbulb memories, for
example, the emotional intensity of the event, its importance or consequentiality, the degree to
which it is rehearsed, its distinctiveness, the level of surprise associated with it (e.g., Conway,
1995; Conway, Anderson, Larsen, Donneley, McDaniel, McClelland, & Rawles, 1994;
cross-sectional methodology, Bahrick and his colleagues studied forgetting curves for a wide
variety of material for intervals stretching up to 50 years or more (Bahrick, 1983; 1984; Bahrick,
Bahrick, & Whittlinger, 1975; see also Conway, Cohen, & Stanhope, 1991; Squire & Slater,
1975; Stanhope, Cohen, & Conway, 1993). For instance, they asked college alumni ranging in
age from their 20’s to their 70’s to recollect the Spanish vocabulary they learned while in college
or the names of the streets of their college town.
As they pertain to the interests in this investigation, the results of these studies indicate
that (1) there is substantial forgetting of both autobiographical and semantic memories in the first
few years, both in terms of reported recollection and in terms of accuracy, (2) the forgetting rate
asymptotes thereafter, conforming to a power function, (3) there is consistency between what is
recollected at one time, even if incorrect, and what is recollected subsequently and (4) if a
memory is retained for a certain period of time, it is unlikely to show further forgetting. Bahrick
and his colleagues suggested that memories go into a “permastore” after approximately six years
(Bahrick, 1984). The conclusions, however, are at best limited to “ordinary” autobiographical
memories, or semantic memories. Whether the “permastore” concept is applicable to flashbulb
memories is an open question.
The Present 9/11 Study
Can what has been observed in studies of flashbulb memories with retention intervals of a
few months to three years be extended to longer retention intervals, such as ten years? And can
one extend the studies of long-term memory of “ordinary” autobiographical event and semantic
memories to long long-term flashbulb memories and their associated event memories? With
questions such as these in mind, within a week of the attack of September 11, 2001, the authors
of this paper distributed a survey at locations around the United States and asked about the
circumstances in which participants learned of the attack, facts about the attack (e.g., how many
planes were involved), characteristics of the memories participants held (e.g., confidence
ratings), and the way participants reacted to the news, among others. We then followed up with
similar questions 11 months, 35 months, and finally 119 months after the attack. (We always
tested participants one month before an anniversary of the attack.)
Several years ago, we offered a preliminary three-year report (Hirst et al., 2009). The
findings fell into three general categories. First, we examined the consistency of flashbulb
memories and their associated level of confidence, as well as the accuracy of event memories.
We observed a decline in the consistency of the reported flashbulb and event memories over the
three-year period, even as confidence ratings remained high. Most of the forgetting occurred in
the first year. The dissociation between consistency and confidence ratings is consistent with
several studies, including Talarico and Rubin’s (2003) 9/11 study. The finding that the level of
consistency seemed to flatten out after a year is consistent with Kvavilashvili et al. (2009), who
found, again for 9/11, similar levels of consistency between their two-year retest and the final
three-year retest. Our results differed from Schmolck, Buffalo, and Squire’s (2000) study of
flashbulb memories for the O.J. Simpson verdict. These researchers showed little forgetting after
15 months, but more dramatic forgetting after 32 months, suggesting that memory distortions
built up over time. They attributed this build-up largely to source monitoring errors.
The second class of results in our previous report focused on predictions of consistency,
examining: (1) residency at the time of the attack, (2) the level of emotional intensity , (3) the
personal loss or inconvenience , (4) the amount of media attended to immediately after the attack
and in the ensuing period, and (5) the amount of discussion with others about the attack (referred
to as ensuing conversation ). None of these were correlated with flashbulb memory consistency.
as short as three years, but may be observed after a ten-year delay. (3) Although forgetting
seemed to be slowing in Hirst et al., there was not an additional time interval to assess whether
memories had become resistant to forgetting, as Bahrick’s (1984) work suggests they eventually
should. (4) Declines in confidence that are not detected after a short retention interval may be
detected after a longer retention interval. (5) The factors determining the shape of the forgetting
curves for the first three years may differ from those associated with forgetting over longer
retention intervals. (6) Inconsistencies may be repeated once, but may not continue to be
repeated.
With these considerations in mind, we distributed a survey similar to the one we used at
Year 3 to previous respondents. We also collected an additional “new” sample of individuals
who had never participated in the project.
Method
Because we have described the method as it applied up to Year 3 in detail elsewhere
(Hirst et al., 2009), we focus here on information relevant to our ten-year follow-up.
Participants, Recruitment, and Procedure
For the first three surveys, we recruited participants between September 17, 2001 and
September 21, 2001 (Survey 1), August 5 and August 20, 2002 (Survey 2), and August 9 and
August 20, 2004 (Survey 3). Recruitment for Survey 4 took place between August 1 , 2011 and
August 15 , 2011. The original recruitment was done in seven locations: Boston and Cambridge,
MA; New Haven, CT; New York, NY; Washington, DC; St. Louis, MO; Palo Alto, CA; Santa
Cruz, CA. As a result, follow-up participants in subsequent surveys came from the same areas,
though in many instances they no longer lived in these areas. The one exception is Boston,
which used a slightly different procedure when recruiting participants on Survey 1 and hence did
not figure in the follow-up surveys reported here. Over the first three surveys, we had a total of
3,246 participants. For Survey 4, we tried to contact all 3,246 participants, either through postal
mail or email. When an email or a postal envelope was returned, we searched through the web to
find additional means of reaching a respondent, using, in the main, Facebook, Google+, and
LinkedIn. Although we could not associate contact information with a particular survey, codes
that participants generated allowed us to connect the responses of a given individual across the
surveys they filled out. Some participants claimed that they had previously participated, but they
supplied an incorrect ID. Attempts were made to find a match by examining handwriting,
demographic information, and so on. The participants in the No Match category reflect those for
which a match could not be found at any point in the project. In the end, 202 participants took
all four surveys. These participants will be our main focus of interest.
As shown in Table 1, 52% of those who had completed Surveys 1, 2, and 3 returned
Survey 4, a good return rate for studies of this kind (Baruch, 1999). Of the 2117 participants
who returned Survey 1, 10% of them ended up participating at every stage of the ten-year
project. This return rate is reasonable, given the length of the project, the difficulty in keeping
track of people over such a long time period, the extensive nature of the survey, and the fact that
we did not compensate participants for their efforts. Although it involved participants across the
United States, our sampling should not be viewed as representative of the American public.
We undertook several dropout analyses. For the features of age, gender, residence at the
time of the attack, student membership at the time of the attack, political affiliation,
race/ethnicity, and religion, we found no differences between those who took all four surveys
and (1) those who only took Survey 1 and (2) those participants who took Surveys 1 through 3,
but not Survey 4 ( p s > .20). The one dimension on which we did find a dropout effect concerned
Demographic information can be found in Table 2. Although the match was not perfect,
the New sample resembled the Four-Sample survey to a substantial degree. Analyses of how
responses to the questions in the survey differed across these demographic categories are beyond
the scope of the present report.
Surveys
The content of Survey 4 was similar to the content of Surveys 2 and 3, with small
changes made to reflect the time Survey 4 was taking place. For example, Survey 2 asked
questions that began “In the last year,….”; Survey 3 asked “In the last three years,…”; Survey
4’s questions began, “In the last seven years, ….” Inasmuch as Survey 4 was administered one
and half months before the 10
th
Anniversary of the attack, for some questions, such as those
involving media attention, we also began several questions with “In the last few months, ….”
Inasmuch as our findings did not differ for the “seven-year” questions and the “last few months”
questions, we do not report the results for the “last few months” questions here. Each survey
was approximately 17 pages long (when printed) and took about 45 minutes to complete. Copies
can be found at http://911memory.nyu.edu. Although the surveys explored a variety of topics,
we focus here on those relevant to the formation and retention of flashbulb and event memories.
As Table 3 indicates, questions fell into three categories: (1) specific memories for the
circumstances in which one learned of the attack , focusing here on six canonical features; (2)
specific memories of the event itself , focusing on five different “facts” (e.g., number of planes,
location of President Bush at the time of the attack); and (3) ratings related to factors affecting
performance. We examined five potential factors, as in Hirst et al. (2009). With respect to the
questions about the canonical features of flashbulb memories, we followed each probe with one
of two follow-up questions. Half of the potential participants were sent a questionnaire assessing
their confidence in their response [with the question “How confident (on a 1 – 5 scale) is your
recollection?”]. The other half received a questionnaire assessing how well they thought that
they could remember the information in the future [“In 10 years, how well (on a 1 – 5 scale) do
you think you will remember this?”] Ninety-two of the 202 participants in our Four-Survey
sample returned the confidence version; 110, the forecasting version. We also asked questions
pertinent to what kinds of relevant media participants had been exposed to over the ten-year
period (e.g., did you see the film Fahrenheit 911 , the film United 93 ?) In what follows, we
present data from all 202 participants, except when discussing confidence. For these discussions,
we confine ourselves to the 92 participants who received the confidence questions.
Coding
A detailed coding manual was developed for Survey 1 and subsequently adapted for
Surveys 2 – 4 (see Hirst et al., 2009 , for details; the manuals can be found at
http://911memory.nyu.edu). There were two general coding schemes. For some questions, there
might be multiple ways to respond, but only a single response was required. For instance, when
asked “How did you first learn of the attack?” participants could respond, for example, with
“TV,” “Radio” or “Telephone,” but only one of these devices played a role when they first
learned of the attack. The coding scheme listed alternatives [specifically, TV, Radio, E-
mail/Instant message, Phone call (including Voice messages), Visual sighting, Word of mouth,
Sounds/screams, Sirens, Other, Not stated]. Using a number system, the coding indicated which
of these options best fit the participant’s response. The second coding scheme allowed for
multiple responses. For instance, more than one city was the target of the attack. As a result,
coders recorded all the responses a participant gave to the question “In the vicinity of which
cities did the airplanes end up?” To assess interrater reliability, we randomly selected 10% of
errors of commission? And given our interest in long-term retention, was the level of forgetting,
in this sense, the same, greater, or lesser for Survey 4 as it was for Survey 3?
Consistency. Inasmuch as all the flashbulb memory questions required a single response,
a response on Survey 2 thru Survey 4 was considered consistent with the response offered on
Survey 1 if the coding matched. For example, for the question “How did you first learn about
the attack?”, if a participant stated that she learned from the radio on Survey 1 and from TV on
Survey 4, her Survey 4 response was scored as a “0,” that is, inconsistent. If the Survey 4
response had been “radio,” it would have been scored as a “1,” that is, consistent. The total
consistency score was the average of the six questions with which we probed the canonical
features of flashbulb memories, producing a value from 0 to 1 (again, see Table 3 for the
canonical features). Our scoring method differs slightly from others, such as Neisser and Harsch
(1992), who used a three-point (0 – 2) rather than a two-point (0 or 1) classification scheme. For
Neisser and Harsch, responses were either absent or inconsistent (0), or consistent with different
degrees of specificity (1 or 2). We collapsed their two consistency scores into one single value.
Our measure, then, is more likely to emphasize the consistency of a response than would
Neisser-Harsch, in that the latter method could elicit a lower score, relative to the entire scale,
than ours would if a consistent response was not very specific. In terms of the pattern of results,
as opposed to absolute values, Hirst et al. ( 2009 ) found no difference between the Neisser-
Harsch scoring and the present one for the Three-Survey sample. An examination of 10% of the
Four-Survey sample produced equivalent similarities. Figure 1a contains the results across the
four surveys for those who completed all four surveys, using our coding scheme.
Given the .63 measure of consistency observed for Survey 2 in Figure 1a (that is, on
average, 2.22 of the 6 recalled canonical items in Survey 2 were inconsistent with what was
reported on Survey 1), one can reasonably state that the flashbulb memories of our participants
after one year did not reflect to a notable degree what they reported in the first week. This
forgetting slowed in the next two years, with a decline of just .07 in the consistency score
between Survey 2 and Survey 3. The level of forgetting stabilized between the third and tenth
year, with a non-significant improvement of .03 in consistency scores between Survey 3 and
Survey 4. We conducted a repeated-measures ANOVA with Time as a within-subject factor and
the total consistency score as the dependent measure. There was a main effect for Time, F (2,
2
= .08. In follow-up analyses, we found a significant difference
between the total consistency scores for responses on Survey 2 and those reported on Survey 3 ,
t (201) = 4.07, p < .001, d = .33, CI[.03,.09], but not between Survey 3 and Survey 4, t (201) =
1.92, p = .06, d = .16, CI[-.07,.00]. Interestingly, the consistency scores on Survey 2 appear to
be comparable to those reported by Kvavilashili et al. (2009) and Talarico and Rubin (2003),
though exact comparison is difficult because the scoring procedures and probes differ across the
three studies. Kvavliashili et al., for instance, used Neisser and Harsch’s (1992) weighted
average score, which could range from 0 to 7. After three years, participants scored, as read
from the graph they present, on average, approximately 5.00, or, if we used the top score as a
denominator, a consistency score of approximately .71. One difference between Neisser and
Harsch’s scoring and ours is that we treat emotional response as equally important as the other
canonical factors, but as shown by Hirst et al. (2009) and Koppel, Winkel, and Hirst (2014),
consistency scores for emotional response can be quite low. If we exclude emotional responses
from the current calculations, we obtain a consistency score of .65 (SD = .24) for Survey 4. It is
unlikely that the stabilization between Surveys 3 and 4 probably is the result of a floor effect.
People are more likely to forget, rather than remember, most of the events of their lives,