Lecture Notes on Simpson's Paradox | MATH 243, Assignments of Probability and Statistics

Material Type: Assignment; Class: + Dis >4; Subject: Mathematics; University: University of Oregon; Term: Unknown 1989;

Typology: Assignments

Pre 2010

Uploaded on 07/29/2009

koofers-user-1qj
koofers-user-1qj 🇺🇸

10 documents

1 / 3

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
MATH 243, LECTURE 7
1. Simpson’s paradox
We will not be covering most of the material from chapter 6. But it is useful to be aware of Simpson’s
paradox.
Fact 1 (Simpson’s paradox).It is possible for one individual to outperform another in every category
measured yet to not perform as well in the aggregate.
Example 2. Let us look at flight delays for two of our local carriers, Alaska Airlines and America West
Airlines, the former of which has a hub in Seattle, the latter in Phoenix. At their hubs we have the following
data:
* Alaska OT Alaska delayed AW OT AW delayed
PHX 221 12 4840 415
SEA 1841 305 201 61
Calculate the on-time percentage of each airline at each airport. Calculate the on-time percentage over
both airports. Explain what you see.
It is simple to see how Simpson’s paradox works if we look at a simple enough example. Suppose Dick
and Jane both take MA 243 and (somehow!) negotiate negotiate different weightings to compute their
final grades. Dick has HW count 10% and the final exam 90%, and Jane has HW count 90% and the
final exam count 10%. They get A and A-, respectively, on their HW, and C and C- on their final exams,
respectively. So Dick has scored better on both. But he ends up with a C+ and Jane with a B+.
Some useful terminology, if thinking in terms of percentages: there are two ways to take an average,
aweighted average which depends on the sample sizes, and a “straight” average of percentages (which
really is not so straight). The weighted average is the one which calculates the true percentage, but it
is susceptible to Simpson’s paradox. A “straight” average behaves predictably in this way, but the final
answer depends on the categories by which the data has been broken down.
2. Looking at how data is produced
So far we have taken data as given and analyzed it.
For single variables have found mean, median, quartiles, and seen the standard deviation. If the variable
is normally distributed, we can find answers to more detailed questions about percentiles.
For many variables, we have taken them two at a time and compared them through scatterplots. Using
the value rand the regression line, we have looked for positive and negative correlations.
But so far we have not questioned how good our data is. That is an important question since data
analysis, like any analysis, obeys the maxim “garbage in, garbage out.” We now start to talk about
generating reliable data.
3. Sampling
Suppose you have a question you wish to answer about a large population. For example,
What percent of Americans think George Bush is doing a good job?
What percentage of Americans know that the Earth orbits the sun in a year?
What percentage of Oregonians are obese?
1
pf3

Partial preview of the text

Download Lecture Notes on Simpson's Paradox | MATH 243 and more Assignments Probability and Statistics in PDF only on Docsity!

  1. Simpson’s paradox We will not be covering most of the material from chapter 6. But it is useful to be aware of Simpson’s paradox.

Fact 1 (Simpson’s paradox). It is possible for one individual to outperform another in every category measured yet to not perform as well in the aggregate.

Example 2. Let us look at flight delays for two of our local carriers, Alaska Airlines and America West Airlines, the former of which has a hub in Seattle, the latter in Phoenix. At their hubs we have the following data:

  • Alaska OT Alaska delayed AW OT AW delayed PHX 221 12 4840 415 SEA 1841 305 201 61 Calculate the on-time percentage of each airline at each airport. Calculate the on-time percentage over both airports. Explain what you see.

It is simple to see how Simpson’s paradox works if we look at a simple enough example. Suppose Dick and Jane both take MA 243 and (somehow!) negotiate negotiate different weightings to compute their final grades. Dick has HW count 10% and the final exam 90%, and Jane has HW count 90% and the final exam count 10%. They get A and A-, respectively, on their HW, and C and C- on their final exams, respectively. So Dick has scored better on both. But he ends up with a C+ and Jane with a B+. Some useful terminology, if thinking in terms of percentages: there are two ways to take an average, a weighted average which depends on the sample sizes, and a “straight” average of percentages (which really is not so straight). The weighted average is the one which calculates the true percentage, but it is susceptible to Simpson’s paradox. A “straight” average behaves predictably in this way, but the final answer depends on the categories by which the data has been broken down.

  1. Looking at how data is produced So far we have taken data as given and analyzed it. For single variables have found mean, median, quartiles, and seen the standard deviation. If the variable is normally distributed, we can find answers to more detailed questions about percentiles. For many variables, we have taken them two at a time and compared them through scatterplots. Using the value r and the regression line, we have looked for positive and negative correlations. But so far we have not questioned how good our data is. That is an important question since data analysis, like any analysis, obeys the maxim “garbage in, garbage out.” We now start to talk about generating reliable data.
  2. Sampling Suppose you have a question you wish to answer about a large population. For example,
  • What percent of Americans think George Bush is doing a good job?
  • What percentage of Americans know that the Earth orbits the sun in a year?
  • What percentage of Oregonians are obese? 1
  • What percent of people with headaches are helped by aspirin? Gathering data to answer these questions can be problematic. First of all, questions such as the last can have answers vary in meaning from person to person. But in all of these cases, there are simply too many people to find out the answer for each one. We will see that carefully taking data from a subset, called “sampling,” is the best we can do and can sometimes answer these questions well.

3.1. Gathering data. There are two basic methods of gathering data:

Definition 3. In an observational study one observes individuals and measures variables of those indi- viduals.

If the study is to lead to conclusions about the overall population there are a two things that must be considered:

  • Does the sample of individuals reflect the overall population?
  • Is the measure of the interesting variables accurate?

Example 4. Phoning 1000 randomly chosen residential phone numbers during the workday, one asks for the answer to two variables, age, and how many hours of TV the subject watches per day. If the phone is not answered, one calls the next number. What are possible problems?

Understanding what problems may arise in collecting data is an artform best learned in the discipline or setting in which you are working.

Definition 5. In an experimental study one treats a group of individuals in a particular way with the goal of discovering the effect of that treatment.

Experimental studies can be fraught with difficulties. If this study is to lead to conclusions about the efficacy of a treatment one must

  • Make certain that there is a group of untreated individuals (a control group) with which to compare the treated indivuals.
  • One also wants to make sure that there isn’t a lurking variable distinguising between treated and untreated individuals.

Example 6. The “placebo effect” and other bias in medical studies, and the need for “double-blind” protocols.

We will discuss this further later. Summarizing our discussions of data collection: To get good conclusions, one must be careful about how one is gathering data and what might be BIAS ing the sample

Example 7. A research firm phones 100 clients of acupuncturists to ask if their treatment has improved their health. What can one learn from this experiment about the efficacy of acupuncture? (Efficacy vs. satisfaction)

If you really want to measure acupuncture’s efficacy, you need to take a group of people and randomly assign half of them to get treated by acupuncture, and half to be untreated (as a control group if you want to compare acupuncture to no treatment) or treated by western medicine (as a control group if you wanted to compare acupuncture to western medicine). This study would be even better if the patients didn’t know whether they were being treated via acupuncture or in the control group.

Example 8. A local Eugene TV station asks callers to phone in during the news to say if they favor the Whole Foods development deal. What can one learn from this poll?