Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Contingency Tables - Lecture Notes | STAT 102, Study notes of Statistics

University of Pennsylvania (UPenn)Statistics

Material Type: Notes; Class: INTRO BUSINESS STAT; Subject: Statistics; University: University of Pennsylvania; Term: Unknown 1998;

Typology: Study notes

Pre 2010

Uploaded on 03/28/2010

koofers-user-83o 🇺🇸

10 documents

1 / 23

This page cannot be seen from the preview

Don't miss anything!

Lecture 23 Contingency Tables

STAT 102

 Stat 102 has concentrated on analysis of explanatory and

predictive models with one dependent variable (Y) and one or

more predictive variables or factors (often denoted by X).

 Data Structures examined include situations described as:

Discover Study notes of Statistics University of Pennsylvania (UPenn)

Partial preview of the text

Download Contingency Tables - Lecture Notes | STAT 102 and more Study notes Statistics in PDF only on Docsity!

Lecture 23 Contingency Tables

STAT 102

 Stat 102 has concentrated on analysis of explanatory and predictive models with one dependent variable ( Y ) and one or more predictive variables or factors (often denoted by X ).  Data Structures examined include situations described as:

 Ordinary linear regression: Y continuous ; One continuous x -variable.  Multiple linear regression: Y continuous ; Several continuous x -variables  Polynomial regression: Like multiple regression, but with some polynomial terms  ANOVA: Y continuous ; One or more qualitative X-factors  Regression and ANOVA combined:  Logistic Regression Y qualitative ; One or more continuous X-factors (We also included a qualitative X-factor (sex) in one example)

We haven’t yet talked about ….

Example Wine and Music: Do shoppers buy more wine when appropriate music is played? Data: A supermarket in N. Ireland played no background music, French music, or Italian music at random times of the day. Here is the number of purchases for different combination of type of music and type of wine: 2-way table of results

Type of Wine By Type of Music Count (^) French Italian None Total French 39 30 30 99 Italian 1 19 11^31 Other 35 35 43 113 Total 75 84 84 243

Taken from Moore and McCabe, Introduction to the Practice of Statistics, 5th^ edition. Original source, Ryan, Northrup-Clewes, Knox and Thurnham (1998). The effect of in-store music on consumer choice of wine. Proc. Nutrition Soc. 57. p1069.

Test of Independence Null Hypothesis: In words -- H 0 : Row effects are independent of Column effects Symbolically – The (population) probabilities for the row effects (Wine) are

denoted by^1 2 3

def for ,

P BuyingFrenchWine Similar

The probabilities for the column effects (Music) are 1 2 3

, are similarly defined

 P FrenchMusic

H 0 is that the probability of a count in the i , j cell is the product, for example Pr (^)  Frenc h& None (^)   Pr count in cell 1 , (^3)     1 3  P 13. Ha is that H 0 is not true.

Computations (cont) We can put this together to get estimates for what the cell counts should be if H 0 is true. The expected (under H 0 ) cell counts would be Expected ij  Eij  N (^)  P ˆ ij. Pearson’s Chi-Square statistic for testing H 0 is

2 ij ij ij (^) ij

N E

E

χ^2  .

Under H 0 it has a Chi-Square distribution with

df   I  1  J  1 

In our example, df  2  2  4. Your book has tables of the Chi-Square distribution, or use JMP. We REJECT when χ^2  C , with C as found in the table.

Computations (summary)

 Get the table. It has I rows and J columns and entries Nij.

 Calculate Eij , the Expected (under H 0 ) cell counts. Looking back

we see they have the formula j ij E Ni^ j N Ni N N

N

  

    

 

 ^.

 Calculate

2 2 Observed Expected Expected

ij ij ij ij ij (^) ij ij

N E

E

χ^2    .

 Use df   I  1  J  1 . Get the critical value, C , from the Chi-

Square table with this df. Reject whenever χ^2  C.

Analysis in JMP (cont) JMP also gives the value of (^) χ^2 , and the resulting P-value: Test ChiSquare Prob>ChiSq Likelihood Ratio 21.875 0. Pearson 18.279 0. If you calculated by hand you would get this from the table on p 9 as 18.279  2.3337  7.6724  .0004  .5209 ....

( You can ignore the Likelihood Ratio statistic. It belongs to a different statistical method. It comes from a different set of formulas than Pearson’s Chi-Square. The P-values from the two tests are nearly always very similar. )

The result of this calculation is that we Reject H 0. (We do so at any reasonable level of significance since P-value = .0011). This should be followed by a summary of the nature of the effects we have detected.

Nature of the Effects We have Rejected H 0. Therefore we believe choice of wine and nature of music are related to each other. What is the relation? There are formal ways to investigate this [see the Appendix to this lecture] , but for Stat 102 we recommend looking at the numbers and use common-sense. You can look at the cell counts (as on p6), and/or at the cell Chi- Square values (as on p9) and/or at row or column % (as below). Used carefully, any of these looks should yield similar conclusions. Here is the table of counts and column %:

“When French music is playing customers seem very much less likely to purchase Italian wine than they are to purchase Italian wine when Italian music is playing.”

Also (from the last two columns ) “There doesn’t seem to be much difference in distribution of wine choice between times when Italian music is playing and when no music is playing. (There’s a slightly greater % of people choosing Italian wine when Italian music is playing, but it’s not a large difference.)”

Second Example Income and Job Satisfaction About 50 years ago, a classic sociological study looked at the relationship between a worker’s salary and their job satisfaction. One two-way table from this study looked at the results from a survey of worker’s income and a questionnaire judging their job satisfaction. Annual job INCOME was divided into 4 categories <$6,000, $6,000 - $15,000, $15,000 - $25,000, >$25,000. Their SATISFACTION was graded into four categories 1 = very dissatisfied, 2 = somewhat dissatisfied, 3 = somewhat satisfied, 4 = very satisfied. [By accident, both examples in this lecture have square contingency tables. But that’s not necessary; rectangular tables are also common].

Data table and results of the analysis follow:

An Aside The Mosaic Plot in JMP provides a very nice visualization for this data: Mosaic Plot

Satisfaction

(1) 0-6 (2) 6- (3) 15-25(4) 25+ Income

You can clearly see how the % of very dissatisfied workers goes down as earnings go up, and the % of very satisfied workers increases with income.

Pearson Chi-Square Test of Independence

Test of H 0 : Satisfaction is independent of Income

df = 33 = 9.

Results from JMP:

Test ChiSquare Prob>ChiSq Likelihood Ratio 12.037 0. Pearson 11.989 0. The P-value is 0.2140. We cannot reject the null hypothesis.

[We have to regretfully conclude that numbers like those in the table might well have occurred by chance. BUT note that not only do we have numbers like these, but they occur in a very obvious pattern. Our test isn’t looking for any special pattern. Might there be a better test to use on this data? See the Appendix .]

GLM (log-link) Wine and Music Data

Whole Model Test Model - LogLikelihood L-R ChiSquare DF Prob>ChiSq Difference 38.99 77.98 8 <. Full 21. Reduced 60. Effect Tests Source DF L-R ChiSquare Prob>ChiSq Wine 2 68.33 <. Music 2 11.31 0. Music*Wine 4 21.88 0.

These tables show that various things we can see in the table are statistically significant. In particular, the interaction is significant.

It is this interaction term that describes the dependence of Wine and Music. The significance of this term is another confirmation of the result of our earlier Pearson Chi-Square test.

Parameter Estimates This analysis yields a parameter estimates table that one can use to estimate the values of relevant quantities.

For example one can estimate a quantity like E  Log (^)  N (^) 12 N  2 ,

and hence also E N  (^) 12 N  2 . This corresponds to the proportion of

Italian Music listeners who purchase French wine.

Parameter Estimates Term Estimate Std Error L-R ChiSquare Prob>ChiSq Intercept 2.96 0.13 51.04 <. Wine[French] 0.52 0.14 20.25 <. Wine[Italian] - 1.18 0.24 67.05 <. Music[French] - 0.56 0.24 10.64 0. Music[Italian] 0.34 0.14 7.05 0. Music[French]Wine[French] 0.73 0.25 15.21 <. Music[French]Wine[Italian] - 1.22 0.46 16.53 <. Music[Italian]Wine[French] - 0.42 0.17 7.82 0. Music[Italian]Wine[Italian] 0.83 0.26 15.21 <.

Contingency Tables - Lecture Notes | STAT 102, Study notes of Statistics

Related documents

Partial preview of the text

Download Contingency Tables - Lecture Notes | STAT 102 and more Study notes Statistics in PDF only on Docsity!

Lecture 23 Contingency Tables

We haven’t yet talked about ….

 P FrenchMusic

N E

E

χ^2  .

df   I  1  J  1 

N

N

N

 ^.

N E

E

χ^2    .

 Use df   I  1  J  1 . Get the critical value, C , from the Chi-