Jessica Wang
BMGT230B, 0207
Chapter 4: Displaying and Describing Categorical Data
• Explanatory variable: the “cause”, or predictor, available variable
• Response variable: the “effect”, or predicted, interesting variable
• Frequency table: displays the counts for each category and records totals and
category names
• Relative frequency table: displays the percentages of the values rather than the
counts, and the total percentage and category names
• Area principle: the area occupied by a part of the graph should correspond to the
magnitude of the value it represents
• Bar chart: displays the distribution of a categorical variable, showing the counts
for each category next to each other for easy comparison
• Relative frequency bar chart: displays the percentages instead
• Pie chart: shows the whole group. Sizes of the slices are proportional to the
fraction of the whole
• Categorical Data Condition: the data are counts or percentages of individuals in
categories
• Simpson’s paradox: a phenomenon that arises when averages or percentages are
taken across different groups, and these group averages appear to contradict the
overall overages. An association or comparison that holds for all of several groups
can reverse direction when the data are combined (aggregated) to form a single
group
• Contingency Table: a two way table that shows how the individuals are
distributed along each variable depending on, or contingent on, the value of the
other variable
o Marginal distribution: the margin totals of a contingency table of one of
the variables
o Cell: the intersection of a row and column of the table
o Conditional distribution: shows the distribution of one variable for just
those cases that satisfy a condition on another
• When the distribution of one variable is the same for all categories of another, the
variables are independent
• Segmented bar chart: treats each bar as the whole and divides it proportional into
segments corresponding to the percentage in each group