Descriptive Statistics and Box-and-Whisker Plot Analysis of Gross Values for Film Data, Exercises of Statistics

Descriptive statistics and graphical analysis of the gross values for an entire film population and a random sample using histograms, box-and-whisker plots, and calculations of central tendency and variation. The analysis reveals the distribution's central tendency, variation, and potential outliers.

Typology: Exercises

2020/2021

Uploaded on 05/23/2022

bulrush1007
bulrush1007 🇺🇸

1 / 3

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
IA-1
(c) Descriptive statics, histogram and box-and-whisker plot for “Gross”
Descriptive statics for entire population
Variable
N
N*
Mean
SE Mean
StDev
Minimum
Q1
Median
Q3
Gross
331
0
31671589
2159782
39293782
1065431
7487370
18552460
37771017
Variable
Gross
Descriptive statics for random sample
Variable
N
N*
Mean
SE Mean
StDev
Minimum
Q1
Median
Q3
Gross
30
0
25748158
4346332
23805840
2410162
8448098
19158752
33442303
Variable
Maximum
Gross
105410874
Box-and-whisker plot for entire
population
Box-and-whisker plot for random sample
Histogram for entire population
*The graph is based on a range of 0 to
320000000 and a class width of 10000000
Histogram for random sample
*The graph is based on a range of 0 to
120000000 and a class width of 10000000
pf3

Partial preview of the text

Download Descriptive Statistics and Box-and-Whisker Plot Analysis of Gross Values for Film Data and more Exercises Statistics in PDF only on Docsity!

IA-

(c) Descriptive statics, histogram and box-and-whisker plot for “Gross”

Descriptive statics for entire population

Variable N N Mean SE Mean StDev Minimum Q1 Median Q* Gross 331 0 31671589 2159782 39293782 1065431 7487370 18552460 37771017 Variable Maximum Gross 306137970

Descriptive statics for random sample

Variable N N Mean SE Mean StDev Minimum Q1 Median Q* Gross 30 0 25748158 4346332 23805840 2410162 8448098 19158752 33442303 Variable Maximum Gross 105410874

Box-and-whisker plot for entire population

Box-and-whisker plot for random sample

Histogram for entire population *The graph is based on a range of 0 to 320000000 and a class width of 10000000

Histogram for random sample *The graph is based on a range of 0 to 120000000 and a class width of 10000000

The four basic features and comments:

  1. Central tendency for entire population is at 0-10000000. Central tendency for the random sample is also at 0-10000000, although the number of cases appeared in this bin is only slightly higher that the cases appeared in the following two bins (10000000-20000000 and 20000000- 30000000). It means that for both entire population and random sample, most of the films have a gross value in the range of 0-10000000.
  2. If not consider the points on the far right, variation for entire population is at 0 to 200000000, and for the random sample is at from 0 to 80000000. The random sample has less variation than the entire population.
  3. Although with different number of data points, the shape of the histograms of entire population and the random sample looks roughly similar. Both have more cases with lower gross values and less cases with higher gross values, and both have some outliers on the far right. Both are unsymmetrical. The right skewed distribution is further confirmed by the descriptive statics that both datasets have much larger mean than median as well as the shape of box-and-whisker plot (longer upper whisker).
  4. For the entire population, there seem to be two possible outliers at 240000000-250000000 and 300000000-310000000. It is later revealed by box-and-whisker plot that the outliers are “Independence_Day” and “Twister”. For the random sample, one possible outlier is at 100000000-110000000. It is later revealed by box-and-whisker plot that the outlier is “First_Wives_Club_The” whose gross value is 105410874. Considering the scale of the box- and-whisker plot, the interquartile range of the entire sample (30283674) is larger than that of the random sample (24994205). It is aligned with previous conclusion that the random sample has less variation than the entire population.