Data - Introduction to Statistics - Solved Exam, Exams of Statistics

This is the Solved Exam of Introduction to Statistics which includes Indicate, Data, Random Sample, Mean, Median, Measure of Center, Mean or Median, Describing, Standard Deviation, Central Limit Theorem etc. Key important points are: Data, Collected, Random Sample, Heights, Average, Median, Mode, Midrange, Measure Represents, Compute

Typology: Exams

2012/2013

Uploaded on 02/26/2013

ammar
ammar 🇮🇳

4.4

(13)

66 documents

1 / 15

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
SOLUTIONS
MAT 167: Statistics
Test I: Chapters 1-3
Instructor: Anthony Tanbakuchi
Fall 2007
Name:
Computer / Seat Number:
No books, notes, or friends. You may use the attached equation sheet, R, and a
calculator. No other materials. If you choose to use R, copy and paste your work
into a word document labeling the question number it corresponds to. When you
are done with the test print out the document. Be sure to save often on a memory
stick just in case. Using any other program or having any other documents open
on the computer will constitute cheating.
You have until the end of class to finish the exam, manage your time wisely.
If something is unclear quietly come up and ask me.
If the question is legitimate I will inform the whole class.
Express all final answers to 3 significant digits. Probabilities should be given as a
decimal number unless a percent is requested. Circle final answers, ambiguous or
multiple answers will not be accepted. Show steps where appropriate.
The exam consists of 13 questions for a total of 40 points on 14 pages.
This Exam is being given under the guidelines of our institution’s
Code of Academic Ethics. You are expected to respect those guidelines.
Points Earned: out of 40 total points
Exam Score:
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Data - Introduction to Statistics - Solved Exam and more Exams Statistics in PDF only on Docsity!

SOLUTIONS

MAT 167: Statistics

Test I: Chapters 1-

Instructor: Anthony Tanbakuchi

Fall 2007

Name:

Computer / Seat Number:

No books, notes, or friends. You may use the attached equation sheet, R, and a

calculator. No other materials. If you choose to use R, copy and paste your work

into a word document labeling the question number it corresponds to. When you

are done with the test print out the document. Be sure to save often on a memory

stick just in case. Using any other program or having any other documents open

on the computer will constitute cheating.

You have until the end of class to finish the exam, manage your time wisely.

If something is unclear quietly come up and ask me.

If the question is legitimate I will inform the whole class.

Express all final answers to 3 significant digits. Probabilities should be given as a

decimal number unless a percent is requested. Circle final answers, ambiguous or

multiple answers will not be accepted. Show steps where appropriate.

The exam consists of 13 questions for a total of 40 points on 14 pages.

This Exam is being given under the guidelines of our institution’s

Code of Academic Ethics. You are expected to respect those guidelines.

Points Earned: out of 40 total points

Exam Score:

MAT 167: Statistics, Test I: Chapters 1-3 SOLUTIONS p. 1 of 14

1. Given the following data collected from a random sample of individual’s heights in cm:

(a) (3 points) If you were asked to calculate the “average” height for this data, what measure

of center (mean, median, mode, midrange) would you use? Give a clear reason why you

choose this measure in terms of the data and what the measure represents.

Solution: Since this data set contains an outlier, using a measure of center that is

resistant to outliers would give a better representation of the “average” height. Recall

that the term “average” is non-specific, and all the measures of center are different

types of averages. The most resistant measure of center would be the mode, but this

is primarily for categorical data. Therefore, the median would be the correct choice

for quantitative continuous data.

(b) (2 points) Compute the measure of center you recommended above for the data.

Solution: Depending on your answer above you should have gotten one of the follow-

ing:

Median 168

Mean 183

Mode No mode for this data.

(c) (2 points) Using the Range Rule of Thumb, estimate the standard deviation for the data.

Solution:

> x

[ 1 ] 170 185 155 168 162 164 280

> s. approx = (max( x ) − min ( x ) ) / 4

> s i g n i f ( s. approx , 3 )

[ 1 ] 3 1. 2

(d) (2 points) Compute the standard deviation for the data.

Solution: You should compute the sample standard deviation:

> s = sd ( x )

> s i g n i f ( s , 3 )

[ 1 ] 4 3. 6

Instructor: Anthony Tanbakuchi Points earned: / 9 points

MAT 167: Statistics, Test I: Chapters 1-3 SOLUTIONS p. 3 of 14

5. (1 point) If the mean, median, and mode for a data set are all the same, what can you conclude

about the data?

Solution: If all three measures of center are the same, the data is symmetrical with no

outliers.

6. (2 points) Find 10 C 7.

Solution: Using the combinations equation with n = 10, and r = 7:

> f a c t o r i a l ( 1 0 ) / ( f a c t o r i a l (1 0 − 7 ) ∗ f a c t o r i a l ( 7 ) )

[ 1 ] 120

Instructor: Anthony Tanbakuchi Points earned: / 3 points

MAT 167: Statistics, Test I: Chapters 1-3 SOLUTIONS p. 4 of 14

7. The following is a modified box plot of final exam scores for two different sections of a statistics

class taking the same exam.

l

section.1 section.

Final Exam Score

(a) (1 point) Which section had the highest score on the exam? Give an approximate value

for the highest score.

Solution: Section 2, highest score was approximately 98.

(b) (1 point) Which section had the higher median score? Give the approximate median score

for that section.

Solution: Section 1 had the higher median score of approximately 75.

(c) (1 point) A useful measure of variation is the inter-quartile range which is calculated as

IQR = Q 3 − Q1. Find the approximate IQR for section 2.

Solution: For section 2 the approximate values for Q1 and Q3 are:

> Q1 = 65

> Q3 = 80

> IQR = Q3 − Q

> IQR

[ 1 ] 15

(d) (1 point) What percent of data lies within the IQR?

Solution: Between the first and third quarter is 50% of the data since the quarters

break the data into four equal parts.

8. (2 points) What is the probability of randomly selecting three people all born on the same day

of the year? (Ignore leap years).

Instructor: Anthony Tanbakuchi Points earned: / 6 points

MAT 167: Statistics, Test I: Chapters 1-3 SOLUTIONS p. 6 of 14

11. (3 points) If a class consists of 12 freshmen and 8 sophomores, find the probability of randomly

selecting three students in the following order: a sophomore then a sophomore then a freshman

without replacement.

Solution:

P (sophomore and sophomore and freshman) these are dependent events so we need to care-

fully count the number available and total number for each trial.

> p = ( 8 / 2 0 ) ∗ ( 7 / 1 9 ) ∗ ( 1 2 / 1 8 )

> s i g n i f ( p , 3 )

[ 1 ] 0. 0 9 8 2

12. (2 points) A cruise ship has 1000 people on it. Of the 1000 people, 25 are crew members. Find

the probability of randomly selecting 10 people without replacement and none of them are crew

members.

Solution: Since n/N ≤ 0 .05, we can easily approximate the probability using independent

events rather than calculate out all 10 probabilities.

P (10 non crew members) ≈ P (non crew member)

> n = 10

> N = 1000

> n/N

[ 1 ] 0. 0 1

> p. non. crew = 1 − 25/N

> p = p. non. crew ˆn

> s i g n i f ( p , 3 )

[ 1 ] 0. 7 7 6

If you choose to do it without the approximation (more work), then you would have had:

> p = prod ( 9 7 5 : 9 6 6 / 1 0 0 0 : 9 9 1 )

> s i g n i f ( p , 3 )

[ 1 ] 0. 7 7 5

As you can see the two methods have very good agreement because our approximation is

sufficiently accurate for n/N ≤ 0. 05

13. Given the following frequency table summarizing data from a study:

age.years frequency

Instructor: Anthony Tanbakuchi Points earned: / 5 points

MAT 167: Statistics, Test I: Chapters 1-3 SOLUTIONS p. 7 of 14

(a) (1 point) Construct a relative frequency table.

Solution:

age.years relative.frequency.percent

(b) (1 point) What is the probability of randomly selecting someone from the study who is in

the age range 10-29?

Solution:

> p = ( 8 + 1 2 ) / ( 5 + 8 + 12 + 2 )

> s i g n i f ( p , 3 )

[ 1 ] 0. 7 4 1

Instructor: Anthony Tanbakuchi Points earned: / 2 points

A nurse measured the blood pressure of each person who visited her clinic. Following is a relative - frequency

histogram for the systolic blood pressure readings for those people aged between 25 and 40. Use the histogram to answer the question. The blood pressure readings were given to the nearest whole number.

  1. Approximately what percentage of the people aged 25 - 40 had a systolic blood pressure reading between 110 and 139 inclusive? A) 89% B) 75% C) 39% D) 59%

Determine which score corresponds to the higher relative position.

  1. Which score has the better relative position: a score of 51 on a test for which x = 48 and s = 5 , a score of 5.8 on a test for which x = 4.8 and s = 0.7 or a score of 434.3 on a test for which x = 383 and s = 57? A) A score of 434.3 B) A score of 5.8 C) A score of 51

Find the mode(s) for the given sample data.

  1. 82 , 39 , 32, 39 , 29, 82 A) 39 B) 82 C) 82 , 39 D) 50.

Find the original data from the stem - and - leaf plot.

Stem Leaves 52 2 6 7 53 2 4 8 54 1 7 A) 522 , 526 , 527 , 532 , 534 , 538 , 541 , 547 B) 54 , 58 , 59 , 55 , 57 , 61 , 55 , 61 C) 52267 , 53248 , 5417 D) 522 , 526 , 537 , 532 , 534 , 538 , 541 , 557

Find the range for the given data.

  1. To get the best deal on a microwave oven, Jeremy called six appliance stores and asked the cost of a specific model. The prices he was quoted are listed below: $ 120 $ 536 $ 227 $ 618 $ 422 $ 258 Compute the range. A) $ 120 B) $ 31 C) $ 498 D) $ 536

Solve the problem.

  1. If the standard deviation of a set of data is zero, what can you conclude about the set of values? A) The sum of the values is zero. B) All values are equal to zero. C) The sum of the deviations from the mean is zero. D) All values are identical.

Use the empirical rule to solve the problem.

  1. At one college, GPA's are normally distributed with a mean of 2.8 and a standard deviation of 0.6. What percentage of students at the college have a GPA between 2.2 and 3.4? A) 95.44% B) 99.74% C) 68.26% D) 84.13%

Answer the question, considering an event to be "unusual" if its probability is less than or equal to 0.05.

  1. Assume that a study of 300 randomly selected school bus routes showed that 281 arrived on time. Is it "unusual" for a school bus to arrive late? A) Yes B) No

Answer the question.

  1. Which of the following cannot be a probability?

A)

B)
C)
D)

Estimate the probability of the event.

  1. In a certain class of students, there are 12 boys from Wilmette, 6 girls from Kenilworth, 9 girls from Wilmette, 7 boys from Glencoe, 5 boys from Kenilworth and 3 girls from Glenoce. If the teacher calls upon a student to answer a question, what is the probability that the student will be from Kenilworth? A) 0.333 B) 0.143 C) 0.262 D) 0.

Find the indicated probability.

  1. A bag contains 4 red marbles, 3 blue marbles, and 5 green marbles. If a marble is randomly selected from the bag, what is the probability that it is blue?

A)

B)
C)
D)
  1. A sample of 100 wood and 100 graphite tennis rackets are taken from the warehouse. If 14 wood and 13 graphite are defective and one racket is randomly selected from the sample, find the probability that the racket is wood or defective. A) 0. B) 0. C) 0. D) There is insufficient information to answer the question.

Answer Key

Testname: EXAM1MULTIPLECHOICE

1) B
2) B
3) A
4) C
5) B
6) B
7) B
8) C
9) A
10) C
11) D
12) C
13) B
14) C
15) C
16) B
17) A
18) B
19) C
20) A

Introductory Statistics Quick Reference & R Commands by Anthony Tanbakuchi. Version 1. http://www.tanbakuchi.com ANTHONY@TANBAKUCHI·COM

Get R at: http://www.r-project.org More R help & examples at: http://tanbakuchi.com/Resources/R_Statistics/RBasics.html R commands: bold text

1 Misc R To make a vector / store data: x=c(x1, x2, ...) Get help on function: ?functionName Get column of data from table: tableName$columnName List all variables: ls() Delete all variables: rm(list=ls())

√ x = sqrt(x) (1) xn^ = xn (2) n = length(x) (3) T = table(x) (4)

2 Descriptive Statistics 2.1 NUMERICAL Let x=c(x1, x2, x3, ...)

total =

n ∑ i= 1

xi = sum(x) (5) min = min(x) (6) max = max(x) (7) six number summary : summary(x) (8) μ = ∑^ xi N = mean(x) (9)

x¯ = ∑ xi n =^ mean(x)^ (10) x˜ = P 50 = median(x) (11)

σ =

√ ∑ (xi − μ)^2 N (12)

s =

√ ∑ (xi − ¯x)^2 n − 1 =^ sd(x)^ (13) CV = σ ¯μ =^

s x ¯ (14)

2.2 RELATIVE STANDING z = x − μ σ =^

x − x¯ s (15) Percentiles Pk = xi, (sorted x) k = i^ −^0.^5 n · 100% (16) To find xi given Pk, i is: 1 : L = k 100% · n (17) 2 : if L is an integer: i = L + 0 .5; otherwise i=L and round up.

2.3 VISUAL All plots have optional arguments: main="" sets title xlab="", ylab="" sets x/y-axis label type="p" for p oint plot type="l" for l ine plot type="b" for b oth points and lines Ex: plot(x, y, type="b", main="My Plot") Histogram: hist(x) Stem & leaf: stem(x) Box plot: boxplot(x) Barplot: plot(T) (where T=table(x)) Scatter plot: plot(x,y) (where x, y are ordered vectors) Time series plot: plot(t,y) (where t, y are ordered vectors) Graph function: curve(expr, xmin, xmax) plot expr involving x 2.4 ASSESSING NORMALITY Q-Q plot: qqnorm(x); qqline(x)

3 Probability Number of successes x with n possible outcomes. (Don’t double count!)

P(A) = xA n (18) P( A¯) = 1 − P(A) (19) P(A or B) = P(A) + P(B) − P(A and B) (20) P(A or B) = P(A) + P(B) if A, B mutually exclusive (21) P(A and B) = P(A) · P(B|A) (22) P(A and B) = P(A) · P(B) if A, B independent (23) n! = n(n − 1 )(n − 2 ) · · · 2 · 1 = factorial(n) (24) nPk = n! (n − k)! Perm. no elements alike^ (25) = n! n 1 !n 2! · · · nk! Perm.^ n^1 alike,...^ (26) nCk =^ n! (n − k)!k! = choose(n,k) (27)

4 Random Variables 4.1 DISCRETE DISTRIBUTIONS P(xi) : probability distribution (28) E = μ = ∑xi · P(xi) (29) σ =

√ ∑(xi −^ μ)^2 ·^ P(xi)^ (30) 4.2 CONTINUOUS DISTRIBUTIONS CDF F(x) gives area to the left of x, F−^1 (p) expects p is area to the left. f (x) : probability density (31) E = μ =

Z (^) ∞ −∞ x · f (x) dx (32)

σ =

√Z (^) ∞ −∞^ (x^ −^ μ)

(^2) · f (x) dx (33)

F(x) : cumulative prob. density (CDF) (34) F−^1 (x) : inv. cumulative prob. density (35) F(x′) =

Z (^) x′ −∞ f (x) dx (36) p = P(x < x′) = F(x′) (37) x′^ = F−^1 (p) (38) p = P(x > a) = 1 − F(a) (39) p = P(a < x < b) = F(b) − F(a) (40)

4.3 SAMPLING DISTRIBUTIONS μ¯x = μ σx¯ = σ √n (41)

μ (^) ˆp = p σ (^) pˆ =

√ pq n (42)

4.4 BINOMIAL DISTRIBUTION μ = n · p (43) σ = √n · p · q (44) P(x) = nCx pxq(n−x)^ = dbinom(x, n, p) (45)

4.5 POISSON DISTRIBUTION P(x) = μ

x (^) · e−μ x! = dpois(x, μ ) (46)

4.6 NORMAL DISTRIBUTION f (x) = √^1 2 πσ^2 · e−^ 12 (x−μ)^2 σ^2 (47) p = P(z < z′) = F(z′) = pnorm(z’) (48) z′^ = F−^1 (p) = qnorm(p) (49) p = P(x < x′) = F(x′) = pnorm(x’, mean= μ , sd= σ ) (50) x′^ = F−^1 (p) = qnorm(p, mean= μ , sd= σ ) (51)