Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
A set of questions related to statistical analysis. It includes questions on computing mean, median, mode, standard deviation, Q1, Q3, Min, & Max for a given sample data, probability calculations, confidence intervals, hypothesis testing, and more. The questions are related to real-life scenarios such as sales calls, customer ratings, gasoline demand, paint defects, and car dealership profits. answers and explanations for each question.
Typology: Exams
1 / 28
following number of sales calls completed last
month.
a.
Compute the mean, median, mode, &
standard deviation, Q 1
, Min, & Max
for the above sample data on number of
sales calls per month.
b.
In the context of this situation,
interpret the Median, Q 1
. (Points
a.
Ans: Mean, median, mode, & standard deviation,
Q1, Q3, Min, & Max for the above sample data on
number of sales calls per month
Mean 91.
Median 91
Mode 93
Standard
Deviation
1st
quartile
3rd
quartile
Minimum 72
Maximum 119
b. Median of the above sales calls means that if
all the sales calls data points are arranged in
an ascending order, then 91 Nos. of calls
made would fall in the middle. So, there are
as 8 sales calls data point above this median
& 8 sales calls data point below this median
point.
Q1 is the first quartile points which is 82 nos.
of calls made. It means that there are 25 %
of sales calls data point which lie below this
point.
Q3 is the third quartile points which is 100
nos. of calls made. It means that there are 75
% of sales calls data point which lie below this
point.
2. (TCO B) Cedar Home Furnishings has
collected data on their customers in
terms of whether they reside in an
urban location or a suburban location,
as well as rating the customers as
either “good,” “borderline,” or “poor.”
The data is below.
Urba
n
Subur
ban
Total
Good 60 168 228
Borderl
ine
Poor 24 40 64
Tot
al
If you choose a customer at random, then find the
probability that the customer
a. is considered “borderline.”
Ans: P(Customer is considered “borderline) =
b. is considered “good” & resides in an
urban location.
Ans: P(is considered “good” & resides in an
urban location) = 60/400 = 3/
c. is suburban, given that customer is
considered “poor.” (Points : 18) Ans:
P(is suburban, given that customer is
considered “poor”) = 40 / 64 = 5 / 8
at Rodale Emporium pay for their purchases
using credit cards. In a sample of 20
customers, find the probability that
a. exactly 14 customers will pay for
their purchases using credit cards.
Ans:
This is a binomial distribution with
p = 0.70, so q = 1-p = 0.30 n =
Probability distribution:
P(exactly 14 customers will pay for their
purchases using credit cards) = 20 C 14
b. at least 10 customers will pay for
their purchases using credit cards.
Ans :
c. at most 12 customers will pay for
their purchases using credit cards.
(Points : 18)
Ans:
service station is normally distributed with a
mean of 27,009 gallons per day & a standard
deviation of 4,530 gallons per day.
a. Find the probability that the demand
for gasoline exceeds 22,000 gallons for a
given day.
Ans:
z- score = (22000-27009)/4530 = -1.
From the Standard Normal
cumulative proportions p-
value = 0.
probability that the demand for gasoline
exceeds 22,000 gallons for a given day = 1-
b. Find the probability that the demand
for gasoline falls between 20,000 &
23,000 gallons for a given day.
Ans:
z- score for 20000 gallons =
From the Standard Normal
cumulative proportions p-
value = 0.
z- score for 23000 gallons =
From the Standard Normal
cumulative proportions
p- value = 0.
probability that the demand for gasoline falls
between 20,000 & 23,000 gallons for a given day
c. How many gallons of gasoline should
be on hand at the beginning of each
day so that we can meet the demand
90% of the time (i.e., the station stands
a 10% chance of running out of
gasoline for that day)? (Points : 18)
Ans: For the demand to be met
90 % of the time, it means that p-
value = 0.
Z-score for (p-value = 0.9) = 1.
gallons of gasoline should be on hand =
1.28*4530 + 27009 = 32807.4 gallons ~
approx 32808 gallons
company has been asked to develop a fairly
accurate estimate of the mean refueling &
baggage handling time at a foreign airport. A
random sample of 36 refueling & baggage
handling times yields the following results.
Sample Size = 36
Sample Mean = 24.2 minutes
Sample Standard Deviation = 4.2 minutes
a. Compute the 90% confidence
interval for the population mean
refueling & baggage time.
Ans:
90 % confidence
interval means
Z(Upper – 95%)
Z ( Lower – 5%) = -1.
90 % Confidence interval lower limit
Confidence interval Upper limit =
b. Interpret this interval.
Ans: Mean refueling & baggage handling time at
a foreign airport location would lie between
23.049 minutes & 25.351 minutes for 90% of the
times.
c. How many refueling & baggage
handling times should be sampled so
that we may construct a 90%
confidence interval with a sampling
error of .5 minutes for the population
mean refueling & baggage time?
(Points : 18)
Ans:
Nos. of refueling & baggage handling times that
should be sampled so that we may construct a
90% confidence interval = (4.2/0.5) ^ 2 =
70.56 ~approx. 71
toothpaste claims that a high
percentage of dentists recommend the use of
their toothpaste. A random sample of 400
dentists results in 310 recommending their
toothpaste.
a. Compute the 99% confidence
interval for the population proportion
of dentists who recommend the use
of this toothpaste.
Ans: p value =
0.775 Sample
Size = 40072.1%
of
99% confidence interval for the
population proportion of dentists =
99% lower population proportion of
dentists = 0.721 = 72.1%
99% Upper population proportion of dentists =
b. Interpret this
confidence
interval. Ans:
So, for 99 % of the cases/times, More than
72.1% of population proportion of dentists &
less than 82.9 % of population proportion of
dentists would recommend the use of their
toothpaste.
c. How large a sample size will need to
be selected if we wish to have a 99%
confidence interval that is accurate to
within 3%? (Points : 18)
Ans: Sample size = 1286
improvement team believes that its recently
implemented defect reduction program has
reduced the proportion of paint defects. Prior to
the implementation of the program, the
proportion of paint defects was .03 & had been
stationary for the past 6 months. Ford selects a
random sample of 2,000 cars built after the
implementation of the defect reduction program.
There were 45 cars with paint defects in that
sample. Does the sample data provide evidence
to conclude that the proportion of paint defects is
now less than .03 (with a = .01)? Use the
hypothesis testing procedure outlined below.
a. Formulate the null &
alternative hypotheses.
Ans:
Null Hypotheses: H0: proportion of paint defects
after the implementation =.
Alternative hypotheses: H1: proportion of paint
defects after the implementation is now more
than.
b. State the level
of significance.
Ans:
Level of significance is = 1- 0.01 = 0.99 = 99
c. Find the critical value (or
values), & clearly show the
rejection & nonrejection regions.
Ans:
New defect % = 45 /2000 = 0.
At 99 % level of significance (one-sided), the
lower limit = 0.
So, more than 2.1 % of car(= more than 42 cars)
with paint defects would clearly fall into rejection
region.
Anything less than 2.1 % of “car ( less than 42
cars) with paint defects” would mean Non-
rejection region.
d. Compute the
test statistic.
Ans:
Z ( Lower Value) = 0.
e. Decide whether you can
reject Ho & accept Ha or not.
Ans:
New defect % = 45 /2000 = 0.
Since the new defect % 2.25 % is higher than the
lower limit of 2.1%, we would reject the null
Hypotheses (H0). We would accept the Alternative
hypotheses: H1: proportion of paint defects after
the implementation is now more than.
f. Explain & interpret your conclusion
in part e. What does this mean?
Ans:
It means that at 99 % level of significance,
proportion of paint defects after the
implementation of the defect reduction program in
cars is now more than .03. So, the defect
reduction program has not really worked for Ford
Motor Company quality improvement team.
g. Determine the observed p-value for
the hypothesis test & interpret this
value. What does this mean?
Ans:
New defect % = 45 /2000 = 0.
Observed p-value is 2.25 % which means that
sample mean observed value recorded
2.25 % of the cars having paint defects.
h. Does the sample data provide
evidence to conclude that the
proportion of paint defects is now less
than .03 (with a = .01)? (Points : 24)
Ans:
Sample data doesn’t provide evidence to conclude
that the proportion of paint defects is now less
than .03 (with a = .01).
dealership must average more than 4.5% profit
on sales of new cars. A random sample of 81 cars
gives the following result.
Sample Size
= 81 Sample
Mean =
Sample Standard Deviation = 1.8%
Does the sample data provide evidence to
conclude that the dealership averages more
than 4.5% profit on sales of new cars (using a =
.10)? Use the hypothesis testing procedure
outlined below.
a. Formulate the null &
alternative hypotheses.
Ans:
Null Hypotheses: H0: dealership average profit
on sales of new cars = 4.5%
Alternative hypotheses: H1: dealership average
profit on sales of new cars is more than
b. State the
level of
significance.
Ans:
Level of significance = 1-0.10 = 0.9 = 90 %
c. Find the critical value (or
values), & clearly show the
rejection & nonrejection regions.
Ans:
Critical Value = 4.5 + 1.28 * 0.2 = 4.
So, less than 4.756% of dealership average profit
on sales of new cars would be rejection region.
More than 4.756% of dealership average profit on
sales of new cars would be non- rejection region
d. Compute
the test
statistic. Ans:
Z – higher value = 4.756 %
e. Decide whether you can
reject Ho & accept Ha or not.
Ans:
Since, the Sample Mean = 4.97% lie above the
critical value of 4.756 %, we would reject H0 &
accept Ha.
f. Explain & interpret your
conclusion in part e. What does this
mean? Ans:
It means that at 90 % level of significance, the
dealership averages more than 4.5% profit on
sales of new cars.
g. Determine the observed p-value
for the hypothesis test & interpret
this value. What does this mean?
Ans:
P – value = Sample Mean = 4.97% so, it
means that random sample of 81 cars gave
profit of 4.97 % on sales of new car.
h. Does the sample data provide evidence to
conclude that the dealership averages more
than 4.5% profit on sales of new cars (using a =
.10)? (Points : 24)
Ans:
Yes, the sample data does provide evidence to
conclude that the dealership averages more than
4.5% profit on sales of new cars (using a = .10).
Week 8 : Final Exam -
Final Exam Page 2
who specializes in selling farmland in a large
western state. Because Bill advises many of his
clients about pricing their land, he
is interested in developing a pricing formula of
some type. He feels he could increase his
business significantly if he could accurately
determine the value of a farmer’s land. A
geologist tells Bill that the soil & rock
characteristics in most of the area that Bill sells
do not vary much. Thus the price of land should
depend greatly on acreage. Bill selects a sample
of 30 plots recently sold. The data is found below
(in Minitab), where X=Acreage & Y=Price
($1,000s).
Correlations: PRICE, ACREAGE
Pearson correlation of PRICE
& ACREAGE = 0.997 P-Value =
Regression Analysis: PRICE versus ACREAGE
The regression
equation is PRICE =
Predictor Coef SE
Coef T P
Constant 2.257 2.231 1.
S = 7.21461 R-Sq = 99.4% R-Sq(adj) = 99.3%
Analysis of Variance
Source DF SS MS F
P Regression
Residual Error 281457 52
Total 29231215
Predicted Values for New Observations
New Obs Fit SE Fit 95% CI 95% PI
XX denotes a point that is an extreme outlier
in the predictors.
Values of Predictors for
New Observations New Obs
a. Analyze the above output to
determine the regression
equation. Ans:
The regression equation is
PRICE ($1,000s) = 2.26 + 2.89 ACREAGE
b. Find & interpret in the
context of this problem.
Ans:
Price of land depends on acreage & varies with
the acreage. Price of the land can be predicted
by multiplying the acreage of the land with 2.
& then adding a constant value of 2.26. The land
price arrived would show up in $ 1,000s
(thousands of Dollars).
c. Find & interpret the
coefficient of determination (r-
squared). Ans:
The coefficient of determination, r 2, is useful
because it gives the proportion of the variance
(fluctuation) of price of the land ( in ,000 $) that
can be predicted from the acreage of the land.
The coefficient of determination is a measure of
how well the regression line (PRICE ($1,000s) =
2.26 + 2.89 ACREAGE) represents the data. In
this case, R-Sq value is 99.4% which means that
variation in the price of the land can be
explained/predicted to the extent of 99.4 % by
the variation in the acreage of the land.
d. Find & interpret coefficient of
correlation.
Ans:
Coefficient of correlation. R, measures the
strength & the direction of a linear relationship
between two variables. In this case, the r value is
0.997 & so there is a string relationship between
the acreage of the land & price of the land ( in
e. Does the data provide significant
evidence ( 𝛼 = .05) that the acreage can
be used to predict the price? Test the
utility of this model using a two-tailed
test. Find the observed p- value &
interpret.
Ans:
at 95 % level of significance, the observed p –
value is 0.000 which is less than 0.05. So,
reject the null hypothesis. So, the data
provides significant evidence (at 𝛼 = .05) that the
acreage can be used to predict the price.
f. Find the 95% confidence interval
for mean price of plots of farmland
that are 50 acres. Interpret this
interval.
Ans:
95% confidence interval for mean price of plots of
farmland that are 50 acres is (144.05, 149.66).
So, at 95% confidence interval, the mean price
of plots of farmland that are 50 acres would lie in
the interval of (144.05, 149.66).
g. Find the 95% prediction interval for
the price of a single plot of farmland
that is 50 acres. Interpret this interval.
Ans:
95% prediction interval for the price of a single
plot of farmland that are 50 acres (131.82,
So, at 95% prediction interval, the price of a
single plot of farmland that is 50 acres would lie
in the interval of (131.82, 161.90.
h. What can we say about the price for a
plot of farmland that is 250 acres?
(Points : 48)
Ans:
Price for a plot of farmland that is 250 acres =
PRICE ($1,000s) = 2.26 + 2.89 ACREAGE =
2.26 + 2.89 * 250 = 724.760 ( In ,000 $ )
Week 8 : Final Exam -
Final Exam 4
1. (TCO E) An insurance firm wishes to
study the relationship between driving
experience (X1, in years), number of
driving violations in the past three years
(X2), & current monthly auto insurance
premium (Y). A sample of 12 insured
drivers is selected at random. The data
is given below (in MINITAB):
Y X1 X2 Predict
X1
Predict
X2
74 5 2 8 1
38 14 0
50 6 1
63 10 3
97 4 6
55 8 2
57 11 3
43 16 1
99 3 5
Regression Analysis: Y versus X1, X2
The regression equation is
Predictor Coef
SE Coef T P Constant
S = 6.07296 R-Sq = 93.1% R-Sq(adj) =
Analysis of Variance
Source DF SS MS F
P Regression
Residual Error 9 331.9 36.9
Total 11 4822.3
Predicted Values for New Observations
New Obs Fit SE Fit 95% CI 95% PI
Values of Predictors for
New Observations New Obs
Correlations: Y, X1, X2
Cell Contents: Pearson correlation
P-Value
a. Analyze the above output to determine
the multiple regression equation.
Ans:
The multiple Regression equation is
The regression equation is
b. Find & interpret the multiple index of
determination (R-Sq).
Ans:
R-Sq = 93.1%
So, the variation in the current monthly auto
insurance premium (Y) is explained by driving
experience (X1, in years) & number of driving
violations in the past three years (X2) to the
extent of 93.1%.
c. Perform the t-tests on & on (use two
tailed test with ( 𝛼 = .05). Interpret your
results.
Ans:
A t-stat of greater than 1.96 with a significance
less than 0.05 indicates that the independent
variable is a significant predictor of the dependent
variable within & beyond the sample. The greater
the t-stat the greater the relative influence of the
independent variable on the dependent variable.
A t-stat of less than
1.96 with significance greater than 0.05 indicates
that the independent variable is NOT a significant