Sampling Bias and Estimation: Practice Problems and Solutions - Prof. James Davenport, Assignments of Statistics

Solutions to practice problems related to sampling bias and estimation. The problems cover topics such as biased methods of sampling, simple random sampling, and process improvement. Students can use this document to understand the concepts of sampling and estimation, and to learn how to identify and avoid sampling bias.

Typology: Assignments

Pre 2010

Uploaded on 02/10/2009

koofers-user-7jr
koofers-user-7jr 🇺🇸

10 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Practice Problems # 03 – Solutions
1. You want to measure the height of students and estimate the mean height of this
population of students. Three biased methods of sampling are given below. Explain the
problem with each method of sampling.
a. Measure the heights of 50 students found in the gym during basketball season.
There are at least a couple of things wrong with this sampling method. First, the
method itself excludes those students who do not frequent the gym. Second, students
who are likely to play basketball, in general, are taller than the average student. This
sampling method produces a biased estimator with estimates that would tend to be
larger than the true average height of all students.
b. Measure the heights of all engineering majors.
Again, this is a biased estimate. There are more male engineering majors than female,
and males, in general, tend to be taller than females. This estimate would tend to be
larger than the true mean height.
c. Measure the heights of the students selected by choosing the first name on each
page of the university phone book.
This sampling technique is called systematic sampling or could be called cluster
sampling (each page could be considered a cluster). The main issue here is with the
sampling frame. The university phone book would include individuals that are not
students; hence non-students may be included in the sample. The use of the phone
book also neglects those without phones or with unlisted numbers. The bias is
produced because the target population does not match the sampled population.
2. True or False:
a. A simple random sample is guaranteed to reflect exactly the population from
which it was drawn.
This is FALSE. A sample that exactly reflects a population does not exist.
b. A simple random sample may or may not differ from the population from
which it was taken, but is free from any systematic tendency to favor one part of
the population over another.
This is TRUE. By using simple random sampling, you are assured that on average, all
aspects of the population will have a chance of being represented in the sample.
pf3
pf4

Partial preview of the text

Download Sampling Bias and Estimation: Practice Problems and Solutions - Prof. James Davenport and more Assignments Statistics in PDF only on Docsity!

Practice Problems # 03 – Solutions

  1. You want to measure the height of students and estimate the mean height of this population of students. Three biased methods of sampling are given below. Explain the problem with each method of sampling.

a. Measure the heights of 50 students found in the gym during basketball season.

There are at least a couple of things wrong with this sampling method. First, the method itself excludes those students who do not frequent the gym. Second, students who are likely to play basketball, in general, are taller than the average student. This sampling method produces a biased estimator with estimates that would tend to be larger than the true average height of all students.

b. Measure the heights of all engineering majors.

Again, this is a biased estimate. There are more male engineering majors than female, and males, in general, tend to be taller than females. This estimate would tend to be larger than the true mean height.

c. Measure the heights of the students selected by choosing the first name on each page of the university phone book.

This sampling technique is called systematic sampling or could be called cluster sampling (each page could be considered a cluster). The main issue here is with the sampling frame. The university phone book would include individuals that are not students; hence non-students may be included in the sample. The use of the phone

book also neglects those without phones or with unlisted numbers. The bias is

produced because the target population does not match the sampled population.

  1. True or False:

a. A simple random sample is guaranteed to reflect exactly the population from which it was drawn.

This is FALSE. A sample that exactly reflects a population does not exist.

b. A simple random sample may or may not differ from the population from which it was taken, but is free from any systematic tendency to favor one part of the population over another.

This is TRUE. By using simple random sampling, you are assured that on average, all aspects of the population will have a chance of being represented in the sample.

  1. A certain process for manufacturing plastic bottles has been in use for a long time, and it is known that 10% of the bottles it produces are defective. A new process that is supposed to reduce the proportion of defectives is being tested. In a simple random sample of 100 bottles produced by the new process, 10 were found to be defective.

a. One of the engineers suggests that the test proves that the new process is no better than the old process, since the proportion of defectives is the same. Is this conclusion justified? Explain. Hint – Be careful to distinguish between the sample proportion and the population proportion.

The issue here is what is meant by “prove”. To prove that the new process is the same as the old process, we would have to have complete knowledge of both of the POPULATION PROPORTIONS, say p (^) old and p (^) new. Since we will never have more than empirical evidence in the form of sample estimates, we can never conclude that we have “proven” anything.

b. Assume that there had been only 9 defective bottles in the sample of 100 instead of 10. Does this prove the new process is better than the old? Explain.

The answer is no for the same reasons as discussed above. There is also the issue of 9 out of 100 versus 10 out of 100 being “different enough” to conclude that a difference exists between p (^) old and pnew.

  1. a. Frequency Distribution of Complaint

Cumulative Cumulative Graph of Complaint Count Count Percent Percent Percent B 7 7 11.67 11.67 |||| C 3 10 5.00 16.67 || F 9 19 15.00 31.67 |||||| J 10 29 16.67 48.33 |||||| M 4 33 6.67 55.00 || N 6 39 10.00 65.00 |||| O 21 60 35.00 100.00 |||||||||||||| Technically, you should not construct a “histogram” for this data , since it is nominal scale data. Presented here are a bar chart and a Pareto chart.

Incorrect ComponentMissing ComponentFailed ComponentInsufficient SolderExcess Solder

Pareto Chart for Product Nonconformity

Problem Type

Frequency

Incorrect ComponentMissing ComponentFailed ComponentInsufficient SolderExcess Solder

Pareto Chart for Product Nonconformity

Problem Type

Frequency

36

58

79

91

100

Pareto Numeric Report Cumulative Cumulative Label1 Freq Freq Percent Percent Failed Component 126 126 21.43 21. Incorrect Component 210 336 35.71 57. Insufficient Solder 67 403 11.39 68. Excess Solder 54 457 9.18 77. Missing Component 131 588 22.28 100.