






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
A series of exercises and solutions related to introductory statistics concepts. It covers topics such as population, sample, parameters, statistics, and data analysis. The exercises are designed to help students understand and apply key statistical concepts in real-world scenarios. Examples and explanations to guide students through the problem-solving process.
Typology: Exams
1 / 10
This page cannot be seen from the preview
Don't miss anything!
An automobile producing company operates car dealerships that sell its cars. In order to assess the service customers, get at the dealerships, a team from the Customer Relationship Department was assembled. The team was sent to 20 randomly selected dealerships. Each dealership was visited once by the team for an entire working day. During a visit, the team interviewed all the customers that arrived at that dealership. One of the questions that each of the customers was asked is: "Do you currently possess a car made by our company?" The answers were marked down as "Yes", "No", or "Refuse to Answer", depending on the customer's response. Select the correct answer in each of the following 4 questions: - Answer: The population in this survey is: B) Customers that arrive at dealerships of the company. The target of the survey is to assess the service that customers get at the dealerships.
The sample in this survey is: - Answer: B) Customers that arrive at the selected dealerships during the day of visit. All the customers that arrive at dealerships are the population. The subset of those that arrive at the 20 dealerships that were selected is the sample. A parameter that may be targeted by this survey is: - Answer: A) The percentage of customers that possess a car made by the company. "The percentage of customers that possess a car made by the company" is the only answer that describes a characteristic of the population. The last answer describes a subset of the population, not a characteristic: The percentage of the subset within the entire population is a parameter, the subset itself is not. A statistic that may be used to summarize the outcome of the survey is: - Answer: B) The percentage, among costumers that were interviewed, of those that do not possess a car made by the company. "The percentage, among costumers that were interviewed, of those that do not possess a car made by the company" is the only answer that describes a characteristic of the sample. The third answer describes the sample, not a characteristic of it: The percentage of the subset within the sample is a statistic, the sample itself is not. The Customers Service Center of a large bank receive calls from customers. The number of incoming calls between 8:00 AM and 8:10 AM in consecutive days were recorded. The number of incoming calls during the working days of the month of September were: 8, 10, 6, 7, 12, 7, 8, 7, 8, 6, 9, 4, 6, 14, 10, 10, 10, 10, 6, 10, 12, 7 The number of incoming calls during the working days of the month of February were:
The difference between the averages is a characteristic of the data, a statistic. A hint that the correct answer is a statistic and not a parameter is the fact that one can compute the value of the average from the data. In which of the months the number of incoming calls between 8:00 AM and 8:10 AM tends to be smaller? - Answer: B) February. The distribution in the month of February tends to concentrate in the range between 2 and 7. The distribution in the month of September tends to concentrate in the range between 6 and
The height of the fourth bar from the left of the bar plot for the month of February represents the fact that? - Answer: A) In 4 of the days of the month of February there were 5 incoming calls. The fourth bar from the left of the bar plot for the month of February is placed above the value "5" and its height is 4 units. The location of the highest bar of the bar plot for the month of September represents the fact that? - Answer: D) 10 incoming calls came during 6 days of the month of September. The highest bar of the bar plot for the month of September is placed over the value "10" and its height is 6 units. The next two questions refer to the following relative frequency table on hurricanes that have made direct hits on the U.S. between 1851 and 2004. Hurricanes are given a strength category rating based on the minimum wind speed generated by the storm. (http://www.nhc.noaa.gov/gifs/table5.gif ) (ALTERNATE DOWNLOAD LINK)
Frequency of Hurricane Direct HitsCategory# Direct HitsRelative Freq.Cum. Relative Freq. 0.39930 0.3993272 3 0.2601 4 0.0659 5 0.01101.0000 - Answer: What is the relative frequency of direct hits that were category 2 hurricanes? A) 0. The total of all relative frequencies is 1.000. Denote by p the relative frequency of category 2 hurricanes. Observe that0.3993 + p + 0.2601 + 0.0659 + 0.0110 = 1.0000Consequently,p = 1 - 0.3993 + 0.2601 + 0.0659 + 0.0110 = 1 - 0.7363 = 0. Sixty adults with gum disease were asked the number of times per week they used to floss before their diagnoses. The (incomplete) results are shown below: Flossing Frequency for Adults with Gum Disease# per WeekFreq.Relative Freq.Cum. Relative Freq.0270.4500 118 3 0.933363 710. Fill in the blanks in the table and answer the next two questions: (The numerical answer that you provide should be of the form "0. XXXX", with 4 digits to the right of the zero.) - Answer: The relative frequency that flossed 6 times per week is: Answer: 0. It is given that the total number of adults with gum disease is 60. There are 3 such adults that flossed 6 times per week. Therefore, the relative frequency is 3/60 = 0. The relative frequency that flossed at least once a week is: - Answer: Answer: 0. The relative frequency of adults that do not floss at all is 0.4500. All other adults floss at least once a week. Their relative frequency is 1.0000 - 0.4500 = 0.5500.
After saving the file "ex2.csv" in the working directory one can use the code> ex2 <- read.csv("ex2.csv") in order to read the file into a data frame by the name "ex2". Writing the content of the object to the screen will produce: > ex2id sex age bmi systolic diastolic group 3695908 FEMALE 34 28.78903 112.5887 64.84949 NORMAL2 5778095 FEMALE 33 18. 122.9261 78.71555 NORMAL3 5138370 MALE 32 27.66339 128.3985 86.57248 NORMAL... The type of the variable "id" is: - Answer: B) factor All the values are numbers. Technically, R treats this variable as a numeric sequence. However, one would typically not use this variable for statistical inference. Usually, it serves as a key, a unique identifier, in data set management. The type of the variable "sex" is: - Answer: B) factor Values are non-numeric: "MALE" and "FEMALE". The type of the variable "diastolic" is: - Answer: A) numeric All values are numeric. A study was done to determine the age, type of activity, number of times per week and the duration (amount of time) of resident use of a local park in San Jose. The first house in the neighborhood around the park was selected randomly and then every 8th house in the neighborhood around the park was interviewed. Answer the following 2 questions: 1.) "Age" is what type of data? 2.) "Type of activity" is what type of data? - Answer: 1.) B) quantitative/numeric
Age is a numeric measurement. 2.)A) qualitative/factor This variable obtains categorical values. Identify the type of data that would be used to describe a response for each of the items below: 1.) Number of tickets sold to a concert. 2.) Favorite baseball team. 3.) Number of students enrolled at Evergreen Valley College. 4.) Distance to the closest movie theater. 5.) Number of competing computer spreadsheet software packages. - Answer: 1.) quantitative/numeric 2.) qualitative/factor 3.) quantitative/numeric 4.) quantitative/numeric 5.) quantitative/numeric In Figure A you will find box plots for three sets of data. In Figure B are the histograms for the same sets of data, but in a different order. Associate each box plot with its relative histogram. - Answer: Boxplot 1 = histogram B Boxplot 2 = histogram A Boxplot 3 = histogram C For the next 3 question deal with the following data: 11.9, 11.0, 12.4, 16.9, 16.3, 13.3, 9.1, 17.0, 11.0, 9.3, 25.3, 17.4, 17. The median is equal to: - Answer: Answer: 13.
The sample mean () is: - Answer: Answer is: 2. Run the code: > x.val <- c(1,2,3,4,5,6) > freq <- c(4,7,3,3,2,2) > rel.freq <- freq/sum(freq) > x.bar <- sum(x.valrel.freq) > x.bar [1] 2. The sample standard deviation (s) is: - Answer: Answer is: 1. Run the code: > x.val <- c(1,2,3,4,5,6) > freq <- c(4,7,3,3,2,2) > rel.freq <- freq/sum(freq) > x.bar <- sum(x.valrel.freq) > var.x <- sum((x.val-x.bar)^2*freq)/(sum(freq)-1) > sqrt(var.x) [1]
The first quartile (Q1) is: - Answer: Answer: 2 Run the code: > x.val <- c(1,2,3,4,5,6) > freq <- c(4,7,3,3,2,2) > rel.freq <- freq/sum(freq) > data.frame(x.val,cumsum(rel.freq)) x.val cumsum.rel.freq. 1 1 0.1904762 2 2 0.5238095 3 3 0.6666667 4 4 0.8095238 5 5 0.9047619 6 6 1.0000000 Observe that more than 25% of the distribution has accumulated at value "2" but less than that at value "1".