

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A homework assignment for a data analysis course in spring 2011. Students are required to analyze the 'old faithful' geyser dataset, which includes two columns: duration (length of the latest eruption) and waiting (interval between eruptions). The assignment includes tasks such as linear regression, plotting residuals, variance estimation, weighted least squares regression, nonparametric kernel regression, and estimating the conditional density of waiting given duration. The goal is to understand the relationship between duration and waiting time and to evaluate the compatibility of the data with homoskedastic noise.
Typology: Exercises
1 / 2
This page cannot be seen from the preview
Don't miss anything!


The data set geyser in the library MASS contains a series of consecutive ob- servations on the “Old Faithful” geyser at Yellowstone National Park, famed for the approximate regularity of its eruptions. There are two columns: duration, the length of the latest eruption of the geyser (in minutes), and waiting, the interval from the end of one eruption to the start of the next (also in minutes). Begin by obtaining the library (if you don’t have it already) and loading the data set. You should be able to reproduce this:
summary(geyser) waiting duration Min. : 43.00 Min. :0. 1st Qu.: 59.00 1st Qu.:2. Median : 76.00 Median :4. Mean : 72.31 Mean :3. 3rd Qu.: 83.00 3rd Qu.:4. Max. :108.00 Max. :5.