

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Data Analysis Smoothing Methods In Regression, Exercises - Engineering - Prof. Cosma Shalizi, Advanced Data Analysis, Random Walk
Typology: Exercises
1 / 2
This page cannot be seen from the preview
Don't miss anything!


In this assignment, you will use the same data set of values for the S& P 500 stock index that was used in the lecture notes for 1 February. You will need to download SPhistory.short.csv from the class website. Problems 2 and 3 are about estimating the first percentile of the return dis- tribution, Q(0.01), under various assumptions. The returns will be larger than this 99% of the time, so Q(0.01) gives an idea of how bad the bad performance will be, which is useful for planning. Note that a calendar year contains about 250 trading days, and so should average two or three days when returns are even worse than Q(0.01). Include code for all problems as an appendix. Clearly indicate which block of code is for which problem. Comment your code when at all possible.
Min. 1st Qu. Median Mean 3rd Qu. Max. -0.094700 -0.006440 0.000467 -0.000064 0.006310 0.
Hint: Read the notes for 1 February.
(a) (5 points) Find the mean and standard deviation of the best-fitting Gaussian, and the Q(0.01) it implies. (b) (5 points) Write a function which simulates a data set of the same size as the real data, using the independent Gaussian model you fit in part 2, and returns a list, with components named mean and sd, containing the parameter values estimated from the simulation output.
(c) (5 points) Write a function which takes as arguments a list with components named mean and sd, and returns the first percentile of the corresponding Gaussian distribution. Check that it works by verifying that when run with mean 5 and sd 2, it returns 0.347. Hint: Look at the examples in the notes of parametric bootstrapping. (d) (10 points) Using the code you wrote in (2b) and (2c), find a 95% confidence interval for your estimate of Q(0.01) from (2a). Hint: Look at the examples in the notes of parametric bootstrapping. (e) (5 points) What is the first percentile of the data? Is it within the confidence interval you found in (2d)?