

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
This document contains information regarding a project for a camp
Typology: Assignments
1 / 2
This page cannot be seen from the preview
Don't miss anything!


Throughout the summer, you have learned data science and its applications on various datasets. Now, use this knowledge to analyze a real world dataset of your choice.
The idea of this project is that you will use a dataset you are passionate about. It can be anything – it can be a dataset used from another class (eg: think if you had any data you get in Excel), it can be a dataset you found online, or it can be a dataset you gather yourself. Some ideas include but are not limited to:
The dataset: - Must be non-trivial. Meaning it has at least 200 data points. This means that the rows multiplied by the columns is greater than or equal to 200.
Project Report
This is a jupyter notebook (.ipynb) file similar to what we have worked with all summer long. It must have 5 sections. Each section should clearly be labeled with a markdown chunk. There is a guide linked on canvas for markdown syntax.
The five sections:
drive.mount('/content/drive')
df = pd.read_csv('/content/drive/My Drive/file.csv')
Each section should be several sentences AND several lines of code (with the exception of section three where the code is one line to create plots.)
You DO NOT need to explain any data science methods in your report. Assume that we know data science but not your specific dataset. Explain the results of your code and why you chose specific methods but not what each line of code does.
Presentation
Your presentation will be short 5-7 minutes. You will be presenting to the entire class. Create a slideshow for the presentation DO NOT just put your report on screen and read that to the class. It should be mainly graphs and pictures generated by your report, it should not be full paragraphs but short bullet points that describe the graphs, and it should contain ZERO code.
The presentation should contain three key things:
One Slide
This should be one slide containing only your most important information from your report and your presen- tation. This is to be used at the gallery walk, so you want to know every thing on the slide without having to look at it, there are examples posted to canvas.
The one slide should contain your 3-4 best plots, and each plot should provide unique information.