Download COMP5310: Principles of Data Science - University of Sydney and more Lecture notes Data Analysis & Statistical Methods in PDF only on Docsity!
COMP5310: Principles of
Data Science
W1: Introduction
Presented by Dr Ali Anaissi School of Information Technologies
Curriculum at a glance
Whirlwind tour of:
- Data Exploration
- Data Engineering
- Data Mining & Machine Learning
- Making Decisions from Data
Focus on key activities of a data scientist
Questions and suggestions
We are very excited to be teaching this for the fourth year
Thank you for joining us!
Please feel free to:
- Ask questions (we should know the answer or someone who does)
- Share thoughts and suggestions on how we can improve
Questions about the MDS degree program or enrolments?
- Keiko Narushima (MDS admin officer), SIT Building, room 2E- 229
- phone: 0 2 8627 0872 email: [email protected]
UNIT ARRANGEMENTS
Introducing Team
Lecturer Dr Ali Anaissi Unit Coordinator Dr Ali Anaissi SIT Building J12, Level 2 [email protected] Tutors Seid Miad Zandavi Ragav Chalapathy Omid Tavallaie Claudio Cifuentes Reza Behi Dai Xiang Heming Ni Norman Yan
Resources
You will need a computer for exercises
- Please bring a laptop, else use a machine here
Google Sheets for spreadsheet exercises [week 2]
- Please create a Google account if you don’t already have one!
Jupyter Hub accounts for Python/SQL exercises
- We will provide account details in week 3
- But we recommend you download Anaconda and PostgreSQL database
on your PC
Learn Python and SQL with Grok
- Exercises will use Python from week 3
- We provide self-guided Python learning through Grok
- Please complete (sooner is better, week 5 at latest)
https://groklearning.com/course/usyd-comp5310- 2019 - s1/
(login with your University of Sydney email)
Find everything on Canvas
- The web site for this unit is on Canvas
- Use it to access contacts, schedule, readings, slides, etc
- Participate in Q&A with instructors and classmates
https://canvas.sydney.edu.au
Assessment
- 10%: Participation
- 13%: Project stage 1
- 20%: Project stage 2
- 7%: Project stage 3
- 50%: Final exam
Participation
Objective Ensure everybody is keeping up. Requirements Submit code at end of each exercise Complete Grok exercises (not marked) Output Code/spreadsheets from exercises Marking 10% of overall mark
Project stage 2 and 3: Experiment, Quantify, Report
Objective Define an experimental framework and complete analysis/visualisation, data mining, machine learning, etc. Activities Define experimental framework Perform analysis or build tool Describe evaluation and conclusions Output 4 - page report describing framework, analysis and conclusions (plus code) Presentation (2-3/3- 4 mins) Marking 27% of overall mark
- 20% report and code
- 7% presentation
Final exam
Objective Assess understanding of unit material, ability to frame data problems scientifically and critical thinking about claims made based on data Activities Answer questions about lecture materials Practical excises and SQL queries Describe an approach to answering a question with data Critique a claim made based on data Format Written examination Must get 40% on exam to pass unit per SIT policy Marking 50% of overall mark cap on final mark which cannot exceed exam mark by more than 10 marks
LATENESS AND PLAGIARISM
Recipe for success
- Attend scheduled classes except for illness, emergency, etc
- Plan 6-9 hours per week for preparation, practice, project, etc
- Participate in classes and forums with respect and humility
- Submit assessments on time
- Let us know if any concerns, e.g., if you are falling behind