COMP5310: Principles of Data Science - University of Sydney, Lecture notes of Data Analysis & Statistical Methods

[Week 1] Data Science Introduction - What is Data Science?

Typology: Lecture notes

2018/2019

Uploaded on 04/20/2019

kefart
kefart 🇺🇸

4.4

(11)

55 documents

1 / 75

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
The University of Sydney Page 1
COMP5310: Principles of
Data Science
W1: Introduction
Presented by
Dr Ali Anaissi
School of Information Technologies
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b

Partial preview of the text

Download COMP5310: Principles of Data Science - University of Sydney and more Lecture notes Data Analysis & Statistical Methods in PDF only on Docsity!

COMP5310: Principles of

Data Science

W1: Introduction

Presented by Dr Ali Anaissi School of Information Technologies

Curriculum at a glance

Whirlwind tour of:

  • Data Exploration
  • Data Engineering
  • Data Mining & Machine Learning
  • Making Decisions from Data

Focus on key activities of a data scientist

Questions and suggestions

We are very excited to be teaching this for the fourth year

Thank you for joining us!

Please feel free to:

  • Ask questions (we should know the answer or someone who does)
  • Share thoughts and suggestions on how we can improve

Questions about the MDS degree program or enrolments?

  • Keiko Narushima (MDS admin officer), SIT Building, room 2E- 229
  • phone: 0 2 8627 0872 email: [email protected]

UNIT ARRANGEMENTS

Introducing Team

Lecturer Dr Ali Anaissi Unit Coordinator Dr Ali Anaissi SIT Building J12, Level 2 [email protected] Tutors Seid Miad Zandavi Ragav Chalapathy Omid Tavallaie Claudio Cifuentes Reza Behi Dai Xiang Heming Ni Norman Yan

Resources

You will need a computer for exercises

  • Please bring a laptop, else use a machine here

Google Sheets for spreadsheet exercises [week 2]

  • Please create a Google account if you don’t already have one!

Jupyter Hub accounts for Python/SQL exercises

  • We will provide account details in week 3
  • But we recommend you download Anaconda and PostgreSQL database

on your PC

Learn Python and SQL with Grok

  • Exercises will use Python from week 3
  • We provide self-guided Python learning through Grok
  • Please complete (sooner is better, week 5 at latest)

https://groklearning.com/course/usyd-comp5310- 2019 - s1/

(login with your University of Sydney email)

Find everything on Canvas

  • The web site for this unit is on Canvas
  • Use it to access contacts, schedule, readings, slides, etc
  • Participate in Q&A with instructors and classmates

https://canvas.sydney.edu.au

Assessment

  • 10%: Participation
  • 13%: Project stage 1
  • 20%: Project stage 2
  • 7%: Project stage 3
  • 50%: Final exam

Participation

Objective Ensure everybody is keeping up. Requirements Submit code at end of each exercise Complete Grok exercises (not marked) Output Code/spreadsheets from exercises Marking 10% of overall mark

Project stage 2 and 3: Experiment, Quantify, Report

Objective Define an experimental framework and complete analysis/visualisation, data mining, machine learning, etc. Activities Define experimental framework Perform analysis or build tool Describe evaluation and conclusions Output 4 - page report describing framework, analysis and conclusions (plus code) Presentation (2-3/3- 4 mins) Marking 27% of overall mark

  • 20% report and code
  • 7% presentation

Final exam

Objective Assess understanding of unit material, ability to frame data problems scientifically and critical thinking about claims made based on data Activities Answer questions about lecture materials Practical excises and SQL queries Describe an approach to answering a question with data Critique a claim made based on data Format Written examination Must get 40% on exam to pass unit per SIT policy Marking 50% of overall mark cap on final mark which cannot exceed exam mark by more than 10 marks

LATENESS AND PLAGIARISM

Recipe for success

  • Attend scheduled classes except for illness, emergency, etc
  • Plan 6-9 hours per week for preparation, practice, project, etc
  • Participate in classes and forums with respect and humility
  • Submit assessments on time
  • Let us know if any concerns, e.g., if you are falling behind