Download Data Management for Data Science DATA 514 and more Schemes and Mind Maps Database Management Systems (DBMS) in PDF only on Docsity!
Data Management for Data Science
DATA 514
Lecture 1: Introduction, Data
Models
Gradience token on the whiteboard: please write it down
Class Goals
- The world is drowning in data!
- Needed: data scientists to help manage this data - Help domain scientists achieve new discoveries - Help companies provide better services - Help governments become more efficient
- Welcome to 514
- Existing tools PLUS data management principles
About Me
- Postdoctoral Researcher at UW since 2016
- PhD from Carleton University, Canada, Ottawa
- Born in Iran
- Research Interests: Decision Making System, Causal Inference from Big Data, Database Repair and Approximate Query Processing
Course Format
- Lectures Tuesdays, 5pm-7:50pm
- Sections: Tuesdays, 8-8:50pm
- Content: exercises, tutorials, questions
- Locations: here!
- 6 homework assignments
- 7 web quizzes
- Midterm and final
Textbook Main textbook, available at the bookstore:
- Database Systems: The Complete Book , Hector Garcia-Molina, Jeffrey Ullman, Jennifer Widom Second edition. Most important: COME TO CLASS! ASK QUESTIONS!
Other Texts Available at the Engineering Library (some on reserve):
- Database Management Systems , Ramakrishnan
- Fundamentals of Database Systems , Elmasri, Navathe
- Foundations of Databases , Abiteboul, Hull, Vianu
- Data on the Web, Abiteboul, Buneman, Suciu
Eight Homework Assignments
H1: Sqlite
H2: Basic SQL with SQLite
H3: Advanced SQL with SQL Server
H4: Conceptual Design
H5: JSon
H6: SQL in Java (JDBC)
Check calendar for due dates -- Submit via gitlab!
About the Assignments
- Homework assignments will take time but most time should be spent learning
- Do them on your own
- Very practical assignments
- Put everything on your resume!!!
- SQL, SQLite, SQL Server, SQL Azure JDBC, JSon,… Cloud!
Six Web Quizzes
- http://newgradiance.com/
- Create account, provide token
- Class token: on the white board, write it down
- No late days – closes at 11:00 deadline
- Provides explanations for wrong answers
- Short tests, take many times, best score counts
Exams
- Midterm and Final
- See course calendar for dates and times
- May bring 1 letter-size, double-side piece of paper with notes
- Closed book. No computers, phones, watches, etc.!
- Check course website for dates
- Location: in class
Now onto the real stuff…
Outline of Today’s Lecture
- Overview of database management
systems
- Course content
- Data Models
- SQL
What is a DBMS?
- A big program written by someone else that allows us to manage efficiently a large database and allows it to persist over long periods of time Give examples of DBMSs
- Oracle, IBM DB2, Microsoft SQL Server, Vertica
- Open source: MySQL (Sun/Oracle), PostgreSQL, AsterixDB
- Open source library: SQLite We will focus on relational DBMSs most quarter Database Management System
An Example: Online Bookseller
What data do we need?
What capabilities on the data do we need?