Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Berkeley Databases Prelim Notes, Study notes of Database Management Systems (DBMS)

University of California - Berkeley Database Management Systems (DBMS)

Study notes for the Berkeley DB prelim. It covers topics such as the architecture of a database system, relational query processors, storage management, transactions, and query processing. The notes also include comprehension questions and are based on original papers and notes from Michael Whittaker. The document could be useful as study notes or a summary for a university student preparing for a database exam.

Typology: Study notes

2021/2022

Uploaded on 05/11/2023

gaurishaknar 🇺🇸

3.4

(8)

232 documents

1 / 220

This page cannot be seen from the preview

Don't miss anything!

Berkeley Databases Prelim Notes

Shreya Shankar

Summer 2022

This document served as my study guide for the Berkeley DB prelim (syllabus here).

Much of the text is copy-pasted from the original papers. Some of the text is copy-pasted

from Michael Whittaker’s notes and self-contained paper summaries, both of which I am

very grateful for. A miniscule fraction of the text is copy-pasted from old CS262A and

CS286 notes, scattered across the internet.

My contribution is mainly the comprehension questions, which are spread throughout the

document. Although I spent several months1studying, I did not get to every subsection

in each paper.

1I would not have been able to do this without the support of my partner, who made me chai or coffee

almost every day.

Discover Study notes of Database Management Systems (DBMS) University of California - Berkeley

Partial preview of the text

Download Berkeley Databases Prelim Notes and more Study notes Database Management Systems (DBMS) in PDF only on Docsity!

Berkeley Databases Prelim Notes

Shreya Shankar

Summer 2022

This document served as my study guide for the Berkeley DB prelim (syllabus here). Much of the text is copy-pasted from the original papers. Some of the text is copy-pasted from Michael Whittaker’s notes and self-contained paper summaries, both of which I am very grateful for. A miniscule fraction of the text is copy-pasted from old CS262A and CS286 notes, scattered across the internet. My contribution is mainly the comprehension questions, which are spread throughout the document. Although I spent several months^1 studying, I did not get to every subsection in each paper.

(^1) I would not have been able to do this without the support of my partner, who made me chai or coffee almost every day.

1 Basics and Foundational Systems
- 1.1 Architecture of a Database System [33]
  - 1.1.1 Process Models
  - 1.1.2 Parallel Architecture
  - 1.1.3 Relational Query Processor
  - 1.1.4 Storage Management
  - 1.1.5 Transactions: Concurrency Control and Recovery
  - 1.1.6 Shared Components
- 1.2 The Five Minute Rule Twenty Years Later [25]
- 1.3 A History and Evaluation of System R [13]
  - 1.3.1 Phase Zero
  - 1.3.2 Phase One
  - 1.3.3 Phase Two
- 1.4 The POSTGRES Next-Generation Database System [49]
  - 1.4.1 The POSTGRES Data Model and Query Language
  - 1.4.2 The Rules System
  - 1.4.3 The Storage System
- 1.5 The Gamma Database Machine Project [19]
  - 1.5.1 Software Architecture of Gamma
  - 1.5.2 Query Processing
  - 1.5.3 Transaction and Failure Management
  - 1.5.4 Lessons Learned
2 Query Processing
2.1 Access Path Selection in a Relational Database Management System [47]
- 2.1.1 Processing of a SQL Statement
- 2.1.2 Research Storage System
- 2.1.3 Costs for Single Relation Access Paths
- 2.1.4 Access Path Selection for Joins
- 2.1.5 Nested Queries
2.2 Query Evaluation Techniques for Large Databases [24]
- 2.2.1 Architecture of Query Execution Engines
- 2.2.2 Sorting
- 2.2.3 Hashing
- 2.2.4 Disk Access
- 2.2.5 Aggregation and Duplicate Removal
- 2.2.6 Binary Matching Operators
- 2.2.7 Universal Quantification
- 2.2.8 Duality of Sort and Hash
- 2.2.9 Execution of Complex Query Plans
- 2.2.10 Mechanisms for Parallel Query Execution
- 2.2.11 Parallel Algorithms
- 2.2.12 Nonstandard Query Processing Algorithms
2.3 The Volcano Optimizer Generator: Extensibility and Efficient Search [26]
- 2.3.1 Optimizer Generator Input and Optimizer Operation
- 2.3.2 The Search Engine
- 2.3.3 Comparison with the EXODUS Optimizer Generator
2.4 Eddies: Continuously adaptive query processing [6]
- 2.4.1 Ripple Joins
  - 2.4.2 Rivers and Eddies
  - 2.4.3 Routing Tuples in Eddies
  - 2.4.4 Responding to Dynamic Fluctuations
  - 2.4.5 Advantages and Disadvantages of Eddies
- 2.5 Worst-Case Optimal Join Algorithms [42]
- 2.6 Datalog and Recursive Query Processing—Sections 1-3 and 6 [28]
  - 2.6.1 Baby’s First Example
  - 2.6.2 Language and Semantics
3 Transactions
- 3.1 Transaction Basics [45]
  - 3.1.1 Serializability
  - 3.1.2 Lock-based Concurrency Control
  - 3.1.3 Isolation Levels
  - 3.1.4 2PL, Serializability, and Recoverability
  - 3.1.5 Lock Conversions and Deadlocks
  - 3.1.6 Concurrency Control in B+ Trees
  - 3.1.7 Multiple-Granularity Locking
  - 3.1.8 Optimistic Concurrency Control
  - 3.1.9 Timestamp-Based Concurrency Control
  - 3.1.10 Multiversion Concurrency Control
  - ing and Partial Rollbacks Using Write-Ahead Logging [41] 3.2 ARIES: A Transaction Recovery Method Supporting Fine-Granularity Lock-
  - 3.2.1 The Log
  - 3.2.2 Other Recovery-Related Structures
  - 3.2.3 Checkpointing
  - 3.2.4 Executing the Algorithm
  - 3.2.5 Interaction with Concurrency Control
  - 3.2.6 System R Recovery Algorithm
  - 3.2.7 Parallelism
- 3.3 Granularity of locks and degrees of consistency in a shared data base [27]
  - 3.3.1 Granularity of Locks
  - 3.3.2 Degrees of Consistency
  - 3.3.3 Dependencies Among Transactions
  - 3.3.4 Backups and Recovery
- 3.4 Concurrency Control in Distributed Database Systems [8]
  - 3.4.1 Decomposition of the Concurrency Control Problem
  - 3.4.2 Synchronization techniques based on 2PL
  - 3.4.3 Synchronization techniques based on T/O
  - 3.4.4 Integrated Concurrency Control Methods
  - tions [3] 3.5 Concurrency Control Performance Modeling: Alternatives and Implica-
  - 3.5.1 Performance Model
  - 3.5.2 Performance Metrics
  - 3.5.3 Experiments
4 Indexing
- 4.1 Efficient Locking for Concurrent Operations on B-Trees [37]
  - 4.1.1 Storage Model
  - 4.1.2 Problem with Concurrent B+ trees
  - 4.1.3 B-link Trees
  - 4.1.4 Search
  - 4.1.5 Insertion
  - 4.1.6 Deletion
- 4.2 Improved Query Performance with Variant Indexes [43]
  - 4.2.1 Indexing Definitions
  - 4.2.2 Comparing Index types for Aggregate Evaluation
    - tables) 4.2.3 Evaluating OLAP-style Queries (join a fact table with multiple dim
  - gles [7] 4.3 The R*-tree: an Efficient and Robust Access Method for Points and Rectan-
  - 4.3.1 R-tree Variants
- 4.4 The Log-Structured Merge-Tree (LSM-Tree) [44]
  - 4.4.1 The two component LSM-tree algorithm
  - 4.4.2 The multi-component LSM-tree
  - 4.4.3 Concurrency Control in the LSM-tree
  - 4.4.4 Recovery in the LSM-tree
5 DBMS Architectures Revisited
- 5.1 C-Store: A Column-Oriented DBMS [50]
  - 5.1.1 Data Model
  - 5.1.2 RS
  - 5.1.3 WS
  - 5.1.4 Updates and Transactions
  - 5.1.5 Recovery
  - 5.1.6 Tuple Mover
  - 5.1.7 Query Optimizer
- 5.2 Hekaton: SQL Server’s Memory-Optimized OLTP Engine [20]
  - 5.2.1 Design Considerations
  - 5.2.2 Storage and Indexing
  - 5.2.3 Programmability and Query Processing
  - 5.2.4 Transaction Management
  - 5.2.5 Durability and Recovery
  - 5.2.6 Garbage Collection
- 5.3 Calvin: Fast Distributed Transactions for Partitioned Database Systems [52]
  - 5.3.1 Deterministic Database Systems
  - 5.3.2 System Architecture
  - 5.3.3 Calvin with disk-based storage and Checkpointing
- 5.4 Spanner: Google’s Globally-Distributed Database [15]
  - 5.4.1 Implementation
  - 5.4.2 Data Model
  - 5.4.3 Concurrency
  - 5.4.4 Differences between Calvin and Spanner
- 5.5 Building Efficient Query Engines in a High-Level Language [36]
  - 5.5.1 Example
  - 5.5.2 General Execution Workflow
  - 5.5.3 Optimizations
6 Distributed Data, Weak Isolation, Relaxed Consistency - tem [40] 6.1 Transaction Management in the R* Distributed Database Management Sys- - 6.1.1 Properties of an Ideal Atomic Commit Protocol - 6.1.2 The Two-Phase Commit Protocol - 6.1.3 Hierarchical 2PC
- 6.1.4 The Presumed Commit Protocol
- 6.1.5 Deadlock
6.2 Generalized Isolation Level Definitions [1]
- 6.2.1 Previous Work
- 6.2.2 Restrictiveness of Preventative Approach
- 6.2.3 Database Model and Transaction Histories
- 6.2.4 Conflicts and Serialization Graphs
- 6.2.5 New Generalized Isolation Specifications
- 6.2.6 Mixing Isolation Levels
- age System [51] 6.3 Managing Update Conflicts in Bayou, a Weakly Connected Replicated Stor-
- 6.3.1 System Model
- 6.3.2 Conflict Detection and Resolution
- 6.3.3 Replica Consistency
- 6.3.4 Write Stability and Commitment
- 6.3.5 Storage System Implementation Issues
- 6.3.6 Access Control
6.4 Dynamo: Amazon’s Highly Available Key-value Store [17]
- 6.4.1 System Interface
- 6.4.2 Partitioning Algorithm
- 6.4.3 Replication
- 6.4.4 Data Versioning
- 6.4.5 Execution of get () and put () operations
- 6.4.6 Handling Failures
- 6.4.7 Membership and Failure Detection
  - 6.4.8 Experiences and Lessons Learned
- 6.5 CAP Twelve Years Later: How the ”Rules” Have Changed [9]
  - 6.5.1 CAP-Latency Connection
  - 6.5.2 Managing Partitions
  - 6.5.3 Which operations should proceed?
  - 6.5.4 Partition Recovery
- 6.6 Consistency Analysis in Bloom: a CALM and Collected Approach [4]
  - 6.6.1 BUD: Bloom Under Development
7 Parallel Dataflow - cessing [18] 7.1 Parallel Database Systems: The Future of High Performance Database Pro- - 7.1.1 Trend to Shared Nothing DBs - 7.1.2 A Parallel Dataflow Approach to SQL Software - 7.1.3 Parallelism within Relational Operators - 7.1.4 State of the Art - 7.1.5 Future Directions
- 7.2 Encapsulation of Parallelism in the Volcano Query Processing System [23]
  - 7.2.1 Bracket Model of Parallelism
  - 7.2.2 Volcano System Design
  - 7.2.3 Operator Model of Parallelization
  - 7.2.4 Horizontal Parallelism
  - 7.2.5 Variants
- 7.3 MapReduce: simplified data processing on large clusters [16]
  - 7.3.1 Implementation
  - 7.3.2 Data Structures
  - 7.3.3 Refinements
- 7.4 TAG: A tiny aggregation service for ad-hoc sensor networks [39]
  - 7.4.1 Motes and Ad-Hoc Networks
  - 7.4.2 Query Model and Environment
  - 7.4.3 In Network Aggregate
  - 7.4.4 Optimizations
  - 7.4.5 Loss
  - Cluster Computing [53] 7.5 Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory
  - 7.5.1 Representing RDDs
  - 7.5.2 Memory Management
8 The Web and Databases
- 8.1 Combining Systems and Databases: A Search Engine Retrospective [10]
  - 8.1.1 SE Overview
  - 8.1.2 Logical Query Plan
  - 8.1.3 Query Implementation
  - 8.1.4 Implementation on a cluster
  - 8.1.5 Updates
  - 8.1.6 Fault Tolerance
  - 8.1.7 Other Topics
- 8.2 The Anatomy of a Large-Scale Hypertextual Web Search Engine [11]
  - 8.2.1 System Features
  - 8.2.2 Anatomy
  - 8.2.3 Crawling
  - 8.2.4 Indexing
  - 8.2.5 Search
- 8.3 WebTables: Exploring the Power of Tables on the Web [12]
  - 8.3.1 Relations
  - 8.3.2 Relation Search
  - 8.3.3 Indexing
  - 8.3.4 ACSDb Applications
9 Materialized Views, Cubes and Aggregation
- 9.1 Materialized Views [14]
- 9.2 On the Computation of Multidimensional Aggregates [29]
  - 9.2.1 Optimizations Possible
  - 9.2.2 Multiple Independent Group-By Queries (Independent Method)
  - 9.2.3 Hierarchy of Group-By Queries (Parent Method)
  - 9.2.4 Overlap Method
  - 9.2.5 Some Important Issues
- 9.3 Implementing Data Cubes Efficiently [30]
  - 9.3.1 The Lattice Framework
  - 9.3.2 The Cost Model
  - 9.3.3 Optimizing Data-Cube Lattices
  - 9.3.4 The Hypercube Lattice
- 9.4 Informix Under Control: Online Query Processing [32]
  - 9.4.1 Application scenarios and performance requirements
  - 9.4.2 Randomized Data Access and Physical Database Design
  - 9.4.3 Preferential data delivery: Online reordering
  - Very Large Data [2] 9.5 BlinkDB: Queries with Bounded Errors and Bounded Response Times on
  - 9.5.1 Background
  - 9.5.2 System Overview
  - 9.5.3 Sample Creation
  - 9.5.4 BlinkDB Runtime
10 Special-case Data Models: Streams, Semistructured, Graphs - Execution [5] 10.1 The CQL Continuous Query Language: Semantic Foundations and Query - 10.1.1 Introduction to Running Example - 10.1.2 Streams and Relations - 10.1.3 Abstract Semantics - 10.1.4 Linear Road in CQL - 10.1.5 Time Management - 10.1.6 Implementation - tured Databases [21] 10.2 Dataguides: Enabling Query Formulation and Optimization in Semistruc- - 10.2.1 DataGuides - 10.2.2 Existence of Multiple DataGuides
- 10.3 PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs [22]
  - 10.3.1 Related Work
  - 10.3.2 PowerGraph Abstraction
  - 10.3.3 Other Details
  - 10.3.4 Distributed Graph Placement
11 Data Integration, Provenance and Transformation
- 11.1 Schema Mapping as Query Discovery [34]
  - 11.1.1 Value Correspondences
  - 11.1.2 Constructing Schema Mappings
  - 11.1.3 Query Discovery Algorithm
  - 11.1.4 Provenance in Databases: Why, How, and Where (Cheney et al.)
  - 11.1.5 Why-provenance
  - 11.1.6 How-Provenance
  - 11.1.7 Where-Provenance
  - 11.1.8 Eager vs Lazy
- 11.2 Wrangler: Interactive Visual Specification of Data Transformation Scripts [35]
  - 11.2.1 Wrangler Interface Design
  - 11.2.2 Wrangler Inference Engine
12 Systems Support for ML
- 12.1 The MADlib Analytics Library [31]
  - 12.1.1 Macro Programming
  - 12.1.2 Micro-Programming
  - Descent [46] 12.2 HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient
  - 12.2.1 Sparse Separable Cost Functions
  - 12.2.2 The HOGWILD! Algorithm
  - 12.2.3 Fast Rates for Lock-Free Parallelism
- 12.3 Scaling Distributed Machine Learning with the Parameter Server [38]
  - 12.3.1 Architecture
  - 12.3.2 Implementation

1 Basics and Foundational Systems

In this section, we discuss the following papers:

Architecture of a Database System
The Five Minute Rule Twenty Years Later
A History and Evaluation of System R
The POSTGRES Next-Generation Database System
The Gamma Database Machine Project

1.1 Architecture of a Database System [33]

Abstract Database Management Systems (DBMSs) are a ubiquitous and critical component of mod- ern computing, and the result of decades of research and development in both academia and industry. Historically, DBMSs were among the earliest multi-user server systems to be developed, and thus pioneered many systems design techniques for scalability and reliability now in use in many other contexts. While many of the algorithms and abstrac- tions used by a DBMS are textbook material, there has been relatively sparse coverage in the literature of the systems design issues that make a DBMS work. This paper presents an architectural discussion of DBMS design principles, including process models, paral- lel architecture, storage system design, transaction system implementation, query proces- sor and optimizer architectures, and typical shared components and utilities. Successful commercial and open-source systems are used as points of reference, particularly when multiple alternative designs have been adopted by different groups.

Consider an application where a gate agent at an airport requests the list of passengers for a flight. The following is the life of a query, in a nutshell (see Figure 1):

The client establishes a connection with the Client Communications Manager of a DBMS. In some cases, this is established directly via the ODBC or JDBC connectivity protocol (also known as a two-tier solution). In other cases, the client may commu- nicate with a “middle-tier server” (a web server, transaction processing monitor, or the like), which in turn uses a protocol to proxy the communication between the client and the DBMS. This is also known as a three-tier solution.

Upon receiving the client’s first SQL command, the DBMS assigns a ”thread of com- putation” to the command via its Process Manager. The most important decision that the DBMS needs to make at this stage in the query regards admission control: whether the system should begin processing the query immediately, or defer exe- cution until a time when enough system resources are available to devote to this query.
The query executes by invoking the code in the Relational Query Processor. This set of modules checks that the user has permissions, compiles a query plan, and gives it to the plan executor. The leaves of the query plan are “access methods” for the storage layer.
For each operator in the query plan, operators make calls to fetch data from the DBMS’ Transactional Storage Manager, which manages all data access (read) and manipulation (create, update, delete) calls. The storage system includes algorithms and data structures (e.g., tables, indexes) and a buffer management module that decides when and what data to transfer between disk and memory buffers. Locks are also acquired and managed in this module.
The stack of activities up till this point is unwinded, and state is freed by each of the modules.

1.1.1 Process Models

There are a variety of process models. Different decisions make sense these days depend- ing on your platform. This discussion exposes historical mismatches/distrust across OS and DBMS groups/vendors. Here are some definitions we’ll use:

OS process: combines an operating system (OS) program execution unit (a thread of control) with an address space private to the process. Included in the state main- tained for a process are OS resource handles and the security context. This single unit of program execution is scheduled by the OS kernel and each process has its own unique address space.
OS thread: OS program execution unit without additional private OS context and without a private address space. Each OS thread has full access to the memory of other threads executing within the same multithreaded OS Process. Thread execu- tion is scheduled by the operating system kernel scheduler and these threads are often called “kernel threads” or k-threads.
Lightweight thread package: application-level construct that supports multiple threads within a single OS process. Unlike OS threads scheduled by the OS, lightweight threads are scheduled by an application-level thread scheduler. The difference be- tween a lightweight thread and a kernel thread is that a lightweight thread is sched- uled in user-space without kernel scheduler involvement or knowledge.

DBMS worker: the thread of execution in the DBMS that does work on behalf of a DBMS Client. A 1:1 mapping exists between a DBMS worker and a DBMS Client: the DBMS worker handles all SQL requests from a single DBMS Client.

Lightweight threads have advantages and disadvantages. An advantage is that there is no need for OS kernel mode switch to schedule the next thread. A disadvantage is that blocking I/O requests will block all the other threads in the process, so lightweight thread packages must only issue non-blocking I/O requests. This makes for a more difficult programming model. In the context of working on a single processor, single machine system, the DBMS has 3 possible process models, ranging from simplest to most complex: (1) process per DBMS worker, (2) thread per DBMS worker, and (3) process pool.

Process per DBMS worker. DBMS workers are mapped directly onto OS processes. The OS scheduler manages the timesharing of DBMS workers and the DBMS programmer can rely on OS protection facilities to isolate standard bugs like memory overruns. The scaling issues arise because a process has more state than a thread and consequently consumes more memory. A process switch requires switching security context, memory manager state, file and network handle tables, and other process context. Thread switches don’t require this. This model is supported by IBM DB2, PostgreSQL, and Oracle.

Thread per DBMS worker. A single multithreaded process hosts all the DBMS worker activity. A dispatcher thread listens for new DBMS client connections, and each connec- tion is allocated a new thread. Challenges are: locking / making sure threads access shared memory correctly, handling race conditions, and porting across different OSes. This model scales much better to a large number of concurrent connections though and is used by IBM DB2, Microsoft SQL Server, MySQL, Informix, and Sybase. IBM DB2 and MySQL use OS thread per DBMS worker.

Process pool. This model is a variant of process per DBMS worker. Rather than allocating a full process per DBMS worker, they are hosted by a pool of processes. A central process holds all DBMS client connections and, as each SQL request comes in from a client, the request is given to one of the processes in the process pool. SQL Server uses this model for 99% of use cases. In the thread per DBMS worker model, data sharing is easy with all threads run in the same address space. In other models, shared memory is used for shared data structures and state. Additionally, we need ways to move data from the server back to the client. The two major types are disk I/O buffers and client communication buffers.

Disk I/O buffers. All persistent database data is staged through the DBMS buffer pool. With thread per DBMS worker, the buffer pool is simply a heap-resident data structure available to all threads in the shared DBMS address space. In the other two models, the

stores only a portion of the data—tables are spread over multiple systems in the cluster using horizontal data partitioning to allow each processor to execute independently of the others. During query execution, the query optimizer chooses how to horizontally re-partition tables and intermediate results. The DB administrator, as a result, has a big burden of partitioning tables appropriately to get good performance. Another challenge is that explicit cross-processor coordination must take place to handle transaction com- pletion, provide load balancing, and support certain maintenance tasks. For example, the processors must exchange explicit control messages for issues like distributed deadlock detection and two-phase commit. Finally, what do you do for partial failure? One op- tion is to bring down all nodes if one node fails. Another option is Informix’s ”data skip” approach—it just skips the data in the failed node. Another option is to employ some data redundancy approach. In chained declustering, when a node does fail, the system load is distributed fairly evenly over the remaining nodes: the n − 1 remaining nodes each do n/(n − 1) of the original work, and this form of linear degradation in performance continues as nodes fail. Shared-nothing is most commonly used today, especially in data warehouses.

Shared-disk system. All processors can access the disks with about the same perfor- mance, but are unable to access each other’s RAM. Oracle and IBM DB2 use this. One potential advantage of shared-disk over shared-nothing systems is their lower cost of ad- ministration: DBAs of shared-disk systems do not have to consider partitioning tables across machines in order to achieve parallelism. However, in massive databases, this is a moot point as partitioning is still required. Another advantage is that a single node’s failure is not terribly complicated to handle. Failure typically happens when the data is corrupt—it can still be redundantly stored, for example. A disadvantage of this model is that explicit coordination of data sharing across the machines is needed. It needs a cache-coherency protocol for managing the distributed buffer pools.

Non-Uniform Memory Access (NUMA) systems. Each system in the cluster can access its own local memory quickly, whereas remote memory access across the high-speed clus- ter interconnect is somewhat delayed. NUMA hardware architectures are an interesting middle ground between shared-nothing and shared-memory systems. They are much easier to program than shared-nothing clusters, and also scale to more processors than shared-memory systems by avoiding shared points of contention such as shared-memory buses. They haven’t been too successful, except in shared memory multi-processors. When the ratio of nearmemory to far-memory access times rises above the 1.5:1 to 2: range, the DBMS needs to employ optimizations to avoid serious memory access bottle- necks. One potential problem that arises from implementing thread per DBMS worker using DBMS threads becomes immediately apparent when using multiprocessor hardware. When mapping DBMS threads to multiple OS processes, decisions need to be made about how many OS processes to employ, how to allocate the DBMS threads to OS threads, and how to distribute across multiple OS processes. A good rule of thumb is to have one process

per physical processor.

1.1.3 Relational Query Processor

A relational query processor takes a declarative SQL statement, validates it, optimizes it into a procedural dataflow execution plan, and (subject to admission control) executes that dataflow program on behalf of a client program. The client program then fetches (“pulls”) the result tuples, typically one at a time or in small batches. In general, rela- tional query processing can be viewed as a single-user, single-threaded task. Concurrency control is managed transparently by lower layers of the system. In this section we focus on the common-case SQL commands: Data Manipulation Lan- guage (DML) statements including SELECT, INSERT, UPDATE, and DELETE. Data Defi- nition Language (DDL) statements such as CREATE TABLE and CREATE INDEX are typ- ically not processed by the query optimizer.

Query Parsing: the main tasks for the SQL Parser are to (1) check that the query is correctly specified, (2) resolve names and references, (3) convert the query into the internal format used by the optimizer, and (4) verify that the user is authorized to execute the query.. It canonicalizes table names into a fully qualified name of the form server.database.schema.table (four-part name). After canonicalizing the table names, the query processor then invokes the catalog manager to check that the table is registered in the system catalog and make sure the attribute references are correct. It is possible to constraint-check constant expressions during compilation as well; however, deferring this to execution time is quite common.
Query rewrite: simplifies and normalizes the query without changing its semantics. It can rely only on the query and on metadata in the catalog, and cannot access data in the tables. The rewriter’s main responsibilities are: view expansion, constant arithmetic evaluation, logical rewriting of predicates (including adding transitive predicates), semantic optimization (e.g., redundant join elimination), and subquery flattening/other heuristic rewrites.
Query optimizer: transforms an internal query representation into an efficient query plan for executing the query. The original System R prototype compiled query plans into machine code, whereas the early INGRES prototype generated an interpretable query plan. There are multiple components to query optimization: plan space (e.g., only consider left-deep plans, where the right-hand input to a join must be a base table and postpone Cartesian joins), selectivity estimation (e.g. histograms), search algorithms (e.g., top-down cascades), parallelization (e.g., two phase optimization: optimize then distribute), and auto-tuning.

Berkeley Databases Prelim Notes, Study notes of Database Management Systems (DBMS)

Related documents

Partial preview of the text

Download Berkeley Databases Prelim Notes and more Study notes Database Management Systems (DBMS) in PDF only on Docsity!

Berkeley Databases Prelim Notes

Shreya Shankar

Table of Contents

1.1 Architecture of a Database System [33]