






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Distributed Databases, Distributed DBMS Architectures, Storing Data, Distributed Catalog Management, Distributed Queries, Distributed Joins, Semijoin, Bloomjoin, Distributed Query Optimization, Updating Distributed Data, Synchronous Replication, Asynchronous Replication, Peer-to-peer Replication, Data Warehousing, Distributed Locking, Distributed Recovery, Two-phase Commit, Blocking, Database Management Systems, Raghu Ramakrishnan, Lecture Slides, Computer Science, University of Wisconsin, Unite
Typology: Slides
1 / 11
This page cannot be seen from the preview
Don't miss anything!







Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 1
Chapter 22, Part B
Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 2
Y Data is stored at several sites, each managed by a DBMS that can run independently. Y Distributed Data Independence: Users should not have to know where data is located (extends Physical and Logical Data Independence principles). Y Distributed Transaction Atomicity: Users should be able to write Xacts accessing multiple sites just like local Xacts.
Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 3
Y Users have to be aware of where data is located, i.e., Distributed Data Independence and Distributed Transaction Atomicity are not supported. Y These properties are hard to support efficiently. Y For globally distributed sites, these properties may not even be desirable due to administrative overheads of making location of data transparent.
Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 4
Y Homogeneous: Every site runs same type of DBMS. Y Heterogeneous: Different sites run different DBMSs (different RDBMSs or even non- relational DBMSs).
DBMS1 DBMS2 DBMS
Gateway
Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 5
Y Client-Server
Y Collaborating-Server
CLIENT CLIENT
SERVER SERVER SERVER
QUERY
SERVER
SERVER
SERVER QUERY
Client ships query to single site. All query processing at server.
Query can span multiple sites.
Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 6
Y Fragmentation
TID t t t t
R
R1 R
R
SITE A SITE B
Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 10
Y At London, project Sailors onto join columns and ship this to Paris. Y At Paris, join Sailors projection with Reserves.
Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 11
Y At London, compute a bit-vector of some size k:
Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 12
Y Cost-based approach; consider all plans, pick cheapest; similar to centralized optimization.
Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 13
Y Synchronous Replication: All copies of a modified relation (fragment) must be updated before the modifying Xact commits.
Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 14
Y Voting: Xact must write a majority of copies to modify an object; must read enough copies to be sure of seeing at least one most recent copy.
Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 15
Y Before an update Xact can commit, it must obtain locks on all modified copies.
Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 19
Y Log-Based Capture: The log (kept for recovery) is used to generate a Change Data Table (CDT).
Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 20
Y The Apply process at the secondary site periodically obtains (a snapshot or) changes to the CDT table from the primary site, and updates the copy.
Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 21
Y A hot trend: Building giant “warehouses” of data from many sites.
Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 22
Y How do we manage locks for objects across many sites?
Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 23
Y Each site maintains a local waits-for graph. Y A global deadlock might exist even if the local graphs contain no cycles:
T1 T2 T1 T2 T1 T SITE A SITE B GLOBAL Y Three solutions: Centralized (send all local graphs to one site); Hierarchical (organize sites into a hierarchy and send local graphs to parent in the hierarchy); Timeout (abort Xact if it waits too long).
Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 24
Y Two new issues:
Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 28
Y If coordinator for Xact T fails, subordinates who have voted yes cannot decide whether to commit or abort T until coordinator recovers.
Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 29
Y If a remote site does not respond during the commit protocol for Xact T, either because the site failed or the link failed:
Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 30
Y Ack msgs used to let coordinator know when it can “forget” an Xact; until it receives all acks, it must keep T in the Xact Table. Y If coordinator fails after sending prepare msgs but before writing commit/abort log recs, when it comes back up it aborts the Xact. Y If a subtransaction does no updates, its commit or abort status is irrelevant.
Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 31
Y When coordinator aborts T, it undoes T and removes it from the Xact Table immediately.
Database Management Systems, 2 nd^ Edition. R. Ramakrishnan and Johannes Gehrke 32
Y Parallel DBMSs designed for scalable performance. Relational operators very well- suited for parallel execution.