Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

For each uploaded document

Answer questions

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Distributed Databases - Advanced Database System - Lecture Slides, Slides of Database Management Systems (DBMS)

Damodaram Sanjivayya National Law University Database Management Systems (DBMS)

Some concept of Advanced Database System are Types Supported, Simple Data Model, Concurrency Control Two, Continuously Adaptive, Cost-Based Optimization, Data Access From Disks, Data Warehousing. Main points of this lecture are: Distributed Databases, Distributed Data Independence, Distributed Transaction Atomicity, Logical Data Independence Principles, Data is Located, Accessing Multiple, Homogeneous, Types of Distributed Databases, Heterogeneous, Different Sites

Typology: Slides

2012/2013

Uploaded on 04/27/2013

dhanapati 🇮🇳

4.1

(24)

123 documents

1 / 20

This page cannot be seen from the preview

Don't miss anything!

bg1

Distributed Databases

Docsity.com

pf3

pf4

pf5

pf8

pf9

pfa

pfd

pfe

pff

pf12

pf13

pf14

Discover Slides of Database Management Systems (DBMS) Damodaram Sanjivayya National Law University

Related documents

Distributed Databases: Concepts, Advantages, and Strategies

Distributed Databases: Definition, Advantages, Problems, and Solutions

CSE 512 Midterm Study Guide: Distributed Systems and Databases

Distributed Databases-Database Management Systems-Lecture 25 Slides-Computer Science

Introduction to Distributed Databases and Distributed DBMS

databases distributed

Distributed Databases Exam Questions and Answers

Distributed Databases: Advantages, Control, and Design Options

Distributed Databases: Management and Challenges

Data Resource Management: Types of Databases and Database Systems

Introduction to Parallel and Distributed Databases

Types of Databases: Centralized, Distributed, Relational, NoSQL, Cloud, Object-Oriented

Partial preview of the text

Download Distributed Databases - Advanced Database System - Lecture Slides and more Slides Database Management Systems (DBMS) in PDF only on Docsity!

Distributed Databases

Introduction

Data is stored at several sites, each managed by an independent DBMS.
Distributed Data Independence: Users should not have to know where data is located (extends Physical and Logical Data Independence principles).
Distributed Transaction Atomicity: Users should be able to write Xacts accessing multiple sites just like local Xacts.

Distributed DBMS Architectures

Client-Server

 Collaborating-Server

CLIENT CLIENT

SERVER SERVER SERVER

QUERY

SERVER

SERVER

QUERY^ SERVER

Client ships query to single site. All query processing at server.

Thin vs. fat clients.
Set-oriented communication, client side caching.

Query can span multiple sites.

Storing Data

Fragmentation
- Horizontal: Usually disjoint.
- Vertical: Lossless-join; tids.
Replication
- Gives increased availability.
- Faster query evaluation.
- Synchronous vs. Asynchronous. - Vary in how current copies are.

TID t t t t

R

R1 R

R

SITE A SITE B

Distributed Queries

Horizontally Fragmented: Tuples with rating < 5 at Shanghai, >= 5 at Tokyo. - Compute SUM(age), COUNT(age) at both sites. - If WHERE contained just S.rating>6, just one site.
Vertically Fragmented: Sid and rating at Shanghai, sname and age at Tokyo, tid at both. - Must reconstruct relation by join on tid , then evaluate query.
Replicated: Sailors copies at both sites.
- Choice of site based on local costs, shipping costs.

SELECT AVG(S.age) FROM Sailors S WHERE S.rating > 3 AND S.rating < 7

Distributed Joins

Fetch as Needed, Page NL, Sailors as outer:
- Cost: 500 D + 500 * 1000 (D+S)
- D is cost to read/write page; S is cost to ship page.
- If query was not submitted at London, must add cost of shipping result to query site.
- Can also do INL at London, fetching matching Reserves tuples to London as needed.

Sailors Reserves

LONDON PARIS

500 pages 1000 pages

Semi-join

Idea: Tradeoff cost of computing and shipping projection for cost of shipping full relation.
Note: Especially useful if there is selection on full relation (that can be exploited via index); and answer desired back at initial site.

Semi-join

At London, project Sailors onto join columns and ship this to Paris.
At Paris, join Sailors projection with Reserves.
- Result is called reduction of Reserves wrt Sailors.
Ship reduction of Reserves to London.
At London, join Sailors with reduction of Reserves.
Idea: Useful if there is a selection on Sailors (reduce size), and answer desired at London.

Sailors Reserves

LONDON PARIS

500 pages 1000 pages

Distributed Query Optimization

Cost-based approach; consider all plans, pick cheapest; similar to centralized opt. Difference 1: Consider communication costs Difference 2: Respect local site autonomy Difference 3: New distributed join methods.
Query site constructs global plan, with suggested local plans describing processing at each site. - If a site can improve suggested local plan, free to do so.

Issues of

Updating Distributed Data, Replication,

Locking,

Recovery, and

Distributed Transactions

Distributed Locking

How manage locks across many sites?
- Centralized: One site does all locking.
  - Vulnerable to single site failure.
- Primary Copy: All locking for object done at primary copy site for this object. - Reading requires access to locking site as well as site where the object is stored.
- Fully Distributed: Locking for a copy done at site where copy is stored. - Locks at all sites while writing an object.

Distributed Deadlock Detection

Each site maintains local waits-for graph.
A global deadlock might exist even if local graphs contain no cycles: T1 T2 T1 T2 T1 T SITE A SITE B GLOBAL Three solutions:

 Centralized (send all local graphs to one site);

 Hierarchical (organize sites into a hierarchy and send local graphs to parent in the hierarchy);

 Timeout (abort Xact if it waits too long). Docsity.com

Two-Phase Commit (2PC)

Two rounds of communication:
- first, voting;
- then, termination.
- Both initiated by coordinator.

Summary

Parallel DBMSs designed for scalable performance. Relational operators very well- suited for parallel execution. - Pipeline and partitioned parallelism.
Distributed DBMSs offer site autonomy and distributed administration.
Distributed DBMSs must revisit storage and catalog techniques, concurrency control, and recovery issues.