DataBase Development and Implementation Lec11 - Physical DB Design, Study notes of Database Management Systems (DBMS)

Detail Summery about Physical DB Design, DBA role in physical realization of a database system, Database Administration, Other factors influence efficient DB structures, DBA role/ Operational Maintenance, Transformations: logical to physical.

Typology: Study notes

2010/2011

Uploaded on 09/08/2011

rossi46
rossi46 🇬🇧

4.5

(10)

313 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
DBDI 30/05/2007
Lecture_11 / PhysicalDB 1
DBDI / Lecture 11
Physical DB Design
Dr. Ala Al-Zobaidie
The slides are based on the textbook Fundamentals of Database Systems by Elmasri & Navathe
30/05/2007 DBDI / PhysicalDBs 2
Objectives /1
DBA role in physical realization of a
database system
Understand & Apply Physical DB Design
Techniques:
Techniques for storing and finding data
Selection of techniques for particular
application
30/05/2007 DBDI / PhysicalDBs 3
Objectives /2
The main objective of good physical DB design
is to Improve Performance.
Improving performance requires incorporating
various techniques during the transformation
process.
The aim is to:
Reduce the volumes of data transferred to or
from secondary storage, and/or by
Decreasing the communication requirements
in a networked configuration.
30/05/2007 DBDI / PhysicalDBs 4
Database Administration
Management of physical realization of a
database system including:
Physical database design and
implementation,
Setting security and integrity controls,
Monitoring system performance, and
reorganizing the database.
30/05/2007 DBDI / PhysicalDBs 5
DBA role/ Operational Maintenance
Process of monitoring and maintaining
database system following installation.
Monitoring performance of system.
if performance falls, may require
tuning or reorganization of the
database.
Maintaining and upgrading database
application (when required).
Incorporating new requirements into
database application.
30/05/2007 DBDI / PhysicalDBs 6
Other factors influence efficient DB
structures
attribute access:
which attribute should have built-in fast
access paths?
–E.g. pre-constructed joins
buffers:
how large should the buffer space be?
clustering physical relations:
which base relations should be stored
together in the same file?
pf3
pf4
pf5

Partial preview of the text

Download DataBase Development and Implementation Lec11 - Physical DB Design and more Study notes Database Management Systems (DBMS) in PDF only on Docsity!

DBDI / Lecture 11

Physical DB Design

Dr. Ala Al-Zobaidie

The slides are based on the textbook Fundamentals of Database Systems by Elmasri & Navathe 30/05/2007 DBDI / PhysicalDBs 2

Objectives /

• DBA role in physical realization of a

database system

• Understand & Apply Physical DB Design

Techniques:

– Techniques for storing and finding data

– Selection of techniques for particular

application

30/05/2007 DBDI / PhysicalDBs 3

Objectives /

• The main objective of good physical DB designis to Improve Performance.

• Improving performance requires incorporatingvarious techniques during the transformation

process.

• The aim is to:

– Reduce the volumes of data transferred to orfrom secondary storage, and/or by

– Decreasing the communication requirements

in a networked configuration.

30/05/2007 DBDI / PhysicalDBs 4

Database Administration

• Management of physical realization of a

database system including:

– Physical database design and

implementation,

– Setting security and integrity controls,

– Monitoring system performance, and

reorganizing the database.

DBA role/ Operational Maintenance

• Process of monitoring and maintaining

database system following installation.

• Monitoring performance of system.

– if performance falls, may require

tuning or reorganization of the

database.

• Maintaining and upgrading database

application (when required).

• Incorporating new requirements into

database application.

Other factors influence efficient DB

structures

• attribute access:

– which attribute should have built-in fast

access paths?

–E.g. pre-constructed joins

• buffers:

– how large should the buffer space be?

• clustering physical relations:

– which base relations should be stored

together in the same file?

30/05/2007 DBDI / PhysicalDBs 7

Basic requirements of a physical

• RDB Physical implementation must:

– be translatable into tables– underlying physical structures must be hidden from

users

– be accessible by the value of any attribute, not just byunique identifiers (keys)

– be able to support select, project & join operations.

• Data Access can be:– Implicit:

– Physical proximity, or address calculation

– Explicit:

– using address pointers to link items

30/05/2007 DBDI / PhysicalDBs 8

Transformations: logical to physical

• Simplest solution:

– implement each table of the relational schema as asingle base relation

– to improve performance the following alternativesshould be considered:

– vertical partitioning– horizontal partitioning

– Joining (de-normalisation)

• Applicability of transformations depends on severalfactors:

– width of table (no. of bytes per row)– length of table (no. of rows)

– patterns of access to table:

30/05/2007 DBDI / PhysicalDBs 9

Logical to Physical Mapping

• Physical page = block

• Unit of information moved between disk

and memory

• Holds:

– table entries

– auxiliary information such as amount of

free space available in block and

starting position of each record

30/05/2007 DBDI / PhysicalDBs 10

Physical design criteria & storage structure

• Design Trade-off

– Storage Space vs. Response Time

– Query vs. Updates

• Basic Storage structure & Indexing

– File Organisations

– Indexing

– Tree structures

File Organisations

• Heap Files– Suitable when typical access is a file scan retrieving all records.

– Inappropriate when only selected records are to be retrieved.

• Sorted Files– Best if records must be retrieved in some order, or only a `range’

– Really used for DB storage unless a PK index is added to theof records is needed.

file.

• Hashed Files– Best when retrieval is based on an exact match.

– Not suitable when retrievals is based on:– pattern matching, range of values, part keys, or on a column

other than the hash key.

Indexing

• An Index on a file:– speeds up selections on the search key fields for the index.

• Any subset of the fields of a relation can be thesearch key for an index on the relation.

• Search key is not set of fields that uniquely identify the same as key (minimal a record in

a relation).

Page 1 Page 2 Page 3 Page N Data File

k1 k2 kN Index File

30/05/2007 DBDI / PhysicalDBs 19

Basic Tree structure

30/05/2007 DBDI / PhysicalDBs 20

Binary Tree

• Binary trees can be good provided theyare kept balanced (AVL trees), but

binary trees cannot be put directly ontodisk.

• e.g. A binary tree with N = 1024 Keys

average search time O(log2 N) ≈ 10

• i.e. 10 disc access on average for eachsearch - unacceptable.

• The goal is to minimize the no. ofaccesses required for a search which

means minimize the height of the tree.

30/05/2007 DBDI / PhysicalDBs 21

Degree (or the order) of a tree

• • Degree of a tree: is the max # children per parent.All nodes in an m-way search tree are of degree ≤ m. Larger

• degree means shallower tree.B-trees have been specifically designed for use on

secondary storage. They provide fast searching and are self-maintaining (balancing)

Total Search Time= 21 + 62 + 1*

Total Search Time= 1+22+43+2*

= 25 30/05/2007 DBDI / PhysicalDBs 22

ISAM

Leaf pages contain data entries.

Non-leafPages

Pages Overflow

page Primary pages

Leaf

P 0 K 1 P 1 K 2 P 2 K m P m

index entry

B-tree node

• Require a less deep tree with several keys at

each node, i.e. an m-way search tree rather

than one based on a binary tree.

• Also, instead of using overflow to handle

growth, cellular splitting is used.

B-tree criteria

• For an m-way tree of height h≥1, m is

max. # pointers from each node

,i.e. the order,

• Max no. of nodes =

• Since each node has max. m-1 keys,

Hence,

– Max no of keys is: m h^ -1 &

– Min no of keys is : 1 + 2 {  m/ 2 h −1^ - 1 }

i = 0^ ∑

h - 1

m i^ = ( m( m - 1 )^

h- 1 )

30/05/2007 DBDI / PhysicalDBs 25

Conditions of a B-tree

• A B-tree of order m is a multi-way lexicographic

search tree where every node has:

1. m/2 -1 ≤ K ≤ m-1 Keys (elements) appearing

in increasing order from left to right; an

exception is the root which may have as few as

one key.

2. A node with K keys either has K+1 pointers to

sons, which correspond to the partition induced

on the key-space by those K keys, or it has all

its pointers null, in which case it is terminal.

3. All terminal nodes are at the same level.

30/05/2007 DBDI / PhysicalDBs 26

Min & Max # keys in a B-tree

• For example, if h=3 and m=3,

– max. # keys which can be stored is:

• m h^ -1 = 3 3 – 1 = 26

– min. # keys which can be stored is:

• 1 + 2 [ 3/2 2 −1 ] =

• and average no. of accesses ≈ 3

• most keys are at the bottom level.

30/05/2007 DBDI / PhysicalDBs 27

Example

• for m-way tree, if the tree is balanced,

performance is very good.

• For a tree of h=3 and m=200;

• N min = 19,999 keys

• N max = 7,999,999 keys

• Hence, B-tree can be searched with a

small no. of accesses ≤ 3 even where

there is a large no. of keys.

30/05/2007 DBDI / PhysicalDBs 28

Insertion/ Deletion in a B-tree

• Find appropriate place at terminal node & insert.

• If overflow occurs (i.e. more than m-1 Keys), the

node splits in two and the middle Key (along with

pointers to its newly created sons) is passed up

to the next level for insertion and so on.

• Worse case: splitting will propagate along a path

to the root, resulting in the creation of a new root

node.

• Note that insertion should not violate any of the

properties

• Deletion is more complex ( This may be covered it in a tutorial)

Insertion Example: Inserting 8* in the tree

below Root

Notice that root was split, leading to increase in height.

Root 17

14* 16* 19* 20* 22^ 24 27* 29* 33* 34* 38*

Deletion Example: Deleting 19* and

Root 17

14* 16* 19* 20* 22^ 24 27* 29^ 33 34* 38*

Deleting 19* is easy. Deleting 20* is done with re-distribution. Notice howmiddle key, 24* is copied up.

Root 17