Database Optimization: Understanding RAID, Parity Checking, and Index Types, Slides of Database Management Systems (DBMS)

An in-depth exploration of raid (redundant array of independent disks) and indexing techniques used for database optimization. Topics covered include raid arrays, raid levels (0, 1, and 3), parity checking, recovery with parity, indexing concepts, and choosing indexes. Learn how to efficiently store and access data, improve speed, and ensure data redundancy.

Typology: Slides

2012/2013

Uploaded on 04/26/2013

duurga
duurga 🇮🇳

4.6

(25)

121 documents

1 / 30

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Physical DB Issues, Indexes,
Query Optimisation
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e

Partial preview of the text

Download Database Optimization: Understanding RAID, Parity Checking, and Index Types and more Slides Database Management Systems (DBMS) in PDF only on Docsity!

Physical DB Issues, Indexes,

Query Optimisation

Physical Design

  • Design so far
    • E/R modelling helps find the requirements of a database
    • Normalisation helps to refine a design by removing data redundancy - Physical design - Concerned with storing and accessing the data - How to deal with media failures - How to access information efficiently

RAID Level 0

  • Files are split across

several disks

  • For a system with n disks, each file is split into n parts, one part stored on each disk
  • Improves speed, but no redundancy

Disk 1 Disk 2 Disk 3

Data

Data1 Data2 Data

RAID Level 1

  • As RAID 0 but with

redundancy

  • Files are split over multiple disks
  • Each disk is mirrored
  • For n disks, split files into n/2 parts, each stored on 2 disks
  • Improves speed, has redundancy, but needs lots of disks

Disk 1 Disk 2 Disk 3

Data

Data1 Data

Disk 4

Recovery With Parity

  • If one of our pieces

of data is lost we can

recover it

  • Just compute it as the parity of the remaining data and our original parity information

RAID Level 3

  • Data is striped over

disks, and a parity

disk for redundancy

  • For n disks, we split the data in n-1 parts
  • Each part is stored on a disk
  • The final disk stores parity information (^) Disk 1 Disk 2 Disk 3

Data

Data1 Data

Disk 4

Data3 Parity

Indexes

  • Indexes are to do

with ordering data

  • The relational model says that order doesn’t matter
  • From a practical point of view it is very important - Types of indexes - Primary or clustered indexes affect the order that the data is stored in a file - Secondary indexes give a look-up table into the file - Only one primary index, but many secondary ones

Index Example

  • A telephone book
    • You store people’s addresses and phone numbers
    • Usually you have a name and want the number
    • Sometimes you have a number and want the name - Indexes - A clustered index can be made on name - A secondary index can be made on number

Choosing Indexes

  • You can only have

one primary index

  • The most frequently looked-up value is often the best choice
  • Some DBMSs assume the primary key is the primary index, as it is usually used to refer to rows - Don’t create too

many indexes

  • They can speed up queries, but they slow down inserts, updates and deletes
  • Whenever the data is changed, the index may need to change

Index Example

  • A product database,

which we want to

search by keyword

  • Each product can have many keywords
  • The same keyword can be associated with many products

prodID prodName

prodID keyID

keyID keyWord

Products

WordLink

Keywords

Creating Indexes

  • In SQL we use

CREATE INDEX :

CREATE INDEX

ON

()

  • Example: CREATE INDEX keyIndex ON Keywords (keyWord) CREATE INDEX linkIndex ON WordLink(keyID) CREATE INDEX prodIndex ON Products (prodID)

Query Processing

  • Once a database is

designed and made

we can query it

  • A query language (such as SQL) is used to do this
  • The query goes through several stages to be executed - Three main stages - Parsing and translation - the query is put into an internal form - Optimisation - changes are made for efficiency - Evaluation - the optimised query is applied to the DB

Some Relational Operators

  • Product ×
    • Product finds all the combinations of one tuple from each of two relations
    • R1 × R2 is equivalent to **SELECT DISTINCT *** FROM R1, R - Selection σ - Selection finds all those rows where some condition is true - σ (^) cond R is equivalent

to

SELECT DISTINCT *

FROM R

WHERE

Some Relational Operators

  • Projection π
    • Projection chooses a set of attributes from a relation, removing any others
  • π (^) A1,A2,… R is

equivalent to

SELECT DISTINCT

A1, A2, ...

FROM R

  • Projection, selection

and product are

enough to express

queries of the form

SELECT FROM

WHERE