Understanding Indexing in Database Systems: ISAM and B+-tree - Prof. Jun Huan, Study notes of Deductive Database Systems

An introduction to indexing in database systems, focusing on isam (indexed sequential access method) and b+-tree. The basics of indexing, the motivation for using indexes, the structure and functionality of isam and b+-tree, and their comparisons. It also discusses the advantages and disadvantages of each indexing method.

Typology: Study notes

Pre 2010

Uploaded on 09/17/2009

koofers-user-0fl
koofers-user-0fl 🇺🇸

10 documents

1 / 29

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
EECS 647: Introduction to
Database Systems
Instructor: Luke Huan
Spring 2009
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d

Partial preview of the text

Download Understanding Indexing in Database Systems: ISAM and B+-tree - Prof. Jun Huan and more Study notes Deductive Database Systems in PDF only on Docsity!

  • EECS 647: Introduction toDatabase Systems Instructor: Luke HuanSpring

3/31/^

Luke Huan Univ. of Kansas^

Today’s Topic z How to locate data in a file^ fast? z Go with a simple example first z ISAM : Indexed sequence access method z An overview of the indexing strategy z Tree-based indexes+ z B -tree

Tree-Structured Indexes: Introduction z Tree-structured indexing techniques support both

range

search^ and^ equality search

z^ ISAM =Indexed Sequential Access Method^ z^ static structure; early index technology.

Motivation for Index z `` Find all students with gpa > 3.0 ’’ z If data file is sorted, do binary search z Cost of binary search in a database can be quite high,Why? z Simple idea: Create an `index’ file. Page N Page 1 Page 3^ Page 2 Can do binary search on (smaller) index file!

Data File kNk2 (^) k Index File PK P K P^0 1 2 1

P mK m index entry

3/31/^

Luke Huan Univ. of Kansas^

Updates with ISAM Example: insert 107 100, 200, …, 901Example: delete 129 100, 123, …, 192^107 Overflow block z Overflow chains and empty data blocks degradeperformance z Worst case: most records go into one long chain

Index blocks^

200, … 100, 108,123, 129,119, 121…

901, 907,996, 997,…… …^ …^

192, 197,200, 202,…… Data blocks

3/31/^

Luke Huan Univ. of Kansas^

A Note of Caution z ISAM is an old-fashioned idea z B+-trees are usually better, as we’ll see z But, ISAM is a good place to start to understand theidea of indexing z Upshot z Don’t brag about being an ISAM expert on yourresume z Do understand how they work, and tradeoffs with B

+^ -

trees

3/31/^

Luke Huan Univ. of Kansas^

Dense and sparse indexes z Dense: one index entry for each search key value z The index entry include (page id, record id) z Sparse: one index entry for each block z Records must be clustered according to the search key z The index entry include only page id.

3/31/^

Luke Huan Univ. of Kansas^

Dense versus sparse indexes z Index size z Sparse index is smaller z Requirement on records z Records must be clustered for sparse index z Lookup z Sparse index is smaller and may fit in memory z Dense index can directly tell if a record exists z Update z Easier for sparse index

3/31/^

Luke Huan Univ. of Kansas^

Clustered Indexes z An index is clustered^ if the entry order in the index fileis almost the same as that in the data file, otherwise, it isunclustered.

3/31/^

Luke Huan Univ. of Kansas^

Hash-based Indexes z Based on hash-table z Organized as a set of <key, value> tuples z Key is the attribute for indexing z Value is <page id, record id> From Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan

3/31/^

Luke Huan Univ. of Kansas^

Recap

z^ An tree-based index structure built for the primary keyof a file that is sorted according to the primary key.

PRIMARYSPARSE^ CLUSTERED^ (Index File)(Data file)

3/31/^

Luke Huan Univ. of Kansas^

Recap II

z^ An tree-based index structure built for non-key attributeof a file that is sorted according to the primary key.

SECONDARYDENSE^ UNCLUSTERED^ (Index File)(Data file)

3/31/^

Luke Huan Univ. of Kansas^

+ Sample B -tree nodes^ degree: 2^150180120 to keysto keysto keys 100 · k < 120 120 · k < 150^150 ·^ k^ < 180

to keys^180 ·^ k to keys^100 ·^ k Non-leaf Leafto next leaf node in sequence (^120130) to records with these^ k^ values;or, store records directly in leaves

3/31/^

Luke Huan Univ. of Kansas^

+ B -tree balancing properties z Height constraint: all leaves at the same lowest level z Fan-out constraint: all nodes at least half full(except root)Max #^ Max #^ Min # Min #pointers^ keys^ active pointers

keys Non-leaf^ 2d+1^ 2d^

d+1^ d Root^ 2d+1^ 2d^

Leaf^ 2d^ 2d^

d^ d