Download Understanding Indexing in Database Systems: ISAM and B+-tree - Prof. Jun Huan and more Study notes Deductive Database Systems in PDF only on Docsity!
- EECS 647: Introduction toDatabase Systems Instructor: Luke HuanSpring
3/31/^
Luke Huan Univ. of Kansas^
Today’s Topic z How to locate data in a file^ fast? z Go with a simple example first z ISAM : Indexed sequence access method z An overview of the indexing strategy z Tree-based indexes+ z B -tree
Tree-Structured Indexes: Introduction z Tree-structured indexing techniques support both
range
search^ and^ equality search
z^ ISAM =Indexed Sequential Access Method^ z^ static structure; early index technology.
Motivation for Index z `` Find all students with gpa > 3.0 ’’ z If data file is sorted, do binary search z Cost of binary search in a database can be quite high,Why? z Simple idea: Create an `index’ file. Page N Page 1 Page 3^ Page 2 Can do binary search on (smaller) index file!
Data File kNk2 (^) k Index File PK P K P^0 1 2 1
P mK m index entry
3/31/^
Luke Huan Univ. of Kansas^
Updates with ISAM Example: insert 107 100, 200, …, 901Example: delete 129 100, 123, …, 192^107 Overflow block z Overflow chains and empty data blocks degradeperformance z Worst case: most records go into one long chain
Index blocks^
200, … 100, 108,123, 129,119, 121…
901, 907,996, 997,…… …^ …^
192, 197,200, 202,…… Data blocks
3/31/^
Luke Huan Univ. of Kansas^
A Note of Caution z ISAM is an old-fashioned idea z B+-trees are usually better, as we’ll see z But, ISAM is a good place to start to understand theidea of indexing z Upshot z Don’t brag about being an ISAM expert on yourresume z Do understand how they work, and tradeoffs with B
+^ -
trees
3/31/^
Luke Huan Univ. of Kansas^
Dense and sparse indexes z Dense: one index entry for each search key value z The index entry include (page id, record id) z Sparse: one index entry for each block z Records must be clustered according to the search key z The index entry include only page id.
3/31/^
Luke Huan Univ. of Kansas^
Dense versus sparse indexes z Index size z Sparse index is smaller z Requirement on records z Records must be clustered for sparse index z Lookup z Sparse index is smaller and may fit in memory z Dense index can directly tell if a record exists z Update z Easier for sparse index
3/31/^
Luke Huan Univ. of Kansas^
Clustered Indexes z An index is clustered^ if the entry order in the index fileis almost the same as that in the data file, otherwise, it isunclustered.
3/31/^
Luke Huan Univ. of Kansas^
Hash-based Indexes z Based on hash-table z Organized as a set of <key, value> tuples z Key is the attribute for indexing z Value is <page id, record id> From Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan
3/31/^
Luke Huan Univ. of Kansas^
Recap
z^ An tree-based index structure built for the primary keyof a file that is sorted according to the primary key.
PRIMARYSPARSE^ CLUSTERED^ (Index File)(Data file)
3/31/^
Luke Huan Univ. of Kansas^
Recap II
z^ An tree-based index structure built for non-key attributeof a file that is sorted according to the primary key.
SECONDARYDENSE^ UNCLUSTERED^ (Index File)(Data file)
3/31/^
Luke Huan Univ. of Kansas^
+ Sample B -tree nodes^ degree: 2^150180120 to keysto keysto keys 100 · k < 120 120 · k < 150^150 ·^ k^ < 180
to keys^180 ·^ k to keys^100 ·^ k Non-leaf Leafto next leaf node in sequence (^120130) to records with these^ k^ values;or, store records directly in leaves
3/31/^
Luke Huan Univ. of Kansas^
+ B -tree balancing properties z Height constraint: all leaves at the same lowest level z Fan-out constraint: all nodes at least half full(except root)Max #^ Max #^ Min # Min #pointers^ keys^ active pointers
keys Non-leaf^ 2d+1^ 2d^
d+1^ d Root^ 2d+1^ 2d^
Leaf^ 2d^ 2d^
d^ d