Index Structures for Files: Understanding Ordered and B-Tree Indexes, Slides of Database Management Systems (DBMS)

The concept of index structures for files, focusing on single level ordered indexes and b-trees. Ordered indexes allow for binary search and are typically defined on a single field, while b-trees are specialized tree structures used as indexing mechanisms. B-trees have a hierarchical structure, with each node having pointers to child nodes and data, and they solve the problems of search tree imbalance and wasted space.

Typology: Slides

2012/2013

Uploaded on 04/30/2013

archa
archa 🇮🇳

4.3

(15)

94 documents

1 / 16

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Index Structures for Files
Indexes speed up the retrieval of records
under certain search conditions
Indexes called secondary access paths do not
affect the placement of records
They provide fast access for searches based on the
indexing field
Some types of indexes work only in
conjunction with a certain file organization
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Index Structures for Files: Understanding Ordered and B-Tree Indexes and more Slides Database Management Systems (DBMS) in PDF only on Docsity!

  • Indexes speed up the retrieval of records under certain search conditions
  • Indexes called secondary access paths do not affect the placement of records - They provide fast access for searches based on the indexing field
  • Some types of indexes work only in conjunction with a certain file organization
  • Single level ordered indexes allow us to search for a record by searching the index file using binary search
  • The index is typically defined on a single field of the file called the indexing field
  • There are several kinds of ordered indexes
    • A primary index is specified on the ordering key field of an ordered file of records
  • There exists one index entry for each block in the data file
  • Note that this only works for ordered files. Why?
  • The first record in each block is called the anchor record of the block or the block anchor
  • A primary index is an example of a nondense index since we don’t have a pointer to every record in the data file
  • Insertion of records can be handled with an unordered overflow file and periodic maintenance
  • Deletion of records can be handled with deletion markers and periodic maintenance
  • If the ordering field is not unique, we use a clustering index - It is common to reserve a block for each value of the ordering field
  • Note that the index file is an ordered file
  • We can create a primary index on this file with block anchoring to speed up access to this file
  • Repeat as necessary
  • This leads to the idea of a multilevel index as illustrate in figure 5.
  • B-trees and B +^ trees are specialized tree structures
  • A tree is formed of nodes
  • Each node in the tree has one parent node and zero or more child nodes - except for the root node which has no parents
  • A node that has no children is called a leaf node
  • A search tree is a specialized type of tree used to guide a search - A search tree of order p is a tree with at most p- search values and p pointers to sub-trees - <P 1 ,K 1 ,P 2 ,K 2 ,…,Pq-1,Kq-1,Pq > - Each value in the subtree pointed to by P 1 is less than K 1 and each value in the subtree pointed to by P 2 is greater than K 1 (true also for the other subtrees
  • To find a value in a search tree search the root node, and if the value is not found, search in the appropriate subtree. If there is no subtree (we are at a leaf node) the value doesn’t exist in the tree
  • Insertion and deletion in a search tree will usually cause a search tree to be unbalanced
  • Each Pi is a tree pointer
  • Each Pr (^) i is a data pointer
  • K 1 <K2<…<Kq-
  • The values in the subtree are between the values of the two neighboring key values
  • Each node has at most p tree pointers
  • Each node (except the root) has at least ceiling(p/2) tree pointers
  • A node with q tree pointers has q-1 search key field values
  • If there are empty spaces in a node, insertion is simple
  • Otherwise, the node is split, and the middle value is promoted (moved up to the parent node). This in turn, may cause that node to be split
  • Deletion is (relatively!) easy if there are more than p/2 values after the deletion
  • B+^ trees are a variation of B-trees
    • Each data value occurs in a leaf node along with a pointer to the associated record
    • The leaf nodes are chained together to provide fast sequential access to the data records