



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Learn about data indexing, its types including tree-based and hash-based indexes, primary and secondary indexes, and clustered and non-clustered indexes. The concepts, features, and examples of tree-based indexes, specifically the isam and b+ tree.
Typology: Study notes
1 / 5
This page cannot be seen from the preview
Don't miss anything!




Index: a data structure used to speed up selections on data records based on some attributes
Example: Index on ‘age’ attribute of Employee table. age is a search key
An index consists of data entries that point to some raw data records <key_value, data_pointer>
data_pointer raw data Cost of calling pointer to raw data is constant e.g. 1 page I/O
Type of Indexes:
Primary Index vs. Secondary Index
Primary: uses the key attribute of a relation as the search key
Example: an index based on SSN of Employee table
Secondary: search key is NOT the key attribute
Example: an index based on ‘age’ of Employee
Clustered vs. un-clustered index
If the order of data records in file is the same as the order of data entries in the index, that index is called CLUSTERED.
Index
interface b/w ______________________________________________________
raw data
There can only be ONE clustered index
a 0 p 0 a 1 p 1
a 0 a 1 …. an- 1 an
Cluster indexes give better performance in searching a range of values.
Tree structure is static Just insert/delete by modifying the leaf nodes
Example: Insert 23, 48, 41, 42
40
root
20 33 51 63
10 15 20 27 33 37 40 46 51 55 63 97
this part of the tree is never touched
23 48 41
42
Example: Delete 51, 97
2. B+^ -tree Follows the idea of ISAM. But optimizes towards search performance (I/O time) in a dynamic environment.
F: fanout d: order – number of data entries are maintained in range [d, 2d] meaning we leave some empty space in the tree nodes.
We chain all the leaf nodes together
Example:
B+-tree in real life Fanout = 133 each node = 1 page Typical order = 100
Height 4 tree: 133^4 ≈ 312M records Height 3 tree: 235M