


















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Students who are in 2nd CSE jntuk
Typology: Lecture notes
1 / 26
This page cannot be seen from the preview
Don't miss anything!



















RAGHU INSTITUTE OF TECHNOLOGY AUTONOMOUS DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
II BTECH II SEM Advanced Data Structures Unit- Prepared By Dr.V.Sangeetha & Mr. N.UdayKumar
References :
A Forouzan, Cengage.
Sahani, Andersonfreed
Contents:
2
A classic problem in computer science!
Data requested in sorted order
e.g., find students in increasing gpa order
Sorting is first step in bulk loading B+ tree index.
Sorting useful for eliminating duplicate copies in a
collection of records
Sort-merge join algorithm involves sorting.
Problem: sort 10GB of data with 1MB of RAM.
4
General Wisdom :
I/O costs dominate
Design algorithms to reduce I/O
5
merge , which combines two sorted lists into one
Merge Sort
7
Partition into lists of size n/
Example
[10, 4, 6, 3]
[10, 4, 6, 3, 8, 2, 5, 7]
[8, 2, 5, 7]
[10, 4] [6, 3]^
[8, 2] [5, 7]
[4] [10] (^) [3][6] [2][8] [5][7]
8
Merge Sort
10
81 94 11 96 12 35 17 99
81 94 11 96 12 35 17 99
Sort Sort
11 81 94 96 12 17 35 99
Merge : Merge two sorted lists and repeatedly choose the
smaller of the two “heads” of the lists
Merge Sort: Divide records into two parts; merge-sort those
recursively, and then merge the lists.
11
Phase 1: PREPARE.
Read a page, sort it, write it.
only one buffer page is used
Phase 2, 3, …, etc.: MERGE:
Three buffer pages used.
Disk
input
Main
memory
Disk
Main memory
buffers
INPUT 1
INPUT 2
OUTPUT
1 buffer
1 buffer
1 buffer
13
2-Way Sorting Cont’d
The most popular method for sorting on
external storage devices is merge sort. This
method consists of essentially two distinct
phases.
First, segments of the input file are sorted
using a good internal sort method. These
sorted segments, known as runs , are written
out onto external storage as they are
generated.
14
records) to obtain six runs R 1
or quicksort could be used. These six runs are written out
onto the scratch disk.
16
Example Cont’d
Step 2: Set aside three blocks of internal
memory, each capable of holding 250 records.
Two of these blocks will be used as input
buffers and the third as an output buffer.
Merge runs R
1
and R
2
. This is carried out by first
reading one block of each of these runs into
input buffers and etc
17
Example Analysis
Let us now analyze how much time is required to sort
these 4500 records. The analysis will use the
following notation:
t s
= maximum seek time
t l
= maximum latency time
t rw
= time to read or write one block of 250 records
t IO
= t s
+ t l
+ t rw
t IS
= time to internally sort 750 records
n t m
= time to merge n records from input buffers to
the output buffer
19
Computing times for disk sort
20