Download Disk Management in Database Management Systems - Prof. Kristen R. Lefevre and more Study notes Database Management Systems (DBMS) in PDF only on Docsity!
2/3/09 EECS 484: Database Management Systems, Kristen LeFevre 1
Storing Data: Disks and Files
Chapter 9
2/3/09 EECS 484: Database Management Systems, Kristen LeFevre 2 Disk Query Evaluation Engine SQL Query / Application Files & Access Methods Buffer Management Disk Space Management
DBMS Organization
This lecture 2/3/09 EECS 484: Database Management Systems, Kristen LeFevre 3 Non-volatile
The Memory Hierarchy
Price Speed cache main memory magnetic disk tape CPU Size Performance of Microprocessors and Memory improving faster than disks and tapes
2/3/09 EECS 484: Database Management Systems, Kristen LeFevre 4
Why Not Store Everything in Main Memory?
- Too expensive: RAM: $90 per GB, Disk: $0.25 per GB
- Main memory is volatile: Want data to persist between runs
- Typical storage hierarchy:
- Main memory (RAM) for currently used data
- Disk for the main database (secondary storage)
- Read and write data between disk and main memory
- Both are high-cost operations, relative to in-memory operations, so must be planned carefully!
- Tapes for archiving older versions of data (tertiary storage)
- Sequential access devices 2/3/09 EECS 484: Database Management Systems, Kristen LeFevre 5
Disks
- Secondary storage device of choice.
• Main advantage over tapes: random access vs.
sequential.
• Data is stored and retrieved in units called disk
blocks or pages.
- Unlike RAM, time to retrieve a disk page varies
depending upon location on disk.
- Therefore, relative placement of pages on disk has major impact on DBMS performance! 2/3/09 EECS 484: Database Management Systems, Kristen LeFevre 6
Magnetic Disks
Platters Disk head Arm movement Disk arm Tracks Sector Set of tracks with same diameter called a cylinder Data stored in blocks. Size of block is a multiple of sector size. Only one disk head reads or writes at a time.
2/3/09 EECS 484: Database Management Systems, Kristen LeFevre 10
Buffer Management in a DBMS
- Data must be in RAM for DBMS to operate on it!
- Can’t keep all the DBMS pages in main memory
- Buffer Manager: Efficiently uses main memory
- Memory divided into buffer frames: slots for holding disk pages Upper levels:
- release pages when done
- indicate if a page is modified DB MAIN MEMORY DISK disk page free frame BUFFER POOL^ Page Requests from Higher Levels choice of frame dictated by replacement policy 2/3/09 EECS 484: Database Management Systems, Kristen LeFevre 11
Buffer Manager
• Bookkeeping per frame:
• pin count: # of active users of the frame
- Pinning – “I’m using this page”
- Unpinning – “I’m no longer using this page”
• dirty bit: indicates if page has been modified
since it was brought in from disk
2/3/09 EECS 484: Database Management Systems, Kristen LeFevre 12
Buffer Manager
- When a page is requested:
- If page is in buffer pool
- Increment the frame’s pin count
- Return main memory address of the frame
- Else
- Choose a frame for replacement
- Use replacement policy
- Only choose a frame with pin count == 0
- If frame is dirty, write old page to disk
- Read requested page into chosen frame
- Set pin count = 1 (“pin the page”)
- Set dirty bit = false
2/3/09 EECS 484: Database Management Systems, Kristen LeFevre 13
Buffer Replacement Policy
- Chose a frame for replacement
- Least-recently-used (LRU), Clock, MRU etc.
- LRU: queue of pointers to “empty” frames
- Add to end of queue, grab frames from front of queue
- Clock: variant of LRU, but lower overhead
- Policy can have big impact on # of I/O’s; depends on the access pattern.
- Sequential flooding : Nasty situation caused by LRU + repeated sequential scans. - # buffer frames < # pages in file - Use MRU 2/3/09 EECS 484: Database Management Systems, Kristen LeFevre 14
Files of Records (tuples)
• Pages are great for storage system, but
DBMS views data as a collection of records
• Important questions:
- How is a record organized / implemented?
- How are records laid out on pages?
- How are pages organized into files? 2/3/09 EECS 484: Database Management Systems, Kristen LeFevre 15
Record Formats
• Two kinds:
- Fixed-length records
- Each field has a fixed length, and the number of fields is fixed
- Variable-length records
- Some of the fields are of variable length
- E.g., variable-length strings
2/3/09 EECS 484: Database Management Systems, Kristen LeFevre 19
Page Formats
- Page contains a collection of records
- Think of page as a collection of slots, each of which contains a record
- Record ID (RID): <page id, slot #>
- Many different slotted page organizations.
- Must support search, insert, delete records 2/3/09 EECS 484: Database Management Systems, Kristen LeFevre 20
Page Formats: Fixed Length Records
Record id = <page id, slot #>
First alternative: moving records changes rid
may not be acceptable. Slot 1 Slot 2 Slot N
N PACKED Free Space number of records
... 0 1 M M ... 3 2 1 UNPACKED, BITMAP Slot 1 Slot 2 Slot N Slot M 1 1 number of slots 2/3/09 EECS 484: Database Management Systems, Kristen LeFevre 21
Page Formats: Variable Length Records
120, 40 -1, 0 560, 90 -1, 0 0, 70 70, 50 6 Book-keeping 5 4 3 2 1 0 Slot directory Free Space Pointer
- Directory grows backwards!
- Move records on same page
- Good for fixed-length records too. Rid=?Rid= (11, 1) Page num = 11 Delete a record? Slot Entry: Offset, reclen
2/3/09 EECS 484: Database Management Systems, Kristen LeFevre 22
Unordered (Heap) Files
• Simplest file structure contains records in
no particular order.
• Page contains a set of records
• As file grows and shrinks, disk pages are
allocated and de-allocated.
• Many alternatives for storing file
2/3/09 EECS 484: Database Management Systems, Kristen LeFevre 23
Heap File Implemented as a List
- (heap file name, header page id) recorded in a known location
- Each page contains two pointers plus data: Pointer = Page ID (pid)
- Pages in the free space list have “some” free space
- What happens with variable length records?
- Fetch a record with rid Header Page Data Page Data Page Data Page Data Page Data Page Data Page Pages with Free Space Full Pages 2/3/09 EECS 484: Database Management Systems, Kristen LeFevre 24
Heap File Using a Page Directory
- Entry for a page:
- Free/full
- Number of free bytes
- Can locate pages for new tuples faster! Data Page 1 Data Page 2 Data Page N Header Page DIRECTORY Directory itself is implemented as a linked list of pages