Download Disk and File Management in Database Systems and more Slides Introduction to Database Management Systems in PDF only on Docsity!
Storing Data: Disks and Files
Lecture 5
(R&G Chapter 9)
āYea, from the table of my memory Iāll wipe away all trivial fond records.ā -- Shakespeare, Hamlet
Review
⢠Arenāt Databases Great?
⢠Relational model
⢠SQL
Disks, Memory, and Files
Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB The BIG pictureā¦
Disks and Files
- DBMS stores information on disks.
- In an electronic world, disks are a mechanical anachronism!
- This has major implications for DBMS design!
- READ: transfer data from disk to main memory (RAM).
- WRITE: transfer data from RAM to disk.
- Both are high-cost operations, relative to in-memory operations, so must be planned carefully!
Why Not Store Everything in Main Memory?
⢠Costs too much. For ~$1000,
PCConnection will sell you either
- ~20GB of RAM
- ~40GB of flash
- ~5 TB of disk
⢠Main memory is volatile. We want data
to be saved between runs. (Obviously!)
The Storage Hierarchy
Source: Operating Systems Concepts 5th Edition
- Main memory (RAM) for currently used data.
- Disk for the main database (secondary storage).
- Tapes for archiving older versions of the data (tertiary storage). Smaller, Faster Bigger, Slower
Thought Experiment: How Much
RAM?
- Say your biz has
- 100,000 customers
- 10,000 products
- Say space you need is
- How much space do you need?
- 1G cust + .5G product = 1.5G
- Double it for space utilization = 3G
- Times 10 for growth = 30G
- at, say, $100/G =
- ⦠nothing! (to a company with 100,000 customers)
Quick Review
- 1 millisecond = 1ms = 1/1000 second
- 1 microsecond = 1us = 1/1000 ms
- 1 nanosecond = 1ns = 1/1000 us
- Clock rate 3Ghz, how long is a cycle?
Jim Grayās Storage Latency Analogy:
How Far Away is the Data?
Registers On Chip Cache On Board Cache Memory Disk 1ns 2 10 100 Tape /Optical Robot 10 9 10 6 Sacramento This Lecture Hall This Room My Head 10 min 1.5 hr 2 Years 1 min Pluto 2,000 Years Andromeda
Disks
- Secondary storage device of choice for ~
years.
- Main advantage over
- tapes: random access vs. sequential
- RAM: persistence, easy growth
- Data is stored and retrieved in units called
disk blocks or pages.
- Unlike RAM, time to retrieve a disk block
varies depending upon location on disk.
- Therefore, relative placement of blocks on disk has major impact on DBMS performance!
Components of a Disk
Platters The platters spin (say, 120 rps). Spindle The arm assembly is moved in or out to position a head on a desired track. Tracks under heads make a cylinder (imaginary!). Disk head Arm movement Arm assembly Only one head reads/writes at any one time. Tracks Sector ļ¶ Block size is a multiple of sector size (which is fixed).
Accessing a Disk Page
- Time to access (read/write) a disk block:
- seek time (moving arms to position disk head on track)
- rotational delay (waiting for block to rotate under head)
- transfer time (actually moving data to/from disk surface)
- Seek time and rotational delay dominate.
- Seek time varies between about 0.3 and 10msec
- Rotational delay varies from 0 to 4msec
- Transfer rate .01 - .05msec per 8K block
- Key to lower I/O cost: reduce seek/rotation
delays! Hardware vs. software solutions?
More on Buffer Management
- Requestor of page must eventually unpin it,
and indicate whether page has been
modified:
- dirty bit is used for this.
- Page in pool may be requested many times,
- a pin count is used.
- To pin a page, pin_count++
- A page is a candidate for replacement iff pin count == 0 ( āunpinnedā)
- CC & recovery may entail additional I/O when
a frame is chosen for replacement.
- Write-Ahead Log protocol; more later!
Buffer Replacement Policy
⢠Frame is chosen for replacement by a
replacement policy:
- Least-recently-used (LRU), MRU, Clock,
etc.
⢠Policy can have big impact on # of
I/Oās; depends on the access pattern.
⢠For āTransactionalā workloads, notion of
a āworking setā - pages that āshouldā be
in memory.
LRU Replacement Policy
- Least Recently Used (LRU)
- for each page in buffer pool, keep track of time when last unpinned
- replace the frame which has the oldest (earliest) time
- very common policy: intuitive and simple
- Works well for repeated accesses to popular pages
- Problems?
- Problem: Sequential flooding
- LRU + repeated sequential scans.
buffer frames < # pages in file means each page
request causes an I/O. - Idea: MRU better in this scenario?
āClockā Replacement Policy
- An approximation of LRU
- Arrange frames into a cycle, store one reference bit per frame - Can think of this as the 2nd chance bit
- When pin count reduces to 0, turn on ref. bit
- When replacement necessary do for each page in cycle { if (pincount == 0 && ref bit is on) turn off ref bit; else if (pincount == 0 && ref bit is off) choose this page for replacement; } until a page is chosen; Questions: How like LRU? Problems?
A(1)
B(p) C(1)
D(1)
DBMS vs. OS File System
OS does disk space & buffer mgmt: why not let
OS manage these tasks?
- Some limitations, e.g., files canāt span disks.
- Buffer management in DBMS requires ability to:
- pin a page in buffer pool, force a page to disk & order writes (important for implementing CC & recovery)
- adjust replacement policy, and pre-fetch pages based on access patterns in typical DB operations.
Context
Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB
Files of Records
- Blocks are the interface for I/O, butā¦
- Higher levels of DBMS operate on records,
and files of records.
- FILE: A collection of pages, each containing a
collection of records. Must support:
- insert/delete/modify record
- fetch a particular record (specified using record id)
- scan all records (possibly with some conditions on the records to be retrieved)
Unordered (Heap) Files
- Simplest file structure contains records in no
particular order.
- As file grows and shrinks, disk pages are allocated
and de-allocated.
- To support record level operations, we must:
- keep track of the pages in a file
- keep track of free space on pages
- keep track of the records on a page
- There are many alternatives for keeping track of
this.
Heap File Implemented as a List
- The header page id and Heap file name must
be stored someplace.
- Database ācatalogā
- Each page contains 2 `pointersā plus data. Header Page Data Page Data Page Data Page Data Page Data Page Data Page Pages with Free Space Full Pages
Heap File Using a Page Directory
- The entry for a page can include the number
of free bytes on the page.
- The directory is a collection of pages; linked
list implementation is just one alternative.
- Much smaller than linked list of all HF pages! Data Page 1 Data Page 2 Data Page N Header Page DIRECTORY
Indexes (a sneak preview)
- A Heap file allows us to retrieve records:
- by specifying the rid, or
- by scanning all records sequentially
- Sometimes, we want to retrieve records by
specifying the values in one or more fields,
e.g.,
- Find all students in the āCSā department
- Find all students with a gpa > 3
- Indexes are file structures that enable us to
answer such value-based queries efficiently.
Record Formats: Fixed Length
- Information about field types same for all
records in a file; stored in system catalogs.
- Finding iāth field done via arithmetic. Base address (B)
L1 L2 L3 L
F1 F2 F3 F
Address = B+L1+L
Summary
- Disks provide cheap, non-volatile storage.
- Random access, but cost depends on location of page on disk; important to arrange data sequentially to minimize seek and rotation delays.
- Buffer manager brings pages into RAM.
- Page stays in RAM until released by requestor.
- Written to disk when frame chosen for replacement (which is sometime after requestor releases the page).
- Choice of frame to replace based on replacement policy.
- Tries to pre-fetch several pages at a time.
Summary (Contd.)
- DBMS vs. OS File Support
- DBMS needs features not found in many OSās, e.g., forcing a page to disk, controlling the order of page writes to disk, files spanning disks, ability to control pre-fetching and page replacement policy based on predictable access patterns, etc.
- Variable length record format with field offset
directory offers support for direct access to
iāth field and null values.
- Slotted page format supports variable length
records and allows records to move on page.
Summary (Contd.)
- File layer keeps track of pages in a file, and
supports abstraction of a collection of records.
- Pages with free space identified using linked list or directory structure (similar to how pages in file are kept track of).
- Indexes support efficient retrieval of records
based on the values in some fields.
- Catalog relations store information about
relations, indexes and views. ( Information that
is common to all records in a given collection.)