Data Representation in Database Systems: Fields, Records, Blocks, and Variable-Length Data, Slides of Database Management Systems (DBMS)

An overview of data representation in database systems, including fields, records, blocks, and variable-length data. It covers topics such as representing data types, tuples, record headers, packing records into blocks, and addressing. The document also discusses pointer swizzling, automatic swizzling, and unswizzling, as well as pinned records and blocks, and record modification.

Typology: Slides

2012/2013

Uploaded on 04/26/2013

duurga
duurga 🇮🇳

4.6

(25)

121 documents

1 / 28

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Representing Data Elements
Fields, Records, Blocks
Variable-length Data
Modifying Records
1
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c

Partial preview of the text

Download Data Representation in Database Systems: Fields, Records, Blocks, and Variable-Length Data and more Slides Database Management Systems (DBMS) in PDF only on Docsity!

Representing Data Elements

Fields, Records, Blocks Variable-length Data Modifying Records

Overview

  • Attributes are represented by sequences of bytes, called fields
  • Tuples are represented by collections of fields, called records
  • Relations are represented by collections of records, called files
  • Files are stored in blocks, using specialized data structures to support efficient modification and querying

Representing Tuples

  • For now, assume all attributes (fields) are fixed length.
  • Concatenate the fields
  • Store the offset of each field in schema

4

0 30 286 287 297

name CHAR(30) 30 bytes

address VARCHAR(255) 256 bytes

gender CHAR(1) 1 byte

birthdate DATE 10 bytes

More on Tuples

  • Due to hardware considerations, certain types of data need to start at addresses that are multiples of 4 or 8
  • Previous example becomes:

5

0 32 288 292 304

name CHAR(30) 30 bytes

  • 2

address VARCHAR(255) 256 bytes

gender CHAR(1) 1 byte

  • 3

birthdate DATE 10 bytes

  • 2 Docsity.com

Packing Records into Blocks

  • Start with block header:
    • timestamp of last modification/access
    • offset of each record in the block, etc.
  • Follow with sequence of records
  • May end with some unused space

7

header record 1 record 2 (^) … record n-1 record n

Representing Addresses

  • Often addresses (pointers) are part of records:
    • the application data in object-oriented databases
    • as part of indexes and other data structures supporting the DBMS
  • Every data item (block, record, etc.) has two addresses: - database address: address on the disk (typically 8-16 bytes) - memory address, if the item is in memory (typically 4 bytes)

Pointer Swizzling

  • When a block is moved from disk into main memory, change all the disk addresses that point to items in this block into main memory addresses.
  • Need a bit for each address to indicate if it is a disk address or a memory address.
  • Why? Faster to follow memory pointers (only uses a single machine instruction).

Example of Swizzling

11

Block 1

Block 2

Disk Main Memory

read into main memory (^) swizzled

unswizzled

Docsity.com

Automatic Swizzling

  • Locating all pointers within a block:
    • refer to the schema, which will indicate where addresses are in the records
    • for index structures, pointers are at known locations
  • Update translation table with memory addresses of items in the block
  • Update pointers in the block (in memory) with memory addresses, when possible, as obtained from translation table

Unswizzling

  • When a block is moved from memory back to disk, all pointers must go back to database (disk) addresses
  • Use translation table again
  • Important to have an efficient data structure for the translation table

Unpinning a Block

  • Consider each item in the block to be unpinned
  • Keep in the translation table the places in memory holding swizzled pointers to that item (e.g., with a linked list)
  • Unswizzle those pointers: use translation table to replace the memory addresses with database (disk) addresses

Variable Length Data

  • Data items with varying size (e.g., if maximum size of a field is large but most of the time the values are small)
  • Variable-format records (e.g., NULLs method for representing a hierarchy of entity sets as relations)
  • Records that do not fit in a block (e.g., an MPEG of a movie)

Variable Length Fields

19

other header info

record length

to var len field 2

var len field 2

var len field 3

fixed len field 2

var len field 1

fixed len field 1

to var len field 3

Variable-Format Records

  • Represent by a sequence of tagged fields
  • Each tagged field contains
    • name
    • type
    • length, if not deducible from the type
    • value