File Systems: Implementation and Organization, Slides of Computer Numerical Control

An overview of file systems, discussing how to implement file system abstraction on top of raw disks, approaches to organizing file blocks such as contiguous allocation, linked list allocation, and indexed files, and the unix file header (i-node) and naming and directories. It also covers topics such as protection, performance issues, and access control.

Typology: Slides

2010/2011

Uploaded on 10/07/2011

christina
christina 🇺🇸

4.6

(23)

393 documents

1 / 14

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
File Systems
Arvind Krishnamurthy
Spring 2004
File Systems
nImplementing file system abstraction on top of raw disks
nIssues:
nHow to find the blocks of data corresponding to a given file?
nHow to organize files?
nHow to enforce protection?
nPerformance issues: need to minimize the number of “non-
local” disk accesses
nTry to keep related information together on the disk
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe

Partial preview of the text

Download File Systems: Implementation and Organization and more Slides Computer Numerical Control in PDF only on Docsity!

File Systems

Arvind Krishnamurthy

Spring 2004

File Systems

n Implementing file system abstraction on top of raw disks

n Issues:

n How to find the blocks of data corresponding to a given file? n How to organize files? n How to enforce protection?

n Performance issues: need to minimize the number of “non-

local” disk accesses

n Try to keep related information together on the disk

File Blocks Organization

n Approaches:

n Contiguous allocation n A file is stored on a contiguous set of blocks n Prevents incremental growth and complicates allocation n Linked list allocation n A file header points to the first block of the file n Each block of the file points to the next block n If blocks are dispersed across disk Ë horrible performance for both sequential access and random access n Random access can be made faster by separating the next block pointers from the data and storing it at a centralized place (FAT) n Indexed files n File header stores pointers to file blocks n Multi-level indexing required for large files

n Used in Unix 4.

n 13 Pointers in a header

n 10 direct pointers n 11: 1-level indirect n 12: 2-level indirect n 13: 3-level indirect

n Pros & Cons

n In favor of small files n Can grow n Limit is 16G (assuming 1K blocks)

1 2

data

data

11 12 13 data

. data

. data

Hybrid Multi-level Indexing Scheme

Naming and directories

n Options

n Use index (ask users specify inode number). Easier for system, not as easy for users. n Text name (need to map to index) n Icon (need to map to index; or map to name then to index)

n Directories

n Directory map name to file index (where to find file header) n Directory is just a table of file name, file index pairs.

n Each directory is stored as a file, containing a (name, index) pair. n Only OS permitted to modify directory

n Approach 1: have a single directory for entire system.

n put directory at known location on disk n directory contains <name, index> pairs n if one user uses a name, no one else can n many older personal computers work this way.

n Approach 2: have a single directory for each user

n still clumsy. And ls on 10,000 files is a real pain

n Approach 3: hierarchical name spaces

n allow directory to map names to files or other dirs n file system forms a tree (or graph, if links allowed) n large name spaces tend to be hierarchical (ip addresses, domain names, scoping in programming languages, etc.)

Directory structure

n Used since CTSS (1960s)

n Unix picked up and used really nicely.

n Directories stored on disk just like regular files

n inode contains special flag bit set n users can read just like any other file n only special programs can write

n file pointed to by the index may be another directory n makes FS into hierarchical tree (what is needed to make a DAG?)

n Simple. Plus speeding up file ops = speeding up dir ops!

Hierarchical Unix

afs bin cdrom dev sbin tmp

awk chmod chown

<name, inode#>

<afs, 1021>

<tmp, 1020>

<bin, 1022>

<cdrom, 4123>

<dev, 1001>

<sbin, 1011>

n Bootstrapping: Where do you start looking?

n Root directory n inode #2 on the system n 0 and 1 used for other purposes

n Special names:

n Root directory: “/” (bootstrap name system for users) n Current directory: “.” n Parent directory: “..”

Naming

Outline

n Topics covered so far in file systems:

n Data blocks n File headers n Directories n File system superblocks

n Remaining topics:

n Hard and soft links n Permissions

n More than one dir entry can refer to a given file

n Unix stores count of pointers (“hard links”) to inode

n to make: “ln foo bar” creates a synonym (‘bar’) for ‘foo’

n Soft links:

n also point to a file (or dir), but object can be deleted from underneath it (or never even exist). n normal file holds pointer to name, with special “sym link” bit set

n When the file system encounters a symbolic link it automatically translates it (if possible).

Creating synonyms: hard and soft links

ref = 2

foo bar

“baz” /bar

Protection

n Goals:

n Prevent accidental and maliciously destructive behavior n Ensure fair resource usage

n A key distinction to make: policy vs. mechanism

n Policy : what is to be done n Mechanism : how something is to be done

Access control

n Domain structure

n Access/usage rights associated with particular domain n Example: user/kernel mode Ë two domains n Unix: each user is a domain; super-user domain; groups of users (and groups)

n Type of access rights

n For files: read/write/execute n For directories: list/modify/delete n For access rights themselves n Owner (I have the right to change the access rights for some resource) n Copy (I have the right to give someone else a copy of an access right I have) n Control (I have the right to revoke someone else’s access rights)

A combined approach

n Objects have ACLs

n Users have CAPs, called “groups” or “roles”

n ACLs can refer to users or groups

n Change permissions on an object by modifying its ACL

n Change broad user permissions via changes in group

membership

Data structures for a typical file system

Process control block

Open file pointer array

Open file table (systemwide)

File headers (Metadata)

File headers

File system info

Directories

File data

Appendix: Open a file

n File name lookup and authenticate

n Copy the file descriptors into the in-memory data structure, if it is not in yet

n Create an entry in the open file table (system wide) if there isn’t one

n Create an entry in PCB

n Link up the data structures

n Return a pointer to user

PCB

fd = open( FileName, access)

Open file table

Metadata

Allocate & link up data structures

File name lookup & authenticate

File system on disk

PCB

Open file table

Metadata

read( fd, userBuf, size )

Find open file descriptor

read( fileDesc, userBuf, size )

Logical → phyiscal

read( device, phyBlock, size )

Get physical block to sysBuf copy to userBuf

Disk device driver

Buffer cache

Read a block

Example: open-read-close (cont’d)

  1. The kernel: n From “fd” find the file pointer n Based on the file system block size (let’s say 1 KB), find the blocks where the bytes (file_pointer, file_pointer+length) lies; n Read the inode n For (each block) { n If the block # < 11, find the disk address of the block in the entries in the inode n If the block # >= 11, but < 11 + (1024/4): read the “single indirect” block to find the address of the block n If the block # >= 11+(1024/4) but < 11 + 256 + 256 * 256: read the “double indirect” block and find the block’s address n Otherwise, read the “triple indirect” block and find the block’s address } n Read the block from the disk n Copy the bytes in the block to the appropriate location in the buffer
  2. The process calls close(fd);
  3. The kernel: deallocate the fd entry, mark it as empty.

Example: the create-write-close cycle

  1. The process calls create (“README”);
  2. The kernel: n Get the current working directory of the process: Let’s say “/c/cs422/as/as n Call “namei” and see if a file name “README” already exists in that directory n If yes, return error “file already exists”; n If no: Allocate a new inode; Write the directory file “/c/cs422/as/as3” to add a new entry for the (“README”, disk address of inode) pair n Find an empty slot “fd” in the file descriptor table for the process; n Put the pointer to the inode in the slot “fd”; n Set the file pointer in the slot “fd” to 0; n Return “fd”;

Example: create-write-close (cont’d)

  1. The process calls write(fd, buffer, length);
  2. The kernel:

n From “fd” find the file pointer; n Based on the file system block size (let’s say 1 KB), find the blocks where the bytes (file_pointer, file_pointer+length) lies; n Read the inode n For (each block) { n If the block is new, allocate a new disk block; n Based on the block no, enter the block’s address to the appropriate places in the inode or the indirect blocks; (the indirect blocks are allocated as needed) n Copy the bytes in buffer to the appropriate location in the block } n Change the file size field in inode if necessary

  1. The process calls close(fd);
  2. The kernel deallocate the fd entry --- mark it as empty.