week 6 disks file system interface file system implementation operating systems cs3013 / cs502

85
WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Upload: adrian-palmer

Post on 02-Jan-2016

222 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

WEEK 6DISKS

FILE SYSTEM INTERFACEFILE SYSTEM IMPLEMENTATION

Operating SystemsCS3013 / CS502

Page 2: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Agenda2

Disks

File System Interface

File System Implementation

Page 3: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Objectives3

Identify different parts of a disk drive

Explain hard disk addressing schemes

Understand performance and stability issues

Differentiate various levels of RAID

Page 4: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Context4

Early days: disks thought of as I/O devices Controlled like I/O devices Block transfer, DMA, interrupts, etc. Data in and out of memory (where action is)

Today: disks as integral part of computing system Implementer of two fundamental abstractions

Virtual Memory Files

Long term storage of information within system The real center of action

Page 5: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Disk Drives5

External Connection IDE/ATA SCSI USB

Cache – independent of OS

Controller Details of

read/write Cache

management Failure

management

Page 6: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Price per Megabyte of Magnetic Hard Disk

6

Page 7: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Price per GB on Different Types of Disk Drives

7

19¢ per gigabyte – 500 GB Quadra (portable) 7200 rpm, 10 ms. avg. seek time, 16 MB drive cache USB 2.0 port (effective 40 MBytes/sec)

27.3¢ per GB – 1 TB Caviar (internal) 5400-7200 rpm, 8.9 ms. avg. seek time , 16 MB drive cache ATA

62¢ per GB – 80 GB Caviar (internal) 7200 rpm, 8.9 ms. ms. avg. seek time, 8 MB drive cache EIDE (theoretical 66-100 MBytes/sec)

$2.60 per GB – 146.8 GB HP SAS (hot swap) 15,000 rpm, 3.5 ms. avg. seek time ATA

Page 8: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Hard Disk Geometry8

Platters Two-sided magnetic material 1-16 per drive, 3,000 – 15,000 RPM

Tracks Concentric rings bits laid out serially Divided into sectors (addressable)

Cylinders Same track on each platter Arms move together

Operation Seek: move arm to track Read/Write:

wait till sector arrives under head Transfer data

Page 9: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Moving-head Disk Mechanism9

Page 10: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

More on Hard Disk Drives10

Manufactured in clean roomPermanent, air-tight enclosure

“Winchester” technology Spindle motor integral with shaft

“Flying heads” Aerodynamically “float” over moving surface Velocities > 100 meters/sec Parking position for heads during power-off

Excess capacity Sector re-mapping for bad blocks Managed by OS or by drive controller

20,000-100,000 hours mean time between failures Disk failure (usually) means total destruction of data!

Page 11: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Raw Disk Layout11

Track format – n sectors 200 < n < 2000 in modern disks Some disks have fewer sectors

on inner tracks Inter-sector gap

Enables each sector to be read or written independently

Sector format Sector address: Cylinder, Track,

Sector (or some equivalent code)

Optional header (HDR) Data Each field separated by small

gap and with its own CRC Sector length

Almost all operating systems specify uniform sector length

512 – 4096 bytes

Page 12: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Formatting the Disk12

Write all sector addresses Write and read back various data patterns on all

sectors Test all sectors Identify bad blocks

Bad block Any sector that does not reliably return the data that

was written to it!

Page 13: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Bad Block Management13

Bad blocks are inevitable Part of manufacturing process (less than 1%)

Detected during formatting Occasionally, blocks become bad during operation

Manufacturers add extra tracks to all disks Physical capacity = (1 + x) * rated capacity

Who handles them? Disk controller: Bad block list maintained internally

Automatically substitutes good blocks Formatter: Re-organize track to avoid bad blocks OS: Bad block list maintained by OS, bad blocks never

used

Page 14: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Bad Sector Handling – within track14

a) A disk track with a bad sectorb) Substituting a spare for the bad sectorc) Shifting all the sectors to bypass the bad one

Page 15: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

CTS Addressing15

Also known as CHS (Cylinder, Head, Sector) Addressing Blocks per platter = head per platter * cylinder *

sectors per cylinder

Total Blocks = Blocks per platter * platter = cylinder * head * sector

Page 16: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

LBA Addressing16

Linear Base Address (LBA) Addressing

Logical Address for the disk

Assume a disk with C cylinders, T tracks and S sectors and a CTS Address (c, t, s)

LBA Address = c * T * S + t * S + s – 1

Page 17: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Logical vs. Physical Sector Addresses17

Many modern disk controllers convert [cylinder, track, sector]

addresses into logical sector numbers Linear array No gaps in addressing Bad blocks concealed by controller

Reason: Backward compatibility with older PC’s Limited number of bits in C, T, and S fields

Page 18: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Disk Drive Performance18

Seek time Position heads over a cylinder – 1 to 25 ms

Rotational latency Wait for sector to rotate under head Full rotation - 4 to 12 ms (15000 to 5400 RPM) Latency averages ½ of rotation time

Transfer Rate approx 40-380 MB/sec (aka bandwidth)

Transfer of 1 Kbyte Seek (4 ms) + rotational latency (2ms) + transfer (40 μsec)

= 6.04 ms Effective BW here is about 170 KB/sec (misleading!)

Page 19: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Disk Reading Strategies19

Read and cache a whole track Automatic in many modern controllers Subsequent reads to same track have zero rotational

latency – good for locality of reference! Disk arm available to seek to another cylinder

Start from current head position Start filling cache with first sector under head Signal completion when desired sector is read

Start with requested sector When no cache, or limited cache sizes

Page 20: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Disk Writing Strategies20

There are noneThe best one can do is

collect together a sequence of contiguous (or nearby) sectors for writing

Write them in a single sequence of disk actions

Caching for later writing is (usually) a bad idea Application has no confidence that data is actually

written before a failure Some network disk systems provide this feature, with

battery backup power for protection

Page 21: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Efficiency and Performance21

Disk Cache Mechanism There are many places to store disk data so the

system doesn’t need to get it from the disk again and again.

Page 22: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Efficiency and Performance22

Disk Cache Management This is an essential part of

any well-performing Operating System.

The goal is to ensure that the disk is accessed as seldom as possible.

Keep previously read data in memory so that it might be read again.

They also hold on to written data, hoping to aggregate several writes from a process.

Can also be “smart” and do things like read-ahead. Anticipate what will be needed.

Page 23: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Performance Metrics23

Transaction & database systems Number of transactions per second Focus on seek and rotational latency, not bandwidth Track caching may be irrelevant (except read-modify-write)

Many little files (e.g., Unix, Linux) Same

Big files Focus on bandwidth and contiguous allocation Track caching important; seek time is secondary concern

Paging support for VM A combination of both Track caching is highly relevant – locality of reference

Page 24: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Problem24

Question:– If mean time to failure of a disk drive is 100,000 hours, and if your system has 100 identical disks, what is mean time between need to replace a drive?

Answer:– 1000 hours (i.e., 41.67 days ≈ 6 weeks)

I.e.:– You lose 1% of your data every 6 weeks!

But don’t worry – you can restore most of it from backup!

Page 25: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Can we do better?25

Yes, mirrored Write every block twice, on two separate disks Mean time between simultaneous failure of both disks

is >57,000 years

Can we do even better? E.g., use fewer extra disks? E.g., get more performance?

Page 26: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Redundant Array of Inexpensive Disks (RAID)

26

Distribute a file system intelligently across multiple disks to Maintain high reliability and availability

Enable fast recovery from failure

Increase performance

Page 27: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

“Levels” of RAID27

Level 0 – non-redundant striping of blocks across disk

Level 1 – simple mirroringLevel 2 – striping of bytes or bits with ECCLevel 3 – Level 2 with parity, not ECCLevel 4 – Level 0 with parity blockLevel 5 – Level 4 with distributed parity

blocks…

Page 28: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

RAID 0 – Simple Striping28

Each stripe is one or a group of contiguous blocksBlock/group i is on disk (i mod n)Advantage

Read/write n blocks in parallel; n times bandwidth

Disadvantage No redundancy at all. System MTBF is 1/n disk MTBF!

stripe 8stripe 4stripe 0

stripe 9stripe 5stripe 1

stripe 10stripe 6stripe 2

stripe 11stripe 7stripe 3

Page 29: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

RAID 1 – Striping and Mirroring29

Each stripe is written twice Two separate, identical disks

Block/group i is on disks (i mod 2n) & (i+n mod 2n)Advantages

Read/write n blocks in parallel; n times bandwidth Redundancy: System MTBF = (Disk MTBF)2 at twice the cost Failed disk can be replaced by copying

Disadvantage A lot of extra disks for much more reliability than we need

stripe 8stripe 4stripe 0

stripe 9stripe 5stripe 1

stripe 10stripe 6stripe 2

stripe 11stripe 7stripe 3

stripe 8stripe 4stripe 0

stripe 9stripe 5stripe 1

stripe 10stripe 6stripe 2

stripe 11stripe 7stripe 3

Page 30: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

RAID 2 & 330

Bit- or byte-level stripingRequires synchronized disks

Highly impractical

Requires fancy electronics For ECC calculations

Not used; academic interest only

See Silbershatz, §12.7.3 (pp. 471-472)

Page 31: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Observation31

When a disk or stripe is read incorrectly,

we know which one failed!

Conclusion: A simple parity disk can provide very high reliability (unlike simple parity in memory)

Page 32: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

RAID 4 – Parity Disk32

parity 0-3 = stripe 0 xor stripe 1 xor stripe 2 xor stripe 3 n stripes plus parity are written/read in parallel If any disk/stripe fails, it can be reconstructed from others

E.g., stripe 1 = stripe 0 xor stripe 2 xor stripe 3 xor parity 0-3 Advantages

n times read bandwidth System MTBF = (Disk MTBF)2 at 1/n additional cost Failed disk can be reconstructed “on-the-fly” (hot swap) Hot expansion: simply add n + 1 disks all initialized to zeros

Disadvantage Writing requires read-modify-write of parity stripe only 1x write

bandwidth.

stripe 8stripe 4stripe 0

stripe 9stripe 5stripe 1

stripe 10stripe 6stripe 2

stripe 11stripe 7stripe 3

parity 8-11parity 4-7parity 0-3

Page 33: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

RAID 5 – Distributed Parity33

Parity calculation is same as RAID Level 4 Advantages & Disadvantages – Mostly same as RAID Level

4 Additional advantages

avoids beating up on parity disk Some writes in parallel (if no contention for parity drive)

Writing individual stripes (RAID 4 & 5) Read existing stripe and existing parity Recompute parity Write new stripe and new parity

stripe 12stripe 8stripe 4stripe 0

parity 12-15stripe 9stripe 5stripe 1

stripe 13parity 8-11

stripe 6stripe 2

stripe 14stripe 10

parity 4-7stripe 3

stripe 15stripe 11stripe 7

parity 0-3

Page 34: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

RAID 4 & 534

Very popular in data centers Corporate and academic servers

Built-in support in Windows XP and Linux Connect a group of disks to fast SCSI port (320

MB/sec bandwidth) OS RAID support does the rest!

Other RAID variations also available

Page 35: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Another Problem - Incomplete Operation

35

Problem – how to protect against disk write operations that don’t finish Power or CPU failure in the middle of a block Related series of writes interrupted before all are

completed

Examples: Database update of charge and credit RAID 1, 4, 5 failure between redundant writes

Page 36: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Consistency36

Required when system crashes or data on the disk may be inconsistent:

Consistency Checker

compares data in the directory structure with data blocks on disk and tries to fix inconsistencies. For example, What if a file has a pointer to a block, but the bit map for the free-space-management says that block isn't allocated.

Back-upprovides consistency by copying data to a "safe" place.

Recoveryoccurs when lost data is retrieved from backup,

Page 37: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Solution 1 – Stable Storage37

Write everything twice to separate disks Be sure 1st write does not invalidate previous 2nd

copy RAID 1 is okay; RAID 4/5 not okay! Read blocks back to validate; then report completion

Reading both copies If 1st copy okay, use it – i.e., newest value If 2nd copy different or bad, update it with 1st copy If 1st copy is bad; update it with 2nd copy – i.e., old

value

Page 38: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Solution 1 – Stable Storage38

Crash recovery Scan disks, compare corresponding blocks If one is bad, replace with good one If both good but different, replace 2nd with 1st copy

Result:– If 1st block is good, it contains latest value If not, 2nd block still contains previous value

An abstraction of an atomic disk write of a single block Uninterruptible by power failure, etc.

Page 39: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Solution 2 – Log-structured File System

39

Also known as Journaling File System (JFS)Maintains a journal of changes to makeCrash Recovery

Replay the journal until the file system is consistent

Atomic changes Succeed (succeeded originally or replayed completely

during recovery) Fail (not replayed at all or skipped due to incomplete

journal)

Page 40: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Agenda40

Page 41: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Objectives41

Identify file systems, files and directories

Explain file and directory structures

Give examples of file and directory operations

Explain access methods

Explain protection in file system context

Page 42: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Questions42

What is a file system?

What is a file?

What is a directory?

Page 43: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

File System Interface43

User-level portion of the file system Access methods

Directory Structure

Protection

Page 44: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

File Concept44

A collection of related bytes having meaning only to the creator. The file can be "free formed", indexed, structured, etc.

The file is an entry in a directory.The file may have attributes (name, creator, date,

type, permissions)The file may have structure ( O.S. may or may not

know about this.) It's a tradeoff of power versus overhead. For example, An Operating System understands program image format

in order to create a process. The UNIX shell understands how directory files look. (In

general the UNIX kernel doesn't interpret files.) Usually the Operating System understands and interprets

file types.

Page 45: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Fundamental Ambiguity45

Is the file the “container of the information” or the “information” itself?

Almost all systems confuse the two.

Almost all people confuse the two.

Page 46: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Example46

Later, how do either of us know that we are using the same version of the document?

Windows/Outlook/Exchange: Time-stamp is a pretty good indication that they are Time-stamps preserved on copy, drag and drop,

transmission via e-mail, etc.

Unix By default, time-stamps not preserved on copy, ftp, e-

mail, etc. Time-stamp associated with container, not with

information

Page 47: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

File Attributes47

Name – only information kept in human-readable formIdentifier – unique tag (number) identifies file within

file systemType – needed for systems that support different typesLocation – pointer to file location on deviceSize – current file sizeProtection – controls who can do reading, writing,

executingTime, date, and user identification – data for

protection, security, and usage monitoring

Information about files is kept in the directory structure, which is maintained on the disk.

Page 48: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

File Operations48

CreateWriteReadReposition within a fileDeleteTruncate

Rename, Append, Copy

Page 49: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

File Types49

Page 50: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

File Structure50

None Sequence of words/bytes

Simple Record Structure Lines Fixed Length Variable Length

Complex Structure Formatted Document Relocatable load file

OS and programs interpret these structures.

Page 51: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Access Methods51

Sequential Access Implemented by the file system. Data is accessed one record right after the last. Reads cause a pointer to be moved ahead by one. Writes allocate space for the record and move the

pointer to the new End Of File. Such a method is reasonable for tape

Page 52: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Access Methods52

Direct Access Method useful for disks. The file is viewed as a numbered sequence of blocks

or records. There are no restrictions on which blocks are

read/written in any order. User now says "read n" rather than "read next". "n" is a number relative to the beginning of file, not

relative to an absolute physical disk location.

Page 53: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Access Methods53

OTHER ACCESS METHODS Built on top of direct access and often implemented by a user utility.

Indexed ID plus pointer. 

An index block says what's in each remaining block or contains pointers to blocks containing particular items. Suppose a file contains many blocks of data arranged by name alphabetically.

  Example 1: Index contains the name appearing as the first

record in each block. There are as many index entries as there are blocks.  

Example 2: Index contains the block number where "A" begins, where "B" begins, etc. Here there are only 26 index entries.

Page 54: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Access Methods54

Example 1: Index contains the name appearing as the first record in each block. There are as many index entries as there are blocks.

Example 2: Index contains the block number where "A" begins, where "B" begins, etc. Here there are only 26 index entries.

Adams

Arthur

Asher

Smith

Adams

Baker

Charles

Saarnin

Smith, John | Data

Adams | Data

Arthur | Data

Asher | Data

Baker | Data

Saarnin | Data

Smith | Data

Page 55: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Directory55

Special kind of a file

A tool for users & applications to organize and find files User-friendly names Names that are meaningful over long periods of time

The data structure for OS to locate files (i.e., containers) on disk

Page 56: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Directory Structures56

Single level One directory per system, one entry pointing to each file Small, single-user or single-use systems

PDA, cell phone, etc.Two-level

Single “master” directory per system Each entry points to one single-level directory per user Uncommon in modern operating systems

Hierarchical Any directory entry may point to

Another directory Individual file

Used in most modern operating systems

Page 57: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Directory Organization - Hierarchical57

Page 58: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Directory Considerations58

Efficiency – locating a file quickly.

Naming – convenient to users. Separate users can use same name for separate files. The same file can have different names for different

users. Names need only be unique within a directory

Grouping – logical grouping of files by properties e.g., all Java programs, all games, …

Page 59: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

More on Hierarchical Directories59

Most systems support idea of current (working) directory Absolute names – fully qualified from root of file system

/usr/group/foo.c Relative names – specified with respect to working directory

foo.c A special name – the working directory itself

“.”

Modified Hierarchical – Acyclic Graph (no loops) and General Graph Allow directories and files to have multiple names Links are file names (directory entries) that point to existing

(source) files

Page 60: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Directory Operations60

Create: Make a new directory

Add, Delete entry: Invoked by file create & destroy, directory create & destroy

Find, List: Search or enumerate directory entries

Rename: Change name of an entry without changing anything else

about itLink, Unlink:

Add or remove entry pointing to another entry elsewhere Introduces possibility of loops in directory graph

Destroy: Removes directory; must be empty

Page 61: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Links61

Hard links: bi-directional relationship between file names and file A hard link is directory entry that points to a source file’s

metadata Metadata counts the number of hard links (including the source

file) – link reference count Link reference count is decremented when a hard link is deleted File data is deleted and space freed when the link reference count

is 0

Symbolic (soft) links: uni-directional relationship between a file name and the file Directory entry points to new metadata Metadata points to the source file If the source file is deleted, the link pointer is invalid

Page 62: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Other Issues62

Mounting: Attaching portions of the file system into a directory structure.

Sharing: Sharing must be done through a protection scheme May use networking to allow file system access between

systems Manually via programs like FTP or SSH Automatically, seamlessly using distributed file systems Semi automatically via the world wide web

Client-server model allows clients to mount remote file systems from servers Server can serve multiple clients Client and user-on-client identification is insecure or complicated NFS is standard UNIX client-server file sharing protocol CIFS is standard Windows protocol Standard operating system file calls are translated into remote calls

Page 63: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Protection63

File owner/creator should be able to control: what can be done by whom

Types of access Read Write Execute Append Delete List

Page 64: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Protection64

Unix/Linux Mode of access: read (R), write (W), execute (X) Three classes of users

Owner Group Public

Ask manager to create a group (unique name), say G, and add some users to the group.

For a particular file (say game) or subdirectory, define an appropriate access.

drwxr-xr-x 2 clee01 8571 4096 Apr 30 2007 old-rwxr-x--- 1 clee01 8571 15827 Apr 30 2007 webserver-rw-r----- 1 clee01 8571 7758 Apr 27 2007 webserver.cxx

Page 65: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Protection65

Windows

Page 66: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Agenda66

Page 67: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Objectives67

Differentiate layers in file system structure

Explain different allocation methods

Explain different free space management schemes

Page 68: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

File System Implementation68

OS-level portion of the file system

Allocation and free space management

Directory implementation

Page 69: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

File System Structure69

When talking about “the file system”, you are making a statement about both the rules used for file access, and about the algorithms used to implement those rules. Here’s a breakdown of those algorithmic pieces.

Application Programs The code that's making a file request.

Logical File System This is the highest level in the OS; it does protection, and security. Uses the directory structure to do name resolution.

File Organization Module Here we read the file control block maintained in the directory so we know about files and the logical blocks where information about that file is located.

Basic File System Knowing specific blocks to access, we can now make generic requests to the appropriate device driver.

IO Control These are device drivers and interrupt handlers. They cause the device to transfer information between that device and CPU memory.

Devices The disks / tapes / etc.

Page 70: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Layered File System70

Handles the CONTENT of the file. Knows the file’s internal structure.

Handles the OPEN, etc. system calls. Understands paths, directory structure, etc.

Uses directory information to figure out blocks, etc. Implements the READ. POSITION calls.

Determines where on the disk blocks are located.

Interfaces with the devices – handles interrupts.

Page 71: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Virtual File System71

Virtual File Systems (VFS) provide an object-oriented way of implementing file systems.

VFS allows the same system call interface (the API) to be used for different types of file systems.

The API is to the VFS interface, rather than any specific type of file system.

Page 72: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Allocation Methods – Contiguous Allocation

72

Method: Lay down the entire file on contiguous sectors of the disk. Define by a dyad <first block location, length >.

1. Accessing the file requires a minimum of head movement.

2. Easy to calculate block location: block i of a file, starting at disk address b, is b + i.

3. Difficulty is in finding the contiguous space, especially for a large file. Problem is one of dynamic allocation (first fit, best fit, etc.) which has external fragmentation. If many files are created/deleted, compaction will be necessary.

 It's hard to estimate at create time what the size of the file will ultimately be. What happens when we want to extend the file --- we must either terminate the owner of the file, or try to find a bigger hole.

Page 73: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Allocation Methods – Contiguous Allocation

73

Ideal for large, static files Databases, fixed system structures, OS code CD-ROM, DVD

Simple address calculation Block address sector address

Fast multi-block reads and writes Minimize seeks between blocks

Prone to fragmentation when … Files come and go Files change size

Similar to unpaged virtual memory

Page 74: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Allocation Methods – Linked Allocation74

Each file is a linked list of disk blocks, scattered anywhere on the disk.

Each block contains pointer to next block

Directory points to first and last blocks

Sector header block: Pointer to next block ID and block number of file

At file creation time, simply tell the directory about the file. When writing, get a free block and write to it, enqueueing it to the file header.

Page 75: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Allocation Methods – Linked Allocation75

Advantages No space fragmentation Easy to create, extend files Ideal for lots of small files

Disadvantages Lots of disk arm movement Space taken up by links Sequential access only!

Page 76: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Example of Linked Allocation76

File Allocation Table (FAT) Instead of a link on

each block, put alllinks in one table the FAT (File

Allocation Table) fixed place on disk

One entry per physical block in disk

Directory points to first & last blocks of file

Each block points to next block (or EOF)

Page 77: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

FAT File System77

Advantages Advantages of Linked File System FAT can be cached in memory Searchable at CPU speeds, pseudo-random access

Disadvantages Limited size, not suitable for very large disks FAT cache describes entire disk, not just open files! Not fast enough for large databases

Used in MS-DOS, early Windows systems

Page 78: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Allocation Methods – Indexed Allocation

78

Each file uses an index block on disk to contain addresses of other disk blocks used by the file.

When the i th block is written, the address of a free block is placed at the i th position in the index block.

Method suffers from wasted space since, for small files, most of the index block is wasted. What is the optimum size of an index block?

If the index block is too small, we can: Link several together Use a multilevel index

UNIX keeps 12 pointers to blocks in its header. If a file is longer than this, then it uses pointers to single, double, and triple level index blocks.

Page 79: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Allocation Methods – Indexed Allocation

79

Unix Method Direct blocks:

Pointers to first n sectors

Single indirect table: Extra block

containing pointers to blocks n+1 .. n+m

Double indirect table: Extra block

containing single indirect blocks

Page 80: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Performance Issues80

It's difficult to compare mechanisms because usage is different. Let's calculate, for each method, the number of disk accesses to read block i from a file:

Contiguous 1 access from location start + i.

Linked i + 1 accesses, reading each block in turn. (is this a fair example?)

Indexed 2 accesses, 1 for index, 1 for data.

Page 81: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Free Space Management81

Bit Vector Method Each block is represented by a bit

1 1 0 0 1 1 0 means blocks 2, 3, 6 are free.

This method allows an easy way of finding contiguous free blocks. Requires the overhead of disk space to hold the bitmap.

A block is not REALLY allocated on the disk unless the bitmap is updated.

What operations (disk requests) are required to create and allocate a file using this implementation?

Page 82: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Free Space Management82

Free List Method Free blocks are chained together, each holding a pointer to

the next one free. This is very inefficient since a disk access is required to

look at each sector.Grouping Method

In one free block, put lots of pointers to other free blocks. Include a pointer to the next block of pointers.

Counting Method Since many free blocks are contiguous, keep a list of dyads

holding the starting address of a "chunk", and the number of blocks in that chunk.

Format < disk address, number of free blocks >

Page 83: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Directory Management83

The issue here is how to be able to search for information about a file in a directory given its name.

Could have linear list of file names with pointers to the data blocks. This is: simple to program BUT time consuming to search.

Could use hash table - a linear list with hash data structure. Use the filename to produce a value that's used as entry to hash

table. Hash table contains where in the list the file data is located. This decreases the directory search time (file creation and

deletion are faster.) Must contend with collisions - where two names hash to the

same location. The number of hashes generally can't be expanded on the fly.

Page 84: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Example: Linux Files84

[clee01@CCC3 ~]$ stat .addressbook File: `.addressbook' Size: 0 Blocks: 0 IO Block: 8192 regular

empty fileDevice: 16h/22d Inode: 32541601 Links: 1Access: (0640/-rw-r-----) Uid: ( 8571/ clee01) Gid: ( 8571/ UNKNOWN)Access: 2007-01-13 23:32:41.457165000 -0500Modify: 2007-01-13 23:32:41.457165000 -0500Change: 2007-01-13 23:32:41.457165000 -0500[clee01@CCC3 ~]$ stat .cshrc File: `.cshrc' Size: 84 Blocks: 8 IO Block: 8192 regular fileDevice: 16h/22d Inode: 24230479 Links: 1Access: (0600/-rw-------) Uid: ( 8571/ clee01) Gid: ( 8571/ UNKNOWN)Access: 2008-07-10 18:09:08.425782000 -0400Modify: 2006-07-12 01:36:38.000000000 -0400Change: 2007-01-10 11:08:20.775838000 -0500[clee01@CCC3 ~]$

Page 85: WEEK 6 DISKS FILE SYSTEM INTERFACE FILE SYSTEM IMPLEMENTATION Operating Systems CS3013 / CS502

Example : Linux Directory85

[clee01@CCC3 ~]$ stat cs502

File: `cs502'

Size: 4096 Blocks: 8 IO Block: 8192 directory

Device: 16h/22d Inode: 23494968 Links: 11

Access: (0750/drwxr-x---) Uid: ( 8571/ clee01) Gid: ( 8571/ UNKNOWN)

Access: 2008-07-06 10:42:55.745277000 -0400

Modify: 2008-07-05 09:26:33.081177000 -0400

Change: 2008-07-05 09:26:33.081177000 -0400

[clee01@CCC3 ~]$