protection and system calls · protection examples • cpu protection – prevent a user from using...
TRANSCRIPT
Protection and System Calls
Otto J. Anshus
How To Deal with Complexity
2
How To Deal with Complexity
2
I don’t know
How To Deal with Complexity
2
I don’t know
I do believe that the Best Advice You’ll Ever Get is to
Do early, fail early, and learn how to make things work
Why Protection• Observations
– Large system • ⇒ More Complexity ⇒ More crashes ⇒ More Service unavailability
– Evil humans • ⇒ Smart Architecture & Smart Evil Code ⇒ pain & suffering ⇒ anything can happen
– Good humans – ⇒ Dumb Architecture & Dumb Bad Code ⇒ pain & suffering ⇒ anything can happen
– [Large||Evil||Good] • ⇒ [SA||DA||SEC||DBC] ⇒ pain & suffering ⇒ anything can happen
– Security can be compromised in multiple ways
• Claim – Not so simple to solve – Many things are needed
• Good architectures, designs, code, protocols, even encryption • Protection at architectural and systems levels are fundamental
3
Typical Unix OS “Structure”
Application
Libraries
Portable OS Layer
Machine-dependent layer
•Low-level system initialization and bootstrap
•Fault, trap, interrupt and exception handling
•Memory management: hardware address translation
•Low-level kernel/user-mode process context switching
•I/O device driver and device initialization code
C
Assembler
System Call Interface
• Easier to do
•Performance
User LevelKernel Level
Where is the code and data to do all of this actually located?
Typical Unix OS “Structure”
Application
Libraries
Portable OS Layer
Machine-dependent layer
•Low-level system initialization and bootstrap
•Fault, trap, interrupt and exception handling
•Memory management: hardware address translation
•Low-level kernel/user-mode process context switching
•I/O device driver and device initialization code
C
Assembler
System Call Interface
• Easier to do
•Performance
User LevelKernel Level
The code & data are in memory (local DRAM, local disks, remote DRAM, remote disks, remote services)
Where is the code and data to do all of this actually located?
Protection Examples
• CPU protection – Prevent a user from using the CPU for too long
• Throughput of jobs, and response time to events (incl. user interactive response time)
• Memory protection – Prevent users from modifying kernel code and data structures – …and each others code and data
• I/O protection – Prevent users from performing illegal I/O’s (as well as other instructions that can
result in crashes, bad performance, compromised applications and system)
Sidebar: Protection & Security & “Merkelphones”
• Achieving security is fundamentally about system architectures (and designs, and implementations)
• Two very different approaches – German approach (Made possible by many years of German funding
of German OS research in Germany) • Main insight: all computers can be compromised. A smartphone is a computer. We need
a secure OS alongside Android/iOS/etc • http://www.telekom.com/medien/loesungen-fuer-unternehmen/200140
– Norwegian approach (Purchased from a German company) • Main insight: TBD. • https://cdn.rohde-schwarz.com/pws/dl_downloads/dl_common_library/
dl_brochures_and_datasheets/pdf_1/TopSecMobile_fly_en_3606-6692-32_v0300.pdf
6
Professor Hermann Härtig,
Chair of Operating Systems,
Technical University Dresden
Two level architecture supported by the Hardware
Application
Portable OS Layer
Libraries
Machine-dependent layer
•Low-level system initialization and bootstrap
•Fault, trap, interrupt and exception handling
•Memory management: hardware address translation
•Low-level kernel/user-mode process context switching
•I/O device driver and device initialization code
C
Assembler
System Call Interface
•...have to
•Performance
Two level architecture supported by the Hardware
Application
Portable OS Layer
Libraries
Machine-dependent layer
•Low-level system initialization and bootstrap
•Fault, trap, interrupt and exception handling
•Memory management: hardware address translation
•Low-level kernel/user-mode process context switching
•I/O device driver and device initialization code
C
Assembler
System Call Interface
•...have to
•Performance
Of course, nothing is as simple as it looks. When you think you are OK, somebody reminds you that the microcode of the processor can be changed and then anything is possible and bad things can happen
“Application”? What is running at User Level? a.k.a. What is a Process?
• Four segments – Code/text: instructions – Data: variables – Stack – Heap
• Why? – Separate code and data – Stack and heap grow
toward each other
• Stack – Layout by compiler – Allocate at process creation (fork) – Deallocate at process termination
• Heap – Linker and loader specify the
starting address – Allocate/deallocate by library
calls such as malloc() and free() called by application
• Data – Compiler allocate statically – Compiler specify names and
symbolic references – Linker translate references and
relocate addresses – Loader finally lay them out in memory
“Application”? What is running at User Level? a.k.a. What is a Process?
• Four segments – Code/text: instructions – Data: variables – Stack – Heap
• Why? – Separate code and data – Stack and heap grow
toward each other
• Stack – Layout by compiler – Allocate at process creation (fork) – Deallocate at process termination
• Heap – Linker and loader specify the
starting address – Allocate/deallocate by library
calls such as malloc() and free() called by application
• Data – Compiler allocate statically – Compiler specify names and
symbolic references – Linker translate references and
relocate addresses – Loader finally lay them out in memory
NB: Don’t sweat it too much now, but eventually We (well, You) must figure out how to, inside the OS Kernel, implement the process abstraction and how to handle a running process
Registers Directly Available & Used by User Level Processes
In “protected mode”, there are eight 32-bit general-purpose registers for use:
data registers ◦ EAX, the accumulator (32 bits (16 and AX (AH, AL))) ◦ EBX, the base register ◦ ECX, the counter register ◦ EDX, the data register address registers ◦ ESI, the source register ◦ EDI, the destination register ◦ ESP, the stack pointer register ◦ EBP, the stack base pointer register
9
User Level
Registers Directly Available & Used by User Level Processes
In “protected mode”, there are eight 32-bit general-purpose registers for use:
data registers ◦ EAX, the accumulator (32 bits (16 and AX (AH, AL))) ◦ EBX, the base register ◦ ECX, the counter register ◦ EDX, the data register address registers ◦ ESI, the source register ◦ EDI, the destination register ◦ ESP, the stack pointer register ◦ EBP, the stack base pointer register
9
Again, don’t worry if this smells and looks like Norwegian Gammalost (the only cheese being illegal as carry on), we will sneak it in during the next few weeks.
Besides, your fellow students and TA (and Google) are (sometimes at least) your friends.
User Level
Registers accessed by Kernel (cannot be directly accessed by user level processes)
Registers used by OS Kernel changing the state of the processor:
control registers ◦ CR0, CR1, CR2, CR3
test registers ◦ TR4, TR5, TR6, TR7
descriptor registers ◦ GDTR, the global descriptor table register (see below) ◦ LDTR, the local descriptor table register (see below) ◦ IDTR, the interrupt descriptor table register (see below)
task register ◦ TR 10
Kernel Level
Instruction Pointer Register• EIP
– The IP register points to where in the process’ code (address space) the processor will get the next instruction
– The IP register cannot be accessed by the programmer an “application level” process directly.
• So how do we do it then?
11
Instruction Pointer Register• EIP
– The IP register points to where in the process’ code (address space) the processor will get the next instruction
– The IP register cannot be accessed by the programmer an “application level” process directly.
• So how do we do it then?
11
instructions that control program flow, such as calls, jumps, loops, and interrupts, automatically change the instruction pointer.For example: push eax; ret;
Instruction Pointer Register• EIP
– The IP register points to where in the process’ code (address space) the processor will get the next instruction
– The IP register cannot be accessed by the programmer an “application level” process directly.
• So how do we do it then?
11
instructions that control program flow, such as calls, jumps, loops, and interrupts, automatically change the instruction pointer.For example: push eax; ret;
Why not let a user level process change the IP register directly?
Instruction Pointer Register• EIP
– The IP register points to where in the process’ code (address space) the processor will get the next instruction
– The IP register cannot be accessed by the programmer an “application level” process directly.
• So how do we do it then?
11
instructions that control program flow, such as calls, jumps, loops, and interrupts, automatically change the instruction pointer.For example: push eax; ret;
Why not let a user level process change the IP register directly?
(i) no legitimate use case (ii) hard to do branch prediction (iii) security concerns
User level vs. Kernel level
• User level • Some instructions are not available any more • Programs can be modified and substituted by user
• Kernel (a.k.a. supervisory or privileged) level – All instructions are available – Total control possible so OS should never ever give away
to a user process the right to run at kernel level
Application
Portable OS Layer
Libraries
Machine-dependent layer
Architecture Support: Privileged ModeUser Level Kernel Level
Boot OS & Run {OS||Processes}• Boot OS
– power on/start up code (the computer will already have this code) reads boot block from “hard disc” to memory and transfer control to boot block code
– boot block code (you get to do this) reads OS from disc to memory – OS (you get to do this) is given control
• OS starting user program (first time) as a process – Read (already compiled and readied by you) program image from disk – Initialize OS kernel data structures with info about the program, turning it into a process – ATOMICALLY DO {<Set privilege level ”user”>; <Load instruction register with start of process address>} %
why ‘atomically?
– Now we have a user level PROCESS running
• User level process requesting OS service – Process writes data ”somewhere” (memory, stack, registers) indicating which service is
requested and where to find the parameters – Execute instruction that will trap in decode, causing an interrupt
• Still need mechanism that allows OS to ”preempt” user process. So we need a way to activate the OS kernel independently of running program (a.k.a. processes). That can only be achieved “external” to the ”current” process (a.k.a. running program’s) instruction stream.
I will call the process being executed for the CURRENT process
Interrupts and Traps
• Interrupts – Raised by external events – CPU resume from the interrupt
handler in the kernel • switch to next process
– iret instruction: returns by popping return address from stack, and enable interrupts (IA32 instruction set)
• Traps – Internal events – System calls (syscalls) – Also return by iret
Application: program being executed: at least one process: with at least one thread
OS Kernel
By the way: What to Do When User Level Process is Trying to Execute an Illegal Instruction?
• Instruction Stream – Fetch instruction – Decode instruction – Fetch operands – Execute – Write back result – Next instruction
When decoding instruction, what to do if bit-pattern doesn’t represent a (legal) instruction?
– Halt? (No, but why not?) – Instead: “Trap”: fetch next instruction at
”predetermined” address in memory. Make sure that you have placed your (OS) code there beforehand……or else (what will happen?)
Interrupts and Exceptions• Interrupt sources
– HW (by external devices) – SW: INT n
• Exceptions (something unexpected or needing action happened)
– Program errors • faults: save address (CS, EIP) of faulting instruction, “fix” fault, restart
instruction – Case: page fault, divide by zero
• traps: save address of instruction – Case: debugger
• aborts: oops (no saving, not recoverable) – Case: your first attempts at OS kernel code
– SW generated: INT 3 – Machine-check exceptions
• See Intel doc Vol. 3 for details
Interrupts and Exceptions
Interrupts and Exceptions
Privileged Instruction Examples
• Memory address mapping • Data cache flush and invalidation • Invalidating TLB entries • Loading and reading system registers • Changing processor mode from kernel to user • Changing the voltage and frequency of the processor • Halting a processor • Reset a processor • I/O operations
IA32 Protection Rings
No worries, we will use level 0 and 3
System Calls• Operating System API
– Interface between a process and OS kernel – Seen as a set of library functions
• Process management – end, abort , load, execute, create, terminate, set, wait
• Memory management • mmap & munmap, mprotect, mremap, msync, swapon & off,
• File management • create, delete, open, close, R, W, seek
• Device management • res, rel, R, W, seek, get & set atrib., mount, unmount
• Communication • get ID’s, open, close, send, receive
System Call Mechanism• User code can be arbitrary • User code cannot modify kernel
memory • Makes a system call with parameters • The call mechanism switches code to
kernel mode • Execute system call • Return with results Kernel in
protected memory
entry
User program
User program
call
return
But HOW?
System Call Implementation
• Use an “interrupt” • Hardware devices (keyboard, serial port, timer, disk,…)
and software can request service using interrupts • The CPU is interrupted
– ...and a service handler routine is run • …when finished the CPU resumes from where it was
interrupted (or somewhere else determined by the OS kernel)
OS Kernel: Interrupt/Trap/Exception Handler
HW Device Interrupt
HW exceptions
SW exceptions
System Call
Virtual address exceptions
HW enforces this boundary, but we must use it correctly
System Service dispatcher System
services
Interrupt service routines
Exception dispatcher
Exception handlers
VM manager’s pager
Sys_call_table
User Level code running Interrupts ON
Kernel Level code running Interrupts OFF (simple)
This is inside the Kernel
Entry point of Int Handler...
...
Passing Parameters
• Pass by registers • #registers • #usable registers • #parameters in syscall
• Pass by memory vector – A register holds the address of
a location in users memory • Pass by stack
– Push: done by library – Pop: done by Kernel
frame
frame
Top
REMEMBER: Kernel has access to callers address space, but not vice versa
The Stack•Many stacks possible, but only one is “current”: the one in the segment referenced by the SS register
•Max size 4 gigabytes
•PUSH: write (--ESP);
•POP: read(ESP++);
•When setting up a stack remember to align the stack pointer on 16 bit word or 32 bit double-word boundaries
Library Stubs for System Calls
• User Level Library int read( int fd, char * buf, int size) { move READ-id to R0
move fd, buf, size to R1, R2, R3
int $0x80 load result code from Rresult
}
User stack
Registers
User memory
Kernel stack
Registers
Kernel memory
Returns here when work is done by kernel
Could be an error code
32-255 available to user
Win NT: 2E
Linux: 80
HW takes over and IP is set to OS Kernel
• User Level Process calls: read( fd, buf, size);
System Call Entry Point in Kernel
User stack
Registers
User memory
Kernel stack
Registers
Kernel memory
• Assume passing parameters in registers
EntryPoint inside OS Kernel: switch to kernel stack; save user context; if legal(R0) call service;
restore user context; switch to user stack; iret;
int 0x80
SW interrupt
Puts results into buf Or: User stack
Or: some register
Change to user mode and return
Kernel Mode: Total control. All interrupts are disabled
NB, black frame means KL...
A simplified Interrupt Handler
System Call Entry Point• Assume passing parameters in
registers (EntryPoint:) switch to kernel stack; save all registers; if legal(R0) call sys_call_table[R0]; restore user registers; switch to user stack; iret; %next instr in user space app
int 0x80
SW interrupt
Save/Restore Context?
If envoked code executes for a long time: should SCHEDULE or at least ENABLE interrupts (here or inside service routine).
READ returns with result and handler must return them to user(Or SCHEDULE to run another process)
Polling instead of Interrupt?
• OS kernel could check a request queue of syscalls instead of using an interrupt?
• Waste CPU cycles checking • All have to wait while the checks are being done • When to check?
– Non-predictable – Pulse every 10-100ms?
» too long time
• Same valid for HW Interrupts vs. Polling • However, spinning can give good performance (more
later)
But used for Servers
Design Issues for Syscall• We used only one result reg, what if more results? • In kernel and in called service: Use caller’s stack or a
special stack? – Use a special stack
• Quality assurance – Use a single entry or multiple entries?
• Simple is good? – Then a single entry is simpler, easier to make robust
• Can kernel code call system calls? – Yes, but should avoid the entry point mechanism
System calls vs. Library calls
• Division of labor (a.k.a. Separation of Concerns) • Memory management example
– Kernel • Allocates “pages” (w/HW protection) • Allocates many “pages” (a big chunk) to library
– Big chunks, no “small” allocations
– Library • Provides malloc/free for allocation and deallocation of memory • Application use malloc/free to manage its own memory at fine
granularity • When no more memory, library asks kernel for a new chunk of pages
User process vs. kernel
• To go from User Level Process to Kernel – syscalls using int
• To go from Kernel to User Level Process – iret (with “good” stack) – Kernel is all powerful
• Can write into user memory • Can terminate, block and activate user processes