innovations in teaching os concepts using native nt arkady retik program manager source asset...
TRANSCRIPT
Innovations in teaching OS concepts using native NT
Innovations in teaching OS concepts using native NT
Arkady Retik
Program Manager
Source Asset
Management
Arkady Retik
Program Manager
Source Asset
Management
Dave Probert
Architect
Windows Core
Kernel
Dave Probert
Architect
Windows Core
Kernel
Microsoft CorporationMicrosoft Corporation
Integrate Windows internals intoOperating Systems courses
Give students more real-world illustrations of the principles being taught
Achieve a better concept-to-effort ratio for OS projects
Include examples from the Windows kernel source code
Windows Academic Shared Source ProgramWindows Academic Shared Source Program
AgendaAgenda
Program Overview
Windows OS Internals Curriculum Resource Kit
ProjectOZ
Windows Research Kernel
Q&A
Program Overview
Windows OS Internals Curriculum Resource Kit
ProjectOZ
Windows Research Kernel
Q&A
Applications Services
GUI
Middleware
WEB LOB
Win32WinFX POSIX
SystemServices
Net InterfacesProtocol Stacks
DevicesFile Systems
System Runtime Libraries
System Call Interface
Processes Threads Virtual Memory Security
I/O mgr Data cache Registry InterProcess
Object mgr Scheduler InterruptsSynchr
SOURCE
Lecture MaterialsTextbooks
ProjectOZ
Working below ground in Windows
We believe Microsoft technologies are important to Computer Science education
ubiquitous empowering scalable innovative customer-driven features
We know Computer Science Education is important to Microsoft
a source of the human and intellectual resources that drive our industry quality of education determines technical capabilities of
our customers our partners our employees
We believe Microsoft technologies are important to Computer Science education
ubiquitous empowering scalable innovative customer-driven features
We know Computer Science Education is important to Microsoft
a source of the human and intellectual resources that drive our industry quality of education determines technical capabilities of
our customers our partners our employees
Partnership with Higher Education
CRK WRK
ProjectOZ
Windows Academic Shared Source Program
Windows Operating Systems Internals Curriculum Resource Kit (CRK) - presentation slides, experiments, labs, quizzes and assignments for introducing case studies from the Windows kernel into operating system courses. Available now
ProjectOZ - an operating systems project environment that uses the native kernel interfaces of Windows to provide simple, clean, user-mode abstractions of the CPU, MMU, trap mechanism, and physical memory that can be used to perform experiments in operating systems principles. Pilots this year
Windows Research Kernel – the core kernel sources and binaries integrated with an environment for building and testing experimental versions of the Windows kernel for use in teaching and research. Available soon
CRKCRK
Andreas Polze is the Operating Systems and Middleware Professor at the Hasso-Plattner-Institute for Software Engineering at University Potsdam, Germany. He received a doctoral degree from Freie University Berlin, Germany, in 1994 and a habilitation degree from Humboldt University Berlin in 2001, both in computer science. His habilitation thesis investigates Predictable Computing in Multicomputer-Systems. Current research interests include Interconnecting Middleware and Embedded Systems, Mobility and Adaptive System Configuration, and End-to-End Service Availability for standard middleware platforms. At University Potsdam, his current teaching activities focus on architecture of operating systems, on component-based middleware, as well as on predictable distributed computing. Our curriculum includes lectures that discuss operating system issues based on standard platforms (Windows 2000/XP, Mac OS X (BSD Unix), and Solaris) as well as on embedded systems (Windows CE, Embedded Linux). Prof. Polze was a visiting scientist with the Dynamic Systems Unit at Software Engineering Institute, at Carnegie Mellon University, Pittsburgh, USA, were he worked on real-time computing on standard middleware (CORBA), and with the Real-Time Systems Laboratory at University of Illinois, Urbana-Champaign.
Mark Russinovich is chief software architect and cofounder of Winternals Software (www.winternals.com), a company that specializes in advanced systems software for Microsoft Windows. Mark is co-author of Inside Windows 2000, 3rd Edition (Microsoft Press) with David Solomon andsuccessor, Windows Internals, 4th Edition (Microsoft Press).Mark is a Microsoft Most Valuable Professional (MVP) and serves as senior contributing editor for Windows IT Pro magazine where he contributes to the Windows Power Tools column. He is also a frequent speaker at major industry conferences such as Microsoft Tech Ed, IT Forum, Windows IT Pro Magazine's Connections and Redmond Magazine's TechMentor.Mark has a B.S. from Carnegie Mellon University and a M.S. from Rensselaer Polytechnic Institute, both in computer engineering. In 1994, he earned a Ph.D. from Carnegie Mellon University, also in computer engineering. David Solomon (www.solsem.com) teaches classes on Windows kernel internals to developers and IT professionals at companies worldwide, including Microsoft. He is the co-author of Windows Internals, 4th edition, the official Microsoft Press book on Windows kernel internals, as well as the previous edition, Inside Windows 2000. David also wrote Inside Windows NT, 2nd edition, and Windows NT for OpenVMS Professionals. He also co-created the Windows Internals COMPLETE video series which Microsoft licensed for worldwide internal training. David has served as technical chair for three past Windows NT conferences and has spoken at many TechEds and PDCs. He was a recipient of the 1993 & 2005 Microsoft Support Most Valuable Professional (MVP) award.
CRK AuthorsCRK Authors
industryindustry academiaacademia
What about CRK content?
cover all OS BOK units and more (based on Windows XP/Server 2003)
scaleable to multiple levels modular (can be used in whole / in part) case studies / compare & contrast Basic module provides materials to incorporate into a
complete basic level OS course of one semester in length. The module cover the Windows OS specific topics in the core and elective units of the OS BOK of Computing Curricula 2001.
Advanced module provides materials to incorporate into an advanced level OS course of one semester in length. The module covers the Windows OS specific topics in the core and elective units of the “CC2001” OS BOK as well as three supplementary units.
What OS topics CRK covers?
a. Core topics OS1. Overview of operating systems OS2. Operating system principles OS3. Concurrency OS4. Scheduling and dispatch OS5. Memory management
b. Elective topicsOS6. Device management OS7. Security and protectionOS8. File systemsOS9. Real-time and embedded systemsOS10. Fault toleranceOS11. System perf evaluation & troubleshootingOS12. Scripting
c. Supplementary topics 13. Windows networking 14. Comparing the Linux and Windows Kernels 15. Windows – Unix InteroperabilityNote: Labs and Exercises to reinforce the topics
Available now @ http://www.msdnaa.net/curriculum
Anything Anything we we
missed?missed?
Available Available
now!now!
12
ProjectOZ
13
ProjectOZ Background
Collaboration with MSR University Relations, Windows Kernel & Architecture Team, and Source Asset Team
Goal is to provide better support for OS instruction and research using Windows
Part of a larger program:• Windows Research Kernel• Curriculum Resource Kit • Textbooks and other resourcesBased on observations from SPACE research project at UC
Santa Barbara (Probert & Bruno)Provide an alternative to NachosAlpha version of ProjectOZ implemented by Paul
Turner, a summer intern from University of Waterloo
14
OS model of processor
OS can only control:
MMU (memory management unit)
trap vector
scheduling of external interrupts
when it does an RETI (Return from Interrupt)
OS only regains control through trap/interrupt
CPU
MMU
MEMORY
TRAP handler
RETIExternal interrupts
15
SPACESystems Programming using Address-spaces and Capabilities
for Extensibility– a reaction to distributed-shared virtual memory research
Key observation: extending core OS functionality difficult because existing kernel abstractions get in the way
(i.e. threads, processes, inter-process communication)
SPACE uses lower-level abstractions:control flow, address spaces/domains, portals
– represent hardware abstractionsi.e. CPU, MMU, trap-vectors
– then threads, processes, IPC built on top
Monolithic kernel is not necessary => fundamental extensibility
16
Kernel Abstractions
Process
threadthread
pagetable
Process
threadthread
pagetable
CPU CPU CPU
kernel
MMU MMU MMU
17
SPACE AbstractionsSpace: a mapping of addresses from logical to physicalDomain: permission bit-vector on each address mapping in a Space
– Each bit-vector indexed by the current protection-mode– (Space, mode) →→ Domain
Portal: entry-point in a Domain– (currDomain, trap/interrupt) →→ (newDomain, newPC)– Each portal traversal saves state and associates a token– SPACE implementation maintains stack of tokens corresponding to nested
traversals of portals on a particular CPU– Resume reverses portal-traversal to state at top of token stack
Two portal operations– Suspend:
• Save state token at top of current token stack• Create empty token stack, to be used at next portal traversal• Pass handle on token for previous stack to routine at newPC in newDomain
– Unsuspend(token) operation:• Takes handle to a previous token stack• Discards current token stack (if any)• Resumes token from top of previous stack
18
Kernels out of spaces & domains
kernel-modedomain 0
user-modedomain 1
kernel-modedomain 0
user-modedomain 1
kernel-modedomain 1
user-modedomain 1
space 0 space 1 space 2
Kernel-mode memory mappings (mostly) shared in all spaces
spaces used to build processes
19
Following the CPUCPU 0
Domain a Domain b Domain c suspend Domain d
a b c
Domain eDomain fsuspend
f e d
Domain d
T0
T1
unsuspendT0
Domain c
resume Domain b suspend
aT0
Domain dunsuspend
T1
Domain f
20
Redrawing the pictureCPU 0
Domain a
Domain b
Domain c
suspend
Domain d
a b c
Domain e
Domain f
suspendf e d
Domain d
T0
T1
unsuspendT0
Domain c
resume
Domain b
suspenda
T0
Domain d
unsuspendT1
Domain f
SCHEDULER
sleep1
wakeup1
start2sleep1
wakeup2sleep2
21
Building SPACE on top of NTSpaces – use NT Processes
Domains – use a Space for each domain, but – other than the page permissions, the logical-to-‘physical’ mappings are identical for domains in the same space
Physical memory – creates an NT section, and selectively creates single ‘page’ views onto the section from each Space/Domain (64K page size)
CPUs – each domain has an NT thread corresponding to each logical CPU configured -- with only one thread per CPU runnable at a time
Space implementation – space.exe, controls the simulation, provides the space primitives such as portal traversal, implementing CPUs and MMUs
22
Building SPACE on top of NT
Exceptions – space.exe establishes an exception port for each domain, which it uses to detect exceptions (e.g. pagefaults) and implement portal traversal.
Traps – programmatic traps in a domain are forwarded to space.exe for portal traversal using either NT LPC or the exception mechanism
Interrupts – space.exe interrupts CPUs by suspending the running NT thread, and doing get/set thread context
MMU – simulated by space.exe by modifying the views each domain has for the ‘physical memory’ section (using NT memory management APIs)
23
SPACE Multi-computer
space.exe
NT Proc
NT Proc
NT Proc
space.exe
NT Proc
NT Proc
NT Proc
space.exe
NT Proc
NT Proc
NT Proc
Network simulator
24
Teaching Objectives
SPACE Mission:• An exciting, innovative, productive environment for OS
instruction & researchGoals:• Use SPACE to abstract hardware• Let students focus on OS data structures and algorithms• Provide a non-simulated environment for normal
execution• Build models for I/O devices, timers, DMA• Support both project-level and lab-level experiments• Provide an experimental apparatus for exploring the OS
literature
25
Approach to OS experimentsProvide the BasicOZ environment• SPACE core implementing SPACE abstractions• Small vanilla OS implementation on top of SPACE• System described by XML configuration file• Development/measurement environment• Tools for tracing/profiling/analyzing• Workload/test library• Access to native NTAPIs (?)
Student experiments improve on BasicOZ
Experiments selected to complement lectures
Some experiments progressive, others independent
26
Approach to OS experiments
Multiple types of experiments can be assigned• Lab-level experiments to implement different algorithms,
make small extensions, explore performance• Medium-level projects that do major work on a particular
subsystem• Competitive projects where different groups implement
different algorithms and compare resulting performance• Literature-based projects, where students implement
algorithms/solutions from published papers• Investigations into novel algorithms and new solutions (open-
ended)
27
BasicOZ Environment
System calls• implementation of basic system calls, using dynamic
allocation of stacks in 'kernel'• token-chains provide trapframes for returning to user-
mode
User-mode Threads• no preemption, no guard pages on stack
System devices• timer, clock, console, disk simulator, network simulator
(with fault-injection)
28
BasicOZ EnvironmentInput/output• I/O device simulation framework
– DMA, interrupts– simple device register operations– simulation of IRQLs– simulation of real device properties
Filesystem• trivial file system
– one directory– assumes infinite storage, contiguous allocation– no delete or other namespace operations, no file
extension– populated as part of system specification
29
BasicOZ EnvironmentProcesses• single thread• static executable images (no libraries or relocation)• simple create/loadimage model (not fork/exec)• simple virtual address management with linear freelistVirtual Memory• no shared virtual memory• simple pagefile management• pagefault handler always goes to disk• management of physical memory with linear freelist• artificial forcing of low-memory• random page replacement, blocking on page writes• fetch-from-previous-space for kernel implementation
30
BasicOZ Environment
Boot loader• load kernel configuration and images
Image library• load the segments of an executable image into an
address space• access symbols, relocation information, headers,
import/export tables, profiling support, stacktrace support, disassembly
Build environments• environment for producing the 'kernel' (server)• environment for building test programs (client)
31
BasicOZ Environment
Debug, test, instrumentation• execution statistics and timing• profiling information• tracing (flight-data recorder)Tests & Workloads• library of individual applets, applications, and entire
workloads for test/evaluation/demonstration, e.g. – multi-process, multi-thread, multi-computer loads– demonstrate synchronization, priority inversion, scheduling
characteristics– IPC, shared-memory– asynchronous I/O– client/server applications– etc, etc, etc
32
Project Areas: multi-threading
Multi-threading and synchronization primitives• use the timer to make user-mode threads preemptive• implement a pluggable scheduler, with several different
scheduling algorithms (including priority-based)• demonstrate race conditions, including priority inversion• implement basic kernel-mode blocking synchronization
primitives, like semaphores and reader/writer locks• user synchronization primitives to eliminate race
conditions
33
Project Areas: handles
Implement handles and file table• provide a user-mode mechanism for referencing kernel-
mode objects• implement a way of referring to open files in the trivial file
system• implement open/read/write/close on the file system
• experiment with ways of detecting bad closes and test
with poorly synchronized multi-processor workload
34
Project Areas: virtual memory
Virtual memory• improve algorithms for managing
– physical memory– pagefile space– virtual addresses
• implement shared memory between processes• implement distributed-shared-virtual-memory across a
SPACE multi-computer
35
Project Areas: processes
Process management• create/destroy processes
– using fork– using other algorithms
• build a capability-based sandbox • build a process pool for isolating hosted code
36
Project Areas: I/O drivers
I/O driver• implement IRQL-based protection of data structures• write a traditional top-half/bottom-half I/O driver for a
simple simulated device• add DMA• implement asynchronous completion of I/O
37
Project Areas: IPC
Inter-Process (i.e. cross-domain) Communication• simple reader/writer synchronization• basic message-based IPC between processes
– copy-based– shared-memory
• named IPC ports• named pipes• mailboxes
38
Project Areas: objects
Build simple kernel-level object model• cross-domain invocation of object methods, with simple
marshalling• build a name server• recover from cross-domain failures• persist objects across reboots
39
Project Areas: file systemFile system (and volumes)• build a more complex file system (on the simulated disk --
or a USB thumbdrive)• implement block management, directory hierarchies• build a log-based file system• implement namespace operations (like rename, link/unlink)
and test for race conditions• implement a cache (either blocks or files)• implement memory mapped files• implement get/put file protocols (incl memory mapping)• build a RAID layer below the file system, evaluating
robustness and performance)
40
Project Areas: security
Investigate security features• give processes identities• add ACLs to files/objects• demonstrate buffer-overflow• implement ‘applications’• implement client/server impersonation• implement client/server capability mechanism
41
Project Areas: signals/exceptions
signals and exceptions• deliver signals to threads• test for race conditions• use signals for delivery of asynchronous events (like I/O
completion)• exception notification using signals• exception notification using unwinding
42
Project Areas: networking
networking• using the SPACE multicomputer, build a simple network
stack• implement sockets• packetize streams and send between computers• Use network unreliability feature in simulation
– implement reliable streams– explore techniques to minimize network latencies
43
Project Areas: basic debugging
implement a basic debugger• run/stop/step• examine/modify memory • disassemble• set breakpoints
44
2006/2006 academic plans• Initial version nearing completion (thanks Paul!)• Start building community• Pilot projects in China
– building on the Chinese OS principles textbook by faculty at Peking, Tsinghua, and Behai
– considering a follow-on project book
• Will use in short-courses in Japan this year• Talking with some U.S. schools about special topics
courses this year• Working with faculty on proposal for internals book using
ProjectOZ as basis for experiments• Lot of interest in EuropeThat’s as far as our travels have taken us so far
45
Windows Research Kernel
46
WRK Goals
• Make it easier for faculty and students to compare & contrast Windows to other operating systems
• Students can study source, and modify and build projects
• Better support for research & publication based on Windows internals
• Encourage more OS textbook and university-oriented internals books on Windows kernel
• Simplified licensing
47
NTOS Kernel SourcesBased on Windows XP/SP2 and Windows x64 NTOS• Processes, threads, LPC, VM, scheduler, object manager, I/O manager,
synchronization, worker threads, kernel memory manager, …– most everything in NTOS except plug-and-play, power-management, and
specialized code such as the driver verifier, splash screen, branding, timebomb, etc.
– non-kernel kernel-mode code (drivers, file systems, networking) code is from the DDK and IFSKIT
• Simplified in a few places, cleaned up comments, improved spelling• Non-source is encapsulated in a binary libraryBuild and set up utilities and toolsTools for tracing, performance monitoring, logging, debugging, etcPackaged with
– DDK subset and documentation for working with drivers– File system sources from IFSKIT– VirtualPC product– Kernel regression tests – Documentation for Native NT API
Something over 500K lines of source
48
WRK licensingImprovements over current MSR UR license:
– Faculty feel comfortable agreeing to its conditions
– Students can use in classroom environment
License type:
– Non commercial, academic use only; allow derivative works for non-commercial purpose
Eligibility criteria:
– Available to faculty and students in colleges/universities WW
Usage scenarios:
– View, copy, reproduce, distribute within the institution
– Modify for teaching and experimentation purposes
– Produce teaching and research publications including relevant snippets of source
• Can use in textbooks and academic publications, and community forums • Have to perpetuate MS copyright notices
– Share derivatives within academic community
Status CRK:
Core & security topics are available now
Elective & Supplementary topics will be available by end of 2005
ProjectOZ and WRK – we will be looking for participants in pilots and trials AY05/06
If you are interested - contact us at
More information on this and related topics Shared Source
http://www.microsoft.com/resources/sharedsource Curriculum Repository on MSDNAA
http://www.msdnaa.net/curriculum