what is a “system”?bnrg.cs.berkeley.edu/~adj/cs16x/lectures/lec01-intro.pdf · 8/27/07 joseph...
Post on 27-Jun-2020
7 Views
Preview:
TRANSCRIPT
Page 1
CS194-3/CS16xIntroduction to Systems
Lecture 1
What is a “System”?
August 27, 2007
Prof. Anthony D. Joseph
http://www.cs.berkeley.edu/~adj/cs16x
Lec 1.28/27/07 Joseph CS194-3/16x ©UCB Fall 2007
Who am I?
Professor Anthony D. Joseph
• 465 Soda Hall (RAD Lab)
• adj AT cs.berkeley.edu
• Office hours Mon/Tue 1-2pm in 413 Soda
• Background:
– MIT undergrad and grad student
• Research areas:– Current: Network security, OS security, very large security testbeds
– Other: Mobile computing, wireless networking, cellular telephony
Lec 1.38/27/07 Joseph CS194-3/16x ©UCB Fall 2007
Goals for Today
• Motivation for a new course
• Topics:– Operating systems, Databases, Networking, Security, Software engineering, Distributed systems
• Complexity
Interactive is important!
Ask Questions!
Note: Some slides and/or pictures in the following areadapted from slides ©2005 Silberschatz, Galvin, and Gagne. Slides courtesy of Kubiatowicz, AJ Shankar, George Necula, Alex Aiken, Eric Brewer, Ras Bodik, Ion Stoica, Doug Tygar, and David Wagner.
Lec 1.48/27/07 Joseph CS194-3/16x ©UCB Fall 2007
Why Change CS 162?
• Only minor changes since early 1990’s…
– Slides!
– Java version of Nachos
– Content: More crypto/security, less databases and distributed filesystems
– Time to update again!!
• Most CS students take CS 162 and 186– But, not all take EE 122, CS 169/161
– We’d like all students to have a basic understanding of key concepts from these classes
• Each class introduces the same topics with class-specific biases– Concurrency in an Operating System versus in a Database
– Introduce concepts with a common framework
Page 2
Lec 1.58/27/07 Joseph CS194-3/16x ©UCB Fall 2007
Rapid Underlying Technology Change
• “Cramming More Components onto Integrated Circuits”
– Gordon Moore, Electronics, 1965Lec 1.68/27/07 Joseph CS194-3/16x ©UCB Fall 2007
Computing Devices Everywhere
Lec 1.78/27/07 Joseph CS194-3/16x ©UCB Fall 2007
People-to-Computer Ratio Over Time
From David Culler
Lec 1.88/27/07 Joseph CS194-3/16x ©UCB Fall 2007
Increasing Software Complexity
From MIT’s 6.033 course
Page 3
Lec 1.98/27/07 Joseph CS194-3/16x ©UCB Fall 2007
But, Latency Improves Slowly…
From MIT’s 6.033 course
Lec 1.108/27/07 Joseph CS194-3/16x ©UCB Fall 2007
Heat is a Major Problem!
From MIT’s 6.033 course
Lec 1.118/27/07 Joseph CS194-3/16x ©UCB Fall 2007
The Internet
Lec 1.128/27/07 Joseph CS194-3/16x ©UCB Fall 2007
The Dark Side of the Internet…
Page 4
Lec 1.138/27/07 Joseph CS194-3/16x ©UCB Fall 2007
• Click on the link and you join the STORM zombie network of 250K-10M “0wned” PCs
• Zombies used by malicious hackers (crackers) for phishing, spamming, identity theft, extortion
• Crackers build zombie networks of 10K-1M compromised machines & sell services– Ex: Take down competitor's website for $1K
• Hugely profitable!– Massive spamming, ID fraud through phishing
– Roughly half of all spam is sent by zombies
• How can we secure our machines against folks like this?
Zombie Networks
Lec 1.148/27/07 Joseph CS194-3/16x ©UCB Fall 2007
• How to manage complexity at all levels?
• Many issues and many tradeoffs
• Need a global view of systems– Decompose into components
• Need a global understanding of systems– Applications, networks, databases, operating systems, security, software engineering…
Complexity
Lec 1.158/27/07 Joseph CS194-3/16x ©UCB Fall 2007
Course Administration
• Instructor: Anthony D. Joseph (adj AT cs.berkeley.edu)465 Soda HallOffice Hours: M/Tu 1-2 in 413 Soda
• TAs: Kai Xia (dk7x@berkeley.edu)
• Website: http://www.cs.berkeley.edu/~adj/cs16x
• Reader and book: TBA
– Most likely: Silberschatz, Galvin, and Gagne, Operating Systems Concepts, 7th Ed., 2005
• Projects: First project will likely be Nachos-based
• Grading: TBALec 1.168/27/07 Joseph CS194-3/16x ©UCB Fall 2007
• Managing complexity (abstractions, layering, modularity)
• Team programming, IDEs, documentation style
• OS, memory, database, and network security
• Kernel and address spaces, Address translation, Caching, TLBs, demand paging
• I/O Systems, File systems, directories, database buffer pools, tuple layouts, files of tuples
• Internet evolution, architectures, protocols, routing, P2P and overlay networks,
• Concurrency, processes, threads, ACID
• Enforcing mutual exclusion, serializability, 2PL, logging, recovery, deadlock
• Viruses, worms, and botnets, DDoS,
• Cryptographic algorithms: RSA, MD5, DES
• Simple authentication protocols, PKI
• Query (dataflow) operators, map-reduce
Topic Coverage
Page 5
Lec 1.178/27/07 Joseph CS194-3/16x ©UCB Fall 2007
• It’s like art:
– There’s a vision, a realization, an aesthetic appeal, a sense of ownership and satisfaction
• It’s not like art:– The end result is useful
» To you, and anyone else
• It’s immensely satisfying to do– Your project is your baby
» It’ll keep you up at night, make you proud…
» But won’t disown you when it’s 14 (though you might disown it)
• Good software engineering can be learned– But it is hard to teach
– Most people only learn through experience (making mistakes)
Creating Software Is Awesome
Lec 1.188/27/07 Joseph CS194-3/16x ©UCB Fall 2007
Group Project Simulates Industrial Environment
• Project teams have 4 or 5 members in same discussion section– Must work in groups in “the real world”
• Communicate with colleagues (team members)– Communication problems are natural
– What have you done?
– What answers you need from others?
– You must document your work!!!
– Everyone must keep an on-line notebook
• Communicate with supervisor (TAs)– How is the team’s plan?
– Short progress reports are required:» What is the team’s game plan?
» What is each member’s responsibility?
Lec 1.198/27/07 Joseph CS194-3/16x ©UCB Fall 2007
Typical Lecture Format
• 1-Minute Review
• 20-Minute Lecture
• 5- Minute Administrative Matters
• 25-Minute Lecture
• 5-Minute Break (water, stretch)
• 25-Minute Lecture
• Instructor will come to class early & stay after to answer questions
Attention
Time
20 min. Break “In Conclusion, ...”25 min. Break 25 min.
Lec 1.208/27/07 Joseph CS194-3/16x ©UCB Fall 2007
Academic Dishonesty Policy
• Copying all or part of another person's work, or using reference material not specifically allowed, are forms of cheating and will not be tolerated. A student involved in an incident of cheating will be notified by the instructor and the following policy will apply:
http://www.eecs.berkeley.edu/Policies/acad.dis.shtml• The instructor may take actions such as:
– require repetition of the subject work,
– assign an F grade or a 'zero' grade to the subject work, – for serious offenses, assign an F grade for the course.
• The instructor must inform the student and the Department Chair in writing of the incident, the action taken, if any, and the student's right to appeal to the Chair of the Department Grievance Committee or to the Director of the Office of Student Conduct.
• The Office of Student Conduct may choose to conduct a formal hearing on the incident and to assess a penalty for misconduct.
• The Department will recommend that students involved in a second incident of cheating be dismissed from the University.
Page 6
Lec 1.218/27/07 Joseph CS194-3/16x ©UCB Fall 2007
Computer System Organization
• Computer-system operation
– One or more CPUs, device controllers connect through common bus providing access to shared memory
– Concurrent execution of CPUs and devices competing for memory cycles
Lec 1.228/27/07 Joseph CS194-3/16x ©UCB Fall 2007
Example: Some Mars Rover Requirements
• Serious hardware limitations/complexity:– 20Mhz powerPC processor, 128MB of RAM – cameras, scientific instruments, batteries,
solar panels, and locomotion equipment– Many independent processes work together
• Can’t hit reset button very easily!– Must reboot itself if necessary– Always able to receive commands from Earth
• Individual Programs must not interfere– Suppose the MUT (Martian Universal Translator Module)
buggy– Better not crash antenna positioning software!
• Further, all software may crash occasionally– Automatic restart with diagnostics sent to Earth– Periodic checkpoint of results saved?
• Certain functions time critical:– Need to stop before hitting something– Must track orbit of Earth for communication
Lec 1.238/27/07 Joseph CS194-3/16x ©UCB Fall 2007
How do we tame complexity?
• Every piece of computer hardware different– Different CPU
» Pentium, PowerPC, ColdFire, ARM, MIPS
– Different amounts of memory, disk, …
– Different types of devices» Mice, Keyboards, Sensors, Cameras, Fingerprint
readers
– Different networking environment» Cable, DSL, Wireless, Firewalls,…
• Questions:– Does the programmer need to write a single program that performs many independent activities?
– Does every program have to be altered for every piece of hardware?
– Does a faulty program crash everything?– Does every program have access to all hardware?
Lec 1.248/27/07 Joseph CS194-3/16x ©UCB Fall 2007
OS Tool: Virtual Machine Abstraction
• Software Engineering Problem: – Turn hardware/software quirks
what programmers want/need– Optimize for convenience, utilization, security, reliability, etc…
• For Any OS area (e.g. file systems, virtual memory, networking, scheduling):– What’s the hardware interface? (physical reality)
– What’s the application interface? (nicer abstraction)
Application
Operating System
HardwarePhysical Machine Interface
Virtual Machine Interface
Page 7
Lec 1.258/27/07 Joseph CS194-3/16x ©UCB Fall 2007
Interfaces Provide Important Boundaries
• Why do interfaces look the way that they do?
– History, Functionality, Stupidity, Bugs, Management
– CS152 Machine interface
– CS160 Human interface
– EE122 Protocol stack
– CS169 Software engineering/management
• Should responsibilities be pushed across boundaries?
– RISC architectures, Graphical Pipeline Architectures
instruction set
software
hardware
Lec 1.268/27/07 Joseph CS194-3/16x ©UCB Fall 2007
Virtual Machines
• Software emulation of an abstract machine– Make it look like hardware has features you want– Programs from one hardware & OS on another one
• Programming simplicity– Each process thinks it has all memory/CPU time– Each process thinks it owns all devices– Different Devices appear to have same interface
– Device Interfaces more powerful than raw hardware» Bitmapped display windowing system» Ethernet card reliable, ordered, networking (TCP/IP)
• Fault Isolation– Processes unable to directly impact other processes
– Bugs cannot crash whole machine
• Protection and Portability– Java interface safe and stable across many platforms
Lec 1.278/27/07 Joseph CS194-3/16x ©UCB Fall 2007
Four Components of a Computer System
Definition: An operating system implements a virtual machine that is (hopefully) easier and safer to program and use than the raw hardware.
Lec 1.288/27/07 Joseph CS194-3/16x ©UCB Fall 2007
Virtual Machines (con’t): Layers of OSs
• Useful for OS development
– When OS crashes, restricted to one VM
– Can aid testing programs on other OSs
Page 8
Lec 1.298/27/07 Joseph CS194-3/16x ©UCB Fall 2007
What does an Operating System do?
• Silerschatz and Gavin:“An OS is Similar to a government”
– Begs the question: does a government do anything useful by itself?
• Coordinator and Traffic Cop:
– Manages all resources
– Settles conflicting requests for resources
– Prevent errors and improper use of the computer
• Facilitator:
– Provides facilities that everyone needs
– Standard Libraries, Windowing systems
– Make application programming easier, faster, less error-prone
• Some features reflect both tasks:
– E.g. File system is needed by everyone (Facilitator)
– But File system must be Protected (Traffic Cop)
Lec 1.308/27/07 Joseph CS194-3/16x ©UCB Fall 2007
OS Systems Principles
• OS as illusionist:– Make hardware limitations go away– Provide illusion of dedicated machine with infinite memory and infinite processors
• OS as government:– Protect users from each other– Allocate resources efficiently and fairly
• OS as complex system:– Constant tension between simplicity and functionality or performance
• OS as history teacher– Learn from past – Adapt as hardware tradeoffs change
BREAK
Lec 1.328/27/07 Joseph CS194-3/16x ©UCB Fall 2007
• Need to store information?
• Can put it in a file
• Too big or too complex?– Use a database
• How big is the web?– 400 million hosts
– 15-30 billion pages (http://www.pandia.com/sew/383-web-size.html)…
• With a billion users looking for information
Data Complexity
Page 9
Lec 1.338/27/07 Joseph CS194-3/16x ©UCB Fall 2007
What is a Database System Today?
Lec 1.348/27/07 Joseph CS194-3/16x ©UCB Fall 2007
More Complex Database Systems
Lec 1.358/27/07 Joseph CS194-3/16x ©UCB Fall 2007
So… What is a Database?
• We will be broad in our interpretation
• A Database:– A very large, integrated collection of data.
• Typically models a real-world “enterprise”– Entities (e.g., teams, games)
– Relationships (e.g. The A’s are playing in the World Series)
• Might surprise you how flexible this is– Web search:
» Entities: words, documents
» Relationships: word in document, document links todocument.
– P2P filesharing:
» Entities: words, filenames, hosts
» Relationships: word in filename, file available at host
Lec 1.368/27/07 Joseph CS194-3/16x ©UCB Fall 2007
• A Database Management System (DBMS) is:
– A software system designed to store, manage, and facilitate access to databases.
• Typically this term used narrowly– Relational databases with transactions
» E.g. Oracle, DB2, SQL Server
– Mostly because they predate other large repositories
» Also because of technical richness
– When we say DBMS in this class we will usually follow this convention
» But keep an open mind about applying the ideas!
What is a Database Management System?
Page 10
Lec 1.378/27/07 Joseph CS194-3/16x ©UCB Fall 2007
Is the WWW a DBMS?
• Fairly sophisticated search available– Crawler indexes pages on the web
– Keyword-based search for pages
• But, currently– data is mostly unstructured and untyped
– search only:» can’t modify the data
» can’t get summaries, complex combinations of data
– few guarantees provided for freshness of data, consistency across data items, fault tolerance, …
– Web sites typically have a (relational) DBMS in the background to provide these functions.
• The picture is changing quickly– Information Extraction to get structure from unstructured
– New standards e.g., XML, Semantic Web can help data modeling Lec 1.388/27/07 Joseph CS194-3/16x ©UCB Fall 2007
“Search” versus Query
• What if you wanted to find out which actors donated to John Kerry’s presidential campaign?
• Try “actors donated to john kerry” in your favorite search engine.
• If it isn’t “published”,
it can’t be searched!
Lec 1.398/27/07 Joseph CS194-3/16x ©UCB Fall 2007
A “Database Query” Approach
Lec 1.408/27/07 Joseph CS194-3/16x ©UCB Fall 2007
“Yahoo Actors” JOIN “FECInfo”
(Courtesy of the Telegraph research group @Berkeley)
Page 11
Lec 1.418/27/07 Joseph CS194-3/16x ©UCB Fall 2007
• Learn how to build complex systems:– How can you manage complexity for future projects?
• Engineering issues:– Why is the web so slow sometimes? Can you fix it?– What features should be in the next mars Rover?
– How do large distributed systems work? (e.g. Skype)
• Business issues:– Will my web services application scale to 1M users?
• Buying and using a personal computer:– Why different PCs with same CPU behave differently?– Should you upgrade to Vista or wait?– Why does Microsoft have such a bad name (and Apple a good name)?
• Security, viruses, and worms– What exposure do you have to worry about?
Why Study Systems – OS/Net/DB/Sec/SE?
top related