computer science 425 distributed systems cs 425 / cse 424 / ece 428 fall 2010
DESCRIPTION
Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2010. Indranil Gupta (Indy) August 24-December 7, 2010 Lecture 1-29. Website: http://www.cs.uiuc.edu/class/fa10/cs425/. 2010, I . Gupta, K. Nahrtstedt, S. Mitra, N. Vaidya, M. T. Harandi, J. Hou. - PowerPoint PPT PresentationTRANSCRIPT
Lecture 29-1Lecture 29-1
Computer Science 425Distributed Systems
CS 425 / CSE 424 / ECE 428
Fall 2010
Computer Science 425Distributed Systems
CS 425 / CSE 424 / ECE 428
Fall 2010
Indranil Gupta (Indy)
August 24-December 7, 2010
Lecture 1-29
2010, I. Gupta, K. Nahrtstedt, S. Mitra, N. Vaidya, M. T. Harandi, J. Hou
Website: http://www.cs.uiuc.edu/class/fa10/cs425/
Lecture 29-2Lecture 29-2
Our First Aim in this Course was…(first lecture)….
Our First Aim in this Course was…(first lecture)….
To Define the Term Distributed System
(First lecture slide)
Lecture 29-3Lecture 29-3
Can you name some examples of Distributed Systems?
Can you name some examples of Distributed Systems?
• Client-Server (NFS)
• The Web
• The Internet
• An ad-hoc network
• A sensor network
• DNS
• Gnutella or BitTorrent (peer to peer overlays)
• A datacenter, e.g., The Planet, or Amazon EC2/S3
• (The Solar System?)
• (Society?)
• (Food Chain?)
(First lecture slide)
Lecture 29-4Lecture 29-4
What is a Distributed System?What is a Distributed System?
(First lecture slide)
Lecture 29-5Lecture 29-5
FOLDOC definitionFOLDOC definition
A collection of (probably heterogeneous) automata whose distribution is transparent to the user so that the system appears as one local machine. This is in contrast to a network, where the user is aware that there are several machines, and their location, storage replication, load balancing and functionality is not transparent. Distributed systems usually use some kind of client-server organization.
(First lecture slide)
Lecture 29-6Lecture 29-6
Textbook definitionsTextbook definitions
• A distributed system is a collection of independent computers that appear to the users of the system as a single computer
[Andrew Tanenbaum]
• A distributed system is several computers doing something together. Thus, a distributed system has three primary characteristics: multiple computers, interconnections, and shared state
[Michael Schroeder]
(First lecture slide)
Lecture 29-7Lecture 29-7
A working definition for usA working definition for us
A distributed system is a collection of entities, each of which is autonomous, programmable, asynchronous and failure-prone, and communicating through an unreliable communication medium.
• Our interest in distributed systems involves – design and implementation, maintenance, study, algorithmics
• Entity=a process on a device (PC, PDA)
• Communication Medium=Wired or wireless network
• What Evidence/Examples have we seen?
(First lecture slide)
Lecture 29-8Lecture 29-8
Problems we have Seen and Solved Since Then
Problems we have Seen and Solved Since Then
• Failure Detectors • Time and Synchronization• Global States and Snapshots• Multicast Communications • Mutual Exclusion • Leader Election • Impossibility of Consensus• Peer to peer systems – Gnutella and Chord• Cloud Computing • Networking and Routing• Sensor Networks• Measurements from real systems
Lecture 29-9Lecture 29-9
Problems we have Seen and Solved in this Class
Problems we have Seen and Solved in this Class
• Failure Detectors • Time and Synchronization• Global States and Snapshots• Multicast Communications • Mutual Exclusion • Leader Election • Impossibility of Consensus• Peer to peer systems – Gnutella and Chord• Cloud Computing • Networking and Routing• Sensor Networks• Measurements from real systems
Basic Theoretical Concepts
Bridge to Systems
What Lies Beneath
Lecture 29-10Lecture 29-10
Problems we have Seen and Solved in this Class (2)
Problems we have Seen and Solved in this Class (2)
• RPCs & Distributed Objects
• Concurrency Control
• Replication Control
• Gossiping
• Self-stabilization
• Distributed File Systems
• Distributed Shared Memory
• Byzantine Fault-tolerance
• Security
Basic Building Blocks
Distributed Services(e.g., databases)
Distributed Apps that people use directly
Easy programming
Necessary properties
Basic Theoretical Concepts
Lecture 29-11Lecture 29-11
Problems we have Seen and Solved in this Class (3)
Problems we have Seen and Solved in this Class (3)
• Midterm
• HW’s and MP’s
– A new cloud computing application from scratch!
How to get good grades(and regrades, and jobs
in some cases)
Something to boastabout to your friends(and in interviews!)
Lecture 29-12Lecture 29-12
A range of Challenges for DesignersA range of Challenges for Designers
•
• Heterogeneity
• Openness
• Security
• Scalability
• Failures
• Concurrency
• Transparency
•
•
What are
these?
(First lecture slide)
Lecture 29-13Lecture 29-13
What we Have Learned Since ThenWhat we Have Learned Since Then
• Heterogeneity – protocols should run in spite of heterogeneous nodes, e.g., supernodes in Kazaa, variety of nodes in a cloud
• Openness – each service/protocol can build on other services/protocols, e.g., layered or stacked architecture
• Security – it should be possible to add-in security to any existing protocol/system, e.g., encryption and signatures
• Scalability – clouds and peer to peer systems need to scale in overhead as the size of the system increases
• Failures – any protocol should be tolerant to the failure of nodes and of communication
• Concurrency – the more this is, the better the performance, but is a tradeoff with consistency requirements
• Transparency – clouds, distributed file systems, transactions, virtual synchrony, sequential consistency, etc. provide an abstraction of one property while allowing sufficient flexibility at run-time
Lecture 29-14Lecture 29-14
Problems we have Seen and Solved in this Class(and relation to other courses)
Problems we have Seen and Solved in this Class(and relation to other courses)• Failure Detectors
• Time and Synchronization• Global States and Snapshots• Multicast Communications • Mutual Exclusion • Leader Election • Impossibility of Consensus• Peer to peer systems – Gnutella and Chord• Cloud Computing • Measurements from real systems• Networking and Routing• Sensor Networks
Core Material of this course
Related to CS 525
Related to CS 438
Lecture 29-15Lecture 29-15
Problems we have Seen and Solved in this Class (and relation to other courses)
Problems we have Seen and Solved in this Class (and relation to other courses)
• RPCs & Distributed Objects
• Concurrency Control
• Replication Control
• Gossiping
• Self-stabilization
• Distributed File Systems
• Distributed Shared Memory
• Byzantine Fault-tolerance
• Security
Core Material of this course
Related to CS 411/CS 511
Related to CS 423/CS 523
Related to CS 523
Related to CS 421/CS 433
Related to CS 525
Lecture 29-16Lecture 29-16
Problems we have Seen and Solved in this Class (and relation to other courses)
Problems we have Seen and Solved in this Class (and relation to other courses)
• Midterm
• HW’s and MP’s
Related to all courses
Lecture 29-17Lecture 29-17
CS525: Advanced Distributed Systems(taught by Indy)
CS525: Advanced Distributed Systems(taught by Indy)
CS 525, Spring 2011– Looks at hot topics of research in distributed systems: clouds, p2p,
sensor networks, and other distributed systems
– If you liked CS425’s material, it’s likely you’ll enjoy CS525
– Project: Spring 2011 is a special “Entrepreneurial” semester for CS525
» Your project will build a distributed system for a new startup company idea (your own!) and perform associated research with it
» You (as student) own IP for your project! you can found your own company (if you so wish) based on your CS525 project
– Space is limited, so please sign up soon!
– Both graduates and undergraduates welcome! (let me know if you need my consent).
– Class size is around 20-30
Lecture 29-18Lecture 29-18
Questions?Questions?
Lecture 29-19Lecture 29-19
A working definition for usA working definition for us
A distributed system is a collection of entities, each of which is autonomous, programmable, asynchronous and failure-prone, and communicating through an unreliable communication medium.
• Our interest in distributed systems involves – design and implementation, maintenance, study, algorithmics
• Entity=a process on a device (PC, PDA)
• Communication Medium=Wired or wireless network
[Is this definition still ok, or would you want to change it?]
Lecture 29-20Lecture 29-20
Etc.Etc.
• Regular TA (Imranul) office hours on Wed Dec 8 and special office hours on Thursday (10.30am-12noon).
• Indy’s office hours – none this Thursday, special office hours this Friday 4pm-5pm (Dec 10).
• Final Exam– MONDAY Dec 13 1:30-4:30pm MEB 253. Includes all material since the
start of the course
– Allowed to bring a cheat sheet to the exam (A4 size, two sides only, at least 1 pt font)
– Final will be similar in structure to Midterm
– Revising homework problems, and midterm problems, (and if possible other problems in the textbook) will be ample preparation for the final exam
Lecture 29-21Lecture 29-21
Course EvaluationCourse Evaluation
• Main purpose: to give us feedback on how useful this course was to you (and to improve future versions of the course)
• I won’t see these evaluations until after you see your grades
• Use pencil only• Make sure you answer 1 and 2 (You may omit item 5 about
student gender (top box)).• Please write your detailed feedback on the back – this is
valuable for future versions of the course!
• Need a volunteer: 1. Please collect all reviews, and drop envelope in campus mail box2. Return the box of pencils to Siebel Center Academic Office (beside 1st
floor elevator)
Lecture 29-22Lecture 29-22
Thank YouThank You