mc714: sistemas distribuídoslucas/teaching/mc714/... · mc714 – sistemas distrubu´ıdos ementa...
TRANSCRIPT
MC714: Sistemas Distribuıdos
Prof. Lucas Wanner
Instituto de Computacao, Unicamp
Aula 1: Introducao e Fundamentos
MC714 – Sistemas Distrubuıdos
ProfessorLucas Wanner – [email protected]
HorarioTercas 21:00-23:00, Sala CB 06Quintas 19:00-21:00, Sala CB 05
Websitehttp://www.lucaswanner.com/sd
Lista de Emailshttps://groups.google.com/d/forum/sd-2016-2Todos os alunos matriculados foram adicionados a lista com seus emails da DAC. Soliciteingresso na lista caso nao tenha recebido notificacao.
2 / 43
MC714 – Sistemas Distrubuıdos
Ementa• Sistemas Distribuıdos • Comunicacao entre processos • Sistemas de arquivos• Servicos de nomes • Coordenacao • Replicacao • Seguranca
BibliografiaTexto principal: A. S. Tanenbaum and M. Van Steen. Distributed Systems:Principles and Paradigms. Second edition, Pearson, 2006.Link para download na pagina do curso.Coulouris, J. Dollimore, T. Kindberg, and G. Blair. Distributed Systems: Conceptsand Design. Fifth Edition, Addison-Wesley, 2011.A.D. Kshemkalyani, M. Singhal, Distributed Computing: Principles, Algorithms, andSystems. Paperback edition, Cambridge University Press, 2011.
3 / 43
Programa: Primeira Parte
Topico CapıtuloIntroducao e Fundamentos 1
Arquituras de sistemas distribuıdos 2Processos e Threads Revisao, 3
Clientes/Servidores, Virtualizacao e Nuvem 3Comunicacao: Revisao, Sockets Revisao, 4Troca de Mensagens, Multicast 4Disseminacao de informacao 4
Remote Procedure Call 4Nomeacao 5
Sincronizacao de relogio 6Relogios Logicos 6Exclusao mutua 6Eleicao de lıder 6
4 / 43
Programa: Segunda Parte
Topico CapıtuloConsistencia: Fundamentos, Modelos 7
Replicacao: Gerencia, Distribuicao de conteudo 7Tolerancia a falhas: Fundamentos, Comunicacao confiavel 8
Commit distribuıdo 8Recuperacao, Checkpointing 8
Arquivos: Arquitetura, Comunicacao, Sincronizacao 11Arquivos: Consistencia e replicacao, Tolerancia a falhas 11Peer-to-Peer: Introducao, Distributed Hash Table (DHT) Coulouris 10
Peer-to-Peer: Chrod, Kademlia, BitTorrent Singhal 18Web: Arquitetura, Comunicacao, HTTP, SOAP, Caching 12
Seguranca em sistemas distribuıdos 9
5 / 43
Avaliacao
ComponentesProvas: (P)Serao aplicadas duas provas teoricas, P1 e P2.Seminarios: (S)Seminarios serao apresentados em sala de aula. Os grupos, datas, e topicos paraapresentacao serao definidos durante o semestre.Testes: (T )Serao aplicados uma serie de pequenos testes e exercıcios de implementacao. Anota dos testes T sera a media aritmetica entre os testes aplicados.
Polıtica de atrasoCada dia em atraso implicara em um desconto de 2.5/10 pontos para cada entregavel.
6 / 43
Avaliacao
MediaA media M da disciplina sera calculada como:
M = P1×0.3+P2×0.4+T ×0.2+S×0.1
ExameAlunos com media 2.5≤M < 5 poderao fazer um exame final (E).
Nota finalA nota final F sera calculada como:
F =
{min {5, M+E
2 } caso 2.5≤M < 5 e o aluno tenha realizado o exame.M caso contrario.
7 / 43
Avaliacao
Datas ImportantesP1: 06/10/2016P2: 06/12/2016Exame: 20/12/2017
8 / 43
Integridade Academica
Polıtica de tolerancia zeroToda e qualquer violacao de integridade academica sera punida ate o limite daautoridade do professor, incluindo mas nao limitado a nota zero na media final do cursopara todos os envolvidos.
Exemplos (nao exaustivos) de violacoes
Cola e plagioCompartilhamento de solucoes e codigo (e.g., “dar uma olhada” no codigo)Falsificacao de dados e resultados
Nao violacoesGrupos de estudoDiscussao de estrategias de implementacao, excluindo detalhes de codigo
9 / 43
Avaliacao
Como ir bem no curso (em ordem de importancia)1 Resolver os exercıcios de cada aula.2 Ler os capıtulos do livro antes da aula correspondente.3 Entregar solucoes para testes dentro do prazo.4 Fazer uma boa apresentacao no seminario.5 Assistir as aulas.
10 / 43
Estilo das Aulas
1 Revisao breve da aula anterior.2 Discussao dos exercıcos da aula anterior.3 Apresentacao das perguntas para a aula.4 Conteudo.5 (em algumas aulas) Testes.
Participacao
Participacao sera ativamente encorajada na discussao, revisao, e apresentacao doconteudo.
11 / 43
Programa
Topico CapıtuloIntroducao e Fundamentos 1
Arquituras de sistemas distribuıdos 2Processos e Threads Revisao, 3
Clientes/Servidores, Virtualizacao e Nuvem 3Comunicacao: Revisao, Sockets Revisao, 4Troca de Mensagens, Multicast 4Disseminacao de informacao 4
Remote Procedure Call 4Nomeacao 5
Sincronizacao de relogio 6Relogios Logicos 6Exclusao mutua 6Eleicao de lıder 6
12 / 43
Exercıcios
1 Defina e compare sistemas distribuıdos e sistemas paralelos.2 Qual e o papel de um middleware em sistemas distribuıdos?3 De exemplos e defina diferentes tipos de transparencia de distribuicao.4 Qual e a diferenca entre transparencia de migracao e transparencia de relocacao?5 Defina escalabilidade. Quais tecnicas sao usadas para atingir escalabilidade?6 Qual e a diferenca entre replicacao e caching?7 A visao tradicional de transacoes diz que quando uma transacao e abortada, e como
se a transacao nunca tivesse acontecido. De um exemplo onde isto nao e verdade.8 Qual e o papel de um coordenador de transacoes?
13 / 43
Distributed System: Definition
A distributed system is a piece of software that ensures that:
a collection of independent computers appears to its users as a single coherentsystem
Two aspects: (1) independent computers and(2) single system⇒ middleware.
Local OS 1 Local OS 2 Local OS 3 Local OS 4
Appl. A Application B Appl. C
Computer 1 Computer 2 Computer 4Computer 3
Network
Distributed system layer (middleware)
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 14 / 43
Distributed System: Alternative Definition
You know you have [a distributed system] when thecrash of a computer you’ve never heard of stops youfrom getting any work done.-Leslie Lamport
15 / 43
Goals of Distributed Systems
Making resources availableDistribution transparencyOpennessScalability
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 16 / 43
Distribution Transparency
Transp. Description
Access Hide differences in data representation and invocationmechanisms
Location Hide where an object is locatedRelocation Hide that an object may be moved to another location
while in useMigration Hide that an object may move to another locationReplication Hide that an object is replicatedConcurrency Hide that an object may be shared by several
independent usersFailure Hide failure and possible recovery of an object
NoteDistribution transparency is a nice a goal, but achieving it is a different story.
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 17 / 43
Degree of Transparency
ObservationAiming at full distribution transparency may be too much:
Users may be located in different continentsCompletely hiding failures of networks and nodes is (theoretically and practically) impossible
You cannot distinguish a slow computer from a failing oneYou can never be sure that a server actually performed an operation before a crash
Full transparency will cost performance, exposing distribution of the system
Keeping Web caches exactly up-to-date with the masterImmediately flushing write operations to disk for fault tolerance
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 18 / 43
Openness of Distributed Systems
Open distributed systemBe able to interact with services from other open systems, irrespective of the underlyingenvironment:
Systems should conform to well-defined interfacesSystems should support portability of applicationsSystems should easily interoperate
Achieving opennessAt least make the distributed system independent from heterogeneity of the underlyingenvironment:
HardwarePlatformsLanguages
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 19 / 43
Policies versus Mechanisms
Implementing openness
Requires support for different policies:
What level of consistency do we require for client-cached data?Which operations do we allow downloaded code to perform?Which QoS requirements do we adjust in the face of varying bandwidth?What level of secrecy do we require for communication?
Implementing openness
Ideally, a distributed system provides only mechanisms:
Allow (dynamic) setting of caching policiesSupport different levels of trust for mobile codeProvide adjustable QoS parameters per data streamOffer different encryption algorithms
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 20 / 43
Scale in Distributed Systems
ObservationMany developers of modern distributed system easily use the adjective “scalable” without makingclear why their system actually scales.
Scalability
At least three components:
Number of users and/or processes (size scalability)Maximum distance between nodes (geographical scalability)Number of administrative domains (administrative scalability)
ObservationMost systems account only, to a certain extent, for size scalability. The (non)solution: powerfulservers. Today, the challenge lies in geographical and administrative scalability.
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 21 / 43
Techniques for Scaling
Hide communication latenciesAvoid waiting for responses; do something else:
Make use of asynchronous communicationHave separate handler for incoming responseProblem: not every application fits this model
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 22 / 43
Hiding communication latency
23 / 43
Techniques for Scaling
DistributionPartition data and computations across multiple machines:
Move computations to clients (Java applets)Decentralized naming services (DNS)Decentralized information systems (WWW)
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 24 / 43
Distribution: DNS
25 / 43
Techniques for Scaling
Replication/cachingMake copies of data available at different machines:
Replicated file servers and databasesMirrored Web sitesWeb caches (in browsers and proxies)File caching (at server and client)
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 26 / 43
Scaling – The Problem
ObservationApplying scaling techniques is easy, except for one thing:
Having multiple copies (cached or replicated), leads to inconsistencies: modifyingone copy makes that copy different from the rest.Always keeping copies consistent and in a general way requires globalsynchronization on each modification.Global synchronization precludes large-scale solutions.
ObservationIf we can tolerate inconsistencies, we may reduce the need for global synchronization, buttolerating inconsistencies is application dependent.
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 27 / 43
Developing Distributed Systems: Pitfalls
ObservationMany distributed systems are needlessly complex caused by mistakes that requiredpatching later on. There are many false assumptions:
The network is reliableThe network is secureThe network is homogeneousThe topology does not changeLatency is zeroBandwidth is infiniteTransport cost is zeroThere is one administrator
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 28 / 43
Types of Distributed Systems
Distributed Computing SystemsDistributed Information SystemsDistributed Pervasive Systems
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 29 / 43
Distributed Computing Systems
ObservationMany distributed systems are configured for High-Performance Computing
Cluster ComputingEssentially a group of high-end systems connected through a LAN:
Homogeneous: same OS, near-identical hardwareSingle managing node
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 30 / 43
Distributed Computing Systems
Local OSLocal OS Local OS Local OS
Standard network
Component of
parallel application
Component of
parallel application
Component of
parallel applicationParallel libs
Management application
High-speed network
Remote access network
Master node Compute node Compute node Compute node
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 31 / 43
Distributed Computing Systems
Grid ComputingThe next step: lots of nodes from everywhere:
HeterogeneousDispersed across several organizationsCan easily span a wide-area network
NoteTo allow for collaborations, grids generally use virtual organizations. In essence, this is agrouping of users (or better: their IDs) that will allow for authorization on resourceallocation.
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 32 / 43
Distributed Computing Systems: Clouds
Application
Infrastructure
Computation (VM), storage (block)
Hardware
Platforms
Software framework (Java/Python/.Net)Storage (DB, File)
Infr
astr
uctu
rea
a S
vc
Pla
tfo
rma
a S
vc
So
ftw
are
aa
Svc Google Apps
YouTubeFlickr
MS AzureAmazon S3
Amazon EC2
DatacentersCPU, memory, disk, bandwidth
Web services, multimedia, business apps
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 33 / 43
Distributed Computing Systems: Clouds
Cloud computingMake a distinction between four layer:
Hardware: Processors, routers, power and cooling systems. Customers normallynever get to see these.Infrastructure: Deploys virtualization techniques. Evolves around allocating andmanaging virtual storage devices and virtual servers.Platform: Provides higher-level abstractions for storage and such. Example: AmazonS3 storage system offers an API for (locally created) files to be organized and storedin so-buckets.Application: Actual applications, such as office suites (text processors, spreadsheetapplications, presentation applications). Comparable to the suite of apps shippedwith OSes.
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 34 / 43
Distributed Information Systems
ObservationThe vast amount of distributed systems in use today are forms of traditional informationsystems, that now integrate legacy systems. Example: Transaction processing systems.
BEGIN TRANSACTION(server, transaction)READ(transaction, file-1, data)WRITE(transaction, file-2, data)newData := MODIFIED(data)IF WRONG(newData) THEN
ABORT TRANSACTION(transaction)ELSE
WRITE(transaction, file-2, newData)END TRANSACTION(transaction)
END IF
NoteTransactions form an atomic operation.
35 / 43
Distributed Information Systems: Transactions
ModelA transaction is a collection of operations on the state of an object (database, object composition,etc.) that satisfies the following properties (ACID)
Atomicity: All operations either succeed, or all of them fail. When the transaction fails, the state ofthe object will remain unaffected by the transaction.
Consistency: A transaction establishes a valid state transition. This does not exclude thepossibility of invalid, intermediate states during the transaction’s execution.
Isolation: Concurrent transactions do not interfere with each other. It appears to each transactionT that other transactions occur either before T , or after T , but never both.
Durability: After the execution of a transaction, its effects are made permanent: changes to thestate survive failures.
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 36 / 43
Transaction Processing Monitor
ObservationIn many cases, the data involved in a transaction is distributed across several servers. ATP Monitor is responsible for coordinating the execution of a transaction
TP monitor
Server
Server
Server
Client application
Requests
Reply
Request
Request
Request
Reply
Reply
Reply
Transaction
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 37 / 43
Distr. Info. Systems: Enterprise Application Integration
ProblemA TP monitor doesn’t separate apps from their databases. Also needed are facilities fordirect communication between apps.
Server-side application
Server-side application
Server-side application
Client application
Client application
Communication middleware
Remote Procedure Call (RPC)Message-Oriented Middleware (MOM)
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 38 / 43
Distributed Pervasive Systems
ObservationEmerging next-generation of distributed systems in which nodes are small, mobile, and oftenembedded in a larger system.
Some requirements
Contextual change: The system is part of an environment in which changes should beimmediately accounted for.Ad hoc composition: Each node may be used in a very different ways by different users.Requires ease-of-configuration.Sharing is the default: Nodes come and go, providing sharable services and information.Calls again for simplicity.
NotePervasiveness and distribution transparency: a good match?
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 39 / 43
Distributed Systems
40 / 43
Pervasive Systems: Examples
Home systemsShould be completely self-organizing:
There should be no system administratorSimplest solution: a centralized home box?
Monitoring a personDevices are physically close to a person:
Where and how should monitored data be stored?How can we prevent loss of crucial data?What is needed to generate and propagate alerts?How can security be enforced?How can environment provide online feedback?
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 41 / 43
Sensor networks
CharacteristicsThe nodes to which sensors are attached are:
Many (10s-1000s)Simple (small memory/compute/communication capacity)Often battery-powered (or even battery-less)
Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 42 / 43
Sensor networks as distributed systems
Operator's site
Sensor network
Sensor data is sent directly
to operator
Operator's site
Sensor network
Query
Sensors send only answers
Each sensor can process and
store data
(a)
(b)Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 43 / 43