mcti-dsd sistemas escalables en entornos distribuidos (v4d)
TRANSCRIPT
![Page 1: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/1.jpg)
Diseño de Sistemas DistribuidosMáster en Ciencia y Tecnología Informática
Curso 2016-2017
Alejandro Calderón Mateos & Óscar Pérez Alonso
[email protected] [email protected]
Sistemas escalablesen entornos distribuidos
![Page 2: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/2.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
2
Contenidos
– Evolución
http://image.slidesharecdn.com/baronbigdatadeveloper-120418090010-phpapp02/95/the-big-data-developer-pavlobaron-1-728.jpg?cb=1338966838
Big
Big
Big
Big
Big
![Page 3: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/3.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
3
Sistemas distribuidos: los inicios…
http://www.nethistory.info/History%20of%20the%20Internet/origins.html#apps
![Page 4: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/4.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
4
Sistemas distribuidos: los inicios…
http://www.nethistory.info/History%20of%20the%20Internet/origins.html#apps
• Sistema centralizado• Mejor
mantenimiento
• Compartición de recursos• Repartición de
costes
![Page 5: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/5.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
5
Sistemas distribuidos: los inicios…
http://www.nethistory.info/History%20of%20the%20Internet/origins.html#apps
![Page 6: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/6.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
6
Sistemas distribuidos: los inicios…
http://www.nethistory.info/History%20of%20the%20Internet/origins.html#apps
![Page 7: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/7.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
7
Sistemas distribuidos: los inicios…
http://microchip.wdfiles.com/local--files/tcpip:tcp-ip-five-layer-model/TCPIP_5_layer_overview.JPG
![Page 8: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/8.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
8
Sistemas distribuidos: los inicios…
http://www.nethistory.info/History%20of%20the%20Internet/origins.html#apps
![Page 9: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/9.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
9
Sistemas distribuidos: los inicios…
http://edc.tversu.ru/elib/inf/0091/tcpip/figs/tcp2_0303.gif
![Page 10: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/10.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
10
Sistemas distribuidos: los inicios…
https://en.wikipedia.org/wiki/IBM_Personal_Computer
![Page 11: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/11.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
11
Sistemas distribuidos: los inicios…
http://www.thefoa.org/tech/ref/appln/OLAN.html
![Page 12: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/12.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
12
Sistemas distribuidos: los inicios…
http://cdn.arstechnica.net/2011/09/23/hdd-capacity-scale-4e7ce6c-intro.png
![Page 13: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/13.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
13
Sistemas distribuidos: los inicios…
https://bitcointalk.org/index.php?topic=430357.0
![Page 14: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/14.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
14
Sistemas distribuidos: los inicios…
http://img.frbiz.com/news/145317_s/Mobile_communication_base_station_radio_equipment_greet_with_the_explosive_growth_mobile_communication_base_station_radio_equipment_3G_communication_industry.jpg
![Page 15: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/15.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
15
Sistemas distribuidos: los inicios…
https://www.dcsorg.com/images/image_centralized_management.jpg
![Page 16: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/16.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
16
Contenidos
– Evolución
– Learning
http://image.slidesharecdn.com/baronbigdatadeveloper-120418090010-phpapp02/95/the-big-data-developer-pavlobaron-1-728.jpg?cb=1338966838
Big
Big
Big
Big
Big
![Page 17: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/17.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
17
Las transiciones no son fáciles…
https://www.dcsorg.com/images/image_centralized_management.jpg
![Page 18: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/18.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
18
Las transiciones no son fáciles…Expectativas
![Page 19: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/19.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
19
Las transiciones no son fáciles…Realidades
![Page 20: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/20.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
20
Las transiciones no son fáciles…Retos…
http://cdn.comsol.com/wordpress/2014/02/Speeding-up-communications-distributed-memory-computing-copy.jpg
The network is homogeneous
![Page 21: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/21.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
21
Las transiciones no son fáciles…Retos…
http://thenewstack.io/helix-a-linkedin-framework-for-distributed-systems-development/
![Page 22: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/22.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
22
Las transiciones no son fáciles…Áreas de trabajo…
http://cdn.comsol.com/wordpress/2014/02/Speeding-up-communications-distributed-memory-computing-copy.jpg
![Page 23: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/23.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
23
Las transiciones no son fáciles…Áreas de trabajo…
IS473 at http://www.xpowerpoint.com/ppt/system-model-distributed-systems.html
![Page 24: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/24.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
24
Las transiciones no son fáciles…Lecciones aprendidas…
http://www.pixempire.com/images/preview/orchestra-director-with-stick-icon.jpg
Software/hardware extra en S.D.:conseguir que la existencia de múltiples elementos sea transparente
![Page 25: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/25.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
25
Las transiciones no son fáciles…Lecciones aprendidas…
http://www.ukoln.ac.uk/distributed-systems/jisc-ie/arch/
![Page 26: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/26.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
26
Las transiciones no son fáciles…Lecciones aprendidas: elementos típicos…
http://www.ukoln.ac.uk/distributed-systems/jisc-ie/arch/
![Page 27: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/27.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
27
Las transiciones no son fáciles…Lecciones aprendidas: arquitectura típica…
http://books.cs.luc.edu/distributedsystems/issues.html
![Page 28: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/28.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
28
Contenidos
– Evolución
– Learning
– (re)Tendencia
http://image.slidesharecdn.com/baronbigdatadeveloper-120418090010-phpapp02/95/the-big-data-developer-pavlobaron-1-728.jpg?cb=1338966838
Big
Big
Big
Big
Big
![Page 29: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/29.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
29
Sistemas distribuidos: mayor tamaño…Todas las casas conectadas
https://bitcointalk.org/index.php?topic=430357.0
![Page 30: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/30.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
30
Sistemas distribuidos: mayor tamaño…Todas las personas conectadas
http://www.videcom.com/Portals/0/iphone1.png & http://images.techtimes.com/data/images/full/127370/events-for-gmail.jpg?w=600
a computer in your hands
![Page 31: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/31.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
31
Sistemas distribuidos: mayor tamaño…
https://www.dcsorg.com/images/image_centralized_management.jpg http://www.hsi.es/images/cc.png
Cloud
Cloud computing:Ubiquitous, on-demand access to shared pool of computing resources
![Page 32: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/32.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
32
Sistemas distribuidos de mayor tamaño…Todas las cosas conectadas
http://www.mercurynews.com/business/ci_24836116/internet-things-seen-bonanza-bay-area-businesses
![Page 33: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/33.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
33
Sistemas distribuidos de mayor tamaño…Internet of Things (IoT)
http://tarrysingh.com/2014/07/fog-computing-happens-when-big-data-analytics-marries-internet-of-things/
![Page 34: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/34.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
34
Ejemplo de IoT:PI0 + HTML5 Server-Sent Events
https://www.raspberrypi.org/magpi/wp-content/uploads/2015/11/Pi_Zero-Pics-Spread.jpghttps://redbear.cc/content/blog/pi-zero-iot-hat/
![Page 35: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/35.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
35
Ejemplo de IoT:PI0 + HTML5 Server-Sent Events
https://developer.mozilla.org/es/docs/Server-sent_events/utilizando_server_sent_events_sse
Client
• Web page gets updates from a server– without ask for updates, the updates came automatically
• Examples: Twitter updates, stocks prices, news feeds, etc.
Server
data: some text
data: another messagedata: with two lines
![Page 36: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/36.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
36
Ejemplo de IoT:PI0 + HTML5 Server-Sent Events
http://www.w3schools.com/html/html5_serversentevents.asp
Client(demo.html)
• Web page gets updates from a server– without ask for updates, the updates came automatically
• Examples: Twitter updates, stocks prices, news feeds, etc.
Server(demo.php)
<span id="result"></span>
<script>
var source = new EventSource("demo.php");source.onmessage = function (e) {
var ref = document.getElementById("result") ;ref.innerHTML += e.data + "<br>";
};
source.onerror = function(e) { alert(“Problems..."); };
</script>
<?phpheader('Content-Type: text/event-stream');header('Cache-Control: no-cache');
while (1) {echo 'data: {"time": ’ . date('r') . “}\n\n';flush(); sleep(1);
} ?>
![Page 37: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/37.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
37
Ejemplo de IoT:PI0 + HTML5 Server-Sent Events
https://github.com/acaldero/moon
Client(demo.html)
Server(demo.sh)
<!DOCTYPE html>
<html>
<head> <meta charset="utf-8" /> </head>
<body>
<script>
var s = new EventSource('http://<ip>:8080');
s.onmessage = function(e) {
document.body.innerHTML += e.data + '<br>';
};
</script>
</body>
</html>
echo "HTTP/1.1 200 OK"echo "Access-Control-Allow-Origin: *"echo "Content-Type: text/event-stream"echo "Cache-Control: no-cache"echo ""
while [ 1 ]; do
T=$(date +%H:%M:%S)echo "data: {'timestamp': $T}\n\n"sleep 1
done
(demo-nc.sh)
./demo.sh | nc -l -p 8080
![Page 38: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/38.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
38
Sistemas distribuidos de mayor tamaño…Internet of Things (IoT)
http://knowledgeblob.com/technology/a-brief-about-internet-of-things-iot/
![Page 39: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/39.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
39
Contenidos
– Evolución
– Learning
– (re)Tendencia
– (re)Retos
http://image.slidesharecdn.com/baronbigdatadeveloper-120418090010-phpapp02/95/the-big-data-developer-pavlobaron-1-728.jpg?cb=1338966838
Big
Big
Big
Big
Big
![Page 40: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/40.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
40
Dependencia cada vez mayor…
https://www.dcsorg.com/images/image_centralized_management.jpg http://www.hsi.es/images/cc.png
Cloud
Cloud computing:Ubiquitous, on-demand access to shared pool of computing resources
![Page 41: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/41.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
41
Dependencia cada vez mayor…criticidad con difícil vuelta atrás
http://kburnett.net/business-case/technology/mobility-2/
![Page 42: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/42.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
42
Nuevos retos…
http://tarrysingh.com/2014/07/fog-computing-happens-when-big-data-analytics-marries-internet-of-things/
![Page 43: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/43.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
43
Big data is coming…
http://online.wsj.com/news/articles/SB10001424127887324178904578340071261396666
![Page 44: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/44.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
44
Big data is coming…
http://online.wsj.com/news/articles/SB10001424127887324178904578340071261396666
![Page 45: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/45.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
45
Big Data frequently used…
http://www.ndm.net/emcstore/storage/greenplumhttp://fcw.com/~/media/GIG/FCWNow/Topics/Big%20Data/Big_data.png
• Text mining
• Index building
• Graph creation and analysis
• Pattern recognition
• Prediction model
• …
![Page 46: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/46.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
46
Big Data & Big Processing…
https://www.linkedin.com/pulse/open-source-network-has-organized-workshop-big-data-unlock-sharmin
![Page 47: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/47.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
47
Big Distributed Systems are coming…
IS473 at http://www.xpowerpoint.com/ppt/system-model-distributed-systems.html
![Page 48: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/48.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
48
Contenidos
– Evolución
– Learning
– (re)Tendencia
– (re)Retos
http://image.slidesharecdn.com/baronbigdatadeveloper-120418090010-phpapp02/95/the-big-data-developer-pavlobaron-1-728.jpg?cb=1338966838 http://gruposda.es/wp-content/uploads/sertecni.jpg
Big
Big
Big
Big
Big
– Motivación
– Introducción
– Hand-on
![Page 49: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/49.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
49
Repaso de opciones… (1/3)
http://www.scsi4me.com/images/1296909213_60.jpg
File System
Disk Disk
App
Ordenador
DASDirect Attached Storage
• HW: – Ordenador con periféricos de
almacenamiento.
• SW: – Sistema de ficheros local o
gestor de base de datos local.
• V/I:– Rapidez, simplicidad
– No compartición, crecimiento limitado
![Page 50: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/50.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
50
Repaso de opciones… (2/3)
http://www.jollynas.com/nas/img/wJollyNAS.jpg
NAS
File System
Disk Disk
App
Cliente
NASNetwork Attached Storage
NFS, CIFS
• HW: – Ordenador con periféricos de
comunicación.
– NAS: ordenador con periféricos de comunicación y periféricos de almacenamiento.
• SW: – Sistema de ficheros remoto o
gestor de base de datos remoto.
• V/I:– Compartición (RO), crecimiento
– Red limita velocidad y crecimiento
![Page 51: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/51.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
51
Repaso de opciones… (3/3)
http://andysworld.org.uk/blog/wp-content/uploads/2010/05/ra4100.jpg
Disk Disk
Shared F.S.
App
RAID/JBOD
App. serveriSCSI
SANStorage Area Network
• HW: – Ordenador con periféricos de
comunicación.
– SAN: periféricos de almacenamiento con dispositivo de comunicación.
• SW: – Gestor de base de datos o
sistema de ficheros, compartido.
• V/I:– Mejor crecimiento-velocidad
– Complejidad, coste
![Page 52: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/52.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
52
Combinación de opciones…
File System
Disk Disk
App
Ordenador
DASDirect Attached Storage
NAS
File System
Disk Disk
App
Cliente
NASNetwork Attached Storage
NFS
, CIF
S
Disk Disk
Shared F.S.
App
RAID/JBOD
App. server
iSC
SI
SANStorage Area Network+ +
![Page 53: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/53.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
53
Combinación de opciones…
SSD AFA
NAS + SAN
Shared F.S.
App
GP
FS, L
ust
re,…
iSC
SI
Shared F.S.
Disks Disk
![Page 54: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/54.jpg)
SSD AFADisks Disk
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
54
Combinación de opciones…
Shared F.S.
App
Shared F.S.
Blocks
FilesObjs.
GP
FS, L
ust
re,…
iSC
SI
![Page 55: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/55.jpg)
SSD AFADisks Disk
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
55
Combinación de opciones…
Shared F.S.
App
Shared F.S.
Blocks
FilesObjs.
GP
FS, L
ust
re,…
iSC
SI
S3, IL7, … NFS, CEPH, …
iSCSI, FCoE, …
![Page 56: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/56.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
56
Combinación de opciones…
App
Blocks
FilesObjs.
S3, IL7, … NFS, CEPH, …
iSCSI, FCoE, …
Gran almacenamiento con capacidad de
“crecimiento”
![Page 57: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/57.jpg)
SSD AFADisks Disk
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
57
Problemas…
Shared F.S.
App
Shared F.S.
GP
FS, L
ust
re,…
iSC
SI
• Gran cantidad de tráfico por movimiento de datos.
![Page 58: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/58.jpg)
SSD AFADisks Disk
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
58
Opciones para almacenamiento…
Shared F.S.
App
Shared F.S.
GP
FS, L
ust
re,…
iSC
SI• Gran cantidad de tráfico
por movimiento de datos.• Posible disminuir si se
acerca parte del cómputoal almacenamiento.
A0 A1 A2 A3
A4 A5
![Page 59: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/59.jpg)
SSD AFADisks Disk
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
59
Opciones para almacenamiento…
Shared F.S.
App
Shared F.S.
GP
FS, L
ust
re,…
iSC
SI• Gran cantidad de tráfico
por movimiento de datos.• Posible disminuir si se
acerca parte del cómputoal almacenamiento.
A0 A1 A2 A3
A4 A5
![Page 60: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/60.jpg)
SSD AFADisks Disk
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
60
Opciones para almacenamiento…
Shared F.S.
App
Shared F.S.
GP
FS, L
ust
re,…
iSC
SI• Gran cantidad de tráfico
por movimiento de datos.• Posible disminuir si se
acerca parte del cómputoal almacenamiento.
A0 A1 A2 A3
A4 A5
http://rsoltd.com/wp-content/uploads/2014/04/expensive.jpg
![Page 61: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/61.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
61
Buscar/crear la herramienta adecuada…
http://storageio.com/images/SIO_ToolBox.png
![Page 62: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/62.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
62
The Google File System
• Google presenta MapReduce y Google File System (GFS)
– MapReduce es una propuesta para aplicar la misma función a particiones de datos (map) y luego se tiene el resultado procesando los resultados parciales (reduce)
– GFS es una propuesta para almacenar petabytes de datos en muchas máquinas comunes, tratando con fallos, distribución, etc.
http://static.googleusercontent.com/media/research.google.com/en//archive/gfs-sosp2003.pdf
![Page 63: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/63.jpg)
SSD AFADisks Disk
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
63
De esta opción…
Shared F.S.
App
Shared F.S.
GP
FS, L
ust
re,…
iSC
SI
A0 A1 A2 A3
A4 A5
![Page 64: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/64.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
64
…a esta opción
GFS
Disk
App
A0
GFS
Disk
A1
GFS
Disk
A3
GFS
Disk
A4
GFS
Disk
A5
GFS
Disk
A6
GFS
Disk
A7
GFS
Disk
A8
Network
![Page 65: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/65.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
65
…a esta opción
GFS
Disk
App
A0
GFS
Disk
A1
GFS
Disk
A3
GFS
Disk
A4
GFS
Disk
A5
GFS
Disk
A6
GFS
Disk
A7
GFS
Disk
A8
Network
https://upload.wikimedia.org/wikipedia/commons/thumb/7/7c/Yin_and_Yang.svg/1024px-Yin_and_Yang.svg.png
Datos
Cómputo
![Page 66: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/66.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
66
Hadoop Breve historia
• Doug Cutting trabajando en Yahoo! e inspirado por estas tecnologías, inicia el desarrollo de Hadoop.
– El proyecto usa Java, implanta las ideas detrás de GFS y MapReduce.
– El logotipo se basa en el elefante amarillo que era el juguete favorito de su hijo.
– Doug Cutting pasó a trabajar a Cloudera.
http://www.enriquedans.com/2011/11/hadoop-el-elefante-omnipresente.html
![Page 67: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/67.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
67
Hadoop Breve historia
• Actualmente es un proyecto de código abierto bajo licencia Apache.
– Su licencia ha facilitado que sea adoptado por un importante número de empresas.
• El apoyo de IBM, Oracle, EMC, etc. ha acelerado su implantación y su mejora en prestaciones.
– Gran uso en proyectos tipo big data.
– JPMorgan Chase: “We’re hiring, and we’re paying10% more than the other guys.”
http://www.informationweek.com/software/information-management/its-next-hot-job-hadoop-guru/d/d-id/1101209?
![Page 68: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/68.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
68
http://hadoop.apache.org
• Apache Hadoop es un proyecto software open source para computación distribuida escalable y de confianza (reliable).
• Ofrece un framework que permite la computacióndistribuida de grandes conjuntos de datos mediante clusters de ordenadores usando modelos de programación simples.– Diseñado para poder pasar de un solo servidor a miles de máquinas
(scale-up), donde cada una ofrece tanto computación comoalmacenamiento.
– El framework está diseñado para detectar y tratar fallos a nivel de aplicación. Esto permite ofrecer servicios con alta disponibilidadsobre un cluster de ordenadores (aunque estos tengan fallos).
http://hadoop.apache.org/
![Page 69: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/69.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
69
B.D.A. Workflow example…
http://www.infoivy.com/2013/12/5-steps-for-big-data-application.html
![Page 70: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/70.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
70
Big Opportunities…
http://online.wsj.com/news/articles/SB10001424127887324178904578340071261396666
![Page 71: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/71.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
71
Big Opportunities…
http://www.indeed.com/jobtrends?q=Big-data&relative=1
![Page 72: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/72.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
72
Contenidos
– (re)Evolución
– (re)Learning
http://www.siliconweek.es/wp-content/uploads/2013/08/BigData-datos-guardar-almacenamiento-fichero-archivo.jpghttp://datameer2.datameer.com/blog/wp-content/uploads/2012/06/Hadoop-Ecosystem-Infographic-21.png
Big
Big
Big
Big
Big
– Motivación
– Introducción
– Hand-on
![Page 73: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/73.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
73
Arquitectura
http://www.monitis.com/blog/2013/12/19/big-data-and-hadoop-whats-it-all-about/
![Page 74: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/74.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
74
Arquitectura
http://www.sachinpbuzz.com/2014/01/big-data-overview-of-apache-hadoop.html
![Page 75: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/75.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
75
Despliegue
http://blog.csdn.net/suifeng3051/article/details/17288047
![Page 76: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/76.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
76
Despliegue
http://blog.csdn.net/suifeng3051/article/details/17288047
![Page 77: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/77.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
77
Despliegue
http://blog.csdn.net/suifeng3051/article/details/17288047
hdfs-site.xml:dfs.replication
:9000
:50010
![Page 78: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/78.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
78
Despliegue
http://blog.csdn.net/suifeng3051/article/details/17288047
![Page 79: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/79.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
79
Despliegue
http://blog.csdn.net/suifeng3051/article/details/17288047
![Page 80: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/80.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
80
Despliegue
http://blog.csdn.net/suifeng3051/article/details/17288047
![Page 81: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/81.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
81
Despliegue
http://blog.csdn.net/suifeng3051/article/details/17288047
![Page 82: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/82.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
82
Contenidos
– (re)Evolución
– (re)Learning
http://www.siliconweek.es/wp-content/uploads/2013/08/BigData-datos-guardar-almacenamiento-fichero-archivo.jpghttp://datameer2.datameer.com/blog/wp-content/uploads/2012/06/Hadoop-Ecosystem-Infographic-21.png
Big
Big
Big
Big
Big
– Motivación
– Introducción
– Hand-on
![Page 83: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/83.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
83
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
inactivo
start-all.sh
activo
stop-all.sh
hdfs
namenode
-format
inicial
<hdfs>
<mapReduce>
<monitorizar>
![Page 84: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/84.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
84
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
inactivo
start-all.sh
activo
stop-all.sh
hdfs
namenode
-format
inicial
<hdfs>
<mapReduce>
<monitorizar>
![Page 85: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/85.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
85
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
hduser@h1:~$ hdfs namenode -format
14/09/25 23:02:59 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = h1/127.0.1.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.5.2
…
14/09/27 23:07:07 INFO blockmanagement.BlockManager: encryptDataTransfer = false
14/09/27 23:07:07 INFO namenode.FSNamesystem: fsOwner = hduser (auth:SIMPLE)
…
14/09/25 23:03:04 INFO util.ExitUtil: Exiting with status 0
14/09/25 23:03:04 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at h1/127.0.1.1
************************************************************/
![Page 86: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/86.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
86
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
inactivo
start-all.sh
activo
stop-all.sh
hdfs
namenode
-format
inicial
<hdfs>
<mapReduce>
<monitorizar>
![Page 87: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/87.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
87
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
hduser@h1:~$ start-all.shThis script is Deprecated. Instead use start-dfs.sh and start-yarn.sh14/09/28 13:31:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-hduser-namenode-h1.out
localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hduser-datanode-h1.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-hduser-secondarynamenode-h1.out14/09/28 13:32:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-hduser-resourcemanager-h1.out
localhost: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-hduser-nodemanager-h1.out
![Page 88: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/88.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
88
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
hduser@h1:~$ stop-all.shThis script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh14/09/28 13:33:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping namenodes on [localhost]
localhost: stopping namenode
localhost: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode14/09/28 13:33:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
stopping yarn daemons
stopping resourcemanager
localhost: stopping nodemanager
no proxyserver to stop
![Page 89: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/89.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
89
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
inactivo
start-all.sh
activo
stop-all.sh
hdfs
namenode
-format
inicial
<hdfs>
<mapReduce>
<monitorizar>
![Page 90: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/90.jpg)
• NameNode: http://localhost:50070/
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
90
![Page 91: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/91.jpg)
• SecondaryNameNode: http://localhost:50090/
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
91
![Page 92: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/92.jpg)
• DataNode: http://localhost:50075/
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
92
![Page 93: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/93.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
93
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
inactivo
start-all.sh
activo
stop-all.sh
hdfs
namenode
-format
inicial
<hdfs>
<mapReduce>
<monitorizar>
![Page 94: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/94.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
94
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
: crear un directorio
hduser@h1:~$ hadoop fs -mkdir -p /user/hduser
: copiar un fichero de local a hadoop
hduser@h1:~$ echo "hdfs test" > hdfsTest.txt
hduser@h1:~$ hadoop fs -copyFromLocal hdfsTest.txt hdfsTest.txt
: ver contenido de un directorio
hduser@h1:~$ hadoop fs -ls
: ver contenido de un archivo
hduser@h1:~$ hadoop fs -cat /user/hduser/hdfsTest.txt
: copiar un fichero de hadoop a local
hduser@h1:~$ hadoop fs -copyToLocal /user/hduser/hdfsTest.txt hdfsTest2.txt
: borrar un fichero
hduser@h1:~$ hadoop fs -rm hdfsTest.txt
![Page 95: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/95.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
95
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
hduser@h1:~$ wget http://www.gutenberg.org/files/2000/old/2donq10.txt…
2014-10-04 12:53:30 (1,10 MB/s) - ‘2donq10.txt’ saved [2143292/2143292]
hduser@h1:~$ dos2unix -n 2donq10.txt dq.txt dos2unix: converting file 2donq10.txt to file dq.txt in Unix format ...
hduser@h1:~$ hadoop fs –copyFromLocal -f dq.txt /user/hduser/dq.txt
hduser@h1:~$ hadoop fs -ls /user/hduser…
Found 1 items
-rw-r--r-- 3 hduser supergroup 2143292 2014-10-04 13:09 /user/hduser/dq.txt
![Page 96: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/96.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
96
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
inactivo
start-all.sh
activo
stop-all.sh
hdfs
namenode
-format
inicial
<hdfs>
<mapReduce>
<monitorizar>
![Page 97: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/97.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
97
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
Nativo Encapsulado
Java Perl, Python, …
![Page 98: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/98.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
98
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
hduser@h1:~$ hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.2.jar pi 2 5
Number of Maps = 2
Samples per Map = 5
…
Job Finished in 11.536 seconds
Estimated value of Pi is 3.60000000000000000000
http://www.bogotobogo.com/Hadoop/BigData_hadoop_Running_MapReduce_Job.php
![Page 99: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/99.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
99
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
package org.myorg;
import java.io.IOException;
import java.util.*;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
http://wiki.apache.org/hadoop/WordCount
1
![Page 100: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/100.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
100
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
public class WordCount {
public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map (LongWritable key, Text value, Context context) throws IOException, InterruptedException
{
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
context.write(word, one);
}
}
}
http://wiki.apache.org/hadoop/WordCount
2
![Page 101: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/101.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
101
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce (Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException
{
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
context.write(key, new IntWritable(sum));
}
}
http://wiki.apache.org/hadoop/WordCount
3
![Page 102: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/102.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
102
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
public static void main (String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = new Job(conf, "wordcount");
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
}
} // class WordCount
http://wiki.apache.org/hadoop/WordCount
4
![Page 103: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/103.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
103
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
hduser@h1:/usr/local/hadoop$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar wordcount /user/hduser/dq.txt /user/hduser/counterj
14/10/04 16:33:36 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
14/10/04 16:33:37 INFO input.FileInputFormat: Total input paths to process : 1
14/10/04 16:33:37 INFO mapreduce.JobSubmitter: number of splits:1
14/10/04 16:33:38 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local835374884_0001
…
File Input Format Counters
Bytes Read=2106143
File Output Format Counters
Bytes Written=454722
hduser@h1:/usr/local/hadoop$ hadoop fs -cat /user/hduser/counterj/* | sort -n -k 2 -r|head -5
…
que 19429
de 17986
y 15887
la 10199
a 9502
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
![Page 104: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/104.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
104
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
Nativo Encapsulado
Java Perl, Python, …
![Page 105: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/105.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
105
Hadoop Streaming API
http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/
mapper.sh
en 1un 1lugar 1…
En un lugar…STDOUTSTDIN
awk '{i=1; while (i<=NF) {gsub(/[\.,;]/,"",$i); print tolower($i)" "1; i++;}}'
![Page 106: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/106.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
106
Hadoop Streaming API
http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/
reducer.sh
en 10un 20lugar 3…
sed 's/ 1$//g' |uniq -c| awk '{print $2" "$1}'|sed 's/^$//g'
en 1un 1lugar 1…
STDOUTSTDIN
![Page 107: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/107.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
107
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
hduser@h1:~$ echo “uno uno dos dos tres” | ./mapper.sh | more…
hduser@h1:~$ echo “uno uno dos dos tres” | ./mapper.sh|sort | more…
hduser@h1:~$ echo “uno uno dos dos tres” | ./mapper.sh|sort|./reducer.sh |more…
![Page 108: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/108.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
108
Hadoop: solo un nodoPrerequisitos Instalación Uso básico
hduser@h1:/usr/local/hadoop$ hadoop jar share/hadoop/tools/lib/hadoop-streaming-2.5.2.jar -file ./mapper.sh -mapper ./mapper.sh-file ./reducer.sh -reducer ./reducer.sh-input /user/hduser/ -output /user/hduser/counter
packageJobJar: [./mapper.sh, ./reducer.sh] [] /tmp/streamjob724842872862965882.jar tmpDir=null
14/10/04 15:48:02 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
…
File Input Format Counters
Bytes Read=2106143
File Output Format Counters
Bytes Written=320124
14/10/04 15:48:46 INFO streaming.StreamJob: Output directory: /user/hduser/counter
hduser@h1:/usr/local/hadoop$ hadoop fs -cat /user/hduser/counter/part-00000|sort -n -k 2 -r|head -5
…
que 20545
de 18154
y 18053
la 10338
a 9779
http://hadoop.apache.org/docs/r1.1.2/streaming.html#Hadoop+Streaming
![Page 109: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/109.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
109
Contenidos
– (re)Evolución
– (re)Learning
http://www.siliconweek.es/wp-content/uploads/2013/08/BigData-datos-guardar-almacenamiento-fichero-archivo.jpghttp://datameer2.datameer.com/blog/wp-content/uploads/2012/06/Hadoop-Ecosystem-Infographic-21.png
Big
Big
Big
Big
Big
– Motivación
– Introducción
– Hand-on
– Alternativas
– Ecosistema
![Page 110: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/110.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
110
Alternativas
https://www.simple-talk.com/cloud/data-science/analyze-big-data-with-apache-hadoop-on-windows-azure-preview-service-update-3/
![Page 111: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/111.jpg)
Google Dataflow
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
111
• Data pipelines for helping toingest, transform and, analyze data.
http://techcrunch.com/2014/06/25/google-launches-cloud-dataflow-a-managed-data-processing-service/
![Page 112: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/112.jpg)
Google Dataflow
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
112
https://4.bp.blogspot.com/-RlLeDymI_mU/Vp-1cb3AxNI/AAAAAAAACSQ/5TphliHJA4w/s1600/dataflow%2BASF.png
![Page 113: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/113.jpg)
Google Dataflow
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
113
http://www.slideshare.net/GoogleCloudPlatformJP/google-cloud-dataflow-bqsushi
![Page 114: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/114.jpg)
Google Dataflow
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
114
https://cloud.google.com/solutions/processing-logs-at-scale-using-dataflow
![Page 115: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/115.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
115
• Velocidad– Se evita en lo posible escribir en disco
los resultados intermedios
• Facilidad de uso– Transformaciones y operaciones
• Polivalente– En Scala, Python, Java
– Combina SQL, streaming, analítico…
– Trabaja con HDFS, HBase, S3, …
https://spark.apache.org/
![Page 116: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/116.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
116
http://aptuz.com/blog/is-apache-spark-going-to-replace-hadoop/
Hadoop Spark
Datos Resultados intermedios en disco Mantener en memoria todo lo posible
Tolerancia a fallos en datos
HDF (2+1) RDD
Procesamiento Trabajos map/reduce DAG
Programación Java + Otros con tuberías Scala, Python, Java, etc.
Escenario Trabajos lentos en batch Batch + Real-time + Iterativo + Interactivo
(Más) E/S de disco + red que Spark (Más) memoria RAM que Hadoop
Marco de trabajo de propósito general
Alternativa más que remplazamiento de Hadoop
vs.
![Page 117: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/117.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
117
https://geekytheory.com/apache-spark-que-es-y-como-funciona/http://hadoopgeek.com/wp-content/uploads/2013/12/hdfswrite.png
http://www.slideshare.net/cfregly/spark-streaming-40659876
vs.
RDD (Resilient Distributed Dataset)
• Datos leídos como RDD– Inmutable, regenerable, en memoria
• DataFrame = RDD[fila] + Schema
• Datos leídos en bloques– 64MB por defecto
• Usados como en ficheros
HDFS(HDFS + 3copies)
![Page 118: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/118.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
118
https://geekytheory.com/apache-spark-que-es-y-como-funciona/
vs.
DAG (Directed Acyclic Graph)
• DAG con N estados
• Datos intermedios en memoria
• DAG con 2 estados – Map + Reduce
• Datos intermedios a disco
M-R (Map-Reduce)
1 20
S
43
~MPI_Scatter
~MPI_all-to-all
~MPI_Gather
2
1
43
56
7
![Page 119: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/119.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
119
https://geekytheory.com/apache-spark-que-es-y-como-funciona/
vs.
RDD + DAG
• Dos tipos operaciones:– Transformaciones (RDD[] -> RDD’[])
– Acciones (RDD[] -> valor)
• Operaciones sobre bloques:– En M se leerán
– En R se escribirán
HDFS + M-R
1 20
S
43
~MPI_Scatter
~MPI_all-to-all
~MPI_Gather
B1 B2
C1 C2
T
T TT
A
T T
T
A
![Page 120: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/120.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
120
http://aptuz.com/blog/is-apache-spark-going-to-replace-hadoop/
![Page 121: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/121.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
121
https://www.youtube.com/watch?v=x8xXXqvhZq8
1. val conf = new SparkConf().setMaster("local[2]")
2. val sc = new SparkContext(conf)
3. val lines = sc.textFile(path, 2)
4. val words = lines.flatMap(_.split(" "))
5. val pairs = words.map(word => (word, 1))
6. val wordCounts = pairs.reduceByKey(_ + _)
7. val localValues = wordCounts.take(100)
8. localValues.foreach(r => println(r))
![Page 122: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/122.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
122
https://www.youtube.com/watch?v=x8xXXqvhZq8
1. val conf = new SparkConf().setMaster("local[2]")
2. val sc = new SparkContext(conf)
3. val lines = sc.textFile(path, 2)
4. val words = lines.flatMap(_.split(" "))
5. val pairs = words.map(word => (word, 1))
6. val wordCounts = pairs.reduceByKey(_ + _)
7. val localValues = wordCounts.take(100)
8. localValues.foreach(r => println(r))
![Page 123: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/123.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
123
https://www.youtube.com/watch?v=x8xXXqvhZq8
1. val conf = new SparkConf().setMaster("local[2]")
2. val sc = new SparkContext(conf)
3. val lines = sc.textFile(path, 2)
4. val words = lines.flatMap(_.split(" "))
5. val pairs = words.map(word => (word, 1))
6. val wordCounts = pairs.reduceByKey(_ + _)
7. val localValues = wordCounts.take(100)
8. localValues.foreach(r => println(r))
![Page 124: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/124.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
124
Contenidos
– (re)Evolución
– (re)Learning
http://www.siliconweek.es/wp-content/uploads/2013/08/BigData-datos-guardar-almacenamiento-fichero-archivo.jpghttp://datameer2.datameer.com/blog/wp-content/uploads/2012/06/Hadoop-Ecosystem-Infographic-21.png
Big
Big
Big
Big
Big
– Motivación
– Introducción
– Hand-on
– Ecosistema
![Page 125: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/125.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
125
Ecosistema
http://ambuj4bigdata.blogspot.com.es/2014_05_01_archive.html
![Page 126: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/126.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
126
Ecosistema
http://ambuj4bigdata.blogspot.com.es/2014_05_01_archive.html
![Page 127: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/127.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
127
Almacenamiento
http://www.nextree.co.kr/p2865/
![Page 128: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/128.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
128
Procesamiento
http://www.nextree.co.kr/p2865/
![Page 129: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/129.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
129
Ecosistema
http://ambuj4bigdata.blogspot.com.es/2014_05_01_archive.html
![Page 130: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/130.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
130
CDH (Cloudera Hadoop Distribution)
http://www.theregister.co.uk/2012/06/05/cloudera_cdh4_hadoop_stack/
![Page 131: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/131.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
131
AEMR (Amazon Elastic MapReduce)
http://aws.amazon.com/es/elasticmapreduce/
![Page 132: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/132.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
132
Ecosistema
http://ambuj4bigdata.blogspot.com.es/2014_05_01_archive.html
![Page 133: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/133.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
133
Ecosistema
https://practicalanalytics.files.wordpress.com/2011/11/hadoopevolution.png
![Page 134: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/134.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
134
Ecosistema(funcionalidad)
http://mattturck.com/wp-content/uploads/2016/03/Big-Data-Landscape-2016-v18-FINAL.png + http://dfkoz.com/big-data-landscape/
![Page 135: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/135.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
135
Ecosistema(funcionalidad)
https://s-media-cache-ak0.pinimg.com/originals/0d/4f/0d/0d4f0d8aad9d144c52c696c603b8a27c.jpg
![Page 136: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/136.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
136
Ecosistema (estructura)
http://www.nextree.co.kr/p2865/
![Page 137: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/137.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
137
Almacenamiento
http://www.nextree.co.kr/p2865/
![Page 138: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/138.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
138
Procesamiento
http://www.nextree.co.kr/p2865/
purchases = LOAD "/data/purchases" AS (id, client_id, product_id);bigpurchases = FILTER purchases BY price > 1000;…
SELECT * FROM purchases WHERE price > 1000 ORDER BY client_id
![Page 139: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/139.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
139
Almacenamiento: Integración
http://www.nextree.co.kr/p2865/
![Page 140: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/140.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
140
Procesamiento: orquestación
http://www.nextree.co.kr/p2865/
![Page 141: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/141.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
141
Ej.: Flume-Hive-oozie
http://www.datadansandler.com/2013/03/big-data-and-hadoop-assets-available-on.html
![Page 142: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/142.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
142
Ecosistema
http://ambuj4bigdata.blogspot.com.es/2014_05_01_archive.html
![Page 143: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/143.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
143
Contenidos
– (re)Evolución
– (re)Learning
http://www.siliconweek.es/wp-content/uploads/2013/08/BigData-datos-guardar-almacenamiento-fichero-archivo.jpghttp://datameer2.datameer.com/blog/wp-content/uploads/2012/06/Hadoop-Ecosystem-Infographic-21.png
Big
Big
Big
Big
Big
– Motivación
– Introducción
– Hand-on
– Ecosistema
![Page 144: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/144.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
144
Bibliografía: tutoriales
• Página Web oficial:– http://hadoop.apache.org/
• Introducción a cómo funciona Hadoop:– http://blog.csdn.net/suifeng3051/article/details/17288047
• Tutorial de cómo instalar y usar Hadoop:– http://www.bogotobogo.com/Hadoop/BigData_hadoop_Install_on_
ubuntu_single_node_cluster.php
– http://www.bogotobogo.com/Hadoop/BigData_hadoop_Running_MapReduce_Job.php
![Page 145: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/145.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
145
Bibliografía: libro
• Hadoop: The Definitive Guide, 3rd Edition:– http://shop.oreilly.com/product/0636920021773.do
– https://github.com/tomwhite/hadoop-book/
![Page 146: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/146.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
146
Bibliografía: TFG
• Extracción de información social desde Twitter y análisis mediante Hadoop.– Autor: Cristian Caballero Montiel
– Tutores: Daniel Higuero Alonso-Mardones y Juan Manuel Tirado Martín
– http://e-archivo.uc3m.es/handle/10016/16784
• Adaptation, Deployment and Evaluation of a Railway Simulator in Cloud Environments – Autora: Silvina Caíno Lores
– Tutor: Alberto García Fernández
![Page 147: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/147.jpg)
Dis
eño
de
Sist
emas
Dis
trib
uid
os
Ale
jan
dro
Cal
der
ón
Mat
eos
147
Agradecimientos
• Por último pero no por ello menos importante,agradecer al personal del Laboratorio del Departamento de Informáticatodos los comentarios y sugerencias para esta presentación.
![Page 148: MCTI-DSD Sistemas Escalables en Entornos Distribuidos (v4d)](https://reader030.vdocuments.site/reader030/viewer/2022021500/58f02cd11a28ab702f8b46e7/html5/thumbnails/148.jpg)
Diseño de Sistemas DistribuidosMáster en Ciencia y Tecnología Informática
Curso 2016-2017
Alejandro Calderón Mateos & Óscar Pérez Alonso
[email protected] [email protected]
Sistemas escalablesen entornos distribuidos