google could
TRANSCRIPT
![Page 1: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/1.jpg)
互联网超级云计算平台互联网超级云计算平台
方坤
谷歌资深软件工程师
03/28/11 GoogleInc
![Page 2: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/2.jpg)
• 计算的规模与极限• 云计算平台设计挑战• 谷歌云计算基础技术• 谷歌云计算新技术
题纲
03/28/11 GoogleInc
![Page 3: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/3.jpg)
谷歌的云计算理念
• 满足搜索和数据分析的需要– 智能,迅速,规模
• 在现有云计算架构上提供公有服务– Gmail, Calendar, Picasa, Docs, Google Apps,...– App Engine
• 探索新的云计算技术– 云存储– 云计算– 任务调度– 节约能源
03/28/11 GoogleInc
![Page 4: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/4.jpg)
高性能计算的极端
应用 用户数 精确度 可靠度 数据量
科学计算 少 极高 低 ‐‐中等 Tera
股市交易 大量 高 极高 Gega
基因排序 少 高 高 Tera‐‐Peta
搜索引擎 大量 中等 ‐‐高 中等 Peta
03/28/11 GoogleInc
![Page 5: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/5.jpg)
数据规模 vs算法精度
03/28/11 GoogleInc
![Page 6: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/6.jpg)
数据规模 vs算法精度
03/28/11 GoogleInc
![Page 7: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/7.jpg)
数据规模 vs算法精度
03/28/11 GoogleInc
![Page 8: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/8.jpg)
数据规模 vs算法精度
03/28/11 GoogleInc
![Page 9: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/9.jpg)
题纲
• 计算的规模与极限• 云计算平台设计挑战• 谷歌云计算基础技术• 谷歌云计算新技术
03/28/11 GoogleInc
![Page 10: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/10.jpg)
样本平台
• 计算机– 16GBDRAM;160GBSSD;5x1TBdisk
• 计算机机架 (Rack)– 40计算机– 48portGigabitEthernetswitch
• 数据中心 (Warehouse)– 10,000计算机 (250racks)– 2KportGigabitEthernetswitch
03/28/11 GoogleInc
![Page 11: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/11.jpg)
储存 ‐‐‐计算机单机
03/28/11 GoogleInc
服务器 机架 数据中心
![Page 12: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/12.jpg)
储存 ‐‐‐计算机机架
03/28/11 GoogleInc
服务器 机架 数据中心
![Page 13: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/13.jpg)
储存 ‐‐‐数据中心
03/28/11 GoogleInc
服务器 机架 数据中心
![Page 14: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/14.jpg)
设计挑战
• 节约能源– 硬件 ,软件 ,数据中心– 编码 ,压缩 ,传输数据
• 故障恢复– 硬件和软件出错– 等待运行较慢的机器
• 新程式设计模型– 并行计算– Flash(SSD);GPU
03/28/11 GoogleInc
![Page 15: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/15.jpg)
能源消耗全球增长趋势
03/28/11 GoogleInc
能源消耗每 30年增长一倍
![Page 16: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/16.jpg)
能源消耗 地区增长趋势
03/28/11 GoogleInc
亚太区能源消耗每 6 – 8 年增长一倍
亚太 (不含日本 )
西欧
其他地区
美国
日本
平均增长率
2000 – 2005年平均增长率
![Page 17: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/17.jpg)
能源消耗 ‐‐油当量
03/28/11 GoogleInc
![Page 18: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/18.jpg)
电源使用效率
03/28/11 GoogleInc
平均电源使用效率 = 2.0 Google 电源使用效率 = 1.16
IT 冷却 电源分布和备用电源 照明
![Page 19: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/19.jpg)
设计挑战
• 节约能源– 硬件 ,软件 ,数据中心– 编码 ,压缩 ,传输数据
• 故障恢复– 硬件和软件出错– 等待运行较慢的机器
• 新程式设计模型– 并行计算– Flash(SSD);GPU
03/28/11 GoogleInc
![Page 20: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/20.jpg)
故障率
• 99.9%正常运行时间 =9小时故障 /年• 10,000计算机中心
– 0.25次断电– 3次路由器故障– 1,000计算机故障– 1,000s硬盘故障– etc.,etc.,etc.
03/28/11 GoogleInc
![Page 21: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/21.jpg)
故障后快速修复
• 复制• 定期检查• 尽可能:
–松散的一致性–近似的答案–不完整的答案
03/28/11 GoogleInc
![Page 22: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/22.jpg)
设计挑战
• 节约能源– 硬件 ,软件 ,数据中心– 编码 ,压缩 ,传输数据
• 故障恢复– 硬件和软件出错– 等待运行较慢的机器
• 新程式设计模型– 并行计算– Flash(SSD);GPU
03/28/11 GoogleInc
![Page 23: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/23.jpg)
题纲
• 计算的规模与极限• 云计算平台设计挑战• 谷歌云计算基础技术• 谷歌云计算新技术
03/28/11 GoogleInc
![Page 24: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/24.jpg)
谷歌文件系统 (GFS)
ReplicasMaster
GFS Master
GFS Master Client
Client
C1C0 C0
C3 C3C4
C1
C5
C3
C4
03/28/11 GoogleInc
![Page 25: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/25.jpg)
MapReduce
map
map
map
map
map
Reduce
Data Block 1
Data Block 2
Data Block 3
Data Block 4
Data Block 5
Datadatadatadatadatadatadatadatadatadatadata
Datadatadatadatadatadatadatadatadatadatadata
Resultsdatadatadatadatadatadata
Resultsdatadatadatadatadatadata
03/28/11 GoogleInc
GFS 和 Mapreduce 于 2004年发表 , 并成为开源 Hadoop系统的基础,雅虎 , 微软和 Facebook在各自的应用里均使用了 Hadoop系统。
![Page 26: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/26.jpg)
谷歌大表格 (BigTable)
• 内存管理
• 举例– 行 :网页– 列 :网页具体信息– 时间戳 :网页信息提取的时间
03/28/11 GoogleInc
![Page 27: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/27.jpg)
题纲
• 计算的规模与极限• 云计算平台设计挑战• 谷歌云计算基础技术• 谷歌云计算新技术
03/28/11 GoogleInc
![Page 28: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/28.jpg)
赢在规模的例证
• 谷歌翻译• 谷歌语音识别• 趋势预测
03/28/11 GoogleInc
![Page 29: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/29.jpg)
流感趋势图
03/28/11 GoogleInc
![Page 30: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/30.jpg)
总结
• 计算的规模与极限• 云计算平台设计挑战• 谷歌云计算基础技术• 谷歌云计算新技术
03/28/11 GoogleInc
![Page 31: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/31.jpg)
感谢
• 感谢如下谷歌同事贡献:– PeterNorvig– StuartFeldman– EdwardChang– XuemeiGu– HaiFang
03/28/11 GoogleInc
![Page 32: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/32.jpg)
谢谢!
03/28/11 GoogleInc
![Page 33: Google could](https://reader033.vdocuments.site/reader033/viewer/2022052223/559ebf091a28ab3f038b4585/html5/thumbnails/33.jpg)
杭州站 · 2011年10月20日~22日 www.qconhangzhou.com(6月启动)
QCon北京站官方网站和资料下载 www.qconbeijing.com