hadoop & neptune feb. 2009 김형준
DESCRIPTION
More CPU Faster Disk Program Tuning More MemoryTRANSCRIPT
![Page 1: Hadoop & Neptune Feb. 2009 김형준](https://reader034.vdocuments.site/reader034/viewer/2022051303/5a4d1b637f8b9ab0599ae73d/html5/thumbnails/1.jpg)
Hadoop & Nep-tune
Feb. 2009http://www.openneptune.com
http://www.jaso.co.kr
김형준
![Page 2: Hadoop & Neptune Feb. 2009 김형준](https://reader034.vdocuments.site/reader034/viewer/2022051303/5a4d1b637f8b9ab0599ae73d/html5/thumbnails/2.jpg)
The Data 'Tsunami'
![Page 3: Hadoop & Neptune Feb. 2009 김형준](https://reader034.vdocuments.site/reader034/viewer/2022051303/5a4d1b637f8b9ab0599ae73d/html5/thumbnails/3.jpg)
More CPU
Faster DiskProgram Tuning
More Memory
![Page 4: Hadoop & Neptune Feb. 2009 김형준](https://reader034.vdocuments.site/reader034/viewer/2022051303/5a4d1b637f8b9ab0599ae73d/html5/thumbnails/4.jpg)
Uninstall
![Page 5: Hadoop & Neptune Feb. 2009 김형준](https://reader034.vdocuments.site/reader034/viewer/2022051303/5a4d1b637f8b9ab0599ae73d/html5/thumbnails/5.jpg)
Where?Distributed File System
How?Distributed/Parallel Computing
![Page 6: Hadoop & Neptune Feb. 2009 김형준](https://reader034.vdocuments.site/reader034/viewer/2022051303/5a4d1b637f8b9ab0599ae73d/html5/thumbnails/6.jpg)
Hadoop DFSUnlimited StorageNo Backup, Self-healingThousands NodesBut, No POSIXNo Random write
![Page 7: Hadoop & Neptune Feb. 2009 김형준](https://reader034.vdocuments.site/reader034/viewer/2022051303/5a4d1b637f8b9ab0599ae73d/html5/thumbnails/7.jpg)
: machine: daemon process
NameNode(DFS Master)
JobTracker(Job Master)
DataNode(DFS Slave)
TaskTracker(Task Mgmt.)
Local Disk
DataNode(DFS Slave)
TaskTracker(Task Mgmt.)
Local Disk
DataNode(DFS Slave)
TaskTracker(Task Mgmt.)
Local Disk
SecondaryNameNode
ClientAPIcontrol
datacontrol
data
![Page 8: Hadoop & Neptune Feb. 2009 김형준](https://reader034.vdocuments.site/reader034/viewer/2022051303/5a4d1b637f8b9ab0599ae73d/html5/thumbnails/8.jpg)
Hadoop MapReduce1TB group by -> 10 분
More Machine -> 1 분
![Page 9: Hadoop & Neptune Feb. 2009 김형준](https://reader034.vdocuments.site/reader034/viewer/2022051303/5a4d1b637f8b9ab0599ae73d/html5/thumbnails/9.jpg)
• map (k1,v1) → list(k2,v2)• reduce (k2, list (v2)) → result value
This is a book. That book is on the desk.I like that book.
This is a book. That book is on the desk.
I like that book.
(This,1)(book, 1)(That, 1)(book, 1)…
(I,1)(that, 1)(book, 1)…
map()
map()
(book, [1,1,1])…(is, [1,1])…(This,[1])
(book, 3)…(is, 2)…(This,1)
reduce()
Exec distributed/parallelMap&Reduce execution platform
Split
PartitionMergeSort
![Page 10: Hadoop & Neptune Feb. 2009 김형준](https://reader034.vdocuments.site/reader034/viewer/2022051303/5a4d1b637f8b9ab0599ae73d/html5/thumbnails/10.jpg)
: machine: daemon process
NameNode(DFS Master)
JobTracker(Job Master)
DataNode(DFS Slave)
TaskTracker(Task Mgmt.)
Local Disk
DataNode(DFS Slave)
TaskTracker(Task Mgmt.)
Local Disk
DataNode(DFS Slave)
TaskTracker(Task Mgmt.)
Local Disk
SecondaryNameNode
ClientAPIcontrol
datacontrol
data
![Page 11: Hadoop & Neptune Feb. 2009 김형준](https://reader034.vdocuments.site/reader034/viewer/2022051303/5a4d1b637f8b9ab0599ae73d/html5/thumbnails/11.jpg)
A piece of Cake
![Page 12: Hadoop & Neptune Feb. 2009 김형준](https://reader034.vdocuments.site/reader034/viewer/2022051303/5a4d1b637f8b9ab0599ae73d/html5/thumbnails/12.jpg)
NeptuneDatabase running on DFS(Hadoop)Unlimited Structured DataNo Backup
But, No JOIN, No SQLNo Multiple row operationNo Aggregation function
![Page 13: Hadoop & Neptune Feb. 2009 김형준](https://reader034.vdocuments.site/reader034/viewer/2022051303/5a4d1b637f8b9ab0599ae73d/html5/thumbnails/13.jpg)
OperationCreate/Drop Tableput/getlike/betweenscan/merge scan(join)MapReduce
![Page 14: Hadoop & Neptune Feb. 2009 김형준](https://reader034.vdocuments.site/reader034/viewer/2022051303/5a4d1b637f8b9ab0599ae73d/html5/thumbnails/14.jpg)
Why Neptune?
Tablet A-3
Tablet A-N
…
Tablet A-2
TabletA-1
TableA
JobTracker
Make Map&Reduce function
Run on Map&Reduce framework
META Table Get tablet list
Map Task
TaskTracker
Map TaskMap Task
Map Task
TaskTracker
Map TaskMap Task
Map Task
TaskTracker
Map TaskMap Task
Task assign to each node
TaskTracker
ReduceTask
TaskTracker
ReduceTask
TableB
Tablet B-2
Tablet B-1
분산 / 병렬처리: Speed, Scalability
![Page 15: Hadoop & Neptune Feb. 2009 김형준](https://reader034.vdocuments.site/reader034/viewer/2022051303/5a4d1b637f8b9ab0599ae73d/html5/thumbnails/15.jpg)
분산파일시스템 (Hadoop or other)
TabletServer #1TabletServer #2 TabletServer #n
Cluster Management System
NeptuneMaster
분산 / 병렬컴퓨팅 플랫폼(Hadoop)
사용자 애플리케이션
Neptune( 대용량분산 데이터 저장소 )
논리적 Table
물리적 저장소
![Page 16: Hadoop & Neptune Feb. 2009 김형준](https://reader034.vdocuments.site/reader034/viewer/2022051303/5a4d1b637f8b9ab0599ae73d/html5/thumbnails/16.jpg)
When use NeptuneLarge DataOnline put/get and analysisLess complex
Google Personalized SearchGoogle analytics
![Page 17: Hadoop & Neptune Feb. 2009 김형준](https://reader034.vdocuments.site/reader034/viewer/2022051303/5a4d1b637f8b9ab0599ae73d/html5/thumbnails/17.jpg)
Finding developer
![Page 18: Hadoop & Neptune Feb. 2009 김형준](https://reader034.vdocuments.site/reader034/viewer/2022051303/5a4d1b637f8b9ab0599ae73d/html5/thumbnails/18.jpg)
Cheap Hardware and Smart SoftwareUse cheap commodity hardware frequent failureDevelop smart software for reducing the cost of failure
Easy ManagementHigh Scalability by automatic discovery of new servers and racksHigh Redundancy for failure of servers, racks, even data centers
Speed and Then More SpeedHigh speed with low cost Rapid development and deployment of new products
Use existing technologiesUse techniques from the leading edge of computer scienceUse open source codes as a starting point
Principle of Google Infra
![Page 19: Hadoop & Neptune Feb. 2009 김형준](https://reader034.vdocuments.site/reader034/viewer/2022051303/5a4d1b637f8b9ab0599ae73d/html5/thumbnails/19.jpg)
Google Infra
Google Linux
GFS
Bigtable
Map & Reduce Client API
Chubby
Cluster M
gmt
Batch applica-tion Online Services
HardwareLow-end commodity servers40 or more pizza box server per rack
Google’s core competencyGoogle’s software stack
![Page 20: Hadoop & Neptune Feb. 2009 김형준](https://reader034.vdocuments.site/reader034/viewer/2022051303/5a4d1b637f8b9ab0599ae73d/html5/thumbnails/20.jpg)
Q&A