google and large scientific datasets or how to move 100tb jon trowbridge google space telescope...
Post on 13-Dec-2015
214 Views
Preview:
TRANSCRIPT
Google and LargeScientific Datasets
or
How To Move 100TB
Jon Trowbridge
Space Telescope Science Institute
March 15, 2007
Organize the world’s information and make it universally accessible
and useful.
Motivating Problem
What if a piece of information is too large to efficiently transmit across
the Internet as it exists today?
“Never underestimate the bandwidth of a station wagon full of tapes hurtling down the
highway.”- Andrew Tanenbaum (?)
Large Dataset Archive
• Move data by shipping hard drives
• Centralized repository stored on Google’s infrastructure
• Accepting data from all disciplines, but it must be open and free
• Ulimate goal: Promiscuous distribution
Nice Properties ofPhysical HD Shipment
• Uses commodity technologies: Linux, SATA, ext2
• High throughput
• Trivially scalable
• Cheap and easy: $2400 for 3T
• Rapidly getting cheaper
Real-World Throughputs
Method MiB/s GiB/hr TB/day hrs/TB1200 baud modem 1.14E-04 4.02E-04 9.43E-06 2545166My Home DSL (downstream) 0.3 1.41 0.03 728.18Ethernet: 10baseT 0.8 2.81 0.07 364.09Ethernet: 100baseT 8 28.13 0.66 36.41End-to-end physical shipment 0.88 27.42HD Transfer 30 105.47 2.47 9.71FedEx phase of shipment 3.00 8.00Ethernet: Gigabit 60 210.94 4.94 4.85LBNL, 2002: 10.6 GiB/s 10854 38160 894.38 0.03
The Cost of 1GB of Storage
• 1986: $100,000
• 1990: $10,000
• 1994: $1,000
• 1997: $100
• 2000: $10
• 2004: $1
• Today: About 40¢
Creative Computing - February, 1980
Not-So-Nice Properties ofPhysical HD Shipment
• Physical objects break, get stolen, occasionally explode
• HD copying bottleneck
• Customs/duties make international shipments more complicated
The Big Question
What happens when every astronomer has the complete Hubble Legacy Archive on the
computer in their office?
The Big Question
What happens when every high-school student has the complete
Hubble Legacy Archive on thecomputer in their bedroom?
top related