1 cyberbricks: the future of database and storage engines jim gray gray
TRANSCRIPT
1
CyberBricks:The future of Database And Storage Engines
Jim Gray
http://research.Microsoft.com/~Gray
2
Outline
• What storage things are coming from Microsoft?
• TerraServer: a 1 TB DB on the Web
• Storage Metrics: Kaps, Maps, Gaps, Scans
• The future of storage: ActiveDisks
3
New Storage Software From Microsoft
• SQL Server 7.0:
»Simplicity: Auto-most-things
»Scalability on Win95 to Enterprise
»Data warehousing: built-in OLAP, VLDB
• NT 5:
»Better volume management (from Veritas)
»HSM architecture
»Intellimirror
»Active directory for transparency
4
““Hydra” Hydra” ServerServer
Dedicated Dedicated Windows Windows terminalterminal
Existing, Existing, Desktop PC Desktop PC
MS-DOS, MS-DOS, UNIX, UNIX, Mac Mac clientsclients
Net Net PCPC
Thin Client SupportTSO comes to NT
• Lower Per-Client cost
• Huge centralized data stores.
5
Windows NT 5.0
Intelli-Mirror™
• Files and settings mirrored on client and server
• Great for mobile users
• Facilitates roaming
• Easy to replace PCs
• Optimizes network performance
• Means HUGE data stores
6
Outline
• What storage things are coming from Microsoft?
• TerraServer: a 1 TB DB on the Web
• Storage Metrics: Kaps, Maps, Gaps, Scans
• The future of storage: ActiveDisks
7
Microsoft TerraServer: Scaleup to Big Databases
• Build a 1 TB SQL Server database• Data must be
» 1 TB» Unencumbered» Interesting to everyone everywhere» And not offensive to anyone anywhere
• Loaded » 1.5 M place names from Encarta World Atlas» 3 M Sq Km from USGS (1 meter resolution)» 1 M Sq Km from Russian Space agency (2 m)
• On the web (world’s largest atlas)• Sell images with commerce server.
8
Microsoft TerraServer Background• Earth is 500 Tera-meters square
» USA is 10 tm2
• 100 TM2 land in 70ºN to 70ºS
• We have pictures of 6% of it» 3 tsm from USGS
» 2 tsm from Russian Space Agency
• Compress 5:1 (JPEG) to 1.5 TB.
• Slice into 10 KB chunks
• Store chunks in DB
• Navigate with
» Encarta™ Atlas• globe
• gazetteer
» StreetsPlus™ in the USA
40x60 km2 jump image
20x30 km2 browse image
10x15 km2 thumbnail
1.8x1.2 km2 tile
• Someday» multi-spectral image
» of everywhere
» once a day / hour
9
USGS Digital Ortho Quads (DOQ) • US Geologic Survey
• 4 Tera Bytes
• Most data not yet published
• Based on a CRADA» Microsoft TerraServer makes
data available.
USGS “DOQ”
1x1 meter4 TBContinentalUSNew DataComing
10
Russian Space Agency(SovInfomSputnik) SPIN-2 (Aerial Images is Worldwide Distributor)
• 1.5 Meter Geo Rectified imagery of (almost) anywhere
• Almost equal-area projection
• De-classified satellite photos (from 200 KM),
• More data coming (1 m)
• Selling imagery on Internet.
• Putting 2 tm2 onto Microsoft TerraServer.
SPIN-2
11
http://www.TerraServer.Microsoft.com/
Demo
SPIN-2
Microsoft
BackOffice
12
Demo
• navigate by coverage map to White House
• Download image
• buy imagery from USGS
• navigate by name to Venice
• buy SPIN2 image & Kodak photo
• Pop out to Expedia street map of Venice
• Mention that DB will double in next 18 months (2x USGS, 2X SPIN2)
14
The Microsoft TerraServer Hardware
• Compaq AlphaServer 8400Compaq AlphaServer 8400
• 8x400Mhz Alpha cpus8x400Mhz Alpha cpus
• 10 GB DRAM10 GB DRAM
• 324 9.2 GB StorageWorks Disks324 9.2 GB StorageWorks Disks» 3 TB raw, 2.4 TB of RAID53 TB raw, 2.4 TB of RAID5
• STK 9710 tape robot (~14 TB)STK 9710 tape robot (~14 TB)
• WindowsNT 4 EE, SQL Server 7.WindowsNT 4 EE, SQL Server 7.00
15
browser
HTMLJava
Viewer
The Internet
Web Client
Microsoft AutomapActiveX Server
Internet InfoServer 4.0
Image DeliveryApplication
SQL Server7
MicrosoftSite Server EE
Internet InformationServer 4.0
Image Provider Site(s)
TerraServer DB Automap Server
Terra-ServerStored Procedures
InternetInformationServer 4.0
ImageServer
Active Server Pages
MTS
TerraServer Web Site
Software
SQL Server 7
16
• Backup and Recovery
»STK 9710 Tape robot
»Legato NetWorker™
»SQL Server 7 Backup & Restore
»Clocked at 80 MBps (peak)(~ 200 GB/hr)
• SQL Server Enterprise Mgr
»DBA Maintenance
»SQL Performance Monitor
System Management & Maintenance
17
Microsoft TerraServer File Group Layout
• Convert 324 disks to 28 RAID5 setsplus 28 spare drives
• Make 4 WinNT volumes (RAID 50)
595 GB per volume
• Build 30 20GB files on each volume
• DB is File Group of 120 files
HSZ70 A
HSZ70 B
HSZ70 A
HSZ70 B
HSZ70 A
HSZ70 B
HSZ70 A
HSZ70 B
HSZ70 A
HSZ70 B
HSZ70 A
HSZ70 B
E: F: G: H:
HSZ70 A
HSZ70 B
18
Image Delivery and LoadIncremental load of 4 more TB in next 18 months
DLTTape “tar”
\Drop’N’ DoJobWait 4Load
LoadMgrDB
100mbitEtherSwitch
108 9.1 GBDrives
Enterprise Storage Array
AlphaServer8400
108 9.1 GBDrives
108 9.1 GBDrives
STKDLTTape
Library
604.3 GBDrives
AlphaServer4100
ESAAlphaServer4100
LoadMgr
DLTTape
NTBackup
ImgCutter
\Drop’N’ \Images
10: ImgCutter20: Partition30: ThumbImg40: BrowseImg45: JumpImg50: TileImg55: Meta Data60: Tile Meta70: Img Meta80: Update Place
...LoadMgr
20
Some Tera-Byte DatabasesKilo
Mega
Giga
Tera
Peta
Exa
Zetta
Yotta
• The Web: 1 TB of HTML
• TerraServer 1 TB of images
• Several other 1 TB (file) servers
• Hotmail: 7 TB of email
• Sloan Digital Sky Survey: 40 TB raw, 2 TB cooked
• EOS/DIS (picture of planet each week)» 15 PB by 2007
• Federal Clearing house: images of checks» 15 PB by 2006 (7 year history)
• Nuclear Stockpile Stewardship Program» 10 Exabytes (???!!)
22
Kilo
Mega
Giga
Tera
Peta
Exa
Zetta
Yotta
A novel A letter
Library of Library of Congress Congress (text)(text)
All Disks
All Tapes
A Movie
LoC (image)
All Photos
LoC (sound + cinima)
All Information!
23
Michael Lesk’s Points www.lesk.com/mlesk/ksg97/ksg.html
• Soon everything can be recorded and kept
• Most data will never be seen by humans
• Precious Resource: Human attention Auto-SummarizationAuto-Search
will be a key enabling technology.
24
Outline
• What storage things are coming from Microsoft?
• TerraServer: a 1 TB DB on the Web
• Storage Metrics: Kaps, Maps, Gaps, Scans
• The future of storage: ActiveDisks
25
Storage Latency: How Far Away is the Data?
Storage Latency: How Far Away is the Data?
RegistersOn Chip CacheOn Board Cache
Memory
Disk
12
10
100
Tape /Optical Robot
109
106
This CampusThis Room
10 min
My Head 1 min
1.5 hrSacramento
2 YearsPluto
2,000 YearsAndromeda
27
MetaMessage: Technology Ratios Are Important
MetaMessage: Technology Ratios Are Important• If everything gets faster&cheaper
at the same rate THEN nothing really changes.
• Things getting MUCH BETTER:»communication speed & cost 1,000x»processor speed & cost 100x»storage size & cost 100x
• Things staying about the same»speed of light (more or less constant)»people (10x more expensive)»storage speed (only 10x better)
28
Today’s Storage Hierarchy : Speed & Capacity vs Cost Tradeoffs
Today’s Storage Hierarchy : Speed & Capacity vs Cost Tradeoffs
1e 2 1e 1 1e 0 1e -1 1
1015
1012
109
106
103
Typi
cal S
yste
m (
byte
s)
Size vs Speed
Access Time (seconds)10-9 10-6 10-3 10 0 10 3
Cache
Main
Secondary
Disc
Nearline Tape Offline
Tape
Online Tape
1e 2 1e 1 1e 0 1e -1 1
104
102
100
10-2
10-4
$/M
B
Price vs Speed
Access Time (seconds)10-9 10-6 10-3 10 0 10 3
Cache
MainSecondary
Disc
Nearline Tape
Offline Tape
Online Tape
29
Storage Ratios Changed in Last 20 Years
• MediaPrice: 4000X, Bandwidth 10X, Access/s 10X
• DRAM:DISK $/MB: 100:1 25:1
• TAPE : DISK $/GB: 100:1 5:1
Disk Performance vs Time
1
10
100
1980 1990 2000
Year
seek
s p
er s
eco
nd
ban
dw
idth
: M
B/s
0.1
1.
10.
Cap
acit
y (G
B)
Disk accesses/second vs Time
1
10
100
1980 1990 2000
Year
Acc
esse
s p
er S
eco
nd
Storage Price vs TimeMegabytes per kilo-dollar
0.1
1.
10.
100.
1,000.
10,000.
1980 1990 2000
YearM
B/k
$
31
Disk Access Time
• Access time = SeekTime 6 ms 5%/y + RotateTime 3 ms 5%/y+ ReadTime 1 ms 25%/y
• Other useful facts:»Power rises more than size3 (so small is indeed beautiful)
»Small devices are more rugged
»Small devices can use plastics (forces are much smaller)e.g. bugs fall without breaking anything
32
Standard Storage Metrics Standard Storage Metrics • Capacity:
»RAM: MB and $/MB: today at 100MB & 1$/MB»Disk: GB and $/GB: today at 10GB and 50$/GB»Tape: TB and $/TB: today at .1TB and 10$/GB (nearline)
• Access time (latency)»RAM:100 ns»Disk: 10 ms»Tape: 30 second pick, 30 second position
• Transfer rate»RAM: 1 GB/s»Disk: 5 MB/s - - - Arrays can go to 1GB/s»Tape: 3 MB/s - - - not clear that striping works
33
New Storage Metrics: Kaps, Maps, Gaps, SCANs
New Storage Metrics: Kaps, Maps, Gaps, SCANs
•Kaps: How many kilobyte objects served per second
» the file server, transaction procssing metric
•Maps: How many megabyte objects served per second
» the Mosaic metric
•Gaps: How many gigabyte objects served per hour
» the video & EOSDIS metric
• SCANS: How many scans of all the data per day
» the data mining and utility metric
• And: $/Kaps, $/Maps, $/Gaps, $/SCAN
34
How To Get Lots of Maps, Gaps, SCANSHow To Get Lots of Maps, Gaps, SCANS
•parallelism: use many little devices in parallel
1 Terabyte
10 MB/s
At 10 MB/s: 1.2 days to scan
1 Terabyte
1,000 x parallel: 100 seconds/scan
Parallelism: divide a big problem into many smaller ones to be solved in parallel.
35
Tape & Optical: Beware of the Media Myth
Tape & Optical: Beware of the Media Myth
Optical is cheap: 200 $/platter 2 GB/platter => 100$/GB (5x cheaper than disc)
Tape is cheap: 100 $/tape 40 GB/tape => 2.5 $/GB (100x cheaper than disc).
36
Tape & Optical Reality: Media is 10% of System Cost
Tape & Optical Reality: Media is 10% of System CostTape needs a robot (10 k$ ... 3 m$ ) 10 ... 1000 tapes (at 40GB each) => 20$/GB ... 200$/GB
(1x…10x cheaper than disc)
Optical needs a robot (50 k$ ) 100 platters = 200GB ( TODAY ) => 250 $/GB
( more expensive than disc ) Robots have poor access times Not good for Library of Congress (25TB) Data motel: data checks in but it never checks out!
37
The Access Time MythThe Access Time MythThe Myth: seek or pick time dominatesThe reality: (1) Queuing dominates (2) Transfer dominates BLOBs (3) Disk seeks often shortImplication: many cheap servers
better than one fast expensive server»shorter queues
»parallel transfer
»lower cost/access and cost/byte
This is obvious for disk & tape arrays
Seek
Rotate
Transfer
Seek
Rotate
Transfer
Wait
38
My Solution to Tertiary StorageTape Farms, Not Mainframe SilosMy Solution to Tertiary Storage
Tape Farms, Not Mainframe Silos
Scan in 12 hours.many independent tape robots(like a disc farm)
10K$ robot 10 tapes400 GB 6 MB/s 25$/GB 30 Maps 15 Gaps 2 Scans
100 robots
40TB 25$/GB 3K Maps1.5K Gaps2 Scans
1M$
39
0.01
0.1
1
10
100
1,000
10,000
100,000
1,000,000
1000 x Disc Farm STK Tape Robot 6,000 tapes, 8 readers
100x DLT Tape Farm
GB/K$
Maps
Scans
SCANS/Day
Kaps
The Metrics: Disk and Tape Farms Win
The Metrics: Disk and Tape Farms Win
Data Motel:Data checks in, but it never checks out
40
Cost Per Access (3-year)Cost Per Access (3-year)
0.1
1
10
100
100,000
120
2
1000 x Disc Farm STK Tape Robot 6,000 tapes, 16
readers
100x DLT Tape Farm
Kaps/$
Maps/$
Gaps/$
SCANS/k$
500K
540,000
67,000
68
77 4.3
1.5
0.2
23
100
41
Storage Ratios Impact on Software
• Gone from 512 B pages to 8192 B pages (will go to 64 KB pages in 2006)
• Treat disks as tape:
»Increased use of sequential access
»Use disks for backup copies
• Use tape for
»VERY COLD data or
»Offsite Archive
»Data interchange
42
Summary Summary
• Storage accesses are the bottleneck
• Accesses are getting larger (Maps, Gaps, SCANS)
• Capacity and cost are improvingBUT
• Latencies and bandwidth are not improving muchSO
• Use parallel access (disk and tape farms)
• Use sequential access (scans)
43Controller
The Memory Hierarchy
• Measuring & Modeling Sequential IO
• Where is the bottleneck?
• How does it scale with
»SMP, RAID, new interconnects
Adapter SCSIFile cache PCI
MemoryGoals:balanced bottlenecksLow overheadScale many processors (10s)Scale many disks (100s)
Mem
bus
App address space
45
PAP (peak advertised Performance) vs RAP (real application performance) • Goal: RAP = PAP / 2 (the half-power point)
System Bus422 MBps
7.2 MB/s
133 MBps7.2 MB/s
10-15 MBps7.2 MB/s
SCSIFile System Buffers
ApplicationData
Disk
PCI
40 MBps7.2 MB/s
46
The Best Case: Temp File, NO IO• Temp file Read / Write File System Cache
• Program uses small (in cpu cache) buffer.
• So, write/read time is bus move time (3x better than copy)
• Paradox: fastest way to move data is to write then read it.
• This hardware islimited to 150 MBpsper processor
Temp File Read/Write
148 136
54
0
50
100
150
200
Temp read Temp write Memcopy ()
MB
ps
47
Bottleneck Analysis
• Drawn to linear scale
TheoreticalBus Bandwidth
422MBps = 66 Mhz x 64 bits
MemoryRead/Write
~150 MBps
MemCopy~50 MBps
Disk R/W~9MBps
51
PAP vs RAP• Reads are easy, writes are hard
• Async write can match WCE.
•
422 MBps
142 MBps
133 MBps
72 MBps
10-15 MBps
9 MBps
SCSI
File System
ApplicationData
PCI SCSI
Disks40 MBps
31 MBps
52
Bottleneck Analysis• NTFS Read/Write 9 disk, 2 SCSI bus, 1 PCI
~ 65 MBps Unbuffered read~ 43 MBps Unbuffered write
~ 40 MBps Buffered read
~ 35 MBps Buffered write
Memory Read/Write ~150 MBps
PCI~70 MBps
Adapter~30 MBps
Adapter
70 M
Bps
53
Peak Thrughput on Intel/NT• NTFS Read/Write 24 disk, 4 SCSI, 2 PCI (64 bit)
~ 190 MBps Unbuffered read~ 95 MBps Unbuffered write
so: 0.8 TB/hr read, 0.4 TB/hr write
on a 25k$ server.
Memory Read/Write ~150 MBps
PCI~70 MBps
Adapter~30 MBps
PCI
Adapter
Adapter
Adapter
190
MB
ps
54
Penny Sort Ground Ruleshttp://research.microsoft.com/barc/SortBenchmark
• How much can you sort for a penny.» Hardware and Software cost» Depreciated over 3 years» 1M$ system gets about 1 second,» 1K$ system gets about 1,000 seconds.» Time (seconds) = SystemPrice ($) / 946,080
• Input and output are disk resident
• Input is » 100-byte records (random data)» key is first 10 bytes.
• Must create output file and fill with sorted version of input file.
• Daytona (product) and Indy (special) categories
55
PennySort• Hardware
» 266 Mhz Intel PPro
» 64 MB SDRAM (10ns)
» Dual Fujitsu DMA 3.2GB EIDE
• Software» NT workstation 4.3
» NT 5 sort
• Performance» sort 15 M 100-byte records (~1.5 GB)
» Disk to disk
» elapsed time 820 sec • cpu time = 404 sec
PennySort Machine (1107$ )
board13%
Memory8%
Cabinet + Assembly
7%
Network, Video, floppy
9%
Software6%
Other22%
cpu 32%
Disk25%
56
Cluster Sort Conceptual Model
•Multiple Data Sources
•Multiple Data Destinations
•Multiple nodes
•Disks -> Sockets -> Disk -> DiskB
AAABBBCCC
A
AAABBBCCC
C
AAABBBCCC
BBBBBBBBB
AAAAAAAAA
CCCCCCCCC
BBBBBBBBB
AAAAAAAAA
CCCCCCCCC
60
Outline
• What storage things are coming from Microsoft?
• TerraServer: a 1 TB DB on the Web
• Storage Metrics: Kaps, Maps, Gaps, Scans
•The future of storage: ActiveDisks
61
Crazy Disk Ideas• Disk Farm on a card: surface mount disks
• Disk (magnetic store) on a chip: (micro machines in Silicon)
• NT and BackOffice in the disk controller(a processor with 100MB dram)
ASIC
62
Remember Your Roots
63
Year 2002 Disks• Big disk (10 $/GB)
» 3”
» 100 GB
» 150 kaps (k accesses per second)
» 20 MBps sequential
• Small disk (20 $/GB)» 3”
» 4 GB
» 100 kaps
» 10 MBps sequential
• Both running Windows NT™ 7.0?(see below for why)
64
The Disk Farm On a CardThe Disk Farm On a CardThe 1 TB disc card
An array of discs
Can be used as 100 discs 1 striped disc 10 Fault Tolerant discs ....etc
LOTS of accesses/second bandwidth
14"
Life is cheap, its the accessories that cost ya.
Processors are cheap, it’s the peripherals that cost ya
(a 10k$ disc card).
65
Put Everything in Future (Disk) Controllers(it’s not “if”, it’s “when?”)
Acknowledgements:
Dave Patterson explained this to me a year ago
Kim Keeton
Erik Riedel
Catharine Van Ingen
Helped me sharpen these arguments
66
Technology Drivers: Disks• Disks on track
• 100x in 10 years 2 TB 3.5” drive
• Shrink to 1” is 200GB
• Disk replaces tape?
• Disk is super computer!
Kilo
Mega
Giga
Tera
Peta
Exa
Zetta
Yotta
67
Data Gravity Processing Moves to Transducers(moves to data sources & sinks)
• Move Processing to data sources
• Move to where the power (and sheet metal) is
• Processor in
»Modem
»Display
»Microphones (speech recognition) & cameras (vision)
»Storage: Data storage and analysis
68
It’s Already True of PrintersPeripheral = CyberBrick
• You buy a printer
• You get a
»several network interfaces
»A Postscript engine • cpu, • memory, • software,• a spooler (soon)
»and… a print engine.
69
Functionally Specialized Cards• Storage
• Network
• Display
M MB DRAM
P mips processor
ASIC
ASIC
ASIC Today:
P=50 mips
M= 2 MB
In a few years
P= 200 mips
M= 64 MB
71
Basic Argument for x-Disks• Future disk controller is a super-computer.
»1 bips processor»128 MB dram»100 GB disk plus one arm
• Connects to SAN via high-level protocols» RPC, HTTP, DCOM, Kerberos, Directory Services,…. »Commands are RPCs»Management, security,….»Services file/web/db/… requests» Managed by general-purpose OS
with good dev environment
• Apps in disk saves data movement
»need programming environment in controller
72
The Slippery Slope
• If you add function to server
•Then you add more function to server
•Function gravitates to data.
Nothing = Sector Server
Everything = App Server
Something =
Fixed App Server
73
Why Not a Sector Server?(let’s get physical!)
• Good idea, that’s what we have today.
• But
»cache added for performance
»Sector remap added for fault tolerance
»error reporting and diagnostics added
»SCSI commends (reserve,.. are growing)
»Sharing problematic (space mgmt, security,…)
• Slipping down the slope to a 2-D block server
74
Why Not a 1-D Block Server?Put A LITTLE on the Disk Server• Tried and true design
»HSC - VAX cluster»EMC»IBM Sysplex (3980?)
• But look inside»Has a cache »Has space management»Has error reporting & management»Has RAID 0, 1, 2, 3, 4, 5, 10, 50,…»Has locking»Has remote replication»Has an OS»Security is problematic»Low-level interface moves too many bytes
75
Why Not a 2-D Block Server?Put A LITTLE on the Disk Server
• Tried and true design»Cedar -> NFS»file server, cache, space,..»Open file is many fewer msgs
• Grows to have»Directories + Naming»Authentication + access control»RAID 0, 1, 2, 3, 4, 5, 10, 50,…»Locking»Backup/restore/admin»Cooperative caching with client
• File Servers are a BIG hit: NetWare™»SNAP! is my favorite today
76
Why Not a File Server?Put a Little on the Disk Server
• Tried and true design
»Auspex, NetApp, ...
» Netware
• Yes, but look at NetWare
»File interface gives you app invocation interface
»Became an app server• Mail, DB, Web,….
»Netware had a primitive OS• Hard to program, so optimized wrong thing
77
Why Not Everything?
Allow Everything on Disk Server(thin client’s)
• Tried and true design
»Mainframes, Minis, ...
»Web servers,…
»Encapsulates data
»Minimizes data moves
»Scaleable
• It is where everyone ends up.
• All the arguments against are short-term.
79
Disk = Node• has magnetic storage (100 GB?)
• has processor & DRAM
• has SAN attachment
• has execution environment
OS KernelSAN driver Disk driver
File System RPC, ...Services DBMS
Applications
80
Technology Drivers: System on a Chip
• Integrate Processing with memory on chip»chip is 75% memory now»1MB cache >> 1960 supercomputers»256 Mb memory chip is 32 MB!»IRAM, CRAM, PIM,… projects abound
• Integrate Networking with processing on chip»system bus is a kind of network»ATM, FiberChannel, Ethernet,.. Logic on chip.»Direct IO (no intermediate bus)
• Functionally specialized cards shrink to a chip.
82
Technology Drivers: What if Networking Was as Cheap As Disk IO?
• TCP/IP
»Unix/NT 100% cpu @ 40MBps
• Disk
»Unix/NT 8% cpu @ 40MBps
Why the Difference?Host Bus Adapter does
SCSI packetizing, checksum,…flow controlDMA
Host doesTCP/IP packetizing, checksum,…flow controlsmall buffers
83
Technology Drivers: The Promise of SAN/VIA:10x in 2 years
http://www.ViArch.org/• Today:
»wires are 10 MBps (100 Mbps Ethernet)
»~20 MBps tcp/ip saturates 2 cpus
»round-trip latency is ~300 us
• In the lab»Wires are 10x faster Myrinet, Gbps Ethernet, ServerNet,…
» Fast user-level communication• tcp/ip ~ 100 MBps 10% of each processor
• round-trip latency is 15 us
84
Gbps Ethernet: 110 MBps
SAN: Standard
Interconnect
PCI: 70 MBps
UW Scsi: 40 MBps
FW scsi: 20 MBps
scsi: 5 MBps
• LAN faster than memory bus?
• 1 GBps links in lab.
• 100$ port cost soon
• Port is computer
RIPFDDI
RIPATM
RIPSCI
RIPSCSI
RIPFC
RIP?
86
Technology Drivers
Plug & Play Software• RPC is standardizing: (DCOM, IIOP, HTTP)
» Gives huge TOOL LEVERAGE» Solves the hard problems for you:
• naming, • security, • directory service, • operations,...
• Commoditized programming environments » FreeBSD, Linix, Solaris,…+ tools» NetWare + tools» WinCE, WinNT,…+ tools» JavaOS + tools
• Apps gravitate to data.
• General purpose OS on controller runs apps.
87
Basic Argument for x-Disks• Future disk controller is a super-computer.
»1 bips processor
»128 MB dram
»100 GB disk plus one arm
• Connects to SAN via high-level protocols» RPC, HTTP, DCOM, Kerberos, Directory Services,….
»Commands are RPCs
»management, security,….
»Services file/web/db/… requests» Managed by general-purpose OS with good dev environment
• Move apps to disk to save data movement»need programming environment in controller
88
Outline• What storage things are coming from Microsoft?
• TerraServer: a 1 TB DB on the Web
• Storage Metrics: Kaps, Maps, Gaps, Scans
• The future of storage: ActiveDisks
• Papers and Talks at
http://research.Microsoft.com/~Gray