stephen dart lards service manager monash e-research centre lards staging post enhancing workgroup...
TRANSCRIPT
Stephen DartLaRDS Service ManagerMonash e-Research Centre
LaRDS Staging PostEnhancing Workgroup Productivity
Managing User Expectation
In a perfect world
• Dedicated wire • 1Gb/s • 125 MB per second• 7.5 GB per minute• 450 GB per hour• 10 TB per day
In reality,inconsistency
• Slow Speed• 3~30 MB per second
• Workstation, Server or LaRDS?
• Share Hangs or Disconnects
• Please Explain!
Network at the Edge
Complications at the Core
Current LaRDS Samba service
- LaRDS Samba service for workgroup sharing files- End user experience is speed limitations- Not suited for workstation backup- Not suited for bulk upload- Oversubscribed disk is pushed to tape- Something faster please
Many factors to make things work slow• Current situation
– LaRDS Samba based on virtual server– Workstations at the edge of the network– Network bandwidth contention getting to LaRDS
Current ARMI workstation service
- Single Network Port per workstation- 1Gb/s bit rate on port- Effective throughput peak below 10%
- Common network switch for whole floor- Can handle many point to point within floor- Must share floor bandwidth to building switch
- Common network switch for building- Must share building bandwidth to precinct switch
What can be done now- Provide a local data service for workstations
- Install Staging Post on same switch as users- bypass VeRA for uploads and backup
- Increase bandwidth between floor switch and the precinct router
- Extra floor and building uplinks- Faster links between switches
What can be done now- Offload the big data as quickly as possible- To a local cache that can be used as a working share- Sync the data on a daily basis with LaRDS
Something still not right
• NAS on same switch and subnet as workstation
• One session ok, but second session kills first!
• Network engineers insist NAS too slow and dropping packets
• Serious detective work starts
Network Engineers in Denial
• Network bandwidth to NDT server – http://ndt.its.monash.edu.au/toolkit/
• Network bandwidth to Speedtest.net– http://www.speedtest.net/
• Network Weather Map all clear– http://cacti.its.monash.edu.au/cacti/weather
map/weathermap.html
– Low utilization and no errors
QoS Policy set at default for VOIP
Research networks generate data at theedge for upload to the core
Traditional Corporate Intranet
Research and Instrumentation Intranet
Tackle System Integration• Rethink QoS
– Trial with QoS off (unmanaged)
– Open call with CISCO
– TCP/IP behaviour
– Get Network Engineers trained in QoS
• Make sure NAS connected to AD – VeRA Samba was not AD connected
What can be done now- Offload the big data as quickly as possible- To a local cache that can be used as a working share- Sync the data on a daily basis with LaRDS
Updated QoS rolled out to all switches
Five Size Options for Staging Post
Staging Post Capacity User load, NIC Speed Cost
QNAP-509Pro 5 x 1.5TB (6TB RAID5)
~10 users, $2,500
QNAP-809Pro 8 x 2TB (12TB RAID5)
~20 users, 1Gb/s $4,300
QNAP-859URP(rackmounted)
8 x 3TB(18TB RAID6)
~30 users, 1Gb/s $4,750
QNAP-1279U-RP 12 x 3TB(30TB RAID6)
~50 users, 10Gb/s $8,000
SGI ISS-3500 24 x 2TB (40TB RAID6)
~100 users, 10Gb/s $25,000
Re arrange existing disk usage
• Provide two file systems match usage • Working data sets (fast, local disk)
– Online now, used often, interim results
• Archive data sets (deep, NFS to DMF)– Step or phase completion
– Reference for future work
– Storage object as a group of files
– Publication and citation
Integrate with Grid Access
• Grid Users using DMF for home folders– Grid processes flooding DMF shares
– Many small files gone by the time they hit the front of the migration queue
– DMF recalls stall Grid jobs
• Provide non-DMF Grid Scratch– Don’t back it up
Outstanding Issues
• Speeding up other VMs without hardware scale out
• Presenting Samba users with indication of Offline status
• User Indoctrination
Questions