what happens when the science dmz meets the commodity internet? presenter: joe breen work done:...

Post on 17-Jan-2016

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

What happens when the Science DMZ meets the

Commodity Internet?

Presenter: Joe BreenWork done: Brian Haymore, Sam Liston

University of Utah Center for High Performance Computing

Boil and Bubble, Toil and Trouble, What do you get with a Science DMZ,That's mixed up with unlimited storageFast and free?

Image credit: http://www.zastavki.com/eng/Holidays/Halloween/wallpaper-24660.htm

Start with a researcher

Offer him candy (unlimited storage)

Tell her its free

Mix in a "frictionless" Science DMZ environment with 40G and 10G Data Transfer Nodes

Image: https://fasterdata.es.net/science-dmz/science-dmz-architecture/

Mix in a well built commodity cloud service that can consume lots of data quickly

Throw in an open source parallel tool that knows how to efficiently utilize a cloud provider’s API

Google DriveAmazon S3Openstack Swift / Rackspace cloud files / Memset MemstoreDropboxGoogle Cloud StorageAmazon Cloud DriveThe local filesystemhttp://rclone.org/

What do you get?

14+Gig spikes, 5-8Gb/s of sustained traffic

One 10G commodity pipe fills completely, traffic rolls to next available close peering point, and fills it too

R&E routes yanked temporarily by Cloud vendor NOC to allow service to other commodity users and to better understand nature of congestion

A call from a cloud provider NOC early in the morning asking to stop. (at least for a bit)

Almost 100TB of data moved in 2.5 days

Very happy researchers who want more

What's an HPC center to do? Start mixing more...

Multiple vendors now offering Apps/unlimited storage targeted at EDUcation *individual* users.

Same vendors offer multi-tier Cloud storage for purchase for archival storage, mid-level storage, highly available storage, specific application storage

Today's HPC researcher use cases for large personal cloud storage• Using as another storage bin for keeping points in time source code,

input files, etc.• Using instead of a USB drive - a temporary location for:• Saving Snippets of code• Looking at single files• Moving data from a national resource that requires cleaning

• Sharing data to a distributed international audience

Potential other use cases for personal cloud storage or organization cloud storage• Use as large archive for different data sets for individual researcher • Use as archive for a research group or a collection of collaborating

groups• "service account" with Linux extended ACLs to tar up and backup multiple

collaborating research groups using the same file systems• allows a simple formula for groups to replicate data in another location --

flexibility for users at their level and convenience

• Back off info from scratch drives - shuffle back and forth from Cloud provider and from HPC scratch – Accommodate over DMZ DTN box• ... The list continues …

Some of these vendors have Research & Education peering directly, some of them only maintainCommodity peering

Commodity peering not designed for bursty research or long term large data set file transfers

Large research data set transfer

Normal Commodity traffic

Commodity peering points serve lots of companies and businesses

Do we need additional tools in our ecosystem?

How do we protect the community, the vendors, and the collaborative network environments, AND, encourage the innovation?

Should we more aggressively leverage the emerging capabilities of the national Research and Education Software Defined Networking backbone?

Match flow (source IP, source port, destination IP, destination port…) => Action apply QoS or block per flow

Match = Action

Data Transfer Nodes are generally multi-tenant, serving the transfer needs of several groups simultaneously.

How might we work with one tenant AND continue to service other tenants well without null routing whole networks or even individual DTNs?

Image: https://fasterdata.es.net/science-dmz/science-dmz-architecture/

Does the community need a more precision tool?

ORImage: http://onsurg.com/instrument-handling-10-blade/

Summary

• Science DMZs are frictionless, Commodity connections to Cloud vendors are not• Cloud vendors are coming up with new business plans that

researchers like to use creatively• Do emerging technologies allow us to create new tools that might

enable the innovation and protect the ecosystem?

top related