keynote ieee international workshop on cloud analytics. dennis gannon

38

Upload: microsoft-azure-for-research

Post on 26-Jan-2015

111 views

Category:

Data & Analytics


0 download

DESCRIPTION

This talk describes our experiences from hosting scientific research application in the Microsoft Cloud. Covers an overview of Microsoft Azure capabilities, examples of big data analysis for science, data collections, science gateways and science virtual machine libraries.

TRANSCRIPT

Page 1: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 2: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 4: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon

Last

few decades

Thousand

years ago

Today and the FutureLast few

hundred years

2

2

2.

3

4

a

cG

a

a

Simulation of

complex phenomena

Newton’s laws,

Maxwell’s equations…

Description of natural

phenomena

Unify theory, experiment and

simulation with large

multidisciplinary Data

Using data exploration and

data mining

(from instruments, sensors,

humans…)

Distributed Communities

Page 5: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 6: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 7: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon

7

Melbourne

Sydney

Page 8: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon

IT PAC

Page 9: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 10: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon

applicationbuilding

blocks

Page 11: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 12: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 13: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 14: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 15: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 16: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 17: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 18: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 19: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 20: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon

Web Portal

UserBrowser

Task Queue2

1

Executable / Data(Windows Azure Storage)

Compute Nodes

3

5

4

6

7

Page 21: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 22: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon

ChronoZoom: An infinite canvas in time

Page 23: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 24: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 25: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon

• Many Examples

• The Challenge: sustainability

• Manage locality• Keep the hot data local on cloud disk

• Manage the working set over time

• The rest is archival

Data

Acquisition &

modelling

Collaboration

and

visualisation

Analysis &

data mining

Dissemination

& sharing

Archiving and

preserving

Page 26: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 27: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon

• A core library for science in the cloud.

• Built on community tools• Ipython Notebook, Python, NumPy,

SciPy, Scikit-Learn, biopython

• Standard community tools

• Deploy as VM library

• Deploy data collections.• Genomic libraries, medical image

libraries, geophysics, astronomy, etc.

• Build community resources

Page 28: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 29: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 30: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 31: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 32: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon

• The Genetic Causes of Disease

(David Heckerman)

• Wellcome Trust for a GWAS for a large

population

• Looking for causes for seven common

diseases (bipolar, r. arthritis, coronary,

hypertension, ….)

• Confounding is a problem. Needed a

new algorithm.

• Ran on Azure cloud using 35,000 cores

in 3 weeks.

Page 33: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon

Inputs (training data)

Labels

Hidden layers

Input dataDetected featuresMona Lisa

Page 34: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 35: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 36: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon

The Windows Azure for Research program:

·Free access to Windows Azure cloud computing and storage

(submit proposals for Windows Azure Research Awards)

· Windows Azure for Research training classes (20 classes

worldwide. )

· Support and technical resources

azure4research.com.

Page 37: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Page 38: Keynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon