david abramson v2 - australian national university · pdf filedederko, nevo, altshuler, wu,...

61
Grid enabling 'real' science and engineering ? David Abramson Monash e-Science and Grid Engineering Lab (MESSAGE Lab) Faculty of Information Technology Acting Director: Monash e-Research Centre ARC Professorial Fellow 1 Contents The changing face of science Grid testbeds: The Pragma testbed Software tools for “real” science The MESSAGE Lab approach Some successful projects Environmental science, Systems biology, Chemistry Physics Engineering Conclusions 2

Upload: vannhu

Post on 06-Feb-2018

230 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

Grid enabling 'real' science and engineering ?

David AbramsonMonash e-Science and Grid Engineering Lab (MESSAGE Lab)

Faculty of Information Technology

Acting Director: Monash e-Research Centre

ARC Professorial Fellow

1

Contents• The changing face of science• Grid testbeds: The Pragma testbed• Software tools for “real” science

– The MESSAGE Lab approach• Some successful projects

– Environmental science, – Systems biology, – Chemistry– Physics– Engineering

• Conclusions 2

Page 2: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

THE CHANGING FACE OF SCIENCE

3

4

e-Science• Pre-Internet

– Theorize &/or experiment, aloneor in small teams; publish paper

• Post-Internet– Construct and mine large databases of observational or

simulation data– Develop simulations & analyses– Access specialized devices remotely– Exchange information within

distributed multidisciplinary teams

Source: Ian Foster

“Grids are not just communities of computers, but communities of researchers, of people.”

— Peter Arzberger, UCSD

Page 3: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

5

6

Typical Grid Applications• Characteristics

– High Performance Computation– Distributed infrastructure– Instruments are first class resources– Lots of data– Not just bigger – fundamentally different

• Some examples– In-silico biology (See MyGrid)– Earthquake simulation– Virtual observatory– High energy physics– Medical applications– Environmental applications.

Page 4: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

COMPUTATIONAL PLATFORMS

7

8

The Grid

• Infrastructure (“middleware” & “services”) for establishing, managing, and evolving multi-organizational federations– Dynamic, autonomous, domain independent– On-demand, ubiquitous access to computing, data,

and services• Mechanisms for creating and managing workflow

within such federations– New capabilities constructed dynamically and

transparently from distributed services– Service-oriented, virtualization

Source: Ian Foster

Page 5: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

9

What is a Grid?• Three key criteria

– Coordinates distributed resources …– using standard, open, general-

purpose protocols and interfaces …– to deliver non-trivial qualities of

service.• What is not a Grid?

– A cluster, a network attached storage device, a scientific instrument, a network, etc.

– Each may be an important component of a Grid, but by itself does not constitute a Grid

Source: Ian Foster

10

The (Power) Grid:On-Demand Access to Electricity

Time

Qua

lity,

eco

nom

ies

of s

cale

Source: Ian Foster

Page 6: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

11

By analogy, some challenges

Voltage 110 – 220 – 240Frequency 50 – 60 Hz.

THE PRAGMA TESTBED: A GRASS ROOTS GRID

12

Page 7: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

PRAGMA Grid TestbedPRAGMA Grid Testbed

32 Clusters from 29 institutions in 14 countries/regions (+ 7 in preparation)

UZurichSwitzerland

NECTECThaiGridThailand

UoHydIndia

MIMOSUSMMalaysia

CUHKHongKong

ASGCNCHCTaiwan

IOIT-HCMVietnam

AISTOsakaUUTsukubaTITechJapan

BIIIHPCNGOSingapore MU

Australia

APACQUTAustralia

KISTIKorea

JLUChina

SDSCUSA

CICESEMexico

UNAMMexico

UCNChile

UChileChile

UMCUSA

UUtahUSA

NCSAUSA BU

USA

ASURCCosta Rica

BESTGridNew Zealand

CNICGUCASChina

AIST

SDSC

NGO

NECTECThaiGrid

7 gfarm sites

ASGCLZUChina

CNIC

Source Cindy Zheng

PRAGMA Testbed

14

Page 8: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

What have we learned?• Faults (hardware, software and network) are normal • Testbed supports a rich variety of application needs• Testbed is large,

– resource availability, network bandwidth and contention, distributed resource monitoring, and global name spaces.

• Coordination of resources, monitoring, inter-operable software deployments, and resource/user policies complicated because applications may have different requirements for middleware and resource allocation.

• Users and administrators located across 12 time zones. – challenge in putting together a team with very diverse cultural backgrounds.

• Countries with different levels of network speed, quality, and system resources.

• Strong feedback between application and middleware (e.g. Ninf-G, Nimrod and Gfarm).

Zheng, C., Abramson, D., Arzberger, P., Ayyub, S., Enticott, C., Garic, S., Katz, M., Kwak, J., Lee, B. S., Papadopoulos, P., Phatanapherom, S., Sriprayoonsakul, S., Tanaka, Y., Tanimura, Y., Tatebe, O., Uthayopas, P. “The PRAGMA Testbed: Building a Multi-Application International Grid”, Workshop on Grid Testbeds, 6th IEEE International Symposium on Cluster Computing and the Grid 16-19 May 2006, Singapore. 15

MESSAGE LAB APPROACH

16

Page 9: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

17

Supporting the software life-cycle

LowerMiddleware

PlatformInfrastructure

Unix Windows JVM TCP/IP MPI .Net Runtime

Environmental Sciences

Life & Pharmaceutical

Sciences

ApplicationsGeo Sciences

VPN SSH

UpperMiddleware/Tools

Globus GT4

Web Services

Test/Debug

SRB

ExecutionDevelopment Deploy

Deploy Motor

DistANTNimrod/G /O /E Kepler

Nimrod/K

Debug

Worqbench

IE EclipseEclipse

Guard

Eclipse VS NimrodPortal& WS

REMUS

GriddLeSNetFiles

Active Data

ActiveSheets

Excel

Nimrod Development Cycle

18

Prepare Jobs using Portal

Jobs Scheduled Executed Dynamically

Sent to available machines

Results displayed &interpreted

Page 10: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

PHYSICS

19

Ionisation Chamber Design Lew Kotler, ARPANSA

20

Front wall

Guard electrode

Collectingelectrode

Ion collection volume(10 mm diameter x21 mm length)

Direction of photons

PolystyreneInsulator

TeflonInsulator

Aluminium

Fieldshapingelectrodes

EGS4 Simulation of response of Aluminium Cavity Chamber to 60Co photons

0.125

0.135

0.145

0.155

0.165

0.175

0 0.1 0.2 0.3 0.4

Thickness of front cap (cm)

Ion

pairs

per

inci

dent

pho

ton

Egs4 calculation Fitted to sum of exponentials

Page 11: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

RADIATION SOURCE

PATIENT

IMAGER

Radiotherapy planningGiddy, Chin, Lewis, Welsh e-Science Centre, UK

www.utsouthwestern.edu/.../270177SynergyS.2.bmp

BEAMnrc

EGS

DOSXYZnrc

Outcomes

22

CONVOLUTION /SUPERPOSITION MONTE CARLO

Spezi E 2003 PhD Thesis Med Phys 31(3)

Page 12: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

SmartPET - A Compton CameraToby Beveridge, Monash University

23

A SmartPET Detector• Large Volume - 20 x 60 x 60 mm3

• Operating Range 0.1 – 2 MeV• Detector resolution depends on Pulse Shape Analysis

A Compton Camera• Extensive FoV• Multi-resolution Data•Angular precision depends on detector resolution

•Multi-parameter space is difficult to characterise, and optimise, analytically•Monte-Carlo solutions such as GEANT4 are computationally expensive

24

Outcomes

Each pixel (at a particular incident energy) was assigned a separate job

At each point the resolution matrix could be calculated

For a Single Trial242 point-source locations (112 field over 2 orthogonal planes)5 energies (between 140 keV and 1000 keV)2 different detection conditions242 x 5 x 2 x (20 mins per run) = 806 hours

Page 13: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

ENVIRONMENT

25

Climate StudiesLynch, Abramson, Görgen, Beringer, Uotila,

Monash University• Extensive savanna eco-systems in

northern Australia• Changing fire regime • Fires lead to abrupt changes in

surface properties– Surface energy budgets– Partititioning of convective fluxes – Increased soil heat flux→ Modified surface-atmosphere

coupling • Sensitivity study: do the fire’s

effects on atmospheric processes lead to changes in highly variable precipitation regime of Australian Monsoon?

• Many potential impacts (e.g. agricultural productivity)

(J. Beringer)

Page 14: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

Outcomes

0

10

20

30

40

50

60

70

80

90

100

Mar

08

2006

Mar

12

2006

Mar

17

2006

Mar

22

2006

Mar

26

2006

Mar

31

2006

A pr 0

5 20

06

A pr 0

9 20

06

A pr 1

4 20

06

A pr 1

9 20

06

Apr 2

3 20

06

Apr 2

8 20

06

Ma y

03

2006

Ma y

07

2006

Ma y

12

2006

May

17

2006

Ma y

21

2006

Ma y

26

2006

May

31

2006

Jun

04 2

006

Jun

09 2

006

Jun

14 2

006

Jun

18 2

006

Jun

23 2

006

Jun

28 2

006

Jul 0

2 20

06

Jul 0

7 20

06

Jul 1

2 20

06

Jul 1

6 20

06

Jul 2

1 20

06

Jul 2

6 20

06

Jul 3

0 20

06

Aug

04 2

006

Aug

09 2

006

Aug

13 2

006

Aug

18 2

006

mahar

rocks-52

ume

jupiter

pragma001

amata1

tgc

Total

A Workshop On Earth System Models of Intermediate Complexity28-29 March 2006 at the Bureau of Meteorology Research Centre, Melbourne

SYSTEMS BIOLOGY

28

Page 15: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

• Heart disease still leading cause of death

• Understanding the underlyingphysiological mechanisms is cheaper and

faster when experimental studies are performed together with mathematical models & computer simulations

• Studying pathologies• Developing & Testing drugs

Normal Rhythm Ventricular Fibrillation

Cardiac ModellingSher, Gavaghan, Hinch, Noble,

Oxford University

Cardiac Modeling

• Based on experimental data, mathematical models have been developed– ODE’s– Initial conditions– Ion movement in single

cells

Shannon et al. model, 2004

Page 16: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

Studying ionic modelsAnna Sher, Oxford

• Examine the effect of various parameters on Ca2+-induced Ca2+ release and on shape of the action potential

• Fit simulated to experimental data • Identify parameter(s) that are critical

to distinguish Ca2+ dynamics within various species

Outcomes• Single cell ionic models allow us to study:

– Whole cell currents during an action potential (AP) – Currents in response to voltage-clamp stimuli– Dynamics of ions such as Ca2+ and Na+ – Force-frequency relationship– etc. Vm [mV]

INCX [A/F]

Blue: controlRed: local NCX 0.1Green: local NCX 0.3

Transient inward current

time (ms)

time (ms)

Page 17: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

3.0 Hz APD Restitution

00.020.040.060.080.1

0.120.140.160.180.2

0 0.05 0.1 0.15 0.2

DI (s)

APD

90(s

)

Cardiac ModellingDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova,

Kerckhoffs , UCSD

APD Restitution

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0 0.5

DI (s)

APD

90 (s

)

0.5 Hz

1.0 Hz

1.5 Hz

2.0 Hz

3.0 Hz

4.0 Hz

APD Restitution

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

DI (s)

AP

D 90

(s)

0.5 Hz

1.0 Hz

1.5 Hz

2 .0 Hz

3.0 Hz

4.0 Hz

1.0 Hz Epicardial Cell Calcium Restitution

00.0005

0.0010.0015

0.0020.0025

0 0.2 0.4 0.6DI (s)

Cal

cium

(a.u

.)

Control

0.01 Iso

0.1 Iso

1.0 Iso

CHEMISTRY

34

Page 18: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

Quantum ChemistryWibke Sudholt, Univ Zurich

35

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.5 1 1.5 2 2.5 3

Pote

ntia

l U/a

.u.

Radius r/a.u.

A1, A2

B1, B2

C CH

H

H

HH

HFC

H

HH

( ) ( ) ( )222

211eff expexp rBArBArU −+−=

36

Drug docking pipelineBaldridge, Amoreira, Univ Zurich,

Berstis, Kondrick, UCSD

• Goal is to minimize the free binding energy• Use Quantum calculations for more realism

Protein Data Bank

PDB2PQR WHATIF QMView APBS Compute free

binding energy

Add Hydrogen Atoms

Remove the water network

Place ligand

Solve Poisson-Boltzmann equation

Page 19: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

ENGINEERING

37

Flame Kernel Growth in Turbulent FlowsTom Dunstan, Karl Jenkins, Cranfield University

38

Laminar Flame solutions

Laminar simulations are carried out for verification of the code and adjustment of the initial reaction parameters.

Fig 1. Non-dimensional flame speed.time step = 3x10-5 s

Fig 2. Progress variable C (top) and Reaction Rate (bottom) at progressive time steps through centre of domain.

NIMROD optimisation results

Mass Fraction Burned describes global state of reaction (from 0 in unburned state to 1 in fully burned).

Fig 3. Mass fraction burned for varying values of pre-exponential factor B* and heat release parameter.

Page 20: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

Turbulent Flame propagation

39

From parallel computations using 3843 grids (56 million grid points).

Iso-surface of progress variable, C = 0.5

t = 0.8 ms t = 4.8 ms

t = 9.6 ms t = 14.2 ms

2D slices of progress variable taken at position L/4.

Pre-heat zone (pale blue)

Reaction zone (yellow)

Conclusions

• Significant benefits for science and engineering

• Users want to focus on their research• Tools need to be powerful and hide the

complexity of the underlying platforms• Real studies improve tools• Application studies are significant

component of the research www.csse.monash.edu.au/~davida

40

Page 21: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

Grid enabling 'real' science and engineering ?

David AbramsonMonash e-Science and Grid Engineering Lab (MESSAGE Lab)

Faculty of Information Technology

Acting Director: Monash e-Research Centre

ARC Professorial Fellow

1

Contents• The changing face of science• Grid testbeds: The Pragma testbed• Software tools for “real” science

– The MESSAGE Lab approach• Some successful projects

– Environmental science, – Systems biology, – Chemistry– Physics– Engineering

• Conclusions 2

Page 22: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

THE CHANGING FACE OF SCIENCE

3

4

e-Science• Pre-Internet

– Theorize &/or experiment, aloneor in small teams; publish paper

• Post-Internet– Construct and mine large databases of observational or

simulation data– Develop simulations & analyses– Access specialized devices remotely– Exchange information within

distributed multidisciplinary teams

Source: Ian Foster

“Grids are not just communities of computers, but communities of researchers, of people.”

— Peter Arzberger, UCSD

Page 23: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

5

6

Typical Grid Applications• Characteristics

– High Performance Computation– Distributed infrastructure– Instruments are first class resources– Lots of data– Not just bigger – fundamentally different

• Some examples– In-silico biology (See MyGrid)– Earthquake simulation– Virtual observatory– High energy physics– Medical applications– Environmental applications.

Page 24: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

COMPUTATIONAL PLATFORMS

7

8

The Grid

• Infrastructure (“middleware” & “services”) for establishing, managing, and evolving multi-organizational federations– Dynamic, autonomous, domain independent– On-demand, ubiquitous access to computing, data,

and services• Mechanisms for creating and managing workflow

within such federations– New capabilities constructed dynamically and

transparently from distributed services– Service-oriented, virtualization

Source: Ian Foster

Page 25: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

9

What is a Grid?• Three key criteria

– Coordinates distributed resources …– using standard, open, general-

purpose protocols and interfaces …– to deliver non-trivial qualities of

service.• What is not a Grid?

– A cluster, a network attached storage device, a scientific instrument, a network, etc.

– Each may be an important component of a Grid, but by itself does not constitute a Grid

Source: Ian Foster

10

The (Power) Grid:On-Demand Access to Electricity

Time

Qua

lity,

eco

nom

ies

of s

cale

Source: Ian Foster

Page 26: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

11

By analogy, some challenges

Voltage 110 – 220 – 240Frequency 50 – 60 Hz.

THE PRAGMA TESTBED: A GRASS ROOTS GRID

12

Page 27: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

PRAGMA Grid TestbedPRAGMA Grid Testbed

32 Clusters from 29 institutions in 14 countries/regions (+ 7 in preparation)

UZurichSwitzerland

NECTECThaiGridThailand

UoHydIndia

MIMOSUSMMalaysia

CUHKHongKong

ASGCNCHCTaiwan

IOIT-HCMVietnam

AISTOsakaUUTsukubaTITechJapan

BIIIHPCNGOSingapore MU

Australia

APACQUTAustralia

KISTIKorea

JLUChina

SDSCUSA

CICESEMexico

UNAMMexico

UCNChile

UChileChile

UMCUSA

UUtahUSA

NCSAUSA BU

USA

ASURCCosta Rica

BESTGridNew Zealand

CNICGUCASChina

AIST

SDSC

NGO

NECTECThaiGrid

7 gfarm sites

ASGCLZUChina

CNIC

Source Cindy Zheng

PRAGMA Testbed

14

Page 28: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

What have we learned?• Faults (hardware, software and network) are normal • Testbed supports a rich variety of application needs• Testbed is large,

– resource availability, network bandwidth and contention, distributed resource monitoring, and global name spaces.

• Coordination of resources, monitoring, inter-operable software deployments, and resource/user policies complicated because applications may have different requirements for middleware and resource allocation.

• Users and administrators located across 12 time zones. – challenge in putting together a team with very diverse cultural backgrounds.

• Countries with different levels of network speed, quality, and system resources.

• Strong feedback between application and middleware (e.g. Ninf-G, Nimrod and Gfarm).

Zheng, C., Abramson, D., Arzberger, P., Ayyub, S., Enticott, C., Garic, S., Katz, M., Kwak, J., Lee, B. S., Papadopoulos, P., Phatanapherom, S., Sriprayoonsakul, S., Tanaka, Y., Tanimura, Y., Tatebe, O., Uthayopas, P. “The PRAGMA Testbed: Building a Multi-Application International Grid”, Workshop on Grid Testbeds, 6th IEEE International Symposium on Cluster Computing and the Grid 16-19 May 2006, Singapore. 15

MESSAGE LAB APPROACH

16

Page 29: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

17

Supporting the software life-cycle

LowerMiddleware

PlatformInfrastructure

Unix Windows JVM TCP/IP MPI .Net Runtime

Environmental Sciences

Life & Pharmaceutical

Sciences

ApplicationsGeo Sciences

VPN SSH

UpperMiddleware/Tools

Globus GT4

Web Services

Test/Debug

SRB

ExecutionDevelopment Deploy

Deploy Motor

DistANTNimrod/G /O /E Kepler

Nimrod/K

Debug

Worqbench

IE EclipseEclipse

Guard

Eclipse VS NimrodPortal& WS

REMUS

GriddLeSNetFiles

Active Data

ActiveSheets

Excel

Nimrod Development Cycle

18

Prepare Jobs using Portal

Jobs Scheduled Executed Dynamically

Sent to available machines

Results displayed &interpreted

Page 30: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

PHYSICS

19

Ionisation Chamber Design Lew Kotler, ARPANSA

20

Front wall

Guard electrode

Collectingelectrode

Ion collection volume(10 mm diameter x21 mm length)

Direction of photons

PolystyreneInsulator

TeflonInsulator

Aluminium

Fieldshapingelectrodes

EGS4 Simulation of response of Aluminium Cavity Chamber to 60Co photons

0.125

0.135

0.145

0.155

0.165

0.175

0 0.1 0.2 0.3 0.4

Thickness of front cap (cm)

Ion

pairs

per

inci

dent

pho

ton

Egs4 calculation Fitted to sum of exponentials

Page 31: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

RADIATION SOURCE

PATIENT

IMAGER

Radiotherapy planningGiddy, Chin, Lewis, Welsh e-Science Centre, UK

www.utsouthwestern.edu/.../270177SynergyS.2.bmp

BEAMnrc

EGS

DOSXYZnrc

Outcomes

22

CONVOLUTION /SUPERPOSITION MONTE CARLO

Spezi E 2003 PhD Thesis Med Phys 31(3)

Page 32: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

SmartPET - A Compton CameraToby Beveridge, Monash University

23

A SmartPET Detector• Large Volume - 20 x 60 x 60 mm3

• Operating Range 0.1 – 2 MeV• Detector resolution depends on Pulse Shape Analysis

A Compton Camera• Extensive FoV• Multi-resolution Data•Angular precision depends on detector resolution

•Multi-parameter space is difficult to characterise, and optimise, analytically•Monte-Carlo solutions such as GEANT4 are computationally expensive

24

Outcomes

Each pixel (at a particular incident energy) was assigned a separate job

At each point the resolution matrix could be calculated

For a Single Trial242 point-source locations (112 field over 2 orthogonal planes)5 energies (between 140 keV and 1000 keV)2 different detection conditions242 x 5 x 2 x (20 mins per run) = 806 hours

Page 33: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

ENVIRONMENT

25

Climate StudiesLynch, Abramson, Görgen, Beringer, Uotila,

Monash University• Extensive savanna eco-systems in

northern Australia• Changing fire regime • Fires lead to abrupt changes in

surface properties– Surface energy budgets– Partititioning of convective fluxes – Increased soil heat flux→ Modified surface-atmosphere

coupling • Sensitivity study: do the fire’s

effects on atmospheric processes lead to changes in highly variable precipitation regime of Australian Monsoon?

• Many potential impacts (e.g. agricultural productivity)

(J. Beringer)

Page 34: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

Outcomes

0

10

20

30

40

50

60

70

80

90

100

Mar

08

2006

Mar

12

2006

Mar

17

2006

Mar

22

2006

Mar

26

2006

Mar

31

2006

A pr 0

5 20

06

A pr 0

9 20

06

A pr 1

4 20

06

A pr 1

9 20

06

Apr 2

3 20

06

Apr 2

8 20

06

Ma y

03

2006

Ma y

07

2006

Ma y

12

2006

May

17

2006

Ma y

21

2006

Ma y

26

2006

May

31

2006

Jun

04 2

006

Jun

09 2

006

Jun

14 2

006

Jun

18 2

006

Jun

23 2

006

Jun

28 2

006

Jul 0

2 20

06

Jul 0

7 20

06

Jul 1

2 20

06

Jul 1

6 20

06

Jul 2

1 20

06

Jul 2

6 20

06

Jul 3

0 20

06

Aug

04 2

006

Aug

09 2

006

Aug

13 2

006

Aug

18 2

006

mahar

rocks-52

ume

jupiter

pragma001

amata1

tgc

Total

A Workshop On Earth System Models of Intermediate Complexity28-29 March 2006 at the Bureau of Meteorology Research Centre, Melbourne

SYSTEMS BIOLOGY

28

Page 35: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

• Heart disease still leading cause of death

• Understanding the underlyingphysiological mechanisms is cheaper and

faster when experimental studies are performed together with mathematical models & computer simulations

• Studying pathologies• Developing & Testing drugs

Normal Rhythm Ventricular Fibrillation

Cardiac ModellingSher, Gavaghan, Hinch, Noble,

Oxford University

Cardiac Modeling

• Based on experimental data, mathematical models have been developed– ODE’s– Initial conditions– Ion movement in single

cells

Shannon et al. model, 2004

Page 36: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

Studying ionic modelsAnna Sher, Oxford

• Examine the effect of various parameters on Ca2+-induced Ca2+ release and on shape of the action potential

• Fit simulated to experimental data • Identify parameter(s) that are critical

to distinguish Ca2+ dynamics within various species

Outcomes• Single cell ionic models allow us to study:

– Whole cell currents during an action potential (AP) – Currents in response to voltage-clamp stimuli– Dynamics of ions such as Ca2+ and Na+ – Force-frequency relationship– etc. Vm [mV]

INCX [A/F]

Blue: controlRed: local NCX 0.1Green: local NCX 0.3

Transient inward current

time (ms)

time (ms)

Page 37: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

3.0 Hz APD Restitution

00.020.040.060.080.1

0.120.140.160.180.2

0 0.05 0.1 0.15 0.2

DI (s)

APD

90(s

)

Cardiac ModellingDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova,

Kerckhoffs , UCSD

APD Restitution

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0 0.5

DI (s)

APD

90 (s

)

0.5 Hz

1.0 Hz

1.5 Hz

2.0 Hz

3.0 Hz

4.0 Hz

APD Restitution

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

DI (s)

AP

D 90

(s)

0.5 Hz

1.0 Hz

1.5 Hz

2 .0 Hz

3.0 Hz

4.0 Hz

1.0 Hz Epicardial Cell Calcium Restitution

00.0005

0.0010.0015

0.0020.0025

0 0.2 0.4 0.6DI (s)

Cal

cium

(a.u

.)

Control

0.01 Iso

0.1 Iso

1.0 Iso

CHEMISTRY

34

Page 38: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

Quantum ChemistryWibke Sudholt, Univ Zurich

35

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.5 1 1.5 2 2.5 3

Pote

ntia

l U/a

.u.

Radius r/a.u.

A1, A2

B1, B2

C CH

H

H

HH

HFC

H

HH

( ) ( ) ( )222

211eff expexp rBArBArU −+−=

36

Drug docking pipelineBaldridge, Amoreira, Univ Zurich,

Berstis, Kondrick, UCSD

• Goal is to minimize the free binding energy• Use Quantum calculations for more realism

Protein Data Bank

PDB2PQR WHATIF QMView APBS Compute free

binding energy

Add Hydrogen Atoms

Remove the water network

Place ligand

Solve Poisson-Boltzmann equation

Page 39: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

ENGINEERING

37

Flame Kernel Growth in Turbulent FlowsTom Dunstan, Karl Jenkins, Cranfield University

38

Laminar Flame solutions

Laminar simulations are carried out for verification of the code and adjustment of the initial reaction parameters.

Fig 1. Non-dimensional flame speed.time step = 3x10-5 s

Fig 2. Progress variable C (top) and Reaction Rate (bottom) at progressive time steps through centre of domain.

NIMROD optimisation results

Mass Fraction Burned describes global state of reaction (from 0 in unburned state to 1 in fully burned).

Fig 3. Mass fraction burned for varying values of pre-exponential factor B* and heat release parameter.

Page 40: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

Turbulent Flame propagation

39

From parallel computations using 3843 grids (56 million grid points).

Iso-surface of progress variable, C = 0.5

t = 0.8 ms t = 4.8 ms

t = 9.6 ms t = 14.2 ms

2D slices of progress variable taken at position L/4.

Pre-heat zone (pale blue)

Reaction zone (yellow)

Conclusions

• Significant benefits for science and engineering

• Users want to focus on their research• Tools need to be powerful and hide the

complexity of the underlying platforms• Real studies improve tools• Application studies are significant

component of the research www.csse.monash.edu.au/~davida

40

Page 41: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

1

eResearch Australasia, Brisbane, 27 June 2007

Data Linking and Integrationfor Health Applications

Dr Anthony MaederResearch Director

E-Health Research Centre / CSIRO ICT CentreBrisbane, Queensland, Australia

http://ict.csiro.au/Overview

E-Health and Health Data The HDI Software Tool Current ProjectsFuture Directions

Page 42: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

2

http://ict.csiro.au/

E-Health and Health Data

http://ict.csiro.au/Scope of e-Health

Contemporary health care has adopted evidence-based medicine delivered by multi-disciplinary, multi-party health care teams in a patient-centred approach e-Health encompasses the broad application of Information and Communication Technologies, in support of health care needsThe main e-Health domains of activity are:

• Health Information Systems (data and software tools)• Health Services Delivery (work practices and processes)

Page 43: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

3

http://ict.csiro.au/Australian Health Systems Scene

State-based systems, with different structures for management of hospitals and community healthNational and local organisation of private providersAustralian Government provides financial resourceMany independent legacy software systems: specialised, non-interoperable, unsupportedSafety, Quality and Efficiency issues driving reform of work practices and wider sharing of informationKey problem is lack of universal health identifier: National e-Health Transition Authority aims at this

http://ict.csiro.au/Need and Benefits of Data Linking

Currently patient data resides across numerous different databases which are unconnected and owned separately

Different information systems and reporting systemsGovernment vs Hospital vs GP vs Allied health systems

Health care improvement opportunities flow from linking this datahigher levels of patient care due to fuller informationextension of evidence-based practicebetter planning or decision making for specific casesimprovements to training and education, safety and quality

Page 44: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

4

http://ict.csiro.au/Privacy Issues

National Privacy PrinciplesUse of health data for treatment purposesSecondary use of data for research purposes

State Acts related to Health Data:Health Records and Information Privacy Act (2002) - NSWHealth Records Act (2001) – VictoriaMany others

Organisation principlesOther policies applicable site by siteAccess to data governed by ethics compliance

http://ict.csiro.au/

Source: Victorian Privacy Commissioner

Data Linkage and Integration

Page 45: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

5

http://ict.csiro.au/Health Data Sources

National levelMedicare – cost codes identify treatmentsPBS - Pharmaceuticals Benefits Scheme

State levelHealth department hospital admissions dataState based disease-specific data collections Pathology reports & resultsRadiotherapy reportsRadiology reports & imagesRegistries

Hospital levelHospital Information systemHospital pharmaceuticals database

Hospital unitsClinical information systemsUnit specific data sources

Clinical areasClinician based data sources

External sourcesGeneral PractitionersEmergency ServicesAllied health enterprises

http://ict.csiro.au/Data Utilization

Data collection management and organisationKnowledge discovery (population or cohorts) Understanding and comparison of casesPre-processing and reshapingStatistical correlation and analysisData aggregation and integrationIdentify events and trendsHealth awareness and promotion

Page 46: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

6

http://ict.csiro.au/Problems to Overcome for Data Linking

Major practical impediments exist for data linkingPatient security and privacy restrictionsDiversity and independence of databasesComplexity of data formats Sophistication of aggregation methods

Existing solutions tend to adopt a heavy approachManual processing to achieve one-off linkingData repositories or warehousingTrusted third party units offering linking servicesFull scale integration and interoperability of all systems

http://ict.csiro.au/

The HDI software tool

Page 47: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

7

http://ict.csiro.au/The Health Data Integration Project

Aims to provide novel methods for linking multiple databases, by allowing data custodians to retain control of dataProcesses queries and reporting operations remotely at the data locations (in situ)Allows privacy and security restrictions to be metEnables audit trails of data access operations and users to be producedFully software engineered application product has been produced at EHRC after about 20 person-years of effort

http://ict.csiro.au/The HDI Solution

User Interface

Planner

Authorisation

AuthenticationPEP

HDI DataIntegration

Query Executer

Plan

query

Custodian 1

DataService

Data

DataService

Data

Custodian 2

DataService

Data

query

quer

y

AnalysisReporting

Linking

QueryLinkAnalyseReport

Page 48: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

8

http://ict.csiro.au/HDI: Data Custodian Control

Data custodian retains control and securityNo warehousing of dataAll patient-identifying data is encrypted

Databases are added to a HDI installation by the data custodian

The data custodian specifies who can use the data and how they can use it

Metadata layer linked to industry standards to provide a common language and across repositories – increasing usability

http://ict.csiro.au/HDI: Delineation of Responsibility

HDI Domain conceptProvides demarcation of roles and responsibilities

• Domain Administrator

Supports existing ethics committee approval process

• Data Custodian

• Project Administrator

• Project member

Page 49: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

9

http://ict.csiro.au/HDI: Building a virtual data collection

Query on linked CRC surgical data and chemotherapy data to get a dataset of information on CRC patients, their current status and their chemotherapy treatments.

CRC Surgical Data

Chemotherapy Record Data

Link Table

Add Search Terms

http://ict.csiro.au/HDI: Building a virtual data collection

Query results show CRC surgical data and chemotherapy data for each de-identified patient.

CRC surgical data

Chemotherapy data

Page 50: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

10

http://ict.csiro.au/HDI: Performing an Analysis

Generate the survival chart, for example, Kaplan Meier.Analyse survival outcomes by stage.

Survival chart

Survival analysis statistics

http://ict.csiro.au/HDI: Generating a report

Data from multiple databases can be used for reporting.

Eg… linked data from surgical databases, chemotherapy records and cancer registries

Report on patients given adjuvant chemotherapy

Select report

Select variables on

which to report

Data for report

Page 51: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

11

http://ict.csiro.au/

Current Projects

http://ict.csiro.au/Data Linking Projects

Some Current Clinical Applications of HDIQueensland Health: Queensland Oncology On LineRoyal Melbourne Hospital: Colorectal CancerSydney South West Area Health Services: Colorectal CancerAutomated Cancer Staging: Lung Cancer

Page 52: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

12

MA

TCH

LI

NK

STA

ND

ARD

ISE

DA

TA L

AY

ER

SO

UR

CE

S

YS

TEM

SC

LIN

ICAL

LA

YE

R

QOOL Application Suite

Queensland Oncology Repository (QOR)

HL7‘Trickle Feed’

Summary of Data Flow

Identifies data sources

Processes the data

Stores the data

Makes the data accessible

Allows users to enter data

on-line

http://ict.csiro.au/Royal Melbourne Hospital: Colorectal Cancer - Overview

ObjectiveIntegration of CRC screening, surgery and family history information:

• For research on sensitivity and specificity of the Faecal Occult Blood Test (FOBT)• For research on surgical outcomes including factors such as adjuvant therapy and

co-morbidities

Cohort Surveillance program patient data over 25 years (approx 3000 records)Surgical data of approx 4000 records; Family history approx 4000 records

Databases3 databases (CRC Screening Program (administrative and outcomes data), Surgical and Clinical CRC data, Family History data)

Data QualitySignificant cleaning work required on databases to regularise recorded data and remove data entry, data format, or historical information change errors

Page 53: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

13

http://ict.csiro.au/Royal Melbourne Hospital: Colorectal Cancer - Outcomes

Data cleaning effort and data analysis gave access to 25 years of surveillance information

FOBT sensitivity published

Surgical OutcomesBy factors such as smoking history, adjuvant therapy or diabetes status

SENSITIVITY OF FAECAL OCCULT BLOOD TESTING (FOBT) FOR ASYMPTOMATIC COLORECTAL CANCER AND ADVANCED ADENOMAS OVER A 25-YEAR EXPERIENCE IN COLORECTAL CANCER SCREENING

The performance characteristics of FOBT for asymptomatic colorectal cancer and advanced adenomas is uncertain. Sensitivity, for example, requires direct comparison with a gold standard such as colonoscopy or, less accurately, close follow-up of a cohort of negative screenees to detect interval cancers.

The Royal Melbourne Hospital Bowel Cancer Surveillance Program began in 1979. It coordinates colonoscopicsurveillance (along NHMRC guidelines) and annual FOBT of individuals at moderate and high risk of bowel cancer. Hemoccult-type FOBT was used from 1979 to 1986, immunochemical testing from 1995 to now, and both test types in the interval 1987 to 1994.

A database of FOBT and colonoscopy results has been maintained throughout this time, but the technical resources to examine the data have not been available until now.

These results:

• Show that FOBT identifies a sub-group of at-risk but asymptomatic people enriched with colorectal neoplasia

• Support the need for colonoscopy surveillance in this population

• Illustrate the potential of technologically advanced database interrogation methods such as MMIM

This study also supports introduction of FOBT-based screening for average risk subjects, however, results should be interpreted in the light of our population being at above-average risk

• 4821 registrants have had 5398 colonoscopies since the program started in 1979

• 893 asymptomatic patients with planned colonoscopies had a preceding FOBT within 3 months (see Table)

• 5 cancers were detected, with 3 (60%) being FOBT positive

•There were 49 advanced adenomas, of which 17 (34.7%) were FOBT positive

• There were 148 episodes with detection of any neoplasia, of which 38 (25.7%) were FOBT positive

• Using MMIM, the Surveillance Program database was interrogated to identify asymptomatic patients who had a scheduled colonoscopy and an FOBT within the preceding 3 months

• Patients who had colonoscopy because of a positive FOBT or symptoms were not included

• De-identified data was analysed at the EHRC using HDI (Health Data Integration) and other tools

• The results of the colonoscopy were characterized as normal, carcinoma, advanced adenoma and other adenoma

• ‘Advanced adenoma’ was defined as:

• 3 or more polyps

• size >10mm

• villous component or severe dysplasia

• Sensitivity, specificity and predictive value for neoplasia of FOBT were calculated according to the efficient-score method described by Newcombe

Figure 2 FOBT sensitivity by colonoscopy findings in asymptomatic at-risk people

In collaboration with the EHRC (e-Health Research Centre) and Bio21MMIM (Molecular Medicine Informatics Model) project we reviewed our experience of 25 years of FOBT testing in asymptomatic moderate and high risk subjects undergoing scheduled colonoscopy, to assess the sensitivity, specificity and predictive value for neoplasia of FOBT.

Introduction

Aim

Methods

Results

Conclusion

F.A.Macrae1, M.A.Slattery1, G.J.Brown1, M.A.O’Dwyer2, C.G.Murphy2, M.E. Hibbert3, D.J.St.John1.Colorectal Medicine & Genetics, The Royal Melbourne Hospital1; e-Health Research Centre, CSIRO ICT Centre, Brisbane2; & Bio21

MMIM Project, Melbourne3.

13.419.1

34.7

60

0

10

20

30

40

50

60

70

COLONOSCOPY FINDINGS

Negative

n=745

Other Adenomas

n=94

Advanced Adenomas

n= 49Carcinoma

n=5

FOBT

PO

SITI

VE R

ATE

(%)

FOBTResults

Colonoscopy Findings

Normal OtherAdenoma

Advanced Adenoma

Carcinoma TOTAL

Negative 645 76 32 2 755

Positive 100 18 17 3 138

Total 745 94 49 5 893

Table FOBT results by colonoscopy findings in asymptomatic at-risk people

Figure 1 The immunochemical FOBT kit in current use

• Specificity for advanced adenomas or cancers was 85.9 % (95% Confidence Interval, 83.4% - 88.2%)

• The predictive value for a positive FOBT for any neoplasia in AS subjects was 27.5% (20.4% to 35.9%)

• The likelihood ratio for detection of an advanced adenoma or cancer if the FOBT was positive was 2.6 (1.8 to 3.9)

Results (cont)

SENSITIVITY OF FAECAL OCCULT BLOOD TESTING (FOBT) FOR ASYMPTOMATIC COLORECTAL CANCER AND ADVANCED ADENOMAS OVER A 25-YEAR EXPERIENCE IN COLORECTAL CANCER SCREENING

The performance characteristics of FOBT for asymptomatic colorectal cancer and advanced adenomas is uncertain. Sensitivity, for example, requires direct comparison with a gold standard such as colonoscopy or, less accurately, close follow-up of a cohort of negative screenees to detect interval cancers.

The Royal Melbourne Hospital Bowel Cancer Surveillance Program began in 1979. It coordinates colonoscopicsurveillance (along NHMRC guidelines) and annual FOBT of individuals at moderate and high risk of bowel cancer. Hemoccult-type FOBT was used from 1979 to 1986, immunochemical testing from 1995 to now, and both test types in the interval 1987 to 1994.

A database of FOBT and colonoscopy results has been maintained throughout this time, but the technical resources to examine the data have not been available until now.

These results:

• Show that FOBT identifies a sub-group of at-risk but asymptomatic people enriched with colorectal neoplasia

• Support the need for colonoscopy surveillance in this population

• Illustrate the potential of technologically advanced database interrogation methods such as MMIM

This study also supports introduction of FOBT-based screening for average risk subjects, however, results should be interpreted in the light of our population being at above-average risk

• 4821 registrants have had 5398 colonoscopies since the program started in 1979

• 893 asymptomatic patients with planned colonoscopies had a preceding FOBT within 3 months (see Table)

• 5 cancers were detected, with 3 (60%) being FOBT positive

•There were 49 advanced adenomas, of which 17 (34.7%) were FOBT positive

• There were 148 episodes with detection of any neoplasia, of which 38 (25.7%) were FOBT positive

• Using MMIM, the Surveillance Program database was interrogated to identify asymptomatic patients who had a scheduled colonoscopy and an FOBT within the preceding 3 months

• Patients who had colonoscopy because of a positive FOBT or symptoms were not included

• De-identified data was analysed at the EHRC using HDI (Health Data Integration) and other tools

• The results of the colonoscopy were characterized as normal, carcinoma, advanced adenoma and other adenoma

• ‘Advanced adenoma’ was defined as:

• 3 or more polyps

• size >10mm

• villous component or severe dysplasia

• Sensitivity, specificity and predictive value for neoplasia of FOBT were calculated according to the efficient-score method described by Newcombe

Figure 2 FOBT sensitivity by colonoscopy findings in asymptomatic at-risk people

In collaboration with the EHRC (e-Health Research Centre) and Bio21MMIM (Molecular Medicine Informatics Model) project we reviewed our experience of 25 years of FOBT testing in asymptomatic moderate and high risk subjects undergoing scheduled colonoscopy, to assess the sensitivity, specificity and predictive value for neoplasia of FOBT.

Introduction

Aim

Methods

Results

Conclusion

F.A.Macrae1, M.A.Slattery1, G.J.Brown1, M.A.O’Dwyer2, C.G.Murphy2, M.E. Hibbert3, D.J.St.John1.Colorectal Medicine & Genetics, The Royal Melbourne Hospital1; e-Health Research Centre, CSIRO ICT Centre, Brisbane2; & Bio21

MMIM Project, Melbourne3.

13.419.1

34.7

60

0

10

20

30

40

50

60

70

COLONOSCOPY FINDINGS

Negative

n=745

Other Adenomas

n=94

Advanced Adenomas

n= 49Carcinoma

n=5

FOBT

PO

SITI

VE R

ATE

(%)

FOBTResults

Colonoscopy Findings

Normal OtherAdenoma

Advanced Adenoma

Carcinoma TOTAL

Negative 645 76 32 2 755

Positive 100 18 17 3 138

Total 745 94 49 5 893

FOBTResults

Colonoscopy Findings

Normal OtherAdenoma

Advanced Adenoma

Carcinoma TOTAL

Negative 645 76 32 2 755

Positive 100 18 17 3 138

Total 745 94 49 5 893

FOBTResultsFOBTResults

Colonoscopy FindingsColonoscopy Findings

NormalNormal OtherAdenoma

OtherAdenoma

Advanced AdenomaAdvanced Adenoma

CarcinomaCarcinoma TOTALTOTAL

NegativeNegative 645645 7676 3232 22 755755

PositivePositive 100100 1818 1717 33 138138

TotalTotal 745745 9494 4949 55 893893

Table FOBT results by colonoscopy findings in asymptomatic at-risk people

Figure 1 The immunochemical FOBT kit in current use

• Specificity for advanced adenomas or cancers was 85.9 % (95% Confidence Interval, 83.4% - 88.2%)

• The predictive value for a positive FOBT for any neoplasia in AS subjects was 27.5% (20.4% to 35.9%)

• The likelihood ratio for detection of an advanced adenoma or cancer if the FOBT was positive was 2.6 (1.8 to 3.9)

Results (cont)

F.A.Macrae1, M.A.Slattery1, G.J.Brown1, M.A.O’Dwyer2, C.G.Murphy2, M.E. Hibbert3, D.J.St.John1.Colorectal Medicine & Genetics, The Royal Melbourne Hospital1; e-Health Research Centre, CSIRO ICT Centre, Brisbane2; &

Bio21 MMIM Project, Melbourne3.

http://ict.csiro.au/CRC Services Research Group (NSW) - Overview

ObjectiveDemonstrate the ability to gather patient information across databases and across hospitals to provide summary information on quality and safety of patient care, adherence to clinical guidelines and comparison across hospitalsTo be able to provide this information to clinicians for access to their own data for case management AND for monitoring as above

Cohort Approx 1000 ‘dummy’ patient records with CRC and corresponding administrative and treatment information

DatabasesDifferent databases (Registry, Administrative, Surgical, Chemotherapy) for 3 different hospitals

Data QualityConstructed data included deliberate errors in patient demographics and different formats of clinical information to demonstrate HDI’s ability in inexact matching and transformation of data entries to a selected standard

Page 54: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

14

http://ict.csiro.au/CRC Services Research Group (NSW) - Scenario

Hosp A:Admin DB

Hosp B:Admin DB

Hosp C:Admin DB

Hosp A, Surgeon 2:Procedural

Records DB(no chemo data)

Hosp B,Surgeon 3:Procedural

Records DB(no chemo data)

Hosp C, Surgeon 4:

Procedural Records DB

(no chemo data)

Hospital B:Private hospital, not included in Cancer Institute NSW Clinical Cancer Registry Pilot

Hospital C*:Non-teaching hospital, not included in Cancer Institute NSW Clinical Cancer Registry Pilot

Hospital A:Large teaching hospital, included in Cancer Institute NSW Clinical Cancer Registry Pilot

Hosp A, Surgeon 1:Procedural

Records DB(no chemo data)

Hosp A:Cancer Institute

NSW Clinical Cancer Registry

Hosp B,Oncologist 5:

Chemotherapy Records DB

AlphaHosp

CINCCChemoRec

BudgetHosp CHADMIN

CSurgAccSurgAllSurgery BuddsData

*Hospital C – less CRC cases in general, and some harder cases referred to Hospital A or B. All chemotherapy (if required) performed by Hospital A or B.

Hospital B – All chemotherapy (if required) performed by Hospital B.

Hospital A – All chemotherapy (if required) performed by Hospital A.

http://ict.csiro.au/CRC Services Research Group (NSW) - Outcomes

We have demonstrated that with HDI it is possible to report on indicators that require information to be linked within and across hospitals

% o

f P

atie

nts

Page 55: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

15

http://ict.csiro.au/Cancer Stage Interpretation System (CSIS)

Staging of a cancer requires access to all available data:

Radiology and histology text reportsInformation extracted from other forms of data, for example, radiological images

Cancer Stage Interpretation System (CSIS)Cancer staging is necessary to determine effective care for individual patients, as well as to design and evaluate health programmes at a population level Develop improved ways to access and analyse stored medical images and reports to better facilitate the staging of cancer patients

http://ict.csiro.au/Cancer Stage Interpretation System (CSIS)

Page 56: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

16

http://ict.csiro.au/

Future directions

http://ict.csiro.au/Clinical to Genomic

Integrating biomarker data sources and patient data can provide information on the efficacy, safety and toxicity of drugs

• Advanced non small-cell lung cancer - Iressa effective in only 10% to 15% of patients. Scientists pinpointed mutations in a gene within some tumor cells that allows Iressa to work

Advanced breast cancer - Herceptin issued to 35,000 women world-wide 1998-2005. Only appears to work for patients whose breast cancer cells carry extra copies of the HER2 protein.Micro Array experiments may be useful in finding disease biomarkers

• Linking micro array results with known clinical outcomes will allow new biomarkers to be found

Page 57: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

17

http://ict.csiro.au/

Figure by Dr Terry O’Brien, Univ Melbourne (1 July 2005)

Bio21 Molecular Medicine Informatics Model (2005)

http://ict.csiro.au/EHRs and Virtual Registry

Pre-population of Electronic Health RecordUse data linking to gather as much information as possible (to required health record specification) about an individual to pre-populate their complete health recordOutcomes:

• Reduced manual workload for initial set up of health record

Virtual registry Use data linking to gather registry data set information about patients from existing sources (rather than building brand new sources with paper based information submission) – and present back a view of the registry for administration and analysisOutcomes:

• Significantly reduced manual and paper based workload for clinicians and registry officers resulting in more timely registry information, and greater compliance with registry requirements if data capture is at source

Page 58: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

18

http://ict.csiro.au/Contemporary ICT Advances

Web ServicesEasy(ier) federation of data and services

OntologiesRelating concepts using semantic propertiesDiscipline based

• Need domain expertise to define the ontology of data sources and fields

The semantic webUsing ontologies in the web applications

http://ict.csiro.au/Using SNOMED CT

Use SNOMED CT for mapping of data to termsStructured data sourcesNatural language

• Reports • Discussions

To make this possibleSubsets for particular domainsAugmenting the domain specific subsetsFast querying

Page 59: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

19

http://ict.csiro.au/SNOMED-CT Scope

Clinical TermsComprehensive, not specialty or domain

Human, veterinary, drugs, social, disease, observations, interventions and wellness

~400,000 concepts (fully specified names)~1M descriptions (synonyms etc)~1.4M relationships (900,000 defining)URU principles

Useable, Repeatable, Understandable

http://ict.csiro.au/SNOMED CT Top-level Concepts

138875005 | SNOMED CT Concept123037004 | body structure |404684003 | clinical finding |243796009 | context dependent category |308916002 | environments and geographical locations |272379006 | event |106237007 | linkage concept |363787002 | observable entity |410607006 | organism |373873005 | pharmaceutical / biologic product |78621006 | physical force |260787004 | physical object |71388002 | procedure |362981000 | qualifier value |419891008 | record artifact |48176007 | social context |370115009 | special concept |123038009 | specimen |254291000 | staging and scales |105590001 | substance |

Page 60: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

20

http://ict.csiro.au/Expressions, equivalence and subsumption

85189001|acute appendicitissubsumed by• 74400008|appendicitis|• 64572001|disease|:

116676008|associated morphology|=23583003|inflamation|363698007|finding site|=66754008|appendix structure|

equivalent to• 74400008|appendicitis|:

260908002|course|=53737009|acute|

http://ict.csiro.au/Querying

Simple queries can use subsumption/equivalence, for example

find all cases of 74400008|appendicitis| with a particular treatment or outcome

finding those also classified as85189001|acute appendicitisor even as a 64572001|disease| with 23583003|inflamation| of the 66754008|appendix structure|

Page 61: David Abramson v2 - Australian National University · PDF fileDederko, Nevo, Altshuler, Wu, Mcculloch, Mihaylova, Kerckhoffs , UCSD APD Restitution 0 ... David Abramson Monash e-Science

21

http://ict.csiro.au/

E H E A L T H

[email protected]

http://e-hrc.net

http://ict.csiro.au/

HDI HUB

HDI Data SourceHDI Data Source HDI Data Source

HDI Platform Technology

Custodial controlled

data

HDI integrates

data

De-identified virtuallinked data set

De-identified linked data for

analysisStatistical Packages

e.g. R, SPSS Reporting Toolse.g. Crystal Reports

Custom Applications

Colon Cancer Project

PatientProcedureQuery

ProcedureByEventQuery

Custodial Data

BHospPersonDetails

BHospEventProcedure

BHospProcedure

BHospEventProcedure

BHospProcedure

BHospPersonDetails

DeceasedPersonsQuery

Procedure

EventPersonDetails