auspice: automatic service planning in cloud/grid environments david chiu dissertation defense may...
TRANSCRIPT
![Page 1: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/1.jpg)
Auspice: AUtomatic Service Planning in Cloud/Grid
Environments
David ChiuDissertation Defense
May 25, 2010
Committee:Prof. Gagan Agrawal, AdvisorProf. Hakan FerhatosmanogluProf. Christopher Stewart
![Page 2: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/2.jpg)
2
Explosion of Scientific Data Sources
• The amount of scientific data has increased dramatically over the years
• In just one example,‣ Large Hadron Collider (LHC)‣ 15 petabytes annually‣ 60 petabytes overall
• Management and processing have become challenging
![Page 3: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/3.jpg)
3
Data Sources
A Live Cyber Infrastructure
![Page 4: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/4.jpg)
4
Computing & Storage Resources
A Live Cyber Infrastructure
![Page 5: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/5.jpg)
5
Shared/Proprietary Web Services
= Web Service
A Live Cyber Infrastructure
![Page 6: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/6.jpg)
6
. . .
A Live Cyber Infrastructure
![Page 7: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/7.jpg)
7
Service Interaction with Cyber Infrastructure
. . .
invoke
results
![Page 8: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/8.jpg)
8
Current GUI for Creating Workflows
![Page 9: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/9.jpg)
9
Scientific Workflow Challenges
???‣ Difficulties for the scientist:
‣ How to identify which data sets to use, and from where to get them?
‣ Which services are available to me to use?
‣ What resources to utilize?
‣ How can I accelerate workflow execution?
‣ Do I really have to do all this myself?
![Page 10: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/10.jpg)
10
Contributions
• Workflow System-- with the following support
• High-level scientific user querying‣ D. Chiu and G. Agrawal. A Keyword Querying Interface for Invoking
Scientific Workflows. (OSU-TR, submitting to ACM-GIS’10)
‣ D. Chiu and G. Agrawal. Enabling Ad Hoc Queries over Low-Level Scientific Data Sets. (SSDBM'09)
• Automatic workflow planning‣ D. Chiu and G. Agrawal. Enabling Ad Hoc Queries over Low-Level Scientific
Data Sets. (SSDBM'09)
‣ D. Chiu and G. Agrawal. Ad Hoc Scientific Workflows through Data-driven Service Composition. (eScience'07)
![Page 11: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/11.jpg)
11
Contributions (continued)
• Quality of Service‣ D. Chiu, S. Deshpande, G. Agrawal, and R. Li. A Dynamic Approach toward
QoS-Aware Service Workflow Composition. (ICWS’09)
‣ D. Chiu, S. Deshpande, G. Agrawal, and R. Li. Cost and Accuracy Sensitive Dynamic Workflow Composition over Grid Environments. (GRID'08)
‣ D. Chiu, S. Deshpande, G. Agrawal, and R. Li. Composing Geoinformatics Workflows with User Preferences. (GIS’08)
• Accelerating Workflow Execution‣ D. Chiu and G. Agrawal. Evaluating Caching and Storage Options on the
Amazon Web Service Cloud. (OSU-TR, submitted to GRID’10)
‣ D. Chiu, A. Shetty, and G. Agrawal. Elastic Cloud Caches for Derived Data Reuse. (OSU-TR, submitted to SC’10)
‣ D. Chiu and G. Agrawal. Hierarchical Caches for Grid Workflows. (CCGrid’09)
![Page 12: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/12.jpg)
12
Presentation Outline• Motivation & Introduction
• Our Service Composition System: Auspice‣ Metadata Framework‣ Cost-Aware Service Planning‣ Supporting Keyword Queries‣ Elastic Cache Deployment
• Conclusion
Auspice
![Page 13: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/13.jpg)
13
Auspice System
![Page 14: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/14.jpg)
14
Auspice System
D. Chiu & G. Agrawal, eScience ’07
D. Chiu & G. Agrawal, SSDBM ’09
![Page 15: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/15.jpg)
15
What known data or services can derive a coast line?
Systematic Way to Plan Workflows?
• Goal-Driven, Recursive Concept Derivation• Example User Goal: Coastline Extraction
Coast
Line
We are targeting some coastline concept in the geospatial domain
![Page 16: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/16.jpg)
16
What known data or services can derive water
level?
Available Services Available Data
What known data or services can derive a
CTM?
Available Services Available Data
Coast
Line
Coast Extrac
t1
Coast Data
1
Coast Data
N
Available Services Available Data Types
What are its parameters?
Systematic Way to Plan Workflows?
Coast Extrac
tK
Water
Level
CTM
![Page 17: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/17.jpg)
17
Coast
Line
Systematic Way to Plan Workflows?
Coast Extrac
tK
Water
Level
CTM
.
.
.
.
..
..
....
Coast Extrac
t1
Coast Data
1
Coast Data
N
![Page 18: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/18.jpg)
18
Coast
Line
Systematic Way to Plan Workflows?
.
.
.
.
..
..
....
Workflow 1 Workflow 2
Workflow 3
...
![Page 19: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/19.jpg)
19
Ontology for Applying Domain Information
Domain concepts can be derivedfrom executing a service
Domain concepts can be derived from retrieving an existing dataService parameters can be
represented by certaindomain concepts
![Page 20: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/20.jpg)
21
Auspice Metadata Registration
• Given a data set or service,
‣ Ontology is applied to new resources
‣ Resources are indexed and immediately usable in workflow planner
‣ Non-intrusive
![Page 21: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/21.jpg)
22
Registering Data Sets
![Page 22: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/22.jpg)
23
Registering Services
![Page 23: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/23.jpg)
24
Subset of Ontology, with Shoreline Target
![Page 24: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/24.jpg)
25
Service Planning: An Example
A Derived Execution Plan for shoreline concept
![Page 25: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/25.jpg)
26
What Users Want
Do what you can to provide me results in under 20 minutes.
I want the fastest results with at least 75% accuracy
- Exec time prediction,- Online data reduction
- Domain-specific error modeling
....
..
..
....
![Page 26: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/26.jpg)
27
Presentation Outline• Motivation & Introduction
• Our Service Composition System: Auspice‣ Metadata Framework‣ Cost-Aware Service Planning‣ Supporting Keyword Queries‣ Elastic Cache Deployment
• Conclusion
Auspice
![Page 27: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/27.jpg)
28
Auspice System
![Page 28: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/28.jpg)
29
Auspice System
D. Chiu, S. Deshpande, G. Agrawal, & R. Li, GRID ’08
D. Chiu, S. Deshpande, G. Agrawal, & R. Li, ACM-GIS ’08
D. Chiu, S. Deshpande, G. Agrawal, & R. Li, ICWS ’09
![Page 29: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/29.jpg)
30
Challenges
• We wish to project workflow execution time and workflow accuracy costs at planning time
• Allow input models per service
• We should prune all workflows unlikely to meet the user’s demands
![Page 30: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/30.jpg)
31
Estimating Workflow Execution Time
• Service execution time (tx)‣ Each service is trained beforehand with various sized inputs
• Data output size (dsize)‣ Known for files. But models are again trained for service output
• Network transmission time (tnet)‣ Bandwidth between nodes are typically known
• Recall the workflow structure:
![Page 31: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/31.jpg)
32
Estimating Workflow Error/Accuracy
• The recursive sum is similar for error propagation
• The errors, , attributed from services and data are implemented by domain scientists
• is an accuracy parameter, e.g., sampling rate, resolution, ..
![Page 32: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/32.jpg)
33
Cost Models Declared per Operation
![Page 33: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/33.jpg)
34
Water Level Workflow ExampleWorkflow Plan 1 Workflow Plan 2
[t_total=3.5001 t_x=1 t_d=0 o=47889 e=0.004]SRVC.getWL( X=482593 Y=4628522 StnID= [t_total=2.5 t_x=0.5 t_d=0 o=0 e=0.004] SRVC.getKNearestStations( Longitude=482593 Latitude=4628522 ListOfStations= [t_total=2 t_x=2 t_d=0 o=47889 e=0] SRVC.GetGSListGreatLakes() RadiusKM=100 K=3 ) time=00:06 date=01/30/2008)
[t_total=2 t_x=2 t_d=0 o=47889 e=2.4997]SRVC.getWLfromModel( X=482593 Y=4628522 time=00:06 date=01/30/2008)
Total Projected Costs:Workflow Execution Time = 3.251Workflow Error = 0.004
Total Projected Costs:Workflow Execution Time = 1.674Workflow Error = 2.4997
![Page 34: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/34.jpg)
38
Cost Model Overheads
![Page 35: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/35.jpg)
39
Experimented Workflow• Shoreline Extraction
• Users can specify the following QoS Parameters:• Allowed execution time• Allowed error
![Page 36: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/36.jpg)
40
On Meeting Time Constraints
![Page 37: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/37.jpg)
41
On Meeting Error Constraints
![Page 38: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/38.jpg)
42
Presentation Outline• Motivation & Introduction
• Our Service Composition System: Auspice‣ Metadata Framework‣ Cost-Aware Service Planning‣ Supporting Keyword Queries‣ Elastic Cache Deployment
• Conclusion
Auspice
![Page 39: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/39.jpg)
43
Current GUI for Creating Workflows
![Page 40: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/40.jpg)
44
Auspice System
![Page 41: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/41.jpg)
45
Auspice System
D. Chiu & G. Agrawal, SSDBM’09
D. Chiu & G. Agrawal, (submitting to GIS’10)
![Page 42: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/42.jpg)
46
Supporting Keyword Querying
• Planning workflows is hard, while keyword search has become an extremely popular interface for information retrieval‣ No need to know underlying structure of data‣ No need to understand structured query languages like SQL
• Goal: Given set of key terms in the scientific domain, return ranked list of workflow plans to the user for execution
![Page 43: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/43.jpg)
47
Keyword Decomposition
coast CTM7/8/2003(41.30, -82.4)“ ”line
Filterstopping/stemming/pattern-match
map
![Page 44: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/44.jpg)
48
Keyword Maximization
coast
CTM7/8/2003
41.30
line
C
C
C
longitude
C
C
date
-82.4
C
latitude
D
D
D
Data-SubstantiatedConcepts
UnsubstantiatedConcepts
Any combination of these is potentially
what the query is targeting!
Potential queryparameters
![Page 45: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/45.jpg)
49
Keyword Querying
coast
CTM
line
C C
C
Merged SuperConcept
Query Target Candidate Requisite Concepts
7/8/2003
41.30
C
longitude
C
date
-82.4
C
latitude
D
D
D
Query Parameters
![Page 46: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/46.jpg)
50
Keyword Querying
coast CTM line
C CC
Merged SuperConcept
Query Target Candidate Requisite Concepts
7/8/2003
41.30
C
longitude
C
date
-82.4
C
latitude
D
D
D
Query Parameters
![Page 47: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/47.jpg)
51
Keyword Querying
coast CTMline
C C C
Merged SuperConcept
Query Target Candidate Requisite Concepts
7/8/2003
41.30
C
longitude
C
date
-82.4
C
latitude
D
D
D
Query Parameters
Enumerate Workflows
![Page 48: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/48.jpg)
52
Ranking Workflow Plans by Relevance
• Method: ‣ Let be the set of input keyword-concepts‣ Rank workflow plans on
![Page 49: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/49.jpg)
53
A Case Study
• The following keyword queries were submitted to Auspice
![Page 50: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/50.jpg)
54
Search Time
![Page 51: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/51.jpg)
55
Precision
![Page 52: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/52.jpg)
57
Presentation Outline• Motivation & Introduction
• Our Service Composition System: Auspice‣ Metadata Framework‣ Cost-Aware Service Planning‣ Supporting Keyword Queries‣ Elastic Cache Deployment
• Conclusion
Auspice
![Page 53: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/53.jpg)
58
Problem: Query Intensive Circumstances
. . .
. . .
. . .
![Page 54: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/54.jpg)
59
Caching Intermediate Results• Shoreline Extraction
Time consuming!
Can’t we cache the result from when it was last computed??
![Page 55: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/55.jpg)
61
Auspice System
![Page 56: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/56.jpg)
62
Auspice System
D. Chiu & G. Agrawal, CCGrid’09
D. Chiu, A. Shetty, & G. Agrawal, (submitted to SC’10)
D. Chiu & G. Agrawal, (submitted to GRID’10)
![Page 57: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/57.jpg)
63
Cloud Computing
•Pay as you go computing
•Elasticity‣ Cloud applications can
stretch and relax their resource requirements
•“Infinite” compute and storage resources
![Page 58: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/58.jpg)
64
A Workflow Cache
Compute Cloud
. . .
A B
![Page 59: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/59.jpg)
65
. . .
A
B
75
25
8
Consistent Hashing
![Page 60: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/60.jpg)
66
. . .
A
B
75
25
8
invoke:
service(35)
(35 mod 100) = 35Which proxy has the page?h(k) = (k mod 100)
h(35)
Consistent Hashing
![Page 61: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/61.jpg)
67
A
B
75
25
8
50 C
Only records hashing into
(25,50] need to be moved from
A to C!
Our algorithm for Scaling upGBA: Greedy Bucket Allocation
![Page 62: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/62.jpg)
68
Experimental Configuration
• Workload‣ Shoreline Extraction Workflow‣ Takes 23 seconds to
complete without benefits of cache
‣ Executed on a miss
• Amazon EC2 Cloud‣ Each Cloud node:
- Small Instances (Single core 1.2Ghz, 1.7GB, 32bits)- Ubuntu Linux
‣ Caches start out cold‣ Cache stored in memory only
![Page 63: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/63.jpg)
69
Experimental Configuration
• Our approach exploits a dynamic Cloud environment:‣ Consistent Hashing: Greedy Bucket Allocation (GBA)‣ Elastic number of nodes
• We compare GBA against statically allocated Cloud environments:‣ 2 fixed nodes (static-2)‣ 4 fixed nodes (static-4)‣ 8 fixed nodes (static-8)‣ Cache overflow --> LRU eviction
![Page 64: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/64.jpg)
70
Relative Speedup
Querying Rate: 255 invocations/sec
Cost Savings
![Page 65: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/65.jpg)
72
That’s Not Completely Elastic
• What about relaxing the amount of nodes to help save Cloud save costs?
• First, we need an eviction scheme
![Page 66: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/66.jpg)
73
Exponential Decay Eviction
• At eviction time:‣ A value, , is calculated for each data record in the
evicted slice‣ is higher:
- if was accessed more recently- if was accessed frequently
‣ If is lower than some fixed threshold, evict
![Page 67: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/67.jpg)
74
Experimental Configuration• Amazon EC2 Cloud
‣ Each Cloud node:- Small Instances (Single core 1.2Ghz, 1.7GB, 32bits)- Ubuntu Linux
‣ Caches start out cold‣ Data stored in memory‣ When 2 nodes become < 30% capacity, merge
• Sliding Window Configuration:‣ Time Slice: 1 sec‣ Size: 100 Time Slices
![Page 68: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/68.jpg)
75
Data Eviction: 50/255/50 queries per sec
Sliding Window Size = 100 sec
50 q/sec 255 q/sec 50 q/sec
![Page 69: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/69.jpg)
76
Cache Contraction: 50/255/50 queries per sec
![Page 70: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/70.jpg)
77
Cache Contraction: 50/255/50 queries per sec
![Page 71: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/71.jpg)
80
Presentation Outline• Motivation & Introduction
• Our Service Composition System: Auspice‣ Metadata Framework‣ Cost-Aware Service Planning‣ Supporting Keyword Queries‣ Elastic Cache Deployment
• Conclusion
Auspice
![Page 72: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/72.jpg)
81
Future Work
• Dynamic sliding window size
• Evaluate and model various Cloud infrastructure options to optimize cost for sustaining the cache
• Transparent remote data analysis over Clouds
• Deep Web Integration into querying framework
![Page 73: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/73.jpg)
82
Summary and Conclusion
• Auspice is a workflow system, which‣ Supports high-level keyword/NLP user queries‣ Automatically composes workflows, and adapts to QoS
Constraints‣ Caches workflow results to accelerate workflow execution
• Questions?
Auspice
![Page 74: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/74.jpg)
83
Capturing Concept Derivability
Domain concepts can be derivedfrom executing a service
Domain concepts can be derived from retrieving an existing dataService parameters represent
different domain concepts
![Page 75: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/75.jpg)
84
Indexing Data Sets
![Page 76: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/76.jpg)
85
Applying Domain Information
Domain concepts can be derivedfrom executing a service
Domain concepts can be derived from retrieving an existing dataService parameters represent
different domain concepts
![Page 77: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/77.jpg)
86
latitude
A Case for Semantics
• Service Identification:‣Assume the following service retrieves a satellite image
pertaining to (x,y) with resolution respective to r
• Questions to ask the system:‣ How to deduce that this service can be used?‣ How to determine what information is needed for input?‣ Did the user provide enough information to invoke this service?
get_image(double x, double y, double r)
inputsTo inputsToinputsTo
longitude grid_size
outputsTo
satellite image
![Page 78: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/78.jpg)
87
Indexing Services
• Services (inputs, outputs) are also registered in much the same way
![Page 79: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/79.jpg)
88
Systematic Service Planning
Ontology, O
Compose workflows in this form:
data derivation
service derivation
![Page 80: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/80.jpg)
89
Presentation Outline• Motivation & Introduction
• Our Service Composition System: Auspice‣ Metadata Framework‣ Cost-Aware Service Planning‣ Supporting Keyword Queries‣ Caching Intermediate Results‣ Elastic Cache Deployment
• Conclusion
Auspice
![Page 81: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/81.jpg)
90
Caching Intermediate Results
![Page 82: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/82.jpg)
91
A Hierarchical Cache
![Page 83: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/83.jpg)
92
MissesFastSlow
Hits(Slow)
Wouldn’t it be faster to centralize the index on the broker node?
Do we really need the broker index? Isn’t hashing faster?
Cache Access Types
![Page 84: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/84.jpg)
93
Experimental Workflows• Against Heterogeneous Bandwidths
![Page 85: Auspice: AUtomatic Service Planning in Cloud/Grid Environments David Chiu Dissertation Defense May 25, 2010 Committee: Prof. Gagan Agrawal, Advisor Prof](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649efe5503460f94c13b79/html5/thumbnails/85.jpg)
94
Centralized on Broker vs. HierarchicalOut-of-core!
In-core