developing scientific applications for distributed ... · pdf filedeveloping scientific...

Download Developing Scientific Applications for Distributed ... · PDF fileDeveloping Scientific Applications for Distributed Infrastructures . ... Meets the need for a Broad Spectrum of Application:

If you can't read please download the document

Upload: hanhu

Post on 06-Feb-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • Shantenu Jha

    http://cct.lsu.edu/~sjha & http://saga.cct.lsu.edu/

    Developing Scientific Applications for Distributed Infrastructures

  • Understanding Distributed Applications IDEAS: First Principles Development Objectives

    Interoperability: Ability to work across multiple distributed resources

    Distributed Scale-Out: The ability to utilize multiple distributed resources concurrently

    Extensibility: Support new patterns/abstractions, different programming systems, functionality & Infrastructure

    Adaptivity: Response to fluctuations in dynamic resource and availability of dynamic data

    Simplicity: Accommodate above distributed concerns at different levels easily

    Challenge: How to develop DA effectively and efficiently with the above as first-class objectives?

  • SAGA: In a nutshell

    There exists a lack of Programmatic approaches that: Provide general-purpose, basic &common grid functionality for

    applications and thus hide underlying complexity, varying semantics..

    The building blocks upon which to construct consistent higher-levels of functionality and abstractions

    Meets the need for a Broad Spectrum of Application: Simple scripts, Gateways, Smart Applications and Production

    Grade Tooling, Workflow

    Simple, integrated, stable, uniform and high-level interface Simple and Stable: 80:20 restricted scope and Standard Integrated: Similar semantics & style across Uniform: Same interface for different distributed systems

    SAGA: Provides Application* developers with units required to compose high-level functionality across (distinct) distributed systems (*) One Persons Application is another Persons Tool

  • SAGA and Distributed Applications

  • Taxonomy of Distributed Application Development

    Example of Distributed Execution Mode: Implicitly Distributed

    HTC of HPC: 1000 job submissions of NAMD the TG/LONI SAGA shell example (cf DESHL)

    Example of Explicit Coordination and Distribution Explicitly Distributed

    DAG-based Workflows (example of Higher-level API)

    Example of SAGA-based Frameworks Pilot-Jobs, Fault-tolerant Autonomic Framework MapReduce, All-Pairs

    Note: An application can be developed differently, and thus be in more than one category (e.g. DAG-based workflow)

  • Generalized Ensemble Methods: Replica and Replica-Exchange

    Sampling is the challenge: Long run vs multiple short runs?

    Task Level Parallelism Embarrassingly distributable!

    Create replicas of initial configuration.

    Spawn 'N' replicas over different machine

    RE: Run for time t ; Attempt configuration swap. Run for further time t; Repeat till finish

    RN

    R1 R2 R3

    t

    hot

    300K

    Exchange attempts

    T

    t

  • Abstractions for Dynamic Execution (1) Container Task

    Adaptive: Type A: Fix number of replicas; vary cores assigned to each replica.

    Type B: Fix the size of replica; vary number of replicas (Cool Walking) -- Same temperature range (adaptive sampling) -- Greater temperature range (enhanced dynamics)

  • Abstractions for Dynamic Execution (2) SAGA Pilot-Job (BigJob)

  • Deployment & Scheduling of Multiple Infrastructure Independent Pilot-Jobs

  • Distributed Adaptive Replica Exchange (DARE) Multiple Pilot-Jobs on the Distributed TeraGrid

    Ability to dynamically add HPC resources. On TG: Each Pilot-Job 64px Each NAMD 16px

    Time-to-completion improves No loss of efficiency

    Time-per-generation is measure of sampling

    Adaptive Replica-Exchange, Phil. Trans of Royal Society A (2009)

  • IDEAS: Facilitating Novel Execution Modes

    Interoperability and Scale-out enable new ways of resource planning and application execution

    Deadline-driven scheduling: e.g., Need Workload X done before time Y

    Adapt workload distribution and resource utilization to ensure completion

    Accepted for IEEE CCGrid10

  • Characterizing Reservoirs: Permeability and Porosity

    Porosity: Measure of capacity (buckets)

    Permeability: Measure of flow (pipes)

  • Results: Scale-Out Performance

    Using more machines decreases the TTC and variation between experiments

    Using BQP decreases the TTC & variation between experiments further

    Lowest time to completion achieved when using BQP and all available resources

    Khamra & Jha, GMAC, ICAC09 TeraGrid Performance Challenge Award 2008

  • SAGA-based Applications: Examples Data-Intensive Computational Bio Research Agenda

    Many More Questions than Answers!

    SAGA NxM Framework (All-Pairs) Compute Matrix Elements, each is a Task

    All-to-All Sequence comparison Control the distribution of Tasks and Data Data-locality optimization via external (runtime) module

    SAGA MapReduce Framework: Master-Worker: File-Based &/or Stream-Based

    SAGA-based Sphere (Stream based processing)

    SAGA-based DAG Execution Extend to support dynamic decision/placement, LB & scheduling

    Applications ordered from more to less regular All-Pairs very structured C, D DAG-based applications can be very irregular

  • DDIA: Some questions SAGA-based All-Pairs

    We want to understand: Performance sensitivity to data decomposition, workload

    granularity, degree-of-distribution and their interplay Novel relative compute-data placement

    Affinity-based, data-access patterns Adv of Interoperability:

    Which infrastructure to use, for specific problem? Examine sensitivity to placement techniques

    Performance tradeoffs of a DFS compared to regular distribution. Why DFS? Abstract layer between application and local file systems Some examples include

    HDFS, GFS, CloudStore an open-source high performance DFS based on Google's distributed filesystem GFS.

    Common to load DFS as part of VM/Image Multiple (Open-Source) now available; generally more reliable now

  • SAGA-based All-Pairs Performance determined at multiple levels

    We use a SAGA-based All-Pairs abstraction as representative example Applies an operation on the input data-set such that every possible pair in

    the set is input to the operation Compares genome; Image Similarity (Biometrics) [D Thain]

    Initial Data Condition: Data maybe distributed across resources, possibly localized

    Work Decomposition: Granularity of work-load

    8x8 matrix (=64 matrix elements): workload unit 4? 16? 64? Workload to worker mapping

    For a fixed data set size, this is equal to number of workers Worker placement

    All local? All distributed? I/O saturation? Compute-bound? Network effects?

    Dynamic & Irregular Data: Stage of workload to workers binding

  • Distributed Data Base Line tests

    Time curves down, as Nw up

    Adding workers eventually becomes ineffective Coordination costs

    dominate

    Accessing Remote data is expensive

  • Distributing Workload (Intelligent)

    Configurations: Normal & Intelligent

    Very simple hueristics: Assign tasks upon lowest transfer time

    Intelligence Overhead negligible Implementing Intelligence

    ~1% time

    Different file sizes Scales similarly

    Scale out to > 3 resources

  • DFS or Simple Intelligence? Scale-out Performance

  • SAGA-MapReduce (GSOC08 Miceli, Jha et al CCGrid09; Merzky, Jha GPC09)

    Interoperability: Use multiple infrastructure concurrently Control the NW placement

    Dynamic resource Allocation: Map phase different from Reduce

    Distribution of data

    Ts: Time-to-solution, including data-staging for SAGA-MapReduce (simple file-based mechanism)

  • Digedag: SAGA Workflow Package

    Digedag - prototype implementation of an experimental SAGA-based workflow package, with: An API for programmatically expressing workflows A parser for (abstract or concrete) workflow descriptions An (in-time workflow) planner A workflow enactor (using the SAGA engine)

    Use of an integrated API that allows the specification of the node and data dependencies to be specified & removes the need to manual (explicitly) build DAGs

    Can accept mDAG output, or Pegasus output

    Move back and forth between A & C-DAG;

    Facilitates dynamic execution of DAGs

  • Application Development Phase

    Generation & Exec. Planning Phase

    Execution Phase

    DAG based Workflow Applications Extensibility and Higher-level API

  • SAGA-based DAG Execution Preserving Performance

  • Dist. Data-Intensive Applications (DDIA) Case for Frameworks IDEAS

    Data inherently distributed Distributed DIA, not just the simple sum of DIA concerns

    Multiple, Heterogeneous Infrastructure Decouple Application Development from underlying infrastructure Interoperation, e.g., concurrently cross Grid-Clouds

    Support Runtime or Application Characteristics for multiple applications and different infrastructure Support Multiple Programming Models

    Master-Worker, but Irregular Support Application-Level Patterns

    MapReduce, File Vs stream Support Distributed Affinities

  • Intelligent Compute-Data Placement

    Objective: Intelligence in Compute-Data placement

    Strategies: Assignment of workers (statically) determined by lowest Ttransfer

    Simple network measures (ping, throughput etc.) for Ttransfer Data pre-s