www.openfabrics.org open mpi project state of the union - april 2007 jeff squyres cisco, inc

27
www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

Upload: carmella-fox

Post on 03-Jan-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

www.openfabrics.org

Open MPI ProjectState of the Union - April 2007

Jeff Squyres

Cisco, Inc.

2www.openfabrics.org

Overview

Project purposeSub projects

Current status Continuing / future directions

3www.openfabrics.org

Why does Open MPI exist?

Maximize all MPI expertise Research / academia Industry …elsewhere

Capitalize on [literally] years of MPI research and implementation experience

The sum is greater than the parts

Research /academia

Industry

4www.openfabrics.org

Why separate from M[VA]PICH?

Open, inclusive communityNot limited to just Open Fabrics

Common: TCP, shared memory, OFED* (MVAPICH only)

OMPI-specific: Myrinet, Portals, InfiniPath

M[VA]PICH have different project goals They both chose to remain separate

5www.openfabrics.org

Current membership

14 members, 6 contributors 4 US DOE labs 8 universities 7 vendors 1 individual

6www.openfabrics.org

Not-so-subtle hint

…would love to see an iWARP vendor in the list! (please come talk to me!)

7www.openfabrics.org

Current projects

“Open MPI Project” is an umbrella organization for multiple projects OMPI: Open MPI ORTE:Open Run-Time Environment PLPA: Portable Linux Processor Affinity MTT: MPI (Middleware) Testing Tool

8www.openfabrics.org

Project: Open MPI / ORTE

Recently released new 1.2 seriesOF-related changes compared to v1.1 series

Better overall performance, lots of bug fixes Improvements for run-time/launch scalability Relocate installed MPI (good for ISVs) Support for fork() with OFED 1.2 Support fixed limits for registered memory Fixes for heterogeneous network environments Native InfiniPath support

9www.openfabrics.org

Version history

10www.openfabrics.org

Success stories

OFED + Open MPI Thunderbird Sandia cluster

• #6 in Top 500

Road Runner Los Alamos cluster• 16k Opteron cores + 16k cell broadband engines

Coyote Los Alamos cluster• 2580 Opteron cores

Sun ClusterTools v7

11www.openfabrics.org

OFED involvement

Initially planned on “v1.2ofed” Included some OF-

specific updates But community released

v1.2.1 before OFED 1.2

Therefore, included community OMPI v1.2.1 release in OFED v1.2

OMPI SVN development trunk

v1.2 series branch

v1.2

v1.2ofed

v1.2.1Today

12www.openfabrics.org

OFED involvement

“MPI Selector” Menu-based and CLI commands

Trivially set system-wide and per-user default MPI selection No editing of “dot” files necessary Displays / select between all installed MPI’s

Works with all MPI’s Including HP MPI and Intel MPI

13www.openfabrics.org

Ongoing OFA-related work

More flexible OF wireup schemes Heterogeneous networking scenarios Multiple QP’s per connection

More flexible resource affinity schemes Processor / core, HCA / port

Automatic path migrationRDMA CM functionalityBetter LMC / multi-LID routing

14www.openfabrics.org

Ongoing OFA-related work

Message coalescingAsynchronous progressExploit new Mellanox HCA capabilitiesBetter utilization of network resourcesHeterogeneityMulticast, UD

15www.openfabrics.org

Roadmap

1.2 series is current stable v1.2.1 latest release

1.3 series tentatively targeted at end of year Checkpoint / restart (and other FT) Integration with debuggers Windows support (*) MPI collectives performance improvements LSF integration

16www.openfabrics.org

Project: Processor affinity (PLPA)

Linux API for affinity has changed 3 times Changed number and type of arguments Used same function name (!) Both kernel and glibc functions Installed glibc may not match kernel!

Affinity is critical for performance Especially with increasing core count per host Already critical on NUMA machines (locality!)

17www.openfabrics.org

Which API to use?

Compile-time solution not sufficient Need complex “configure” script to figure it out Only determines glibc API, not kernel API, so it

may not even be sufficient Does not help for shipping static binaries (ISVs)

Need a run-time solution Paul Hargrove (LBNL) devised safe kernel probe PLPA library born

18www.openfabrics.org

PLPA library

Constant API suitable for ISVs BSD license

Automatically performs the run-time probe Dispatches to correct back-end kernel function Bypasses glibc

19www.openfabrics.org

Current status

Releases: Stable series: 1.0.x Upcoming series: 1.1

New 1.1 features Topology information

• Mapping between (socket,core) tuple, hardware threads, CPU node, and Linux processor ID’s

plpa-taskset(1) command• Same as taskset(1), but groks topology information

20www.openfabrics.org

Project: MPI Testing Tool (MTT)

Could be named “Middleware Testing Tool” Very little (no?) MPI-specific

Not specific to Open MPI Has been used with LAM, MPICH2, MVAPICH2

Used as primary test mechanism for OMPI Distributed testing by member organizations

21www.openfabrics.org

Open MPI MTT Usage

Distributed regression testing Nightly and weekend runs Results e-mailed every weekday morning

Supports various resource managersSupports correctness and performance testsCornerstone of Open MPI release process

Each member tests the platforms they care about

22www.openfabrics.org

Indiana U. server

Member site Member site

Member siteMember site

Tarball

Nightly Regression Testing

Nightly tarballs created at Indiana U.

Member sites Download tarball and tests Compile and run tests

Members upload results to central DB

E-mail sent at 12 and 24 hour intervals

Real-time web querying

23www.openfabrics.org

Indiana U. server

Member site Member site

Member siteMember site

Tarball DB

Nightly Regression Testing

Nightly tarballs created at Indiana U.

Member sites Download tarball and tests Compile and run tests

Members upload results to central DB

E-mail sent at 12 and 24 hour intervals

Real-time web querying

24www.openfabrics.org

Indiana U. server

Tarball DB

Nightly Regression Testing

Nightly tarballs created at Indiana U.

Member sites Download tarball and tests Compile and run tests

Members upload results to central DB

E-mail sent at 12 and 24 hour intervals

Real-time web querying

25www.openfabrics.org

Usage in Open MPI

Currently available to all OMPI members Strongly “encouraged”

E-mail results examined every day 12 and 24 hour windows Weekend windows

MTT software to be released publicly later this year

26www.openfabrics.org

The Open MPI Project

More than just MPI Concerned with real-world HPC

Open community Come join us!

Solid OpenFabrics support is critical Many unanswered questions Plenty of room for academic and industry

ongoing work

www.openfabrics.org

Thank You

http://www.open-mpi.org/