session 2: esmf distributed data classes instructors: rocky dunlap and fei liu noaa cooperative...

44
Session 2: ESMF Distributed Data Classes Instructors: Rocky Dunlap and Fei Liu NOAA Cooperative Institute for Research in Environmental Sciences University of Colorado, Boulder Training at NRL Monterey August 5-6, 2015

Upload: jared-white

Post on 26-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Session 2: ESMF Distributed Data Classes

Instructors: Rocky Dunlap and Fei LiuNOAA Cooperative Institute for Research in Environmental Sciences

University of Colorado, Boulder

Training at NRL MontereyAugust 5-6, 2015

2

Two Day Overview

• Day 1 morning– Overview of ESMF and component-based modeling– Coding exercise: Run a single-component ESMF application

• Day 1 afternoon– ESMF distributed data classes and wrapping your model data types

with ESMF data types– Regridding, LocStream– Coding exercise: Grid/Mesh construction

• Day 2 morning– Overview of NUOPC interoperability layer– Coding exercise: Run a NUOPC prototype application

• Day 2 afternoon– Questions and time for discussion of ESMF/NUOPC in COAMPS and

other NRL modeling applications

3

Connecting ESMF to your Model’s Data

ESMF’s distributed data classes are used to reference your model’s data as its spread across PETs. These classes are the main connection between ESMF and your model.

Instantiating ESMF data types is a requirement to take advantage of the fast parallel communication operations like redistribution and regridding.

NUOPC relies on ESMF distributed data classes for inter-component communication.

4

This Session…

Learn about ESMF classes for representing model domains and wrapping model data.

• Overview of distributed data classes• Grids• Meshes• Fields / FieldBundle

• Later…– Regridding– New LocStream and remapping to point list

5

This Session…

Learn about ESMF classes for representing model domains and wrapping model data.

• Overview of distributed data classes• Grids• Meshes• Fields / FieldBundle

• Later…– Regridding– New LocStream and remapping to point list

6

Physical Space and Index Space

Physical space is concerned with the geometry of the modeled domain. A physical space representation is built on an underlying index space.

The related ESMF classes are Grid, Mesh, LocStream, Field, and FieldBundle.

Index space is concerned with the logical layout (topology) of model data. This is directly related to how data is stored in memory.

The related ESMF classes are DistGrid, DELayout, Array, and ArrayBundle.

7

Relationship ofESMF Distributed Data Classes

DistGrid

DELayout

Grid

Array / ArrayBundle

Field /FieldBundle

Mesh LocStream

Physical space• Field – storage for a

physical field• Grid – logically rectangular

regions • Mesh – unstructured• LocStream – set of points,

e.g., observations

Index space• Array – index-based distributed

data storage• DistGrid – distributed, multi-

dimensional index space• DELayout – maps

decomposition elements to PETs

8

Relationship ofESMF Distributed Data Classes

DistGrid

DELayout

Grid

Array / ArrayBundle

Field /FieldBundle

Mesh LocStream

In many cases, you can create physical space objects directly.

The underlying index space objects will be created automatically.

This saves time and reduces code size.

9

Operations on Distributed Data Classes

• Sparse Matrix Multiply– Apply coefficients (weights) in parallel to distributed data– Highly tuned for efficiency/auto-tunes for optimal execution– Underlies most of ESMF distributed data operations

• Redistribution– Move data between distributions without changing values (no interp.)

• Scatter / Gather– Distribute native data from single PET to multiple PETs or vice versa

• Halo– Fill surrounding “Halo” cells which hold data from another processor– Useful during computations on distributed data

• Regridding – Move data from one grid/mesh to a different one– Only available on physical space data classes

10

Typical Flow of Execution! Create grid and field objectsmyGrid = ESMF_GridCreate(…)myField = ESMF_FieldCreate(myGrid,…)

! Create Halo Communication Structure call ESMF_FieldHaloStore(myField, …, routehandle)

! Loop over timedo t=1,…

! Access Field data pointercall ESMF_FieldGet(myField, farrayPtr=ptr_to_data, …)

! Loop over memory doing computations on datado i=1, …

do j=1, … ptr_to_data(i,j) = … enddo enddo

! Update halocall ESMF_FieldHalo(myField, routehandle, …)

enddo

call ESMF_FieldHaloRelease(routehandle,…)

Efficient reuse of communication RouteHandle

11

This Session…

Learn about ESMF classes for representing model domains and wrapping model data.

• Overview of distributed data classes• Grids• Meshes• Fields / FieldBundle

• Later…– Regridding– New LocStream and remapping to point list

12

ESMF GridsESMF Grids represent logically rectangular regions.

Fully specifying a physical Grid requires setting a number of parameters through the ESMF Grid API.

The typical steps are:• define the index space• define the topology in terms of periodic dimensions and poles• define the parallel decomposition (or use the default)• add storage for grid coordinates at one or more stagger locations• set the coordinates• if needed, add storage for and set masks

Once set, multiple Fields can be defined using the same Grid object.

13

Grid Creation Shortcut Methods

For many basic grids, most parameters can be set with a single API call.

• ESMF_GridCreateNoPeriDim()– Creates a Grid with no edge connections, e.g., a regional Grid with

closed boundaries• ESMF_GridCreate1PeriDim()

– Creates a Grid with one periodic dimension and supports a range of options for what to do at the pole (bipole, tripole spheres)

• ESMF_GridCreate2PeriDim()– Creates a Grid with two periodic dimensions, e.g., a torus, or a

regional Grid with doubly periodic boundaries• ESMF_GridCreate(“gridfile.nc”)

– Support for SCRIP and Gridspec formats

14

Example of Grid Creation:Regular Decomposition

type(ESMF_Grid) :: my2DGrid

my2DGrid = ESMF_GridCreateNoPeriDim(minIndex=(/1,1/), & maxIndex=(/12,18/), regDecomp=(/2,3/), rc=rc)

(1,1)

(12,18)

dim

1

dim 2

DE 0

DE 1

DE 2

DE 3

DE 4

DE 5

Decomposition Element (DE)

Distribution of a Grid is cell-based

15

Irregular Grid Decomposition

type(ESMF_Grid) :: my2DGrid

my2DGrid = ESMF_GridCreateNoPeriDim( & countsPerDEDim1=(/3,7,2/), & countsPerDEDim2=(/3,5,10/), rc=rc)

(1,1)

(12,18)

dim

1

dim 2

DE 0

DE 1

DE 2

DE 3

DE 4

DE 5

DE 6

DE 7

DE 8

16

Arbitrary Grid Decomposition

type(ESMF_Grid) :: my2DGridinteger :: localCount! array of size (localCount, num distributed dims)integer :: localIndex(:,:)

… ! fill in localIndex array per PET

my2DGrid = ESMF_GridCreateNoPeriDim( & maxIndex=(/12,18/), & arbIndexList=localIndex, & arbIndexCount=localCount, rc=rc)

DE 0

DE 1

DE 2

DE 3

Restrictions:- one DE per PET- only local indexing- only center stagger

17

Distributing DEs to PETsAssuming the application is executed with 6 PETs, the default DELayout will assign each DE to a separate PET, i.e., a 1-to-1 mapping.

DE 0

DE 1

DE 2

DE 3

DE 4

DE 5

PET 0

PET 1

PET 2

PET 3

PET 4

PET 5

DE 0

DE 1

DE 2

DE 3

DE 4

DE 5

18

Distributing DEs to PETs

If there are fewer PETs available than DEs, DEs will be assigned in a cyclic manner leading to some PETs having multiple DEs.

DE 0

DE 1

DE 2

DE 3

DE 4

DE 5

PET 0

PET 1

PET 2

PET 3

DE 0

DE 1

DE 2

DE 3

DE 4

DE 5

19

Explicit DE-to-PET Mapping

allocate( petMap(2,3,1) )

! explicit assignment of DEs to PETspetMap(:,1,1) = (/0,0/) ! DEs 0 and 1 on PET 0petMap(:,2,1) = (/1,1/) ! DEs 2 and 3 on PET 1petMap(:,3,1) = (/0,0/) ! DEs 4 and 5 on PET 0

my2DGrid = ESMF_GridCreateNoPeriDim(maxIndex=(/12,18/), & regDecomp=(/2,3/), petMap=petMap, rc=rc)

PET 0

PET 1

DE 0

DE 1

DE 2

DE 3

DE 4

DE 5

DE 0

DE 1

DE 2

DE 3

DE 4

DE 5

20

Non-distributed Dimensions

type(ESMF_Grid) :: grid3D

grid3D = ESMF_GridCreateNoPeriDim( & countsPerDEDim1=(/45,75,40,20/), & countsPerDEDim2=(/30,40,20/), & countsPerDEDim3=(/40/), rc=rc)

(180,90,40)

DE 0 DE 1 DE 2 DE 3

DE 4

DE 11

21

Supported Coordinates

Uniform Rectilinear CurvilinearSphere Global uniform lat-

lon gridGaussian grid Displaced pole grid

Rectangle Regional uniform lat-lon grid

Gaussian grid section Polar stereographic grid section

22

StaggeringStaggering is a finite difference technique in which the values of different physical quantities are placed at different locations within a grid cell.

The Grid class supports stagger locations at cell centers, corners, and edge centers. Coordinates are defined at specific stagger locations.

23

Adding Grid CoordinatesWhen a Grid is created, memory is not automatically allocated for storing coordinates. ESMF_GridAddCoord() is used to allocate memory for coordinates at a specific stagger location.

grid2D = ESMF_GridCreateNoPeriDim( & maxIndex=(/10,20/), & ! define index space regDecomp=(/2,3/), & ! define how to divide among DEs coordSys=ESMF_COORDSYS_CART, & ! Cartesian coordinates coordDep1=(/1/), & ! 1st coord is 1D and depends on 1st Grid dim coordDep2=(/2/), & ! 2nd coord is 1D and depends on 2nd Grid dim indexflag=ESMF_INDEX_GLOBAL, rc=rc)

call ESMF_GridAddCoord(grid2D, & staggerloc=ESMF_STAGGERLOC_CENTER, rc=rc)

call ESMF_GridGetCoord(grid2D, coordDim=1, localDE=0, & staggerloc=ESMF_STAGGERLOC_CENTER, & computationalLBound=lbnd, computationalUBound=ubnd, & farrayPtr=coordX, rc=rc)

do i=lbnd(1),ubnd(1) coordX(i) = i*10.0enddo

Allocate space for coordinates

Retrieve local pointer to coordinate array

Set coordinates

Rectilinear coordinates

24

Padding for Stagger Locations

N c

ells

M cells

Stagger Location Required Storage

ESMF_STAGGERLOC_CENTER N x M

ESMF_STAGGERLOC_CORNER (N+1) x (M+1)

ESMF_STAGGERLOC_EDGE1 (N+1) x M

ESMF_STAGGERLOC_EDGE2 N x (M+1)

ESMF automatically provides padding (extra index space) when required for storing coordinate data (and other grid items) based on the stagger location.

25

This Session…

Learn about ESMF classes for representing model domains and wrapping model data.

• Overview of distributed data classes• Grids• Meshes• Fields / FieldBundle

• Later…– Regridding– New LocStream and remapping to point list

26

ESMF Meshes

• A Mesh is constructed of nodes and elements.

• The parametric dimension is the dimension of the elements.

• The spatial dimension is the dimension of the space the Mesh is embedded in, i.e., the number of coordinate dimensions.

• ESMF support 2D element in 2D space, 3D elements in 3D space, and 2D element in 3D space.

• In 2D, there is no limit to the number of polygon sides.

• In 3D, elements may be tetrahedra or hexahedra.

• Meshes are distributed by element. Nodes may be duplicated on multiple PETs but are owned by one PET.

0

0

0

0

1

1

2 2 3

[1][2]

[3]

[4] [5]

1 2 3

4 5 6

7 8 9

Node IDs

[Element IDs]

PET Owners

27

Explicit Mesh Construction

! create Mesh from explicit nodes and element lists

mesh = ESMF_MeshCreate(parametricDim=2, spatialDim=2, & nodeIds=nodeIds, & ! 1d array of unique node ids nodeCoords=nodeCoords, & ! 1d array of size spatialDim*nodeCount nodeOwners=nodeOwners, & ! 1d array of PETs elementIds=elemIds,& ! 1d array of unique element ids elementTypes=elemTypes, & ! 1d array of element types elementConn=elemConn) ! 1d array of corner node local indices

! the parameters above are for PET-local nodes and elements2D element types:ESMF_MESHELEMTYPE_TRIESMF_MESHELEMTYPE_QUADN (n-sided polygon)

3D element types:ESMF_MESHELEMTYPE_TETRAESMF_MESHELEMTYPE_HEX

28

Mesh Construction from File

! create Mesh from file in SCRIP format

mesh = ESMF_MeshCreate(filename="data/ne4np4-pentagons.nc", & filetypeflag=ESMF_FILEFORMAT_SCRIP, & nodalDistgrid=nodalDistgrid, & ! optional,

describes decompelementDistgrid=elementDistgrid, & ! optional,

describes decomprc=rc)

File type flags:

ESMF_FILEFORMAT_SCRIPformat from SCRIP regridding tool, grid_rank must be 1

ESMF_FILEFORMAT_ESMFMESHefficient, custom format designed to match capabilities of Mesh class

ESMF_FILEFORMAT_UGRIDproposed extension to CF convention; only 2D flexible mesh topology supported

29

This Session…

Learn about ESMF classes for representing model domains and wrapping model data.

• Overview of distributed data classes• Grids• Meshes• Fields / FieldBundle

• Later…– Regridding– New LocStream and remapping to point list

30

ESMF Fields

An ESMF Field is a distributed data structure that wraps model variables. Fields are added to import and export States and can be transferred between Components.

• Start by creating a Grid, Mesh, or LocStream object. This describes the underlying index space required for the Field and how it is decomposed across PETs.

• Use one of the ESMF_FieldCreate() routines, passing in the Grid/Mesh/LocStream.

1. ESMF can allocate memory for you, in which case you provide the typekind

2. ESMF can use a pre-allocated Fortran array or array pointer• When a Field references internal model arrays, it removes the

requirement to copy data in/out of the model for coupling exchanges.

31

Create Field with Memory Automatically Allocated

! create a gridgrid = ESMF_GridCreateNoPeriDim(minIndex=(/1,1/), & maxIndex=(/10,20/), & regDecomp=(/2,2/), name="atmgrid", rc=rc) ! create a Field from the Grid and typekindfield1 = ESMF_FieldCreate(grid, typekind=ESMF_TYPEKIND_R4, & indexflag=ESMF_INDEX_DELOCAL, & staggerloc=ESMF_STAGGERLOC_CENTER, & name="pressure", rc=rc)

call ESMF_FieldGet(field1, localDe=0, farrayPtr=farray2d, & computationalLBound=clb, computationalUBound=cub, & totalCount=ftc)

do i = clb(1), cub(1) do j = clb(2), cub(2)

farray2d(i,j) = … ! computation over local DE enddoenddo

32

Create Field from Fortran Array

real(ESMF_KIND_R8), pointer :: farrayPtr2D(:,:)

! local PET array allocation in modelallocate( farrayPtr2D(5,5) )

! create 10x10 grid with regular 2x2 decompositiongrid = ESMF_GridCreateNoPeriDim(minIndex=(/1,1/), maxIndex=(/10,10/), regDecomp=(/2,2/), name=“atmgrid”)

! local array size is 5x5! default center staggerfield = ESMF_FieldCreate(grid, farrayPtr2D, & indexflag=ESMF_INDEX_DELOCAL)

33

Field Regions and Default BoundsExclusive region (fixed): elements owned exclusively by a DE; sole source for halo and reduce operations

Total region (fixed):additional elements on the rim of the exclusive region; typically overlaps with non-local DE and receives halo updates

Computational region (dynamic): can be arbitrarily set to bounds of cells updated by DE-local computation kernel

Default: exclusive = computational = total bounds

Decomposition Element

34

Setting Bounds for Halo Padding

! create field with halo width of 2 in each dimension! total widths are with respect to exclusive region

field = ESMF_FieldCreate(grid, & farray2d, ESMF_INDEX_DELOCAL, & totalLWidth=(/2,2/), totalUWidth=(/2,2/))

totalLWidth(1)

totalLWidth(2)

totalUWidth(1)

totalUWidth(2)

35

Retrieving Local-DE Boundsinteger :: elb(2), eub(2)integer :: clb(2), cub(2)integer :: tlb(2), tub(2)

call ESMF_FieldGetBounds(field2d, & exclusiveLBound=elb, & exclusiveUBound=eub, & computationalLBound=clb, & computationalUBound=cub, & totalLBound=tlb, & totalUBound=tub, & rc=rc)

! iterator over computational regiondo j=clb(2), cub(2) do i=clb(1), cub(1) field2dPtr(i,j) = … enddoenddo

36

Global and Local Indexing

1,1 1,1

3,33,3

1,1 1,1

3,33,3

ESMF_INDEX_DELOCAL ESMF_INDEX_GLOBAL

1,1 1,4

3,63,3

4,1 4,4

6,66,3

Lower bound of each DE’s exclusive region is (/1,1,…/)

DEs use global indices

37

Creating a Field on a Mesh

! create Mesh mesh = ESMF_MeshCreate(parametricDim=2, spatialDim=2, & nodeIds=nodeIds, nodeCoords=nodeCoords, & nodeOwners=nodeOwners, elementIds=elemIds,& elementTypes=elemTypes, elementConn=elemConn)

! create arrayspec w/ rank and data type call ESMF_ArraySpecSet(arrayspec, 1, ESMF_TYPEKIND_I4, rc=rc)

! the field is created on locally owned nodes on each PETfield = ESMF_FieldCreate(mesh, arrayspec, rc=rc)

call ESMF_MeshGet(mesh, nodalDistgrid=distgrid, & numOwnedNodes=numOwnedNodes, & numOwnedElements=numOwnedElements)

allocate(ownedNodeCoords(2*numOwnedNodes))allocate(ownedElemCoords(2*numOwnedElements))

! retrieve coordinates of nodes/elements owned locallycall ESMF_MeshGet(mesh, nodalDistgrid=distgrid, & ownedNodeCoords=ownedNodeCoords, & ownedElemCoords=ownedElemCoords)

38

Creating a Field on a Mesh with an Ungridded Dimension

! create Mesh mesh = ESMF_MeshCreate(parametricDim=2, spatialDim=2, & nodeIds=nodeIds, nodeCoords=nodeCoords, & nodeOwners=nodeOwners, elementIds=elemIds,& elementTypes=elemTypes, elementConn=elemConn) call ESMF_ArraySpecSet(arrayspec, 2, ESMF_TYPEKIND_I4, rc=rc)

field = ESMF_FieldCreate(mesh, arrayspec=arrayspec, & gridToFieldMap=(/2/), & ungriddedLBound=(/1/), & ungriddedUBound=(/3/))

The gridToFieldMap indicates that the second field dimension maps to the first grid dimension.

The ungriddedLBound and ungriddedUBound define the bounds of the first field dimension.

39

Field BundleThe ESMF FieldBundle is a container for storing Fields that are discretized on the same Grid/Mesh/LocStream and have the same decomposition.

Currently this is a convenience mechanism, e.g., all Fields in a FieldBundle can be regridded with one call. In the future, optimizations may be applied such as packing Fields together to enable collective manipulations.

simplefield = ESMF_FieldCreate(grid, arrayspec, & staggerloc=ESMF_STAGGERLOC_CENTER, name="rh", rc=rc) bundle2 = ESMF_FieldBundleCreate(name="fb", rc=rc)

call ESMF_FieldBundleAdd(bundle2, (/simplefield/), rc=rc)

40

Conclusions

Wrapping model data with ESMF types is required for taking advantage of fast, parallel communication operations and is also required by NUOPC.

ESMF has a flexible set of distributed data classes designed to support the index space and physical domain representations used in most Earth system model components.

41

Extra Slides

42

ESMF Infrastructure

• Distributed data classes are used to hold data spread across PETs and they are the main connection between ESMF and your model’s data– Represents model data so ESMF can perform operations on it– Provides a standard representation to be passed between components

• Utilities:– Time Manager: Classes to represent time, time intervals, alarms…

• Used in ESMF for passing time info between models, time loops, etc.• Also useful for doing calculations with time, conversions, etc.

– Attributes: Allow metadata to be attached to ESMF classes• Can be written to various file formats, e.g. CIM compliant XML

– Log– Config: reads text-based configuration files

43

Working with Fields and Grids

• Often you can work directly with Fields and Grids, allowing ESMF to automatically create the lower-lever objects (i.e., Array, DistGrid, DELayout).

• However, building from the bottom up provides the highest level of flexibility, when necessary.

type(ESMF_Grid) :: my2DGridtype(ESMF_DistGrid) :: myDistGrid

! create a new grid object – DistGrid created automaticallymy2DGrid = ESMF_GridCreateNoPeriDim(maxIndex=(/100,200/), …)

! retrieve DistGridcall ESMF_GridGet(my2DGrid, distgrid=myDistGrid, …)

44

Adding Grid Coordinates

grid2D=ESMF_GridCreateNoPeriDim( & countsPerDEDim1=(/3,7/), & countsPerDEDim2=(/11,2,7/), & coordDep1=(/1,2/), & ! 1st coord is 2D and depends on both Grid dims coordDep2=(/1,2/), & ! 2nd coord is 2D and depends on both Grid dims indexflag=ESMF_INDEX_GLOBAL, rc=rc)

call ESMF_GridAddCoord(grid2D, & staggerloc=ESMF_STAGGERLOC_CENTER, rc=rc)

call ESMF_GridGetCoord(grid2D, coordDim=1, localDE=0, & staggerloc=ESMF_STAGGERLOC_CENTER, & computationalLBound=lbnd, computationalUBound=ubnd, & farrayPtr=coordX2D, rc=rc)

do j=lbnd(2),ubnd(2) do i=lbnd(1),ubnd(1) coordX2D(i,j) = i+j enddoenddo

Retrieve local pointer to coordinate array

Set X coordinate for every cell in the 2D grid

Curvilinear coordinates

Then set Y coordinate…