rrtmgp : a high-performance broadband radiation code for the next decade

26
1 RRTMGP: A High-Performance Broadband Radiation Code for the Next Decade A 3-year project funded by the Office of Naval Research Eli Mlawer AER David Berthiaume AER Robert Pincus Univ. of Colorado Brian Eaton NCAR Ming Liu ONR Mike Iacono AER

Upload: tavita

Post on 24-Feb-2016

44 views

Category:

Documents


0 download

DESCRIPTION

RRTMGP : A High-Performance Broadband Radiation Code for the Next Decade A 3-year project funded by the Office of Naval Research Eli MlawerAER David Berthiaume AER Robert Pincus Univ. of Colorado Brian EatonNCAR Ming LiuONR Mike Iacono AER. Overview of Talk. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

1

RRTMGP: A High-Performance Broadband Radiation Code for the Next Decade

A 3-year project funded by the Office of Naval Research

Eli Mlawer AERDavid Berthiaume AERRobert Pincus Univ. of ColoradoBrian Eaton NCARMing Liu ONRMike Iacono AER

Page 2: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

2

Overview of Talk

• Motivation for Project

• Basics of radiation calculations

• Radiation calculations in GCMs

• Overview of AER radiation codes, validation

• RRTMG – Background, accuracy, GCM Implementations

• RRTMG – Computational issues

• Introduction to Other Talks

Page 3: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

3

Motivation for project

• Climate/weather prediction models need physics parameterizations• Trend toward greater complexity in GCMs

- e.g. greater temporal, spatial, vertical resolution

• Accuracy and computational efficiency are linked• Radiative processes important

- means by which planet stays in long-term equilibrium with the universe- differential distribution of absorbed solar radiation drives equator-to-pole temperature gradient- small radiative imbalance due to increasing GGHs drives modern-era warming

• Radiative processes complex• Fast parameterizations developed, but still can take ~30% of computational time

in GCMs- Even with that, radiation package is not called at every time step

e.g. NAVGEM: SW code called every 2 hrs, LW code every 2 hrs per 6 grid cells

Conclusion: Radiation is a key bottleneck in predictive modeling.Goal: Develop next generation radiation code for multi-core, vector- and

cache-based computational architectures.

Page 4: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

4

Gases (e.g. H2O, CO2)

Basics of Monochromatic Radiative Transfer (1)

Rin(ν)

Rout(ν)

B(ν,T) E(ν) T(ν)

Single layer (P, T)

E(ν) is layer emissivity at this frequency; B(ν,T) is the Planck function at ν and TT(ν) is layer transmissivity at this frequency

1 – E(ν) = T(ν) = exp(-τ(ν)) τ(ν) is the layer optical depth at this frequency(assumes no scattering)

Rin(ν)

Rout(ν)

B(ν,T) E(ν) T(ν)

Page 5: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

5

Basics of Monochromatic Radiative Transfer (2)

Zenith-sky optical depths

(Univ. of Oxford)

τ(ν) = WCO2 * kCO2(P,T) + WH2O * kH2O(P,T) + ...

W is number of molecules in layer; k is absorption coefficient at P and T

Page 6: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

6

Basics of Monochromatic Radiative Transfer (3)

CO2 band H2O Lines H2O Continuum Ozone band H2O band

Measured downwelling radiances measured at surface (Oklahoma,

7/22/01)

Differences between measured and

calculated radiances (model circa 1999)

Differences between measured and

calculated radiances (current model)

Page 7: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

7

Basics of Monochromatic Radiative Transfer (4)

• By the 1990’s, validations with spectral radiation measurements with line-by-line models had provided confidence in model quality

• U.S. Department of Energy Atmospheric Radiation Measurement (ARM) program instrumental - dedicated to the collection of high-quality observations of geophysical properties at a

number of ground sites and utilizing these observational data sets to improve physical parameterizations in climate models.

- initial emphasis was on understanding the spectrally detailed distribution of radiation at the Earth’s surface and how this distribution depends on the state of the atmosphere.

- Comparisons of high-spectral-resolution radiometric measurements with LBLRTM led to significant improvements in the spectroscopy and other physics underlying such models. Conclusion: model is accurate under a wide range of atmospheric conditions

• Key point: we know “the answer” for clear-sky RT• Development of RRTMG at AER was funded by ARM (1995-2000)

Page 8: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

8

RRTMG – Basics of correlated-k method (1)

• Rearrange absorption coefficients (k’s) in ascending order (i.e. a k-distribution)- defines a mapping from ν- to g-space• Break into sub-intervals (‘g-points’), compute average k for each, store in

LUTs

Page 9: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

9

RRTMG – Specifics (2)

Some considerations • Full spectrum broken up into bands based on absorbing species in region

- 16 in LW, 14 in SW

• k-distributions for different layers in same column will not generally be spectrally correlated (i.e. ν- to g-space mapping not fixed)

- Monochromatic RT equations not obeyed, contributes to model’s error budget

• Other spectrally dependent values (e.g. Planck function) included by applying ν- to-g mapping

• LUTs needed for a wide range of P, T, and ratios of abundances of key species (η)

Per g-point:- troposphere: 13 P’s x 5 T x 9 ratios for bands with two ‘key’ species; otherwise 13 P’s x 5 T’s- stratosphere: 47 P’s x 5 T x 5 ratios for bands with two ‘key’ species; otherwise 47 P’s x 5 T’s

• For each layer, use LUTs to interpolate in ln P, T, and (possibly) η• Tuning occasionally needed

Page 10: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

10

RRTMG – Interpolating in binary species param. η (3)

Different interpolation methods used for outer 1/8’s and inner ¾ of

η-space

Page 11: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

11

LBLRTM

OSS RRTMG

RRTMG – Relationship with LBLRTM(4)

Line-by-line modeling

Parameterizations

Validations through Radiative Closure Studies

Optical depths

Satellite – e.g. AIRS, TES, IASI

Aircraft – e.g. HIS

Ground – e.g. AERI, TCCON

Satellite – e.g. CERES

Ground – e.g. ARM

Application ApplicationRadiances Fluxes andand Jacobians Heating Rates

Spectral Measurements

Broadband Measurements

Page 12: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

12

RRTMG – Relationship with LBLRTM (5)

Absorption line parameters(HITRAN)

Continuum absorption modelsCross-sections

Page 13: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

13

RRTMG – Spectral Bands (6)

Selected Spectral Bands in RRTMG (troposphere)

Range (cm-1) Key Species Minor Species

350-500 H2O -

500-630 H2O, CO2 N2O

630-700 H2O, CO2 -

700-820 H2O, CO2 O3, CCl4820-980 H2O CO2

980-1080 H2O, O3 CO2

1080-1180 H2O CO2, O3, N2O,CFC-12,CFC-22

1180-1390 H2O, CH4 N2O

Page 14: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

14

RRTMG - Accuracy (7)

‘Effective’ Accuracy

Equivalent to LBLRTM

Page 15: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

15

RRTMG - Accuracy (8)

CIRC RT Intercompariso

n

Percentage errors in calculated flux

for each participating

modelRRTM – Code #1

RRTMG – Code #2

Page 16: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

16

RRTMG – Cloud Approach (9)

• The Monte-Carlo Independent Column Approximation provides an efficient and unbiased method to handle cloud inhomogeneity and vertical correspondence.

• Each g-point is matched with a single element from the pdf of cloud amounts and vertical overlap

Page 17: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

17

RRTMG – Selected Implementations (10)

Organization ModelsECMWF IFS

ERA40Max Planck Institute ECHAM

NCEP CFSGFSCFS ReanalysisRUC

NCAR CAMCESMWRF-ARW

NASA/GSFC GEOS

Laboratory for Dynamical Meteorology

LMDZ

China Met. Administration GRAPES

Meteo-France Meso-NH

FNMOC (Navy) NAVGEM

Page 18: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

18

RRTMG – Moving Toward Parallelization (11)

• RT in a GCM: repetitive, independent calculations. For each time step:

- number of ‘columns’ = grid cells (~104) X g-points (~200)- number of optical depths = ‘columns’ X layers (~70)

• RRTMG: developed in 1990s for CPUs- has many conditional branches aimed at minimizing the number of floating point operations- is not vectorized- is generous in its use of memory copying

Conclusion: it’s time to bring RRTMG into the modern computational era

Page 19: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

19

RRTMGP – Our Team (1)

• AEREli Mlawer: PI, lead developer of RRTMG, PI for NASA RRTMGPU projectDavid Berthiaume: developer of RRTMGPU, advanced programming skillsMike Iacono: great experience wrt RRTMG implementation in atmospheric models

• University of ColoradoRobert Pincus: developer of Psrad and McICA, worked with many modeling centers to improve treatment of radiation in atmospheric models• National Center for Atmospheric Research (NCAR)Brian Eaton: significant experience wrt infrastructure and interfaces for physics

packages, dynamical cores, and physics-dynamics coupling in CAM/CESM• Naval Research LaboratoryMing Liu: substantial experience with physics packages and code structure in NAVGEM

Page 20: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

20

RRTMGP – Project Schedule/Milestones (2)

Year 1• Redesign RRTMG to be more flexible and easier to optimize

- Refactor RRTMG to be more modular, using the structure of PSrad as a guide- Focus on optimality across vector- and cache-based architectures, including the use of high-performance libraries where possible. When possible, GPU and MIC drop-in versions of routines will be created. - Vectorize all routines across columns, with no preference as to vertical direction - Design code to allow calculation of full broadband irradiance or selected sets of subintervals - Redesign to be done in consultation with collaborators at NRL and NCAR• Begin recoding of RRTMGP • Develop large test suite of clear and cloudy casesYear 2• Complete recoding of RRTMGP

- Revise gas optics scheme -- modified interpolation scheme that is more uniform across bands• Thorough testing and validation on a variety of platformsYear 3• Work with NRL and CESM collaborators to test performance of RRTMGP across the full range of platforms on which their respective GCMs are run

Page 21: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

21

RRTMGP – Elements of the Development (3)

Rebuilding RRTMG for the computers of the next decade• Redesign, refactoring, parallelization

• Revised interfaces• Uniform table interpolations for optical depths

• Allow spectral sampling• Complier directives for OpenACC and OpenMP• Fast math routines• CMAKE build system

Validation and benchmarking will be key.

Implementation and performance testing in CESM and NAVGEM.

Need to coordinate with NASA project to improve physics (regenerate k’s with updated spectroscopy, fix known issues, add LW scattering)

Page 22: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

22

Extra Slides

Page 23: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

23

Timing statistics for porting of cldprmc subroutine to GPU

Overhead not included; speed-up shown here may not be representative of final speed-up of RRTMG

COUND Project: Acceleration of RRTMG with GPU Technology

• Calculation of radiative fluxes, heating rates in a GCM can take ~30% of time• Individual radiation calculations in a GCM are independent – ideal for GPU

• Steps in redevelopment of RRTMG to run on GPU – first LW code, then SW1. Create timing/accuracy benchmarking code2. Restructure so that each component is parallelized3. Pass global data to GPU; implement kernel functions4. Port each component/subroutine to GPU

- Main RT code running on GPU; subroutine ‘cldprmc’ completed

5. Optimize memory throughput; evaluate – repeat as needed6. Deliver to collaborators at GMAO (Suarez/Oreopoulos)

- Work with GSFC to evaluate performance of GEOS-5 with GPU/RRTMG

• With GPU/RRTMG implemented, GEOS-5 can exploit the time savings by introducing other new physics packages

GCM Schematic

N is a large number= nlat X nlong (~104)or= nlat X nlong X ‘spectral’ dimension of RT code (~106)

# profiles CPU time (s) GPU time (s) speed-up

1000 0.315 0.006 52.2x

3000 1.061 0.012 88.4x

5000 2.042 0.016 127x

10000 4.696 0.033 142x

Model Physics (1)

Radiation Code

Model Physics (2)

Next time step

Atmospheric State ………

Radiative Fluxes Heating Rates ………

N independent calculations

Page 24: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

24

Improvement in Accuracy of LBLRTM

Two AERI Measurements

Page 25: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

25

LBLRTM: Improvements to CO2 Spectroscopy

Previous version (2006)· Q branch line coupling· HITRAN 2000 CO2

parameters

Mean residuals from 36 AIRS ARM TWP nighttime cases using Tobin et al. best estimate sonde profiles

Improved agreement (Obs - Calc) and consistency across spectral bands!

Mid-Upper Trop

Mid-Upper Trop

(Input profiles supplied by L. Strow and S. Hannon).

Latest version (2011)• P, Q and R line coupling

• Niro et al. [2005]• Widths, line coupling coeffs

• Lamouroux et al. [2010]• Tashkun positions, intensities

• Flaud et al. [2003]• Updated CO2 and H2O continua

• Mlawer et al., [2012]

Mid-Upper Trop

Mid-Upper Trop

CO2 v2 CO2 v3

Page 26: RRTMGP : A High-Performance Broadband Radiation Code for the Next  Decade

26

An example related to flexibility

· A solution for cloud-scale models

– Monte Carlo Spectral Integration (Pincus and Stevens 2008) approximates G ~ 100 calculations every N time steps

with G’ ~ 1 calculations every time step

(slide from Robert Pincus)