introduction to dataflow computing - hpcfinance davies_max… · explaining control flow versus...
Post on 05-Jul-2020
16 Views
Preview:
TRANSCRIPT
Craig Davies
Introduction to
Dataflow Computing
HPC Finance Conference, Tampere – May, 2013
2
Overview
• Maxeler Overview
• Dataflow Technology
• HPC Finance: Risk Management
• MAX-UP: Maxeler University Program
• Example Application: Credit Derivatives
Valuation & Risk
• Maxeler delivers dataflow solutions for Analytics including Risk, Trading infrastructure (low-latency), HPC for Scientific Computing
• Building the HPC compute fabric based on the application in a multi-disciplinary, data-centric approach
What we do
Hardware
We build 1U boxes, Workstations and the cards inside.
We build custom large memory systems to deal with Big Data
We integrate rack system with networking and storage.
Full stack of software for runtime, compile time, and monitoring
MaxJ - Java-based programming and MaxIDE development tools
Consulting
HPC System Performance Architecture
Algorithms and Numerical Optimization
Integration into business and technical processes
Software
3
Dataflow Technology
Programmable Spectrum
5
Single-Core CPU Multi-Core Several-Cores FPGA
Intel, AMD GPU (NVIDIA, AMD) Tilera, XMOS etc...
Maxeler
Hybrid e.g. AMD Fusion, IBM Cell
Control-flow processors Dataflow processor
Increasing Parallelism (#cores)
Increasing Core Complexity
Many-Cores
GK110
Where silicon is used?
Intel 6-Core X5680 “Westmere”
6
Where silicon is used?
Intel 6-Core X5680 “Westmere”
Dataflow Processor
7
Computation
MaxelerOS
Computation (Dataflow cores)
DataFlow Engine (DFE) vs FPGA
ASIC
CPU DFE GPU
DRAM Ctrl
Cores
Cores
FPGA
PCIe DRAM Ctrl PCIe DRAM Ctrl PCIe
CPU Interconnect DFE Interconnect
Compiler
Technology
Compute Architecture
Run-time system
OpenCL MaxCompiler C Compiler
Operating System MaxelerOS Driver
ASIC
8
Traditional (CPU) Computing
Dataflow computes 30-200x faster, with
10-50x smaller physical footprint and
10-50x power efficiency
Multiscale Dataflow Computing
Only a small proportion of chip is actually
used for computation, time is wasted
talking to levels of cache.
Solution scalability – how it works…
9
Maxeler Dataflow Engines (DFEs) computes in space not in time. In dataflow
computing the trick is in maximising data movement and bandwidth.
CPUs compute
in time
DFEs compute in
space
10
Explaining Control Flow versus Data Flow
• Many specialized workers are more efficient (data flow)
• Experts are expensive and slow (control flow)
Analogy 1: The Ford Production Line
Maxeler Hardware Solutions
11
CPUs plus DFEs Intel Xeon CPU cores and up to
6 DFEs with 288GB of RAM
DFEs shared over Infiniband Up to 8 DFEs with 384GB of RAM and dynamic allocation
of DFEs to CPU servers
Low latency connectivity Intel Xeon CPUs and 1-2 DFEs with up to six 10Gbit Ethernet
connections
MaxWorkstation Desktop development system
MaxCloud On-demand scalable accelerated compute resource, hosted in London
MPC-X1000
• 8 dataflow engines (192-384GB RAM)
• High-speed MaxRing
• Zero-copy RDMA between CPUs and DFEs over Infiniband
• Dynamic CPU/DFE balancing
12
Accelerator Programming Models
DSL
DSL
DSL DSL
Possible applications
Leve
l of
Ab
stra
ctio
n
Flexible Compiler System: MaxCompiler
13
Higher Level Libraries
Risk Library
14
Programming with MaxCompiler Host
Application *.c, *.f90 ...
User Input
Compiler
Linker
Executable
Output
MAX File Sim or H/W
(.max)
MaxCompiler
Output
MyKernel.maxj MyManager.maxj
User Input MaxIDE
Rewrite only code to be accelerated
maxelerOS Library
SLiC Library
HPC Finance: Risk Management
Problem statement
16
Problem
Provide consistent, real-time, valuation and risk
management across all major asset classes, that enables
measurement and control of risk, as well as optimal
capital use.
Constraints
• Time to deliver completed client solution.
• Integration with pre-existing client technology stack.
• Increasing complexity of regulatory requirements.
• Scalability of solution in time, space and performance.
• Reliability and support.
• Deliver previously infeasible results.
• Total cost of solution.
Solution architecture
17
Consistent, real-time, valuation and risk management
calculations across all major asset classes
Maxeler’s dataflow
accelerated finance library
provides ultra high speed
computation of PV and risk
Client provides trade,
market and static data
in own format
Finance appliance
covers 10 asset classes
Risk summarizations in
hardware avoid use of
complex databases
Basic finance • Dates
• Cashflows
• Products
• European, Bermudan &
American options
• Utilities
• Maths and Stats
Risk management • Scenario engine generator
• Monte Carlo VaR
• VaR & CVaR
• Expected Shortfall
• EVT
• CVA/DVA
• Basel II/III and MiFID II
18
Risk management functionality
Risk management reporting • First and second order risks – deltas,
gammas and cross-partial risks
• Consistent risk sensitivities over all asset
classes
• Client driven, flexible, risk summarizations
Sp
ot M
ark
et
Ra
nd
om
B
um
ps
His
tori
ca
l
Historical Markets
Bump Scenarios
Monte Carlo
Market Scenarios
Bootstrap
Market Curves
Pricing Engine
Price Scenarios
Risk Analysis
Results Aggregation
Market Instruments
Cashflow Generator
Trade Portfolio
Cashflow Generator
Maxeler Finance Appliance
functionality and information flow
Risk management architecture
19
Client input data
Clie
nt
inp
ut
dat
a
Client generated risk management data is passed to client risk database for analysis
• Maxeler’s finance appliance accepts client input data and generates required risk management data at any and all requested levels of aggregation.
• Scenario analysis can be either permutative, combinatorial, ad- hoc or Monte Carlo.
• The finance appliance is fully scalable to client requirements.
Maxeler’s finance library provides accelerated analytics for valuation and risk management across all major asset classes
Random Number MersenneTwister
LinearCongruence
UniformContinuous
UniformDiscrete
UnivariateGaussian
MultivariateGaussian
UnivariateLognormal
MultivariateLognormal
Poisson
Exponential
Gamma
ChiSquare
NonCentralChiSquare
UniformPerturbation
LatinHypercube
InverseCummulativeMethod
AcceptanceRejectionMethod
GaussianWallace
Stochastic Process
UnivariateGaussianProcess
MultivariateGaussianProcess
UnivariateLognormalProcess
MultivariateLognormalProcess
SquareRootProcess
HestonProcess
PoissonProcess
CompoundPoissonProcess
AffineJumpProcess
CEVProcess
GenericEulerStepper
GenericPredictorCorrectorStepper
GenericTerminalDistributionStepper
Finance library coverage - details
20
Pricing and Risk CDSPricer
CDSRisk
CDSIndexPricer
CDSIndexRisk
SwapPricer
SwapRisk
OISPricer
OISRisk
BondPricer
BondRisk
EurodollarOptionPricer
EuroDollarOptionRisk
Pricing and Risk continued…
TNoteOptionPricer
TNoteOptionRisk
FXOption
FXOptionRisk
FXBarrierOption
FXBarrierOptionRisk
FuturesConvexityPricer
EuropeanFuturesOptionPricer
EuropeanFuturesOptionRisk
AmericanFuturesOptionPricer
AmericanFuturesOptionRisk
AsianFuturesOptionPricer
AsianFuturesOptionRisk
Finance library coverage - details
21
Distribution
NormalCummulative
NormalDensity
InverseNormalCummulative
PoissonDensity
PoissonCummulative
GaussianCopulaCummulative
NonCentralChiSquareCummulative
Distribution continued…
ExponentialDensity
ExponentialCummulative
GammaDensity
GammaCummulative
Financial Support Functions
ExponentialInterpolation
PWLinearSimpleSpotInterpolation
PWLinearForwardInterpolation
PWConstantForwardInterpolation
ConvexMonotoneInterpolation
CDSHazardCurveBootstrap
CDSHazardToUpfront
OISRateCurveBootstrap
Swap1CurveBootstrap
Swap2CurveBootstrap
GenericObjectiveBootstrap
Discount
RiskyDiscount
BusinessDayLogic
DateLogic
HolidayLogic
Models
BlackScholes
AmericanOptionBAW
HullWhiteTree
StaticCashflow
MultiStateMonteCarlo
1DExplicitFiniteDifference
2DExplicitFiniteDifference
3DExplicitFiniteDifference
1DImplicitFiniteDifference
2DImplicitFiniteDifference
3DImplicitFiniteDifference
FourierTimeStepping
Finance library coverage - details
22
Math Library
CholeskyDecomposition
SchurDecomposition
GaussianElimination
SOR
AMG
ForwardBackwardSubstitution
LUDecomposition
QRDecomposition
SVDecomposition
SteepestDecent
ThomasAlgorithm
FactorAnalysis
CubicSplineInterpolation
ConjugateGradient
BisectionSolver
SecantSolver
BrentSolver
NewtonRaphsonSolver
1DFFT, 2DFFT & 3DFFT
JacobiSolver
Math Library continued…
GaussSeidelSolver
GeneralFunctionApproximation
InverseMatrix
MatrixMultiply
NumericalQuadrature
1DConvolution
2DConvolution
3DConvolution
WaveletTransforms
LeastSquareRegression
HermitePolynomial
LaguerrePolynomial
LegendrePolynomial
BinarySearch
BackwardsEulerODESolver
LeapfrogODESolver
RungeKuttaODESOlver
ImplicitRungeKuttaNystromODESolver
ExplicitRungeKuttaNystromODESolver
Finance library coverage - details
23
Function Library
abs
ceil
cos
sin
tan
cosh
sinh
tanh
arcsin
arccos
arctan
erf
erfc
divMod
exp
floor
log
log2
Function Library
max
min
modulo
pow2
scalb
sin
sqrt
Risk Analysis
BumpMarket
PerturbMarket
LogDistHistorical
ExpWgtLogDistHistorical
VaRFromSample
ExpectedShortfallFromSample
ComponentVaR
SystemicVaR
VaRBucketAggregation
Finance library coverage - details
24
MAX-UP:
Maxeler University Program
26
Maxeler University Program Members
There are over 89 affiliated academic members across disciplines, here is a sample!
• Access to the fastest HPC computing technology
• Academic pricing on Maxeler dataflow computing hardware
• Free access to Maxeler software
• Free access to Maxeler educational material
• Guest lectures
• Internships for undergraduates and postgraduates
• Support for research proposals or joint project proposals
• An online forum for exchanging experience, expertise and ideas
27
MAX-UP Program Benefits
http://www.maxeler.com/solutions/universities/
28
Maxeler University Program
Example Accelerated
Applications
Credit Derivatives Valuation & Risk
• Compute value of complex financial derivatives (CDOs)
• Typically run overnight, but beneficial to compute in real-time
• Many independent jobs
• Speedup: 220-270x
• Power consumption per node drops from 250W to 235W/node
31
O. Mencer and S. Weston, 2010
Application Analysis
32
DFE Convolution Architecture
33
• Calculation of current value and credit spread risk for population of 2,925 bespoke tranches.
• Speedup from 1 MAX2:
– 219 – 270x compared to 1 core
– ~30x compared to 8-core node
• Power consumption drops from 250W/node to 235W/node with acceleration
34
Credit Derivatives Results
• Dataflow engines provide massive parallelism at low clock frequencies
• Many applications are amenable to dataflow processing, and can achieve high acceleration
35
Summary & Conclusions
top related