stochastic optimization in high dimension

Stochastic optimization in high dimension

Stéphane Vialle [email protected]

Xavier Warin [email protected]

Sophia 21/10/08

21 octobre 2008Juin 2008

Entité d'appartenanceEDF R&D2

EDF and its customersEDF and its customers

The assets

58 nuclear power plants on 19 areas (86.6%)

14 thermal power plants (4.6%)

440 hydro plants and 220 dams (8.8%)

Solar energy, wind power (< 0.5%)

Customers

Different kind of customers

A lot of style of contracts (for example swing where the producer can suspend delivery)



Assets management at EDFAssets management at EDF

Manage water stocks, fuel , customers contracts.

Goal :

maximize the expected cash flow

minimize risk

Under constraints

Satisfy the customer load

Respect pollution constraints



Asset management at EDFAsset management at EDF

Hazards :

Demand

Hydraulicity (inflows)

Weather patterns (cold means high demand)

Market prices

Assets outages.

Stochastic control problem in high dimension :

Number of state variable linked to :

Number of hazards

Number of stock to be dealt with



Numerical methods associatedNumerical methods associated

Decomposition methods (Lemaréchal)

Very effective for time resolution

Duality gap (non convexity)

Dynamic programming (Bellman 1957)

Very general (non convex, binary …)

Face curse of dimensionnality, global risk constraints difficult to implement

Stochastic Dual Dynamic programming (Pereira)

Approximate convex Bellman values for stocks (needs convexity)

Bender cuts leads to Linear Programming problem

Global risk constraints difficult to implement



EDF processEDF process

Optimize the cost function J with approximation and keep all the optimal commands at each step (no asset constraints as ramp constraint, minimum time before restarting etc…)

Use a Monte Carlo simulator with all the assets and constraints to calcule accurate average earnings, risk measure

Goal

Incorporate more stocks in the optimizer to be more accurate

A way to do it :

Use parallelism fo Stochastic Programming optimization

See influence of optimisation parallelisation on simulation



Dynamic programming implementationDynamic programming implementation

Use Monte Carlo for simulations for hazards (flexible, easy to use for risk)

Backward algorithm (Longstaff Schwarz version)

At t = 0 interpolate J for current stock c and current uncertainty s

;

;)),(()( ),,(max),(

stocksfor commands possible for

),(

levels hazard possible for

levelsstock admissible for

0 to)1( for

~*

*

*

*~*

~*

~*

JJ

snccsJnccsJcsJ

nc

csJ

s

c

tMt

Ε



Algorithm problematicAlgorithm problematic

Sequential in time

Rather sequential for nc nest

Parallel for c nest if all are available in memory for all (c,s)

number of points discretization in each direction.

Is the number of c to explore

IDEA : parallelize the C nest by splitting the hypercube

Use of communication scheme for optimisation and simulation too (commands spreads with stocks levels on processors)

*J

iN

i

iN

i

iN



Example splitting for 3 stocksExample splitting for 3 stocks

Pi

Stock-1 levels

Sto

ck-2

leve

ls

Stock

-3

levels

Pi

tn

tn+1Influence areaon tn computations



2D example for routing (receive) 2D example for routing (receive)

P0 P1 P2 P3

P4 P5 P6 P7

P8 P9 P10 P11

P12 P13 P14 P15

P6P5 P6P6P5 P6P5 P6

P5 Routing plan:

What happens on P5 (for example) ?

It determines all 2D-subcubes it hasto receive from other processors

Recv

Send

Proc 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15



2D example for routing (send)2D example for routing (send)

P0 P1 P2 P3

P4 P5 P6 P7

P8 P9 P10 P11

P12 P13 P14 P15

P5 Routing plan:


Recv

Send

Proc 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

It determines all 2D-subcubes it hasto send to other processors:

• compute « influence area » of P0

• compute the intersection with its tn+1 2D-subcube of data

P0



2D example for routing2D example for routing

P0 P1 P2 P3

P4 P5 P6 P7

P8 P9 P10 P11

P12 P13 P14 P15

P5 Routing plan:


Recv

Send

Proc 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

It determines all 2D-subcubes it hasto send to other processors:

• repeat with other processors…

The routing plan of P5 is complete! Execute it quickly!



C+ implementationC+ implementation

Parallelization:• MPI: Mpich-1, OpenMPI, IBM MPI communication routines: MPI_Issend, MPI_Irecv, MPI_Wait

overlap all communications when executing a routing plan (to speedup) do not use “extra communication buffers” (to size up)

• + multithreading: Intel TBB or OpenMP

to speedup and to size up more (than using only message passing)

Scientific computing libraries:Blitz++, Boost, Clapack, Sprng.

Total: • 57000 lines of C++ code• 10% for parallelization management• Parallelization can be withdrawn by preprocessing for small cas debug• Same source code on PC-cluster and Blue Gene/L and Blue Gene/P



Test case presentationTest case presentation

Optimization and simulation on 518 days with time step of one day

One stock of water :

225 points discretizations (c)

5 commands (0 to 5000 MW each day for nc )

6 stocks of month future products with delivery of energy (peak and off peak hours)

5 points discretization for each one

5 commandes ( -2000 MW (sell) to 2000Mw (buy) tested every 2 weeks

Aggregated view of thermal assets.

Up to 225*5^6 points discretizations and 5^7 commands to tests



Results Intel 256 *2 cores, BG 8192*4 coresResults Intel 256 *2 cores, BG 8192*4 cores

Comparison BG, cluster without multithreading

Opti15 - BGP-p4t1 - IC-p2t1

1E+2

1E+3

1E+4

1E+5

10 100 1000 10000

Number of nodes

Exe

c ti

mes

(s)

T-BGP-tot-p4t1

T-BGP-opti-p4t1T-BGP-simu-p4t1

T-IC-tot-p2t1

T-IC-opti-p2t1T-IC-simu-p2t1



ResultsResults

Comparison Blue Gene, Cluster multithreading

Opti15 - BGP-p1t4 - IC-p1t2

1E+2

1E+3

1E+4

1E+5

10 100 1000 10000

Number of nodes

Exe

c ti

mes

(s)

T-BGP-tot-p1t4

T-BGP-opti-p1t4T-BGP-simu-p1t4

T-IC-tot-p1t2

T-IC-opti-p1t2T-IC-simu-p1t2



ResultsResults

• Some optimizations carried out for Blue Gene

• Should improve the results in intel.

• ICC should be used instead of ICC on intel

• Some more optimizations on Blue Gene should bring the optimization part around 1000s on 8192 mpi sessions with 4 threads



ConclusionConclusion

• Tool developped for stochastic optimization with a limited number of stocks (< 10)

• Will bring some reference calculation for some other methods (supposing convexity for example) giving some results on how far of optimality we are.

• To be tested on EDF data without approximation for asset

• Will be a candidate for GPU cluster optimization

stochastic optimization in high dimension

Documents