cs 471 final project 2d advection/wave equation using fourier methods

28
CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods December 10, 2003 Jose L. Rodriguez [email protected]

Upload: gitano

Post on 25-Jan-2016

26 views

Category:

Documents


0 download

DESCRIPTION

CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods. December 10, 2003 Jose L. Rodriguez [email protected]. Project Description. Use a Spectral Method (Fourier Method) for the equation:. Use the JST Runge-Kutta Time Integrator for each time step. Algorithm. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

CS 471 Final Project2d Advection/Wave Equation

Using Fourier MethodsDecember 10, 2003

Jose L. Rodriguez

[email protected]

Page 2: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

Project Description

• Use a Spectral Method (Fourier Method) for the equation:

• Use the JST Runge-Kutta Time Integrator for each time step.

0x yC C Ca a

t x y

Page 3: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

Algorithm• For each time step that we take, we do s sub

stages:

n

n

n+1

Set C=C

for : 1:1

C C

end

C =C

x y

k s

dt C Ca a

k x y

Page 4: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

Algorithm with Spectral Representation

2j j

Set

for : 1:1

ˆ = fft

ˆ ˆ

ˆ ˆ

ˆ ifft

ˆ ifft

for j=1,..,N

end

=

x x

y y

x x

y y

x x y yj j j j

k s

dt

k

c c

c c

d D c

d D c

d d

d d

c c a d a d

c c

Page 5: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

Code Development• Develop Serial C Code based off given Matlab

code using FFTw libraries for fft and ifft calls• Very straightforward• Verification of code working correctly was simply

comparing with Matlab result• Develop Parallel C Code based off Serial C Code

• The FFTw libraries provide fft and ifft calls that do all MPI Calls for you.

• The tricky part of this development was placing the data correctly on each processor for the fft and ifft calls.

• Verification of code working correctly was again comparison with Matlab result

Page 6: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

Results: N=512, 1000 Iterations

Page 7: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

Results: N=512, 1000 Iterations

Page 8: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

Results: N=512, 1000 Iterations

Page 9: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

Results: N=512, 1000 Iterations

Page 10: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

Usage of FFTw Libraries in Parallel: Function Calls

Notice: Message Passing is transparent to the user

Page 11: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

Usage of FFTw Libraries in Parallel: MPI Data Layout

• The transform data used by the MPI FFTW routines is distributed: a distinct portion of it resides with each process involved in the transform. This allows the transform to be parallelized, for example, over a cluster of workstations, each with its own separate memory, so that you can take advantage of the total memory of all the processors you are parallelizing over.

• In particular, the array is divided according to the rows (first dimension) of the data: each process gets a subset of the rows of the data. (This is sometimes called a "slab decomposition.") One consequence of this is that you can't take advantage of more processors than you have rows (e.g. 64x64x64 matrix can at most use 64 processors). This isn't usually much of a limitation, however, as each processor needs a fair amount of data in order for the parallel-computation benefits to outweight the communications costs. Taken from FFTw website/documentation

Page 12: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

Usage of FFTw Libraries in Parallel: MPI Data Layout

These calls needed to create fft and ifft plan, as well as find out what memory needs are to be met

Page 13: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

Usage of FFTw Libraries in Parallel: MPI Data Layout

ilocal_x_start tells us where we are in the global 2d array (row) and ilocal_nx tells us how many elements we have on this current processor.

Using Row-Major Format

Page 14: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

Notice: Message Passing is transparent to the user

Page 15: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

Parallel Results

• Two versions written• A Non-Efficient version that is not optimized for FFTw

MPI calls: • An extra work array is not used.• An extra un-transposing of data is done prior to coming out of

fft calls.

• An Efficient version that is optimized for FFTw MPI calls:

• An extra work array is used• Data is left transposed so that an extra communication step of

un-transposing data is not done

Page 16: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

Notice: The slight differences

Page 17: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

N=256, Iterations=100

1

10

100

1 10 100

Number of Processors

Tim

e T

aken

(se

con

ds)

Efficient Version

Non-Efficient Version

N=256, Iterations=100

0.000

0.200

0.400

0.600

0.800

1.000

1.200

0 5 10 15 20 25 30 35

Num ber of Processors

Eff

icie

ncy

Per

cen

tag

e

Eff icient Version

Non-Ef f icient Version

Efficient Version is Faster and more efficient.

Page 18: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

N=512, Iterations=100

1

10

100

1000

1 10 100

Number of Processors

Tim

e T

aken

(se

con

ds)

Efficient Version

Non-Efficient Version

N=512, Iterations=100

0.000

0.200

0.400

0.600

0.800

1.000

1.200

0 5 10 15 20 25 30 35

Number of Processors

Eff

icie

ncy

Per

cen

tag

e

Efficient Version

Non-Efficient Version

We begin to see some scaling, however, efficiency starts to taper off indicating that much of the time spent is in communication.

Page 19: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

N=1024, Iterations=100

1

10

100

1000

10000

1 10 100

Number of Processors

Tim

e T

aken

(se

con

ds)

Efficient Version

Non-Efficient Version

N=1024, Iterations=100

0.000

0.200

0.400

0.600

0.800

1.000

1.200

0 5 10 15 20 25 30 35

Number of Processors

Eff

icie

ncy

Per

cen

tag

e

Efficient Version

Non-Efficient Version

Overall, we see the same trend as N increases, i.e. some scaling as Number of Procs increases, but starts to flatten, and efficiency steadily decreases.

Page 20: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

The Sea of Black for the Non-Efficient Version

N=256, 10 Iterations

Page 21: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

A lot of communication between processors.

Page 22: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

Communication goes on between each processor with MPI_SendRecv since each processor needs data from each other. We can actually see here when a fft is being performed.

Page 23: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

8 processors and 16 processors: same trend of communication.

Page 24: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

The sea of white for the Efficient Version.

N=256, 10 Iterations

Page 25: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

The Efficient Version uses MPI_AlltoAll for its communication between all processors.

Page 26: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

We again can see when an fft call is being performed by each white bar for each process.

Page 27: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

8 processors and 16 processors: same trend of communication.

Page 28: CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods

Conclusions• A lot of time is spent in communication since each

process communicates with each other process. • Efficiency goes down as a result because as number of

process increases for a given size N, more communication is needed.

• We saw some scaling, but this starts to drop off as number of processors increases (efficiency issues).

• Time Spent on this project• Code Development: ~8 hours with debugging• Data Collection: ~2 days • Overall: Quite a bit of time