algebraic multigrid for a mass-consistent wind model, the nordic...

Department of PhysicsUmea UniversitySwedish Defence Research Agency (FOI) June 24, 2015

Master’s thesis

Algebraic multigrid for amass-consistent wind model, theNordic Urban Dispersion model

Markus Pogulis

June 24, 2015

Supervisor: Niklas BrannstromExaminer: Krister Wiklund

Markus Pogulis ([email protected]) June 24, 2015

Abstract

In preparation for, and for decision support during, CBRN (chemical,biological, radiological and nuclear) emergencies it is essential to knowhow such an event would turn out, so that one can prepare a possibleevacuation. Afterwards it might be good to know how to backtrack andsee what caused the emergency, and in the case of e.g. a gas leak, where didit begin? The Swedish Defence Research Agency (FOI) develops modelsfor such scenarios.

In this thesis FOI’s model, ”The Nordic Urban Dispersion model”(NUD), has been studied. The system of equations set up by this modelwas originally solved using Intel’s PARDISO solver, which is a directsolver. An evaluation on how an iterative multigrid method would workto solve the system has been done in this thesis. The wind model is amass-consistent model which sets up a diagnostic initial wind field. Thefinal wind field is later minimized under the constraint of the continuityequation. The minimization problem is solved using Lagrange multipliersand the system turns into a Poisson-like problem.

The iterative algebraic multigrid solver (AMG) which has been evalu-ated had difficulties solving the problem of an asymmetric system matrixgenerated by NUD. The AMG solver was then tried on a symmetric dis-crete Poisson problem instead, and the solution turns out to be the sameas for the PARDISO solver. A comparison was made between the AMGand PARDISO solver, and for the discrete Poisson case the AMG solverturned out on top for both larger system size and less computational time.

To try out the solvers for the original NUD case a modification of theboundary conditions was made to make the system matrix symmetric.This modification turns the problem into a mathematical problem ratherthan a physical one, as the wind fields generated are not physically correct.For this modified case both the solvers get the same solution in essentiallythe same computational time. A method of how to in the future solvethe original (asymmetric) problem, by modifying the discretization of theboundary conditions, has been discussed.

Keywords: Mass-consistent model, urban environment, numericalmethod, algebraic multigrid

iii


Sammanfattning

Som en forberedelse infor, och for beslutsstod under, CBRN-handelser(kemiska (C), biologiska (B), radiologiska (R) eller nukleara (N)) ar detviktigt att veta vad en sadan handelse skulle kunna leda till, sa att mankan forbereda en eventuell evakuering. I efterhand kan det vara bra attveta vad som kan ha orsakat olyckan, och ifall det ar t.ex. en gaslacka, vartborjade det? Totalforsvarets forskningsinstitut (FOI) utvecklar modellerfor sadana scenarier.

I detta examensarbete har FOI:s modell, ”The Nordic Urban Disper-sion model” (NUD), studerats. Det ekvationssystem som inrattas avmodellen lostes ursprungligen med Intels losare PARDISO, vilket ar endirektlosare. En utvardering av hur en iterativ multigridmetod skulleklara att losa systemet har gjorts i detta projekt. Vindmodellen ar enmasskonsekvent modell som satter upp ett diagnostisk initialt vindfalt.Det slutgiltiga vindfaltet minimeras senare under restriktion av kontinu-itetsekvationen. Minimeringsproblemet loses med hjalp av Lagrangemul-tiplikatorer och systemet blir ett Poisson-liknande problem.

Den iterativa algebraiska multigridlosaren (AMG) som anvants harhade svarigheter med att losa problemet med en asymmetrisk systemma-tris som genererats av NUD. AMG-losaren provades da istallet, i forstahand, genom att losa ett symmetriskt diskretiserat Poisson-problem, darlosningen visade sig vara densamma som for PARDISO-losaren. En jamfo-relse har gjorts mellan AMG- och PARDISO-losaren, och for det diskre-tiserade Poisson-fallet visade sig AMG-losaren vara battre genom att badeklara storre systemstorlekar och ha lagre berakningstid.

For att forsoka losa det ursprungliga NUD-fallet behovdes en andringav randvillkoren goras for att fa en symmetrisk systemmatris. Dennamodifikation omvandlar problemet till ett matematiskt problem snararean ett fysikaliskt, och de vindfalt som genereras ar saledes inte fysikalisktkorrekta. For detta modifierade fall presterade de bade losarna mycket liktvarandra och fick liknande losningar pa ungefar samma berakningstid. Enmetod for hur man i framtiden kan losa det ursprungliga (asymmetriska)problemet, genom att modifiera diskretiseringen av randvillkoren, hardiskuterats.

v


Contents

1 Introduction 11.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 The problem at hand . . . . . . . . . . . . . . . . . . . . . . . . . 1

2 Theory 32.1 Wind fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.1.1 Navier-Stokes . . . . . . . . . . . . . . . . . . . . . . . . . 32.1.2 Mass-consistent model . . . . . . . . . . . . . . . . . . . . 3

2.1.2.1 The initial wind field, ~V00 . . . . . . . . . . . . . 42.1.2.2 Recirculation zones, ~V0 . . . . . . . . . . . . . . 62.1.2.3 The mass-consistent wind field, ~V . . . . . . . . 82.1.2.4 Boundary conditions . . . . . . . . . . . . . . . . 92.1.2.5 The final formulation . . . . . . . . . . . . . . . 10

2.2 Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.3 The discretization of the problem . . . . . . . . . . . . . . . . . . 112.4 Multigrid methods . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.4.1 The idea of multigrid . . . . . . . . . . . . . . . . . . . . 132.4.2 A deeper understanding . . . . . . . . . . . . . . . . . . . 15

2.4.2.1 Different meanings of smooth, geometric and al-gebraic . . . . . . . . . . . . . . . . . . . . . . . 16

2.4.2.2 Different types of multigrid, geometric and alge-braic . . . . . . . . . . . . . . . . . . . . . . . . . 17

3 Numerics and implementation 183.1 Sparse matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.1.1 Compressed Sparse Row . . . . . . . . . . . . . . . . . . . 183.1.2 Skyline Sparse . . . . . . . . . . . . . . . . . . . . . . . . 193.1.3 Conversion from CSR to SSK . . . . . . . . . . . . . . . . 19

3.2 The problem of asymmetries . . . . . . . . . . . . . . . . . . . . . 203.2.1 Dirichlet only . . . . . . . . . . . . . . . . . . . . . . . . . 203.2.2 Discrete Poisson . . . . . . . . . . . . . . . . . . . . . . . 21

4 Results 244.1 Discrete Poisson . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.1.1 Sparsity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264.1.2 The asymmetric case . . . . . . . . . . . . . . . . . . . . . 28

4.2 NUD - Dirichlet only . . . . . . . . . . . . . . . . . . . . . . . . . 304.2.1 Sparsity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5 Discussion 375.1 Discrete Poisson . . . . . . . . . . . . . . . . . . . . . . . . . . . 375.2 NUD - Dirichlet only . . . . . . . . . . . . . . . . . . . . . . . . . 38

6 Conclusions 416.1 Future works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416.2 Final remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

Appendices 45

vii


Appendix A Euler-Lagrange equations 45

Appendix B Discrete Poisson 46B.1 Solution for PARDISO and MATLAB . . . . . . . . . . . . . . . 46B.2 The asymmetric case . . . . . . . . . . . . . . . . . . . . . . . . . 47

Appendix C Sparsity 48C.1 Discrete Poisson . . . . . . . . . . . . . . . . . . . . . . . . . . . 48C.2 NUD - Dirichlet only . . . . . . . . . . . . . . . . . . . . . . . . . 49

Appendix D NUD - Original problem 50

viii


1 Introduction

This work is a master’s thesis performed at Umea University in collaborationwith the Swedish Defence Research Agency (FOI). The task is to investigatethe use of multigrid solvers to solve a Poisson-like equation, which in this casecomes from a mass-consistent wind model.

1.1 Background

Sweden together with 12 other countries forms the consortium behind the crisismanagement system ARGOS [1]. This system is a software system for decisionsupport during, and to prepare for, CBRN1 emergencies. In Sweden the systemis in use since 2005 by FOI and the Swedish Radiation Safety Authority (SSM).

One of FOI’s contributions to ARGOS is to, together with Danish Risø NationalLaboratory (DTU), develop a dispersion model for urban environments. Themodel which they have developed is called the Nordic Urban Despersion model(NUD). FOI’s task was to generate a stationary wind-field from meteorologicaldata, and DTU’s task was to later model the dispersion in this wind field.

1.2 The problem at hand

Currently FOI has a program which takes meteorological data and data for theenvironment and sets up the system of equations for the finite difference method.This system is then solved by a direct solver, Intel’s PARDISO solver [2]. Thissolver, and direct solvers in general, can not handle all problems of interest, inparticular PARDISO cannot handle systems beyond a certain size, which putsan unnecessary constraint on the resolution of the grid. Therefore FOI wantsto try a different approach, they want to try an iterative solver, which generallycan handle problems of larger size. FOI has done some earlier testing of aniterative method based on a geometric multigrid method, but were experiencingproblems with convergence.

The purpose of this master’s thesis is to further investigate the possibilities ofusing mutligrid methods to solve problems of this type. The reason for this isthat they want to be able to solve systems of larger size, and not be restrictedby the solver. As FOI has already tried to use a geometric multigrid method,but did not get it to converge, they now want to try an algebraic multigrid(AMG) method instead. There already exists a code, which is available at FOI,for AMG which will be used (amg1r6) in such a way that it will be integratedinto the finite difference method already in NUD.

Restrictions will be that we will use a wind field model, NUD, already imple-mented by FOI, for which the required theory will be presented later. For afull description see [3]. Also, another restriction is that the problem is alreadydescribed using a finite difference method in NUD. Because of this we will con-tinue in that approach and not consider other methods, such as finite element

1Chemical, biological, radiological and nuclear

1


or finite volume. In this thesis there will be a detailed description of the modelhandled here, the algebraic multigrid. For different methods on how to solvethe problem, see [3][4].

This thesis has a Plan A and a Plan B. The main part, Plan A, of this thesis willbe the work of integrating two methods, one finite difference method describingthe wind field (NUD) and one iterative solver to solve the system (amg1r6).Since multigrid methods are not full black box solvers and can be problemdependent there is another approach. If the integration of these methods willnot work fully the thesis will instead focus on investigating why it is not working,Plan B. And, if there is still time, try to solve the problem by using a self-madeimplementation of either the problem or the solver.

2


2 Theory

In this section there will first be a description of the theory of wind fields. Theminimization of the wind field, under the constraint of the continuity equation,will be set up to later arrive at a Poisson-like equation. After that the descriptionof the geometry will be handled and then on to the necessary theory of the finitedifference method. To solve the system set up the theory of the iterative solverand multigrid methods will be described.

2.1 Wind fields

A stationary wind field is a three dimensional spatial pattern of winds and canbe modeled in many ways, of which two will be handled here; the standard wayof Navier-Stokes equations and a mass-consistent approach. The one which willbe handled in-depth is the mass-consistent model.

2.1.1 Navier-Stokes

Classically one would use the Navier-Stokes equations (NSE), equation (1), todescribe a wind field fully. These equations are based on conservation of mass,momentum and energy. The incompressible Navier-Stokes,

∂~u

∂t+ ~u · ∇~u = −∇p

ρ+ ν∇2~u,

∇ · ~u = 0,

(1)

where ~u is the velocity of the fluid parcel, p is the pressure, ρ is the density andν is the kinematic viscosity.

When using NSE to describe a turbulent wind field one has to account for theKolmogrov microscales, which are the smallest scales of turbulence and wherethe dissipation of energy takes place [6], and thus have a very fine grid. Thisapproach is called a Direct Numerical Simulation (DNS). In the case of thecomplex geometries handled in this thesis this is too computationally heavy.Thus one would rather use a simpler model to approximate the wind field.

2.1.2 Mass-consistent model

The mass consistent model only uses conservation of mass and then approxi-mates energy and momentum conservations using empirical diagnostic studiesof wind fields in the desired environment, which in this case is a urban envi-ronment. In this section we will describe FOI’s interpretation of the model [3],which is based on [7][8][9].

The model first takes an initial wind velocity, ~Vin, which is known at a height,h1. Based on meteorological models and inter-/extrapolation we can estimate a

first wind field, ~V00. Then, by adjusting the wind field to take the recirculation

3


zones generated by the buildings in the area into account we arrive at ~V0. Underthe constraint of the continuity equation we solve a minimization problem toarrive at the final wind field, ~V . A final wind field, solved by the orginal NUDusing the PARDISO solver, can be seen in Appendix D.

The reason why so much work has to be done prior to solving for the desiredwind field ~V is because a mass-consistent model has difficulties to introducecorrect fluid dynamics such as recirculations close to a building [3]. The processcan be seen in figure 1 and is described in more detail below.

~Vin ~V00 ~V0 ~V- - -

Figure 1: Schematic figure of the process of generating the different wind fields.

2.1.2.1 The initial wind field, ~V00

We will first describe the vertical (z-direction) inter-/extrapolation. We first

load data for the wind velocities, ~Vin, to find its mean direction. The velocity,~Vin, is known at at least one height, h1. See figure 2 for an example, even thoughthis figure is not of an urban environment. From this we can, through empiricalstudies, set up different layers of wind behaviour close to the ground. We havethree layers; closest to the ground is the constant layer, next comes the surfacelayer and after that the linear interpolation layer, all seen in figure 3. For thefirst two layers, the constant layer and the surface layer, we will extrapolatethe velocity from the lowest known height, h1. For the linear interpolationlayer an interpolation between all known heights, h1 through hn, will be done,seen in equation (3). The reason why we have different layers is because whenapproaching the ground the wind is affected by the ground friction [3][4].

To determine where the different layers should start and stop we make use ofdifferent heights. The constant layer layer holds from the ground level, h0, up tothe mean building height of the domain, hm, which is calculated as in equation(2). From the mean building height up until the first known wind velocityheight, h1, we find the surface layer. After that, from h1 to other known windvelocity heights, h2 through hN , we have the linear interpolation layer.

hm =

∑iWiLiHi∑iWiLi

(2)

The equations for the vertical inter-/extrapolation,

~V00(z) =

(|Vin(hi+1)|−|Vin(hi)|

hi+1−hi

)(z − hi) + |Vin(hi)| where hi < z ≤ hN

|Vin(h1)|ln(

h1−DZ0

)ln( z−DZ0

) where hm < z ≤ h1|Vin(h1)|ln(

h1−DZ0

)ln(hm−D

Z0) where 0 < z ≤ hm

(3)

4


Figure 2: A numerical weather forecast, used to describe a step in the process ofgenerating the wind field. The figure is taken from FOI with permission.

Figure 3: The three different wind layers. The figure is taken from [3] with permissionand is modified.

where,

D = 0.8

∑iWiLiHi∑iWiLi

,

Z0 = 0.2

∑iWiLiHi

LxLy.

In these equations W , L and H are the dimensions of the buildings with respectto the wind direction; width, W , perpendicular to the direction as of the wind,length, L, in the same direction of the wind, and height, H. The two remainingparameters, Lx and Ly, are the dimensions of the computational domain [7][9].

5


The wind field, ~V00(z), we now have generated is on the same grid as ~Vin waspresented on, just inter-/extrapolated in the z-direction. This grid is usuallyvery coarse and we want to use a much finer one, hence a horizontal weightedinterpolation will be done in order to get the wind field to the grid we want touse, and thus generate ~V00(x, y, z). Equation (4) shows the weighted interpola-tion, where u is the wind velocity on the grids, which is presented in figure 4[3].

u(xp, yp) =(x2 − xp)(y2 − yp) · u(x1, y1)

A+

(xp − x1)(y2 − yp) · u(x2, y1)

A

+(x2 − xp)(yp − y1) · u(x1, y2)

A+

(xp − x1)(yp − y1) · u(x2, y2)

A,

(4)

where A = (x2 − x1)(y2 − y1) is the area of the coarse cell.

Figure 4: Figure visualising the weighted interpolation. The figure is taken from [3]with permission.

Here the point (xp, yp) is on the grid we want to go to, and the other (corner)points are known points from the larger grid and earlier interpolation. Theremay exist many points (xp, yp) in such a region, this is why the weighting isimportant.

2.1.2.2 Recirculation zones, ~V0

Now when we have ~V00(x, y, z) we want to add the recirculation zones. Theseare turbulence induced and appear behind objects, in this case buildings. Oncethe recirculation zones are introduced we have ~V0(x, y, z). All buildings are seenas rectangular shapes and the same notation as in the previous section is usedfor width, length and height [3].

There are three different zones in connection to buildings; the Displacementzone, the Cavity zone and the Wake zone. All zones can be seen in figure 5. Inthe upwards wind direction, on the windward side of the building, we find thefirst zone, the Displacement zone, which is described in equation (5) and (6) [9].

LfH

=2WH

1 + 0.8WH, (5)

6


Figure 5: The three recirculation zones generated from a building.

X2

L2f

(1−

(Z

0.6H

)2) +Y 2

W 2= 1, (6)

where Lf is the length of the zone and X, Y and Z form the internal coordinatesystem for the zone. There exists one such system of coordinates for eachzone and their center is located in connection to the foot of the building. TheX-direction is in the upwards wind direction (to the right in figure 5), the Z-direction is along with the height of the building and the Y -direction is alongthe width of the building (”into” the figure when studying figure 5). Equation(6) describes the surface of the Displacement zone, which separates from theground and reattaches to the building at a height about 0.6H along the Z-axisof the zone [7].

Next we have the Cavity zone which is positioned at the leeward side of thebuilding. The separation of this zone reaches all the way from the ground tothe top of the building and is described by equations (7) and (8) [9].

LrH

=1.8WH(

LH

)0.3 (1 + 0.24WH

) , (7)

X2

L2r

(1−

(ZH

)2) +Y 2

W 2= 1. (8)

For these equations Lr is the length of the zone and X, Y and Z form theinternal coordinate system of this zone. Equation (8) describes the surface areaof the zone. The Wake zone is parametrized in the same way as the Cavity zoneexcept that Lr,wake = 3Lr,cavity [7].

For buildings not standing orthogonal to the wind direction similar formulasare used, but will not be discussed here. All the zones can also be modified ifbuildings are close to each other. Both these phenomena are described in [9].

Now that we have described the spatial extent of the zones we want to knowhow the velocities are modified within the zones. In the Displacement zone thevelocity is defined by that the average wind is zero [9]. In the Cavity zone the

7


wind direction is opposite to the roof-level wind [9], ~V00(H), and the velocitydecays with X as

~V0 = −~V00(H)

(1− X

dN

)2

, (9)

where

dN = Lr

√√√√(1−(Z

H

)2)(

1−(Y

W

)2)− 0.5L. (10)

Finally, the Wake zone wind velocity is modified by altering the unperturbedboundary layer wind profile, ~V00(Z), according to equation (11) [9].

~V0 = ~V00(Z)

(1−

(dNX

)1.5). (11)

2.1.2.3 The mass-consistent wind field, ~V

The goal here is to achieve a corrected, mass-consistent, wind field, ~V = (u, v, w),where u, v and w are the wind velocities for the x-, y- and z-direction. Thiswind field should fulfill the condition of the continuity equation, ∇ · ~V = 0,which our current diagnostic wind field, ~V0 = (u0, v0, w0), does not. Using thiswe can set up the minimization problem of the weighted least square functional,

J(u, v, w) =

∫∫∫V

(α2(u− u0)2 + β2(v − v0)2 + γ2(w − w0)2

)dxdydz (12)

under the constraint,

∇ · ~V ≡ ∂u

∂x+∂v

∂y+∂w

∂z= 0, (13)

where the weights α, β and γ may be functions of (x, y, z) and V is the regionof interest. Using Lagrange multipliers we can include the constraint in thefunctional and rewrite the functional as,

E(u, v, w, λ) =

∫∫∫V

(α2(u− u0)2 + β2(v − v0)2 + γ2(w − w0)2+

+ λ

(∂u

∂x+∂v

∂y+∂w

∂z

))dxdydz (14)

where λ is the Lagrange multiplier function, a function of (x, y, z) [3]. By mini-

mizing this functional we can arrive at the mass consistent wind field, ~V . Takingthe first variation of equation (14) we obtain the Euler-Lagrange equations (for

8


a derivation see Appendix A):

u = u0 +1

2α2

∂λ

∂x, (15)

v = v0 +1

2β2

∂λ

∂y, (16)

w = w0 +1

2γ2∂λ

∂z, (17)

0 =∂u

∂x+∂v

∂y+∂w

∂z. (18)

Substituting equations (15)-(17) into equation (18) yields the final equation ofPoisson form,

∂

∂x

(1

2α2

∂λ

∂x

)+

∂

∂y

(1

2β2

∂λ

∂y

)+

∂

∂z

(1

2γ2∂λ

∂z

)= −∇ · ~V0. (19)

Solving equation (19) gives the Lagrange multipliers λ. Substituting back into

equations (15)-(17) we find the mass-consistent wind field ~V = (u, v, w).

2.1.2.4 Boundary conditions

To make the problem well-defined we need to specify the boundary conditions.We split the boundary into two different parts; one for the free (also called flow-through) boundary, ∂1V , and one for the buildings and terrain, ∂2V . For a 2Ddescription, see figure 6.

Figure 6: 2D figure describing the two different boundary parts. ∂1V points to thefree boundary and ∂2V points to the buildings and terrain part of the boundary.

9


∂V = ∂1V ∪ ∂2V.

For the free, flow-through, boundary, ∂1V , we use λ = 0, which translates tothat the velocity field can vary freely.

For the other part of the boundary, ∂2V , we have a no-flow-through bound-ary, for which the boundary condition is found by studying the Euler-Lagrangeequations, (15)-(18), normal derivative part [3],

∂λ

∂n:=

∂λ

∂x~nx +

∂λ

∂y~ny +

∂λ

∂z~nz =

= 2α2(u− u0)~nx + 2β2(v − v0)~ny + 2γ2(w − w0)~nz, (20)

where ~nx, ~ny and ~nz are the normal directions of x, y and z respectively.

2.1.2.5 The final formulation

Summarizing we have a Poisson-like equation, equation (19),

∂

∂x

(1

2α2

∂λ

∂x

)+

∂

∂y

(1

2β2

∂λ

∂y

)+

∂

∂z

(1

2γ2∂λ

∂z

)= −∇ · ~V0

with boundary conditions

λ = 0, on ∂1V (flow-through),

∂λ

∂n= 2α2(u− u0)~nx + 2β2(v − v0)~ny + 2γ2(w − w0)~nz, on ∂2V (no-flow-through).

2.2 Geometry

To set up the grid we first find the mean values for the wind velocities usingdata from e.g. a numerical weather forecast or from observations made at thesite. The direction of this mean wind is used to set up the domain, which willbe a rectangle in this direction, see figure 7. The buildings are generated byimporting height data, currently only from shapefiles (which is a collection offiles [5]), and if they cut the domain the part outside is ignored [3].

The grid can now be generated and is made orthogonal to the direction of themean wind. There are three different types of nodes in the grid, see figure 8.An empty cell is a cell which has no part of buildings in it (white), a partly filledcell refers to a cell which has a part of a building in it (red) and a full cell refersto a cell which consists fully of a building (grey) [3].

10


Figure 7: Figure describing how the domain is set up. The figure is taken from [3]with permission. The (red) dot is a point of emission which is not relevant in thiscontext.

Figure 8: Figure describing the different cells. The figure is taken from [3] withpermission.

2.3 The discretization of the problem

Equation 19 is solved using a finite difference method to approximate the solu-tion in all grid points. For a free point a 7-point stencil of second order in allspatial directions is used [10],

λi+1,j,k − 2λi,j,k + λi−1,j,k2α2(∆x)2

+λi,j+1,k − 2λi,j,k + λi,j−1,k

2β2(∆y)2+

λi,j,k+1 − 2λi,j,k + λi,j,k−12γ2(∆z)2

= −f0, (21)

11


where the f0 is the divergence defined as

f0 =

(u0i+1,j,k

− u0i−1,j,k

2∆x+v0i,j+1,k

− v0i,j−1,k

2∆y+w0i,j,k+1

− w0i,j,k−1

2∆z

).

For a figure for the stencil, see figure 9.

Figure 9: The used 7-point stencil for the discretization of a free point.

The boundary conditions involving the normal derivative have a different dis-cretization and it may vary between first and second order. Ideally the dis-cretization is of second order and is taken as two points in the normal directionas

∂λ

∂n

∣∣∣∣i+1

=3λi+1 − 4λi + λi−1

2∆x, (22)

or

∂λ

∂n

∣∣∣∣i−1

=−3λi−1 + 4λi − λi+1

2∆x, (23)

depending on whether we need to use forward first or backward first difference.The same equations can be used for all directions if changing the derivativefrom x to y or z and also the index from i to j or k. The reason why we have”mostly” second order discretization is that if two buildings are just 1 grid-pointaway from each other we cannot use a three point discretization, hence we haveno choice but to use a first order discretization if that occurs. The first orderdiscretization is

∂λ

∂x

∣∣∣∣i+1

=λi+1 − λi

∆n, (24)

or

∂λ

∂x

∣∣∣∣i−1

=λi − λi−1

∆n, (25)

12


again depending on whether we want to use forward first or backward firstdifference. Using the same reasoning as above these equations can also be usedfor all directions, x, y and z.

This discretization yields a diagonally dominant but not fully symmetric systemmatrix. Any asymmetries are due to the presence of boundaries. In additionthe matrix will depend on the order of which the grid is traversed when thesystem matrix is generated.

2.4 Multigrid methods

The multigrid method is not a solver by itself, but a technique to make othersolvers behave better and thus converge faster. In this section the theory ofmultigrid methods will be handled.

2.4.1 The idea of multigrid

When using a standard iterative solver, such as Gauss-Seidel or Successive Over-Relaxtion (SOR), the error-reduction tend to fall off as more iterations aredone. On a fine mesh this behaviour is easier to observe, and in fact manysolvers behave better on coarser grids [11]. Studying this we find out that thisbehaviour depends on the frequency field of the error. From figure 10-12 (allgenerated using a MATLAB script from the University of Cambridge [12]) wecan study the frequency dependence. In this example the 1D Poisson equation,

u′′ = −x+ 100 cos(24πx) + 50000 cos(1000πx), (26)

is solved using a Gauss-Seidel method on a grid containing 2048 points. Thisfunction is chosen so that we can see a few different frequencies [12]. The upper(blue) curve shows the analytical solution and the lower (red) curve shows thecurrent solution of the iterative solver. In the lower window we can see theerror. In figure 10 we see the solution after 100 and 500 iterations, looking atthe error plot we can still see the high frequency behaviour in both the figures.Going on to the next figure, figure 11, we see iteration 1200 and 2000. Forthese figures we can see that the high frequency behaviour is almost gone whenlooking at the error plot. Lastly, in figure 12, we see the solution after 5000and 100000 iterations and looking at the error plot it is almost impossible to seeany high frequency behaviour. It might be counter-intuitive that the small-scalebehaviour of high frequency is what is easy to capture, but this is indeed whatthis tells us [13]. The high frequency behaviour does smoothen out faster thanthe low frequency one.

13


Figure 10: This figure shows an analytical (upper) and iterative (lower) solution,for 100 (left) and 500 (right) iterations, of equation (26) and is used to explain thefrequency dependence of an iterative solver. The lower window contains the error plot.Figure generated using a MATLAB script from the University of Cambridge [12].

Figure 11: This figure shows an analytical (upper) and iterative (lower) solution,for 1200 (left) and 2000 (right) iterations, of equation (26) and is used to explain thefrequency dependence of an iterative solver. The lower window contains the error plot.Figure generated using a MATLAB script from the University of Cambridge [12].

Figure 12: This figure shows an analytical (upper) and iterative (lower) solution, for5000 (left) and 100000 (right) iterations, of equation (26) and is used to explain thefrequency dependence of an iterative solver. The lower window contains the error plot.Figure generated using a MATLAB script from the University of Cambridge [12].

14


A multigrid method uses this behaviour to its advantage as it goes betweendifferent grids, from finer to coarser and back again. When going from a finergrid to a coarser one the error will behave differently, if the error was smooth onthe finer one it need not be on the coarser one. This is due to what is considereda low frequency error field on the fine grid might be of high frequency on thecoarser grid. Hence, if we were to (after some iterations when the error hasbeen smoothened out) go from a fine grid to a coarser, we can then run a fewiterations on this coarser grid to smoothen out the errors on this scale. If wewent back to the fine grid after this, that error would be smoothened out thereas well, this is called a global correction and is what the multigrid method isbased on. This can be expanded and repeated, recursively, multiple times togo through many coarse grids until we reach the coarsest level. This processis shown in figure 13. The process of going from a finer to a coarser grid iscalled a restriction, and when going from coarser grid to finer one it is called aprolongation (or interpolation).

Figure 13: Multilayer scheme to explain the concept of multigrid methods.

This process, of cycling between finer and coarser grids, is called a multigrid cy-cle. A multigrid cycle can be performed in different ways and to fully understandthe ins and outs of this a deeper study is needed.

2.4.2 A deeper understanding

The most essential steps of a multigrid method are the processes of the coarsegrid correction; the prolongation and restriction. Both the prolongation andrestriction is a form of interpolation, and what needs to be done is definingthese interpolation operators. Interpolation operators can be set up and thenapplied to the system matrix to do this coarse grid correction.

From here on we always talk about a two-level method that can be appliedrecursively, if not otherwise stated. The system of equations we want to solveis given by

15


Ahuh = fh, (27)

where A is the system matrix, u is the solution vector and f is the right handside vector. The index, h, denotes that we are on the fine grid and H the coarsergrid. Using an interpolation operator, IhH , we can map vectors from H to h,and the other way around using another interpolation operator IHh = (IhH)>.We can then define the coarser level system matrix as [14],

AH = IHh AhIhH . (28)

Then, a two-level correction looks like,

uhnew = uhold + IhHeH , (29)

where eH is the correction (or error) vector given by the solution of

AHeH = rH , (30)

where rH = IHh rhold = IHh (fh − Ahu

hold) is the vector of coarse level residuals

[14].

There are two different types of multigrid, algebraic and geometric. For thegeometric multigrid we have a hierarchy of grids on which we are working on,where we set up one grid for every level and from these we can set up the neededoperators based on geometric information [14]. For the algebraic multigrid, orrather multilevel [15] (as we have no grids), we only have the initial grid toset up the initial system matrix, and after that we set up all the interpolationoperators based on coefficients (and signs of these) in the matrix.

2.4.2.1 Different meanings of smooth, geometric and algebraic

When studying these different multigrid methods the word smoothing is usedin both, but what does it mean in each situation?

The geometric one is easier to get a grip on, as we have grids where somethingactually can be smooth on. Smooth in this context is not the mathematicaldefinition of smooth, but rather in a geometrical sense. See figure 12, wherethe error plot ”looks smooth”. So if the error is smooth relative to the nextpre-defined coarser grid, this is sufficient to say that the error is indeed smooth..

In contrast, for the algebraic one we have no grids, so we cannot use the samedefinition of smoothness. Instead we define a new smoothness, algebraic smooth-ness, which property is met if the convergence is slow on that level. So if theconvergence of the error is slow on the current level, it has to be approximatedon a coarser level to speed up convergence. This is important for an algebraicmultigrid as this property is what defines the convergence speed [15].

16


2.4.2.2 Different types of multigrid, geometric and algebraic

In the early development of multigrid the geometric multigrid was dominant,and the coarsening strategies were often a simple doubling of the mesh h→ 2h.Using a strategy like this we know that if we have isotropic errors it wouldsmooth out in both directions. However, if we consider the case of anisotropicerrors we only see smoothening in the direction of strong couplings [14]. Thus,a typical h → 2h will not suffice for most problems. For an efficient smooth-ing, and fast multigrid convergence, we need to have good interplay betweenprolongation and restriction.

For the geometric multigrid, looking at more complex grids for example whenlooking at complex flows in the car industry, be it of cooling or heating systemsunder the hood of the vehicle or in the vehicle interior, the grids are typicallyvery complex and unstructured [14]. In such a case it would be impossible touse a normal h → 2h coarsening and one would have to study the grid evenmore to set up suitable interpolation operators.

On the other hand, for the algebraic version of multigrid all operators are basedon coefficients in the system matrix. Due to this the interplay between interpo-lation and restriction operators is not as important, for this method we do nothave a fixed way that the smoothing needs to be done in. The AMG chooses itsinterpolation in direction of where the error is already smooth. This is basedonly on information from the current system matrix and the coarser levels arethus locally adapted [15]. In this way the algebraic multigrid is more of a blackbox solver than the geometric one, and also more flexible for complex problems.

When using a geometric multigrid one makes use of the grid and can thus inspectthe grid and see what points are suitable to interpolate. For example one couldlook at normal directions and say that if a many points close to each other havethe same normal direction, one could possibly guess that these could be, say, aroof or wall of a building. Using the algebraic version one has to look at thesimilarities in the system matrix instead and thus maybe could interpolate, ifsay at an edge between a wall and a roof, a roof- and wall-point into one point,which is not suitable. This could be looked at as if both the normal vectorsof the roof and wall are turned into one and thus the house is not rectangularanymore but rather a house with a slant connection between the wall and roof.With this example we can understand that how the interpolation and restrictionoperators are chosen is very important.

Also when comparing geometric and algebraic multigrid one has to account formemory usage. For the geometric one all the grids are created and stored inthe memory, then when using the grids we create a system matrix and transferthe errors from finer to coarser levels. For the algebraic one once the systemmatrix is set up all the work is done numerically on the system matrix. Gridsare irrelevant. From the system matrix one creates interpolation operators andthose are used to create the coarse-level system matrices in the way describedabove.

17


3 Numerics and implementation

The part of the work which we have done is to integrate the already existingAMG solver (amg1r6) in to the NUD code. Doing this we ran in to a fewproblems. This section will describe the problems and how they were overcome.

One thing which, unexpectedly, took a lot of time was to implement differenttypes of sparse representations and to perform different matrix manipulationson these. This was needed because the input data for the PARDISO solver(the solver which was already in use) required one sparsity representation, com-pressed sparse row (CSR), and the AMG solver another, sparse skyline (SSK).In the NUD program the matrices were already in CSR format, thus in order tocall the AMG solver we needed to convert the matrix representation from CSRto SSK format (and back when displaying the results).

Also, it turned out that the boundary conditions required modifications. This inturn required the system matrix to be manipulated. In particular identificationand removal of rows and columns.

Both the NUD and AMG programs were written in Fortran, hence it was naturalto continue in Fortran. Hence, both the discrete Poisson and the parts neededto make a transition from PARDISO to AMG, for NUD, are written in Fortran.Analysis of the algorithms have been done in MATLAB to confirm that theywork as intended.

3.1 Sparse matrices

For large problems, such as this, generated by a finite difference method (orfinite element or volume for that matter) the system matrix is often very large,but does contain a lot of zeros; such a matrix is called sparse. To handle theselarge matrices in the standard way of a two dimensional array is a waste ofmemory, and sometimes not even possible. An alternative is to represent thematrix without the zeros, a sparse representation. This is done by using a vectorwith all the non-zero elements and other vectors to point where the elementswould lie in the full matrix, and then considering all other elements as zeros. Arepresentation like this will require far less memory as we have at most a fewvectors of size ”number of non-zero elements” (from here on referred to as nnz),instead of a full matrix which consists of N2 elements, where N is the numberof nodal points [16].

3.1.1 Compressed Sparse Row

The compressed sparse row format, used by the PARDISO solver, makes useof the first non-zero element on each row. This format uses three vectors torepresent the full matrix. Two vectors of size nnz, called aCSR and jaCSR, andone of size N + 1, called iaCSR.

The first vector, aCSR, holds all the non-zero elements of the full matrix, storedrow by row. Secondly, jaCSR points to the column for which the corresponding

18


element in aCSR would have had in the full matrix. Lastly, iaCSR points to theindex of the first element on each row in aCSR, which means that iaCSR(i+1)−iaCSR(i) gives the number of non-zero elements on the row i. Also iaCSR(N +1) = nnz + 1 so that one can get information about the length of the last rowas well.

3.1.2 Skyline Sparse

The skyline sparse format, used by the AMG solver, is quite similar to CSR butinstead makes use of the diagonal element on each row. Also here we make useof two vectors of size nnz, called aSSK and jaSSK , and one of size N + 1, callediaSSK . It is often used to store symmetric matrices as it then requires even lessmemory because it only needs to store one triangular part of the matrix, eitherupper or lower [17]. But in this case it is used to store both the lower and uppertriangular part, as required by the AMG solver.

As mentioned, both representations are quite similar. Here aSSK also holds allthe non-zero elements, but not entirely row by row as the diagonal element ofthe row will be moved to become the first element on that row. Here jaSSK stillpoints to the column for which the corresponding element in aSSK would havehad in the full matrix. For iaSSK there is another difference, for this format itwill point to the index of the diagonal element of each row in aSSK . This meansthat jaSSK(iaSSK(i)) = i, which is the diagonal element itself. For this formatiaSSK(N +1) = nnz+1 still holds and one can still use iaSSK(i+1)− iaSSK(i)to find out the number of non-zero elements on a row.

3.1.3 Conversion from CSR to SSK

From the above description we draw the conclusion that the difference lies in therepresentation of the first element of the row and the diagonal element. So howdo we go between these two representations? If we could find the position ofthe first and diagonal element of the same row, then we would put the diagonalelement in the place of the first element. After this we would loop over theelements of the row, from left to right, and put the elements, in order, justafter the diagonal element (which now is first on that row). This means thatif the diagonal element is not the first element on a given row then a changein representation has to be done. Afterwards, the diagonal element will befirst on that row, and the element which used to be the first one is now thesecond one and so on. Elements which lie on the right hand side of the diagonalelement are the same for both formats, and are thus untouched when changingrepresentation. For an algorithm of how to go from CSR to SSK, see algorithm1.

19


Algorithm 1 Conversion algorithm

1. For each row1.1. Find the diagonal element and remember its position.1.2. Put the diagonal element at the place of the first element, also change

the numbering in ja.1.3. For each element in-between the first and diagonal element, from left

to right.1.3.1. Put elements, in order, just after the diagonal element, also

change the numbering in ja.1.4. For each element after the diagonal.

1.4.1. Do nothing, as these elements will already have the correct po-sition on the row.

3.2 The problem of asymmetries

When we first started to remove the PARDISO solver and introduce the AMGsolver instead it went mostly fine. When the implementation was done, all bugswere resolved and the code appeared to be running as designed, we still didnot get the answer that we wanted. We got mostly NaN (Not a Number). Werealised that for the AMG solver matrix symmetry is essentially a must for thesystem matrix, and also that the rows should preferably have a positive sum(or sum to 0). To test how sensitive the AMG solver was to asymmetries wetried out methods to make the system matrix symmetric. We also constructeda controlled toy-problem to perform tests.

3.2.1 Dirichlet only

We studied the discretization of the boundary conditions and how a modificationcould be done to make the system matrix symmetric. The first thing we triedwas to change all the boundary conditions to Dirichlet conditions. This is notthe physically correct problem we wanted to solve, but it was an easy first stepto make sure that the solver could handle larger symmetric problems and alsoto see that NUD was implemented in such a way that the row sum is either 0or positive.

However, using Dirichlet conditions will not automatically generate a symmetricsystem matrix, we have to make use of a ”trick” to make it symmetric. Thistrick is to move all the boundary elements to the right hand side. The reasonwhy the matrix is not symmetric to start with is that if a free (Laplace) elementhas its neighbor as a boundary element, it will use the boundary element inits discretization (on its row in the system matrix). But, the boundary point(assuming Dirichlet) will not use the free element in its discretization. Thismight also hold for a Neumann or Robin boundary condition, depending onhow the discretization is chosen to be done.

So, why this ”trick” works is described below. A Dirichlet boundary conditionis a boundary condition where the value of the element is known, and thus wedo not need to apply a finite difference scheme on such points. This means thatin the system matrix a row with a Dirichlet condition will only have one element

20


on that row. Hence, when a free (Laplace) element uses a Dirichlet boundaryelement in its discretization (on its row in the system matrix), this element isalready known. Due to this, we can move that element to the right hand side. Infact, all the elements on the same column as the Dirichlet element can be movedto the right hand side, as all these are known. As described above, the asymme-tries appear when a free element use the boundary element in its discretization,but not the other way around. When shifting the boundary element to the righthand side we go around this problem and the matrix becomes symmetric.

Below, in figure 14, we see an example of the full matrix including Dirichletboundary elements, and in figure 15 we show an example of the reduced matrixwhere the boundary elements have been shifted to the right hand side. Bothfigures show spy-plots from MATLAB, where each dot is a value and all whitespace is full of zeros.

For an algorithmic description, see algorithm 2.

Algorithm 2 Dirichlet Trick

1. Find a row containing a Dirichlet element.1.1. Find all columns containing this Dirichlet element.1.2. Move those element to the corresponding position in the right hand

side.1.3. Save the solution for this row and remove the row from the system

matrix.2. Repeat from 1. until all Dirichlet elements are found.

3.2.2 Discrete Poisson

We implemented a two dimensional discrete Poisson problem [18]. Using asquare grid where n is the number of points on one side of the grid (N = n2 isthe number of nodal points) and the same spacing is used in both dimensionsyields the following discretisation for a free point,

(∇2u)i,j =1

(∆x)2(ui+1,j + ui−1,j + ui,j+1 + ui,j−1 − 4ui,j) = gi,j , (31)

where 2 < i < n − 1 and 2 < j < n − 1. For i = j = 1 and i = j = n we haveDirichlet Boundary conditions. This gives a linear system,

A~x = ~b.

Using the trick described above and shifting the boundary elements to the righthand side gives a symmetric (n − 2)2 × (n − 2)2 matrix A and a vector ~b. Anexample of the shape and symmetry of the matrix, using n = 5, can be seenbelow. This example is the same as used in the MATLAB spy-plots above, seefigure 14 and 15. We can see that all the u-elements in the right hand side (in~b) have an index of either 1 or 5, which tells us that those are indeed boundaryelements which have been shifted.

21


Figure 14: Spy-plot from MATLAB showing the full discrete Poisson problem, in-cluding Dirichlet boundary elements.

Figure 15: Spy-plot from MATLAB showing the reduced discrete Poisson problem,where the boundary elements have been shifted to the right hand side.

22


A =

4 −1 0 −1 0 0 0 0 0−1 4 −1 0 −1 0 0 0 0

0 −1 4 0 0 −1 0 0 0−1 0 0 4 −1 0 −1 0 0

0 −1 0 −1 4 −1 0 −1 00 0 −1 0 −1 4 0 0 −10 0 0 −1 0 0 4 −1 00 0 0 0 −1 0 −1 4 −10 0 0 0 0 −1 0 −1 4

~b =

−∆x2g2,2 + u1,2 + u2,1−∆x2g3,2 + u3,1−∆x2g4,2 + u5,2 + u4,1−∆x2g2,3 + u1,3−∆x2g3,3−∆x2g4,3 + u5,3−∆x2g2,4 + u1,4 + u2,5−∆x2g3,4 + u3,5−∆x2g4,4 + u5,4 + u4,5

This problem, where A is symmetric and has positive row sum, was solvableusing the AMG solver. But, if we were to introduce just a few elements in thelower or upper triangular part of A the solver would get unstable quite fast.

23


4 Results

In this section results from both the discrete Poisson and NUD problems willbe presented. For NUD we will only see the case where we have only Dirichletboundary conditions, the mixed Dirichlet and Neumann boundary conditionscase is presented in a ”Future works”-section.

All results from the AMG and PARDISO solvers are generated from Fortranprograms and scripts. We trust the results of the PARDISO solver to be true.Also, some of the results are confirmed using MATLAB, which results we alsotrust to be true. All plots are generated in MATLAB where data from oursimulations have been imported.

4.1 Discrete Poisson

We start off by considering the two dimensional discrete Poisson problem, seeequation (31), with Dirichlet boundary conditions equal to zero and and anall zero right hand side, except for one element in the center of the grid. Thesolution to this problem looks like a peak in the center which decays to zerowhen going further out towards the boundaries. The solution for the AMGsolver can be seen in figure 16, using n = 30. Solutions for the PARDISO solverand MATLAB’s backslash operator can be seen in Appendix B.

Figure 16: Solution for the discrete Poisson problem using the AMG solver, n = 30.

The AMG solver included seven levels (”grids”) of different resolution whensolving the problem, and a total of 10 cycles (iterations) was needed to get theseresults. Below, in table 1, are the residuals and convergence factor presented foreach cycle. The total residual is calculated as the sum of each equation (eachrow in the system matrix) residual squared, as

24


∑i

(Ai · ~x− bi)2 , (32)

where Ai is a vector of one row of the system matrix, ~x is the solution vectorand bi is the corresponding row element in the right hand side vector. Theconvergence factor is the ratio of the current and the previous residual andexplains how much the solution has improved. Values in the table are roundedto avoid displaying too many decimal points. Comparing to other solvers, itmay seem like 10 iterations (or rather cycles) are very few to reduce the error soquickly and get a good solution. But as described above one cycle of a multigridmethod is quite extensive.

Table 1: Table showing residuals and convergence factors for a discrete Poissonproblem using the AMG solver, n = 30.

n Cycle Residual Convergence factor

30 0 0.349 · 102 -

1 0.159 · 100 0.404 · 10−2

2 0.553 · 10−2 0.347 · 10−1

3 0.304 · 10−3 0.550 · 10−1

4 0.190 · 10−4 0.624 · 10−1

5 0.128 · 10−5 0.677 · 10−1

6 0.897 · 10−7 0.699 · 10−1

7 0.623 · 10−8 0.695 · 10−1

8 0.426 · 10−9 0.684 · 10−1

9 0.287 · 10−10 0.673 · 10−1

10 0.190 · 10−11 0.662 · 10−1

We have already seen (in Appendix B) that for a discrete Poisson problem thesolutions of the AMG solver, the PARDISO solver and MATLAB’s backslashoperator are very much alike. To see how fast the AMG solver is, timings werestudied for a few different solvers on different system sizes. The system sizes forwhich the problem was solved on ran from a lower limit of n = 200. The reasonwe start at n = 200 is that for smaller problems the times are so small thatthey can vary between different runs (i.e. we are not measuring the CPU clockcycles for the solver only, but other processes start to dominate). The upperlimit is solver dependent, as different solvers can handle different upper limitsdue to convergence errors or memory problems.

Even though we relate all the sizes to n (which is the the number of pointsalong one side in the two dimensional grid) all figures are plotted against nnz(which is the number of non-zero elements in the matrix) which is driving thecomputational cost.

In figure 17 we can see the timings for seven different solvers; the AMG solver,the PARDISO solver, MATLAB’s backslash operator and two of MATLAB’siterative solvers, conjugate gradient and bi-conjugate gradient. Both of MAT-LAB’s solver were run both with and without preconditioners (in the case a

25


preconditioner was used it was the incomplete Cholesky factorization). In fig-ure 17 results are shown for all solvers for n = 200 up to n = 900 (whichtranslates to number of non-zero elements, nnz = 199200 and nnz = 4046400).After n = 900 some solvers stop converging: the bi-conjugate gradient methodfailed both with and without preconditioning and the conjugate gradient onefailed without preconditioning.

Figure 17: CPU time versus nnz for multiple solvers. n going from 200 to 900.

The remaining solvers, AMG, PARDISO, MATLAB’s backslash operator andthe preconditioned conjugate gradient all ran up to n = 1200 (nnz = 7195200),which can be seen in figure 18. Next, in figure 19, the preconditioned conju-gate gradient was also removed to make a comparison of AMG, PARDISO andMATLAB’s backslash operator easier.

In figure 20 we see timings of AMG and PARDISO, where AMG goes all theway to n = 2050, but PARDISO gets memory issues at n = 1200.

4.1.1 Sparsity

As mentioned the system of equations generated for problems like this are verysparse. In table 2 below data about the system size and sparsity is presented.The columns of the table are: n is number of points on one side of the grid, n2

is the number of equations (same as number of nodal points), n4 is the numberof total elements in the full system matrix and nnz is the number of non-zero elements in the sparse representation of the system matrix. The sparsitycolumn tells how many zero-elements there are for every non-zero element, e.g.for n = 200 there are about 8000 zero-elements for every non-zero element.Values in the table are rounded to avoid displaying too many decimal points.For a full table see Appendix C.

26


Figure 18: CPU time versus nnz for AMG, PARDISO and MATLAB’s pcg solver(using a preconditioner) and backslash operator. n going from 200 to 1200.

Figure 19: CPU time versus nnz, zoomed in on AMG, PARDISO and MATLAB’sand backslash operator for an easier comparison. n going from 200 to 1200.

In figure 21 the sparsity is presented in a graphical way.

27


Figure 20: CPU time versus nnz for AMG and PARDISO. n going from 200 to 2050,where PARDISO cannot handle more than n = 1200.

Table 2: Part of the table showing different sizes of the system of equations set upfor the problems, also showing the sparsity. For the full table see Appendix C.

n no. of equations, n2 no. of elements, n4 nnz sparsity, nnzn4

200 4.00 · 104 1.60 · 109 1.99 · 105 18032

400 1.60 · 105 2.56 · 1010 7.98 · 105 132064

600 3.60 · 105 1.30 · 1011 1.80 · 106 172318

800 6.40 · 105 4.10 · 1011 3.20 · 106 1128253

1000 1.00 · 106 1.00 · 1012 5.00 · 106 1200160

1200 1.44 · 106 2.07 · 1012 7.20 · 106 1287691

1400 1.96 · 106 3.84 · 1012 9.79 · 106 1392060

1600 2.56 · 106 6.55 · 1012 1.28 · 107 1511974

1800 3.24 · 106 1.05 · 1013 1.62 · 107 1648436

2000 4.00 · 106 1.60 · 1013 2.00 · 107 1800320

2050 4.20 · 106 1.76 · 1013 2.10 · 107 1837923

4.1.2 The asymmetric case

To show that asymmetries do indeed cause problems for the AMG solver weconstructed a case of an almost symmetric discrete Poisson problem by modi-fying a small number of elements. We used the same system size, n = 30, asabove which gives a 900 × 900 system matrix and then perturbed the systemby introducing 2 elements in places where they cause asymmetries. Using the

28


Figure 21: Plot of number of non-zero elements versus number of total elements toshow the sparsity relation.

AMG solver on this problem yields the solution shown in figure 22, and for thePARDISO solver the solution is presented in figure 23. For MATLAB’s back-slash operator the solution looks like the PARDISO case, see Appendix B. Also,if the system matrix was perturbed by just another one or two elements theAMG solver would not return an answer at all, but instead only NaN.

To study the error, the residuals are presented in table 3. Values in the tableare rounded to avoid displaying too many decimal points.

Table 3: Table showing residuals and convergence factors for a discrete Poissonproblem with a few asymmetric perturbations using the AMG solver, n = 30.

n Cycle Residual Convergence factor

30 0 0.399 · 102 -

1 0.379 · 100 0.950 · 10−2

2 0.815 · 10−1 0.215 · 100

3 0.606 · 10−1 0.743 · 100

4 0.648 · 10−1 0.107 · 101

5 0.732 · 10−1 0.113 · 101

6 0.833 · 10−1 0.114 · 101

7 0.949 · 10−1 0.114 · 101

8 0.108 · 100 0.114 · 101

9 0.123 · 100 0.114 · 101

10 0.140 · 100 0.114 · 101

29


Figure 22: Asymmetric case for the discrete Poisson problem using the AMG solver,n = 30.

4.2 NUD - Dirichlet only

Now we want to consider the wind fields of the Nordic Urban Dispersion model.For this part we are using only Dirichlet boundary conditions to achieve sym-metry as described in section 3.2.1. Figure 24 show the empirical wind field~V0, around a building, seen from above. This wind field is the initial one andthe system of equations has not yet been solved. In figure 25 and 26 we seethe wind field, ~V , which has been solved for using the AMG and PARDISOsolvers, respectively, also seen from above around a building. In figures 24-26,we have a domain of size nx = 73, ny = 73 and nz = 39 which results inN = nx · ny · nz = 207831, which is the size of one side of the system matrix.In order to have something to compare the results to, see Appendix D for theoriginal NUD solved using the PARDISO solver.

Figure 27-29 show the same wind field as above, around the same building, butthis time seen from the side. Also this time we have the domain size nx = 73,ny = 73 and nz = 39. In the first figure, figure 27, we see the initial empirical

wind field, ~V0, and for figure 28-29 we see the solutions for AMG and PARDISO.

In figure 30, we have the initial wind field ~V0 seen from the side in absence ofbuildings. This is the same domain as for the wind field displayed above, sonx = 73, ny = 73 and nz = 39 was used.

In table 4 we can see the residuals and convergence for this problem usingthe AMG solver. The residuals are calculated in the same way as earlier, seeequation (32), and the system size is nx = 43, ny = 43 and nz = 33 whichresults in N = 61017. Values in the table are rounded to avoid displaying toomany decimal points.

The results from the AMG and PARDISO solver are looking qualitatively verysimilar, and that holds for the time spent solving the problems as well. In

30


Figure 23: Asymmetric case for the discrete Poisson problem using the PARDISOsolver, n = 30.

Figure 24: Empirical wind field, ~V0, around a building seen from above. UsingN = 207831.

figure 31 we see timings for the AMG and PARDISO solvers, from N = 15376(nx = ny = 31 and nz = 16) to N = 306030 (nx = ny = 101 and nz = 30),which results in from nnz = 85792 to nnz = 1906746.

31


Figure 25: Final wind field, ~V , around a building seen from above, solved using theAMG solver. Using N = 207831.

Figure 26: Final wind field, ~V , around a building seen from above, solved using thePARDISO solver. Using N = 207831.

4.2.1 Sparsity

In NUD we are also dealing with very sparse system matrices as a result ofusing a finite difference method. Table 5 below shows data about system size

32


Figure 27: Empirical wind field, ~V0, around a building seen from one side. UsingN = 207831.

Figure 28: Final wind field, ~V , around a building seen from one side, solved usingthe AMG solver. Using N = 207831.

and sparsity. For the full table see Appendix C. The columns for this tableare: nx, ny and nz show the size of the calculation domain, N = nx · ny · nz isthe total number of equations (same as the number of nodal points), N2 is thetotal number of elements in the system matrix and nnz is the total number of

33


Figure 29: Final wind field, ~V , around a building seen from one side, solved usingthe PARDISO solver. Using N = 207831.

Figure 30: Empirical wind field, ~V0, in the direction of the mean wind seen from oneside in absence of buildings. Using N = 207831.

non-zero elements. The last column shows the sparsity. Values in the table arerounded to avoid displaying too many decimal points.

In figure 32 the sparsity is presented in the same graphical way as earlier.

34


Table 4: Table showing residuals and convergence factors for NUD using the AMGsolver, N = 61017.

N Cycle Residual Convergence factor

61017 0 0.469 · 102 -

1 0.369 · 100 0.788 · 10−2

2 0.130 · 10−1 0.352 · 10−1

3 0.480 · 10−3 0.369 · 10−1

4 0.171 · 10−4 0.355 · 10−1

5 0.595 · 10−6 0.349 · 10−1

6 0.217 · 10−7 0.364 · 10−1

7 0.859 · 10−9 0.396 · 10−1

8 0.362 · 10−10 0.422 · 10−1

9 0.155 · 10−11 0.429 · 10−1

10 0.707 · 10−13 0.455 · 10−1

Table 5: Table showing different sizes of the system of equations set up for theproblems, also showing the sparsity. For the full table see Appendix C.

nx ny nz N N2 nnz sparsity, nnzN2

31 31 16 15376 2.36 · 108 8.58 · 104 12755

51 51 18 46818 2.19 · 109 2.68 · 105 18182

71 71 22 110902 1.23 · 1010 6.54 · 105 118812

91 91 28 231868 5.38 · 1010 1.43 · 106 137629

101 101 30 306030 9.37 · 1010 1.91 · 106 149117

35


Figure 31: CPU time versus nnz for the AMG and PARDISO solvers. Domain sizesaccording to table 7.

Figure 32: Plot of number of non-zero elements versus number of total elements toshow the sparsity relation.

36


5 Discussion

In this section the results presented in section 4 will be discussed. For all theresults we assume the solutions of the PARDISO solver (and the MATLABsolvers) to be correct and the comparisons and discussions are based on this.This seems to be a fair assumption given that PARDISO and MATLAB areboth produced by commercial companies, Intel and MathWorks.

5.1 Discrete Poisson

In figure 16 we see the AMG solvers solution for the discrete Poisson problemdescribed above, this is a well known problem with a well known solution. Com-paring the solution to the solution of the PARDISO solver (see Appendix B)we see that they are essentially the same, and as mentioned we trust the PAR-DISO solver to have the correct results. We have also seen that when dividingthe solutions over each other we get a ratio of essentially 1 (differentiating insomething like the tenth decimal). Both these facts tells us that the AMG solvercan handle problems of this type.

Studying table 1 we see the residuals and convergence and we can see that theresiduals are reduced by about a factor 10 for every cycle. We can also see thatthe solution was retrieved by the AMG solver in a total of 10 multigrid cycles,using seven levels of different resolution. An iterative solver which solves theproblem in 10, or less, cycles may seem incredible, but what we have to thinkabout is that the set up phase for a solver like this is what takes time. Also,each multigrid cycle is quite extensive compared to standard iterations of otheriterative solvers.

In figure 17-19 we see the CPU times versus number of non-zero elements fordifferent sets of solvers. We can see that the AMG solver is faster than mostother solvers for this problem, the only one which is slightly faster is MATLAB’sbackslash operator. PARDISO, AMG and MATLAB’s backslash operator arecomparable in time, where the PARDISO solver is slightly slower. This problemis very well-posed and is symmetric, has a row-sum of zero and the system matrixis structured. This means that the set up of the interpolation and restrictionoperators for the different levels is probably very easy which results in a shortset up time in total, and thus a short solution time.

Looking at figure 20 we can see that the AMG solver can solve larger prob-lems than the PARDISO solver, this is because the PARDISO solver ran outof memory at n = 1200 while the AMG ran all the way to n = 2050. Run-ning to n = 2050 we can see that the number of non-zero elements, nnz, are

about 2.1·1077.20·106 ≈ 2.9 times as many. For the full system matrix there are about

1.76·10132.07·1012 ≈ 8.5 times as many total elements, for n = 2050 compared to n = 1200.Comparing nnz when both solvers has similar computational time we can seethat the AMG solver handles a system that has almost twice as many non-zeroelements.

From table 2 and figure 21 we can study the sparsity of the system. Studyingnnz compared to n2 we can see that there are very close to five times as many

37


non-zero elements as there are equations. This makes sense as we use a 5-pointstencil for most points, all except the boundary points. But looking at thesparsity and comparing the number of total elements in the system matirx, n4,we see that nnz is very small. In figure 21 we can see that the number of totalelements increases a lot faster than nnz for the larger system sizes, so the systemgets more sparse.

Comparing figure 22 and 23 we see that we have different solutions. As we trustthe PARDISO solver to be true we can see, as stated earlier, that the AMGsolver cannot handle the asymmetric case. The solver seems to have problemsto account for the asymmetries and does not get the same peaks which thePARDISO solver does in its solution. In table 3 we can see that the residuals,compared to the earlier case of table 1, first gets reduced by a bit, but still farslower than for the earlier case, and then actually starts to increase. So after 10cycles for this case the solution has barely improved at all. We conclude thatthe AMG solver which we have used cannot handle asymmetries.

5.2 NUD - Dirichlet only

The problem described in section 3.2.1 is not the full physical wind problemderived in the theory section. The reason we used this reduced problem insteadwas, as described earlier, because it was a way to achieve a symmetric systemmatrix which was essentially a must for the AMG solver we used. From a math-ematical point of view this is an interesting problem, but it does not correspondto the physical assumptions posed in the theory section. Nevertheless, it verifiesthat the AMG solver works for these kind of elliptical problems. The boundaryconditions of this problem is Dirichlet boundary conditions for all boundaries.This means that all boundaries are flow-through boundaries, which is what wewant to use for the free boundaries. So the ”buildings” of this problem are notbuildings at all but more like pillars of air which are just standing there and notconsidered in the domain. The same with the ground, the ground is just likethe ”roof” of our domain, a simple flow-through boundary. Saying this we nowknow, again, that this is not the full physical problem, but a mathematical oneinstead.

This mathematical problem helped us understand the difficulties of asymmetrywhile still allowing us to evaluate the solver. Studying figure 24-26 we can seethat the initial wind field, ~V0, is modified when solved under the constraint ofmass conservation. The results for the AMG and PARDISO solvers are alsovery much alike, as stated we assume that the solution of the PARDISO solveris true so we conclude that the AMG solver does indeed solve the problem in acorrect way. Discussing the wind field we note that some of the wind velocityvectors do not avoid going into the house, and this is to be expected for theflow-through boundaries which the Dirichlet condition is. We can also see thatsome of the wind velocity vectors are curved inwards, toward the house, at theend of the house. This seems reasonable because mass has to be conserved; anywind velocity vectors that enters terrain or buildings need to be compensatedfor by wind velocity vectors exiting terrain or buildings elsewhere.

For figure 27-29 we also see that the wind field has changed, maybe more so

38


than for the already discussed case. The solution for the AMG and PARDISOsolver are also here essentially the same, which tells us that the AMG solverhas solved the given problem correctly. For this wind field we make the sameobservation: the wind velocity vectors do enter the building. Here we also noticea new behaviour, the vectors go into the ground right before the building, andcome up from the ground right after. As earlier this is to be expected as theboundaries are flow-through for the Dirichlet case we have. The reason that theycome up from the ground after the building is to keep the mass conservation.We also see that some of the wind vectors at the top of the building slightlycurve upwards to avoid this region. Studying the initial wind field, figure 27,we see that the empirical field does have the recirculation zones added. For thedisplacement zone we can tell that the wind vectors are zero, and this makes surethat the condition that the average wind should be zero holds. The cavity zoneand wake zone are more difficult to distinguish, but we observe that the windvectors are slightly smaller in the vicinity of the building and looking carefullywe can see the outlines of a recirculation zone.

By comparing figure 30 to figure 3 we verify that the diagnostic wind layer modelof how the wind field decays closer to the ground is implemented in NUD. Wecan clearly see the constant layer and the surface layer, while we can faintlydiscern the linear interpolation layer at the top.

In table 4 the residuals and the convergence factor for the AMG solver arepresented. Looking at cycle 10 we notice that the residuals are very small,hence the solution is good. Studying the cycles in order, from 0 to 10, weobserve that the reduction of the residuals is quite fast. Again, 10 iterationsmay seem like a low number for an iterative solver to retrieve a solution withsuch good residuals, but using the same reasoning as above we understand thata cycle for the AMG solver is not the same as an iteration for other iterativesolvers.

For the timings, in figure 31, we see that there is a weird bump early on. We arenot sure where this bump comes from and we have run the solver multiple timesto make sure it is not caused by randomness. Our guess is that right at thismoment irregularities are being introduced in to the system matrix. Say, whenincreasing the system size at this moment a part of a building comes into thegrid which causes irregularities in the system matrix. But as the grid increasesthe whole building starts to come in to the grid and we get a system matrixwithout irregularities and the solver is well-behaved again. A phenomena likethis, or similar, could explain the bump in the timings.

Apart from these irregularities the timings for this case are essentially the samefor both the AMG and PARDISO solver. Our guess to why there is less of a gapbetween the AMG and PARDISO solver for this case (and not for the discretePoisson case) is that we are no longer considering a ”school book example”.For this problem we may not have perfect row sums and such. The grid is alsosite dependent (if a building comes up the grid may be altered) so our systemmatrix may not have a full seven-diagonal (as we use a 7-point stencil) all theway through the system matrix but irregularities may arise. This may make itmore difficult to create the needed restriction and interpolation operators whichare needed for the AMG solver and thus more time has to be spent on the setup phase.

39


From table 5 and figure 32 we can, again, see that the system is sparse. Thelargest system here is not as large as the largest for the discrete Poisson case,this is because NUD has a hard coded maximum size of the domain. For thelargest case the discrete Poisson (107) is one order of magnitude larger thanNUD (106). That said, we can still see that the largest system here takes twiceas much time to solve than the largest case for the discrete Poisson case, eventhough we have fewer elements in this case. This tells us that both solvers havedifficulties to account for the complexities of this case.

40


6 Conclusions

We have learned about multigrid in various forms and seen that it is applicableto Poisson-like problems of the type considered in this thesis. The solver uses fewcycles and reduces the error greatly with each one, though one has to considerthat one cycle for a multigrid solver is more extensive than an iteration with astandard one level iterative solver. The solver we used (amg1r6) may be a bitoutdated as it cannot handle asymmetric problems, which in this case turnedout to be a problem. By relaxing the conditions imposed by physics we turnedthe problem formulation into a mathematical one, and we obtained a modifiedNUD problem that was solved using the AMG solver.

From the results we observe that the AMG solver gets the same results as thePARDISO solver. Looking at the timings we see that for the discrete Poissoncase, which is essentially a school book example, the AMG solver solves theproblem faster than the PARDISO solver. For this problem we also note thatthe PARDISO solver ran out of memory more quickly than the AMG solver did,and the AMG solver is able to solve a quite much larger problem. For the NUDcase both solvers solve the problem at similar times, but due to limited timewe could unfortunately not test how large systems the solvers could handle asthere was a hard coded maximum system size.

We conclude that if the problem of asymmetries can be avoided a multigridmethod is able to generate the stationary three dimensional wind field neededto study dispersion in an urban environment. The solver itself shows fast con-vergence for the cases we have studied and the solution times should be fastenough for an emergency situation as those considered within ARGOS.

6.1 Future works

Solving the full NUD problem we have to modify the problem to avoid asym-metries. Here an idea of how to solve the problem of asymmetries, potentiallyat the cost of some accuracy, will be presented. We will try to describe theproblem of the asymmetries and what causes them, and then an idea on how tochange the discretizations to produce a symmetric system matrix.

The asymmetries of the system matrix arise when a point uses another pointin its discretization, but not the other way around. For our case this occurswhen a Neumann point, which uses a three point discretization (the point itself,and two points outward from the boundary in its normal direction), sets up itsequations as it uses a point two steps away from the boundary, but the freepoint two steps away will not use the Neumann point in its equation, as the freepoints just uses points of one step in each direction. Hence, we have to changethe discretization. We would have to either increase the order of accuracy forthe free points from a second order to a fourth order (7-point stencil to a 13-point stencil, two steps in each direction) or decrease the order of the Neumannpoints to first order (just use the point itself and one step outwards from thebuilding in its discretization).

We also have another source of asymmetries. Think of the case where we have

41


an edge (or corner) between two (or three) walls, or wall and roof. A point inthe grid which is an edge point will have to be chosen to be either a wall pointor a roof point, and its normal direction will have to be chosen accordingly.This is done in the current version of NUD, but what is not considered are thepoints which use an edge point in its discretization. If we say this edge pointis a roof point, and thus have its normal direction chosen, only the free pointin its normal direction can use this roof point in its discretization. We cannothave any other free points use this point in its discretization, as this would causeasymmetries. Hence, a Neumann point which uses a free point in its normal-direction for its discretization need to have that free point use the Neumannpoint in the free points discretization.

The important thing to understand here is that points which make use of anotherpoint in its discretization need to have that point make use of ”itself” in itsdiscretization. If this is not fulfilled we cannot assure that we have a symmetricsystem matrix.

Another thing which could be done is to try to implement an own version of amultigrid solver, which then could be designed to handle asymmetries.

6.2 Final remarks

In the introduction, see section 1.2, we discussed a Plan A and Plan B, andwent with Plan A quite quickly. In works like this you always have to accountfor how much time things will take and have to strike a balance. In this case wethought it would be more time efficient to go with Plan A, and half way throughwe thought we were closer than we actually were and thus continued with PlanA. It is impossible to know if it would have been better to go with Plan B and tomake an own implementation of a multigrid method, also this would have beeneven more conceptual and would probably not have led to any more results. Inretrospect I wish I would have considered it more carefully and spent more timeresearching early on than just go with Plan A directly. On the other hand, it isalways difficult to find all the faults of a method just by inspecting it, and oneoften have to actually work with the method for some time before realizing itsproblems.

During my master’s thesis I was also invited to follow to Uppsala where FOIand other Swedish government agencies held a start up meeting for an upcomingproject. At this meeting I gave a short presentation of the model as they intendto use it. For the project wind fields over a rural environment has to be handled,so the initial wind field would have to be modified to account for fluid dynamicproperties which could arise in such environment. Following these alterationsthe name of the model may change accordingly: ”The Nordic Rural Dispersionmodel” (NRD).

42


References

[1] PDC-ARGOS CBRN Crisis Management, http://www.pdc-argos.com/,accessed: 2015-02-10.

[2] Intel, ”Intel MKL PARDISO - Parallel Direct Sparse Solver Interface”,https://software.intel.com/en-us/node/470282, accessed: 2015-05-20.

[3] Jonsson, L., Persson, L., Thaning, L., Schoenberg, P., Westman, S., Bur-man, J., ”Plan for the development of NUD, Nordic Urban DispersionModel”, FOI, Umea, Memo 1608, 2005.

[4] Eriksson, D., ”A Mass Conserving Wind Model Evaluation With FiniteElement”, Master’s thesis, Umea University, Umea, 2013.

[5] Wikipedia, ”Shapefile”, http://en.wikipedia.org/wiki/Shapefile, ac-cessed: 2015-05-11.

[6] George, W.K., ”Lectures in Turbulence for the 21st Century” ChalmersUniversity of Technology, Goteborg, 2013.

[7] Rockle, R., ”Bestimmung der Stromungsverhaltnisse im Bereich komplexerBebauungstrukturen”, Ph.D. thesis, Vom Fachbereich Mechanik, der Tech-nischen Hochschule Darmstadt, Germany, 1990.

[8] Sherman, C.A., Journal of applied Meterology, vol. 17 (1978), pp. 312-319,”A Mass-Consistent Model for Wind Field over Complex Terrain”.

[9] Kaplan, H., Dinar, N., Atmospheric Environment, vol. 30 (1996), pp. 4197-4207, ”A Lagrangian dispersion model for calculating concentration distri-bution within built-up domain”.

[10] Wang, Y., Williamson, C., Garvet, D., Chang, S., Cogan, J., Journal of Ap-plied Meteorology, vol. 44 (2005), pp. 1078-1089, ”Application of a Multi-grid Method to a Mass-Consistent Diagnostic Wind Model”.

[11] CFD Online, ”Multigrid methods”, http://www.cfd-online.com/Wiki/Multigrid_methods, accessed: 2015-04-24.

[12] University of Cambridge, Multigrid methods, http://www.maths.cam.

ac.uk/undergrad/course/na/ii/multigrid/multigrid.php, accessed:2015-04-24.

[13] LeVeque, R.J., ”Finite Difference Methods for Differential Equations”, Uni-versity of Washington, 2006.

[14] Stuben, K., Journal of Computational and Applied Mathematics, vol 128(2001), pp. 281-309, ”A review of algebraic multigrid”.

[15] Stuben, K., ”Algebraic Multigrid (AMG): An Introduction with Applica-tions”, preprint of Appendix that appeared in ”Multigrid” by Trottenberg,U., Oosterlee, C.W., Schuller, A., 2000.

[16] Wikipedia, ”Sparse matrix”, http://en.wikipedia.org/wiki/Sparse_

matrix, accessed: 2015-04-24.

43

http://www.pdc-argos.com/

https://software.intel.com/en-us/node/470282

http://en.wikipedia.org/wiki/Shapefile

http://www.cfd-online.com/Wiki/Multigrid_methods

http://www.cfd-online.com/Wiki/Multigrid_methods

http://www.maths.cam.ac.uk/undergrad/course/na/ii/multigrid/multigrid.php

http://www.maths.cam.ac.uk/undergrad/course/na/ii/multigrid/multigrid.php

http://en.wikipedia.org/wiki/Sparse_matrix

http://en.wikipedia.org/wiki/Sparse_matrix


[17] Pysparse, ”Sparse Matrix Formats”, http://pysparse.sourceforge.

net/formats.html, accessed: 2015-03-30.

[18] Wikipedia, ”Discrete Poisson equation”, http://en.wikipedia.org/

wiki/Discrete_Poisson_equation, accessed: 2015-04-24.

44

http://pysparse.sourceforge.net/formats.html

http://pysparse.sourceforge.net/formats.html

http://en.wikipedia.org/wiki/Discrete_Poisson_equation

http://en.wikipedia.org/wiki/Discrete_Poisson_equation


Appendix A Euler-Lagrange equations

The derivation starts from equation (14), also seen below,

E(u, v, w, λ) =

∫∫∫V

(α2(u− u0)2 + β2(v − v0)2 + γ2(w − w0)2+

+ λ

(∂u

∂x+∂v

∂y+∂w

∂z

))dxdydz

we can take its first variation as follows (in this case w.r.t. u),

δuE(u, v, w, λ;u′) = limε→0

E(u+ εu′, v, w, λ)− E(u, v, w, λ)

ε

=

∫∫∫V

(2α2(u− u0)u′ + λ

∂u′

∂x

)dxdydz

=

∫∫∫V

2α2(u− u0)u′dxdydz +

∫∫∂V

λu′nxdS.

Correspondingly we have,

δvE(u, v, w, λ; v′) =

∫∫∫V

2β2(v − v0)v′dxdydz +

∫∫∂V

λv′nyd,

δwE(u, v, w, λ;w′) =

∫∫∫V

2γ2(w − w0)w′dxdydz +

∫∫∂V

λw′nzdS,

δλE(u, v, w, λ;λ′) =

∫∫∫V

(∂u

∂x+∂v

∂y+∂w

∂z

)λ′dxdydz.

Assuming that the functional is differentiable to at least first order we have asufficient condition for minimum,

δuE(u, v, w, λ;u′) = 0,

δvE(u, v, w, λ; v′) = 0,

δwE(u, v, w, λ;w′) = 0,

δλE(u, v, w, λ;λ′) = 0.

and by saying that (u′, v′, w′) is 0 on the surface ∂V we obtain the Euler-Lagrange equations [3],

u = u0 +1

2α2

∂λ

∂x,

v = v0 +1

2β2

∂λ

∂y,

w = w0 +1

2γ2∂λ

∂z,

0 =∂u

∂x+∂v

∂y+∂w

∂z.

45


Appendix B Discrete Poisson

B.1 Solution for PARDISO and MATLAB

In figure 33 and 34 below we can see the solution to the discrete Poisson problemfor the PARDISO solver and MATLAB’s backslash operator, respectively. Thesolutions are almost identical.

Figure 33: Solution for the discrete Poisson problem using the PARDISO solver,n = 30.

Figure 34: Solution for the discrete Poisson problem using MATLAB’s backslashoperator, n = 30.

46


B.2 The asymmetric case

In figure 35 we see the asymmetric discrete Poisson problem solved using MAT-LAB’s backslash operator, this figure is almost identical to the above solution,in the results section, using the PARDISO solver.

Figure 35: Asymmetric case for the discrete Poisson problem using MATLAB’sbackslash operator, n = 30.

47


Appendix C Sparsity

C.1 Discrete Poisson

In table 6 we see the domain size and sparsity of the discrete Poisson problem.n is number of points on one side of the grid, n2 is the number of equations, n4

is the number of total elements in the full system matrix, nnz is the numberof non-zero elements in the sparse representation of the system matrix and thesparsity nnz

n4 is a measure of how sparse the matrix is and tells of how big partof the total number of elements which are non-zero.

Table 6: Full table showing different sizes of the system of equations set up for theproblems, also showing the sparsity.

n no. of equations, n2 no. of elements, n4 nnz sparsity, nnzn4

50 2.50 · 103 6.25 · 106 1.23 · 104 1507

100 1.00 · 104 1.00 · 108 4.96 · 104 12000

150 2.25 · 104 5.06 · 108 1.12 · 105 14545

200 4.00 · 104 1.60 · 109 1.99 · 105 18032

250 6.25 · 104 3.90 · 109 3.12 · 105 112500

300 9.00 · 104 8.1 · 109 4.49 · 105 118182

350 1.23 · 105 1.50 · 1010 6.11 · 105 124390

400 1.60 · 105 2.56 · 1010 7.98 · 105 132064

450 2.03 · 105 4.10 · 1010 1.01 · 106 140000

500 2.50 · 105 6.25 · 1010 1.25 · 106 150000

550 3.03 · 105 9.15 · 1010 1.51 · 106 158824

600 3.60 · 105 1.30 · 1011 1.80 · 106 172318

650 4.23 · 105 1.80 · 1011 2.11 · 106 183333

700 4.90 · 105 2.40 · 1011 2.45 · 106 1100000

750 5.63 · 105 3.16 · 1011 2.81 · 106 1112360

800 6.40 · 105 4.10 · 1011 3.20 · 106 1128253

850 7.23 · 105 5.22 · 1011 3.61 · 106 1142857

900 8.10 · 105 6.56 · 1011 4.05 · 106 1161290

950 9.03 · 105 8.14 · 1011 4.51 · 106 1181818

1000 1.00 · 106 1.00 · 1012 5.00 · 106 1200160

1050 1.10 · 106 1.22 · 1012 5.51 · 106 1222222

1100 1.21 · 106 1.46 · 1012 6.05 · 106 1243902

1150 1.32 · 106 1.75 · 1012 6.61 · 106 1263158

1200 1.44 · 106 2.07 · 1012 7.20 · 106 1287691

48


C.2 NUD - Dirichlet only

In table 7 we see the domain size and sparsity of NUD. Here nx, ny and nzshow the size of the calculation domain, N = nx · ny · nz is the total numberof equations (same as the number of nodal points), N2 is the total number ofelements in the system matrix and nnz is the total number of non-zero elements.The last column shows the sparsity.

Table 7: Full table showing different sizes of the system of equations set up for theproblems, also showing the sparsity.

nx ny nz N N2 nnz sparsity, nnzN2

31 31 16 15376 2.36 · 108 8.58 · 104 12755

36 36 16 20736 4.30 · 108 1.16 · 105 13695

41 41 16 26896 7.23 · 108 1.50 · 105 14831

46 46 17 35972 1.30 · 109 2.02 · 105 16396

51 51 18 46818 2.19 · 109 2.68 · 105 18182

56 56 19 59584 3.55 · 109 3.45 · 105 110296

61 61 20 74420 5.54 · 109 4.35 · 105 112741

66 66 21 91476 8.37 · 109 5.34 · 105 115659

71 71 22 110902 1.23 · 1010 6.54 · 105 118812

76 76 25 144400 2.09 · 1010 8.69 · 105 123986

81 81 26 170586 2.91 · 1010 1.04 · 105 128088

86 86 27 199692 3.99 · 1010 1.22 · 105 132616

91 91 28 231868 5.38 · 1010 1.43 · 106 137629

96 96 29 267264 7.14 · 1010 1.66 · 105 143123

101 101 30 306030 9.37 · 1010 1.91 · 106 149117

49


Appendix D NUD - Original problem

In figure 36 and 37 below we see the NUD problem solved using the true, original,problem formulation of correct physics. The problem generating these figuresare solved using the PARDISO solver.

Figure 36: Original NUD problem of correct physics, solved using the PARDISOsolver. Final wind field, ~V , around a building seen from above. Using N = 207831.

Figure 37: Original NUD problem of correct physics, solved using the PARDISOsolver. Final wind field, ~V , around a building seen from the side. Using N = 207831.

50

algebraic multigrid for a mass-consistent wind model, the nordic...

Documents