barth_vki95

141

Upload: jose-ulloa

Post on 02-Apr-2015

85 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Barth_VKI95

Lecture Notes Presented at the VKI Lecture Series 1994-05. Revised February 1995.ASPECTS OF UNSTRUCTURED GRIDS AND FINITE-VOLUMESOLVERS FOR THE EULER AND NAVIER-STOKES EQUATIONSTimothy J. BarthAdvanced Algorithms and Applications BranchNASA Ames Research CenterMo�ett Field, CA [email protected], http://www.nas.nasa.gov/�barth

1

Page 2: Barth_VKI95

ASPECTS OF UNSTRUCTURED GRIDS AND FINITE-VOLUMESOLVERS FOR THE EULER AND NAVIER-STOKES EQUATIONSTimothy J. BarthComputational Technology BranchNASA Ames Research CenterMo�ett Field, CaIntroductionThe intent of these notes is to discuss basic algorithms for unstructured mesh generationand uid ow calculation. The volume of literature on these subjects has grown enormouslyin recent years. Consequently, we consider only a small subset of this new technology forthe remaining notes. Hopefully, the remaining portion of these notes will convince thereader that the theory and application of numerical methods for unstructured meshes hasimproved signi�cantly in recent years.1.0 Preliminaries1.1 Graphs and MeshesGraph theory o�ers many valuable theoretical results which directly impact the design ofe�cient algorithms using unstructured grids. For purposes of the present discussion, onlysimple graphs which do not contain self loops or parallel edges will be considered. Resultsconcerning simple graphs usually translate directly into results relevant to unstructuredgrids. Perhaps the most well known graph theoretic result is Euler's formula which relatesthe number of edges n(e), vertices n(v), and faces n(f) of a polyhedron (see �gure 1.1.0(a)):n(f) = n(e) � n(v) + 2 (Euler0s formula) (1:1:0)This polyhedron can be embedded in a plane by mapping one face to in�nity. This makesthe graph formula (1.1.0) applicable to 2-D unstructured meshes. In the example below,the face 1-2-3-4 has mapped to in�nity to form the exterior (in�nite) face. If all faces arenumbered including the exterior face, then Euler's formula (1.1.0) remains valid.1

2

3 4

5 6

789

10

1 2

3 4

6

78

5

9

10

(b)(a)Figure 1.1.0 (a) 3-D Polyhedron, (b) 2-D Planar embedding2

Page 3: Barth_VKI95

The in�nite face can be eliminated by describing the outer boundary in terms of boundaryedges which share exactly one interior face (interior edges share two). We also considerboundary edges which form simple closed polygons in the interior of the mesh. Thesepolygons serve to describe possible objects embedded in the mesh (in this case, the polygonwhich they form is not counted as a face of the mesh). The number of these polygons isdenoted by n(h). The modi�ed Euler's formula now readsn(f) + n(v) = n(e) + 1� n(h) (1:1:1)Since interior edges share two faces and boundary edges share one face, the number ofinterior and boundary edges can be related to the number of faces by the following formula:2n(e)interior + n(e)bound = max d(f)Xi=3 i n(f)i (1:1:2)where n(f)i denotes the number of faces of a particular edge degree, d(f) = i. Note thatfor pure triangulations T , these formulas can be used to determine, independent of themethod of triangulation, the number of triangles or edges given the number of verticesn(v), boundary edges n(e)bound, and interior holes n(h)n(f)3 = 2n(v)� n(e)bound � 2 + 2n(h) (1:1:3)or n(e) = 3n(v) � n(e)bound � 3 + 3n(h): (1:1:4)This is a well known result for planar triangulations. (For brevity, we will sometimes useN to denote n(v) in the remainder of these notes.) In many cases boundary edges are notexplicitly given and the boundary is taken to be the convex hull of the vertices, the smallestconvex polygon surrounding the vertices. (To obtain the convex hull in two dimensions,envision placing an elastic rubber band around the cloud of points. The �nal shape ofthis rubber band will be the convex hull.) A key observation for planar meshes is theasymptotic linear storage requirement with respect to the number of vertices for arbitrarymesh arrangements.The Euler formula extends naturally to an arbitrary number of space dimensions. In thisgeneral setting, Euler's formula relates basic components (vertices, edges, faces, etc.) ofhigher dimensional polytopes. In computational geometry jargon, vertices, edges, and facesare all speci�c examples of \k-faces". A 0-face is simply a vertex, a 1-face correspondsto a edge, a 2-face corresponds to a facet (or simply a face), etc. A polytope P in Rdcontains k-faces 8 k 2 f�1; 0; 1; :::; dg. The -1-face denotes the null set face by standardconvention. Let the number of k-faces contained in the polytope P be denoted by Nk(P).For example, N0(P) would denote the number of vertices. Using this notation, we havethe following relationships:N0 � n(v) (vertices); N1 � n(e) (edges)N2 � n(f) (faces); N3 � n(�) (volumes)3

Page 4: Barth_VKI95

By convention, there is exactly one null set contained in any polytope so that N�1(P) = 1.Using these results, we can succinctly state the general Euler formula for an arbitrarypolytope in Rd dXk=�1(�1)kNk(P) = 0 (Euler0s Formula in Rd) (1:1:5)On the surface of a polyhedron in 3-space, the standard Euler formula (1.1.0) is recoveredsince �1 +N0 �N1 +N2 � 1 = 0or equivalently n(f) + n(v) = n(e) + 2:To obtain results relevant to three-dimensional unstructured grids, the Euler formula(1.1.5) is applied to a four-dimensional polytope.n(f) + n(v) = n(e) + n(�) (1:1:6)This formula relates the number of vertices, edges, faces, and volumes n(�) of a three-dimensional mesh. As in the two-dimensional case, this formula does not account forboundary e�ects because it is derived by looking at a single four-dimensional polytope.The example below demonstrates how to derive exact formulas including boundary termsfor a tetrahedral mesh. Derivations valid for more general meshes in three dimensions arealso possible.Example: Derivation of Exact Euler Formula for 3-D Tetrahedral Mesh.Consider the collection of volumes incident to a single vertex vi and the polyhedronwhich describes the shell formed by these vertices. Let Fb(vi) and Nb(vi) denote thenumber of faces and vertices respectively of this polyhedron which actually lie on theboundary of the entire mesh. Also let E(vi) denote the total number of edges on thepolyhedron surrounding vi. Finally, let d�(vi) and de(vi) denote the number of tetrahedralvolumes and edges respectively that are incident to vi. On this polyhedron, we have exactsatisfaction of Euler's formula (1.1.0), i.e.polyhedral facesz }| {d�(vi) + Fb(vi) + vertices on polyhedronz }| {de(vi) +Nb(vi) = E(vi) + 2 (1:1:7)Note that this step assumes that the polyhedron is homeomorphic to a sphere (otherwiseEuler's formula fails). In reality, this is not a severe assumption. (It would preclude amesh consisting of two tetrahedra which touch at a single vertex.) On the polyhedron wealso have that 2E(vi) = 3 polyhedral facesz }| {(d�(vi) + Fb(vi)) : (1:1:8)Combining (1.1.7) and (1.1.8) yieldsde(vi) = 12d�(vi) + 12Fb(vi) �Nb(vi) + 2: (1:1:9)4

Page 5: Barth_VKI95

Summing this equation over all vertices producesden(v) = 12 d�n(v) +Xi Fb(vi)! �Xi Nb(vi) (1:1:10)where de and d� are the average vertex degrees with respect to edges and volumes. Sinceglobally we have that den(v) = 2n(e); d�n(v) = 4n(�); (1:1:11)substitution of (1.1.11) into (1.1.10) reveals thatn(e) = n(�) + n(v) + 14XFb(vi)� 12 n(v)boundz }| {XNb(vi) : (1:1:12)Finally, note that PFb(vi) = 3n(f)bound. Inserting this relationship into (1.1.12) yieldsn(e) = n(�) + n(v) + 34n(f)bound � 12n(v)bound (1:1:13)Other equivalent formulas are easily obtained by combining this equation with the formularelating volume, faces, and boundary faces, i.e.n(f) = 2n(�) + 12n(f)bound (1:1:14)An exact formula, similar to (1.1.6), is obtained by combining (1.1.14) and (1.1.13)n(e) + n(�) = n(f) + n(v) + 14n(f)bound � 12n(v)bound (1:1:15)1.2 DualityConsider a planar graph G of vertices, edges, and faces (cells). We de�ne a dual graphGDual to be any graph with exhibits the following three properties: each vertex of GDual isassociated with a face of G; each edge of G is associated with an edge of GDual; if an edgeseparates two faces, fi and fj of G then the associated dual edge connects two vertices ofGDual associated with fi and fj . This duality plays an important role in some algorithmsfor solving conservation law equations. 5

Page 6: Barth_VKI95

Mesh

Median Dual

Centroid Dual

Dirichlet RegionFigure 1.2.0 Several triangulation duals.In �gure 1.2.0 edges and faces about the central vertex are shown for duals formed frommedian segments, centroid segments, and by Dirichlet tessellation. (The Dirichlet tessel-lation of a set of points is a pattern of convex regions in the plane, each region being theportion of the plane closer to some given point P of the set of points than to any otherpoint.) These geometric duals arise naturally for two-dimensional �nite-volume schemeswhich solve conservation law equations. Schemes which use the cells of the mesh as controlvolumes are often called \cell-centered" schemes. Other \vertex" schemes use mesh dualsconstructed from median segments, Dirichlet regions, or centroid segments. In later sec-tions, we will construct algorithms for solving the Euler and Navier-Stokes equations whichuse these geometric duals. The graph theoretic results of the previous section will providesharp estimates of the computational work and storage requirements for these algorithms.The estimates obtained in two space dimensions are reasonably intuitive whereas resultsin three space dimensions can be nonintuitive and somewhat surprising. This is the powerof graph theoretic and combinatoral analysis.1.3 Data StructuresThe choice of data structures used in representing unstructured grids varies considerablydepending on the type of algorithmic operations to be performed. In this section, a fewof the most common data structures will be discussed. The mesh is assumed to havea numbering of vertices, edges, faces, etc. In most cases, the physical coordinates aresimply listed by vertex number. The \standard" �nite element (FE) data structure listsconnectivity of each element. For example in �gure 1.3.0(a), a list of the three verticesof each triangle would be given. The FE structure extends naturally to three dimensions.The FE structure is used extensive in �nite element solvers for solids and uids.6

Page 7: Barth_VKI95

(a) (b)

(c) (d)Figure 1.3.0Data structures for planar graphs. (a) FE data structure, (b) Edge structure,(c) Out-degree structure, (d) Quad-edge structure.For planar meshes, another typical structure is the edge structure (�gure 1.3.0(b)) whichlists connectivity of vertices and adjacent faces by storing a quadruple for each edge con-sisting of the origin and destination of the each edge as well as the two faces (cells) thatshare that edge. This structure allows easy traversal through a mesh which can be use-ful in certain grid generation algorithms. (This traversal is not easily done using the FEstructure.) The extension to three dimensions is facewise (vertices of a face are given aswell as the two neighboring volumes) and requires distinction between di�erent face types.A third data structure provides connectivity via vertex lists as shown in �gure 1.3.0(c).The brute force approach is to list all adjacent neighbors for each vertex (usually as alinked-list). Many sparse matrix solver packages specify nonzeros of a matrix using rowor column storage schemes which list all nonzero entries of a given row or column. Fordiscretizations involving only adjacent neighbors of a mesh, this would be identical tospecifying a vertex list. An alternative to specifying all adjacent neighbors is to directedges of the mesh. In this case only those edges which point outward from a vertex arelisted. In the next section, it will be shown that an out-degree list can be constructedfor planar meshes by directing a graph such that no vertex has more than three outgoingedges. This is asymptotically optimal. The extension of the out-degree structure to threedimensions is not straightforward and algorithms for obtaining optimal edge direction fornonplanar graphs are still under development.The last structure considered here is the quad-edge structure proposed by Guibas andStol� [GuiSt85], see �gure 1.3.0(d). Each edge is stored as pair of directed edges. Each ofthe directed edges stores the edge origin and pointers to the next and previous directededge of the region which lies to the left of the edge. The quad-edge structure is extremelyuseful in grid generation where distinctions between topological and geometrical aspectsare sometimes crucial. The structure has been extended to three dimensional arrangementsby Dobkin and Laslo [DobL89] and Brisson [Bri89].7

Page 8: Barth_VKI95

2.0 Some Basic Graph Operations Important in CFDImplementation of unstructured grid methods on di�ering computer architectures hasstimulated research in exploiting properties and characterizations of planar and nonplanargraphs. For example in Hammond and Barth [HamB91], we exploited a recent theorem byChrobak and Eppstein [ChroE91] concerning directed graphs with minimum out-degree.In this work an Euler equation solver was constructed on arbitrary triangulations. Thedirected graph of the triangulation was used as a means for optimal load balancing ofcomputation on a SIMD parallel computer.In this section, we review this result as well as present other basic graph operations thatare particularly useful in CFD. Some of these algorithms are specialized to planar graphs(2-D meshes) while others are very general and apply in any number of space dimensions.2.1 Planar Graphs with Minimum Out-DegreeTheorem: Every planar graph has a 3-bounded orientation, [ChoE91].In other words, each edge of a planar graph can be assigned a direction such that themaximum number of edges pointing outward from any vertex is less than or equal tothree. A constructive proof is given in [ChroE91] consisting of the following steps. The�rst step is to �nd a reduceable boundary vertex. A reduceable boundary vertex is anyvertex on the boundary with incident exterior (boundary) edges that connect to at mosttwo other boundary vertices and any number of interior edges. Chrobak and Eppsteinprove that reduceable vertices can always be found for arbitrary planar graphs. (In fact,two reduceable vertices can always be found!) Once a reduceable vertex is found thenthe two edges connecting to the other boundary vertices are directed outward and theremaining edges are always directed inward. These directed edges are then removed fromthe graph, see Figures 2.1.0(a-j). The process is then repeated on the remaining graphuntil no more edges remain. The algorithm shown pictorially in �gure 2.1.0 is summarizedin the following steps:Algorithm: Orient a Graph with maximum out-degree � 3.Step 1. Find reduceable boundary vertex.Step 2. Direct exterior edges outward and interior edges inward.Step 3. Remove directed edges from graph.Step 4. If undirected edges remain, go to step 18

Page 9: Barth_VKI95

(a) (b)

(c) (d)

(e) (f)

(g) (h)

(i) (j)Figure 2.1.0 (a-i) Mesh orientation procedure with out-degree 3, (j) �nal oriented trian-gulation.2.2 Graph Ordering TechniquesThe particular ordering of solution unknowns can have a marked in uence on the amountof computational e�ort and memory storage required to solve large sparse linear systemsand eigenvalue problems. In many algorithms, the band width and pro�le characteristicsof the matrix determines the amount of computation and memory required. Most meshesobtained from grid generation codes have very poor natural orderings.9

Page 10: Barth_VKI95

Figure 2.2.0 Typical Steiner triangulation about multi-component airfoil.Figure 2.2.1 Nonzero entries of Laplacian matrix produced from natural ordering.Figures 2.2.0 and 2.2.1 show a typical mesh generated about a multi-component airfoiland the nonzero entries associated with the \Laplacian" of the graph. The Laplacian ofa graph would represent the nonzero entries due to a discretization which involves onlyadjacent neighbors of the mesh. Figure 2.2.1 indicates that the band width of the naturalordering is almost equal to the dimension of the matrix! In parallel computation, theordering algorithms can be used as means for partitioning a mesh among processors of thecomputer. This will be addressed in the next section.10

Page 11: Barth_VKI95

Several algorithms exist which construct new orderings by attempting to minimize theband width of a matrix or attempting to minimize the �ll that occurs in the process matrixfactorization. These algorithms usually rely on heuristics to obtain high e�ciency, and donot usually obtain an optimum ordering. One example would be Rosen's algorithm [Ros68]which iterates on the ordering to minimize the maximum band width.Algorithm: Graph ordering, Rosen.Step 1. Determine band width and the de�ning index pair (i; j) with (i < j)Step 2. Does their exist an exchange which increases i or decreases j so that the bandwidth is reduced? If so, exchange and go to step 1Step 3. Does their exist an exchange which increases i or decreases j so that the bandwidth remains the same? If so, exchange and go to step 1This algorithm produces very good orderings but can be very expensive for large matri-ces. A popular method which is much less expensive for large matrices is the Cuthill-McKee[CutM69] algorithm.Algorithm: Graph ordering, Cuthill-McKee.Step 1. Find vertex with lowest degree. This is the root vertex.Step 2. Find all neighboring vertices connecting to the root by incident edges. Order themby increasing vertex degree. This forms level 1.Step 3. Form level k by �nding all neighboring vertices of level k� 1 which have not beenpreviously ordered. Order these new vertices by increasing vertex degree.Step 4. If vertices remain, go to step 3Figure 2.2.2 Nonzero entries of Laplacian matrix after Cuthill-McKee ordering.The heuristics behind the Cuthill-McKee algorithm are very simple. In the graph of themesh, neighboring vertices must have numberings which are near by, otherwise they will11

Page 12: Barth_VKI95

produce entries in the matrix with large band width. The idea of sorting elements amonga given level is based on the heuristic that vertices with high degree should be given indicesas large as possible so that they will be as close as possible to vertices of the next levelgenerated. Figure 2.2.2 shows the dramatic improvement of the Cuthill-McKee orderingon the matrix shown in �gure 2.2.1.Studies of the Cuthill-McKee algorithm have shown that the �ll characteristics of amatrix during L-U decomposition can be greatly reduced simply by reversing the orderingof the Cuthill-McKee algorithm, see George [Geo71]. This amounts to a renumbering givenby k ! n� k + 1 (2:2:0)where n is the size of the matrix. While this does not change the bandwidth of the matrix,it can dramatically reduce the �ll that occurs in Cholesky or L-U matrix factorizationwhen compared to the original Cuthill-McKee algorithm.2.3 Graph Partitioning for Parallel ComputingAn e�cient partitioning of a mesh for distributed memory machines is one that ensuresan even distribution of computational workload among the processors and minimizes theamount of time spent in interprocessor communications. The former requirement is termedload balancing. For if the load were not evenly distributed, some processors will have to sitidle at synchronization points waiting for other processors to catch up. The second require-ment comes from the fact that communication between processors takes time and it is notalways possible to hide this latency in data transfer. In our parallel implementation of a�nite-volume ow solver on unstructured grids, data for the vertices that lie on the bound-ary between two processors is exchanged, hence requiring a bidirectional data-transfer. Onsome systems, a synchronous exchange of data can yield a higher performance than whendone in an asynchronous fashion. To exploit this fact, edges of the communication graphare colored such that no vertex has more than one edge of a certain color incident upon it.1

2

3

4

1 4

2 3

(a) (b)Figure 2.3.0 (a) Four partition mesh, (b) Communication graph.A communication graph is a graph in which the vertices are the processors and an edgeconnects two vertices if the two corresponding processors share an interprocessor bound-ary. The colors in the graph represent separate communication cycles. For instance, the12

Page 13: Barth_VKI95

mesh partitioned amongst four processors as shown in �gure 2.3.0(a), would produce thecommunication graph shown in �gure 2.3.0(b). The graph shown in �gure 2.3.0(b) canbe colored edgewise using three colors. In the �rst communication cycle, processors (1; 4)could perform a synchronous data exchange as would processors (2; 3). In the secondcommunication cycle, processors (1; 2) and (3; 4) would exchange and in the third cycle,processors (1; 3) would exchange while processors 2 and 4 sit idle. Vizing's theorem provesthat any graph of maximum vertex degree � (number of edges incident upon a vertex) canbe colored using n colors such that � � n � �+ 1. Hence, any operation that calls forevery processor to exchange data with its neighbors will require n communication cycles.The actual cost of communication can often be accurately modeled by the linear rela-tionship: Cost = �+ �m (2:3:0)where � is the time required to initiate a message, � is the rate of data-transfer betweentwo processors and m is the message length. For n messages, the cost would beCost =Xn (�+ �mn): (2:3:1)This cost can be reduced in two ways. One way is to reduce � thereby reducing n. Thealternative is to reduce the individual message lengths. The bounds on n are 2 � N � P�1for P � 3 where P is the total number of processors. Figure 2.3.1(a) shows the partitioningof a mesh which reduces �, and 2.3.1(b) shows a partitioning which minimizes the messagelengths. For the mesh in �gure 2.3.1(a), � = 2 while in �gure 2.3.1(b), � = 3. However,the average message length for the partitions shown in �gure 2.3.1(b) is about half as muchas that in �gure 2.3.1(a).(a) (b)Figure 2.3.1 (a) Mesh partitioning with minimized �, (b) Mesh with minimizes messagelength.In practice, it is di�cult to partition an unstructured mesh while simultaneously mini-mizing the number and length of messages. In the following paragraphs, a few of themost popular partitioning algorithms which approximately accomplish this task will bediscussed. All the algorithms discussed below: coordinate bisection, Cuthill-McKee, andspectral partitioning are evaluated in the paper by Venkatakrishnan, Simon, and Barth13

Page 14: Barth_VKI95

[VenSB91]. This paper evaluates the partitioning techniques within the con�nes of an ex-plicit, unstructured �nite-volume Euler solver. Spectral partitioning has been extensivelystudied by Simon [Simon91] for other applications.Note that for the particular applications that we have in mind (a �nite-volume schemewith solution unknowns at vertices of the mesh), it makes sense to partition the domainsuch that the separators correspond to edges of the mesh. Also note that the partitioningalgorithms all can be implemented recursively. The mesh is �rst divided into two sub-meshes of nearly equal size. Each of these sub-meshes is subdivided into two more sub-meshes and the process in repeated until the desired number of partitions P is obtained(P is a integer power of 2). Since we desire the separator of the partitions to coincide withedges of the mesh, the division of a sub-mesh into two pieces can be viewed as a 2-coloringof faces of the sub-mesh. For the Cuthill-McKee and spectral partitioning techniques,this amounts to supplying these algorithms with the dual of the graph for purposes of the2-coloring. The balancing of each partition is usually done cellwise; although an edgewisebalancing is more appropriate in the present applications. Due to the recursive nature ofpartitioning, the algorithms outlined below represents only a single step of the process.Coordinate BisectionIn the coordinate bisection algorithm, face centroids are sorted either horizontally orvertically depending of the current level of the recursion.Figure 2.3.2 Coordinate bisection (16 partitions).A separator is chosen which balances the number of faces. Faces are colored depending onwhich side of the separator they are located. The actual edges of the mesh correspondingto the separator are characterized as those edges which have adjacent faces of di�erentcolor, see �gure 2.3.2. This partitioning is very e�cient to create but gives sub-optimalperformance on parallel computations owing to the long message lengths than can routinelyoccur. 14

Page 15: Barth_VKI95

Cuthill-McKeeThe Cuthill-McKee algorithm described earlier can also be used for recursive mesh par-titioning. In this case, the Cuthill-McKee ordering is performed on the dual of the meshgraph. A separator is chosen either at the median of the ordering (which would balance thecoloring of faces of the original mesh) or the separator is chosen at the level set boundaryclosest to the median as possible. This latter technique has the desired e�ect of reducingthe number of disconnected sub-graphs that occur during the course of the partitioning.Figure 2.3.3 shows a Cuthill-McKee partitioning for the multi-component airfoil mesh.The Cuthill-McKee ordering tends to produce long boundaries because of the way that theordering is propagated through a mesh. The maximum degree of the communication graphalso tends to be higher using the Cuthill-McKee algorithm. The results shown in [VenSB91]for multi-component airfoil grids indicate a performance on parallel computations whichis slightly worse than the coordinate bisection technique.Figure 2.3.3 Cuthill-McKee partitioning of mesh.Spectral PartitioningThe last partitioning considered is the spectral partitioning [PothSL90] which exploitsproperties of the Laplacian L of a graph (de�ned below). The algorithm consists of thefollowing steps:Algorithm: Spectral Partitioning.Step 1. Calculate the matrix L associated with the Laplacian of the graph (dual graph inthe present case).Step 2. Calculate the eigenvalues and eigenvectors of L.Step 3. Order the eigenvalues by magnitude, �1 � �2 � �3:::�N .Step 4. Determine the smallest nonzero eigenvalue, �f and its associated eigenvector xf(the Fiedler vector).Step 5. Sort elements of the Fiedler vector. 15

Page 16: Barth_VKI95

Step 6. Choose a divisor at the median of the sorted list and 2-color vertices of the graph(or dual) which correspond to elements of the Fielder vector less than or greater than themedian value.The spectral partitioning of the multi- component airfoil is shown in �gure 2.3.4. In[VenSB91] it was observed that superior performance was attained for parallel compu-tations on the spectral partitioning than on the coordinate bisection or Cuthill-McKeepartitionings. The cost of the spectral partitioning is high (even using a Lanczos algo-rithm to compute the eigenvalue problem). This cost has been reduced by the use of amultilevel Lanczos algorithm in [BarnS93].Figure 2.3.4 Spectral partitioning of multi- component airfoil.The spectral partitioning exploits a peculiar property of the \second" eigenvalue of theLaplacian matrix associated with a graph. The Laplacian matrix of a graph is simplyL = �D +A: (2:3:2)where A is the standard adjacency matrixAij = n 1 e(vi; vj ) 2 G0 otherwise (2:3:2)and D is a diagonal matrix with entries equal to the degree of each vertex, Di = d(vi).From this de�nition, it should be clear that rows of L each sum to zero. De�ne anN-vector,s = [1; 1; 1; :::]T . By construction we have thatLs = 0: (2:3:3)This means that at least one eigenvalue is zero with s as an eigenvector.16

Page 17: Barth_VKI95

The objective of the spectral partitioning is to divide the mesh into two partitions of equalsize such that the number of edges cut by the partition boundary is approximately minimized.Technically speaking, the smallest nonzero eigenvalue need not be the second. Graphswith disconnected regions will have more than one zero eigenvalue depending on the numberof disconnected regions. For purposes of discussion, we assume that disconnected regionsare not present, i.e. that �2 is the relevant eigenmode.Elements of the proof:De�ne a partitioning vector which 2-colors the verticesp = [+1;�1;�1;+1;+1; :::;+1;�1]T (2:3:4)depending on the sign of elements of p and the one-to-one correspondence with verticesof the graph, see for example �gure 2.3.5. Balancing the number of vertices of each coloramounts to the requirement that s ? p (2:3:5)where we have assumed an even number of vertices.+1

+1

+1

+1

+1+1

+1

+1

+1

+1

-1

-1

-1

-1

-1

-1

-1

-1

-1Figure 2.3.5 Arbitrary graph with 2-coloring showing separator and cut edges.The key observation is that the number of cut edges, Ec, is precisely related to the L1norm of the Laplacian matrix multiplying the partitioning vector, i.e.4Ec = kLpk1 (2:3:6)which can be easily veri�ed. The goal is to minimize cut edges. That is to �nd p whichminimizes kLpk1 subject to the constraints that kpk1 = N and s ? p. Since L is a realsymmetric (positive semi-de�nite) matrix, it has a complete set of real eigenvectors whichcan be orthogonalized with each other. The next step of the proof would be to extend the17

Page 18: Barth_VKI95

domain of p to include real numbers (this introduces an inequality) and expand p in termsof the orthogonal eigenvectors. p = nXi=1 cixi (2:3:7)By virtue of (2.3.5) we have that x1 = s. It remains to be shown that kLpk1 is minimizedwhen p = p0 = nx2=kx2k1 ,i.e. when the Fiedler vector is used. Inserting this expressionfor p we have that kLp0k1 = n�2 (2:3:8)It is a simple matter to show that adding any other eigenvector component to p0 whileinsisting that kpk1 = N can only increase the L1 norm. This would complete the proof.Figure 2.3.6 plots contours (level sets) of the Fiedler vector for the multi-component airfoilproblem.Figure 2.3.6. Contours of Fiedler Vector for Spectral Partitioning. Dashed lines are lessthan the median value. 3.0 Triangulation MethodsAlthough many algorithms exist for triangulating sites (points) in an arbitrary number ofspace dimensions, only a few have been used on practical problems. In particular, Delaunaytriangulation has proven to be a very useful triangulation technique. This section willpresent some of the basic concepts surrounding Delaunay and related triangulations as wellas discussing some of the most popular algorithms for constructing these triangulations.3.1 Voronoi Diagrams and the Delaunay TriangulationRecall the de�nition of the Dirichlet tessellation in a plane. The Dirichlet tessellationof a point set is the pattern of convex regions, each being closer to some point P in thepoint set than to any other point in the set. These Dirichlet regions are also called Voronoi18

Page 19: Barth_VKI95

regions [Vor07]. The edges of Voronoi polygons comprise the Voronoi diagram, see �gure3.1.0. The idea extends naturally to higher dimensions.Figure 3.1.0 Voronoi diagram of 40 random sites.Voronoi diagrams have a rich mathematical theory. The Voronoi diagram is believed to beone of the most fundamental constructs de�ned by discrete data. Voronoi diagrams havebeen independently discovered in a wide variety of disciplines. Computational geometri-cians have a keen interest in Voronoi diagrams. It is well known that Voronoi diagrams arerelated to convex hulls via stereographic projection. Point location in a Voronoi diagramcan be performed in O(log(n)) time with O(n) storage for n regions. This is useful insolving post-o�ce or related problems in optimal time. Another example of the Voronoidiagram which occurs in the natural sciences can be visualized by placing crystal \seeds"at random sites in 3-space. Let the crystals grow at the same rate in all directions. Whentwo crystals collide simply stop their growth. The crystal formed for each site would rep-resent that volume of space which is closer to that site than to any other site. This woulde�ectively construct a Voronoi diagram. We now consider the role of Voronoi diagrams inDelaunay triangulation [Del34].De�nition: The Delaunay triangulation of a point set is de�ned as the dual of the Voronoidiagram of the set.The Delaunay triangulation in two space dimensions is formed by connecting two points ifand only if their Voronoi regions have a common border segment. If no four or more pointsare cocircular, then we have that vertices of the Voronoi are circumcenters of the triangles.This is true because vertices of the Voronoi represent locations that are equidistant to three(or more) sites. Also note that from the de�nition of duality, edges of the Voronoi are inone-to-one correspondence to edges of the Delaunay triangulation (ignoring boundaries).Because edges of the Voronoi diagram are the locus of points equidistant to two sites, eachedge of the Voronoi diagram is perpendicular to the corresponding edge of the Delaunaytriangulation. This duality extends to three dimensions in a straightforward way. The19

Page 20: Barth_VKI95

Delaunay triangulation possesses several alternate characterizations and many propertiesof importance. Unfortunately, not all of the two dimensional characterizations have three-dimensional extensions. To avoid confusion, properties and algorithms for constructionof two dimensional Delaunay triangulations will be considered �rst. The remainer of thissection will then discuss the three-dimensional Delaunay triangulation.3.2 Properties of 2-D Delaunay Triangulations(1)Uniqueness. The Delaunay triangulation is unique. This assumes that no four sites arecocircular. The uniqueness follows from the uniqueness of the Dirichlet tessellation.(2)The circumcircle criteria. A triangulation of N � 2 sites is Delaunay if and only if thecircumcircle of every interior triangle is point-free. For if this was not true, the Voronoiregions associated with the dual would not be convex and the Dirichlet tessellation wouldbe invalid. Related to the circumcircle criteria is the incircle test for four points as shownin �gures 3.2.0-3.2.1.A

B

C

D

Figure 3.2.0 Incircle test for 4ABC and D (true).A

B

C

D

Figure 3.2.1 Incircle test for 4ABC and D (false).This test is true if point D lies interior to the circumcircle of 4ABC which is equivalentto testing whether 6 ABC + 6 CDA is less than or greater than 6 BCD + 6 BAD. More20

Page 21: Barth_VKI95

precisely we have that6 ABC + 6 CDA = (< 180� incircle false180� A,B,C,D cocirc> 180� incircle true (3:2:0)Since interior angles of the quadrilateral sum to 360�, if the circumcircle of4ABC containsD then swapping the diagonal edge from position A � C into B �D guarantees that thenew triangle pair satis�es the circumcircle criteria. Furthermore, this process of diagonalswapping is local, i.e. it does not disrupt the Delaunayhood of any triangles adjacent tothe quadrilateral.(3)Edge circle property. A triangulation of sites is Delaunay if and only if there exists somecircle passing through the endpoints of each and every edge which is point-free. This char-acterization is very useful because it also provides a mechanism for de�ning a constrainedDelaunay triangulation where certain edges are prescribed apriori. A triangulation of sitesis a constrained Delaunay triangulation if for each and every edge of the mesh there ex-ists some circle passing through its endpoints containing no other site in the triangulationwhich is visible to the edge. In �gure 3.2.2, site d is not visible to the segment a-c becauseof the constrained edge a-b.b

c

d

aFigure 3.2.2 Constrained Delaunay triangulation. Site d is not visible to a-c due toconstrained segment a-b.(4)Equiangularity property. Delaunay triangulation maximizes the minimum angle of thetriangulation. For this reason Delaunay triangulation often called the MaxMin triangula-tion. This property is also locally true for all adjacent triangle pairs which form a convexquadrilateral. This is the basis for the local edge swapping algorithm of Lawson [Law77]described below.(5)Minimum Containment Circle. A recent result by Rajan [Raj91] shows that the Delau-nay triangulation minimizes the maximum containment circle over the entire triangulation.The containment circle is de�ned as the smallest circle enclosing the three vertices of atriangle. This is identical to the circumcircle for acute triangles and a circle with diameter21

Page 22: Barth_VKI95

equal to the longest side of the triangle for obtuse triangles (see �gure 3.2.3). This propertyextends to n dimensions. Unfortunately, the result does not hold lexicographically.(a) (b)Figure 3.2.3 Containment circles for acute and obtuse triangles.(6)Nearest neighbor property. An edge formed by joining a vertex to its nearest neighboris an edge of the Delaunay triangulation. This property makes Delaunay triangulation apowerful tool in solving the closest proximity problem. Note that the nearest neighboredges do not describe all edges of the Delaunay triangulation.(7)Minimal roughness. The Delaunay triangulation is a minimal roughness triangulationfor arbitrary sets of scattered data, [Rippa90]. Given arbitrary data fi at all vertices of themesh and a triangulation of these points, a unique piecewise linear interpolating surfacecan be constructed. The Delaunay triangulation has the property that of all triangulationsit minimizes the roughness of this surface as measured by the following Sobolev semi-norm:ZT "�@f@x�2 +�@f@y�2# dx d y (3:2)This is a interesting result as it does not depend on the actual form of the data. This alsoindicates that Delaunay triangulation approximates well those functions which minimizethis Sobolev norm. One example would be the harmonic functions satisfying Laplace'sequation with suitable boundary conditions which minimize exactly this norm. In a latersection, we will prove that a Delaunay triangulation guarantees a maximum principle forthe discrete Laplacian approximation (with linear elements).3.3 Algorithms for 2-D Delaunay TriangulationWe now consider several techniques for Delaunay triangulation in two dimensions. Thesemethods were chosen because they perform optimally in rather di�erent situations. Thediscussion of the 2-D algorithms is organized as follows:22

Page 23: Barth_VKI95

(a) Divide and Conquer Algorithm(b) Incremental Insertion Algorithms(i) Bowyer algorithm(ii) Watson algorithm(iii) Green and Sibson algorithm(iv) Randomized Algorithms(c) Global Edge Swapping (Lawson)3.3a Divide-and-Conquer AlgorithmIn this algorithm, the sites are assumed to be prespeci�ed. The idea is to partition thecloud of points T (sorted along a convenient axis) into left (L) and right (R) half planes.Each half plane is then recursively Delaunay triangulated. The two halves must then bemerged together to form a single Delaunay triangulation. Note that we assume that thepoints have been sorted along the x-axis for purposes of the following discussion (this canbe done with O(N logN) complexity).Algorithm: Delaunay Triangulation of Point Set Via Divide-and-ConquerStep 1. Partition T into two subsets TL and TR of nearly equal size.Step 2. Delaunay triangulate TL and TR recursively.Step 3. Merge TL and TR into a single Delaunay triangulation.Figure 3.3.0 Triangulated subdivisions. 23

Page 24: Barth_VKI95

Figure 3.3.1 Triangulation after merge.The only di�cult step in the divide-and-conquer algorithm is the merging of the left andright triangulations. The process is simpli�ed by noting two properties of the merge:(1) Only cross edges (L-R or R-L) are created in the merging process. Since vertices areneither added or deleted in the merge process, the need for a new R-R or L-L edge indicatesthat the original right or left triangulation was defective. (Note that the merging processwill require the deletion of edges L-L and/or R-R.)(2) Vertices with minimum (maximum) y value in the left and right triangulations alwaysconnect as cross edges. This provides an initial guess for determining the lower commontangent (LCT) of the two triangulations (dashed line in �g. 3.3.2).LCTFigure 3.3.2 Calculation of the LCT given an initial guess.Given a starting guess for the LCT, a traversal algorithm �nds the LCT with overall timeO(N). 24

Page 25: Barth_VKI95

Given these properties and a LCT for the two triangulations, we now outline the \risingbubble" [GuiS85] merge algorithm. This algorithm produces cross edges in ascendingy-order. The LCT forms the initial cross edge for the rising bubble algorithm. Moregenerally consider the situation in which we have a cross edge between A and B and alledges incident to the points A and B with endpoints above the half plane formed by a linepassing through A �B, see �gure 3.3.3.A

B

C

Figure 3.3.3 Circle of increasing radius in rising bubble algorithm.This �gure depicts a continuously transformed circle of increasing radius passing throughthe points A and B. Eventually the circle increases in size and encounters a point C fromthe left or right triangulation (in this case, point C is in the left triangulation). A newcross edge (dashed line in �gure 3.3.3) is then formed by connecting this point to a vertexof A�B in the other half triangulation. Given the new cross edge, the process can then berepeated and terminates when the top of the two meshes is reached. The deletion of L�Lor R � R edges can take place during or after the addition of the cross edges. Properlyimplemented, the merge can be carried out in linear time, O(N). Denoting T (N) as thetotal running time, step 2 is completed in approximately 2T (N=2). Thus the total runningtime is described by the recurrence T (N) = 2T (N=2) +O(N) = O(N logN).3.3b Incremental Insertion AlgorithmsFor simplicity, assume that the site to be added lies within a bounding polygon of theexisting triangulation. If we desire a triangulation from a new set of sites, three initialphantom points can always be added which de�ne a triangle large enough to enclose allpoints to be inserted. In addition, interior boundaries are usually temporarily ignoredfor purposes of the Delaunay triangulation. After completing the triangulation, spuriousedges are then deleted as a postprocessing step. Incremental insertion algorithms begin byinserting a new site into an existing Delaunay triangulation. This introduces the task ofpoint location in the triangulation. Some incremental algorithms require knowing whichtriangle the new site falls within. Other algorithms require knowing any triangle whosecircumcircle contains the new site. In either case, two extremes arise in this reguard.Typical mesh adaptation and re�nement algorithms determine the particular cell for siteinsertion as part of the mesh adaptation algorithm, thereby reducing the burden of point25

Page 26: Barth_VKI95

location. In the other extreme, initial triangulations of randomly distributed sites usuallyrequire advanced searching techniques for point location to achieve asymptotically optimalcomplexity O(N logN). Search algorithms based on quad-tree and split-tree data struc-tures work extremely well in this case. Alternatively, search techniques based on graphtraversal or \walking" algorithms are frequently used because of their simplicity. Thesemethods work extremely well when successively added points are close together. The basicidea is start at the location in the mesh of the previously inserted point and move one edge(or cell) at a time in the general direction of the newly added point. In the worst case,each point insertion requires O(N) traversals. This would result in a worst case overallcomplexity O(N2). For randomly distributed points, the average point insertion requiresO(N 12 ) traversals which gives an overall complexity O(N 32 ).The second issue in complexity analysis concerns the number of structural changes re-quired when each site is inserted. Unfortunately this is highly dependent on the order inwhich sites are inserted. In fact poor orderings can give rise to worst case situations re-quiring O(N2) structural changes. In section 3.3c we consider algorithms which randomizethe order in which sites are presented for insertion. When coupled with the Green-Sibsontechnique, the resulting algorithm has expected N logN performance. As we will see, thetechnique also suggests a new randomized data structure for proximity query which welater exploit in a multigrid method for solving di�erential equations.Bowyer's algorithm [Bow81]The basic idea in Bowyer's algorithm is to insert a new site into an existing Voronoidiagram (for example site Q in �gure 3.3.4), determine its territory (dashed line in �gure3.3.4), delete any edges completely contained in the territory, then add new edges andrecon�gure existing edges in the diagram. The following is Bowyer's algorithm essentiallyas presented by Bowyer:Algorithm: Incremental Delaunay triangulation, Bowyer [Bow81].Step 1. Insert new point (site) Q into the Voronoi diagram.Step 2. Find any existing vertex in the Voronoi diagram closer to the new point than toits forming points. This vertex will be deleted in the new Voronoi diagram.Step 3. Perform tree search to �nd remaining set of deletable vertices V that are closerto the new point than to their forming points. (In �gure 3.3.4 this would be the setfv3; v4; v5g)Step 4. Find the set P of forming points corresponding to the deletable vertices. In �gure3.3.4, this would be the set fp2; p3; p4; p5; p7g.Step 5. Delete edges of the Voronoi which can be described by pairs of vertices in the setV if both forming points of the edges to be deleted are contained in PStep 6. Calculate the new vertices of the Voronoi, compute their forming points andneighboring vertices, and update the Voronoi data structure.26

Page 27: Barth_VKI95

x

p

p

p

p

p

pp

p

5

6

8

1

4

3

2

7

v

v

v

v

vv

v

6

7

2

1

4

35

w

w

w

w

w

1

3

2

45Q

Figure 3.3.4 Voronoi diagram modi�ed by Bowyer.Implementational details and suggested data structures are given in the paper by Bowyer.Watson's algorithm [Wat81]Implementation of the Watson algorithm is relatively straightforward. The �rst stepis to insert a new site into an existing Delaunay triangulation and to �nd any triangle(the root) such that the new site lies interior to that triangles circumcircle. Starting atthe root, a tree search is performed to �nd all triangles with circumcircle containing thenew site. This is accomplished by recursively checking triangle neighbors. (The resultingset of deletable triangles violating the circumcircle criteria is independent of the startingroot.) Removal of the deletable triangles exposes a polygonal cavity surrounding site Qwith all the vertices of the polygon visible to site Q. The interior of the cavity is thenretriangulated by connecting the vertices of the polygon to site Q, see �gure 3.3.5(b). Thiscompletes the algorithm. A thorough account of Watson's algorithm is given by Baker[Bak89] where he considers issues associated with constrained triangulations.Algorithm: Incremental Delaunay triangulation, Watson [Wat81].Step 1. Insert new site Q into existing Delaunay triangulation.Step 2. Find any triangle with circumcircle containing site Q.Step 3. Perform tree search to �nd remaining set of deletable triangles with circumcirclecontaining site Q.Step 4. Construct list of edges associated with deletable triangles. Delete all edges fromthe list that appear more that once.Step 5. Connect remaining edges to site Q and update Delaunay data structure.27

Page 28: Barth_VKI95

(a) (b)Figure 3.3.5 (a) Delaunay triangulation with site added. (b) Triangulation after deletionof invalid edges and reconnection.Green and Sibson algorithm [GreS77]The algorithm of Green and Sibson is very similar to the Watson algorithm. The primarydi�erence is the use of local edge transformations (swapping) to recon�gure the triangu-lation. The �rst step is location, i.e. �nd the triangle containing point Q. Once this isdone, three edges are then created connecting Q to the vertices of this triangle as shownin �gure 3.3.6(a).(a) (b)Figure 3.3.6 (a) Insertion of new vertex, (b) Swapping of suspect edge.If the point falls on an edge, then the edge is deleted and four edges are created connect-ing to vertices of the newly created quadrilateral. Using the circumcircle criteria it can beshown that the newly created edges (3 or 4) are automatically Delaunay. Unfortunately,some of the original edges are now incorrect. We need to somehow �nd all \suspect" edgeswhich could possibly fail the circle test. Given that this can be done (described below),each suspect edge is viewed as a diagonal of the quadrilateral formed from the two adjacenttriangles. The circumcircle test is applied to either one of the two adjacent triangles ofthe quadrilateral. If the fourth point of the quadrilateral is interior to this circumcircle,the suspect edge is then swapped as shown in �gure 3.3.6(b), two more edges then becomesuspect. At any given time we can immediately identify all suspect edges. To do this, �rstconsider the subset of all triangles which share Q as a vertex. One can guarantee at alltimes that all initial edges incident to Q are Delaunay and any edge made incident to Q28

Page 29: Barth_VKI95

by swapping must be Delaunay. Therefore, we need only consider the remaining edges ofthis subset which form a polygon about Q as suspect and subject to the incircle test. Theprocess terminates when all suspect edges pass the circumcircle test.The algorithm can be summarized as follows:Algorithm: Incremental Delaunay Triangulation, Green and Sibson [GreS77]Step 1. Locate existing cell enclosing point Q.Step 2. Insert site and connect to 3 or 4 surrounding vertices.Step 3. Identify suspect edges.Step 4. Perform edge swapping of all suspect edges failing the incircle test.Step 5. Identify new suspect edges.Step 6. If new suspect edges have been created, go to step 3.The Green and Sibson algorithm can be implemented using standard recursive program-ming techniques. The heart of the algorithm is the recursive procedure which would takethe following form for the con�guration shown in �gure 3.3.7:procedure swap[ vq, v1, v2, v3, edges]if(incircle[vq ,v1,v2,v3] = TRUE)thencall recon�g edges[vq, v1, v2, v3, edges]call swap[vq, v1, v4, v2, edges]call swap[vq, v2, v5, v3, edges]endifendprocedureThis example illustrates an important point. The nature of Delaunay triangulationguarantees that any edges swapped incident to Q will be �nal edges of the Delaunaytriangulation. This means that we need only consider forward propagation in the recursiveprocedure. In a later section, we will consider incremental insertion and edge swapping forgenerating non-Delaunay triangulations based on other swapping criteria. This algorithmcan also be programmed recursively but requires backward propagation in the recursiveimplementation. For the Delaunay triangulation algorithm, the insertion algorithm wouldsimplify to the following three steps:Recursive Algorithm: Incremental Delaunay Triangulation, Green and SibsonStep 1. Locate existing cell enclosing point Q.Step 2. Insert site and connect to surrounding vertices.Step 3. Perform recursive edge swapping on newly formed cells (3 or 4).29

Page 30: Barth_VKI95

q

1

2

3

(a)

q

1

2

3

(b)4

5

Figure 3.3.7 Edge swapping with forward propagation.3.3c Randomized Algorithms [GuiKS92]As mentioned in the introduction to incremental insertion methods, special con�gura-tions of sites can lead to O(N2) structural changes. Figure 3.3.8 shows one such example.This quadratic behavior can be eliminated with extremely high probability by randomizingthe order in which sites are triangulated.(b)

6 5 4 3 2 1

(a)

Figure 3.3.8 (a) Initial triangulation, (b) �nal triangulation. Di�cult case which producesa quadratic number of structural changes when sites 1� 6 are added from right to left.De�ne the scope of a triangle w(4) to be the number of sites inside the circumcircle of4. Let Tj be the number of triangles of scope j.Probability[4 appears as Delaunay] = 3!j!(j + 3)!30

Page 31: Barth_VKI95

which states the probability of picking 3 sites before j others. Thus the total number ofDelaunay triangles appearing is � = N�3Xj=0 3!j!(j + 3)!Tjwhich is shown in [GuiKS93] to be O(N). In fact for the Green-Sibson algorithm the totalnumber of Delaunay triangles that arise, � , is� � 6N � 15HN + 212 (3:3:0)where HN = 1 + 12 + 13 + :::+ 1N .The randomization of site insertion in the Green-Sibson algorithm also permits theconstruction of an e�cient randomized structure for point location. This is needed to �ndthe triangle in which the new site falls inside. We will give an algorithm which contributesan overall complexity ofN logN to the insertion algorithm. This complexity is known to beoptimal for planar triangulations. The central idea is to keep a history of the constructionof the triangulation. Guibas makes an analogy of the history to that of a stack of papertriangles glued on top of one another. Thus we think of storing triangulations one atopanother.Assume we wish to insert a new site Q into the triangulation and that we know thatthe triangle T contains site Q at some step in the construction of the triangulation. If Tis present in the �nal triangulation then the search is complete. More generally, T willhave been previously split in one of two ways: either a new site of the triangulation wasadded which divided T into three children or one of the edges of T was swapped, in whichcase T will be split into two children. In both cases the appropriate children are found inconstant time.(a)

(b)

(c)

Q

Q

QFigure 3.3.9 Data tree for site location. (a) Original triangulation containing Q. (b)Triangle is split into 3 from site insertion. (c) Triangle is split into 2 from edge swap.31

Page 32: Barth_VKI95

The process would then continue until the search is complete. In this way we can thinkof the data structure as a tree with node branching factors of either two or three. In[GuiKS92] an amortized analysis reveals O(logN) depth per search yielding the overallO(N logN) complexity for the entire triangulation.3.3d Delaunay Triangulation Via Global Edge SwappingThis algorithm due to Lawson [Law77] assumes that a triangulation exists (not Delau-nay) then makes it Delaunay through application of edge swapping such that the equian-gularity of the triangulation increases. The equiangularity of a triangulation, A(T ), isde�ned as the ordering of angles A(T ) = [�1; �2; �3; :::; �3n(c)3] such that �i � �j if i < j.We write A(T �) < A(T ) if ��j � �j and ��i = �i for 1 � i < j. A triangulation T isglobally equiangular if A(T �) � A(T ) for all triangulations T � of the point set. Lawson'salgorithm examines all interior edges of the mesh. Each of these edges represents the di-agonal of the quadrilateral formed by the union of the two adjacent triangles. In generalone must �rst check if the quadrilateral is convex so that a potential diagonal swappingcan place without edge crossing. If the quadrilateral is convex then the diagonal positionis chosen which optimizes a local criterion (in this case the local equiangularity). Thisamounts to maximizing the minimum angle of the two adjacent triangles. Lawson's algo-rithm continues until the mesh is locally optimized and locally equiangular everywhere.It is easily shown that the condition of local equiangularity is equivalent to satisfactionof the incircle test described earlier. Therefore a mesh which is locally equiangular every-where is a Delaunay triangulation. Note that each new edge swapping (triangulation T �)insures that the global equiangularity increases A(T �) > A(T ). Because the triangulationis of �nite dimension, this guarantees that the process will terminate in a �nite number ofsteps. A more re�ned proof exploits the fact that the Delaunay triangulation problem inRd is equivalent to a lower convex hull problem for a point set lifted onto unit paraboloidin Rd+1. In two dimensions the lower convex hull characterization reveals that once anedge in the triangulation is removed in Lawson's algorithm it can never return. Since thenumber of edges is �N2 � = O(N2), Lawson's algorithm must terminate after at most O(N2)swaps.Iterative Algorithm: Triangulation via Lawson's Algorithmswapedge = trueWhile(swapedge)doswapedge = falseDo (all interior edges)If (adjacent 4s form convex quad)thenSwap diagonal to form T �.If (optimization criteria satis�ed)thenT = T �swapedge = trueEndIfEndIfEndDoEndWhile 32

Page 33: Barth_VKI95

When Lawson's algorithm is used for constructing Delaunay triangulations, the test forquadrilateral convexity is not needed. It can be shown that nonconvex quadrilateralsformed from triangle pairs never violate the circumcircle test. When more general opti-mization criteria is used (discussed later), the convexity check must be performed.3.4 Other 2-D Triangulation AlgorithmsIn this section, other algorithms which do not necessarily produce Delaunay triangula-tions are explored.The MinMax TriangulationAs Babu�ska and Aziz [BabuA76] point out, from the point of view of �nite elementapproximations, the MaxMin (Delaunay) triangulation is not essential. What is essen-tial is that no angle be too close to 180�. In other words, triangulations which minimizethe maximum angle are more desirable. These triangulations are referred to as MinMaxtriangulations. One way to generate a 2-D MinMax triangulations is via Lawson's edgeswapping algorithm. In this case, the diagonal position for convex pairs of triangles ischosen which minimizes the maximum interior angle for both triangles. The algorithmis guaranteed to converge in a �nite number of steps using arguments similar to Delau-nay triangulation. Figures 3.4.0 and 3.4.1 present a Delaunay (MaxMin) and MinMaxtriangulation for 100 random points.Figure 3.4.0 Delaunay Triangulation. 33

Page 34: Barth_VKI95

Figure 3.4.1 MinMax Triangulation.Note that application of local MinMax optimization via Lawson's algorithm may only re-sult in a mesh which is locally optimal and not necessarily at a global minimum. Attaininga globally optimal MinMax triangulation is a much more di�cult task. The best algorithmto present date (Edelsbrunner, Tan, and Waupotitsch [Edel90]) has a high complexity ofO(n2 log n). The Green-Sibson algorithm can be modi�ed to produce locally optimal Min-Max triangulations using incremental insertion and local edge swapping. Again the basicidead is to replace the incircle test with another local optimization predicate. This added exibility make the Green-Sibson algorithm attractive. The algorithm is implemented us-ing recursive programming with complete forward and backward propagation (contrast�gures 3.4.2 and 3.3.7). This is a necessary step to produce locally optimized meshes.Q

1

2

3

(a)

Q

1

2

3

(b)4

5

Figure 3.4.2 Edge swapping with forward and backward propagation.The MinMax triangulation has proven to be very useful in CFD. Figure 3.4.3 shows the34

Page 35: Barth_VKI95

Delaunay triangulation near the trailing edge region of an airfoil with extreme point clus-tering.Figure 3.4.3 Delaunay triangulation near trailing edge of airfoil.Upon �rst inspection, the mesh appears awed near the trailing edge of the airfoil. Fur-ther inspection and extreme magni�cation near the trail edge of the airfoil (�gure 3.4.4)indicates that the grid is a mathematically correct Delaunay triangulation.Figure 3.4.4 Extreme closeup of Delaunay triangulation near trailing edge of airfoil.Because the Delaunay triangulation does not control the maximum angle, the cells nearthe trailing edge have angles approaching 180�. The presence of nearly collapsed trianglesleaves considerable doubt as to the accuracy of any numerical solutions computed in thetrailing edge region. Edge swapping based on the MinMax criteria via Lawson's algorithm35

Page 36: Barth_VKI95

or incremental insertion using the modi�ed Green-Sibson algorithm produces the desiredresult as shown in �gure 3.4.5.Figure 3.4.5 MinMax triangulation near trailing edge of airfoil.The Greedy TriangulationA greedy method is one that never undoes what it did earlier. The greedy triangula-tion continually adds edges compatible with the current triangulation (edge crossing notallowed) until the triangulation is complete, i.e. Euler's formula is satis�ed. One objectiveof a triangulation might be to choose a set of edges with shortest total length. The bestthat the greedy algorithm can do is adopt a local criterion whereby only the shortest edgeavailable at that moment is considered for addition to the current triangulation. (Thisdoes not lead to a triangulation with shortest total length.) Note that greedy triangula-tion easily accommodates constrained triangulations containing interior boundaries and anonconvex outer boundary. In this case the boundary edges are simply listed �rst in theordering of candidate edges. The entire algorithm is outlined below.Algorithm: Greedy TriangulationStep 1. Initialize triangulation T as empty.Step 2. Compute �n2� candidate edges.Step 3. Order pool of candidate edges.Step 4. Remove current edge es from ordered pool.Step 5. If( es does not intersect edges of T ) add es to TStep 6. If(Euler's formula not satis�ed) go to Step 4.36

Page 37: Barth_VKI95

Figure 3.4.6 Greedy Triangulation.Figures 3.4.0 and 3.4.6 contrast the Delaunay and greedy algorithm. The lack of anglecontrol is easily seen in the greedy triangulation. The greedy algorithm su�ers from bothhigh running time as well as storage. In fact a naive implementation of Step 5. leads toan algorithm with O(N3) complexity. E�cient implementation techniques are given inGilbert [Gil79] with the result that the complexity can be reduced to O(N2 logN) withO(N2) storage.Data Dependent TriangulationUnlike mesh adaptation, a data dependent triangulation assumes that the number andposition of vertices is �xed and unchanging. Of all possible triangulations of these vertices,the task is to �nd the best triangulation under data dependent constraints. In Nira,Levin, and Rippa [Nira90a], they consider several data dependent constraints togetherwith piecewise linear interpolation. In order to determine if a new mesh is \better" thana previous one, a local cost function is de�ned for each interior edge. Two choices whichprove to be particularly e�ective are the JND (Jump in Normal Derivatives) and the ABN(Angle Between Normals). Using their notation, consider an interior edge with adjacenttriangles T1 and T2. Let P (x; y)1 and P (x; y)2 be the linear interpolation polynomials inT1 and T2 respectively: P1(x; y) = a1x + b1y + c1P2(x; y) = a2x + b2y + c2The JND cost function measures the jump in normal derivatives of P1 and P2 across acommon edge with normal components nx and ny.s(fT ; e) = jnx(a1 � a2) + ny(b1 � b2)j; (JND cost function)37

Page 38: Barth_VKI95

The ABN measures the acute angle between the two normals formed from the two planesP1 and P2. Using the notation of [Nira90a]:s(fT ; e) = � = cos�1(A)A = a1a2 + b1b2 + 1p(a21 + b21 + 1)(a22 + b22 + 1) ; (ABN cost function)The next step is to construct a global measure of these cost functions. This measure isrequired to decrease for each legal edge swapping. This insures that the edge swappingprocess terminates. The simplest measures are the l1 and l2 norms:R1(fT ) = Xedges js(fT ; e)jR2(fT ) = Xedges s(fT ; e)2Recall that a Delaunay triangulation would result if the cost function is chosen whichmaximizes the minimum angle between adjacent triangles (Lawson's algorithm). Althoughit would be desirable to obtain a global optimum for all cost functions, this could be verycostly in many cases. An alternate strategy is to abandon the pursuit of a globally optimaltriangulation in favor of a locally optimal triangulation. Once again Lawson's algorithm isused. Note that in using Lawson's algorithm, we require that the global measure decreaseat each edge swap. This is not as simple as before since each edge swap can have an e�ecton other surrounding edge cost functions. Nevertheless, this domain of in uence is verysmall and easily found.Iterative Algorithm: Data Dependent Triangulation via Modi�ed Lawson's Algorithmswapedge = trueWhile(swapedge)doswapedge = falseDo (all interior edges)If (adjacent trianglesform convex quadrilateral)thenSwap diagonal to form T �.If (R(fT� ) < R(fT ))thenT = T �swapedge = trueEndIfEndIfEndDoEndWhileEdge swapping only occurs when R(fT� ) < R(fT ) which guarantees that the method termi-nates in a �nite number of steps. Figures 3.4.0 and 3.4.7 plot the Delaunay triangulation of38

Page 39: Barth_VKI95

100 random vertices in a unit square and piecewise linear contours of (1+tanh(9y�9x))=9on this mesh. The exact solution consists of straight line contours with unit slope.Figure 3.4.7 Piecewise Linear Interpolation of (1 + tanh(9y � 9x))=9.

Figure 3.4.8. Data Dependent Triangulation.In �gures 3.4.8 and 3.4.9 the data dependent triangulation and solution contours using theJND criteria and l1 measure are plotted. 39

Page 40: Barth_VKI95

Figure 3.4.9 Piecewise Linear Interpolation of (1 + tanh(9y � 9x))=9.Note that the triangulations obtained from this method are not globally optimal and highlydependent on the order in which edges are accessed. Several possible ordering strategiesare mentioned in [Nira90b].3.5 2-D Steiner TriangulationsDe�nition: A Steiner triangulation is any triangulation that adds additional sites to anexisting triangulation to improve some measure of grid quality [BernE92].The insertion algorithms described earlier also provide a simple mechanism for generat-ing Steiner triangulations. In the following paragraphs we consider di�erent strategiesfor the computation of Steiner triangulations in 2-D. The �rst strategy produces nearlyequilateral triangles. These meshes are suitable for the �nite-element and �nite-volumecomputation of inviscid uid ows. The second strategy gives preference to the generationof right triangles, thus allowing the element shape to attain a high aspect ratio (the ratioof circumcircle to incircle). These meshes are suitable for the calculation of viscous uid ow at high Reynolds numbers.Circumcenter Re�nementA number of researchers have independently discovered the feasibility of inserting sites atcircumcenters of Delaunay triangles into an existing 2-D triangulation to improve measuresof grid quality, [HolS88], [Chew90], and [Rupp92]. This strategy has the desired e�ect ofplacing the new site in a position that guarantees that no other site in the triangulationcan lie closer than the radius of the circumcircle, see �gure 3.5.0. In a loose sense, the newsite is placed as far away from other nearby sites as conservatively possible.40

Page 41: Barth_VKI95

c

a

b

a

b

c

(b)(a)Figure 3.5.0 Inserting site at circumcenter of acute and obtuse triangles.Most algorithms then follow a procedure similar to that proposed by Chew:Algorithm: Steiner Triangulation, [Chew90]Step 1. Construct a constrained Delaunay triangulation of the boundary points and edges.Step 2. Compute a measure of shape and size for each element. A triangle passes only if:(1) it is well-shaped, i.e. the smallest angle is greater than 30�, (2) it is well-sized, i.e. thetriangle passes measure of size (area, containment circle size, etc.). Any size measure canbe speci�ed as long as it can be achieved by making the triangle smaller.Step 3. If all triangles pass then halt. Otherwise choose the largest triangle,4, which failsand determine its circumcenter, c.Step 4. Traverse from4 toward c until either a constraining boundary edge is encounteredor the triangle containing c is found.Step 5. If a triangle is found containing c then insert c into the triangulation and procedeto Step 2.Step 6. If a boundary edge is encountered during the traversal then split the boundary edgeinto halves and update the triangulation. Let l be the length of the new edges and considerthe new vertex located on the boundary. Delete each interior vertex of the triangulationwhich is closer than l to this boundary site. Procede to Step 2.Using this algorithm it is proven in [Chew90], [Chew93] that guaranteed-quality meshesare obtained:(1) All angles in the triangulation lie between 30� and 120�. (Smaller angles are requiredif boundary angles less that 30� are allowed).(2) All triangles will pass the user speci�ed measure.(3) All boundary edge constraints will be preserved by the �nal triangulation.Figure 3.5.1 shows Step 1 of the algorithm. The triangulation is constrained to theboundary edge set using local transformations, see [BernE90] or [George91]. Note that in41

Page 42: Barth_VKI95

Step 2 it is possible to maintain a dynamic heap data structure of the quality measure.(Heap structures are a very e�cient way of keeping a sorted list of entries with insertionand query time O(logN) for N entries.) The triangle with the largest value of the speci�edmeasure will be located at the top of the heap at all times during the triangulation. InStep 5 sites are inserted using the Green-Sibson algorithm with boundary re�nement asdiscussed in Step 6. The resulting triangulation in shown in �g. 3.5.2.Figure 3.5.1 Initial triangulation of boundary points.Figure 3.5.2 Steiner triangulationwith sites inserted at circumcenters to reduce maximumcell aspect ratio. 42

Page 43: Barth_VKI95

This triangulation has proven to be very exible. For instance, �gure 3.5.3 shows a Steinertriangulation of the Texas coast and Gulf of Mexico.Figure 3.5.3 Steiner triangulation of Texas coast and the Gulf of Mexico.Stretched Triangle Re�nementIn this approach we consider a related strategy for the generation of stretched triangu-lations. The stretching is based on a parameterization of a scalar function representingthe minimum Euclidean distance from interior vertices to boundary segments. The basicSteiner point insertion algorithm follows closely that discussed in the previous section withtwo notable di�erences:(1) Determination of candidate Steiner point locations.(2) Proximity �ltering of candidate Steiner point locations.The user speci�es a one-dimensional function of desired mesh spacings (shown in �g.3.5.4) which is parameterized by the minimum Euclidean distance function, S, so that�r = �r(S).Candidate Steiner point locations for each triangle are determined in the following way.For each interior vertex v of the existing triangulation we �rst �nd a point vb on the bound-ary which achieves minimum Euclidean distance from vb to v. This minimum distance,S, is stored for all vertices. In addition we compute and store a unit vector from vb to v,bs. Thus at each vertex of a triangle we have the distance function S, the direction vectorbs, and the desired mesh spacing �r. In Step 2 of Chew's algorithm, a triangle measureis chosen which is a ratio of the maximum dimension of the triangle projected along thevector bs to the desired mesh spacing �r. 43

Page 44: Barth_VKI95

7777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777

a

bc

Figure 3.5.4 Typical triangle abc showing the determination of distance to solid bound-aries S(a); S(b); S(c) and the associated direction vectors, bs, at a, b, and c.In Step 3 candidate Steiner point locations are obtained from values of bs and �r at thethree vertices of a triangle. The candidate Steiner points are placed a distance �r fromthe vertices of the triangle along the direction vectors bs, see �g. 3.5.4. Candidate Steinerpoint locations are eliminated (�ltered) if any other vertex in the existing triangulationlies within radius �r from the candidate point. This proximity query is performed using adynamic tree structure of the current triangulation vertex set. Of the remaining candidates,the optimal candidate is determined which produces the smallest maximum angle wheninserted into the triangulation.Figure 3.5.5 Stretched Triangulation Computed About Multi-element Airfoil.44

Page 45: Barth_VKI95

Figure 3.5.6 Closeup of Triangulation Computed About Multi-element Airfoil.Figures 3.5.5 and 3.5.6 show a typical mesh generated about a multi-element airfoilusing this strategy. The minimum wall spacing was prespeci�ed at .0001 chord units. Themaximum angle generated by the algorithm is approximately 135�. Figure 3.5.7 shows anextreme closeup near the leading edge of the main element showing the highly stretchedelements.Figure 3.5.7 Extreme Closeup of Triangulation Computed About Multi-element Airfoil.Mesh AdaptationOne clear advantage of triangulations is the ability to locally re�ne the mesh. The incre-mental triangulation algorithms earlier all extend naturally to include mesh adaptation.(This is the good news.) One strategy is to modify Step 2 of the Chew algorithm to include45

Page 46: Barth_VKI95

solution dependent measures.Figure 3.5.8 Adapted mesh obtained from supersonic channel ow computation showingre ecting shock pattern.Figure 3.5.9 Adapted mesh obtained from transonic airfoil computation showing shockand slip lines.In Step 3 of the Chew algorithm circumcenter location are chosen for site insertion.Note that although the theory of solution adaptive meshing is well developed for ellipticequations, the bad news is that a general theory suitable for nonlinear hyperbolic systemsof equations is still under development. Several subtle issues surround the development of ageneral theory. A discussion of mesh adaptation theory and error estimation is well beyondthe scope of these notes and will not be pursued. Even so, solution dependent measures46

Page 47: Barth_VKI95

based on heuristic arguments can often yield impressive results. Figures 3.5.8 and 3.5.9show density gradient-adapted triangulations obtained from two uid ow computations.These calculations are presented in detail in section 6.0.3.6 Three-Dimensional TriangulationsThe Delaunay triangulation extends naturally into three dimensions as the geometricdual of the 3-D Voronoi diagram. The Delaunay triangulation in 3-D can be characterizedas the unique triangulation such that the circumsphere passing through the four verticesof any tetrahedron must not contain any other point in the triangulation. As in the 2-Dcase, the 3-D Delaunay triangulation has the property that it minimizes the maximumcontainment sphere. In two dimensions, it can be shown that a mesh entirely comprisedof acute triangles is automatically Delaunay. To prove this, consider an adjacent trianglepair forming a quadrilateral. By swapping the position of the diagonal it is easily shownthat the minimum angle always increases. Rajan [Raj91] shows the natural extension ofthis idea to three or more space dimensions. He de�nes a \self-centered" simplex in Rd tobe a simplex which has the circumcenter of its circumsphere interior to the simplex. In twodimensions, acute triangles are self-centered and obtuse triangles are not. Rajan showsthat a triangulation entirely composed of self-centered simplices in Rd is automaticallyDelaunay.3.6a 3-D Bowyer and Watson AlgorithmsThe algorithms of Bowyer [Bow81] and Watson [Wat81] extend naturally to three dimen-sions with estimated complexities of O(N5=3) and O(N4=3) for N randomly distributedvertices. They do not give worst case estimates. It should be noted that in three dimen-sions, Klee [Klee80] shows that the maximum number of tetrahedra which can be generatedfrom N vertices is O(N2). Thus an optimal worst case complexity would be a least O(N2).Under normal conditions this worst case scenario is rarely encountered.3.6b 3-D Edge Swapping AlgorithmsUntil most recently, the algorithm of Green and Sibson based on edge swapping wasthought not to be extendable to three dimensions because it was unclear how to generalizethe concept of edge swapping to three or more dimensions. In 1986, Lawson [Law86]published a paper in which he proved the fundamental combinatorial result:Theorem: (Lawson, 1986) The convex hull of d+ 2 points in Rd can be triangulated inat most 2 ways.Joe [Joe89,91], and Rajan [Raj91] have constructed algorithms based on this theorem.In joint work with A. Gandhi [GanB93], we independently constructed an incrementalDelaunay triangulation algorithm based on Lawson's theorem. The remainder of thissection will review our algorithm and the basic ideas behind 3-D edge swapping algorithms.It is useful to develop a taxonomy of possible con�gurations addressed by Lawson's the-orem. Figure 3.6.0 shows con�gurations in 3-D of �ve points which can be triangulatedin only one way and hence no change is possible. We call these arrangements \nontrans-formable". Figure 3.6.1 shows con�gurations which allow two ways of triangulation. Itis possible to ip between the two possible triangulations and we call these arrangements47

Page 48: Barth_VKI95

\transformable". These �gures reveal an important di�erence between the two and three-dimensional algorithms. The number of tetrahedrons involved in the swapping operationneed not be constant.(a) (b) (c)

777777777777777777777777777777777777777777777777777777777777

777777777777777777777777777777777777777777777777777777777777Figure 3.6.0 Generic nonswappable con�gurations of 5 points. Shaded region denotesplanar surface.

7777777777777777777777777777777777777777777777777777777

7777777777777777777777777777777777777777777777777777777

(a) (b) (c) (d)Figure 3.6.1 Generic swappable con�gurations of 5 points. Shaded region denotes planarsurface.There are two arrangements that allow two triangulations. Figures 3.6.1(a) and 3.6.1(b)show the subclass of companion triangulations that can be transformed from one typeto another thereby changing the number of tetrahedra from 2 to 3 or vice-versa. Figures3.6.1(c) and 3.6.1(d) show the other subclass of con�gurations that can be transformed fromone type to another while keeping constant the number of tetrahedra (2). It can be shownthat all nontransformable con�gurations satisfy the circumsphere characterization of theDelaunay triangulation. Among the class of transformable con�gurations if the particularcon�guration fails the insphere test it must necessarily pass in the other con�guration.Incremental Delaunay TriangulationThe local transformation described above provide the essential machinery for extendingthe Green-Sibson algorithm into Rd.Algorithm: Incremental Delaunay Triangulation Via Local TransformationsStep 1. Locate existing tetrahedron enclosing point Q.Step 2. Insert site and connect to 4 or 5 surrounding vertices.Step 3. Identify suspect faces.Step 4. Perform edge swapping of all suspect faces failing the insphere test.Step 5. Identify new suspect faces.Step 6. If new suspect faces have been created, go to step 3.48

Page 49: Barth_VKI95

The correctness of this algorithm has been proven by Joe [Joe89]. The above algorithmcan be implemented using recursive programming techniques with forward propagation.One distinct advantage of the present technique is that it allows other criteria for edgeswapping to be used. The only additional complexity involves the inclusion of back propa-gation in the recursion process. Given the experience gained in two dimensions concerningstretched triangulations, this added exibility appears highly advantageous.3.6c 3-D Surface TriangulationThe modi�ed Green-Sibson algorithm has been extended to include the triangulation ofsurface patches. Although the concept of Dirichlet tessellation is well de�ned on a smoothmanifolds using the concept of geodesic distance, in practice this is too expensive. Findinggeodesic distance is a variational problem that is not easily solved. We have implementeda simpler procedure in which surface grids in 3-D are constructed from rectangular surfacepatches (assumed at least C0 smooth) using a generalization of the 2-D Steiner triangula-tion scheme.s

t

x

yz

R R3 2Figure 3.6.2 Mapping of rectangular patches on (s; t) plane.Points are �rst placed on the perimeter of each patch using an adaptive re�nement strategybased on absolute error and curvature measures. The surface patches are projected ontothe plane, see �gure 3.6.2. Simple stretching of the rectangular patches permits the user toproduce preferentially stretched meshes. (This is useful near the leading edge of a wing.)The triangulation takes place in the two dimensional (s; t) plane. The triangulation isadaptively re�ned using Steiner point insertion to minimize the maximum user speci�edabsolute error and curvature tolerance on each patch. The absolute error is approximatedby the perpendicular distance from the triangle centroid (projected back to 3-space) to thetrue surface as depicted in �gure 3.6.3. 49

Page 50: Barth_VKI95

000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

x

y

z

s

t

1

2

3

1

2

3

4

4

4Figure 3.6.3 Calculation of triangulation absolute error by measurement of distance fromface centroid to true surface.The user can further re�ne based on triangle aspect ratio in the (s; t) plane if desired.Figure 3.6.4 shows a typical adaptive surface grid generated using the Steiner triangulationmethod.Figure 3.6.4 Adaptive Steiner triangulation of surface geometry about Boeing 737 with aps deployed.3.7 Computational Aspects of TriangulationIn practical application, it is important to understand the impact of �nite-precisionarithmetic in the construction of two and three-dimensional triangulations using the Green-Sibson and Joe algorithms. For purposes of this discussion we assume that the sites to beinserted are prespeci�ed and the task at hand is to insert them (correctly) into the existingtriangulation. 50

Page 51: Barth_VKI95

Convexity Test:The convexity test determines whether the shape formed by the tetrahedron T1 and T2 isconvex. Let the vertices of the tetrahedra be numbered T1 = (1; 2; 3; 4) and T2 = (1; 2; 3; 5)i.e., (1; 2; 3) is the face shared by T1 and T2 and nodes 4 and 5 are at the two ends of T1and T2 respectively. We make use of the notion of barycentric coordinates to perform theconvexity test. The b1;2;3;4 satisfying264 1 1 1 1x1 x2 x3 x4y1 y2 y3 y4z1 z2 z3 z4 375264 b1b2b3b4 375 = 264 1x5y5z5 375are called the barycentric coordinates of node 5. They indicate the position of 5 in relationto the nodes of tetrahedron T1. For each s, the sign of bs indicates the position of 5 relativeto the plane Hs passing through the triangular face opposite node s. Thus bs = 0 when 5is in Hs, bs > 0 when 5 is on the same side of Hs as node s, and bs < 0 when 5 is on theopposite side of Hs from node s. Clearly, b1;2;3;4 > 0 if 5 lies inside tetrahedron (1; 2; 3; 4).If we imagine a cone formed by planes H1;2;3 of T1, then T1 and T2 would form a convexshape if and only if node 5 lies in the cone on the side opposite from node 4 (�gure 3.7.0).CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC1

2

3

4 5Figure 3.7.0 Convexity cone for node 4.This requires that b4 < 0 and b1;2;3 > 0. Thus we observe that convexity only requiresknowledge of the sign of each bi. This simpli�es the task considerably and reduces theproblem to the following:b1 = sign ������� 1 1 1 1x5 x2 x3 x4y5 y2 y3 y4z5 z2 z3 z4 ������� ; b2 = sign ������� 1 1 1 1x1 x5 x3 x4y1 y5 y3 y4z1 z5 z3 z4 ������� (3:7:0a� b)b3 = sign ������� 1 1 1 1x1 x2 x5 x4y1 y2 y5 y4z1 z2 z5 z4 ������� ; b4 = sign ������� 1 1 1 1x1 x2 x3 x5y1 y2 y3 y5z1 z2 z3 z5 ������� (3:7:0c� d)These computations e�ectively determine the orientation of the tetrahedrons (5; 2; 3; 4),(1; 5; 3; 4), (1; 2; 5; 4), (1; 2; 3; 5) formed from the con�guration.51

Page 52: Barth_VKI95

Circumsphere Test:The 3-D Delaunay triangulation is de�ned as the unique triangulation such that thecircumsphere of any tetrahedron contains no other point in the mesh. To determine wherepoint e lies in relation to the circumsphere of tetrahedron (a; b; c; d), denoted by( abcd),we use the InSphere primitive :InSphere(abcde) =8<: 1 if e is inside abcd0 if e is on abcd�1 if e is outside abcdwhere InSphere is computed from the following determinant:InSphere(abcde) = sign ��������� 1 1 1 1 1xa xb xc xd xeya yb yc yd yeza zb zc zd zew2a w2b w2c w2d w2e ���������and w2p = x2p + y2p + z2pThis test is motivated by the observation in 3-D that the intersection of a cylinder and aunit paraboloid is an ellipse lying in a plane (�gure 3.6.3). So, any four co-circular points in2-D will project to four co-planar points and the volume of the tetrahedra made from thesefour co-planar points will be zero. The paraboloid is a surface that is convex upward andthe points interior to a circle get projected to the paraboloid below the intersection planeand the points exterior to it get projected above the the intersection plane. If a point liesoutside the circumcircle of three other points, the tetrahedron made from these four pointswill have positive volume provided the three points were ordered in a counter-clockwisefashion. The volume will be negative if the point lies inside the circumcircle.-1

-0.5

0

0.5

1

-1

-0.5

0

0.5

1

0

0.2

0.4

0.6

0.8

1

-1

-0.5

0

0.5

1

-1

-0.5

0

0.5

1

0

0.2

0.4

0.6

0.8

1

Figure 3.6.3 Projection of cocircular points onto unit paraboloid.52

Page 53: Barth_VKI95

Robust Triangulation:In this section we assume that the (predetermined) sites to be inserted can be representedexactly by b bit mantissas. IEEE single precision oating point representations have a 24bit mantissa length while double precision oating point representations have a 53 bitmantissa length. In practice the accuracy to which geometry information can be obtainedis far less that 53 bits. In typical 3-D applications we usually insist by least signi�cant bittruncation that all site coordinates be exactly represented by 36 bits. This simpli�es thetask described below.An important observation is that oating point arithmetic is only required for the calcula-tion of the ternary decision predicates for convexity and the circumsphere test given d+ 2sites in Rd. Consequently, these predicates can be evaluated exactly since they only requirethe operations of addition, subtraction, and multiplication. Notably absent is the operationof division.To see why this is true, recall from the previous sections that the determination of convexityand the circumsphere test can both be reduced to the problem of determinant evaluation.Moreover, the determinant evaluation is equivalent to the problem of polynomial evaluationwhich only requires the operations of addition, subtraction, and multiplication. Also notethat the decision predicates are ternary, that is to say that the determinants can bepositive, negative, or zero. The latter response indicates a degeneracy situation whichmay call for a \tie-breaking" decision. A rather elaborate and elegant theory for handlingthese situations has been developed by Knuth [Knu92]. In the context of the Green-Sibsonor Joe algorithms, choosing one possibility over another never leads to a topologicallyincorrect triangulation but can produce a �nal triangulation which deviates from a trueDelaunay triangulation.In actual computer implementation, the exact computation of these predicates can becarried out using conventional arithmetic on redundant expressions. One example of aredundant expression is the following formula:a228 + b214 + c; jaj; jbj; jcj < 229:The use of redundant expressions is illustrated in the following computer code for thecalculation the arithmetic dot product sign(x1y1+x2y2+x3y3) assuming 28 bit coordinates.Example [Knu93]: Exact Redundant Expression Calculation.Let si = sign(xi � yi), if all si � 0 or si � 0 then the answer is clear. Otherwise reorderand possibly negate the data so that s1 = s2 � 0, s3 < 0 and all terms xi � yi � 0.f register long lx,rx,ly,ry;lx = x1/#4000;rx = x1 % #4000;ly = y1/#4000;ry = y1 % #4000;a = lx*ly; b = lx*ry + ly*rx;c = rx*ry; 53

Page 54: Barth_VKI95

lx = x2/#4000; rx = x2 % #4000;ly = y2/#4000; ry = y2 % #4000;a += lx*ly; b += lx*ry + ly*rx;c += rx*ry;lx = x3/#4000; rx = x3 % #4000;ly = y3/#4000; ry = y3 % #4000;a -= lx*ly; b -= lx*ry + ly*rx;c -= rx*ry;if(a�0) goto ez;if(a < 0) a= -a,b=-b,c=-c,s3=-s3;while(c < 0) f;a--; c+= #10000000;if(a�0) goto ez;gif(b � 0) return -s3;b = -b;a-=b/#4000;if(a > 0)return -s3;if(a � -2) return s3;return -s3*((a*#4000- b%#4000)*#4000 + c);ez:if(b � #8000) return -s3;if(b � -#8000)return s3;return -s3*(b*#4000 + c);g where #4000 = 214;#10000000 = 228.Fortune and Van Wyk [FortW93] have also examined this problem and have proposedprocedures which use interval estimates of the error in �nite precision calculations so thatexact evaluation is rarely needed.4.0 Maximum Principle AnalysisOne of the best known tools employed in the study of di�erential equations is the max-imum principle. Any function f(x) which satis�es the inequality f 00 > 0 on the interval[a; b] attains its maximum value at one of the endpoints of the interval. Solutions of theinequality f 00 > 0 are said to satisfy a maximum principle. Functions which satisfy adi�erential inequality in a domain and because of the form of the di�erential equationachieve a maximum value on the boundary @ are said to possess a maximum principle.Recall the maximum principle for Laplace's equation. Let �u � uxx + uyy denote theLaplace operator. If a function u satis�es the strict inequality�u > 0 (4:0:0)at each point in , then u cannot attain its maximum at any interior point of . Thestrict inequality can be weakened �u � 0 (4:0:1)54

Page 55: Barth_VKI95

so that if a maximum valueM is attained in the interior of then the entire function mustbe a constant with value M . Without any change in the above argument, if u satis�es theinequality �u+ c1ux + c2uy > 0 (4:0:2)in , then u cannot attain its maximum at an interior point.The second model equation of interest is the nonlinear conservation law equation:ut + (f(u))x = 0; dfdu = a(u) (4:0:3)In the simplest setting the initial value problem is considered in which the solution is spec-i�ed along the x-axis, u(x; 0) = u0(x) in a periodic or compact supported fashion. Thesolution can be depicted in the x� t plane by a series of converging and diverging charac-teristic straight lines. From the solution of (4.0.3) Lax provides the following observation:the total increasing and decreasing variations of a di�erentiable solution between any pairsof characteristics are conserved.I(t+ t0) = I(t0); I(t) = Z +1�1 ����@u(x; t)@x ���� dxMoreover in the presence of entropy satisfying discontinuites the total variation decreases(information is destroyed) in time. I(t+ t0) � I(t0) (4:0:4)An equally important consequence of Lax's observation comes from considering a mono-tonic solution between two nonintersecting characteristics: between pairs of characteristics,monotonic solutions remain monotonic, no new extrema are created. Also from (4.0.4) wehave that(1) Local maxima are nonincreasing(2) Local minima are nondecreasingThese properties of the di�erential equations serve as basic design principles for numericalschemes which approximate them. In the next section we review the fundamental theorysurrounding discrete matrix operators equipped with maximum principles.4.1 Discrete Maximum Principles for Elliptic EquationsLaplace's Equation on Structured MeshesConsider Laplace's equation with Dirichlet dataLu = 0; x; y 2 u = g; x; y 2 @ (4:1:0)L = �55

Page 56: Barth_VKI95

From the maximum principle property we have thatsupx2 ju(x; y)j � supx2@ ju(x; y)jFor simplicity consider the unit square domain = f(x; y) 2 R2 : 0 � x; y � 1gwith spatial grid xj = j�x, yk = k�x, and J�x = 1. Let Uj;k denote the numericalapproximation to u(xj ; yk). It is well known that the standard second order accurateapproximation L�U = 1�x2 [Uj+1;k +Uj�1;k + Uj;k+1 + Uj;k�1 � 4Uj;k] (4:1:1)exhibits a discrete maximum principle. To see this simply solve for the value at (j; k).Uj;k = 14[Uj+1;k + Uj�1;k + Uj;k+1 + Uj;k�1]If Uj;k achieves a maximum value M in the interior thenM = 14[Uj+1;k +Uj�1;k + Uj;k+1 + Uj;k�1]which implies that M = Uj+1;k = Uj�1;k = Uj;k+1 = Uj;k�1Repeated application of this argument for the four neighboring points yields the desiredresult.Monotone OperatorsThe discrete Laplacian operator �L�U obtained from (4.1.1) is one example of a mono-tone operator.De�nition: The discrete matrix operatorM is a monotone operator if and only ifM�1 �0 (all entries are nonnegative).Lemma 4.1.0: A su�cient but not necessary condition for M monotone is that theoperator be M-type. M-type matrix operators have the sign pattern Mii > 0 for each i,Mij � 0 whenever i 6= j. In additionM must either be strictly diagonally dominantMii > nXj=1;j 6=i jMij j; i = 1; 2; :::; n (strict diagonal dominance)or else M must be irreducible andMii � nXj=1;j 6=i jMij j; i = 1; 2; :::; n (diagonal dominance)56

Page 57: Barth_VKI95

with strict inequality for at least one i.Proof: The proof for strictly diagonally dominant M is straightforward. Rewrite thematrix operator in the following formM =D �N; D > 0;N � 0=[I �ND�1]D D�1 > 0=[I � P ]D P � 0From the strict diagonal dominance ofM we have that eigenvalues of P = ND�1 are lessthan unity, so that the Neumann series for [I �P ]�1 is convergent. This yields the desiredresult: M�1 = D�1[I + P + P 2 + P 3 + :::] � 0 (4:1:2)When M is not strictly diagonally dominant then M must be irreducible so that nopermutation P exists such thatPTMP = �M11 M120 M22 � (reducibility):This insures that eigenvalues of P are less than unity [Taussky48]. Once again we havethat the Neumann series is convergent and the �nal result follows immediately.Example: Maximum Principles and Uniform Convergence.Consider the model problem (4.1.0) on the unit square.L�Uj;k = 0Inserting the exact solution into the discrete operator yields the truncation errorL�u(xj ; yk) = Tj;k (4:1:3)Denote the error ej;k = Uj;k � u(xj ; yk) so that the problem becomesL�ej;k = �Tj;k; (j; k) 2 (4:1:4)with ej;k = 0; 8 (j; k) 2 @. Next letkek1 = max(j;k)2 jej;kjand kTk1 = max(j;k)2 jTj;kjso that kek1 = kL�1Tk1 � kL�1k1kTk1 (4:1:5)57

Page 58: Barth_VKI95

From consistency of the approximation we have from Taylor series analysis that 9 C > 0independent of �x such that max(j;k)2� jTj;kj = kTk1 � C�x2: (4:1:6)From (4.1.2) we have that�L�1 = �x24 [I + P + P 2 + P 3 + :::] P � 0Next de�ne the \summation" vector s = [1; 1; :::; 1]Tso that kPk1 = kPsk1 since all entries of P are nonnegative. Close inspection of the termsappearing in the Neumann series reveals that terms in the sum eventually geometricallyvanish (and can be summed) so that when combined with the fact that J�x = 1 we havea bound on the inverse operator of 18 .kL�1k1 =�x24 kI + P + P 2 + P 3 + :::k1=�x24 ks + Ps+ P 2s+ P 3s + :::k1� 18 (4:1:7)When the stability estimate (4.1.7) is combined with consistency result (4.1.6), convergenceis proven kek1 � kL�1k1kTk1 � C8 �x2 (4:1:8)Laplace's Equation on Unstructured MeshesConsider solving the Laplace equation problem (4.1.0) on a planar triangulation usinga Galerkin �nite element approximation with linear elemental shape functions. (Resultsusing a �nite volume method are identical but are not considered here.) In the �niteelement method we consider the variational form of the Laplace operator which is obtainedby integration by parts.Z w�u d = �Z(rw � ru) d+ Z@ w(ru � n)d� (4:1:9)This equation is e�ectively discretized by assuming a solution and test space which arepiecewise linear interpolations of pointwise function values. For example, given functionvalues fj at vertices of the triangulation, a global piecewise linear interpolant can beconstructed f(x; y) = XverticesNj(x; y)fj58

Page 59: Barth_VKI95

where the shape functions Nj(x; y) are piecewise linear functions which distance one sup-port on the triangulation so that Nj(xj ; yj ) = 1 and Nj(xi; yi) = 0; i 6= j, see �g. 4.1.1.0

i-1

i

i+1

n i-1

n i

n i+1

n i + 1/2Figure 4.1.0 Local mesh con�guration and geometry about vertex v0.Since the shape functions serve as a complete basis for representing any piecewise linearfunction on the triangulation, we need only consider w equal to a single shape functionto extract the discretized form of the Laplace operator. This implies that the discretiza-tion will involve only distance one neighbors in the triangulation as shown in �g. 4.1.0.Notationally, we de�ne the following sets:T0 = f40s incident to v0gN0 = fvertices distance 1 from v0gand the convention that Ti+12 � 4(v0; vi; vi+1) with area Ai+ 12 . The domain of interest isnow =PT0 [Ti with w = 0 on @.1

2

3

4

5

6Figure 4.1.1 Shape function N0(x; y) about central vertex v0.For piecewise linear u and w, both ru and rw are constant in each triangle so thatZ w�u d = �Z(rw � ru) d+ 0 = � Xi2N0(rw � ru)i+12Ai+ 12 : (4:1:10)59

Page 60: Barth_VKI95

Let ~n be a normal vector scaled by the length of the edge as shown in �g. 4.1.0. Somealgebra reveals that rwi+12 = �12Ai+ 12 ~ni+ 12and rui+12 = �12Ai+ 12 (~ni+1(Ui � U0)� ~ni(Ui+1 � U0)):Inserting these expressions into (4.1.10) with some rearrangement yieldsL�U0 = Z w�u d = Xi2N0W0i(Ui � U0) (4:1:11)with W0i = �14 ~ni+ 12 � ~ni+1Ai+ 12 � ~ni� 12 � ~ni�1Ai� 12 ! :This result can be further simpli�ed. Note that~ni+ 12 � ~ni+1Ai+ 12 = 2 ~ni+12 � ~ni+1j~ni+12 � ~ni+1j = �2cos(�0i)sin(�0i) = �2cotan(�0i)and �~ni� 12 � ~ni�1Ai� 12 = �2 ~ni+12 � ~ni+1j~ni� 12 � ~ni�1j = �2cos(�0i)sin(�0i) = �2cotan(�0i)where �0i and �0i are the two angles subtending the edge e(v0; vi), see �g. 4.1.2. Using thesubtending angles, the discretized Laplacian weights assume a particularly simple form:W0i = 12[cotan(�0i) + cotan(�0i)] (4:1:12)From (4.1.11) we see that Lemma 4.1.0 does not hold unless all W0i � 0. Fortunately, wehave the following result for planar triangulations.Lemma 4.1.1: A necessary and su�cient for the weights appearing in (4.1.12) to benonnegative (�L� monotone) is that the triangulation be a Delaunay triangulation.Proof: Rearrangement of the weights appearing in (4.1.12) yieldsW0i =12 [cotan(�0i) + cotan(�0i)]=12 �cos (�0i)sin (�0i) + cos (�0i)sin (�0i) �=12 � sin (�0i + �0i)sin (�0i) sin (�0i)� (4:1:13)60

Page 61: Barth_VKI95

Since �0i < �, �0i < �, the denominator is always positive and nonnegativity requires that�0i + �0i � �.α

β

0i

0i

ii+1

i−1Figure 4.1.2 Circumcircle test for adjacent triangles.Some trigonometry reveals that for the con�guration of �gure 4.1.2 with circumcircle pass-ing through fv0; vi; vi+1g the sum �0i + �0i depends on the location of vi�1 with respectto the circumcircle in the following way:�0i + �0i < �; vi�1 exterior�0i + �0i > �; vi�1 interior�0i + �0i = �; vi�1 cocircular (4:1:14)Also note that we could have considered the circumcircle passing through fv0; vi; vi�1gwith similar results for vi+1. The condition of nonnegativity implies a circumcircle con-dition for all pairs of adjacent triangles whereby the circumcircle passing through eithertriangle cannot contain the fourth point. This is precisely the unique characterization ofthe Delaunay triangulation which completes the proof.Keep in mind that from equation (4.1.13) we have that cotan(�) � 0 and cotan(�) � 0if � � �=2. Therefore a su�cient but not necessary condition for nonnegativity of theLaplacian weights is that all angles of the mesh be less than or equal to �=2. This isa standard result in �nite element theory [Ciarlet73] and applies in two or more spacedimensions.From Lemma 4.1.1 we have the following theorem concerning a discrete maximumprinciple.Theorem 4.1.2: The discrete Laplacian operator obtained from the �nite element dis-cretization (4.1.11) with linear elements exhibits a discrete maximum principle for arbi-trary point sets in two space dimensions if the triangulation of these points is a Delaunaytriangulation.Elements of the Proof: From Lemma 4.1.1 a one-to-one correspondence exists betweennonnegativity of weights and Delaunay triangulation. Assume a Delaunay triangulation ofthe point set so that for some arbitrary interior vertex v0 we have all W0i � 0 and solvefor U0: U0 = Pi2N0 W0iUiPi2N0 W0i = Xi2N0 �iUi61

Page 62: Barth_VKI95

with �i = W0iPi2N0 W0iwhich satis�es �i � 0 and Pi2N0 �i = 1. Since U0 is a convex combination of the neigh-boring values we have mini2N0Ui � U0 � maxi2N0Ui (4:1:15)If U0 attains a maximum value M then all Ui = M . Repeated application of (4.1.15) toneighboring vertices in the triangulation establishes the discrete maximum principle.We can ask if the result concerning Delaunay triangulation and the maximum principleextends to three space dimensions. As we will show, the answer is no. The resultingformula for the three dimensional Laplacian isZ0 w�u d v = Xi2N0W0i(Ui �U0) (4:1:16)where W0i = 16 d(v0;vi)Xk=1 j�Rk+12 j cotan(�k+ 12 ): (4:1:17)In this formula 0 is the volume formed by the union of all tetrahedra that share vertexv0. N0 is the set of indices of all adjacent neighbors of v0 connected by incident edges, k alocal cyclic index describing the associated vertices which form a polygon of degree d(v0; vi)surrounding the edge e(v0; vi), �k+ 12 is the face angle between the two faces associated with~Sk+12 and ~S0k+ 12 which share the edge e(vk; vk+1) and j�Rk+12 j is the magnitude of theedge (see �gure below).Sk+1/2

Sk+1/2′→

kk+1

k-1

0

i

Figure 4.1.3 Set of tetrahedra sharing interior edge e(v0; vi) with local cyclic index k.62

Page 63: Barth_VKI95

A maximum principle is guaranteed if allW0i � 0. We now will procede to describe a validDelaunay triangulation with one or more W0i < 0. It will su�ce to consider the Delaunaytriangulation of N points in which a single point v0 lies interior to the triangulation andthe remainingN �1 points describe vertices of boundary faces which completely cover theconvex hull of the point set.Top View

1

60° vi

Side View

v0

1

z

{Positivity }Delaunay

Figure 4.1.4 Subset of 3-D Delaunay Triangulation which does not maintain nonnegativ-ity.Consider a subset of the N vertices, in particular consider an interior edge incident to v0connecting to vi as shown in �gure 4.1.4 by the dashed line segment and all neighborsadjacent to vi on the hull of the point set. In this experiment we consider the height of theinterior edge, z, as a free parameter. Although it will not be proven here, the remainingN � 8 points can be placed without con icting with any of the conclusions obtained forlooking at the subset.It is known that a necessary and su�cient condition for the 3-D Delaunay triangulationis that the circumsphere passing through the vertices of any tetrahedron must be pointfree [Law86]; that is to say that no other point of the triangulation can lie interior to thissphere. Furthermore a property of locality exists so that we need only inspect adjacenttetrahedra for the satisfaction of the circumsphere test. For the con�guration of pointsshown in �gure 4.1.4, convexity of the point cloud constrains z � 1 and the satisfaction ofthe circumsphere test requires that z � 2.1 � z � 2 (Delaunay Triangulation)From (4.1.17) we �nd that W0i � 0 if and only if z < 7=4.1 � z � 74 ; (Nonnegativity)This indicates that for 7=4 < z � 2 we have a valid Delaunay triangulation which does notsatisfy a discrete maximum principle. In fact, the Delaunay triangulation of 400 pointsrandomly distributed in the unit cube revealed that approximately 25% of the interioredge weights were of the wrong sign (negative).63

Page 64: Barth_VKI95

Keep in mind that from (4.1.17) we can obtain a su�cient but not necessary conditionfor nonnegativity that all face angles be less than or equal to �=2. This is consistent withthe known result from [Ciarlet73].4.2 Discrete Variation and MaximumPrinciples for Hyperbolic EquationsIn this section we examine discrete total variation and maximum principles for scalarconservation law equations. We �rst consider the nonlinear conservation law equation:ut + (f(u))x = 0; u(x; 0) = u0(x) (4:2:0)which is discretized in the conservation form:Un+1j =Unj � �t�x (hj+ 12 � hj� 12 )=H(Unj�l; Unj�l+1; :::; Uj+l) (4:2:1)where hj+ 12 = h(Uj�l+1; :::; Uj+l) is the numerical ux function satisfying the consistencycondition h(U;U; :::; U) = f(U):A �nite-di�erence scheme (4.2.1) is said to be monotone in the sense of Harten, Hyman,and Lax [HartHL76] if H is a monotone increasing function of each of its arguments.@H@Ui (U�k; :::; Uk) � 0 8 � k � i � k (HHL monotonicity)This is a strong de�nition of monotonicity. In [HartHL76] it is proven that schemes sat-isfying this condition also satisfy the entropy inequality which distinguishes physicallyrelevant discontinuities. Unfortunately, they also prove that HHL monotone schemes inconservation form are at most �rst order spatially accurate.To allow higher order accuracy, Harten introduced a weaker concept of monotonicity. Agrid function U is called monotone if for all imin(Ui�1; Ui+1) � Ui � max(Ui�1; Ui+1): (4:2:2)A scheme is called monotonicity preserving if monotonicity of Un+1 follows from mono-tonicity of Un. Observe the close relationship between monotonicity preservation in timeand the discrete maximum principle for Laplace's equation (4.1.15) in space. It followsimmediately from the de�nition of monotonicity preservation that(1) Local maxima are nonincreasing(2) Local minima are nondecreasingwhich is a property of the conservation law equation (discussed earlier). Using this weakerform of monotonicity Harten [Hart83] introduced the notion of total variation diminishingschemes. De�ne the total variation in one dimension:TV (U) = 1X�1 jUi � Ui�1j: (4:2:3)64

Page 65: Barth_VKI95

A scheme is said to be total variation diminishing (TVD) ifTV (Un+1) � TV (Un) (4:2:4)This is a discrete analog of the total variation statement (4.0.4) given for the conservationlaw equation. Harten has proven that schemes which are HHL monotone are TVD andschemes that are TVD are monotonicity preserving. Furthermore, it can be shown that alllinear monotonicity preserving schemes are at most �rst order accurate. Thus high orderaccurate TVD schemes must necessarily be nonlinear in a fundamental way.To understand the basic design principles for TVD schemes, assume a one-dimensionalperiodic grid together with the following numerical scheme in abstract matrix operatorform [I + ��t fM D]Un+1 = [I � (1� �)�t M D]Un (4:2:5)where fM and M are matrices which can be nonlinear functions of the solution U . Thematrix D denotes the di�erence operatorDU = [I �E�1]U =0BBBB@ U1 � UJU2 �U1U3 �U2...UJ � UJ�11CCCCA :The scheme (4.2.5) represents a general family of explicit (� = 0) and implicit (� = 1)schemes with arbitrary support. More importantly we claim that schemes written inconservative form can be rewritten in this form using (exact) mean value constructions.Using this notation, we have a equivalent de�nition of the total variation in terms of theL1 norm: TV (U) = kDUk1:To analyze the scheme (4.2.5), multiply by D from the left and regroup.[I + ��t D fM ]DUn+1= [I � (1� �)�t DM ]DUn (4:2:6)or in symbolic form LDUn+1 = RDUn DUn+1 = L�1RDUn: (4:2:7)where we have assumed invertibility of L. This invertibility will be guaranteed from thediagonal dominance required below. Next we take the L1 norm of both sides and applymatrix-vector inequalities.TV (Un+1) = kDUn+1k1 �kL�1Rk1kDUnk1=kL�1Rk1TV (Un)Thus we see that a su�cient condition is that kL�1Rk1 � 1. Recall that the L1 norm ofa matrix is obtained by summing the absolute value of elements of columns of the matrix65

Page 66: Barth_VKI95

and choosing the column whose sum is largest. Furthermore, we have the usual matrixinequality kL�1Rk1 � kL�1k1kRk1 so that su�cient TVD conditions are that kL�1k1 � 1and kRk1 � 1. As we will see, these simple inequalities are enough to recover the TVDcriteria of previous investigators, see [Hart83], [Jam86].Theorem 4.2.0:[BarL87] A su�cient condition for the scheme (4.2.5) to be TVD with� = 0 is that R be bounded with R � 0 (all elements are nonnegative).Proof: Consider the explicit operatorR = I��t DM and multiply it from the left by thesummation vector sT = [1; 1; :::; 1]. It is clear that sTD = 0 so that sTR = sT (columnssum to unity). Because the L1 norm of R is the maximum of the sum of absolute valuesof elements in columns of R, we immediately obtain the desired result.Next we consider the implicit scheme with � = 1 and su�cient conditions for kL�1k1 � 1.From the previous development, one way to do this would be to show that L is monotone,i.e. L�1 � 0 with columns that sum to unity.Theorem 4.2.1:[BarL87] A su�cient condition for the scheme (4.2.5) to be TVD with� = 1 is that L be an M-type monotone matrix, i.e. diagonally dominant with positivediagonal entries and negative o�-diagonal entries.Proof: Consider the implicit operator L = I +�t D fM and multiply it from the left bythe summation vector. Again we have that sTD = 0 so thatsTL = sT ! sT = sTL�1which implies that columns of L�1 sum to unity. Finally, we appeal to lemma 4.1.0concerning M-type monotone matrix operators to obtain the �nal result.Next we will demonstrate that this general theory reproduces some well known resultsby Harten [Hart83]. Consider the following explicit scheme in Harten's notation:Un+1j = Unj + C+j+1=2�j+1=2Un � C�j�1=2�j�1=2Unwhere �j+1=2U = Uj+1 � Uj . The operator R in this case has the following bandedstructure 0BBBBBBBBB@. . . . . . 0 0 . . .. . . . . . C+j+1=2 0 00 C�j�1=2 1� C+j+1=2 � C�j+1=2 C+j+3=2 00 0 C�j+1=2 . . . . . .. . . 0 0 . . . . . .1CCCCCCCCCAWe need only require that this matrix be nonnegative to arrive at Harten's criteria:C+j+1=2 � 066

Page 67: Barth_VKI95

C�j+1=2 � 01�C+j+1=2 � C�j+1=2 � 0Next we consider Harten's implicit form:Un+1j +D+j+1=2�j+1=2Un+1 �D�j�1=2�j�1=2Un+1 = UnjIn this case L has the general structure0BBBBBBBBB@. . . . . . 0 0 . . .. . . . . . �D+j+1=2 0 00 �D�j�1=2 1+D+j+1=2+D�j+1=2 �D+j+3=2 00 0 �D�j+1=2 . . . . . .. . . 0 0 . . . . . .1CCCCCCCCCATo obtain Harten's TVD criteria for the implicit scheme, we need only require that thisoperator be an M-matrix to obtain the following conditions as did HartenD+j+1=2 � 0D�j+1=2 � 0Maximum Principles and Monotonicity Preserving Schemes on Multidimensional Struc-tured MeshesUnfortunately, we have two motivations for further weakening the concept of monotonic-ity. The �rst motivation concerns a negative result by Goodman and Le Veque [GoodLV85]in which they show that conservative TVD schemes on Cartesian meshes in two space di-mensions are �rst order accurate. The second motivation is the apparent di�culty inextending the TVD concept to arbitrary unstructured meshes. The �rst motivation in-spired Spekreijse [Spek87] to consider a new class of monotonicity preserving schemes basedon positivity of coe�cients. Consider the following conservation law equation in two spacedimensions ut + (f(u))x + (g(u))y = 0: (4:2:8)Next construct a discretization of (4.2.8) on a logically rectangular meshUn+1j;k � Unj;k�t =A+j+ 12 ;k(Unj+1;k �Unj;k) +A�j� 12 ;k(Unj�1;k � Unj;k)+B+j;k+ 12 (Unj;k+1 � Unj;k) +B�j;k� 12 (Unj;k�1 � Unj;k) (4:2:9)with nonlinear coe�cientsA�j+ 12 ;k = A(:::; Unj�1;k ; Unj;k; Unj+1;k; :::)67

Page 68: Barth_VKI95

B�j� 12 ;k = B(:::; Unj�1;k; Unj;k; Unj+1;k; :::)Theorem 4.1.2: The scheme (4.2.9) exhibits a discrete maximum principle at steady-stateif all coe�cients are uniformly bounded and nonnegativeA�j� 12 ;k � 0 B�j� 12 ;k � 0:Furthermore, the scheme (4.2.9) is monotonicity preserving under a CFL-like condition if�t � 1P�(A�j� 12 ;k +B�j;k� 12 ) :Proof: We �rst prove a discrete maximum principle at steady-state by solving for thevalue at (j; k). Uj;k =P�(Aj� 12 ;kUj�1;k +Bj;k� 12Uj;k�1)P�(Aj� 12 ;k +Bj;k� 12 )=X� (�j� 12 ;kUj�1;k + �j;k� 12Uj;k�1) (4:2:10)with the constraints �j� 12 ;k + �j+ 12 ;k + �j;k� 12 + �j;k+ 12 ;k = 1and �j� 12 ;k � 0, and �j;k� 12 � 0. From positivity of coe�cients and convexity of (4.2.10)we have min(Uj�1;k; Uj;k�1)�Uj;k�max(Uj�1;k; Uj;k�1): (4:2:11)If Uj;k attains a maximum value M at (j; k) thenM = Uj�1;k = Uj+1;k = Uj;k�1 = Uj;k+1:Repeated application of (4.2.11) to neighboring mesh points establishes the maximumprinciple.Next we determine a CFL-like condition for monotonicity preservation in time by againseeking positivity of coe�cients and a convex local mapping from Un to Un+1.Un+1j;k = 1��tX� (A�j� 12 ;k +B�j;k� 12 )!Unj;k +�t X� (Aj� 12 ;kUnj�1;k +Bj;k� 12Unj;k�1)= j;kUnj;k +X� (�j� 12 ;kUnj�1;k + �j;k� 12Unj;k�1) (4:2:12)with the derivable constraints j;k + �j� 12 ;k + �j+ 12 ;k + �j;k� 12 + �j;k+ 12 ;k = 168

Page 69: Barth_VKI95

and �j� 12 ;k � 0, and �j;k� 12 � 0. To show that (4.2.12) is a local convex mapping we needonly satisfy the CFL-like condition for nonnegativity of j;k:�t � 1P�(A�j� 12 ;k +B�j;k� 12 ): (4:2:13)If (4.2.13) is satis�ed then monotonicity preservation in time follows immediately:min(Unj�1;k; Unj;k�1; Unj;k) � Un+1j;k � max(Unj�1;k; Unj;k�1; Unj;k) (4:2:14)Maximum Principles and Monotonicity Preserving Schemes on Unstructured MeshesIn this section examine the maximum principle theory for unstructured meshes. Werestrict our attention to the analysis of Godunov-like schemes based on solution recon-struction and evolution. Thus the present analysis di�ers from maximum principle theorybased on the \upwind triangle" scheme developed by Desideri and Dervieux [DesD88],Arminjon and Dervieux [ArmD89] and Jameson [Jam93].Consider an integral form of (4.2.5) for some domain comprised of nonoverlappingcontrol volumes, i, such that = [i and i \j = 0; i 6= j. In each control volume wehave @@t Zi u d+ Z@i (F � n) d� = 0: (4:2:15)where F (u) = f(u)i+g(u)j. The situation is depicted for a control volume 0 in �g. 4.2.0.0

inoi

L R

Figure 4.2.0 Local control volum con�guration for unstructured mesh.For two and three-dimensional triangulations, several control volume choices are available:the triangles themselves, Voronoi duals, median duals, etc. Although the actual choiceof control volume tessellation is very important, the monotonicity analysis contained inthe remainder of this section is largely independent of this choice. Therefore we assume ageneric control volume 0 with neighboring control volumes i; i 2 Ni as shown in �g.4.2.0. In later sections we will examine the particular choice of control volume in detail.69

Page 70: Barth_VKI95

Example: Analysis of an Upwind Advection Scheme with Piecewise Constant Reconstruc-tion.In this example we assume that the solution data u(x; y)i in each control volume i isconstant with value equal to the integral average value, i.e.u(x; y)i = ui = 1Ai Zi u d; 8i 2 : (4:2:16)where Ai is the area of i.. Next we de�ne the unit exterior normal vector n0i for thecontrol volume boundary separating 0 and i. It is also useful to de�ne a variant ofthe unit normal vector ~n0i which is scaled by the length (area in 3-D) of that portion ofthe control volume boundary separating 0 and i. Finally, to simplify the exposition wede�ne f(u; ~n) = F (u) � ~nand assume the existence of a mean value linearization such thatf(v; ~n) � f(u; ~n) = df(u; v; ~n)(v � u): (4:2:17)Using this notation, we construct the following upwind schemeddt (A0u0) = �Xi2N0 h(u0; ui; ~n0i) (4:2:18)with h(u0; ui; ~n0i) = 12 (f(u0; ~n0i) + f(u;~n0i)) � 12 jdf(u0; ui; ~n0i)j(ui � u0) (4:2:19)In Barth and Jespersen [BarJ89] we proved a maximum principle and monotonicity preser-vation of the scheme (4.2.18) for scalar advection.Theorem 4.2.2 The upwind algorithm with piecewise constant solution data (4.2.18) ex-hibits a discrete maximum principle for arbitrary unstructured meshes and is monotonicitypreserving under the CFL-like condition:�t � �AjPi2Nj df�(ui; uj ; ~nji) ; 8j 2 Proof:For simplicity consider the control volume surrounding v0 as shown in �g.4.2.0.Recall that the ux function was constructed using a mean value linearization such thatf(ui; ~n0i)� f(u0; ~n0i) = df(u0; ui; ~n0i) (ui � u0) (4:2:20)This permits regrouping terms into the following form:ddt (u0A0) = �Xi2N0 f(u0; ~n0i) � Xi2N0 df�(u0; ui; ~n0i)(ui � u0) (4:2:21)70

Page 71: Barth_VKI95

where (�) = (�)++(�)� and j(�)j = (�)+� (�)�. For any closed control volume, we have thatXi2N0 f(u0; ~n0i) = 0:Combining the remaining terms yields a �nal form for analysis.ddt (u0A0) = �Xi2N0 df�(u0; ui; ~n0i)(ui � u0) (4:2:22)To verify a maximum principle at steady-state, set the time term to zero and solve for u0.u0 = Pi2N0 df�(u0; ui; ~n0i) uiPi2N0 df�(u0; ui; ~n0i) = Xi2N0 �iui (4:2:23)with Pi2N0 �i = 1 and �i � 0. Since u0 is a convex combination of all neighborsmini2N0 ui � u0 � maxi2N0 ui: (4:2:24)If u0 takes on a maximum value M in the interior, then ui = M;8 i 2 N0. Repeatedapplication of (4.2.24) to neighboring control volumes establishes the maximum principle.For explicit time stepping, a CFL-like condition is obtained for monotonicity preserva-tion. For Euler explicit time stepping, we have the time approximation,ddt (u0A0) � A0un+10 � un0�twhich results in the following scheme:un+10 =un0 � �tA0 Xi2N0 df�(un0 ; uni ; ~n0i)(uni � un0 )=�0un0 + Xi2N0 �iuni (4:2:25)It should be clear that coe�cients in (4.2.25) sum to unity. To prove monotonicity preser-vation, it is su�cient to show nonnegativity of coe�cients. By inspection we have that�i � 0 8 i > 0. To achieve monotonicity preservation we require that �0 � 0.�0 = 1 + �tA0 Xi2N0 df�(un0 ; uni ; ~n0i) � 0 (4:2:26)A local convex mapping from un to un+1 exists under the CFL-like condition�t � �AjPi2Nj df�(un0 ; uni ; ~n0i) ; 8j 2 : (4:2:27)71

Page 72: Barth_VKI95

which establishes monotonicity preservation in time.Note that in one dimension, this number corresponds to the conventional CFL number.In multiple space dimensions, this inequality is su�cient but not necessary for stability.In practice somewhat larger timestep values may be used.Example: Analysis of High Order Accurate Upwind Advection Schemes Using ArbitraryOrder Reconstruction.In this example we examine maximum principle properties for a general class of highorder accurate schemes on unstructured meshes. The solution algorithm is a relativelystandard procedure for extensions of Godunov's scheme in Eulerian coordinates, see forexample [God59], [VanL79], [ColW84], [WoodC84], [HartO85], [HartEOC86]. The basicidea in Godunov's method is to treat the integral control volume averages, u, as the basicunknowns. Using information from the control volume averages, k � th order piecewisepolynomials are reconstructed in each control volume i:Uk(x; y)i = Xm+n�k�(m;n)P(m;n)(x � xc; y � yc) (4:2:28)where P(m;n)(x�xc; y�yc) = (x�xc)m(y�yc)n and (xc; yc) is the control volume centroid.The process of reconstruction amounts to �nding the polynomial coe�cients, �(m;n). Nearsteep gradients and discontinuities, these polynomial coe�cients maybe altered based onmonotonicity arguments. Because the reconstructed polynomials vary discontinuously fromcontrol volume to control volume, a unique value of the solution does not exist at controlvolume interfaces. This nonuniqueness is resolved via exact or approximate solutions of theRiemann problem. In practice, this is accomplished by supplanting the true ux functionwith a numerical ux function (described below) which produces a single unique ux giventwo solution states. Once the ux integral is carried out (either exactly or by numericalquadrature), the control volume average of the solution can be evolved in time. In mostcases, standard techniques for integrating ODE equations are used for the time evolution,i.e. Euler implicit, Euler explicit, Runge-Kutta. The result of the evolution process is anew collection of control volume averages. The process can then be repeated. The processcan be summarized in the following steps:(1) Reconstruction in Each Control Volume: Given integral solution averages in allj , reconstruct a k�th order piecewise polynomial Uk(x; y)i in each i for use in equation(4.2.15). In faithful implementations of Godunov's method we require thatZi Uk(x; y)i d = (uA)iduring the reconstruction process (see discussion below). For solutions containing discon-tinuities and/or steep gradients, monotonicity enforcement may be required.(2) Flux Evaluation on Each Edge: Supplant the true ux by a numerical ux function.Given two solution states the numerical ux function returns a single unique ux. Using72

Page 73: Barth_VKI95

the notation of the previous section we de�ne f(u;n) = (F (u) � n) so thatZ@i f(u;n) d� � Z@i h(UL; UR;n)d�: (4:2:29)Consider each control volume boundary @i, to be a collection of polygonal edges (or dualedges) from the mesh. Along each edge (or dual edge), perform a high order accurate uxquadrature. When the reconstruction polynomial is piecewise linear, single (midpoint)quadrature is usually employed on both structured and unstructured meshesZ@i h(UL; UR;n)d� � Xj2Ni h(UL; UR; ~n)ij (4:2:30)where UL and UR are solution values evaluated at the midpoint of control volume edgesas shown in �g. 4.2.0. When multi-point quadrature formulas are employed we assumethat they are of the form: Z 10 f(s)d s =Xq2Qwqf(�q)with wq > 0 and �q 2 [0; 1]. We denote the inclusion of multi-point quadrature formulasusing the augmented notationZ@i h(UL; UR;n)d� � Xj2NiXq2Qwqh(UL; UR; ~n)ijq (4:2:31)(3) Evolution in Each Control Volume: Collect ux contributions in each controlvolume and evolve in time using any time stepping scheme, i.e., Euler explicit, Eulerimplicit, Runge-Kutta, etc. The result of this step is once again control volume averagesand the process can be repeated.In the present analysis we assume that the reconstruction polynomials Uk(x; y)i in eachi are given. The result of the analysis will be conditions or constraints on the recon-struction so that maximum principles can be obtained. The topic of reconstruction andimplementation of the constraints determined by this analysis will be examined in a latersection. Using this notation, we construct the following upwind scheme for the con�gura-tion in �g. 4.2.0. ddt (A0u0) = � Xi2N0 Xq2Qwqh(UL; UR; ~n)0iq (4:2:32)with a numerical ux function obtained from (4.2.17).h(UL; UR; ~n0i) = 12 �f(UL; ~n) + f(UR ; ~n)�0i� 12 jdf(UL; UR; ~n)j0i(UR� UL)0i (4:2:33)73

Page 74: Barth_VKI95

To analyze the scheme, we use the fact that the ux function was constructed using amean value linearization such thatf(UR; ~n) � f(UL ; ~n) = df(UL; UR; ~n)(UR �UL)This permits regrouping terms into the following form:ddt (u0A0) = � Xi2N0Xq2Qwq �f(UL; ~n) + df�(UL; UR; ~n)(UR � UL)�0iq (4:2:34)Next we rewrite the �rst term in the sum using a mean value constructionXi2N0Xq2Qwqf(u0; ~n)0iq + wq �df(u0; UL; ~n)(UL� u0)�0iqThe �rst term vanishes when summed over a closed volume so that (4.2.34) reduces toddt(u0A0) = � Xi2N0Xq2Qwq �df(u0; UL; ~n)(UL � u0) + df�(UL; UR; ~n)(UR � UL)�0iq(4:2:35)By introducing di�erence ratios we rewrite the scheme in the following form:ddt (u0A0) =� Xi2N0Xq2Qwq �df�(u0; UL; ~n)�0iq (ui � u0)� Xi2N0Xq2Qwq �df+(u0; UL; ~n)��0iq (u0 � uk)� Xi2N0Xq2Qwq �df�(UL; UR; ~n)��0iq (ui � u0) (4:2:36)with 0iq = UL0iq � u0ui � u0 ; �0iq = UL0iq � u0u0 � uk ; �0iq = UR0iq � UL0iqui � u0 (4:2:37)In this equation, the k subscript refers to some as yet unspeci�ed index value such thatk 2 N0.Theorem 4.2.3: The generalized Godunov scheme with arbitrary order reconstruction(4.2.32) exhibits a discrete maximum principle at steady-state if the following three con-ditions are ful�lled: jiq � 0; �jiq � 0 �jiq � 0 8j; q; i 2 Nj : (4:2:38)Furthermore, the scheme is monotonicity preserving under a CFL-like condition if 8 j 2: �t � �AjPi2NjPq2Q �df�� df+�+ df���jiq74

Page 75: Barth_VKI95

Proof: Assume that (4.2.38) holds. De�ne df0iq = wqdf(u0; UL; ~n)0iq and similarly df0iq =wqdf(UL; UR; ~n)0iq. Setting the time term to zero and solving for u0 yieldsu0 =Pi2N0Pq2Q �df+��0iq uk � �df�+ df���0iq uiPi2N0Pq2Q �df+�� df�� df���0iq= Xi2N0 �iui:Examining the individual coe�cients we see thatPi2N0 �i = 1 and �i � 0. Thus a convexlocal mapping exists and we obtainmini2N0 ui � u0 � maxi2N0 ui: (4:2:40)If u0 takes on a maximum value M in the interior, then ui = M;8 i 2 N0. Repeatedapplication of (4.2.40) to neighboring control volumes establishes the maximum principle.To establish monotonicity preservation in time we consider the Euler explicit timestep-ping scheme. ddt (u0A0) � A0un+10 � un0�tInserting this formula into (4.2.36) yieldsun+10 =un0 � �tA0 Xi2N0 Xq2Q df�0iq0iq(uni � un0 )��tA0 Xi2N0Xq2Q df+0iq�0iq(un0 � unk )��tA0 Xi2N0Xq2Q df�0iq�0iq(uni � un0 )=�0un0 + Xi2N0Xq2Q�iuni (4:2:41)with �0 +Pi2N0 �i = 1 and �i � 0; i > 0. A locally convex mapping in time from Un toUn+1 is achieved when �0 � 0. This assures monotonicity in time. Some algebra revealsthe following formula for �0�0 = 1 + �tA0 Xi2N0Xq2Q�df�� df+�+ df���0iqFrom this we obatin the CFL-like condition for monotonicity preservation�t � �A0Pi2N0Pq2Q �df�� df+�+ df���0iq75

Page 76: Barth_VKI95

so that mini2N0(uni ; un0 ) � un+10 � maxi2N0(uni ; un0 )Applying this result to all control volumes establishes monotonicity preservation.5.0 Finite-Volume Schemes for Scalar Conservation Law EquationsIn this section we consider the speci�c application of the maximum principle theory ofSection 4 to the design of numerical schemes for solving conservation law equations. Someportions of this section will repeat ideas introduced in Section 4 but now emphasis is placedon the details of implementation.5.1 Conservation LawsDe�nition: A conservation law asserts that the rate of change of the total amount of asubstance with density z in a �xed region is equal to the ux F of the substance throughthe boundary @. ddt Z z d a + Z@ F(z) � n d l = 0 (integral form)The choice of a numerical algorithm used to solve a conservation law equation is oftenin uenced by the form in which the conservation law is presented. A �nite-di�erencepractitioner would apply the divergence theorem to the integral form and let the area of shrink to zero thus obtaining the divergence form of the equation.ddtz +r � F(z) = 0 (divergence form)The �nite-element practitioner constructs the divergence form then multiplies by an arbi-trary test function � and integrates by parts.ddt Z �z d a�Zr� � F(z) d a+Z@ �F(z) � n d l = 0 (weak form)Algorithm developers starting from these three forms can produce seemingly di�erentnumerical schemes. In reality, the �nal discretizations are usually very similar. Some dif-ferences do appear in the handling of boundary conditions, solution discontinuities, andnonlinearities. When considering ows with discontinuities, the integral form appears ad-vantageous since conservation of uxes comes for free and the proper jump conditions areassured. Discretizations of the divergence form on structured meshes yield conventional�nite-di�erence formulas. The divergence form of the equations is rarely used in the dis-cretization of conservation law equations on unstructured meshes because of the di�cultyin ensuring conservation and the proper jump conditions. The weak form of the equationguarantees satisfaction of the jump conditions over the extent of the support and is fully76

Page 77: Barth_VKI95

compatible with unstructured meshes. For these reasons the integral and weak forms areboth used extensively in numerical modeling of conservation laws on unstructured meshes.The remaining portion of these notes will be devoted to the �nite-volume method whichis a technique for obtaining discretizations based on the integral conservation law form ofthe equations.5.2 Upwind Finite-Volume Schemes Via Godunov's MethodIn this section, we consider upwind algorithms for scalar hyperbolic equations. In par-ticular, we concentrate on upwind schemes based on Godunov's method [God59]. Thedevelopment presented here follows many of the ideas developed previously for structuredmeshes. For example, in the extension of Godunov's scheme to second order accuracyin one space dimension, van Leer [VanL79] developed an advection scheme based on thereconstruction of discontinuous piecewise linear distributions together with Lagrangianhydrodynamics. Colella and Woodward [ColW84] and Woodward and Colella [WoodC84]further extended these ideas to include discontinuous piecewise parabolic approximationswith Eulerian or Lagrangian hydrodynamics. Harten et. al. [HartO85], [HartEOC86] laterextended related schemes to arbitrary order and clari�ed the entire process. These tech-niques have been applied to structured meshes in multiple space dimensions by applyingone-dimensional-like schemes along individual coordinate lines. This has proven to bea highly successful approximation but does not directly extend to unstructured meshes.In reference [BarJ89], we proposed a scheme for multi-dimensional reconstruction on un-structured meshes using discontinuous piecewise linear distributions of the solution in eachcontrol volume. Monotonicity of the reconstruction was enforced using a limiting proce-dure similar to that proposed by van Leer for structured grids. In later papers ([BarF90],[Bar93]) we developed numerical schemes for unstructured meshes utilizing a reconstruc-tion algorithm of arbitrary order. Recall that in section 4.2 the three basic steps in thegeneralized Godunov scheme were outlined:(1) Reconstruction in Each Control Volume: Given integral cell averages in all con-trol volumes, reconstruct piecewise polynomial approximations in each control volume,Uk(x; y)i, for use in the integral conservation law equation (4.2.29). For solutions contain-ing discontinuities or steep gradients, monotonicity enforcement may be required.(2) Flux Evaluation on Each Edge: Consider each control volume boundary, @i, tobe a collection of edges (or dual edges) from the mesh. Along each edge (or dual edge),perform a high order accurate ux quadrature.(3) Evolution in Each Control Volume: Collect ux contributions in each controlvolume and evolve in time using any time stepping scheme, i.e., Euler explicit, Eulerimplicit, Runge-Kutta, etc. The result of this process is once again cell averages.By far, the most di�cult of these steps is the polynomial reconstruction given cell aver-ages. In the following paragraphs, we describe design criteria for a general reconstructionoperator with �xed stencil size. In work by Vankeirsbilck and Deconinck [VankD92] andMichell [Mich94] ENO-like schemes have been constructed for both unstructured and struc-tured meshes. 77

Page 78: Barth_VKI95

The reconstruction operator serves as a �nite-dimensional (possibly pseudo) inverse ofthe cell-averaging operator A whose j-th component Aj computes the cell average of thesolution in j . uj = Aju = 1Aj Zj u(x; y) d (5:2:0)In addition, we place the following additional requirements:(1) Conservation of the mean: Simply stated, given cell averages u, we require that allpolynomial reconstructions uk have the correct cell average.if uk = Rku then u = AukThis means that Rk is a right inverse of the averaging operator A.ARk = I (5:2:1)Conservation of the mean has an important implication. Unlike �nite-element schemes,Godunov schemes have a diagonal mass matrix.(2) k-exactness: We say that a reconstruction operator Rk is k-exact if RkA reconstructspolynomials of degree k or less exactly.if u 2 Pk and u = Au; then uk = Rku = uIn other words, Rk is a left-inverse of A restricted to the space of polynomials of degreeat most k. RkA����Pk= I (5:2:2)This insures that exact solutions contained in Pk are in fact solutions of the discreteequations. For su�ciently smooth solutions, the property of k-exactness also insures thatwhen piecewise polynomials are evaluated at control volume boundaries, the di�erencebetween solution states diminishes with increasing k at a rate proportional to hk+1 wereh is a maximum diameter of the two control volumes. Figure 5.2.0 shows a global quarticpolynomial u 2 P4 which has been averaged in each interval.Figure 5.2.0 Cell averaging of quartic polynomial.78

Page 79: Barth_VKI95

Figure 5.2.1 shows a quadratic reconstruction uk 2 P2 given the cell averages. Closeinspection of �g. 5.2.1 reveals small jumps in the piecewise polynomials at interval bound-aries.Figure 5.2.1 Piecewise quadratic reconstruction.These jumps would decrease even more for cubics and vanish altogether for quartic recon-struction. Property (1) requires that the area under each piecewise polynomial is exactlyequal to the cell average. 5.3 Linear ReconstructionIn this section, we consider algorithms for linear reconstruction. The process of linearreconstruction in one dimension is depicted in �g. 5.3.0.Figure 5.3.0 Linear Reconstruction of cell-averaged data.79

Page 80: Barth_VKI95

One of the most important observations concerning linear reconstruction is that we candispense with the notion of cell averages as unknowns by reinterpreting the unknowns aspointwise values of the solution sampled at the centroid (midpoint in 1-D) of the controlvolume. This well known result greatly simpli�es schemes based on linear reconstruction.The linear reconstruction in each interval shown in �gure 5.3.0 was obtained by a simplecentral-di�erence formula given point values of the solution at the midpoint of each interval.In section 5.5, results for arbitrary order reconstruction will be presented. The recon-struction strategy presented there satis�es all the design requirements of the reconstructionoperator. For linear reconstruction, simpler formulations are possible which exploit theedge data structure mentioned in Section 1. For each edge of a triangulation, we store thefollowing two pieces of information:(1) The two vertices which form the edge.(2) The centroid location of the two cell which share that edge.Several of these reconstruction schemes are given below. Note that for steady-state com-putations, conservation of the mean in the data reconstruction is not necessary. Theimplication of violating this conservation is that a nondiagonal mass matrix appears inthe time integral. Since time derivatives vanish at steady-state, the e�ect of this massmatrix vanishes at steady-state. The reconstruction schemes presented below assume thatsolution variables are placed at the vertices of the mesh, which may not be at the precisecentroid of the control volume, thus violating conservation of the mean. Even so, linearreconstruction methods given below can all be implemented using an edge data structureand satisfy k-exactness for linear functions.5.3a Green-Gauss ReconstructionConsider a domain 0 consisting of all triangles incident to some vertex v0, see �g. 5.3.0and the exact integral relation Z0 ru d = Z@0 un d�:In [BarJ89] we show that given function values at vertices of a triangulation, a discretizationof this formula can be constructed which is exact whenever u varies linearly:(ru)v0 = 1A0 Xi2N0 12(ui + u0)~n0i: (5:3:0)In this formula ~n0i = R ba d ~n for any path which connects triangle centroids adjacent to theedge e(v0; vi) and A0 is the area of the nonoverlapping dual regions formed by this choice ofpath integration. Two typical choices are the median and centroid duals as shown below.80

Page 81: Barth_VKI95

0

6

1

2

3

4

5

6’1’

2’

3’4’

5’Mesh

Median Dual

Centroid DualFigure 5.3.0 Local mesh with centroid and median duals.This approximation extends naturally to three dimensions, see [Bar91b]. The formula(5.3.0) suggests a natural computer implementation using the edge data structure. Assumethat the normals ~nij for all edges e(vi; vj) have been precomputed with the convention thatthe normal vector points from vi to vj . An edge implementation of 5.3.0 can be performedin the following way:For k = 1; n(e) ! Loop through edges of meshj1 = e�1(k; 1) !Pointer to edge originj2 = e�1(k; 2) !Pointer to edge destinationuav = (u(j1) + u(j2))=2 !Gatherux(j1) + = normx(k) � uav !Scatterux(j2) � = normx(k) � uavuy(j1) + = normy(k) � uavuy(j2) � = normy(k) � uavEndforFor j = 1; n(v) ! Loop through verticesux(j) = ux(j)=area(j) ! Scale by areauy(j) = uy(j)=area(j)EndforIt can be shown that the use of edge formulas for the computation of vertex gradients isasymptotically optimal in terms of work done.5.3b Linear Least-Squares ReconstructionTo derive this reconstruction technique, consider a vertex v0 and suppose that the solu-tion varies linearly over the support of adjacent neighbors of the mesh. In this case, thechange in vertex values of the solution along an edge e(vi; v0) can be calculated by(ru)0 � (Ri �R0) = ui � u0 (5:3:1)81

Page 82: Barth_VKI95

This equation represents the scaled projection of the gradient along the edge e(vi; v0). Asimilar equation could be written for all incident edges subject to an arbitrary weightingfactor. The result is the following matrix equation, shown here in two dimensions:264 w1�x1 w1�y1... ...wn�xn wn�yn 375�uxuy�=0B@w1(u1�u0)...wn(un�u0)1CA (5:3:2)or in symbolic form L ru = f where L = � ~L1 ~L2 � (5:3:3)in two dimensions. Exact calculation of gradients for linearly varying u is guaranteed ifany two row vectors wi(Ri �R0) span all of 2 space. This implies linear independence of~L1 and ~L2. The system can then be solved via a Gram-Schmidt process, i.e.,� ~V1~V2 � � ~L1 ~L2 � = � 1 00 1 � (5:3:4)The row vectors ~Vi are given by ~V1 = l22~L1 � l12~L2l11l22 � l212~V2 = l11~L2 � l12~L1l11l22 � l212with lij = (~Li � ~Lj).Note that reconstruction of N independent variables in Rd implies �d+12 � + dN innerproduct sums. Since only dN of these sums involves the solution variables themselves,the remaining sums could be precalculated and stored in computer memory. This makesthe present scheme competitive with the Green-Gauss reconstruction. Using the edge datastructure, the calculation of inner product sums can be calculated for arbitrary combi-nations of polyhedral cells. In all cases linear functions are reconstructed exactly. Wedemonstrate this idea by example:For k = 1; n(e) !Loop through edges of meshj1 = e�1(k; 1) !Pointer to edge originj2 = e�1(k; 2) !Pointer to edge destinationdx = w(k) � (x(j2)� x(j1)) !Weighted �xdy = w(k) � (y(j2) � y(j1)) !Weighted �yl11(j1) = l11(j1) + dx � dx ! l11 orig suml11(j2) = l11(j2) + dx � dx ! l11 dest suml12(j1) = l12(j1) + dx � dy ! l12 orig suml12(j2) = l12(j2) + dx � dy ! l12 dest sum 82

Page 83: Barth_VKI95

du = w(k) � (u(j2)� u(j1)) !Weighted �ulf1(j1) + = dx � du !~L1f sumlf1(j2) + = dx � dulf2(j1) + = dy � du !~L2f sumlf2(j2) + = dy � duEndforFor j = 1; n(v) ! Loop through verticesdet = l11(j) � l22(j)� l212ux(j) = (l22(j) � lf1(j) � l12 � lf2)=detuy(j) = (l11(j) � lf2(j)� l12 � lf1)=detEndforThis formulation provides freedom in the choice of weighting coe�cients, wi. Theseweighting coe�cients can be a function of the geometry and/or solution. Classical approx-imations in one dimension can be recovered by choosing geometrical weights of the formwi = 1:=j�ri ��r0jt for values of t = 0; 1; 2. The L2 gradient calculation technique isoptimal in a weighted least squares sense and determines gradient coe�cients with leastsensitivity to Gaussian noise. This is an important property when dealing with highlydistorted (stretched) meshes which typically arise in the computation of viscous ow.5.3c Data Dependent ReconstructionBoth the Green-Gauss and L2 gradient calculation techniques can be generalized toinclude data dependent (i.e. solution dependent) weights. In the case of Green-Gaussformulation, the sum Xi2I0 12(u0 + ui)~n0iis replaced by Xi2I0p�0i 12(u0 + ui)~n0i+p+0i 12 ((ru)0 � (Ri �R0)) ~n0i (5:3:5)If the p�0i are chosen such that p�0i+p+0i = 1 then the gradient calculation is exact wheneverthe solution varies linearly over the support. In two space dimensions, equation (5.3.5)implies the solution of a linear 2� 2 system of the form�A0 �mxx �mxy�myx A0 �myy ��uxuy�=Xi2I0p�0i 12(u0 + ui)~n0iwhere mxx = Xi2I0 p+0i�xinxi ; myy = Xi2I0 p+0i�yinyimxy = Xi2I0 p+0i�xinyi ; myx = Xi2I0 p+0i�yinxi83

Page 84: Barth_VKI95

Care must be exercised in the selection of p� in order that the system be invertible.This is similar to the spanning space requirement of the least-squares gradient calculationtechnique.5.3d Monotonicity EnforcementWhen solution discontinuites and steep gradients as present, additional steps must betaken to prevent oscillations from developing in the numerical solution. One way to dothis was pioneered by van Leer [VanL79] in the late 1970's. The basic idea is to take thereconstructed piecewise polynomials and enforce strict monotonicity in the reconstruction.Monotonicity in this context means that the value of the reconstructed polynomial doesnot exceed the minimum and maximum of neighboring cell averages. The �nal reconstruc-tion must guarantee that no new extrema have been created. When a new extremum isproduced, the slope of the reconstruction in that interval is reduced until monotonicity isrestored. This implies that at a local minimum or maximum in the cell averaged data theslope in 1-D is always reduced to zero, see for example �g. 5.3.2.Figure 5.3.2 Linear Data Reconstruction with monotone limiting.Theorem 4.2.3 provides su�cient conditions for a discrete maximum principle using arbi-trary order reconstruction. Consider the control volume interface separating i and j asshown in �g. 5.3.3.

i

j

LRFigure 5.3.3 Control volume interface between i and j showing left and right extrap-olations. 84

Page 85: Barth_VKI95

From Theorem 4.2.3 we can guarantee a maximum principle if for all quadrature pointson the interface separating i and j � 0; � � 0 � � 0:By performing the analysis for the control volume i we determined the following condi-tions: 0 �UL � uiuj � ui (a)0 �UL � uiui � uk (b)0 �UR � ULuj � ui (c) (5:3:6)where constraint (b) should be interpreted as �nd any k 2 Ni such that the inequalityexists. By considering the neighboring control volumes, constraint (c) is reproduced atshared interfaces and the remaining new constraints do not involve the reconstructionpolynomial in the other control volume. In other words, constraints (a) and (b) are purelylocal to a control volume and can be enforced on an individual control volume basis.Constraint (c) is not local and must be enforced pairwise at common interface boundaries.The entire picture becomes clearer by considering a one-dimensional mesh with pointscontiguously numbered i = 1; 2; :::N . In this scenario, we would have that j = i+1 and kwould necessarily be equal to i� 1 so that previous result (5.3.6) reduces to0 � UL � uiui+1 � ui0 � UL � uiui � ui�10 �UR � ULui+1 � ui (5:3:7)for the interface separating i and i+1. Looking from i+1 at the same interface from i+1to i yields the additional constraints:0 �UR � ui+1ui � ui+10 � UR � ui+1ui+1 � ui+20 �UR � ULui+1 � ui : (5:3:8)The local constraints (a) and (b) in (5.3.7) and (5.3.8) are satis�ed if the reconstructed ULand UR are bounded between the minimum and maximum of neighboring cell averages.The last inequality appearing in (5.3.7-5.3.8) requires that the di�erence in the extrapolated85

Page 86: Barth_VKI95

states at a cell interface must be of the same sign as the di�erence in the cell averagevalues. For example in �g. 5.3.4(a) this condition is violated but can be remedied eitherby a symmetric reduction of slopes or by replacing the larger slope by the minimum valueof the two slopes.CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

(a) (b)Figure 5.3.4 (a) Reconstruction pro�le with increased variation violating monotonicityconstraints. (b) Pro�le after modi�cation to satisfy monotonicity constraints.Observe that in one space dimension the net e�ect of the slope limiting in the recon-struction process is to ensure that the total variation of the reconstructed function does notexceed the total variation of the cell averaged data.Next we consider the implementation of slope limiting procedures on unstructured meshes.In Barth and Jespersen [BarJ89], we gave a simple recipe for slope limiting on arbitraryunstructured meshes. Extensive testing has shown that the limiting characteristics of ourlimiter can damage the overall accuracy of computations, especially for ows on coarsemeshes. In addition the limiter behaves poorly when the solution is nearly constant un-less additional heuristic parameters are added. This has prompted other researchers (c.f.[Ven93], [AftGT94]) to proposed alternative limiter functions, but no serious attempt ismade to appeal to the rigors of maximum principle theory. The design of accurate limiterssatisfying the maximum principle constraints is a topic of current research.In the following paragraphs the limiter function developed in [BarJ89] is discussed.Consider writing the linearly reconstructed data in the following form for 0:U(x; y)0 = u0 +ru0 � (r� r0) (5:3:9)Now consider a \limited" form of this piecewise linear distribution.U(x; y)0 = u0 +�0ru0 � (r� r0) (5:3:10)Next compute the minimum and maximum of all adjacent neighborsuminj = mini2Nj(u0; ui)86

Page 87: Barth_VKI95

umaxj = maxi2Nj(u0; ui)and require that uminj � U(x; y)0 � umaxj (5:3:11)when evaluated at the quadrature points used in the ux integral computation. For eachquadrature point location in the ux integral compute the extrapolated state UL0i anddetermine the smallest �0 so that�0 = 8>><>>:min(1; umaxj �u0UL0i�u0 ); if UL0i � u0 > 0min(1; uminj �u0UL0i�u0 ); if UL0i � u0 < 01 if UL0i � u0 = 0 :When the above procedures are combined with the ux function given earlier (4.2.23),h(UL; UR;n) =12 �f(UL;n) + f(UR;n)��12 jdf(UL; UR;n)j �UR � UL� (5:3:12)the resulting scheme has good shock resolving characteristics. To demonstrate this, weconsider the scalar nonlinear hyperbolic problem suggested by Struijs, Deconinck, et al[StruVD89]. The equation is a multidimensional form of Burger's equation.ut + (u2=2)x + uy = 0This equation is solved in a square region [0; 1:5] � [0; 1:5] with boundary conditions:u(x; 0) = 1:5�2x; x � 1, u(x; 0) = �:5; x > 1, u(0; y) = 1:5, and u(1:5; y) = �:5. Figures5.3.5 and 5.3.6 show carpet plots and contours of the solution on regular and irregularmeshes.Figure 5.3.5a Carpet plot of Burger's equation solution on regular mesh.87

Page 88: Barth_VKI95

The exact solution to this problem consists of converging straightline characteristics whicheventually form a shock which propagates to the upper boundary.

Figure 5.3.5b Solution Contours on regular mesh.

Figure 5.3.6a Carpet plot of Burger's equation solution on irregular mesh.88

Page 89: Barth_VKI95

Figure 5.3.6b Solution contours.The carpet plots indicate that the numerical solution on both meshes is monotone. Evenso, most people would prefer the solution on the regular mesh. This is an unavoidableconsequence of irregular meshes. The only remedy appears to be mesh adaptation. Similarresults for the Euler equations will be shown on irregular meshes in the next section.5.4 Simpli�ed Quadratic ReconstructionSeveral possible strategies exist for piecewise quadratic reconstruction. As a generaldesign criterion we require that the reconstruction exhibit quadratic precision so that thereconstruction will be exact whenever the solution varies quadratically. Recall that aquadratic polynomial contains six degrees of freedom in two dimensions. This means thatthe support (stencil) of the reconstruction operator must contain at least six members. In[BarF] and [VankD92] this is accomplished by increasing the physical support on the meshuntil six or more members are included. This approach has several pitfalls. Increasingthe physical support may not always be possible due to the presence of boundaries. Inthis situation either the data support must be shifted in an unnatural way or the orderof polynomial reconstruction must be lowered. Increasing the physical support also hasthe undesirable e�ect of bringing less relevant data into the reconstruction. One examplewould be data reconstruction in a ow �eld containing two shock waves in close proximity.Another approach to higher order reconstruction is to add additional degrees of freedominto the solution representation. To simplify the approach, pointwise values of the solutionare assumed rather than cell averages. Thus the resulting scheme has a nondiagonal massmatrix multiplying time derivates. We presume that this matrix can be safely ignored forsteady-state computations. 89

Page 90: Barth_VKI95

1 2

3

4

56Figure 5.4.0 Six noded quadratic element with control volume tessellation (bold lines).In our approach a quadratic element approximation is used, see Fig. 5.4.0. The solutionunknowns are the six nodal values. These values uniquely describe a quadratic functionwithin the element. The control volumes for the scheme are then formed from a tessellationof the elements. The particular tessellation which we prefer is obtained by connectingcentroids and edges as shown in Fig. 5.4.0.For each control volume surrounding a vertex or midside node, v0, a quadratic polyno-mial of the form u(x; y)0 = u0 +�rTru0 + 12�rTH0�r (5:3:13)must be reconstructed from surrounding data. In this equation ru is the usual solutiongradient and H is the Hessian matrix of second derivativesH = �uxx uxyuxy uyy � :Least-Squares Quadratic ReconstructionFollowing the same procedure developed for linear reconstruction, a least-squares prob-lem for the gradient and Hessian derivatives can be devised, see [Bar93] for full details.The result is the following nonsquare matrix problem24 ~L1 ~L2 ~L3 ~L4 ~L5 350BBB@ uxuyuxxuxyuyy 1CCCA = �!�u: (5:3:14)�!�u =[�u1;�u2; : : : ;�un]T~L1 =[�x1;�x2; : : : ;�xn]T~L2 =[�y1;�y2; : : : ;�yn]T~L3 =12[�x21;�x22; : : : ;�x2n]T~L4 =[�x1�y1;�x2�y2; : : : ;�xn�yn]T~L5 =12[�y21;�y22 ; : : : ;�y2n]T90

Page 91: Barth_VKI95

where n is the number of quadrature points, uint is the value of the reconstructed dataat a quadrature point, and �u = uint � u0. Note that rows of this matrix can be scaledby weighting factors without sacri�cing the property of k-exactness. Multiplication by amatrix transpose produces the normal form of the least-squares problem:26664L11 L12 L13 L14 L15L12 L22 L23 L24 L25L13 L23 L33 L34 L35L14 L24 L34 L44 L45L15 L25 L35 L45 L55 377750BBB@ uxuyuxxuxyuyy1CCCA=0BBB@f1f2f3f4f51CCCA (5:3:15)This equation requires the calculation of 15 inner products Lij and 5 inner products fi.Note that certain identities exist relating Lij which further reduce the number of Lijneeded to 12.Two concerns arise in the general implementation of this least-squares technique. First,we desire that the reconstruction be invariant to a�ne coordinate transformations, i.e.translation, rotation, dilatation, and shear. These e�ects can be eliminated or reduced bya preimage mapping to a normalized control volume. Second, the least-squares solution byway of normal equations can be poorly conditioned. When implementing quadratic recon-struction, we have noted some conditioning problems when the cell aspect ratio gets verylarge. Other methods for solving the basic least-squares problem are much less sensitiveto the matrix condition number: Householder transforms, singular value decompositions,etc. For highly stretched meshes the use of these methods may be a necessity.Enforcing Monotonicity of the ReconstructionIn [Bar93] we devised a limiting procedure similar to that used in linear reconstruction.Consider the reconstructed quadratic polynomial for the control volume 0u(x; y)0 = u0 +�rTru0 + 12�rTH0�r:One approach to enforcing monotonicity of the reconstruction is to introduce a parameter� into the reconstruction polynomialu(x; y)0 = u0 +�0 ��rTru0 + 12�rTH0�r� (8:0)with the goal of �nding the largest admissible �0 2 [0; 1] while invoking a monotonicityprinciple that values of the reconstructed function must not exceed the maximum andminimum of neighboring nodal values and u0. To calculate �0 �rst computeumax0 = maxi2Nj(u0; ui)umin0 = mini2Nj(u0; ui)91

Page 92: Barth_VKI95

then require that umin0 � u(x; y)0 � umax0 . Extrema in u(x; y)0 can occur anywhere in theinterior or on the boundary of the control volume surrounding v0. Determining the locationand type of extrema is clumsy and computationally expensive. One approximation whichconsiderably simpli�es the task is to interrogate the reconstructed polynomial for extremevalues at the quadrature points used in the ux integration. For each quadrature pointlocation in the ux integral compute the extrapolated state uL0i and determine the smallest�0 so that �0 = 8>><>>:min(1; umaxj �u0uL0i�u0 ); if uL0i � u0 > 0min(1; uminj �u0uL0i�u0 ); if uL0i � u0 < 01 if uL0i � u0 = 0 :This limiting procedure is very e�ective in removing spurious solution oscillations althoughthe discontinuous nature of the limiter can hinder steady-state convergence of the scheme.To assess the merits of quadratic reconstruction, we consider a test problem which solvesthe two-dimensional scalar advection equationut + (yu)x � (xu)y = 0or equivalently ut + ~� � ru = 0; ~� = (y;�x)Ton a grid centered about the origin, see Fig. 5.4.1. Discontinuous in ow data is speci�edalong an interior cut line, u(x; 0) = 1 for �:6 < x < �:3 and u(x; 0) = 0, otherwise. Theexact solution is a solid body rotation of the cut line data throughout the domain.Figure 5.4.1 Grid for the circular advection problem.The discontinuities admitted by this equation are similar to the linear contact and slip linesolutions admitted by the Euler equations. Linear discontinuities are often more di�cultto compute accurately than nonlinear shock wave solutions which naturally steepen due toconverging characteristics. Figures 5.4.2 - 5.4.4 display solution contours for the schemes.92

Page 93: Barth_VKI95

Figure 5.4.2 Solution contours, piecewise constant reconstruction.The improvement from piecewise constant reconstruction to piecewise linear is quite dra-matic. The improvement from piecewise linear to piecewise quadratic also looks impressive.The width of the discontinuities is substantially reduced with little observable grid depen-dence. Note however, that the quadratic approximation has roughly quadruple the numberof solution unknowns because of the use of 6 noded triangles. In the Section 6 we will showcomputations of Euler ow using these same approximations.Figure 5.4.3 Solution contours, p.w. linear reconstruction.93

Page 94: Barth_VKI95

Figure 5.4.4 Solution contours, p.w. quadratic reconstruction.5.5 k-Exact ReconstructionIn this section, a brief account is given of the reconstruction scheme presented in Barthand Frederickson [BarF90] for arbitrary order reconstruction. The task at hand is todetermine reconstruction polynomials for each i. These polynomials are assumed to beof the form Uk(x; y)i = Xm+n�k�(m;n)P(m;n)(x � xc; y � yc) (5:5:0)where P(m;n)(x � xc; y � yc) = (x � xc)m(y � yc)n and (xc; yc) is the control volumecentroid. Upon �rst inspection, the use of high order reconstruction appears to be anexpensive proposition. The present reconstruction strategy optimizes the e�ciency of thereconstruction by precomputing as a one time preprocessing step the set of weightsWj ineach cell j with neighbor set Nj such that�(m;n) = Xi2NjW(m;n)iui (5:5:1)where �(m;n) are the polynomial coe�cients to be used in (5.5.0). This e�ectively reducesthe problem of reconstruction to multiplication of predetermined weights and cell averagesto obtain polynomial coe�cients.During the preprocessing to obtain the reconstruction weights Wj a coordinate systemwith origin at the centroid of j is assumed to minimize roundo� errors. To insure thatthe reconstruction is invariant to a�ne transformations, we then temporarily transform(rotate and scale) to another coordinate system (x; y) which is normalized to the cell j�xy � = �D1;1 D1;2D2;1 D2;2 � �xy �with the matrix D is chosen so thatAj (x2) = Aj(y2) = 194

Page 95: Barth_VKI95

Aj(x y) = Aj(y x) = 0Polynomials on j are temporarily represented using the polynomial basis functionsP = [1; x; y; x2; xy; y2; x3; :::]:Note that polynomials in this system are easily transformed to the standard cell-centroidbasis xmyn = Xs+t�k�ms��nt�Ds1;1Dm�s1;2 Dt2;1Dn�t2;2 xs+tym+n�s�tSince 0 � s+ t � k and 0 �m+n� s� t � k, we can reorder and rewrite in terms of thestandard and transformed basis polynomialsP (m;n) = Xs+t�kGs;tm;nP(s;t) (5:5:2)Satisfaction of conservation of the mean is guaranteed by introducing into the transformedcoordinate system zero mean basis polynomials P 0 in which all but the �rst have zerocell average, i.e P0 = [1; x; y; x2 � 1; x y; y2 � 1; x3 � Aj(x3); :::]. Note that using thesepolynomials requires a minor modi�cation of (5.5.2) but retains the same form:P 0(m;n) = Xs+t�kGs;tm;nP(s;t) (5:5:3)Given this preparatory work, we are now ready to describe the formulation of the recon-struction algorithm.Minimum Energy (Least-Squares) ReconstructionWe note that the set of control volume neighborsNj must contain at least (k+1)(k+2)=2members in two space dimensions if the reconstruction operator Rkj is to be k-exact. That(k+1)(k+2)=2 members is not su�cient in all situations is easily observed. If, for example,the control volume-centers all lie on a single straight line one can �nd a linear function usuch that Aj(u) = 0 for every j , which means that reconstruction of u is impossible. Inother cases a k-exact reconstruction operator Rkj may exist, but due to the geometry maybe poorly conditioned.Our approach is to work with a slightly larger support containing more than the min-imum number of cells. In this case the operator Rkj is likely to be nonunique, becausevarious subsets would be able to support reconstruction operators of degree k. Althoughall would reproduce a polynomial of degree k exactly, if we disregard roundo�, they woulddi�er in their treatment of non-polynomials, or of polynomials of degree higher than k.Any k-exact reconstruction operator Rkj is a weighted average of these basic ones. Ourapproach is to choose the one of minimum Frobenius norm. This operator is optimal, in acertain sense, when the function we are reconstructing is not exactly a polynomial of degreek, but one that has been perturbed by the addition of Gaussian noise, for it minimizes theexpected deviation from the unperturbed polynomial in a certain rather natural norm.95

Page 96: Barth_VKI95

As we begin the formulation of the reconstruction preprocessing algorithm, the reader isreminded that the task at hand is to calculate the weightsWj for each control volume jwhich when applied via (5.5.1) produces piecewise polynomial approximations. We beginby �rst rewriting the piecewise polynomial for j in terms of the reconstruction weightsUk(x; y) = Xm+n�kP(m;n) Xi2NjW(m;n)iuior equivalently Uk(x; y) = Xi2Nj ui Xm+n�kW(m;n)iP(m;n)Polynomials of degree k or less are equivalently represented in the transformed coordinatesystem using zero mean polynomialsUk(x; y) = Xi2Nj ui Xm+n�kW 0(m;n)iP 0(m;n) (5:5:4)Using (5.5.3), we can relate weights in the transformed system to weights in the originalsystem W(s;t)i = Xm+n�kGs;tm;nW 0(m;n);i (5:5:5)We satisfy k-exactness by requiring that (5.5.4) is satis�ed for all linear combinations ofthe polynomials P 0(s;t)(x; y) such that s+ t � k. In particular, if Uk(x; y) = P 0(s;t)(x; y) forsome s+ t � k then P 0(s;t)(x; y) =Xm+n�kP 0(m;n) Xi2NjW 0(m;n)iAi(P 0(s;t))This is satis�ed if for all s + t;m+ n � kXi2NjW 0(m;n)iAi(P 0(s;t)) = �stmnTransforming basis polynomials back to the original coordinate system we haveXi2NjW 0(m;n)i Xu+v�kGu;vs;t Ai(P(u;v)) = �stmn (5:5:6)This can be locally rewritten in matrix form asW0jA0j = I (5:5:7)and transformed in terms of the standard basis weights viaWj = GW0j (5:5:8)96

Page 97: Barth_VKI95

Note thatW0j is a (k+1)(k+2)=2 by Nj matrix and A0j has dimensionsNj by (k+1)(k+2)=2. To solve (5.5.7) in the optimum sense described above, an LjQj decomposition ofA0j is performed where the orthogonal matrix Qj and the lower triangular matrix Lj havebeen constructed using a modi�ed Gram-Schmidt algorithm (or a sequence of Householderre ections). The weightsW0j are then given byW0j = Q�jL�1jApplying (5.5.5) these weights are transformed to the standard centroid basis and thepreprocessing step is complete.We now show a few results presented earlier in [BarF90]. The �rst calculation involvesthe reconstruction of a sixth order polynomial with random normalized coe�cients whichhas been cell averaged onto a random mesh. Figures 5.5.0 shows a sample mesh and �g.5.5.1 graphs the absolute L2 error of the reconstruction for various meshes and reconstruc-tion degree.Figure 5.5.0 Random mesh. 97

Page 98: Barth_VKI95

0 1 2 3 4 5

Degree K of Reconstruction

-3

-2

-1

0

1

10

10

10

10

10

L2

erro

r

89 Cells188 Cells381 Cells777 Cells

Figure 5.5.1 L2 error of reconstruction.The reconstruction algorithm has also been tested on more realistic problems. Figures5.5.2-5.5.4 show a mesh and reconstructions (linear and quadratic) of a cell averaged density�eld corresponding to a Ringleb ow, an exact hodograph solution of the gasdynamicequations, see [Chio85].Figure 5.5.2 Randomized mesh for Ringleb ow.98

Page 99: Barth_VKI95

Figure 5.5.3 Piecewise linear reconstruction of Ringleb ow.The reader should note that the use of piecewise contours gives a crude visual critique asto how well the solution is represented by the piecewise polynomials. The improvementfrom linear to quadratic is dramatic in the case of Ringleb ow. In the paper by Barthand Frederickson we show ow computations and error measures for the Ringleb problem.

Figure 5.5.4 Piecewise quadratic reconstruction of Ringleb ow.99

Page 100: Barth_VKI95

6.0 Finite-Volume Schemes for theEuler and Navier-Stokes EquationsIn this section, we consider the extension of upwind scalar advection schemes to theEuler equations of gasdynamics. As we will see, the changes are relatively minor sincemost of the di�cult work has already been done in designing the scalar scheme.6.1 Euler Equations in Integral FormThe physical laws concerning the conservation of mass, momentum, and energy for anarbitrary region can be written in the following integral form:Conservation of Mass ddt Z � d a+ Z@ �(V � n) d l = 0 (6:1:1)Conservation of Momentumddt Z �V d a + Z@ �V(V � n) d l + Z@ pn d l = 0 (6:1:2)Conservation of Energyddt ZE da + Z@(E + p)(V � n) d l = 0 (6:1:3)In these equations �;V; p, and E are the density, velocity, pressure, and total energy ofthe uid. The system is closed by introducing a thermodynamical equation of state for aperfect gas: p = ( � 1)(E � 12�(V �V)) (6:1:4)These equations can be written in a more compact vector equation:ddt Z u d a+ Z@ F(u) � n d l = 0 (6:1:5)with u = 0@ ��VE 1A ; F(u) � n =0@ �(V � n)�V(V � n) + pn(E + p)(V � n) 1AIn the next section, we show the natural extension of the scalar advection scheme toinclude (6.1.5).6.2 Extension of Scalar Advection Schemes to Systems of EquationsThe extension of the scalar advection schemes to the Euler equations requires threerather minor modi�cations: 100

Page 101: Barth_VKI95

(1) Vector Flux Function. The scalar ux function is replaced by a vector ux function.In the present work, the mean value linearization due to Roe [Roe81] is used. The form ofthis vector ux function is identical to the scalar ux function (4.2.33), i.e.h(uR;uL;n) =12 �f(uR;n) + f(uL;n)��12 jA(uR;uL;n)j �uR � uL� (5:6)where f(u;n) = F(u) � n, and A = df=du is the ux Jacobian.(2) Componentwise limiting. The solution variables are reconstructed componentwise.In principle, any set of variables can be used in the reconstruction (primitive variables,entropy variables, etc.). Note that conservation of the mean can make certain variablecombinations more di�cult to implement than others because of the nonlinearities thatmay be introduced. The simplest choice is obviously the conserved variables themselves.When conservation of the mean is not important (steady-state calculations), we typicallyuse primitive variables in the reconstruction step.(3) Weak Boundary Conditions. Boundary conditions for inviscid ow at solid surfacesare enforced weakly. For solid wall boundary edges, the ux is calculated with V � n setidentically to zero f(u;n) = 0B@ 0nxpnyp0 1CA :Boundary conditions at far �eld boundaries are also done weakly. De�ne the characteristicprojectors of the ux Jacobian A in the following way:P� = 12[I � sign(A)]:At far �eld boundary edges the uxes are assumed to be of the form:f(un;n) = (F(unproj) � n)where unproj = P+ un+P� u1 and u1 represents a vector of prescribed far �eld solutionvalues. At �rst glance, prescribing the entire vector u1 is an overspeci�cation of boundaryconditions. Fortunately the characteristic projectors remove or ignore certain combinationsof data so that the correct number of conditions are speci�ed at in ow and out ow.The remainder of this section will present calculations of Euler ow using various re-construction schemes and mesh adaptation. The resulting scheme for the Euler equationshas essentially the same shock resolving characteristics as the scalar scheme. The actualsolution strategy is based on an implicit Newton-like solver. Details of the implicit schemeare discussed in Section 7.Transonic Airfoil Flow 101

Page 102: Barth_VKI95

Figures 6.2.0a-b show a simple Steiner triangulation and the resulting solution obtainedwith a linear reconstruction scheme for transonic Euler ow (M1 = :80; � = 1:25�) over aNACA 0012 airfoil section.Figure 6.2.0a Initial triangulation of airfoil, 3155 vertices.Figure 6.2.0b Mach number contours on initial triangulation, M1 = :80; � = 1:25�.Even though the grid is very coarse with only 3155 vertices, the upper surface shock iscaptured cleanly with a pro�le that extends over two cells of the mesh. Clearly, the powerof the unstructured grid method is the ability to locally adapt the mesh to resolve owfeatures. Figures 6.2.1a-b show an adaptively re�ned mesh and solution for the same ow.102

Page 103: Barth_VKI95

Figure 6.2.1a Solution adaptive triangulation of airfoil, 6917 vertices.Figure 6.2.1b Mach number solution contours on adapted airfoil.The mesh has been locally re�ned based on a posteriori error estimates. These estimateswere obtained by performing k-exact reconstruction in each control volume using linearand quadratic functions and comparing the di�erence. The ow features in �g. 6.2.1b areclearly de�ned with a weak lower surface shock now visible. Figure 6.2.1c shows the sur-face pressure coe�cient distribution on the airfoil. The discontinuities are monotonicallycaptured by the scheme. 103

Page 104: Barth_VKI95

0.0 0.2 0.4 0.6 0.8 1.0

x/c

1.5

1.0

0.5

0.0

-0.5

-1.0

-1.5

Cp

Cp DistributionOriginal MeshAdapted MeshFigure 6.2.1c Comparison of Cp distributions on initial and adapted meshes.Inviscid Multi-Element Airfoil FlowAs we have see from Section 2, one major advantage of unstructured grids is the ability toautomatically mesh complex geometries. The next example shown in �gure 6.2.2a-b is aSteiner triangulation and solution about a multi-element airfoil.

Figure 6.2.2a Steiner triangulation about multi-element airfoil.104

Page 105: Barth_VKI95

Figure 6.2.2b Mach number contours about multi-element airfoil, M1 = :2; � = 0�.Using the incremental Steiner algorithm discussed previously, the grid can constructedfrom curve data in about ten minutes time on a standard engineering workstation usingless than a minute of actual CPU time. The ow calculation shown in �gure 6.2.2b wasperformed on a CRAY supercomputer taking just a few minutes of CPU time using a linearreconstruction scheme with implicit time advancement. Details of the implicit scheme aregiven in the next section.Supersonic Oblique Shock Re ectionsIn this example, two supersonic streams (M=2.50 and M=2.31) are introduced at theleft boundary. These streams interact producing a pattern of supersonic shock re ectionsdown the length of the converging channel, see Fig. 6.2.3. The grid is a subdivided 15x52mesh with perturbed coordinates.Figure 6.2.3a Channel grid (780 Verts).Solution Mach contours are shown in Figs. 6.2.3b-d. Figure 6.2.5 graphs density pro�lesalong a horizontal cut at 70 percent the vertical height of the left boundary. As expected,the piecewise constant reconstruction scheme severely smears the shock system while thescheme based on a linear solution reconstruction, �g. 6.2.4c, performs very well. Thepiecewise quadratic approximation, �g. 6.2.4d, shows some improvement in shock wave105

Page 106: Barth_VKI95

thickness although the improvement is not dramatic given the increased number of un-knowns in the quadratic element. The number of unknowns required for the quadraticapproximation is roughly four times the number required for the piecewise linear scheme.The less than dramatic improvement is not a surprising result since the solution has largeregions of constant ow which do not bene�t greatly from the quadratic approximation. Atsolution discontinuities the quadratic scheme reduces to a low order approximation whichagain negates the bene�t of the quadratic reconstruction.Figure 6.2.3b Mach contours, piecewise constant reconstruction.Figure 6.2.3c Solution Mach contours, piecewise linear reconstruction.Figure 6.2.3d Solution Mach contours, piecewise quadratic reconstruction.Figure 6.2.4a Adapted channel grid (1675 Verts).Figure 6.2.4a shows the same mesh adaptively re�ned. The number of mesh points hasroughly doubled. The solution shown in �g 6.2.4b was calculated using linear reconstruc-tion. The results are very comparable with the calculation performed using quadraticreconstruction. 106

Page 107: Barth_VKI95

Figure 6.2.4b Mach contours on adapted mesh, piecewise linear reconstruction.0.0 0.5 1.0 1.5

x

1

2

3

4

Den

sity

PW Constant ReconstructionPW Linear ReconstructionPW Quadratic Reconstruction(Adapted) PW Linear Reconstruction

Figure 6.2.5 Density pro�les, y=h = :70.ONERA M6 WingThe algorithms outlined in Sections 5-6 have been extended to the Euler equations inthree dimensions. In [Bar91], we showed the natural extension of the edge data structure inthe development of an Euler equation solver on tetrahedral meshes. One of the calculationspresented in this paper simulated Euler ow about the ONERA M6 wing. The tetrahedralmesh used for the calculations was a subdivided 151x17x33 hexahedral C-type mesh withspherical wing tip cap. The resulting tetrahedral mesh contained 496,350 tetrahedra,105,177 vertices, 11,690 boundary vertices, and 23,376 boundary faces. Figure 6.2.7 showsa closeup of the surface mesh near the outboard tip.107

Page 108: Barth_VKI95

Figure 6.2.7 Closeup of M6 Wing Surface Mesh Near Tip.Transonic calculations,M1 = :84; � = 3:06�, were performed on the CRAY Y-MP com-puter using the upwind code with both the Green-Gauss and L2 gradient reconstruction.Figure 6.2.8 shows surface pressure contours on the wing surface and Cp pro�les at severalspan stations.Figure 6.2.8 M6 Wing Surface Pressure Contours and Spanwise Cp Pro�les (M1 =:84; �=3:06�).Pressure contours clearly show the lambda type shock pattern on the wing surface.Figures 5.8a-c compare pressure coe�cient distributions at three span stations on thewing measured in the experiment, y/b=.44,.65.,.95.108

Page 109: Barth_VKI95

0.00 0.25 0.50 0.75 1.00

x/c

1.5

1.0

0.5

0.0

-0.5

-1.0

-1.5

Cp

Upwind, Green-Gauss GradientsUpwind, L2 GradientsCFL3D (ref. 16)ExperimentExperimentFigure 6.2.9a M6 Wing Spanwise Pressure Distribution, y=b = :44.Each graph compares the upwind code with Green-Gauss and L2 gradient calculation withthe CFL3D results appearing in Thomas et al. [ThomLW85] and the experimental datareported by Schmitt [Schmitt79]. Numerical results on the tetrahedral mesh compare veryfavorably with the CFL3D structured mesh code. The results for the outboard stationappear better for the present code than the CFL3D results. This is largely due to thedi�erence in grid topology and subsequent improved resolution in that area.

0.00 0.25 0.50 0.75 1.00

x/c

1.5

1.0

0.5

0.0

-0.5

-1.0

-1.5

Cp

Upwind, Green-Gauss GradientsUpwind, L2 GradientsCFL3D (ref. 16)ExperimentExperimentFigure 6.2.9b M6 Wing Spanwise Pressure Distribution y=b = :65.109

Page 110: Barth_VKI95

0.00 0.25 0.50 0.75 1.00

x/c

1.5

1.0

0.5

0.0

-0.5

-1.0

-1.5

Cp

Upwind, Green-Gauss GradientsUpwind, L2 GradientsCFL3D (ref. 16)ExperimentExperimentFigure 6.2.9c M6 Wing Spanwise Pressure Distribution y=b = :95.6.3 Calculation of Navier-Stokes Flow6.3a Maximum PrinciplesSection 4.1 discusses the maximum principle properties of Laplace's equation on struc-tured and unstructured meshes. The maximum principle analysis entends naturally to theself-adjoint viscous-like term r � �(x; y)ru; � � 0:Assuming a linear solution variation in each element, both the �nite-element and �nite-volume methods on planar triangulations yield the following resultZ0 w(r � �ru) d � Xi2N0W0i(Ui �U0) (6:3:0)W0i = 12 ��Lcotan(�L) + �Rcotan(�R)�0iwhere �L and �R are the left and right angles subtending the edge e(v0; vi). Similarly,�L and �R are integral averaged values of �(x; y) in the left and right adjacent triangles.Thus we immediately arrive at su�cient conditions for a maximum principle:Consider discretizations of r � �(x; y)ru using a �nite-volume or Galerkin �nite-elementmethod with linear elements. A su�cient condition for the discretized scheme to exhibit adiscrete maximum principle is that all angles in the triangulation be acute.The proof follows immediately from nonnegativity of weights appearing in (6.3.0). InSection 4.1 we saw that for planar triangulations a discrete maximum principle was assured110

Page 111: Barth_VKI95

if the triangulation of the point set was a Delaunay triangulation. This result does notextend to (6.3.0).While the existence of a discrete maximumprinciple simpli�es the proofs for stability anduniform convergence of numerical approximations, the existence of a discrete maximumprinciple for self-adjoint equations is more of a luxury than a necessity. For example theL2 �nite-element theory still yields a stability estimate for the above equation even whenthe maximum principle analysis does not hold. In fact, the variational operator a(u;w)associated with r � �(x; y)ru a(u;w) = Z �(ru � rw)dserves as a proper energy norm kuk2a = a(u; u) for proving convergence of the method.6.3b Alternative TessellationsThe computation of viscous ow on stretched triangulation places high demands onthe discretization. Recall that the spatial accuracy of the generalized Godunov schemerelies heavily on the property of k-exactness, i.e. that certain complete polynomials arereconstructed exactly whenever the numerical solution has that polynomial dependence. Ifthis is the case then the reconstructed polynomials are continuous at interface boundaries.From consistency of the numerical ux function we haveh(u;u;n) = (F(u) � n)so that the numerical ux function collapses to the true Euler ux. If we assume an exact ux integration in space, then all component calculations related to the spatial operatorare done exactly. This implies that schemes that possess k-exactness for some value ofk have arti�cial viscosity which vanishes completely when the true solution behaves as ak-th order polynomial.Even so, the actual choice of control volume shape can signi�cantly e�ect the absolutelevel of discretization error. Assume that the control volumes are formed from a geometricdual of a triangulation. A number of possible geometric duals exist. The most frequentchoice is the median dual. It is known that the Galerkin �nite-element method with linearelements can be rewritten as a �nite-volume scheme on a median dual tessellation. Themedian dual is also attractive because it is well de�ned for any triangulation and has nicegeometrical relationships which simplify many discretization formulas. Unfortunately, themedian dual regions can become very distorted when triangles obtain high aspect ratio,see �g. 6.3.0a.777777777777777777777777777777777777777777777777777777777777777777777777777

77777777777777777777777777777777777777777777

(a) (b)Figure 6.3.0 (a) Median Dual Tessellation, (b) Containment Circle (Sphere) Tessellationfor Stretched Triangulations. 111

Page 112: Barth_VKI95

The Riemann problems associated with 6.3.0a become very nonphysical. The net resultis a degradation in the accuracy of the schemes discussed in Sections 4-6. One interestingalternative which avoids this problem to a large extent is the containment circle tessellationshown in �g. 6.3.0b. The containment circle is de�ned as the smallest circle which containsa triangle. For acute triangles the minimum containment circle is the circumcircle, see �g.6.3.1.(a) (b) (c) (d)Figure 6.3.1 Containment circle centers for acute and obtuse triangules.For an obtuse triangle the minimum containment circle has a center which lies on thelongest edge in the triangle. The geometric dual is obtained by connecting containmentcircle centers to the midside of edges for a triangle. This construction produces portionsof the Voronoi diagram for acute triangles but does not for obtuse triangles. Nicolaidesand coworkers [XiaN92] have extensively studied properties of the divergence and curloperators on Voronoi duals. Note that connecting circumcenters only make sense forDelaunay triangulation, see �g. 6.3.2a. On the other hand, the notion of a containmentcircles remains well de�ned for all triangulations.

(a) (b) (c)

777777777777777777777777777777777777777777777777777777777777777

777777777777777777777777777777777777777777777777777777777777777777777777777777777

777777777777777777777777777777777777777777777777777777777777777

Figure 6.3.2 (a) Dual Obtained by Connecting Circumcenters, (b) Containment CircleCenters and Midsides, (c) Centroids and Midsides.In �g. 6.3.3a we show a subdivided quadrilateral mesh with stretched elements near112

Page 113: Barth_VKI95

the surface of the airfoil. Next we calculate subsonic inviscid compressible ow on thismesh using the scheme presented in the previous section using linear reconstruction andthe median dual tessellation shown in �g. 6.3.3b. The calculation of inviscid ow onviscous-like stretched meshes provides an excellent test for the numerical methods.Figure 6.3.3aTriangulationAbout NACA 0012 Airfoil Obtained by Subdividing a Quadri-lateral Mesh.Figure 6.3.3b Median Dual Tessellation of the Triangulation Shown in �g. 6.3.3a.Mach number contours of the solution are shown in �g. 6.3.3c. The correct solutionhas Mach contours which smoothly approach the surface of the airfoil. The present solu-113

Page 114: Barth_VKI95

tion shows a noticeable inaccuracy near the surface of the airfoil with signi�cant entropyproduction and ultimately several counts of airfoil drag.Figure 6.3.3c Mach Contours for Inviscid Subsonic Flow Over NACA 0012 (M1 = :50,� = 2:0�) Using Median Dual Tessellation Showing Inaccuracy of the Computation Nearthe Airfoil Surface.Figure 6.3.3d Containment Sphere Tessellation Derived From Figure 6.3.3a by Connect-ing Containment Sphere Centers and Edge Midsides.Next we repeat the calculation using the same algorithm with linear reconstruction buta containment circle tessellation as shown in �g. 6.3.3d. The solution obtained on this114

Page 115: Barth_VKI95

mesh has improved qualitative features. The peak entropy produced is reduced by almosta factor of 10 and the drag level is reduced to near zero.Figure 6.3.3e Mach Contours for Inviscid Subsonic Flow Over NACA 0012 (M1 = :50,� = 2:0�) Using Containment Sphere Tessellation Showing Improved Accuracy of theComputation Near the Airfoil Surface.6.3c The Navier-Stokes EquationsThe development of e�cient and robust solution strategies for solving the Navier-Stokesequations on unstructured meshes is probably one of the most di�cult aspects facing al-gorithm developers. In this section we will show several steady-state Navier-Stokes calcu-lations performed on stretched triangulations. The preferred method of reconstruction forthese calculations is the least-squares method with unit weights. In solving high Reynoldsnumber ows we must also content with the modeling of uid turbulence. Two simpletransport models suitable for unstructured meshes are presented. We will delay the dis-cussion of the implicit solution strategy until Section 7. At that time we will also discussionother acceleration procedures for steady-state calculations.We begin with the Navier-Stokes equations in integral formddt Z u d+ Z@(F � n)d� = Z@(G � n)d� (6:3:1)where u represents the vector of conserved variables, F the inviscid Euler ux vector,u =0B@ ��u�vE 1CA ; F =0B@ �u�u2 + p�uvu(E + p)1CA bi +0B@ �v�uv�v2 + pv(E + p)1CA bj115

Page 116: Barth_VKI95

and G the viscous ux vector.G = 0BBBB@ 0�xx�yxu�xx + v�xy� qx 1CCCCA bi +0BBBB@ 0�xy�yyu�yx + v�yy� qy 1CCCCA bjrq =� �rT��� =�(ux + vy)I + � � 2ux uy + vxvx + uy 2vy �p =( � 1)�E � 12�(u2 + v2)� = �RTIn these equations � and � are di�usion coe�cients, � the coe�cient of thermal conduc-tivity, the ratio of speci�c heats, and R the ideal gas law constant.At solid walls with no slip boundary conditions the ux reduces to the following formfor steady flow: F � n =0B@ 0nxpnyp0 1CA ; G � n = 0B@ 0�ru �n�rv � n-q � n 1CA :The last entry in the viscous ux vanishes for adiabatic ow. The conditions can beenforced weakly or alternatively the second and third equations can be replaced by thestrong no slip wall condition u = v = 0.6.3d High Reynolds Number FlowSimulating the e�ects of turbulence on unstructured meshes via the Reynolds-averagedNavier-Stokes equations and turbulence modeling is a relatively unexplored topic. In earlywork by Rostand [Rost88], an algebraic turbulence model was incorporated into an unstruc-tured mesh ow solver. This basic methodology was later re�ned by Mavriplis [Mav88]for the ow about multi-element airfoil con�gurations. Both of these implementationsutilize locally structured meshes to produce one-dimensional-like boundary-layer pro�lesfrom which algebraic models can determine an eddy viscosity coe�cient for use in theReynolds-averaged Navier-Stokes equations. The use of local structured meshes limits thegeneral applicability of the method.The next level of turbulence modeling involves the solution of one or more auxiliarydi�erential equations. Ideally these di�erential equations would only require initial dataand boundary conditions in the same fashion as the Reynolds-averaged mean equations.The use of turbulence models based on di�erential equations greatly increases the class ofgeometries that can be treated \automatically." Unfortunately this does not make the issueof turbulence modeling a \solved" problem since most turbulence models do not performwell across the broad range of ow regimes usually generated by complex geometries. Alsokeep in mind that most turbulence models for wall-bounded ow require knowledge of116

Page 117: Barth_VKI95

distance to the wall for use in \damping functions" which simulate the damping e�ectof solid walls on turbulence. The distance required in these models is measured in \wallunits" which means that physical distance from the wall y is scaled by the local wall shear,density, and viscosity. y+ =r �wall�wall y� (6:3:2)Scaling by wall quantities is yet another complication but does not create serious imple-mentation di�culties for unstructured meshes as we will demonstrate shortly.In a recent report with Baldwin [BalB90] we proposed a single equation turbulencetransport model. In this report, the model was tested on various subsonic and transonic ow problems: at plates, airfoils, wakes, etc. The model consists of a single advection-di�usion equation with source term for a �eld variable which is the product of turbulenceReynolds number and kinematic viscosity, � eRT . This variable is proportional to the eddyviscosity except very near a solid wall. The model equation is of the form:D(� eRT )Dt =(c�2f2(y+)� c�1)q� eRTP+(� + �t�R )r2(� eRT )� 1�� (r�t) � r(� eRT ): (6:3:3)In this equation P is the production of turbulent kinetic energy and is related to themean ow velocity rate-of-strain and the kinematic eddy viscosity �t. Equation (6.3.3)depends on distance to solid walls in two ways. First, the damping function f2 appearingin equation (6.3.3) depends directly on distance to the wall (in wall units). Secondly, �tdepends on � eRt and damping functions which require distance to the wall.�t = c�D1(y+)D2(y+)� eRtIt is important to realize that the damping functions f2;D1; andD2 deviate from unity onlywhen very near a solid wall. For a typical turbulent boundary-layer (see �g. 6.3.4) accuratedistance to the wall is only required for mesh points which fall below the logarithmic regionof the boundary-layer. 117

Page 118: Barth_VKI95

Model Depends onDistance to Wall

-1 0 1 2 3 4 510 10 10 10 10 10 10

y+

0

5

10

15

20

25

30

35

40

U+

Flat Plate B.L.Laminar SublayerLog LayerRe(theta)=37,300Figure 6.3.4 Typical at plate boundary-layer showing dependence of turbulence modelon distance to wall.The relative insensitivity of distance to the wall means that accurate estimation of distanceto the wall is only required for a small number of points that are extremely close to aboundary surface. The remaining points can be assigned any approximate value of physicaldistance since the damping functions are essentially at their asymptotic values. A generalprocedure for calculation of distance to the wall in wall units is to precompute and store,for each vertex of the mesh, the minimum distance from the vertex to the closest solid wall(see �g. 6.3.8 below). This strategy can only fail if two bodies are in such close proximitythat the near wall damping functions never reach their asymptotic values. Realisticallyspeaking, the validity of most turbulence models would be in serious question in thissituation anyway. In general, the minimum distance from vertex to boundary edge doesnot occur at the end points of a boundary edge but rather interior to a boundary edge. Foreach vertex, information concerning which boundary edge achieves the minimum distanceand the weight factor for linear interpolation along the boundary edge must be stored.Data can then be interpolated to the point of minimum distance whenever needed. In thecourse of solving (6.3.3), distance to a solid wall in wall units is calculated by retrievingphysical distance to the wall and the local wall quantities needed for (6.3.2) as interpolatedalong the closest boundary edge.In the paper with Baldwin [BalB90] we construct a numerical scheme and show preciseconditions for which a maximum principle is guaranteed. This aspect of the work wascritical in producing a robust numerical model for complex uid ows. In the followingparagraphs we discuss another turbulence model which shares this feature.One noticeable weakness of the Baldwin-Barth model is the poor accuracy in computingfree shear-layer ows. This motivated Spalart and Allaras [SpalA92] to devised a variantof our model. Using their notation the model is writtenDe�Dt =1� �r � ((� + e�)re�) + cb2(re�)2�� hcw1fw � cb1�2 ft2i� e�d�2 + cb1eSe� (6:3:4)118

Page 119: Barth_VKI95

where d is the physical distance to solid walls, cb1, cb2, cw1, and � are constants, and eSis related to the strain tensor. The complete model also includes terms for simulating thetransition of turbulence. Numerical schemes can be constructed for solving this equationwhich naturally exhibit a maximum principle. When combined with the improvedmodelingof shear-layer ows, the model is very attractive.6.3e Examples of Navier-Stokes FlowThe numerical calculations presented in this section represent a successful applicationof the ideas discussed in previous sections.Blasius Boundary-LayerSelf-similar laminar boundary-layer is simulated on a 10x20 subdivided quadrilateralmesh, see �g. 6.3.5. An analytical Blasius pro�le is speci�ed at in ow and computationsat low Mach number (M=.08) were performed using the various reconstruction schemes.Figure 6.3.5 Coarse Flat Plate Grid.Viscous terms have been discretized in �nite-volume form using the continuous triangleinterpolants (linear or quadratic) and integration quadrature rules consistent with theseinterpolants. Figure 6.3.6 graphs the velocity pro�le which has been vertically sampled atx/L = .88. Data is plotted at locations corresponding to the intersection of this verticalline and the edges of the mesh. This results in a somewhat irregular spacing of data points.

0.0 0.2 0.4 0.6 0.8

y/L

0.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

u

PW Constant ReconstPW Linear ReconstPW Quadratic ReconstBlasius SolutionFigure 6.3.6 Velocity pro�les at x/L=.88.119

Page 120: Barth_VKI95

The piecewise constant reconstruction performs rather poorly owing to the excessive di�u-sion in the Euler scheme. The linear reconstruction scheme (without limiting) produces areasonably accurate solution with a slight overshoot in the outer portion of the boundary-layer. The quadratic scheme performs very well with a slight error in the region of maxi-mum curvature. Both higher order methods appear to have acceptable accuracy for thisproblem.RAE 2822 AirfoilThe geometry consists of a single RAE 2822 airfoil. Navier-Stokes ow is computedabout this geometry assuming turbulent ow with the following free-stream conditions:M1 = :725; � = 2:31�; Re = 6:5 million. Wind tunnel experiments for the RAE 2822geometry have been performed by Cook, McDonald, and Firmin [CookMF79]. The RAE2822 airfoil mesh shown in �g. 6.3.7 contains approximately 14000 vertices and 41000edges.Figure 6.3.7 Mesh near RAE 2822 airfoil.Navier-Stokes calculations on the RAE 2822 airfoil were performed using the implicitscheme described in the next section. The Baldwin-Barth turbulence model was used tosimulate the e�ects of turbulence. For purposes of calculating distance to solid walls in theturbulence model, a distance function is �rst computed for all mesh vertices. Contours ofthis function are shown in �g. 6.3.8. 120

Page 121: Barth_VKI95

Figure 6.3.8 Contours of distance function for turbulence model.Figures 6.3.9-10 plot Mach number contours and pressure coe�cient distributions for theRAE 2822 airfoil. The pressure coe�cient distribution compares favorably with the ex-periment.Figure 6.3.9. Closeup of Mach number contours near airfoil.Leading edge trip strips were used on the experimental model but not simulated in thecomputations. This may explain the minor di�erences in the leading edge pressure distri-bution. 121

Page 122: Barth_VKI95

0.0 0.2 0.4 0.6 0.8 1.0

x/c

1.5

1.0

0.5

0.0

-0.5

-1.0

-1.5

Cp

RAE 2822CalculationExperiment, HarrisFigure 6.3.10 Pressure coe�cient distribution on airfoil.Multi-Element Airfoil FlowTurbulent high Reynolds number ow (M = :2, � = 8:2�, Re = 9 million) is computedabout a 3-element airfoil con�guration. Experimental data is available at these conditions,see Valarezo et al. [ValDM91].

Figure 6.3.11 Grid about multi-element airfoil con�guration.122

Page 123: Barth_VKI95

Figure 6.3.12 Closeup of grid near ap region.Computations were performed using the linear least-squares reconstruction without limit-ing. The Baldwin-Barth one-equation turbulence model was used to simulate the e�ectsof turbulence. This model has not been implemented in the other schemes at this date sothe computations have been restricted to the linear reconstruction scheme only. The meshcontaining approximately 28,000 vertices is shown in �gs. 6.3.11-12.0.00 0.25 0.50 0.75 1.00 1.25

x/c

1

0

-1

-2

-3

-4

-5

-6

Cp

M = .2, Alpha = 8.2 degRe = 9 MillionBaldwin-Barth Turbulence Model

ExperimentComputation, PW Linear

Figure 6.3.13 Cp distribution on element surfaces.The pressure coe�cient distribution is graphed in �g. 6.3.13 for the computation andexperiment. The agreement with experiment is reasonably good although the mesh reso-lution is inadequate in the wake regions of the ow. This can be seen in the Mach contourplots shown in �gs. 6.3.14-15. Proper mesh resolution of the wake ow is an excellent op-portunity for mesh adaptation although we have not developed an adaptation procedure123

Page 124: Barth_VKI95

for this purpose.Figure 6.3.14 Mach contours about multi-element airfoil.Figure 6.3.15 Closeup of Mach contours in trailing edge region.7.0 Convergence Acceleration For Steady-State CalculationsIn this section we examine two techniques for accelerating the convergence of numericalschemes to steady-state. Section 7.1 will discuss a randomized multigrid procedure. Thecentral idea is to exploit the randomized theory for Delaunay and related triangulations.Then in Section 7.2 we consider implicit solution strategies based on a preconditionedminimum residual method. 124

Page 125: Barth_VKI95

7.1 Randomized MultigridRecall from Section 3.3c the theory of randomized incremental triangulation. If theorder in which sites are introduced for insertion into the triangulation is randomized thenit is possible to store a history of the entire triangulation process as a data tree (see �g.7.1.0) with the following characteristics:(1) Expected size O(m logn+ n)(2) Average traversal depth O(logn)wherem is the size of the original triangulation T0 and n is the number sites to be insertedinto T0. Usually m is a small number so that the size is approximately O(n). Thus weconstruct a data structure which e�ectively encodes a composition of structural changestransforming the mesh from its coarsest representation, T0, to its �nest representation,Tn. Let, Si denote the i-th structure change, the encoding process is represented by thefollowing composition: Tn = (Sl(Sl�1(Sl�2:::(S3(S2(S1T0))):::))): (7:1:0)This structure permits us to stop at any point in the composition and extract a validintermediate triangulation. This is the central idea that we exploit in a multigrid setting.In addition, this composition is used for global restriction and prologation given localrestriction and prologation operators on convex sets of d + 1 points in Rd, see discussionbelow.(a)

(b)

(c)

Q

Q

QFigure 7.1.0 Data tree for site location. (a) Original triangulation containing Q. (b)Triangle is split into 3 from site insertion. (c) Triangle is split into 2 from edge swap.Note that the intermediate meshes produced by this procedure are node nested subsets.This is a departure from the multigrid algorithms of Jameson and Mavriplis [JamM85],125

Page 126: Barth_VKI95

[MavJ86]. The actual multigrid algorithm follows a standard procedure for nonlinearproblems, see Jespersen [Jesp83] or Spekreijse [Spek87].Stage I: Nested iterationLet Rk(uk) = rk (7:1:1)be a discretization of the conservation law with source term r on Tk; k = 0; :::l. Denote thesolution of (7.1.1) by u�; k = 0; :::; l. The nested iteration scheme starts with some initialguess, u�0 and procedes recursively. Given an approximation u�k, an approximation of u�k+1is obtained as follows. The approximation of u�k is improved by a single NMG iterationand this improved solution is projected onto a �ner mesh, Tk+1. These steps are repeateduntil u�l is obtained.Stage II: nonlinear multi-grid (NMG)One iteration of NMG on a general grid Tk is de�ned recursively by the following steps:(0) Start with an approximation uk of u�k(1) Improve uk by application of p relaxation iterations to Rk(uk) = rk.(2) Compute the defect dk := rk �Rk(uk).(3) Find an approximation of uk�1 of u�k�1 of the coarser grid Tk�1 via restriction operator,Ik�1k .(4) Compute rk�1 := Rk�1(uk�1) + Ik�1k dk.(5) Approximate the solution of Rk�1(uk�1) = rk�1 by NMG on Tk�1. The result is euk�1.(6) Correct the current approximation by uk := uk + Ikk�1(euk�1 � uk�1).(7) Improve uk by application of q relaxation iterations to Rk(uk) = rk.To demonstrate the method, consider solving the Burger's equation problem introducedin Section 5.3 using the upwind scheme (4.2.18) based on piecewise constant reconstruction.A 4-stage Runge-Kutta scheme is used for pre and post relaxation iterations with p = q = 2.For ease of exposition, assume that the control volumes are the actual triangles. In thiscase the restriction and prolongation operators are particularly simple. We also introducea new local operator called the \recombination" operator. This operator accounts for theedge swapping transformation during the triangulation process.The restriction and prolongation operators can now be represented as the compositionof local operators, Ti, for example:Rk = Ikk�1Rk�1 = (Tj(Tj�1(:::(Ti+1(TiRk�1))))where i and j are the locations of the levels k�1 and k and j > i. The three local transferoperators are depicted in the �g. 7.1.1. 126

Page 127: Barth_VKI95

R i

Ri

Ri

R j

R j

kR

kR

kR

lR

lR

lR

R = (R i+ R j)l

kR = (R i+ R j)

A____A A+

k

k

Ri= Rj

R j

kRl

R+ +

R j = RiA j__A i

R = RiA__

A i

R = RiA__A i

kk

ll

(a)

(b)

(c)

A____A A+k

l

l

l

Figure 7.1.1 Local Transfer Operators (a) Recombination (b) Prolongation (c) RestrictionThe algorithm is tested using the Burger's equation problem of Section 5.3 on meshescontaining 3200 and 12800 cells. Intermediate meshes are extracted from (7.1.0) so thatthe number of solution unknowns is successively reduced by a factor of 4. Figure 7.1.2plots the coarsest mesh and �g. 7.1.3 plots the mesh containing 3200 cells.Figure 7.1.2 Coarsest mesh for multigrid calculation.127

Page 128: Barth_VKI95

Figure 7.1.3 Finest mesh (3200 cells) for multigrid calculation.Figure 7.1.4 graphs the convergence history for the multigrid scheme using 5 level V-cycleson the mesh containing 3200 cells and 6 level V-cycles on the mesh containing 12800 cells.0 10 20 30 40

MG Cycles

-16

-15

-14

-13

-12

-11

-10

-9

-8

-7

-6

-5

-4

-3

-2

-1

0

1010101010101010101010101010101010

Nor

m R

esid

ual

Randomized Multigrid3200 cells (5 Levels)12800 cells (6 levels)

Figure 7.1.4 Convergence history for randomized multigrid scheme.Ideally we would like to observe grid size independent rates of convergence. The resultsdo not indicate true grid independent convergence but come fairly close. The resultsmay be improved somewhat by alternate mesh level choices. One alternative might be anonstationary randomized algorithm for level selection.7.2 Implicit Solution AlgorithmsIn this section we consider implicit solution strategies for the upwind schemes described128

Page 129: Barth_VKI95

in Section 4-6. De�ning the solution vectoru = [u1; u2; u3; :::; uN ]Tthe basic scheme is written as Dut = r(u) (7:2:0)where D is a positive diagonal matrix. Next perform a backward Euler time integration,equation (7.2.0) is rewritten asD(un+1 � un) = �t r(un+1) (7:2:1)where n denotes the iteration (time step) level. Linearizing the right-hand-side of (7.2.1)in time produces the following form:D(un+1 � un) = �t �r(un) + drndu (un+1 � un)� (7:2:2)By rearrangement of terms, we arrive at the delta form of the backward Euler scheme�D ��t drndu � (un+1 � un) = ��t r(un) (7:2:2)Note that for large time steps, the scheme becomes equivalent to Newton's method for�nding roots of a nonlinear system of equations. In practice the time step is usually scaledas a exponential function of the norm of the residual�t = f(kr(u)k)so that when kr(u)k approaches zero the time step becomes large. Each iteration of thescheme requires the solution of a linear system of equations of the form Mx = b. The twomost di�cult tasks in this procedure are:(1) Calculation of the Jacobian matrix elements(2) Solution of the sparse linear system, Mx = bBefore discussing these steps, it is worthwhile to assess the storage requirements forthis strategy. Assume that the solution unknowns are associated with vertices of themesh. All of the schemes discussed in previous sections require knowledge of distance-oneneighbors in the graph (mesh). Furthermore, the higher order accurate schemes requiremore distant neighbors in building the scheme. If we consider only distance-one neighborsthen the number of nonzero entries in each row of the matrix is related to the number ofedges incident to the vertex associated with that row. Or equivalently, each edge e(vi; vj )will guarantee nonzero entries in the i-th column and j-th row and similarly the j-thcolumn and i-th row. In addition nonzero entries will be placed on the diagonal of thematrix. From this counting argument we see that the number of nonzero entries in thematrix is exactly twice the number of edges plus the number of vertices, 2E +N . This129

Page 130: Barth_VKI95

argument holds in three dimensions as well. Although we do not prove it here, the storagerequirements become prohibitively large when more distant neighbors are included intothe matrix pattern.The strategy taken here is to store only the matrix associated with distance-one neigh-bors. We will require two copies of this matrix which are used in two di�erent ways: (1) inthe preconditioning of the linear system, (2) matrix-vector products in the iterative solver.7.2a Calculation of Matrix ElementsThe main burden in this calculation is the linearization of the numerical ux vector withrespect to the two solution states.For example, given the ux vectorh(uR;uL;n) =12 �f(uR;n) + f(uL;n)��12 jA(uR;uL;n)j �uR � uL� (7:2:3)we require the Jacobian terms dhduR and dhduL . Exact analytical expressions for these termsare available [Bar87]. As mentioned above, in practice we only compute the matrix el-ements associated with distance-one neighbors in the mesh. For the edge e(vi; vj) thisrequires the Jacobian terms dhijdui and dhijduj .Using the edge data structure mentioned in Section 1 the actual assembly of the sparsematrix from components becomes very simple. Again the idea is that we can determinethe ux Jacobian terms edge-wise. The edge e(vi; vj ) then contributes entries in the i-throw and j-th column as well as the j-th row and i-th column.7.2b Solution of the Linear SystemThe task is to solve the large sparse linear systemMx � b = 0or in left preconditioned form P (Mx � b) = 0:The approach take here is to use the GMRES algorithm of Saad and Schultz [SaadS86]with incomplete L-U factorization preconditioning. Consider for simplicity the nonprecon-ditioned form of the linear system. Let x0 be an approximate solution ofMx � b = 0where M is assumed invertible. The solution is advanced from x0 to xk = x0 + yk. Letv1 = Mx0 � b, rk = Mxk � b. GMRES(k) �nds the best possible folution for the yk overthe Krylov subspace [v1;Mv1;M2v1; :::;Mk�1v1] by solving the minimization problemkrkk = miny kv1 +Myk:130

Page 131: Barth_VKI95

GMRES forms an orthogonal basis v1; v2; :::; vk spanning the Krylov subspace. As k in-creases, the storage increases linearly and the computation quadratically. Thus Saad andSchultz devised GMRES(k;m) where the GMRES algorithm is restarted every m steps. Acomplete discussion of the GMRES solver and Krylov subspace methods will be deferredto the lectures of Prof. Saad.If we consider the backward Euler time stepping scheme with large �t, we see that thematrix-vector products are actually Fr�echet directional derivatives of the residual vectorin the direction v, i.e. Mv = @R(u)@u v:This has prompted researchers [Shak88] to consider \matrix-free" versions of GMRES inwhich the right-hand-side term is calculated numerically@R(u)@u v � R(u+ �v) �R(u)�We do not adopt this strategy because we actually desire a matrix (presumably betterthan a diagonal approximation) for the preconditioning step . Let M0 denote the Jacobianmatrix associated with the distance-one neighbors in the mesh. We compute the matrix-vector product terms in GMRES in the following fashion:Mv =M0v + �R(u+ �v) ��R(u)�where �R(u) represents the remaining terms in the discretization. We �nd that thischoice makes the scheme less dependent on the choice of �. This decomposition strategy isa departure from previous work by Venkatakrishnan who examined iterative ILU-GMRESalgorithms [VenkM93] together with uid ow algorithms but used the distance-one neigh-bor matrix for matrix-vector products in GMRES.The e�ectiveness of the GMRES hinges heavily on the preconditioning of the linearsystem. In the present applications, we precondition using incomplete L-U factorizationwith zero additional �ll, ILU(0). The basic idea in ILU(0) is to perform a lower-upperfactorization of the matrix ignoring any new �ll that might occur. Again we defer adiscussion of ILU to the lectures of Prof. Saad.7.3c Numerical ExamplesSubsonic Airfoil FlowAs a �rst example inviscid subsonic ow M1 = :63; � = 2:0� is computed about theNACA 0012 airfoil using the �rst order spatially accurate scheme. In this case the JacobianlinearizationM0 is exact and Newton's method is obtained for large �t. The grid, solution,and convergence history are shown in �g. 7.3.0a-c.131

Page 132: Barth_VKI95

Figure 7.3.0a Grid about NACA 0012 airfoil.Figure 7.3.0bMach contours about NACA 0012 airfoil using �rst order accurate scheme.In the convergence history plot we also show the convergence behavior of one GMREScycle with �t corresponding to a CFL number of about 10000. For comparison we showthe behavior with ILU preconditioning and simple block-diagonal preconditioning. Theimprovement with ILU preconditioning is dramatic at this large time step.132

Page 133: Barth_VKI95

0 5 10 15

Iterations

-14

-13

-12

-11

-10

-9

-8

-7

-6

-5

-4

-3

-2

-1

0

10

10

10

10

10

10

10

10

10

10

10

10

10

10

10

Nor

m R

esid

uals

ConvergenceL2 NormMax NormILU PreconditionDiag PreconditionFigure 7.3.0c Convergence history showing global convergence of the scheme and con-vergence of 1-cyle of GMRES(k,m) k = 12;m = 5 using block diagonal and ILU precondi-tioning.As a second example inviscid subsonic ow is computed about the NACA 0012 airfoilusing the second order spatially accurate scheme with linear reconstruction and no limiting.In this case Fr�echet numerical derivatives are also included in the GMRES computation.The numerical solution and convergence history are shown in �g. 7.3.1a-b.

Figure 7.3.1a Mach contours about NACA 0012 airfoil using second order accuratescheme. 133

Page 134: Barth_VKI95

0 5 10 15 20

Iterations

-14

-13

-12

-11

-10

-9

-8

-7

-6

-5

-4

-3

-2

-1

0

10

10

10

10

10

10

10

10

10

10

10

10

10

10

10

Nor

m R

esid

uals

ConvergenceL2 NormMax NormILU PreconDiag PreconFigure 7.3.1b Convergence history showing global convergence of the scheme and con-vergence of 1-cyle of GMRES(k,m) k = 12;m = 5 using ILU preconditioning.The residual history shows a rapid convergence to steady-state but never achieves Newtonconvergence. This could be due to the choice of � = 1:e� 6. We also show the convergencebehavior of one GMRES cycle with �t corresponding to a CFL number of about 10000.For comparison we show the behavior with ILU preconditioning and simple block-diagonalpreconditioning. The ILU preconditioning is degraded somewhat from the previous exam-ple but is still very e�ective. The block diagonal preconditioning is largely ine�ective atthis large time step.Transonic Airfoil FlowAs a third example inviscid transonic ow M1 = :8; � = 1:25� is computed aboutthe NACA 0012 airfoil using the �rst and second order spatially accurate schemes. Toavoid convergence problems owing to the nondi�erentiability of the limiter function in thesecond order accurate scheme, the smooth limiter function proposed by Venkatakrishnan[Venkat93] is used (although the limiter allows small overshoots in the solution). Thenumerical solution and convergence histories are shown in �g. 7.3.2a-b.134

Page 135: Barth_VKI95

Figure 7.3.2a Mach contours about NACA 0012 airfoil using second order accuratescheme.0 10 20 30 40 50

Iteration

-14

-13

-12

-11

-10

-9

-8

-7

-6

-5

-4

-3

-2

-1

0

1

10101010101010101010101010101010

L2

Nor

m R

esid

ual

2nd Order Scheme1st Order SchemeFigure 7.3.2b Convergence history showing global convergence of the scheme and con-vergence of 1-cyle of GMRES(k,m) k = 12;m = 5 using ILU preconditioning.Although both the �rst and second order accurate schemes converge rapidly near steady-state, the second order scheme requires almost twice as many iterations before the solutionis \su�ciently close" so that Newton's method may yield rapid convergence. This is largelydue to the strong shock which develops early in the transient and must propagate upstreamto its �nal position.Viscous FlowAs a �nal example, viscous turbulent ow (M1 = :3; � = 2:31�, Re = 6 million) iscomputed about the RAE 2822 airfoil using the �rst and second order spatially accurate135

Page 136: Barth_VKI95

schemes with linear reconstruction. The mesh has been previously shown in �g. 6.3.7.The numerical solution, and convergence history are shown in �g. 7.3.3b-c. In thesecalculations, the turbulence model equations are solved independently of the mean owequations. This creates a serious departure from Newton's method.Figure 7.3.3aMach contours about RAE 2822 airfoil using second order accurate scheme.

0 10 20 30

Iteration

-15

-14

-13

-12

-11

-10

-9

-8

-7

-6

-5

-4

-3

-2

-1

101010101010101010101010101010

L2

Nor

m R

esid

ual

Laminar (1st Order Scheme)Turbulent (1st Order Scheme)Turbulent (2nd Order Scheme)

Figure 7.3.3b Convergence history of the scheme for laminar ow (Re= .5 Million),turbulent ow (Re=6 Million) using �rst and second order schemes.The e�ect is quite noticeable for the �rst order scheme. In this case the number of iterationsrequired has doubled. The e�ect is very damaging for the second order scheme. In thiscase the only reasonable solution would be to fully couple the turbulence equations to themean ow equations. This is an avenue for future work.136

Page 137: Barth_VKI95

References[AftGT94] \On the Accuracy, Stability and Monotonicity of Various Reconstruction Algo-rithms for Unstructured Meshes," AIAA paper 94-0415, 1994.[And92] Anderson, W.K.,\A Grid Generation and Flow Solution Method for the EulerEquations on Unstructured grids," NASA Langley Research Center, USA, un-published manuscript, 1992.[ArmD89] Construction of TVD-like Arti�cial Viscosities on 2-Dimensional Arbitrary FEMGrids," INRIA Report 1111, October 1989.[BabuA76] Babu�ska, I., and Aziz, A. K., \On the Angle Condition in the Finite ElementMethod", SIAM J. Numer. Anal., Vol. 13, No. 2, 1976.[Bak89] Baker, T. J. ,\Automatic Mesh Generation for Complex Three-Dimensional Re-gions Using a Constrained Delaunay Triangulation", Engineering with Comput-ers, Vol. 5, 1989, pp. 161-175.[BaldB90] Baldwin, B.S., and Barth, T.J.,\A One-Equation Turbulence Transport Modelfor High Reynolds Number Wall-Bounded Flows," NASA TM-102847, August1990.[Barn93] Barnard, S.T., and Simon, H.D., \A Fast Multilevel Implementation pf recursiveSpectral Bisection," In Proc. 6h SIAM Conf. Parallel Proc. for Sci. Comp.,1993, pp. 711-718.[BarL87] Barth, T.J., and Lomax, H.L., \Algorithm Development," NASA CP 2454, 1987,pp. 191-200.[Bar87] Barth, T.J.,\Analysis of Implicit Local Linearization Techniques for Upwind andTVD Schemes," AIAA Paper 87-0595, 1987.[BarJ89] Barth, T. J., and Jespersen, D. C., \The Design and Application of UpwindScehemes on Unstructured Meshes", AIAA-89-0366, Jan. 9-12, 1989.[BarF90] Barth, T. J., and Frederickson, P. O., \Higher Order Solution of the Euler Equa-tions on Unstructured Grids Using Quadratic Reconstruction", AIAA-90-0013,Jan. 8-11, 1990.[Bar91a] Barth, T.J.,\Numerical Aspects of Computing Viscous High Reynolds NumberFlows on Unstructured Meshes", AIAA paper 91-0721, January, 1991.[Bar91b] Barth, T. J.,\A Three-DimensionalUpwind Euler Solver of UnstructuredMeshes,"AIAA Paper 91-1548, Honolulu, Hawaii, 1991.[Bar93] Barth, T.J.,\Recent Developments in High Order K-Exact Reconstuction on Un-structured Meshes", AIAA paper 93-0668, January, 1993.[BernE90] \Mesh Generation and Optimal Triangulation," Tech Report CSL-92-1, XeroxPARC, March 1992.[Bow81] Bowyer, A., \Computing Dirichlet Tessellations", The Computer Journal, Vol.24, No. 2, 1981, pp. 162|166.[Bri89] Brisson, E.,\Representing Geometric Structures in d Dimensions: Topology andOrder", In Proceedings of the 5th ACM Symposium on Computational Geome-try., 1989, pp. 218-227.[Chio85] Chiocchia, G., \ Exact Solutions to Transonic and Supersonic Flows", AGARDAdvisory Report AR-211,1985. 137

Page 138: Barth_VKI95

[Chew89] Chew, L.P., \ Guaranteed-Quality Triangular Meshes," Department of ComputerScience Tech Report TR 89-983, Cornell University, 1989.[Chew93] Chew, L.P., \Guaranteed-Quality Mesh Generation for Curved Surfaces," ACMPress, Proceedings of the 9th Annual Symposium on Computational Geometry,San Diego, CA, 1993.[ChroE91] Chrobak, M. and Eppstein, D.,\Planar Orientations with Low Out-Degree andCompaction of Adjacency Matrices", Theo. Comp. Sci., Vol. 86, No. 2, 1991,pp.243-266.[Ciarlet73] Ciarlet, P.G., Raviart, P.-A., \Maximum Principle and Uniform Convergence forthe Finite Element Method", Comp. Meth. in Appl. Mech. and Eng., Vol. 2.,1973, pp. 17-31.[ColW84] Colella, P., Woodward, P., \The Piece- wise Parabolic Method for Gas-DynamicalSimulations", J. Comp. Phys., Vol. 54, 1984.[CookMF79] Cook, P.H., McDonald M.A., Firmin, M.C.P., \AEROFOIL RAE 2822 PressureDistributions, and Boundary Layer and Wake Measurements," AGARD AdvisoryReport No. 139, 1979.[CutM69] Cuthill, E., and McKee, J.,\Reducing the Band Width of Sparse Symmetric Ma-trices", Proc. ACM Nat. Conf., 1969, pp. 157-172.[Del34] Delaunay, B., \Sur la Sph�ere Vide", Izvestia Akademii Nauk SSSR, Vol. 7, No.6, Oct. 1934, pp. 793-800.[DesD88] Desideri, J. A., and Dervieux, A., \Compressible Flow Solvers on UnstructuredGrids," VKI Lecture Series 1988-05, March 7-11, 1988, pp. 1{115.[DobL89] Dobkin, D.P., and Laszlo, M.J.,\Primitives for the Manipulation of Three- di-mensional Subdivisions", Algorithmica, Vol. 4, 1989, pp. 3-32.[Edel90] Edelsbrunner, H., Tan, T.S, and Waupotitsch, R.,\An O(n2 logn) Time Algo-rithm for the MinMax Angle Triangulation," Proceedings of the 6th ACM Sym-posium on Computational Geometry, 1990, pp. 44-52.[Fort93] Fortune, S.J., and Van Wyk, C.J., \E�cient Exact Arithmetic for Computa-tional Geometry," ACM Press, Proceedings of the 9th Annual Symposium onComputational Geometry, San Diego, CA, 1993.[GanB92] Gandhi, A. S., and Barth T. J.,\3-D Unstructured Grid Generation and Re�ne-ment Using `Edge-Swapping' ", NASA TM-103966, 1993.[Geo71] George, J.A,\Computer Implementation of the Finite Element Method", Techi-cal Report No. STAN-CS-71-208, Computer Science Dept., Stanford University,1971.[Gil79] Gilbert, P.N., \New Results on Planar Triangulations", Tech. Rep. ACT-15,Coord. Sci. Lab., University of Illinois at Urbana, July 1979.[God59] Godunov, S. K., \A Finite Di�erence Method for the Numerical Computation ofDiscontinuous Solutions of the Equations of Fluid Dynamics", Mat. Sb., Vol. 47,1959.[GoodLV85] Goodman, J.D., Le Veque, R.J., \On the Accuracy of Stable Schemes for 2DConservation Laws," Math. Comp., Vol. 45.[GreS77] Green, P. J. and Sibson, R., \Computing the Dirichlet Tesselation in the Plane",The Computer Journal, Vol. 21, No. 2, 1977, pp. 168|173.138

Page 139: Barth_VKI95

[GuiS85] Guibas, L.J., and Stol�, J.,\Primitives for the Manipulation of General Subdivi-sions and the Computation of Voronoi Diagrams", ACM Trans. Graph., Vol. 4,1985, pp. 74-123.[GuiKS92] Guibas, L.J., Knuth, D.E., and Sharir, M., \Randomizing Incremental Construc-tion of Delaunay and Voronoi Diagrams", Algorithmica, Vol. 7, 1992, pp. 381-413.[Guill1993] Guillard, H., \Node-Nested Multi-grid Method with Delaunay Coarsening," IN-RIA report 1898, 1993.[HamB91] Hammond, S., and Barth, T.J., \An E�cient Massively Parallel Euler Solver forUnstructured Grids", AIAA paper 91-0441, Reno, 1991.[Hart83] Harten, A., \High Resolution Schemes for Hyperbolic Conservation Laws," J.Comp. Phys., Vol. 49, pp. 357-393.[HartEOC] Harten, A., Engquist, B., Osher, S., Chak- ravarthy, \Uniformly High OrderAccurate Essentially Non - Oscillatory Schemes III, ICASE report 86-22, 1986.[HartHL76] Harten, A., Hyman, J.M., and Lax, P.D., \On Finite-Di�erence Approximationsand Entropy Conditions for Shocks," Comm. Pure Appl. Math., Vol. 29, 1976,pp. 297-322.[HartO85] Harten, A. , Osher, S., \Uniformly High-Order Accurate Non-oscillatory Schemes,I.," MRC Technical Summary Report 2823, 1985.[HolS88] Holmes, G. and Snyder, D., \The Generation of Unstructured Triangular Meshesusing Delaunay Triangulation," in Numerical Grid Generation in CFD, pp. 643-652, Pineridge Press, 1988.[Jam86] Jameson, A., and Lax, P., \Conditions for the Construction of Multi-Point TotalVariation Diminishing Di�erence Schemes," ICASE Report 178076, March 1986.[Jam93] Jameson, A., \Arti�cial Di�usion, Upwind Biasing, Limiters and their E�ect onAccuracy and Multigrid Convergence in Transonic and Hypersonic Flows," AIAApaper 93-3359, 1993.[JamM85] Jameson, A. and Mavriplis, D., \Finite Volume Solution of the Two-DimensionalEuler Equations on a Regular Triangular Mesh", AIAA paper 85-0435, January1985.[Jesp83] Jespersen, D.C., \Design and Implementation of a Multigrid Code for the EulerEquations," Appl. Math. and Computat., Vol. 13, 1983, pp. 357{374.[Joe89] Joe, B.,\Three-Dimensional Delaunay Triangulations From Local Transforma-tions", SIAM J. Sci. Stat. Comput., Vol. 10, 1989, pp. 718-741.[Joe91] Joe, B.,\Construction of Three-Dimensional Delaunay Triangulations From LocalTransformations", CAGD, Vol. 8, 1991, pp. 123-142.[Klee80] Klee, V., \On the Complexity of d-dimensional Voronoi diagrams", Archiv derMathematik, Vol. 34, 1980.[Knuth92] Knuth, D.E., Axioms and Hulls, Lecture Notes in Computer Science, Springer-Verlag, Heidelberg, 1992.[Knuth93] Knuth, D.E., Stanford Graph Base: A Platform for Combinatorial Com-puting, Addison- Wesley Publishing, New York, 1993.[Law77] Lawson, C. L., \Software for C1 Surface Interpolation", Mathematical SoftwareIII, (Ed., John R. Rice), Academic Press, New York, 1977.139

Page 140: Barth_VKI95

[Law86] Lawson, C. L., \Properties of n-dimensional Triangulations" CAGD, Vol. 3, April1986, pp. 231{246.[Mav88] Mavriplis, D., \Adaptive Mesh Generation for Viscous Flows Using DelaunayTriangulation," ICASE Report No. 88-47,1988.[MavJ87] Mavriplis, D. and Jameson, A., \Multigrid Solution of the Two-Dimensional EulerEquations on Unstructured Triangular Meshes", AIAA paper 87-0353, January1987.[Mich94] Michell, C., \Improved Reconstruction Schemes for the Navier-Stokes Equationson Structured Meshes," AIAA paper 94-0642, January 1994.[Nira90a] Nira, D., Levin, D., Rippa, S.,\Data Dependent Triangulations for PiecewiseLinear Interpolation", J. Numer. Anal., Vol. 10, No. 1, 1990, pp. 137-154.[Nira90b] Nira, D., Levin, D., Rippa, S.,\Algorithms for the Construction of Data De-pendent Triangulations",Algorithms for Approximation, II, Chapman, andHall, London, 1990, pp. 192-198.[PothSL90] Pothen, A., Simon, H.D., and Liou, K.-P., \Partitioning Sparse Matrices withEigenvectors of Graphs," SIAM J. Matrix Anal. Appl., Vol. 11, No. 3, July1990, pp. 430-452.[Raj91] Rajan, V. T., \Optimality of the Delaunay Triangulation in Rd", Proceedings ofthe 7th ACM Symposium on Computational Geometry, 1991, pp. 357-372.[Rippa90] Rippa, S., \Minimal Roughness Property of the Delaunay Triangulation", CAGD,Vol. 7, No. 6., 1990, pp-489{497.[Roe81] Roe, P.L.,\Approximate Riemann Solvers, Parameter Vectors, and Di�erenceSchemes", J. Comput. Phys., Vol 43, 1981.[Ros68] Rosen, R., \Matrix Band Width Minimization", Proc. ACM Nat. Conf., 1968,pp. 585-595.[Rost88] Rostand, P., \Algebraic Turbulence Models for the Computation of Two- dimen-sional High Speed Flows Using Unstructured Grids", ICASE Report 88-63, 1988.[Rupp92] Ruppert, J., \A New and Simple Algorithm for Quality 2-Dimensional MeshGeneration," Report UCB/CSD 92/694, University of California, Berkeley, 1992.[SaadS86] Saad, Y., Schultz, M.H. \GMRES: A Generalized Minimal Residual Algorithmfor Solving Nonsymmetric Linear Systems," SIAM J. Sci. Stat. Comp., Vol. 7,No. 3, 1986, pp. 856-869.[Schmitt79] Schmitt, V., and Charpin, F., \Pressure Distributions on the ONERA M6-Wingat Transonic Mach Numbers," in \Experimental Data Base for Computer Pro-gram Assessment," AGARD AR-138, 1979.[Shak88] Shakib, F., \Finite Element Analysis of the Compressible Euler and Navier-StokesEquations", PhD Thesis, Department of Mechanical Engineering, Stanford Uni-versity, 1988.[Simon91] Simon, H.D., \Partitioning of Unstructured Problems for Parallel Processing,"NASA Ames R.C., Tech. Report RNR-91-008, 1991.[SpalA92] Spalart, P., and Allmaras, S., \A One-Equation Turbulence Model for Aerody-namic Flows," AIAA Paper 92-0439, 1992.[Spek87] Spekreijse, S.P., \Multigrid Solution of Monotone Second-Order Discretizationsof Hyperbolic Conservation Laws," Math. Comp. Vol. 49, pp. 135-155.140

Page 141: Barth_VKI95

[Spek88] Spekreijse, S.P., \Multigrid Solution of the Steady Euler Equations," CWI Tract,Centre for Mathematics and Computer Science, Amsterdam, The Netherlands,1988.[StruVD89] Struijs, R., Vankeirsblick, and Deconinck, H., \An Adaptive Grid Polygonal Fi-nite Volume Method for the Compressible Flow Equations," AIAA-89-1959-CP,1989.[ThomLW85] Thomas, J.L., van Leer, B., and Walters, R.W., \Implicit Flux-Split Schemes forthe Euler Equations," AIAA paper 85-1680, 1985.[VanL79] Van Leer, B., \Towards the Ultimate Conservative Di�erence Schemes V. A Sec-ond Order Sequel to Godunov's Method", J. Comp. Phys., Vol. 32, 1979.[ValDM91] Valarezo, W.O., Dominik, C.J., and McGhee, R.J.,\Multi-Element Airfoil Opti-mization for Maximum Lift at High Reynolds Numbers," AIAA Paper 91-3332,September, 1991.[VankD92] Vankeirsbilck, P., and Deconinck, H., \Higher Order Upwind Finite VolumeSchemes with ENO Properties for General Unstructured Meshes," AGARD Re-port R-787, May 1992.[VenSB91] Venktakrishnan, V., Simon, H.D., Barth, T.J., \A MIMD Implementation of aParallel Euler Solver for Unstructured Grids", NASA Ames R.C., Tech. ReportRNR-91-024, 1991.[Ven93] Venktakrishnan, V., \On the Accuracy of Limiters and Convergence to SteadyState Solutions," AIAA paper 93-0880, 1993.[Vor07] Voronoi, G., \Nouvelles Applications des Param�etres Continus �a la Th�eorie desFormes Quadratiques," Premier M�emoire: Sur Quelques Propriete�es de FormesQuadratiques Positives Parfaites, J. Reine Angew. Math., Vol. 133, 1907, pp.97-178.[Wat81] Watson, D. F, \Computing the n-dimensional Delaunay Tessellation with Appli-cation to Voronoi Polytopes", The Computer Journal, Vol. 24, No. 2, 1981, pp.167|171.[WoodC84] Woodward, P., Colella, P., \The Numerical Simulation of Two-Dimensioal FluidFlow with Strong Shocks" J. Comp. Phys., Vol. 54, 1984.[XiaN92] Xia, and Nicolaides, \Covolume Techniques for Anisotropic Media," Numer.Math., Vol. 61, 1992, pp. 215-234.141