constructing trees in parallel

Upload: preethi-sai-krishnan

Post on 04-Jun-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 Constructing Trees in Parallel

    1/15

    Purdue University

    Purdue e-Pubs

    Computer Science Technical Reports Department of C

    1989

    Constructing Trees in ParallelMikhail J. AtallahPurdue University, [email protected]

    S. R. KosarajuL. L. Larmore

    G. L. Miller

    Report Number:89-883

    Atallah, Mikhail J.; Kosaraju, S. R.; Larmore, L. L.; and Miller, G. L., "Constructing Trees in Parallel" (1989).Comput

    http://docs.lib.purdue.edu/http://docs.lib.purdue.edu/cstechhttp://docs.lib.purdue.edu/comp_scihttp://docs.lib.purdue.edu/comp_scihttp://docs.lib.purdue.edu/cstechhttp://docs.lib.purdue.edu/
  • 8/13/2019 Constructing Trees in Parallel

    2/15

    CONSTRUCTING TREES IN PARALLELM J Atallah

    S R KosarajuL Larmore MillerS H TengCSD TR 88May 1989

  • 8/13/2019 Constructing Trees in Parallel

    3/15

    - 1 -

    Constructing Trees in ParallelM. J. AtalIah' s. R. Kosarajut L. L. Larmore t G. L. Miller

    Abstract 1 IntroductionOCIog n time, n /log n processor as well as D(log n)time, n 3/ logn processor CREW deterministic parallel algorithIllB are presented for constructing Huffmancodes from a given list of frequencies. The time canbe reduced to O(logn(log logn) ) on a ReWmodel,using only n 2J(loglogn)2 processors. Also presentedis an optimal O(logn) time, OCn/logn) processorEREW parallel algorithm for constructing a tree givena list of leaf depths when the depths are monotonic.An 0Clog n) time, n processor paral lel algorithm isgiven for the general t ree construct ion problem. Wealso give an O(log2 n) time n 2 / log n processor algorithm which finds a nearly optimal binary search tree.An OCIog n time n2 .36 processor algorithm for recognizing linear context free languages is given. A crucialingredient in achieving those bounds a formulationof these problems as multiplications of special matriceswhich we call concave matrices. The structure of thesematrices makes their parallel multiplication dramatically more efficient than that of arbitrary matrices.

    oDepartment of Computer Science, Purdue Univemity. Suppor ted by the Office of Naval Re3earch under Grant9 NOOO14-B4K-0502 and NOOOl4-86-K-0689, llIId th e National Science Foundation under Grant DCR-8451393, with matching fund9 fromAT TlDepartment of Computer Science, Johns Hopkins University. Supporl.l :d by National Scien= Founda.tion through grantCCR-88-04284

    IICS, UC Irvine.ISchoolofCompuler Science, CMUand Depar tment ofCom

    puter Science, USC. Supported by National Science Foundationthrough grant CCR-87-13489.

    In this paper we present several nrithms. Each algorithm uses substacessors than used in previously knownfour problems considered are: The TProblem, The Huffman Code ProbContext Free LanguageRecognition POptimal Binary Search Tree Problemproblems the computationally expenproblem is finding the associated trethat these trees are not arbitrary treeWe take advantage of the special foto decrease the number of processors

    All of the problems we consideas well as many other problems, cansequential polynomial time using Dyming. algorithms for each of thebe obtained by parallelization of Dyming. Unfortunately, this approachalgorithms which use 0 n6 or morealgorithm which increases the work0 71 or O n2 to 0 n6 is not of mucIn this paper we present several new pproving the processor efficiency for dyming problems. For all the problemsor class of trees is given implicitly anmust find one such tree.

    The construction of opt imal codproblem in communication. Let be an alphabet. A code C {ct, finite nonempty set ofdistinct finite s

  • 8/13/2019 Constructing Trees in Parallel

    4/15

    - 2 -

    that the average word length ~ r i i . l is minimum,where led is the length of Ci.

    I t is easy to see that pre fix codes have t he niceproperty that a message can b e d ecompo sed in codeword in on ly o ne way- they are uniquely decipherable.I t is interesting to point ou t that Kraft and McMillanproved that for any code which is uniquely decipherable there is alwaysa prefix code with the same averageword length [13] In 1952, Huffman [9] gave an elegantsequential algorithm which can generate an optimalprefix code in O n log n ) time. I f the probabilities arepresorted then his algorithm is a ct ua ll y lin ear t im e[11] Using parallel dynamic programming, Kosarajuand Teng [18], independently, gave the first JiC algorithm for the Huffman Coding Problem. However,both constructions use n6 processors. I n this p ap er,we first show how to reduce the processor count to n 3 while using O log n time, by showing that we may assume that th e tree associated w it h t he prefix code isleft-justified to be defined in Section 2).The n 3 processor count ar is es f ro m t he fact thatwe a re m ul ti pl yi ng n x n m at ri ce s o ver a closed semiring. We reduce the p ro ce ss or c ou nt s ti ll f ur th er t on2/10gn by showing that, arter suitable modification,the matrices which are multiplied are concave to bedefined later). The structure of these matrices makestheir parallel multiplication dramatically more efficientthan that of arbitrary m atr ic es . An O logn loglognt ime n 2I logn processor CREW algorithm is presentedfor multiplying them. Also given is an Ologlogn 2time, n 2I log logn processor CRCW algorithm for multiplying two concave matrices1 .

    The algorithm for construction of a Huffman codestill uses n 2 processors, which is probably too large forpractical consideration since Huffman s algorithm onlytakes O nlogn sequential time. Shannon and Fanogave a code, the Shannon-Fano Code, which is onlyone bit off from optimal. That is, the expected lengthofa ShannonFano code word is at most one bi t longerthan the Huffman code word.

    The construction of the Shannon-Fano Code reduces to the following Tree Construction Problem,

    We give an 0 log2 n time, n pPRAM: p ar al le l a lg or it hm for t heproblem. In the case when Ii, . .. In agive an O logn time and nflogn pPRAM parallel algorithm. In fact,level of the leaves are monotoneboth constructing Huffman Codes anCodes.

    Using our solution of the tree colem we get an O logn time nEREW PRAM algorithm for constrFano Codes.

    We al so con sid er the problemstructing optimal binary se arch t reKnuth [1 ] The b est known NC aproblem is the parallelization of dyming which uses n6 processors. I nthe new concave matrix multiplicatishow how to compute nearly optimtree in 0 log2 n time using n2 /10gnsearch trees are o nly off fro m o pt imamount of link for any fixed k.

    Finally, we consider recognitionfree languages. A CFL is said to beductions are of the form A bB Awhere A and Bare nonterminal varb are terminal variables. I t is well kn[17] that the general CFL s recognitioperformed on a CRCW PRAM in O n 6 processors again by parallelizationgramming. By observing that the parear context free language is of very reconstruct an 0 n 3 processor, 0 log2PRAM algorithm for it. Using the facing Boolean matrix multiplication, wprocessor count to n2 6

    2 PreliminariesThroughout this paper a tree will b eis ordered the children of each nodeleft to right. The level of a n od e in a tr

  • 8/13/2019 Constructing Trees in Parallel

    5/15

    - 3 -

    then T is complete at levell, where Ttl and Tv denote the subtrees rooted at u and v, respectively.Right-justified trees can be defined similarly.Let RAKE be an operation that removes all leaves

    from a tree. We shall consider a restricted form ofRAKE where leaves are removed only when its siblingsare leaves.Proposition 2.1 The se t of left-justified trees rightjustified trees) is closed under the RAKE opemtion.

    The values of all Hi ; includingpu t value HI,n, may be obtained bygorithm which slmulates the RAKE1. Estimate Hi ; to be 0 if i = j, +2. Iterate this step unt il all Hi,; a

    lation 1) to re-estimate Hi ; fothe values of H obtained duringtimation step.LCDllDB. 2.1 For any left-justified tree T oln vertices,flogz n1applications olRAKE will reduce T to a singlechain. Moreover, the resulting chain comes from theleft most path in T.

    Proof: We need only show that any vertex not onthe left most path of T is removed by flog n1iterationsof RAKE.

    Let v be a vertex of T no t on the left most path,and let h be the height of v, the maximum length ofa path from v to a leaf in Tv. Since T is left-justified,there exists a vertex u of T at the same level as v andto the left of v, and since T is not empty at level h JTtl is complete at level h and hence has at least 2hleaves. Since T has n leaves altogether, h < logn.Each RAKE decreases the height of every non-emptysubtree by 1, thus log n iterations of RAKE completelyeliminate T . 0Let the height of a tree T be the height of its root.

    3. Output the value of HI,nEach iteration of the second ste

    O logn) time using n3 / logn procesPRAM model of computation is usedthe best upper bound on the numneeded is O n), since each iteration sRAKE operation.

    The algorithm can be improvedstep which simulates the COMPREwell. The COMPRESS operation hala tree by doubling. For any 1 : ; iF i to be that quantity such that Hminimum average word length of aPI, ,Pi) where the only trees conwhich contain a subtree which is a Pt, .. . ,pi). the value of all Hi,; arFi ; can be defined by the following:

    3 Parallel Tree Contractionand Dynamic Programming

    Corollary 2.1 1 fT is a left-justified tree ofn verticesJthen for all v not on the left most path ofT, the heightofT is bounded Ocpognl).

    1. For 1< i : ; j 5 n, estimate Hi ;+00 otherwise.2. Iterate this step flog n1 t imes: Fo

    n, re-estimate Hi,; using relation

    We now describe the modified amakes use of relations 1) and 2), andlogn iterations of RAKE followed byof COMPRESS:

    In this section, we present a parallel algorithm for finding an optimal Huffman tree for a given monotonicfrequency vector PIJ JPn). The general Huffman

  • 8/13/2019 Constructing Trees in Parallel

    6/15

    - 4 -

    5. Output the value F1,n, which will be the minimumaverage word length of any Huffman code.Intuitively, each re-estimation of the values of Hsimulates one RAKE step, while each re-estimation of

    the values of F simulates one COMPRESS step. Correctness of the algorithm follows from the fact that anyleft-justified binary tree can be reduced to the emptytree by pog niterations of RAKE followed by pogniterations of COMPRESS, andLemma 3.1 For each monotonically increasing fre-quency vector P1, . . . ,Pn), there is an opt imal positional tree Huffman tree) that is left-justified.[PROOF]: This lemma can be proven by a simple induction on n. In fact, the procedure given in the proofofLemma 3.1 Teng 18] transforms any Huffman treeto a left-justified one. 0Theorem 3. 1 The Huffman Coding Problem can besolved in O logn) time, using 0 n3 jlogn) processorson a CRCW PRAM.

    4 MultiplicationMatrix

    of Concave

    4.1 The Matrix Cut A, B)Let A be a concave matrix ofsize matrix ofsize q x r. By the definitionplication above, AB),j = min{A ik We can define a matrix Cut A, B){I,q as follows: Cut(A, B)ii = thathat Ail: Bki is minimized. If thone value of k for which that sum isthe smallest.

    To compute AB it is clearly suffCut A, B), since we can construct ABin 0 1) t ime using pr processors.below, we just indicate how to comp

    Define Aellen to be the submatrof all the entries of A whose row indby an abuse of nota.tion we definesubmatrix of B consisting of all encolumn index is even.

    MULTIPLICATION ALGORlTProcedure: Cut(A, B)i f A has just one row, or if B hasthen

    compute Cut(A, B) by examichoiceseI.e

    In this section we introduce a new subclass of matriceswhich we call concave matrices. A concave matrix isa rectangular matrix M which satisfies the quadranglecondition [19 , that is ii Mkl Mil ki for alli < k j < I in the range of the indices of

    Matrix multiplication shall be defined over theclosed semiring min, + , where the domain the setof rational numbers extended with +00. For example,ifM is the n x n matrix giving the weights of the edgesof a complete digraph of size n, then Mk is the matrix giving the minimum weight ofany path of lengthexactly k between any given pair of vertices.

    We give a recursive concave matrix multiplication algorithm which takes O lognloglogn) time, using n2jlogn processors on a CREW machine andOlog logn 2 time, using n2 j log log n processors ona CRCW machine. Our algorithm is very simple and

    compute Cut(Ae en Bellen bycompute Cut Aellen , B by intcompute Cut A, Bellen by intcompute Cut(A, B) by interpo

    iInterpolation:The concavity property of A gu

    lowing inequality:Cut(A, B)ij ; Cut A, B)while concavity of B guarantees a

    ity:Cut A, B)ij ; Cut A, B)

    The combination of these two p

  • 8/13/2019 Constructing Trees in Parallel

    7/15

    - 5 -

    number of comparisons needed (for the fixed value ofj is thus only q - 1. For all j together, q - l rare enough. Similarly, monotonicity allows us to compute Cut A l e , B given Cut A ellen ,Be en) using atmost p q - 1)/2 comparisons an d Cut A,B givenCut Ae erIlB) an d Cut A,Bellen ) using at most qr/2operations.

    T im e an d Work Analysis:Except for the recursion, the t ime to execute th e

    multiplication algorithm is O(logq) or O loglogq) ona C RE W and CRCW machine, respectively. Sincethe d ep th o f the recursion is min{logp, logr}, the to-tal time is O(logq(min{logp, logr}) on a CREW machine an d O(loglogq(min{logp, logr}) on a CRCWmachine.4.2 Mor e Efficient Concave Matrix

    Algor ithmIn the MULTIPLICATION ALGORITHM given inthe above subsection, th e size of matrices is gettingsmall during th e recursion, while the number of processor available is still n 2 Hence at a cer tain stage,we do have enough processors to ru n th e general matrix multiplication algorithm to compute the Cu t matrix in one step. This implies that we can s top th erecursion whenever th e sizes of matrices are smallerenough to ru n t he general matrix multiplication algorithm. Thus, the parallel (concave matrix) multiplication algorithm ca n be speeded up.

    Fo r each integer m, let Amodm be the submatrixof A consisting of a ll the entries of A whose row in-dex is a multiple of m , while (by an abuse of notation) le t Bmodm submatrix of B consisting of all theentries of B whose column index is a multiple of m.Clearly, Cl.lt(Amodl.fii j,Bmodl..,lnj) requires n2 comparisons, an d can be computed in O(1ogn) time an d(log log n) time on a CREW machine an d a CRCWmachine, respectively.

    The following is a bottom-up procedure for computing Cut A, B .

    for m = 1 to rloglognl + 1 do

    that in Step(2), each row requires.,fSince there is ;n rows, n 2 comparisSimilarly, Step (3) takes n2 compariFor m > 1, by monotonicity prorow (column) takes n l12m n comparare nl_l/2m-l. n l / 2m rows (columnsare sufficient.

    Therefore, the arithm takes O(log n log log n) time, ucessors on a CREW machine or 0((using n2/ loglogn) processors on a

    5 T h e P ar all el G enHuf f man C ode

    Presen ted in this section is an efficrithms for the Huffman Coding Prorithm runs in 0(log2 n) time, usingsors on an CRCW PRAM This algupon the previous known fife algorion the processor count. t hinges on tmatrix multiplication.

    By Lemma 3.1, for each nond(PI, . .. ,Pn), there exists an opt imfor PI ... ,Pn) which is left-justifielary 2.1, it follows that there existsfor PI,. . l Pn) such that the heightsduced by nodes no t on the leftmost pby nagnl

    This observation suggests th e fofor the Huffman Coding Problem.1. Constructing Height Bo u n d e

    all i j , compute 1i,;, an(pi, Pi whose height is bound

    2 Constructing the Optimal Trformation provided in the firstan optimal Huffman tree for (PIAssume that weights PI, . . . ,Pn a

    tonically increasing order. DefineS[i j] = L:{=i l P for i < j , an d S

  • 8/13/2019 Constructing Trees in Parallel

    8/15

    - 6 -

    Theorem 5.1 The Huffman Godingsolved in O log2 n) t ime, using n 2Jl

    Th ink of M as a digraph deradding a self-loop of weight 0 at othat satisfies the quadrangle condM is a concave matrix.

    Note that any path of length kj in M corresponds to a path of lengo to i in M .

    The left edge of the opt imal Hlength less than n, therefore M )kweighted path length of the optimalanyk>n

    Note that M is a concave m(M )r is also a concave matrix. Hcan be computed by starting with Mnog n1 times. Using the parallel contiplication algorithm, each squaringin O logn) time, using n2Jlog n pro

    In this section, the parallel constructnary search trees, an important data smaintenance and information retrievered. An O log2 n time, n2Jlog2 n palgorithm is given for constructing annary search tree whose weighted pathf off the optimal, where f = n-I:. Nknown sequenLial optimal search treegorithm, due to Knuth, takes 0(n2algorithm is opHmal up to approximrithm too hinges on thejudicioususe omultiplication.

    The sequential version of the opproblem was first s tudied by Knuthmonotonicity to give an 0(n 2 ) time a

    Constructing Almmal Binary Searchin Parallel

    6

    i f i = j = Oi f i =Oand j =1i fO

  • 8/13/2019 Constructing Trees in Parallel

    9/15

    - 7 -

    Theorem 6.1 T any 0 < c n we must store a as linked-list of nonzeroentries. For simplicity of the exposition assume thatm ; n. We first show how to reduce a to a vectorsuch that 2 for 1 S i ; n. We compute a vectora from a by setting = La;f2J (aj_l mod 2). Itfollows by the Kraft sum that the tree for a exists iffit does for a . Further, from the tree for a we canin unit time construct one for a. This reduction froma to a is very analogous the the RAKE in the Huff-man code algorithm for left-justified trees. We shallapply this reduction until the ai ::; 2, at most O(logn)times. To see that we only need nl log n processorsfor the logn reductions, observe that the total workis 00= ;>2 log a;) S n. To balance the work, anyprocessor -that computes at a given stage will berequired to compute at the next stage. Thus wedistribute the a; 2 based on the work of which islogaj. To construct a tree for a where a. S 2 reducesto computing the sum of two n-bit numbers and theirintermediate carries. This can all be done optimalyusing prefix sums.

    This gives the following theorem:Theorem 1.1 Trees with monotone leaf patterns canbe constructed in O(log n) time, using nl log n processors on an EREW PRAM.

    A pattern h, ... ,In) is a bitonic pattern if thereexists i such that 11 ,Ii) is monotone increasingand (Ii, ..., In is monotone decreasing.Lemma 1.2 The tree construction problem for abitonic leaf pattern 11, In) has a solution iff

    2- l j < 1L....J=I Using the methods presented for monotone leaf

    patterns and the above lemma we get the followingtheorem:

    general leaf pattern to the one withterns. Moreover, the reduction can bnllogn processors. Therefore, ann/logn processor EREW PRAM presults.

    A segment.representation of a pis IL nd, ..., I:n nm)) where j nan d

    For simplicity, Wi, nI), ..., I; , na pattern. In a pattern It,nI), ... ,min-point if Ii_I> < 1;+1; i is a mi 1i+I.

    In a pattern h, nI) ..., 1m, na right-finger if (1) li_I is a min-pi S k j is II; a min-point 2 1A left-finger is defined similarly excmin-point. Note that a finger may beright finger. We next show how to rger from a leaf pattern using the treebitonic patterns. Finally, we observetern will have at most half as manyThus we need only remove fingers O

    Finger-Reduction applied to onein a pattern 11, nI), ..., 1m, nm)) is dWithout loss of generality assume tfinger and IHI < 1;-1. Set

    Finger-Reduction returns the pattern

    We have just relaced a finger with tLemma 7.2) of leaves at level 1;-1 thgenerate it. In general Finger-Reducneously reJ love all fingers, both left aIt will return with a pattern.

    To see that Finger-Reduction rber of finger by at least one half, obs

  • 8/13/2019 Constructing Trees in Parallel

    11/15

    - 9 -

    a pattern (h , .. . , In), then the tree construction problem with pattern (ll, In has a solution iff there is asolution to the tree construction problem with pal/ern... ,

    To obtain the tree for a pattern we apply FingerReduction until the pattern is reduced to a single finger. We then construct the rool t ree for the finger.In an expansion phase we attach the trees constructedwhile removing the finger during Finger-Reduction tothe root tree.Theorem. 7.3 A tree can be constructed for a pattern/1> ... ,ln) with m fingers O(lognlogm) time, usingnjlogn processors.7.3 Constructing Approximate OptiM

    mal TreesThe Shannon-Fano coding method can be specified as :upon input PI, . .. ,Pn), compute /1, . .. ,In) such thatlog :; :::; 1; :::; log k 1, then construct a prefix codeG = CI, . . . ,en) such that 1 1= Ii

    The proof to the following claim can be found in[8]Claim. 7.1 Let SF(A) be the average word-length ofthe Shannon-Fano code of A = {al, . . . ,a n} andHUFF(A) be the one of the Huffman code then

    HUFF A ; SF A ; HUFF A 1The second part of the Shannon-Fanomethod can

    be implemented by the parallel tree construction algorithm presented in Section 7.1. Therefore,Theorem 7.4 In O(logn) time, using njlogn processors, a prefix code can be constructed with averageword length bounded by that of the corresponding HuiJman code plus one.

    8 Parallel Linear Context-freeLanguage Recognition

    Definition 8.1 (Context-free LanA context-free grammar is a ..f-tuplewhere:

    V is a finite nonemply set calleulary;

    1; V is a finite non empty set calphabet; E V - 1; ; N is called the sta P is a finite set of rules of the f

    fr , where A E N,A context-free grammar G = {V, 1; ,each role is of the form: A _ uBvN, U,v E 1;.

    Let G ; {V, 1; , P,5} be a conteand let w, w E V, w is said to direwritten w => W i if there exist a,fJ,u,w = aAfJ, w = auBv{3 E V, and Astands for the reflexive-transitive clo

    The language generated by G, w L G) = {w E sThe CFL-recognition problem is

    a context-free grammar, G and a finWlW n E 1;., decide whether W E L(a parse tree).

    Each linearfree grammar G = {V 1,1;,PI ,S} caby constructing another linear conteG = {V,E,P j ,u,h hat i L G) =a finite se t of rules of the form

    _ b B o r _ a or where a,b,cE1;, A,B,GEN.

    A linear context-free grammarnormalized by finding a G such that

  • 8/13/2019 Constructing Trees in Parallel

    12/15

    IVIE

    {Vi i p i j :S n,pE N}IE , U fEr where{ vi,i,p, vi,i-l,q I i < j , P - qWj E P}{ vi,i,p, Vi+l,i,q I i < j, P Wiq E P}

    - 10 -

    approximately equal components (itriangle is meant to depict IG G,wits collapsed version). Moreover, suctor can be uniformly found in eachsively.

    We have the following observation.Claim 8.1 Let G be a linear context-free grammarand wEE . Le t fGCG,w be the induced graph ofG and w. Then w E L G iff there exists a path inIG G,w from VI n to V;,i,q for some i : 1 : i:5 n,where q _ Wi E P.

    The above observation reduces the linear contextfree recognition problem to a path problem (reachability problem) in the induced graph IG G,w .Let cluster i, j refer to the se t of vertices ofthe form vi,i,p (see Figure 1). Note that i f all verticesof each cluster i,i are collapsed into one vertex (callit Vi } then a planar grid graph is obtained (see Figure 2a) which we schematically draw as a triangle (secFigure 2b) . Although IG G,w itself is typically notplanar, we shall take the l iberty of talking about itsexternal face to refer to the subset of it s nodes thatmap into the external face of the collapsed version.

    Figure 1: In IG G,w the only edges leaving clusteri,i go to clusters i, i - 1 and i l j

    Figure 3: Illustrating the smallIG G,w

    The outline of the parallel alclear. Let U,M,L,R be the four piinduced by the separator 0 (see Figreachability matrix Re.achu betweentices on tbe external face of U is recurThe same is done for each of M, L, trices ReachM , ReachL, Re.achR, resthe four boolean matrices returnedcursive calls, the reachability matrixall pairs of vertices on the externalis computed. This can be done simmatrix multiplication (actually threetions), taking timeO logn with M nknown that M n = O n 2+ where 0the time complexity of the algorithmprocessor count can be obtained bycurrence:

    Pen = max 4P n/2 ,Mwhich implies P n = O M n.Theorem 8.1 Linear context-free lrecognized in 0 log2 n) time, using Mwhere. M n is the number of procesboolean matrix multiply n O logn t

    9 Open Questions

  • 8/13/2019 Constructing Trees in Parallel

    13/15

    - 11 -

    Can a linear context-free language be recognizedin polylogarithmic time, using n 2 or o M n)) processors?

    Can the Optimal Binary Search Tree Construction be solved in polylogarithmic Lime using fewerthan O n- processors?

    Acknowledgements We would like to thankManuela Veloso of CMU for carefully reading drafts ofthe paper and many helpful comments. We would alsolike to thank Alok Aggarwal for helpful discussions.

    References[1] A Apostolico, M. J. Ata ll ah , L. L. L ar mo re and

    H. S. McFaddin. Efficient parallel algorithms forstring editing and related problems. In Froc. 26thA nnual Allerton Con on Communication, Con-trol, and Computing, Monticello, Illinois, September 1988, pp 253-263, 1988.

    [2] A Aggarwal and J Park. Notes on searchingin mu ltidi me nsio nal monoton e arrays. I n 29ihAnnual Symposium on Foundations of ComputerScience, IEEE, 1988.

    [3] A Aho, J. Hopcroft, and J Ullman. The Designand Analysis of Computer Algorithms. AddisonWesley, 1974.

    [4] R. Cole. Parallel merge sorL In FOCS27,pages 511-516, IEEE, Toronto, October 1987.

    [5] S. E ve n. Graph Algorithms. Computer SciencePress, Potomac, Maryland, 1979.

    [6] M. R. Garey Optimal b in ar y s earc h tr ee wit h re stricted maximal depth. SIAM Journal of Com-puting, 3:101-110, 1974.

    [7] GuttIer Mehlhorn and W. Sc hn ei der . B i

    [10] D E. Knu th . Optimal binary sInformatica, 1:14-25, 1971.

    [11] L. L. L ar mo re . Hei gh t r es tr ic tetrees. SIAM Journal of Com1123, 1987.

    [12] L. L. Larmore. A subquadraconstructing approximately optitrees. of Algorithms, 8:579-59

    [13] B. McMillan. Two inequalities idecipherability. IRE, TransactioTheory, 0,185-189, 1956.

    [14] G. L Miller and J II. Reif. Paration and its applications. In 26Foundaiions of Computer SciencIEEE, Portland Oregon, 1985.

    [15] G L Miller and Teng. Sods for tree based parallel algoritIn Second International Confereputing, pages 392--403, Santa Cl

    [16] V Pan and J Reif. Fast anlel solution of linear systems. SComputing, to appear, 1988.

    [17] W. L. Ruzzo. On uniform ciJournal of Computer and SystemJune 1981.

    [18] S-H. Teng. The constructioequivalent prefix code in NC.18 4),54-61,1987.

    [19] F F . Yao. Efficient d yn am ic ping efficient dynamic programmrangle inequalities. In Proceedingnual ACM Symposium on Theorpages 429-435, ACM, 1980.

  • 8/13/2019 Constructing Trees in Parallel

    14/15

    - 2 -

    Cluster i,j _

    Cluster j l

    igur

    5,5,

    J 4 43,31,5 ? 5

  • 8/13/2019 Constructing Trees in Parallel

    15/15

    3

    Figure 3