proceedings of the w orkshop on algorithmic aspects of

122
Proceedings of the Workshop on Algorithmic Aspects of Advanced Programming Languages WAAAPL’99 Paris, France September 30, 1999 Chris Okasaki, Editor Department of Computer Science Columbia University Technical Report CUCS–023–99

Upload: others

Post on 05-Apr-2022

0 views

Category:

Documents


0 download

TRANSCRIPT

Proceedingsof the

Workshop on Algorithmic AspectsofAdvancedProgramming Languages

WAAAPL’99

Paris,France

September30,1999

ChrisOkasaki,Editor

Departmentof ComputerScienceColumbiaUniversity

TechnicalReportCUCS–023–99

Foreword

Thefirst WorkshoponAlgorithmic Aspectsof AdvancedProgrammingLanguageswasheldonSeptem-ber30,1999,in Paris,France,in conjunctionwith thePLI’99 conferencesandworkshops.

Thechoiceof programminglanguagehasahugeeffectonthealgorithmsanddatastructuresthatareto beimplementedin that language.Traditionally, algorithmsanddatastructureshavebeenstudiedin thecontextof imperative languages.This workshopconsidersthe algorithmic implicationsof choosingan advancedfunctionalor logic programminglanguageinstead.

A totalof eightpaperswereselectedfor presentationat theworkshop,togetherwith aninvitedlecturebyRobertHarper.

We would like to thankDider Remy, generalchairof PLI’99, for his assistancein organizingthis work-shop.

Program Chair:ChrisOkasaki ColumbiaUniversity, USA

Program Committee:GerthStøltingBrodal BRICS,Universityof Aarhus,DenmarkAdamBuchsbaum AT&T LabsResearch,USAIliano Cervesato StanfordUniversity, USARalf Hinze RheinischeFriedrich-Wilhelms-UniversitatBonn,GermanyJohnO’Donnell Universityof Glasgow, ScotlandRicardoPena UniversidadComplutensedeMadrid,Spain

1999Workshop on Algorithmic Aspectsof AdvancedProgramming Languages

Paris,FranceSeptember30,1999

Invited Talk: 9:45–10:45

ParallelScientificComputing:TheCMU PSciCoProjectRobertHarper (CarnegieMellonUniversity, USA)

Session1: 11:15–12:45

ManufacturingDatatypes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1RalfHinze(Universitat Bonn,Germany)

DependentlyTypedDataStructures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17HongweiXi (OregonGraduateInstitute, USA)

TeachingMonadicAlgorithmsto First-YearStudents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33RicardoPena,YolandaOrtega,andFernandoRubio(UniversidadComplutensedeMadrid, Spain)

Session2: 14:30–16:00

ModularLazySearchfor ConstraintSatisfactionProblems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47ThomasNordin (Oregon GraduateInstitute, USA)and Andrew Tolmach (Portland StateUniversity,USA)

PersistentTriangulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63Guy Blelloch, Hal Burch, Karl Crary, RobertHarper, Gary Miller, and Noel Walkington(CarnegieMellonUniversity, USA)

An AlgebraicDynamicProgrammingApproachto theAnalysisof RecombinantDNA Sequences. . . . 77RobertGiegerich (Universitat Bielefeld,Germany),StefanKurtz(Universitat Bielefeld,Germany),andGeorg Weiller (Australian NationalUniversity, Australia)

Session3: 16:30–17:30

ConstructingRed-BlackTrees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89RalfHinze(Universitat Bonn,Germany)

An ExperimentalStudyof CompressionMethodsfor FunctionalTries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101Jukka-PekkaIivonen(NokiaTelecommunications,Finland),StefanNilsson(RoyalInstituteof Technol-ogy, Sweden),andMatti Tikkanen(NokiaTelecommunications,Finland)

Manufacturing Datatypes

Ralf Hinze

Institut fur Informatik III, UniversitatBonnRomerstraße164,53117Bonn,Germany

������������� ����������������������������� ��!" ����#�$&%�%('�'�')�*��+� �������������)��������������,�-��!�%+.(���+����%

Abstract

This paperdescribesa generalframework for designingpurelyfunctionaldatatypesthatautomaticallysatisfygivensizeor structuralconstraints.Using the framework we develop implementationsof differentmatrix types(eg squarematrices)and implementationsof several tree types(eg Braun trees,2-3 trees).Consider, for instance,representingsquare/102/ matrices.Theusualrepresentationusinglistsof lists failsto meetthestructuralconstraints:thereis no way to ensurethat theouterlist andthe inner lists have thesamelength.Themainideaof ourapproachis to solve in afirst steparelated,but simplerproblem,namelyto generatethe multiset of all squarenumbers. In order to describethis multiset we employ recursionequationsinvolving finite multisets,multiset union, additionandmultiplication lifted to multisets. In asecondstepwe mechanicallyderive datatypedefinitionsfrom theserecursionequations,whichenforcethe‘squareness’constraint.Thetransformationmakesessentialuseof polymorphictypes.

1 Intr oduction

Many informationstructuresaredefinedby certainsizeor structuralconstraints.Take, for instance,theclassof perfectlybalanced,binary leaf trees[10] (perfectleaf treesfor short): a perfectleaf treeof height 3 is aleaf anda perfectleaf treeof height 4)576 is a nodewith two children,eachof which is a perfectleaf treeofheight 4 . How canwe representperfectleaf treesof arbitraryheightsuchthat thestructuralconstraintsareenforced?Theusualrecursive representationof binaryleaf treesis apparentlynot very helpful sincethereisnowayto ensurethatthechildrenof anodehavethesameheight.As anotherexample,considersquare8�9:8matrices[14]. How do we representsquarematricessuchthat thematricesareactuallysquare?Again, thestandardrepresentationusinglists of lists fails to meettheconstraints:theouterlist andtheinner lists havenot necessarilythesamelength. In this paper, we presenta framework thatallows to designrepresentationsof perfectleaf trees,squarematrices,andmany other informationstructuresthat automaticallysatisfy thegivensizeor structuralconstraints.

Let usillustratethemainideasby meansof example.As afirst example,wewill devisearepresentationofToeplitzmatrices[6] whereaToeplitzmatrix is an 8,9;8 matrix <>=�?A@CB suchthat =�?D@FEG=�?IH�JLK @ H�J for 6:MONQPSR�T8 . Clearly, to representa Toeplitzmatrix of size 8U5V6 it sufficesto store W2XY8U5V6 elements.Now, insteadof designinga representationfrom scratchwe first solve a related,but apparentlysimplerproblem,namely,to generatethesetof all oddnumbers.Actually, wewill work with multisetsinsteadof setsfor reasonsto beexplainedlater. In orderto describemultisetsof naturalnumbersweemploy systemsof recursionequations.Thefollowing system,for instance,specifiesthemultisetof all oddnumbers,ie themultisetwhich contains

1

oneoccurrenceof eachoddnumber.

odd Z [ 1\Y]^[ 2\_ odd

Here, [ n\ denotesthesingletonmultisetthatcontainsn exactly once, `a]�b denotesmultisetunionand `a_:b isadditionlifted to multisets:A _ B Zc[ a _ b d a e A f b e B\ . We agreeuponthat `a_:b bindsmoretightlythan `a]b . Now, how canwe turn theequationinto a sensibledatatypedefinition for Toeplitzmatrices?Thefirst thing to noteis thatwe areactuallylooking for a datatypethat is parameterizedby the type of matrixelements.Sucha type is alsoknown asa typeconstructoror asa functor1. An elementof a parameterizedtypeis calledacontainer. Theequationabovehasthefollowing counterpartin theworld of functors.

Odd Z Id d�` Id g Id bYg Odd

Here,Id is theidentityfunctorgivenby Id a Z a. Furthermore,�d b and `hg�b denotedisjointsumsandproductslifted to functors,ie ` F i;d F j&b a Z F i a d F j a and ` F ig F j&b a Z F i a g F j a. Comparingthetwo equationswe seethat [ 1\ correspondsto Id, `a]�b correspondsto `Qd b , and `S_2b correspondsto `hg�b . This immediatelyimpliesthatId g Id correspondsto [ 1\�_k[ 1\lZG[ 2\ . Therelationshipis verytight: thefunctorcorrespondingto amultisetM contains,for eachmemberof M, acontainerof thatsize.For instance,Id g Id correspondsto[ 1\_m[ 1\nZV[ 2\ asit containsonecontainerof size2; Id d Id g Id correspondsto [ 1\l]^[ 1\_o[ 1\lZG[qpsrQtu\asit containsonecontainerof size1 andanotheroneof size2.

Functorequationsarewritten in a compositionalstyle. To derive a datatypedeclarationfrom a functorequationwe simply rewrite it into anapplicative form—additionallyaddingconstructornamesandpossiblymakingcosmeticchanges.2

data Toeplitza Z Cornera d Extenda a ` ToeplitzabThe left uppercornerof a Toeplitzmatrix is representedby Corner a; Extendr c m extendsthe matrix mby anadditionalrow andanadditionalcolumn,bothof which arerepresentedby elements.For instance,thev g v

Toeplitzmatrix ` awAxyb is representedby

Extendaz i ai z1` Extenda{ i a i {;` Extendaj-i ai|j ` Cornera i�i b|b�b~}Of course,this is not theonly implementationconceivable.Alternatively, we candefineodd in termsof

thesetof all evennumbers.odd Z [ 1\_ even

even Z [ 0\Y]^[ 2\_ even

As innocentas this variationmay look it hasthe advantagethat the left uppercornercan be accessedinconstanttime asopposedto lineartime with thefirst representation.

data Toeplitza Z Toeplitza ` List2 abdata List2 a Z Nil2 d Cons2a a ` List2 ab

Easierstill, we maydefineodd in termsof thenaturalnumbersusingthefactthateachoddnumberis of theform 1 _ n � 2 for somen.

odd Z [ 1\_ nat �Y[ 2\nat Z [ 0\l]�[ 1\_ nat

1Categoricallyspeaking,afunctormustsatisfyadditionalconditions,see[3]. All thetypeconstructorslistedin thispaperarefunctorsin thecategory-theoretical sense.

2Examplesaregivenin thefunctionallanguageHaskell 98 [15].

2

Thefirst equationmakesuseof themultiplicationoperation,which is definedanalogouslyto �S�:� . To whichoperationon functorsdoesmultiplicationcorrespond?Wewill seethatundercertainconditionsto bespelledout later �a�s� correspondsto thecompositionof functors ���A� givenby � F ��� F �&� a � F ��� F � a� . Thefunctorequationsderivedfrom oddandnat are

Odd � Id � Nat �(� Id � Id �Nat � K Unit � Id � Nat �

Here,K t denotestheconstantfunctorgivenbyK t a � t andUnit is theunit typecontainingasingleelement.Note that K Unit correspondsto � 0� . Unsurprisingly, Nat modelsthe ubiquitousdatatypeof polymorphiclists.

data Toeplitza � Toeplitza � List � a � a���data List a � Nil � Consa � List a�

Thus,to storeanevennumberof elementswesimplyusealist of pairs.Thisrepresentationhastheadvantagethatthelist typecanbeeasilyreplacedby amoreefficient sequencetype.

Next, let usapply the techniqueto designa representationof perfectleaf trees.The relatedproblemissimple:we haveto generatethemultisetof all powersof 2.

power � � 1�l� power �Y� 2�Thecorrespondingfunctorequationis

Power � Id � Power �(� Id � Id ���from which wecaneasilyderive thefollowing datatypedefinition.

data Perfecta � Zero a � Succ � Perfect � a � a���Thus,a perfectleaf treeof height0 is a leaf anda perfectleaf treeof heighth � 1 is a perfectleaf treeofheighth, whoseleavescontainpairsof elements.Note that this definitionproceedsbottom-upwhereasthedefinitiongiven in thebeginningproceedstop-down. The typePerfect is anexamplefor a so-callednesteddatatype[4]: the recursive call of Perfecton the right-handsideis not a copy of the declaredtype on theleft-handside,ie thetyperecursionis nested.

As the final example,let us tacklethe problemof representingsquarematrices.We soonfind that therelatedproblemof generatingthe multisetof all squarenumbersis not quite aseasyasbefore. Onecouldbetemptedto definesquare � nat � nat. However, this doesnot work sincethe resultingmultisetcontainsproductsof arbitrarynumbers.Incidentally, nat � nat is relatedto List � List, thelists of lists implementationwe alreadyrejected.We mustsomehow arrangethat �a�s� is only appliedto singletonmultisets. A trick toachievethis is to first rewrite thedefinitionof nat into a tail-recursive form.

nat � nat�(� 0�nat� n � n � nat����� 1��� n�

Thedefinitionof nat� closelyresemblesthefunction from ��� Int ��� Int � givenby from n � n � from � n � 1� ,which generatestheinfinite list of successive integersbeginningwith n. Now, to obtainsquarenumberswesimply replacen by n � n in thesecondequation.

square � square�(� 0�square� n � n � n � square� ��� 1�� n�

3

Using this trick we are, in fact, able to enumeratethe codomainof an arbitrary polynomial. Even moreinteresting,this trick is applicableto otherrepresentationsof sequences,aswell. But, weareskippingahead.For now, let usdeterminethedatatypescorrespondingto squareandsquare� . Fromthefunctorequations

Square � Square��� K Unit  Square� f � f ¡ f ¢ Square��� Id £ f  

we canderivethefollowing datatypedeclarations.

type Matrix a � Matrix � Nil a

data Matrix � t a � Zero � t � t a � l¢ Succ � Matrix � � Const   a data Nil a � Nil

data Const a � Consa � t a ThetypeconstructorsNil andConst correspondto K Unit andId £ f . As anaside,notethatNil andConsareobtainedby decomposingtheList datatypeinto a baseandinto a recursive case.Furthermore,notethatSquare� is not a functor but a higher-orderfunctor asit takesfunctorsto functors. Accordingly, Matrix � isa typeconstructorof kind �a¤¦¥§¤q n¥¨�a¤,¥©¤s  . Recallthat thekind systemof Haskell specifiesthe ‘type’of a typeconstructor[12]. The‘ ¤ ’ kind representsnullary constructorslike Bool or Int. Thekind ª�«;¥¬ª�­representstypeconstructorsthatmaptypeconstructorsof kind ª�« to thoseof kind ª(­ . Thoughthe typeofsquarematriceslooksdaunting,it is comparatively easyto constructelementsof that type. Hereis a squarematrix of size3.

Succ � Succ � Succ � Zero � Cons � Consa«Q«®� Consa«|­;� Consa «�¯ Nil  � | � Cons � Consa­-«®� Consa­Q­;� Consa­�¯ Nil  � | � Cons � Consa-«®� ConsaQ­;� Consa�¯ Nil  � | � Nil  � | | � | � 

Perhapssurprisingly, we have essentiallya list of lists! Theonly differenceto thestandardrepresentationisthatthesizeof thematrix is additionallyencodedinto aprefix of Zero andSuccconstructors.It is this prefixthattakescareof thesizeconstraints.

Thiscompletestheoverview. Therestof thepaperis organizedasfollows. Section2 introducesmultisetsandoperationson multisets. Furthermore,we show how to transformequationsinto a tail-recursive form.Section3 explainsfunctorsandmakestherelationshipbetweenmultisetsandfunctorsprecise.A multitudeof examplesis presentedin Section4: amongotherthingswe studyrandom-accesslists, Brauntrees,2-3trees,andsquarematrices.Finally, Section5 reviewsrelatedwork andpointsout directionsfor futurework.

2 Multisets

A multisetof type ° a± is acollectionof elementsof typea thattakesaccountof their numberbut notof theirorder. In this paper, we will only considermultisetsformedaccordingto thefollowing grammar.

M ²³²A� ´c¢q° 0±:¢s° 1±;¢�� M µ M  n¢�� M ¶ M  n¢�� M ¤ M  Here, ´ denotesthe emptymultiset, ° n± denotesthe singletonmultiset that containsn exactly once, ��µ� denotesmultisetunion, �a¶:  and �S¤q  areadditionandmultiplicationlifted to multisets,ie they aredefinedbyA · B �¸° a · b ¢ a ¹ A º b ¹ B± for ·¼»¾½C¶�¿L¤�À . If themeaningcanberesolvedfrom thecontext, weabbreviate ° n± by n. Furthermore,we agreeuponthatmultiplication takesprecedenceover addition,whichin turn takesprecedenceovermultisetunion.

4

ÁmÂ�à Á

nÂÅÄ Ám à nÂÁ

mÂÆ ÁnÂÅÄ Á

m Æ nÂA ÇoÈ B Ç C É¨Ä È A Ç BÉ�Ç C

A Ç B Ä B Ç A

A ÃGÈ B à C É¨Ä È A à BÉ�à CA à B Ä B à A0 à A Ä A

Ê Ç A Ä AÊ Ã A Ä ÊÊ Æ A Ä Ê

A Æ®È B Æ C É¨Ä È A Æ BÉ�Æ Ca Æ b Ä b Æ a1 Æ A Ä AA Æ 1 Ä A

È A Ç BÉ�à C Ä A à C Ç B à CÈ A Ç BÉ�Æ C Ä A Æ C Ç B Æ CÈ A à BÉ�Æ c Ä A Æ c à B Æ c

0 Æ A Ä 0

A, B, C aremultisets a, b, c aresimplemultisets m, n arenaturalnumbers

Figure1: Lawsof theoperations.

Multisetsaredefinedby higher-orderrecursionequations. Higher-ordermeansthat the equationsmaynot only involve multisets,but alsofunctionsover multisets,function over functionsover multisetsetc. Inthispaper, wewill, however, restrictourselvesto first-orderequations.Theexplorationof higher-orderkindsis the topic of future research.The meaningof higher-orderrecursionequationsis givenby the usualleastfixpointssemantics.

A multiset is calledsimple if f it is either the emptymultisetor a multisetcontaininga singleelementarbitrarily often. Simplemultisetaredenotedby lower caseletters. A productA Æ B is calledadmissibleif f B denotesa simplemultiset. For instance,nat Æ 2 is admissiblewhile nat Æ nat is not. We will seeinSection3 thatonly admissibleproductscorrespondto compositionsof functors.That is, nat Æ 2 correspondsto Nat Ë È Id Ì Id É but nat Æ nat doesnot correspondto Nat Ë Nat. For that reason,we confineourselvestoadmissibleproductswhendefiningmultisets.

A multisetis calleduniqueiff eachelementoccursat mostonce.For instance,themultisetposgivenbypos Ä 1 Ç 1 à pos is uniquewhereaspos Ä 1 Ç pos à posdenotesa non-uniquemultiset. Note that thefirst definitioncorrespondsto non-emptylistsandthesecondto leaf trees.Theability to distinguishbetweenuniqueandnon-uniquerepresentationsis themainreasonfor usingmultisetsinsteadof sets.

Themultisetoperationssatisfyavarietyof lawslistedin Figure1. Thelawshavebeenchosensothattheyhold both for multisetsand for the correspondingoperationson functors. This explainswhy, for instance,a Æ b Ä b Æ a is restrictedto simplemultisets:thecorrespondingpropertyon functors,F Ë G Ä G Ë F, doesnot hold in general.It is valid, however, if G only comprisescontainersof onesize.Of course,for functorstheequationsstateisomorphismsratherthanequalities.

In theintroductionwehavetransformedtherecursivedefinitionof themultisetof all naturalnumbersintoa tail-recursive form. In therestof this sectionwe will studythis transformationin moredetail. A functionh Í�Í Á aÂ1Î Á

a on multisetsis saidto bea homomorphismiff hÊ Ä Ê

andh È A Ç BÉFÄ h A Ç h B. Forinstance,h N Ä A Ã N Æ b is ahomomorphismwhile g N Ä N Ã N is not. Let h Ï , . . . , hn behomomorphisms,let A bea multiset,andlet X begivenby

X Ä A Ç h Ï X ÇmËhË&ËuÇ hn X ÐThe definition of X is not tail-recursive asthe recursive occurrencesof X arenestedinsidefunction calls.Notethatnat is aninstanceof thisschemewith A Ä Á

0 , n Ä 1, andh Ï N Ä Á1Â�à N. Now, thetail-recursive

5

variantof X is f A with f givenby

f N Ñ N Ò f Ó hÔ N Õ�Ò×Ö&ÖhÖuÒ f Ó hn N Õ~ØThedefinitionof f is calledtail-recursive for obviousreasons.NotethatnatÙ�Ú 0Û is thetail-recursivevariantof nat. Thecorrectnessof thetransformationis impliedby thefollowing theorem.

Theorem1 Let X ܳÜ&Ú aÛ , A ܳÜyÚ aÛ , andf ܳÜ&Ú aÛnÝÞÚ aÛ begivenasabove, thenX Ñ f A.

3 Functors

In closeanalogyto multisetexpressionswe definethesyntaxof functorexpressionsby thefollowing gram-mar.

F ܳÜDÑ K Void ß K Unit ß Id ß�Ó F ß F Õlß�Ó F à F Õlß�Ó F Ö F ÕHere,K t denotesthe constantfunctor given by K t a Ñ t, Void is the empty type, andUnit is the unittypecontaininga singleelement.By Id we denotethe identity functorgivenby Id a Ñ a; F Ô�Ö F á denotesfunctorcompositiongivenby Ó F Ô)Ö F áyÕ a Ñ F Ô�Ó F á aÕ . Disjoint sumsandproductsaredefinedpointwise:Ó F Ô:ß F á&Õ a Ñ F Ô a ß F á a and Ó F Ô2à F á&Õ a Ñ F Ô a à F á a.

All theseconstructscanbeeasilydefinedin Haskell. Firstof all, werequirethefollowing typedefinitions.

type Unit Ñ Ó�Õdata Either a Ô aá Ñ Left a Ô ß Rightaádata Ó a Ô*â aá Õ Ñ Ó a Ô*â aá Õ

The predefinedtypesEither aÔ aá and Ó aÔ â aáyÕ implementdisjoint sumsandproducts.The operationsonfunctorsarethendefinedby

newtype Id a Ñ Id a

newtypeK a b Ñ K a

newtypeSumt Ô t á a Ñ Sum Ó Either Ó t Ô aÕYÓ t á aÕ�ÕnewtypeProd t Ô t á a Ñ Prod Ó t Ô a â t á aÕnewtypeCompt Ô t á a Ñ Comp Ó t Ô;Ó t á aÕ�ÕãØ

Using thesetype constructorsit is straightforward to translatea functor equationinto a Haskell datatypedefinition. For reasonsof readability, we will oftendefinespecialinstancesof the generalschemeswritingNil insteadof K Unit or Const insteadof Prod Id t.

Thetranslationof multisetsinto functorsis givenby thefollowing table.

mÔ má ä Ú 0Û Ú 1Û mÔ Ò má mÔå má mÔ�æ máf Ô f á K Void K Unit Id f Ô ß f á f Ô à f á f Ô Ö f á

We saythat F correspondsto M if F is obtainedfrom M usingthis translation. In the restof this sectionwe will briefly sketchthecorrectnessof thetranslation.Informally, the functorcorrespondingto a multisetM contains,for eachmemberof M, a containerof that size. This statementcanbe madepreciseusingtheframework of polytypic programming[11]. Briefly, a polytypic function is onethat is definedby inductionon the structureof functor expressions.A simpleexamplefor a polytypic function is sumç f èéÜ³Ü f êëÝìê ,which sumsa structureof naturalnumbers.To make therelationshipbetweenmultisetsandfunctorsprecisewefurthermorerequirethefunctionfanç f è�Ü³Ü a ÝÞÚ f aÛ , whichgeneratesthemultisetof all structuresof typef a from a givenseedof typea. For instance,fanç List è 1 generatesthemultisetof all lists thatcontain1 asthesingleelement.

6

Theorem2 If the functor F correspondsto the multisetM and if M’s definitiononly involvesadmissibleproducts,thenM í¾î sumï F ð a ñ a ò fanï F ð 1ó .Thefollowing exampleshows thatit is necessaryto restrictproductsto admissibleproducts:if we composethe functorscorrespondingto îCôsõLösó and îqôsõQ÷uó , we obtaina functor thatcorrespondsto îCôuõQö�õQ÷�õ�ø(õ|ø�õ�ùúó . Ingeneral,functorcompositioncorrespondsto themultisetoperationûaü:ý givenby

A ü B í î bþ�ÿ������*ÿ ba ñ a ò A � b þlò B ��������� ba ò Bó �We takeacontainerof typeA andfill eachof its slotswith a containerof typeB. Summingthesizesof theBcontainersyieldstheoverallsize.Theoperationsû��qý and ûaü:ý coincideonly for admissibleproducts,ie if thecontainersof typeB all haveequalsize.

4 Examples

In thissectionweapplytheframework to generateefficientimplementationsof vectors(akalistsor sequencesor arrays)andmatrices.

4.1 Lists

A vectoror a sequencetype containscontainersof arbitrarysize. The problemrelatedto designinga se-quencetypeis, of course,to generatethemultisetof all naturalnumbers.Differentwaysto describethis setcorrespondto differentimplementationsof vectors.Perhapssurprisingly, thereis anabundanceof waystosolve thisproblem.In theintroductionwe alreadyencounteredthemostdirectsolution:

nat í 0 1 ÿ nat �If we transformthecorrespondingfunctorequation

Nat í K Unit ñ Id � Natinto a Haskell datatype,weobtaintheubiquitousdatatypeof polymorphiclists.

data Vectora í Nil ñ Consa û VectoraýAs anexample,thelist representationof thevector û 0 õ 1 õ 2 õ 3 õ 4 õ 5ý is

Cons0 û Cons1 û Cons2 û Cons3 û Cons4 û Cons5 Nil ý|ý|ý�ý|ý �Thetail-recursivevariantof nat is givenby

natþ í nat� þ 0

nat� þ n í n nat� þ û 1 ÿ ný �Fromthefunctorequations

Natþ í Nat� þ û K Unit ýNat� þ f í f ñ Nat� þ û Id � f ý

we canderivethefollowing datatypedefinitions.

type Vector í Vector� Nil

data Vector� t a í Zero û t aýlñ Succ û Vector� û Const ý aý

7

Usingthis representationthevector 0 � 1 � 2 � 3 � 4 � 5� is writtensomewhatlengthyas

Succ Succ Succ Succ Succ Succ Zero Cons0 Cons1 Cons2 Cons3 Cons4 Cons5 Nil �������������������������

Fortunately, wecansimplify thedefinitionsslightly. RecallthatVector� is atypeof kind ����������� �������� .In thiscasethe‘higher-orderness’is,however, notrequired.Notingthatthefirstargumentof Vector� is alwaysappliedto thesecondwe cantransformVector� into a first-orderfunctorof kind ��������� .

type Vector � Vector� ��data Vector� t a � Zero t ! Succ Vector� a � t � a�

Thetwo variantsof Vector� arerelatedby Vector�ho t a � Vector�fo t a� a andVector�fo t a � Vector�ho K t � a.NotethatthetypeMatrix � definedin theintroductionis notamenableto this transformationsincethefirst ar-gumentof Matrix � is usedatdifferentinstances.Usingthefirst-orderdefinition 0 � 1 � 2 � 3 � 4 � 5� is representedby

Succ Succ Succ Succ Succ Succ Zero 0 �" 1 �� 2 �� 3 �� 4 �" 5 �" #�������������������������$�4.2 Random-accesslists

Thedefinitionof nat% is basedon theunaryrepresentationof thenaturalnumbers:a naturalnumberis eitherzeroor thesuccessorof a naturalnumber. Of course,we canalsobasethedefinitionon thebinarynumbersystem:a naturalnumberis eitherzero,even,or odd.

nat&'� 0 ( nat&)� 2 ( 1 * nat&)� 2

Transformingthecorrespondingfunctorequation

Nat& � K Unit ! Nat&�+ Id , Id �-! Id , Nat&�+ Id , Id �into a Haskell datatypeyields

data Vectora � Null ! Zero Vector a � a���.! Onea Vector a � a���/�Interestingly, this definitionimplementsrandom-accesslists [13], which supportlogarithmicaccessto indi-vidualvectorelements.A random-accesslist is basicallyasequenceof perfectleaf treesof increasingheight.Thevector 0 � 1 � 2 � 3 � 4 � 5� , for instance,is representedby

Zero One 0 � 1�0 One � 2 � 3���� 4 � 5��� Null ���1�The sequenceof Zero and Oneconstructorsencodesthe size of the vector in binary representation(withthe leastsignificantbit first): we have #24353"� & �76 . The representationof a vectorof size 353 is depictedinFigure2(a). Note that the representationis not uniquebecauseof leadingzeros: the emptysequence,forexample,canberepresentedby Null, Zero Null, Zero Zero Null � etc. Thereareat leasttwo waysto repairthis defect.Thefollowing definitionensuresthattheleadingdigit is alwaysaone.

nat8 � 0 ( pos8pos8 � 1 ( pos8 � 2 ( 1 * pos8 � 2

More elegantly, onecandefinea zerolessrepresentation[13], which employs thedigits 1 and2 insteadof 0and1. We call thisvariantof thebinarynumbersystem1-2 system.

nat9:� 0 ( 1 * nat9)� 2 ( 2 * nat9)� 2

Thisalternativehasthefurtheradvantagethataccessingthe i-th elementrunsin ;< >=@?5ACBD� time [13].

8

4.3 Fork-node tr ees

Now, let ustransformnatE into a tail-recursive form.

natF G 0 H posIF 1

posIF n G n H posIF�J n K 2LMH posIF�J 1 N n K 2LNotethatwe mayreplacen K 2 by 2 K n G n N n if posIF is calledwith a simplemultisetasin posIF 1. Thecorrespondingfunctorequationslook puzzling.

NatF G K Unit O PosIF Id

PosIF f G f O PosIFPJ f Q J Id R Id L�L.O PosIFPJ Id R f Q J Id R Id L�LIn order to improve the readabilityof the derived datatypeslet us defineidioms for 2 K n G n N n and1 N 2 K n G 1 N n N n.

data Fork t a G Fork J t aL J t aLdata Nodet a G Nodea J t aL J t aL

Thesedefinitions assumethat t is a simple functor. The alternative definitions newtype Fork t a GFork J t J a S aL�L anddata Node t a G Node a J t J a S aL�L , which correspondto n K 2 and1 N n K 2, workfor arbitraryfunctorsbut aremoreawkwardto use.Building uponFork andNodetheHaskell datatypesread

data Vectora G Empty O NonEmptyJ VectorI Id aLdata VectorI t a G Base J t aL

O Zero J VectorI J Fork t L aLO One J VectorI J Nodet L aL/T

A vectorof size U is representedby acompletebinarytreeof height VXWZY\[^] U`_CNba . A nodein the c -th level ofthis treeis labelledwith anelementif f the c -th digit in thebinarydecompositionof U is one.Thelowestlevel,whichcorrespondsto aleadingone,alwayscontainselements.To thebestof theauthor’sknowledgethisdatastructure,which we baptizefork-nodetreesfor want of a bettername,hasnot beendescribedelsewhere.3

Our runningexample,thevector J 0 S 1 S 2 S 3 S 4 S 5L , is representedby

NonEmptyJ One J Zero J Base J Fork J Node0 J Id 1L J Id 2L�L J Node3 J Id 4L J Id 5L�L�L�L�L�LdTAgain,thesizeof thevectoris encodedinto theprefixof constructors:replacingNonEmptyandOneby 1 andZero by 0 yieldsthebinarydecompositionof thesizewith themostsignificantbit first. Figure2(b) shows asamplevectorof a\a elements.Thevectorelementsarestoredin left-to-right preorder:if thetreehasa root,it containsthe first element;theelementsin the left treeprecedetheelementsin the right tree. This layoutis, however, by no meanscompelling.Alternatively, onecaninterleave theelementsof theleft andtherightsubtree:if l representsthevector J befS�T�T�T�S bn L andr representsJ cegS�T�T�T�S cn L , thenFork l r representsthevectorJ b egS cegS�T�T�T�S bn S cn L andNodea l r representsJ a S b e"S cegS�T�T�T�S bn S cn L . This choicefacilitatestheextensionof avectorat thefront andalsoslightly simplifiesaccessingavectorelement.

As alwaysfor vectortypeswecan‘firstify’ thetypedefinitions.

data Vectora G Empty O NonEmptyJ VectorI a aLdata VectorI t a G Baset

O Zero J VectorI J t S t L aLO One J VectorI J a S t S t L aL

3Sincethispaperwaswritten, I have learnedthatHongweiXi hasindependentlydiscoveredthesamedatastructure.

9

Therepresentationof h 0 i 1 i 2 i 3 i 4 i 5j now consistsof nestedpairsandtriples.

NonEmptyh One h Zero h Base h�h 0 i 1 i 2j�i�h 3 i 4 i 5j�j�j�j�jFinally, let usremarkthatthetail-recursivevariantof natk , which is basedonthe1-2system,yieldsasimilartreeshape:a nodeon the l -th level containsm elementswhere m is the l -th digit in the1-2 decompositionofthevector’ssize.

4.4 Rightist right-perfect tr ees

Thedefinitionof natn is basedon the fact thatall naturalnumberscanbegeneratedby shifting (n o 2) andsettingtheleastsignificantbit (1 p n o 2). Thefollowing definitionsetsbitsatarbitrarypositionsby repeatedlyshiftinga one.

natq r natsq 1

natsq p r 0 t natsq h p o 2j�t p p natsq h p o 2jOf course,thetwo definitionsarenot unrelated,we have

natn)o p r natsq p iie natsq p generatesall multiplesof p. In the i-th level of recursionthe parameterof natsq equalsp o 2i ifthe initial call wasnatsq p. Now, transformingthe correspondingfunctorequations,which assumethat f issimple,

Natq r Natsq Id

Natsq f r f u Natsq h f v f j.u f v Natsq h f v f jinto Haskell datatypesyields

type Vector r Vectors Id

data Vectors t a r Nullu Zero h Vectorswh Fork t j aju One h t ajxh Vectorswh Fork t j ajzy

This datatypeimplementshigher-orderrandom-accesslists [9]. If we ‘firstify’ the typeconstructorVectors ,we obtain the first-ordervariantasdefinedin Section4.2. For a discussionof the tradeoffs we refer theinterestedreaderto [9]. Thevector h 0 i 1 i 2 i 3 i 4 i 5j is representedby

Zero h One h Fork h Id 0jxh Id 1j�jxh One h Fork h Fork h Id 2jxh Id 3j�jxh Fork h Id 4jxh Id 6j�j�j Null j�j1yInterestingly, usingaslightgeneralizationof Theorem1 wecantransformnatsq into a tail-recursiveform,

aswell.nat{ r nats{ 0 1

nats{ n p r n t nats{ n h p o 2j�t nats{ h n p pjxh p o 2jThefunctionnats{ is relatedto natn by

n p natn0o p r nats{ n p yAssumingthatp is simplewe getthefollowing functorequations

Nat{ r Nats{ h K Unit j Id

Nats{ f p r f u Nats{ f h p v pj.u Nats{ h f v pjxh p v pjzi10

from which wecaneasilyderive thedatatypedefinitionsbelow.

type Vector | Vector}w~ K Unit � Id

data Vector} t p a | Base ~ t a��Even ~ Vector} t ~ Prod p p� a��Odd ~ Vector} ~ Prod t p�x~ Prod p p� a�

This datatypeimplementsrightist right-perfecttreesor RR-trees[7] wherethe offspringsof the nodesonthe left spineform a sequenceof perfectleaf treesof decreasingheight. Note that if we changeProd t p toProd p t in the last line, we obtainleftist left-perfecttrees. Hereis the vector ~ 0 � 1 � 2 � 3 � 4 � 5� written asanRR-tree.

Even ~ Odd ~ Odd ~ Base ~ Prod ~ Prod ~ K ~#��� Prod ~ Id 0 � Id 1�����Prod ~ Prod ~ Id 2 � Id 3��� Prod ~ Id 4 � Id 5�������������

ReadingtheconstructorsEvenandOddasdigits (LSB first) givesthesizeof thevector. A samplevectorofsize �\� is shown in Figure2(c). The‘firstification’ of Vector} is left asanexerciseto thereader.

4.5 Braun tr ees

Let usapplytheframework to designarepresentationof Brauntrees[5]. Brauntreesarenode-orientedtrees,which arecharacterizedby the following balancecondition: for all subtrees,the sizeof the left subtreeiseitherexactly thesizeof theright subtree,or oneelementlarger. In otherwords,aBrauntreeof size2 � n � 1hastwo childrenof sizen anda Brauntreeof size2 � n � 2 hasa left child of sizen � 1 anda right child ofsizen. Thismotivatesthefollowing definition.

braun | braun} 0 1

braun} n n} | n � braun} ~ n � 1 � n�x~ n} � 1 � n�� braun} ~ n}5� 1 � n�x~ n}�� 1 � n}@�

Theargumentsof braun} arealwaystwo successive naturalnumbers.Fromthecorrespondingfunctorequa-tions

Braun | Braun} ~ K Unit � Id

Braun} f f } | f�Braun} ~ f � Id � f �x~ f } � Id � f ��Braun} ~ f }w� Id � f �-~ f }w� Id � f }Z�

we canderivethefollowing datatypedefinitions.

data Bin t � t � a | Bin ~ t � a� a ~ t � a�type Braun | Braun} ~ K Unit � Id

data Braun} t t } a | Null ~ t a��One ~ Braun} ~ Bin t t �-~ Bin t } t � a��Two ~ Braun} ~ Bin t } t �.~ Bin t } t }X� a�

Interestingly, Brauntreesarebasedon the 1-2 numbersystem(MSB first). The vector ~ 0 � 1 � 2 � 3 � 4 � 5� , forinstance,is representedasfollows.

Two ~ Two ~ Null ~ Bin ~ Bin ~ Id 0� 1 ~ Id 2��� 3 ~ Bin ~ Id 4� 5 ~ K ~#�����������Figure2(d)displaystherepresentationof avectorof �5� elements.R. Patersonhasdescribedasimilar imple-mentation(personalcommunication).

11

One One Zero One Null

(a) random-accesslist

NonEmpty�

Zero

One

One

Base

(b) fork-nodetree

Odd�Odd

Even

Odd

Base

���

(c) rightist right-perfecttree

T�wo

One

One

Null

��� �#� ��� �#�

(d) Brauntree

Figure2: Differentrepresentationsof a vectorwith �5� elements.Notethat‘ ’ representsa leaf (anelementof Id), ‘ ’ an unlabellednode(an elementof Id � Id, Fork t, or Prod t � t � ), and‘ ’ a labellednode(anelementof Nodet or Bin t � t � ).

12

4.6 2-3 tr ees

Upto now wehavemainlyconsidereduniquerepresentationswheretheshapeof adatastructureiscompletelydeterminedby thenumberof elementsit contains.Interestingly, uniquerepresentationsarenot well-suitedfor implementingsearchtrees:onecanprovea lowerboundof ����� ��� for insertionanddeletionin this case[16]. For thatreason,popularsearchtreeschemessuchas2-3 trees[2], red-blacktrees[8], or AVL-trees[1]arealwaysbasedonnon-uniquerepresentations.Let usconsiderhow to implement,say, 2-3 trees.Theothersearchtreeschemescanbehandledin ananalogousfashion.Thedefinitionof 2-3 treesis similar to thatofperfectleaf trees:a 2-3 treeof height0 is a leaf anda 2-3 treeof heighth � 1 is a nodewith eithertwo orthreechildren,eachof which is a2-3 treeof heighth. This similarity suggeststo model2-3 treesasfollows.

tree23 � tree23� 0

tree23� N � N � tree23� � N � 1 � N � N � 1 � N � 1 � N �Notethatcontraryto previousdefinitionstheparameterof theauxiliary functiondoesnot rangeover simplesets.Thecorrespondingfunctorequations

Tree23 � Tree23�w� K Unit �Tree23� F � F � Tree23�w� F � Id � F � F � Id � F � Id � F �

giveriseto thefollowing datatypedefinitions.

type Tree23a � Tree23� Nil a

data Tree23� t a � Zero � t a�.� Succ � Tree23��� Node23t � a�data Node23t a � Node2 � t a� a � t a�.� Node3 � t a� a � t a� a � t a�

Thevector � 0   1   2   3   4   5� hasthreedifferentrepresentations;onealternative is

Succ � Succ � Zero � Node3 � Node3Nil 0 Nil 1 Nil � 2 � Node2Nil 3 Nil �4 � Node2Nil 5 Nil �������d¡

Algorithmsfor insertionanddeletionaredescribedin [9].

4.7 Matrices

Let usfinally designrepresentationsof squarematricesandrectangularmatrices.In theintroductionwehavealreadydiscussedthecentralidea:wetakeatail-recursivedefinitionof thenaturalnumbers(or of thepositivenumbers)

X � f a

f n � n � f � h ¢ n���¤£�£�£\� f � hn n�andreplacen by n ¥ n in thesecondequation:

square � square� a

square� n � n ¥ n � square� � h¢ n���¤£�£�£�� square� � hn n�1¡This transformationworksprovideda is a simplemultisetandthehi preserve simplicity. Theseconditionshold for all of theexamplesabovewith thenotableexceptionof 2-3 trees.As a concreteexample,hereis an

13

NonEmpty¦

One

Zero

Base

Figure3: Therepresentationof a §<¨©§ matrix basedon fork-nodetrees.

implementationof squarematricesbasedon fork-nodetrees.

data Matrix a ª Empty « NonEmpty¬ Matrix ­ Id a®data Matrix ­ t a ª Base ¬ t ¬ t a®�®

« Zero ¬ Matrix ­ ¬ Fork t ® a®« One ¬ Matrix ­¯¬ Nodet ® a®

Therepresentationof a §<¨°§ matrix is shown in Figure3.Rectangularmatricesareequallyeasyto implement. In this casewe replacen by nat ± n in thesecond

equation:rect ª rect­ a

rect­ n ª nat ± n ² rect­¯¬ h ³ n®�²µ´�´�´�² rect­�¬ hn n®1¶Alternatively, onemayusethefollowing scheme.

rect ª rect­ a a

rect­ m n ª m ± n ² rect­ ¬ h ³ m®.¬ h ³ n®�²µ´�´�´�² rect­ ¬ h ³ m®.¬ hn n®²µ´�´�´² rect­¯¬ hn m®.¬ h ³ n®�²µ´�´�´5² rect­�¬ hn m®.¬ hn n®

This representationrequiresmoreconstructorsthanthefirst one( ·w¸x¹»º insteadof ·¼¹½º ). On thepositiveside,it canbeeasilygeneralizedto higherdimensions.

5 Relatedand futur ework

This work is inspiredby a recentpaperof C. Okasaki[14], who derivesrepresentationsof squarematricesfrom exponentiationalgorithms. He shows, in particular, that the tail-recursive versionof the fast expo-nentiationgivesrise to an implementationbasedon rightist right-perfecttrees. Interestingly, the simpler

14

implementationbasedon fork-nodetreesis not mentioned.The reasonis probablythat fastexponentiationalgorithmstypically processthebits from leastto mostsignificantbit while fork-nodetreesandBrauntreesarebasedon thereverseorder. Therelationshipbetweennumbersystemsanddatastructuresis explainedatgreatlengthin [13]. Thedevelopmentin Section3 canbeseenasputting this designprincipleon a formalbasis.

Extensionsto theHindley-Milner typesystemthatallow to capturestructuralinvariantsin amorestraight-forwardway have beendescribedby C. Zenger[18, 19] andH. Xi [17]—the latterpaperalsoappearsin theproceedingsof this workshop. Using the indexed types of C. Zengeronecan, for instance,parameterizevectorsandmatricesby their size. Sizecompatibility is thenstaticallyensuredby the type checker. H. Xiachievesthesameeffect usingdependentdatatypes.In his system,deCaml, thetypeof perfectleaf treesis,for instance,declaredasfollows.

datatype ¾ a perfectwith nat¿ Leaf À 0Á of ¾ aÂÄÃn Å nat Æ Fork À n Ç 1Á of ¾ a perfect À nÁ¯È.¾ a perfect À nÁ

This definition is essentiallya transliterationof the top-down definition of perfectleaf treesgiven in theintroduction. A practicaladvantageof dependenttypesis that standardregulardatatypesandfunctionsonthesetypescanbe adaptedwith little or no change.Often it sufficesto annotatedatatypedeclarationsandtypesignatureswith appropriatesizeconstraints.

Directionsfor futurework suggestthemselves. It remainsto adaptthestandardvectorandmatrix algo-rithms to the new representations.Somepreparatorywork hasbeendonein this respect.In [9] the authorshowshow to adaptsearchtreealgorithmsto nestedrepresentationsof searchtreesusingconstructorclasses.It is conceivablethat this approachcanbe appliedto matrix algorithms,aswell. Furthermore,many func-tionslikemap, listify, sumetccanbegeneratedautomaticallyusingthetechniqueof polytypicprogramming[11]. On thetheoreticalside,it would be interestingto investigatetheexpressivenessof theframework andof higher-orderpolymorphictypesin general.Which classof multisetscanbedescribedusinghigher-orderrecursionequations?For instance,it appearsto beimpossibleto specifythemultisetsof all primenumbers.Do higher-orderkindsincreasetheexpressiveness?

Acknowledgements

I amgratefulto Iliano Cervesato,LambertMeertens,andJohnO’Donnell for many valuablecomments.

References

[1] G.M. Adel’son-Vel’skiı andY.M. Landis. An algorithmfor theorganizationof information. DokladyAkademiiaNaukSSSR, 146:263–266,1962.Englishtranslationin Soviet Math.Dokl. 3,pp.1259–1263.

[2] Alfred V. Aho, JohnE. Hopcroft,andJeffrey D. Ullman. Data StructuresandAlgorithms. Addison-Wesley PublishingCompany, 1983.

[3] RichardBird andOegedeMoor. Algebra of Programming. PrenticeHall Europe,London,1997.

[4] RichardBird and LambertMeertens. Nesteddatatypes. In J. Jeuring,editor, Fourth InternationalConferenceon Mathematicsof ProgramConstruction,MPC’98, Marstrand,Sweden, volume1422ofLectureNotesin ComputerScience, pages52–67.Springer-Verlag,June1998.

15

[5] W. Braun and M. Rem. A logarithmic implementationof flexible arrays. MemorandumMR83/4,EindhovenUniversityof Technology, 1983.

[6] ThomasH. Cormen,CharlesE. Leiserson,andRonaldL. Rivest. Introductionto Algorithms. TheMITPress,Cambridge,Massachusetts,1991.

[7] Victor J. DielissenandAnne Kaldewaij. A simple,efficient, andflexible implementationof flexiblearrays.In Third InternationalConferenceon Mathematicsof ProgramConstruction,MPC’95,KlosterIrsee,Germany, volume947of Lecture Notesin ComputerScience, pages232–241.Springer-Verlag,July1995.

[8] LeoJ.GuibasandRobertSedgewick. A diochromaticframework for balancedtrees.In Proceedingsofthe19thAnnualSymposiumonFoundationsof ComputerScience, pages8–21.IEEEComputerSociety,1978.

[9] Ralf Hinze. Numericalrepresentationsashigher-ordernesteddatatypes.TechnicalReportIAI-TR-98-12, Institut fur Informatik III, UniversitatBonn,December1998.

[10] Ralf Hinze. FunctionalPearl:Perfecttreesandbit-reversalpermutations.Journal of FunctionalPro-gramming, 1999.To appear.

[11] Ralf Hinze. Polytypic functionsover nesteddatatypes(extendedabstract). In 3rd Latin-AmericanConferenceon FunctionalProgramming(CLaPF’99), March1999.

[12] Mark P. Jones. Functionalprogrammingwith overloadingandhigher-orderpolymorphism. In FirstInternationalSpringSchoolonAdvancedFunctionalProgrammingTechniques, volume925of LectureNotesin ComputerScience, pages97–136.Springer-Verlag,1995.

[13] ChrisOkasaki.PurelyFunctionalData Structures. CambridgeUniversityPress,1998.

[14] ChrisOkasaki.Fromfastexponentiationto squarematrices:An adventurein types. In Proceedingsofthe1999ACM SIGPLANInternationalConferenceon FunctionalProgramming, Paris, France, 1999.To appear.

[15] SimonPeytonJonesandJohnHughes,editors.Haskell 98—A Non-strict,PurelyFunctionalLanguage,February1999.

[16] LawrenceSnyder. On uniquelyrepresenteddatastructures(extendedabstract).In 18thAnnualSym-posiumon Foundationsof ComputerScience, Providence, pages142–146,Long Beach,Ca., USA,October1977.IEEEComputerSocietyPress.

[17] Hongwei Xi. Dependentlytyped datastructures. In Proceedingsof the Workshopon AlgorithmicAspectsof AdvancedProgrammingLanguages,WAAAPL’99, Paris, France, September1999.

[18] ChristophZenger. Indexedtypes.TheoreticalComputerScience, 187(1–2):147–165, November1997.

[19] ChristophZenger. IndizierteTypen. PhDthesis,UniversitatKarlsruhe,1998.

16

É ÊÌË�ÊÎÍÐÏ�ÊÎÍÒÑdÓ Ô Õ ÔÖË×ÊØÏ É ÙÚÑ1Ù ÛÜÑ1Ý�ÞÐßàÑdÞÐÝzÊÌá/â

HongweiXi

OregonGraduateInstitute

Abstract

Themechanismfor declaringdatatypesin functionalprogramminglanguagessuchasML andHaskellis of greatusein practice.This mechanism,however, oftensuffers from its imprecisionin capturingtheinvariantsinherentin datastructures.Weremedythesituationwith theintroductionof dependentdatatypesso that we canmodeldatastructureswith significantlymoreaccuracy. We presenta few interestingex-amplessuchasimplementationsof red-blacktreesandbinomial heapsto illustratethe useof dependentdatatypesin capturingsomesophisticatedinvariantsin datastructures.We claim thatdependentdatatypescanenabletheprogrammerto implementalgorithmsin away thatis morerobustandeasierto understand.

1 Intr oduction

Themechanismthatallowstheprogrammerto declaredatatypesseemsindispensablein functionalprogram-minglanguagessuchasStandardML [15] andHaskell [19]. In practice,weoftenencountersituationswherethe declareddatatypesdo not accuratelycapturewhat we really need. For instance,if what we needis adatastructurefor thepairsof integerlistsof thesamelength,weoftendeclarea datatypein StandardML orHaskell thatis for all pairsof integerlists. This inaccuracy problemis oftenarich sourcefor programerrors.A typical scenariois thata functionwhich shouldonly receive asits argumenta pair of integer lists of thesamelengthis mistakenlyappliedto a pairof integerlistsof differentlengths.Unfortunately, sucha mistakecausesnotypeerrorsif pairsof integerlistsof equallengtharegivenatypethatis for all pairsof integerlists,andthuscanusuallyhide in a programunnoticeduntil at run-time,whendebuggingoften becomesmuchmoredemandingthanatcompile-time.

Theinaccuracy problembecomesmoreseriouswhenwestartto implementmoresophisticateddatastruc-turessuchasred-blacktrees,binomialheaps,orderedlists,etc.Therearesomerelatively complex invariantsin thesedatastructuresthatwe mustmaintainin orderto implementthemcorrectly. For instance,a correctimplementationof aninsertionoperationon a red-blacktreeshouldalwaysyield a red-blacktree. If we canform a datatypeto preciselycapturethe propertiesof a red-blacktree, then it becomespossibleto detecta programerror throughtype-checkingwhensuchan error leadsto the violation of oneof thesecapturedproperties.This is evidentlyadesirablefeaturein programmingif it canbemadepractical.

Theneedfor forming moreaccuratedatatypespartially motivatedthedesignof DependentML (DML),anenrichmentof ML with a restrictedform of dependenttypes.Moreprecisely, DML is a languageschema.Givena constraintdomain ã , DML( ã ) is the languagein theschemawhereall type index expressionsaredrawn from ã . Roughlyspeaking,a typeindex expressionis simply a termthatcanbeusedto index a type.Type-checkingin DML( ã ) canthenbereducedto constraintsatisfactionin ã . In this paper, we restrict ãto someintegerdomainandusethenameDML for this particularDML( ã ). A variantof DML, deCaml,hasbeenimplementedon top of Caml-light [14]. This implementationessentiallyreplacesthe front-endofäPartially supportedby theUnitedStatesAir ForceMaterielCommand(F19628-96-C-0161)andtheDepartmentof Defense.

17

Caml-lightwith adependenttype-checkerandkeepstheback-endof Caml-lightintact. It alsomodifiesmanylibrary functions,assigningto themmoreaccuratetypes.

An alternativeapproachto forming moreaccuratedatatypesis to usenesteddatatypes[2]. For instance,a nesteddatatypeexactly representingred-blacktreescanbe readily formed. However, thereexist varioussignificantdifferencesbetweenDML-style dependenttypesandnesteddatatypes,which we will illustratelater.

We usetypewriter font in thispaperto representcodein deCaml,all of whichhavebeenverifiedin a prototypeimplementation.A significantconsequenceof theintroductionof dependenttypesis the lossof the notion of principal typesin DML. For instance,both of the following typescanbe assignedto animplementationin deCamlwhichzipstwo lists together.

’a list * ’b list -> (’a * ’b) list{n:nat}’a list(n) * ’b list(n) -> (’a * ’b) list(n)

The first type hasthe usuallymeaning,while the secondoneimplies that for every naturalnumber å , thefunctionyieldsalist with length å whengivenapairof listswith length å . Noticethatweuse’a list(n)for the type of a list with length å in which every elementis of type ’a . If a dependenttype is to beassignedto a functionin DML, it is theresponsibilityof theprogrammerto annotatethefunctionwith suchadependenttype.This is probablythemostsignificantdifferencebetweentheprogrammingstylesin ML andin DML. In practice,we observe that the typeannotationsin a typical DML programoftenconstituteslessthan20%of theentirecode.Sincedependenttypeannotationscanoftenleadto moreaccuratereportsof typeerrormessagesandserve asinformative programdocumentation,we feel that theDML programmingstyleis acceptablefrom a practicalpointof view. We will providesomeconcreteexamplesfor thereaderto judgethis claim, includingimplementationsof red-blacktreesandbinomialheaps.Bothof theseimplementationsareadoptedfrom thecorrespondingonesin [17]. Theimplementationsin deCamlhave severaladvantagesover the original ones. We have verifiedmoreinvariantsin the de Caml implementations.For instance,itis verifiedin the typesystemof deCaml that the functionwhich mergestwo binomialheapsindeedyieldsa binomial heap. Also the type annotationsin the implementations,which canbe fully trustedsincetheyaremechanicallyverified, offer somepedagogicalvalues. We feel it is easierto understandthe de Camlimplementationsbecausethereadercanreasonin thepresenceof theseinformativedependenttypes.

In thispaper, it is neitherpossiblenornecessaryto formallypresentDML. Instead,wefocusonpresentingsomeconcreteexamplesin theprogramminglanguagedeCaml, a variantof DML, aswell assomeintuitiveexplanation.We refer theinterestedreaderto [22] for theformal developmentof DML, thoughwe stronglybelievethatthis is largelyunnecessaryfor comprehendingthispaper.

Therestof thepaperis organizedasfollows. In Section2, wegiveabrief overview of thetypesin DML.We thenintroducedeCamlin Section3, presentingsomeof its mainfeaturesandillustratingtype-checkingin de Camlwith a shortexample. Somecasestudiesaregiven in Section4, including implementationsofBrauntrees,random-accesslists, red-blacktreesandbinomialheaps.Lastly, we discusssomerelatedworkandthenconclude.

2 Typesin DependentML

In this section,we presenta brief explanationon the typesin DML. The readeris encouragedto skip thissectionandreadit later, thoughit could be helpful to gathersomeintuition beforestudyingthe concreteexamplesin Section3.

Intuitively speaking,dependenttypesaretypeswhichdependonthevaluesof languageexpressions.Forinstance,wemayform atype æ�å ç�èéæDê for eachinteger æ to meanthatevery integerexpressionof this typemusthavevalueæ , thatis, æ�å ç�èéæDê is asingletontype.Notethat æ is theexpressiononwhichthistypedepends.Weusethenametypeindexexpressionfor suchanexpression.Therearevariouscompellingreasons,suchaspracticaltype-checking,for imposingrestrictionson expressionswhich canbechosenastype index expressions.A

18

index expressions ëíì�î ï@ïñð ò°ófô�ógë`õ1îöófë�÷øîöógë ùMîöógë úzî¼ófû<ü@ý¯þéëíì�î^ÿ.ógû����wþ>ë ì�î^ÿ.ógû�� � þéëíì�î^ÿindex propositions � ï@ïñð ë�$î¼ógë$îöófë�$î¼ógë�$îöógë�ðµî¼ógë� ð¤îöó����������Pó����������

index sorts � ï@ïñð ë����xó��"òöï��dó��! ó"� � ù� �index contexts # ï@ïñð $ ó�#¯ì�ò ï��dó�#¯ì%�

Figure1: Thesyntaxfor typeindex expressions

novelty in DML is to requirethattypeindex expressionsbedrawn only from a givenconstraintdomain.Forthepurposeof thispaper, werestricttypeindex expressionsto integers.We presentthesyntaxfor typeindexexpressionsin Figure1, wherewe use ò for type index variablesand ô for a fixed integer. Note that thelanguagefor type index expressionsis typed. We usesortsfor the typesin this languagein orderto avoidpotentialconfusion.Weuse $ for theemptyindex variablecontext andomit thestandardsortingrulesfor thislanguage.We alsousecertaintransparentabbreviations,suchas &' ë��¤î which standsfor &(�ë)�°ë��¤î .Thesubsetsort �gòøï*�¤ó*�! standsfor thesort for thoseelementsof sort � which satisfytheproposition� .For example,weusenatasanabbreviationfor thesubsetsort �"ò ï,+.-,/-ófò0�1&* .

Typesin DML areformedasfollows. We use2 for typevariablesand 3 for typeconstructors.

types 4 ï@ïñð 2àó þ54 � ì768686�ì948: ÿ93 þéëDÿ.ó<;¼ó�4 � ù=4 � ó�4 ��> 4 � ó�? ò ï@��6 4øó@A0ò ï��B6 4For instance,C>ëED8� is a type constructorand þéë�����ÿEC>ëED8��þF��ÿ standsfor the type of an integer list of length � .? ò½ï=��6 4 and A0ò½ï�B6 4 form a universaldependenttype andan existentialdependenttype, respectively.For instance,theuniversaldependenttype ? ò/ï*�wò��G6 þéë�����ÿEC>ëHD7��þ#ò^ÿ > þ>ëI����ÿ9CéëED7��þ>ò ÿ capturestheinvariantof afunctionwhich, for every naturalnumberò , returnsan integer list of length ò whengivenan integer list oflength ò . Also we canusethe existentialdependenttype A0òàï��wòJ�G6 þéë�����ÿ9CéëED7��þ>ò ÿ to meanan integer list ofsomeunknown length.We demonstratehow a typeconstructoris declaredin Section3.

The typing rulesfor this languageshouldbe familiar from a dependentlytyped K -calculus(suchastheonesunderlyingCoqor NuPrl). Thecritical notionof typeconversionusesthejudgment#ML(4"�ONP47� whichis thecongruentextensionof equalityon index expressionsto arbitrarytypes:

#dóðP4"��NQ4SR� $7$8$T4 : NQ4SR: #1óðbë 6ð ëHR#(L1þ54"��ì867686�ì%4 : ÿ93 þéëDÿUNÎþF4 R� ì868676�ì94 R: ÿE3 þ>ë R ÿ

#VL'4 � NQ4SR� #ML'4 � NQ4SR�#ML04 � ù=4 � NW4SR� ù=4SR�

#ML04SR� NQ4 � #(L'4 � NQ4SR�#ML04 �X> 4 � NQ4SR� > 4SR�

#¯ì�òöï��ML'4'NW4SR#(L(? òöï��B6 40NY? ò ï���6 4SR

#¯ì ò ï��VL'4'NW4SR#VLMA0ò ï��B6 4'NZA0ò ï��B6 4SR

Noticethat it is theapplicationof theseruleswhich generatesconstraints.For instance,theconstraint#¤óðþ>ò�õ[��ÿwõP\ 6ðP]7õ[� is generatedin orderto derive #(L1þ>ëI����ÿ9CéëED7��þ�þ#ò õ[��ÿwõP\gÿUNÎþ>ëI����ÿ9CéëED7��þ5] õ[��ÿ .It is difficult to presentmoredetailsgiventhespacelimitation. For thosewhoareinterested,wepointout

thatthedetailedformaldevelopmentof DML canbefoundin [22].

3 SomeFeaturesin deCaml

In thissection,weuseexamplesto presentsomeuniqueandsignificantfeaturesin deCaml,preparingfor thecasestudiesin Section4.

Theprogrammeroftendeclaresdatatypeswhenprogrammingin ML. For instance,thefollowing datatypedeclarationdefinesa typeconstructorC>ëHD7� .

19

type ’a list = nil | cons of ’a * ’a list

Roughlyspeaking,this declarationstatesthat a polymorphiclist is formedwith two constructorsnil andcons , whosetypesare’a list and’a * ’a list -> ’a list , respectively. Weuse’a for atypevariable.However, thedeclaredtype’a list is coarse.For instance,wecannotusethetypeto distinguishanemptylist from anon-emptyone.In deCaml,this typecanberefinedasfollows.

refine ’a list with nat =nil(0) | {n:nat} cons(n+1) ’a * ’a list(n)

Theclauserefine ’a list with nat meansthatwe refinethetype ’a list with anindex of sortnat , thatis, theindex is a naturalnumber. In thiscase,theindex standsfor thelengthof a list.

^ nil(0) meansthatnil is of type’a list(0) , thatis, it is a list of length _ .^ {n:nat} cons(n+1) of ’a * ’a list(n) meansthatcons is of type

{n:nat} ’a * ’a list(n) -> ’a list(n+1) `that is, for every naturalnumbern, cons yieldsa list of length a(bZc whengivenanelementof type’a andalist of length a . Note{n:nat} is auniversalquantifier, whichis usuallywrittenas dXafe�a)g�hin typetheory.

Now list typeshave becomemoreinformative. Thefollowing codedefinestheappendfunctionon lists. Weuse[] for nil and:: astheinfix operatorfor cons .

let rec append = function([], ys) -> ys

| (x :: xs, ys) -> x :: append(xs, ys)withtype {m:nat}{n:nat} ’a list(m) * ’a list(n) -> ’a list(m+n)

Thewithtype clauseis atypeannotationsuppliedby theprogrammer,whichsimplystatesthatthefunctionreturnsa list of lengthof i�b[a whengivena pairof listsof lengthsi and a , respectively. We now presentaninformaldescriptionabouttype-checkingin thiscase.

For thefirst clause([], ys) -> ys , thetype-checkerassumesthatys is of types’a list(b) forsomeindex variablej of sortnat . Thisimpliesthat([], ys) isof type’a list(0) * ’a list(b) .Thetype-checkertheninstantiatesi and a with _ and j , respectively, andverify thattheys ontheright sideof -> is of type ’a list(0+b) . Sinceys is of type ’a list(b) underassumption,the type-checkergeneratesaconstraintj�kQ_Ublj undertheassumptionthat j is anaturalnumber. Thisconstraintcanbeeasilyverified.

Let usnow type-checkthesecondclause(x :: xs, ys) -> x :: append(xs, ys) . As-sumethat xs andys areof type ’a list(a) and ’a list(b) , respectively, where g and j are in-dex variablesof sort nat . Then(x :: xs, ys) is of type ’a list(a+1) * ’a list(b) , andwe thereforeinstantiatei and a with g0bmc and j , respectively. Also we infer that the right sidex ::append(xs, ys) is of type’a list((a+b)+1) sincenpo and qSo areassumedof types’a list(a)and ’a list(b) , respectively. We needto prove that the right side is of type int list(m+n) forirkYgsbQc and alkQj . This leadsto thefollowing constraint,

t gubQc"v)bwj�k t gxbwjyv)bPcwhichcanbeimmediatelyverifiedunderthatassumptionthat g and j arenaturalnumbers.Thisfinishestype-checkingtheabovedeCamlprogram.Theinterestedreaderis referredto [22] for theformalpresentationoftype-checkingin DML.

20

Clearly, anaturalquestionis whetherthetypefor append canbereconstructedor synthesized.For sucha simpleexample,this seemshighly possible.However, our experienceindicatesthat it seemsexceedinglydifficult in generalto synthesizedependenttypesin practice,thoughwehavenot formally studiedthis issue.

Insteadof refininga type,it is alsoallowedto declarea dependenttypein deCaml.For instance,wecandeclarethefollowing.

datatype ’a list with nat = nil(0) | {n:nat} cons of ’a * ’a list(n)

Thedeclarationis basicallyequivalentto therefinementwemadeearlier. However, thereis alsoasignificantdifference.Whenwedeclarea refinement,we mustbeableto interpretthecorrespondingunrefinedtypesintermsof refinedones.For example,after refining the type ’a list , we mustinterpretthis type in termsof the refinedlist type. We needexistentialdependenttypesfor this purpose.’a list is interpretedas[n:nat] ’a list(n) , thatis, ’a list is ’a list(n) for some(unknown)naturalnumberz . Notethat [n:nat] is anexistentialquantifier, which is oftenwritten as {=zW|}z)~�� in type theory. This providesa smoothinteractionbetweenML typesanddependenttypes.Supposethat � is definedbeforethe list typeis refinedandits typeis ’a list -> ’a list . After refiningthelist type,we canassignto � thetype([n:nat] ’a list) -> [n:nat] ’a list , thatis, � takesa list with unknown lengthandreturnsa list with unknown length.Thismakesit possiblefor � to beappliedto anargumentof dependenttype,say,int list(2) . This is alsoessentialfor ensuringbackwardcompatibility, avery importantissuewhentheuseof existingML codeis concerned.

However, thereis a needfor imposingsomerestrictionondatatyperefinement.We give a shortexampleto illustratesuchaneed.Thedatatype’a tree is declaredasfollowsfor all binarytrees.

datatype ’a tree = Leaf | Node of ’a tree * ’a * ’a tree

Supposewedeclarethefollowing refinement,wherethetypeindex standardsfor theheightof a tree.

refine ’a tree with nat =Leaf(0) | {h:nat} Node(h+1) of ’a tree(h) * ’a * ’a tree(h)

This refinementis problematicsincethe type [h:nat] ’a tree(h) now standardsfor the type of allperfectbinarytrees,andthereforeit cannotbeusedto representtheoriginal ’a tree , which is thetypeforall binarytrees.Thereis somesyntacticrestrictionthatcanbeimposedto ruleoutsuchproblematicdatatyperefinements.We stopmentioningtherestrictionsinceit is simplynotneededin thispaper.

Thereis anotherimportantuseof existentialdependenttypes.In orderto guaranteepracticaltypecheck-ing in de Caml,we mustmake constraintsrelatively simple. Currently, we only acceptlinear integercon-straints.This immediatelyimpliesthattherearemany (realistic)constraintsthatareinexpressiblein thetypesystemof de Caml. For instance,the following codeimplementsa filter function on a list which removesfrom thelist all elementsnotsatisfyingthepropertyp.

let filter p = function[] -> []

| x :: xs -> if p(x) then x :: (filter p xs) else (filter p xs)

In general,it is impossibleto know the length of the list (filter p l) without knowing what p is.Therefore,it is impossibleto typethefunctionusingonly universaldependenttypes.Nonetheless,we knowthat the lengthof (filter p xs) is lessthanor equalto that of l . This invariantcanbe capturedbyassigningfilter thefollowing types.

(’a -> bool) -> {m:nat} ’a list(m) -> [n:nat | n <= m] ’a list(n)

Notethat[n:nat | n <= m] standsfor {=zf|*�"~'|�z)~����@~��w��� .Another significantuseof existential dependenttypesis to representa rangeof values. We can use

([n:nat] int(n)) array to representthe type for thevectorswhoseelementsarenaturalnumbers.

21

datatype ’a brauntree with nat =L(0)

| {m:nat}{n:nat | n <= m <= n+1}B(m+n+1) of ’a * ’a brauntree(m) * ’a brauntree(n) ;;

let rec diff k = functionL -> 0

| B(_, l, r) ->if k = 0 then 1else if k mod 2 = 1 then diff (k/2) l else diff (k/2 - 1) r

withtype {k:nat}{n:nat | k <= n <= k+1}int(k) -> ’a brauntree(n) -> int(n-k) ;;

let rec size = functionL -> 0 | B(_, l, r) -> let n = size r in 1 + n + n + diff n l

withtype {n:nat} ’a brauntree(n) -> int(n) ;;

Figure2: An implementationof thesizefunctiononBrauntrees

This is very useful for eliminatingarrayboundchecksat run-time[20]. In general,we view that the useof existentialtypesin deCaml for handlingfunctionslike filter is crucial to the scalabilityof the typesystemof deCamlsincesuchfunctionsareabundantin practice.

Lastly, wementionaconventionin deCaml.After declaringa dependenttypeasfollows,

datatype �F�����8�7�8�7�%���s��� with ���"�����%���7�8�8�7���"�����E�,�U���8�8�7�8�7�wemaywrite �5�"���8�8�7�8�%�8�s�9� to standfor thefollowing.

� � �O� �7����� �y� �7�8� � � �'� �7����� �S� � �5� � �8�7�8�y�%� � �E�*� � � �8�7�8�8� � � �For example,’a list standsfor [n:nat] ’a list(n) .

4 CaseStudies

In this section,we presentsomeexamplesto demonstratethe useof dependentdatatypesin capturingin-variantsin datastructures.All theseexamplesin de Caml have beensuccessfullyverified in a prototypecompilerfor deCaml,which is written on top of theCaml-lightcompiler[14]. Theclaim we make is thatdependentdatatypesenabletheprogrammerto implementalgorithmsin a way thatis morerobustandeasierto understand.

4.1 Braun Trees

A Braun tree is a balancedbinary tree [4] suchthat for every branchnodein the tree, its left subtreeiseitherthesamesizeasits right subtree,or containsonemoreelement.Brauntreescanbeusedto give neatimplementationsfor flexible arraysandpriority queues.In [16], thereis an algorithmwhich computesthesizeof a Brauntreein ¡��5¢¤£@¥<¦�§¨� time, where § is thesizeof theBrauntree. We implementthis algorithmin Figure2. We first declarea dependentdatatype’a brauntree(n) for Brauntreesof size § . Notethatthetypeof B is

22

{m:nat}{n:nat | n <= m <= n+1}’a * ’a brauntree(m) * ’a brauntree(n) -> ’a brauntree(m+n+1)

which statesthat B yields a Brauntreeof size ©rªW«lª­¬ whengivenan element,a Brauntreeof size ©anda Brauntreeof size « where «1®Y©¯®Y«MªZ¬ holds. This exactly capturestheinvarianton Brauntreesmentionedabove.

Givenanaturalnumber° anda Brauntreeof size « satisfying°(®w«f®1°xªQ¬ , thefunctiondiff yieldsthedifferencebetween« and ° . With this function,thesizefunctionon Brauntreescanbedefinedstraight-forwardly. An interestingpoint in this exampleis thatthetypeof thefunctionsize preciselyindicatesthatthis is the sizefunction on Brauntreessinceit statesthat the function returnsan integer of value « whengivena Brauntreeof size « .

The reasonthat diff «²± yields thedifferencebetween³ ±%³ , the sizeof ± , and « canbe found in [16].We give somebrief explanationbelow. It is clearthat ³ ±�³�´f« is either µ or ¬ . If ± is a leaf, ³ ±%³�´¶« mustbe µ .Otherwise,³ ±%³*·�¬�ª­³ ±F¸I³"ª­³ ¹�¸�³ , where ±F¸ and ¹�¸ aretheleft andright branchesof ± , respectively. If « is odd,then «º·»¬=ªm¼5«�½�¾�¿ªÀ¼.«�½@¾�¿ andthus

³ ±%³�´�« · ¬=ªZ³ ±F¸�³"ªY³ ¹�¸H³�´1¬Uª�¼5«�½�¾�¿=ªÀ¼.«�½@¾�¿Á·�Â�³ ±�¸I³�´­¼5«�½�¾�¿�Ã�ªQÂ�³ ¹�¸H³�´»¼.«�½@¾�¿�ÃSince ³ ± ¸ ³�´W¬s®»³ ¹ ¸ ³<®»³ ± ¸ ³ holds,wehave thefollowing.

¾,Â%³ ± ¸ ³�´­¼5«�½�¾�¿�Ã�´w¬x®»³ ±%³�´f«Ä®P¾SÂ�³ ± ¸ ³�´»¼.«�½@¾�¿�ÃIt cannow be readily verified that ³ ±%³<´1«Å·Æ¬ if ³ ± ¸ ³S´�¼5«�½�¾�¿º·Ç¬ and ³ ±%³<´1«Å·rµ if ³ ± ¸ ³*´È¼.«�½@¾�¿V·Tµ .Therefore,if « is odd, ³ ±%³<´w«Y·Æ³ ±F¸�³<´�¼.«�½�¾�¿ . With somesimilar reasoning,we caneventuallyprove thecorrectnessof thedefinedfunctiondiff .

This examplealsoshows that althoughthe datatypetype declarationfor Brauntreescontainssize in-formation, this information is not availableat run-timeand thereforea recursive walk throughthe tree isnecessaryto determinethesizeof a tree.

4.2 Random-AccessLists

A random-accesslist is a list representationsuchthatlist lookup(update)canbeimplementedin anefficientway. In thiscase,thelookup(update)functiontakes É�Â5ʤË@Ì=«¨Ã time in contrastto theusual É�Â5«¨Ã time (worstcase),where« is thelengthof theinput list.

We presentan implementationof random-accesslist in Figures3 and4. We first declarethe depen-dentdatatypefor random-accesslists. Note that ’a rlist(n) standsfor the typeof random-accesslistswith length « . Nil andOne aretheconstructorsfor emptyandsingletonrandom-accesslists, respectively.Furthermore,theconstructorsEven andOdd areto form random-accesslists of evenandodd lengths,re-spectively. If l1 andl2 representlists ÍpÎ�Ï8Ð7Ð8Ð8Ï%Í}Ñ and Ò<Î�Ï8Ð7Ð8ÐyÏ%Ò@Ñ for some«ÔÓZµ , respectively, thenEven(l1, l2) representsthe list ÍpÎ�Ï9Ò�Î�Ï8Ð8Ð7Ð8Ï%Í}ÑpÏ9ÒJÑ . Similarly, if l1 and l2 representlists ÍpÎ�Ï8Ð7Ð8Ð8Ï%Í}ÑpÏ%Í}Ñ�Õ�Îand Ò�Î�Ï7Ð8Ð8Ð7Ï9Ò@Ñ for some«�ÓWµ , respectively, Odd(l1, l2) representsÍpÎ�Ï9Ò�Î�Ï8Ð8Ð7ÐyÏ9ÍÖÑ�Ï9ÒJÑpÏ9Í}Ñ@Õ�Î . With sucha datastructure,we canimplementa lookup(update)functionon random-accesslist which takes É�Â5ʤË@Ì=«¨Ãtime. A crucial invarianton this datastructureis that l1 and l2 musthave thesamelengthif Even(l1,l2) is formedor l1 containsonemoreelementthanl2 if Odd(l1, l2) is formed.This is clearlycap-turedby the dependentdatatypedeclarationfor ’a rlist . The functioncons appendsan elementto alist anduncons decomposesa list into a pair consistingof the headandthe tail of the list. Note that thetypeof uncons requiresthis functiononly to beappliedto anon-emptylist. Bothcons anduncons takesÉ�Â5ʤË@Ì=«¨Ã time.

Thefunction lookup_safe deservessomeexplanation.Thetypeof this functionindicatesthat it canbeappliedto i andl only if i is a naturalnumberandits valueis lessthanthelengthof l . Noticethat thelook_up i l simply returnx whenthe l matchesthepatternOne x . Thereis noneedto checkwhether

23

datatype ’a rlist with nat =Nil(0)

| One(1) of ’a| {n:nat | n > 0} Even(n+n) of ’a rlist(n) * ’a rlist(n)| {n:nat | n > 0} Odd(n+n+1) of ’a rlist(n+1) * ’a rlist(n) ;;

exception Subscript ;;

let rec cons x = functionNil -> One x

| One y -> Even(One(x), One(y))| Even(l1, l2) -> Odd(cons x l2, l1)| Odd(l1, l2) -> Even(cons x l2, l1)

withtype {n:nat} ’a -> ’a rlist(n) -> ’a rlist(n+1) ;;

let rec uncons = functionOne x -> (x, Nil)

| Even(l1, l2) ->let (x, l1) = uncons l1 in begin

match l1 withNil -> (x, l2) | _ -> (x, Odd(l2, l1))

end| Odd(l1, l2) -> let (x, l1) = uncons l1 in (x, Even(l2, l1))

withtype {n:nat | n > 0} ’a rlist(n) -> ’a * ’a rlist(n-1) ;;

let rec length = functionNil -> 0

| One _ -> 1| Even (l1, _) -> 2 * (length l1)| Odd (_, l2) -> 2 * (length l2) + 1

withtype {n:nat} ’a rlist(n) -> int(n) ;;

let rec lookup_safe i = functionOne x -> x

| Even (l1, l2) ->if i mod 2 = 0 then lookup_safe (i / 2) l1else lookup_safe (i / 2) l2

| Odd(l1, l2) ->if i mod 2 = 0 then lookup_safe (i / 2) l1else lookup_safe (i / 2) l2

withtype {i:nat}{n:nat | i < n} int(i) -> ’a rlist(n) -> ’a ;;

Figure3: An implementationof random-accesslists in deCaml(I)

24

let rec update_safe i x = functionOne y -> One x

| Even(l1, l2) ->if i mod 2 = 0 then Even(update_safe (i / 2) x l1, l2)else Even(l1, update_safe (i / 2) x l2)

| Odd(l1, l2) ->if i mod 2 = 0 then Odd(update_safe (i / 2) x l1, l2)else Odd(l1, update_safe (i / 2) x l2)

withtype {i:nat}{n:nat | i < n}int(i) -> ’a -> ’a rlist(n) -> ’a rlist(n) ;;

Figure4: An implementationof random-accesslists in deCaml(II)

datatype ’a rlist with nat =Nil(0)

| One(1) of ’a| {n:nat | n > 0} Even(n+n) of (’a * ’a) rlist(n)| {n:nat | n > 0} Odd(n+n+1) of ’a * (’a * ’a) rlist(n)

Figure5: A nesteddependentdatatypefor randomaccesslists

i is × : it mustbesincei is a naturalnumberandi is lessthanthelengthof l , which is Ø in this case.Theusuallookupfunctioncanbeimplementedasusualor asfollows.

let rec lookup i l =if i < 0 then raise Subscriptelse if i >= length l then raise Subscript

else lookup_safe i lwithtype int -> ’a rlist -> ’a ;;

We point out that an implementationof random-accesslists is given in [17], which usesthe featureofnesteddatatypes.Okasaki’s implementationsupports(on average)Ù�ÚEØ"Û -time consingandunconsingopera-tionsandarethussuperiorto our implementationin this respect.On theotherhand,theupdatefunction inOkasaki’s implementationrequirestheuseof somehigher-orderfeature,which doesnot exist in our imple-mentation.We view thisasanedgeof our implementation.

It shouldbestressedthatnesteddatatypesandDML-style dependenttypesareorthogonalto eachother.For instance,we canform a nesteddependentdatatypein Figure5 for random-accesslists, imitating a cor-respondingdatatypein [17]. Unfortunately, we currentlycannotexperimentwith sucha dependentdatatypebecausepolymorphicrecursionis notsupportedin Caml-light.

4.3 Red-BlackTrees

A red-blacktree(RBT) is a balancedbinarytreewhich satisfiesthefollowing conditions:(a) all leavesaremarkedblackandall othernodesaremarkedeitherredor black;(b) for everynodetherearethesamenumberof blacknodeson every pathconnectingthenodeto a leaf,andthis numberis calledtheblack heightof thenode;(c) thetwo sonsof everyrednodemustbeblack. It is acommonpracticeto usetheRBT datastructurefor implementinga dictionary. We declarea datatypein Figure6, which preciselycapturesthesepropertiesof a RBT.

25

type key == int ;;

sort color == {a:int | 0 <= a <= 1} ;;

datatype rbtree with (color, nat, nat) =E(0, 0, 0)

| {c:color}{cl:color}{cr:color}{bh:nat}B(0, bh+1, 0) of rbtree(cl, bh, 0) * key * rbtree(cr, bh, 0)

| {cl:color}{cr:color}{bh:nat}R(1, bh, cl+cr) of rbtree(cl, bh, 0) * key * rbtree(cr, bh, 0) ;;

let restore = function(R(R(a, x, b), y, c), z, d) -> R(B(a, x, b), y, B(c, z, d))

| (R(a, x, R(b, y, c)), z, d) -> R(B(a, x, b), y, B(c, z, d))| (a, x, R(R(b, y, c), z, d)) -> R(B(a, x, b), y, B(c, z, d))| (a, x, R(b, y, R(c, z, d))) -> R(B(a, x, b), y, B(c, z, d))| (a, x, b) -> B(a, x, b)

withtype {cl:color}{cr:color}{bh:nat}{vl:nat}{ vr:na t | vl+vr <= 1}rbtree(cl, bh, vl) * key * rbtree(cr, bh, vr) ->[c:color] rbtree(c, bh+1, 0) ;;

exception Item_already_exists ;;

let insert x t =let rec ins = function

E -> R(E, x, E)| B(a, y, b) -> if x < y then restore(ins a, y, b)

else if y < x then restore(a, y, ins b)else raise Item_already_exists

| R(a, y, b) -> if x < y then R(ins a, y, b)else if y < x then R(a, y, ins b)

else raise Item_already_existswithtype {c:color}{bh:nat}

rbtree(c, bh, 0) ->[c’:color][v:nat | v <= c] rbtree(c’, bh, v) in

match ins t withR(a, y, b) -> B(a, y, b)

| t -> twithtype {c:color}{bh:nat} key -> rbtree(c, bh, 0) ->

[bh’:nat] rbtree(0, bh’, 0) ;;

Figure6: A red-blacktreeimplementation

26

A sortcolor is declaredfor thetype index expressionsrepresentingthecolorsof nodes.We use Ü forblackand Ý for red. For simplicity, we useintegersfor keys. Of course,onecanreadilyuseotherordereddatastructures.Thetype rbtree is indexedwith a triple (c, bh, v) , where Þ is thecolor of thenode,ßyà

is theblackheightof thetree,and á is thenumberof color violations.We recordonecolor violation if arednodeis followedby anotherrednode,andthusa RBT musthave no color violations.Clearly, thetypesof constructorsindicatethatcolor violationscanoonly occurat thetop node.Also, noticethata leaf, thatis,E, is consideredblack.Giventhedatatypedeclarationandtheexplanation,it shouldbeclearthatthetypeofa RBT is simply

[c:color][bh:nat] rbtree(c,bh,0) ,

thatis, a treewhichhassometopnodecolor Þ andsomeblackheightßyà

but nocolorviolations.It is an involvedtaskto implementRBT. The implementationwe presentis basicallyadoptedfrom the

onein [17], thoughtherearesomeminormodifications.We explainhow theinsertionoperationonaRBT isimplemented.Clearly, theinvariantwe intendto captureis that insertinganentryinto a RBT yieldsanotherRBT. In otherwords,we intendto declarethattheinsertionoperationis of thefollowing type.

key->[c:color][bh:nat] rbtree(c,bh,0) -> [c:color][bh:nat] rbtree(c,bh,0)

If we insertanentryinto a RBT, somepropertieson RBT maybeviolated.Thesepropertiescanberestoredthroughsomerotationoperations.Thefunctionrestore in Figure6 is definedfor thispurpose.

Thetypeof restore is easyto understand.It statesthatthis functiontakesanentry, a treewith atmostonecolor violation anda RBT andreturnsa RBT tree. The two treesin theargumentmusthave thesameblackheight

ßyàfor somenaturalnumber

ßyàandthereturnedRBT hasblackheight

ßyàxâ Ý . This informationcanbeof greathelp for understandingthecode. If the informationhadbeeninformally expressedthroughcomments,it would be difficult to know whetherthe commentscanbe trusted. Also notice that it is nottrivial at all to verify the informationmanually. We could imaginethatalmosteveryonewho did this wouldappreciatetheavailability of a type-checker to performit automatically.

Thereis a greatdifferencebetweentype-checkinga patternmatchingclausesin DML andin ML. Theoperationalsemanticsof ML requiresthat patternmatchingbe performedsequentially, that is, the chosenpatternmatchingclauseis alwaysthefirst onewhich matchesa givenvalue. For instance,in thedefinitionof thefunctionrestore , if thelastclauseis chosenat run-time,thenwe know theargumentof restoredoesnot matcheitherof theclausesaheadof the lastone. This mustbe taken into accountwhenwe type-checkingpatternmatchingin DML. Oneapproachis to expandpatternsinto disjoint ones.For instance,thepattern(a, x, b) expandsinto ã@ä patternsåçæ}è�éHéEê"ë�ì�í�î%ï¨îFæÖè�éHéEê7ë�ì)ð�ñ , whereæ}è�éHéEê"ë�ì�í andæ}è�éHéEê"ë�ì)ð rangeover thefollowing six patterns:R(B _, _, B _) , R(B _, _, E) , R(E, _, B _) , R(E, _, E) ,B _, andE. Unfortunately, suchexpansionmayleadto combinatorialexplosion.An alternative is to requirethe programmerto indicatewhethersuchexpansionis needed. Neitherof theseis currentlyavailable indeCaml,andtheauthorhastaken the inconvenienceto expandpatternsinto disjoint oneswhennecessary.We emphasizethat thecodein Figure6 mustbe thusexpandedin orderto passtype-checkingin deCaml.Thoughthiscanbefixedstraightforwardly, it is currentlyunclearwhatmethodcansolvetheproblembest.

The completeimplementationof the insertionoperationfollows immediately. Notice that the type offunction ins indicatesthat ins mayreturna treewith onecolor violation if it is appliedto a treewith redtopnode.This is fixedby replacingthetopnodewith ablackonefor everyreturnedtreewith a redtopnode.

Moreover, wecanuseanextra index to indicatethesizeof aRBT. If wedoso,wecanthenshow thattheinsert functionalwaysreturnsaRBT of size ì â Ý whengivena RBT of size ì (notethatanexceptionisraisedif theinsertedentryalreadyexistsin thetree).Pleasereferto [23] for details.

4.4 Binomial Heaps

A binomialtreeis definedrecursively;abinomialtree òOó with rank Ü consistsof asinglenodeandabinomialtree òsôyõ�í of rank ö â Ý consistsof two linkedbinomialtreesòsô of rank ö suchthattherootof one òuô is the

27

leftmostsonof theother ÷uø . A binomialheapù is a collectionof binomialtreesthatsatisfytheproperties:(a)eachbinomialtreein ù is heap-ordered,thatis, thekey of a nodeis greaterthator equalto thekey of itsparent,and(b) thereis at mostonebinomialtreein ù whoseroot hasa givendegree.Pleasereferto [6] fordetails.

We declaresomedatatypesin Figure7 for forming binomialheaps.Thetype tree(n) is for binomialtreesof rank ú , andthe type treelist(n) is for a list of binomial treeswith decreasingranksand ú1ûüQý�þ if thelist is notempty, whereü is therankof thefirst binomialtreein thelist. Werepresentabinomialheapasa list of binomialtreeswith increasingranks.For a heapof typeheap(n) , if ú¶û�ÿ thentheheapis empty;otherwiseúWû ü ý­þ where ü is the rankof thefirst binomial treein theheap.Notice thatweattachrankto eachtreenodein orderto efficiently computetherankof a treewhile usingthetypeof Nodeto guaranteethatthefirst componentof eachnodeindeedrepresentstherankof thatnode.

Noticethatthedatatypefor binomialtreesdoesnotcapturetheinvariantstatingthatthesetreesareheap-ordered.This seemsto be beyondthe reachof dependentdatatypes.Also notethat we would not be ableto capturesomeof the invariantsif we usedtheordinarylist constructors,that is, nil andcons , to formtreelists. This leadsto theintroductionof Tempty , Tcons , Hempty andHcons . This specialfeatureinprogrammingwith dependentdatatypeshasanunpleasantconsequence,whichwementionin Section5.

The implementationin Figure 7 and 8 is largely adoptedfrom [17]. Sincethe type for the functionmerge is relatively complex, we explain it asfollows. This type statesthat given two binomial heapsoftypesheap(m) andheap(n) , respectively, this function returnsa binomial heapof type heap(l) forsome

�suchthat

� û ü if úºûQÿ , or� ûPú if ü ûYÿ , or

��������� ü�� ú� ��1ÿ otherwise.

5 Limitation

We mentionsomelimitationsof dependentdatatypesin thissection.In orderto captureinvariants,we mayhave to declarenew datatypesinsteadof usingexisting ones.For

instance,we declaredthedatatypetreelist in Figure7 insteadof usingtheexisting list constructorstoform a list of trees.Thereasonis thatwe wantedto only form lists of binomial treeswith decreasingrank.Similarly, weintroducedthedatatypeheap to capturetheinvariantthatabinomialheapis a list of treeswithincreasingorder. This forcesus to definethe function to_heap later, which essentiallyreversesa list oftreesandappendit to aheap.If weusedtheexisting list constructorswithoutdeclaringeitherof treelistandheap , we couldthenusesomeexisting functionon lists insteadof definingto_heap . In orderwords,usingdependentdatatypesmaylosesomeopportunitiesfor codereuse.

Another limitation can be illustratedusing the following example. Let B be the constructordeclaredin Figure2, which is usedto form Brauntrees. Supposethat B(x, l, r) occursin the codewheretheprogrammerknowsfor somereasonthatl is thesamesizeasr or containsonemoreelementbut thiscannotbeestablishedin the typesystemof deCaml. In this case,thecodeis to be rejectedby thedeCaml type-checker, thoughthe codewill causeno run-timeerror (if we trust the programmer).The situationis verysimilar to thecasewherewe move from anuntypedprogramminglanguageinto a typedone. A solutiontothis problemis thatwe introducesomerun-timechecks.For instance,we maydefinethefollowing functionandreplaceB(x, l, r) with make_brauntree x l r .

let make_brauntree x l r =let m = size(l) and n = size(r) in

if n <= m && m <= n+1 then B(x, l, r) else raise Illegal_argumentwithtype int -> brauntree -> brauntree -> brauntree

Thefunctionmake_brauntree canreadilypasstype-checkingin deCaml(we refertheinterestedreaderto [22] for furtherdetails).Thepenaltyin thiscaseis thatmake_brauntree takes � ������� ú� time to builda treeof size ú , thoughthiscanbeavoidedif westoresizeinformationin eachnode.

In general,if theprogrammeranticipatestheabove situationto occurfrequently, thensheor heshouldeithermake surethat run-timecheckscanbe doneefficiently or switch back to non-dependentdatatypes.

28

datatype tree with nat ={n:nat} Node(n) of int(n) * int * treelist(n)

and treelist with nat =Tempty(0)

| {m:nat}{n:nat | m >= n} Tcons(m+1) of tree(m) * treelist(n) ;;

datatype heap with nat =Hempty(0)

| {m:nat}{n:nat | n = 0 \/ m+1 < n} Hcons(m+1) of tree(m) * heap(n) ;;

let rank = function Node(r, _, _) -> rwithtype {n:nat} tree(n) -> int(n) ;;

let root = function Node(_, x, _) -> xwithtype {n:nat} tree(n) -> int ;;

let link (Node(r, x1, ts1) as t1) = functionNode(_, x2, ts2) as t2 ->if (x1 <= x2) then Node(r+1, x1, Tcons(t2, ts1))else Node(r+1, x2, Tcons(t1, ts2))

withtype {r:nat} tree(r) -> tree(r) -> tree(r+1) ;;

let rec insTree t = functionHempty -> Hcons(t, Hempty)

| Hcons(t’, ts’) as ts ->if rank t < rank t’ then Hcons(t, ts) else insTree (link t t’) ts’

withtype {r:nat}{n:nat | n = 0 \/ r < n}tree(r) -> heap(n) -> [l:nat | l > r] heap(l) ;;

let insert x hp = insTree (Node(0, x, Tempty)) hpwithtype int -> [n:nat] heap(n) -> [n:nat | n > 0] heap(n) ;;

let rec merge = function(hp1, Hempty) -> hp1

| (Hempty, hp2) -> hp2| (Hcons(t1, hp1’) as hp1), (Hcons(t2, hp2’) as hp2) ->

if rank t1 < rank t2 then Hcons(t1, merge(hp1’, hp2))else if rank t1 > rank t2 then Hcons(t2, merge(hp1, hp2’))else let hp = merge(hp1’, hp2’) in insTree (link t1 t2) hp

withtype {m:nat}{n:nat} heap(m) * heap(n) ->[l:nat | (n = 0 /\ l = m) \/ (m = 0 /\ l = n) \/

(l >= min(m, n) > 0)] heap(l) ;;

Figure7: An implementationof binomialheapin deCaml(I)

29

exception Heap_is_empty ;;

let rec removeMinTree = functionHempty -> raise Heap_is_empty

| Hcons(t, Hempty) -> (t, Hempty)| Hcons(t, hp) ->

let (t’, hp’) = removeMinTree hp inif root t < root t’ then (t, hp) else (t’, Hcons(t, hp’))

withtype {n:nat}heap(n) ->[r:nat][l:nat | l = 0 \/ l >= n > 0] (tree(r) * heap(l)) ;;

let findMin hp = let (t, _) = removeMinTree hp in root twithtype {n:nat} heap(n) -> int ;;

let rec to_heap hp = functionTempty -> hp

| Tcons(t, ts) -> to_heap (Hcons(t, hp)) tswithtype {m:nat}{n:nat | m = 0 \/ m > n}

heap(m) -> treelist(n) -> heap ;;

let deleteMin hp =let (Node(_, x, ts), hp) = removeMinTree hpin merge (to_heap Hempty ts, hp)

withtype heap -> heap ;;

Figure8: An implementationof binomialheapin deCaml(II)

We recommendthat theprogrammeravoid complex encodingswhenusingdependentdatatypesto captureinvariantsin datastructures.

6 RelatedWork

Theuseof typesystemsin programerrordetectionis ubiquitous.Usually, thetypesin generalpurposepro-gramminglanguagessuchasML andJavaarerelatively inexpressivefor thesakeof practicaltype-checking.In theselanguages,the useof typesin programverificationis effective but too limited. Our work canbeviewedasproviding a moreexpressive typesystemto allow theprogrammerto capturemoreprogramprop-ertiesthroughtypesandthuscatchmoreerrorsat compile-time.As a consequence,typescanserveasinfor-mative programdocumentation,facilitatingprogramcomprehension.We assignpriority to thepracticalityof type-checkingin our languagedesignandemphasizetheneedfor restrictingtheexpressivenessof a typesystem.

In [21], we have comparedour work with sometraditionaldependenttype systemssuchas the onesunderliningCoq[8] andNuPrl [5], which arefar morerefinedthanthetypesystemof DML. There,wealsogive comparisonto the notion of indexed types[25] (an earlierversionof which is describedin [24]), thenotionof refinementtypes[9, 7], thenotionof sizedtypes[13], andtheprogramminglanguageCayenne[1].

Therehave beenmany recentstudieson the useof nesteddatatypes[2] in constructing(sophisticated)datatypesto capturemoreinvariantsin datastructures.For instance,a varietyof examplescanbefoundin

30

[3, 18, 10, 12, 11]. We feel that theadvantageof this approachis that it requiresrelatively minor languageextensions,whichmayincludepolymorphicrecursion,higher-orderkinds,rank-2polymorphism,to existingfunctionalprogramminglanguagessuchasHaskell, while type-checkingin DML is muchmoreinvolved.On the otherhand,this approachseemslessflexible, often requiringsomeinvolved treatmentat both typeandprogramlevel. The importantnotion of datatyperefinementin DML cannotbe capturedwith nesteddatatypes.For instance,it is impossibleto form a nesteddatatypethatcancapturethenotionof thelengthofa list sincethiswould imply thatonecouldsimplyusetypesto distinguishnon-emptylists from emptyones.In general,wethink thatthesetwo approachesareessentiallyorthogonalin spiteof somesimilarmotivationsbehindtheirdevelopmentandthey canbereadilycombinedwith little effort.

7 Conclusion

The useof dependentdatatypesin capturinginvariantsin datastructuresis novel. This practicecanoffermany advantageswhenwe implementalgorithmsin advancedprogramminglanguagesequippedwith suchamechanism.Themostsignificantadvantageis probablyin programerrordetection.We arguedin Section1that the imprecisionof datatypesin StandardML or Haskell in capturinginvariantscan be a rich sourcefor run-timeprogramerrors. In addition,the dependenttype annotationssuppliedby the programmeraremechanicallyverified andcan thusbe fully trusted. They canserve as valuableprogramdocumentation,facilitatingprogramunderstanding.Therearealsovarioususesof dependentdatatypesin compileroptimiza-tion.

Type-checkingin DML is largely independentof the size of a programsincea type-checkingunit isroughly thebodyof a toplevel function. In general,whatmattersin type-checkingis thedifficulty level ofthepropertiesthatareto bechecked. A moreseriousissueis how to reporterrormessagesin caseof typeerrors.Thetype-checkingin deCamlimplementsa top-down stylealgorithm,whichusuallypinpointsto thelocationof a type error. Unfortunately, the authorfinds that it may often be surprisinglydifficult to figureout thecauseof a typeerror. On thepositive side,thetype-checkerof deCamlis oftencapableof detectinga variety of subtleerrors. For instance,the authoronceusedEven(l1, l2) to form a random-list(inFigure3) andthetype-checker raisedanerrorbecauseit couldnot prove that l1 cannotbeNil . If this hadgoneunnoticed,it would have invalidatedsomeinvariantassumedby the programmer, potentiallycausing(difficult) run-timeerrors. We arecurrentlyin theprocessof gatheringmorestatisticsregardingtheuseofdeCaml.

The usualfocusof datastructuredesignis mainly on enhancingtime and/orspaceefficiency, andlessattentionis paidto programerrordetection.Theintroductionof dependentdatatypesprovidesanopportunityto remedythesituation.In general,weareinterestedin promotingtheuseof light-weightformalmethodsinpracticalprogramming,enhancingtherobustnessof programs.We have presentedsomeconcreteexamplesof dependentdatatypesin this paperin supportof sucha promotion.We hopetheseexamplescanraisetheawarenessof dependentdatatypesandtheirusein implementingalgorithms.

8 Acknowledgment

I thankChrisOkasaki,Ralf Hinzeandananonymousrefereefor theirconstructivecomments,whichhaveun-doubtedlyraisedthequalityof thepaper. Also I thankJimHookfor hisgeneroushelpin presentingthepaper.

References[1] LennartAugustsson. Cayenne– a languagewith dependenttypes. In Proceedingsof the 3rd ACM SIGPLAN

InternationalConferenceon FunctionalProgramming, pages239–250,1998.

[2] RichardBird andLambertMeertens.Nesteddatatypes.In Mathematicsof program construction, pages52–67.Springer-VerlagLNCS1422,1998.

31

[3] RichardBird andRossPaterson.debruijn notationasa nesteddatatypes.Journal of FunctionalProgramming, Toappear.

[4] W. BraunandM. Rem. A logarithmicimplementationof fexible arrays.TechnicalReportMemorandumMS83/1,EidenhovenUniversityof Technology, 1983.

[5] RobertL. Constableet al. ImplementingMathematicswith theNuPrl Proof DevelopmentSystem. Prentice-Hall,EnglewoodClif fs, New Jersey, 1986.

[6] ThomasH. Corman,CharlesE. Leiserson,andRonaldL. Rivest. Introductionto Algorithms. The MIT Press,Cambridge,Massachusetts,1989.

[7] RowanDavies. Practicalrefinement-typechecking.ThesisProposal,November1997.

[8] GillesDowek,Amy Felty, HugoHerbelin,GerardHuet,ChetMurthy, CatherineParent,ChristinePaulin-Mohring,andBenjaminWerner. The Coq proof assistantuser’s guide. RapportTechniques154, INRIA, Rocquencourt,France,1993.Version5.8.

[9] Tim FreemanandFrankPfenning. Refinementtypesfor ML. In ACM SIGPLANConferenceon ProgrammingLanguage DesignandImplementation, pages268–277,Toronto,Ontario,1991.

[10] Ralf Hinze. NumericalRepresentationsasHigher-OrderNestedTypes. TechnicalReportIAI-TR-98-12, Institutfur Informatik III, Universitat Bonn,April 1998.

[11] Ralf Hinze. ConstructingRed-BlackTrees. In Proceedingsof Workshopon Algorithmic Aspectsof AdvancedProgrammingLanguages, September1999.

[12] Ralf Hinze. ManufacturingDatatypes.In Proceedingsof WorkshoponAlgorithmicAspectsof AdvancedProgram-ming Languages, September1999. Also availableasTechnicalReportIAI-TR-99-5, Institut fur Informatik III,Universitat Bonn.

[13] JohnHughes,Lars Pareto,and Amr Sabry. Proving the correctnessof reactive systemsusing sizedtypes. InConferenceRecord of 23rd ACM SIGPLANSymposiumonPrinciplesof ProgrammingLanguages, pages410–423,1996.

[14] INRIA. Caml-light. http://caml.inria.fr .

[15] Robin Milner, MadsTofte, RobertW. Harper, andD. MacQueen.TheDefinition of Standard ML. MIT Press,Cambridge,Massachusetts,1997.

[16] ChrisOkasaki.ThreeAlgorithmsonBraunTreesby ChrisOkasaki.Journalof FunctionalProgramming, 7(6):661–666,November1997.

[17] ChrisOkasaki.PurelyFunctionalDataStructures. CambridgeUniversityPress,1998.

[18] ChrisOkasaki.FromFastExponentiationto SquareMatrices:An Adventurein Types. In Proceedingsof the4thACM SIGPLANInternationalConferenceon FunctionalProgramming, September1999.

[19] Simon Peyton Jones et al. Haskell 98 – A non-strict, purely functional language. Available fromhttp://www.haskell.org/onlinereport/ , February1999.

[20] H. Xi and F. Pfenning. Eliminating array boundcheckingthroughdependenttypes. In Proceedingsof ACMSIGPLANConferenceon ProgrammingLanguage Designand Implementation, pages249–257,Montreal, June1998.

[21] H. Xi andF. Pfenning.Dependenttypesin practicalprogramming.In Proceedingsof ACM SIGPLANSymposiumonPrinciplesof ProgrammingLanguages, pages214–227,SanAntonio,January1999.

[22] HongweiXi. DependentTypesin Practical Programming. PhD thesis,Carnegie Mellon University, 1998. pp.viii+189. Availableashttp://www.cs.cmu.edu/˜hwxi/DML/thesis. ps .

[23] HongweiXi. Someprogrammingexamplesin deCaml. Availableathttp://www.cse.ogi.edu/˜hongwei/DML/deC aml/ex amples / , 1999.

[24] ChristophZenger. Indexedtypes.TheoreticalComputerScience, 187:147–165,1997.

[25] ChristophZenger. IndizierteTypen. PhDthesis,Fakultat fur Informatik,UniversitatKarlsruhe,1998.Forthcoming.

32

Teachingmonadicalgorithmsto first-yearstudents�RicardoPena YolandaOrtega

FernandoRubio

DepartamentodeSistemasInformaticosy ProgramacionUniversidadComplutensedeMadrid,28040Madrid,Spain.

e-mails: � ricardo, yolanda � @sip.ucm.es , [email protected]

Abstract

Themainclaim of this paperis that imperative conceptssuchassequencing,repetition,mutablestate,andI/O canbe taughtto first-yearstudentsby using the monadicfacilities of a functional languagesuchasHaskell.

Wereportonanexperienceof teachingalgorithmsinvolving arrays,andwhicharetypicalof afirst pro-grammingcourse—suchas insertionsort, bubblesort, linear search, andsoon—, by usingthemonadicstyle. It appearsthat our studentsdo not have specialdifficulties in graspingboth the imperative con-ceptsandthe algorithms. They learnthesealgorithmsafter a previous expositionto classicalfunctionalprogramming.

In thepaper, we provide a rich sampleof thealgorithmsusedin thecourse.We alsoclaim thathigherorderconstructionsfacilitateto our studentsthedesignof complex monadicalgorithms.

Keywords: imperative functionalprogramming,monadicalgorithms,education.

1 Intr oduction

Sincetheendof the80’s,therehasbeenabroadtrendtoabandonimperativelanguagesonbehalfof functionalonesin introductoryprogrammingcourses.So,at many universities,Pascalhasbeenreplacedby Scheme,ML —andits variants—,or Miranda,or, morerecently, GoferandHaskell, asthefirst programminglanguageto be learntby undergraduates.This kind of experienceshave beenalreadyreportedin a numberof papers(see,for instance,[7, 9, 8]). Therefore,thereis noneedto repeatherethebenefitsof thefunctionalparadigmfor ‘unexperienced’students.

Being theweak-pointof functionalprogramminglanguagesexecutionefficiency, mostof the recentre-searchon thefunctionalfield hasbeendevotedto increasetheefficiency of functionalprograms.Oneof themostinterestingresultsis monadicprogramming[15, 17]. A monadicstyleenablestheprogrammerto copewith interactionandstate-basedcomputationsin a functionalsetting. Also, higher-orderstructurescanbedefined,which mimic thecontrol structuresof imperative languages,andgiving rise to the term imperativefunctionalprogramming[6, 10]. However, dueto therelationshipbetweenmonadsandcategory theory, andthe proximity of the monadicstyle to the temptingrealmof imperative programming,theseadvanceshavebeenmostlyrelegatedto postgraduatecourses.

Weclaim thatit is completelyviableto teachmonadicalgorithmsto freshmen.Moreover, this can—andshould—bedonewithoutexplainingthetechnicaldetailsof monads.Thebenefitsof acceptingthischallenge,�Work partially supportedby thespanishprojectsCAM-06T/033/96andCICYT-TIC97-0672.

33

aretwo-fold: on theonehand,studentsareableto tacklea wider spectrumof programmingapplications;ontheotherhand,they learnimperativeconceptswithout leaving thefunctionalworld.

The aim of the presentpaperis to substantiateour claim by explaining how to gently introducethemonadicstyleof programmingto first-yearstudents,andby providingsomesimple,yet illustrative,examplesfor thisteachingtask.Westartwith abrief presentationof thecontext whereourproposalhassprouted.Then,we explain anddetaila bit theproposal.Section4 is thecoreof thepaper, containingtheteachingsequencewehavefollowedandthecorrespondingmonadicalgorithms.We endby commentingsomeresultsfrom ourexperience.

2 The context

Beforepresentingourproposal,it is importantto clearlyexplain thecontext andcircumstancesof ourcourse.Attemptsto give introductoryprogrammingcoursesbasedon functionalprogramminglanguageshave beensometimesforsakenfor fearof anotcompletelysatisfactoryintegrationwith therestof thecurriculum.Func-tional programmingturnsout to be so naturalandcloseto problem-thinking,that studentsfind difficultiesto handlelanguageslike FORTRAN or C whenthey areconfrontedto this low-level programmingstyle insuccessorcourses.

It is a reality that computersciencecurriculaaremostly imperative programmingoriented,with mostsubjectsbasedon this style,while otherprogrammingparadigmsareincludedascomplementaryor optionalcourses.For instance,while thereis a greatvariety of first courseson programmingfrom the functionalperspective,therearevery few proposalsfor a secondcourseon programming(advanceddatastructuresandprogramdesignmethods)in a functionalstyle(see[13] for a proposal).

Nonetheless,our proposalis not addressedto futurecomputerengineers,but to first-yearundergraduatemathematicstudents,which must follow a compulsorycourseon programmingand,probably, will neverlearnanymoreon computersor programming.Although,in our case,thereis a secondprogrammingcourseon datastructuresandalgorithms,this is only an option amonga greatanddiversifiedoffer on pureandappliedmathematicssubjects.Therefore,themaingoalof this introductorycourseto programmingis not topreparestudentsfor latercourseson thecomputingdiscipline,but to teachthemhow to usesucha powerfulandnowadaysindispensabletool: aprogramminglanguage.Of course,it is not ourgoalto teachaparticularlanguageandsystem,but to make thestudentsto understandthemainconceptsin programmingsothattheywill beableto designalgorithmsto solve their problems,andto expressthemin theavailableprogramminglanguage—imperative in mostcases.While thefunctionalstyleis excellentfor algorithmdesign,evenmorefor mathematicians,thetrainingwouldbeincompletewithoutanunderstandingof theimperativecomputingmodel,andof the most typical datastructuresof the imperative style, i.e. arraysandfiles, which will beextensively usedin subjectslikenumericalanalysisor statistics.

A first attemptwetriedto follow —inspiredby theapproachof [5]— wasto presentfunctionallanguagesasa specificationtool for describingalgorithms,which couldbe directly executed,or which couldbe laterefficiently implementedin animperativelanguage.Actually, HartelandMuller, describehow to learnC afterafirst courseonSML. A relatedexperienceis presentedin [3], whereMirandais usedfor ADTsspecificationsto be implementedin C. In [7] a first-yearcoursecombiningfunctional and imperative programmingisdescribed. Our project was not so ambitious,becausewe were constrainedto a one-yearcourse. Thus,75% of the coursewasdevotedto purefunctionalprogramming,while the remaining25% wasemployedto explain,by usinga conventionalimperative language,themain imperativeconcepts(updatablevariables,sequencing,iteration,arrays,files, subprograms).However, this first experiencewasquite a failure. Themainreasonwasthescarceintegrationbetweenthe two programmingstyles.Themethodology‘functionalspecification– imperative implementation’only workedfor simpleexamplesbecausemany of thefunctionalconstructions,likenon-tailrecursionor higher-orderfunctions,weredifficult to translatein asystematicway

34

to theimperativestyle.Weconcludedthatit waseasierto designthealgorithmsdirectlyusingtheimperativefeatures.Consequently, thestudents‘divided’ our courseinto two independentsubjects:Haskell andPascal,which werethelanguageschosento beusedin laboratories.This desintegrationwasaggravatedby thelackof time: 60 classroomhoursplus 30 hoursin labsappearedto be too scantyto make themunderstandthetwo paradigms.Thus,while studentswerestill fighting againsthigher-orderfunctions,we suddenlystartedto talk aboutstatesanditerations.It is not thecasethattheseconceptsaredifficult to grasp,but thestudentswereunableto expressthemin thenew syntax,andtheconfusionbetweenthetwo notationswasgreat.

3 The proposal

As wehaveexplainedin theprevioussection,thefailureof ourfirst experiencewascausedby thedesintegra-tion betweenthetwo programmingstyles,increasedby theuseof two differentsyntaxes.Hence,whatabouthaving the two programmingmodelsin a uniquelanguage?Then,we turnedto themonadicprogrammingstylecommentedat theintroductionof this paper.

Our actualproposaldistributesthesubjectin a 75 % ‘pure’ functional+ 25%‘imperative’ functional. Inthis way, we still keepa quarterof thecoursedevotedto theessentialsof theimperativemodel:

� controlof sequence;

� repetition,asanalternative to recursion;and

� amutablestate,allowing efficientdatastructures(arrays)andpermanentdata(files).

We havechosenHaskell asthesupportinglanguagebecauseit includesall thefeatureswe desireto commu-nicateto ourstudents,while enjoying aneasyto learnandhandysyntax.Moreover, it is widely known in thefunctionallanguagecommunity, with muchongoingresearchon it, andproviding very efficient compilers.Thereexist alsothepossibilityof usingan interpreterlike HUGS,which allows thestudentsto quickly teston thecomputertheexampleslearnedat theclassroom,andto easilydevelopsmallprograms.

A detailedprogramis givennext:

Part I: Intr oducingProgrammingLesson1: Intr oduction. Algorithmsandprograms.Underlyinghardware.Programminglanguages.Oper-atingsystemsandtranslators.Lesson2: Program correctness.Programspecification.Programdesignandverification.

Part II: BasicFunctional ProgrammingLesson3: Basictypesand simpleexpressions.Haskell: basicsyntaxandevaluation.Valuesanddatatypes.Integers,floatingpoint numbers,booleans,charactersandstrings.Lesson4: Function definitions. Conditionalexpressionsandguards.Simplepatterns.Functionapplication.Functioncomposition.Lesson5: Top-down design.Declarationscope.Programmingwith localdefinitions.Functionrefinement.Lesson6: Recursive functions. Mathematicalinduction. Recursive decomposition.Recursive functionsover integers.Proofby induction.Lesson7: The type system. Introducingclasses.A tour of thebuilt-in Haskell classes.Monomorphicandpolymorphictypes.Typechecking.Lesson8: Tuples.Concept.Valueconstructionandpatterns.Standardoperations.Componentselectionandpatternmatching.

35

Lesson9: Lists. Concept.Valueconstructionandpatterns.Polymorphiclists. Standardoperations.Recur-sive functionsover lists. Proofby structuralinduction.Lesson10: Designingfunctions over lists. List traversalsandsearchs.Sortinglists: selectionsort,insertionsort,mergesort.Analysisof correctness.Lesson11: Program efficiency. O-notation.Basicordersof efficiency. Time complexity analysis.Lesson12: Higher-order functions. Functionsasarguments.Higher-orderfunctionsover lists: filtering,mappingandfolding. Insertionsort revisited. Functionsasvaluesandresults.Partial application.Sectionsandlambdaabstractions.Curryinganduncurrying.Lesson13: List comprehensions.Conceptandsyntax.Examples:primes,quicksort.List comprehensionandhigher-orderfunctions.Lesson14: Intr oducing abstract data types. TheADT concept.Modulesin Haskell. Examples:stacks,FIFO queues,andsets.Implementationusinglists.

Part III: Imperati veFunctional ProgrammingLesson15: The imperativecomputing model. Updatablevariablesandstates.Sequentialcompositionanditeration.Relationshipwith theunderlyinghardware.Lesson16: Interacti ve input and output. Interactive keyboardinput andscreenoutput. Interactive pro-gramswith file input/output.Sequencingusing>> and>>=. Thedo notation.Lesson17: Immutable arrays. Index types.TheArray module.Array creationandsubscripting.Usefulfunctionsoverarrays.Examples:tabulatingresults,binarysearch,insertingin asortedarray, matrixproduct.Lesson18: Mutable arrays. The ST (Strict StateThread)module. Basicactionsover (ST s) a. Con-structinga mutablecomputation.Examples:insertionsort,bubblesort.

Noticethatwe introduceclasses(Lesson7). We find difficult for studentsto understandthetypeinformationprovidedby HUGSif they know nothingaboutHaskell classes.However, werestrictourselvesto explainingthemostbasicconcepts,andwedonotexpectourstudentsto createnew classes.Ontheotherhand,algebraictypesare absentfrom the programpresentedhere. The main useof algebraictypes is the definition ofrecursive types(e.g. trees),which we think arebettersuitedfor a secondyear. Thestructureswe expectourstudentsto masterarethelinearones:lists andarrays.

Thelastpartof thecourse,theonedevotedto imperativefunctionalprogramming,starts(Lesson15)withanintroductionto typical imperativeconcepts,without mentioningthefunctionalparadigm.

The expectedadvantagesof this new approachresidenot only in keepingthe samesyntaxfor the twostyles,but alsoin keepingthe sameprogrammingenvironmentat the laboratories.This savesa lot of timeandmistakes.Besidesthat,it allowsthestudentto continueusing,in theimperativepart,theusualfunctionalstylefor thenonmonadicfunctions,thuscontributing to theirmaturityin theparadigm.

4 Imperati ve functional programming by example

Thissectioncontainsadetailedpresentationof theteachingsequencewehavefollowedin theimperativepartof thecourse,andanumberof illustrativemonadicandnonmonadicalgorithmswehaveusedto transmittheimperativeconceptsto thestudents.

4.1 Sequenceand iteration: I/O interaction

Thesimplestimperative conceptto startwith is sequentialcompositionof actions. For thefirst time in thecourse,wewonderaboutthespecificorderin whichactionsshouldbeperformed.Input/outputinteractionisanareain which thestudentcannaturallyappreciatethatthecontrolof this orderingis important.

36

Atomic actions We startby explainingoutput,the type IO () , andthe mostelementaryI/O action,theonedoingnothing: done::IO () . Then,we go on with otheratomicoutputactions:writing a character,writing a string,writing a completefile, andsoon. Then,we generalizeto input, to the type IO a anditsatomicactions:return a, readingacharacter, readingaline, readingacompletefile, andsoon. As in [16]andin [1, Chapter10], we stressthedifferencebetweendefininganI/O actionandperformingit.

Sequencingactions In orderto beableto establishdialogues,someway of sequencingtheseelementaryactionsmustbeprovided.First we introducethesequentialcombinator>>:

main = putChar ’a’ >> putChar ’b’

Oncetwo actionshave beencombined,recursionprovidesthe meansto sequencea variablenumberof ac-tions:

putStr "" = doneputStr (c:cs) = putChar c >> putStr cs

Whenan I/O actionreturnsa valuedifferentfrom () , someway mustbe provided so that the restof theinteractioncanusethis value.If we write

main = getChar >> putChar ’a’

the>> combinatorsimply ignoresthevaluereturnedby thefirst action,sowe justify thesecondcombinator>=:

main = getChar >>= putChar

Weexplainthetype(>>=)::IO a -> (a -> IO b) -> IO b andbuild morecomplex interactions:

getLine = getChar >>= \c -> if c==’\n’ then return ""else getLine >>= \cs -> return (c:cs)

To explain theseideaswe do not appealto monads.For students,the>>= combinatoris just read‘follo wedby’.

The do-notation At this point, the needfor a morecompactandclearnotationis strongly felt, andweintroducethedo-notation,explainingthat this is just anabbreviation of themorecumbersomecombinationof >>, >>= andlambdaabstractions:

getLine = do c <- getCharif c==’\n’ then return ""

else do cs <- getLinereturn (c:cs)

We could have chosento explain only the do-notation,insteadof presentingit asan abbreviation of moreelementaryconcepts.But, in doing so, we could have transmittedthe impressionof a magicalbehaviourbehindimperative-stylealgorithms.We havepreferredto remarkthatprogramsarestill functional.

Repetition Frequentlyin interactions,thereis the needto repeatan action until somedesiredpropertyholds.Hereis anexampleof a programreadinganintegerbetween1 anda givennumbern:

readInt :: Int -> IO IntreadInt n = do putStr ("Type an integer between 1 and " ++ show n ++ ": ")

s <- getLinelet x = read s in

if all isDigit s && 1 <= x && x <= n then return xelse readInt n

37

We tell thestudentsthat this constructionis very typical in animperative languageandthat therearespecialcontrolstructuressuchaswhile andrepeat to expressit.

Top down designof interactions Monadicdialogsshouldnot look like long sequencesof actions. Topdown designhasits placehere. Whena complex dialoguemustbe designed,it is advisableto split it intopieces,eachonetakingcareof a partof the interaction.For instance,we candesigna programperformingthefollowing loop: displayinga menu,inviting theuserto chooseanoption,performingthecorrespondingaction,andgoingbackto theloop,or leaving it if theoptionchosenwasthelastone:

main = do showMenui <- readInt 3case i of

1 -> do {action1 ; main}2 -> do {action2 ; main}3 -> done

4.2 Readonly state: immutable arrays

The next importantimperative conceptis the arraydatastructure.It mimics the computermemoryandsoallows accessingto any singlecomponentin constanttime, independentlyof thenumberof elementsstoredin thearray. It is importantthatstudentsunderstandthedifferencesbetweenthis structureandlists: (i) oncecreated,anarraycannotbeextendedwith new elementsto produceanew array;and(ii) therecursionpatternsfor arraysarebasedonchangingindex intervalsinsteadof onapplyingtherecursivefunctionto asubstructureof thesametype.

The type Array a b of immutablearraysis a good startingpoint for introducinglater on mutablearrays.Therearemany algorithms,mainly readingfrom immutablearrays,having thesametimecomplexityasif they wereprogrammedin animperative language.Oneof themis linearsearch:

linSearch :: Eq b => Array Int b -> b -> Maybe IntlinSearch a x = linSearch’ a x low up

where (low,up) = bounds alinSearch’ a x j up

| j > up = Nothing| x == a!j = Just j| otherwise = linSearch’ a x (j+1) up

We will alwaysusethe techniqueshown in this example,in every recursive definition relatedto eitherim-mutableor mutablearrays: the function to be designedis embeddedin a moregeneralonehaving at leasttwo additionalparameters,thefollowing index to bedealtwith, andtheupperboundof this index. Thismoregeneralfunctionis recursively designed:abasecaseis reachedwhenwegettheemptyinterval of indices;intherecursivecase,we decreasethelengthof theindex interval. Of course,if thearrayis sorted,wecando itbetterby usinga binarysearch:

binSearch :: Ord b => Array Int b -> b -> Maybe IntbinSearch a x = binSearch’ a x low up

where (low,up) = bounds abinSearch’ a x j k

| j > k = Nothing| x < a!m = binSearch’ a x j (m - 1)| x == a!m = Just m| x > a!m = binSearch’ a x (m + 1) kwhere m = (j + k) ‘div‘ 2

38

It is well known thatthisalgorithmhaslogarithmiccost.We emphasizethefactby explainingthatno searchalgorithmusinglists asasearchstructurecanbeatthis cost.

Otherinterestingalgorithmswith immutablearraysincludematrix multiplication,Fibonaccitabulation,andthedefinitionof higherorderfunctionsfor arrays,similar to map, fold , all , any andsoon. We alsogive a versionof insertionsort for immutablearrays(whosecost is in ���� "!$# , being the lengthof thearray):

isort :: Ord b => Array Int b -> Array Int bisort a = foldl insert a [low+1..up]

where (low,up) = bounds ainsert :: Ord b => Array Int b -> Int -> Array Int binsert a n = insert’ a low n

where (low,_) = bounds ainsert’ a j n

| j >= n = a| a!n > a!j = insert’ a (j+1) n| otherwise = a // ((j,a!n) : [(k+1,a!k) | k <- [j..n-1]])

This algorithm will be the basisfor a similar algorithm using mutablearrays. A call to insert a nassumesthat theelementsof a in positions[low..n-1] areordered,andthat low % n & up ; then,itrearrangestheelementsin positions[low..n] in suchaway that,at theend,thisportionbecomesordered.In theworstcase,eachcall to insert createsa new arrayby modifying theonegivenasparameter. Thiscostis in ���' (# , being thenumberof elementsin thearray. As thereare *),+ callsto insert , thetotalcostof isort is in ���� ! # .4.3 Read-write state: mutable arrays

Coming back to the analogybetweenarraysand the computermemory, it is easyto justify the needformutablearrays:wewould like to modify anarrayelement,aswecando with amemoryposition,with acostin ���-+.# . Weexplainthatit is possibleto expressmutablearraysin apurefunctionallanguagesuchasHaskellprovidedtwo conditionsaremet:

/ Theprogrammerimposesa strict sequentialorderto theactionsperformedon a mutablearray.

/ Theprogrammeracceptsthat,oncea mutablearrayis modified,only thenew copy is availableto theremainingactionsof thesequence.This impliesto acceptthata nameis connectedto differentvaluesin differentpartsof a text (weknow thatthis factdoesnotviolatetransparentialreferency sinceanamedenotesalways the samemutablevariable. What ‘changes’is the state. More exactly, it is passedaroundfrom oneactionto thefollowing one).

We explain that the tools for creatingsequencesof mutableactionsare alreadyknown: the >> and>>=combinators,the return action,andthedo-notationarenot privative of thetype IO a. We saythat theyareoverloadedandthatthetypeST s a of mutablestateactionscanalsoenjoy of them(we sayin passingthatbothconstructors,IO andST s , andsomeother, belongto theconstructorclassMonad).

In the following, we assumethat the library moduleST, standardin all Haskell distributions, whichprovidesthe interfaceto the mutablestateactionsproposedin Launchbury andPeyton Jones’s paper[11]hasbeenimported. We explain to the studentsthe elementarymutableactions: creatinga mutablearrayor a mutablevariable,readingfrom them,writing to them,andsoon. We alsopresentthespecialfunctionrunST::ST s a -> a whichis mandatoryif wewishto encapsulatestate-basedcomputationsinto anonstate-basedone.

Our first algorithmsuseembeddingandrecursionon indicesaswe did with immutablearrays.Hereisthemutableversionof insert :

39

insert :: Ord b => STArray s Int b -> Int -> ST s ()insert ma n = insert’ ma low n

where (low,_) = boundsSTArray mainsert’ :: Ord b => STArray s Int b -> Int -> Int -> ST s ()insert’ ma j n

| j >= n = return ()| otherwise = do a_n <- readSTArray ma n

a_j <- readSTArray ma jif a_n > a_j then insert’ ma (j+1) n

else do shift ma j (n-1)writeSTArray ma j a_n

Thereaderis invited to comparethisprogramwith theonegivenin Section4.2.Thesimilaritiesareobvious.The big differenceis that now, as we are working with only one array insteadof with two, the order inwhich modificationsto the arrayareperformedis crucial. Oncewe have found that elementa n mustgointo positionj , we mustfirst shift theelementsbetweenpositionj andpositionn-1 oneplaceto therightandthenwrite a n into position j . Shouldwe changethis order, the arraywould becomecorrupted.Theshiftingactioncanalsobedefinedby recursionon indices.We will presenta higherorderversionof shiftin Section4.4.Thecostof insert is clearlyin 0�1�243 , being 2 = n - low + 1 thelengthof thearrayportionaffectedby insertion.Everypositionin thisportionis subjectto a reador/andawrite operation,eachonewith a costin 0�165.3 .

For the completeinsertionsort algorithm,we cannotusefoldl becausethe typesdo not match. Wecannoteitherusethemonadicversionof foldl , calledfoldM::Monad m => (a -> b -> m a) -> a -> [b] -> m a, for muchthesamereason.For themoment,wecontentourselveswith a recursiveversion:

mutIsort :: Ord b => STArray s Int b -> ST s ()mutIsort ma = mutIsort’ (low+1) up ma

where (low,up) = boundsSTArray mamutIsort’ :: Ord b => Int -> Int -> STArray s Int b -> ST s ()mutIsort’ j up ma

| j > up = return ()| otherwise = do insert ma j

mutIsort’ (j+1) up ma

If theprogrammerwishesto hidethewholestatefulcomputation,hecanuserunST to encapsulateit:

isort :: Ord b => Array Int b -> Array Int bisort a = runST (do ma <- thawSTArray a

mutIsort maa’ <- unsafeFreezeSTArray mareturn a’)

Thefunctionfirst convertsanimmutablearrayinto a mutableone,sortsit, andsavesits final stateinto a newimmutablearraywhich is returnedasresult. From the outside,the algorithmlooks like sortingimmutablearrays.

4.4 Higher order abstractions

Functionalprogrammingis known to begoodfor abstractingcommoncomputationpatternsinto higherorderfunctions.In theareaof monadicalgorithms,usefulcomputationpatternsmoreor lesscorrespondto controlstructurespresentin mostimperative languages.

40

Simulating an imperative for-loop Thefirst usefulabstractionis thepredefinedfunction

sequence :: Monad m => [m a] -> m ()sequence = foldr (>>) (return ())

convertinga list of monadicactionsinto a singleactionwhich performssequentiallythe actionsin the list.Usedin combinationwith map, it canserve asa goodsimulationof the for control structureof many im-perative languages.Considertheexpresionsequence (map f indices) . Functionmapcreatesa listof actionsby mappinga function,dependingon an index, to the list of indices;sequence threadstheac-tion list into a singleaction. So,by providing andappropriatelist of indicesanda ’body’ functionwe getafunctionalequivalentof the imperative for. Hereis thehigherorderimplementationof functionshift inSection4.3:

shift :: STArray s Int b -> Int -> Int -> ST s ()shift ma i j = sequence (map action [j,j-1..i])

where action k = do x <- readSTArray ma kwriteSTArray ma (k+1) x

Noticetheorderin which positionsareshifted.Likewise,hereis thehigherorderversionof mutIsort ofSection4.3:

mutIsort :: Ord b => STArray s Int b -> ST s ()mutIsort ma = sequence (map (insert ma) [low+1..up])

where (low,up) = boundsSTArray ma

If theteacherwishesto usea stylewith amoreimperativeflavour, hecandefine

for :: Monad m => [a] -> (a -> m ()) -> m ()for indices body = sequence (map body indices)

andtranslatetheabove examplesto usethis construction.A slightly differentfor functionwasoriginallyproposedin [12]. For instance,theshift functionwould look like:

shift ma i j = for [j,j-1..i] actionwhere action k = ...

But we claim that for functionalprogrammers(e.g.our students)the direct useof sequence andmap ismoreillustrative thanthatof for .

General linear search Whenworking with mutablearrays,usefulabstractionsincludethecorrespondingversionsof map, fold , any , all , andsoon. Anotherinterestingabstractionis looking for thefirst arrayelementsatisfyinga givenproperty, i.e.a generalizationof linearsearch:

gLinSearch :: STArray s Int b -> (b -> Bool) -> ST s (Maybe Int)gLinSearch ma p = gLinSearch’ ma p low up

where (low,up) = boundsSTArray magLinSearch’ ma p j up

| j > up = return Nothing| otherwise = do a_j <- readSTArray ma j

if p a_j then return (Just j)else gLinSearch’ ma p (j+1) up

By usingit, we canwrite a verycompactversionof themutableinsert functionof Section4.3:

insert ma n = do a_n <- readSTArray ma n˜(Just j) <- gLinSearch ma (a_n <=)shift ma j (n-1)writeSTArray ma j a_n

41

Noticethat,in theworstcase,thesearchendsupwith j = n. In thiscase,theshift actionjustdoesnothing,andwriting a_n into positionn producesnoharm.Theirrefutablepatternin thesecondline is arequirementof thedo-notation.

Simulating an imperative while-loop Thelastabstractionwe presentis a kind of while loop. Differentlyfrom the onepresentedin [14, Chapter14] for the type IO , we have found that the action in the body isusuallydifferentfrom oneiterationto thenext, soweproposeto givea list of actionsasthesecondargument:

while :: Monad m => m Bool -> [m ()] -> m ()while test [] = return ()while test (a:as) = do continue <- test

if continue then do {a ; while test as}else return ()

Theloop endseitherwhenthetestfails or whenthelist of actionsis —if ever— exhausted.By usingit, wecanwrite a higherorderversionof thewell known bubblesortalgorithm:

bubbleSort :: Ord b => STArray s Int b -> ST s ()bubbleSort ma = do boolVar <- newSTRef True

while (readSTRef boolVar) (map (stage boolVar up)[low..up-1])

where (low,up) = boundsSTArray mastage v up k = do writeSTRef v False

sequence (map (action v) [up-1,up-2..k])action v j = do x <- readSTArray ma j

y <- readSTArray ma (j+1)if x <= y then return ()

else do -- array is being changedwriteSTArray ma j ywriteSTArray ma (j+1) xwriteSTRef v True

For anarraywith 7 elements,thealgorithmperforms,in theworstcase,7(8:9 stages,with index k rangingfrom low to up-1 . At the endof stagek we have at positionk the next minimum elementof the array.So,thearraygetssortedfrom left to right. We make useof a mutablebooleanvariableboolVar to recordwhethertherehasbeenany modificationto the array in the currentstage. If not, the test fails in the nextiteration,thewhile loop is exited, andthewholecomputationterminates.This meansthat,for an initiallysortedarray, bubblesortperformsanonly stage,with a timecomplexity in ;�<'7�= .4.5 Putting all together

At theendof thecourse,studentsshouldbeableto combineimperativefunctionalprogrammingwith classicalfunctionalprogramming.So, in order to know if they have acquiredtheseskills, we have proposedthemto write, as a final laboratoryassignment,a programwhosecore is the Floyd algorithm [4]. The aim ofthis algorithm is to computethe shortestpathsbetweeneachpair of nodesof a given graph. It receivesthe graphas input, andgeneratesasoutput two bidimensionalarrays: one to recordthe shortestdistancebetweeneachpair of nodes;and the other to storethe necessaryinformation to obtain the shortestpaths.This is a dynamicprogrammingproblemand,of course,first-yearstudentsarenot expectedto discover it bythemselves. Instead,we explain to themin wordswhat hasto be doneto solve theproblem,andthentheyhave to implementit.

We havechosenthis examplebecauseit combinesall thefeatureswe havetaughtin thecourse:

42

> Thereare several I/O operations, and it is importantto performthem in the right sequence:at thebeginning,theoriginalgraphis to bereadfrom afile; aftercomputingthearrays,theprograminteractswith theuser, whocanaskfor theshortestpathbetweenany pair of nodes.

> It is convenientto usemutablearrays, becausethe coreof the algorithmis a loop that computesthepathsby refiningthesolutionsfoundsofar. At eachstage? , for eachpairof nodes( @ ,A ) it is decidedifa betterpathbetween@ and A canbeobtainedby visiting node ? asan intermediatestep.Eachtime abetterpathis found,botharraysaremodified.

> After computingthe arrays,thereis no needto modify themanymore. Therefore,immutablearrayscanbeused.

> It is easierandclearerto expressFloyd’s algorithm by using higher order functionsthanby usingrecursion.

Assumingthat theoriginal graphis representedby a matrix in which position( @ ,A ) containsB if thenodesarenotdirectlyconnected,andcontainsthedistanceof theconnectionotherwise,acompactandprecisewayto write thealgorithmis:

-- Encapsulates the mutable computations of the programfloydAlg :: Array (Int,Int) Int -> (Array (Int,Int) Int, Array (Int,Int) Int)floydAlg t = runST (do tm <- thawSTArray t

um <- newSTArray (bounds t) 0floyd tm umti <- unsafeFreezeSTArray tmui <- unsafeFreezeSTArray umreturn (ti,ui))

-- Floyd algorithm using two mutable arraysfloyd :: STArray a (Int,Int) Int -> STArray a (Int,Int) Int -> ST a ()floyd tm um = sequence (map stage [l..u])

where ((l,l’),(u,u’)) = boundsSTArray tmstage k = sequence (map (refine k) (range ((l,l’),(u,u’))))refine k (i,j) = do tij <- readSTArray tm (i,j)

tik <- readSTArray tm (i,k)tkj <- readSTArray tm (k,j)if tik + tkj < tij

then do writeSTArray tm (i,j) (tik + tkj)writeSTArray um (i,j) k

else return ()

We have foundout thatour studentsareableto write programssimilar to thesolutiongivenabove,andthatthey understandtheconceptualdifferencesbetween‘normal’ operationsandoperationsinvolving a state.

5 Results

We only report here the resultsrelevant to the subjectof this paper. Generalresultsabout the useof afunctionallanguagein a first-yearcoursehavebeenreportedelsewhere(see,for example,[2]).

At the time of writing theselineswe canassesswhetherpartof thegoalsof thecoursehasbeenmetornot but, unfortunately, we cannotdo it for all of them. In particular, it is very early to know which kind ofdifficulties thesestudentwill have whenconfronted,in the next few years,to actualimperative languages

43

suchasC or Pascal. Will the conceptslearnedin our coursebe enoughto understandthe new languages?Will they recognizethe imperative modelof computationin spiteof thechangeof syntax?Will they easilyreplacerecursionby iteration? We plan to follow the evolution of thesestudentsin the next two yearstocollectinformationaboutthis aspectbut, for now, wecanonly guesswhatmayhappen.

For the moment, through their laboratoryassignmentsand written examinations,we have collectedenoughinformationto assessthe quality of the skills they have acquired. The most importantconclusionrelevant to this paperis that we have not detectedthe studentsto have specialdifficulties with monadicalgorithms.In particular, they acceptverynaturallytheconceptsof sequentialactionsandof mutablestate.

With respectto sequentialcomposition,we think that the do-notation,proposedoriginally in [10], de-servesmostof the merit for it. It is very simple,illustrative of what is going on, andhidesa lot of clumsydetailsthatthestudentsarehappy to ignore.In our opinion,it hasbeena very gooddecisionto includeit aspartof Haskell.

Howeverthat,andperhapsbecausethedo-notationis ahighlevelabstraction,thestudentstendto confusethe<- in a do sequencewith the= in anequation,andproduceprogramsin which they mix bothnotations,suchasthefollowing one:

main = do x <- actiony = f x...

Theconfusionis favouredby the fact that thesyntax<- is alsousedin list comprehensions,with a secondmeaning.Theessenceof theproblemsis that they do not seea cleardifferencebetweenthetype IO a andthetypea. Thisquestion—Why arethey different?—hasbeenvery frequentlyaskedto us.Fortunately, thetypesystemtakescareof thesemistakesandforcesthemto usethecorrectsyntax.Thequestionhasalsotobewith understandingthe>>= combinatorunderlyingthesyntaxx <- action . We have foundthat thiscombinatoris muchmoredifficult to understandthanthe>> one.For this reason,we think thatperhapsit isa goodapproachto movequickly from usingraw >>= andlambdaabstractionsto thedo-notation.

Anotherinterestingresultis thathigherorderabstractions,suchasthoseproposedin Section4.4,areveryeasilyapprehendedin this partof thecourse.For instance,they arewilling to give up recursionon behalfofusingthesequence-map combination,whenthey detectthat thesameactionhasto berepeatedfor a setof indices.This is in contrastto whathashappenedin therestof thecourse,wherethey arestronglyreluctantto usehigherorderfunctions(in particular, thoseof thefold family).

In summary, we think that theapproachfollowedherecanbeusefulfor thosehaving context conditionssimilarto ours:(i) youbelievethatfunctionalprogramminghasdidacticadvantagesoverimperativeprogram-ming for first-yearstudents;(ii) your studentsneedalsoto understandtheimperative modelof computationto beableto learnimperative languagesin subsequentcourses;(iii) thereis no muchtime availablefor thecourse.

References[1] R. Bird. Introductionto FunctionalProgrammingusingHaskell. Prentice-Hall,2 edition,1998.

[2] C. Clack and C. Myers. The dys-functionalstudent. In LNCS1022, pages289–309.Springer-Verlag, 1995.FPLE’95,Nijmegen(TheNetherlands).

[3] A. Davison. TeachingC afterMiranda. In LNCS1022, pages35–50.Springer-Verlag,1995. FPLE’95,Nijmegen(TheNetherlands).

[4] R. W. Floyd. Algorithm 97: Shortestpath.Communicationsof theACM, 5(6):345,1962.

[5] P. HartelandH. Muller. FunctionalC. Addison-Wesley, 1997.

[6] S.PeytonJonesandP. Wadler. Imperativefunctionalprogramming.In ACM Principlesof ProgrammingLanguages.ACM, 1993.Charleston,N. Carolina.

44

[7] S. Joosten,K. van den Berg, andG. van der Hoeven. Teachingfunctionalprogrammingto first-yearstudents.Journalof FunctionalProgramming, 3:49–65,1993.

[8] E. T. Keravnou. Introducingcomputerscienceundergraduatesto principlesof programmingthrougha functionallanguage.In LNCS1022, pages15–34.Springer-Verlag,1995.FPLE’95,Nijmegen(TheNetherlands).

[9] T. Lambert,P. Lindsay, andK. Robinson.UsingMirandaasa first programminglanguage.Journal of FunctionalProgramming, 3:5–34,1993.

[10] J.Launchbury. Lazyimperativeprogramming.In ACM WorkshoponStatein ProgrammingLanguages, pages1–11,1993.

[11] J. Launchbury andS. L. Peyton Jones.Lazy functionalstatethreads.In Proceedingsof theACM ConferenceonProgrammingLanguagesDesignandImplementation,PLDI’94, pages24–35,June1994.

[12] J.Launchbury andS.L. PeytonJones.Statein Haskell. LispandSymbolicComputation, 8(4):293–341,Dec.1995.Elaborationof [11].

[13] M. Nunez,P. Palao,andR. Pena. A SecondYearCourseonDataStructuresbasedon FunctionalProgramming.InLNCS1022, pages65–84.Springer-Verlag,1995.FPLE’95,Nijmegen(TheNetherlands).

[14] S.Thompson.Haskell: TheCraft of FunctionalProgramming. Addison-Wesley, 1996.

[15] P. Wadler. Theessenceof functionalprogramming.In 19’th SymposiumonPrinciplesof ProgrammingLanguages.ACM, January1992.Alburquerque,New Mexico.

[16] P. Wadler. How to declareanimperative. In InternationalLogic ProgrammingSymposium. MIT Press,1995.

[17] P. Wadler. Monadsfor functionalprogramming. In J. JeuringandE. Meijer, editors,AdvancedFunctionalPro-gramming. LNCS925. Springer-Verlag,1995.

45

46

ModularLazySearchfor ConstraintSatisfactionProblemsCThomasNordin Andrew Tolmach

PacificSoftwareResearchCenterOregonGraduateInstitute& PortlandStateUniversity

[email protected] [email protected]

Abstract

We describea unified,lazy, declarative framework for solvingconstraintsatisfactionproblems,anim-portantsubclassof combinatorialsearchproblems. Theseproblemsareboth practicallysignificantandhard. Findinggoodsolutionsinvolvescombininggoodgeneral-purposesearchalgorithmswith problem-specificheuristics.Conventionalimperative algorithmsareusuallyimplementedandpresentedmonolithi-cally, whichmakesthemhardto understandandreuse,eventhoughnew algorithmsoftenarecombinationsof simplerones.Lazyfunctionallanguages,suchasHaskell, encouragemodularstructuringof searchalgo-rithmsby separatingthegenerationandtestingof potentialsolutionsinto distinctfunctionscommunicatingthroughanexplicit, lazyintermediatedatastructure.But only relatively simplesearchalgorithmshavebeentreatedin thisway in thepast.

Our framework usesa genericgenerationandpruningalgorithmparameterizedby a labelingfunctionthatannotatessearchtreeswith conflict sets.We show thatmany advancedimperative searchalgorithms,includingbackmarking,conflict-directedbackjumping,andminimal forwardchecking,canbeobtainedbysuitableinstantiationof thelabellingfunction.Moreimportantly, arbitrarycombinationsof thesealgorithmscanbebuilt by simply composingtheir labellingfunctions.Our modularalgorithmsareasefficient asthemonolithicimperative algorithmsin thesensethatthey make thesamenumberof consistency checks,andmostof our algorithmsarewithin a constantfactorof their imperative counterpartsin runtimeandspaceusage.Webelieve our framework is especiallywell-suitedfor experimentingto find goodcombinationsofalgorithmsfor specificproblems.

1 Intr oduction

Combinatorialsearchproblemsoffer a greatchallengeto the academicresearcher:they areof tremendousinterestto commercialusers,andthey areoftenverycomputationallyintensiveto solve.OverthepastseveraldecadestheAI communityhasrespondedto thischallengeby producingasteadystreamof improvementstogenericsearchalgorithms.Therehave alsobeennumerousattemptsto organizethevariousalgorithmsintostandardizedframeworksfor comparison(e.g., [8, 6, 16]).

While thespeedandcunningof thesearchalgorithmshave improved,thenew algorithmsaremorecom-plicatedandharderto understand,eventhoughthey areoftencombinationsof simplerstandardalgorithms.The problemis exacerbatedby the fact that mostalgorithmsaredescribedby large, monolithic chunksofpseudo-code(or C code). Although it is recognizedthat mostproblemsbenefitfrom a tailor-madesolu-tion, involving a combinationof existing genericanddomain-specificalgorithms,modularityhasnot beena strongpoint of muchof the recentresearch.It is difficult to reusecodeexceptvia cut-and-paste.More-over, to prove thesealgorithmscorrectwe mustresortto complex reasoningabouttheir dynamicbehavior.DWork supported,in part,by theUSAir ForceMaterielCommandundercontractF19628-26-C-0161.

47

For example,althoughmostof thesesearchalgorithmsareconceivedasvarietiesof “tree search,” no actualtreedatastructuresappearin their implementations;only virtual treesarepresent,in the form of recursiveroutineactivationhistories.Perhapsfor this reason,evenwidely-usedandwell-studiedalgorithmsoftenlackcorrectnessproofs.

In the lazy functionalprogrammingworld, the ideaof implementinga searchalgorithmusingmodulartechniquesis a commonplace.The classicpaperof Hughes[9] andtext of Bird andWadler [3] both giveexamplesof searchalgorithmsin which generationand testingof candidatesolutionsare separatedintodistinctphases,gluedtogetherusinganexplicit, lazy, intermediatedatastructure.This “generate-and-test”paradigmmakesessentialuseof lazinessto synchronizethetwo functions(really coroutines)in sucha waythatwe never needto storemuchof the(exponential-sized)intermediatedatastructureat any onetime. Ingeneral,themodularlazy approachcanleadto algorithmsthataremuchsimplerto read,write, andmodifythantheir imperativecounterparts.However, thealgorithmsdescribedin thesesourcesarefairly elementary.

In this paperwe presenta lazy declarative framework for solving oneimportantclassof combinatorialsearchproblems,namelyconstraintsatisfactionproblems(CSPs).For simplicity, we restrictour attentionto binaryconstraintproblems,andto searchalgorithmsthat usea fixed variableorder;neitherof thesere-strictionsis fundamental.Our framework is basedon explicit, lazy, treestructures,in which eachtreenoderepresentsa statein the searchspace;a subsetof the tree’s leaf nodescorrespondsto problemsolutions.Nodescanbelabeledwith conflictsets, which recordconstraintviolationsin thecorrespondingstates;manyalgorithmsusethesesetsto prunesubtreesthatcannotcontributeasolution.

Our framework is written in Haskell. We provide a small library of separatefunctionsfor generating,labeling,rearranging,pruning,andcollectingsolutionsfrom trees.In particular, wedescribeagenericsearchalgorithm,parameterizedby a labelingfunction,andshow that a varietyof standardimperative CSPalgo-rithms, includingsimplebacktracking,backmarking,conflict-directedbacktracking,andforwardchecking,canbe obtainedby makinga suitablechoiceof labelingfunction. Using an explicit representationof thesearchtreeallows us to think aboutthe intermediatevaluesandgivesus new insightsinto moreefficientalgorithms.As in recentwork on functionaldatastructures[10, 14], we foundthatrecastingimperativealgo-rithmsinto a declarative lazy settingcastsnew light on thefundamentalalgorithmicideas.In particular, it iseasyto seehow to combineouralgorithms,simplyby composingtheir labelingfunctions,andto seethattheresultwill becorrect.

Sincethe whole point of improving searchalgorithmsis to be ableto solve larger problemsfaster, wemustobviouslybeconcernedwith theperformanceof our lazy algorithms.Our experimentsshow that lazy,modularHaskell codeis several timesslower thanstrict, manuallyintegratedHaskell code;moreover, eventhe lattercanbeanorderof magnitudeslower thanhighly optimizedC code. However, sincesearchtimesoftenexplodeexponentially, evenslowdownsof oneor two ordersof magnitudehave little effect on thesizeof problemwecansolvewithin afixedtimebound.All ouralgorithmsandtheircombinationsarefastenoughfor experimentsthat have beeninterestingto researchersin the past;for examplewe areableto reproducepartsof thetablesin [2, 11]. Moreimportantly, ourimplementationsarefastenoughto allow experimentationwith differentcombinationsof algorithmson problemsof realisticsize. For suchexperiments,CPU timeis often not an ideal comparisonmetric, sinceit is difficult to comparenumbersobtainedfrom differentimplementationsondifferentsystems.A widely usedalternativemetricis thenumberof consistency checksperformedby thealgorithm.

Theremainderof thepaperis organizedasfollows.Section2 describesourproblemdomainandSection3givesa Haskell specificationfor it. Section4 describessimpletree-basedbacktrackingsearch.Section5introducesconflictsetsandourgenericsearchalgorithm,andrecastsbacktrackingsearchin thatframework.Section6 briefly discussessearchheuristics.Sections7, 8,and9 describemoresophisticatedalgorithms,andSection10 discussestheir combination.Section11 summarizesperformanceresults,Section12 describesrelatedwork, andSection13concludes.

The readeris assumedto have a working knowledgeof functionalprogramming,andsomefamiliarity

48

with laziness.Peculiaritiesof Haskell syntaxwill beexplainedasthey arise.All thecodeexamplesin thispaperareavailableon theWorld WideWebathttp://www.cs.pdx.edu/˜apt/CSP.hs .

2 Binary Constraint SatisfactionProblems

A binaryconstraint satisfactionproblemis:E asetof variablesFHGJILK�MON6KQPQNLRSRLR6KQTVU ;E for eachvariableKQW , a finite set X�W of possiblevalues(its domain);E anda set Y of pairwiseconstraints betweenvariablesthat restrict the valuesthat they can take on

simultaneously.

Eachconstraintis a relationon two namedvariables,i.e.,a triple Z'[\N^]�N`_ba where_Hc:XdWfe"Xhg .An assignmentKQW := ijW associatesa variableKQW to somevalue ijWlkmX�W . A stateis a collectionof assign-

mentsfor asubsetof F . A stateI�RSRSRnKQW := iVW6NLRSRSRnK.g := iogpNLRSRSRU satisfiesaconstraintZ�[`Nq]�Nn_ra if Z�ijWnN6iog.ask"_ . Astateis consistentif it satisfieseveryconstrainton its variables,i.e., if for everypairof assignmentsK g := i g ,Kpt := iut in the state,andevery matchingconstraintZv]�N`wuN`_ba in Y , Z�i g NniVtQaxky_ . A stateis completeif itassignsall thevariablesof F ; otherwiseit is partial. A solutionto a CSPis any completeconsistentstate.For someproblemswe wantto calculateall solutions,but for many we only wish to find thefirst solutionasquickly aspossible.

In this paper, we fix thevariableorder K�MQN6KQPpNSRLRSR6KQT , i.e., we consideronly statessuchthat if KQW is in thestateso is K.g for all ]{z|[ . We definethe level of a variable KQW to be [ andthe level of a stateto be themaximumof its variables’levels. To simplify thepresentation,we furtherassumethatall domainshave thesamesize } andthattheir valuesarerepresentedby integersin theset I�~pN\�oNSRLRSR�Nn}�U .

A naive approachto solvinga CSPis to enumerateall possiblecompletestatesandthencheckeachinturn for consistency. In a binaryCSP, consistency of a statecanbe determinedby performingconsistencychecksoneachpair of assignmentsin thestate,until an inconsistentpair of variablesis detected,or all pairshave beenchecked. Following the conventionsof the searchliterature,we usethe numberof consistencychecksasa key measureof executioncode,althoughit is not necessarilyanaccuratemeasureunlesseachcheckcanbeperformedin unit time,which is not thecasefor all problems.

3 CSPsin Haskell

Figure1 givesa Haskell framework for describingCSPproblemsandan implementationof a naive solver.An assignmentis constructedusingthe infix constructor:= . A CSPis modeledasa recordcontainingthenumberof variables,vars , thesizeof their domain,vals , anda constraintoracle,rel . We representtheoracleasa Haskell functiontakingtwo assignmentsandreturningFalse if f theassignmentsviolatesomeconstraint.This functioncanbeimplementedby a four-dimensionalarrayof booleansor by a mathematicalformula.

Wepresentthesolverin thestandard“lazy pipeline”stylethatseparatesgenerationof candidatesolutions(herethesetof all completestates)from consistency testing. Statesarerepresentedaslists of assignmentssortedin decreasingorderby variablenumber. Although this codeappearsto producea hugeintermedi-ate list datastructurecandidates , lazy evaluationinsuresthat list elementsaregeneratedonly on de-mand,andelementsthat fail thefilter in test canbe immediatelygarbagecollected.Similarly, althoughinconsistencies appearsto build a list of all inconsistentvariablepairsin the state1, consistent

1This functionusesa Haskell list comprehension, which is similar to a familiar setcomprehension:this onebuilds a list of pairsofvariablelevelssuchthatthecorrespondingassignmentsaredrawn from thecurrentstateandarein conflictaccordingto rel .

49

type Var = Inttype Value = Int

data Assign = Var := Value deriving (Eq, Ord, Show)

type Relation = Assign -> Assign -> Bool

data CSP = CSP � vars, vals :: Int, rel :: Relation �type State = [Assign]

level :: Assign -> Varlevel (var := val) = var

value :: Assign -> Valuevalue (var := val) = val

maxLevel :: State -> VarmaxLevel [] = 0maxLevel ((var := val):_) = var

complete :: CSP -> State -> Boolcomplete CSP� vars � s = maxLevel s == vars

generate :: CSP -> [State]generate CSP� vals,vars � = g vars

where g 0 = [[]]g var = [ (var := val):st | val <- [1..vals], st <- g (var-1) ]

inconsistencies :: CSP -> State -> [(Var,Var)]inconsistencies CSP� rel � as =

[ (level a, level b) | a <- as, b <- reverse as, a > b, not (rel a b) ]

consistent :: CSP -> State -> Boolconsistent csp = null . (inconsistencies csp)

test :: CSP -> [State] -> [State]test csp = filter (consistent csp)

solver :: CSP -> [State]solver csp = test csp candidates

where candidates = generate csp

queens :: Int -> CSPqueens n = CSP � vars = n, vals = n, rel = safe �

where safe (i := m) (j := n) = (m /= n) && abs (i - j) /= abs (m - n)

Figure1: A formulationof CSPsin Haskell.

50

data Tree a = Node a [Tree a]

label :: Tree a -> alabel (Node lab _) = lab

type Transform a b = Tree a -> Tree b

mapTree :: (a -> b) -> Transform a bmapTree f (Node a cs) = Node (f a) (map (mapTree f) cs)

foldTree :: (a -> [b] -> b) -> Tree a -> bfoldTree f (Node a cs) = f a (map (foldTree f) cs)

filterTree :: (a -> Bool) -> Transform a afilterTree p = foldTree f

where f a cs = Node a (filter (p . label) cs)

prune :: (a -> Bool) -> Transform a aprune p = filterTree (not . p)

leaves :: Tree a -> [a]leaves (Node leaf []) = [leaf]leaves (Node _ cs) = concat (map leaves cs)

initTree :: (a -> [a]) -> a -> Tree ainitTree f a = Node a (map (initTree f) (f a))

Figure2: Treesin Haskell.

actuallydemandsonly theheadof the list (to checkwhetherthe list is null ). Thusthesolver actuallycal-culatesonly theearliestinconsistentpair of variablesfor eachstate.Finally, althoughthesolverreturnsa listof all solutionsif demanded,it canbeusedto obtainjust thefirst solution(anddo no furthercomputation)by askingfor just theheadof theresult. Althoughthecodethususesmuchlessspacethana strict readingwouldsuggest,thissolver is still extremelyinefficientbecauseit duplicateswork, but it is usefulto illustratelazycodingstyleandasa specificationfor themoresophisticatedsolverswe introducebelow.

A simpleproblemusefulfor illustratingdifferentsearchstrategiesis the � -queensproblem,thatis, tryingto put � queenson a ����� chessboardsuchthat no queenis threateninganother. using the standardoptimizationthatwe only try to placeonequeenin eachcolumn[13]. Giventhedefinitionof queens , wecanapplythegeneral-purposeCSPmachineryto solve it; for example,theexpressionsolver (queens5) generatesa list of solutionsto the5-queensproblem.

4 Backtracking and TreeSearch

Themostobviousdefectof thenaive solver is that it canduplicatea tremendousamountof work by repeat-edly checkingthe consistency of assignmentsthat arecommonto many completestates.We saystate ���extendsstate� if it containsall theassignmentsof � togetherwith zeroor moreadditionalassignments.Afundamentalfact aboutCSP’s is that no extensionto an inconsistentstatecanever be consistent,so thereis no point in searchingsuchan extensionfor a solution. This observation immediatelysuggestsa bettersolver algorithm. A backtracking solver searchesfor solutionsby constructingandcheckingpartial states,beginningwith theemptystateandextendingwith oneassignmentat a time. Whenever thesolver discoversan inconsistentstate,it immediatelybacktracks to try a differentassignment,thusavoiding the fruitlessex-plorationof thatstate’s extensions.Moreover, consistency of eachnew statecanbetestedjust by comparing

51

mkTree :: CSP -> Tree StatemkTree CSP� vars,vals � = initTree next []

where next ss = [ ((maxLevel ss + 1) := j):ss | maxLevel ss < vars, j <- [1..vals] ]

data Maybe a = Just a | Nothing deriving Eq

earliestInconsistency :: CSP -> State -> Maybe (Var,Var)earliestInconsistency CSP� rel � [] = NothingearliestInconsistency CSP� rel � (a:as) =

case filter (not . rel a) (reverse as) of[] -> Nothing(b:_) -> Just (level a, level b)

labelInconsistencies :: CSP -> Transform State (State,Maybe (Var,Var))labelInconsistencies csp = mapTree f

where f s = (s,earliestInconsistency csp s)

btsolver0 :: CSP -> [State]btsolver0 csp =

(filter (complete csp) . leaves . (mapTree fst) . prune ((/= Nothing) . snd). (labelInconsistencies csp) . mkTree) csp

Figure3: Simplebacktrackingsolver for CSPs.

thenewly addedassignmentto all previousassignmentsin thestate,sinceany inconsistency involving onlythepreviousassignmentswould alreadyhave beendiscoveredearlier. If thesolvermanagesto reacha com-pletestatewithout encounteringan inconsistency, it recordsa solution; if multiple solutionsarewanted,itbacktracksto find theothers.

Backtrackingsolverscanbeviewedvery naturallyassearchinga tree, in which eachnodecorrespondsto a stateandthe descendentsof a nodecorrespondto extensionsof its state. In conventionalimperativeimplementationsof backtracking,the tree is not explicit in the program;if a recursive implementationisused,the tree is isomorphicto the dynamicactivation history treeof the program,but usually the tree islittle morethana metaphorfor helpingtheprogrammerreasoninformally aboutthealgorithm. In the lazyfunctionalparadigmit is naturalto treatsearchtreesasexplicit datastructures,i.e.,programsareconstructedaspipelinesof operationsthatbuild, search,label,manipulate,andpruneactualtrees.As before,we rely onlazinessto avoid actuallybuilding theentiretree.

Figure2 givesHaskell definitionsfor a treedatatypeandassociatedutility functions.A Tree is a nodecontaininga labelanda list of children,themselvesTree s. mapTree , foldTree , andfilterTree aretheanaloguesof thefamiliar functionson lists. leaves extractsthelabelsof theleavesof a treeinto a listin left-to-rightorder. initTree generatesa treefrom a functionthatcomputesthechildrenof a node[9].

The codein Figure 3 usesthesetreesto implementa backtrackingsolver btsolver0 using a lazypipeline. All thealgorithmsdiscussedin this paperexpectthe treeto begeneratedandmaintainedin fixedvariableorder, sothatnodesat level � of thetree(countingtheroot aslevel 0) alwaysextendtheir parentbyanassignmentto �Q� . Thus,thegenerator, mkTree , worksby providing a next functionto initTree thatgeneratesoneextensionfor eachpossiblevalueof thenext variable.Eachnodedescribesanentire(partial)state,but (in any reasonableHaskell implementation)it actuallystoresonly a singleassignment,togetherwith a pointerto theremainderof thestateembeddedin its parentnode.

The application(labelInconsistencies csp) returnsa treetransformer: it addsan annotationto eachnoderecordingits earliestinconsistentpair (if any), asreturnedby earliestInconsistency .Thestandardtreefunctionprune � removesnodesfor which predicate� is true; in this instanceit prunesall inconsistentnodes.The annotationsarethenremovedby (mapTree fst) . Any nodesrepresenting

52

���\� �Q�����\� �Q��� �Q� �Q� � ���\� �Q����\� �Q�����\� �Q�����\� �Q��� �Q� �Q��� �Q� �Q� ���

��

�� ����\� �\�

� ���\� �Q��� �Q� �Q�����\� �Q�����\� �Q����\� �Q�����\� �Q�����\� �Q� � �Q� �Q�

������

� � �� � �

� � ��

 ¡ � ¢

¡ � ¢ £   ¡ � £¢   £¢�¡¡ � ¢ £  

Figure4: Portionof searchtreefor queens 5. Nodesat level ¤ areannotatedwith their assignedvalue ¥j¦(in bold), andwith theirearliestinconsistentpair, if any.

�����

� � � � �

§

§�����

� � � � �

§

§

§

§

Figure5: Two positionsfrom thequeens 5 searchtreein Figure4. Theleft andright diagramscorrespondto theleft-mostandright-mostsubtreesof level 3, respectively.

completestatesthat arestill left in the treemustbe solutions;the remainingpipelinestagesextract theseusingthe standardtree function leaves andthe standardlist function filter . Figure4 illustratesthelabelsproducedby btsolver0 on partof the treefor queens 5; thecorrespondingboardpositionsareshown in Figure5. Notethatthechildrenof inconsistentnodeshavebeenpruned.

It is essentialto notethat this pipelineis demanddriven: eachstageexecutesonly whendemandedbythe following stage. In particular, inconsistency calculationswill not be performedon nodesof the treeexcisedby prune , becausethe valuesof thesenodeswill never be demanded.Thuswe get the desiredeffect of backtrackingwithout any explicit manipulationof controlflow. Also, asbefore,only a smallpartof eachintermediatetreeis ever “li ve” (non-garbagedata)atany onetime,namelythespineof thetreefromroot to currentnode,i.e., essentiallywhat would be storedin activation recordsfor a recursive imperativeimplementation.So our lazy algorithmspay at worst a constantfactor more spacethan their imperativecounterparts.We do, however, pay someoverheadfor building, storing,andgarbagecollectingeachtreenode,and,unlessour Haskell implementationperformseffectivedeforestation[7], this costwill berepeatedfor eachintermediatetreein thepipeline.For thesereasons,thelazyimplementationof backtrackingis aboutfour timesslower thana monolithic,strictHaskell implementation(seeSection11).

53

data ConflictSet = Known [Var] | Unknown deriving Eq

knownConflict :: ConflictSet -> BoolknownConflict (Known (a:as)) = TrueknownConflict _ = False

knownSolution :: ConflictSet -> BoolknownSolution (Known []) = TrueknownSolution _ = False

checkComplete :: CSP -> State -> ConflictSetcheckComplete csp s = if complete csp s then Known [] else Unknown

type Labeler = CSP -> Transform State (State, ConflictSet)

search :: Labeler -> CSP -> [State]search labeler csp =

(map fst . filter (knownSolution . snd) . leaves .prune (knownConflict . snd) . labeler csp . mkTree) csp

bt :: Labelerbt csp = mapTree f

where f s = (s,case earliestInconsistency csp s of

Nothing -> checkComplete csp sJust (a,b) -> Known [a,b])

btsolver :: CSP -> [State]btsolver = search bt

Figure6: Conflict-directedsolvingof CSPs.

5 Conflict Setsand GenericSearch

Theutility of thebacktrackingsolver is basedon its ability to prunesubtreesrootedat inconsistentnodes;it doesnothingwith consistentnodes.Of course,just becausea stateis consistentdoesn’t meanit canbeextendedto a solution; the assignmentsalreadymademay be inconsistentwith any possiblechoicesforfuture variables. Figure4 shows an examplefor queens 5: the assignmentto value1 at level 3 of theleft-handtreeis consistent,but cannotbeextendedto a solution.

If asolvercouldidentify suchconflictedstates,it couldprunetheirsubtreestoo. To makeprecisetheexactconditionsunderwhichsuchpruningis possible,weusethefollowing definition.A conflictsetfor astateis asubsetof (theindicesof) thevariablesassignedby thestatesuchthatanysolutionmustassignadifferentvalueto at leastonememberof thesubset.More formally, givena statey©*ª.«�¬ := ­u¬Q®6«Q¯ := ­j¯�®S°S°L°±®6«p² := ­V²�³ , aconflict set ´r¨ for ¨ is a subsetof ª�µp®`¶·®S°L°S°�®`¸u³ suchthat,if ªL«�¬ := ¹�¬.®n«Q¯ := ¹p¯p®S°S°L°S®n«Qº := ¹pºV³ is a solution,then »½¼¿¾ÁÀ4´r¨fÂ-­jÃÅÄ©:¹pà . (Thinking imperatively, wemightsayaconflictsetcontainsvariablesat leastoneofwhich“mustbechanged”to reachasolution.)Notethatconflictsetsarenot, in general,uniquelydefined.Inparticular, if astateat level ¸ hasanon-emptyconflictset ´r¨ , theneverysubsetof ª�µp®L°S°L°�®`¸V³ containingr¨is alsoa conflict set. If a statehasa non-emptyconflict setthenno extensionof thatstatecanbea solution;conversely, if it hasanemptyconflict set,thenit musthave at leastoneextensionthat is a solution. This isa very strongcharacterizationof states:for example,if we couldcomputea conflict setfor the root of thetree(theemptystate),wecouldtestwhetherit wereemptyandtherebydeterminewhethertheproblemhasasolutionatall! Wewill thereforeoftenoperatein anenvironmentwheremany conflictsetsareunknown. It isobviouslynot possibleto identify a conflicted,but consistent,statewithoutexploringsomeof its extensions;

54

hrandom :: Int -> Transform a ahrandom seed (Node a cs) = Node a (randomList seed’ (zipWith hrandom (randoms seed’) cs))

where seed’ = random seed

btr :: Int -> Labelerbtr seed csp = bt csp . hrandom seed

Figure7: A randomizationheuristic

thetrick is to avoid exploring all of them,andsave effort by pruningtheremainder. We addressalgorithmswith thispropertybeginningin Section7.

For themoment,notethatany inconsistentstatehasa non-emptyconflict set. In particular, if a statehasan earliestinconsistentpair Æ'Ç\ÈqÉ�Ê thenit has ËLÇ\È^ÉÍÌ asa conflict set,which we call the earliestconflict set.Sowecansubsumebacktrackingsearchin a moregeneralalgorithmwecall conflict-directedsearch,shownin Figure6. We definea genericroutinesearch , parameterizedby a labeler function,which annotatesnodeswith conflict sets. More precisely, if the labelercandeterminea legal conflict set Î for the node,itannotatesthenodewith Known Î ; otherwise,it annotatesit with Unknown. (In general,wealsopermitthelabelerto rearrangeor pruneits input tree,solong asits outputtreeis properlylabeledandstill containsallsolutionstates.)Theoutputof thelabelingstageis fed to a pruner, which removessubtreesrootedat nodeslabeledwith known non-emptyconflict sets. Again, demand-drivenexecutionguaranteesthat the excisedsubtreesnever needto be labeled. Becauseof this arrangement,the labeleris allowed to assumethat if itlabelsa nodewith a non-emptyconflict set,it will never becalledon a descendentof thatnode,so it neednot annotatesuchdescendentsproperly;this allows simplerlabelercode.After pruning,thesolutionnodesarejust theleavesof thetreeannotatedwith known emptyconflictsets;theremainderof thepipelinesimplyfilters theseout.

Theframework of Figure6 is sufficiently general-purposeto accommodateall thesearchalgorithmsdis-cussedin the remainderof the paper. By instantiatingsearch with the labelerfunction bt we obtainasimplebacktrackingsolver btsolver that behavesjust like btsolver0 . The moresophisticatedalgo-rithmsdiscussedbelow areall obtainedby usingfancierlabelerfunctions,leaving search itself unchanged.

6 Heuristics and Search Order

As with thenaivesolver, if weareinterestedin only thefirst solutionratherthanall solutions,wecanstill usesearch unchanged;wemerelydemandjusttheheadof thesolutionlist. Sincesolutionsarealwaysextractedin left-to-rightorder, thisimpliesthatthetimerequiredtofind thefirst solutionwill beverysensitiveto theor-derin whichvaluesaretried for eachvariable.Theuseof value-orderingheuristicsis well-establishedin theimperative searchliterature.Suchheuristicscanbe implementedusingspecializedgeneratorfunctionsthatproducetheinitial treein thedesiredorder. A moremodularapproach,however, is to view theseheuristicsasasrearrangementsof a canonically-orderedinitial tree;this keepstheinitial generatorsimpleandallowsmultipleheuristicsto bereadilycomposed.

Suchrearrangementheuristicscanbeeasilyexpressedin our framework by incorporatingtheminto thelabeler function.For example,queens searchcanbespeededupby consideringvaluesin randomorder.Thefollowing functionhrandom in Figure7 transformsacanonicaltreeby randomizingits children(usinga randomnumbergeneratornot shown here).Theapplication(btr seed) returnsa labelerthatcombinesrandomizationwith standardbacktrackingsearch.We have implementeda numberof othersuchheuristics,bothgenericandproblem-specific,but weomit detailsfrom thispaperfor lack of space.

55

type Table = [Row] -- indexed by Vartype Row = [ConflictSet] -- indexed by Value

bm :: Labelerbm csp = mapTree fst . lookupCache csp . cacheChecks csp (emptyTable csp)

emptyTable :: CSP -> TableemptyTable CSPÏ vars,vals Ð = []:[[Unknown | m <- [1..vals]] | n <- [1..vars]]

cacheChecks :: CSP -> Table -> Transform State (State, Table)cacheChecks csp tbl (Node s cs) =

Node (s, tbl) (map (cacheChecks csp (fillTable s csp (tail tbl))) cs)

fillTable :: State -> CSP -> Table -> TablefillTable [] csp tbl = tblfillTable ((var’ := val’):as) CSPÏ vars,vals,rel Ð tbl =

zipWith (zipWith f) tbl [[(var,val) | val <- [1..vals]] | var <- [var’+1..vars]]where f cs (var,val) =

if cs == Unknown && not (rel (var’ := val’) (var := val))then Known [var’,var]else cs

lookupCache :: CSP -> Transform (State, Table) ((State, ConflictSet), Table)lookupCache csp t = mapTree f t

where f ([], tbl) = (([], Unknown), tbl)f (s@(a:_), tbl) = ((s, cs), tbl)

where cs = if tableEntry == Unknown then checkComplete csp s else tableEntrytableEntry = (head tbl)!!(value a-1)

Figure8: Backmarking

7 Backmarking

Given the formulationof backtrackingsearchasa pipelinedalgorithmwith separatelabelingandpruningphases,usinga treeannotatedwith conflict setsasintermediatedatastructure,it makessenseto askif thereareotherwaysto performthe labelingphase.bt works by checkingeachassignmentagainstall previousassignmentsin its state.Althoughthisapproachcheckstheoverallconsistency of eachpartialstateonly once,it canstill performmany duplicatepairwiseconsistency checksbecauseall thechildrenof a givennodeareisomorphic.ConsideranodeÑ at level Ò , andconsiderany descendentof Ñ . In checkingtheconsistency of thedescendent,pairwisecheckswill bemadebetweenits assignmentandall theassignmentsin Ñ at levelslessthan Ò . Thesecheckswill beduplicatedfor thecorrespondingdescendentsof everysibling of Ñ (unless,ofcourse,they hadaninconsistentancestorandhavebeenprunedaway). For anexample,comparetheleftmostnodesof theleft-mostandright-mostsubtreesonlevel5 of Figure4: to generatetheseconflictsets,bt makesthesamethreecomparisonsin eachcase.

An alternative approachis to cache the resultsof suchconsistency checksso they can be reusedforeachsibling; this shouldreducethe total numberof consistency checksat the cost of the spaceneededfor the cache.Figure8 shows a Haskell algorithmincorporatingthis idea. We annotateeachnodewith acacheto storeinformationaboutinconsistenciesbetweenthatnode’s stateandtheassignmentsmadein itsdescendents.Eachcacheis organizedasa tableof earliestconflict setsfor all descendents,indexedby level(greaterthanor equalto thenode’s own level) andvalue;thetableis representedasa list of lists. Theroothasa tablein which every entrycontainsUnknown. fillTable computesthe tablecontentsfor a nodebasedonthenode’sassignmentandthenode’sparent’s tableby consideringeachpossiblefutureassignmentin turn. If the parent’s tablealreadyrecordsa known conflict pair for the future assignment,that conflict

56

Ó �\� �QÔ Ó �\� �QÔ Ó �Q� �QÔ Ó �\� �QÔÓ �\� �QÔ Ó �\� �QÔ Ó �\� �QÔ Ó �Q� �QÔ Ó �Q� �QÔ Ó Ô

Ó ÔÓ Ô

Ó ÔÓ �\� �Q� �\Ô Ó ÔÓ �\� �\Ô

Ó Ô Ó �\� �QÔ Ó �Q� �QÔ Ó �\� �QÔ Ó �\� �QÔÓ �\� �QÔ Ó �\� �QÔ Ó �\� �QÔ Ó �Q� �QÔ

������

� � �� � �

� � ��

 ¡ � ¢

¡ � ¢ £   ¡ � £¢   £¢�¡¡ � ¢ £  

Ó �\� �Q� �\� �QÔ

Figure9: Sameportionof searchtreefor queens 5, annotatedwith conflictsetsascomputedby bj .

pair is copiedinto thecurrenttable;otherwisea conflict checkis performedandtheresult(a known conflictpair or Unknown) is recorded.Note that eachnode’s tablecontainsa refinementof the informationin itsparent’s table,with a tableat level Õ containingcompleteconsistency informationaboutassignmentsat levelÕ . Oncethetreehasbeenannotatedwith cachetables,lookupCache is mappedover eachnodeto extracttheconflict pair for thenode’s own assignmentfrom thenode’s table;if thenodehasno recordedconflictsandrepresentsa completestate,it is a solutionandis thereforegiven an emptyconflict set. The ultimateannotatedtreeis identicalto thatproducedby bt .

As usual,we rely on lazyevaluationto avoid building thetablesor their contentsunlessthey areneeded.Somostof thetablesremainunbuilt, andtheactualorderin whichconsistency checksis performedis similarto bt . Theimportantpoint is that,becausemany of anode’stableentriesareinheritedfrom its parent’stable,all duplicateconsistency checksareavoided.

As before,weobtaina completesolver by usingbmasthe labeler parameterto search . Somewhatsurprisingly, this algorithmturnsout to be equivalent(in termsof consistency checksmade)to a standardimperativealgorithmcalledbackmarking[1].

8 Conflict-Dir ectedBackjumping

The bt andbm algorithmsannotateinconsistentnodeswith known conflict sets,but most internalnodesremainmarkedUnknown. If we couldsomehow computenon-emptyconflict setsfor internalnodescloserto theroot of the tree,we couldprunelargersubtreesandsospeedup search.In fact,many suchnodesdohavenon-emptyconflictsets;for example,seetheleftmostnodeat level 3 in Figure9.

Oneapproachto computinginternalnodeconflict setsis to constructthembottom-upfrom theconflictsetsof a subsetof their children.To do this,wemakeuseof two key factsaboutconflictsets:

Ö (i) If a node × at level Õ hasa child (at level ÕjØÚÙ ) with a known conflict set ÛrÜ thatdoesnot containÕ�Ø�Ù , then ÛrÜ is alsoaconflictsetfor × . (In particular, if × hasachild with anemptyconflictset,then× alsohasanemptyconflictset.)

Ö (ii) If all the children of node × at level Õ have non-emptyconflict sets ÛrÜ�ÝQÞ`ÛrÜàßQÞSáLáSá±Þ\ÛrÜãâ , thenä ÛrÜ�Ý�å4ÛrÜãßfåæáLáSáLå4ÛrÜãâÍçãè�é�ÙpÞSáLáSá±Þ`Õëê is aconflictsetfor × .

57

bjbt :: Labelerbjbt csp = bj csp . bt csp

bj :: CSP -> Transform (State, ConflictSet) (State, ConflictSet)bj csp = foldTree f

where f (a, Known cs) chs = Node (a,Known cs) chsf (a, Unknown) chs = Node (a,Known cs’) chs

where cs’ = combine (map label chs) []

combine :: [(State, ConflictSet)] -> [Var] -> [Var]combine [] acc = acccombine ((s, Known cs):css) acc =

if maxLevel s ‘notElem‘ cs then cs else combine css (cs ‘union‘ acc)

bj’ :: CSP -> Transform (State, ConflictSet) (State, ConflictSet)bj’ csp = foldTree f

where f (a, Known cs) chs = Node (a,Known cs) chsf (a, Unknown) chs =

if knownConflict cs’ then Node (a,cs’) [] else Node (a,cs’) chswhere cs’ = Known (combine (map label chs) [])

Figure10: Conflict-directedbackjumping.

Thesefactsareeasyto prove from thedefinitionof conflict set. Intuitively, fact(i) saysthat if anychildof ì hasconflictsthatdon’t involve íOîðï�ñ , thenall childrenof ì have(at least)thesameconflicts,andhencesodoesì itself. (Thespecialcasejustsaysthatif a child of ì canbeextendedto asolution,thensocan ì .) Fact(ii) saysthat if no child of ì canbeextendedto a solution,thenneithercan ì , andany solutionmustdifferfrom ì in the valueof at leastoneof the offendingvariablesof oneof the children. Fact (i) is the crucialonefor optimizingsearch,sinceit permitstheparent’sconflictsetto becomputedfrom a strictsubsetof thechildren’sconflictsets.

We cannow definea lazy bottom-upalgorithmfor computinginternalnodeconflict setsfrom a treethathasbeen(lazily) “seeded”with at leastoneconflict setalongevery pathfrom root to leaf. Functionbj inFigure10 is a Haskell versionof this labelingalgorithm. At eachparentnodethat doesn’t alreadyhave aconflictset,bj callscombine to build one.combine inspectstheconflictsetsof thechildrenin turn. If itfindsachild to whichfact(i) canbeapplied,it immediatelyreturnsthisastheconflictsetfor theparent;if nosuchchild is found,it appliesfact(ii).2 Underlazy evaluation,thesubtreescorrespondingto theremainingchildrenareneverexplored.

This algorithmworks correctlyfor any initial seedingof conflict sets,but it is mosteffective whentheconflict setsaresmallandcontainlow-numberedvariables,becausethis increasesthenumberof levels forwhichfact(i) canbeapplied.This is why weuseearliestinconsistentpairsto representconsistency conflicts.The combinationof bj with bt is commonlyreferredto asconflict-directedbackjumping(CBJ) (or justbackjumping)in theliteratureandit is thecornerstoneof many newly-developedalgorithms[6]. In its usualimperative formulationthis algorithmis notoriouslydifficult to understandor provecorrect.While we haverelied on the analysisof Caldwell, et al. [4] for our understandingof conflict sets,we areunawareof anydescriptionof thealgorithmasa form of labeling.

While search bjbt behavesjust like imperativeCBJin thesensethatit performsthesamenumberofconsistency checks,it hasanunfortunatespaceleak. Theproblemis that thepruningphasecannotremovethechildrenof a nodeuntil that node’s conflict sethasbeencomputed,but thatcomputationmaygeneratea substantialpartof thechildren’s subtreesinto memory. We canplug thespaceleakeffectively, if not too

2To simplify the implementation,we don’t botherperformingthe intersectionstepin fact (ii), sinceit is harmlessfor a node’s(non-empty)conflictsetto includeindicesof its descendents.

58

fc :: Labelerfc csp = domainWipeOut csp . lookupCache csp . cacheChecks csp (emptyTable csp)

collect :: [ConflictSet] -> [Var]collect [] = []collect (Known cs:css) = cs ‘union‘ (collect css)

domainWipeOut :: CSP -> Transform ((State, ConflictSet), Table) (State, ConflictSet)domainWipeOut CSPvars t = mapTree f t

where f ((as, cs), tbl) = (as, cs’)where wipedDomains = ([vs | vs <- tbl, all (knownConflict) vs])

cs’ = if null wipedDomains then cs else Known(collect(head wipedDomains))

Figure11: Forwardchecking.

neatly, by addingadditionalpruninginto thelabeleritself, asillustratedby bj’ .

9 Forward Checking

Anotherway of assigningconflict setsto consistentinternalnodescanbedevelopedon thebasisof the thecachetablesintroducedfor backmarking(Section7). Recallthatthesetablesrecord,for eachnode,theear-liest conflict setsfor all descendentnodes;tableentriesfor consistentnodeswill remainmarkedUnknown.Suppose,however, thatthetablefor somenode ò at level ó containsa row, correspondingto a domainlevelô(õ ó , in which every entrycontainsa non-emptyconflict set. Thenit is evident that thenodecannever beextendedto a solution,becausetheassignmentsin ò rulesout all possiblevaluesfor variable

ô. (As anex-

ample,considerthetheleft diagramin Figure5; if weaddaqueenatposition(4,4),thenwecanimmediatelyseethatno row placementwill work for column5.) Therefore,theremustexist a non-emptyconflict setforò . By labelling ò with sucha set,we canavoid furthersearchin thesubtreerootedat ò . This techniquehasbeencalleddomainwipeout[1]. Thecombinationof domainwipeoutwith backmarkingcorrespondsto thewell-known imperativealgorithmcalledforward checking. Becauseour cachetableconstructionis lazy, wehave actuallyrediscovered(“for free”) minimal (or lazy) forward checking, itself a recentdiscovery in theimperative literature[5].

Figure11 shows codefor implementingdomainwipeout. To gathera list of wipedDomains andtestwhetherit is non-emptyis straightforward.Theinterestingquestionis whatconflictsetto assignto thenodeò if domainwipeouthasoccurred.Sinceit is alwaysvalid to throw additionalvariablesinto a non-emptyconflict set,we could just usethe set ö�÷�øSùLùSù±ønónú . But it is betterto usethe smallestavailableconflict setsbasedon the availableinformation,becausethis canincreasetheir utility for otheralgorithms(e.g.,CBJ).In this case,thecachetablerow for a wiped-outdomainrecordswhich existing assignmentrulesout eachpossiblevaluefor thatdomain.Theunionof thevariablesin theseassignments(restrictedto ö�÷pøLùSùLù±ø6ó`ú )3 isa valid conflict setfor ò , sinceany solutionmustassigna differentvalueto at leastoneof them. If thereismorethanonewipedout domain,we couldcomputea conflict setfrom any oneof them;for simplicity andto limit computation,domainWipeOut just choosesthefirst.

10 Mixing and Matching

A major advantageof our declarative approachis that we cantrivially combinealgorithmsusingfunctioncomposition,solong asthey take a consistentview of conflict setannotations.Thecombinationof forward

3Again,wesimplify theimplementationby omitting therestrictionstep,which is harmless.

59

Queens 8 9 10 11 12 13CSPlibBT 0.01 0.05 0.27 1.50 8.91 57.34ghcmonolithicBT 0.14 0.60 3.20 18.03 108.34 686.92ghcbtsolver0 0.56 2.84 14.18 76.29 440.72 2686.13

Table1: Runtimein secondsfor differentversionsof simplebacktrackingsearchfor the û -queensproblem.

checkingandbackjumping

bjfc csp = bj csp . fc csp

is well known,althoughto ourknowledgeit hasnotpreviouslybeenachievedfor lazyforwardchecking.Im-perativeforwardcheckingis traditionallydescribedasfilteringoutall theconflictingvaluesfrom thedomainsof futurevariables;this makesit hardto explainhow it canbeprofitablycombinedwith backjumping,sincethelatterwouldseemto haveno informationonwhich to basebackjumpingdecisions.Ourviewpoint is thatforwardcheckingis justamore(time-)efficientwayof generatingconflictsets,whichmakesthecombinationperfectlyreasonable.

Similarly, thecombinationof backmarkingandbackjumping

bjbm csp = bj csp . bm csp

is tricky to implementcorrectlyin animperativesetting[11], but is simplefor us,andturnsoutto dodofewerconsistency checksonqueens thanany of ourotheralgorithms.

Onceproblem-specificvalueorderingheuristicsareintroduced,many morepossibilitiesfor new algo-rithm designopenup. Sincethe bestcombinationof algorithmfeaturestendsto dependon the particularproblemat hand,it is importantto beableto experimentwith differentcombinations;our framework makesthisextremelyeasy.

11 Experimental Results

To estimatethecostof modularityandlazinesswewroteanintegrated,strict versionof simplebacktrackingsearchfor the û -queensproblemin Haskell andcomparedthe runtimewith that of btsolver0 . Table1reportstheresults;they indicateanoverheadfactorof aboutfour times.Themeasurementsweretakenusingghc (theGlasgow Haskell compiler)version3.02with optimizationturnedon, runningon a lightly loadedSunUltra 1 underSolaris2.5.1. We alsoshow theruntimeof anoptimizedC library for solvingCSPs[17]compiledwith egcs version2.93.06using-O4 on thesameplatform; it runsanorderof magnitudefaster,partlybecauseit performsconsistency checksvia lookupinto aprecomputedtable.Table2 givesthenumberof consistency checksmadeby thedifferentalgorithmsfor the û -queensproblem.

12 RelatedWork

Hughes[9] givesa lazydevelopmentof minimaxtreesearch.Bird andWadler[3] treatthen-queensproblemusing generate-and-testand lazy lists. Laziness(not in the context of lazy languages)hasbeenusedforimproving the efficiency of existing CSPalgorithms[15, 5], but as far aswe know lazinesshasnot beenpreviouslybeenusedto modularizeany of theCSPalgorithmspresentedhere.

60

Queens 5 6 7 8 9 10 11 12 13bjbm 276 909 3158 11928 49369 210210 975198 4938324 26709008bjfc 279 916 3182 12229 51314 218907 1026826 5231284 28387767bm 276 944 3236 12308 50866 220052 1026576 5224512 28405086fc 279 920 3189 12276 51642 220745 1038129 5297651 28817439bjbt 405 1828 8230 41128 214510 1099796 6129447 36890689 233851850bt 405 2016 9297 46752 243009 1297558 7416541 45396914 292182579Solutions 10 4 40 92 352 724 2680 14200 73712

Table2: Numberof consistency checksperformedby variousalgorithmson the ü -queensproblem. Algo-rithmsareidentifiedby their labelerfunctionname.

Many reformulationsof standardalgorithmsinto a framework exist in theliterature[8, 6, 16,2], but theframeworkstypically aren’t modular;in thebestcasethedifferencesbetweentwo algorithmsarehighlightedby showing which lines of pseudo-codehave changed[11]. Algorithms have beenclassifiedaccordingtotheamountof arcconsistency (AC) they do [12] or thenumberof nodesvisited [11]. Theseclassificationshave shown that the backmarkingandforward checkingalgorithms,which werepreviously thoughtof asbeing fundamentallydifferent,actually sharethe samefoundation[1], as we independentlyrediscovered(Section9). Thereoftenremainsconfusion,evenamongexpertsin thefield, aboutwhich algorithma givendescriptionreally implements.

Consideringhow longthestandardalgorithmshaveexistedandhow muchthey areused,therehavebeensurprisinglyfew proofsof correctness.A correctnesscriterion for searchalgorithmsbasedon soundnessandcompletenesswaspresentedin Kondrak[11] andan automatictheoremprover wasusedto derive thealgorithmsin Caldwell,etal. [4].

Theterm“conflict set” is verycommonin theliterature,but aprecisedefinitionis difficult to achieve;webaseourson thatof Caldwell,et al. [4].

13 Conclusion

Expressingalgorithmsin alazyfunctionallanguageoftenclarifieswhatanalgorithmdoesandwhatinvariantsit dependson. With a little bit of carewe can modularizecodethat traditionally hasbeenexpressedinmonolithic imperative form. Experimentationis alsovery easy. New combinationsof algorithms,suchasforwardcheckingplusconflict-directedbackjumping,canbeexpressedin asingleline of code;theequivalentalgorithm in the imperative literaturerequiresmany lines of (mysterious)C or pseudocode.Despitetheoverheadsintroducedby lazinessanduseof Haskell, large experimentscanbe conducted.For example,combininghrandom with bjbt allowed us to find solutionsfor the queensproblemwith well over 100queens,evenusingtheHaskell interpreterHugs.

The major problemof working with lazy codeis difficulty in predictingruntimebehavior, particularlyfor space.Very minor codechangescanoftenleadto asymptoticdifferencesin spacerequirements,andtheavailabletoolsfor investigatingsuchproblemsin Haskell areinadequate.

For futurework, weplanto work on formalproofsof algorithmiccorrectness,whichshouldberelativelyeasyin ourframework,andto investigatevariable-reorderingheuristics,whichareat thecoreof currentworkin theAI searchliterature.

61

References

[1] F. BacchusandA. Grove. Ontheforwardcheckingalgorithm.In PrinciplesandPracticeof ConstraintProgramming, pages293–309,Cassis,France,September1995.

[2] F. BacchusandP. vanRun. Dynamicvariableorderingin CSPs.In U. MontanariandF. Rossi,editors,PrinciplesandPracticeof Constraint Programming, pages258–275,Cassis,France,September1995.

[3] R. Bird andP. Wadler. Introductionto FunctionalProgramming. PrenticeHall, 1988.

[4] J. L. Caldwell, I. P. Gent, and J. Underwood. Searchalgorithmsin type theory. In Submittedto:TheoreticalComputerScience:SpecialIssueonProof Search in Type-theoreticLanguages, September1997.

[5] M. J. DentandR. Mercer. Minimal forwardchecking. In Prec.of the Int’l Conferenceon ToolswithArtificial Intelligence, pages432–438,New Orleans,Louisiana,1994.IEEEComputerSociety.

[6] D. H. Frost.AlgorithmsandHeuristicsfor Constraint SatisfactionProblems. PhDthesis,UniversityofCaliforniaIrvine, 1997.

[7] A. Gill, J.Launchbury, andS.PeytonJones.A short-cutto deforestation.In Proc.ACM FPCA, 1993.

[8] M. L. Ginsberg. Dynamicbacktracking.Journalof Artifical IntelligenceResearch, 1:25–46,1993.

[9] J.Hughes.Why functionalprogrammingmatters.ComputerJournal, 32(2):98–107,1989.

[10] D. King andJ.Launchbury. Structuringdepthfirst searchalgorithmsin Haskell. In Proc.ACM Princi-plesof ProgrammingLanguages, 1995.

[11] G. Kondrak. A theoreticalevaluationof selectedbacktrackingalgorithms.Master’s thesis,Universityof Alberta,1994.

[12] V. Kumar. Algorithmsfor constraintsatisfactionproblems:A survey. AI Magazine, 13(1):32–44,1992.

[13] B. A. Nadel. Representationselectionfor constraintsatisfaction: A casestudyusingn-queens.IEEEExpert, 5(3):16–23,June1990.

[14] C. Okasaki.PurelyFunctionalDataStructures. CambridgeUniversityPress,1998.

[15] T. Schiex, J. C. Regin, C. Gaspin,andG. Verfaillie. Lazy arc consistency. In Proc. of AAAI, pages216–221,Portland,Oregon,USA, 1996.

[16] E. Tsang.Foundationsof Constraint Satisfaction. AcademicPressLimited, 1993.

[17] P. van Beek. A ’C’ library of constraint satisfaction techniques.,1999. Available fromftp://ftp.cs.ualberta.ca/pub/vanbe ek/sof tware / .

62

PersistentTriangulationsý

GuyBlelloch Hal Burch Karl Crary RobertHarper GaryMiller NoelWalkington

CarnegieMellon UniversityPittsburgh,PA 15213

Abstract

Triangulationsof asurfaceareof fundamentalimportancein computationalgeometry, engineeringsim-ulation,andcomputergraphics.For example,theconvex hull of a setof pointsmay be constructedasatriangulation,andthereis acloserelationshipbetweenDelaunaytriangulationsandVoronoidiagramsin ge-ometry. Triangulationsareordinarilyrepresentedasmutablegraphstructuresfor whichbothedgetraversalandaddingedgestakeconstanttimeperoperation.Theserepresentationsof triangulationsmakeit difficultto supportpersistence, including“multiple futures”,theability to usea datastructurein severalunrelatedwaysin a given computation,“time travel”, theability to move freely amongversionsof a datastructure,or parallelcomputation,the ability to operateconcurrentlyon a datastructurewithout interference.Weproposea new representationof triangulationsthatsupportspersistence.To demonstrateits usewe give anew algorithmfor the three-dimensionalconvex hull that is asymptoticallyoptimal in theexpectedcase,andwegive animplementationof a terrain-modellingalgorithmbasedonthis representation.To assessitspracticalitywemeasuretheperformanceof bothapplications.

1 Intr oduction

A persistentdatastructureis onefor whoseoperations—eventhosethat “update”or “modify” it—preservethe datastructureacrosscalls. To achieve persistence,updateoperationsmustcreatea “fresh” copyof thedatastructuresothat theoriginal is not disturbedby theoperation.Persistentdatastructuresarisenaturallyin “value-oriented”programminglanguagessuchasML or Haskell. In theselanguagesall datastructuresarevaluesthatarepassedasargumentsandreturnedasresults,muchasnumbersarehandledin nearlyeverylanguage.

In contrast,mostfamiliar datastructuresareephemeral—the operationson the datastructure“mutate”it by modifying its representationin memoryin sucha way thatall referencesto it changesimultaneously.Ephemeraldatastructuresarisenaturally in “object-oriented”programminglanguages,including Java orC++. In theselanguagesdatastructuresarethoughtof asregionsof mutablestoragethat is modifiedby theoperationson thestructure.

Persistentdatastructuresoffer a numberof advantagesnot sharedby their ephemeralcounterparts.Thetermpersistence,however, is sometimesuselooselyin theliteratureandthereareseveraltypesof persistencethatcanbecategorizedby thefeaturesthey support.Thekey featuresweareconcernedwith are:

1. timetravel: theability to gobackandview any previousversionof a datastructure,þThis researchwassupportedin partby NSFGrantCCR-9706572.Burchwassupportedby anNSFGraduateFellowship.

63

2. multiplefutures, theability to go backandmakea sequenceof modificationsto a previousversionofthedatastructurewithoutaffectingthecurrentversion(thisallowsfor a versiontree),

3. combining, theability to combinepersistentdatastructures,suchastakingtheunionof two persistentsets,and

4. implicit parallelism, theability to view andmodify differentversionsof adatastructurein parallel.

Driscoll, et.al. [10] distinguishbetweenpartially persistentandfully persistentdatastructures.Theformersupportonly time travel, whereasthe latter also supportmultiple futures. They describetechniquesforbuilding both partially andfully persistentdatastructures,but their methodsdo not supportcombiningorimplicit parallelism. Driscoll, SleatorandTarjan [11] introducedthe term confluentlypersistentfor fullypersistentstructuresthatalsosupportcombining,andshowedatechniquefor supportingconfluentlypersistentlists with catenation.Their techniquedoesnot supportparallelism.We will usethe termstronglypersistentfor a datastructurethatsupportsall four features.

Theabove mentionedtechniquesareheavily basedon theuseof side-effects.It is well known, however,that purely functionalprograms(no side-effects)areinherentlypersistent,andsupportat leasttime-travel,multiple-futuresandcombining. Furthermore,if the programis strictly functional(i.e., doesnot uselazyevaluationor any other form of memoization)then it will also supportimplicit parallelism. Tarjan andKaplan [18] makeuseof this in a designof a strictly functionalcatenablelist that is strongly persistent.Okasaki[20] developeda simpleralgorithmwith similar time boundsbasedon functionalprograms.Sincehismethoduseslazyevaluation,however, it doesnotsupportparallelism.

The subjectof this paperis the persistentrepresentationof closedsurfacesin multi-dimensionalspace.Closedsurfacesareof fundamentalimportancein computationalgeometry, andhave a numberof applica-tionsin a wide varietyof areas,includinggeographicinformationsystems,meshgenerationfor engineeringsimulations,andsurfacerepresentationsin computergraphics.Oneapplicationof closedsurfacesis thecon-structionof the convex hull of a setof points in threedimensions.The convex hull of a setof points isthesurface,or boundary, of thesmallestconvex polytopecontainingthosepoints.Anotherapplicationis toterrainmodelling,in which the topographyof a geographicalregion is approximatedto within a specifiedresolutionby a closedsurface.

A numberof representationsof closedsurfaceshave beenconsideredin theliterature[16, 4, 8, 7], but allarebasedon ephemeraldatastructuressuchasgraphsor mutabledictionaries.Usingtheseephemeraldatastructures,severalasymptoticallyoptimalalgorithmsfor theconstructionof theconvex hull of asetof pointsin threedimensionsareknown [4, 7]. GarlandandHeckbert’s terrainmodellingalgorithm[13] is alsobasedonanephemeralrepresentationof surfaces.

Persistenceoffersa numberof advantagesover themorefamiliar ephemeralrepresentations.In thecaseof the convex hull algorithm,we may exploit implicit parallelismto provide a simultaneousdisplayof theconstructionof thehull duringits construction,without imposingany synchronizationor checkpointingover-head.By exploiting time travel, we maymove backwardsandforwardsamongstagesof theconstructionofthe hull, allowing the userto explore the dynamicsof its creationor to modify the unconsideredpointstoseethe effect on the final hull. In the caseof terrainmodelling,persistencesupportsviewing themodelatmany differentresolutions,on demand.Althoughwe have not exploredthedirectuseof persistentsurfacesin algorithmdesign,persistentdata-structuresareusedascomponentsof severalalgorithmsin otherareasofcomputationalgeometry[21, 15,17].

Conventionalimplementationsof the3-dimensionalconvex hull algorithmrely onmutabledatastructuressuchasthedoubly-connectededgelist describedby deBerg, et. al. [4]. Thehull is representedby a planargraphwhosenodesaretrianglesandwhoseedgesrepresenttheadjacency relationamongthem.Theanalysisof thesealgorithmsrelieson the assumptionthat graphedgesmay be traversed,andnew edgesadded,in

64

constanttime.1 A naıve translationof thesealgorithmsto the persistentsettingwould proceedby simplyreplacingtheephemeralgraphstructureby a persistentdictionaryrecordingtheadjacency relation.Usingasimpledictionaryrepresentationthecostof thefundamentalgraphoperationsincreasesfrom ÿ������ to ÿ�������� ,whichwould leadto a sub-optimalÿ�� ��������� timeboundin thepersistentcase.

Oneapproachto reducingthis costmight be to usea persistentrepresentationof graphs(suchastheonegivenby Erwig [12]) thatperformswell in thecasethateachgraphhasonly one“logical future”. Theobviousdisadvantageof thisapproachis thatin themultiple-futurecasetheperformanceonceagaindegradesto sub-optimal.Moreover, thesemethodsusuallyrely on so-called“benign” effectsthat inhibit parallelism.Anotherapproachmight be to useDietz’s persistentrepresentationof arrays[9] for which fully persistentarrayupdatesandsearchesrequireonly ÿ������������� time. Usednaıvely in a convex hull algorithmDietz’smethodwould leadto a suboptimaltime boundof ÿ�� ���������������� . Moreover, Dietz’s representationdoesnotsupportcombiningor implicit parallelism.

We thereforeconsiderwhetherit is possibleto achieve the ��� ������� lower boundwhile retainingtheadvantagesof strongpersistence.We presenta new randomizedalgorithm,calledthe bulldozeralgorithm,for theconstructionof theconvex hull of a setof pointsin threedimensionsthatachievesthelowerboundintheexpectedcase.Thecrucialfeatureof thealgorithmis thattheexpectednumberof adjacency relationshipchecksandupdatesamongfacesof the hull is at most ÿ�� �� , while the expectednumberof floating-pointoperationsto determinethe spatialrelationshipsamongpoints is ÿ���������� . Sincethe algorithmonly re-quires ÿ�� �� operationson thesurface,we canafford to spendÿ��������� time peroperationwithout affectingthe overall time bound. We canthereforeusea simpledictionarybasedon balancedtreesto representtheadjacency relationshipof thesurface.

Asymptotic complexity is important,but so are empirical performancemeasurements.To assessthepracticalityof ourrepresentation,wemeasuretheperformanceof thehull algorithmandtheterrainmodellingalgorithmon representative datasets.Thereareseveral typesof experimentwe couldrun. Ratherthanjustmeasuringrunningtimes, which arestrongly influencedby the machineusedand the optimizationof thealgorithm,we countedthe numberof basicoperations.For the numericalcomponentof the algorithmwecountedthenumberof floatingpointoperations.For themanipulationof thesurfacewecountedthenumberof key comparisonsthataremadein our dictionaryimplementationof thesurface,which is basedon Red-Black trees.For theconvex-hull algorithmsweexpectboththesecountsto grow as ÿ������������ , but wewereinterestedin comparingtheconstantfactors.Our resultsaregivenin Section4 andshow thattherearemorefloating-pointoperationsthankey comparisonsover a wide varietyof point distributionsandsizes.Similarresultsfor theterrain-modelingcodearegivenin Section5.

Sinceone could argue that a “key comparison”and associatedoverheadcould be much greaterthanthe cost of a floating-pointoperation,we also ran a timed experiment. To avoid biasingthe results,ourcomparisonsaremadeto the fastest3D convex-hull algorithmwe knew of [2], which wasdevelopedat theMinnesotageometrycenter. We first measuredthetime for MinnesotaQuickhull usingits own emphemeralsurfacerepresentation.We thenmeasuredtheadditionaltime requiredto implementthesurfaceoperationsusinga dictionaryinsteadof theephemeralstructure.Our initial experimentsshow thatthedictionaryaddsacostof about50%to theoverall costof thealgorithm.

2 Surfaces

Boththeconvex hull algorithmandtheterrainmodellingalgorithmpresentedin latersectionsconstructatwodimensionalsurfaceenclosingasetof points.In thecaseof theconvex hull, this is thesurfaceof thesmallestenclosingconvex polytopeof a setof points,andin thecaseof theterrainmodelling,thesurfacerepresents

1Thismaynot beareasonableassumptionwhenthenumberof nodesis extremelylarge,dueto memoryhierarchyeffects.Neverthe-less,it is astandardassumptionto ignoresuchissues.

65

anapproximationto the topographyof a geographicalregion. In both casesit is convenientto think of thesurfaceasa connectedsetof trianglescovering the surface;if the surfaceis specifiedby polygonalfacesthey aresubdividedinto triangles.Consequently, thesurfacesof interestaresometimescalledtriangulatedsurfaces, or simply triangulations.

Following Giblin [14], wedefinea closedsurfaceto consistof a setof trianglessatisfyingthefollowingthreeconditions:

1. Any two triangleshaveatmostonevertex or oneedge(andits twovertices)in common;nootherformsof overlaparepermitted.This is calledthe intersectioncondition.

2. Thesurfaceis connectedin thesensethatthereis a pathfrom any vertex to any othervertex consistingof edgesof thetrianglesof thesurface.

3. Thesetof edges“opposite”any vertex, calledthe link of thatvertex, formsa simple,closedpolygon.

This definition relieson the familiar conceptof a triangle. A triangleconsistsof a setof threedistinctvertices,specifiedin someorder. This raisesthe questionof whentwo trianglesareequivalent. Underanordered interpretation,���! #" is distinct from both �� $"�� and ��"��! , even thoughthey enumeratetheverticesin the samesequence,andis alsodistinct from ���!"$ , which reversesthe orderof presentation.Two orderingsthat differ by an even permutation(i.e., that canbe obtainedfrom oneanotherby an evennumberof swaps)aresaidto determinethesameorientation. Thus �%�� $" , �� $"&� , and �'"&�! all havethesameorientation,whereas���!"$ (andits evenpermutations)have theoppositeorientation.Theorien-tation may be thoughtof asdeterminingtwo “sides” of a triangle; �%�� $" is the “front” of ���� $" , and,correspondingly, ���!"$ is the “back” of ���! #" . Underan oriented interpretationwe identify trianglesthathave thesameorientation,anddistinguishthosethatdonot.

Following Giblin, we maintaina careful distinctionbetweenthe configurationof the triangleson thesurface(i.e., theiradjacency relationships)andtheembeddingof thetrianglesin three-dimensionalspace(i.e.,the assignmentof coordinatesto their vertices). Whenembeddinga triangle ���! $" in three-dimensionalspace,we requirethat the pointsassignedto the verticesbe affinely independent, which is to say that thevectors )(*� and ")(+� arelinearly independent,or, equivalently, that the threepointsarenot collinear.Theconvex hull algorithmwill determinenot only theconfigurationof triangles,but alsotheirembeddinginthree-dimensionalspace.2

A closedsurfaceis a specialcaseof the moregeneralconceptof a simplicial complex [14, 1], whichappliesin anarbitrarydimension.Our implementationof the three-dimensionalconvex hull andof the ter-rain modellingalgorithmarebasedon anabstracttypeof simplicial complexes. Not only doesthis supportgeneralizationto higher-dimensionalspaces,but it alsoallowsusto experimentwith variousimplementationsof themwithout disturbingtheapplicationcode.Indeed,we experimentedwith severaldifferentimplemen-tationsbeforesettlingon theonewedescribehere.

Justasaclosedsurfaceis asetof trianglessatisfyingsomeconditions,asimplicialcomplex is asetof sim-plicesoverasetof verticessatisfyingsomerelatedconditions.A zero-dimensionalsimplex is a“bare” vertex,a one-dimensionalsimplex is a line segment,a two-dimensionalsimplex is a triangle,a three-dimensionalsimplex is a tetrahedron,andsoon. A complex is a configurationof simplicessubjectto somesimplecondi-tionsthatensurethatthesimplices“fit together”to form a coherent“solid” in , -dimensionalspace.

We assumegivena totally orderedset - of vertices.3 An , -dimensionalorderedsimplex, or , -simplex,is an .�,�/1032 -tuple of distinct vertices. An orderedsimplex is orientedif f we do not distinguishbetweentwo orderingsthatdiffer by anevenpermutation(onethatcanbeexpressedasanevennumberof swaps).Asimplex 4 is a sub-simplex, or a face, of a simplex 5 , written 4�675 , if f 4 is asubsequenceof 5 .

2To avoid degeneraciesandto simplify thepresentation,weassumethattheinputsetof pointsto thehull algorithmhasthepropertythatno four pointsarecoplanar.

3This is notordinarily requiredin themathematicalsetting,but is necessaryfor implementationreasons.

66

signature VERTEX =sig

type vertexval compare : vertex * vertex -> order

type pointval new : point -> vertexval loc : vertex -> point

end

Figure1: Signatureof Vertices

An 8 -dimensional,oriented,pure simplicial complex, or just 8 -complex for short,consistsof a set 9 ofverticesanda set : of orientedsimplicessatisfyingthefollowing conditions:

1. Every vertex determinesa ; -simplex. We usuallydonotdistinguishbetweena vertex < andits associ-ated; -simplex =�<�> .

2. Every sub-simplex of a simplex in ? is alsoa simplex of ? .

3. Every simplex @�A�: is a sub-simplex of some8 -simplex in : . Thatis, thereareno B -simplices,withBDCE8 , otherthanthosethatarefacesof an 8 -simplex in : .

A closedsurfaceis a F -complex in whichthelink of every ; -simplex is a simple,closedpolygonhaving that; -simplex asaninteriorpoint.

The signature(interface)of the simplicial complex abstracttype is given in Figure3. This abstractionreliesonanabstracttypeof vertices,whosesignatureis givenin Figure1, andanabstracttypeof simplices,whosesignatureis givenin Figure2. Takentogether, thesesignaturessummarizetheentiresuiteof operationsavailable to applicationsthat build andmanipulatecomplexes. To fix ideaswe summarizethe operationsprovidedby theseabstractions.

The signatureVERTEXspecifiesthat verticesadmit a total ordering,which is requiredfor efficientlyassociatingdatawith vertices. In particularwe associatea point with eachvertex; this is usedto embedasimplex in space,asdescribedearlier. Theembeddingis establishedby thenew operation,which createsa“new” vertex at thespecifiedpoint. The locationof a vertex in spaceis obtainedusingthe loc operation,whichyieldsthepoint in spaceassociatedwith vertex. Thetypeof pointsis left completelyunspecifiedsincethesimplicialcomplex packageneednot beconcernedwith its exactrepresentation.

ThesignatureSIMPLEXdefinestheabstracttypeof (ordered)simplicesoveragiventypeof vertices.Aswith vertices,werequirethatsimplicesbetotally orderedby someunspecifiedorderrelationsothatsimplicesmaybeusedaskeys in a dictionary. Theoperationdim yieldsthedimensionof a simplex. Sincetheorderof verticesin a simplex is significant,we distinguishonevertex astheapex of thesimplex, with the othersfollowing in order; this is the first vertex in the enumerationof verticesof the simplex. The verticesoperationyields the sequenceof verticesof a simplex in order, apex first.4 An 8 -simplex is createdbyapplyingthe simplex operationto a sequenceof 8�GIH vertices;the first vertex in the sequenceis theapex. Theorders operationyieldsa sequenceof 8JGKH orderingsof thesimplex with thesameorientation,oneorderingfor eachchoiceof apex. The faces operationyieldsa sequenceof =�8�LMH3> -dimensionalsub-simplicesof a given 8 -simplex. The flip operationinvertstheorientationof a simplex (flips to its reverseside).Thedown operationpassesfrom an 8 -simplex to its apex andthe“opposing” = 8�LNH�> -simplex of that

4Wemakeuseof anabstracttypeof sequences,aformof immutablearraywhoseprimitiveoperationsaredesignedtosupportimplicitparallelism[5].

67

signature SIMPLEX =sig

structur e Vertex : VERTEX

type simplexval compare : simplex * simplex -> order

val dim : simplex -> intval vertices : simplex -> Vertex.v ert ex seqval simplex : Vertex.v ert ex seq -> simplex

val down : simplex -> Vertex.ver te x * simplexval join : Vertex.ve rt ex * simplex -> simplexval orders : simplex -> simplex seqval faces : simplex -> simplex seqval flip : simplex -> simplex

end

Figure2: Signatureof Simplices

apex. (In the caseof a triangle,this is the baseoppositeto a specifiedvertex of the triangle.) The joinoperationbuilds an O -simplex from a givenvertex and P O�QKR�S -simplex, taking the vertex asapex andtheP�OTQMR�S simplex asits oppositeface.

The signatureSIMPCOMPspecifiesthe abstracttype of simplicial complexes. Thereareno mutationoperationson complexes. Insteadwe supplyoperationsto createnew complexesfrom old, asdiscussedintheintroduction.Thetype ’a complex of O -dimensionalsimplicial complexesis parameterizedby a type’a of datavaluesassociatedwith the O -simplicesof thecomplex. Thedimensionof thecomplex is a fixedpropertyof theabstracttype;differentinstancesof theabstractionmayhave differentdimension.Theemptycomplex is the valueempty ; the operationisempty testsfor it. The sequenceof verticesof a complexarereturnedby thevertices operation,in anarbitraryorder. Thesimplicesof a givendimension(atmostdim ) arereturnedby the simplices operation.The grep operationfinds all the simplicesof maximaldimensionhaving a givensimplex asa face. More precisely, givena dimensionU�V dim anda U -simplexW , grep returnsthesequence(in unspecifiedorder)of simplicesof dimensiondim having W asa face.Thefind operationis a specializationof grep for dimensiondim QNR . Theoperationadd addsa simplex to acomplex, with specifieddatavalue;to ensurethat thecondition3 in thedefinitionof simplicesis preserved,we may only addan O -simplex to an O -complex. The operationrem removesa simplex from a complex,yieldingthereducedcomplex. Theupdate operationappliesaspecifiedfunctionto thedatavaluesof everysimplex in thecomplex, yieldinganew complex.

In our implementation,an O -simplex is representedby a sequenceof verticesof length O%XYR , with theapex beingtheleadvertex of thesequence.Thedown operationstripsoff theapex andreturnstheremainingP�O�QKR�S -simplex, asdescribedabove. Simplicesarecomparedby comparisonof sequencessothatdifferentorderingsdeterminedifferentsimplices.We implementcomplexesusingthe Map signaturetakenfrom theSML/NJlibrary. An O�Z)R complex is representedby amappingfromverticesto thesetof P�O�Q%R3S -complexesincidenton it. (In thecaseO\[I] , eachvertex hasassociatedwith it theedges,togetherwith their vertices,incidentonthatvertex.) A R -complex is implementedspeciallyto avoid theoverheadof maintainingthemap.

We maybuild an O -complex by asequenceof OJQNR applicationsof a“bootstrappingfunctor” thatbuildsan O -complex from an P�O^Q+R�S -complex, startingwith thedirectimplementationof the R -complex. However,for reasonsof efficiency, wechooseto implementthe ] -complexesdirectly, ratherthanby bootstrapping.Inthis optimizedimplementationwe usethefirst vertex of a simplex asa key into a red-blacktree[3]. Eachnodeof thered-blacktreethenstoresasits valueanassociationlist thatmapsthesecondvertex to the third

68

signature SIMPCOMP=sig

structur e Simplex : SIMPLEX

type ’a complexval dim : int

val empty : ’a complexval isempty : ’a complex -> bool

val vertices : ’a complex -> Simplex.Ve rte x. ve rte x seqval simplices : ’a complex -> int -> Simplex.s imp le x seqval data : ’a complex * Simplex. si mpl ex -> ’a option

val grep : ’a complex -> int * Simplex. sim pl ex -> Simplex. sim pl ex seqval find : ’a complex * Simplex. si mpl ex -> Simplex. sim pl ex option

val add : ’a complex * Simplex.s im ple x * ’a -> ’a complexval rem : ’a complex * Simplex.s im ple x -> ’a complexval update : ’a complex * Simplex. sim pl ex * (’a -> ’a) -> ’a complex

end

Figure3: Signatureof SimplicialComplexes

vertex andthedata.Usinganassociationlist is adequatein practicesincethenumberof entriesis small(theaveragenumberis 6). To maketheimplementationoptimalin theoryonecouldconvert to a balancedtreeifthesizeof thelist becomestoo long.

In our direct implementationsearchingfor a simplex involvessearchingthered-blacktreeandthentheassociationlist. Addingasimplex involvessearchingthered-blacktreeto seeif thevertex is alreadythere.Ifit is, thesimplex is addedto theexisting associationlist, otherwisea new associationlist is created.We notethatwhenasimplex is added,it needsto beaddedto thetreein all threeordersthathavethesameorientation.Deletinga simplex involvessearchingthe treeanddeletingthesimplex from thecorrespondingassociationlist. If theassociationlist becomesempty, thenthe treenodeis alsodeleted.As with adding,thedeletionneedsto beexecutedin all threeorderings.

3 ConvexHull: The Bulldozer Algorithm

It is well known that the problemof constructingthe convex hull of a set of points in threedimensionsrequires_�`�a�bc�a�d time [4]. Asymptoticallyoptimalalgorithmsfor theproblemarealsoknown [8, 7] for theephemeralcase.In thissectionwewill givea randomizedoptimalalgorithmfor thepersistentcase.

We will beconcernedwith incrementalmethodsthatextendtheconvex hull of a setof pointsto includea new point. Many algorithms,includingour own, arebasedon tentconstruction. Givena point e exteriorto thehull of a setof points,we mayextendthehull to includethis point asfollows. We view the exteriorpoint asa light source illuminatinga subsetof the facesof thehull. Theboundaryof thelit facesis a setofedges,which we call thehorizon. We thenconstructa pyramidaltentwhoseapex is theexterior point andwhosebaseis the horizon,removing the lit faces.This constructionextendsthe convex hull to includethegivenpointasanew vertex.

Several incrementalalgorithmsbasedon thetentconstructionareknown; they differ in how theexteriorpoint is chosen,andhow the set of exterior points is maintainedduring the construction. Our algorithmmaintains,for eachexterior point, onefacethat is visible to that point. In particular, the algorithmbeginsby selectinga point thatwill alwaysbe interior to the hull—we call this thecenterpoint. Considertheray

69

from the centerpoint to eachexterior point. Eachsuchray penetratesa face, to which we associatethepoint. A faceis visible to eachof its associatedpoints. Thesetof facesthatarevisible to any onepoint isconnected,soknowingonevisiblefaceallowsoneto walk throughthevisiblefacesto find themall. ClarksonandShor’s algorithm[8], theMinnesotaQuickhull algorithm[2], the algorithmpresentedby Motwani andRaghavan [19], our basicalgorithm, and the Bulldozer algorithm[6] all maintainsuchinformation. Allthe otheralgorithms,however, eitherdo not carewhich visible facethe point is associatedwith or keepacompletelist of visible facesfor eachpoint.

Our algorithmrequiresa representationof the hull, for which we usea simplicial complex, andsomemethodfor associatingpoints with faces,for which we usethe dataassociatedwith eachsimplex in thecomplex.

Eachincrementalstepof ouralgorithmselectsarandomfacethathaspointsassociatedwith it. A randompoint associatedwith this faceis designatedasthelight source.Thealgorithmthenfindsall theotherfacesvisible to thelight sourceby searchingadjacenttriangleson thesurfacestartingwith theselectedface.Thealgorithmdefinesadirectedacyclic graphwhosenodesarethevisiblefacesandthehorizonedges,andwhosearcsconnectadjacentfacesor a facewith oneof its horizonedges.Theselectedfacehasin-degreezero(i.e.,it is theroot),andthehorizonedgeshave out-degreezero.Thefacesarevisitedin a topologicalorderingofthegraph.Whenvisiting a face,every point assignedto the faceis eitherdiscarded,becauseit is interior tothehull, or pushedoutalonganout-goingarc(hencethename“bulldozer”algorithm).This requiresatmosttwo plane-sidetestsper point. Whenthe searchis completeeachpoint associatedwith any of the visiblefaceshaseitherbeendiscardedor associatedwith a horizonedge.Oneadditionaltestcandeterminewhethera point is interior to thehull or visible to thefaceformedby thisedgeandthelight source.

This algorithmvisits eachvisible faceonce. It is possibleto show, by backwardsanalysisandEuler’sformula, that the expectednumberof facesvisited is linear in the numberof input pointswhensummedacrossall steps.The costof the algorithmcanbeseparatedinto plane-sidetestsandthe costof the graphtraversalandsurfacemanipulation(i.e., findingadjacentfaces).Eachvisit requiresatmostaconstantnumberof graphandsurfaceoperations,eachtaking f�g�h�i�j�k�l time,hencethetotal expectedcostof graphtraversalandsurfacemanipulationis fTg�k�hj�k�l . It is alsopossibleto show that the expectednumberof plane-sidetestsis f�g�k�h�j�k�l [6]. Thusthetotal expectedcostis f�g�k�h�j�k�l .4 ConvexHull: Experimental Evaluation

Althoughour theoryshowsusingapurelypersistentdictionaryfor storingasimplicialcomplex is asymptoti-cally optimal,weareinterestedin theactualoverhead.In particularwewereworriedthattheconstantfactorscouldmakethe ideasimpractical.For this reasonwe ranseveralexperimentsto studytheoverhead.Theseexperimentsinvolvedmeasurementson thebulldozer3d hull algorithm,andon a terraintriangulationalgo-rithm, describedin thenext section.Thegoalin theexperimentsis to comparethework neededto maintainthesimplicial complex to theotherwork in thealgorithm.This otherwork mostlyconsistsof thenumericalaspectsandis dominatedby floating-pointoperations.

In ourexperimentsweusedthefollowing 5 distributionsof pointsin 3d:

1. OnSphere: Randomuniformly distributedpointson theunit 2-sphere(i.e., thesurfaceof theunit ballin 3d).

2. EqHeavy: Randompoints on the spherethat areweightedto be mostly on the equator. Thesearegeneratedby producingrandompointson thesphere,stretchingtheequator(x andy coordinates)by afactorof 100sothatthedistributionis onadisk like surface,andthenprojectingthepointsbackdownontoa sphereby scalingtheir lengthto one.

70

0

5e+07

1e+08

1.5e+08

2e+08

2.5e+08

3e+08

3.5e+08

4e+08

4.5e+08

5e+08

128K 256K 512K

Num

ber

of O

pera

tions

Problem size (points)

key-comparisons (OnSphere)floating-point ops (OnSphere)

key-comparisons (BordHeavy)floating-point ops (BordHeavy)

Figure4: Operationcountsasa functionof input sizefor two of thedistributionsusingtheBulldozeralgo-rithm. Thetwo setsof countsfor OnSpherearealmostidentical.

3. PolHeavy: Randompointson thespherethatareweightedto bemostlyat thepoles.Thesearegener-atedby producingrandompointson thesphere,stretchingthepoles(z coordinate)by a factorof 100sothatthedistributionis ona stretchedellipsoidsurface,andthenscalingthepointsbackdown ontoasphereasin theEqHeavy distribution.

4. InBall : Randomuniformly distributedpointsin theunit ball.

5. BordHeavy: Generatedby producingpoints randomlyin a unit ball and then mappingeachpointm�nporqsortvuto thepoint

m nwoxqyoxn{z}|~qvz�|*t�z3u. It canbeshown thatusingthis distributionfor � points,the

sizeof theconvex hull is expectedto be � m � zx����u .Weselectedthesesincewewanteddatasetsbothwhereall thepointsarein thefinal result(theexpensivecase)andwheresomeareinside. We alsowantednonuniformdistributions,which arewhatEqHeavy, PolHeavy,andBordHeavy giveus.

To geta machine-andlanguage-independentmeasurementof thecostswe first measuredvariousopera-tion counts.For themanipulationof thesimplicial complex (thetopologicalpartof thealgorithm)we countboth thenumberof dictionaryoperationsandthe total numberof key-comparisonsmadeby the dictionarycode. For the numerical(geometric)part of the algorithmwe count the numberof plane-sidetests,fromwhichwecaneasilydeterminethenumberof floating-pointoperations.

As mentionedin Section2, the simplicial complex is implementedusing red-blacktreeswith vertexidentifiersusedaskeys. For a treeof size � eachinsertion,deletionor searchwill traverse � m��� � u nodesAt eachnode,the key being searched(an integer identifier for the vertex) is comparedto the key at thenode. In additionto thekey-comparisonsmadein thered-blacktree,which arebasedon the first vertex ofthesimplex beingsearched,key-comparisonsarealsorequiredwhensearchingfor thesecondvertex of thesimplex in theassociation-listof thenodethatis found(seeSection2). Our key-comparisoncountsincludetheseassociation-listcomparisons.Thekey-comparisonsis thereforea measureof the total numberof red-black-treenodesvisited,plusthetotalnumberof association-listelementsvisited. Our theorystatesthattheexpectedtotal numberof dictionaryoperationsis � m � u andsincethered-blacktreeoperationsvisit � m��� � unodes,thetotalnumberof expectedkey-comparisonsis � m � �� � u .

We measuredthenumberof key-comparisonsandfloating-pointoperationsfor all the distributionsandfor a rangeof input sizesup to 512K points. A graphshowing the operationcountsasa function of size

71

Dictionary-Comparisons Floating-Point-Ops

0

2e+08

4e+08

6e+08

8e+08

OnSphere EqHeavy PolHeavy InSphere BordHeavy

Figure5: Operationcountsfor all five datadistributionsusingthe Bulldozeralgorithm. The input sizeis512Kpoints.

is given in Figure 4 for two of the distributions. A bar graphshowing the operationcountsfor the fivedistributionson 512K pointsis givenin Figure5. Thegraphsshow that the numberof key-comparisonsisapproximatelythesameasthe numberof floating-pointoperationsfor the first threedistributionsin whichall thepointsareon thesphere.For theothertwo distributionsin which somepointsareinsidetheball, thenumberof key-comparisonsis verymuchlessthanthenumberof floating-pointoperations(by afactorof 30for the InBall distribution anda factor of 10 for the BordHeavy distribution). This is to be expectedsincetheresultinghull is significantlysmallerthanthesizeof theinput,andthesimplicial-complex operationsareonly usedonthesimplexesthatareactuallycreated,while plane-sidetestsarerequiredonall theinputpoints.

We werealsointerestedin actualrunningtime of thesimplicial complex codesinceonemight imaginethattraversinganodeof atreeis moreexpensivethanafloating-pointoperation.To befair onthismeasurewewantedto comparetimesto a well tunedexisting implementationof 3D Convex Hull. We thereforeselectedthe MinnesotaQuickhull code[2]. Sinceour codeis written in ML andthe Minnesotacodeis written inC, we couldnot comparethe timesdirectly. We alsodid not want to completelyrewrite our codein C, orthe Minnesotacodein ML. Insteadwe instrumentedour codeto dumpout tracesof all the operationsonthe simplicial complex. We thenwrote C codethatsimulatesthe complex operationsusingbalancedtreesandlinked lists. The ideais to getansenseof how muchtime relative to the Quickhull codethepersistentimplementationof thesimplicial complex requires.Theresultsareshown in Figure6. As canbeseen,thecostof the simplicial-complex operationsis at mosthalf the total costof the Minnesotacode,and this isfor a distribution wherethe numberof operationson the complex is high. Sincesomeof the costof theMinnesotacodeis dedicatedto manipulatingits representationof the simplicial complex (it would behardactuallyto separatethis out) it is reasonablysafeto concludethatusinga persistentdictionaryin their codefor manipulatethesurfacewould incur lessthana 50%overhead,andfor many distributionsverymuchless.

5 Terrain Modeling: Experimental Evaluation

Oneinterestingreal-worldapplicationof theconvex hull algorithmis to terrainmodeling[13]. Terraindatais importantto many real-worldapplications,suchasflight simulators.However, renderinga terrainat fullresolutionis impracticalfor terrainsof any significantsize. Therefore,applicationsthatrely on terraindatarequireterrainmodelsthatapproximatefull terrainsusingsubstantiallyfewer polygons.

Givena two-dimensionalarrayof evenly spacedheightsamplesfrom thefull terrain,a terrainmodelingprocedurecomputesa triangulationof theterrainthatminimizestheerrorbetweentheactualsamplevalues

72

0

1

2

3

4

5

6

16K 32K 64K

Run

ning

Tim

e (s

econ

ds)

Problem size (points)

Minnesota Quickhull TimeAdditional time for Simplicial Complex Ops

Figure6: Runningtimeasafunctionof theinputsizefor bothMinnesotaQuickhullandfor theC implemen-tationof thedictionaryoperations.Thedistributionusesis OnSphere.

andthevaluesgivenby thetriangulation.Moreover, thetriangulationsodetermined,whenprojectedontotheplane,is requiredto havetheDelaunayproperty5 [4], assuchtriangulationshaveseveraldesirableproperties.However, sinceit is prohibitivelyexpensive to computeatriangulationthatis actuallyoptimal,heuristicsaretypically employedthatperformwell in practice.

Onesuchheuristicis thegreedyinsertionheuristic.Thegreedyinsertionheuristicstartsby dividing theplaneinto two triangles,andinitializesa priority queuewith onepoint from eachtriangle,thepoint havingthegreatesterrorbetweenthesamplevalueandthevaluegivenby thetriangle.Theheuristicthenbuilds thetriangulationincrementally, at eachstepobtainingthe samplepoint with maximumerror from the priorityqueueandupdatingthe Delaunaytriangulationto includethat point.6 The priority queueis thenupdatedto includethepointsof maximumerrorfor eachnew triangle. Typically only a few trianglesarecreatedineachstep,resultingin only moderaterescanningof theterrainsamples.Thisprocessis thenrepeateduntil anacceptablemaximumerroris achieved.

We implementedthis heuristicusingour persistenttriangulationpackage.Delaunaytriangulationscanbecomputedusinga three-dimensionalconvex hull procedureby projectingthepointsfrom theplaneontoa paraboloid(the surfacespecifiedby the equation�~�������)��� ) and computingthe convex hull of theprojectedpoints[4], so the implementationwasstraightforward.To measureits performance,we ran it ontwo setsof terrainsampledata,onefrom thevicinity of Ozark,Missouri,andtheotherfrom thewestendofCraterLake,Oregon. A 1000-pointtriangulationof eachof thesedatasetsis givenin Figures7 and8. Asin theprevioussection,we countedkey-comparisonsandfloating-pointoperationsfor eachrun. Theresultsappearin Figure9 andshow thatthenumberof key comparisonsis significantlysmallerthanthenumberoffloating-pointoperations,especiallyfor thesmallersizes.

5TheDelaunaypropertyspecifiesthatnopointlieswithin thecircumcircleof anytriangleof which it is notavertex,exceptin certaindegeneratecircumstances.

6An alternativegreedyheuristic,designedto avoid narrow triangles,is to addthecircumcenterof thetrianglecontainingthepointofmaximumerror, ratherthanthepointof maximumerroritself.

73

050

100150

200250

300350

0

100

200

300

400

350

400

Figure7: 1000-pointtriangulationof Ozark

6 Conclusion

Purelyfunctional,persistentdatastructuresoffer a numberof programmingadvantagesover their morefa-miliar, ephemeralcounterparts.Many algorithmsin computationalgeometryarebasedon low-level graphstructures.Theanalysisof thesealgorithmsis basedon a unit-costassumptionfor the fundamentalopera-tionsonagraph.A naıvetranslationof thesealgorithmsusingapersistenttreestructureto representthegraphwould introducean ���������� factor into the asymptoticcomplexity, resultingin asymptoticallysub-optimalperformancein thepersistentcase.

This paperaddressesthequestionof whetherthis logarithmicpenaltyis avoidablein specificcases.Weconsiderthefundamentalproblemof constructingtheconvex hull of a setof pointsin threedimensions.Wegive a high-level descriptionof thehull asa simplicial complex, andprovidea persistentimplementationofit. We alsopresenta new algorithm,calledthe bulldozeralgorithm,that achievesthe asymptoticallyopti-mal ��� ��������� time bound(in a randomizedsense)thatworkswith this abstractrepresentationof the hull.To assessthe practicalityof the algorithm,we implementedthis algorithmin StandardML andmeasuredits performanceon a varietyof artificial datasets.We alsousedthis algorithmto build a terrainmodellingapplicationderivedfrom work of GarlandandHeckbert[13]. Ourresultsconfirmthatthepersistentrepresen-tationof thehull, togetherwith our new algorithmfor constructingit, areboth theoreticallyandpracticallyefficient.

Acknowledgements

We thankChrisOkasakifor his red-blacktreelibrary, whichweusedin our implementationof simplices.

References

[1] PaulAlexandroff. ElementaryConceptsof Topology. Dover Publications,Inc, New York, 1961.

[2] C. B. Barber, D. P. Dobkin, andH. T. Huhdanpaa.The quickhull algorithmfor convex hulls. ACMTrans.onMathematicalSoftware, December1996.

74

0

50

100

150

200

250

300

0

100

200

300

400

1600

1800

2000

2200

2400

Figure8: 1000-pointtriangulationof CraterLake

[3] R. Bayer. Symmetricbinary B-trees:Datastructureandmaintenancealgorithms. Acta Informatica,1:290–306,1972.

[4] M. deBerg, M. vanKreveld,M. Overmars,andO.Schwarzkopf.Computational Geometry:AlgorithmsandApplications. Springer-Verlag,Berlin, 1997.

[5] GuyE. Blelloch. Programmingparallelalgorithms.Communicationsof theACM, 39(3):85–97,March1996.

[6] Hal Burch and Gary Miller. Computingthe convex hull in a functional language. In Preparation,September1999.

[7] BernardChazelle. An optimal convex hull algorithmandnew resultson cuttings. In FOCS, pages29–38,1991.

[8] K. L. ClarksonandP. W. Shor. Applicationsof randomsamplingin computationalgeometry. DiscreteandComputational Geometry, 4:387–421,1989.

[9] Paul F. Dietz. Fully persistentarrays.In Workshopon AlgorithmsandData Structures, volume382ofLecture Notesin ComputerScience, pages67–74.Springer-Verlag,August1989.

[10] JamesR. Driscoll, Neil Sarnak,DanielD. Sleator, andRobertE. Tarjan.Makingdatastructurespersis-tent. Journal of ComputerandSystemSciences, 38(1):86–124,February1989.

[11] JamesR.Driscoll,DanielD. Sleator, andRobertE.Tarjan.Fully persistentlistswith catenation.Journalof theACM, 41(5):943–959,1994.

[12] Martin Erwig. Functionalprogrammingwith graphs.In Proc.ACM SigplanInternationalConferenceonFunctionalProgramming, pages52–55,June1997.

75

0

2e+06

4e+06

6e+06

8e+06

1e+07

1.2e+07

1.4e+07

1.6e+07

1600 3200 6400 12800

Num

ber

of O

pera

tions

Problem size (points)

key-comparisons (Ozark)floating-point ops (Ozark)

key-comparisons (Crater Lake)floating-point ops (Crater Lake)

Figure9: Operationcountsasa functionof input size.

[13] MichaelGarlandandPaul Heckbert.Fastpolygonalapproximationof terrainsandheightfields. Tech-nicalReportCMU-CS-95-181,CSDept,Carnegie Mellon U., September1995.

[14] P. J.Giblin. Graphs,Surfaces,andHomology. ChapmanandHall, London,1977.

[15] C. M. Gold andP. R. Remmele.Voronoimethodsin GIS. In Algorithmic foundationsof geographicinformationsystems, pages21–35.Springer-Verlag,1997.

[16] L. GuibasandJ.Stolfi. Primitivesfor themanipulationof generalsubdivisionsandthecomputationofVoronoidiagrams.ACM TransactionsonGraphics, 4(2):74–123,1985.

[17] N. GuptaandS.Sen.An improvedoutput-sizesensitive parallelalgorithmfor hidden-surfaceremovalfor terrains. In Proceedingsof the First Merged InternationalParallel ProcessingSymposiumandSymposiumonParallel andDistributedProcessing, pages215–219,March1998.

[18] Haim KaplanandRobertE. Tarjan. Persistentlists with catenationvia recursive slow-down. In ACMSymposiumonTheoryof Computing, pages93–102,May 1995.

[19] Rajeev MotwaniandP. Raghavan. RandomizedAlgorithms. CambridgeUniversityPress,1995.

[20] ChrisOkasaki. Amortization,lazy evaluation,andpersistence:Lists with catenationvia lazy linking.In Proc. IEEE SymposiumonFoundationsof ComputerScience, pages646–654,October1995.

[21] Neil Sarnakand RobertEndreTarjan. Planarpoint location using persistentsearchtrees. CACM,29(7):669–679,1986.

76

An Algebraic Dynamic Programming Approach

to the Analysis of RecombinantDNA Sequences

RobertGiegerich� StefanKurtz� Georg F. Weiller �

1 Intr oduction

1.1 From Biosequencesto Structure to Function

Dynamicprogramming(DP, for short)is a fundamentalprogrammingtechnique,applicableto greatadvan-tagewhenever theinput to a problemspawnsanexponentialsearchspacein a structurallyrecursive fashion,and solutionsto subproblemsadhereto an optimality principle. No wonder that DP is the predominantparadigmin computational(molecular)biology. Sequencedata—DNA, RNA, andproteins—aredeterminedon an industrialscaletoday. The desireto give a meaningto thesemoleculardatagivesrise to an ever in-creasingnumberof sequenceanalysistasks.Giventhemassof thesedataandthelengthof thesesequences( �#���3��� basesfor a bacterialgenome,��������� for the humangenome),programefficiency is crucial. DP isusedfor assemblingDNA sequencedatafrom thefragmentsthataredeliveredby theautomatedsequencingmachines[1], andto determinetheintron/exonstructureof genes[3]. It is usedto infer functionof proteinsby homologyto otherproteinswith known function [10,11], and to determinethe secondarystructureoffunctionalRNA genesor regulatoryelements[15]. In someareas,DP problemsarisein suchvariety thataspecificcodegenerationsystemfor implementingthe typical DP recurrenceshasbeendeveloped[2]. Thissystem,however, doesnot supportthedevelopmentor validationof theserecurrences.

1.2 Outline of Algebraic Dynamic Programming

The systematicdevelopmentof DP solutionsfor problemsin computationalbiology hasbeenrecentlyad-dressedby Giegerich [4]. There,an algebraicapproachto dynamicprogramming(ADP) was developedandappliedto the problemof folding an RNA sequenceinto its secondarystructure. Herewe will adaptADP to theproblemof comparingtwo sequencesin theedit distancemodel.ADP is basedon thefollowingprinciples:

1. Theanalysisproblemat handis conceptuallysplit into a structure recognition anda structure evalua-tion phase.Recognizedstructuresarerepresentedby analgebraicdatatype  . Evaluationis specifiedin termsof a particular  -algebra.

2. A subsetof well-formedstructuresin   is distinguishedby a treegrammar. We requirethat¡TechnischeFakultat, Universitat Bielefeld,Postfach100131,33501Bielefeld,Germany, E-mail: [email protected],

partially supportedby agrantfrom theAustralianNationalUniversity¢TechnischeFakultat, Universitat Bielefeld,Postfach100131,33501Bielefeld,Germany, E-mail: [email protected],

partially supportedby DFG-grantKu 1257/1-1£BioinformaticsLaboratory, ResearchSchoolof BiologicalScience,AustralianNationalUniversity, Canberra,ACT 0200,Australia,

E-mail: [email protected]

77

¤ structurerecognitionfindsall andonly all well-formedstructures,¤ structurerecognitionconstructseachsuchstructureexactlyonce,¤ structureevaluationis performedonly on well-formedstructures.

3. By providing parsers for theterminalsymbolsandparsercombinators for thealternative,applicative,andsequentialoperatorsof thetreegrammar, thegrammarturnsinto a recognizerfor its language.

4. A recursive recognizeris turnedinto a DP algorithmby tabulation: Each(recursive) parseris substi-tutedby a (recursively defined)tableof results.This is achievedby anefficiency annotationthatdoesnotchangethedeclarativemeaningof thegrammar.

5. An abstract evaluator is a recognizerwritten in termsof an abstract¥ -algebra,applyingan abstractchoicefunctionto eachintermediateresult.Instantiatedwith aconcrete¥ -algebra,it interleavesstruc-turerecognitionandevaluation.Theconcreteevaluatorsoobtainedrunsin polynomialtimeandspace,if theconcreteevaluationalgebrahasaconstanttimeandspaceboundwith respectto eachintermediateresult.1

6. DP recurrences, suitablefor implementationin any imperative language,can be derived from thespecificationby straightforwardsubstitutionandprogramsimplification.

1.3 Why Functional Programming Matters

ADP is a programdevelopmentmethod,andthe resultingprogramcan(andnormally will) eventuallybeimplementedin an imperative language.A functionallanguagelike Haskell, however, makesthe approachmuch more practical,and even enjoyable. The ADP approachcan be completelyembeddedin Haskell,allowing us to experimentwith executableprogramsat all stagesof development. A wide rangeof lazyfunctionalprogrammingtechniquesis used,the mostessentialbeingparsercombinators[9], programmingwith unknowns,andlazy (thoughimmutable)arrays.

Theproductivity of theapproachresultsfrom themodularity(cf. [8]) we achieveby separatingstructurerecognitionfrom structureevaluation.This advantageonly exists in thefunctionalparadigm;it is sacrificedin thefinal step(seeSection1.2,Principle6).

AlthoughADP is a programdevelopmentmethod,andnot anequivalencetransformationon programs,it bearssomeresemblanceto deforestation[13], particularlyin theform of [5]. Theessentialspeed-upfromexponentialto polynomialtime complexity, however, is not achievedby deforestation,but by tabulationandthesimultaneousintroductionof achoicefunctionthatreducesthevolumeof theintermediateresults.

2 BiosequenceComparison in the Edit DistanceModel

2.1 Searching for the Signalsof Recombination

Comparisonof DNA or proteinsequencesis predominantlydonein the edit distancemodel. Two or moresequencesarerearrangedby introducinggaps,in away thatbestexhibits their (dis)similarities.Theconcreteway in which distanceor similarity is measuredis expressedby meansof a scoringfunction for matches,mismatches,andgaps.The scoringfunction variesfrom applicationto application.Sequencesimilarity istakenasanindicationof homology, andmultiplealignmentsor pairwisedistancessoobtainedarefrequentlyfed into programsthattry to reconstructphylogenies,i.e.,evolutionaryrelationshipsof genesor species.

1More precisely, all operationsof thealgebramaybeallowed to have polynomialefficiency, but thechoicefunction is critical andmusthavea constantboundon thesizeof its output.

78

DNA recombinationis an importantmechanismin molecularevolution. Genesthathave evolved inde-pendentlyin differentstrainsof a virus, for example,mayrecombinein a new strain.This addsthepowerofparallelprocessingto Darwinianevolution,which is otherwisebasedontrial anderror(i.e.,randommutationandselection). In the presenceof recombinantDNA, practicallyall commonlyusedanalysisprogramsgowrong. Thereis no longera tree-like phylogeny, asdifferentpartsof a sequencestemfrom differentances-tors. In sucha case,thebestwe canhopefor from a treereconstructionprogramis to tell usthat thereis noclearsupportfor eitherof severalpossibletreesin thedistancedata.

But thereis a differencebetweendatawhich are just noisy, anddatawhich carry a clearsignalaboutrecombinationevents.Therearedifferentwaysto explicitly searchfor recombinationsignals.ThePhylProprogram[14] doesso by monitoringpatternsof changein the mutual similarities in a multiple sequencealignment.In this paper, we take adirectapproach,applicableto pairwisesequencealignment.

Traditionally, insertionsanddeletionsareseenasrandomevents,independentof their sequencecontext.But this is not totally adequate:Insertionsanddeletionsin DNA sequencesoftenstemfrom recombinationevents.Themolecularmechanismsof recombinationmayleave tracesin theform of targetsiteduplicationsof varying length. Similar repeatsmay be formed throughreplicationslippage,the othercellular processresponsiblefor indel formation.Currentmethodsof sequenceanalysisignorethesesignals.

2.2 Extending the Edit DistanceModel

Let ¦ and § be two DNA sequencesof length ¨ and © , respectively. The classicaledit distancemodelconsidersthefollowing edit operations:ª+«+¬�­ ®�¯ denotesthereplacementof nucleotide° in ¦ by ± in § . If °^²M± , this is calledamatch, otherwise

aproperreplacement.ª+³´¬¶µ ·s¯ denotesthe insertionof a non-emptysequence of nucleotidesinto § , therebyintroducingin ¦agapof thesamelength,i.e.,a sequenceof ¹ ¸º¹ dashes.ª+»K¬x¼µ ¯ denotesthedeletionof a non-emptysequence of nucleotidesfrom ¦ , therebyintroducingin §agapof thesamelength,i.e.,a sequenceof ¹ ¸º¹ dashes.

As new edit operations,we introducerecombinantdeletionand insertion. Let ½ be a non-empty(buttypically short)sequenceof nucleotidesthatoccursbothin ¦ andin § .ª*¾\¿�À µwµÀ ¼ ÀÂÁ denotesa recombinantinsertion in § : Following the target site ½ , presentin both ¦ and § ,

a sequence of nucleotidesis insertedinto § , followed by a new copy of ½ in § . In ¦ , a gapof thecombinedlengthof ¸ and ½ is introduced.ª+ÃÄ¿ À ¼ ÀÀ µpµ Á denotesa recombinantdeletionfrom ¦ : Following the targetsite ½ , presentin both ¦ and § , asequence of nucleotidesis deletedfrom ¦ . This requiresa secondcopy of ½ to follow ¸ in ¦ . In § , agapof thecombinedlengthof ¸ and ½ is introduced.

In bothcases,we allow thedeletedor insertedsequence to beempty, which makesthe targetsiteandits duplicationform a tandemrepeatin ¦ or § .Example1 Hereis analignmentof °v½Å½¶ÆÈÇÉ°É° and °ÉÆxÇ�½¶°�½¶°ÉÆxÇ�°ÉÆ :

«K¿ °° Á »ËÊ ½&½Ì�Ì´Í ¾ÄÊ ÆºÇ Ì�Ì&Ì!Ì&Ì�ÌƺǴ½�°T½Î°�ÆÎÇ Í «K¿ °° Á «K¿ ° Æ ÁIt shows threereplacements,a shortdeletion,anda recombinantinsertion.

79

Proceedingfrom this operationalview of recombinationeventsto theanalyticview, we mustdefinethesequencepatternthatcanbeinterpretedin retrospectasasignalleft from arecombination.

For any sequenceÏ andany Ð#Ñ)Ò ÓyÔÖÕ Ï×Õ Ø , ÐÈÙ{Ï denotesthe suffix of Ï after dropping Ð symbolsfrom thebeginningof Ï . A target siteduplicationin Ú is a pair Û�ÐxÔ ÜÉÝ suchthat ÐÞÙ�ßÄàâá¶Ï and ÜÉÙÉÚ�àâáÅã{áÅä for someáÈÔÞã}ÔÞä#Ô�Ï suchthat á is notempty. It is maximalif thefirst characterof ãpÔ�ä$Ô�Ï is not thesame,wheneverthesestringsarenot empty. A target siteduplicationin ß is a pair Û�ÐxÔ ÜÉÝ suchthat жÙÉß\àYáÅã�áÅä and Ü�ÙÉÚåàYá¶Ï forsomeáÈÔÞã}ÔÞä$ÔrÏ suchthat á is notempty. It is maximalif thefirst characterof ãpÔ�ä$Ô�Ï is not thesame,wheneverthesestringsarenot empty.

Notethatseveraltargetsiteduplicationsmaybeidentifiedat thesameposition,differing in thelengthofthetargetsequenceá . However, in thefollowing werestrictto maximalduplications,sincealongertargetsiteduplicationis to betakenasthestrongersignalof a recombinationevent.Wesaythata recombinantdeletionis signalledby a maximaltargetsiteduplicationin ß , anda recombinantinsertionis signalledby a maximaltargetsiteduplicationin Ú .3 Computing Optimal Alignments in theExtendedEdit DistanceModel

3.1 An Algebraic Data Type for ExtendedAlignments

An alignmentof ß and Ú is traditionally representedby placing the alignedsequenceson different lines,with inserteddashesto denotegaps. Successive dashesinside ß areinterpretedasan insertioninto Ú , andsuccessivedashesinside Ú asadeletionfrom ß . Theeyeof thereaderimplicitly groupssuccessivedashesintogapsof maximallength.A slightly moreexplicit view definesthealignmentasasequenceof editoperations,with theadditionalrestrictionthata deletion(resp.insertion)mustnot immediatelyfollow anotherdeletion(resp.insertion).

With thenew edit operationsintroducedhere,we mustresortto anevenmoreexplicit notation,markingtargetsitesandtheir duplications.We alsohave to distinguishbetweengapsresultingfrom recombinationsandgapsfor which suchanevent is not indicated.We give up theview of a sequenceof edit operationsinfavor of a recursivedatatypeAlignment with a constructorfor eachedit operation.

type Sequence a = Array Int a -- indexedfrom 1type Region = (Int,Int) -- region æèç¶é�êÂë of ì denotesì�íïî�ðsñxñxñÅìÂòdata Alignment a = R a (Alignment a) a |

D Region (Alignment a) |I (Alignment a) Region |S Region (Alignment a) Region Region Region |L Region Region Region (Alignment a) Region |Empty

Within the datatypeAlignment , a target site duplication ÐóÙ�ß~àôá¶Ï , Ü�Ù�Ú+àôáÅã{áÅä , is representedbyan expressionof the form S t azw t u t , whereinazw denotesan alignmentof the suffixes Ï and ä ,andthe threeoccurrencesof t denotethe target site in ß and(duplicated)in Ú . Subwordsof ß and Ú arerepresentedby their boundaries.Henceeachedit operationrequiresconstantspace.If õåàËÕ áóÕ and öTàËÕ ã÷Õ ,thentheaboveexpressionis actuallywrittenas

S (i,i+k) azw (j+k+r,j+k+r+k) (j+k,j+k+r) (j,j+k) .

More spaceefficient representationsarepossible,sincewe only needto storeÐ , Ü , õ , and ö .Example2 GivenadatatypeBase with constantsA, C, G, andT, theexpression

R A (D (1,3) (S (3,5) (R A (R A Empty C) A) (7,9) (3,7) (1,3))) A

80

denotesthealignmentof ø�ùÅù¶úxûÉøÉø and øÉúxûÉù¶ø�ù¶ø�úÈûÉøÉú shown in Example1. It maybeprintedin ASCII as

x = a t t c g - - - - - - a ay = a - - c g t a t a c g a c

R D D S S U U U U T T R R

Thethird line indicatestheeditoperationinvolved.Therecombinantinsertionis labeledin theform S UT to indicatethestartingtargetsiteS, theinsertU, andtheduplicatedtargetsiteT.

At this point the readeris encouragedto take a look aheadat Section4. It shows the improvementofstandardalignmentalgorithmswhichwe go for in thesubsequentsections.

3.2 A Grammar for Well-Formed Alignments

ThedatatypeAlignment is notspecificenoughto describeexactlyall meaningfulalignments.For example,it allowsto representtwo subsequentinsertions,whichshouldratherbemergedinto asingle,longerinsertion:

x = a t t c g - - - - - - - - a a -- malformedy = a - - c g t a t a c g g g a c

R D D S S U U U U T T I I R R

We do not accepta non-recombinantinsertionimmediatelyfollowing a recombinantinsertion. (We do,however, acceptthe oppositeorder.) It seemsaccidentalto locatea duplicatedtarget site in the middleof a gap. In sucha situation,the alignmentshouldrathershow a single (non-recombinant)insertion(leftalignment).Alternativelywemightcall for arecombinantinsertionwith ashortertargetsite(right alignment).

x = a t t c g - - - - - - - - a a x = a t t c g - - - - - - - - a ay = a - - c g t a t a c g g g a c y = a - - c g t a t a c g g g a c

R D D R R I I I I I I I I R R R D D R S U U U U U U U T R R

It will be the taskof the scoringfunction to choosebetweenthe latter two alternatives,while the mal-formedalignmentabovewill not evenbescored.

We introducea grammargeneratingexactly thewell-formedalignments.Following thedisciplineof [4],we usea treegrammarover thedatatypeAlignment , seeFigure1. Theterminalsymbolsof this grammararebase, region, uregion, denotingasinglenucleotide,anon-emptyandanarbitrarysequenceof nucleotides,respectively. Thenonterminalsarealignment, noDel, noIns, andmatch.

A productionin thisnotationshouldbereadas:“An alignmentis eitheramatch, or alternativelyadeletionof someregion from ü followedby a noDel, or alternatively an insertionof someregion in ý , followedby anoIns.”

As easilyseenin thegrammarnoDelgeneratesall alignmentsthatdonotstartwith adeletion.TheuseofnoDel in thefirst productionpreventssuccessive deletions.Similarly for noIns. Leaving out theclausesforrecombinantdeletionsandinsertions,this treegrammarexpressestheclassicaledit distancemodel[10,11],usedin biosequenceanalysisaswell asin stringprocessing.

The grammarstill lackssomesyntacticrestriction: The threeoccurrencesof region in the productionsassociatedwith S andL mustall derive thesamenucleotidesequence.

We now turn thegrammarinto a recognizerby definingterminalparsersandparsercombinators[9]. Forsimplicity (andreasonsof space)we assumethat theinput sequencesü and ý , aswell astheir length þ andÿ areglobally known. We do not show how thesevaluesarethreadedthroughthefunctions.

A parseris givena pair of indices ��������� andreturnsa list of all well-formedalignmentsof thesuffix �� üwith thesuffix �vý . Parsersfor terminalsymbols,however, areappliedto oneof the input sequences,so intheir case,a call for ��������� recognizesthe subword ��������� in either ü or ý . Thereis a parsercombinatorfor

81

alignment � match D

region noDel I

noIns region

noDel � match I

match region

noIns � match D

region match

match � Empty

R

base alignment base

S

region noIns region uregion region

L

region uregion region noDel region

Figure1: A treegrammarfor well-formedalignments

the alternative, andfor usingparserresults.Sincewe have two stringsto process,therearetwo sequentialcombinators+˜˜ and˜˜+ . Thecombinators-˜˜ and˜˜- arespecialformsof these.2

type Parser b = (Int,Int)->[b] -- all parsesof suffix pair

xbase,ybase::Parser axbase (i,j) = [x!j | i+1 == j] -- recognizea basefrom �ybase (i,j) = [y!j | i+1 == j] -- recognizea basefrom �region,uregion::Parser (Int,Int)region (i,j) = [(i,j) | i < j] -- recognizea non-emptyregionuregion (i,j) = [(i,j) | i <= j] -- recognizeany region

empty::b->(Parser b)empty v (i,j) = [v | i == m && j == n] -- recognizeemptyalignment

(|||)::(Parser b)->(Parser b)->(Parser b)(|||) q r inp = q inp ++ r inp -- alternative

(<<<)::(b->c)->(Parser b)->(Parser c)(<<<) f q = map f.q -- usingparserresults

(+˜˜),(˜˜+),(-˜˜),(˜˜-)::(Parser (b->c))->(Parser b)->(Parser c)(+˜˜) q r (i,j) = [s t | k<-[i..m], s<-q (i,k), t<-r (k,j) ](˜˜+) q r (i,j) = [s t | k<-[j..n], s<-q (i,k), t<-r (j,k) ](-˜˜) q r (i,j) = [s t | i < m, s<-q (i,i+1), t<-r (i+1,j)](˜˜-) q r (i,j) = [s t | j < n, s<-q (i,j+1), t<-r (j,j+1)]

2To explain our ideasit would sometimessuffice to presentlessHaskell-code.However, we want to make our paperself-containedandthereforeshow thecodealmostcompletely.

82

suchthat::(Parser b)->(b->Bool)->(Parser b)suchthat q f inp = [s | s <- q inp, f s] -- checkpropertyof parserresults

axiom::((Int,Int)->b)->baxiom q = q (0,0) -- declarestartsymbolof grammar

In thegrammar, written asa recognizer, we addsyntacticrestrictionsfor maximaltargetsiteduplicationto thecorrespondingproductions.

enum_alignments::(Eq a)=>(Sequence a)->(Sequence a)->[Alignment a]enum_alignments x y = axiom alignment where

alignment = match |||D <<< region +˜˜ noDel |||I <<< noIns ˜˜+ region

noDel = match |||I <<< match ˜˜+ region

noIns = match |||D <<< region +˜˜ match

match = empty Empty |||R <<< xbase -˜˜ alignment ˜˜- ybase |||recombIns |||recombDel

recombIns = ((S <<< region +˜˜ noIns ˜˜+ region ˜˜+ uregion ˜˜+ region)‘suchthat‘ targetsiteduplication)‘suchthat‘ maximality

recombDel = ((L <<< region +˜˜ uregion +˜˜ region +˜˜ noDel ˜˜+ region)‘suchthat‘ targetsiteduplication)‘suchthat‘ maximality

3.3 Dynamic Programming = Parsing + Tabulation

Theabove recognizeris easyto develop,but its associatedparseris highly inefficient: Not only is thereanexponentialnumberof well-formedalignmentsfor eachpair of input sequences.The recognizerwill alsorepeatedlyparsethesamesubwordswhencalledfrom differentcontexts.

The latter inefficiency is removedby introducingtabulationof intermediateparserresults(representingalignmentsof suffixesof thetwo inputs).In otherwords,we employ DP. In contrastto memoization[7], DPusesexplicitly andstaticallyallocatedtables.

type Parsetable b = Array (Int,Int) [b]

tabulated::Parser b->Parsetable btabulated q = array ((0,0),(m,n)) [((i,j),q (i,j)) | i<-[0..m],j<-[0..n]]

We modify the previousgrammar, suchthatall parsersthat do a non-constantamountof work per callshall usetabulation. Calling a parsermeansa tablelookup. For reasonsof spacewe only show the parseralignment . Notethatour “efficiency annotation”doesnot affect thedeclarativemeaningof thegrammar.

dp_alignments::(Eq a)=>(Sequence a)->(Sequence a)->[Alignment a]dp_alignments x y = axiom (alignment!) where

alignment = tabulated ((match!) |||D <<< region +˜˜ (noDel!) |||I <<< (noIns!) ˜˜+ region)

83

It is folklore knowledgethat DP combinesrecursionandtabulation. After all, DP is normally formu-latedvia matrix recurrences.Theremarkablepoint hereis theswiftnessof transition,merelyby addingthe“keyword” tabulated anda few “ ! ” to thegrammar. Thedeclarativeandtheoperationalmeaningof thegrammarremainunaffected,while efficiency improvesfrom exponentialto polynomial. If we hadnot beenin lovewith Haskell before,this is whereit would havehappened.

Therecognizerspecifiedby thisgrammarrunsin ��������� spaceandin ��������� time,dueto thefour sequen-tial combinatorsin theproductionsassociatedwith recombinantinsertionsanddeletions.3

3.4 An ����� �"! Implementation Using a PrecomputedLookahead

The above parserindependentlychoosesthreeregionsfor the targetsite in # , in $ , andfor the duplicationsite in either # or $ . Thereafter, thosearecheckedfor identity. Its efficiency canbegreatlyimprovedby thefollowing observation: Considera maximaltargetsiteduplication %'&(#*),+.- , /(&($0),+21"+23 . Assumewe havechosenandfixedthecombinedlength 45)76 +2186 of thetargetsite + andthe insert 1 . Now for given # and $ ,thereis really novariationleft for theremainingconstituentsof thepattern:

9 Thestartpositionsof theidenticalsubwordsmustbe %�:�/ , and /<;=4 .

9 Their lengthsareuniquelydeterminedby themaximalitycondition.

Thuswewill modify theparserto guesstheposition/>;?4 , andthenuseaprecomputedtablelookaheadto determinethelengthof + . For each��%�:�/��A@CB DE:GFIHKJLB DE:G�>H this tablestoresthelengthof thelongestcommonprefix of %.&�# and /&�$ . It is computedandstoredin ��������� time andspace.Theoverall runningtime of therecognizeris reducedto ������M�� , while thespacerequirementremains����� � � . Notethatsincethethreesitesarenow chosenasidenticalsubwordsof maximal length,this approachobviatesthe a-posterioricheckfortheseproperties.Theresultinggrammaris very similar to thegrammarab alignments givenin Section3.5.

3.5 The Abstract Evaluator and Evaluation Algebras

According to [4], an abstractevaluatoris obtainedby abstractingfrom the constructorsof the underlyingdatatypeAlignment . Additionally, anabstractchoicefunction is associatedwith eachproduction,by thecombinator(...) . Suchanensembleof functionsof appropriatetypesconstitutesanalignment-algebra.

type Algebra a b= (b, -- Empty

a->b->a->b, -- R(Int,Int)->b->b, -- Db->(Int,Int)->b, -- I(Int,Int)->(Int,Int)->(Int,Int)->b->(Int ,Int)- >b, -- L(Int,Int)->b->(Int,Int)->(Int,Int)->(Int ,Int)- >b, -- S[b]->[b]) -- choicefunction

(...)::Parser b->([b]->c)->(Int,Int)->c(...) q choice = choice.q -- applyinga choicefunction

Theabstractevaluatortakesanalignmentalgebraasanadditionalparameterandaddsthechoicefunction.

3Wegenerallyassumethat NPORQTSVUXW , to simplify asymptoticresults.

84

ab_alignments::(Eq a)=>(Algebra a b)->(Sequence a)->(Sequence a)->[b]ab_alignments alg x y = axiom (alignment!) where

(fE, fR, fD, fI, fL, fS, choice) = alg

alignment = tabulated ((match!) |||fD <<< region +˜˜ (noDel!) |||fI <<< (noIns!) ˜˜+ region ... choice)

noDel = tabulated ((match!) |||fI <<< (match!) ˜˜+ region ... choice)

noIns = tabulated ((match!) |||fD <<< region +˜˜ (match!) ... choice)

match = tabulated (empty fE |||fR <<< xbase -˜˜ (alignment!) ˜˜- ybase |||(recombIns!) |||(recombDel!) ... choice)

recombIns = tabulated (r ... choice) wherer (i,j) = [fS t’ noins d u t | l <-[j+1..n-1],

let k = min h (lookahead!(i,l)),t’<- region (i,i+k),noins <- noIns!(i+k,l+k),d <- region (l,l+k),u <- uregion (j+k,l),t <- region (j,j+k)]

where h = lookahead!(i,j)

recombDel = tabulated (r ... choice) wherer (i,j) = [fL t u d nodel t’ | l <-[i+1..m-1],

let k = min h (lookahead!(l,j)),t <- region (i,i+k),u <- uregion (i+k,l),d <- region (l,l+k),nodel <- noDel!(l+k,j+k),t’<- region (j,j+k)]

where h = lookahead!(i,j)

Thevirtueof theabstractevaluatoris,of course,thatit canbecalledwith arbitraryAlignment algebras:Theenumerationalgebra is trivially givenby theconstructorsof theAlignment datatype.

enum_alg::Algebra a (Alignment a)enum_alg = (Empty, R, D, I, L, S, id)

Thecountingalgebra maybeusedto determinethe numberof well-formedalignments(without calcu-lating thealignments,of course).

85

count_alg::Algebra a Intcount_alg = (fE, fR, fD, fI, fL, fS, choice) where

fE = 1fR _ x _ = xfD _ x = xfI x _ = xfL _ _ _ x _ = xfS _ x _ _ _ = xchoice [] = []choice xs = [sum xs]

The following scoringalgebraimplementsa model with affine gapscores[6]. Sucha model is usede.g.by CLUSTALW, a popularsequencealignmenttool [12]. We have extendedthis algebraby scoresforrecombinantinsertionsanddeletions.We have givena clearadvantageto recombinantindelsover regularonesby dividing their penaltiesby thelengthof theobservedtargetsiteduplication.

affine_alg::Algebra Base Floataffine_alg = (fE, fR, fD, fI, fL, fS, choice) where

fE = 0fR a x b = x + matchscore a bfD (i,j) x = x + open + fromInt(j-i)*extendfI x (i,j) = x + open + fromInt(j-i)*extendfL (i,j) (u,u’) _ x _ = x + ropen (i,j) + fromInt(u’-u)*rextendfS (i,j) x _ (u,u’) _ = x + ropen (i,j) + fromInt(u’-u)*rextendchoice [] = []choice xs = [minimum xs]open = 5.0extend = 0.2ropen (i,j) = open/fromInt(j-i)rextend = extend

matchscore::Base->Base->Floatmatchscore a b | a == b = 0

| a > b = matchscore’ b a -- functionis symmetric| otherwise = matchscore’ a b

where matchscore’ A G = 1matchscore’ A _ = 3matchscore’ C G = 3matchscore’ C T = 1matchscore’ G T = 3

The optimal alignmentalgebra combinesthe scoringalgebrawith the enumerationalgebra. This isstraightforward.It returnsanoptimalalignmenttogetherwith its score,in Y�Z�[�\�] spaceand Y�Z�[�^�] time.

4 Applications

We have appliedour programsto chicken immunoglobinsequencestaken from a multiple alignment. Thetypical improvementsachievedby our algorithmareshown in Figure2:_ In theleft partof therecombinantalignment,a gapof length12 (presentin themultiple alignment)is

re-discoveredin thecorrectposition.Additionally, it is markedasadirectrepeat,asit mayresultfromarecombinantinsertionwith anemptyinsert.Furtherexperimentsrevealthatanalignmentinsensitiveto recombination,but with the samescoringotherwise,hasan insertionin approximatelythe sameposition,but doesnot exhibit therepeatdueto anaccidentalambiguitywhich causesonebaseto shiftfrom theendto thebeginningof theinsert.

86

>* * ** * ** * <...tac------------tatggctggtaccag...ctccggttc cctatcc ggctcca caggca cat......tactatggctggtactatggctggtaccag...ctccggctc cccaggc agaacca caagca cat...

> * <...tactatggctggtac------------cag...ctccggttc cctatcc ggctcca caggca ------- -----ca t......tactatggctggtactatggctggtaccag...ctccgg--- ------- --ctccc caggca gaaccac aagcaca t......RRRSSSSSSSSSSSSTTTTTTTTTTTTRRR...RLLLLLUUUUUUUTTTTTRRRRRRRRSSSUUUUUUUUUTTTRRR...

Figure2: Originalalignment(top)andrecombinantalignment(bottom)

` Theright partof themultiple alignmentis poorwith 8 mismatches(markedby thesymbol* ) within aregion of 23 bases(betweenthedelimiters> and<). Therecombinantalignmentoffersanalternativeexplanation. It exhibits both a recombinantdeletionand an insertion,with significant target sites,reducingthemismatchcountto 1 over thesameregionasin thetop alignment.

FromtheHaskell-program,DP recurrenceswerederived(seeSection1.2,Principle6). Their implemen-tation in C by a studentrequiredthreedaysof work, includingdebugging. The functionalprogramhelpedto spoterrorsin theC programthatmight otherwisehavegoneunnoticed.First measurementsshow thattheC-programrunsfasterthanthecompiledHaskell-programby a factorof 68,while using2%of thespace.4

5 Conclusion

ADP is a methodfor algorithmdevelopment.It canbeappliedbeneficiallymerelywith pencilandpaper. Itsembeddingin Haskell addstheconvenienceto testideasvery early, i.e., on a very high level of abstraction.Thebenefitsof thefunctionalmethodsaremanyfold:

1. Haskell’s infix operatorsarenotationalconveniencewhich is essentialin this context.

2. Thecombinatorparsingtechniqueallows to have a consistentdeclarative andoperationalmeaningofthegrammar.

3. Theequivalenceof arraysandfunctionsgivesuspolynomialefficiency without intellectualcomplica-tion.

4. Lazinessfreesus from explicitly programmingthe orderof computationof tableentries,which is amosterror-pronetaskin strict setting.Our experienceis summarizedin themotto“No subscripts,noerrors”.

5. Algebraicdatatypesandhigherorder functionsallow to separaterecognitionphaseandevaluationalgebra. a grammarsand b evaluationalgebrascombineto adc�b differentanalyses.In biosequenceanalysis,which involvesmuchexperimentalprogramming,this compositionalitytakesthe logarithmof theprogrammingeffort requiredotherwise.

Theimplementationeffort canbesummarizedasfollows.Having appliedADP in adifferentcontext before,ittookanafternoonto adaptthecombinatordefinitionsandarriveat the e�f�b�g�h algorithm.Differentevaluation

4For example,whencomputingthealignmentof Figure2 (for sequencesof length200),theC-programtakes5 secondsusing1.08megabytesof space,while theHaskell programtakes340secondsusing50megabytesof space.TheseresultswereobtainonaPentiumPII computerwith 300 MHz and128 MB RAM. We usedthe C-compilergcc version2.7.2.3,andthe Haskell-compiler ghc version4.04-1.

87

algebraswerehelpful to test the program. Comingup with the lookaheadbasedimplementationrequiredsomethinking,but again,its implementationandtestingwasa matterof hours.

Althoughtheimprovedparsersare“hard-coded”ratherthandefinedvia combinators,they fit in therestof theprogramwithout friction. Theflexibility makesusbelievethattheADP methodhasvirtually unlimitedpotentialfor improving programmingproductivity in biosequenceanalysis.

Acknowledgement We thankMatthiasHochsmannfor implementingtherecurrencesin C andperformingtheexperiments.Dirk Everscarefullyreadpreviousversionsof thepaper.

References

[1] E.L. AnsonandE.W. Myers. ReAligner: A Programfor RefiningDNA SequenceMulti-Alignments.In Proceedingsof theFirst Conferenceon ComputationalMolecularBiology, pages9–16.ACM-Press,1997.

[2] E. Birney andR. Durbin. Dynamite:A Flexible CodeGenerationLanguagefor DynamicProgrammingMethodsUsedin SequenceComparison.In Proc.of ISMB’97, pages56–64,1997.

[3] M. A. Gelfand,L. I. Podolsky, T.V. Astakhova,andM. A. Roytberg. Recognitionof Genesin HumanDNA Sequences.J. Comp.Biol., 3(2):223–234,1996.

[4] R. Giegerich. A Declarative Approachto the Developmentof Dynamic ProgrammingAlgorithms,Applied to RNA-Folding. Report98–02,TechnischeFakultat, Universitat Bielefeld, 1998. ftp://ftp.uni- bielefeld.de/pub/papers/tec hfak/p i/Rep ort98 - 02.ps .gz .

[5] A. Gill, J.Launchbury, andS.Peyton-Jones.A ShortCut to Deforestation.In Proceedingsof theCon-ferenceon FunctionalProgrammingLanguagesand ComputerArchitecture, June1993. ACM Press,New York, NY, 1993.

[6] O. Gotoh. An ImprovedAlgorithm for MatchingBiological Sequences.J. Mol. Biol., 162:705–708,1982.

[7] J.Hughes.Lazy Memo-Function.In Proceedingsof theConferenceon FunctionalProgrammingLan-guagesandComputerArchitecture, pages129–146.LectureNotesin ComputerScience201, SpringerVerlag,1985.

[8] J.Hughes.Why FunctionalProgrammingMatters.TheComputerJournal, 32(2):98–107,1989.

[9] G. Hutton. HigherOrderFunctionsfor Parsing.J. FunctionalProgramming, 3(2):323–343,1992.

[10] S.B.NeedlemanandC.D. Wunsch.A GeneralMethodApplicableto theSearchfor Similaritiesin theAmino-Acid Sequenceof Two Proteins.J. Mol. Biol., 48:443–453,1970.

[11] T.F. Smith andM.S. Waterman. Identificationof CommonMolecularSubsequences.J. Mol. Biol.,147:195–197,1981.

[12] J.D.Thompson,D.G.Higgins,andT.J.Gibson.CLUSTAL W: Improving theSensitivity of ProgressiveMultiple SequenceAlignmentthroughSequenceWeighting,PositionSpecificGapPenaltiesandWeightMatrix Choice.NucleicAcidsRes., 22:4673–4680,1994.

[13] P. Wadler. Deforestation:TransformingProgramsto EliminateTrees.Theoretical ComputerScience,73:231–248,1990.

[14] G.F. Weiller. PhylogeneticProfiles: A GraphicalMethod for DetectingGeneticRecombinationsinHomologousSequences.Mol. Biol. Evol., 15(3):326–335,1998.

[15] M. Zuker. TheUseof DynamicProgrammingAlgorithmsin RNA SecondaryStructurePrediction.InMathematicalMethodsfor DNA Sequences,Waterman,M.S.(editor), pages159–184.1989.

88

Constructing Red-BlackTrees

Ralf Hinze

Institut fur Informatik III, UniversitatBonnRomerstraße164,53117Bonn,Germanyikjkl"monqp�rkmts"i>u8jwvxpXy�z�{�r|p"}w~qs�r�r�z��k�� v�v��������E�o���?z�p�r�mts"iwu8j>vxpXy?z�{or|p"}w~xs�r�r�z��k�k���Eikj�l"m8�

Abstract

This paperexploresthe structureof red-blacktreesby solving an apparentlysimpleproblem: givenanascendingsequenceof elements,construct,in lineartime,a red-blacktreethatcontainstheelementsinsymmetricorder. Severalextremered-blacktreeshapesarecharacterized:treesof minimumandmaximumheight, treeswith a minimal and with a maximal proportionof red nodes. Thesecharacterizationsareobtainedby relatingtreeshapesto variousnumbersystems.In addition,connectionsto left-completetrees,AVL trees,andhalf-balancedtreesarehighlighted.

1 Intr oduction

Red-blacktreesareanelegantsearch-treeschemethatguarantees���������|�o� worst-caserunningtimeof basicdynamic-setoperations.Recently, C. Okasaki[10, 11] presenteda beautiful functional implementationofred-blacktrees. In this paperwe plungedeeperinto the structureof red-blacktreesby solving an appar-ently simpleproblem:givenanascendingsequenceof elements,constructa red-blacktreethatcontainstheelementsin symmetricorder. Sincethesequenceis ordered,theconstructionshouldonly take lineartime.

Thereareat leasttwo ways of approachingthis problem. Onecan try to analyseand to improve thestandardmethod,which worksby repeatedlyinsertingelementsinto anemptyinitial tree. Or onecanbuilduponwell-knownalgorithmsfor constructingtreesof minimumheight[4, 5]. In thelattercaseonemustsolvethefollowing relatedproblem:givenanarbitrarybinarysearch-tree,is thereawayof coloringthenodessuchthata red-blacktreeemerges?

We follow both pathsaseachprovidesus with different insightsinto the structureof red-blacktrees.Along the way, we will encounterseveralextremered-blacktreeshapes:treesof minimum andmaximumheight, treeswith a minimal proportionof red nodes,andotherswith a maximalproportion. In addition,connectionsto left-completetrees[13], AVL trees[2], andhalf-balancedtrees[12] arehighlighted.

2 Functional red-black tr ees

Let usstartwith abrief review of C.Okasaki’sfunctionalred-blacktrees[10, 11]. A red-blacktreeis abinarytreewhosenodesarecoloredeitherredor black.

data Color � R � Bdata RBTreea � E � N Color � RBTreea� a � RBTreea�

89

The balanceconditionsarebestexplainedif we take a look at their historicalroots. Red-blacktreesweredevelopedby R. Bayer[3] underthenamesymmetricbinaryB-trees. This termindicatesthatred-blacktreeswereoriginally designedasbinarytreerepresentationsof 2-3-4trees. Recallthata 2-3-4treeconsistsof 2-,3- and4-nodes(a 3-node,for instance,has2 keys and3 children)andsatisfiesthe invariantthat all leavesappearon the samelevel. The ideaof red-blacktreesis to represent3- and4-nodesby small binary trees,which consistof a blackroot andoneor two auxiliary redchildren.This explainsthefollowing two balanceconditions.

Red condition: Eachrednodehasa blackparent.

Black condition: Eachpathfrom therootto anemptynodecontainsexactlythesamenumberof blacknodes(thisnumberis calledthetree’sblackheight).

Notethattheredconditionimpliesthattherootof a red-blacktreeis black.Thealgorithmfor insertinganelementinto a red-blacktreeis nearlyidenticalto thestandardalgorithm

for unbalancedbinarytrees.Themaindifferenceis thattheconstructorfor building nodes,N, is replacedbya smartconstructor[1] thatmaintainstheinvariants.

insert ����� Ord a�8� a � RBTreea � RBTreeainserta t � blacken � ins t �

where ins E � N R E a Eins � N c l b r ��

a   b � bal c � ins l � b r�a �¡� b � N c l a r�a ¢ b � bal c l b � ins r �

blacken � N l a r � � N B l a r

Sinceanew nodeis coloredred,only theredconditionis possiblyviolated.Thesmartconstructorbal detectsandrepairssuchviolations.

bal B � N R � N R t £ a£ t ¤�� a¤ t ¥�� a¥ t ¦§� N R � N B t £ a£ t ¤�� a¤R� N B t ¥ a¥ t ¦��bal B � N R t £ a£<� N R t ¤ a¤ t ¥���� a¥ t ¦§� N R � N B t £ a£ t ¤�� a¤R� N B t ¥ a¥ t ¦��bal B t £ a £<� N R � N R t ¤ a¤ t ¥�� a¥ t ¦K��� N R � N B t £ a£ t ¤�� a¤R� N B t ¥ a¥ t ¦��bal B t £ a £<� N R t ¤ a¤R� N R t ¥ a¥ t ¦����¨� N R � N B t £ a£ t ¤�� a¤R� N B t ¥ a¥ t ¦��bal c l a r � N c l a r

Thesimplestway to constructa red-blacktreeis to repeatedlyinsertelementsinto anemptyinitial tree.

top-down ����� Ord a�q�ª© a «>� RBTreeatop-down � foldr insertE

3 A closerlook at top-down

What treeshapesdoestop-downproducewhenthegivensequenceis ascending?It is instructive to peekatsomesmallexamplesfirst. Thefollowing treesaregeneratedby top-down © 1 ¬­¬ i « for 1 ® i ® 8 (‘ ’ is a rednodeand‘ ’ is a blacknode).

90

Notethatwe do not careto label thenodesasthekeys areuniquelydeterminedby thesearch-treeproperty.Sincethe list is processedfrom right to left, the elementsareinsertedin descendingorder. Consequently,ins alwaystraversesthe left spineof the treeto the leftmostleaf. Theexamplesshow that the color of theleftmost leaf alternatesbetweenblack andred implying that the red condition is violated in every secondstep.Becauseinsalwaysbranchesto theleft, only thefirst equationof thesmartconstructorbal canpossiblymatch.If we draw theleft spinehorizontally, thebalancingoperationtakesthefollowing form.

¯�° ± °¯.²

± ²¯.³

± ³¯2´ µ ¶w·

¯�° ± °¯.²

± ²± ³

¯ ³ ¯ ´Thebalancingoperationpaintsthefirst nodeblackandcombinesthenext two puttingtheblacknodebelowthesecondredone.Sincethis is theonly operationapplicable,all nodesnot on theleft spinemustbeblack.We know even more: the black condition implies that the treesbelow the left spine( ¸.¹ , ¸.º and ¸2» ) mustbeperfectlybalancedbinarytrees(perfecttreesfor short).Thus,thegeneratedred-blacktreescorrespondtosequencesof toppedperfecttrees. A toppedtreeis atreewith anadditionalunarynodeontop. Toppedperfecttreesare,in fact,awidespreadplantin thedesignandanalysisof datastructures.J.-R.SackandT. Strothotte[13], who call thempennants, employ themto designalgorithmsfor splittingandmergingheapsin theformof left-completebinarytrees.In a left-completetreeall leavesappearon at mosttwo adjacentlevelsandtheleaveson the lowestlevel arein the leftmostpossiblepositions.We will exhibit furtherconnectionsto theirwork in Section6. Theauthorrecentlyshowedthatpennantsalsounderlybinomialheaps[7].

A toppedperfecttreeor a pennantof rank ¼ is a perfecttreeof height ¼ with anadditionalnodeon top.It follows thata pennantof rank ¼ containsexactly ½�¾ nodes.Turningto theanalysisof top-down ¿ 1 À­À i Á weareleft with the taskof determiningthepennants’ranks. It is helpful to redraw our examplesaccordingtotheleft-spineview.

A patternbegins to emerge: let ¼ be the rank of the rightmostpennant;the black conditionimplies that apennantof rank i appearseitheronceor twice for all 0  i  r. Sincethesizeof a rank i pennantis ½(à , wehave that thetreescorrespondto ‘binary numbers’composedof thedigits 1 and2. It is worthwhileto studythis numbersystem,which we call 1-2 system,in moredetail. Recallthat the valueof the radix-2numberÄ'Å�Æ�ÇÉÈ À�À­À Å�Ê�Ë ¹ is Ì Æ�ÇÉÈÃVÍ Ê Å Ã ½ à . Sincethe numbersystemabandonsthe digit 0 in favour of the digit 2, eachnaturalnumberhas,in fact, a uniquerepresentation.It is conceivablethat the numbersystemwasalreadyknown in the middle ageswhen the number0 was frowneduponand fell into oblivion later. Puristsareprobablyattractedby thefactthatthereis no needto disallow leadingzeros.Countingis easy:Ä�Ë ¹ÏÎ ÄGÐ�Ë ¹�Î Ä ½ Ë ¹�Î ÄGÐ�Ð�Ë ¹(Î Ä.Ð ½ Ë ¹�Î Ä ½ Ð�Ë ¹KÎ Ä ½�½ Ë ¹(Î Ä.Ð�ÐÏÐ�Ë ¹KÎ ÄGÐ�Ð ½ Ë ¹tÀ�À­ÀNotethat0 is representedby theemptysequence.Theincrementis similar to theordinarybinaryincrement;wehave Ñ ÐxÒÓÐÕÔ Ñ�½ and Ñ�½ ÒÖÐ×ÔØÄ Ñ ÒÓÐ�Ë�Ð . In thesequelweusethenotation

Ä'Å Æ�ÇÉÈ À�À­À Å Ê Ë 1-2 to emphasizethatthedigits aredrawn from theset Ù Ð ÎÚ½�Û .

We have seenthat red-blacktreesgeneratedby top-down ¿ 1 À­À n Á are uniquelydeterminedby the 1-2decompositionof n. Let us examinesomeexamples:

Ä.ÐKÜ ÆÝ Ë1-2Ô ½ ÆIÞ Ð correspondsto a perfecttreeof

height ß ( à Ü ÆÏÝ meansthedigit à repeatedß times);left-completetreesareproducedforÄGÐ�Ü ÆÏÝ ½ Ë 1-2

Ô ½ Æ(á�Èand

Ä ½ Ð Ü ÆÏÝ Ë�Ôdâ�ã ½ Æ Þ Ð . Hence,we know that top-down ¿ 1 À­À i Á producestreesof minimum height for

91

somevaluesof ä . This is goodnews. On theotherhand å'æç�èÏé­ê 1-2 ë æ�è(ìkíAî=æ and å.ï�æÏç�èÏé�ê 1-2 ëØð¡ñ æ�è�îòæcorrespondto skinny treesof height æ(ó and æ(ó�ôõï , respectively. A skinny treeis a treeof smallestpossiblesizefor a givenheight. Fig. 1 depictsa skinny treeof height ö andits ‘successor’,which is a left-completetree.Notethatwecannotremovea singlenodefrom theskinny treewithout eitherloweringthetree’sheight

å.ï�æ�æ�æÏê 1-2 å'æ�ï�ï�ï�ê 1-2

Figure1: A skinny treeof height ö andits ‘successor’

or violating theblackcondition.Skinny treesgiveusa preciseupperboundfor theheightof a red-blacktree[14]:

heightt ÷ 2 lg å sizet ô 2êtî 2 øSothebadnews is thattop-down ù 1 ø­ø i ú producesred-blacktreesof maximumheightfor somevaluesof ä .

Thetreesgeneratedby top-downhave anotherintriguing property. They containtheminimal numberofrednodesamongall red-blacktreesof thatsize.

Sketch of proof. Thecentralideais to show thateachtreeof sizeä canbetransformedinto theshapegeneratedby top-down ù 1 ø�ø i ú andthatthetransformationsdo not increasethenumberof rednodes.We basetheproofon2-3-4trees,whichunderlyred-blacktrees,seeSection2. Theshapeof a2-3-4treeis uniquelyrepresentedby a sequenceof level descriptionswhereeachlevel descriptionis a sequenceof thenumbers2, 3 and4. Asanexample,thetreesof Fig. 1 arerepresentedby

æð æð æxæqæqæð æxæqæqæqæqæxæqæqæ8æðæqæqææqæqæxæqæqææqæqæxæqæqæqæqæxæqæ8æ8æûø

To simplify theproofwe considernotonly 2-3-4treesbut generalmultiwaybranchingtreesandtransforma-tionson thesetrees.To this endwe generalizelevel descriptionsto sequencesof arbitrarynaturalnumbers.Weonly requirethefollowing ‘sanity’ condition:A sequenceof level descriptionsü è ü è�ý�í ø­ø�ø2ü­þ�ü í is valid if fÿ ü è ÿ ë ï and � ü�� ìkí ë ÿ ü�� ÿ for ï ÷ ä�� ó . We usetwo kindsof transformations.Thefirst transformationre-placesasubsequenceof numbersin acertainlevel by anothersubsequenceandis depicted����� . To ensurethat the resultingtreeis valid � and � mustsatisfy � � ë � � and

ÿ � ÿ ë ÿ � ÿ . Thesecondtransformation

is � ì��� � with ��� ï , ��� ë ��� , andÿ � ÿ ô�� ë ÿ � ÿ . Here, � is replacedby � in a certainlevel andthe

level above is increasedby � , ie oneor morenumbersin this level areincreasedby a total of � . If thereisno level above,we silently createa new level consistingof a ï -node.Thereaderis invited to relatethetwotransformationsto operationson multiway branchingtrees.Note that ����� doesnot affect thenumberof

rednodesandthat � ì����� decreasesthenumberof rednodesby � (recall thata nodeof size ó consistsofoneblacknodeand ó0î ï rednodes).Now, anarbitrary2-3-4treecanbetransformedinto thedesiredshapeby repeatedlyapplyingthefollowing transformationsfrom thebottomlevel to thetop level.

æ ð � ð æ ðxð ì�í� æqæqæ å�ó ô���ê ì�í� å�ó ô æ�ê�æ92

In theresultingtreeeachlevel descriptionhastheform � ������� , ie anoptional3-nodefollowedby anarbitrarynumberof � -nodes.Sincethetreegeneratedby top-downsatisfiesthesamepropertyandsinceeachnumberhasa unique1-2decomposition,theclaim follows.

Usingthe1-2numbersystemwecanevenquantifytheminimalnumberof rednodes:a red-blacktreeofsizen containsat leastk rednodeswherek is thenumberof 2’s in the1-2 representationof n.

4 Impr oving top-down

Theanalogyto the1-2 numbersystemcanbeexploitedto give a betterimplementationof top-downfor thespecialcasethattheelementsappearin ascendingorder. Thedigits becomecontainersfor pennants:

data Digit a Onea ! RBTreea"#Two a ! RBTreea" a ! RBTreea"%$

A red-blacktreeis representedby a list of digits in increasingorderof size(theleastsignificantdigit comesfirst). Insertinganelementcorrespondsto incrementinga1-2number. Thefunctionincr, whichdoesthejob,essentiallyimplementsthetwo laws &�')(*'+ ,&� and &-�.(*'+ /!0&1(*'"2' where& is any sequenceof 1-2digits.

incr 343 Digit a 56� Digit a �)56� Digit a �incr ! Onea t "7�8� � Onea t �incr ! Onea 9 t 9:";! Onea< t <+3 ps" Two a9 t 9 a< t <73 psincr ! Onea 9 t 9:";! Two a< t < a= t =73 ps"> Onea 9 t 9?3 incr ! Onea<@! N B t < a= t ="A" ps

Thereaderis invited to relateincr to thedefinitionsof ins andbal givenin Section2. Therestis easy:werepeatedlyinsertelementsinto thelist of digits; thefinal resultis convertedto a red-blacktree.

bottom-up 3B3C� a �D5 RBTreeabottom-up linkAll E foldr add �F�adda ps incr ! Onea E " ps

linkAll foldl link E

link l ! Onea t " N B l a tlink l ! Two a 9 t 9 a< t < "� N B ! N R l a9 t 9 " a< t <

It is a routinematterto provebottom-upcorrect.Wemustessentiallyshow thatadd implementsinserton theleft-spineview (labelslists thelabelsof a red-blacktree):

all ! a GH";! labelst "I KJ linkAll ! adda t "L inserta ! linkAll t "M$Thereimplementationof top-downisworththeeffort: astandardamortizationargumentshowsthatbottom-uptakesonly lineartime.

Remark. Red-blacktreesunderthe left-spineview correspondcloselyto finger search-trees[6]. A fingersearch-treeis a representationof an orderedlist that allows for efficient insertionin the vicinity of certainpoints,termedfingers.Herewe havea singlestaticfingerat thefront endof thelist. Thisdatastructuremaybe of further interestbecauseit makesa nice implementationof updatablepriority queues,which supportdeletinganddecreasingakey.

93

5 Lessheight, please!

Having succeededin implementingtop-downefficiently, let usnow try to reducetheheightof thegeneratedtrees. It is well-known [8] that a binary search-treeis optimal if all leavesappearon at mosttwo adjacentlevels—underthe assumptionthatall keys areequallylikely. It turnsout that it is almosttrivial to modifybottom-upsothatit producestreesof thatshape.A simplerotationto theright suffices:

link N l O Two a P t P aQ t QRCS N B l aP�O N R t P aQ t Q:R*THerearetheshallow variantsof thetreesshown in Fig. 1.

It is interestingto seehow theleaveson thebottomlevel arearranged.Thepatternbecomesapparentif wetakea look at a longersequenceof trees.

In eachcasetheleavesappearon two adjacentlevels. Thedefinitionof link N bringsaboutthat the leavesonthe lowestlevel aredescendantsof a rednode(apartfrom perfecttrees).Dependingon the positionof therednodewe haveeithera groupof 1, 2 or 4 nodesasindicatedby theshading.If we convert thegroupsintobinarydigits, thebinarynumbersOVUWUWUXR8Q , OYU�U[ZRAQ , OVU[ZU�R8Q , . . . , OAZWZWZ-R8Q appearon the last level. Thenumbersystemhelpsto explain why this is thecase.The 1-2 number OV\2]_^1P`T:T�Ta\2b:R 1-2 canbedecomposedinto twobinarynumbers: O0\ ]c^1P T:T�Td\ b R 1-2 SeO8Z�T�T�T:ZR 0-1 f OV\gN]c^1P T:T�Ta\gNb R 0-1 with \2NhiS/\ h�j Z . Thenumber OAZLT�T:T�ZR 0-1

correspondsto theperfecttreeon top; theresidueO0\2N]c^KP T�T:Ta\2Nb R 0-1 to theleaveson thebottomlevel.

6 Digression:Left-completebinary heaps

Thegeneratedtrees,which wecall quasileft-completetrees, arecloselyrelatedto left-completetrees, whichhave theleaveson thelowestlevel in the leftmostpossiblepositions.Considertheparentsof therednodes.If we swap their children,we obtaina left-completetree.Again, it is trivial to modify link accordingly:

link N N l O Two a P t P aQ t Q R�S N B O N R t P aQ t Q R aP l TTo completethepictureherearetheleft-completecolleaguesof thetreesabove.

94

Of course,the above transformationdoesnot preserve the search-treeproperty. So let us assumein thissectionthatwedealwith binaryheapsinstead.It is notdifficult to adoptincr to thenew situation.Wechangeour point of view becausethis providesan interestinglink to thework of J.-R.SackandT. Strothotte[13].Thedecompositionof a left-completetreeinto a list of pennantsalsolies at theheartof their algorithmsforsplitting andmerging heaps.Thereis, however, a slight difference.They decomposea left-completeheapalong the path from the root to the last leaf, ie the rightmostleaf on the last level. This seemsto be anobviouschoicebut aswe shallseegivesrise to a morecomplicatednumbersystem.Herearethetwo waysof decomposinga left-completetree.

kAlm�mWn1-2 o kVm_lp�n 0-1-2

kVm_l�ln1-2 o k8lp[l�ln 0-1-2

k0m_lm�n1-2 o k8l:pXmqpXn 0-1-2

versus

k0mqp�m�n0-1-2r kVm_l�ln

0-1-2r kAl:p[l-mWn0-1-2r

Thedifferenceis really minor: in thefirst row we follow thepathto thefirst freeposition;in thesecondrowto thelastoccupiedposition.Givena left-completetreeof size s , thefirst choiceyields t�uBv k sxw l-nzy pennantswhile thelatterchoicegives {|u4v k s}w l-nz~ pennants.Let usexaminethenumbersystemcorrespondingto thelatterchoice,which we call 0-1-2� system,for wantof a bettername.Theexamplesshow that thenumbersarecomposedof thedigits

p,l

andm. Thedigit � appearsin the � -th positionif f thepathcontains� pennants

of sizemW�

. For instance,the rightmosttreecontainsonepennantof size8, oneof size2, andtwo of size1.Consequently, thecorrespondingnumberis

k8lp[l-mWn0-1-2r . Without furtherrestrictionsthe0-1-2� binarysystem

is clearly redundant. It turnsout that the numberk0�2�c�K�.�����d�d�n

0-1-2r correspondsto a left-completetree if f� w l�������4� � � � � � w m for all ��� s [13]. Thisconditionimpliesthatweneverhavetwo successivem’s.

In fact,m_l��gm

cannotappearasa subsequence( � � means� repeatedarbitrarily often). Incrementinga 0-1-2�numberis funny: first makea ‘normal’ increment(thiscanbedonein constanttimesincethesubsequence

m�mis forbidden);thenapplythetransformation

m_l-� ��� mx���l:p�l-� ��� m(atmostonce).If asegmentedrepresentation

is used[10, Section9.2.4],thelattertransformationcanalsobedonein constanttime.How dothenumbersystemsrelateto eachother?Well, weobtainedtheleft-completetreesusingrotations

andswaps.A rotationcorrespondsto thecarrypropagationmx���l:p

turninga1-2numberinto a0-1-2number.A swapdoesnot affect thenumericrepresentation.Thus,

k8l-mWm�n1-2 becomes

kVm[l:p�n0-1-2 and

k0m_l�ln1-2 becomesk8lp[l�ln

0-1-2. Notethatthelastdigit of the0-1-2numberis eitherp

orl. If we incrementthelastdigit, we get

the0-1-2� numbercorrespondingto its successor. This relationis not too surprisingsincethepathto thefirstfreepositioncorrespondsto thepathto thelastelementin thesuccessortree.

7 Coloring binary search-trees

Let us now approachthe problemof constructingred-blacktreesfrom a differentperspective. Say, we aregivenanarbitrarybinarysearch-treeandwe areaskedto color thenodessuchthata red-blacktreeemergesor to reportthatit is notpossibleto doso.Hereis thefirst testcase.

95

The longestpath from the root to an emptytreecomprises4 nodes;the shortestpathconsistsof 2 nodes.Thus,theblackheightmustbetwo andthenodeson the longestpathmustbecoloredblack,red,blackandred. We get a betterpictureof the situationif we draw the treeslightly different. In the picturebelow theverticalpositionof anodecorrespondsto its heightratherthanits distancefrom theroot.

��

Weadditionallyassignalevel numberto eachnode:anodeof heighth receivesthelevel number � h � 2� . Thiswaywedivide thetreetwo-levelwisefrom thebottomto thetop. Coloringis now easy:anodeis coloredrediff it hasa parentwith thesamelevel number. Doesthis schemework in general?Not quite,asthesecondtestcaseshows.

���

All nodesof theright subtreemustbecoloredblackbut ourschemecolorsthetwo leavesred.Fortunately, thepictureon theright alsocontainsanindicationof thefailure:anedgecrossestwo levels,ie thelevel numbersof two adjacentnodesdiffer by morethanone.We canremedythis defectby lifting theright sonof therootto thesecondlevel. Generally, it is possibleto adjustthelevel numbersin a singletop-down pass.It may, ofcourse,happenthattheleavesno longerappearon thesamelevel. In thiscasethegiventreeis notcolorable.

We arenow readyto tackletheimplementation.For simplicity, let usassumethat thenodesof theinputtreearedecoratedwith thelevel number, ie treesaregivenaselementsof thedatatype

type Level � Int

data Treea � Empty   NodeLevel ¡ Treea¢ a ¡ Treea¢¤£in which � height ¡ Nodeh l a r ¢�� 2��� h. Thealgorithmtakesthefollowing form:

rbtree ¥4¥ Treea ¦ RBTreearbtreeEmpty � Erbtree ¡ Nodeh l a r ¢ � N B ¡ rbtree§ h l ¢ a ¡ rbtree§ h r ¢rbtree§ ¥4¥ Level ¦ Treea ¦ RBTreearbtree§ hp Empty   hp �@� 1 � E  otherwise � error ¨�©�ªW«­¬�®[¯X°_±q²´³X¬�µ-¶·«c®[¯c¯D¨rbtree§ hp ¡ Nodeh l a r ¢ � N color ¡ rbtree§ h§ l ¢ a ¡ rbtree§ h§ r ¢

where h§ � h ¸max�¡ hp ¹ 1¢color   hp �@� h � R  otherwise � B º

The auxiliary function rbtree§ receivestwo arguments:the uncoloredtreeandthe level number, hp, of thetree’s parent. If the tree is empty, the level numbermust necessarilybe one. Otherwise,the given tree

96

cannotbecolored.Theroot of a non-emptytreeis coloredrediff its level numberh coincideswith hp. Theadjustedlevel numberof theroot,which is passedto therecursivecallsof rbtree» , is givenby theexpressionh ¼max¼c½ hp ¾ 1¿ . Note that the level numberof the input treeequalstheblackheightof thegeneratedtree.Furthermore,the longestpathin thered-blacktreecontainsalternatingblackandrednodes(if theheightisoddandgreaterthanone,thepathstartswith two blacknodes).This in turn impliesthatrbtreeproducestreesof minimumblackheight.

It is relatively easyto seethatrbtreeyieldsavalid red-blacktree.Theconverseis notsoobvious:canwebesurethat thegiventreeis not colorableif rbtreesignalsanerror? It turnsout that thecorrectnessof thealgorithmis bestshown usinganalternative characterizationof red-blacktrees.Definethemin-heightof atreeasthelengthof theshortestpathfrom theroot to anemptynodeandthemax-heightasthelengthof thelongestpath.A binarytreet is saidto behalf-balanced[12] if for everysubtreeu of t,

ÀÁàmin-heightu Ä max-heightu ÅEvery red-blacktreeis half-balancedbecauseheight t  2black-heightandblack-heightt  min-heightt.The function rbtreecanbe viewedasa constructive proof of the reverseimplication. Onemustessentiallyshow thatthefirst parameterof rbtree» satisfiesthefollowing invariant

ÀÁ ½ max-heightt Æ 1¿  hp  min-heightt Æ 1 ÅIf theinput treesatisfiestheAVL property[2], thealgorithmcanbeslightly simplified: thetesthp ÇxÇ 1

becomesobsoleteandh» maybesafelyreplacedby h. Theresultingfunctionalreadyappearsin theseminalpaperon red-blacktrees[3, p. 295].

It is high time to seethealgorithmin action.Herearethecoloredvariantsof thetreesshown in Fig. 1.

It is not hard to show that rbtree producestreeswith a maximal proportionof red nodes. However, wealreadyknow that the skinny treeon the left handsidecontainsthe smallestpossiblenumberof red nodes(seeSection3). Both resultsimply thatthereis exactly onewayof coloringskinny trees.

If rbtreeis appliedto aleft-completetreeor to atreegeneratedby bottom-up» , it producesared-blacktreethatcontainsthemaximalpossiblenumberof rednodesamongall treesof thatsize. Notethat it is actuallydesirablethata treecontainsmany rednodessincethebalancingoperationbal takesonly blacknodesintoaccount.To summarize:let quasi-left-completebea variantof bottom-up» thatconstructsanuncoloredtreeof typeTreea. Then

build È4ÈCÉ a ÊDË RBTreeabuild Ç rbtree Ì quasi-left-complete

buildsa red-blacktreethathasminimumpathlengthanda maximalportionof rednodes.

Sketch of proof. It remainsto show thatbuild constructsared-blacktreewith amaximalportionof rednodes.Theproof is largelyanalogousto theonein Section3. This timeweusetransformationsthatdonotdecrease

the numberof red nodes. In particular, we employ the transformationÍ*Î)ÏÐ Í:» with Ñ·Ò�Ó , Ô�ÍÕÇ�Ô�Í» ,and Ö ÍcÖ�¾�Ñ�Ç×Ö Í»0Ö . This transformationreplacesÍ by Í» in a certainlevel anddecreasesthe level aboveby Ñ (numbersmustnot becomenegative). A given2-3-4treecanbetransformedinto thedesiredshapeby

97

repeatedlyapplyingthe following transformationsfrom bottomto top (permutationssuchas Ø�ÙÃÚÛÙÜØ areomittedfrom thelist).

Ý�Þ1ßÚ�à áLá ÞKßÚ×Ù á`Ù ÞKßÚ�Ø á�Ø ÞKßÚãâ áKâxÚ�ÙÜØÙLÙ ÞKßÚ�â ØLØ@Ú�ÙÜâ

We tacitly assumethat levels that consistonly of a singleton1-nodearesilently removed. In the resultingtreeeachlevel hasthe form ä Ù-åFä Ø�åæâcç , ie an optional2-nodefollowedby an optional3-nodefollowedby anarbitrarynumberof â -nodes.Usingan inductionon the lengthof the left spineonecanshow that the treesgeneratedby build canbetransformedinto thesameshapeusingonly ‘ Ú ’ transformations.Finally, treesofthisshapeareuniquelydeterminedby thesizesincethey correspondto quaternarynumberscomposedof thedigits1, 2, 3 and4 andsincethisnumbersystemis non-redundant.

Remark. In solvingtheproblemof constructingred-blacktreeswe have answeredquitea few exercisestobefoundin textbooksondatastructuresandalgorithms,mostnotablyexercises10.9,10.10and10.14in [14]andexercises3.9and9.7 in [10].

8 Acknowledgement

I would like to thank Chris Okasakifor his valuablecommentson an earlier draft of this paper. AdamBuchsbaum,GerthStøltingBrodal,andtwo anonymousrefereesmadeseveralhelpful suggestions.Thanksarealsodueto JoachimKorittky for implementingfunctional èêé[ë�ìÜí.î?ï�ë [9], which wasusedfor drawingthediagrams.

References

[1] StephenAdams. FunctionalPearls:Efficient sets—abalancingact. Journal of FunctionalProgram-ming, 3(4):553–561,October1993.

[2] G.M. Adel’son-Vel’skiı andY.M. Landis. An algorithmfor theorganizationof information. DokladyAkademiiaNaukSSSR, 146:263–266,1962.Englishtranslationin Soviet Math.Dokl. 3,pp.1259–1263.

[3] RudolfBayer. SymmetricbinaryB-trees:Datastructureandmaintenancealgorithms.ActaInformatica,1:290–306,1972.

[4] RichardS.Bird. Functionalalgorithmdesign.Scienceof ComputerProgramming, 26:15–31,1996.

[5] RichardS. Bird. FunctionalPearl: On building treeswith minimum height. Journal of FunctionalProgramming, 7(4):441–445,July1997.

[6] Leo J. Guibas,EdwardM. McCreight,Michael F. Plass,andJanetR. Roberts. A new representationfor linear lists. In ConferenceRecord of theNinth AnnualACM Symposiumon Theoryof Computing,pages49–60,Boulder, Colorado,May 1977.

[7] Ralf Hinze. FunctionalPearl: Explaining binomial heaps. Journal of Functional Programming,9(1):93–104,January1999.

[8] DonaldE. Knuth. TheArt of ComputerProgramming, Volume3: Sortingand Searching. Addison-Wesley PublishingCompany, 2ndedition,1998.

[9] JoachimKorittky. Functional èxé�ë�ìÜí.î?ï[ë . Diplomarbeit,UniversitatBonn,December1998.

98

[10] ChrisOkasaki.PurelyFunctionalData Structures. CambridgeUniversityPress,1998.

[11] ChrisOkasaki. FunctionalPearl:Red-Blacktreesin a functionalsetting. Journal of FunctionalPro-gramming, 9(4),July1999.To appear.

[12] H.J. Olivie. A new classof balancedsearchtrees:Half-balancedsearchtrees. RAIRO Informatiquetheoretique, 16:51–71,1982.

[13] Jorg-RudigerSackandThomasStrothotte.A characterizationof heapsandits applications.InformationandComputation, 86(1):69–86,May 1990.

[14] Derick Wood. Data Structures,Algorithms,andPerformance. Addison-Wesley PublishingCompany,1993.

99

100

ð�ñóò`ô7õxò`ö`÷aøùò`ñxú�û7üþý�ú�ÿ�����������7øùõxö�ò.ý�ý�÷�7ñeø�ò`ú������xý ���;ö���ÿxñ��`ú�÷�;ñêû7üþú�ö`÷aò`ý

����������������������� �"!�#�$���$ %�&'��( ��$*)��,+'-.-.#�$ /0��&'&,�213�,������$���$465�7�8'96:<;�=';�>�5�?�?�@�A�8'>�9�B'8'5�A�C DE5GFH9�=�I�A�C.B'8'B'@�B';J5�KL:<;�>�M�A�5�='5ONPF 465�7�8'9*:<;�=';�>�5�?�?Q@�A�8'>�9�B'8'5�A�C

RSRHTVUEWSXSWSYSRSZSR\[�]SR ^SWSRS_S^S^SUSWSYSWS`EaS`\[�ZSbSc\[�^SX dJ`SbSbSR\[�bSRSZSZS`SWSXSWEYSWSUSZSRS`\[�eSUHd

f�gih.j�k�l�m�jno�qp��G!���+'#�rts�#�uvr�w���-.-��,#�$�uv��&'x�#�p�-V(�#�wy(���$�s�&'�,#�$���+z&'w��,��-{��$�p|-�&,��p�}|&'x���u~����r���w��,uv��$�&'��+,+'}H�y13w��,�\s�#�uvr�w���-�-.�'#�$t�'-��-�����+,+'}��'uvr�+'��uv��$�&,��pJ���'&,x���w���-3��s�#�uv���,$���&'�'#�$J#�(�r���&,x{s�#�uvr�w���-.-��,#�$J��$�p{���'p�&'x�s�#�uvr�w���-.-��'#�${#�w2��-S��s�#�u����'$���&,�'#�$#�(Sr���&'x\s�#�uvr�w���-.-��,#�$6��$�p\+'�G!O��+�s�#�uvr�w���-.-��,#�$3��no��p��P!O��+'#�r6��$��G����(���s��'��$�&��,uvr�+'��uv��$�&'��&,�'#�$6( #�wS���'p�&'x\s�#�uvr�w���-��-.�'#�$���$�p�-�x�#���&,x���&��,$o(���$�s�&'�,#�$���+�&,w��'��-Q&'x��ts�#�uv���,$���&'�'#�$�#�(L��+,+�#�(V&'x��v&'x�w����vs�#�uvr�w���-.-.�'#�$�uv��&,x�#�p�-�}��'��+,p�-Q&,x������-.&�w���-.��+,&'-{��x���$|&'�����'$��|�'$�&'#���s�s�#���$�&i&'x��\&'w �'�\-.�'������&'x��6&'w �'�\p���r�&'x��i&'x��\s�#�rO}|s�#�-.&'�3��$�p|&'x��\-.����w s�x|��$�p|��r�p���&'�r���w (�#�w uv��$�s��2��no����+'-.#�-.x�#���&'x���&�&'x��v������&'w ���vuq�'$��'uq�'����-�&'x��vs�#�rO}�s�#�-.&��'$�����+'��$�s���p�&'w�����-���$�p�s�#�uvr���w �\#���w�P��r���w��,uv��$�&'��+�w���-.��+'&'-V(�#�w�&'w �'��-V&'#t��r�r�w�#����'uv��&'����$���+'}�&'�,s���+zw ��-.��+,&'-V(�#�w��,$�&'��w $���+i��$�p�����&'��w�$���+i��� �t&,w�����-<������w�s�#�$��s�+,��-��,#�$v�,-J&,x���&�&'x��6r���&'x��������'p�&'x�� �2��$�pv+'�G!O��+'� s�#�uvr�w���-.-.��p*&'w��,�Q�'-V��$v�,p�����+is�x�#��'s���( #�w��Q( ��$�s�&'�'#�$���+�uv���'$���uv��uv#�w�}�,$�p��P�Q-.&'w ��s�&'��w��H��\������� k���h(���$�s�&,�'#�$���+�p���&'�L-.&'w ��s�&'��w���-����,uvr���w���&'��!O�yp���&,�V-.&'w ��s�&'��w ��-.��&'w �'���O-.x���p�#����,$�����r���&'x�s�#�rO}��'$����<r���&,x6s�#�uvr�w���-.-.�'#�$��<���'p�&'xs�#�uvr�w���-.-��,#�$��<+'�G!O��+Hs�#�uvr�w���-.-.�'#�$

� ��ñxúqöE���xÿ��`ú�÷��7ñno�Qs�#�$�p���s�&2��$q�P��r���w �'uv��$�&'��+H-.&'��p�}\#�(Ss�#�uvr�w���-.-.�'#�$\uv��&,x�#�p�-( #�wS( ��$�s�&,�'#�$���+2&,w��'��-���1�x���w ����-.#�$\�Qx�}q����s�x�#�#�-.�&'#t-�&,��p�}t&'w��'�6-�&,w���s�&'��w���-J��-��,$��t����r���w��,uv��$�&,��+�w���&,x���w�&'x���$|��$���+'}�&'�'s���+zuv��&'x�#�p�-L�'-{&'x���&�&,x��6(�#�w�u�#�(�\&'w �'�6p���r���$�p�-#�$t&,x��6���'$�pv#�(�p���&'�*-.&'#�w���p3�� �$���+'}�&'�'sQ-.&'��p��,��-V��+,����}�-V$�����p�&'#t��-.�Q�*p���&'�6uv#�p���+,��-���s�x���-J��$��'(�#�w�u�p��,-�&'w �'����&,�'#�$���'$�p���r���$�p���$�&Ew���$�p�#�u¡-���uvr�+'��-*(�w #�u¢�|s���w &'���'$�p��'-.&'w �'�O��&'�'#�$o(���$�s�&'�'#�$���#�w{£���w�$�#���+'+'�'��&'}�r��vr�w #�s���-.-.��-<����&�'-\����+'+��$�#��Q$�&,x���&�&,w��,��-Ex��P!O���L+'#��¤�G!O��w ������p���r�&'x�( #�w���+'+�&'x���-.�L&'}�r���-S#�(zp���&'��������&��,&��,-�$�#�&O�'uvuv��p��,��&'��+'}�s�+'����w�x�#��¥&'x���-��w ��-���+,&'-S&'w ��$�-�+'��&'��&,#6w ����+O��#�w +'p�p���&,�H�<��$�&,x��,-�-.&'��p�}����Lx��G!��y��-���p�-���uvr�+'��-S#�(�¦E$���+'�,-�x�&'�P��&'��� $�&'��w $���&�w #���&'�,$���&,����+'��-����$�p6����#���w ��r�x��'syr�#��'$�&Hp���&'�V�'$6��p�p��'&'�,#�$Q&,#*-.}�$�&'x���&,�'sVp���&'�H�

no����w ��s���w w���$�&'+,}*p���-��,��$��,$��*��$�p\�,uvr�+'��uv��$�&'�,$��6��r�w�#�&'#�&'}�r���� ����-.��p�-�&,w��,s�&�(���$�s�&,�'#�$���+2p���&,������-.�{r�w�#���w���uquv�'$��+'��$�����������s���+'+'��p�%�x��'$���-��<%ix��'$���-����'+'+Or�w #O!��'p���&'x���r�w #���w ��uquv��w����'&'x������'+'&'� �'${uv��r�-S��$�p{-.��&'-�&,x���&���w��y�'uqr�+'��uv��$�&'��p��}Qr���w�-.�'-.&'��$�&�( ��$�s�&'�'#�$���+�&,w��'��-���� $6#�w p���wi&,#Q#���&,���'$Quv�����,uv��+Os�#�uvr���&'��&,�'#�$���+���(��is��'��$�sP}��<���Jx��G!O���'uvr�+,��uv��$�&,��p�#���w(���$�s�&,�'#�$���+yp���&,��-.&'w���s�&'��w���-\�'$0§�#�$0&,#�r¨#�(�%�x���p���-�©�����ª �J��r���w -��,-�&,��$�&�(���$�s�&,�'#�$���+yx�����r¨-.��r�r�#�w�&,�'$��¨����&'#�uv��&,�'sp��'-.��� ����s��O��p6uq��uv#�w }*uq��$�������uv��$�&��'$\��-.#�(�&�w ����+'��&'�'uv�V��$�!��'w #�$�uv��$�&«��%�x���p���-�w���r�+'��s���-+'#������'$��6�Q�'&'x*-�x���p�#����'$��s�#�uv���'$���pJ�Q�,&'xQw�����+,��&'�,uv�y����$���w ��&'�,#�$���+�-�&,#�r�� ��$�p���s�#�rO}V����w���������s�#�+'+,��s�&'�'#�$3��%�x���p�#����'$����'-E��-�-.��$�&,�'��+'+'}��L-.}�$�#�$O}�u(�#�wi&'x��{( ��$�s�&'�,#�$���+O��r�p���&'�Lr�#�+'�,s�}Q��x���w �V��$*��r�p���&'�Ls���$*���V!��'�G����pQ��-��{&,x�w ������-.&'��rQr�w�#�s���p���w �2�P¬i�,w�-.&2�{s�#�r�}Q#�(3&,x��uv��uq#�w }�s���+,+S-����G­«��s�&'��p|&,#�&,x��q��r�p���&,�*�,-�s�w�����&,��p3�L1Ex���$|&'x��\��s�&'����+���r�p���&'�*�,-�r���w (�#�w�uv��pv#�$�&'x��\s�#�r�}��i��$�p�+,��-.&�Qw���(���w ��$�s���&,#t&,x���uv��uv#�w�}vs���+'+ix�#�+,p��'$��q&'x��Qs�#�rO}v�'-Vw���&'��w�$���p3�3� $tw ��s���w -���!O��-.&'w ��s�&'��w���-�-���s�x���-J&,w�����-L��$���r�p���&'�+'����p�-�&'#\-.x���p�#����'$���&'x��J��$�&'�'w �J-.����w�s�x*r���&,x��<1Ex��'-��'-���+,-�#6��$�#���$Q��-�r���&'x6s�#�rO}��,$��t©���®Oª¯�

��$J�'&,-S#�w �'���'$���+�( #�w�u�&'x��&,w��'�6© °�°��°G��ª��,-S��p���&'��-�&,w���s�&,��w ���x���w�����-���&�#�(�-�&,w��'$���-�( w�#�u���$J��+,r�x�������&�s�#�$�&'���'$��'$���±s�x���w���s�&'��w�-E�'-�-.&'#�w���p6�,$*��$q±�����w }Q&²w ���L��$�p6����s�xQ-�&,w��'$��Qs�#�w�w ��-.r�#�$�p�-S&'#*�V��$��'³�����r���&'x3�<1Ex���w��L��w �J-.�G!O��w���+�uv��&'x�#�p�-( #�wS�,uvr�+'��uq��$�&'�'$��\&,w��'��-.&'w���s�&'��w���-y�'$v&,x���+'�'&'��w���&'��w���©�´O�2µ��2¶O�E°P®O�E°G·Oª¯�z��$���#�(E&'x���p�w ��������s���-�#�(E&'x���-���uv��&'x�#�p�-y�'-&'x���&�&'x��P}\$�����p*s�#�$�-.�'p���w�����+'}Quv#�w��{uq��uv#�w }*&'x���$\������+'��$�s���p6-.����w�s�x*&'w�����p����{&'#q��uvr�&'}\-.����&'w�����-<�Hno��-�x�#���x�#��&'#6�G!�#��'p6&'x��'-�r�w�#���+'��u¸��}*s�x�#�#�-.�,$��Q�J-.uv��+,+2��+,r�x�������&���$�p6��r�r�+'}��'$���&'w �'�Js�#�uvr�w���-�-.�'#�$3�

°G¹3°

1Ex��V�G!O��w���������s���-�������x��G!��'#�wi#�(i&'w��,�J-.&'w ��s�&'��w���-Ex���-�������$Q&'x��J-.���G­º��s�&�#�(z&,x�#�w #�����x�&,x���#�w ��&,�'s���+���$���+'}�-.�'-{©�°G¹��z°G¶����´O����µOªG»P��$Q�P��&'��$�-.��!���+,�'-.&H#�(zw ��( ��w���$�s���-3s���$Q���L(�#���$�p��,$Q¼Q��$�p���#�#��{#�(z1Ex���#�w���&'�,s���+O§�#�uvr���&,��w�%�s��,��$�s��6© °G½�ª¯��1�x���P��r���s�&'��pt�P!O��w������*p���r�&'xt#�(y�\&,w��'�qs�#�$�&'���'$��'$���¾0�'$�p���r���$�p���$�&�w���$�p�#�u�-�&,w��,$���-{( w�#�u��qp��,-.&'w��,����&,�'#�$|���,&'x�p���$�-.�'&'}(���$�s�&'�,#�$\¿ ÀÂÁVÃy�,-{Ä�Å�Æ'Ç�ÈL¾�ÉL©�®�ª¯��1Ex��,-w ��-���+'&Hx�#�+,p�-���+'-�#6( #�wzp���&'�V( w #�uÊ�V£L��w $�#���+'+'�,� &'}�r���r�w�#�s���-�-{©"½O��Ë�ª¯�

1Ex��y����-�&O��$�#��Q$Js�#�uvr�w���-�-.�'#�${&,��s�x�$��,Ì����y( #�w�&'w��,��-S�,-�r���&'x�s�#�uvr�w ��-.-��,#�$3��1Ex��y�,p������'-E-.�'uvr�+'�HÍOr���&,x�-3s�#�$�-��,-�&,�'$��#�(3�J-���Ì�����$�s��V#�(3-.�'$���+'����s�x��,+'pQ$�#�p���-���w��Vs�#�uvr�w���-�-.��p�����--.x�#���$Q�'$*¬��'����w��\°P���< �r���&,x*s�#�uqr�w ��-.-���pQ���,$���w�}Q&'w �'�{�,-#�(�&'��$*w ��(���w w���p�&'#*��-��{����&'w �'s��'�{&'w ���2������&'x\s�#�uvr�w ��-�-��'#�$6uv��}6w���p���s��L&,x��V-��'���{#�(�&'x��J&,w��,�{p�w ��uv��&,�'s���+'+,}2�<� $*(Î��s�&'��&'x��$���uv����w�#�(i$�#�p���-��,$Q�Jr���&,x6s�#�uvr�w ��-.-.��p����'$���w�}�&,w��,�J-.&'#�w��,$��\¾|���P}�-E�,-LÏO¾�Ð�ÑO�<1�x��V��-.}�uvr�&'#�&'�'s��P��r���s�&'��p��G!���w������p���r�&'x���x�#����G!O��wÒ�<�'-&,}�r��,s���+'+,}Q$�#�&Hw ��p���s���pt©�°GÓO��°GÔOªG�

Õ²Ö

Õ²Ö

×PØ

Ù Ø

Ú ÕÛ Ö

ÜÝ Õ'Õ

Õ²ÜÕ²Ú

Þ¥ß

àá

Õ²Û

Ú ÕÛ Ö

Ü Ý Õ'ÜÕ²Ú�Õ²ÕßÞ

áà Õ²Û â¯ã

Õ'Õ²Õ²Ú²Õ²Ú²Ú'Õ

â¯ãÕ²Õ²Õ²Ú²Õ²Ú²Ú'Õ

ßÞ

ä Ø

Ú ÜÝ Õ'Õ

áà

Õ²Ú

Õ²ÛÖÛÕ

Õ²Ö�Õ²Ü

¬��'����w��|°�ÍS å���'$���w�}v&'w �'�6æ���ç����*r���&,x���s�#�uvr�w���-�-.��p\���'$���w�}v&,w��,�*æ ��ç ����$�pv�6r���&'x�����$�pt+,�G!���+'� s�#�uvr�w���-.-.��p*���,$���w�}v&'w �'�æ�s�çG��1Ex��{p���&,�{-.��&�s�#�$�-��,-�&,-y#�(L°GÓ\-.&'w �'$��\��+'��uq��$�&'-<��§�#�uvr�+'��&,�{-.&'w �'$���-��w �{-.&'#�w���p\�'$\&'x��{+,���G!O��-<�H1Ex��{+'������+'-Lè�иÑ�é�'$�p��'s���&'�{&'x��Jr�w���-���$�s��V#�(3��$*��+'��uq��$�&'��$�#�&���!���+'���V-�&,#�w���p3��êE��w �'����+'�QëV����!���-�&'x���+'#�$�����-�&�s�#�uvuv#�$6r�w �����*-.&'w �'$��\#�(�{r���&'x���s�#�uvr�w���-�-.��p�$�#�p��H�

ìE�G!���+zs�#�uvr�w���-�-.�'#�$¨© °Gª��'-��6uv#�w �*w���s���$�&z&,��s�x�$��²Ì����2���$�s��*�������²$���&'x��\�'p����*�'-{-.�'uvr�+'�2Í�s�#�uvr�+'��&'�6$�#�p���-Væ ��+'+s�x��'+,p�w ��$���w �*$�#�$�� ��uvr�&'}�çS��w��6s�#�uvr�w���-.-.��p�����$�p|&,x��,-Js�#�uvr�w ��-.-��,#�$��'-Jr���w�(�#�w�uv��pv&'#�rtp�#���$��z-.���*¬��'����w���°Ps2�y1Ex��+'�G!O��+,��s�#�uvr�w���-.-���p\&'w �'����ìE§���&'w �'����x���-Lr�w #�!O��p*&'#�����#�(��,$�&,��w���-.&z��#�&'xv�'$v&,x���#�w }q��$�pvr�w ��s�&'�,s��2����&��'-V��$�#���$q&'x���&z&,x���G!O��w �����\�P��r���s�&'��ptp���r�&'x�#�(y��$�ìE§�� &'w��'�\�'-ví�Å�Æ'Ç�ÈJÆ,Ç�ÈL¾�ÉL( #�wp���&'�q(�w #�u~�v+'��wÎ���*s�+'��-�-�#�(�p��'-.&'w �'����&'�'#�$�-v©��OªG��1Ex��'--.x�#���+'p6���{s�#�uvr���w���pQ&,#\&'x��{+,#�����w��,&'x�uv�'s�p���r�&'x6#�(���$�s�#�uvr�w���-.-.��p���$�p*r���&'x6s�#�uvr�w���-�-.��p*&,w��'��-��O1Ex���-.�{w���-���+'&,-��+,-.#&'w ��$�-.+'��&'�V&'#6��#�#�p6r���w�( #�w�uv��$�s��y�'$*r�w���s�&'�'s�������--.x�#���$Q��}*�Vw���s���$�&H-.#�( &'����w��L�'uvr�+'��uv��$�&'��&'�'#�$�#�(z���tw�#���&'�'$���&'����+'��-��-��,$��6��$6ìE§L� &'w �,�q©���î�ª¯�

°Gî��

1Ex���ìE§���&'w �'������-�#�w��'���,$���+'+,}q��-�&,��&,�'s�p���&,��-�&,w���s�&,��w ��&'x���&ip��'p\$�#�&�-.��r�r�#�w�&��,$�-.��w &'-���$�pqp���+'��&,��-<�z12#v#���wS��$�#���+'���p�������&'x��tì���§�� &'w �'�ï©"�3°¯ª{����-*&,x��|��w�-.&�p�}�$���uv�,s�!���w��,��$�&E#�({&'x���ìE§���&,w��'�H��1Ex��tìE��§���&,w��,�|�,-*��$ï�,uvr���w���&,��!��vp���&'�-.&'w���s�&'��w����Sx�#����G!O��wº�6� $�(���$�s�&'�,#�$���+S&'w �'��-.�E��$o��r�p���&'�\+'����p�-�&,#�s�#�rO}��'$���&,x��v��$�&,�'w �v-�����w�s�x�r���&'x�(�w�#�u¡&'x��vw�#�#�&S&'#&'x���+'����(S-.���P­º��s�&�&,#v&'x�����r�p���&'�H��ìE�G!O��+'� s�#�uvr�w���-�-.��p6$�#�p���-yuv��}qx��G!��{+'��w ����#���&,��p��P��w ����-���x��'s�xquv���O��-(���$�s�&'�,#�$���+��r�p���&'��-��P��&'w ��uv��+'}�s�#�-�&,+'}H�\12#��,uvr�+'��uv��$�&S��( ��$�s�&'�,#�$���+�ìE��§���&,w��,���E��r�#�-�-.�,��+,�v��r�r�w #���s�x|&'#��G!�#��,p��,$����P��s���-.-.��!O�s�#�rO}��'$����'$*��r�p���&'��-E�'-�&'#*w���-�&,w��,s�&H&,x��V��r�r�+,�'s�����+'�L#���&,��p��P��w�����-S&'#6���Oð��O½��O��$�p6r�#�-�-.�,��+,}|°G®���1Ex��J-.�,uvr�+,�Vuv��&,x���uv��&,��'s���+i�P����w�s��,-�������+'#��¸w �G!O����+,-��Hx�#����G!O��wÒ��&'x���&i�6���'$���w }q&'w �'�Q�'-V$�#�&��Q��#�#�p\-�&,��w�&,�'$��vr�#��,$�&i�,(�#���wE��#���+��'-V&'#v�O����r�&,x��s�#�-.&2#�(is�#�rO}��'$��Q��--.uv��+'+���-�r�#�-.-.�'��+'�2�

ñE��&O��-S��-�-���uv��&'x���&�#���w�&'w �'�L�'-S#�(i�Lp��P��w ���JòQ��$�p���+,+�+,���G!O��-S��w�����&�&,x���-���uv��+'�G!O��+��,$�&'x���&'w �'�H�<1�x�����r�r�w�#����'uv��&,�s�#�rO}\s�#�-.&��'-�����!���$\��}q&'x���( #�w uv��+,�Qóyô�ò�õ�ö÷ô�òtø�ù2õ�ú,û�ü2ýþÿ��x���w �6ù\�'-�&'x���-.�'����#�(S��p�uv�'$��'-�&'w ��&'��!O�Jp���&'���,$v��&'w �'�$�#�p�����òtø¸ù*&'x���-��,����#�(S&'x���&'w �'��$�#�p��{�,$q��#�w�p�-���þÿ&'x���$���uv����w3#�(S+'���G!O��-���#�w���+'��uv��$�&'-y-.&'#�w���p�������$�ptú,û�ü ý þÿ&,x��r���&,xt+,��$���&'x3�Sn0�'&'x�ù ö��{&'x��Quv�'$��'uv��u #�(�óyô�ò�õ��'-V����!���$v��}�ó���ô ò�õqö�<���«� �H�LòJú�6ò ��ò ö����S§�#�u�r���&'�'$��v&'x����r�r�w�#����,uv��&,�6-.#�+'��&'�'#�$t(�#�w{ò|}��,��+'p�-6ò�ö���� �����V1Ex��*p��,-�s�w���&,�\!���+'����-{s�+'#�-���-�&S&,#�&'x��\uv�,$��'uv��u���w �\����$�p�ð��3��$�póyô��Hõ��¸óyô��Hõ�( #�wi��+'+zþ�������1Ex���-.���{( ��$�s�&,�'#�$���+OÌ�����&,��w $���w }�&'w��,�Juv�,$��,uv�'����-�&,x��Vs�#�rO}6s�#�-�&º�

1Ex��*w���&,�'#�$���+'�6����#O!��6x�#�+,p�-{#�$�+'}t(�#�wy&,w��,��-�-.&'#�w��,$��|��$��'( #�w uv+'}tp��'-.&'w��,����&,��ptp���&'�q#�wys�+'��-�&,��w���p�p���&'�H�{¬�#�wy#�&,x���wp��'-.&'w��,����&,�'#�$�-6#�(Vp���&'�t&'x��|&,#�r�uv#�-�&�&'w �'�t$�#�p���-Qs���$�$�#�&����t��-�-.��uv��pï&,#ox��G!O�t#�$�+'}o( �G����uvr�&'}os�x��,+'p�w���$3�|¦Euvr�&,}-.����&'w �'��-6�'$�s�w ����-.�|&,x��ts�#�-.&#�({s�#�rO}��,$��o��$�p�&,x��t#O!O��w���+'+�uv��uv#�w�}�s�#�$�-���uvr�&'�'#�$3�|no��x��G!O�|��&#���wJp��,-�r�#�-���+}���&��$�#�&'x���w2s�#�uvr�w���-.-.�'#�${uv��&'x�#�p���x�#O���G!O��wÒ��&,x���&�s���$�������-.��p�(�#�w�w���uv#�!��'$��J&'x��y��uvr�&'}�-.����&,w��'��-�Í����'p�&,x�s�#�uvr�w ��-�-.�'#�$3���$J����'p�&'x�� s�#�uvr�w ��-.-���pL&,w��'���<&'x��#���&,��p�����w����S#�(2&'x��&,w��'���'-�������p��P����&�#�$�+'}V$�#�$�����uvr�&'}L-.����&'w �'��-i��w�����s�&'����+,+'}Jr�w���-���$�&,�¬��'����w �J�������¯��nÂ�,p�&,x*s�#�uvr�w���-�-.�'#�$6s���$*��+'-�#6���Js�#�uv���'$���p6���'&,x*r���&'x6s�#�uvr�w ��-.-��'#�$���¬��'����w �V�������G�

���

� �

!

"$#%#%# "$#%#%##%#%# " # "%"$# &'(&)

"$#%# "

*,+,- ) �

#%#%# "# "%"$#

( !

.0/ $ � ) )�)

!

"$#%#%# "$#%#1##%#1# " # "%"$# &'(&)

"$#%# ""%"$# ""$# "1"# "%"�#

( !

*,+,- ) �

#%#%# "

2

2

#%#%#%#

#%#%#%#

¬��'����w����3Í�  ���'p�&,x�� s�#�uvr�w���-�-.��p{Ì�����&'��w�$���w }�&'w �'�3���4������$�p6�Lr���&,x�����$�p����,p�&,x�� s�#�uvr�w���-.-.��p�Ì�����&,��w $���w�}J&'w �'�3�����G�<1Ex��p���&'�Q-.��&i#�(�¬��'����w��t°��'-V��-.��p���nÂ�'p�&'x�s�#�uvr�w ��-.-.�'#�$q�'-Vw�����+'�'����pv��}v�6���'&'uv��rv-�&,#�w���pv�'$t����s�xv$�#�p��2�L°�-�&,��$�p�-L(�#�wE�$�#�$�����uvr�&,}�s�x��,+'pQ��$�p65*( #�wi��$Q��uvr�&,}�s�x��'+,p���1Ex��L#�w�p���w �'$���#�(i���'&,-��,$�p��'s���&'��-�&,x��L#�w�p���w��,$���#�(3$�#�$�����uvr�&'}Js�x��'+,p�w ��$��� $7����� �H!���w��'����+'��8V����!���-y&,x��J+'#�$�����-�&Hs�#�uvuv#�$*r�w������\-.&'w��,$��\#�(S��r���&'x�� s�#�uvr�w���-.-.��pQ-.����&,w��'�H�H Q-����(�#�w�����&,x��{+'������+,-�9�:�4��#�$�+'}Q�,$�p��'s���&'�V&'x��Vr�w���-���$�s��L#�(3��$*��+'��uv��$�&«�

1Ex��6r�w��,uv��w }v��#���+z#�(+,�G!���+zs�#�uqr�w ��-.-��,#�$v�'-V&'#tw���p���s��Q&'x��6p���r�&'xv#�(&,x��Q&'w �'�H�no�\s���$���+'-.#t��r�r�+'}�+'�G!O��+is�#�uv�r�w���-�-.�'#�$�&'#��6r���&'x��E��$�p����'p�&'x�� s�#�uvr�w���-�-.��p\&'w �'�6��-V+'#�$��v��-V�Q�6�O����rv&'x��*s�#�-.&i#�(�s�#�rO}��'$��vw�����-.#�$�����+'���H¬i�,����w �6�3�)�#�&,��&'x���&��'$��tr���&,x��J��$�po���,p�&'x�� s�#�uvr�w���-.-.��p|&'w��,��+'�G!O��+�s�#�uvr�w���-�-.�'#�$��'-6��r�r�+'�,��p�&,#ï-.r���w -���+'}�r�#�r���+,��&'��p�$�#�p���-<�1Ex��'-y�,-yÌ����'&,�{s�#�$�&,w���w }6&'#q&,x��{����}q+'�G!O��+2s�#�uvr�w���-.-.�'#�$6��#�w���-��'$q&,x��{r���&,x��3��$�p*+'�G!O��+'� s�#�uvr�w���-�-.��pQ&'w �'�J��x���w��J#�$�+,}

°;5��

s�#�uvr�+'��&'�{$�#�p���-��w �Q+,�G!O��+,��s�#�uvr�w���-�-.��p���� $v&'x��Qr���&,x����2���,p�&,x�� �2��$�pq+,�G!���+'��s�#�uvr�w���-.-.��p6&'w �'�Q����-�&,��w &z���'&,xv�Q&'w��,��#�(&'x��J+,��w����-.&Hr�#�-.-��,��+'�J#���&'� p��P��w����y����s�����-.�J���'p�&'x*s�#�uvr�w���-�-.�,#�$�w ��uq#�!O��-���+'+H��uvr�&'}6-�����&'w �'��-<�<>=

? =

@

A$A$A%B

C D

BEA$A�A

F GH

CI@JCIC

A B�B$BKB$B$B�BLB�B$B�BLBEA B$B

CIF�CEH

BEA B$BKB�B$B$BKB$B�B$BKB$BIA%BC D F H G M CID

BIA$A�A

M

BIA�A BNPO CICICI@ECI@I@EC

CIF�CEH

BEA B$BKB�B$B$BKB$B�B$BKB$BIA%B

CE@�CIC

A%B$B$BKB$B�B$BKB$B$B�BLBEA B�B CIDB$BEA BKBIA B�BLB�B$BIA�B�B$B$B

QSRUT

B$BIA%BLBEA B$BKB$B�BIA�B$B�B$B

QVRWT

A$A�A BKBIA�A$A�BIA$A�AXBEA$A%B

A$A�A$A

NPO CICICI@ECI@I@EC@

¬��'����w��L�3Í�  r���&'x��i��$�pQ�Q�,p�&'x���s�#�uqr�w ��-.-���p�x�������p���s��'uv��+�&'w �'�Y����� ����$�pQ&'x��Js�#�w�w ��-.r�#�$�p��'$��{r���&,x�� �����'p�&'x��i��$�pQ+'�G!O��+,�s�#�uvr�w���-.-���pV&'w �'�Z� ���G�<1Ex��p���&'�y-.��&O#�(H¬��'����w���°��'-���-.��p3�<� ${&,w��,�Z� �[� �<&,x��yw�#�#�&�#�(�&,w��,�Z���4�2�,-S&'w ��$�-.(�#�w�uv��pJ�,$�&,#{�ys�+,��-.&,��w#�(3Ì�����&,��w $���w�}�$�#�p���-�&'#*w���p���s��L&'x��{s�#�-.&H#�(zs�#�rO}��'$��3�< Q-����(�#�w ����&'x��V+'������+,-]\_^:`4a�#�$�+'}6�'$�p��'s���&'�V&'x��Vr�w ��-.��$�s��V#�(��$*��+'��uq��$�&«�

§�#�uvr�w ��-.-.�'#�$�uq��&'x�#�p�-�( #�wL(���$�s�&'�'#�$���+E&'w �'��-6��w���x���w p�+'}�p��'-.s���-.-.��p��,$o+'�'&'��w���&'��w �2�cb�#�������+'&J© °G�OªVp���-.s�w �'����-Q�-.}�-.&'��uv��+,�G!���+��'uvr�+'��uv��$�&'��&'�,#�$q#�(��6( ��$�s�&'�'#�$���+����'$���w }v&'w �'�Q��$�p���-���-V&,x��Q&,w��'�6(�#�wE�'$�&'w #�p���s��'$��\-.��&'-V��$�p�uv��r�-L�,$�&'#�q( ��$�s�&'�,#�$���+�-.��&,��#�w��,��$�&,��ptr�w�#���w���uvuq�'$���+'��$����������H�{¼��qp�#���-�$�#�&S��r�r�+,}���$O}�s�#�uvr�w ��-.-��,#�$|uv��&'x�#�p�-{&,#�&'x��q&,w��,���x�#����G!���w«��������-.������©����OªS-.x�#��Q-�x�#�� uv��r�-���w ����-.��p\�'$q��( ��$�s�&'�,#�$���+Hr�w�#���w���uvuv�,$��6+,��$����������J&,#q�'uvr�+,��uv��$�&�&'w��,��-<�¦E!O��$q&,x�#�����xquv��r�-�s���$�����!��'�¯����pv��-L�Quv����$�-�&'#��,uvr�+'��uv��$�&����'p�&,xvs�#�uvr�w���-.-��,#�$��H&'x��P}qp�#�$�#�&���w��,$��q����#���&�&'x��-��G!��'$���-6�'$�uv��uv#�w }���&'�'+'�,����&,�'#�$o���|��w ��-�&'w ��!��'$��ï( #�w«�|12#ï#���wV��$�#���+'��p�������&,x��,-*�,-\&,x��t�iw -�&��P��r���w �,uv��$�&'��+E-�&'��p�}s�#�uvr���w��'$�����(���s��,��$�&H-.}�-�&,��uq��+'�G!O��+�s�#�uvr�w ��-.-.�'#�$�uv��&,x�#�p�-E(�#�w3( ��$�s�&'�,#�$���+O&,w��'��-<�

d e �7øùõêöKò`ý�ý�÷�7ñ øùò`ú������xý ���7ö���ÿxñt�.ú�÷�7ñxû7ü�ú�ö`÷aò.ý��$v&,x��,-L-���s�&'�,#�$v��������!����Q��w��,��(S#O!O��w !��'�G��#�(E&,x���s�#�uvr�w ��-.-��,#�$q&'��s�x�$��'Ì�����-y������-��Q( #�wE( ��$�s�&'�'#�$���+2&,w��,��-<�Sno�Q-.&'��w�&���'&'x6&'x��Jp����i$��'&'�'#�$�#�(z�J���'$���w�}Q&'w �'�Vp���r��,s�&,��p��,$*¬��'����w��*°����4�¯����&�s�#�$�&'���'$�-E-�&,w��,$���-�( w #�u¸&'x��V��+'r�x�������&gfihJjk\gl4`�m��no�{-.��}*&'x���&��{-.&'w �'$��_n �'-�&'x��6o�p C�@�K$q�r #�(3&'x��J-.&'w �'$��9s��O�'(�&,x���w �J�'-��J-.&'w �'$��_tQ#�(z+'��$���&'x7oz-.��s�x*&'x���&gs,hutLnv�v��4wgx�y j y'��xcz|{ ���'$���w�}Q&'w �'� >�5�A�B,9�8,A�8,A�N�}�;�=,;�?�;�A�B,C�8,C96B�~º;�;���8,B'M*B'M�;2KP5�=,='5[��8,A�N���~ 5��3;�~.B'8,;�C0�

� I K]} h�\�� B,M�;VB�~.8';{8,C9�A*;�?��3B F\B�~�;�;��� I K]} h�`P� B,M�;VB�~.8';J>�5�A�C.8,C�B,C5�KV5�A�;VA�5[��;3~º;���~�;�C�;�A�B,8'A�NQB'M�;J;�=,;�?�;�A�B��� I K7}�� `P� B'M�;QB�~.8';6>�5�A�C.8'C.B'CV5ºK�9�A�5���;Q5�KQB���5|>PM�8'=��[~º;�A[�3��5[~V;�9�>PM�� � j�\�l4`Xm B'M�;4~º;Q8,CJ9�>PM�8,=�� � ��M�8'>PMv8'CV9� 8,A�9�~�F*B�~.8,;{9�A��Q>�5�A�B,9�8,A�C�B'M�; `;p C�@�K$q�rH;�CE5ºKL9�=,=H;�=';�?�;�A�B'C�C�B,9[~.B'8'A�N���8,B'Mc���

°;��ð

no�v��-.-���uv�*&'x���&S��+,+�-.&,w��'$���-{�,$��q&'w �'�\��w��\r�w ��������( w����HÍ�$�#�-.&'w �'$���s���$����\�vr�w������|#�(���$�#�&'x���wº�J��$�r���w�&,�'s���+,��wÒ�i&'x��'-�'uvr�+,�'��-V&,x���&zp���r�+'�,s���&,��-.&'w��,$���-L��w �Q$�#�&���+,+'#�����p3����(E��+,+3-.&'w �'$���-�-�&,#�w���pv�,$�&'x��6&,w��,�6��w��Q��$��'Ì������2�,&z�,-J����-.}�&,#���$�-.��w �&'x���&�&'x��J-�&'w �'$���-y��w ��r�w����i����( w����L��}q��r�r���$�p��'$��Q��-�r���s��'��+�uv��w �O��w3��&�&'x�����$�p*#�(S����s�x\-.&,w��'$��3�H¬�#�w3������uvr�+'��������s���$��r�r���$�pQ&'x��{-.&'w �'$�� ���[���������G&'#\&'x��{��$�p*#�(�����s�x\-.&'w �'$����O ��i$��'&'�J-�&,w��'$��*&'x���&2x���-�������$*�P��&'��$�p���pQ�'$\&,x��'-����}\�'-y#�( &'��$w ��( ��w w���pQ&,#\��-�V-���uv�'� �'$��i$��'&'�V-.&'w��'$��6#�w3-.�'-.&'w �'$��3�

  r���&'x���s�#�uvr�w ��-.-���p|���'$���w }�&,w��'���S¬��'����w��o°�� �������,-6�\&'w��,�v��x���w��\��+'+E-.����&,w��,��-����'&,x��v-��,$���+,�q$�#�$�� ��uvr�&'}ts�x��,+'px��G!O�y������$�w ��uv#O!O��p3�P1�x��yr���&,x��'$�( #�s�#�w�w ��-.r�#�$�p��'$��L&'#�&,x��yw ��uv#O!O��pJ-��,$���+,��� s�x��'+,p{-.����&,w��'��-S�'-Sw ��r�w���-���$�&,��pJ��-E��p��'���'&-.&'w �'$��3�v��4w�xgy j y'��xc�|� r���&'x���s�#�uvr�w ��-.-���p����'$���w }6&,w��,� >�5�A�B'9�8'A�8'A�N���;�=';�?�;�A�B'C�8'C�9\B���;�;Y��8'B,M\B'M�;2KP5�=,='5���8'A�N3����5;��;4��B,8';�C� 

¡ I K]�£¢ ��¤ B,M�;VB��.8';{8,C9�A*;�?��3B F\B���;�;�¥¡ I K]�£¢ �P¤ B,M�;VB��.8';J>�5�A�C.8,C�B,C5�KV5�A�;VA�5[¦�;3�º;�����;�C�;�A�B,8'A�NQB'M�;J;�=,;�?�;�A�B�¥¡ I K]�¨§ �;¤ B,M�;JB��.8';{>�5�A�C.8'C.B'C�5ºKJ96A�5[¦�;V5ºKVB���5\A�5�A[© ;�?���B F6>PM�8,=�¦��º;�AQ9�A�¦*97ª�8'A�9���F*C.B��.8'A�N�«L5�KJ=';�A�N�B'M ¬ «L¬ ¥i:iM�8'C

C�B���8,A�N�8'C{B'M�;*='5�A�NO;�C.B[����;I­�®�>�5�?�?�5�AtB'5|9�='=3;�=';�?Q;�A�B'CJC.B'5��º;4¦�8'A|B'M�;\B��.8';�¥�¯�5���;�9�>PM±°³²J´ ��µ4�X¶ B,M�;4�º;\8,C{9>�M�8'=�¦ ¤ ��M�8'>PM\8,C�9Y�39�B'M�©�>�5�?Z���º;�C.C�;4¦·ª�8,A�9��ÒFvB��.8';{9�A�¦\>�5�A�B'9�8'A�C�B'M�;±¬ «K¬�¸ � ©�C.@�K�­�®2;�C�5�K{9�='=H;�=,;�?�;�A�B'CC�B,9[�.B'8,A�N�L8,B'Mc«�°�¥

1Ex��y$��P��&�-.&'��r��,-S&'#���r�r�+'}J+'�G!O��+�s�#�uvr�w ��-�-��'#�$J&,#���r���&'x���s�#�uvr�w ��-�-.��pL���'$���w }{&,w��'�H�<1Ex��yw���-���+'&'�'$��{&,w��,���'-S��uv��+'&'�'� p��'���'&&'w �'�Vs���+'+'��p6&'x��{r���&'x��i��$�p*+,�P!O��+'� s�#�uvr�w ��-.-���p�&,w �'�¹� -����{¬��'����w �\°3�����º������$�p6�,&2�'-�p�����$���pQ��-�(�#�+'+'#���-��»��4¼�½g¾ j ¾'��½c¿|� r���&'x�����$�p�+'�G!���+'� s�#�uvr�w ��-�-���p�&,w��,� >�5�A�B'9�8'A�8'A�N·�t;�=';�?�;�A�B'CE8'C�9QB��º;�;¹��8,B'MQB'M�;�K�5�='='5���8'A�N�����5;�3;��.B'8';�C0 

¡ I K]�£¢ ��¤ B,M�;VB��.8';{8,C9�A*;�?��3B F\B���;�;�¥¡ I K]�£¢ �P¤ B,M�;VB��.8';J>�5�A�C.8,C�B,C5�KV5�A�;VA�5[¦�;3�º;�����;�C�;�A�B,8'A�NQB'M�;J;�=,;�?�;�A�B�¥¡ I K��À§ �;¤ B'M�;yB��.8';>�5�A�C.8'C.B'C�5�K9�A�5[¦�;�5�K¹ÁXÂO>PM�8'=�¦��º;�A���M�;���;YÃ3Ä � 8,C�?Q9;®28,?Q9�=�C.@�>PMJB'M�9�B�9�='=�>PM�8'=�¦[��;�AV9[�º;A�5�A�©

;�?Z�3B F69�A�¦�9·ª�8,A�9��ÒF6C�B���8,A�N_«5ºK�=';�A�N�B'MŬ «K¬ ¥�:�M�;�C.B��.8'A�N*8'CEB'M�;L='5�A�NO;�C.BP����;I­�®6>�5�?�?�5�A�B'5Q9�='=�;�=';�?�;�A�B'CEC.B'5��º;4¦8,A\B,M�;JB��.8';�¥Æ¯�5[�y;�9�>PM�ª�8,A�9��ÒFqC.B��.8'A�NÈÇt5�K{=';�A�N�B'M9Ã3B'M�;��º;J8'Cy9q>PM�8'=�¦ ¤ ��M�8'>PM\8'C9Y�39�B'M�©�9�A�¦*=,;4É�;�=�©�>�5�?����º;�C.C�;4¦B��.8';{9�A[¦Q>�5�A�B'9�8'A�C�B,M�;_Ê�¬ «L¬�¸�¬ Ç�¬ ˺©�C.@ºK$­�®H;�C�5ºKJ9�='=�;�=,;�?�;�A�B'C�C�B,9[�.B'8'A�N���8,B'MÈ«�ÇÌ¥

  �'$�p��,s���&'��p6��}\&'x��Jp�����$��,&'�'#�$*����#�!����O&'x���w �{��w ��$�#\��uvr�&'}*s�x��'+,p�w ��$6�'$\��r���&'x�����$�p\+,�G!���+,� s�#�uvr�w���-�-.��pQ&,w��,�{-.&'#�w��,$��uv#�w ��&'x���$q#�$�����+'��uv��$�&«�z� (S��uvr�&'}qs�x��'+'p�w���$\��w�����+'+'#��Q��pq�'$v&,x���p���&'��-.&'w ��s�&'��w��J�'&i�'-Ls���+'+'��pq&,x�� �º;�='9;®2;4¦ r���&'x��S��$�p+'�G!O��+,��s�#�uvr�w ��-�-���p�&,w �'�H�

nÂ�,p�&,x{s�#�uvr�w���-�-.�'#�$L�'-�}���&���$�#�&,x���wHuv��&,x�#�pV(�#�wHw���uv#O!��,$��L��uvr�&'}Vs�x��'+,p�w ��$V(�w�#�u �y&'w �'�H�<��$J����'p�&'x���s�#�uvr�w���-.-.��p&'w �'�Jp���r��'s�&'��pQ�'$*¬i�,����w��L�7���4�z#�$�+'}*$�#�$�� ��uvr�&'}�s�x��,+'p�w ��$Q��w��Vr�w���-.��$�&«�»��4¼g½�¾ j ¾'��½cÍ|� ���'p�&'x���s�#�uvr�w ��-.-���p�&,w��'� >�5�A�B'9�8'A�8'A�N��|;�=,;�?�;�A�B,C�8,Cy9*B��º;�;���8'B,M*B,M�;2K�5�='=,5[��8'A�NZ��� 5��3;��.B'8';�C0 

¡ I K]�£¢ ��¤ B,M�;VB��.8';{8,C9�A*;�?��3B F\B���;�;�¥¡ I K]�£¢ �P¤ B,M�;VB��.8';J>�5�A�C.8,C�B,C5�KV5�A�;VA�5[¦�;3�º;�����;�C�;�A�B,8'A�NQB'M�;J;�=,;�?�;�A�B�¥¡ I K��u§ �P¤ B'M�;�B��.8';Q>�5�A�C.8'C.B'CL5�KQ9vA�5[¦�;�5ºK�ÁLÎ�>PM�8'=�¦��º;�AEK�5��LC�5�?�;�­�®H;4¦ÅÏÀÄ � ¥�¯�5��L;�9�>�M_ª�8'A�9[�ÒF�C.B��.8'A�N±Ç�5�K

=';�A�N�B'M±Ï�B,M�;4�º;�8'CV9v>�M�8'=�¦vB'M�9�B�8'CL9c��8�¦�B'M�©�>�5�?����º;�C.C.;�¦qB���8,; ¤ ª�@�B�5�A�= FtA�5�A�©�;�?Z�3B Fv>PM�8'=�¦��º;�Av9[�º;6��;Ð���º;�C�;�A�B,;�¦8,A6B'M�;JA�5�¦�;�¥KÑ�;�B�ÒÔÓ �ÖÕ×µ4���4��µ ÒÔÓ Á[Î·Ø �4Õ ª�;VB'M�;VC�;�Ù�@�;�A�>�;�5�KJ9�='=H>PM�8'=�¦��º;�AQ9�A�¦9ÚLÓ �ÖÕ×µ4���4�;µ ÚLÓ Û Õ>µ ÛܲÀÓ �gµ Á[Î6Ø ��ÕB,M�;QC.;�Ù�@�;�A�>�;65�K�A�5�A�©�;�?��3B Ft>PM�8,=�¦��º;�A[¥¹Ý�9�>PMtA�5�A�©�;�?���B F�Ò�Ó Ã Õ ?�9;��CV8'A>Þ�;�>�B'8�É�;�= F|9�A[¦�5��צ�;���©1���º;�C.;��0É�8,A�N�= FtB'5C.5�?�;·ÚKÓ ß Õ ¥

����&'xVs�#�uvr�w���-.-.�'#�$Ls���$V����s�#�uq���,$���p��Q�,&'xV�Q�,p�&'xVs�#�uvr�w���-�-.�'#�$��G-����¬��,����w ���Y�����G��1�x���r���&'x�����$�pJ���,p�&,x�� s�#�uvr�w���-.-.��p&'w �'�V�'-p����i$���p6��-�(�#�+'+,#���-��

°;à�Ó

»��4¼�½g¾ j ¾'��½cá|� r���&'x��2��$�p����'p�&,x�� s�#�uvr�w���-�-.��pV&,w��,� >�5�A�B'9�8'A�8'A�N�âv;�=,;�?�;�A�B,C38,CS9�B��º;�;Z��8,B'M�B'M�;OKP5�=,='5��L8,A�Nã��� 5��3;��.B'8';�C0 ä I K]â£å�æ ¤ B,M�;VB��.8';{8,C9�A*;�?��3B F\B���;�;�¥ä I K]â£å�ç ¤ B,M�;VB��.8';J>�5�A�C.8,C�B,C5�KV5�A�;VA�5[¦�;3�º;�����;�C�;�A�B,8'A�NQB'M�;J;�=,;�?�;�A�B�¥ä I K±âèéç ¤ B,M�;�B���8,;t>�5�A�C�8,C�B,C*5�Kq9oA�5�¦�;v5�KÅê[ëq>PM�8'=�¦[��;�AVKP5[�QC�5�?�;ì­�®H;4¦³íJî�ç\9�A[¦�9ïª�8'A�9���F�C.B��.8'A�N£ðq5�K

=';�A�N�B'M,ñ ðKñ ¥J:iM�;�C.B��.8'A�N�8'CVB'M�;Q='5�A�NO;�C�Bk���º;I­�®t>�5�?�?�5�AvB,5�9�=,=z;�=';�?�;�A�B,CVC.B'5��º;4¦v8'A�B'M�;QB��.8';�¥Y¯�5��V;�9�>PM_ª�8,A�9��ÒFC.B���8,A�NóòÂ5ºKt=';�A�N�B'Móí¨B,M�;4�º;�8,Cq9ï>PM�8'=�¦ ¤ �yM�8,>�M�8'C\9È�39�B,M[©V9�A[¦ï��8�¦�B,M[© >�5�?Z���º;�C�C.;�¦oB���8,;�9�A[¦�>�5�A�B'9�8'A�C*B,M�;ô ñ ðLñ[õöñ ò�ñ ÷º© C�@ºK$­�®2;�C�5�K\9�='=S;�=';�?�;�A�B'CJC�B,9[�.B'8'A�NÅ��8'B,M,ð�òK¥�ø�A�= F�A�5�A�©�;�?��3B F�>�M�8,=�¦��º;�A|9[��;9�º;����º;�C�;�A�B,;�¦|8'A�B,M�;A�5�¦�;4¥3ù�;�Bãú�û æÖü×ý4þ4þ�þ�ý�ú�û ê[ë9ÿ�ç�ü̪�;6B'M�;QC�;���@�;�A�>�;�5�KQ9�='=i>PM�8,=�¦��º;�Av9�A�¦��Kû æXü>ý�þ4þ4þ4ý��Kû � ü×ý�����û$ç[ý�ê[ë9ÿ ç�üiB,M�;C�;���@�;�A�>�;�5�KVA�5�A[© ;�?���B FQ>PM�8,=�¦��º;�A[¥E9�>�M�A�5�A[© ;�?���B F9ú�û �>ü�?�9;��CE8,A �P;�>�B,8���;�= F\9�A[¦�5��צ�;4��©$���º;�C.;�����8,A�N�= F*B'5QC�5�?�;�Lû ��ü1¥

��(z���{s�x�#�#�-�� í å ç �'$6&'x��Vr�w �G!��,#���-Ep�����$��,&'�'#�$6���V����&H&'x��Vr���&,x�� s�#�uvr�w���-�-.��p{���'$���w }6&,w �'�H�1Ex���r���&'x������Q�,p�&'x�� ����$�p�+,�G!���+'� s�#�uvr�w���-�-���pJ&,w��,�V����$���w ��+,�'����-�&'x���p�����$��,&'�'#�$�#�(�&'x���r���&'x�����$�p����'p�&'x���s�#�uvr�w ��-.-���p

&'w �'�V��}6��+,+'#����'$��Q&'w��,�{$�#�p���-E#�w3!���w�}��'$��6#���&'� p��P��w�������-����{¬��,����w��L�3�»��4¼�½g¾ j ¾'��½��|� r���&'x����z�Q�,p�&,x����3��$�p�+,�G!���+'� s�#�uvr�w���-�-.��p�&,w��'� >�5�A�B'9�8'A�8'A�NÅâ¨;�=';�?�;�A�B'C�8'C{9�B��º;�;9��8,B'M�B,M�;�K�5�='='5���8'A�N����5;�3;��.B'8';�C0 

ä I K]â£å�æ ¤ B,M�;VB��.8';{8,C9�A*;�?��3B F\B���;�;�¥ä I K]â£å�ç ¤ B,M�;VB��.8';J>�5�A�C.8,C�B,C5�KV96A�5�¦�;Y��;Ð���º;�C.;�A�B'8,A�N6B'M�;V;�=';�?Q;�A�B�¥ä I K�âuè ç ¤ B'M�;�B��.8';Q>�5�A�C.8'C.B'CL5�K�9tA�5�¦�;�5�K7ê��z>�M�8'=�¦[��;�A�KP5[��C.5�?�;���î ç{9�A[¦q9_ª�8'A�9���FtC.B��.8'A�N ð{5�KQ=,;�A�N�B'M,ñ ðKñ ¥

:iM�;�C�B���8,A�N�8'CLB,M�;Q='5�A�NO;�C�Bk���º;�­�®t>�5�?�?�5�AqB'5�9�='=i;�=,;�?�;�A�B,C�C�B,5[�º;4¦�8'A�B,M�;�B���8';�¥Y¯�5��L;�9�>�M7ª�8,A�9���FtC.B��.8'A�NÅò�5�K=';�A�N�B'M��2B'M�;��º;y8'CS9�>PM�8,=�¦ ¤ ��M�8'>PM�8'CS9��39�B'M�© ¤ ��8�¦�B,M[© ¤ 9�A�¦�=';���;�=�©�>�5�?�����;�C�C.;4¦{B��.8';L9�A�¦{>�5�A�B'9�8'A�C�B'M�; ô ñ ðKñ>õ ñ ò�ñ ÷º©C�@ºK$­�®H;�C{5ºKq9�='=E;�=,;�?�;�A�B,C�C�B,9[�.B'8'A�N ��8'B,M£ð�òK¥ ø�A�= F�A�5�A�©�;�?��3B F�>�M�8'=�¦[��;�A�9��º;_�º;����º;�C.;�A�B';4¦�8'A�B,M�;vA�5�¦�;�¥�ùi;�BúÔû æÖü×ý4þ�þ4þ�ý�úÔû ê��Æÿuç4ü�ª�;JB'M�;{C.;���@�;�A�>�;V5�KJ9�=,=2>PM�8,=�¦��º;�A*9�A�¦��Kû æÖü×ý4þ4þ�þ�ý��Kû �ïü×ý����Vû$ç[ý�êLë·ÿ:ç�üHB'M�;{C.;���@�;�A�>�;L5ºKA�5�A�©�;�?Z�3B F6>PM�8,=�¦��º;�A�¥E9�>�MQA�5�A�©�;�?��3B FÈúÔû ��ü�?�9;�3C�8,A �P;�>�B,8���;�= F\9�A[¦Q5��צ�;���©1���º;�C.;�����8,A�N�= F\B,5\C.5�?Q;��Lû íLü$¥

 �-L�'$�p��'s���&'��p\��}v#���wS-.�'uvr�+'����$���+'}�-.�'-��'$�%���s�&'�,#�$o°G���QÌ�����&'��w�$���w }*&,w��'�Quv�,$��,uv�'����-y&'x���s�#�-.&i#�(Es�#�rO}��'$��3�i1H#v�P!�#��,p���'$���w }{$�#�p���-.�<����x��P!O��&'#�p�����$�� ��å�ê��[ý�� î�ç �<nÂ�'p�&'x�s�#�uvr�w ��-.-��,#�$Js���$����y��r�r�+'�'��p�&,#�&'w �'��-S#�(���w����'&'w ��w }{p�����w�����-<�1Ex��Js�x�#��'s��J#�(3#���&'� p��P��w�����-�p���r���$�p���$�&'-�#�$*&'x��{uv��&,x�#�p6&'x���&��'-��-.��p*(�#�w��'uvr�+'��uv��$�&'�,$��Q&'x���-.��&�#�(�+,������+'-(�#�w�$�#�$����uvr�&'}vs�x��'+'p�w���$v��$�pv&'x��Q�'$G­«��s�&'��!O����$�p�#�w p���wÎ� r�w���-.��w�!��'$��6uv��r�r��,$��\(�w�#�u úÔû �>ü�� -L&'# �Kû ��ü�� -��S �-J-.x�#��Q$v�,$�¬i�,����w �Q������Q��-.�Q�Q���,&'uv��rv( #�wE&,x��Q-.��&3#�(E+'������+,-L#�(�$�#�$�� ��uvr�&'}6s�x��,+'p�w���$3�31Ex��6���'&,uv��rv�,-Vr���w�&�#�(�#�$���� ��#�w p\��p�uv�'$��'-.&'w ��&'��!O�p���&,�2��� $Q#���wi-.}�-.&,��uv�O���'��x�&H���,&'-���w �Jw ��-.��w !O��p�( #�wz+,#�����+'�¯!���+�&'}�r��L�'$�(�#3�< �-����Vx��G!O�V#�$�+,}Q��ðQ���'&,-���&2#���wip��,-.r�#�-.��+��,$�{���������,&�uv��s�x��'$��L��w�s�x��'&'��s�&,��w ����&,x��V��r�r���wz+,�'uv�'&H&'#6&'x��J#���&'� p��P��w����y�'-�°G® �*�{x��P����p���s��'uv��+�-.����&'w��,�2�

! ��ø�õxüaò`ø�ò`ñxú�û7ú�÷¯�7ñÿ������7ø�õxöKò`ý�ý�÷�7ñeø�ò`ú������xýn¥x���$|�'uvr�+'��uv��$�&'�,$����q( ��$�s�&'�'#�$���+zr���&'x����z���'p�&'x����z��$�p�+,�P!O��+'��s�#�uvr�w ��-.-.��pv&,w��'�������q��w �\s�#�$�( w�#�$�&'��p����'&,x�&'x��\( #�+,�+'#����,$��Q�'-.-�����-��O¬��'w -�&2���Vx��G!O�J&'#*-�r���s��'( }Q&'x��{#���&,� p��P��w�����-#" � �Q�V��w �{����+,�J&'#6-.��r�r�#�w &«�<1Ex��Vuv��w �J#���&'��p��P��w �����'-�$�#�&��$�#�����x��<x�#��Q�G!O��wÒ�<����s�����-������J����$�&�&'#*�O����r�&'x��Js�#�rO}�s�#�-�&�w�����-�#�$�����+'�2�<1�x���-�����$6��p�p��'&'�'#�$���+Or���w���uv��&²��w$�Q&'x��L��r��r���w���#���$�pQ(�#�w3$�#�$�����uvr�&'}�s�x��'+'p�w ��$%�*uv��-.&2���{-.r���s��²����p\(�#�w3����s�x\#���&²� p��P��w����2�<1Ex���$*����x��G!��V&²#*�i$�p\��$\��(���s��'��$�&uv��&,x�#�p*( #�w3&'x���uv��$��'r���+,��&'�,#�$Q#�(z���'&,uv��r�-�'$\���,p�&'x���s�#�uvr�w���-.-.��pQ$�#�p���-<��n¥x���$\��-��,$��\�J���,&'uv��r\w ��r�w���-.��$�&'��&'�'#�$��O�,&�'-E-.�'uvr�+'�&'#�-.r���s��'( }��'(���s�x��,+'p��,-���uvr�&,}�#�w�$�#�$�����uvr�&'}H�P� &O�'-S+'��-.-�#��O!��'#���-zx�#�� &'#���(���s��'��$�&,+'}�s�#�uvr���&,�&,x���+,#�s���&'�,#�$#�(���$�#�$�� ��uvr�&,}Qs�x��'+,pq�'$*��$q��w w���}*-�&,#�w �'$��*#�$�+'}\$�#�$�����uvr�&'}Qs�x��,+'p�w���$3��¬��'$���+,+'}�������x��G!O�{&'#\�i$�p*&'x���uv#�-.&�-.���,&'����+,�w���r�w ��-���$�&,��&,�'#�$�-S(�#�w3!���w��'����+,����-.�'���L&'w �'�J$�#�p���-���$�pQ+'#�$�����-�&�s�#�uvuv#�$�r�w������*-.&'w��,$���-<�

°'&�®

(*)( ++*)+ +, ), +- )- +. )

( , . / ) / 0 / ( / ,

1 2 354 176 198 :5; < 8 : = >2 ? @5A 2*B*@ 8 :5; < 8 : = >

¬��'����w��*ð3Í�1Ex��*�G!O��w �����6��$�p|uv�����,uv��u�s�#�-�&,-{#�(s�#�rO}��'$��t��-{�*(���$�s�&'�'#�$�#�(�&'x��*��r�r���w���#���$�p�(�#�w&,x��*$���uv����w�#�(s�x��'+,p�w ��$��

no�|x��G!O�v�'uvr�+'��uv��$�&'��p�#���wVp���&'�t-.&,w���s�&'��w �v�'$o§å#�$o&'#�r�#�(V%�x���p���-<�v%ix���p���-6#�wÎ����$��'����-{p���&'�t�'$os���+'+'-.�����s�xx��G!��'$�����#�$�����#�w p�x�����p���w���-.��p�( #�w���p�uv�'$��'-.&'w���&'��!��p���&'�2�<� $�&'x���x�����p���w����'��x�&O���'&'-S��w ��w���-���w�!O��p{(�#�w�+'#�����+'�G!O��+�&'}�r���'$�(�#vp��G!�#�&'��pq&'#�&'x��Q�'$�&'��w�$���+i��-.��#�(�&'x��Quv��uv#�w }\uv��$�������w«��no�Qx��G!O��w ��-.&'w��,s�&,��p�&'x���#���&,��p��P��w ����-#�(E#���wE&'w��,�Q&'#vð��$�pï°G®v��$�p�-.��&3&,x��Q��r�r���wE��#���$�p�-y( #�w�$�#�$�� ��uvr�&,}qs�x��,+'p�w ��$q&,#|ð���$�p�°G����w���-.r���s�&'��!O��+'}H��1�x��Q$���uv����wS#�(�s�x��'+,p�w ��$-.r���s��'( }*&,x��{+'#�����+,�G!���+�&,}�r��J#�(��{$�#�p��H��¬�#�w��P����uvr�+'����&'x���w��J��w �{&'x�w����{+,#�����+'�G!O��+�&'}�r���-�(�#�w��JÌ�����&'��w�$���w }6$�#�p����O#�$��x��G!��,$��Q���O#�$��V������$�pQ#�$��Vð*s�x��,+'p�w���$3�

no��x��G!O�t-.r���s��,�i��p�&'x�����r�r���wL��#���$�p�-6( #�wJ$�#�$�����uvr�&'}�s�x��'+,p�w ��$o����r���w��,uv��$�&'��+,+'}H�t12#���$�pï&,x��t�G!O��w���������$�puv�����'uv��u s�#�-.&'-�#�(Es�#�r�}��'$���������w���$v��r�p���&'�J&'��-�&,-V( #�wS�Qx��P����p���s��'uv��+�&'w��,�����'&,xvp��,(Î( ��w���$�&H��r�r���wS��#���$�p�-�(�#�wS$�#�$����uvr�&'}ïs�x��'+,p�w ��$���-.�'$�� °'CDC��5CDCECo��$��'Ì����t��$��,(�#�w�u÷w���$�p�#�u÷�O��}�-.�yñE§�b6��FDG�H¯�L�3°Pð�ÔECDC3°P����Ó��EÔ�°GÓ3°P��®��DC�Ó��IC�����-����¬��'����w��{ð3��)�#�&'�J&'x�������w���r�&��'$�s�w�����-.�{�'$*&'x��{�G!O��w �����J��$�p*&'x���uv�����'uv��uÊs�#�-.&'-y#�(�s�#�rO}��'$��*��x���$6(���+'+�x�������p���s��'uv��+$�#�p���-E��w��J��+'+'#�����p3��1Ex��'-��'$�s�w�����-.�L�'-yp��'-�&,w��'����&'�,#�$�p���r���$�p���$�&O��$�p*s�����-.��pQ��}6(���+'+�$�#�p���-E��&2&'x��V��r�r���wz+,�G!���+'-�#�(3&'x��&,w��'�V�Qx���$6-.&'#�w��'$��Qw���$�p�#�uÊ�O��}�-<�

,*. )-*) )- 0*)- (*)-*, )-*. ).*) ). 0*). (*)

( , . / ) / 0 / ( / ,

J 4 K @7L M N O

¬��,����w��VÓ�ÍO1Ex��V-.�'���J#�(z&'x��J&,w��'�V��-��{( ��$�s�&,�'#�$�#�(z&'x��{��r�r���w���#���$�pQ(�#�wz&,x��V$���uv����wi#�(3s�x��,+'p�w���$3�

°'C�Ô

1Ex���-.��uv��p���&'��-.��&y����-q��-.��p0&,#¨uv����-.��w ��&,x���#�!O��w���+'+uv��uv#�w }ïs�#�$�-.��uvr�&²�'#�$o(�#�wJ����s�x0��r�r���w{��#���$�p��-.���¬��'����w��yÓ3��no�Vx��G!O�ys�x�#�-.��$�°G����-S&'x�����r�r���w���#���$�p{( #�w��Vx��P����p���s��²uq��+<$�#�p���������s�����-���²&�r�w�#O!��'p���-3����#�#�p�&'w���p�����#�(�(����&,������$6&,x��Js�#�-.&�#�(3s�#�rO}��,$��*��$�p*&,x��J#O!O��w ��+,+Huv��uv#�w }Qs�#�$�-���uvr�&,�'#�$3�� Q$�#�&'x���w3w ����-.#�$*( #�w�#���w3s�x�#��,s��V����-&'x���&H�x��P����p���s��'uv��+�$�#�p��J#�(V°Gð\s�x��'+'p�w���$*����!O��-w��'-.��&'#q�{s�+'��-.&'��wS#�(�Ì�����&'��w $���w }Q$�#�p���-��x���w ��&'x��{(Î��&'x���w��'-y(���+'+���$�p*����s�x#�(��,&'-Vs�x��'+,p�w ��$vx���-L��&z+'����-�&z&'��#t��w���$�p�s�x��,+'p�w���$3�z1Ex���-���$�#�$���#�(�&'x��Qs�x��'+,p�w ��$v�,$t&,x��6s�+'��-.&'��w�'-Vr���&'x�� s�#�uvr�w���-.-.��p3�1Ex��'--.r�����p�-���r*-�����w s�x���-����'&'x��'$\&'x��Vs�+'��-.&'��w�#�(3Ì�����&'��w $���w�}Q$�#�p���-E����s�����-.�J&'x��{r���&'x*s�#�uvr�w���-�-.�'#�$Qs�x���s���-���$�p*&'x��$�#�p��V&'}�r��Vs�x���s���-�s���$6���J��+'�'uv�'$���&'��pQ��+'&'#�����&'x���wÒ��-.���{&'x��Vr�-�����p�#�� s�#�p��y( #�w3&,x��V( ��$�s�&'�'#�$6��n¤ñ�� ñE����P�Q�������+'#��6��En ñ��.ñSRTR#UVTWTX5Y[Z<13w��,�]\_^`bacZ����P}Q&'#6���J-�����w s�x���ped

° ¾�f Yv�'-��{x�������p���s��'uv��+�$�#�p�� j�g �4½� hji 1�x��J$���uv����w�#�(z���,&'-�,$*&,x��V+'#�$�����-�&�s�#�uvuv#�$�r�w������6#�(Y��� kli 1Ex��V)���&H!���+'���J#�(3( #���wz���'&'-�-���s�s�����p��'$���&,x��Juv#�-.&2-.�'��$��'�is���$�&em����'&,-�,$n^`ba3�ð opi�qsrut_vekuwyx�z�{�hlX�/j|TWsX}qlr}~�ku�����_qs\�k�d�dÓ �Sn�ñ��.ñSRTR#UVTW#X�o]\_�E{��]d® ��� h �Ô ¾�f Y��,-�{Ì�����&'��w�$���w }�$�#�p�� j�g �4½½ hji 1Ex��V$���uv����w�#�(3���'&'-��'$6&'x��J+,#�$�����-.&Hs�#�uvuv#�$�r�w �����*#�(YE�Ë kli 1Ex��J)���&H!���+'���V#�(z&'��#6���'&'--.��s�s�����p��'$��Q&'x��Vuv#�-.&H-.�'��$��'�is���$�&m|���,&'-��'$�^�`�a3�

°'C o�i�qsrut�vkuwyx�z�{�hlXÒ/�|#WlX}qsru~�ku������ql\�k�d�d°�° ��i 1Ex��J)���&2!���+'���V#�(3&,��#\���'&'-�-���s�s�����p��'$���&,x��Juv#�-.&2-.�'��$��'�is���$�&em����{���'&,-�,$n^`ba3�°G� zpi�o]rut�vkuwyx�z�{�hlX�/j|TWlX�o]ru~�ku������ql\��]d�d°P� �Sn ñ���ñSR#RTUV#W#X�z]\'�D{��]d°Pð ��� h �°PÓ ¾yf Yv�'-�J$�#�p��Lw���r�w ��-.��$�&'�,$���&,x��J��+'��uv��$�&H��$�p�Y��,-��Ì�����+H&'#�^`ba j�g �4½°P® Y°PÔ ��� h �°P½ �#���

����&z��$�#�&'x���wEw ����-.#�$|( #�w�w ��-.&'w��,s�&,�'$��|&,x��*$�#�uv�'$���+i#���&,��p��P��w ����&,#0°G®|�,-{��(��is��'��$�&z�'uvr�+'��uv��$�&,��&'�,#�$v#�(&,x��*-.��&3#�(+'������+'-E(�#�w�$�#�$�����uvr�&'}{s�x��'+'p�w���${��$�p�&'x���uv��r�r��'$���( w�#�u ��+'+�s�x��'+'p�w ��$�&,#Q$�#�$�� ��uvr�&'}Js�x��'+'p�w���$3������s���+'+H&'x���&O���'��x�&O���'&'-#�(�&'x��{#�$��J�{#�w p*s���+'+�x�����p���wz��w��J��-.��p*(�#�w3+,#��Q� +'�G!O��+H&'}�r��'$��3����$\&,x��J����� ���,&���w�s�x��'&'��s�&,��w ���O&'x��{w ��uv���'$��'$��Q��ð6���'&,-y#�(&'x��*x�����p���wE��w��6��&3#���w�p��'-�r�#�-.��+i(�#�w�#�&'x���w�r���w r�#�-���-���n��*��-.��°P®����'&'-{( #�w��*���'&'uv��rt��x���w��Q�*-.��&����'&z�'$�p��'s���&'��-V&'x���P���'-.&'��$�s��{#�(���$�#�$�����uvr�&'}Qs�x��'+'pq���'&'xq��p��'���'&2��Ì�����+2&'#\&'x������'&�r�#�-��'&'�'#�$3��)�#�����#�$�+'}6$�#�$���$���+'+Hr�#��'$�&'��w -$�����p\&'#q���-.&'#�w���p3�<1Ex��J#�w�p���wi#�(z&'x��Jr�#��'$�&'��w -��,-����!���$6��}Q#�w�p���wz#�(3���'&,-�,$*&,x��V���,&'uv��r3�

1Ex�����'&'uv��r��'-S��-���p�( #�w�s�#�uvr���&'�'$��J&,x��w���+'��&'��!��r�#�-��'&,�'#�${#�(��yr�#��'$�&'��w2���'&'x��'$�&'x���s�x��'+'p{r�#��'$�&'��w2��w�w ��}2�P �$���w�w ��}#�(S�]�7����+'��uv��$�&,-J��#���+'pt���*w���Ì����'w ��pv&'#|w���r�w���-���$�&i��+,+S&'x��*r�#�-��,&'�,#�$�-V#�(�&'x��\r�#�-.-.�'��+'�6s�x��'+,p|r�#��'$�&'��ws�#�uv���'$���&'�,#�$�-�'$t�\x�������p���s��,uv��+i$�#�p��2�¼�#O���G!O��wÒ�����\s���$|p�#|&,x��6s�#�uvr���&'��&'�'#�$vr��'��s��G�Q�,-�� �|���'��x�&z���,&'-{��&z�\&'�,uv�%����}���-.�'$��|�-�uv��+'+,��w���w�w ��}v#�(S���{��+'��uv��$�&'-<����$��Q��w w���}v+'#�#�����rq�'-J$�����p���pv&'#|s�#�uqr���&'�{&'x��*w ��+'��&'��!O�Qr�#�-.�'&'�'#�$v#�(��6r�#��'$�&'��wE�'$t����'p�&'x�� s�#�uvr�w���-.-.��p�Ì�����&'��w $���w�}t$�#�p��H����$��q+'#�#�����rt#�wy&'��#�+,#�#�����r�-{��$�p|��$���p�p��²&'�'#�$���w��\w���Ì����'w���p|( #�w�q���'p�&²x��s�#�uvr�w���-�-.��p6x��P����p���s��,uv��+O$�#�p�����-����J&,x��Jr�-.����p�#���s�#�p��y( #�w3&'x��V(���$�s�&'�,#�$�/¨ Q�t����+'#��6�/�|#W#X��_  ¡�¢j£bY�ZD¤�r�r�r��e����¥§¦�\s  mI¨�ZE¤�r�r�r�¦9©�d°ª����q�«%z'z�����Z�  ��Ó�®�� ��+'��uv��$�&���w�w ��}Qs�#�$�-�&,��$�&Huv��r�r��'$���+'#����'s���+H��w w���}Q�'$�p��P����-E&'#6r�x�}�-��,s���+�#�$���-<�� ¾�f   mI¨�¬®­ j�g �4½� ¢j£bY°¯ ±9±9£]a²X��' �¡�¢�£bY|/³R�´µ��¶�·E¸bdð ��� h �Ó ¢j£bY°¯ ±9±9£]a²X��' �¡�¢�£bY|/³R�´µ���¹�ºX��' �¡�¢�£bYn»¼��½µ���bdE/³R�´µ��¶�·E¸�¾¿��d

°'C�½

no����w ��-.&'�'+'+�+'��( &����'&'xq#�$��J�'uvr�+'��uv��$�&'��&'�'#�$*�,-�-.���HÍ2x�#�� &,#qw���r�w���-���$�&H&'x��{+,#�$�����-.&2s�#�uvuv#�$6r�w����i�\-.&'w �'$���-���no�x��G!O�\-.#�+,!���p�&'x��'-���}|uv��w ��+'}|-�&,#�w��'$���&,x��q+'��$���&,x�#�(�&'x��\+,#�$�����-.&Ss�#�uvuv#�$tr�w ������-.&'w �'$��3�{)Q#�&,�q&'x���&S���q$�����p�&'x��!���+'���\#�(y&'x��\+,#�$�����-.&�s�#�uquv#�$tr�w������t-�&,w��,$��|�'$���r�p���&,��-V#�$�+,}2�J� $�-.����w�s�x���-��z�'&S�'-�s�x�����r���w&,#�-.����w s�x�( ��w�&,x���w��$�pr���w (�#�w�u~��-��,$���+,�v�O��}�s�#�uvr���w��,-�#�$|�'$o&'x��v+'����(G�\1Ex���-��S&,x���w��v�,-Q��+"����}�-Q#�$��\���P}�s�#�uvr���w��,-�#�$�r���wL-.����w s�x3�*1Ex��'--.&'w ��&'�P��}quv��$��'( ��-.&��'&'-.��+'(��'$v&'x��Q/0 ���( ��$�s�&'�'#�$��H&'#�#�����-L$�#qs�x���s���-���w���$�����p���pq(�#�wE��uvr�&'}qs�x��'+'p�w ��$3Íi&'x���+,������+'-L#�(��uvr�&,}6s�x��'+,p�w ��$���w �Juq��r�r���p�&'#6&'x��Vs�+'#�-.��-.&H$�#�$�����uvr�&,}�s�x��'+,p�w ��$3��no�{x��G!O�L��s�&,����+'+,}Qs�x�#�-���$6�V-��,uv�'+,��wz-�&,w���&'�P��}Q(�#�w��r�p���&'��-���&'#�#3��n¤x���$q&'x���r�w �����q-.&'w �'$��q�'-�$�����p���p\����+,#�s���&'��#�$���#�(S&'x���+'���G!���-y#�(S&'x���s���w w���$�&2$�#�p��{��}q-.����w s�x��'$��3���(s�#���w -.������s�xt$�#�p���s�#���+'p�-.&'#�w��Q�6p��'w ��s�&zr�#��'$�&'��wE&'#t�*+,����(�������&3&'x��Quv#�p���w ��&'��s�#�uvr���&'��&'�'#�$���+�#O!O��w x�����p\p����6&'#+'����(z-�����w�s�x��'$��Q-.�G!O��-��-�#�$��V��#�w�p6r���wi$�#�p��2�

À Á�ô;õêò.ö`÷aø�ò`ñxú�û7ü�öKò.ý�ÿêüaú-ýno�yx��G!��s�#�uvr���w���p�p��'(�(���w ��$�&<s�#�uvr�w���-�-��,#�$L-�&'w ��&'�P���,��-z(�#�wH( ��$�s�&'�'#�$���+����'$���w }���Ì�����&'��w $���w }��Gx��P����p���s��'uv��+,����$�pJuv��+'&'�'�p��'���'&�&'w �'��-<ÍHuv��w �Jr���&,x\s�#�uqr�w ��-.-��,#�$��<�{s�#�uv���,$���&'�,#�$Q#�(zr���&'x6��$�p6+'�G!O��+Hs�#�uvr�w���-�-.�'#�$�����$�p*�{s�#�uv���'$���&,�'#�$�#�(zr���&'x�����'p�&'x6��$�pQ+'�G!���+Hs�#�uvr�w ��-.-��'#�$�����$6&'x��J( ��$�s�&'�'#�$���+Ouv��+,&'�'� p��'���'&�&'w �'���O���{x��G!��Lw���-�&,w��,s�&'��p*&,x��V$�#�uv�'$���+�#���&,��p��P��w ����-�&'#( #���wS��$�pv-.�'��&'����$vp�����&'#�&'x���w�����-�#�$�-��P��r�+,���,$���p\�,$t&,x���r�w��G!��'#���--���s�&,�'#�$��31Ex���uv��+,&'�'� p��'���'&�&,w��,�6�,-L&'x��6����-�&�s�x�#��'s��(�#�w��y( ��$�s�&'�,#�$���+�&'w �'���<��-S�'&Or�w�#O!��'p���-i����#�#�p{s�#�uvr�w #�uv�'-.������&'������$�&'x��yuv��uv#�w }{w���Ì����,w���uv��$�&'-.�P&'x���s�#�-.&�#�(�s�#�r�}��,$���,$*��r�p���&'��-.����$�p6&'x��V�G!O��w������V��$�p6uv�����'uv��u�r���&'x6+'��$���&,x�-��

¹ÃuÄ ÅÇÆ�È�É Ê²Ë1Ex��6p���&'�*-.&'w���s�&'��w���-Vx��G!O�6������$��'uvr�+'��uv��$�&'��pq#�$�&'#�rt#�(�&'x��6r���w�-.�'-.&'��$�&zuv��uv#�w�}vuv��$�������w�%�x���p���-<��%�x���p���-V-.��r��r�#�w &'-�p��,-���� ����s�����p�����&,#�uv��&'�'svuv��uv#�w�}�uv��$�������uv��$�&��'$o��-�#�(�&Ew ����+'� &,�'uv�v��$<!��'w #�$�uv��$�&«� ����s�#�!���w }��'-6����-���p�#�$-.x���p�#����,$��os�#�uq���,$���po���'&,x0w�����+'� &'�,uv�|����$���w ��&'�,#�$���+�-.&'#�r�����$�p�� s�#�rO}�����w���������s�#�+,+'��s�&'�,#�$��� �-q-�x���p�#����,$��ow���+'�,��-��r�#�$|r���&'x|s�#�r�}��'$�����w���s�#O!O��w }tp�#���-J$�#�&��'uvr�#�-.�*��$O}|��p�p��'&'�'#�$���+3#O!O��w x�����pv&'#�&'x��\�,uvr�+'��uq��$�&'��&'�'#�$�#�((���$�s�&'�'#�$���+p���&'�V-�&,w���s�&'��w ��-���x���$Qp��'-��6�Qw �'&'��-���w �J��+'�'uv�'$���&'��p3�

��& Ì -V��+"����}�-Vr�w�#���+'��uv��&'�,s�&'#�s�#�$�p���s�&�r���w�( #�w�uv��$�s���uv����-���w ��uv��$�&,-��'$t��$t��$�!��'w #�$�uv��$�&�-���r�r�#�w &'�,$��v����&,#�uv��&,�'suv��uq#�w }|uv��$�������uv��$�&º�V� $���r�p���&'��-��3&'x���w��*�'-�$�#��P����s�&S����}|#�(�-.r���s��,(�}��'$��t&'x��\#O!O��w x�����p�#�(�����&'#�uv��&'�'s6uv��uv#�w }uv��$�������uq��$�&«��no�Vx��G!O����-���pQw���+'��&'��!O��+'}�+,��wÎ����uv��uv#�w }�-.�'����-E&'#Quv�'$��'uv�'����&'x��L#O!O��w x�����pJ#�(�&'x��Luv��&'��w�������$���w���&,�'#�$����w ��������s�#�+'+,��s�&,�'#�$3�<no�{x��G!O�J��+,-�#6p��'w ��s�&,��p6&'x��J#���&'r���&�&,#�Í aSXHT Í W²ÎS_S_ &,#\�G!�#��'pQ&'x��{#O!O��w x�����p�#�(3p��'-.�6�Qw��,&'��-��

no��x��G!O��,u�r�+'��uv��$�&'��pV-��,��&'w��'�-�&'w ��s�&'��w���-<Í�&'x��yr���&'x�� s�#�uvr�w ��-.-���pJ���,$���w�}J&,w��'��� �i� �[� ��&'x��yr���&'x���s�#�uvr�w���-.-.��pLÌ������&'��w�$���w�}Q&'w �'�¹� �i� ð[� ��&,x��Vr���&'x��z��$�p6�Q�,p�&,x�� s�#�uvr�w ��-�-.��p�Ì�����&'��w�$���w�}�&,w��'�¹����n¨��ð�����&'x��Jr���&'x���s�#�uvr�w���-�-.��p�x�������p���s��,uv��+&'w �'�Å�����G°G®����y&,x���r���&,x�����$�pï�Q�'p�&'x���s�#�uvr�w���-�-.��pox��P����p���s��'uv��+&'w �'�Å����n¨�¯°P®������$�p0&'x���r���&'x�� �y���'p�&'x�� ����$�p¨+'�G!O��+'�s�#�uqr�w ��-.-.��p�&,w��'�3� �in¥ñã�¯����$Q&'x���&'����+'��-E����+'#�����&'x���&,w��'��-E��w �V�,p���$�&'�'�i��p���}Q&'x��y-�x�#�w�&'��w�$���uv��-S����!O��$��'$Qr���w���$�&'x���-.��-<�¬�#�w�����s�xt&'w �'�Q�Q�6����!O�Q�'&'-V-.�'�����i�'&'-J�G!O��w ��������$�ptuq�����'uv��u r���&'x�+'��$���&'x�-.���'&'-{�G!O��w ��������$�p�uv�����'uv��u�s�#�rO}vs�#�-.&'-r���w���r�p���&'���O��$�p*&'x��{$���uv����wz#�(�-.����w�s�x���-���$�p\��r�p���&,��-�r���w�-.��s�#�$�p3��1Ex��{-.�'����������w ��-p�#*$�#�&2�,$�s�+'��p��J&,x��Juv��uv#�w�}s�#�$�-���uvr�&,�'#�$�#�(z+'���G!���-.������s�����-.�V�'&H�'-��Ì�����+��'$6��+'+H&'w��'��-<�

no�{x��G!O�Js�x�#�-���$6&,x��J(�#�+'+,#��Q�,$��Q&,��-.&2p���&'�V-.��&'-(�#�w3#���wz�P��r���w��'uv��$�&,��+O-�&,��p�}H��1Ex��V�iw -�&Hp���&'�{-.��&2�'-�-�}�$�&'x���&'�,sJ��$�ps�#�$�-��,-�&,-�#�( °�CDC��ÏCDCDC ��$��'Ì���� �O�P}�-�����$���w ��&'��p ��-.�'$�� �÷+,�'$�����wÊs�#�$���w�����$�s��~w���$�p�#�u $���uv����wÊ����$���w���&,#�wÐSÑpÒ�Ó�Ô�Õ�Ö�×_ÔØ�ÙÚ�ÛEÛ�Ø�Ü]ÔDÝ�×_ÚeØ9ÝØ�ÜEÞ�ÜDÛEÝ×�ÛEß � 12����+'�*°;�G�<1Ex��L#�&,x���wip���&'�L-���&'-���w �Lw�����+�p���&,�2ÍO���,$���w�}�-.&'w �'$���-�( w�#�u¸� $�&'��w��$���&�w�#���&'�,$���&'����+'��-ã��1H����+'�y�[� ��&,��#���p��'uv��$�-��,#�$���+�����#�uv��&,w��,syr�#��'$�&Op���&'��� 12����+,�L�[��!��,-�����+,�'����p��,$Q¬��'����w���®�( w�#�u�©���ÔOª����$�po �%�§�� ��-.&'w �'$���-*( w�#�u¢&,x��|§���+,����w�}o1��P��&§�#�uvr�w���-.-��,#�$�§�#�w r���-.����x��'s�xo�,-\�|-.&'��$�p���w�p��'����p�&,�P��&�s�#�w r���-Q��-.��p(�w ��Ì�����$�&'+,}��'$6p���&'�Vs�#�uvr�w���-�-.�'#�$Qw���-�����w s�x·��12����+,�Jð��¯�

1Ex��{r���w�( #�w�uv��$�s��J&,��-.&'-�����w ��uv����-���w ��p\�,$q&'x���ñE�'$����\�3� Cq#�r���w���&'�,$��6-�}�-.&'��u���$�!��'w�#�$�uv��$�&�w���$�$��,$��*#�$\������$��&'�'��u~��w�#���%��DCDC�/0¼��q-.��w !O��w����'&'x��q��Ó�®t��£ �'$�&'��w $���+�s���s�x��2�J£���( #�w �\p�#��'$��t&'x��\��s�&'����+�uv����-.��w���uv��$�&'-V�Q�\�iw�-.&r�#�r���+,��&'��p{&,x��Lp���&'��-.&'w ��s�&'��w���-S��$�p�s�w ����&'��p��L&'����+'��#�(�ÓEC��5CDCEC{��+'��uv��$�&'-S&,x���&�����w ��w���$�p�#�uv+'}Js�x�#�-���$�(�w #�u &'x��Lp���&'�-.��&'-V�'$�Ì�����-.&'�'#�$3��1Ex���$v����s�xv&'��-.&3����-Jw ��$v(�#�wE&'�{#tuv�'$���&'��-�����$�p�&'x��Q&,����+'�Q#�(�w���$�p�#�u ��+'��uv��$�&'-L����-Vw ��r�����&'��p�+'}

°'C�Ë

¬i�,����w �J®3ÍD»�w��,+'+2x�#�+'��-��'$6/0��$��,s�x3�

��-���p\��-y�,$�r���&«��)�#�&'�{&'x���&�&,x��J��r�p���&'�{������w���-�p�#*$�#�&2�'$�s�+'��p��{�'$�-���w�&,-y��$�p*p���+,��&'��-��H1Ex��{s�#�-.&�#�(��'$�-.��w�&'-��$�p*p���+'��&,��-�'-�!O��w�}ts�+'#�-.�\&'#|&'x��*s�#�-.&�#�(y��r�p���&'��-V����s�����-.�\�'$�-���w &'-{��$�ptp���+'��&'��-Jp�#|$�#�&3�'$�p���s��*��$O}|uv�«­«#�ww ��#�wÎ����$��'����&'�'#�$�-�'$(���$�s�&,�'#�$���+O&'w �'�J-�&,w���s�&,��w ��-<�

àbá â5ã ä�âÏå�ã æ�ã�ç]è5é ê²ë7ç�ì�ê²ë�í è ä�ã�î�á�ï�é�ã�í ð�çeñ�î�è5ã�íòuó�ôTõ ö¿÷ ã�á øùî7ú ö¿÷ ã�á ø�î�ú çeã�á¿í�ã�ï ç]ã�á°í ã�ï

ûEüuý þ�ÿ�þ��eþ�� � þ���� ��� ý]þ ���� �� ��� ��ý���ÿ�þ�þ� þ�ý�ý�ÿ ��ý��ûEü�� þ�ÿ ý]þ ��� � ��� eþ þ�ý ����� �� �� ���]þ�ÿ ���� þ�����ÿ �]þ �û���ü�� ������ � ��� eþ þ�ý ����� ��� ��� ��ý]þ�ÿ ý��� þ�����ÿ ý����ûEü�þ�� ý�ÿ ������ � ��� ��� � �ý�� �eþ þ�þ�� ��ý]þ�ÿ ���� ���ÿ ����û���ü�þ�� ������� � ��� ��� � ����� �� �� ������� ����� þ�þ �ÿ ���ýû���� ��ý��� � ���þ � þ�þ ����� �� ��� ������ÿ ��]þ ������� �����

� �"!�#%$�þ���&('�)+*�,�-�./-��"'�0�,�.21�þ ��ÿ ��43�'�)%5�3�$6$7'�8+-�)+$"9;:��àbá�âÏã ä�â5å ã æ�ã�çeè5é ê²ë�ç�ì¼ê²ë�í�è ä�ã�î�á�ï�é�ã�í ð�ç]ñ�î�è5ã�í

òuó�ôTõ ö ÷ ã�á ø�î�ú ö ÷ ã�á øùî�ú çeã�á°í ã�ï çeã�á¿í ã�ïûDüuý ������� � þ��� ��� ý�� ����� �� ��ý ������ÿ ý�ý�� þ�þ�ý�ÿ ��ý�ýûDü�� ������ � ��� �]þ þ�� ����� ��� ��� ��ý���ÿ ���� þ7ý���ÿ ý����û<��ü=� ��ý���� ý ��� �]þ þ�� ����� ��� ��� ������ÿ ���ý þ7ý���ÿ ���ýûDü�þ � ������ ��� ��� � ��� ��ý þ�þ�� ������� ����� �ý�ÿ �����û<��ü�þ�� ������� � ��� ��� � �]þ � ��� þ�]þ ��ý���ÿ ���ý þ�þ��ÿ ��]þû<�>� ý������ � �� �]þ þ�ý ����� ý�� �� ����ÿ ����� ������� ������ �"!�#+$#ý��<?@�7$�ü�$7�79A8B-�,�3�8+)+'�CD8+�"!�#+$E1����ÿ ����F$7'�8+-�)+$79 ÿ����ÿ �����F3�'�)+5�3�$E$7'�8+-�)+$"9;:��

°�°'C

àbá�âÏã ä�â5å ã æ�ã�çeè5é ê²ë�ç�ì¼ê²ë�í�è ä�ã�î�á�ï�é�ã�í ð�ç]ñ�î�è5ã�íòuó�ôTõ ö ÷ ã�á ø�î�ú ö ÷ ã�á øùî�ú çeã�á°í ã�ï çeã�á¿í ã�ï

ûDüuý ý�ý��� þ���� � ý�� �]þ � ý�� ��� ������ÿ ����� þ7ý���ÿuþ�þ��ûDü�� ý������ ý ��� �� þ�� ����� ��ý ��� ������ÿ ý���� þ ����ÿ �����û<��ü=� þ ���� � ��� �� þ�� ����� �]þ ��� ���]þ7ÿ ��� þ ���ÿuþ ��ýûDü�þ � ������� � ��� ��� � �����þ�þ þ�þ�� ������� ����� ����ÿ �����û<��ü�þ�� ������� � ��� ��� � ��ý���þ�� eþ �����ÿ ����� þ ���ÿ ����û<�>� þ ����� � ��� �� þ�þ ����� ��� �� ������ÿ ����� ������� �����

� �7!�#+$G����H6-�)+#+#JI�,�#+$79K1�þ���ÿ ���]þL3�'�)+5�3�$E$"'�8+-�)+$"9;:��àbá â5ã ä�â5å ã æ�ã�çeè5é ê²ë�ç�ì¼ê²ë�í�è ä�ã�î�á�ï�é�ã�í ð�ç]ñ�î�è5ã�í

òuó�ôTõ ö ÷ ã�á øùî7ú ö¿÷ ã�á ø�î�ú çeã�á¿í ã�ï ç]ã�á í ã�ïûEüuý þ������ ý��� ��� �eþ �]þ � �� þ7ý�� ý�����ÿuþ � ���ÿ ��ý]þûEü�� ý���ý���þ þ ��� ��� �� ��ý�� ��� þ �� ý��ý�ÿ ý���� ���ÿ�þ ]þû���ü�� þ������ � þ ��� ��� �� ����� � þ �� ý��]þ7ÿ �eþ�� �]þ�ÿ �����ûEü�þ�� ������� ý ��� ��� ��� þ ���� ��� ��ý�� ý����ÿ ����� ���ÿ �����û���ü�þ�� ������� � ��� ��� ��� ����� �� þ�þ�� ������� ����� ����ÿ ����ýû���� þ������ � þ �� ��� ý� ����� ��� ����� ý�����ÿ ���� ������� �����

� �7!�#+$G����MN'�C�#+)+9;ID8%$PO�8J1�þ���ÿ ���]þL3�'�)+5�3�$($7'�8%-�)+$79A:��

¹à Q R>SUTWV�XYTJT7S_ÊLZ1Ex��J&'��-�&Hw���-.��+'&,-�p���uq#�$�-.&'w ��&'�L&,x��V����$�����&'-�#�(3��r�r�+,}��,$�����+'+2#�(3&,x��Js�#�uvr�w ��-.-��,#�$Quq��&'x�#�p�-E&'#�����&'x���wº�<����&'x6s�#�uvr�w ��-.�-.�'#�$\��$�p6���'p�&'x\s�#�uvr�w���-�-.�'#�$Qr�w�#�p���s���&'x��{uq#�-.&2-.�'��$��'�is���$�&H-��G!��'$���-��'$\uv��uv#�w�}Q��&²�'+'�'����&'�'#�$*��$�p6s�#�-.&2#�(�s�#�rO}��'$��3�ñE�G!���+is�#�uvr�w���-�-.�'#�$v( ��w &'x���wEw ��p���s���-�&,x��6-��,���6#�(�&,x��6p���&,�6-.&'w ��s�&'��w �6��$�p��'&'-V�G!O��w������6p���r�&'xv���,&'x�#���&i�*-.����-�&,��$�&'�,��+�,$�s�w�����-.���'$v&,x��{s�#�r�}qs�#�-.&«��1Ex����{��}v+,�P!O��+�s�#�uvr�w���-�-�-.�'#�$\��$�p\���'p�&'xvs�#�uvr�w���-�-.�'#�$6�'$�&'��w���s�&��,-V����-�&�p���uv#�$�-.&'w ��&,��p��}o&'x��\[�»�w��'+,+x�#�+'��-7]���$�p�[�¦�$���+'�,-�x�&,�P��&^]t&,��-.&s���-.��-<�|¦E!O��$o&'x�#�����x�&'x���$���uv����wL#�(Jp�w��,+'+�x�#�+,��-Q�'-*��w ����&'��wL&'x���$&'x��Q$���uv����w�#�(�&'�P��&��'&'��uv-.�i���,p�&,xvs�#�uvr�w���-�-.�'#�$q��$�pv+,�G!O��+�s�#�uvr�w���-�-.�'#�$qr���s��v&'x��Qx��,��x�+'}\s�+,��-.&'��w���pvp�w �'+'+ix�#�+'��-�-.#��(��is��'��$�&'+'}Q&'x���&2&'x��J-.�'���{#�(3&,x��F[�»�w �'+'+�x�#�+'��-7]V&'w �'�J�'--.uv��+'+'��w3&,x���$6&'x��J-��,���J#�(z&'x��4[�¦E$���+'�,-�xQ&'�P��&^]J&,w��,�2�

 �-�-�&,��&'��p��'$�&'x��qr�w��G!��'#���-{-.��s�&'�'#�$����v��-.��p�+'��w ���\uq��uv#�w }�-.�'����-�&'#���+'�'uv�'$���&'�q&,x��q#O!���w�x�����pt#�(�&'x��quv��&'��w �����$���w ��&'�,#�$t����w��������6s�#�+,+'��s�&'�'#�$3�{��$|w�����+S��r�r�+'�,s���&'�'#�$�-.�3x�#��Q�G!O��wÒ�z( w����*uv��uv#�w�}|uv��}|����s�#�uv�\�P��x�����-.&'��p��Jn¤x���$uv��&,��w ������$���w ��&,�'#�$�-y��w ����+'-.#vs�#�+,+'��s�&'��p��2&'x���-.�'���Q#�(S�Qp���&'��-.&'w���s�&'��w���-.x�#���+,pqx��G!O���Qp��'w ��s�&��,uvr���s�&�#�$v&'x���#O!���w���+'+r���w (�#�w uv��$�s��2�i£���s�����-��6#���wEuv����-���w ��uv��$�&'-yp��'p�$�#�&z&'�����Q&'x��'-Ls���uv��+'��&'��!���s�#�-�&z#�(Es�#�rO}��'$��q�,$�&'#t��s�s�#���$�&'�����6w ��$�vuv���'$�� uv��uv#�w�}�!���w �'��$�&3#�(�&,x��v1E��§���£¸����$�s�x�uv��w �¨©�°Gð�ªL&'#��'$�!O��-.&'�'����&'�*&'x��q��(�(���s�&S#�(�&'x��qs�#�+'+'��s�&'�'#�$|#�(�uv��&'��w������$���w ��&'�,#�$�-V��-��,$��|����&,w��'s��'�*&'w�����-.��r���&'x��y��$�p����'p�&'x���s�#�uqr�w ��-.-���pvÌ�����&'��w�$���w }�&'w �'��-.�S��$�p|r���&'x����z�Q�,p�&'x����$�p|+'�G!O��+,�s�#�uvr�w���-.-���pQ&'w �'��-���� $*&,x��J&,��-�&,-y���Vp��'p6$�#�&���$�p6��$O}Q-��,��$��,��s���$�&�p��G!��,��&,�'#�$�-E(�w #�u¸#���wi����w�+,�'��w3w ��-.��+'&,-<�

no�qx��G!O�\$�#�&3�'$�s�+'��p���pt����+'��$�s���pt&'w ����-J�'$�&,x��,-��P��r���w��'uq��$�&'��+i-.&'��p�}2�J����wy�P��r���w �'��$�s���-L#�(w���$�$��,$��|&,x��* `_QñE�&'w ���|!���w �'��$�&�#�({&'x��|uv���'$���uv��uv#�w }�1E��§���£ ����$�s�x�uv��w ���'$�p��'s���&'�t&'x���&&'w �'��-*#���&,r���w�( #�w u  K_QñE� &'w�����-Q�'$ï&'��w�uv-*#�(uv��uv#�w�}���&'�'+'�'����&'�'#�$��zs�#�r�}�s�#�-�&���$�p|��r�p���&'�Q��$�p|-.����w�s�x|r���w�( #�w�uv��$�s��H��1Ex��* `_�ñE��&'w ���6�'-J$�#�&3��$|#�r�&'�'uv��+is�x�#��'s��(�#�wy�v( ��$�s�&'�,#�$���+3����+'��$�s���pt&'w �����Sx�#����G!O��wº�{n��v��$�p�&,x��qp��P��w����\a�uv�'$��'uv�,���,$��|&'x��q��r�r�w�#����,uv��&'�Qs�#�rO}�s�#�-.&S#�(������+'��$�s���p6&'w����J��}*w���r�����&,�'$��6&,x��{-.�'uvr�+'����$���+'}�-��'-��Q�Jp��,p\�'$\&,x�����w -�&2s�x���r�&'��w�(�#�w3&,w��'��-��� �-yr�w��G!��,#���-.+'}���������-�-.��uv�&'x���&���+'+�+'���G!O��-���w ����&�&'x���-.��uq��+'�G!O��+��'$v&'x��{&'w ���H�i1Ex���-.�2#���wS&'w �����,-V��$q�P��&'��w $���+�����+'��$�s���p\&'w ���H�i1Ex���&,w�����s�#�$�-.�'-.&'-#�(��'$�&'��w�$���+2$�#�p���-x�#�+'p��'$��\b<adcfeVr�#��'$�&'��w -�ÍKa\r�#��'$�&'��w�-&,#\-.����&,w�����-y��$�p\adcgeL�O�P}\r�#��'$�&'��w --.��w�!��,$���-��-�w�#���&,��w�-.���$�pdh +,���G!O��-<Í3&'x���-.&'#�w���pv�O�P}�-<��1Ex�����r�r�w�#����'uv��&,�{s�#�rO}qs�#�-.&z�,-J����!���$q��}di(jkaNlnmoj;b<apcqesrutvlxwzy<{N|Gh �Qx���w��t\�'-L&'x���-��,����#�(S��p�uv�,$��,-�&,w���&'��!O�{p���&'���,$v��&'w�����$�#�p��2��nÂ�,&'xdt}m~eJ&,x���uv�,$��,uv��u #�(�i(j;aNlE�'-L����!���$\��}di`�kj;aNl�m��<��«� �H�Kw^�Ya}m�eV#�wS-��,uvr�+,}�a}m�����1Ex���p��'-.s�w ��&,�{!���+,����-ys�+,#�-���-�&�&,#dav��w����q��$�p*���H��$�p\i(j;b<ln��i(jk��lS( #�wS��+'+Kh���eO�1Ex���-.�O&'x��J�P��&'��w�$���+������6&'w ���{uv�,$��,uv�'����-�&,x��Js�#�rO}Qs�#�-.&«����$�&'��w���-.&'�,$���+'}���&'x��J��$���+'}�-.�'-�����#O!��L��r�r�+'�,��-��G!O��$6&'#6�'$�&'��w $���+����+'��$�s���pQ&'w�����-���-.-���uv�'$��6�J+,��wÎ����h �

°�°�°

� ú9èÏã�áA��î�� ����� èÏá�ã�ã7� ö ç�ç]á ö¿÷ ã�á ö ç�ç]á ö¿÷ ã�á ö ç�çeá ö ÷ ã�á��� ï ó ã�è�ï�î7ç]î�ï�â5è ìs� ä�â5å ã òuó�ôTõ æ�ã�çeè5é ê²ë�ç�ì�ê²ë�í�è&6'�)+*�,�-�.2-��7'�0�,�. ������� ��� ��� �eþ�� ���?@�"$�ü�$"�"9A8 -�,�3�8+)+'�CF8+�"!�#�$ ý����� � �� �� ����� ��H6-�)+#+#WI�,�#+$"9 þ������ � ��� �� ����� ��MN'�C�#%)+9AID8+$7O�8 þ�ý���� � ��� �eþ ����� ���

� �7!�#+$������NI�$��7����-�,�O�)+.4�78+$49;)%�7$�ÿB8+I�$��7����-�,�O�)+.4�78%$D�P��$"-��7C�$D0�$7��8+I�ÿ<�"'�0�8+I�$(�"����-�,�O�)+.��78+$D�P��$7-��"C�$D 7,���¡� 7,�9;8¢*�,�-£$PO�8+$7-�'��7#Eý�ü��8+-�$7$"9K¤6)+8+I�!�3� 7¥�$"8B "�"���7 ")+8+¡F���

¦P��è5ã�á;��î�� ����� è5á�ã�ã ö ç�ç]á ö¿÷ ã�á ö ç�ç]á ö¿÷ ã�á ö ç�çeá ö ÷ ã�áä�â5å ã òuó�ôTõ æ�ã�çeè5é ê²ë�ç�ì�ê²ë�í�è

&6'�)+*�,�-�.2-��7'�0�,�. þ�ÿ ý����� � ��� eþ �eþ�� ���?@�"$�ü�$"�"9A8 -�,�3�8+)+'�CF8+�"!�#�$ ���ý�� �� ��� ����� ��H6-�)+#+#WI�,�#+$"9 ý������ � �� ��ý ����� ���MN'�C�#%)+9AID8+$7O�8 ý�����þ ��uþ ����� ý�

� �7!�#+$D���6�NI�$��"����-�,�O�)+.4�"8+$F9A)+�7$�ÿ<8+I�$D�"����-�,�O�)%.4�78+$4�P��$"-��7C�$D0�$7��8+I�ÿ¢�7'�0�8+I�$D�7����-�,�O�)+.4�78%$F�P��$"-��7C�$D 7,���¡� 7,�9A8�*�,�-§)+'�8+$7-�'��"#Eý�ü��8+-�$"$79A�

��$t12����+,��-{Ó|��$�pt®����q����!��6��r�r�w�#����'uv��&,�'#�$�-�( #�w�&'x��\�G!O��w �����6p���r�&'x��i&'x��\�G!O��w������6s�#�rO}ts�#�-�&3��$�pt&'x��\#O!���w���+'+-�&'#�w������V��&'�'+,�'����&'�,#�$Q#�(i&'x��V����&'��w $���+O��� �6&,w����V���'&'xQ����s�����&�s���r���s��'&'}QÓ*��$�p6&,x��V�'$�&'��w�$���+���� �6&,w������Ow���-�r���s�&'��!O��+,}�����x���$&'x��P}q��w ����-.��pq(�#�w�-.&'#�w �'$��q#���w3&'��-.&�p���&'�H��1Ex�������s��O��&�s���r���s��'&,}\Óquv������-y&'x���-.�'����#�(�&,x�������s��O��&2$�#�p�����Ì�����+2&'#v&,x��-.�'���{#�(3&'x��V�'$�&'��w $���+�$�#�p��H�<no�{��+'-�#*&'���O�V�'$�&'#6��s�s�#���$�&�&'x��V��(�( ��s�&�#�(3!���w��'����+'��� -��'���V$�#�p���-���}Q��-.�'$��6&'x��J-.s���+'�{(Î��s�&'#�w¨z©Yª �'$�s�#�uvr���&'�'$��L&'x���s�#�rO}{s�#�-�&���$�p�-�&'#�w ������w���Ì����'w ��uq��$�&'-<��1Ex��y-.s���+'��( ��s�&'#�w ¨z©Yª ����!O��-�&,x��y��r�r�w #����'uv��&'���G!O��w������-.&'#�w �����Q��&'�'+'�'����&'�'#�$v#�(�x��'��w ��w s�x��'s���+���s�s���-�-Luv��&'x�#�p�-V����-.��pv#�$t&,x��Q���'$���w }v-.r�+'�'&'&,�'$��tr�#�+'�,s�}�©���ªG�E �+'+3&'x��Q(�#�w uv��+'��-��-���p*( #�wis�#�uvr���&'�'$���&'x��{!���+'����-E��w �{���"!O��$��,$\&,x��V��r�r���$�p��,���

��&S�,-6s�+'����w�&,x���&E��r�r�w�#����'uv��&'�,#�$�-Vs���$�$�#�&����v&'������$�(�#�w�uv����-.��w���p�w ��-.��+'&,-��Q¬�#�w�����+'��$�s���p�&'w ����-.�Sx�#����G!���wÒ�3��r��r�w #����'uv��&'�'#�$�-y����!����6(Î���'w +'}tw ��+'�'����+'����-.&'�'uv��&,�*#�(�&,x��6��s�&'����+zw ��-.��+'&'-��E��$���+'+z&'��-.&3s���-.��-.��&'x��6uv����-.��w ��pv-.�'����-V#�(�&'x����n ñS��&'w �'�Q��w �6s�+'#�-���w�&'#�&,x��Q��r�r�w�#����,uv��&,��pq-.�'����-J#�(E&'x��Q�P��&'��w�$���+���� �t&,w����Q&'x���$v&'#�&'x�#�-��Q#�(�&,x��Q�'$�&'��w�$���+i������&,w����H�¦E��s�+'��p��'$��\&'x���[�¦E$���+'�'-.xv&'�P��&z]�&'��-.&zs���-.����&'x��Quv����-.��w���pq�G!O��w ������p���r�&'x�-y��w��Qs�+'#�-.��wE&'#v&'x�����r�r�w #����'uv��&'��p\�G!O��w������p���r�&'x�-�#�(�&'x��q�P��&,��w�$���+S�����|&'w����v&'x���$�&,#�&'x�#�-��v#�(�&'x��q�,$�&,��w�$���+S�����|&'w ���2�6¦E��s�+'��p��'$�����$��'(�#�w�u�w ��$�p�#�u~p���&'���S&'x��uv����-���w���p�s�#�r�}�s�#�-.&'-Q#�(�&,x��v��n¥ñ &'w �'����w��v-.#�uv�G��x���&S��w ����&'��w�&'x���$�&'x��t��r�r�w�#����'uv��&,��pts�#�r�}�s�#�-�&,-6#�(���#�&,xo&,x���'$�&'��w $���+���$�p�&'x����P��&'��w $���+�������&,w����2�<1Ex��'-S�'-Sr�w�#�������+'}{p����y&'#�����&'&'��w�����+'��$�s��y�'$�&'x�������+'��$�s���p{&'w ����-<��1Ex��yp��'(�(���w���$�s���-��w �J$�#�&H-.�'��$��'��s���$�&'��x�#����G!���w«�

� -{�'-Jp��,(��is���+'&3&,#�p�w��P�å��$�}�w ��+'�,����+'�*s�#�$�s�+,��-.�'#�$�-L����#���&i&'x��\r���w (�#�w�uq��$�s���#�(&,x��6��n¥ñE��&,w��'�*s�#�uvr���w ��p�&'#t�����&,w�����-��H%�#�uv�{#���-.��w�!���&'�,#�$�-���(�( ��s�&,�'$��*&,x��{r���w�( #�w uv��$�s���s���$*���{uv��p�����x�#����G!O��wº����n¥ñE� &'w��,��+'#�#�����r�-���w �{�P��&'w���uv��+,}��(Î��s��'��$�&'������s�����-.�L#�$�+'}��L-.�'$���+'�L���P}�s�#�uqr���w �'-�#�$��'-�$�����p���p�r���w�+,#�#�����r3��� $Q�'$�-.��w�&'-E��$�p�p���+'��&'��-�����+'+Huv#�p��'��s���&'�'#�$�-#�(�&'x��*��n¥ñE� &'w��,�*��w��6+'#�s���+3���'&'x��'$�&,x��6-�s�#�r��Q#�(�6x�������p���s��'uv��+i$�#�p��2�E)�#|��+'#�����+i#�r���w���&,�'#�$�-�-���s�xt��-V����+'��$�s��,$����w��V$�����p���p��

« e �7ñ��.üaÿxý´÷��7ñxý��$J#���w��P��r���w �'uv��$�&'��+P-.&'��p�}{���p��G!O��+,#�r���p���$�p{s�#�uqr���w���p�p��,(�(���w ��$�&<s�#�uvr�w���-.-.�'#�$Luv��&'x�#�p�-z( #�wH(���$�s�&,�'#�$���+�&'w��,��-<�< �+'+#�(�&'x���s�#�uvr�w���-�-.�'#�${uq��&'x�#�p�-�x��G!��������$�r�w �G!��,#���-.+'}{��-.��p�( #�w��,uvr���w ��&'��!O�&,w��,��-.������&O&'x����,w���r�r�+'�,s���&,�'#�${&'#�( ��$�s�&'�,#�$���+&'w �'��-��,-�x���w�p�+'}tp��'-.s���-.-���p��,$�+'�'&'��w���&'��w �2�{� $���p�p��'&'�'#�$t&'#���r�r�+'}��'$��|s�#�uvr�w���-�-.�'#�$tuv��&'x�#�p�-V&'#�( ��$�s�&'�'#�$���+3&,w��'��-.�������+'-.#|p��G!O��+'#�r���pv�\$��G� uv��&'x�#�p�(�#�w���(���s��'��$�&3�,uvr�+'��uv��$�&'��&'�'#�$�#�(����'p�&'x|s�#�uvr�w���-�-.�'#�$�&'x���&�s���$��G!O��$����\��-.��pt(�#�w�'uvr���w���&,��!��L&'w �'��-��

%��'� p��,(�(���w ��$�&V(���$�s�&'�,#�$���+V&,w��,�¨-.&'w ��s�&'��w���-t����w �ïp���-.�'��$���p����,uvr�+'��uv��$�&'��p��V��$�p &'��-.&'��p¥��-.�'$��¤��#�&'x -.}�$�&,x���&'�,s��$�p�w ����+�p���&'�2Í{&,x���r���&'x�� s�#�uqr�w ��-.-.��p|���'$���w�}�&,w��'����&'x���r���&'x�� s�#�uvr�w���-.-���p|Ì�����&'��w $���w }�&'w �'���E&'x���r���&'x��V��$�p����'p�&'x��

°�°G�

s�#�uqr�w ��-.-.��p�Ì�����&'��w $���w }{&'w �'����&'x��Lr���&'x�� s�#�uvr�w ��-.-���pJx��P����p���s��'uv��+�&,w��'����&'x��Vr���&'x�����$�p����'p�&'x�� s�#�uqr�w ��-.-.��pJx��P����p���s���'uv��+�&'w �'���O��$�p�&'x��Vr���&'x��������'p�&'x�����$�pQ+'�G!���+'� s�#�uvr�w ��-.-���p{&'w �'�H��n¥x���$Q��s�s�#���$�&'�,$��{(�#�wz&'x��L&'w �'�L-.�'������&,x��L�G!���w ������r���&,x+'��$���&'x��z&,x��*s�#�rO}|s�#�-�&,�3��$�p|&'x��6-�����w�s�x���$�pt��r�p���&'�6r���w�( #�w uv��$�s�����&'x��\r���&,x�� �z���,p�&,x�����$�p�+,�G!O��+,��s�#�uvr�w���-�-.��pv&'w��,���#�w�&'x��v��n¤ñ�� &'w �'�\(�#�w�-.x�#�w &'�3����-6( #���$�pt&,x��v����-.&º�612#�#���w���$�#���+'��p������3&,x��\��n¥ñE��&,w��,�v�'-�&,x��v��w�-.&E(���$�s�&,�'#�$���+z&'w��,�-.&'w���s�&'��w��y��x���w��yr���&,x�s�#�uvr�w���-�-.�'#�$������,p�&,x�s�#�uvr�w���-�-.�'#�$J��$�p�+'�G!O��+Os�#�uvr�w���-�-.�'#�$Jx��G!O�y������$���(���s��,��$�&,+'}�s�#�uv���,$���p3�

no����+,-�#vs�#�uvr���w���p6-.#�uv��#�(S&,x���w ��-.��+'&'-�#�(S&,x�����n¤ñ�� &'w �'��&'#\��r�r�w�#����'uv��&'�'#�$�-�s���+,s���+'��&'��p\( #�wS&,x��{����&,��w�$���+2��� �&'w ���v���'&'x�����s��O��&Ss���r���s��'&'}�Ó���$�p�&'x��v�,$�&'��w $���+S������&'w ���H�*no��s�x�#�-.�v��� ��&'w�����-�����s�����-.�p����-Q���v-�x�#�����pù��&'x��P}uv�'$��'uv�,���\&'x��qs�#�rO}ts�#�-�&S�,$�����+,��$�s���pt&'w ����-�����$|��+'+E&,��-.&Ss���-���-�&'x��\-.�'���v��$�p|&'x��\�G!���w �����\p���r�&,x|#�(�&'x��\��n¥ñE� &'w �'�����w��6s�+,#�-��6&'#t&'x��6��r�r�w�#����'uv��&,��p\�P!O��w������Q-.�'���\��$�p���r�r�w�#����'uq��&'��pq�G!O��w �����Qp���r�&'x�#�(�&'x��6����&'��w�$���+i��� �|&,w����H�1�x����r�r�w�#����'uv��&'��p�s�#�rO}Js�#�-�&��,${�����J&'w�����-z����-3-�uv��+'+'��w2&,x���${&,x���uv����-���w���pJs�#�rO}Vs�#�-.&�(�#�w2&,x�����n¥ñE��&,w��'�H�<1Ex��p��'(�(���w���$�s������-\$�#�&y-.�'��$��'��s���$�&,�x�#����P!O��wÒ�E��$�pï�,&��'-*r�w #�������+,}�p����t&'#ï����&'&,��wJ����+'��$�s��|�,$ï������&'w ����-���¬���w &'x���wJ����r���w �'uv��$�&,��+w ��-.����w s�xQ�'-�-�&'�'+'+�$�����p���p���x�#����G!O��wÒ��&,#*!O��w �'( }6&,x��Vw ��-���+'&'-�#�(z#���wi��r�r�w #����'uv��&'����$���+'}�-.�'-��

��n¥ñE� &'w��'�L+'#�#�����r�-S��w��V�P��&'w ��uq��+'}���(Î��s��'��$�&'������s�����-.�V#�$�+'}Q�J-.�'$���+'�L�O�P}6s�#�uvr���w �'-�#�$��'-�$�����p���pQr���wi+'#�#�����r3����$�'$�-.��w &'-���$�p�p���+'��&'��-{��+'+Suv#�p��'��s���&'�,#�$�-V#�(y&'x��q��n¥ñE� &'w��,�\��w��*+'#�s���+S���'&,x�w ��-.r���s�&�&,#�&'x��\s���w w���$�&3$�#�p��2�L)�#���+'#�����+#�r���w���&'�,#�$�-�-.��s�x���-*����+'��$�s��'$�����w �|$�����p���p3��/¨#�w ��#O!O��wÒ�Suv#�-.&#�r���w ��&'�,#�$�-��o��-�r���s��'��+,+'}o��w�#���r�#�r���w���&'�'#�$�-�����w �����-��,��wV&'#��,uvr�+'��uv��$�&S�,$�&,w��'��-Q&'x���$��,$�����+'��$�s���p�&'w ����-Q��$�p�&,x��P}�����$���w���+'�'���q&,#�uv��+,&'�'� p��,uv��$�-��,#�$���+3p���&'�H�\12�����'$����p�p��'&,�'#�$���+'+,}*�,$�&,#q��s�s�#���$�&2&'x���uv#�p���w���&'�V-��,������$�pqs�#�r�}*s�#�-.&�#�(S&,x�����n¥ñE��&'w �'���H����s�#�$�-.�'p���w3�'&���-y��$q�'p�����+�s�x�#��'s��(�#�wz�J����$���w ��+,��r���w�r�#�-��(���$�s�&'�,#�$���+�uv���'$���uq��uv#�w�}��,$�p��P�Q-�&,w���s�&,��w �2�

ð �E¬êñ��s­�ü ò��\®7ò.ø�ò`ñxú�ýno����w ����w ����&,+'}J�'$�p�����&'��p{&'#¼PJ��$�$���&,x{����-���$���${( #�w�x��,-Sr��,#�$�����w��,$��L��#�w��{#�$�w ����+'��&,�'uv������w���������s�#�+'+,��s�&,�'#�$�uv��&,x�#�p�-( #�w{r���w -��'-.&'��$�&y( ��$�s�&'�'#�$���+x�����r�-��ïno��&,x���$��¯_E��w ������w ���'+'�,&'������&,w ��/±°���$�r<°�°��y¦S+ ­«��-q%�#��'-���+'#�$�� %i#��'$��'$���$�����$�p¨&,x����$�#�$O}�uv#���-�w ��( ��w�����-�(�#�wis�#�uvuv��$�&,-�&,x���&2x���+,r���pQ��-�&'#*�'uvr�w�#O!O��&'x��'-�r���r���wº�

1Ex��'-t��#�w � x���-t������$ s���w w �'��p #���&{�'$¤&'x���¼Q�,£���-.�¨��$�p¤¼��'��w�# r�w�#G­«��s�&'-��­«#��'$�&Jw���-�����w�s�x r�w�#G­«��s�&,-|#�(\)�#����'�1���+'��s�#�uvuv��$��'s���&,�'#�$�-���$�pï¼���+'-.�'$����TQQ$���!O��w�-.�'&,}o#�(J1���s�x�$�#�+,#���}H�t¬�#�w{uv#�w��tp���&'���'+'-.�yr�+'����-����s�#�$�-���+'&�#���wJx�#�uv�r������y��& cSbSb¢²´³ Í²Í cSR¢µV`S^SX\[PeS^\[�c²ÎSb\[�]SR Íy�¯1Ex���r�w #G­«��s�&'-�x��G!O��������$���$���$�s��'��+'+,}�-���r�r�#�w�&'��p{��}�&'x���1���s�x�$�#�+,#���}»��G!O��+'#�r�uv��$�&O§���$�&'w��V#�(z¬i�'$�+'��$�p�� 1����O��-��¯�

ð õxõxò.ñt�x÷aô\¶>·o�7ö.ø�ÿxüaû7ý¸ É Æ�Æ�¹²È9Æxº¢Z�»¢¼EQF½¿¾[È ºDÆeÆ´ÀÁS�È�ÉÃÂ6XYVxÄlÆ�È\Vx»�Ån»`V�S�È"ÆÈÇ

°���1�x��{��r�r�w�#����'uv��&,�L�G!���w������Jx����'��x�&2#�(�&'x���&'w ���2ÍKÉËÊ�Ì^Í�΢ÏÑÐÒ�Ó Ô`Õ ��x���w �YÖ¢�'-�&'x���$���uv����w3#�(���+,��uv��$�&,-��£×�&,x������s�����&is���r���s��'&'}�����$�pdÌzØYÙ*&'x��*��r�r�w #����'uv��&'�{�P!O��w �����Q-�&,#�w �����Q��&,�'+'�'����&'�'#�$t#�(��*x��'��w ��w s�x��'s���+z��s�s���-�-Vuv��&,x�#�p����-���p6#�$6���'$���w }6-.r�+'�'&,&'�'$����

�3��1�x��y��r�r�w�#����'uv��&,��G!O��w �����s�#�rO}�s�#�-.&«ÍvÉnÚÜÛ;×(ÌzØYÙ6ÝßÞJà���x���w �sÉ6�,-�����!���$�����#O!O���$�pá×(ÌzØ�Ù(ÝßÞ��'-�&'x����G!���w �����-��,���J#�(3&,x��J&'w ���J$�#�p��L�,$�s�+,��p��,$�����p�uv�,$��,-�&,w���&,��!��Lp���&'�H�

�3��1�x��L��r�r�w�#����,uv��&'��$���uv����wz#�(i&'w����{$�#�p���-<Í�âãÊ ÏUäÜå¢æ"ç�èÕ ��x���w ��Év�,-���"!O��$6����#�!O�H�

¸ É ÆéSÜZ È9Æxº¢Z�»�¼6QF½¿¾[ÈJºEÆ]Æ°��L1Ex�����r�r�w�#����'uv��&,�|$���uv����w{#�(�&'w�����$�#�p���-�Í@âêÊ ÐÕxÓ Ô§Õ ��x���w��¯Ö �,-v&'x���$���uv����w{#�(���+'��uv��$�&'-.��ÌzØYÙo&,x��

��r�r�w #����'uq��&,�V�G!O��w �����J-.&'#�w������{��&'�'+'�'����&'�'#�$\#�(S��x��'��w���w s�x��'s���+H��s�s���-�-yuv��&'x�#�p\����-���p*#�$*���'$���w�}*-�r�+'�'&,&'�'$�������$�pÙ(Ì^Ø�Ù{&,x��V��r�r�w�#����'uv��&,���G!���w������L$���uv����wi#�(i�O��}�-��'$\�V&'w ���V$�#�p��2�

�3�L1Ex��V��r�r�w #����'uv��&,���G!���w �����Vx����'��x�&�#�(3&,x��{&,w����HÍ�%�#�+'��&'�'#�$Q#�(3&,x��V��Ì�����&'�'#�$ëÐÕ¢Ó Ô`Õ Ê Ï äÜåxæ çGèÕ

°�°G�

�3��1�x��L��r�r�w�#����'uv��&'���G!O��w �����Lr���&'x6+'��$���&'xQ#�(3&,x��V&'w ���2Ívì�íîïðzñ<òkózô ðîïðzñ<ò;ô ð

ð3�L1Ex��L��r�r�w #����,uv��&'�y�G!O��w ������s�#�rO}�s�#�-.&«ÍxìNõJök÷(øzùYú4ûýüJþz��x���w���ìH�,-�����!���$�����#O!O����$�pÁ÷�ø^ù�ú4ûýüy�,-�&,x��V�G!���w������-��,���J#�(3&,x��J&'w ���J$�#�p��L�,$�s�+,��p��,$�����p�uv�,$��,-�&,w���&,��!��Lp���&'�H�

ÿ ò��ò.öKò`ñt�.ò`ý�uþ����(���('�0�$"-�9;9A,�'�ÿ�����()+#%9;9A,�'����.4��-�,���$"0n!�$7I��P��)+,�3�-G,�*L8+-�)+$"96!�¡n�"0��"��8%) ��$�!�-��7'� 7I�)+'�C�� ¦P� �_ë�á��lî�è5â5ë�����á�ë�ï�ã�í í â+�����¿ã�è5è5ã�á�í ÿ

����1���:��5ý�����������ÿ$þ��������� ý����(���6'�0�$7-�9A9;,�'�ÿ��<���()�#+9A9A,�'����B�"9A8�$"- 9A$7�"-� 7I�)+'�C6)+'68+-�)�$"9x�7'�065�3��70�8+-�$7$"9��6�7'6�7'��"#�¡�9A)+9v,�* #+$ ��$7#� 7,�.4��-�$79A9A)�,�'����' ��á�ë�ï�ã�ã�ñ�â+����íë �sèÏé�ã#ä�ã�ï�ë���ñ ö ��� � î�� � � á�ë7ç]ã�î��¹ä7ì��°çeë�í â � ��ë�� ö ����ë�á�âÏè5é��lí ÿW���"C�$"9§�ý�������ÿ$þ��������£���! "�s������

� ���� ²ü�#6��('�C�ÿ$#(�%���7.�$"8+�&�6����-�,�O�)+.��78+$s�P��$7-��"C�$49A8+,�-��"C�$43�8�)+#+)+�"�78+)+,�'Y,�*L!�3� 7¥�$78N.�$"8+I�,�0�9�¤6)+8+In�7-�!�)+8�-��"-�¡Y*��"'�,�3�8+�(' ë�á�ñ�â5ï)7ë � á;��î���ë*�sê²ë��°ç � è5â+��� ÿW��1�þ ������:��5ý���üuý��]þ7ÿ°þ������

� ����+A� ü*��,�6,�$�ÿ.-(�§?\,�-�)+.�,�8�,��/�('@$"*10� ")�$"'�8L)+.���#+$".�$7'�8+�"8�)+,�'d,�*(8+-�)+$Y9A8+-�3� "8+3�-�$"9A� ä�ë*�_è�2²î�á�ã435��á�î�ï�è5â5ï�ã î���ñ � ú�çeã�á â5ã7��ï�ã ÿý�ý�1���:��+����������ýeþ�ÿ$þ�����ý��

� ���5+A� +A��H(�"-�-��7C�I�ÿ�+A� 6(�� L#+$"�"-�¡�ÿ7�� #(���d)+8+8�$"'��$8`,�'�9A�")��$�¯ ",�.����7 "8B-�$"��-=$79A$"'�8+�"8+)+,�'4,�*x8+-�$"$"9A� ä�ë �_è92 î�á�ã:3;��á�î�ï�è5â5ï�ã#î���ñ � ú�çeã�á â+�ã"��ï�ã ÿbý���1���:��5ý�������ý��eþ�ÿ°þ �������

� �����N��H($ ��-�,�¡�$7�<�}'�,�8+$G,�'F8+I�$G�7��$7-��7C�$E0�$7��8+ID,�*x8+-�)+$79A� ê²ë��°ç � è5â+��� ÿbý��1���:��+�����������eþ�ÿ°þ ���ý��� ����+A� �(�WH(3�'�0��"9=����=�.���#+$".�$7'�8+)+'�C�0�¡�'��7.�)+ �.4)+'�)+.��7#5ü���-�$>0<O�8+-�)�$"9A� ä�ë*�_è�2²î�á�ã�3(�$á�î�ï�è5âÏï�ãlî���ñ � ú�ç]ã�á�â5ã"��ï�ã ÿDý]þ 1�þ��:���þ �ý����

þ�����ÿ�þ����]þ �� ��nûÜ�?�<#+�A@k,�#+$"8+�CB('�8+I�$L��$7-�*�,�-�.��"'� "$E$ �U�"#�3��"8+)�,�'�,�*�$7O�8+$"'�0�)+!�#+$�I��"9AI�)�'�C��"'�0�8+-�)�$L9A$"�"-� "I�)+'�C���ö ï�è5P� �'ë�á��lî�è5âÏï�î ÿ�ý���+��������������ÿ

þ�������� ����û"���<#+�A@k,�#+$"85ÿ�?@�7D"E7C�'�)+$7-yÿ�H(����,�8+8+$"�73����(#+C�$7!�-��")+ G.�$78+I�,�0�9`*�,�-x8+-�)+$�9A8+�78%)+9;8+)% 79A�]ö ����F7æ�â5í ï�á�ã�è5ã�øùî�è5é�F ÿbý�����þ������]þ ��ÿ$þ��������uþ���nûÜ���<#+�A@k,�#+$"8+�vH()�C�)+8+�"# 9;$"�"-� "ID8+-�$"$79K-�$ ��)+9;)+8+$"0�� ä ¦ ö øG)7ë � á;��î���ë�� ê²ë��°ç � è5â+��� ÿþ ��1���:��+�������������ÿ$þ �������}þ�þ��\MN���<-�$"0�¥�)+'��N��-�)+$�.�$".4,�-�¡P� ê²ë��:� � ��â5ï�î�è5â5ë���í ë �sè5é�ã ö ê]ø ÿ����+�����������ÿ$þ �������uþ7ý��566� #6��6(,�'�'�$78ÏÿCDL� �6��8L�"$7�"��ü�H§�78%$79A�JI î���ñ � ë�ë ó ë � ö ����ë�á â5è5é��sí²î���ñTæ�î�èÏîSä�è5á � ï�è � á�ã�í �!�(0�0�)+9;,�'�ü��Á$"9A#+$P¡�ÿB9A$7 7,�'�0�$"0�)%8+)+,�'�ÿ

þ����]þ ��uþ�����+A�?6(,�3�!��"3�#+8+�K#()+.�?@�<��9A8+�7'�0��"-�0(?@�é¤()+8+I6*z�"9A8�9A$78%9v�7'�06.��"��9A� �$á�ë�ï>F'ë �²èÏé�ãL��è5é ö ê]ø ä�¦ML:��� ö�'ON ë�á ó í�é�ë�ç%ë��lä�è5î���ñ�î�á�ñ

øP�nî���ñSâ5è5í ö ç�çx�5â5ï�î�è5âÏë���í ò ø4�RQ S�T õ ��UláA�5î���ñ�ë���VN�5ë�á�â5ñ�î��eð$ä ö#ÿW��ý�������ÿ�þ ��������uþ�����+A��6(-��7¡�ÿW$"0�)+8+,�-%� àEé�ã ô ã7��ï�é��lî�á ó!I î���ñ � ë�ë ó �_ë�á]æ�î�è5î � î�í ãsî���ñ�àbá�î���í î�ï�è5â5ë��(��á�ë�ï�ã�í�í â+����ä�ì�í�èÏãM�lí �L?@,�-zC��"'W-(�"3�*�.4�"'�'

û<3�!�#%)+9AI�$7-�9 ÿ7�<�"'�?@�"8+$7�61� "�(:uÿ�&!�K�lÿEþ ��������uþ����nûÜ�C-()+-�9A "I�$7'�I�,�*�$7-yÿ$#(�Dû<-�,�0�)+'�C�$"-+�X��,�.4$F*�3�-�8+I�$7-`-�$79A3�#%8+9G,�'Y0�)+C�)+8%�7#x9A$7�"-� 7I�8%-�$7$"9;� �$á�ë�ï>FKY���è5é(¦�ê ö �Z� ÿ ���7C�$"9�þ������eþ�����

�<��-�)+'�C�$7-}ü�[N$"-�#+�"C�ÿeþ�������£�N$" 78+3�-�$=�(,�8%$79K)+'; L,�.���3�8+$7-��� 7)+$"'� "$G��,�#+�bý�����uþ����\H(� M£��-('�3�8+I�� à �]\J^ àEé�ãJ��á�ë���á�î�� �<�60�0�)+9A,�'�ü��Á$"9;#+$7¡�ÿeþ �������uþ����\H(� MN��-('�3�8+I�� ä�ë�á è5â+���Sî���ñ ä�ã�î�á�ï�é�â+��� ÿ ��,�#%3�.4$E�D,�* àEé�ã ö á è9ë �sê²ë��°ç � è5ã�á,�$á�ë���á�î��!�sâ%��� �<�(0�0�)+9;,�'�ü��Á$79A#+$P¡�ÿ�þ ��������uþ����+A�£�U�"'\�£$7$"3�¤6$"'\1�$"0�� :��cö ����ë�á â5è5é��lí�î���ñnê²ë��°çx�5ãuú�â5è ì ÿL��,�#+3�.4$4�ý,�*;I î���ñ � ë�ë ó ë*��àDé�ã�ë�á�ã�è5â5ï�î��²ê²ë��°ç � è5ã�á¼ä�ï�â5ã"��ï�ã �

MN#+9A$ ��)+$7-yÿ]þ ������

°�°Gð

�uþ�����-(��?\,�-�)%.4,�8+,�ÿ�#6���-�)+C�3� "I�)5ÿ�+A� ü�����6,�$7���.�$78+I�,�0F,�*¢ 7,�.4��-�$"9;9A)+'�CF8+-�)+$L9A8+-�3� "8+3�-�$79A� ä�ë �_è�2²î�á�ãJ_:�$á�î�ï�è5âÏï�ãIî���ñ � ú�çeã�á â5ã7��ï�ã ÿý���1���:��5ý�������ý���ÿ$þ��������

� ý�����<�C�()+#+9A9;,�'�ÿ%6(�K-(�"-�#+9;9A,�'��`�<�79;8x�70�0�-�$"9;9E#%,�,�¥�3���*�,�-`)+'�8+$7-�'�$"8¢-�,�3�8+$"-�9;�`�'�ûB�"3�#�- $"0�)+8+,�-Ïÿ �$á�ë�ï�ã�ã�ñ�â+����íTë*�Sè5é�ãaT�è5é(¦bVJ¦b�¦P��èÏã�áA��î�è5â5ë���î��Dê²ë��*�_ã�á�ã"��ï�ãTë�� ô á�ë�î�ñ � î���ñ¹ê²ë��!� � ��â5ï�î�è5â5ë���í ò ô êaQ S�c õ��� LI��7��.4�7'Wde#(�7#+#Ïÿ]þ ������

� ý]þ��������6)+#%9;9A,�'�ÿ�?@����)%¥�¥��"'�$"'��<�.4��#+$7.�$7'�8+)+'�CD�E0�¡�'��".4)+ E 7,�.4��-�$79A9;$"0F8+-�)%$7���'�?@$"I�#+I�,�-�'W-u$"0�)%8+,�-yÿ ��á�ë�ï�ã�ã�ñ�â+����í°ë*�sè*é�ã�����ñN ë�á ó í é�ë7çpë�� ö ����ë�á â5è5é�� � ����â%��ã�ã�á â+��� òfNIöE� Q S�c õuÿ�ý���������ÿ��(3�C¹ý��üuý�ý�ÿ�þ�������

� ý�ý��� L��B6¥��"9;�"¥�)+� � � á�ã"� ì`V � ��ï�è5âÏë���î���æ�î�è5î¹ä�è5á � ï�è � á�ã�í �< L�".4!�-�)+0�C�$6&('�) ��$"-�9;)+8%¡ û<-�$"9;9 ÿ]þ ������� ý�����-(��B(¥�9;�"'�$"'��Kg ã�î��+�uèÏâ��lã.Llî�á � î���ãsê²ë��+�5ã�ï�è5â5ë�� ë �sî;V � ��ï�è5â5ë���î��C��ã�á�í â5í è5ã"��è I ã�î�ç �£�N)+ "$7'�8+)+�"8�$>h+9§�NI�$"9A)+9�ÿ7#($"#�9A)+'�¥�) &6'�) ��$"-�ü

9A)+8+¡F,�*v�B$7 "I�'�,�#+,�C�¡�ÿC�B�" 73�#+8+¡F,�*��'�*�,�-�.4�"8+)+,�'4�B$" 7I�'�,�#+,�C�¡�ÿBH($"���"-�8+.4$"'�8B,�*� L,�.4��3�8+$7-<�< ")+$"'� 7$�ÿ�þ����������6#%9;,��P�U�7)%#+�"!�#+$6�78i=j$j7k&lbm$m7i=n7o=p$q$r(sbt$q(s�i=u$jWsbv$n �

w ý���x�yL��û<)+8+8�$"#+��z69A¡�.���8+,�8+)+ 7�"#BC�-�,U¤(8�IF,�*x�E 7#+�"9A9§,�*x-��"'�0�,�.ß8+-�$7$"9A� àEé�ã ö ����î��5í°ë*{:|�á�ë � î � â+�5â*è ì ÿþ���1uý�:�}+�eþ���~���ý���ÿ$þ�������w ý���x5yL�]û�)+8+8+$7#+��ûB�"8+I�9()+'Y��-��"'�0�,�.g0�)+C�)+8��"#£8+-�$7$>}4�N)+.�)+8+)+'�Cn0�)+9A8+-�)+!�3�8+)+,�'�9A��ö ñ ÷ î���ï�ã�í â+� ö ç�çx�5â5ã�ñ4|�á�ë � î � â��5â5è ì ÿ#þ��}�þ�����~]þ ����ÿ

þ�������w ý���x��(���<�"-�'��"¥@�"'�0��L� M£�N� �7- �k�7'��[û�#+�7'��"-E��,�)+'�8L#+,� "�"8�)+,�'Á3�9A)+'�C@��$"-�9A)+9;8+$"'�8L9A$"�7-� "IÁ8�-�$"$79A� ê²ë��:� � ��â5ï�î�è5â5ë���í�ë {�è5é�ã ö êEø ÿ

ý���1���:�}+������~�������ÿ��;3�#$þ�������w ý���x����v�Á�7C�'�$7-6�7'�0&z6�v�Á,�#+*^*��Rz  7,�.4!�)+'��78+,�-�)+�7#`*�-��7.4$ ¤6,�-�¥Á*�,�-E.4�7�Á#+�"!�$"#+)+'�C�����'Á��I�)+8%$79;)%0�$79;��3�$`�(�N1�$70�� :uÿ |�á�ë�ï�ã�ã�ñ��

â+����í�ë {�è5é�ã%ä7ì��°ç]ë�í�â � �®ë����lá�î�çeé�æ�á�î��²â+���R� ��c ÿ��N$" 78+3�-�$P�(,�8+$79�)+'��L,�.4��3�8+$"-!�� ")+$7'� 7$�ÿ¹þ�������}+�]þ ��~����eþ�ÿ!�<��-�)+'�C�$7-}ü�N$"-�#+�"C�ÿùþ �������z(#+9A,ã�P�U�")+#+�"!�#+$¯*�-�,�. i=j$j7k&lbm$m7�$�$�&sbn��$v(sbv$u=�bo=r$�$�$n$�(sb�$r$m7�=p7k(�7�$p7o=r$��n$�$�$m7k=p�k=r$�$q$m7���(�t$v7�=�=�7�$�(s�k=q(sb�$� �

°�°GÓ

°�°G®