advanced algorithmics (6eap) dynamic programming · 2016. 10. 7. · • a simple inspection of the...

15
07.10.16 1 Advanced Algorithmics (6EAP) Dynamic Programming Jaak Vilo 2016 Fall 1 Jaak Vilo Example Wind has blown away the +, *, (, ) signs Whats the maximal value? Minimal? 2 1 7 1 4 3 2 17143 (2+1)*7*(1+4)*3 = 21*15 = 315 2*1 + 7 + 1*4 + 3 = 16 Q: How to maximize the value of any expression? 2 4 5 1 9 8 12 1 9 8 7 2 4 4 1 1 2 3 = ? http://net.pku.edu.cn/~course/cs101/resource/Intro2Algorithm/book6/chap16.htm Dynamic programming, like the divide-and-conquer method, solves problems by combining the solutions to subproblems. Dünaamiline planeerimine. Divide-and-conquer algorithms partition the problem into independent subproblems, solve the subproblems recursively, and then combine their solutions to solve the original problem. In contrast, dynamic programming is applicable when the subproblems are not independent, that is, when subproblems share subsubproblems.

Upload: others

Post on 12-Sep-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Advanced Algorithmics (6EAP) Dynamic Programming · 2016. 10. 7. · • A simple inspection of the nested loop structure of MATRIX-CHAIN-ORDER yields a running time of O(n3) for

07.10.16

1

AdvancedAlgorithmics(6EAP)DynamicProgramming

JaakVilo2016Fall

1JaakVilo

Example

• Windhasblownawaythe+,*,(,)signs• What’sthemaximalvalue?• Minimal?

2 1 7 1 4 3

• 217143

• (2+1)*7*(1+4)*3=21*15=315• 2*1+7+1*4+3=16

• Q:Howtomaximizethevalueofanyexpression?

2451981219872441123=?

• http://net.pku.edu.cn/~course/cs101/resource/Intro2Algorithm/book6/chap16.htm

• Dynamicprogramming,likethedivide-and-conquermethod,solvesproblemsbycombiningthesolutionstosubproblems.– Dünaamilineplaneerimine.

• Divide-and-conqueralgorithmspartitiontheproblemintoindependentsubproblems,solvethesubproblemsrecursively,andthencombinetheirsolutionstosolvetheoriginalproblem.

• Incontrast,dynamicprogrammingisapplicablewhenthesubproblemsarenotindependent,thatis,whensubproblemssharesubsubproblems.

Page 2: Advanced Algorithmics (6EAP) Dynamic Programming · 2016. 10. 7. · • A simple inspection of the nested loop structure of MATRIX-CHAIN-ORDER yields a running time of O(n3) for

07.10.16

2

Dynamicprogramming

• Avoidcalculatingrepeatingsubproblems

• fib(1)=fib(0)=1;• fib(n)=fib(n-1)+fib(n-2)

• Althoughnaturaltoencode(andausefultaskfornoviceprogrammerstolearnaboutrecursion)recursively,thisisinefficient.

n

n-1 n-2

n-2 n-3 n-3 n-4

n-3 n-4

Structurewithintheproblem

• Thefactthatitisnotatree indicatesoverlappingsubproblems.

• Adynamic-programmingalgorithmsolveseverysubsubproblemjustonceandthensavesitsanswerinatable,therebyavoidingtheworkofrecomputingtheanswereverytimethesubsubproblemisencountered.

Topp-down(recursive,memoized)

• Top-downapproach:Thisisthedirectfall-outoftherecursiveformulationofanyproblem.Ifthesolutiontoanyproblemcanbeformulatedrecursivelyusingthesolutiontoitssubproblems,andifitssubproblemsareoverlapping,thenonecaneasilymemoize orstorethesolutionstothesubproblemsinatable.Wheneverweattempttosolveanewsubproblem,wefirstcheckthetabletoseeifitisalreadysolved.Ifasolutionhasbeenrecorded,wecanuseitdirectly,otherwisewesolvethesubproblemandadditssolutiontothetable.

Bottom-up

• Bottom-upapproach:Thisisthemoreinterestingcase.Onceweformulatethesolutiontoaproblemrecursivelyasintermsofitssubproblems,wecantryreformulatingtheprobleminabottom-upfashion:trysolvingthesubproblemsfirstandusetheirsolutionstobuild-onandarriveatsolutionstobiggersubproblems.Thisisalsousuallydoneinatabularformbyiterativelygeneratingsolutionstobiggerandbiggersubproblemsbyusingthesolutionstosmallsubproblems.Forexample,ifwealreadyknowthevaluesofF41 andF40,wecandirectlycalculatethevalueofF42.

• Dynamicprogrammingistypicallyappliedtooptimizationproblems.Insuchproblemstherecanbemanypossiblesolutions.Eachsolutionhasavalue,andwewishtofindasolutionwiththeoptimal(minimumormaximum)value.

• Wecallsuchasolutionan optimalsolutiontotheproblem,asopposedtothe optimalsolution,sincetheremaybeseveralsolutionsthatachievetheoptimalvalue.

Page 3: Advanced Algorithmics (6EAP) Dynamic Programming · 2016. 10. 7. · • A simple inspection of the nested loop structure of MATRIX-CHAIN-ORDER yields a running time of O(n3) for

07.10.16

3

Thedevelopmentofadynamic-programmingalgorithmcanbebrokenintoasequenceoffoursteps.

1.Characterizethestructureofanoptimalsolution.2.Recursivelydefinethevalueofanoptimalsolution.3.Computethevalueofanoptimalsolutioninabottom-upfashion.

4.Constructanoptimalsolutionfromcomputedinformation.

Editdistance(Levenshteindistance)

• Smallestnrofeditoperationstoconvertonestringintotheother

I N D U S T R Y

I N T E R E S T

I N D U S T R Y

I N T E R E S T

Edit(Levenshtein)distance• Definition TheeditdistanceD(A,B)betweenstringsAandBis

theminimalnumberofeditoperationstochangeAintoB.Allowededitoperationsaredeletionofasingleletter,insertionofaletter,orreplacingoneletterwithanother.

• LetA=a1 a2 ...am andB=b1 b2 ...bm.– E1:Deletion ai →ε

– E2:Insertion ε→bi

– E3:Substitution ai →bj (ifai ≠bj)

• Otherpossiblevariants:– E4:Transposition aiai+1 →bjbj+1 andai=bj+1 jaai+1=bj

(e.g.lecture→letcure)

Howcanwecalculatethis?

α

β b

a

D(αa, βb) =

1. D(α, β) if a=b2. D(α, β)+1 if a≠b

3. D(αa, β)+14. D(α, βb)+1

min

Howcanwecalculatethisefficiently?

D(S,T) = 1. D(S[1..n-1], T[1..m-1] ) + (S[n]=T[m])? 0 : 1 2. D(S[1..n], T[1..m-1] ) +13. D(S[1..n-1], T[1..m] ) +1

min

d(i,j) = 1. d(i-1,j-1) + (S[n]=T[m])? 0 : 1 2. d(i, j-1) +13. d(i-1, j) +1

min

Define: d(i,j) = D( S[1..i], T[1..j] )

Recursion?

d(i,j)

j

id(i-1,j-1)

d(i,j-1)

d(i-1,j)

x

y

Page 4: Advanced Algorithmics (6EAP) Dynamic Programming · 2016. 10. 7. · • A simple inspection of the nested loop structure of MATRIX-CHAIN-ORDER yields a running time of O(n3) for

07.10.16

4

Recursion?

d(i,j)

j

id(i-1,j-1)

d(i,j-1)

d(i-1,j)

0

0 n0123456789

1 2 3 4 5 6 7 8 9 n

m

AlgorithmEditdistanceD(A,B)usingDynamicProgramming(DP)

Input:A=a1a2...an,B=b1b2...bmOutput: Valuedmn inmatrix(dij),0≤i≤m,0≤j≤n.

for i=0tomdo di0=i;for j=0to ndo d0j=j ;for j=1to ndofor i=1tomdo

dij =min(di-1,j-1 +(if ai==bj then 0else 1),di-1,j +1,di,j-1 +1)

return dmn

DynamicProgramming

d(i,j)

j

id(i-1,j-1)

d(i,j-1)

d(i-1,j)

0

0 n0123456789

1 2 3 4 5 6 7 8 9 n

m

1234

x

y

Editdistance=shortestpathin Matrixmultiplication• for i=1..n

for j=1..kcij =Σx=1..m aix bxj

A Bn

m

X = Cm

k n

k

O(nmk)

Page 5: Advanced Algorithmics (6EAP) Dynamic Programming · 2016. 10. 7. · • A simple inspection of the nested loop structure of MATRIX-CHAIN-ORDER yields a running time of O(n3) for

07.10.16

5

MATRIX-MULTIPLY(A,B)1if columns[A]≠rows[B]2 thenerror "incompatibledimensions"3 elsefori=1to rows[A]4 dofor j =1to columns[B]5 do C[i,j]=06 for k=1 to columns[A]7 do C[i,j]=C[i,j]+A[i,k]*B[k,j]8 return C

Chainmatrixmultiplication

• Thematrix-chainmultiplicationproblem canbestatedasfollows:givenachain<A1,A2,...,An>ofn matrices

• matrixAihasdimensionpi-1Xpi

• fullyparenthesizetheproductA1 A2...An inawaythatminimizesthenumberofscalarmultiplications.

A1A2A3A4

• (A1(A2(A3A4))),• (A1((A2A3)A4)),• ((A1A2)(A3A4)),• ((A1(A2A3))A4),• (((A1A2)A3)A4).

Page 6: Advanced Algorithmics (6EAP) Dynamic Programming · 2016. 10. 7. · • A simple inspection of the nested loop structure of MATRIX-CHAIN-ORDER yields a running time of O(n3) for

07.10.16

6

• DenotethenumberofalternativeparenthesizationsofasequenceofnmatricesbyP(n).

• Sincewecansplitasequenceofnmatricesbetweenthekthand(k +1)stmatricesforanyk =1,2,...,n -1andthenparenthesizethetworesultingsubsequencesindependently,weobtaintherecurrence

• Problem13-4askedyoutoshowthatthesolutiontothisrecurrenceisthesequenceofCatalannumbers:

• P(n)=C (n - 1),where

• Thenumberofsolutionsisthusexponentialinn,andthebrute-forcemethodofexhaustivesearchisthereforeapoorstrategyfordeterminingtheoptimalparenthesization ofamatrixchain.

Let’scracktheproblem

Ai..j=Ai•Ai+1 •••Aj

• OptimalparenthesizationofA1•A2 •••Ansplitsatsomek,k+1.

• Optimal=A1..k• Ak+1..n

• T(A1..n)=T(A1..k) +T(Ak+1..n)+T(A1..k• Ak+1..n)

• T(A1..k)mustbeoptimalforA1•A2 •••Ak

Recursion

• m[i,j]- minimumnumberofscalarmultipli-cationsneededtocomputethematrixAi..j;

• m[i,i]=0• cost(Ai..k• Ak+1..j)=pi-1 pk pj• m[i,j]=m[i,k]+m[k+1,j]+pi-1pkpj.

• Thisrecursiveequationassumesthatweknowthevalueofk,whichwedon't.Thereareonlyj- i possiblevaluesfork,however,namelyk=i,i+1,...,j- 1.

• Sincetheoptimalparenthesizationmustuseoneofthesevaluesfork,weneedonlycheckthemalltofindthebest.Thus,ourrecursivedefinitionfortheminimumcostofparenthesizingtheproductAiAi+1 ...Aj becomes

• Tohelpuskeeptrackofhowtoconstructanoptimalsolution,letusdefines[i,j]tobeavalueofk atwhichwecansplittheproductAiAi+1 ...Aj toobtainanoptimalparenthesization.Thatis, s[i,j]equalsavaluek suchthat m[i,j] =m[i,k] +m[k+1,j]+pi- 1pkpj.

Recursion

• Checksallpossibilities…

• But– thereisonlyafewsubproblems–choosei,js.t.1≤i≤j≤n- O(n2)

• Arecursivealgorithmmayencountereachsubproblemmanytimesindifferentbranchesofitsrecursiontree.Thispropertyofoverlappingsubproblemsisthesecondhallmarkoftheapplicabilityofdynamicprogramming.

Page 7: Advanced Algorithmics (6EAP) Dynamic Programming · 2016. 10. 7. · • A simple inspection of the nested loop structure of MATRIX-CHAIN-ORDER yields a running time of O(n3) for

07.10.16

7

// foreach length from 2 to n

// check all mid-points for optimality

// foreach start index i

// new best value q

// achieved at mid point k

Example

matrix dimensions: A1 30 X 35 A2 35 X 15 A3 15 X 5 A4 5 X 10 A5 10 X 20 A6 20 X 25

((A1(A2A3))((A4A5)A6))

• AsimpleinspectionofthenestedloopstructureofMATRIX-CHAIN-ORDERyieldsarunningtimeofO(n3)forthealgorithm.Theloopsarenestedthreedeep,andeachloopindex(l,i,and k)takesonatmostn values.

• TimeΩ(n3) => Θ(n3)• SpaceΘ(n2)

• Step4ofthedynamic-programmingparadigmistoconstructanoptimalsolutionfromcomputedinformation.

• Usethetables[1..n,1..n]todeterminethebestwaytomultiplythematrices.

MultiplyusingStable

MATRIX-CHAIN-MULTIPLY(A,s,i,j)1if j>i2 then X=MATRIX-CHAIN-MULTIPLY(A,s,i,s[i,j])3 Y =MATRIX-CHAIN-MULTIPLY(A,s,s[i,j]+1,j)4 returnMATRIX-MULTIPLY(X,Y)5elsereturn Ai

((A1(A2A3))((A4A5)A6))

Elementsofdynamicprogramming

• Optimalsubstructurewithinanoptimalsolution

• Overlappingsubproblems

• Memoization

Page 8: Advanced Algorithmics (6EAP) Dynamic Programming · 2016. 10. 7. · • A simple inspection of the nested loop structure of MATRIX-CHAIN-ORDER yields a running time of O(n3) for

07.10.16

8

• Amemoizedrecursivealgorithmmaintainsanentryinatableforthesolutiontoeachsubproblem.Eachtableentryinitiallycontainsaspecialvaluetoindicatethattheentryhasyettobefilledin.Whenthesubproblemisfirstencounteredduringtheexecutionoftherecursivealgorithm,itssolutioniscomputedandthenstoredinthetable.Eachsubsequenttimethatthesubproblemisencountered,thevaluestoredinthetableissimplylookedupandreturned. (tabulated)

• Thisapproachpresupposesthatthesetofallpossiblesubproblemparametersisknownandthattherelationbetweentablepositionsandsubproblemsisestablished.Anotherapproach

istomemoizebyusinghashingwiththesubproblemparametersaskeys.

Overlappingsubproblems

LongestCommonSubsequence(LCS) Optimaltriangulation

Two ways of triangulating a convex polygon. Every triangulation of this 7-sided polygon has 7 - 3 = 4 chords and divides the polygon into 7 - 2 = 5 triangles.

The problem is to find a triangulation that minimizes the sum of the weights of the triangles in the triangulation

Parsetree

Parse trees. (a) The parse tree for the parenthesized product ((A1(A2A3))(A4(A5A6))) and for the triangulation of the 7-sided polygon (b) The triangulation of the polygon with the parse tree overlaid. Each matrix Ai corresponds to the side vi-1 vi for i = 1, 2, . . . , 6.

Optimaltriangulation

Page 9: Advanced Algorithmics (6EAP) Dynamic Programming · 2016. 10. 7. · • A simple inspection of the nested loop structure of MATRIX-CHAIN-ORDER yields a running time of O(n3) for

07.10.16

9

Dynamicprogramming

• Avoidre-calculatingsamesubproblemsby– Characterisingoptimalsolution– Cleverorderingofcalculations

DynamicProgramming

d(i,j)

j

id(i-1,j-1)

d(i,j-1)

d(i-1,j)

0

0 n0123456789

1 2 3 4 5 6 7 8 9 n

m

1234

x

y

Editdistanceisametric

• Itcanbeshown,thatD(A,B)isametric– D(A,B)≥0,D(A,B)=0iffA=B– D(A,B)=D(B,A)– D(A,C)≤D(A,B)+D(B,C)

Pathofeditoperations

• Optimalsolutioncanbecalculatedafterwards– Quitetypicalindynamicprogramming

• Memorizesetspred[i,j]dependingfromwherethedij wasreached.

Page 10: Advanced Algorithmics (6EAP) Dynamic Programming · 2016. 10. 7. · • A simple inspection of the nested loop structure of MATRIX-CHAIN-ORDER yields a running time of O(n3) for

07.10.16

10

Threepossibleminimizingpaths

• Addintopred[i,j]– (i-1,j-1)if dij =di-1,j-1 +(ifai==bj then0else1)– (i-1,j)if dij =di-1,j +1– (i,j-1)if dij =di,j-1 +1

The path (in reverse order) ε → c6, b5 → b5, c4 → c4, a3 → a3, a2 → b2, b1 → a1.

Multiplepathspossible

• Allpathsarecorrect

• Therecanbemany(howmany?)paths

Spacecanbereduced

CalculationofD(A,B)inspaceΘ(m)

Input:A=a1a2...am,B=b1b2...bn (choosem<=n)Output: dmn=D(A,B)for i=0tomdo C[i]=ifor j=1to ndo

C=C[0];C[0]=j;for i=1tomdo

d=min(C+(if ai==bj then 0else 1), C[i-1]+1, C[i]+1)C=C[i] //memorizenew“diagonal” valueC[i]=d

write C[m]

TimecomplexityisΘ(mn)sinceC[0..m]isfilledntimes

Shortestpathinthegraphhttp://en.wikipedia.org/wiki/Shortest_path_problem

Page 11: Advanced Algorithmics (6EAP) Dynamic Programming · 2016. 10. 7. · • A simple inspection of the nested loop structure of MATRIX-CHAIN-ORDER yields a running time of O(n3) for

07.10.16

11

Shortestpathinthegraphhttp://en.wikipedia.org/wiki/Shortest_path_problem

All nodes at distance 1 from source

Editdistance=shortestpathin

Observations?

• Shortestpathisclosetothediagonal– Ifashortdistancepathexists

• Valuesalonganydiagonalcanonlyincrease(byatmost1)

Diagonal

Diagonal nr. 2, d02, d13, d24, d35, d46

Diagonal k , -m ≤ k ≤ n, s.t. diagonal k contains only dij where j-i = k.

Property of any diagonal: The values of matrix (dij) can on any specific diagonal either increase by 1 or stay the same

DiagonallemmaLemma: Foreachdij,1≤i≤m,1≤j≤nholds:dij=di-1,j-1 ordij =di-1,j-1 +1.

(noticethatdij anddi-1,j-1 areonthesamediagonal)Proof: Sincedij isaninteger,show:

1. dij ≤di-1,j-1 +12. dij ≥di-1,j-1

Fromthedefinitionofeditdistance1.holdssincedij ≤di-1,j-1 +1Inductiononi+j:

– Basisistrivialwheni=0orj=0(ifweagreethatd-1,j-1=d0j)– Inductionstep:thereare3possibilities-

• Onminimizationthedij iscalculatedfromentrydi-1,j-1,hencedij ≥di-1,j-1

• Onminimizationthedij iscalculatedfromentrydi-1,j,hencedij=di-1,j+1≥di-2,j-1+1≥di-1,j-1

• Onminimizationthedij iscalculatedfromentrydi,j-1.Analogicalto2.– Hence,di-1,j-1 ≤dij

Page 12: Advanced Algorithmics (6EAP) Dynamic Programming · 2016. 10. 7. · • A simple inspection of the nested loop structure of MATRIX-CHAIN-ORDER yields a running time of O(n3) for

07.10.16

12

Transformthematrixintofkp• Foreachdiagonalonlyshowtheposition(rowindex)wherethevalueisincreasedby1.

• Also,onecanrestrictthematrix(dij)toonlythispartwheredij ≤dmn sinceonlythosedij canbeontheshortestpath.

• We'llusethematrix(fkp)thatrepresentsthediagonalsofdij– fkp isarowindexifromdij,suchthatondiagonalkthevaluepreachesrowi(dij=pandj-i=k).

– Initialization:f0,-1=-1andfkp=-∞whenp≤|k|-1;– dmn =p,suchthatfn-m,p=m

Calculatingmatrix(fkp)bycolumns

• Assumethecolumnp-1hasbeencalculatedin(fkp),andwewanttocalculatefkp. (theregionofdij=p)

• Ondiagonalkvaluespreachatleasttherowt=max(fk,p-1+1,fk-1,p-1 ,fk+1,p-1+1)ifthediagonalkreachessofar.

• Ifonrowt+1additionallyai =bj onthesamediagonal,thendij cannotincrease,andvaluepreachesrowt+1.

• Repeatpreviousstepuntilai ≠bj ondiagonalk.

• fk,p-1+1 - samediagonal

• fk-1,p-1 - diagonalbelow

• fk+1,p-1+1 - diagonalabove

AlgorithmA():calculatefkp

A(k,p)1.t=max(fk,p-1 +1,fk-1,p-1 ,fk+1,p-1+1)2.while at+1 ==bt+1+k do t=t+13. fkp =if t>mort+k>nthen undefinedelse t

f 0,2 … t= max (3,2,3) = 2A[3]=B[3] , … A[5]=B[5] => f(0,2) = 5

f(1,2) … t

Page 13: Advanced Algorithmics (6EAP) Dynamic Programming · 2016. 10. 7. · • A simple inspection of the nested loop structure of MATRIX-CHAIN-ORDER yields a running time of O(n3) for

07.10.16

13

Algorithm:Diagonalmethodbycolumns

p=-1while fn-m,p ≠mp=p+1for k=-pto p do //fkp =A(k,p)

t=max(fk,p-1 +1,fk-1,p-1 ,fk+1,p-1+1)while at+1 =bt+1+k do t=t+1fkp =if t>mor t+k>nthen undefinedelse t

• pcanonlyocurondiagonals-p≤k≤p.• Methodcanbeimprovedsincekisoftensuchthatfkp isundefined.

• Wecandecreasevaluesofk:– -m≤k≤n(diagonalnumbers)– Letm≤nanddij ondiagonalk.

• if-m≤k≤0then|k|≤dij ≤m• if1≤k≤nthenk≤dij ≤k+m• Hence,-m≤k≤mifp≤mandp-m≤k≤pifp≥m

Extensionstobasiceditdistance

• Newoperations

• Variablecosts

• TimeWarping

Transposition(ab→ba)

• E4: Transpositionaiai+1 →bjbj+1 ,s.t. ai=bj+1 and ai+1=bj

• (e.g.:lecture→letcure)

d(i,j) =

1. d(i-1,j-1) + (S[n]=T[m])? 0 : 1 2. d(i, j-1) +13. d(i-1, j) +14. d(i-2,j-2) + ( if S[i-1,i] = T[j,j-1] then 1 else ∞ )

min

Generalizededitdistance

• UsemoreoperationsE1...En,andtoprovidedifferentcoststoeach.

• Definition. Letx,y∈ Σ*.Theneveryx→yisaneditoperation.Editoperationreplacesxbyy.– IfA=uxvthenaftertheoperation,A=uyv

• Wenotebyw(x→y)thecostorweightoftheoperation.

• Costmaydependonxand/ory.Butweassumew(x→y)≥0.

Page 14: Advanced Algorithmics (6EAP) Dynamic Programming · 2016. 10. 7. · • A simple inspection of the nested loop structure of MATRIX-CHAIN-ORDER yields a running time of O(n3) for

07.10.16

14

Generalizededitdistance

• Ifoperationscanonlybeappliedinparallel,i.e.thepartalreadychangedcannotbemodifiedagain,thenwecanusethedynamicprogramming.

• Otherwiseitisanalgorithmicallyunsolvableproblem,sincequestion- canAbetransformedintoBusingoperationsofG,isunsolvable.

• Thediagonalmethodingeneralmaynotbeapplicable.

• But,sinceeachdiversionfromdiagonal,thecostslightlyincreases,onecanstaywithinthenarrowregionaroundthediagonal.

Applicationsofgeneralizededitdistance

• Historicdocuments,names• Humanlanguageanddialects• Transliterationrulesfromonealphabettoanothere.g.Tõugu=>Tyugu(viaRussian)

• ...

Examples

näituseks– näiteksAhwrika - Aafrikaweikese - väikesematerjaali - materjali

tuseks -> teksa -> aa , hw -> fw -> v , e -> äaa -> a

“kavalam” otsimineDush, dušš, dushsh ? Gorbatšov, Gorbatshov, Горбачов, Gorbachevrežiim, rezhiim, riim

Page 15: Advanced Algorithmics (6EAP) Dynamic Programming · 2016. 10. 7. · • A simple inspection of the nested loop structure of MATRIX-CHAIN-ORDER yields a running time of O(n3) for

07.10.16

15

Links• Est-Eng;OldEstonian;Est-Rustransliteration

– https://biit-dev.cs.ut.ee/~orasmaa/gen_ed_test/

• Pronunciation– https://biit-dev.cs.ut.ee/~orasmaa/ing_ligikaudne/

• Github(ReinaUba;SiimOrasmaa)– https://github.com/soras/genEditDist

– Orasmaa,Siim;Käärik,Reina;Vilo,Jaak;Hennoste,Tiit(2010).InformationRetrievalofWordFormVariantsinSpokenLanguageCorporaUsingGeneralizedEditDistance.In:ProceedingsoftheSeventhconferenceonInternationalLanguageResourcesandEvaluation(LREC'10):TheInternationalConferenceonLanguageResourcesandEvaluation;Valletta,Malta;May17-23,2010.(Toim.)Calzolari,Nicoletta;Choukri,Khalid;Maegaard,Bente;Mariani,Joseph;Odjik,Jan.Valletta,Malta:ELRA,2010,623- 629.

How?

• ApplyAho-Corasicktomatchforallpossibleeditoperations

• Useminimumoverallpossiblesuchoperationsandcosts

• Implementation:ReinaKäärik,SiimOrasmaa

Possibleproblems/tasks

• Manuallycreatesensiblelistsofoperations– ForEnglish,Russian,etc…– Oldlanguage,

• Improvethespeedofthealgorithm(testing)

• Trainforautomaticextractionofeditoperationsandrespectivecostsfromexamplesofmatchingwords…

AdvancedDynamicProgramming

• RobertGiegerich:– http://www.techfak.uni-bielefeld.de/ags/pi/lehre/ADP/

• Algebraicdynamicprogramming– Functionalstyle– HaskellcompilesintoC