solving the longest common subsequence (lcs) problem …svirdi/data/thesis.pdfparallel counterpart...
TRANSCRIPT
Virdi Sabegh Singh(Advisor Dr. Robert A. Walker)Computer Science Department
Kent State University
Solving the Longest Common Subsequence (LCS) problem using the Associative ASC Processors with Reconfigurable 2D Mesh
Presentation Outline
String matching and its variationsMotivation of LCSRole of LCS in Molecular Biology Overview of LCSDiscussion on Folklore algorithmParallel Algorithms for LCSDiscussion on ASC processorBrief introduction on Coterie Network
Presentation OutlineReconfigurable Network in the ASC ProcessorModifying the Network for LCS AlgorithmLongest Common Subsequence on Reconfigurable 2D Mesh
Exact match
Longest Common Subsequence on Reconfigurable 2D Mesh
Approximate match
Summary and Future work
Presentation Outline
String Matching and its variationsMotivation of LCSRole of LCS in Molecular Biology Overview of LCSDiscussion on Folklore algorithmParallel Algorithms for LCSDiscussion on ASC processorBrief introduction on Coterie Network
String MatchingFundamental operation in computingComparison of characters, words etc. to determine their similarityInterest is in the area of bioinformatics, in particular searching genetic databasesString are enormous, efficient string processing is therefore a requirement
String Matching VariationsIs Exact match the only solution?What if the pattern does not occur in the text? Find the longest subsequence that occurs both in the pattern and in the text. Longest Common Subsequence, Longest Common Substring, Sequence alignment, Edit distance Problem are all variation of SM problem
Sequence alignmentProcedure of comparing 2 or more sequencesSearches series of individual character pattern in the same order in the sequence
LCSFind a common string for both the sequences preserving symbol order
Sequence alignment vs. LCS
GGHSRLILSQLGEEG.RLLAIDRDPQAIAVAKT....IDDPRFSII
GGHAERFL.E.GLPGLRLIGLDRDPTALDVARSRLVRFAD.RLTLV|||::::| : |::| ||:::||||:|:|||:: ::| |::::
Presentation Outline
String matching and its variationsMotivation of LCSRole of LCS in Molecular Biology Overview of LCSDiscussion on Folklore algorithmParallel Algorithms for LCSDiscussion on ASC processorBrief introduction on Coterie Network
Motivation of LCS
Molecular BiologyFile comparison Screen redisplay Cheater finderPlagiarism detectionCodes and Error Control
Spell checkingHuman speechGas ChromatographyBird song analysisData compressionSpeech recognition
Presentation Outline
String matching and its variationsMotivation of LCSRole of LCS in Molecular BiologyOverview of LCSDiscussion on Folklore algorithmParallel Algorithms for LCSDiscussion on ASC processorBrief introduction on Coterie Network
Role of LCS in Molecular biology
DNA sequences (genes) represented by four letters ACGT, corresponding to the four submolecules forming DNAWhen biologists find a new sequences, they typically want to know what other sequences it is most similar toOne way of computing how similar (homologous) two sequences are is to find the length of their longest common subsequence
Role of LCS in Molecular biology
This is a simplification, since in the biological situation one would typically take into account not only the length of the LCS, but also i.e., how gaps occur when the LCS is embedded in the two original sequences.An obvious measure for the closeness of two strings is to find the maximum number of identical symbols (preserving symbol order)This by definition, is the longest common subsequence of the strings
Presentation Outline
String matching and its variationsMotivation of LCSRole of LCS in Molecular Biology Overview of LCSDiscussion on Folklore algorithmParallel Algorithms for LCSDiscussion on ASC processorBrief introduction on Coterie Network
Longest Common Subsequences
Formally, we compare two strings, X[1..m] and Y[1..n], which are elements of the set Σ*; here Σdenotes the input alphabet containing σ symbolsThe LCS of strings X and Y, lcs(X,Y) is a common subsequences of maximal lengthSpecial case of the edit distance problem
The distance between X and Y is defined as the minimal number of elementary operations needed to transform the source string X to the target string YIn practical applications, operation are restricted to insertions, deletions and substitutionsFor each operation, an application dependent cost is assigned
Longest Common SubsequencesLCS(X,Y) typically solved with the dynamic programming technique and filling an mxntableTable elements acts as a vertices in a graph, and the simple dependencies between the table values defines the edgesThe task is to find the longest path between the vertices in the upper left and lower right corner of the table
Presentation Outline
String matching and its variationsMotivation of LCSRole of LCS in Molecular Biology Overview of LCSDiscussion on Folklore algorithmParallel Algorithms for LCSDiscussion on ASC processorBrief introduction on Coterie Network
Folklore AlgorithmFoundation of most of the LCS algorithmsGiven two strings, find the LCS common to both strings.Example:
String 1: AGACTGAGGTAString 2: ACTGAG
AGACTGAGGTA- -ACTGAG - - - list of possible alignments- -ACTGA - G- -A- -CTGA - G- -A- -CTGAG - - -
The time complexity of this algorithm is clearly O(nm);
Folklore AlgorithmComplexity does not depend on the sequences uand v themselves but only on their lengthsBy choosing carefully the order of computing the d(i,j)'s one can execute the above algorithm in space O(n+m)The bottleneck in efficient parallelization of LCS problem are the calculating the value of diagonal elements, as shown
As seen, the value of {i,j} depend upon the previous element {i-1,j-1}, when a match is found.We may have more then one LCS for the same problemIn order to find the best LCS, we associate some parameterThe Smith-Waterman Algorithm uses the same concept that of Folklore algorithm, but gives us the optimal result (LCS)
Folklore Algorithm
Folklore Algorithm
1 1 1 1 1
11
2111
1 222222
111111
3
1
1
1
44443222
3333
43332
5
55
43332 6
5
4
3
2 2
666
5 5
4
3
0 0 0 0 0 0 0 0 0 0 0 0A G A C T G A G G T A
0
0
0
0
0
0
A
C
T
G
A
G
Presentation Outline
String matching and its variationsMotivation of LCSRole of LCS in Molecular Biology Overview of LCSDiscussion on Folklore algorithmParallel Algorithms for LCSDiscussion on ASC processorBrief introduction on Coterie Network
Parallel CounterpartSerial LCS algorithm runs in O(nm) time, where n is the length of the text string, and m is the length of pattern stringEfficient Parallel algorithm do exist to solve this computational extensive task
Some algorithm runs in O(max{n,m}) using O(min{n,m}) processorsO(logn) using O(mn/logn) processorsThere are constant time algorithm for this LCS problem using the DP approach, using some assumptions
Computation ModelVarious Network Models have been used to solve this LCS problemPRAM model, Suffix Tree, 2D-Mesh Network, Mesh with Reconfigurable buses, Mesh with Multiple buses etcAlgorithm which runs in constant time, assume that most of the operation are done in constant timeIn parallel version, one of the important task is to distribute data efficiently and easy manner
Presentation Outline
String matching and its variationsMotivation of LCSRole of LCS in Molecular Biology Overview of LCSDiscussion on Folklore algorithmParallel Algorithms for LCSDiscussion on ASC processorBrief introduction on Coterie Network
The ASC ProcessorA scalable design implemented on a million gate Altera FPGASIMD-like architectureSearches data by content instead of address8-bit Instruction Stream (IS) control unit with 8-bit Instruction and Data addresses, 32-bit instructions
mem
ory
and
supp
ortin
g ci
rcui
try
PE and Memory
Net
wor
k
PE and Memory
PE and Memory
PE and MemoryCommonRegisters
ResponderResolution
Unit
PE Array
ControlUnit
Inst
ruct
ion
Bus
Dat
aB
us
From Control Unit
The ASC Architecture
The ASC ArchitectureEach PE listens to the IS through the broadcast and reduction networkPEs can communicate amongst themselves using the PE NetworkPE may either execute or ignore the microcode instruction broadcast by IS under the control of the Mask Stack
The ASC FeaturesAssociative Search
Each PE can search its local memory for a key under the control of IS
Responder ResolutionA special circuit signals if ‘at least one’ record was found
Masked OperationLocal Mask Stacks can turn on or off the execution of instruction from IS
Communication between PE’sIn 2D mesh network,
Communication between P.E’s themselves take place in two different ways
By using the nearest neighbors mesh interconnection networkPowerful variation on the nearest-neighbor mesh called the “Coterie network”, developed in response to the requirement for nonlocal communication
Processors in a group share common properties and purpose, we call the group a coterie, and hence the name coterie network
Presentation OutlineString matching and its variationsMotivation of LCSRole of LCS in Molecular Biology Overview of LCSDiscussion on Folklore algorithmParallel Algorithms for LCSDiscussion on ASC processorBrief introduction on Coterie Network
Coteries[ Weems & Herbordt ]“A small often selected group of persons who
associate with one another frequently”Features:
Related to other Reconfigurable broadcast networkDescribable using hypergraphsAnd they are dynamic in nature
Advantages:Propagation of information quickly over long distances at electrical speedSupport of one-to-many communication within coterie, reconfigurability of the coterie
Coterie NetworkProvides method of performing operations on regions of an image in parallelUsed extensively for Matrix Arithmetic, FFT, Convex Hull Computation, Simulating a pyramid processors, General Permutation Routing and Parallel PrefixNote that the coterie network is separate from the nearest-neighbor mesh, which we refer to as the SEWN networkCoterie network results in a new mode of parallelism that falls between SIMD and MIMD
PE’s form Coteries
5 x 5 coterie network with switches shown in “arbitrary”settings. Shaded areas denotes coterie (the set of PEs Sharing same circuit)
Coterie’s Physical StructureIn the physical implementation, each PE controls set of switches
Four of these switches control access in the different directions (N,S,E,W)Two switches H and V are used to emulated horizontal and vertical busesThe two switches NE and NW are used to creation of eight way connected region
Coteries Structure
NW NE
WS ES
V
H E
S
W
: Switch
N
Coterie NetworkThe isolated group of processors called coterie’s, have access only to the multicast within a coterieWhen the switches are set, connected processors form a CoterieThe coterie network switches are set by loading the corresponding bits of the mesh control register in each P.E
Basic Coterie structure algorithmThe complexity is assumed to be O(1) unless otherwise stated
Transfer of data between two adjacent coteriesSymmetry breaking between a pair of nodes in a coterieTwo nodes within a coterie exchange information
Presentation OutlineReconfigurable Network in the ASC ProcessorModifying the Network for LCS AlgorithmLongest Common Subsequence on Reconfigurable 2D Mesh
Exact match
Longest Common Subsequence on Reconfigurable 2D Mesh
Approximate match
Summary and Future work
Reconfigurable Network in the ASC Processor
Scalable design with Reconfigurable networkCan be used as dedicated ASIC or Co-processorImplemented on Altera APEX20KC1000, single CPU, 50 pipelined PE & linear PE interconnection networkKey to reconfigurability is the Data Switchinside each PE S
N
W E
DATA SWITCH
Reconfigurable Network in the ASC Processor
Linear network, PE communicates both ways2D Reconfigurable Network, PE communicates with all of its neighbors (N-E-S-W) Data switch has bypass mode to allow PE communication to skip non-responder, so as to support Associative computing
Presentation OutlineReconfigurable Network in the ASC ProcessorModifying the Network for LCS AlgorithmLongest Common Subsequence on Reconfigurable 2D Mesh
Exact match
Longest Common Subsequence on Reconfigurable 2D Mesh
Approximate match
Summary and Future work
Modifying the Network for LCS Algorithm
Coterie Network, one of the powerful networkBut we don’t need full features of the same for the LCS AlgorithmAugmented ASC with new 2D Mesh, with row and column broadcast busesModified linear network into 2D MeshAdded features inspired by Coterie networkA PE can communicate now, with any of its four neighborsBypass mode augmented to support H and V bypass as well
Presentation OutlineReconfigurable Network in the ASC ProcessorModifying the Network for LCS AlgorithmLongest Common Subsequence on Reconfigurable 2D Mesh
Exact match
Longest Common Subsequence on Reconfigurable 2D Mesh
Approximate match
Summary and Future work
LCS Algorithm on Reconfigurable 2D Mesh
We assume, initially all the internal switch of the PEs are openEach PEs have a Match Register “M” and Length Register “L”, initially having value 0Let the Text string T=T(1)T(2)…T(n) been fed into row 1 of the Reconfigurable 2D MeshPE(0,j) stores T(j), where 0<=j<=n, as shownThis steps take unit time.
LCS Algorithm on Reconfigurable 2D Mesh
A G A C T G A C T G A
LCS Algorithm on Reconfigurable 2D Mesh
Broadcast each character of the text string along the column, using column broadcast busIn case of Coterie network
Form coteries along the columnPerform operation multicast in all coteriesThis step takes unit time.
LCS Algorithm on Reconfigurable 2D Mesh
A G A C T G A C T G A
A G A C T G A C T G A
A G A C T G A C T G A
A G A C T G A C T G A
A G A C T G A C T G A
A G A C T G A C T G A
LCS Algorithm on Reconfigurable 2D Mesh
Let the Pattern string P=P(1)P(2)…P(m) been fed into column 1 of the Reconfigurable 2D MeshPE(i,0) stores P(j), where 0<=i<=m, as shownThis steps take unit time
LCS Algorithm on Reconfigurable 2D Mesh
A
C
T
G
A
C
PE’s form CoteriesBroadcast each character of the Pattern string along the row, using row broadcast busIn case of Coterie network
Form coteries along the rowsPerform operation multicast in all coteriesThis step takes unit time
LCS Algorithm on Reconfigurable 2D Mesh
A
C
T
G
A
C
A
C
T
G
A
C
A
C
T
G
A
C
A
C
T
G
A
C
A
C
T
G
A
C
A
C
T
G
A
C
A
C
T
G
A
C
A
C
T
G
A
C
A
C
T
G
A
C
A
C
T
G
A
C
A
C
T
G
A
C
LCS Algorithm on Reconfigurable 2D Mesh
After this step each PE’s with index [i,j] have P[i] T[j].Now each PE’s compares the content held in his internal Register.It set the value 1 if they are equal else 0 in its Match register M.This step takes unit time.Next figure shows the value after this operation
LCS Algorithm on Reconfigurable 2D Mesh
1 0 1 0 0
00
0000
0 010001
100010
1
0
0
1
00010001
1000
00010
0
01
00100 1
0
1
0
0 0
000
0 1
0
0
A G A C T G A C T G A
A
C
T
G
A
C
Parallel VLDC SM Algorithm on MCCRB Network
A Parallel SM algorithm With VLDC proposed by K.L. Chung in 1995Uses the Mesh-Connected Computer with reconfigurable buses system.Runs in O(1) timePattern of size m , Text of size n uses, O(nm) PE’s.
LCS Algorithm on Reconfigurable 2D Mesh
Now expect the PE’s with index[0,j], where 0<=j<=n, all PEs having value 0 in its Match register M closes the N-E switch.PE’s with value 1 in its Match Register M closes the W-S switch as shown Both the steps takes unit time
LCS Algorithm on Reconfigurable 2D Mesh
1 0 1 0 0
00
0000
0 010001
100010
1
0
0
1
00010001
1000
00010
0
01
00100 1
0
1
0
0 0
000
0 1
0
0
A G A C T G A C T G A
A
C
T
G
A
C
LCS Algorithm on Reconfigurable 2D Mesh
Sequential Version:Each PE at the beginning (bottom) of an LCS sends a token to its West neighborA PE receiving a token adds 1 to its token if its Match Register “M” Contains 1, and passes the token on if its W-S bypass switch is set and stores it in its Length Register “L”Perform operation MAX on the entire networkThe PE with the largest value in its Length register “L” is the start of the LCSComplexity being the length of the LCS found
LCS Algorithm on Reconfigurable 2D Mesh
1 0 6 0 0
00
0000
0 040005
100050
4
0
0
1
00030003
3000
00020
0
02
00100 1
0
2
0
0 0
000
0 1
0
0
A G A C T G A C T G A
A
C
T
G
A
C
LCS Algorithm on Reconfigurable 2D Mesh
Parallel Version:Each PE a the beginning (bottom) sends its [row, column] id to its west neighborPE receiving an ID passes it onOr is it’s the end of an LCS subtracts its own ID from the received IDStore the value in the Length Register “L”Perform operation Max on the networkPE having largest value in its Length Register “L” is the start of the LCSComplexity, Constant time
LCS Algorithm on Reconfigurable 2D Mesh
1,1 1,2 1,3 1,4 1,5
02,1
0003,1
0 02,80002,4
1,111,101,91,81,71,6
3,5
4,1
6,1
5,1
0004,60004,2
3,9000
0005,30
0
05,7
006,400 6,8
0
4,10
0
0 0
000
0 5,11
0
0
A G A C T G A C T G A
A
C
T
G
A
C
LCS Algorithm on Reconfigurable 2D Mesh
1 6 5
3
A G A C T G A C T G A
A
C
T
G
A
C
LCS Algorithm on Reconfigurable 2D Mesh
Exact match implemented on Altera APEX1000KC FPGASufficient to hold 6 x 11 arrays of PEs, used in the exampleRan at a clock speed of 37 MHz, with respect to the number of PEsLarger network can be easily supported, due to ASC scalability
LCS Algorithm on Reconfigurable 2D Mesh
The algorithm described above solve the LCS problem for exact matchDoesn’t address approximate matchThe next example demonstrate this problem
For the string:Text : AGACTGAGGTAPattern : ACCAGGLCS being : ACAGG
Presentation OutlineReconfigurable Network in the ASC ProcessorModifying the Network for LCS AlgorithmLongest Common Subsequence on Reconfigurable 2D Mesh
Exact match
Longest Common Subsequence on Reconfigurable 2D Mesh
Approximate match
Summary and Future work
LCS Algorithm on Reconfigurable 2D Mesh
1 0 1 0 0
00
1000
0 000001
100010
0
1
0
0
00100010
0000
10001
0
10
10001 1
1
0
0
0 0
001
0 0
1
0
A G A C T G A G G T A
A
C
C
A
G
G
LCS Algorithm on Reconfigurable 2D Mesh
0 1 0 0 0
01
1000
1 001000
001101
0
1
0
0
00100010
0000
10001
0
10
10001 1
1
0
0
0 1
001
0 0
1
0
A G A C T G A G G T A
G
A
C
A
G
G
LCS Algorithm on Reconfigurable 2D Mesh
Inject token from the bottom rowToken reaches a gap, enter south port of some PE, and stops at that PE, whose W-S switch is not setClose the W-S bypass switch of that PE, and bypass Vertically (N-S) of all to the top of the PEs identified in above step
LCS Algorithm on Reconfigurable 2D Mesh
Inject token from the top rowToken reaches a gap, enter West port of some PE, and stop at that PE whose W-S switch is not setClose the W-S bypass switch of that PE, and Bypass Horizontally (W-S) of all PEs to the right of the PE identified in above stepBypass W-S switch of all those PEs, where there is cross over of H and V switch
LCS Algorithm on Reconfigurable 2D Mesh
Inject token from the bottom rowPE receiving a token adds 1 to its Match Register “M” contains 1 and passes it on if its W-S bypass switch is set, if ends of LCS stores it in the Length Register “L”The PE with the largest value in its “L”register is the start of LCSIncrement “L” by 1, if “M” register has value 1
LCS Algorithm on Reconfigurable 2D Mesh
When H or V switch are set, the token bypass this switch, the “L” value remains unchangedWe bypass only those tokens whose, value in the “M” Match register is maximum and that in “L” Length register is Minimum.If both the token have “M” value same, block that token having “L” value maximumIf both “L” and “M” value are same, select any one of them
LCS Algorithm on Reconfigurable 2D Mesh
1 0 1 0 0
00
1000
0 000001
100010
0
1
0
0
00100010
0000
10001
0
10
10001 1
1
0
0
0 0
001
0 0
1
0
A G A C T G A G G T A
A
C
C
A
G
G
LCS Algorithm on Reconfigurable 2D Mesh
0 1 0 0 0
01
1000
1 001000
001101
0
1
0
0
00100010
0000
10001
0
10
10001 1
1
0
0
0 1
001
0 0
1
0
A G A C T G A G G T A
G
A
C
A
G
G
Presentation OutlineReconfigurable Network in the ASC ProcessorModifying the Network for LCS AlgorithmLongest Common Subsequence on Reconfigurable 2D Mesh
Exact match
Longest Common Subsequence on Reconfigurable 2D Mesh
Approximate match
Summary and Future work
Summary and Future workSummary:
In this Presentation, we have described a new parallel algorithm on specialized hardwareInspired by certain feature of Coterie NetworkModified ASC processor to add reconfigurable 2D MeshExact Match implemented on Altera FPGAConstant time algorithm for Exact matchApproximate algorithm depends upon the diameter of the network
Summary and Future workFuture Work:
Optimize the algorithm for Approximate matchIncorporating additional parameters to find the best LCS, instead of longest oneIncorporating different weights schemesConserve memory by using encoding scheme
Use two bits to represent four bases of DNAUsing this idea, we save 75% of space/memory
AcknowledgementsProfessor WalkerCommittee members for their timeASC/MASC Group for their useful CommentsProfessor Helen Piontkivska from Biology DepartmentProfessor Charles Weems and Martin HerbordtHong Wang for implementing the exact match algorithm on FPGA
THANK YOU
Questions….