parallel random projection for motif discovery on gpus

99
Finding Planted (l, d)-Motifs in Parallel using Random Projection on GPUs Jhoirene Barasi Clemente Algorithms and Complexity Laboratory Department of Computer Science University of the Philippines-Diliman [email protected] March 31, 2012 J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 1 / 88

Upload: jhoirene-clemente

Post on 26-Jun-2015

583 views

Category:

Education


1 download

TRANSCRIPT

Page 1: Parallel Random Projection for Motif Discovery on GPUs

Finding Planted (l, d)-Motifs in Parallelusing Random Projection on GPUs

Jhoirene Barasi Clemente

Algorithms and Complexity LaboratoryDepartment of Computer Science

University of the [email protected]

March 31, 2012

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 1 / 88

Page 2: Parallel Random Projection for Motif Discovery on GPUs

Overview

Overview

Introduction

Definitions and Notations

Finding Motifs using Random Projection (FMURP)

Parallel Implementations of CUDA-FMURP

Results and Analysis

Conclusion

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 2 / 88

Page 3: Parallel Random Projection for Motif Discovery on GPUs

Introduction

In this work, we are interested in solving Planted (l, d)-Motif Problemusing Random Projection (FMURP).

The focus of this study is on parallelization of FMURP, where wepresent three versions of the parallel algorithm. Correctness of theparallelization is also discussed.

We implement two of these parallel algorithms on GPUs. Theoreticaland actual performance analyses are also presented.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 3 / 88

Page 4: Parallel Random Projection for Motif Discovery on GPUs

Introduction

Introduction

A DNA motif is defined as a nucleic acid sequence pattern that has somebiological significance such as being DNA binding sites for a regulatory

protein. i.e., a transcription factor [Das,2007].

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 4 / 88

Page 5: Parallel Random Projection for Motif Discovery on GPUs

Introduction

Introduction

DNA Sequences as Strings

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 5 / 88

Page 6: Parallel Random Projection for Motif Discovery on GPUs

Introduction

Introduction

The pattern is fairly short (5 to 20 base-pairs (bp) long) and is known to recurin different genes or several times within gene [Rombauts,1999].

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 6 / 88

Page 7: Parallel Random Projection for Motif Discovery on GPUs

Introduction Notations

Notations

Set of t sequences S.

Example 1 (Sequences S = {S0,S1, . . . ,S(t−1)})S0 : C G G G G C T A T G G A A C T G G G T C G T C A C A T T C C C C T T T C G A T AS1 : T T T G A G G G T G C C C A A T A A A T G C C A C T C C A A A G C G G A C A A AS2 : G G A T G C A A C T G A T G C C G T T T G A C G A C C T A A A T C A A C G G C CS3 : A A G G A T G C A A C T C C A G G A G C G C C T T T G C T G G T T C T A C C T GS4 : A A T T T T C T A A A A A G A T T A T A A T G T C G G T C C A T G C A A C T T CS5 : C T G C T G T A C A A C T G A G A T C A T G C T G C A T G C A A C T T T C A A CS6 : T A C A T G A T C T T T T G A T G C A A C G T G G A T G A G G G A A T G A T G C

Set of sequences S = {S0, S1, S2, S3, S4, S5, S6}defined over ΣDNA = {A,C,T,G},where each sequence Si in S has length ni = 40 for all i ∈ {0, . . . , (t − 1)}

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 7 / 88

Page 8: Parallel Random Projection for Motif Discovery on GPUs

Introduction Notations

Notations

An l-mer is a string of length l defined over ΣDNA.

To denote an l-mer in S, we useSi,j, where i ∈ {0, 1, . . . , (t − 1)} is the sequence numberand j ∈ {0, 1, . . . , (n− l)} is the starting position in Si.

Example 2 (Si,j in S)

For instance, an 8-mer S0,7 is

A T G G A A C T

S0 : C G G G G C T A T G G A A C T G G G T C G T C A C A T T C C C C T T T C G A T A

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 8 / 88

Page 9: Parallel Random Projection for Motif Discovery on GPUs

Introduction Notations

Notations

Let s = (a0, a1, . . . , a(t−1)) be the set of starting positions in S,where ai ∈ {0, 1, . . . , (n− l)}.Let A(s) denotes the alignment made by l-mers in the set{S0,a0 , S1,a1 , . . . , S(t−1),a(t−1)

}.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 9 / 88

Page 10: Parallel Random Projection for Motif Discovery on GPUs

Introduction Notations

Notations

Example 3 (Alignment matrix A(s))

Suppose we have a starting position vector s = (7, 18, 2, 4, 30, 26, 14)

S0,7: A T G G A A C TS1,18: A T G C C A C TS2,2: A T G C A A C T

A(s) S3,4: A T G C A A C TS4,30: A T G C A A C TS5,26: A T G C A A C TS6,14: A T G C A A C G

S0 : C G G G G C T A T G G A A C T G G G T C G T C A C A T T C C C C T T T C G A T AS1 : T T T G A G G G T G C C C A A T A A A T G C C A C T C C A A A G C G G A C A A AS2 : G G A T G C A A C T G A T G C C G T T T G A C G A C C T A A A T C A A C G G C CS3 : A A G G A T G C A A C T C C A G G A G C G C C T T T G C T G G T T C T A C C T GS4 : A A T T T T C T A A A A A G A T T A T A A T G T C G G T C C A T G C A A C T T CS5 : C T G C T G T A C A A C T G A G A T C A T G C T G C A T G C A A C T T T C A A CS6 : T A C A T G A T C T T T T G A T G C A A C G T G G A T G A G G G A A T G A T G C

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 10 / 88

Page 11: Parallel Random Projection for Motif Discovery on GPUs

Introduction Notations

Notations

A profile matrix P(s) with dimension equal to (|ΣDNA| × l) is derivedfrom the frequency of each letter in each column of the A(s).

Example 4 (Profile Matrix P(s))

S0,7: A T G G A A C TS1,18: A T G C C A C TS2,2: A T G C A A C T

A(s) S3,4: A T G C A A C TS4,30: A T G C A A C TS5,26: A T G C A A C TS6,14: A T G C A A C G

A : 7 0 0 0 6 7 0 0T : 0 7 0 0 0 0 0 6

P(s) C : 0 0 0 6 1 0 7 0G : 0 0 7 1 0 0 0 1

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 11 / 88

Page 12: Parallel Random Projection for Motif Discovery on GPUs

Introduction Notations

Notations

From P(s), we define MP(s)(j), where 0 ≤ j ≤ (l− 1), be the maximumnumber at jth column of the profile matrix.

Example 5 (MP(s),j)

S0,7: A T G G A A C TS1,18: A T G C C A C TS2,2: A T G C A A C T

A(s) S3,4: A T G C A A C TS4,30: A T G C A A C TS5,26: A T G C A A C TS6,14: A T G C A A C G

A : 7 0 0 0 6 7 0 0T : 0 7 0 0 0 0 0 6

P(s) C : 0 0 0 6 1 0 7 0G : 0 0 7 1 0 0 0 1

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 12 / 88

Page 13: Parallel Random Projection for Motif Discovery on GPUs

Introduction Notations

Notations

A consensus string is an l-mer, where each of its elements is thenucleotide base corresponding to MP(s)(i).

Example 6 (Consensus String)

S0,7: A T G G A A C TS1,18: A T G C C A C TS2,2: A T G C A A C T

A(s) S3,4: A T G C A A C TS4,30: A T G C A A C TS5,26: A T G C A A C TS6,14: A T G C A A C G

A : 7 0 0 0 6 7 0 0T : 0 7 0 0 0 0 0 6

P(s) C : 0 0 0 6 1 0 7 0G : 0 0 7 1 0 0 0 1

Consensus String A T G C A A C TJ.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 13 / 88

Page 14: Parallel Random Projection for Motif Discovery on GPUs

Introduction Notations

Notations

We define the Score(s,S) to be equal to

Score(s, S) =

l−1∑i=0

MP(s)(i). (1)

Example 7 (Consensus Score())

A : 7 0 0 0 6 7 0 0T : 0 7 0 0 0 0 0 6

P(s) C : 0 0 0 6 1 0 7 0G : 0 0 7 1 0 0 0 1

Score(s, S) = 7 + 7 + 7 + 6 + 6 + 7 + 7 + 6 = 53

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 14 / 88

Page 15: Parallel Random Projection for Motif Discovery on GPUs

Introduction Notations

Notations

We define the Score(s,S) to be equal to

Score(s, S) =

l−1∑i=0

MP(s)(i). (1)

Example 7 (Consensus Score())

A : 7 0 0 0 6 7 0 0T : 0 7 0 0 0 0 0 6

P(s) C : 0 0 0 6 1 0 7 0G : 0 0 7 1 0 0 0 1

Score(s, S) = 7 + 7 + 7 + 6 + 6 + 7 + 7 + 6 = 53

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 14 / 88

Page 16: Parallel Random Projection for Motif Discovery on GPUs

Introduction Motif Finding Problem

Motif Finding Problem

Definition 8 (Motif Finding Problem [Pevzner,2004])

INPUT:

A motif length l

A set of t sequences S = {S0, S1, S2, . . . , S(t−1)},where each Si is of length ni

OUTPUT:

An array of starting positions s = (a0, a1, . . . , a(t−1))maximizing consensus Score(s,S)

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 15 / 88

Page 17: Parallel Random Projection for Motif Discovery on GPUs

Introduction Motif Finding Problem

Naive MFP Solver [Pevzner,2004]

Input: DNA (sequences), motif length lOutput: Starting position s and consensus string corresponding to s

1 For each possible starting position in S,i.e. s ∈ {(0, 0, . . . , 0), . . . , ((n− l), (n− l) . . . , (n− l))}.

1 Get alignment A(s).2 Compute for P(s).3 Evaluate Score(s, S).

2 From s with the maximum Score, get the consensus string.3 Output consensus string.

Step 1 needs to iterate (n− l + 1)t times because all possible startingpositions s is equal to

s = (a0, a1, . . . , a(t−1)), ∀ ai ∈ {0, . . . , (n− l)}.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 16 / 88

Page 18: Parallel Random Projection for Motif Discovery on GPUs

Introduction Motif Finding Problem

Naive MFP Solver [Pevzner,2004]

Input: DNA (sequences), motif length lOutput: Starting position s and consensus string corresponding to s

1 For each possible starting position in S,i.e. s ∈ {(0, 0, . . . , 0), . . . , ((n− l), (n− l) . . . , (n− l))}.

1 Get alignment A(s).2 Compute for P(s).3 Evaluate Score(s, S).

2 From s with the maximum Score, get the consensus string.3 Output consensus string.

Step 1 needs to iterate (n− l + 1)t times because all possible startingpositions s is equal to

s = (a0, a1, . . . , a(t−1)), ∀ ai ∈ {0, . . . , (n− l)}.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 16 / 88

Page 19: Parallel Random Projection for Motif Discovery on GPUs

Introduction Planted (l, d)-Motif Finding Problem

Definitions

Definition 9 (Challenge Problem [Pevzner,2000])INPUT:

Motif length l = 15,

Expected mismatches d,

20 DNA sequences each with ni = 600 nucleotide bases

OUTPUT:

A consensus string M from an alignment A(s), where each l-mer in A(s)has Si,ai

dE(M, Si,ai) = 4,

for all i ∈ {0, . . . , (t − 1)}.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 17 / 88

Page 20: Parallel Random Projection for Motif Discovery on GPUs

Introduction Planted (l, d)-Motif Finding Problem

Why challenging?

Suppose we have A(s),

S0,a0 A C T T G G G G C A A G A G GS1,a1 G G A C G G G G C A G A C T GS2,a2 A C T T G C T A A A G A C T GS3,a3 A C T G C G G G C A C A G T GS4,a4 A C C T G G G T C G T A C T GA: 4 0 1 0 0 0 0 1 1 4 1 4 1 0 0C: 0 4 1 1 1 1 0 0 4 0 1 0 2 0 0T: 0 0 3 3 0 0 1 1 0 0 1 0 1 4 0G: 1 1 0 1 4 4 4 3 0 1 2 1 1 1 5

A C T T G G G G C A G A C T G

dE(S0,a0 , S1,a1) = 2d = 8

Score(s, S) = 4 + 4 + 3 + 3 + 4 + 4 + 4 + 3 + 4 + 4 + 2 + 4 + 2 + 4 + 5 = 54

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 18 / 88

Page 21: Parallel Random Projection for Motif Discovery on GPUs

Introduction Planted (l, d)-Motif Finding Problem

Definitions

Definition 10 (Planted (l, d)-Motif Finding Problem [Tompa,2001])

INPUT:

Motif length l,

Expected number of mismatches d, and

A set of t sequences S = {S0, S1, S2, . . . , S(t−1)}, where each Si is oflength ni

OUTPUT:

A consensus string M from an alignment A(s), where each l-mer in A(s)has Si,ai

dE(M, Si,ai) = d,

for all i ∈ {0, . . . , (t − 1)}.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 19 / 88

Page 22: Parallel Random Projection for Motif Discovery on GPUs

Introduction Planted (l, d)-Motif Finding Problem

Solutions for Planted (l, d)-Motif Finding

SP-STAR [Pevzner,2000]

Winnower [Pevzner,2000]

Random Projection [Tompa,2001]

Aggregation [Mohammed,2004]

GibbsDST [Shida,2006]

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 20 / 88

Page 23: Parallel Random Projection for Motif Discovery on GPUs

Finding Motifs using Random Projection (FMURP)

Finding Motifs using Random Projection (FMURP)

INPUT: Set of sequences S, motif length l, expected mismatches d, projectiondimension k, and bucket threshold δOUTPUT: Motif

1 Projection1 Get all l-mer Si,js in S.2 Get projection hI(Si,j) for each Si,j in S.3 Hash each Si,j to buckets with identifier hI(Si,j).4 Get enriched buckets.

2 Refine each enriched bucket using EM3 Refine each enriched bucket using SP-STARσ4 Maximize score to output best motif

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 21 / 88

Page 24: Parallel Random Projection for Motif Discovery on GPUs

Finding Motifs using Random Projection (FMURP)

Definition 11Random Projection Given an l-mer Si,j, projection dimension k, and a setI ⊂ L = {0, . . . , (l− 1)}, where |I| = k, elements in I are sorted in increasingorder and are randomly chosen from the set L, a k-dimensional projection ofSi,j is

hI(Si,j) = Si,j(I0), Si,j(I1), . . . , Si,j(I(k−1)),

where hI(Si, j) is a k-mer and Ii denotes the ith element in I.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 22 / 88

Page 25: Parallel Random Projection for Motif Discovery on GPUs

Finding Motifs using Random Projection (FMURP)

FMURP: Example

Example 12

Given a set of DNA sequences S, pattern length l = 4, projection dimensionk = 2, and bucket threshold δ = 3.

S0 : C G G T C A G GS1 : T T C G A C A TS2 : A C G A T G A A

Figure: Set of t = 3 sequences each with n = 8

Let I = {0, 1}.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 23 / 88

Page 26: Parallel Random Projection for Motif Discovery on GPUs

Finding Motifs using Random Projection (FMURP)

Projection

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 24 / 88

Page 27: Parallel Random Projection for Motif Discovery on GPUs

Finding Motifs using Random Projection (FMURP)

Projection

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 25 / 88

Page 28: Parallel Random Projection for Motif Discovery on GPUs

Finding Motifs using Random Projection (FMURP)

Projection

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 26 / 88

Page 29: Parallel Random Projection for Motif Discovery on GPUs

Finding Motifs using Random Projection (FMURP)

Projection

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 27 / 88

Page 30: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection

How do we parallelize FMURP?

1 Projection1 Get all l-mer Si,js in S.2 Get projection hI(Si,j) for each

Si,j in S.3 Hash each Si,j to buckets with

identifier hI(Si,j).4 Get enriched buckets.

2 Refine each enriched bucketusing EM

3 Refine each enriched bucketusing SP-STARσ

4 Maximize score to output bestmotif

1 Projection1 Get all l-mer Si,js in S in

parallel.2 Get projection hI(Si,j) for each

Si,j in S in parallel.3 Hash each Si,j to buckets with

identifier hI(Si,j) in parallel.4 Get enriched buckets in

parallel.2 Refine each enriched bucket

using EM in parallel3 Refine each enriched bucket

using SP-STARσ in parallel4 Maximize score to output best

motif.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 28 / 88

Page 31: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection

Parallel Algorithms for Motif Finding

CUDA-MEME

CUDA-Gibbs Sampling

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 29 / 88

Page 32: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection

CUDA

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 30 / 88

Page 33: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Computing Framework

Figure: Flowchart showing the processes done in the CPU and GPU

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 31 / 88

Page 34: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

CUDA-FMURP v1

Figure: Thread ID is denoted by an ordered pair (i, j), 0 ≤ i ≤ w and 0 ≤ j ≤ v, where v isthe maximum thread per block and w is the number of allocated thread blocks in the grid. Thealgorithm uses a total of x = t · (n− l + 1) threads that are linearly arranged in GPU.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 32 / 88

Page 35: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

CUDA-FMURP v1INPUT: Set of sequences S, motif length l, expected mismatches d, projection dimension k,and bucket threshold δOUTPUT: Motif

1 In CPU, generate k random positions for projection.Let this be the set I = {i|i ∈ {0, . . . , (l− 1)}} and |I| = k.

2 In GPU, for each thread tid in {0, . . . , (x− 1)},1 Get hI(Si,j)s for each Si,j in S,2 Convert each k-mer hI(Si,j) to its corresponding integer representation k∗i,j.3 Perform a linear search over all k∗i,js to determine which l-mers

are ‘hashed’ in the same bucket. The tid of matched k∗i,js are noted insteadof the actual l-mer.

3 In CPU, identify the set of enriched buckets,and prune duplicates in preparation for EM refinement.

4 In GPU, for each tid in {0, . . . , (e− 1)},

1 Perform EM refinement for each enriched bucket.2 Perform SP-STARσ for each enriched bucket.3 Maximize σ score to output best motif.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 33 / 88

Page 36: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Integer Conversion

Step 2.2 represents each hI(Si,j) to their corresponding integer representationk∗i,j. Given a unique k-mer from projection, a corresponding integer iscomputed using the following mapping. Let us define

f : ΣDNA → {0, 1, 2, 3},A → 0C → 1G → 2T → 3

where each symbol in the DNA alphabet is mapped to a unique integer.For a string v of length k,

f ∗ : Σ+DNA → Z+ ∪ {0}v →

∑k−1i=0 f (vi)4i

where vi denotes the symbol at ith position starting from the least significantdigit and the integer representation is only defined on the positive integersincluding {0}.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 34 / 88

Page 37: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

CUDA-Projection v1: Example

Given a set of DNA sequences, pattern length l = 4, projection dimensionk = 2, and bucket threshold δ = 3. Projection in parallel is shown as follows

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 35 / 88

Page 38: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

CUDA-Projection v1: Integer Conversion example

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 36 / 88

Page 39: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

CUDA-Projection: Parallel Integer Conversion Example

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 37 / 88

Page 40: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

CUDA-Projection: Getting enriched buckets

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 38 / 88

Page 41: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

CUDA-Projection: Getting enriched buckets

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 39 / 88

Page 42: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

CUDA-EM

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 40 / 88

Page 43: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

CUDA-SP-STARσ

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 41 / 88

Page 44: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness v1

1 In CPU, generate k random positions for projection.Let this be the set I = {i|i ∈ {0, . . . , (l− 1)}} and |I| = k.

2 In GPU, for each thread tid in {0, . . . , (x− 1)},1 Get hI(Si,j)s for each Si,j in S,2 Convert each k-mer hI(Si,j) to its corresponding integer representation k∗i,j.3 Perform a linear search over all k∗i,js to determine which l-mers

are ‘hashed’ in the same bucket. The tid of matched k∗i,js are noted insteadof the actual l-mer.

3 In CPU, identify the set of enriched buckets,and prune duplicates in preparation for EM refinement.

4 In GPU, for each tid in {0, . . . , (e− 1)},1 Perform EM refinement for each enriched bucket.2 Perform SP-STARσ for each enriched bucket.3 Maximize σ score to output best motif.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 42 / 88

Page 45: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness v1

The uniqueness of the representation we defined using f ∗ follows from theresults below.Let Σk = {0, 1, 2, . . . , k − 1}, and let Ck a regular language such that,

Ck = {ε} ∪ (Σk − {0})Σ∗k .

Theorem 4.1 (Fundamental Theorem of base-k Representation[Allouche,2003])Let k ≥ 2 be an integer. Then every non-negative integer has a uniquerepresentation of the form

N =

t∑i=0

aiki,

where at 6= 0 and 0 ≤ ai < k for 0 ≤ i ≤ t.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 43 / 88

Page 46: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness v1

In the case of our representation f ∗, we have k = 4 and ai = f (vi), wherevi ∈ ΣDNA. Note that the mapping f is one-to-one and onto by definition. Thuswe have the following:

Proposition 4.1

f ∗ provides a unique representation of hI(Si,j), for each i, j, and element of I.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 44 / 88

Page 47: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness v1

1 In CPU, generate k random positions for projection.Let this be the set I = {i|i ∈ {0, . . . , (l− 1)}} and |I| = k.

2 In GPU, for each thread tid in {0, . . . , (x− 1)},1 Get hI(Si,j)s for each Si,j in S,2 Convert each k-mer hI(Si,j) to its corresponding integer representation k∗i,j.3 Perform a linear search over all k∗i,js to determine which l-mers

are ‘hashed’ in the same bucket. The tid of matched k∗i,js are noted insteadof the actual l-mer.

3 In CPU, identify the set of enriched buckets,and prune duplicates in preparation for EM refinement.

4 In GPU, for each tid in {0, . . . , (e− 1)},1 Perform EM refinement for each enriched bucket.2 Perform SP-STARσ for each enriched bucket.3 Maximize σ score to output best motif.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 45 / 88

Page 48: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness v1

We have to show that the set of enriched buckets EB obtained in FMURP isequivalent to the set of enriched buckets EB obtained in CUDA-FMURP v1.

EB = {B| |B| ≥ δ}.

Two elements Si,j and Si′,j′ belongs to the same bucket B if it follows therelation R defined below.

Definition 13 (Relation R)

(Si,j, Si′,j′) ∈ B ⇔ (Si,j, Si′,j′) ∈ R(Si,j, Si′,j′) ∈ R ⇔ hI(Si,j) = hI(Si′,j′)

Proposition 4.2R is an equivalence relation.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 46 / 88

Page 49: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness v1

We have to show that the set of enriched buckets EB obtained in FMURP isequivalent to the set of enriched buckets EB obtained in CUDA-FMURP v1.

EB = {B| |B| ≥ δ}.

Two elements Si,j and Si′,j′ belongs to the same bucket B if it follows therelation R defined below.

Definition 13 (Relation R)

(Si,j, Si′,j′) ∈ B ⇔ (Si,j, Si′,j′) ∈ R(Si,j, Si′,j′) ∈ R ⇔ hI(Si,j) = hI(Si′,j′)

Proposition 4.2R is an equivalence relation.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 46 / 88

Page 50: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness v1

We have to show that the set of enriched buckets EB obtained in FMURP isequivalent to the set of enriched buckets EB obtained in CUDA-FMURP v1.

EB = {B| |B| ≥ δ}.

Two elements Si,j and Si′,j′ belongs to the same bucket B if it follows therelation R defined below.

Definition 13 (Relation R)

(Si,j, Si′,j′) ∈ B ⇔ (Si,j, Si′,j′) ∈ R(Si,j, Si′,j′) ∈ R ⇔ hI(Si,j) = hI(Si′,j′)

Proposition 4.2R is an equivalence relation.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 46 / 88

Page 51: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness v1

We have to show that the set of enriched buckets EB obtained in FMURP isequivalent to the set of enriched buckets EB obtained in CUDA-FMURP v1.

EB = {B| |B| ≥ δ}.

Two elements Si,j and Si′,j′ belongs to the same bucket B if it follows therelation R defined below.

Definition 13 (Relation R)

(Si,j, Si′,j′) ∈ B ⇔ (Si,j, Si′,j′) ∈ R(Si,j, Si′,j′) ∈ R ⇔ hI(Si,j) = hI(Si′,j′)

Proposition 4.2R is an equivalence relation.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 46 / 88

Page 52: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness v1

In CUDA-FMURP v1, an enriched bucket is defined as

EB = {B| |B| ≥ δ}.

where B is a bucket in CUDA-FMURP and two elements p and q belongs tothe same bucket B if it follows the relation R defined below.

Definition 14 (Relation R)

(p, q) ∈ B ⇔ (p, q) ∈ R(p, q) ∈ R ⇔ k∗i,j = k∗i,j

where i = bp/(n− l + 1)c, j = p mod (n− l + 1), i = bq/(n− l + 1)c, andj = q mod (n− l + 1).

Lemma 15

Relation R and R are equivalent.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 47 / 88

Page 53: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness v1

In CUDA-FMURP v1, an enriched bucket is defined as

EB = {B| |B| ≥ δ}.

where B is a bucket in CUDA-FMURP and two elements p and q belongs tothe same bucket B if it follows the relation R defined below.

Definition 14 (Relation R)

(p, q) ∈ B ⇔ (p, q) ∈ R(p, q) ∈ R ⇔ k∗i,j = k∗i,j

where i = bp/(n− l + 1)c, j = p mod (n− l + 1), i = bq/(n− l + 1)c, andj = q mod (n− l + 1).

Lemma 15

Relation R and R are equivalent.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 47 / 88

Page 54: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness v1

In CUDA-FMURP v1, an enriched bucket is defined as

EB = {B| |B| ≥ δ}.

where B is a bucket in CUDA-FMURP and two elements p and q belongs tothe same bucket B if it follows the relation R defined below.

Definition 14 (Relation R)

(p, q) ∈ B ⇔ (p, q) ∈ R(p, q) ∈ R ⇔ k∗i,j = k∗i,j

where i = bp/(n− l + 1)c, j = p mod (n− l + 1), i = bq/(n− l + 1)c, andj = q mod (n− l + 1).

Lemma 15

Relation R and R are equivalent.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 47 / 88

Page 55: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness

Note that elements in B involves Si,js while elements in B involves the set ofintegers p ∈ {0, . . . , (x− 1)}. Using Equations

tid = i× (n− l + 1) + j (2)

i =

⌊tid

(n− l + 1)

⌋(3)

j = tid mod (n− l + 1) (4)

we can retrieve the l-mer Si,j corresponding to tid and vice versa. The theorembelow follows from the fact that R and R are equivalent.

Theorem 4.2

Set of enriched buckets EB and EB are equivalent.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 48 / 88

Page 56: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motifs Finding using Random Projection (CUDA-FMURP v1)

Correctness

Note that elements in B involves Si,js while elements in B involves the set ofintegers p ∈ {0, . . . , (x− 1)}. Using Equations

tid = i× (n− l + 1) + j (2)

i =

⌊tid

(n− l + 1)

⌋(3)

j = tid mod (n− l + 1) (4)

we can retrieve the l-mer Si,j corresponding to tid and vice versa. The theorembelow follows from the fact that R and R are equivalent.

Theorem 4.2

Set of enriched buckets EB and EB are equivalent.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 48 / 88

Page 57: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v2)

CUDA-FMURP v2

INPUT: Set of sequences S, motif length l, expected mismatches d, projectiondimension k, and bucket threshold δOUTPUT: Motif

1 In CPU, generate k random positions for projection.Let this be the set I = {i|i ∈ {0, . . . , (l− 1)}} and |I| = k.

2 In GPU, for each thread tid in {0, . . . , (x− 1)},1 Get hI(Si,j)s for all Si,js in S,

where i ∈ {0, . . . , (t − 1)}, and j ∈ 0, . . . , (n− l).2 Convert each k-mer hI(Si,j) to its corresponding

integer representation k∗i,j.

3 In CPU, hash the list of k∗i,js .4 In CPU, identify the set of enriched buckets.5 In GPU, for each tid in {0, . . . , (e− 1)},

1 Perform EM refinement for each enriched bucket.2 Perform SP-STARσ for each enriched bucket.3 Maximize σ score to output best motif.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 49 / 88

Page 58: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v2)

CUDA-FMURP v2

INPUT: Set of sequences S, motif length l, expected mismatches d, projectiondimension k, and bucket threshold δOUTPUT: Motif

1 In CPU, generate k random positions for projection.Let this be the set I = {i|i ∈ {0, . . . , (l− 1)}} and |I| = k.

2 In GPU, for each thread tid in {0, . . . , (x− 1)},1 Get hI(Si,j)s for all Si,js in S,

where i ∈ {0, . . . , (t − 1)}, and j ∈ 0, . . . , (n− l).2 Convert each k-mer hI(Si,j) to its corresponding

integer representation k∗i,j.

3 In CPU, hash the list of k∗i,js.4 In CPU, identify the set of enriched buckets.5 In GPU, for each tid in {0, . . . , (e− 1)},

1 Perform EM refinement for each enriched bucket.2 Perform SP-STARσ for each enriched bucket.3 Maximize σ score to output best motif.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 50 / 88

Page 59: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v2)

Hash Table in CPU

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 51 / 88

Page 60: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v2)

Hash Table in CPU

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 52 / 88

Page 61: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v2)

Hash Table in CPU

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 53 / 88

Page 62: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v2)

Hash Table in CPU

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 54 / 88

Page 63: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v2)

Hash Table in CPU

To avoid collision between two items with different k∗i,js, linear probing isimplemented.Suppose, we will hash item p with key k∗i′,j′ , and found out that h(k∗i′,j′) is notempty,i.e. ∃ k∗i,j, such that h(k∗i,j) = h(k∗i′,j′) and k∗i,j 6= k∗i′,j′ .We have to look for empty positions in table where we can place item p.We explore positions

h′(k∗i′,j′ , i) = (h(k∗i,j) + i) mod x

for i from 0 to (m− 1), until an empty hash table position is found.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 55 / 88

Page 64: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v2)

Hash Table in CPU

To avoid collision between two items with different k∗i,js, linear probing isimplemented.Suppose, we will hash item p with key k∗i′,j′ , and found out that h(k∗i′,j′) is notempty,i.e. ∃ k∗i,j, such that h(k∗i,j) = h(k∗i′,j′) and k∗i,j 6= k∗i′,j′ .We have to look for empty positions in table where we can place item p.We explore positions

h′(k∗i′,j′ , i) = (h(k∗i,j) + i) mod x

for i from 0 to (m− 1), until an empty hash table position is found.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 55 / 88

Page 65: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v2)

Hash Table in CPU

To avoid collision between two items with different k∗i,js, linear probing isimplemented.Suppose, we will hash item p with key k∗i′,j′ , and found out that h(k∗i′,j′) is notempty,i.e. ∃ k∗i,j, such that h(k∗i,j) = h(k∗i′,j′) and k∗i,j 6= k∗i′,j′ .We have to look for empty positions in table where we can place item p.We explore positions

h′(k∗i′,j′ , i) = (h(k∗i,j) + i) mod x

for i from 0 to (m− 1), until an empty hash table position is found.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 55 / 88

Page 66: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v3)

CUDA-FMURP v3

INPUT: Set of sequences S, motif length l, expected mismatches d, projectiondimension k, and bucket threshold δOUTPUT: Motif

1 In CPU, generate k random positions for projection.Let this be the set I = {i|i ∈ {0, . . . , (l− 1)}} and |I| = k.

2 In GPU, for each thread tid in {0, . . . , (t − 1)},1 Get hI(Stid,j)s for all Stid,js in S,

where j ∈ 0, . . . , (n− l).2 Convert each k-mer hI(Stid,j) to its corresponding

integer representation k∗tid,j.3 In CPU, hash the list of k∗i,js.4 In CPU, identify the set of enriched buckets.5 In GPU, for each tid in {0, . . . , (e− 1)},

1 Perform EM refinement for each enriched bucket.2 Perform SP-STARσ for each enriched bucket.3 Maximize σ score to output best motif.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 56 / 88

Page 67: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v3)

CUDA-FMURP v3

INPUT: Set of sequences S, motif length l, expected mismatches d, projectiondimension k, and bucket threshold δOUTPUT: Motif

1 In CPU, generate k random positions for projection.Let this be the set I = {i|i ∈ {0, . . . , (l− 1)}} and |I| = k.

2 In GPU, for each thread tid in {0, . . . , (t − 1)},1 Get hI(Stid,j)s for all Stid,js in S,

where j ∈ 0, . . . , (n− l).2 Convert each k-mer hI(Stid,j) to its corresponding

integer representation k∗tid,j.3 In CPU, hash the list of k∗i,js.4 In CPU, identify the set of enriched buckets.5 In GPU, for each tid in {0, . . . , (e− 1)},

1 Perform EM refinement for each enriched bucket.2 Perform SP-STARσ for each enriched bucket.3 Maximize σ score to output best motif.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 57 / 88

Page 68: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v3)

CUDA-Projection v3

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 58 / 88

Page 69: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v3)

CUDA-Projection v3

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 59 / 88

Page 70: Parallel Random Projection for Motif Discovery on GPUs

Parallel Motif Finding using Random Projection Parallel Motif Finding using Random Projection (CUDA-FMURP v3)

Integer Conversion

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 60 / 88

Page 71: Parallel Random Projection for Motif Discovery on GPUs

Result and Analysis

Running Time and Space Complexity

Algorithm Time Space Number of ProcessorsFMURP O(log(x)) O(x) 1SEQ-FMURP O(x2) Oe(n− l + 1) 1CUDA-FMURP v1 O(x) O(e(n− l + 1)) xCUDA-FMURP v2 O(x) O(e(n− l + 1)) xCUDA-FMURP v3 O(x) O(e(n− l + 1)) t

Table: Total running time and space complexity of the three parallel algorithms forCUDA-FMURP in comparison with the two sequential implementations.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 61 / 88

Page 72: Parallel Random Projection for Motif Discovery on GPUs

Result and Analysis

Speedup and Efficiency

FMURP: O(x log x)The computation of Speedup is the ratio of sequential and parallel runningtime.

SP =Sequential

ParallelComparison of Speedups SP, S′P, and S′′P for CUDA-FMURP versions 1 to 3,respectively is shown below.

SP = S′P = S′′P =O(x log x)

O(x)= O(log x)

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 62 / 88

Page 73: Parallel Random Projection for Motif Discovery on GPUs

Result and Analysis

Speedup and Efficiency

Computation of processor Efficiency makes use of the speedup SP andnumber of processors used p.

EP =1p· SP

Comparison of Efficiencies EP, E′P, and E′′P for CUDA-FMURP versions 1 to3, respectively is shown below.

EP =1x· O(log x) =

log xx

(5)

E′P =1x· O(log x) =

log xx

(6)

E′′P =1t· O(log x) =

log xt

(7)

EP = E′P < E′′PJ.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 63 / 88

Page 74: Parallel Random Projection for Motif Discovery on GPUs

Result and Analysis Dataset

Dataset

t n l d Instances generated20 600 10 2 10020 600 11 2 10020 600 12 3 10020 600 13 3 10020 600 14 4 10020 600 15 4 10020 600 16 5 10020 600 17 5 10020 600 18 6 10020 600 19 6 100

Table: Summary of generated dataset that is used to determine the accuracy ofCUDA-FMURP. For each of the instance generated, the search model OOPS isassumed, that is each sequence contains exactly one occurrence of the planted motif.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 64 / 88

Page 75: Parallel Random Projection for Motif Discovery on GPUs

Result and Analysis Dataset

Accuracy

t n l d FMURP FMURP∗ SEQ-FMURP CUDA-FMURP m20 600 10 2 13 100 98 98 7220 600 11 2 99 100 100 100 1620 600 12 3 3 96 83 83 25920 600 13 3 81 100 100 100 6220 600 14 4 1 86 79 79 64520 600 15 4 49 100 100 100 17220 600 16 5 0 77 53 53 129220 600 17 5 19 98 98 98 37820 600 18 6 0 82 38 38 221720 600 19 6 9 98 94 94 711

Table: The table shows the number of correctly identified planted motif over 100random input instances. For each of the instances, parameters k = 7 and s = 4 areused. The column labelled FMURP∗ is based from the result presented in[Tompa,2001] using the dataset they generated.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 65 / 88

Page 76: Parallel Random Projection for Motif Discovery on GPUs

Result and Analysis Machine Setups

Machine Setups

System specifications ValuesHost processors (procs) 2 × Intel Quad-core 2.26GHzTotal number of cores 8Max host RAM 12GBDevice/s (GPU/s) 2 × NVIDIA GT120Compute capability 1.1CUDA Cores/GPU 4 (multiprocs) × 8 (cores/proc) = 32GPU clock rate 1.40 GHzMemory clock rate 500 MhzMax device global memory 512MBOperating system Mac OS X 10.6.8CUDA version 3.2

System specifications ValuesHost processors (procs) Core(TM) i7-2600 CPU 3.40GHzTotal number of cores 4 × 2 (hyperthreaded) = 8Max host RAM 8GBDevice/s (GPU/s) 1 × NVIDIA GeForce GTX 580Compute capability 2.0CUDA Cores/GPU 16 (multiprocs) × 32 (cores/proc) = 512GPU clock rate 1.54 GHzMemory clock rate 2004 MhzMax device global memory 1535MBOperating system 64-bit Ubuntu 10.0.4CUDA version 4.1

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 66 / 88

Page 77: Parallel Random Projection for Motif Discovery on GPUs

Result and Analysis Actual Speedup

Actual speed of CUDA-Projection v3 with respect toCUDA-Projection v1

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 67 / 88

Page 78: Parallel Random Projection for Motif Discovery on GPUs

Result and Analysis Actual Speedup

Actual speed of CUDA-FMURP v1 and CUDA-Projectionv3

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 68 / 88

Page 79: Parallel Random Projection for Motif Discovery on GPUs

Result and Analysis Actual Speedup

Actual Speed Result: Setup1

Figure: Actual speed comparison and speedup of CUDA-FMURP v1 with respect toSEQ-FMURP using Setup 1.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 69 / 88

Page 80: Parallel Random Projection for Motif Discovery on GPUs

Result and Analysis Actual Speedup

Memory Requirement

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 70 / 88

Page 81: Parallel Random Projection for Motif Discovery on GPUs

Result and Analysis Actual Speedup

Actual speed comparison and speedup of CUDA-FMURPv1 with respect to SEQ-FMURP and FMURP using Setup 2

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 71 / 88

Page 82: Parallel Random Projection for Motif Discovery on GPUs

Conclusion

Conclusion

In this work, we presented three versions of parallel algorithms for FMURP.

Algorithm Processors SP wrt FMURP SP wrt SEQ-FMURP EfficiencyCUDA-FMURP v1 x O(log x) O(x) (log x/x)CUDA-FMURP v2 x O(log x) O(x) (log x/x)CUDA-FMURP v3 t O(log x) O(x) (log x/t)

We implemented CUDA-FMURP v1 and CUDA-FMURP v2 and achieved amaximum actual speedup of 6.8 and 6.6 respectively with respect to theSEQ-FMURP.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 72 / 88

Page 83: Parallel Random Projection for Motif Discovery on GPUs

Conclusion

curtain

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 73 / 88

Page 84: Parallel Random Projection for Motif Discovery on GPUs

References

References

J.P. Allouche and J. Shallit, “Automatic Sequences: Theory Applicationsand Generalizations”, Cambridge University Press,Chapter 3:Numeration Systems, pp 70-73, 2003

P. Pevzner and S. H. Sze, “Combinatorial Approaches to Finding SubtleSignals in DNA Sequences”, Proceedings of 8th Int. Conf. IntelligentSystems for Molecular Biology (ISMB), 269-78, 2000

J. Buhler, M. Tompa, “Finding Motifs Using Random Projections”,RECOMB ’01 Proceedings of the fifth annual international conference onComputational biology, 2001

D. Kirk, W. Hwu, Programming Massively Parallel Processors: A HandsOn Approach, 1st ed. MA, USA: Morgan Kaufmann, 2010

M. Harris, “Mapping computational concepts to GPUs”, ACMSIGGRAPH 2005 Courses, NY, USA, 2005

N. Jones, P. Pevzner,“An Introduction to Bioinformatics Algorithms”,Massachusetts Institute of Technology Press, 2004

M. K. Das and H.-K Dai, “A survey of DNA Motif Finding Algorithm”,Biomed Central Bioinformatics, 2007

S. Rombauts, P. Dehais, M. Van Montagu, P. Rouze, “PlantCare, A CarePlant cis Acting Regulatory Element Database”, Nucleic Acids Res, 1999

K. Shida, “Hybrid Gibbs-Sampling Algorithm for Challenging MotifDiscovery: GibbsDST”, Genome Informatics, 17(2):3-13, 2006

A. Mohammed, W. Cheung, J. G. Estrada, and J. King, “Improvement ofthe PROJECTION Motif Finding Algorithm”, 2004

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 74 / 88

Page 85: Parallel Random Projection for Motif Discovery on GPUs

Extra Slides

Finding Motifs using Random Projection (FMURP)

INPUT: Set of sequences S, motif length l, expected mismatches d, projectiondimension k, and bucket threshold δOUTPUT: Motif

1 Projection1 Generate k random positions for projection.

Let this be the set I = {i|i ∈ {0, . . . , (l− 1)}} and |I| = k.2 For each Si,j in S,

1 Get hI(Si,j)s from all Si,js in S,where i ∈ {0, . . . , (t − 1)}, and j ∈ 0, . . . , (n− l).

2 Sort Si,js with respect to hI(Si,j).3 Perform a linear search over all hI(Si,j)s to determine which l-mers

are ‘hashed’ in the same bucket.

2 Refine each enriched bucket using Expectation Maximization (EM)3 Refine each enriched bucket using SP-STARσ4 Maximize score to output best motif

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 75 / 88

Page 86: Parallel Random Projection for Motif Discovery on GPUs

Extra Slides Projection

Projection: Example

Given a set of DNA sequences S, pattern length l = 4, projection dimensionk = 2, and bucket threshold δ = 3.

S0 : C G G T C A G GS1 : T T C G A C A TS2 : A C G A T G A A

Figure: Set of t = 3 sequences each with n = 8

We generate the set of k random positions used in the actual projection.Suppose we have the set I = {0, 1}.For all Si,j in S, we get hI(Si,j) using the random positions in I generatedin step 1.To hash Si,js to corresponding buckets using its hI(Si,j), the list definedabove is sorted lexicographically in terms of hI(Si,j) together with theircorresponding Si,js .The sorted list is obtained.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 76 / 88

Page 87: Parallel Random Projection for Motif Discovery on GPUs

Extra Slides Projection

Projection: Example

Label Si,j hI(Si,j) Label Sorted Si,j Sorted hI(Si,j)

S0,0 CGGT CG S2,0 ACGA ACS0,1 GGTC GG S1,4 ACAT ACS0,2 GTCA GT S2,3 ATCA ATS0,3 TCAG TC S0,4 CAGG CAS0,4 CAGG CA S0,0 CGGT CGS1,0 TTCG TT S2,1 CGAT CGS1,1 TCGA TC S1,2 CGAC CGS1,2 CGAC CG S1,3 GACA GAS1,3 GACA GA S2,2 GATC GAS1,4 ACAT AC S0,1 GGTC GGS2,0 ACGA AC S0,2 GTCA GTS2,1 CGAT CG S1,1 TCGA TCS2,2 GATC GA S0,3 TCAG TCS2,3 ATCA AT S2,4 TGAA TGS2,4 TGAA TG S1,0 TTCG TT

Figure: Illustration showing the set of hI(Si,j)s computed from step 2. The sortedhI(Si,j)s is shown in the right part of the table.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 77 / 88

Page 88: Parallel Random Projection for Motif Discovery on GPUs

Extra Slides Projection

Projection: Example

To get the list of buckets, we will perform a linear search over hI(Si,j)s toget the corresponding Si,j with equivalent hI(Si,j)s.

hI(Si,j) Count Si,jAC 2 { ACGA, ACAT }AT 1 { ATCA }CA 1 {CAGG }CG 3 {CGGT, CGAT , CGAC }GA 2 {GACA, GATC }GG 1 {GGTC }GT 1 {GTCA }TC 2 {TCGA, TCAG }TG 1 {TGAA }TT 1 {TTCG}

Figure: Buckets obtained from Projection

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 78 / 88

Page 89: Parallel Random Projection for Motif Discovery on GPUs

Extra Slides Projection

Projection: Example

From the set of buckets obtained, we identify which of those contains atleast δ l-mers hashed and consider them enriched.

hI(Si,j) Count Si,jAC 2 { ACGA, ACAT }AT 1 { ATCA }CA 1 {CAGG }CG 3 {CGGT, CGAT , CGAC }GA 2 {GACA, GATC }GG 1 {GGTC }GT 1 {GTCA }TC 2 {TCGA, TCAG }TG 1 {TGAA }TT 1 {TTCG}

Figure: Buckets obtained from Projection

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 79 / 88

Page 90: Parallel Random Projection for Motif Discovery on GPUs

Extra Slides Projection

Projection: Example

From the set of buckets obtained, we identify which of those contains atleast δ l-mers hashed and consider them enriched.

hI(Si,j) Count Si,jAC 2 { ACGA, ACAT }AT 1 { ATCA }CA 1 {CAGG }CG 3 {CGGT, CGAT , CGAC }GA 2 {GACA, GATC }GG 1 {GGTC }GT 1 {GTCA }TC 2 {TCGA, TCAG }TG 1 {TGAA }TT 1 {TTCG}

Figure: Buckets obtained from Projection

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 80 / 88

Page 91: Parallel Random Projection for Motif Discovery on GPUs

Extra Slides Expectation Maximization (EM)

Expectation Maximization (EM)

INPUT: Motif model θ0 from one enriched bucket, maximum number ofiterations, and threshold for convergence δEM

OUTPUT: Motif model θy

1 For j in {1, . . . , y} or until convergence1 E-step For all l-mer in each sequence Si,

compute E(Si,ai |θj) given the current motif model.2 (M-step) For all Si in S,

get starting positions s such that for each ai ∈ s,E(Si,ai |θj) is maximum ∀ ai in {0, . . . , (n− l)}.

3 (Test for Convergence) Compute L(θj). Compare previouslikelihood L(θj−1) to current L(θj).If the difference satisfies the threshold δEM , stop iteration.

4 (Update step) For the alignment made by starting position vector sidentified in M-step,get motif model θj+1.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 81 / 88

Page 92: Parallel Random Projection for Motif Discovery on GPUs

Extra Slides Expectation Maximization (EM)

EM: Example

From the set of enriched bucket from Projection, EM performs the followingoperations.

From EB, get the alignment made by hashed l-mers.C G G TC G A CC G A T

From the alignment made, a profile matrix is computed.

C G G TC G A CC G A T

A: 0 0 2 0C: 3 0 0 1G: 0 3 1 0T: 0 0 0 2

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 82 / 88

Page 93: Parallel Random Projection for Motif Discovery on GPUs

Extra Slides Expectation Maximization (EM)

EM: Example

Normalize the profile matrix obtained.A: 0.00 0.00 0.33 0.00C: 1.00 0.00 0.00 0.33G: 0.00 1.00 0.66 0.00T: 0.00 0.00 0.00 0.66

To avoid zero values for Pr(Si,j|θ), [Tompa,2001] performed Laplacecorrection. For each row corresponding to a symbol say a, theprobability pa that symbol a appears in the sequence is added to itscorresponding row. Since all symbols in ΣDNA has uniform frequencydistribution, 0.25 is added for each row.

A: 0.25 0.25 0.58 0.25C: 1.25 0.25 0.25 0.58G: 0.25 1.25 0.91 0.25T: 0.25 0.25 0.25 0.91

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 83 / 88

Page 94: Parallel Random Projection for Motif Discovery on GPUs

Extra Slides Expectation Maximization (EM)

EM: Example

Normalize the matrix obtained and let the resulting matrix be the initialmotif model θ0.

A: 0.125 0.125 0.290 0.125C: 0.625 0.125 0.125 0.290G: 0.125 0.625 0.455 0.125T: 0.125 0.125 0.125 0.455

For each Si in S get j such that for all j ∈ {0, . . . , (n− l)}, E(Si,j|θ0) ismaximum. For instance, let’s identify an l-mer in sequence S0 withmaximum expectation E(S0,j|θ0).

E(S0,0|θ0) = E(CGGT|θ0) = ((0.625)(0.625)(0.455)(0.455))/(0.254) = 20.725E(S0,1|θ0) = E(GGTC|θ0) = ((0.125)(0.625)(0.125)(0.125))/(0.254) = 00.313E(S0,2|θ0) = E(GTCA|θ0) = ((0.125)(0.125)(0.125)(0.125))/(0.254) = 00.063E(S0,3|θ0) = E(TCAG|θ0) = ((0.125)(0.125)(0.455)(0.290))/(0.254) = 00.528E(S0,4|θ0) = E(CAGG|θ0) = ((0.625)(0.125)(0.455)(0.125))/(0.254) = 01.138

From all S0,js in S0, l-mer S0,0 obtains the highest expectation.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 84 / 88

Page 95: Parallel Random Projection for Motif Discovery on GPUs

Extra Slides Expectation Maximization (EM)

EM: Example

The set of l-mers with the highest expectation in each sequence willdefine another alignment, like in Step 1. From this set of l-mers, we canobtain the next motif model θ1.

S0,0 : C G G T : 20.73S1,2 : C G A C : 08.41S2,1 : C G A T : 13.20

We compute the likelihood of a motif model θy using the bestexpectations.

L(θ) = 20.73 + 08.41 + 13.20 = 42.34

Update the motif model θ0 to get θ1, using the set of l-mers from eachsequence that maximize the expectation.

Stop iteration if L(θy)− L(θy−1) ≤ δEM.

The output of EM in this example is the consensus string CGAT.J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 85 / 88

Page 96: Parallel Random Projection for Motif Discovery on GPUs

Extra Slides Expectation Maximization (EM)

EM: Example

The set of l-mers with the highest expectation in each sequence willdefine another alignment, like in Step 1. From this set of l-mers, we canobtain the next motif model θ1.

S0,0 : C G G T : 20.73S1,2 : C G A C : 08.41S2,1 : C G A T : 13.20

We compute the likelihood of a motif model θy using the bestexpectations.

L(θ) = 20.73 + 08.41 + 13.20 = 42.34

Update the motif model θ0 to get θ1, using the set of l-mers from eachsequence that maximize the expectation.

Stop iteration if L(θy)− L(θy−1) ≤ δEM.

The output of EM in this example is the consensus string CGAT.J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 85 / 88

Page 97: Parallel Random Projection for Motif Discovery on GPUs

Extra Slides Expectation Maximization (EM)

SP-STARσ

INPUT: Consensus string M from θy and expected mismatches dOUTPUT: Refined consensus string M∗

1 For j in {1, . . . , y′} or until convergence1 Compute for Sb, where Sb is the set of all l-mers from each sequence that

has the least Edit distance from M.

Sb = {Si,j|dE(M, Si,j) is minimum ∀Si,j in Si}

2 Compute for score σ(Sb), where it is equal to the number of sequences inSb such that

dE(M, Si,j) ≤ d

3 Compute the consensus string M′ from alignment made by Sb.4 Compute S′b from M′.5 Compute σ(S′b).6 If σ(S′b) > σ(Sb), continue iteration using M = M′,

else M∗ = M′.

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 86 / 88

Page 98: Parallel Random Projection for Motif Discovery on GPUs

Extra Slides Expectation Maximization (EM)

SP-STARσ: Example

Using M =CGAT and expected mismatches d = 1.

Compute for Sb. For S0 the S0,j is identified as follows.

dE(M,S0,0) = dE(CGAT,CGGT) = 1dE(M, S0,1) = dE(CGAT,GGTC) = 3dE(M, S0,2) = dE(CGAT,GTCA) = 4dE(M, S0,3) = dE(CGAT,TCAG) = 3dE(M, S0,4) = dE(CGAT,CAGG) = 3

The set Sb contains

Sb = {S0,0, S1,2, S2,1}

Sb = CGGT,CGAC,CGAT

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 87 / 88

Page 99: Parallel Random Projection for Motif Discovery on GPUs

Extra Slides Expectation Maximization (EM)

SP-STARσ: Example

Score for Sb isσ(Sb) = 3

because the least edit distance in each sequence is 1, 1, 0. That is all 3sequences satisfies

dE(M, Si,j) ≤ 1

Consensus string from Sb is M′ = CGAT .

S′b from M′ is similar to Sb.

S′b = {S0,0, S1,2, S2,1}

S′b = {CGGT,CGAC,CGAT}

Since σ(Sb) = σ(S′b),M∗ = M = CGAT .

J.B. Clemente (ACLab, DCS, UPD) CUDA-FMURP March 31, 2012 88 / 88