dna structure notation operationsprofs.sci.univr.it/~manca/mnc/dna-one.pdf · 2014. 11. 4. · 2 13...

Post on 22-Dec-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

DNADNAStructureStructureNotationNotation

OperationsOperations

Vincenzo MancaVincenzo Manca

Dipartimento di InformaticaDipartimento di Informatica

UniversitaUniversita’’ di di VeronaVerona

2

13 Years of Molecular Computing13 Years of Molecular Computing 1994 1994 AdlemanAdleman’’s s Experiment *Experiment * 1995 Lipton1995 Lipton’’s Model *s Model * 1996 1996 IntInt. Conf. on Math. Linguistics (Marcus). Conf. on Math. Linguistics (Marcus) 1997 1997 Mangalia Mangalia ((PaunPaun, Head), Head) 1998 MFCS Brno (Molecular Computing) 1998 MFCS Brno (Molecular Computing) 1999 ( 1999 (PaunPaun’’s s WMC)WMC) 2000 DNA6 2000 DNA6 Leiden Leiden ** 2001 DNA7 Tampa (FL) : 3-SAT2001 DNA7 Tampa (FL) : 3-SAT 2002 DNA8 Sapporo : DNA Duplication 2002 DNA8 Sapporo : DNA Duplication 2004 DNA10 2004 DNA10 Milano Milano : XPCR Extraction: XPCR Extraction 2005 DNA11 Ontario : XPCR Recombination 2005 DNA11 Ontario : XPCR Recombination 2007 DNA13 Memphis : MP Systems2007 DNA13 Memphis : MP Systems 2005 DNA14 Prague : Genetic Drift2005 DNA14 Prague : Genetic Drift

3

DNA Computing MottoDNA Computing Motto

Problem: Data and RequirementsProblem: Data and Requirements Algorithm: SolutionsAlgorithm: Solutions

Encode data by DNA strandsEncode data by DNA strands Encode algorithms by biotech proceduresEncode algorithms by biotech procedures Decode final strands as solutions Decode final strands as solutions

http://profs.scienze.univr.it/~manca/

Faculty Page

Papers and Tutorials

4

A General schema of combinatorial problemA General schema of combinatorial problem

A set of Requirements for A set of Requirements for ““assignmentsassignments””, that is,, that is,sequences 0/1 of some length nsequences 0/1 of some length n

The Space of possible solutions has E(2,n) elements,The Space of possible solutions has E(2,n) elements,but only some of them satisfy the requirementsbut only some of them satisfy the requirements

Encode assignments by DNA strandsEncode assignments by DNA strands

Encode requirements as biotech protocols that filterEncode requirements as biotech protocols that filterthe strands encoding the true solutionsthe strands encoding the true solutions

5

Space GenerationIn linear time

Solution ExtractionIn linear time

!!!

6

New Trends in DNACNew Trends in DNAC

oo DNA Self Assembly (DNA Self Assembly (SeemanSeeman, , WinfreeWinfree, , ……))

oo DNA Automata (Shapiro)DNA Automata (Shapiro)

oo DNA Algorithms ==> new biotech protocolsDNA Algorithms ==> new biotech protocols

7

8

Biotech ProtocolsBiotech Protocols

AlgorithmsAlgorithms

DNA ComputingComputing DNA

A change of perspective

9

In the search for implementing algorithmsIn the search for implementing algorithmson DNA, general algorithmic principles areon DNA, general algorithmic principles arediscovered in fundamentaldiscovered in fundamental biomolecular biomolecularprocesses.processes.

10

1’

2’3’

4’

5’ O

P

A

CH2

1’

2’ 3’

4’

5’O CH2OH

H

1’

2’3’

4’

5’ O

C

CH2OH

T

NucleotidesNucleotides

~330 Dalton1 Dalton = 1.64 10-24

1 g. H = 6.2 1023

1’--- 1’ = ~ 1nm

A few grams of DNA = the amount of all electronic information stored in all the world

--------

O- P O

O-

O-

PO43-

P3O105-

G

{A, T, C, G}

5’

3’

Nucleoside

PhosphodiestericGlycosidic

Hydrogen

BONDS

11

12

BilinearityBilinearityComplementarityComplementarityAntiparallelismAntiparallelism

The marvelous formThe marvelous form

5’

3’

13

V. MancaOn the logic of bilinear forms,

Fundamenta Informaticae, 2006

P

14

SSTRANDTRAND H HYBRIDIZATIONYBRIDIZATION

15

αα ββ

γγ

16

17

18

DNA DNA LigaseLigase

α δ

α’ δ’

α’ δ’

Ligase Joins 5' phosphateto 3' hydroxyl

α’ δ’α

δ

19

StringsStrings Strings over an alphabet are Strings over an alphabet are sequencessequences of of

symbols of the alphabet :symbols of the alphabet :

abbabbbaabbabbba

On strings a On strings a concatenationconcatenation associative associativeoperation - - is definedoperation - - is defined

((αβαβ))γγ = = αα((βγβγ))αα = = αλαλ = = λαλα

A language L is a set of strings A language L is a set of strings

20

DNA Sequences are DNA Sequences are Mobile Double StringsMobile Double Strings

B B = {A, T, C, G}= {A, T, C, G}

B* = B* = strings over strings over BB

αα[i,j][i,j]

||αα||

s is a s is a αα-strand -strand oror s : s : αα or or type(type(ss )= )=αα

αα :n :n or or multmult((αα)=n)=n

21

Complementation Complementation - - c c ((involutiveinvolutive))

Reverse Reverse rev rev ((involutiveinvolutive))MirrorMirror mir mir ((involutiveinvolutive))

mirmir((αα)= )= revrev((ααcc) )

Reverse and ComplementationReverse and Complementation commutecommute

22

DNA Sequences are DNA Sequences are Floating Double StringsFloating Double Strings

B B = {A, T, C, G}= {A, T, C, G}B/BB/B * = double and single * = double and single strings over strings over BB

HybridizationHybridization ||||] [] [] ] γγ [ [

PairingPairing αα ββ

23

Hybridization :Hybridization :αα || || mirmir((αα))

αα] ] γγ [ [ββ <==> <==> αα ⊃⊃ γγ , , ββ ⊃⊃ mirmir((γγ))

αα] [] [ββ <==> <==> αα] ] γγ [ [ββ for some for some γγ

Pairing :Pairing : αα] [] [ββ ==> ==> αα / / revrev((ββ) )

24

ATTGGCGCCAAT

ATTGGC

GCCAAT

AxiomAxiom

αα = = rev(rev(αα) ) rev( rev(ββ)) ββ

< <αα> = <> = <mirmir((αα)>)>

25

Fraction NotationFraction Notation αα / / λλ = = αα = = αα->->

λλ / / αα = = revrev((αα) = <- ) = <- αα

αα / / mirmir((αα) = <) = <αα>>

< <αα> = <> = <mirmir((αα)>)>

26

B B = {A, T, C, G}= {A, T, C, G}

BBBB* = (double) * = (double) strings over strings over BB

extext

overlapoverlap

overlapping concatenationoverlapping concatenation

paired concatenationpaired concatenation

27

Polymerase ExtensionPolymerase Extension

extext

28

Overlap Relation

Overlapping Concatenation

overlapoverlap

overlapping concatenationoverlapping concatenation

29

Ligase Ligase CatenationCatenation

paired concatenationpaired concatenation

A pool P of DNA molecules is aA pool P of DNA molecules is amultiset multiset of strandsof strands

i) Set of strands typed by stringsi) Set of strands typed by strings

ii) Set of strings with multiplicitiesii) Set of strings with multiplicities

P = {s1:P = {s1:αα1 , s2:1 , s2:αα2, 2, …….}.}

P = {P = {αα1: n1 , 1: n1 , αα2: n2, 2: n2, …….}.}

multmultPP((αα1) = n1 , 1) = n1 , multmultPP ((αα2) = n22) = n2

s s ∈∈ P P

αα ∈∈ P P

31

Types of DNA Pools areTypes of DNA Pools areLanguages of BB*Languages of BB*

Type(T) = {Type(T) = {ηη ∈∈ BB*BB* | s : | s : ηη , s , s ∈∈ T } T }

32

Test Tube Operations in DNACTest Tube Operations in DNAC Denature (Melting)Denature (Melting)

Renature Renature (Hybridization, Annealing)(Hybridization, Annealing)

MixMix

SplitSplit

fish (by Affinity)fish (by Affinity)

RemoveRemove

lengthlength

Separate (Gel Electrophoresis)Separate (Gel Electrophoresis)

Ligate Ligate ((LigaseLigase))

Extend (Polymerase)Extend (Polymerase)

Synthetize Synthetize ((OligosOligos))

InfixInfix

pTpApCpGOH

pGOH

COH

33

BufferGel

Electrode

Electrode

Samples

Faster

Slower

GEL ELECTROPHORESIS – Separation of DNAfragments

34

More Complex OperationsMore Complex Operations

Amplification (PCR)Amplification (PCR)

Sequencing (Sanger)Sequencing (Sanger)

Restriction (R. Enzymes)Restriction (R. Enzymes)

Clonation Clonation ((Plasmide TransinfectionPlasmide Transinfection))

ddA, ddT, ddC, ddG

35

PCR: Polymerase ChainPCR: Polymerase ChainReactionReaction

36ExponentialLinear

h(α)h(β)

α

β

long short

PCR with 3PCR with 3’’ sticky end sticky end

37

PCR LemmaPCR Lemma

Let P be a pool of type {Let P be a pool of type {α⁄βα⁄β} including primers } including primers γγ, , δδ,,then PCR(P, then PCR(P, γγ, , δδ) provides an exponential) provides an exponentialamplification amplification iff iff one of 4 cases holds (defined byone of 4 cases holds (defined bymeans of overlapping concatenation), and, at mostmeans of overlapping concatenation), and, at mostat the third step, the (blunt) seed of an exponentialat the third step, the (blunt) seed of an exponentialamplification is generated (its form depends on theamplification is generated (its form depends on thespecific case which holds).specific case which holds).

V.V. Manca Manca, G. Franco, G. Franco““Computing by polymerase chain reactionComputing by polymerase chain reaction”” Mathematical Biosciences, N. Mathematical Biosciences, N. 211, 282211, 282––298, 2008.298, 2008.

38

T of

type L

Operation

T’ oftype L’

39

MathematicallyMathematicallyTest Tube OperationsTest Tube Operations

Type (T) = LType (T) = L means thatmeans that

Types of strands of T constitute the language LTypes of strands of T constitute the language L

Given some test tubes as arguments with some typesGiven some test tubes as arguments with some types

provide as resultsprovide as results

Test tubes with other typesTest tubes with other types

40

41

DNA Test Tube MachineDNA Test Tube Machine

Register Machines where:Register Machines where:

- Registers are Test Tubes- Registers are Test Tubes((multisets multisets of strands instead of numbers)of strands instead of numbers)

- DNA Test Tubes operations - DNA Test Tubes operations(instead of arithmetic operations)(instead of arithmetic operations)

42

AdlemanAdleman’’s s ProblemProblem

Given a Graph (of seven nodes)

Find (if there are)The paths from two given nodes (0,6)

Passing once for every node(Hamiltonian paths)

43

Adleman Adleman - Lipton- Lipton’’s Extract Models Extract ModelIn Combinatorial ProblemsIn Combinatorial Problems

The Generation of all possible solutionsin linear time

The Extraction of true solutionsin linear time

Extraction is performed in a number of sub-steps andeach of them selects all the strands that include a sub-strand of a given type

44

Adleman’s Graph

45

αic βjc

Node i = αi βi

Arc ij = mir(βi αj)

Ai BiBj

Bj’ Ai’

αi βi

AdlemanAdleman’’s s EncodingEncoding

|αi| = |βi| = 10 i , j = 0, …, 6

46

AdlemanAdleman’’s s AlgorithmAlgorithm

Generation of Generation of hamiltonian hamiltonian paths from v1 to v7paths from v1 to v7

Generate paths of G (hybridization/Generate paths of G (hybridization/ligationligation))Perform PCR of primers Perform PCR of primers α0, mir(β6)

Separate paths of length 140 (7 x 20)Separate paths of length 140 (7 x 20)forfor J := 0 J := 0 toto 6 6 dodo Select strands where Select strands where αjβj occurs occurs

outputoutput remaining strands remaining strands

47

MIX and Split MethodMIX and Split Method

Generation of space solutions of N variablesGeneration of space solutions of N variables

Merge X1 and Merge X1 and ¬¬X1 in a tube TX1 in a tube T

Split T into A and BSplit T into A and BFor J := 2 To NFor J := 2 To N

Extend strands of A with XJExtend strands of A with XJExtend strands of B with Extend strands of B with ¬¬XJXJ

Merge A and B into TMerge A and B into TSplit T into A and BSplit T into A and B

Merge A and BMerge A and B

48

LiptonLipton’’s Algorithm 3-Sat(N, M)s Algorithm 3-Sat(N, M)

oo Generate N-space solutions in TGenerate N-space solutions in Too For J = 1 To MFor J = 1 To M

T1 := Extract [T, L(1,J)]T1 := Extract [T, L(1,J)] T := T - T1T := T - T1 T2 := Extrtact[T , L(2,J)]T2 := Extrtact[T , L(2,J)] T := T - T2T := T - T2 T3 := Extract[T , L(3,J)]T3 := Extract[T , L(3,J)] T := Merge(T1, T2)T := Merge(T1, T2) T := Merge(T, T3)T := Merge(T, T3)

oo Detect TDetect Too ifif T T≠≠ ∅∅, , thenthen take a clone and sequence it (Solution)take a clone and sequence it (Solution)

oo elseelse ““Unsolvable ProblemUnsolvable Problem””

DNA ExtractionDNA ExtractionStrands of type Strands of type γγ are called are called γγ-strands-strands

(or instances of (or instances of γγ))

A A ββ-strand with -strand with ββ including including γγ as substring as substring is iscalled a called a γγ--superstrand superstrand ((ββ is a is a γγ--superstring)superstring)

Problem:Problem:

Extract all the Extract all the γγ--superstrands superstrands of a pool Pof a pool P

50

X1

Y1

X2

Y2

X3

Y3

Xn

Yn

From 2n strands to 2n strandsStarting from 4 strands (n-multiples of X, Y)

in linear time.

top related