reconstructing circular order from inaccurate adjacency information applications in nmr data...

48
Reconstructing Circular Order from Inaccurate Adjacency Information Applications in NMR Data Interpretation Ming-Yang Kao

Post on 20-Dec-2015

226 views

Category:

Documents


0 download

TRANSCRIPT

Reconstructing Circular Order from Inaccurate Adjacency Information

Applications in NMR Data Interpretation

Ming-Yang Kao

Problem Description

540

190 480

160 520

220

(220,480)

(520,220)

(480,190)

(190,540)

(540,160)

(160,520)

Problem Description

?

? ?

? ?

?

(220,480)

(520,220)

(480,190)

(190,540)

(540,160)

(160,520)

GivenFind the correct order

Introduction

• Nuclear Magnetic Resonance (NMR)

Introduction

• Nuclear Magnetic Resonance (NMR)– Use the strong

magnetic wave to align nuclei (isotopes).

– When this spin transition occurs, the nuclei are said to be in resonance with the applied radiation.

NMR Measurement

• Chemical Shift – ppm – Electrons in the

molecule have small magnetic fields

– When magnetic field is applied, electrons tend to oppose the applied field

• NMR Spectrum

Determining Protein Structure Using NMR

1. NMR Spectral Data generation

2. Peak Picking

3. Peak Assignment

4. Structural Restraint Extraction

5. Structure Calculation

NMR Data Interpretation

• Peak Assignment.– Map resonance peaks from different NMR spectra to

same residue

– Identify adjacency relationship

– Assign the segments to the protein sequence

• Currently done manually• Bottleneck for high throughput structure

determination

Our Focus

Peak Assignment

• Two kinds of information available

– Distribution of spin systems for different amino acids

– The adjacency information between spin systems

Problem Description (Input)

(a1,b1)

(a2,b2)

(a3,b3)

(a4,b4)

(a5,b5)

(a6,b6)

a1 a2 a3 a4 a5 a6

b5 b6b4b3b2b1

Problem Description (Output)

(a1,b1)

(a5,b5)

(a3,b3)

(a4,b4)

(a2,b2)

(a6,b6)

a1 a2 a3 a4 a5 a6

b5 b6b4b3b2b1

?

Problem Description (Output)

(a1,b1)

(a5,b5)

(a3,b3)

(a4,b4)

(a2,b2)

(a6,b6)

a1 a2 a3 a4 a5 a6

b5 b6b4b3b2b1

Problem Description (Output)

(a1,b1)

(a5,b5)

(a3,b3)

(a4,b4)

(a2,b2)

(a6,b6)

a1 a2 a3 a4 a5 a6

b5 b6b4b3b2b1

Problem Description (Output)

(a1,b1)

(a5,b5)

(a3,b3)

(a4,b4)

(a2,b2)

(a6,b6)

a1 a2 a3 a4 a5 a6

b5 b6b4b3b2b1

Problem Description (Output)

(a1,b1)

(a5,b5)

(a3,b3)

(a4,b4)

(a2,b2)

(a6,b6)

a1 a2 a3 a4 a5 a6

b5 b6b4b3b2b1

Problem Description (Output)

(a1,b1)

(a5,b5)

(a3,b3)

(a4,b4)

(a2,b2)

(a6,b6)

a1 a2 a3 a4 a5 a6

b5 b6b4b3b2b1

Problem Description (Output)

(a1,b1)

(a5,b5)

(a3,b3)

(a4,b4)

(a2,b2)

(a6,b6)

a1 a2 a3 a4 a5 a6

b5 b6b4b3b2b1

Problem Description (Output)

(a1,b1)

(a5,b5)

(a3,b3)

(a4,b4)

(a2,b2)

(a6,b6)

a1 a3 a6 a4 a2 a5

b3 b4b2b6b1b5

≤ ≤ ≤ ≤ ≤

≤ ≤ ≤ ≤ ≤

Equivalent Problem Description

(a1,b1)

(a5,b5)

(a3,b3)

(a4,b4)

(a2,b2)

(a6,b6)

u1 u2 u3 u4 u5 u6

v5 v6v4v3v2v1

Cyclic Augmentation

(a1,b1)

(a5,b5)

(a3,b3)

(a4,b4)

(a2,b2)

(a6,b6)

u1 u2 u3 u4 u5 u6

v5 v6v4v3v2v1

A matching M is called a cyclic augmentation if HM forms a hamiltonian cycle.

Not every matching forms a cycle

Not every matching forms a cycle

Not every matching forms a cycle

Not every matching forms a cycle

Cost of an edge in M

200

270

Cost of this edge is 70

Cost of an edge in M

1200

100

Cost of this edge is 1100

Sum of cost of edges

Minimum Bipartite Cyclic Augmentation

Input: U = {u1, u2,…, un}

V = {v1, v2,…, vn}

H : a perfect matching between U and V

Output: A perfect matching M such that

1. HM forms a cycle

2. ∑(u,v)M|u-v| is minimized

u1 u2 u3 u4 u5 u6

v5 v6v4v3v2v1

Cost of most expensive edges

Bottleneck Bipartite Cyclic Augmentation

Input: U = {u1, u2,…, un}

V = {v1, v2,…, vn}

H : a perfect matching between U and V

Output: A perfect matching M such that

1. HM forms a cycle

2. max(u,v)M{|u-v|} is minimized

u1 u2 u3 u4 u5 u6

v5 v6v4v3v2v1

Outline

• MD : the minimum cost matching

• We will transform MD to an optimal cost matching using exchange operations

• Some properties of an optimal matching to prune down the space of exchanges required

• Exchange graph

• Optimal matching – MST in exchange graph

MD : the minimum cost matching

MD : the minimum cost matching

The minimum cost matching may not be a cyclic augmentation

Exchanges

Exchanges

Exchanges

Exchanges between different cycles merges them

Cost of an Exchange

Cost of an Exchange

Cost of an Exchange

x

Cost of the exchange is 2.x

Transform MD into a minimum cost cyclic augmentation using exchange operations

Which exchanges will yield the optimal cyclic augmentation?

Clusters

l1 l2 l7l6l5l4l3 l8

Exchange Graph

l1 l2 l7l6l5l4l3 l8

Nodes ≡ Cycles in MD

Edges ≡ Adjacent Clusters in MD

12

45

56

67

78

23

34

Exchange Graph

l1 l2 l7l6l5l4l3 l8

Weight on Edges ≡ Cost of corresponding . Exchange

12

45

56

67

78

23

34

Solution

Exchanges corresponding to the Minimum Spanning Tree on Exchange Graph yield a

minimum cost cyclic augmentation

Results

• Minimum Bipartite Cyclic Augmentation

• Bottleneck Bipartite Cyclic Augmentation

Ω(n log n)

3 approx. algorithm

The End

Thank You