reconstructing circular order from inaccurate adjacency information applications in nmr data...
Post on 20-Dec-2015
226 views
TRANSCRIPT
Reconstructing Circular Order from Inaccurate Adjacency Information
Applications in NMR Data Interpretation
Ming-Yang Kao
Problem Description
540
190 480
160 520
220
(220,480)
(520,220)
(480,190)
(190,540)
(540,160)
(160,520)
Problem Description
?
? ?
? ?
?
(220,480)
(520,220)
(480,190)
(190,540)
(540,160)
(160,520)
GivenFind the correct order
Introduction
• Nuclear Magnetic Resonance (NMR)– Use the strong
magnetic wave to align nuclei (isotopes).
– When this spin transition occurs, the nuclei are said to be in resonance with the applied radiation.
NMR Measurement
• Chemical Shift – ppm – Electrons in the
molecule have small magnetic fields
– When magnetic field is applied, electrons tend to oppose the applied field
• NMR Spectrum
Determining Protein Structure Using NMR
1. NMR Spectral Data generation
2. Peak Picking
3. Peak Assignment
4. Structural Restraint Extraction
5. Structure Calculation
NMR Data Interpretation
• Peak Assignment.– Map resonance peaks from different NMR spectra to
same residue
– Identify adjacency relationship
– Assign the segments to the protein sequence
• Currently done manually• Bottleneck for high throughput structure
determination
Our Focus
Peak Assignment
• Two kinds of information available
– Distribution of spin systems for different amino acids
– The adjacency information between spin systems
Problem Description (Input)
(a1,b1)
(a2,b2)
(a3,b3)
(a4,b4)
(a5,b5)
(a6,b6)
a1 a2 a3 a4 a5 a6
b5 b6b4b3b2b1
Problem Description (Output)
(a1,b1)
(a5,b5)
(a3,b3)
(a4,b4)
(a2,b2)
(a6,b6)
a1 a2 a3 a4 a5 a6
b5 b6b4b3b2b1
?
Problem Description (Output)
(a1,b1)
(a5,b5)
(a3,b3)
(a4,b4)
(a2,b2)
(a6,b6)
a1 a2 a3 a4 a5 a6
b5 b6b4b3b2b1
Problem Description (Output)
(a1,b1)
(a5,b5)
(a3,b3)
(a4,b4)
(a2,b2)
(a6,b6)
a1 a2 a3 a4 a5 a6
b5 b6b4b3b2b1
Problem Description (Output)
(a1,b1)
(a5,b5)
(a3,b3)
(a4,b4)
(a2,b2)
(a6,b6)
a1 a2 a3 a4 a5 a6
b5 b6b4b3b2b1
Problem Description (Output)
(a1,b1)
(a5,b5)
(a3,b3)
(a4,b4)
(a2,b2)
(a6,b6)
a1 a2 a3 a4 a5 a6
b5 b6b4b3b2b1
Problem Description (Output)
(a1,b1)
(a5,b5)
(a3,b3)
(a4,b4)
(a2,b2)
(a6,b6)
a1 a2 a3 a4 a5 a6
b5 b6b4b3b2b1
Problem Description (Output)
(a1,b1)
(a5,b5)
(a3,b3)
(a4,b4)
(a2,b2)
(a6,b6)
a1 a2 a3 a4 a5 a6
b5 b6b4b3b2b1
Problem Description (Output)
(a1,b1)
(a5,b5)
(a3,b3)
(a4,b4)
(a2,b2)
(a6,b6)
a1 a3 a6 a4 a2 a5
b3 b4b2b6b1b5
≤ ≤ ≤ ≤ ≤
≤ ≤ ≤ ≤ ≤
Equivalent Problem Description
(a1,b1)
(a5,b5)
(a3,b3)
(a4,b4)
(a2,b2)
(a6,b6)
u1 u2 u3 u4 u5 u6
v5 v6v4v3v2v1
Cyclic Augmentation
(a1,b1)
(a5,b5)
(a3,b3)
(a4,b4)
(a2,b2)
(a6,b6)
u1 u2 u3 u4 u5 u6
v5 v6v4v3v2v1
A matching M is called a cyclic augmentation if HM forms a hamiltonian cycle.
Sum of cost of edges
Minimum Bipartite Cyclic Augmentation
Input: U = {u1, u2,…, un}
V = {v1, v2,…, vn}
H : a perfect matching between U and V
Output: A perfect matching M such that
1. HM forms a cycle
2. ∑(u,v)M|u-v| is minimized
u1 u2 u3 u4 u5 u6
v5 v6v4v3v2v1
Cost of most expensive edges
Bottleneck Bipartite Cyclic Augmentation
Input: U = {u1, u2,…, un}
V = {v1, v2,…, vn}
H : a perfect matching between U and V
Output: A perfect matching M such that
1. HM forms a cycle
2. max(u,v)M{|u-v|} is minimized
u1 u2 u3 u4 u5 u6
v5 v6v4v3v2v1
Outline
• MD : the minimum cost matching
• We will transform MD to an optimal cost matching using exchange operations
• Some properties of an optimal matching to prune down the space of exchanges required
• Exchange graph
• Optimal matching – MST in exchange graph
Transform MD into a minimum cost cyclic augmentation using exchange operations
Which exchanges will yield the optimal cyclic augmentation?
Exchange Graph
l1 l2 l7l6l5l4l3 l8
Nodes ≡ Cycles in MD
Edges ≡ Adjacent Clusters in MD
12
45
56
67
78
23
34
Exchange Graph
l1 l2 l7l6l5l4l3 l8
Weight on Edges ≡ Cost of corresponding . Exchange
12
45
56
67
78
23
34
Solution
Exchanges corresponding to the Minimum Spanning Tree on Exchange Graph yield a
minimum cost cyclic augmentation
Results
• Minimum Bipartite Cyclic Augmentation
• Bottleneck Bipartite Cyclic Augmentation
Ω(n log n)
3 approx. algorithm