summer at vialogy ronald j. perez. vialogy developers of computational products for increased...

17
Summer at ViaLogy Ronald J. Perez

Post on 21-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

Summer at ViaLogy

Ronald J. Perez

ViaLogy

Developers of computational products for

increased performance of molecular detection

systems ViaAmp

Gene expression amplification software designed to use active signal processing technology Differentiation of true signal from background noise

Project

Showing the limitations of passive analysis Standard microarray image analysis software represented

by GenePix

Deliverables Processed 18 microarray images using passive analysis Classified arrays into triplicates according to dilution

Given initial condition: ratio of green to red intensity is 1 Focus on array intensities as opposed to gene regulation

What did I analyze?

18 Microarrays, 340 spots on each array and since each gene is in duplicate, there are a total of 170 genes

There were 6 different levels of dilution across 18 arrays

Dilution Step Dilution Factor

Stock solution Original Concentration

1 1:10

2 1:100

3 1:1000

4 1:10000

5 1:100000

Most Concentrated Array Most Diluted Array

GenePix Output File

Block Column Row Name IDB635 SD

B532 SD

Ratio of Medians (635/532)

F635 Median - B635

F532 Median - B532

F635 Mean - B635

F532 Mean - B532

1 1 1 Gene 1a NC 84 21 0 0 13 1 15

1 1 2 Gene 1b NC 74 11 0.679 24 36 28 35

1 1 3 Gene 2a NC 76 11 0.781 11 14 11 15

1 1 4 Gene 2b NC 74 12 1.043 19 18 17 19

1 1 5 Gene 3a NC 76 10 0.73 24 34 22 35

1 1 6 Gene 3b NC 70 12 0.281 3 13 1 14

1 1 7 Gene 4a NC 68 10 0.85 13 15 13 15

1 1 8 Gene 4b NC 68 11 0.314 5 18 13 19

1 1 9 Gene 5a NC 66 14 1.328 15 11 17 11

1 1 10 Gene 5b NC 66 13 0.981 13 13 15 13

Calculating Signal to Noise

There are two ways to calculate signal to noise ratio (S/N) from a microarray spot: The first S/N definition used by Vialogy is

calculated the following way: S/N (1) = Foreground Median – Background

The second S/N definition used by the client who sent us the arrays is: S/N (2) = (Foreground Mean – Background)/SD of Background

GenePix Reproducibility

If GenePix data were 100% reproducible, one would see a line with slope of 1 when plotting the S/N ratios of two independent analysis.

When a scatter plot was made, some data points did not fall on a straight line.

Since most data points fall on a straight line, we assumed the output data is credible and safe to continue analyzing.

Genepix Output Reproducibility Check

0

20

40

60

80

100

120

140

0 20 40 60 80 100 120 140

Person A [F532 Median-B]

Pers

on B

[F53

2 M

edia

n-B

]

Reproducibility Graph

Microarray Categorization

The first approach taken to classify this set of 18 arrays into triplets was to plot the S/N ratio of all 170 genes vs. arrays.

This plot will give a rough idea of the intensity pattern of these microarrays.

Did GenePix do a good job analyzing these microarrays?

Average Green Gene Intensity per Array

0

2000

4000

6000

8000

10000

12000

14000

A B C D E F G H I J K L M N O P Q R

Array

[F53

2Med

-B] A

vera

ge o

f Gen

e D

uplic

ates

Gene Intensity per Array

Client Selected Groups

Instead of looking at all 170 genes, our client gave us a list of 48 genes to focus on.

These genes had a S/N ratio greater than 2 and where classified into the following 4 groups:

Focus on C and D because they have highest S/N ratio

Groups S/N ratio

Level A 3 - 10

Level B 11- 25

Level C 26 - 50

Level D >50

Analysis of Groups C and D

Genes in Levels C and D were used to design a different categorization scheme. S/N ratios of Genes in Level C were summed up

and the MEAN was taken separately for each array. Same was done for Level D genes. Level C and D MEANS were averaged.

S/N ratios of Genes in Level C were summed up and this time the MEDIAN was taken separately for each array. Same was done for Level D genes Level C and D MEDIANS were also averaged.

Mean and Median Approach

AVERAGE S/N ratio of Group C & D - GREEN

0.1

1

10

100

D H N I L Q A B C F K R M O P E G J

ARRAYS

[(F

53

2 M

ed

ian

-B)/

B5

32

SD

] Average C

Average D

Average of C+D Ave

MEDIAN S/N ratio of Group C & D - GREEN

0.1

1

10

100

D H N I L Q A B C F K R M O P E G J

ARRAYS

[(F

532

Med

ian

-B)/B

532

SD

]

Median C

Median D

Ave of C+D Medians

MEAN INTENSITY MEDIAN INTENSITY

Categorization Summary

A B C D E F G H I J K L M N O P Q R

[F Med-B] for all genes 16 10 14 3 17 15 9 1 4 18 11 7 12 2 13 5 8 6

[F Med-B] for 48 genes 18 13 12 3 17 14 10 2 4 16 9 5 15 1 11 8 6 7

MEAN S/N for 48 genes 9 10 7 1 17 8 18 3 4 16 11 5 15 2 14 13 6 12

MEDIAN S/N for 48 genes 15 12 7 1 17 9 18 2 4 14 8 5 13 3 16 11 6 10

Median of All Approaches 16 11 9.5 2 17 12 14 2 4 16 10 5 14 2 14 9.5 6 8.5

Median of All Approaches

0

5

10

15

20

A B C D E F G H I J K L M N O P Q R

ARRAYS

RA

NK

ING

Future Direction

I have only told half of the story, the next steps are to: Process microarrays using ViaAmp Passive analysis vs. active analysis

ViaAmp results Sensitivity and Specificity studies

Thanks

Vialogy Team - Dr. David Robbins SoCalBSI Team NSF, NIH