steps involved in microarray analysis after the...

39
Steps involved in microarray analysis after the experiments • Scanning slides to create images • Conversion of images to numerical data • Processing of raw numerical data • Further analysis – Clustering – Integration with genomic data Steps involved in microarray analysis after the experiments • Scanning slides to create images • Conversion of images to numerical data • Processing of raw numerical data • Further analysis – Clustering – Integration with genomic data

Upload: others

Post on 08-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

1

Steps involved in microarray analysisafter the experiments

• Scanning slides to create images• Conversion of images to numerical data• Processing of raw numerical data• Further analysis

– Clustering– Integration with genomic data

Steps involved in microarray analysisafter the experiments

• Scanning slides to create images• Conversion of images to numerical data• Processing of raw numerical data• Further analysis

– Clustering– Integration with genomic data

Page 2: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

2

Processing raw microarray data

• Main aims: – To identify and reduce the noise found in

microarray data

– To identify differentially expressed genes/fragments in an experiment

Methods used so far…

• Many have been ad hoc

Page 3: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

3

Methods used so far…

• Many have been ad hoc– Visual identification (!)

Methods used so far…

• Many have been ad hoc– Visual identification (!)– 2-fold change in hybridisation signals

Page 4: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

4

Methods used so far…

• Many have been ad hoc– Visual identification (!)– 2-fold change in hybridisation signals– Ranking ratios of hybridisation signals

Methods used so far…

• Many have been ad hoc– Visual identification (!)– 2-fold change in hybridisation signals– Ranking ratios of hybridisation signals

• Microarray expts are

Page 5: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

5

Methods used so far…

• Many have been ad hoc– Visual identification (!)– 2-fold change in hybridisation signals– Ranking ratios of hybridisation signals

• Microarray expts are– Easy to do, but very sensitive– Lot of noise, artifacts, errors– ~20,000 spots per slide

Methods used so far…

• Many have been ad hoc– Visual identification (!)– 2-fold change in hybridisation signals– Ranking ratios of hybridisation signals

• Microarray expts are– Easy to do, but very sensitive– Lot of noise, artifacts, errors– ~20,000 spots per slide

• Without good processing the results can be completely wrong

Page 6: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

6

This is changing…..

0

200

400

600

800

10001997

1998

1999

2000

2001

2002

Year

PubM

ed h

its

clusteringprocessing

microarray

This is changing…..• Opposing camps

0

200

400

600

800

1000

1997

1998

1999

2000

2001

2002

Year

PubM

ed h

its

clusteringprocessing

microarray

Page 7: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

7

This is changing…..• Opposing camps

– Ad hoc camp• Biologists• Too simple and

may be wrong

0

200

400

600

800

10001997

1998

1999

2000

2001

2002

Year

PubM

ed h

its

clusteringprocessing

microarray

This is changing…..• Opposing camps

– Ad hoc camp• Biologists• Too simple and

may be wrong

– Complicated maths

• Biostatisticians• Incomprehensible• Idealised datasets0

200

400

600

800

1000

1997

1998

1999

2000

2001

2002

Year

PubM

ed h

its

clusteringprocessing

microarray

Page 8: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

8

This is changing…..• Opposing camps

– Ad hoc camp• Biologists• Too simple and

may be wrong

– Complicated maths

• Biostatisticians• Incomprehensible• Idealised datasets

– Must strike a balance

0

200

400

600

800

10001997

1998

1999

2000

2001

2002

Year

PubM

ed h

its

clusteringprocessing

microarray

Getting the most out of your microarray

• Processing the raw data

• Cleaning and assessing the quality of your data

• Identifying differentially hybridised spots

• How do you get the correct list of differentially expressed genes out of ~20000 data points?

Page 9: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

9

Getting the most out of your microarray

• Processing the raw data

• Cleaning and assessing the quality of your data

• Identifying differentially hybridised spots

• How do you get the correct list of differentially expressed genes out of ~20000 data points?

Processing flow chart

Data input

Background

correction

Cy5/Cy3

normalisation

Merging

replicate

experiments

Score

differential

hybridisation

Page 10: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

10

Processing flow chart

Data input

Background

correction

Cy5/Cy3

normalisation

Merging

replicate

experiments

Score

differential

hybridisation

The data – a GenePix file

Page 11: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

11

GenePix file – what do we look at?

• Measure red and green intensities separately

• Signal intensity = foreground – background

GenePix file – what do we look at?

• Measure red and green intensities separately

• Signal intensity = foreground – background

• Foreground signal

Page 12: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

12

GenePix file – what do we look at?

• Measure red and green intensities separately

• Signal intensity = foreground – background

• Foreground signal

• Background signal

GenePix file – what do we look at?

• Measure red and green intensities separately

• Signal intensity = foreground – background

• Foreground signal

• Background signal

• Ratio = red/green or green/red

Page 13: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

13

Processing flow chart

Data input

Background

correction

Cy5/Cy3

normalisation

Merging

replicate

experiments

Score

differential

hybridisation

Background correction

• Genepix background– Median intensity of

immediate area surrounding each spot

• But– Very variable

between individual spots

– Artifactual background from smudges

• Therefore add to noise

Page 14: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

14

Background correction

• Calculate the average background from surrounding area of spot

• Recommendation of 3x3 – 5x5 area

• Repeat for red and green separately

Background correction

• Still have variable distribution of intensity

• Much smoother distribution of background intensity

• Remove artifactual smudges

Page 15: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

15

Processing flow chart

Data input

Background

correction

Cy5/Cy3

normalisation

Merging

replicate

experiments

Score

differential

hybridisation

Red/green normalisation

• Normalise red and green intensities – spots with equal hybridisation should have similar

intensities– ie ratio ~ 1 for similarly expressed genes

• Otherwise, will have wrong list of differentially expressed genes

• Multiply one set of intensities by a scale factor

• But must obtain scale factor

Page 16: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

16

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

0 1 2 3 4

red/green ratio

num

ber o

f spo

tsMajority method

• Majority method:– Assume that most spots do

not change expression level

– Find average ratio (red/green intensity)

– Scale factor is the amount by which need to multiply the ratio so it is 1.

Majority method

• But several issues– Hybridisation levels differ according to location on slide

– Scanning properties for different colours differ at different intensities

• Scale factor must take these into account

Page 17: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

17

Intensity considerations

• Simple scale factor fits a straight line

0

10000

20000

30000

40000

50000

60000

0 10000 20000 30000 40000 50000 60000

green intensity

red

inte

nsity

Intensity considerations

• Simple scale factor fits a straight line

• Distribution is curved– Difference in ratio can be

10-fold different depending on intensity

0

10000

20000

30000

40000

50000

60000

0 10000 20000 30000 40000 50000 60000

green intensity

red

inte

nsity

Page 18: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

18

Intensity considerations

• Simple scale factor fits a straight line

• Distribution is curved– Difference in ratio can be

10-fold different depending on intensity

• Different scale factors should be used for different intensities

0

10000

20000

30000

40000

50000

60000

0 10000 20000 30000 40000 50000 60000

green intensity

red

inte

nsity

Positional considerations

• Different regions of the slide have different levels of hybridization

• Difference in average ratio can be ~10 fold

• Different scale factors needed for each region of slide

Page 19: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

19

Positional considerations

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

0 1 2 3 4

red/green ratio

num

ber o

f spo

ts

Scale factor for each spot• Calculate the scale factor using surrounding area of spot

• Recommendation of 12x12 – 20x20 area

Positional considerations

• Raw data has large positional dependence

Before

Page 20: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

20

Positional considerations

• Raw data has large positional dependence

• Normalisationwithout pos. shifted ratios towards red intensity, but does not remove artifact

Before No positional data

Positional considerations

• Raw data has large positional dependence

• Normalisationwithout pos. shifted ratios towards red intensity, but does not remove artifact

• Positional normalisationremoves most of the artifact

Before No positional data Positional data

Page 21: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

21

Processing flow chart

Data input

Background

correction

Cy5/Cy3

normalisation

Merging

replicate

experiments

Score

differential

hybridisation

Combining multiple experiments

• Often have replicates of the same experiment

• Do you have to scale between them?• How do you combine the data from them?

Page 22: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

22

Replicate scaling• Different slides may have

different spreads in intensity and ratios

• Adjust spread of distributions by measuring standard deviation

• Estimate of variance by quartiles, fitting distribution, or bootstrapping

0

500

1000

1500

2000

2500

0

200

400

600

800

1000

1200

1400

Combining replicate data

• Take medians of ratios?• Take means of intensity values?• Take weighted means?• Treat each experiment individually and see

which spots are consistently differntially expressed?

Page 23: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

23

Getting the most out of your microarray

• Processing the raw data

• Cleaning and assessing the quality of your data

• Identifying differentially hybridised spots

• How do you get the correct list of differentially expressed genes out of ~20000 data points?

Processing flow chart

Data input

Background

correction

Cy5/Cy3

normalisation

Merging

replicate

experiments

Score

differential

hybridisation

Spot

qualityArtifactual

regions

Duplicate spot

variability

Replicate experiment variability

Page 24: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

24

Processing flow chart

Data input

Background

correction

Cy5/Cy3

normalisation

Merging

replicate

experiments

Score

differential

hybridisation

Spot

qualityArtifactual

regions

Duplicate spot

variability

Replicate experiment variability

Filter bad spots

• Small spots• Smudgy spots• Non-round spots

etc…

Page 25: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

25

Processing flow chart

Data input

Background

correction

Cy5/Cy3

normalisation

Merging

replicate

experiments

Score

differential

hybridisation

Spot

qualityArtifactual

regions

Duplicate spot

variability

Replicate experiment variability

Artifactual array regions

Remove regions that have artifactual background after correction

Background artifacts usually vary from slide to slide – ie not consistent

Page 26: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

26

Processing flow chart

Data input

Background

correction

Cy5/Cy3

normalisation

Merging

replicate

experiments

Score

differential

hybridisation

Spot

qualityArtifactual

regions

Duplicate spot

variability

Replicate experiment variability

Measurement of chip quality

• How successful is an experiment?– How consistent are the hybridisations

within experiments

Page 27: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

27

Measurement of intrachip quality• Genes are often placed as

neighbouring pairs

• Variation between them provides a measure of variation within an experiment– (x1 – x2)/(x1+x2)

• Mean gives overall variation for expt.

• Remove outliers

• Are the same spots always inconsistent across replicate expts?0

20

40

60

80

100

120

140

-1.2-1.02-0.84-0.66-0.48 -0.

3-0.12 0.0

60.240.42 0.6 0.7

80.961.14

Variation

Num

ber o

f spo

ts

Experimentquality

Outliers

Intrachip variability

0

20

40

60

80

100

120

140

-1.2-1.02-0.84-0.66-0.48 -0.

3-0.12 0.0

60.240.42 0.6 0.7

80.961.14

Variation

Num

ber o

f spo

ts

Mean = 3.7%

Page 28: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

28

Intrachip variability – single experiment quality

0

20

40

60

80

100

120

140

-1.2-1.02-0.84-0.66-0.48 -0.

3-0.12 0.0

60.240.42 0.6 0.7

80.961.14

Variation

Num

ber o

f spo

ts

Mean = 3.7% Mean = 10.7%

Filtering poor duplicates

0

1

2

3

4

5

0 1 2 3 4 5

ratio 1

ratio

2

Page 29: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

29

Filtering poor duplicates

0

1

2

3

4

5

0 1 2 3 4 5

ratio 1

ratio

2

0

1

2

3

4

5

0 1 2 3 4 5

ratio 1

ratio

2

Processing flow chart

Data input

Background

correction

Cy5/Cy3

normalisation

Merging

replicate

experiments

Score

differential

hybridisation

Spot

qualityArtifactual

regions

Duplicate spot

variability

Replicate experiment variability

Page 30: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

30

Measurement of interchip quality• How consistent are

replicate experiments?

• Use same measure for equivalent spots :(x1 – x2)/(x1+x2)

• Mean gives overall variation for expt. With respect to other replicates

• Remove outlier experiments

Measurement of replicate chip quality

0

10000

20000

30000

40000

50000

60000

70000

0 10000 20000 30000 40000 50000

Green intensity

Red

inte

nsity

Page 31: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

31

Measurement of replicate chip quality

0

10000

20000

30000

40000

50000

60000

70000

0 10000 20000 30000 40000 50000

Green intensity

Red

inte

nsity

0

10000

20000

30000

40000

50000

60000

70000

0 10000 20000 30000 40000 50000

Green intensity

Red

inte

nsity

Measurement of replicate chip quality

0

10000

20000

30000

40000

50000

60000

70000

0 10000 20000 30000 40000 50000

Green intensity

Red

inte

nsity

0

10000

20000

30000

40000

50000

60000

70000

0 10000 20000 30000 40000 50000

Green intensity

Red

inte

nsity

0

10000

20000

30000

40000

50000

60000

70000

0 10000 20000 30000 40000 50000

Green intensity

Red

inte

nsity

Page 32: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

32

Measurement of replicate chip quality

0

10000

20000

30000

40000

50000

60000

70000

0 10000 20000 30000 40000 50000

Green intensity

Red

inte

nsity

0

10000

20000

30000

40000

50000

60000

70000

0 10000 20000 30000 40000 50000

Green intensity

Red

inte

nsity

0

10000

20000

30000

40000

50000

60000

70000

0 10000 20000 30000 40000 50000

Green intensity

Red

inte

nsity

0

10000

20000

30000

40000

50000

60000

70000

0 10000 20000 30000 40000 50000 60000

Green intensity

Red

inte

nsity

Measurement of replicate chip quality

0

10000

20000

30000

40000

50000

60000

70000

0 10000 20000 30000 40000 50000

Green intensity

Red

inte

nsity

0

10000

20000

30000

40000

50000

60000

70000

0 10000 20000 30000 40000 50000

Green intensity

Red

inte

nsity

0

10000

20000

30000

40000

50000

60000

70000

0 10000 20000 30000 40000 50000

Green intensity

Red

inte

nsity

0

10000

20000

30000

40000

50000

60000

70000

0 10000 20000 30000 40000 50000 60000

Green intensity

Red

inte

nsity

0

10000

20000

30000

40000

50000

60000

0 2000 4000 6000 8000 10000 12000 14000

Green intensity

Red

inte

nsity

Page 33: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

33

Measurement of replicate chip quality

0

10000

20000

30000

40000

50000

60000

70000

0 10000 20000 30000 40000 50000

Green intensity

Red

inte

nsity

0

10000

20000

30000

40000

50000

60000

70000

0 10000 20000 30000 40000 50000

Green intensity

Red

inte

nsity

0

10000

20000

30000

40000

50000

60000

70000

0 10000 20000 30000 40000 50000

Green intensity

Red

inte

nsity

0

10000

20000

30000

40000

50000

60000

70000

0 10000 20000 30000 40000 50000 60000

Green intensity

Red

inte

nsity

0

10000

20000

30000

40000

50000

60000

0 2000 4000 6000 8000 10000 12000 14000

Green intensityR

ed in

tens

ity

Mean var. = 8.4% Mean var. = 8.2%

Mean var. = 15.2% Mean var. = 32.3%

Mean var. = 7.3%

0

20

40

60

80

100

120

140

-1.2-1.02-0.84-0.66-0.48 -0.

3-0.12 0.0

60.240.42 0.6 0.7

80.961.14

Variation

Num

ber o

f spo

ts

Measurement of replicate chip quality

0

10000

20000

30000

40000

50000

60000

70000

0 10000 20000 30000 40000 50000

Green intensity

Red

inte

nsity

0

10000

20000

30000

40000

50000

60000

70000

0 10000 20000 30000 40000 50000

Green intensity

Red

inte

nsity

0

10000

20000

30000

40000

50000

60000

70000

0 10000 20000 30000 40000 50000

Green intensity

Red

inte

nsity

0

10000

20000

30000

40000

50000

60000

70000

0 10000 20000 30000 40000 50000 60000

Green intensity

Red

inte

nsity

0

10000

20000

30000

40000

50000

60000

0 2000 4000 6000 8000 10000 12000 14000

Green intensity

Red

inte

nsity

Mean var. = 8.4% Mean var. = 8.2%

Mean var. = 15.2% Mean var. = 32.3%

Mean var. = 7.3%

0

20

40

60

80

100

120

140

-1.2-1.02-0.84-0.66-0.48 -0.

3-0.12 0.0

60.240.42 0.6 0.7

80.961.14

Variation

Num

ber o

f spo

ts

Page 34: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

34

Measurement of interchip quality

• Quite easy to see by eye, but provides a systematic and objective method for determining consistency

• Use to measure overall consistency of replicates

• Identify and remove bad replicate expts

• Also identify regions of the slide that are consistently error prone

Getting the most out of your microarray

• Processing the raw data

• Cleaning and assessing the quality of your data

• Identifying differentially hybridised spots

• How do you get the correct list of differentially expressed genes out of ~20000 data points?

Page 35: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

35

Processing flow chart

Data input

Background

correction

Cy5/Cy3

normalisation

Merging

replicate

experiments

Score

differential

hybridisation

Spot

qualityArtifactual

regions

Duplicate spot

variability

Replicate experiment variability

Scoring differentially hybridized probes

• Identify spots that have differential hybridisation on red and green channels

• Many different methods being published

Page 36: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

36

Scoring differentially:ratio-based method

• Calculate red/green ratio for each spot

Scoring differentially:ratio-based method

• Calculate red/green ratio for each spot• Plot distribution:

0

500

1000

1500

2000

2500

3000

0 1 2

ratio (red/green)

Num

ber o

f spo

ts

Page 37: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

37

Scoring differentially:ratio-based method

• Calculate red/green ratio for each spot• Plot distribution:

• Define cut-off based on normal distribution(or use 2-fold cut-off)

0

500

1000

1500

2000

2500

3000

0 1 2

ratio (red/green)

Num

ber o

f spo

ts

Problems

Page 38: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

38

Problems

• Many experiments don’t give normal distribution

0

100

200

300

400

500

600

700

800

900

0 1 2 3 4 5

ratio (red/green)

Num

ber o

f spo

ts

Problems

• Many experiments don’t give normal distribution

• Ratios ignore the signal intensity– More stringent for high

intensity spots

0

100

200

300

400

500

600

700

800

900

0 1 2 3 4 5

ratio (red/green)

Num

ber o

f spo

ts

0

10000

20000

30000

40000

50000

60000

0 10000 20000 30000 40000 50000 60000

green intensity

red

inte

nsity

Page 39: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation

39

Processing flow chart

Data input

Background

correction

Cy5/Cy3

normalisation

Merging

replicate

experiments

Score

differential

hybridisation

Spot

qualityArtifactual

regions

Duplicate spot

variability

Replicate experiment variability

Processing flow chartData input

Background

correction

Cy5/Cy3

normalisation

Merging

replicate

experiments

Score

differential

hybridisation

Spot

qualityArtifactual

regions

Duplicate spot

variability

Replicate experiment variability

• Raw microarray requires a lot of initial processing before being useful• Very important as can completely change the answers you get• The issues are beginning to emerge

• Different people have different ideas of how to resolve them• There is no standard method yet – each has problems

• Very labour intensive, but can be computed relatively easily