what is microarray - unito.it · pdf filewhat is microarray a powerful ... instead of putting...

45
What is Microarray What is Microarray A powerful technology for biological exploration A powerful technology for biological exploration which which enables to simultaneously measure the enables to simultaneously measure the which which enables to simultaneously measure the enables to simultaneously measure the level of activity of thousands genes. level of activity of thousands genes. The amount of mRNA for each gene in a given The amount of mRNA for each gene in a given The amount of mRNA for each gene in a given The amount of mRNA for each gene in a given sample (or a pair of samples) is measured. sample (or a pair of samples) is measured. Microarrays are: Microarrays are: Parallel Parallel High High-throughput throughput L l Large Large-scale scale Genomic scale Genomic scale

Upload: lehanh

Post on 23-Mar-2018

220 views

Category:

Documents


1 download

TRANSCRIPT

What is MicroarrayWhat is Microarray

A powerful technology for biological exploration A powerful technology for biological exploration whichwhich enables to simultaneously measure theenables to simultaneously measure thewhich which enables to simultaneously measure the enables to simultaneously measure the level of activity of thousands genes. level of activity of thousands genes. The amount of mRNA for each gene in a givenThe amount of mRNA for each gene in a givenThe amount of mRNA for each gene in a given The amount of mRNA for each gene in a given sample (or a pair of samples) is measured.sample (or a pair of samples) is measured.Microarrays are:Microarrays are:

ParallelParallelHighHigh--throughputthroughputLL llLargeLarge--scalescaleGenomic scaleGenomic scale

Most known commercial Most known commercial ffmicroarray platformsmicroarray platforms

AgilentAgilent AffymetrixAffymetrix Applied Applied BiosystemBiosystem NembGeneNembGene IlluminaIllumina

Inkjet Inkjet th ith i PhotolithograPhotolithogra

ContactContactii

Mirror Mirror h li hh li h BeadsBeads--linkedlinkedsynthesissynthesis PhotolithograPhotolithogra

phyphy spottingspotting photolithographotolithographyphy

BeadsBeads linked linked oligo libraryoligo library

60mers60mers 25mers25mers 60mers60mers 24 mers24 mers 50mers50mers

Commercial microarrays give in general more d ibl l d h hi h QC hi hreproducible results due to the high QC, which can

be achieved only at industrial level.

Robot pin spottingRobot pin spottingRobot pin spottingRobot pin spotting

Agilent inkjet technologyAgilent inkjet technologyg j gyg j gy

l k h l l ll k h l l lAgilent uses inkjet technology to print oligos onto glass Agilent uses inkjet technology to print oligos onto glass slides. slides. The nonThe non--contact inkjet technology produces microarrayscontact inkjet technology produces microarraysThe nonThe non--contact inkjet technology produces microarrays contact inkjet technology produces microarrays with more uniform and consistent features. with more uniform and consistent features. The inkjet process does not introduce defects as a result of The inkjet process does not introduce defects as a result of j pj psurface tension interactions with the microarray surface. surface tension interactions with the microarray surface. NonNon--contact inkjet process provides substantial contact inkjet process provides substantial i t i tti ith t t i ti t i tti ith t t i timprovement over pin spotting with respect to consistency improvement over pin spotting with respect to consistency and spot uniformity. and spot uniformity.

Probe designProbe designggProprietary Proprietary algorithms arealgorithms arealgorithms are algorithms are used to design the used to design the probes.probes.Design of Design of consensus target consensus target sequences is basedsequences is basedsequences is based sequences is based on the integration on the integration of public data of public data d i d b diff td i d b diff tderived by different derived by different resources. resources. Probes areProbes areProbes are Probes are experimentally experimentally validated.validated.

Affymetrix microarraysAffymetrix microarraysy yy ya.k.a gene chip.a.k.a gene chip.

Instead of putting entire genes on an array, Instead of putting entire genes on an array, put sets of DNA 25put sets of DNA 25 mers (oligonucleotides)mers (oligonucleotides)put sets of DNA 25put sets of DNA 25--mers (oligonucleotides).mers (oligonucleotides).

Produced using a photolithography processProduced using a photolithography processProduced using a photolithography, process Produced using a photolithography, process similar to that used to make semiconductor similar to that used to make semiconductor chips.chips.chips.chips.

mRNA samples are processed separatelymRNA samples are processed separatelymRNA samples are processed separately mRNA samples are processed separately instead of in pairs (single channel technology)instead of in pairs (single channel technology)

Affymetrix geneChipsAffymetrix geneChipsy g py g pVarious 25 mers are put on the chip to interrogate a Various 25 mers are put on the chip to interrogate a single gene.single gene.g gg g

Additionally a slight variant (that differs only at the Additionally a slight variant (that differs only at the 1313thth base) is put next to itbase) is put next to it1313thth base) is put next to it.base) is put next to it.

The gene expression measure is made combining theThe gene expression measure is made combining theThe gene expression measure is made combining the The gene expression measure is made combining the hybridization information derived by many (20, 16, hybridization information derived by many (20, 16, 11, >3) separate measurements.11, >3) separate measurements.

New geneChips releases are characterized by an New geneChips releases are characterized by an increase of the number of enquired transcripts. To increase of the number of enquired transcripts. To k th Chi f t t t ( f 1 28k th Chi f t t t ( f 1 28keep the geneChip format constant (a square of 1.28 keep the geneChip format constant (a square of 1.28 x 1.28 cm): x 1.28 cm):

the size of the feature is reduced (e.g. 11 nmthe size of the feature is reduced (e.g. 11 nm→→5 nm) 5 nm) the number of probes for each probe set descreases (e.g. 11 the number of probes for each probe set descreases (e.g. 11 probes probes →→5 probes).5 probes).

Affymetrix geneChipsAffymetrix geneChipsProbe Array (Photolithography)Probe Array (Photolithography)

1 4 million features on each chip and1 4 million features on each chip and1.4 million features on each chip and 1.4 million features on each chip and therefore the sections of the mask can be therefore the sections of the mask can be very tinyvery tinyvery tiny.very tiny.

400 chips mask

Probe Array (Photolithography)y ( g p y)Synthesis of probe

11 μm 18 μm

Probe set (Affymetrix)Probe set (Affymetrix)cell

Probe pair

PMMM

Genesequence

ACCAGATCTGTAGTCCATGCGATGC

ACCAGATCTGTAATCCATGCGATGC

PM

MM

Exon arrays

On the GeneChip® Human Exon 1.0 ST A 5 362 207 f t d tArray, 5,362,207 features are used to interrogate one million exon clusters (collections of overlapping exons) with over 1.4 million probe sets.over 1.4 million probe sets.

Transcript clustersThe core type was so named because the annotations in this type were intended to be the foundation from which was built the gene annotations. The extended type derived its name from the sense that these annotations would extend the boundaries of the core genes. The idea behind the name of the full typewas that it would signify all possible content.was that it would signify all possible content.

Core Gene Annotation sourcesRefSeq alignmentsGenbank alignments of ‘complete CDS’ transcripts

Extended Gene Annotation sourcesExtended Gene Annotation sourcescDNA alignmentsEnsembl annotations (Hubbard, T. et al.)Mapped syntenic mRNA from rat and mouseMapped syntenic mRNA from rat and mousemicroRNA annotationsMitomap annotationsVegagene (The HAVANA group Hillier et al Heilig et al )Vegagene (The HAVANA group, Hillier et al., Heilig et al.)VegaPseudogene (The HAVANA group, Hillier et al., Heilig et al.)

Full Gene AnnotationsGeneid (Grup de Recerca en Informàtica Biomèdica)Genscan (Burge, C. et al.)GENSCAN Suboptimal (Burge, C. et al.)GENSCAN Suboptimal (Burge, C. et al.)Exoniphy (Siepel et al.)RNAgene (Sean Eddy Lab)SgpGene (Grup de Recerca en Informàtica Biomèdica)SgpGene (Grup de Recerca en Informàtica Biomèdica)TWINSCAN (Korf, I. et al.)

NimbleGen photolitography

NimbleGene technologyNimbleGene technologygygy

Feature characteristics:Feature characteristics:The most consistent results are obtained The most consistent results are obtained usingusing::gg

3333--μμm features, created by clustering four m features, created by clustering four activated mirrors surrounded by a border of activated mirrors surrounded by a border of i ti t d ii ti t d iinactivated mirrors.inactivated mirrors.1616--μμm features, in which a single activated m features, in which a single activated mirror is surrounded by a border of inactivatedmirror is surrounded by a border of inactivatedmirror is surrounded by a border of inactivated mirror is surrounded by a border of inactivated mirrors. mirrors.

These formats produce microarrays with These formats produce microarrays with p yp y85,000 or 195,000 features, respectively, 85,000 or 195,000 features, respectively, when the entire printable area is usedwhen the entire printable area is used..

Probe designProbe designgg

Completely custom.Completely custom.It is possible to reIt is possible to re--design low quality design low quality probes and update the custom arrayprobes and update the custom arrayprobes and update the custom array.probes and update the custom array.Probes can be scattered over the all Probes can be scattered over the all chip.chip.

Applera arrays are based on 60Applera arrays are based on 60Applera arrays are based on 60 Applera arrays are based on 60 mer probes spotted on 3D matrixmer probes spotted on 3D matrix

Chemiluminescence is used to:Chemiluminescence is used to:Measure Gene Expression. Measure Gene Expression. Quality ControlQuality ControlQuality Control.Quality Control.

Fluorescence is used to:Fluorescence is used to:Locate and autoLocate and auto--grid. grid. Normalize every feature in a way Normalize every feature in a way y yy yindependent of gene expression signal.independent of gene expression signal.

Probes between 50 to 80 mers give the Probes between 50 to 80 mers give the best hybridization specificitybest hybridization specificity

HybridizationAgilentApplera

Specificitypp

Illumina

home madeAffymetrix

home madecDNA arrays

Probe length

20 50 60 70 80 300

T.R.Hughes et al, Nature Biotechnology 19:342-347, 2001

Probe designProbe designggSignal mainly derived as the average between the various isoforms

Applera mouse microarray contentApplera mouse microarray contentCelera curatedCelera curated

Only: 18%Only: 18%GenBank mRNAGenBank mRNA

Only: 2.6%Only: 2.6%

Curated PublicCurated PublicOnly: 24%Only: 24%

(RefSeq + RIKEN)(RefSeq + RIKEN)(RefSeq + RIKEN)(RefSeq + RIKEN)

Curated public mapping to Curated public mapping to Celera: 55%Celera: 55%Celera: 55%Celera: 55%

Illumina technologyIllumina technologygygy

Illumina technologyIllumina technologygygy

8 sub-arrays with 24K probes

Individual fibers conduct light to enable data acquisition and Individual fibers conduct light to enable data acquisition and tit ti f i l itt d f h b dtit ti f i l itt d f h b dquantitation of signal emitted from each bead.quantitation of signal emitted from each bead.

Scanner has < 1 Scanner has < 1 μμm resolutionm resolution

Illumina decodingdecoding

Illumina decoding

1 1 0 1 2 2 0 2

Probe designProbe designggEach address and probe sequence combination has Each address and probe sequence combination has been selected bioinformatically and functionallybeen selected bioinformatically and functionallybeen selected bioinformatically and functionally been selected bioinformatically and functionally screened in the laboratory to ensure the absence of screened in the laboratory to ensure the absence of crosscross--hybridization. hybridization. GeneGene--specific probes were designed using a multispecific probes were designed using a multi--GeneGene--specific probes were designed using a multispecific probes were designed using a multi--step algorithm scoring the following parameters:step algorithm scoring the following parameters:

Similarity to other genesSimilarity to other genesAb f hi hl t d i thAb f hi hl t d i thAbsence of highly repeated sequence in the genomeAbsence of highly repeated sequence in the genomeSequence complexitySequence complexityEST coverageEST coverageSelfSelf--complementarity for hairpin structure predictioncomplementarity for hairpin structure predictionMelting temperature for hybridization uniformityMelting temperature for hybridization uniformityDistance from 3Distance from 3’’ end of the transcriptend of the transcriptpp

The design also tooks into account exon structure:The design also tooks into account exon structure:Probe design incorporated splice isoforms that have been Probe design incorporated splice isoforms that have been identified and documented in the RefSeq database.identified and documented in the RefSeq database.identified and documented in the RefSeq database.identified and documented in the RefSeq database.

Labeling for two channel arraysLabeling for two channel arraysLabeling for two channel arraysLabeling for two channel arraysDirect labeling:

Cy3 and Cy5 are directly incorporated during the cDNACy3 and Cy5 are directly incorporated during the cDNA synthesis

Indirect labeling:Th f i di t l b li ll i ti fThe process of indirect labeling allows incorporation of modified nucleotides, commonly 5-(3-aminoallyl)-2′-deoxyuridine 5′-triphosphate (a reactive amine derivative of dUTP) into the reverse transcription reaction.dUTP) into the reverse transcription reaction. The aminoallyl nucleotide is readily incorporated by both DNA and RNA polymerases. A reactive fluorescent dye is then chemically attached to theA reactive fluorescent dye is then chemically attached to the cDNA transcript in a second reaction.

Amplification labeling (IVT):RT is used with an oligo d(T) primer associated to the T7RT is used with an oligo d(T) primer associated to the T7 promoter sequence to make ds cDNA.T7 RNA Polymerase produces a cRNA with result in an 100 x linear amplification of the dsDNA.linear amplification of the dsDNA.

Cy3, Cy5 spectra

Critical parametersCritical parametersMany factors have an impact on the Many factors have an impact on the reliability of signals. reliability of signals.

The scanner used is important to consider,The scanner used is important to consider,The scanner used is important to consider, The scanner used is important to consider, as it is generating the actual signals to be as it is generating the actual signals to be quantified.quantified.quantified. quantified. The choice of image analysis software is The choice of image analysis software is clearly critical in the production of reliableclearly critical in the production of reliableclearly critical in the production of reliable clearly critical in the production of reliable intensity values. intensity values.

Affymetrix labelingAffymetrix labeling

DetectionDetection Biotin

Hybridization detection is Hybridization detection is f df d

StreptavidinStreptavidin

fitcfitcperformed in two steps:performed in two steps:

Streptavidin labeled with Streptavidin labeled with phycoerythrin is applied tophycoerythrin is applied to

fitcfitc

moAb antimoAb anti--streptavidinstreptavidin

phycoerythrin is applied to phycoerythrin is applied to the array.the array.Signal amplification isSignal amplification isSignal amplification is Signal amplification is performed using antiperformed using anti--streptavidin moAb labeled streptavidin moAb labeled with biotin followed bywith biotin followed bywith biotin followed by with biotin followed by addition of streptavidin addition of streptavidin labeled with fitc.labeled with fitc.

The fluorescence signal is The fluorescence signal is read by a laser based read by a laser based yyscanner.scanner.

RNA fragments with fluorescent tags from sample to be tested

RNA f t h b idi ith DNA G Chi fiRNA fragment hybridizes with DNA on GeneChipfi array

GeneChip Output

Applera RT /IVT Applera RT /IVT labelinglabelingpppp gg

AAAAAAA 3’ mRNA

5’

(T7 Promoter) 5’ cDNAReverse transcription AAAAAAA - 3’ cRNA

2nd strand DNA synthesis (T7 Promoter) 5’ cDNARNA degradation

cDNA purificationAAAAAAA - 3’ cDNA

cRNA synthesis

cRNA purification

UUUUUUU 5’ purified DIG labeled cRNADIG

3’3’3’

DetectionDetection digoxigenin

Ch i l i i lCh i l i i l

Hybridization detection Hybridization detection

Chemioluminescence signalChemioluminescence signal

moAb antimoAb anti--digoxigenindigoxigenin

yyis performed using antiis performed using anti--digoxigenin moAbdigoxigenin moAb

APAP

Reagent and enhancerReagent and enhancer

digoxigenin moAb digoxigenin moAb labeled with alcaline labeled with alcaline phosphatase (AP)phosphatase (AP)phosphatase (AP)phosphatase (AP)Chemioluminescence is Chemioluminescence is acquired by CCD acquired by CCD cameracameracamera.camera.

I l C l P b (ICP)Internal Control Probe (ICP)

LIZ

DIGDI

G® L

IZ ®

60 mer oligo

Hybridization

®

24 mer oligo

Immobilized ICP Oligo(co-spotted in every location)

Hybridization

Illumina labelingIllumina labelingBA B

DetectionDetection Biotin

C 3C 3

HybridizationHybridization

Cy3Cy3

moAb antimoAb anti--biotinbiotin

Hybridization Hybridization detection is detection is

f d i tif d i tiperformed using antiperformed using anti--biotin moAb labeled biotin moAb labeled with Cy3.with Cy3.Th flTh flThe fluorescence The fluorescence signal is read by asignal is read by asignal is read by a signal is read by a laser based scanner.laser based scanner.