what is microarray - unito.it · pdf filewhat is microarray a powerful ... instead of putting...
TRANSCRIPT
What is MicroarrayWhat is Microarray
A powerful technology for biological exploration A powerful technology for biological exploration whichwhich enables to simultaneously measure theenables to simultaneously measure thewhich which enables to simultaneously measure the enables to simultaneously measure the level of activity of thousands genes. level of activity of thousands genes. The amount of mRNA for each gene in a givenThe amount of mRNA for each gene in a givenThe amount of mRNA for each gene in a given The amount of mRNA for each gene in a given sample (or a pair of samples) is measured.sample (or a pair of samples) is measured.Microarrays are:Microarrays are:
ParallelParallelHighHigh--throughputthroughputLL llLargeLarge--scalescaleGenomic scaleGenomic scale
Most known commercial Most known commercial ffmicroarray platformsmicroarray platforms
AgilentAgilent AffymetrixAffymetrix Applied Applied BiosystemBiosystem NembGeneNembGene IlluminaIllumina
Inkjet Inkjet th ith i PhotolithograPhotolithogra
ContactContactii
Mirror Mirror h li hh li h BeadsBeads--linkedlinkedsynthesissynthesis PhotolithograPhotolithogra
phyphy spottingspotting photolithographotolithographyphy
BeadsBeads linked linked oligo libraryoligo library
60mers60mers 25mers25mers 60mers60mers 24 mers24 mers 50mers50mers
Commercial microarrays give in general more d ibl l d h hi h QC hi hreproducible results due to the high QC, which can
be achieved only at industrial level.
Agilent inkjet technologyAgilent inkjet technologyg j gyg j gy
l k h l l ll k h l l lAgilent uses inkjet technology to print oligos onto glass Agilent uses inkjet technology to print oligos onto glass slides. slides. The nonThe non--contact inkjet technology produces microarrayscontact inkjet technology produces microarraysThe nonThe non--contact inkjet technology produces microarrays contact inkjet technology produces microarrays with more uniform and consistent features. with more uniform and consistent features. The inkjet process does not introduce defects as a result of The inkjet process does not introduce defects as a result of j pj psurface tension interactions with the microarray surface. surface tension interactions with the microarray surface. NonNon--contact inkjet process provides substantial contact inkjet process provides substantial i t i tti ith t t i ti t i tti ith t t i timprovement over pin spotting with respect to consistency improvement over pin spotting with respect to consistency and spot uniformity. and spot uniformity.
Probe designProbe designggProprietary Proprietary algorithms arealgorithms arealgorithms are algorithms are used to design the used to design the probes.probes.Design of Design of consensus target consensus target sequences is basedsequences is basedsequences is based sequences is based on the integration on the integration of public data of public data d i d b diff td i d b diff tderived by different derived by different resources. resources. Probes areProbes areProbes are Probes are experimentally experimentally validated.validated.
Affymetrix microarraysAffymetrix microarraysy yy ya.k.a gene chip.a.k.a gene chip.
Instead of putting entire genes on an array, Instead of putting entire genes on an array, put sets of DNA 25put sets of DNA 25 mers (oligonucleotides)mers (oligonucleotides)put sets of DNA 25put sets of DNA 25--mers (oligonucleotides).mers (oligonucleotides).
Produced using a photolithography processProduced using a photolithography processProduced using a photolithography, process Produced using a photolithography, process similar to that used to make semiconductor similar to that used to make semiconductor chips.chips.chips.chips.
mRNA samples are processed separatelymRNA samples are processed separatelymRNA samples are processed separately mRNA samples are processed separately instead of in pairs (single channel technology)instead of in pairs (single channel technology)
Affymetrix geneChipsAffymetrix geneChipsy g py g pVarious 25 mers are put on the chip to interrogate a Various 25 mers are put on the chip to interrogate a single gene.single gene.g gg g
Additionally a slight variant (that differs only at the Additionally a slight variant (that differs only at the 1313thth base) is put next to itbase) is put next to it1313thth base) is put next to it.base) is put next to it.
The gene expression measure is made combining theThe gene expression measure is made combining theThe gene expression measure is made combining the The gene expression measure is made combining the hybridization information derived by many (20, 16, hybridization information derived by many (20, 16, 11, >3) separate measurements.11, >3) separate measurements.
New geneChips releases are characterized by an New geneChips releases are characterized by an increase of the number of enquired transcripts. To increase of the number of enquired transcripts. To k th Chi f t t t ( f 1 28k th Chi f t t t ( f 1 28keep the geneChip format constant (a square of 1.28 keep the geneChip format constant (a square of 1.28 x 1.28 cm): x 1.28 cm):
the size of the feature is reduced (e.g. 11 nmthe size of the feature is reduced (e.g. 11 nm→→5 nm) 5 nm) the number of probes for each probe set descreases (e.g. 11 the number of probes for each probe set descreases (e.g. 11 probes probes →→5 probes).5 probes).
Affymetrix geneChipsAffymetrix geneChipsProbe Array (Photolithography)Probe Array (Photolithography)
1 4 million features on each chip and1 4 million features on each chip and1.4 million features on each chip and 1.4 million features on each chip and therefore the sections of the mask can be therefore the sections of the mask can be very tinyvery tinyvery tiny.very tiny.
400 chips mask
Probe set (Affymetrix)Probe set (Affymetrix)cell
Probe pair
PMMM
Genesequence
ACCAGATCTGTAGTCCATGCGATGC
ACCAGATCTGTAATCCATGCGATGC
PM
MM
Exon arrays
On the GeneChip® Human Exon 1.0 ST A 5 362 207 f t d tArray, 5,362,207 features are used to interrogate one million exon clusters (collections of overlapping exons) with over 1.4 million probe sets.over 1.4 million probe sets.
Transcript clustersThe core type was so named because the annotations in this type were intended to be the foundation from which was built the gene annotations. The extended type derived its name from the sense that these annotations would extend the boundaries of the core genes. The idea behind the name of the full typewas that it would signify all possible content.was that it would signify all possible content.
Core Gene Annotation sourcesRefSeq alignmentsGenbank alignments of ‘complete CDS’ transcripts
Extended Gene Annotation sourcesExtended Gene Annotation sourcescDNA alignmentsEnsembl annotations (Hubbard, T. et al.)Mapped syntenic mRNA from rat and mouseMapped syntenic mRNA from rat and mousemicroRNA annotationsMitomap annotationsVegagene (The HAVANA group Hillier et al Heilig et al )Vegagene (The HAVANA group, Hillier et al., Heilig et al.)VegaPseudogene (The HAVANA group, Hillier et al., Heilig et al.)
Full Gene AnnotationsGeneid (Grup de Recerca en Informàtica Biomèdica)Genscan (Burge, C. et al.)GENSCAN Suboptimal (Burge, C. et al.)GENSCAN Suboptimal (Burge, C. et al.)Exoniphy (Siepel et al.)RNAgene (Sean Eddy Lab)SgpGene (Grup de Recerca en Informàtica Biomèdica)SgpGene (Grup de Recerca en Informàtica Biomèdica)TWINSCAN (Korf, I. et al.)
NimbleGene technologyNimbleGene technologygygy
Feature characteristics:Feature characteristics:The most consistent results are obtained The most consistent results are obtained usingusing::gg
3333--μμm features, created by clustering four m features, created by clustering four activated mirrors surrounded by a border of activated mirrors surrounded by a border of i ti t d ii ti t d iinactivated mirrors.inactivated mirrors.1616--μμm features, in which a single activated m features, in which a single activated mirror is surrounded by a border of inactivatedmirror is surrounded by a border of inactivatedmirror is surrounded by a border of inactivated mirror is surrounded by a border of inactivated mirrors. mirrors.
These formats produce microarrays with These formats produce microarrays with p yp y85,000 or 195,000 features, respectively, 85,000 or 195,000 features, respectively, when the entire printable area is usedwhen the entire printable area is used..
Probe designProbe designgg
Completely custom.Completely custom.It is possible to reIt is possible to re--design low quality design low quality probes and update the custom arrayprobes and update the custom arrayprobes and update the custom array.probes and update the custom array.Probes can be scattered over the all Probes can be scattered over the all chip.chip.
Applera arrays are based on 60Applera arrays are based on 60Applera arrays are based on 60 Applera arrays are based on 60 mer probes spotted on 3D matrixmer probes spotted on 3D matrix
Chemiluminescence is used to:Chemiluminescence is used to:Measure Gene Expression. Measure Gene Expression. Quality ControlQuality ControlQuality Control.Quality Control.
Fluorescence is used to:Fluorescence is used to:Locate and autoLocate and auto--grid. grid. Normalize every feature in a way Normalize every feature in a way y yy yindependent of gene expression signal.independent of gene expression signal.
Probes between 50 to 80 mers give the Probes between 50 to 80 mers give the best hybridization specificitybest hybridization specificity
HybridizationAgilentApplera
Specificitypp
Illumina
home madeAffymetrix
home madecDNA arrays
Probe length
20 50 60 70 80 300
T.R.Hughes et al, Nature Biotechnology 19:342-347, 2001
Applera mouse microarray contentApplera mouse microarray contentCelera curatedCelera curated
Only: 18%Only: 18%GenBank mRNAGenBank mRNA
Only: 2.6%Only: 2.6%
Curated PublicCurated PublicOnly: 24%Only: 24%
(RefSeq + RIKEN)(RefSeq + RIKEN)(RefSeq + RIKEN)(RefSeq + RIKEN)
Curated public mapping to Curated public mapping to Celera: 55%Celera: 55%Celera: 55%Celera: 55%
Illumina technologyIllumina technologygygy
8 sub-arrays with 24K probes
Individual fibers conduct light to enable data acquisition and Individual fibers conduct light to enable data acquisition and tit ti f i l itt d f h b dtit ti f i l itt d f h b dquantitation of signal emitted from each bead.quantitation of signal emitted from each bead.
Scanner has < 1 Scanner has < 1 μμm resolutionm resolution
Probe designProbe designggEach address and probe sequence combination has Each address and probe sequence combination has been selected bioinformatically and functionallybeen selected bioinformatically and functionallybeen selected bioinformatically and functionally been selected bioinformatically and functionally screened in the laboratory to ensure the absence of screened in the laboratory to ensure the absence of crosscross--hybridization. hybridization. GeneGene--specific probes were designed using a multispecific probes were designed using a multi--GeneGene--specific probes were designed using a multispecific probes were designed using a multi--step algorithm scoring the following parameters:step algorithm scoring the following parameters:
Similarity to other genesSimilarity to other genesAb f hi hl t d i thAb f hi hl t d i thAbsence of highly repeated sequence in the genomeAbsence of highly repeated sequence in the genomeSequence complexitySequence complexityEST coverageEST coverageSelfSelf--complementarity for hairpin structure predictioncomplementarity for hairpin structure predictionMelting temperature for hybridization uniformityMelting temperature for hybridization uniformityDistance from 3Distance from 3’’ end of the transcriptend of the transcriptpp
The design also tooks into account exon structure:The design also tooks into account exon structure:Probe design incorporated splice isoforms that have been Probe design incorporated splice isoforms that have been identified and documented in the RefSeq database.identified and documented in the RefSeq database.identified and documented in the RefSeq database.identified and documented in the RefSeq database.
Labeling for two channel arraysLabeling for two channel arraysLabeling for two channel arraysLabeling for two channel arraysDirect labeling:
Cy3 and Cy5 are directly incorporated during the cDNACy3 and Cy5 are directly incorporated during the cDNA synthesis
Indirect labeling:Th f i di t l b li ll i ti fThe process of indirect labeling allows incorporation of modified nucleotides, commonly 5-(3-aminoallyl)-2′-deoxyuridine 5′-triphosphate (a reactive amine derivative of dUTP) into the reverse transcription reaction.dUTP) into the reverse transcription reaction. The aminoallyl nucleotide is readily incorporated by both DNA and RNA polymerases. A reactive fluorescent dye is then chemically attached to theA reactive fluorescent dye is then chemically attached to the cDNA transcript in a second reaction.
Amplification labeling (IVT):RT is used with an oligo d(T) primer associated to the T7RT is used with an oligo d(T) primer associated to the T7 promoter sequence to make ds cDNA.T7 RNA Polymerase produces a cRNA with result in an 100 x linear amplification of the dsDNA.linear amplification of the dsDNA.
Critical parametersCritical parametersMany factors have an impact on the Many factors have an impact on the reliability of signals. reliability of signals.
The scanner used is important to consider,The scanner used is important to consider,The scanner used is important to consider, The scanner used is important to consider, as it is generating the actual signals to be as it is generating the actual signals to be quantified.quantified.quantified. quantified. The choice of image analysis software is The choice of image analysis software is clearly critical in the production of reliableclearly critical in the production of reliableclearly critical in the production of reliable clearly critical in the production of reliable intensity values. intensity values.
DetectionDetection Biotin
Hybridization detection is Hybridization detection is f df d
StreptavidinStreptavidin
fitcfitcperformed in two steps:performed in two steps:
Streptavidin labeled with Streptavidin labeled with phycoerythrin is applied tophycoerythrin is applied to
fitcfitc
moAb antimoAb anti--streptavidinstreptavidin
phycoerythrin is applied to phycoerythrin is applied to the array.the array.Signal amplification isSignal amplification isSignal amplification is Signal amplification is performed using antiperformed using anti--streptavidin moAb labeled streptavidin moAb labeled with biotin followed bywith biotin followed bywith biotin followed by with biotin followed by addition of streptavidin addition of streptavidin labeled with fitc.labeled with fitc.
The fluorescence signal is The fluorescence signal is read by a laser based read by a laser based yyscanner.scanner.
RNA fragments with fluorescent tags from sample to be tested
RNA f t h b idi ith DNA G Chi fiRNA fragment hybridizes with DNA on GeneChipfi array
Applera RT /IVT Applera RT /IVT labelinglabelingpppp gg
AAAAAAA 3’ mRNA
5’
(T7 Promoter) 5’ cDNAReverse transcription AAAAAAA - 3’ cRNA
2nd strand DNA synthesis (T7 Promoter) 5’ cDNARNA degradation
cDNA purificationAAAAAAA - 3’ cDNA
cRNA synthesis
cRNA purification
UUUUUUU 5’ purified DIG labeled cRNADIG
3’3’3’
DetectionDetection digoxigenin
Ch i l i i lCh i l i i l
Hybridization detection Hybridization detection
Chemioluminescence signalChemioluminescence signal
moAb antimoAb anti--digoxigenindigoxigenin
yyis performed using antiis performed using anti--digoxigenin moAbdigoxigenin moAb
APAP
Reagent and enhancerReagent and enhancer
digoxigenin moAb digoxigenin moAb labeled with alcaline labeled with alcaline phosphatase (AP)phosphatase (AP)phosphatase (AP)phosphatase (AP)Chemioluminescence is Chemioluminescence is acquired by CCD acquired by CCD cameracameracamera.camera.
I l C l P b (ICP)Internal Control Probe (ICP)
LIZ
DIGDI
G® L
IZ ®
60 mer oligo
Hybridization
®
24 mer oligo
Immobilized ICP Oligo(co-spotted in every location)
Hybridization
DetectionDetection Biotin
C 3C 3
HybridizationHybridization
Cy3Cy3
moAb antimoAb anti--biotinbiotin
Hybridization Hybridization detection is detection is
f d i tif d i tiperformed using antiperformed using anti--biotin moAb labeled biotin moAb labeled with Cy3.with Cy3.Th flTh flThe fluorescence The fluorescence signal is read by asignal is read by asignal is read by a signal is read by a laser based scanner.laser based scanner.