cdna microrray

13
INTRODUCTION A DNA microarray is a multiplex technology used in molecular biology and in medicine. It consists of an arrayed series of thousands of microscopic spots of DNA oligonucleotides, called features, each containing picomoles of a specific DNA sequence. This can be a short section of a gene or other DNA element that are used as probes to hybridize a cDNA or cRNA sample (called target) under high-stringency conditions. Probe-target hybridization is usually detected and quantified by fluorescence-based detection of fluorophore-labeled targets to determine relative abundance of nucleic acid sequences in the target. In standard microarrays, the probes are attached to a solid surface by a covalent bond to a chemical matrix (via epoxy-silane, amino-silane, lysine, polyacrylamide or others). The solid surface can be glass or a silicon chip, in which case they are commonly known as gene chip or colloquially Affy chip when an Affymetrix chip is used. Other microarray platforms, such as Illumina, use microscopic beads, instead of the large solid support. DNA arrays are different from other types of microarray only in that they either measure DNA or use DNA as part of its detection system. Microarray technology evolved from Southern blotting, where fragmented DNA is attached to a substrate and then probed with a known gene or fragment. The use of a collection of distinct DNAs in arrays for expression profiling was first described in 1987, and the arrayed DNAs were used to identify genes whose expression is modulated by interferon. These early gene arrays were made by spotting cDNAs onto filter paper with a pin-spotting device. The use of miniaturized microarrays for gene expression profiling was first reported in 1995, and a complete eukaryotic genome (Saccharomyce cerevisiae) on a microarray was published in 1997.

Upload: anand-maurya

Post on 07-May-2017

228 views

Category:

Documents


5 download

TRANSCRIPT

INTRODUCTION

A DNA microarray is a multiplex technology used in molecular biology and in medicine It consists of an arrayed series of thousands of microscopic spots of DNA oligonucleotides called features each containing picomoles of a specific DNA sequence This can be a short section of a gene or other DNA element that are used as probes to hybridize a cDNA or cRNA sample (called target) under high-stringency conditions Probe-target hybridization is usually detected and quantified by fluorescence-based detection of fluorophore-labeled targets to determine relative abundance of nucleic acid sequences in the target

In standard microarrays the probes are attached to a solid surface by a covalent bond to a chemical matrix (via epoxy-silane amino-silane lysine polyacrylamide or others) The solid surface can be glass or a silicon chip in which case they are commonly known as gene chip or colloquially Affy chip when an Affymetrix chip is used Other microarray platforms such as Illumina use microscopic beads instead of the large solid support DNA arrays are different from other types of microarray only in that they either measure DNA or use DNA as part of its detection system

Microarray technology evolved from Southern blotting where fragmented DNA is attached to a substrate and then probed with a known gene or fragment The use of a collection of distinct DNAs in arrays for expression profiling was first described in 1987 and the arrayed DNAs were used to identify genes whose expression is modulated by interferon These early gene arrays were made by spotting cDNAs onto filter paper with a pin-spotting device The use of miniaturized microarrays for gene expression profiling was first reported in 1995 and a complete eukaryotic genome (Saccharomyce cerevisiae) on a microarray was published in 1997

PRINCIPLE

The microarray technology consists of spotting PCR products or long oligonucleotides (50mer-70mer) on glass slides at densities of up to 6000 spots cm2 These slides are hybridised using fluorescent targets (cDNAs or genomic DNAs) The fluorescent molecules most commonly used are members of the cyanine family Cy3 et Cy5 After hybridisation the signals are detected using a fluorescence scanner The use of two different fluorochromes allows the determination of hybridisation signals from two distinct strains in one single experiment One the fluorescent intensities have been obtained the major part of the work is the analysis of the data in order to extract the biological information

This analysis can be divided into five steps Target preparation Hybridization Slide scanning Data analysis Expression profile clustering

MATERIALS

DNA sourcesAbout 5200 human cDNA clones of the IMAGE library were obtained from the RZPD Resource Centre (Berlin Germany) Some 21 000 random shotgun clones representing the genome of Trypanosoma brucei were provided by Najib El-Sayed of the Institute for Genomic Research (TIGR Rockville USA) Nearly 4550 shotgun clones covering the entire genome of Pseudomonas putida as a minimal tiling path were obtained from Helmut Hilbert of Qiagen (Hilden Germany) PCR products for some 21 000 predicted open reading frames (ORFs) of Drosophila melanogaster were produced directly from genomic DNA The template for some 7300 ORF-specific PCR products of Candida albicans was strain SC5314 (Can14)

PCR amplificationPCR amplifications were performed in 384- or 96-well microtitre plates For PCR on the cDNA and shotgun clones 02 microM of the respective vector-specific primer pairs d(TCA CACAGGAAACAGCTATGAC) and d(GTAAAACGACGGCCAGTG) (human clones) d(TTGTAAAACGACGGCCAGTG) and d(GCGGATAACAATTTCACACAGGA) (Tbrucei) or d(TCGGATCCACTAGTAACG) and d(GGCCGCCAGTGTGATG) (Pputida) (all from Interactiva Ulm Germany) were used The reactions were started by inoculating 25 or 100 microl of PCR mix usually in 10 mM TrisndashHCl pH 83 225 mM MgCl2 50 mM KCl 02 mM each dATP dTTP dGTP and dCTP 15 M betaine 01 mM cresol red and 2 U Taq polymerase with a few Escherichia coli cells transferred from a growth culture using a plastic 384- or 96-pin gadget (Genetix New Milton UK) The plates were incubated for 3 min at 94degC before 35 cycles of denaturation at 94degC for 30 s annealing at 51degC for 30 s and elongation at 72degC for 90 s were performed followed by a final elongation phase at 72degC for 10 min In some cases the PCR was performed without betaine The Drosophila ORFs were initially amplified on 100 ng genomic DNA with some 43 000 gene-specific primers all of which contained one of several common tag sequences of 15 nt length at their 5-ends Subsequent re-amplification was carried out using the fitting primer pair PCR products of Calbicans ORFs were produced on 20 ng genomic DNA with 7300 specific primer pairs

Microarray production process

DNA fragments amplified by PCR technique are spotted on a microscopic glass slide coated with polylysine prior to spotting process The polylysine coating goal is to ensure DNA fixation through electrostatic interactions PCR fragments are in our case the expressed part (ORF) of the 6200 Saccharomyces cerevisae genes (baker yeast) Slide preparation is achieved by blocking the polylysine not fixed to DNA in order to avoid target binding Prior to hybridisation DNA is denatured to obtained a single strand DNA

on the microarray this will allow the probe to bind to the complementary strand from the target

Target preparation

RNA are extracted from two yeast cultures from which we want to compare expression level Messengers RNA are then transformed in cDNA by reverse transcription On this stage DNA from the first culture with a green dye whereas DNA from the second culture is labelled with a red dye

The available target-preparation methods can be divided into two groups first-strand cDNA that is labeled or tagged with a capture sequence or the generation of antisense RNA (aRNA) from double-stranded cDNA during an in vitro transcription (IVT) reaction Labeled cDNA can be prepared via direct The incorporation of a fluorophore-labeled nucleotide or through incorporation of an aminoallyl-labeled nucleotide followed by coupling to a fluorophore containing an amine-reactive group to the aminoallyl nucleotide (Schena et al 1995 for review see Lockhart and Winzeler 2000) Alternatively the first-strand cDNA can be tagged with a capture sequence that is used for subsequent detection steps (Stears et al 2000) DNA microarrays containing short oligonucleotide probes (lt35 nucleotides long) require more target for each hybridization which requires an amplification method with smaller sample sizes Typically the generation of aRNA (aRNA is also commonly called complementary RNA or cRNA) is preceded by first-strand synthesis of cDNA using an oligonucleotide primer containing a bacteriophage T7 RNA polymerase promoter proximal to an oligo(dT) sequence (van Gelder et al 1990Eberwine et al 1992 Lockhart et al 1996) After second-strand cDNA synthesis and cDNA purification an IVT reaction is performed using T7 RNA polymerase in the presence of labeled nucleotides Alternatives to this labeling strategy produce unlabeled aRNA followed by a cDNA synthesis in the presence of a fluorophore-labeled nucleotide (Wang et al 2000) Any target preparation method requires a linear amplification of the available transcripts to be representative of the transcript population

Hybridisation

Green labelled cDNA and red labelled ones are mixed together (call the target) and put on the matrix of spotted single strand DNA (call the probe) The chip is then incubated one night at 60 degrees At this temperature a DNA strand that encounter the complementary strand and match together to create a double strand DNA The fluorescent DNA will then hybridise on the spotted ones

The discrepancies in microarray results are a consequence of differences in microarray measures such as accuracy [ie lsquothe degree of conformity of the measured quantity to its actual (true) valuersquo sensitivity [ie lsquothe concentration range of target molecules in which accurate measurements can be madersquo reproducibility [ie lsquothe degree to which repeated measurements of the same quantity will show the same or similar resultsrsquo and specificity [ie lsquothe ability of a probe to provide a signal that is influenced only by the presence of the target moleculersquo

Accuracy sensitivity and reproducibility may be affected by several effectors These measures and their effectors are discussed by Dufva and Draghici et al and will not be detailed here An example for an effector of sensitivity reproducibility and accuracy is the type of microarray platform oligonucleotide arrays have been found to be more reproducible and sensitive than cDNA arrays and some oligonucleotide arrays have been found to be more accurate than others Sensitivity is also affected by probe density (ie the number of different probes that are fabricated in a given area) which has been shown to be an effector for the availability of probes for hybridization this availability may also be affected by the steric restrictions imposed by the solid microarray surface A higher availability of probes for hybridization has been demonstrated to increase sensitivity In addition sensitivity is affected by the hybridization signal-to-noise ratio (ie the ratio between the spot signal and that of the background) a low background increases microarray hybridization sensitivity

Low specificity of microarray hybridizations has been suggested to be one of the prime measures affecting discrepancies in gene-expression profiles between different probes targeting the same region of a given transcript or between different microarray platforms in the present review we will highlight the issue of microarray - hybridization specificity as a key measure that once improved may increase the validity of microarray results

Microarrays consist of multiple probes Hence a prime key for specificity during microarray hybridiation for either short-oligomer or cDNA microarrays is the ability of the probe to discriminate between different target molecules

Probes are designed to be complementary to the target molecule according to the WatsonndashCrick rules of binding Therefore a probe with high specificity to its target molecule should provide a signal influenced only by the presence of the target molecule Nevertheless a perfect match in terms of sequence-similarity-based complementarity between a probe and its target molecule does not guarantee specificity This is due to the presence of thousands of target molecules during microarray hybridizationmdasheach target molecule being composed of tens of hundreds or thousands of four-nucleotide bases and to the effect of different effectors (discussed subsequently) of hybridization specificity which may alter the ability of a probe to bind to a target molecule Hence there is often some degree of microarray-probe hybridization to a target molecule which is not strictly complementary to it or vice versa a variable number of target molecules that are hybridized to a microarray probe which is not exactly complementary to them

FOUR LEVELS OF HYBRIDIZATION SPECIFICITY

We define four levels of hybridization specificity in the context of microarray hybridization The first is of hybridization between a single probe molecule and a single target molecule The two molecules may exhibit perfect hybridization partial hybridization (ie the target molecule is only partially hybridized to the probe or no hybridization

The second level of specificity is of a spot At this level multiple probe molecules that compose one spot are hybridized to multiple target molecules The spot probes may exhibit perfect partial or no hybridization with the target molecules Notably at this level partial hybridization may have one or both of two forms only some of the probes may be hybridized to the target molecule or probes may be hybridized to only some of the target molecules This partial hybridization at the spot level may be a result of cross- hybridization (ie hybridization between sequences that are not strictly complementary due to the presence and hybridization of nontarget molecules with sequences similar to that of the spot probes Since a spot is composed of multiple probes a single spot may simultaneously bear all combinations of one to four of the presented probe-target molecule types of binding

The third level of specificity is of a spot-set [or in Affymetrix terminology lsquoprobe-setrsquo in which multiple spots represent different segments of the same reference sequence (eg different exons of a gene) At this level different spots of a spot-set may exhibit perfect hybridization with the target molecule partial hybridization with the target molecule due to the presence of probes with mismatches to the target molecule as a result of for example an annotation error in the gene sequence or intended mismatches introduced to quantify nonspecific hybridization no hybridization due to for example alternative

splicing of a transcript leading to probes with no match to the target molecule cross hybridization due to for example a spot within a spot-set that represents an evolutionarily conserved gene segment which hybridizes with nontarget molecules derived from various gene-family members

The fourth level of specificity is that of the microarray in which a variable number of spot-sets may exhibit different forms of hybridization with target sequences perfect hybridization (ie all target molecules are hybridized to their representative spot-sets and all spot-sets are hybridized to the target molecules they represent) partial hybridization in either direction no hybridization (ie target molecules are not hybridized to any spot-set or spot-sets do not match any target molecules) or cross- hybridization (eg target molecules of different genes hybridize to the same spot-set or target molecules of a particular gene hybridize to several different genesrsquo spot-sets) These different forms may exist for a large number of different target molecules or spot-sets

Slide scanning

A laser excites each spot and the fluorescent emission gather through a photo-multiplicator (PMT) coupled to a confocal microscope We obtained two images where grey scales represent fluorescent intensities read If we replace grey scales by green scales for the first image and red scales for the second one we obtained by superimposing the two images one image composed of spots going from green ones (where only DNA from the first condition is fixed) to red (where only DNA from the second condition is fixed) passing through the yellow colour (where DNA from the two conditions are fixed on equal amount)

Data analysis

We have now two microarray images from which we have to calculate the number of DNA molecules in each experimental condition To dos o we measure the signal amount in the green dye emission wavelength and the signal amount in the red dye emission wavelength Then we normalise these amount according to various parameters (yeast amount in each culture condition emission power of each dye hellip) We suppose that the amount of fluorescent DNA fixed is proportional to the mRNA amount present in each cell at the beginning and we calculate the redgreen fluorescence ratio If this ratio is greater than 1 (red on the image) the gene expression is greater in the second experimental condition if this ration is smaller than 1 (green on the image) the gene expression is greater in the first condition We can visualize these differences in expression using software as the one developed in the laboratory call ArrayPlot (cf below image) This software allows from the intensities list of spot to display the red intensities of each spot as a function of the green intensities

Fabrication

Microarrays can be manufactured in different ways depending on the number of probes under examination costs customization requirements and the type of scientific question being asked Arrays may have as few as 10 probes to up to 21 million micrometre-scale probes from commercial vendors

Surface engineering

The first step of DNA microarray fabrication involves surface engineering of a substrate in order to obtain desirable surface properties for the application of interest Optimal surface properties are those which produce high signal to noise ratios for the DNA targets of interest Generally this involves maximizing the probe surface density and activity while minimizing the non-specific binding of the targets of interest Methods of surface engineering vary depending on the platform material design and application

Spotted vs oligonucleotide arrays

Microarrays can be fabricated using a variety of technologies including printing with fine-pointed pins onto glass slides photolithography using pre-made masks photolithography using dynamic micromirror devices ink-jet printing or electrochemistry on microelectrode arrays

In spotted microarrays the probes are oligonucleotides cDNA or small fragments of PCR products that correspond to mRNAs The probes are synthesized prior to deposition on the array surface and are then spotted onto glass A common approach utilizes an array of fine pins or needles controlled by a robotic arm that is dipped into wells containing DNA probes and then depositing each probe at designated locations on the array surface The resulting grid of probes represents the nucleic acid profiles of the prepared probes and is ready to receive complementary cDNA or cRNA targets derived from experimental or clinical samples This technique is used by research scientists around the world to produce in-house printed microarrays from their own labs These arrays may be easily customized for each experiment because researchers can choose the probes and printing locations on the arrays synthesize the probes in their own lab (or collaborating facility) and spot the arrays They can then generate their own labeled samples for hybridization hybridize the samples to the array and finally scan the arrays with their own equipment This provides a relatively low-cost microarray that may be customized for each study and avoids the costs of purchasing often more expensive commercial arrays that may represent vast numbers of genes that are not of interest to the investigator Publications exist which indicate in-house spotted microarrays may not provide the same level of sensitivity compared to commercial oligonucleotide arrays possibly owing to the small batch sizes and reduced printing efficiencies when compared to industrial manufactures of oligo arrays

In oligonucleotide microarrays the probes are short sequences designed to match parts of the sequence of known or predicted open reading frames Although oligonucleotide probes are often used in spotted microarrays the term oligonucleotide array most often refers to a specific technique of manufacturing Oligonucleotide arrays are produced by printing short oligonucleotide sequences designed to represent a single gene or family of gene splice-variants by synthesizing this sequence directly onto the array surface instead of depositing intact sequences Sequences may be longer (60-mer probes such as the Agilent design) or shorter (25-mer probes produced by Affymetrix) depending on the desired purpose longer probes are more specific to individual target genes shorter probes may be spotted in higher density across the array and are cheaper to manufacture One technique used to produce oligonucleotide arrays include photolithographic synthesis (Agilent and Affymetrix) on a silica substrate where light and light-sensitive masking agents are used to build a sequence one nucleotide at a time across the entire array Each applicable probe is selectively unmasked prior to bathing the array in a solution of a single nucleotide then a masking reaction takes place and the next set of probes are unmasked in preparation for a different nucleotide exposure After many repetitions the sequences of every probe become fully constructed More recently Maskless Array Synthesis from NimbleGen Systems has combined flexibility with large numbers of probes

Two-channel vs one-channel detection

Diagram of typical dual-colour microarray experiment

Two-color microarrays or two-channel microarrays are typically hybridized with cDNA prepared from two samples to be compared (eg diseased tissue versus healthy tissue) and that are labeled with two different fluorophores Fluorescent dyes commonly used for cDNA labelling include Cy3 which has a fluorescence emission wavelength of 570 nm (corresponding to the green part of the light spectrum) and Cy5 with a fluorescence emission wavelength of 670 nm (corresponding to the red part of the light spectrum) The two Cy-labelled cDNA samples are mixed and hybridized to a single microarray that is then scanned in a microarray scanner to visualize fluorescence of the two fluorophores after excitation with a laser beam of a defined wavelength Relative intensities of each fluorophore may then be used in ratio-based analysis to identify up-regulated and down-regulated genes

Oligonucleotide microarrays often contain control probes designed to hybridize with RNA spike-ins The degree of hybridization between the spike-ins and the control probes is used to normalize the hybridization measurements for the target probes Although absolute levels of gene expression may be determined in the two-color array the relative differences in expression among different spots within a sample and between samples is the preferred method of data analysis for the two-color system Examples of providers for such microarrays includes Agilent with their Dual-Mode platform Eppendorf with their DualChip platform for fluorescence labeling and TeleChem International with Arrayit

In single-channel microarrays or one-color microarrays the arrays are designed to give estimations of the absolute levels of gene expression Therefore the comparison of two conditions requires two separate single-dye hybridizations As only a single dye is used the data collected represent absolute values of gene expression These may be compared to other genes within a sample or to reference normalizing probes used to calibrate data across the entire array and across multiple arrays Three popular single-channel systems

are the Affymetrix Gene Chip the Applied Microarrays CodeLink arrays and the Eppendorf DualChip amp Silverquant One strength of the single-dye system lies in the fact that an aberrant sample cannot affect the raw data derived from other samples because each array chip is exposed to only one sample (as opposed to a two-color system in which a single low-quality sample may drastically impinge on overall data precision even if the other sample was of high quality) Another benefit is that data are more easily compared to arrays from different experiments the absolute values of gene expression may be compared between studies conducted months or years apart A drawback to the one-color system is that when compared to the two-color system twice as many microarrays are needed to compare samples within an experiment

Expression profile clustering

Then we can try to gather genes that share the same expression profile on several experiments This clustering can be done gradually as for phylogenetic analysis which consist in calculating similarity criteria between expression profiles and gather the most similar ones We can also use more complex techniques as principal component analysis or neuronal networks

At the end hierarchical clustering is usually displayed as a matrix where each column represent one experiment and each row a gene Ratios are displayed thanks to a colour scale going from green (repressed genes) to red (induced genes)

Uses and types

Arrays of DNA can be spatially arranged as in the commonly known gene chip (also called genome chip DNA chip or gene array) or can be specific DNA sequences labelled such that they can be independently identified in solution The traditional solid-phase array is a collection of microscopic DNA spots attached to a solid surface such as glass plastic or silicon biochip The affixed DNA segments are known as probes (although some sources use different terms such as reporters) Thousands of them can be placed in known locations on a single DNA microarray

DNA microarrays can be used to detect DNA (as in comparative genomic hybridization) or detect RNA (most commonly as cDNA after reverse transcription)that may or may not be translated into proteins The process of measuring gene expression via cDNA is called expression analysis or expression profiling

Since an array can contain tens of thousands of probes a microarray experiment can accomplish that many genetic tests in parallel Therefore arrays have dramatically accelerated many types of investigation

Applications include

Technology or Application

Synopsis

Gene expression profiling

In an mRNAor gene expression profiling experiment the expression levels of thousands of genes are simultaneously monitored to study the effects of certain treatments diseases and developmental stages on gene expression For example microarray-based gene expression profiling can be used to identify genes whose expression is changed in response to pathogens or other organisms by comparing gene expression in infected to that in uninfected cells or tissues

Comparative genomic hybridization

Assessing genome content in different cells or closely related organisms

Chromatin immunoprecipitation on Chip

DNA sequences bound to a particular protein can be isolated by immunoprecipitating that protein (CHIP) these fragments can be then hybridized to a microarray (such as a tiling array) allowing the determination of protein binding site occupancy throughout

the genome Example protein to immunoprecipitate are histone modifications (H3K27me3 H3K4me2 H3K9me3 etc) Polycomb-group protein (PRC2Suz12 PRC1YY1) and trithorax-group protein (Ash1) to study the epigenetic landscape or RNA Polymerase II to study the transcription lanscape

SNP detection

Identifying single nucleotide polymorphism among alleles within or between populations Several applications of microarrays make use of SNP detection including Genotyping forensic analysis measuring predisposition to disease identifying drug-candidates evaluating germline mutations in individuals or somatic mutations in cancers assessing loss of heterozygosity or genetic linkage analysis

Alternative splicing detection

An exon junction array design uses probes specific to the expected or potential splice sites of predicted exons for a gene It is of intermediate density or coverage to a typical gene expression array (with 1-3 probes per gene) and a genomic tiling array (with hundreds or thousands of probes per gene) It is used to assay the expression of alternative splice forms of a gene Exon arrays have a different design employing probes designed to detect each individual exon for known or predicted genes and can be used for detecting different splicing isoforms

Tiling array

Genome tiling arrays consist of overlapping probes designed to densely represent a genomic region of interest sometimes as large as an entire human chromosome The purpose is to empirically detect expression of transcripts or alternatively splice forms which may not have been previously known or predicted

PRINCIPLE

The microarray technology consists of spotting PCR products or long oligonucleotides (50mer-70mer) on glass slides at densities of up to 6000 spots cm2 These slides are hybridised using fluorescent targets (cDNAs or genomic DNAs) The fluorescent molecules most commonly used are members of the cyanine family Cy3 et Cy5 After hybridisation the signals are detected using a fluorescence scanner The use of two different fluorochromes allows the determination of hybridisation signals from two distinct strains in one single experiment One the fluorescent intensities have been obtained the major part of the work is the analysis of the data in order to extract the biological information

This analysis can be divided into five steps Target preparation Hybridization Slide scanning Data analysis Expression profile clustering

MATERIALS

DNA sourcesAbout 5200 human cDNA clones of the IMAGE library were obtained from the RZPD Resource Centre (Berlin Germany) Some 21 000 random shotgun clones representing the genome of Trypanosoma brucei were provided by Najib El-Sayed of the Institute for Genomic Research (TIGR Rockville USA) Nearly 4550 shotgun clones covering the entire genome of Pseudomonas putida as a minimal tiling path were obtained from Helmut Hilbert of Qiagen (Hilden Germany) PCR products for some 21 000 predicted open reading frames (ORFs) of Drosophila melanogaster were produced directly from genomic DNA The template for some 7300 ORF-specific PCR products of Candida albicans was strain SC5314 (Can14)

PCR amplificationPCR amplifications were performed in 384- or 96-well microtitre plates For PCR on the cDNA and shotgun clones 02 microM of the respective vector-specific primer pairs d(TCA CACAGGAAACAGCTATGAC) and d(GTAAAACGACGGCCAGTG) (human clones) d(TTGTAAAACGACGGCCAGTG) and d(GCGGATAACAATTTCACACAGGA) (Tbrucei) or d(TCGGATCCACTAGTAACG) and d(GGCCGCCAGTGTGATG) (Pputida) (all from Interactiva Ulm Germany) were used The reactions were started by inoculating 25 or 100 microl of PCR mix usually in 10 mM TrisndashHCl pH 83 225 mM MgCl2 50 mM KCl 02 mM each dATP dTTP dGTP and dCTP 15 M betaine 01 mM cresol red and 2 U Taq polymerase with a few Escherichia coli cells transferred from a growth culture using a plastic 384- or 96-pin gadget (Genetix New Milton UK) The plates were incubated for 3 min at 94degC before 35 cycles of denaturation at 94degC for 30 s annealing at 51degC for 30 s and elongation at 72degC for 90 s were performed followed by a final elongation phase at 72degC for 10 min In some cases the PCR was performed without betaine The Drosophila ORFs were initially amplified on 100 ng genomic DNA with some 43 000 gene-specific primers all of which contained one of several common tag sequences of 15 nt length at their 5-ends Subsequent re-amplification was carried out using the fitting primer pair PCR products of Calbicans ORFs were produced on 20 ng genomic DNA with 7300 specific primer pairs

Microarray production process

DNA fragments amplified by PCR technique are spotted on a microscopic glass slide coated with polylysine prior to spotting process The polylysine coating goal is to ensure DNA fixation through electrostatic interactions PCR fragments are in our case the expressed part (ORF) of the 6200 Saccharomyces cerevisae genes (baker yeast) Slide preparation is achieved by blocking the polylysine not fixed to DNA in order to avoid target binding Prior to hybridisation DNA is denatured to obtained a single strand DNA

on the microarray this will allow the probe to bind to the complementary strand from the target

Target preparation

RNA are extracted from two yeast cultures from which we want to compare expression level Messengers RNA are then transformed in cDNA by reverse transcription On this stage DNA from the first culture with a green dye whereas DNA from the second culture is labelled with a red dye

The available target-preparation methods can be divided into two groups first-strand cDNA that is labeled or tagged with a capture sequence or the generation of antisense RNA (aRNA) from double-stranded cDNA during an in vitro transcription (IVT) reaction Labeled cDNA can be prepared via direct The incorporation of a fluorophore-labeled nucleotide or through incorporation of an aminoallyl-labeled nucleotide followed by coupling to a fluorophore containing an amine-reactive group to the aminoallyl nucleotide (Schena et al 1995 for review see Lockhart and Winzeler 2000) Alternatively the first-strand cDNA can be tagged with a capture sequence that is used for subsequent detection steps (Stears et al 2000) DNA microarrays containing short oligonucleotide probes (lt35 nucleotides long) require more target for each hybridization which requires an amplification method with smaller sample sizes Typically the generation of aRNA (aRNA is also commonly called complementary RNA or cRNA) is preceded by first-strand synthesis of cDNA using an oligonucleotide primer containing a bacteriophage T7 RNA polymerase promoter proximal to an oligo(dT) sequence (van Gelder et al 1990Eberwine et al 1992 Lockhart et al 1996) After second-strand cDNA synthesis and cDNA purification an IVT reaction is performed using T7 RNA polymerase in the presence of labeled nucleotides Alternatives to this labeling strategy produce unlabeled aRNA followed by a cDNA synthesis in the presence of a fluorophore-labeled nucleotide (Wang et al 2000) Any target preparation method requires a linear amplification of the available transcripts to be representative of the transcript population

Hybridisation

Green labelled cDNA and red labelled ones are mixed together (call the target) and put on the matrix of spotted single strand DNA (call the probe) The chip is then incubated one night at 60 degrees At this temperature a DNA strand that encounter the complementary strand and match together to create a double strand DNA The fluorescent DNA will then hybridise on the spotted ones

The discrepancies in microarray results are a consequence of differences in microarray measures such as accuracy [ie lsquothe degree of conformity of the measured quantity to its actual (true) valuersquo sensitivity [ie lsquothe concentration range of target molecules in which accurate measurements can be madersquo reproducibility [ie lsquothe degree to which repeated measurements of the same quantity will show the same or similar resultsrsquo and specificity [ie lsquothe ability of a probe to provide a signal that is influenced only by the presence of the target moleculersquo

Accuracy sensitivity and reproducibility may be affected by several effectors These measures and their effectors are discussed by Dufva and Draghici et al and will not be detailed here An example for an effector of sensitivity reproducibility and accuracy is the type of microarray platform oligonucleotide arrays have been found to be more reproducible and sensitive than cDNA arrays and some oligonucleotide arrays have been found to be more accurate than others Sensitivity is also affected by probe density (ie the number of different probes that are fabricated in a given area) which has been shown to be an effector for the availability of probes for hybridization this availability may also be affected by the steric restrictions imposed by the solid microarray surface A higher availability of probes for hybridization has been demonstrated to increase sensitivity In addition sensitivity is affected by the hybridization signal-to-noise ratio (ie the ratio between the spot signal and that of the background) a low background increases microarray hybridization sensitivity

Low specificity of microarray hybridizations has been suggested to be one of the prime measures affecting discrepancies in gene-expression profiles between different probes targeting the same region of a given transcript or between different microarray platforms in the present review we will highlight the issue of microarray - hybridization specificity as a key measure that once improved may increase the validity of microarray results

Microarrays consist of multiple probes Hence a prime key for specificity during microarray hybridiation for either short-oligomer or cDNA microarrays is the ability of the probe to discriminate between different target molecules

Probes are designed to be complementary to the target molecule according to the WatsonndashCrick rules of binding Therefore a probe with high specificity to its target molecule should provide a signal influenced only by the presence of the target molecule Nevertheless a perfect match in terms of sequence-similarity-based complementarity between a probe and its target molecule does not guarantee specificity This is due to the presence of thousands of target molecules during microarray hybridizationmdasheach target molecule being composed of tens of hundreds or thousands of four-nucleotide bases and to the effect of different effectors (discussed subsequently) of hybridization specificity which may alter the ability of a probe to bind to a target molecule Hence there is often some degree of microarray-probe hybridization to a target molecule which is not strictly complementary to it or vice versa a variable number of target molecules that are hybridized to a microarray probe which is not exactly complementary to them

FOUR LEVELS OF HYBRIDIZATION SPECIFICITY

We define four levels of hybridization specificity in the context of microarray hybridization The first is of hybridization between a single probe molecule and a single target molecule The two molecules may exhibit perfect hybridization partial hybridization (ie the target molecule is only partially hybridized to the probe or no hybridization

The second level of specificity is of a spot At this level multiple probe molecules that compose one spot are hybridized to multiple target molecules The spot probes may exhibit perfect partial or no hybridization with the target molecules Notably at this level partial hybridization may have one or both of two forms only some of the probes may be hybridized to the target molecule or probes may be hybridized to only some of the target molecules This partial hybridization at the spot level may be a result of cross- hybridization (ie hybridization between sequences that are not strictly complementary due to the presence and hybridization of nontarget molecules with sequences similar to that of the spot probes Since a spot is composed of multiple probes a single spot may simultaneously bear all combinations of one to four of the presented probe-target molecule types of binding

The third level of specificity is of a spot-set [or in Affymetrix terminology lsquoprobe-setrsquo in which multiple spots represent different segments of the same reference sequence (eg different exons of a gene) At this level different spots of a spot-set may exhibit perfect hybridization with the target molecule partial hybridization with the target molecule due to the presence of probes with mismatches to the target molecule as a result of for example an annotation error in the gene sequence or intended mismatches introduced to quantify nonspecific hybridization no hybridization due to for example alternative

splicing of a transcript leading to probes with no match to the target molecule cross hybridization due to for example a spot within a spot-set that represents an evolutionarily conserved gene segment which hybridizes with nontarget molecules derived from various gene-family members

The fourth level of specificity is that of the microarray in which a variable number of spot-sets may exhibit different forms of hybridization with target sequences perfect hybridization (ie all target molecules are hybridized to their representative spot-sets and all spot-sets are hybridized to the target molecules they represent) partial hybridization in either direction no hybridization (ie target molecules are not hybridized to any spot-set or spot-sets do not match any target molecules) or cross- hybridization (eg target molecules of different genes hybridize to the same spot-set or target molecules of a particular gene hybridize to several different genesrsquo spot-sets) These different forms may exist for a large number of different target molecules or spot-sets

Slide scanning

A laser excites each spot and the fluorescent emission gather through a photo-multiplicator (PMT) coupled to a confocal microscope We obtained two images where grey scales represent fluorescent intensities read If we replace grey scales by green scales for the first image and red scales for the second one we obtained by superimposing the two images one image composed of spots going from green ones (where only DNA from the first condition is fixed) to red (where only DNA from the second condition is fixed) passing through the yellow colour (where DNA from the two conditions are fixed on equal amount)

Data analysis

We have now two microarray images from which we have to calculate the number of DNA molecules in each experimental condition To dos o we measure the signal amount in the green dye emission wavelength and the signal amount in the red dye emission wavelength Then we normalise these amount according to various parameters (yeast amount in each culture condition emission power of each dye hellip) We suppose that the amount of fluorescent DNA fixed is proportional to the mRNA amount present in each cell at the beginning and we calculate the redgreen fluorescence ratio If this ratio is greater than 1 (red on the image) the gene expression is greater in the second experimental condition if this ration is smaller than 1 (green on the image) the gene expression is greater in the first condition We can visualize these differences in expression using software as the one developed in the laboratory call ArrayPlot (cf below image) This software allows from the intensities list of spot to display the red intensities of each spot as a function of the green intensities

Fabrication

Microarrays can be manufactured in different ways depending on the number of probes under examination costs customization requirements and the type of scientific question being asked Arrays may have as few as 10 probes to up to 21 million micrometre-scale probes from commercial vendors

Surface engineering

The first step of DNA microarray fabrication involves surface engineering of a substrate in order to obtain desirable surface properties for the application of interest Optimal surface properties are those which produce high signal to noise ratios for the DNA targets of interest Generally this involves maximizing the probe surface density and activity while minimizing the non-specific binding of the targets of interest Methods of surface engineering vary depending on the platform material design and application

Spotted vs oligonucleotide arrays

Microarrays can be fabricated using a variety of technologies including printing with fine-pointed pins onto glass slides photolithography using pre-made masks photolithography using dynamic micromirror devices ink-jet printing or electrochemistry on microelectrode arrays

In spotted microarrays the probes are oligonucleotides cDNA or small fragments of PCR products that correspond to mRNAs The probes are synthesized prior to deposition on the array surface and are then spotted onto glass A common approach utilizes an array of fine pins or needles controlled by a robotic arm that is dipped into wells containing DNA probes and then depositing each probe at designated locations on the array surface The resulting grid of probes represents the nucleic acid profiles of the prepared probes and is ready to receive complementary cDNA or cRNA targets derived from experimental or clinical samples This technique is used by research scientists around the world to produce in-house printed microarrays from their own labs These arrays may be easily customized for each experiment because researchers can choose the probes and printing locations on the arrays synthesize the probes in their own lab (or collaborating facility) and spot the arrays They can then generate their own labeled samples for hybridization hybridize the samples to the array and finally scan the arrays with their own equipment This provides a relatively low-cost microarray that may be customized for each study and avoids the costs of purchasing often more expensive commercial arrays that may represent vast numbers of genes that are not of interest to the investigator Publications exist which indicate in-house spotted microarrays may not provide the same level of sensitivity compared to commercial oligonucleotide arrays possibly owing to the small batch sizes and reduced printing efficiencies when compared to industrial manufactures of oligo arrays

In oligonucleotide microarrays the probes are short sequences designed to match parts of the sequence of known or predicted open reading frames Although oligonucleotide probes are often used in spotted microarrays the term oligonucleotide array most often refers to a specific technique of manufacturing Oligonucleotide arrays are produced by printing short oligonucleotide sequences designed to represent a single gene or family of gene splice-variants by synthesizing this sequence directly onto the array surface instead of depositing intact sequences Sequences may be longer (60-mer probes such as the Agilent design) or shorter (25-mer probes produced by Affymetrix) depending on the desired purpose longer probes are more specific to individual target genes shorter probes may be spotted in higher density across the array and are cheaper to manufacture One technique used to produce oligonucleotide arrays include photolithographic synthesis (Agilent and Affymetrix) on a silica substrate where light and light-sensitive masking agents are used to build a sequence one nucleotide at a time across the entire array Each applicable probe is selectively unmasked prior to bathing the array in a solution of a single nucleotide then a masking reaction takes place and the next set of probes are unmasked in preparation for a different nucleotide exposure After many repetitions the sequences of every probe become fully constructed More recently Maskless Array Synthesis from NimbleGen Systems has combined flexibility with large numbers of probes

Two-channel vs one-channel detection

Diagram of typical dual-colour microarray experiment

Two-color microarrays or two-channel microarrays are typically hybridized with cDNA prepared from two samples to be compared (eg diseased tissue versus healthy tissue) and that are labeled with two different fluorophores Fluorescent dyes commonly used for cDNA labelling include Cy3 which has a fluorescence emission wavelength of 570 nm (corresponding to the green part of the light spectrum) and Cy5 with a fluorescence emission wavelength of 670 nm (corresponding to the red part of the light spectrum) The two Cy-labelled cDNA samples are mixed and hybridized to a single microarray that is then scanned in a microarray scanner to visualize fluorescence of the two fluorophores after excitation with a laser beam of a defined wavelength Relative intensities of each fluorophore may then be used in ratio-based analysis to identify up-regulated and down-regulated genes

Oligonucleotide microarrays often contain control probes designed to hybridize with RNA spike-ins The degree of hybridization between the spike-ins and the control probes is used to normalize the hybridization measurements for the target probes Although absolute levels of gene expression may be determined in the two-color array the relative differences in expression among different spots within a sample and between samples is the preferred method of data analysis for the two-color system Examples of providers for such microarrays includes Agilent with their Dual-Mode platform Eppendorf with their DualChip platform for fluorescence labeling and TeleChem International with Arrayit

In single-channel microarrays or one-color microarrays the arrays are designed to give estimations of the absolute levels of gene expression Therefore the comparison of two conditions requires two separate single-dye hybridizations As only a single dye is used the data collected represent absolute values of gene expression These may be compared to other genes within a sample or to reference normalizing probes used to calibrate data across the entire array and across multiple arrays Three popular single-channel systems

are the Affymetrix Gene Chip the Applied Microarrays CodeLink arrays and the Eppendorf DualChip amp Silverquant One strength of the single-dye system lies in the fact that an aberrant sample cannot affect the raw data derived from other samples because each array chip is exposed to only one sample (as opposed to a two-color system in which a single low-quality sample may drastically impinge on overall data precision even if the other sample was of high quality) Another benefit is that data are more easily compared to arrays from different experiments the absolute values of gene expression may be compared between studies conducted months or years apart A drawback to the one-color system is that when compared to the two-color system twice as many microarrays are needed to compare samples within an experiment

Expression profile clustering

Then we can try to gather genes that share the same expression profile on several experiments This clustering can be done gradually as for phylogenetic analysis which consist in calculating similarity criteria between expression profiles and gather the most similar ones We can also use more complex techniques as principal component analysis or neuronal networks

At the end hierarchical clustering is usually displayed as a matrix where each column represent one experiment and each row a gene Ratios are displayed thanks to a colour scale going from green (repressed genes) to red (induced genes)

Uses and types

Arrays of DNA can be spatially arranged as in the commonly known gene chip (also called genome chip DNA chip or gene array) or can be specific DNA sequences labelled such that they can be independently identified in solution The traditional solid-phase array is a collection of microscopic DNA spots attached to a solid surface such as glass plastic or silicon biochip The affixed DNA segments are known as probes (although some sources use different terms such as reporters) Thousands of them can be placed in known locations on a single DNA microarray

DNA microarrays can be used to detect DNA (as in comparative genomic hybridization) or detect RNA (most commonly as cDNA after reverse transcription)that may or may not be translated into proteins The process of measuring gene expression via cDNA is called expression analysis or expression profiling

Since an array can contain tens of thousands of probes a microarray experiment can accomplish that many genetic tests in parallel Therefore arrays have dramatically accelerated many types of investigation

Applications include

Technology or Application

Synopsis

Gene expression profiling

In an mRNAor gene expression profiling experiment the expression levels of thousands of genes are simultaneously monitored to study the effects of certain treatments diseases and developmental stages on gene expression For example microarray-based gene expression profiling can be used to identify genes whose expression is changed in response to pathogens or other organisms by comparing gene expression in infected to that in uninfected cells or tissues

Comparative genomic hybridization

Assessing genome content in different cells or closely related organisms

Chromatin immunoprecipitation on Chip

DNA sequences bound to a particular protein can be isolated by immunoprecipitating that protein (CHIP) these fragments can be then hybridized to a microarray (such as a tiling array) allowing the determination of protein binding site occupancy throughout

the genome Example protein to immunoprecipitate are histone modifications (H3K27me3 H3K4me2 H3K9me3 etc) Polycomb-group protein (PRC2Suz12 PRC1YY1) and trithorax-group protein (Ash1) to study the epigenetic landscape or RNA Polymerase II to study the transcription lanscape

SNP detection

Identifying single nucleotide polymorphism among alleles within or between populations Several applications of microarrays make use of SNP detection including Genotyping forensic analysis measuring predisposition to disease identifying drug-candidates evaluating germline mutations in individuals or somatic mutations in cancers assessing loss of heterozygosity or genetic linkage analysis

Alternative splicing detection

An exon junction array design uses probes specific to the expected or potential splice sites of predicted exons for a gene It is of intermediate density or coverage to a typical gene expression array (with 1-3 probes per gene) and a genomic tiling array (with hundreds or thousands of probes per gene) It is used to assay the expression of alternative splice forms of a gene Exon arrays have a different design employing probes designed to detect each individual exon for known or predicted genes and can be used for detecting different splicing isoforms

Tiling array

Genome tiling arrays consist of overlapping probes designed to densely represent a genomic region of interest sometimes as large as an entire human chromosome The purpose is to empirically detect expression of transcripts or alternatively splice forms which may not have been previously known or predicted

MATERIALS

DNA sourcesAbout 5200 human cDNA clones of the IMAGE library were obtained from the RZPD Resource Centre (Berlin Germany) Some 21 000 random shotgun clones representing the genome of Trypanosoma brucei were provided by Najib El-Sayed of the Institute for Genomic Research (TIGR Rockville USA) Nearly 4550 shotgun clones covering the entire genome of Pseudomonas putida as a minimal tiling path were obtained from Helmut Hilbert of Qiagen (Hilden Germany) PCR products for some 21 000 predicted open reading frames (ORFs) of Drosophila melanogaster were produced directly from genomic DNA The template for some 7300 ORF-specific PCR products of Candida albicans was strain SC5314 (Can14)

PCR amplificationPCR amplifications were performed in 384- or 96-well microtitre plates For PCR on the cDNA and shotgun clones 02 microM of the respective vector-specific primer pairs d(TCA CACAGGAAACAGCTATGAC) and d(GTAAAACGACGGCCAGTG) (human clones) d(TTGTAAAACGACGGCCAGTG) and d(GCGGATAACAATTTCACACAGGA) (Tbrucei) or d(TCGGATCCACTAGTAACG) and d(GGCCGCCAGTGTGATG) (Pputida) (all from Interactiva Ulm Germany) were used The reactions were started by inoculating 25 or 100 microl of PCR mix usually in 10 mM TrisndashHCl pH 83 225 mM MgCl2 50 mM KCl 02 mM each dATP dTTP dGTP and dCTP 15 M betaine 01 mM cresol red and 2 U Taq polymerase with a few Escherichia coli cells transferred from a growth culture using a plastic 384- or 96-pin gadget (Genetix New Milton UK) The plates were incubated for 3 min at 94degC before 35 cycles of denaturation at 94degC for 30 s annealing at 51degC for 30 s and elongation at 72degC for 90 s were performed followed by a final elongation phase at 72degC for 10 min In some cases the PCR was performed without betaine The Drosophila ORFs were initially amplified on 100 ng genomic DNA with some 43 000 gene-specific primers all of which contained one of several common tag sequences of 15 nt length at their 5-ends Subsequent re-amplification was carried out using the fitting primer pair PCR products of Calbicans ORFs were produced on 20 ng genomic DNA with 7300 specific primer pairs

Microarray production process

DNA fragments amplified by PCR technique are spotted on a microscopic glass slide coated with polylysine prior to spotting process The polylysine coating goal is to ensure DNA fixation through electrostatic interactions PCR fragments are in our case the expressed part (ORF) of the 6200 Saccharomyces cerevisae genes (baker yeast) Slide preparation is achieved by blocking the polylysine not fixed to DNA in order to avoid target binding Prior to hybridisation DNA is denatured to obtained a single strand DNA

on the microarray this will allow the probe to bind to the complementary strand from the target

Target preparation

RNA are extracted from two yeast cultures from which we want to compare expression level Messengers RNA are then transformed in cDNA by reverse transcription On this stage DNA from the first culture with a green dye whereas DNA from the second culture is labelled with a red dye

The available target-preparation methods can be divided into two groups first-strand cDNA that is labeled or tagged with a capture sequence or the generation of antisense RNA (aRNA) from double-stranded cDNA during an in vitro transcription (IVT) reaction Labeled cDNA can be prepared via direct The incorporation of a fluorophore-labeled nucleotide or through incorporation of an aminoallyl-labeled nucleotide followed by coupling to a fluorophore containing an amine-reactive group to the aminoallyl nucleotide (Schena et al 1995 for review see Lockhart and Winzeler 2000) Alternatively the first-strand cDNA can be tagged with a capture sequence that is used for subsequent detection steps (Stears et al 2000) DNA microarrays containing short oligonucleotide probes (lt35 nucleotides long) require more target for each hybridization which requires an amplification method with smaller sample sizes Typically the generation of aRNA (aRNA is also commonly called complementary RNA or cRNA) is preceded by first-strand synthesis of cDNA using an oligonucleotide primer containing a bacteriophage T7 RNA polymerase promoter proximal to an oligo(dT) sequence (van Gelder et al 1990Eberwine et al 1992 Lockhart et al 1996) After second-strand cDNA synthesis and cDNA purification an IVT reaction is performed using T7 RNA polymerase in the presence of labeled nucleotides Alternatives to this labeling strategy produce unlabeled aRNA followed by a cDNA synthesis in the presence of a fluorophore-labeled nucleotide (Wang et al 2000) Any target preparation method requires a linear amplification of the available transcripts to be representative of the transcript population

Hybridisation

Green labelled cDNA and red labelled ones are mixed together (call the target) and put on the matrix of spotted single strand DNA (call the probe) The chip is then incubated one night at 60 degrees At this temperature a DNA strand that encounter the complementary strand and match together to create a double strand DNA The fluorescent DNA will then hybridise on the spotted ones

The discrepancies in microarray results are a consequence of differences in microarray measures such as accuracy [ie lsquothe degree of conformity of the measured quantity to its actual (true) valuersquo sensitivity [ie lsquothe concentration range of target molecules in which accurate measurements can be madersquo reproducibility [ie lsquothe degree to which repeated measurements of the same quantity will show the same or similar resultsrsquo and specificity [ie lsquothe ability of a probe to provide a signal that is influenced only by the presence of the target moleculersquo

Accuracy sensitivity and reproducibility may be affected by several effectors These measures and their effectors are discussed by Dufva and Draghici et al and will not be detailed here An example for an effector of sensitivity reproducibility and accuracy is the type of microarray platform oligonucleotide arrays have been found to be more reproducible and sensitive than cDNA arrays and some oligonucleotide arrays have been found to be more accurate than others Sensitivity is also affected by probe density (ie the number of different probes that are fabricated in a given area) which has been shown to be an effector for the availability of probes for hybridization this availability may also be affected by the steric restrictions imposed by the solid microarray surface A higher availability of probes for hybridization has been demonstrated to increase sensitivity In addition sensitivity is affected by the hybridization signal-to-noise ratio (ie the ratio between the spot signal and that of the background) a low background increases microarray hybridization sensitivity

Low specificity of microarray hybridizations has been suggested to be one of the prime measures affecting discrepancies in gene-expression profiles between different probes targeting the same region of a given transcript or between different microarray platforms in the present review we will highlight the issue of microarray - hybridization specificity as a key measure that once improved may increase the validity of microarray results

Microarrays consist of multiple probes Hence a prime key for specificity during microarray hybridiation for either short-oligomer or cDNA microarrays is the ability of the probe to discriminate between different target molecules

Probes are designed to be complementary to the target molecule according to the WatsonndashCrick rules of binding Therefore a probe with high specificity to its target molecule should provide a signal influenced only by the presence of the target molecule Nevertheless a perfect match in terms of sequence-similarity-based complementarity between a probe and its target molecule does not guarantee specificity This is due to the presence of thousands of target molecules during microarray hybridizationmdasheach target molecule being composed of tens of hundreds or thousands of four-nucleotide bases and to the effect of different effectors (discussed subsequently) of hybridization specificity which may alter the ability of a probe to bind to a target molecule Hence there is often some degree of microarray-probe hybridization to a target molecule which is not strictly complementary to it or vice versa a variable number of target molecules that are hybridized to a microarray probe which is not exactly complementary to them

FOUR LEVELS OF HYBRIDIZATION SPECIFICITY

We define four levels of hybridization specificity in the context of microarray hybridization The first is of hybridization between a single probe molecule and a single target molecule The two molecules may exhibit perfect hybridization partial hybridization (ie the target molecule is only partially hybridized to the probe or no hybridization

The second level of specificity is of a spot At this level multiple probe molecules that compose one spot are hybridized to multiple target molecules The spot probes may exhibit perfect partial or no hybridization with the target molecules Notably at this level partial hybridization may have one or both of two forms only some of the probes may be hybridized to the target molecule or probes may be hybridized to only some of the target molecules This partial hybridization at the spot level may be a result of cross- hybridization (ie hybridization between sequences that are not strictly complementary due to the presence and hybridization of nontarget molecules with sequences similar to that of the spot probes Since a spot is composed of multiple probes a single spot may simultaneously bear all combinations of one to four of the presented probe-target molecule types of binding

The third level of specificity is of a spot-set [or in Affymetrix terminology lsquoprobe-setrsquo in which multiple spots represent different segments of the same reference sequence (eg different exons of a gene) At this level different spots of a spot-set may exhibit perfect hybridization with the target molecule partial hybridization with the target molecule due to the presence of probes with mismatches to the target molecule as a result of for example an annotation error in the gene sequence or intended mismatches introduced to quantify nonspecific hybridization no hybridization due to for example alternative

splicing of a transcript leading to probes with no match to the target molecule cross hybridization due to for example a spot within a spot-set that represents an evolutionarily conserved gene segment which hybridizes with nontarget molecules derived from various gene-family members

The fourth level of specificity is that of the microarray in which a variable number of spot-sets may exhibit different forms of hybridization with target sequences perfect hybridization (ie all target molecules are hybridized to their representative spot-sets and all spot-sets are hybridized to the target molecules they represent) partial hybridization in either direction no hybridization (ie target molecules are not hybridized to any spot-set or spot-sets do not match any target molecules) or cross- hybridization (eg target molecules of different genes hybridize to the same spot-set or target molecules of a particular gene hybridize to several different genesrsquo spot-sets) These different forms may exist for a large number of different target molecules or spot-sets

Slide scanning

A laser excites each spot and the fluorescent emission gather through a photo-multiplicator (PMT) coupled to a confocal microscope We obtained two images where grey scales represent fluorescent intensities read If we replace grey scales by green scales for the first image and red scales for the second one we obtained by superimposing the two images one image composed of spots going from green ones (where only DNA from the first condition is fixed) to red (where only DNA from the second condition is fixed) passing through the yellow colour (where DNA from the two conditions are fixed on equal amount)

Data analysis

We have now two microarray images from which we have to calculate the number of DNA molecules in each experimental condition To dos o we measure the signal amount in the green dye emission wavelength and the signal amount in the red dye emission wavelength Then we normalise these amount according to various parameters (yeast amount in each culture condition emission power of each dye hellip) We suppose that the amount of fluorescent DNA fixed is proportional to the mRNA amount present in each cell at the beginning and we calculate the redgreen fluorescence ratio If this ratio is greater than 1 (red on the image) the gene expression is greater in the second experimental condition if this ration is smaller than 1 (green on the image) the gene expression is greater in the first condition We can visualize these differences in expression using software as the one developed in the laboratory call ArrayPlot (cf below image) This software allows from the intensities list of spot to display the red intensities of each spot as a function of the green intensities

Fabrication

Microarrays can be manufactured in different ways depending on the number of probes under examination costs customization requirements and the type of scientific question being asked Arrays may have as few as 10 probes to up to 21 million micrometre-scale probes from commercial vendors

Surface engineering

The first step of DNA microarray fabrication involves surface engineering of a substrate in order to obtain desirable surface properties for the application of interest Optimal surface properties are those which produce high signal to noise ratios for the DNA targets of interest Generally this involves maximizing the probe surface density and activity while minimizing the non-specific binding of the targets of interest Methods of surface engineering vary depending on the platform material design and application

Spotted vs oligonucleotide arrays

Microarrays can be fabricated using a variety of technologies including printing with fine-pointed pins onto glass slides photolithography using pre-made masks photolithography using dynamic micromirror devices ink-jet printing or electrochemistry on microelectrode arrays

In spotted microarrays the probes are oligonucleotides cDNA or small fragments of PCR products that correspond to mRNAs The probes are synthesized prior to deposition on the array surface and are then spotted onto glass A common approach utilizes an array of fine pins or needles controlled by a robotic arm that is dipped into wells containing DNA probes and then depositing each probe at designated locations on the array surface The resulting grid of probes represents the nucleic acid profiles of the prepared probes and is ready to receive complementary cDNA or cRNA targets derived from experimental or clinical samples This technique is used by research scientists around the world to produce in-house printed microarrays from their own labs These arrays may be easily customized for each experiment because researchers can choose the probes and printing locations on the arrays synthesize the probes in their own lab (or collaborating facility) and spot the arrays They can then generate their own labeled samples for hybridization hybridize the samples to the array and finally scan the arrays with their own equipment This provides a relatively low-cost microarray that may be customized for each study and avoids the costs of purchasing often more expensive commercial arrays that may represent vast numbers of genes that are not of interest to the investigator Publications exist which indicate in-house spotted microarrays may not provide the same level of sensitivity compared to commercial oligonucleotide arrays possibly owing to the small batch sizes and reduced printing efficiencies when compared to industrial manufactures of oligo arrays

In oligonucleotide microarrays the probes are short sequences designed to match parts of the sequence of known or predicted open reading frames Although oligonucleotide probes are often used in spotted microarrays the term oligonucleotide array most often refers to a specific technique of manufacturing Oligonucleotide arrays are produced by printing short oligonucleotide sequences designed to represent a single gene or family of gene splice-variants by synthesizing this sequence directly onto the array surface instead of depositing intact sequences Sequences may be longer (60-mer probes such as the Agilent design) or shorter (25-mer probes produced by Affymetrix) depending on the desired purpose longer probes are more specific to individual target genes shorter probes may be spotted in higher density across the array and are cheaper to manufacture One technique used to produce oligonucleotide arrays include photolithographic synthesis (Agilent and Affymetrix) on a silica substrate where light and light-sensitive masking agents are used to build a sequence one nucleotide at a time across the entire array Each applicable probe is selectively unmasked prior to bathing the array in a solution of a single nucleotide then a masking reaction takes place and the next set of probes are unmasked in preparation for a different nucleotide exposure After many repetitions the sequences of every probe become fully constructed More recently Maskless Array Synthesis from NimbleGen Systems has combined flexibility with large numbers of probes

Two-channel vs one-channel detection

Diagram of typical dual-colour microarray experiment

Two-color microarrays or two-channel microarrays are typically hybridized with cDNA prepared from two samples to be compared (eg diseased tissue versus healthy tissue) and that are labeled with two different fluorophores Fluorescent dyes commonly used for cDNA labelling include Cy3 which has a fluorescence emission wavelength of 570 nm (corresponding to the green part of the light spectrum) and Cy5 with a fluorescence emission wavelength of 670 nm (corresponding to the red part of the light spectrum) The two Cy-labelled cDNA samples are mixed and hybridized to a single microarray that is then scanned in a microarray scanner to visualize fluorescence of the two fluorophores after excitation with a laser beam of a defined wavelength Relative intensities of each fluorophore may then be used in ratio-based analysis to identify up-regulated and down-regulated genes

Oligonucleotide microarrays often contain control probes designed to hybridize with RNA spike-ins The degree of hybridization between the spike-ins and the control probes is used to normalize the hybridization measurements for the target probes Although absolute levels of gene expression may be determined in the two-color array the relative differences in expression among different spots within a sample and between samples is the preferred method of data analysis for the two-color system Examples of providers for such microarrays includes Agilent with their Dual-Mode platform Eppendorf with their DualChip platform for fluorescence labeling and TeleChem International with Arrayit

In single-channel microarrays or one-color microarrays the arrays are designed to give estimations of the absolute levels of gene expression Therefore the comparison of two conditions requires two separate single-dye hybridizations As only a single dye is used the data collected represent absolute values of gene expression These may be compared to other genes within a sample or to reference normalizing probes used to calibrate data across the entire array and across multiple arrays Three popular single-channel systems

are the Affymetrix Gene Chip the Applied Microarrays CodeLink arrays and the Eppendorf DualChip amp Silverquant One strength of the single-dye system lies in the fact that an aberrant sample cannot affect the raw data derived from other samples because each array chip is exposed to only one sample (as opposed to a two-color system in which a single low-quality sample may drastically impinge on overall data precision even if the other sample was of high quality) Another benefit is that data are more easily compared to arrays from different experiments the absolute values of gene expression may be compared between studies conducted months or years apart A drawback to the one-color system is that when compared to the two-color system twice as many microarrays are needed to compare samples within an experiment

Expression profile clustering

Then we can try to gather genes that share the same expression profile on several experiments This clustering can be done gradually as for phylogenetic analysis which consist in calculating similarity criteria between expression profiles and gather the most similar ones We can also use more complex techniques as principal component analysis or neuronal networks

At the end hierarchical clustering is usually displayed as a matrix where each column represent one experiment and each row a gene Ratios are displayed thanks to a colour scale going from green (repressed genes) to red (induced genes)

Uses and types

Arrays of DNA can be spatially arranged as in the commonly known gene chip (also called genome chip DNA chip or gene array) or can be specific DNA sequences labelled such that they can be independently identified in solution The traditional solid-phase array is a collection of microscopic DNA spots attached to a solid surface such as glass plastic or silicon biochip The affixed DNA segments are known as probes (although some sources use different terms such as reporters) Thousands of them can be placed in known locations on a single DNA microarray

DNA microarrays can be used to detect DNA (as in comparative genomic hybridization) or detect RNA (most commonly as cDNA after reverse transcription)that may or may not be translated into proteins The process of measuring gene expression via cDNA is called expression analysis or expression profiling

Since an array can contain tens of thousands of probes a microarray experiment can accomplish that many genetic tests in parallel Therefore arrays have dramatically accelerated many types of investigation

Applications include

Technology or Application

Synopsis

Gene expression profiling

In an mRNAor gene expression profiling experiment the expression levels of thousands of genes are simultaneously monitored to study the effects of certain treatments diseases and developmental stages on gene expression For example microarray-based gene expression profiling can be used to identify genes whose expression is changed in response to pathogens or other organisms by comparing gene expression in infected to that in uninfected cells or tissues

Comparative genomic hybridization

Assessing genome content in different cells or closely related organisms

Chromatin immunoprecipitation on Chip

DNA sequences bound to a particular protein can be isolated by immunoprecipitating that protein (CHIP) these fragments can be then hybridized to a microarray (such as a tiling array) allowing the determination of protein binding site occupancy throughout

the genome Example protein to immunoprecipitate are histone modifications (H3K27me3 H3K4me2 H3K9me3 etc) Polycomb-group protein (PRC2Suz12 PRC1YY1) and trithorax-group protein (Ash1) to study the epigenetic landscape or RNA Polymerase II to study the transcription lanscape

SNP detection

Identifying single nucleotide polymorphism among alleles within or between populations Several applications of microarrays make use of SNP detection including Genotyping forensic analysis measuring predisposition to disease identifying drug-candidates evaluating germline mutations in individuals or somatic mutations in cancers assessing loss of heterozygosity or genetic linkage analysis

Alternative splicing detection

An exon junction array design uses probes specific to the expected or potential splice sites of predicted exons for a gene It is of intermediate density or coverage to a typical gene expression array (with 1-3 probes per gene) and a genomic tiling array (with hundreds or thousands of probes per gene) It is used to assay the expression of alternative splice forms of a gene Exon arrays have a different design employing probes designed to detect each individual exon for known or predicted genes and can be used for detecting different splicing isoforms

Tiling array

Genome tiling arrays consist of overlapping probes designed to densely represent a genomic region of interest sometimes as large as an entire human chromosome The purpose is to empirically detect expression of transcripts or alternatively splice forms which may not have been previously known or predicted

on the microarray this will allow the probe to bind to the complementary strand from the target

Target preparation

RNA are extracted from two yeast cultures from which we want to compare expression level Messengers RNA are then transformed in cDNA by reverse transcription On this stage DNA from the first culture with a green dye whereas DNA from the second culture is labelled with a red dye

The available target-preparation methods can be divided into two groups first-strand cDNA that is labeled or tagged with a capture sequence or the generation of antisense RNA (aRNA) from double-stranded cDNA during an in vitro transcription (IVT) reaction Labeled cDNA can be prepared via direct The incorporation of a fluorophore-labeled nucleotide or through incorporation of an aminoallyl-labeled nucleotide followed by coupling to a fluorophore containing an amine-reactive group to the aminoallyl nucleotide (Schena et al 1995 for review see Lockhart and Winzeler 2000) Alternatively the first-strand cDNA can be tagged with a capture sequence that is used for subsequent detection steps (Stears et al 2000) DNA microarrays containing short oligonucleotide probes (lt35 nucleotides long) require more target for each hybridization which requires an amplification method with smaller sample sizes Typically the generation of aRNA (aRNA is also commonly called complementary RNA or cRNA) is preceded by first-strand synthesis of cDNA using an oligonucleotide primer containing a bacteriophage T7 RNA polymerase promoter proximal to an oligo(dT) sequence (van Gelder et al 1990Eberwine et al 1992 Lockhart et al 1996) After second-strand cDNA synthesis and cDNA purification an IVT reaction is performed using T7 RNA polymerase in the presence of labeled nucleotides Alternatives to this labeling strategy produce unlabeled aRNA followed by a cDNA synthesis in the presence of a fluorophore-labeled nucleotide (Wang et al 2000) Any target preparation method requires a linear amplification of the available transcripts to be representative of the transcript population

Hybridisation

Green labelled cDNA and red labelled ones are mixed together (call the target) and put on the matrix of spotted single strand DNA (call the probe) The chip is then incubated one night at 60 degrees At this temperature a DNA strand that encounter the complementary strand and match together to create a double strand DNA The fluorescent DNA will then hybridise on the spotted ones

The discrepancies in microarray results are a consequence of differences in microarray measures such as accuracy [ie lsquothe degree of conformity of the measured quantity to its actual (true) valuersquo sensitivity [ie lsquothe concentration range of target molecules in which accurate measurements can be madersquo reproducibility [ie lsquothe degree to which repeated measurements of the same quantity will show the same or similar resultsrsquo and specificity [ie lsquothe ability of a probe to provide a signal that is influenced only by the presence of the target moleculersquo

Accuracy sensitivity and reproducibility may be affected by several effectors These measures and their effectors are discussed by Dufva and Draghici et al and will not be detailed here An example for an effector of sensitivity reproducibility and accuracy is the type of microarray platform oligonucleotide arrays have been found to be more reproducible and sensitive than cDNA arrays and some oligonucleotide arrays have been found to be more accurate than others Sensitivity is also affected by probe density (ie the number of different probes that are fabricated in a given area) which has been shown to be an effector for the availability of probes for hybridization this availability may also be affected by the steric restrictions imposed by the solid microarray surface A higher availability of probes for hybridization has been demonstrated to increase sensitivity In addition sensitivity is affected by the hybridization signal-to-noise ratio (ie the ratio between the spot signal and that of the background) a low background increases microarray hybridization sensitivity

Low specificity of microarray hybridizations has been suggested to be one of the prime measures affecting discrepancies in gene-expression profiles between different probes targeting the same region of a given transcript or between different microarray platforms in the present review we will highlight the issue of microarray - hybridization specificity as a key measure that once improved may increase the validity of microarray results

Microarrays consist of multiple probes Hence a prime key for specificity during microarray hybridiation for either short-oligomer or cDNA microarrays is the ability of the probe to discriminate between different target molecules

Probes are designed to be complementary to the target molecule according to the WatsonndashCrick rules of binding Therefore a probe with high specificity to its target molecule should provide a signal influenced only by the presence of the target molecule Nevertheless a perfect match in terms of sequence-similarity-based complementarity between a probe and its target molecule does not guarantee specificity This is due to the presence of thousands of target molecules during microarray hybridizationmdasheach target molecule being composed of tens of hundreds or thousands of four-nucleotide bases and to the effect of different effectors (discussed subsequently) of hybridization specificity which may alter the ability of a probe to bind to a target molecule Hence there is often some degree of microarray-probe hybridization to a target molecule which is not strictly complementary to it or vice versa a variable number of target molecules that are hybridized to a microarray probe which is not exactly complementary to them

FOUR LEVELS OF HYBRIDIZATION SPECIFICITY

We define four levels of hybridization specificity in the context of microarray hybridization The first is of hybridization between a single probe molecule and a single target molecule The two molecules may exhibit perfect hybridization partial hybridization (ie the target molecule is only partially hybridized to the probe or no hybridization

The second level of specificity is of a spot At this level multiple probe molecules that compose one spot are hybridized to multiple target molecules The spot probes may exhibit perfect partial or no hybridization with the target molecules Notably at this level partial hybridization may have one or both of two forms only some of the probes may be hybridized to the target molecule or probes may be hybridized to only some of the target molecules This partial hybridization at the spot level may be a result of cross- hybridization (ie hybridization between sequences that are not strictly complementary due to the presence and hybridization of nontarget molecules with sequences similar to that of the spot probes Since a spot is composed of multiple probes a single spot may simultaneously bear all combinations of one to four of the presented probe-target molecule types of binding

The third level of specificity is of a spot-set [or in Affymetrix terminology lsquoprobe-setrsquo in which multiple spots represent different segments of the same reference sequence (eg different exons of a gene) At this level different spots of a spot-set may exhibit perfect hybridization with the target molecule partial hybridization with the target molecule due to the presence of probes with mismatches to the target molecule as a result of for example an annotation error in the gene sequence or intended mismatches introduced to quantify nonspecific hybridization no hybridization due to for example alternative

splicing of a transcript leading to probes with no match to the target molecule cross hybridization due to for example a spot within a spot-set that represents an evolutionarily conserved gene segment which hybridizes with nontarget molecules derived from various gene-family members

The fourth level of specificity is that of the microarray in which a variable number of spot-sets may exhibit different forms of hybridization with target sequences perfect hybridization (ie all target molecules are hybridized to their representative spot-sets and all spot-sets are hybridized to the target molecules they represent) partial hybridization in either direction no hybridization (ie target molecules are not hybridized to any spot-set or spot-sets do not match any target molecules) or cross- hybridization (eg target molecules of different genes hybridize to the same spot-set or target molecules of a particular gene hybridize to several different genesrsquo spot-sets) These different forms may exist for a large number of different target molecules or spot-sets

Slide scanning

A laser excites each spot and the fluorescent emission gather through a photo-multiplicator (PMT) coupled to a confocal microscope We obtained two images where grey scales represent fluorescent intensities read If we replace grey scales by green scales for the first image and red scales for the second one we obtained by superimposing the two images one image composed of spots going from green ones (where only DNA from the first condition is fixed) to red (where only DNA from the second condition is fixed) passing through the yellow colour (where DNA from the two conditions are fixed on equal amount)

Data analysis

We have now two microarray images from which we have to calculate the number of DNA molecules in each experimental condition To dos o we measure the signal amount in the green dye emission wavelength and the signal amount in the red dye emission wavelength Then we normalise these amount according to various parameters (yeast amount in each culture condition emission power of each dye hellip) We suppose that the amount of fluorescent DNA fixed is proportional to the mRNA amount present in each cell at the beginning and we calculate the redgreen fluorescence ratio If this ratio is greater than 1 (red on the image) the gene expression is greater in the second experimental condition if this ration is smaller than 1 (green on the image) the gene expression is greater in the first condition We can visualize these differences in expression using software as the one developed in the laboratory call ArrayPlot (cf below image) This software allows from the intensities list of spot to display the red intensities of each spot as a function of the green intensities

Fabrication

Microarrays can be manufactured in different ways depending on the number of probes under examination costs customization requirements and the type of scientific question being asked Arrays may have as few as 10 probes to up to 21 million micrometre-scale probes from commercial vendors

Surface engineering

The first step of DNA microarray fabrication involves surface engineering of a substrate in order to obtain desirable surface properties for the application of interest Optimal surface properties are those which produce high signal to noise ratios for the DNA targets of interest Generally this involves maximizing the probe surface density and activity while minimizing the non-specific binding of the targets of interest Methods of surface engineering vary depending on the platform material design and application

Spotted vs oligonucleotide arrays

Microarrays can be fabricated using a variety of technologies including printing with fine-pointed pins onto glass slides photolithography using pre-made masks photolithography using dynamic micromirror devices ink-jet printing or electrochemistry on microelectrode arrays

In spotted microarrays the probes are oligonucleotides cDNA or small fragments of PCR products that correspond to mRNAs The probes are synthesized prior to deposition on the array surface and are then spotted onto glass A common approach utilizes an array of fine pins or needles controlled by a robotic arm that is dipped into wells containing DNA probes and then depositing each probe at designated locations on the array surface The resulting grid of probes represents the nucleic acid profiles of the prepared probes and is ready to receive complementary cDNA or cRNA targets derived from experimental or clinical samples This technique is used by research scientists around the world to produce in-house printed microarrays from their own labs These arrays may be easily customized for each experiment because researchers can choose the probes and printing locations on the arrays synthesize the probes in their own lab (or collaborating facility) and spot the arrays They can then generate their own labeled samples for hybridization hybridize the samples to the array and finally scan the arrays with their own equipment This provides a relatively low-cost microarray that may be customized for each study and avoids the costs of purchasing often more expensive commercial arrays that may represent vast numbers of genes that are not of interest to the investigator Publications exist which indicate in-house spotted microarrays may not provide the same level of sensitivity compared to commercial oligonucleotide arrays possibly owing to the small batch sizes and reduced printing efficiencies when compared to industrial manufactures of oligo arrays

In oligonucleotide microarrays the probes are short sequences designed to match parts of the sequence of known or predicted open reading frames Although oligonucleotide probes are often used in spotted microarrays the term oligonucleotide array most often refers to a specific technique of manufacturing Oligonucleotide arrays are produced by printing short oligonucleotide sequences designed to represent a single gene or family of gene splice-variants by synthesizing this sequence directly onto the array surface instead of depositing intact sequences Sequences may be longer (60-mer probes such as the Agilent design) or shorter (25-mer probes produced by Affymetrix) depending on the desired purpose longer probes are more specific to individual target genes shorter probes may be spotted in higher density across the array and are cheaper to manufacture One technique used to produce oligonucleotide arrays include photolithographic synthesis (Agilent and Affymetrix) on a silica substrate where light and light-sensitive masking agents are used to build a sequence one nucleotide at a time across the entire array Each applicable probe is selectively unmasked prior to bathing the array in a solution of a single nucleotide then a masking reaction takes place and the next set of probes are unmasked in preparation for a different nucleotide exposure After many repetitions the sequences of every probe become fully constructed More recently Maskless Array Synthesis from NimbleGen Systems has combined flexibility with large numbers of probes

Two-channel vs one-channel detection

Diagram of typical dual-colour microarray experiment

Two-color microarrays or two-channel microarrays are typically hybridized with cDNA prepared from two samples to be compared (eg diseased tissue versus healthy tissue) and that are labeled with two different fluorophores Fluorescent dyes commonly used for cDNA labelling include Cy3 which has a fluorescence emission wavelength of 570 nm (corresponding to the green part of the light spectrum) and Cy5 with a fluorescence emission wavelength of 670 nm (corresponding to the red part of the light spectrum) The two Cy-labelled cDNA samples are mixed and hybridized to a single microarray that is then scanned in a microarray scanner to visualize fluorescence of the two fluorophores after excitation with a laser beam of a defined wavelength Relative intensities of each fluorophore may then be used in ratio-based analysis to identify up-regulated and down-regulated genes

Oligonucleotide microarrays often contain control probes designed to hybridize with RNA spike-ins The degree of hybridization between the spike-ins and the control probes is used to normalize the hybridization measurements for the target probes Although absolute levels of gene expression may be determined in the two-color array the relative differences in expression among different spots within a sample and between samples is the preferred method of data analysis for the two-color system Examples of providers for such microarrays includes Agilent with their Dual-Mode platform Eppendorf with their DualChip platform for fluorescence labeling and TeleChem International with Arrayit

In single-channel microarrays or one-color microarrays the arrays are designed to give estimations of the absolute levels of gene expression Therefore the comparison of two conditions requires two separate single-dye hybridizations As only a single dye is used the data collected represent absolute values of gene expression These may be compared to other genes within a sample or to reference normalizing probes used to calibrate data across the entire array and across multiple arrays Three popular single-channel systems

are the Affymetrix Gene Chip the Applied Microarrays CodeLink arrays and the Eppendorf DualChip amp Silverquant One strength of the single-dye system lies in the fact that an aberrant sample cannot affect the raw data derived from other samples because each array chip is exposed to only one sample (as opposed to a two-color system in which a single low-quality sample may drastically impinge on overall data precision even if the other sample was of high quality) Another benefit is that data are more easily compared to arrays from different experiments the absolute values of gene expression may be compared between studies conducted months or years apart A drawback to the one-color system is that when compared to the two-color system twice as many microarrays are needed to compare samples within an experiment

Expression profile clustering

Then we can try to gather genes that share the same expression profile on several experiments This clustering can be done gradually as for phylogenetic analysis which consist in calculating similarity criteria between expression profiles and gather the most similar ones We can also use more complex techniques as principal component analysis or neuronal networks

At the end hierarchical clustering is usually displayed as a matrix where each column represent one experiment and each row a gene Ratios are displayed thanks to a colour scale going from green (repressed genes) to red (induced genes)

Uses and types

Arrays of DNA can be spatially arranged as in the commonly known gene chip (also called genome chip DNA chip or gene array) or can be specific DNA sequences labelled such that they can be independently identified in solution The traditional solid-phase array is a collection of microscopic DNA spots attached to a solid surface such as glass plastic or silicon biochip The affixed DNA segments are known as probes (although some sources use different terms such as reporters) Thousands of them can be placed in known locations on a single DNA microarray

DNA microarrays can be used to detect DNA (as in comparative genomic hybridization) or detect RNA (most commonly as cDNA after reverse transcription)that may or may not be translated into proteins The process of measuring gene expression via cDNA is called expression analysis or expression profiling

Since an array can contain tens of thousands of probes a microarray experiment can accomplish that many genetic tests in parallel Therefore arrays have dramatically accelerated many types of investigation

Applications include

Technology or Application

Synopsis

Gene expression profiling

In an mRNAor gene expression profiling experiment the expression levels of thousands of genes are simultaneously monitored to study the effects of certain treatments diseases and developmental stages on gene expression For example microarray-based gene expression profiling can be used to identify genes whose expression is changed in response to pathogens or other organisms by comparing gene expression in infected to that in uninfected cells or tissues

Comparative genomic hybridization

Assessing genome content in different cells or closely related organisms

Chromatin immunoprecipitation on Chip

DNA sequences bound to a particular protein can be isolated by immunoprecipitating that protein (CHIP) these fragments can be then hybridized to a microarray (such as a tiling array) allowing the determination of protein binding site occupancy throughout

the genome Example protein to immunoprecipitate are histone modifications (H3K27me3 H3K4me2 H3K9me3 etc) Polycomb-group protein (PRC2Suz12 PRC1YY1) and trithorax-group protein (Ash1) to study the epigenetic landscape or RNA Polymerase II to study the transcription lanscape

SNP detection

Identifying single nucleotide polymorphism among alleles within or between populations Several applications of microarrays make use of SNP detection including Genotyping forensic analysis measuring predisposition to disease identifying drug-candidates evaluating germline mutations in individuals or somatic mutations in cancers assessing loss of heterozygosity or genetic linkage analysis

Alternative splicing detection

An exon junction array design uses probes specific to the expected or potential splice sites of predicted exons for a gene It is of intermediate density or coverage to a typical gene expression array (with 1-3 probes per gene) and a genomic tiling array (with hundreds or thousands of probes per gene) It is used to assay the expression of alternative splice forms of a gene Exon arrays have a different design employing probes designed to detect each individual exon for known or predicted genes and can be used for detecting different splicing isoforms

Tiling array

Genome tiling arrays consist of overlapping probes designed to densely represent a genomic region of interest sometimes as large as an entire human chromosome The purpose is to empirically detect expression of transcripts or alternatively splice forms which may not have been previously known or predicted

The discrepancies in microarray results are a consequence of differences in microarray measures such as accuracy [ie lsquothe degree of conformity of the measured quantity to its actual (true) valuersquo sensitivity [ie lsquothe concentration range of target molecules in which accurate measurements can be madersquo reproducibility [ie lsquothe degree to which repeated measurements of the same quantity will show the same or similar resultsrsquo and specificity [ie lsquothe ability of a probe to provide a signal that is influenced only by the presence of the target moleculersquo

Accuracy sensitivity and reproducibility may be affected by several effectors These measures and their effectors are discussed by Dufva and Draghici et al and will not be detailed here An example for an effector of sensitivity reproducibility and accuracy is the type of microarray platform oligonucleotide arrays have been found to be more reproducible and sensitive than cDNA arrays and some oligonucleotide arrays have been found to be more accurate than others Sensitivity is also affected by probe density (ie the number of different probes that are fabricated in a given area) which has been shown to be an effector for the availability of probes for hybridization this availability may also be affected by the steric restrictions imposed by the solid microarray surface A higher availability of probes for hybridization has been demonstrated to increase sensitivity In addition sensitivity is affected by the hybridization signal-to-noise ratio (ie the ratio between the spot signal and that of the background) a low background increases microarray hybridization sensitivity

Low specificity of microarray hybridizations has been suggested to be one of the prime measures affecting discrepancies in gene-expression profiles between different probes targeting the same region of a given transcript or between different microarray platforms in the present review we will highlight the issue of microarray - hybridization specificity as a key measure that once improved may increase the validity of microarray results

Microarrays consist of multiple probes Hence a prime key for specificity during microarray hybridiation for either short-oligomer or cDNA microarrays is the ability of the probe to discriminate between different target molecules

Probes are designed to be complementary to the target molecule according to the WatsonndashCrick rules of binding Therefore a probe with high specificity to its target molecule should provide a signal influenced only by the presence of the target molecule Nevertheless a perfect match in terms of sequence-similarity-based complementarity between a probe and its target molecule does not guarantee specificity This is due to the presence of thousands of target molecules during microarray hybridizationmdasheach target molecule being composed of tens of hundreds or thousands of four-nucleotide bases and to the effect of different effectors (discussed subsequently) of hybridization specificity which may alter the ability of a probe to bind to a target molecule Hence there is often some degree of microarray-probe hybridization to a target molecule which is not strictly complementary to it or vice versa a variable number of target molecules that are hybridized to a microarray probe which is not exactly complementary to them

FOUR LEVELS OF HYBRIDIZATION SPECIFICITY

We define four levels of hybridization specificity in the context of microarray hybridization The first is of hybridization between a single probe molecule and a single target molecule The two molecules may exhibit perfect hybridization partial hybridization (ie the target molecule is only partially hybridized to the probe or no hybridization

The second level of specificity is of a spot At this level multiple probe molecules that compose one spot are hybridized to multiple target molecules The spot probes may exhibit perfect partial or no hybridization with the target molecules Notably at this level partial hybridization may have one or both of two forms only some of the probes may be hybridized to the target molecule or probes may be hybridized to only some of the target molecules This partial hybridization at the spot level may be a result of cross- hybridization (ie hybridization between sequences that are not strictly complementary due to the presence and hybridization of nontarget molecules with sequences similar to that of the spot probes Since a spot is composed of multiple probes a single spot may simultaneously bear all combinations of one to four of the presented probe-target molecule types of binding

The third level of specificity is of a spot-set [or in Affymetrix terminology lsquoprobe-setrsquo in which multiple spots represent different segments of the same reference sequence (eg different exons of a gene) At this level different spots of a spot-set may exhibit perfect hybridization with the target molecule partial hybridization with the target molecule due to the presence of probes with mismatches to the target molecule as a result of for example an annotation error in the gene sequence or intended mismatches introduced to quantify nonspecific hybridization no hybridization due to for example alternative

splicing of a transcript leading to probes with no match to the target molecule cross hybridization due to for example a spot within a spot-set that represents an evolutionarily conserved gene segment which hybridizes with nontarget molecules derived from various gene-family members

The fourth level of specificity is that of the microarray in which a variable number of spot-sets may exhibit different forms of hybridization with target sequences perfect hybridization (ie all target molecules are hybridized to their representative spot-sets and all spot-sets are hybridized to the target molecules they represent) partial hybridization in either direction no hybridization (ie target molecules are not hybridized to any spot-set or spot-sets do not match any target molecules) or cross- hybridization (eg target molecules of different genes hybridize to the same spot-set or target molecules of a particular gene hybridize to several different genesrsquo spot-sets) These different forms may exist for a large number of different target molecules or spot-sets

Slide scanning

A laser excites each spot and the fluorescent emission gather through a photo-multiplicator (PMT) coupled to a confocal microscope We obtained two images where grey scales represent fluorescent intensities read If we replace grey scales by green scales for the first image and red scales for the second one we obtained by superimposing the two images one image composed of spots going from green ones (where only DNA from the first condition is fixed) to red (where only DNA from the second condition is fixed) passing through the yellow colour (where DNA from the two conditions are fixed on equal amount)

Data analysis

We have now two microarray images from which we have to calculate the number of DNA molecules in each experimental condition To dos o we measure the signal amount in the green dye emission wavelength and the signal amount in the red dye emission wavelength Then we normalise these amount according to various parameters (yeast amount in each culture condition emission power of each dye hellip) We suppose that the amount of fluorescent DNA fixed is proportional to the mRNA amount present in each cell at the beginning and we calculate the redgreen fluorescence ratio If this ratio is greater than 1 (red on the image) the gene expression is greater in the second experimental condition if this ration is smaller than 1 (green on the image) the gene expression is greater in the first condition We can visualize these differences in expression using software as the one developed in the laboratory call ArrayPlot (cf below image) This software allows from the intensities list of spot to display the red intensities of each spot as a function of the green intensities

Fabrication

Microarrays can be manufactured in different ways depending on the number of probes under examination costs customization requirements and the type of scientific question being asked Arrays may have as few as 10 probes to up to 21 million micrometre-scale probes from commercial vendors

Surface engineering

The first step of DNA microarray fabrication involves surface engineering of a substrate in order to obtain desirable surface properties for the application of interest Optimal surface properties are those which produce high signal to noise ratios for the DNA targets of interest Generally this involves maximizing the probe surface density and activity while minimizing the non-specific binding of the targets of interest Methods of surface engineering vary depending on the platform material design and application

Spotted vs oligonucleotide arrays

Microarrays can be fabricated using a variety of technologies including printing with fine-pointed pins onto glass slides photolithography using pre-made masks photolithography using dynamic micromirror devices ink-jet printing or electrochemistry on microelectrode arrays

In spotted microarrays the probes are oligonucleotides cDNA or small fragments of PCR products that correspond to mRNAs The probes are synthesized prior to deposition on the array surface and are then spotted onto glass A common approach utilizes an array of fine pins or needles controlled by a robotic arm that is dipped into wells containing DNA probes and then depositing each probe at designated locations on the array surface The resulting grid of probes represents the nucleic acid profiles of the prepared probes and is ready to receive complementary cDNA or cRNA targets derived from experimental or clinical samples This technique is used by research scientists around the world to produce in-house printed microarrays from their own labs These arrays may be easily customized for each experiment because researchers can choose the probes and printing locations on the arrays synthesize the probes in their own lab (or collaborating facility) and spot the arrays They can then generate their own labeled samples for hybridization hybridize the samples to the array and finally scan the arrays with their own equipment This provides a relatively low-cost microarray that may be customized for each study and avoids the costs of purchasing often more expensive commercial arrays that may represent vast numbers of genes that are not of interest to the investigator Publications exist which indicate in-house spotted microarrays may not provide the same level of sensitivity compared to commercial oligonucleotide arrays possibly owing to the small batch sizes and reduced printing efficiencies when compared to industrial manufactures of oligo arrays

In oligonucleotide microarrays the probes are short sequences designed to match parts of the sequence of known or predicted open reading frames Although oligonucleotide probes are often used in spotted microarrays the term oligonucleotide array most often refers to a specific technique of manufacturing Oligonucleotide arrays are produced by printing short oligonucleotide sequences designed to represent a single gene or family of gene splice-variants by synthesizing this sequence directly onto the array surface instead of depositing intact sequences Sequences may be longer (60-mer probes such as the Agilent design) or shorter (25-mer probes produced by Affymetrix) depending on the desired purpose longer probes are more specific to individual target genes shorter probes may be spotted in higher density across the array and are cheaper to manufacture One technique used to produce oligonucleotide arrays include photolithographic synthesis (Agilent and Affymetrix) on a silica substrate where light and light-sensitive masking agents are used to build a sequence one nucleotide at a time across the entire array Each applicable probe is selectively unmasked prior to bathing the array in a solution of a single nucleotide then a masking reaction takes place and the next set of probes are unmasked in preparation for a different nucleotide exposure After many repetitions the sequences of every probe become fully constructed More recently Maskless Array Synthesis from NimbleGen Systems has combined flexibility with large numbers of probes

Two-channel vs one-channel detection

Diagram of typical dual-colour microarray experiment

Two-color microarrays or two-channel microarrays are typically hybridized with cDNA prepared from two samples to be compared (eg diseased tissue versus healthy tissue) and that are labeled with two different fluorophores Fluorescent dyes commonly used for cDNA labelling include Cy3 which has a fluorescence emission wavelength of 570 nm (corresponding to the green part of the light spectrum) and Cy5 with a fluorescence emission wavelength of 670 nm (corresponding to the red part of the light spectrum) The two Cy-labelled cDNA samples are mixed and hybridized to a single microarray that is then scanned in a microarray scanner to visualize fluorescence of the two fluorophores after excitation with a laser beam of a defined wavelength Relative intensities of each fluorophore may then be used in ratio-based analysis to identify up-regulated and down-regulated genes

Oligonucleotide microarrays often contain control probes designed to hybridize with RNA spike-ins The degree of hybridization between the spike-ins and the control probes is used to normalize the hybridization measurements for the target probes Although absolute levels of gene expression may be determined in the two-color array the relative differences in expression among different spots within a sample and between samples is the preferred method of data analysis for the two-color system Examples of providers for such microarrays includes Agilent with their Dual-Mode platform Eppendorf with their DualChip platform for fluorescence labeling and TeleChem International with Arrayit

In single-channel microarrays or one-color microarrays the arrays are designed to give estimations of the absolute levels of gene expression Therefore the comparison of two conditions requires two separate single-dye hybridizations As only a single dye is used the data collected represent absolute values of gene expression These may be compared to other genes within a sample or to reference normalizing probes used to calibrate data across the entire array and across multiple arrays Three popular single-channel systems

are the Affymetrix Gene Chip the Applied Microarrays CodeLink arrays and the Eppendorf DualChip amp Silverquant One strength of the single-dye system lies in the fact that an aberrant sample cannot affect the raw data derived from other samples because each array chip is exposed to only one sample (as opposed to a two-color system in which a single low-quality sample may drastically impinge on overall data precision even if the other sample was of high quality) Another benefit is that data are more easily compared to arrays from different experiments the absolute values of gene expression may be compared between studies conducted months or years apart A drawback to the one-color system is that when compared to the two-color system twice as many microarrays are needed to compare samples within an experiment

Expression profile clustering

Then we can try to gather genes that share the same expression profile on several experiments This clustering can be done gradually as for phylogenetic analysis which consist in calculating similarity criteria between expression profiles and gather the most similar ones We can also use more complex techniques as principal component analysis or neuronal networks

At the end hierarchical clustering is usually displayed as a matrix where each column represent one experiment and each row a gene Ratios are displayed thanks to a colour scale going from green (repressed genes) to red (induced genes)

Uses and types

Arrays of DNA can be spatially arranged as in the commonly known gene chip (also called genome chip DNA chip or gene array) or can be specific DNA sequences labelled such that they can be independently identified in solution The traditional solid-phase array is a collection of microscopic DNA spots attached to a solid surface such as glass plastic or silicon biochip The affixed DNA segments are known as probes (although some sources use different terms such as reporters) Thousands of them can be placed in known locations on a single DNA microarray

DNA microarrays can be used to detect DNA (as in comparative genomic hybridization) or detect RNA (most commonly as cDNA after reverse transcription)that may or may not be translated into proteins The process of measuring gene expression via cDNA is called expression analysis or expression profiling

Since an array can contain tens of thousands of probes a microarray experiment can accomplish that many genetic tests in parallel Therefore arrays have dramatically accelerated many types of investigation

Applications include

Technology or Application

Synopsis

Gene expression profiling

In an mRNAor gene expression profiling experiment the expression levels of thousands of genes are simultaneously monitored to study the effects of certain treatments diseases and developmental stages on gene expression For example microarray-based gene expression profiling can be used to identify genes whose expression is changed in response to pathogens or other organisms by comparing gene expression in infected to that in uninfected cells or tissues

Comparative genomic hybridization

Assessing genome content in different cells or closely related organisms

Chromatin immunoprecipitation on Chip

DNA sequences bound to a particular protein can be isolated by immunoprecipitating that protein (CHIP) these fragments can be then hybridized to a microarray (such as a tiling array) allowing the determination of protein binding site occupancy throughout

the genome Example protein to immunoprecipitate are histone modifications (H3K27me3 H3K4me2 H3K9me3 etc) Polycomb-group protein (PRC2Suz12 PRC1YY1) and trithorax-group protein (Ash1) to study the epigenetic landscape or RNA Polymerase II to study the transcription lanscape

SNP detection

Identifying single nucleotide polymorphism among alleles within or between populations Several applications of microarrays make use of SNP detection including Genotyping forensic analysis measuring predisposition to disease identifying drug-candidates evaluating germline mutations in individuals or somatic mutations in cancers assessing loss of heterozygosity or genetic linkage analysis

Alternative splicing detection

An exon junction array design uses probes specific to the expected or potential splice sites of predicted exons for a gene It is of intermediate density or coverage to a typical gene expression array (with 1-3 probes per gene) and a genomic tiling array (with hundreds or thousands of probes per gene) It is used to assay the expression of alternative splice forms of a gene Exon arrays have a different design employing probes designed to detect each individual exon for known or predicted genes and can be used for detecting different splicing isoforms

Tiling array

Genome tiling arrays consist of overlapping probes designed to densely represent a genomic region of interest sometimes as large as an entire human chromosome The purpose is to empirically detect expression of transcripts or alternatively splice forms which may not have been previously known or predicted

Microarrays consist of multiple probes Hence a prime key for specificity during microarray hybridiation for either short-oligomer or cDNA microarrays is the ability of the probe to discriminate between different target molecules

Probes are designed to be complementary to the target molecule according to the WatsonndashCrick rules of binding Therefore a probe with high specificity to its target molecule should provide a signal influenced only by the presence of the target molecule Nevertheless a perfect match in terms of sequence-similarity-based complementarity between a probe and its target molecule does not guarantee specificity This is due to the presence of thousands of target molecules during microarray hybridizationmdasheach target molecule being composed of tens of hundreds or thousands of four-nucleotide bases and to the effect of different effectors (discussed subsequently) of hybridization specificity which may alter the ability of a probe to bind to a target molecule Hence there is often some degree of microarray-probe hybridization to a target molecule which is not strictly complementary to it or vice versa a variable number of target molecules that are hybridized to a microarray probe which is not exactly complementary to them

FOUR LEVELS OF HYBRIDIZATION SPECIFICITY

We define four levels of hybridization specificity in the context of microarray hybridization The first is of hybridization between a single probe molecule and a single target molecule The two molecules may exhibit perfect hybridization partial hybridization (ie the target molecule is only partially hybridized to the probe or no hybridization

The second level of specificity is of a spot At this level multiple probe molecules that compose one spot are hybridized to multiple target molecules The spot probes may exhibit perfect partial or no hybridization with the target molecules Notably at this level partial hybridization may have one or both of two forms only some of the probes may be hybridized to the target molecule or probes may be hybridized to only some of the target molecules This partial hybridization at the spot level may be a result of cross- hybridization (ie hybridization between sequences that are not strictly complementary due to the presence and hybridization of nontarget molecules with sequences similar to that of the spot probes Since a spot is composed of multiple probes a single spot may simultaneously bear all combinations of one to four of the presented probe-target molecule types of binding

The third level of specificity is of a spot-set [or in Affymetrix terminology lsquoprobe-setrsquo in which multiple spots represent different segments of the same reference sequence (eg different exons of a gene) At this level different spots of a spot-set may exhibit perfect hybridization with the target molecule partial hybridization with the target molecule due to the presence of probes with mismatches to the target molecule as a result of for example an annotation error in the gene sequence or intended mismatches introduced to quantify nonspecific hybridization no hybridization due to for example alternative

splicing of a transcript leading to probes with no match to the target molecule cross hybridization due to for example a spot within a spot-set that represents an evolutionarily conserved gene segment which hybridizes with nontarget molecules derived from various gene-family members

The fourth level of specificity is that of the microarray in which a variable number of spot-sets may exhibit different forms of hybridization with target sequences perfect hybridization (ie all target molecules are hybridized to their representative spot-sets and all spot-sets are hybridized to the target molecules they represent) partial hybridization in either direction no hybridization (ie target molecules are not hybridized to any spot-set or spot-sets do not match any target molecules) or cross- hybridization (eg target molecules of different genes hybridize to the same spot-set or target molecules of a particular gene hybridize to several different genesrsquo spot-sets) These different forms may exist for a large number of different target molecules or spot-sets

Slide scanning

A laser excites each spot and the fluorescent emission gather through a photo-multiplicator (PMT) coupled to a confocal microscope We obtained two images where grey scales represent fluorescent intensities read If we replace grey scales by green scales for the first image and red scales for the second one we obtained by superimposing the two images one image composed of spots going from green ones (where only DNA from the first condition is fixed) to red (where only DNA from the second condition is fixed) passing through the yellow colour (where DNA from the two conditions are fixed on equal amount)

Data analysis

We have now two microarray images from which we have to calculate the number of DNA molecules in each experimental condition To dos o we measure the signal amount in the green dye emission wavelength and the signal amount in the red dye emission wavelength Then we normalise these amount according to various parameters (yeast amount in each culture condition emission power of each dye hellip) We suppose that the amount of fluorescent DNA fixed is proportional to the mRNA amount present in each cell at the beginning and we calculate the redgreen fluorescence ratio If this ratio is greater than 1 (red on the image) the gene expression is greater in the second experimental condition if this ration is smaller than 1 (green on the image) the gene expression is greater in the first condition We can visualize these differences in expression using software as the one developed in the laboratory call ArrayPlot (cf below image) This software allows from the intensities list of spot to display the red intensities of each spot as a function of the green intensities

Fabrication

Microarrays can be manufactured in different ways depending on the number of probes under examination costs customization requirements and the type of scientific question being asked Arrays may have as few as 10 probes to up to 21 million micrometre-scale probes from commercial vendors

Surface engineering

The first step of DNA microarray fabrication involves surface engineering of a substrate in order to obtain desirable surface properties for the application of interest Optimal surface properties are those which produce high signal to noise ratios for the DNA targets of interest Generally this involves maximizing the probe surface density and activity while minimizing the non-specific binding of the targets of interest Methods of surface engineering vary depending on the platform material design and application

Spotted vs oligonucleotide arrays

Microarrays can be fabricated using a variety of technologies including printing with fine-pointed pins onto glass slides photolithography using pre-made masks photolithography using dynamic micromirror devices ink-jet printing or electrochemistry on microelectrode arrays

In spotted microarrays the probes are oligonucleotides cDNA or small fragments of PCR products that correspond to mRNAs The probes are synthesized prior to deposition on the array surface and are then spotted onto glass A common approach utilizes an array of fine pins or needles controlled by a robotic arm that is dipped into wells containing DNA probes and then depositing each probe at designated locations on the array surface The resulting grid of probes represents the nucleic acid profiles of the prepared probes and is ready to receive complementary cDNA or cRNA targets derived from experimental or clinical samples This technique is used by research scientists around the world to produce in-house printed microarrays from their own labs These arrays may be easily customized for each experiment because researchers can choose the probes and printing locations on the arrays synthesize the probes in their own lab (or collaborating facility) and spot the arrays They can then generate their own labeled samples for hybridization hybridize the samples to the array and finally scan the arrays with their own equipment This provides a relatively low-cost microarray that may be customized for each study and avoids the costs of purchasing often more expensive commercial arrays that may represent vast numbers of genes that are not of interest to the investigator Publications exist which indicate in-house spotted microarrays may not provide the same level of sensitivity compared to commercial oligonucleotide arrays possibly owing to the small batch sizes and reduced printing efficiencies when compared to industrial manufactures of oligo arrays

In oligonucleotide microarrays the probes are short sequences designed to match parts of the sequence of known or predicted open reading frames Although oligonucleotide probes are often used in spotted microarrays the term oligonucleotide array most often refers to a specific technique of manufacturing Oligonucleotide arrays are produced by printing short oligonucleotide sequences designed to represent a single gene or family of gene splice-variants by synthesizing this sequence directly onto the array surface instead of depositing intact sequences Sequences may be longer (60-mer probes such as the Agilent design) or shorter (25-mer probes produced by Affymetrix) depending on the desired purpose longer probes are more specific to individual target genes shorter probes may be spotted in higher density across the array and are cheaper to manufacture One technique used to produce oligonucleotide arrays include photolithographic synthesis (Agilent and Affymetrix) on a silica substrate where light and light-sensitive masking agents are used to build a sequence one nucleotide at a time across the entire array Each applicable probe is selectively unmasked prior to bathing the array in a solution of a single nucleotide then a masking reaction takes place and the next set of probes are unmasked in preparation for a different nucleotide exposure After many repetitions the sequences of every probe become fully constructed More recently Maskless Array Synthesis from NimbleGen Systems has combined flexibility with large numbers of probes

Two-channel vs one-channel detection

Diagram of typical dual-colour microarray experiment

Two-color microarrays or two-channel microarrays are typically hybridized with cDNA prepared from two samples to be compared (eg diseased tissue versus healthy tissue) and that are labeled with two different fluorophores Fluorescent dyes commonly used for cDNA labelling include Cy3 which has a fluorescence emission wavelength of 570 nm (corresponding to the green part of the light spectrum) and Cy5 with a fluorescence emission wavelength of 670 nm (corresponding to the red part of the light spectrum) The two Cy-labelled cDNA samples are mixed and hybridized to a single microarray that is then scanned in a microarray scanner to visualize fluorescence of the two fluorophores after excitation with a laser beam of a defined wavelength Relative intensities of each fluorophore may then be used in ratio-based analysis to identify up-regulated and down-regulated genes

Oligonucleotide microarrays often contain control probes designed to hybridize with RNA spike-ins The degree of hybridization between the spike-ins and the control probes is used to normalize the hybridization measurements for the target probes Although absolute levels of gene expression may be determined in the two-color array the relative differences in expression among different spots within a sample and between samples is the preferred method of data analysis for the two-color system Examples of providers for such microarrays includes Agilent with their Dual-Mode platform Eppendorf with their DualChip platform for fluorescence labeling and TeleChem International with Arrayit

In single-channel microarrays or one-color microarrays the arrays are designed to give estimations of the absolute levels of gene expression Therefore the comparison of two conditions requires two separate single-dye hybridizations As only a single dye is used the data collected represent absolute values of gene expression These may be compared to other genes within a sample or to reference normalizing probes used to calibrate data across the entire array and across multiple arrays Three popular single-channel systems

are the Affymetrix Gene Chip the Applied Microarrays CodeLink arrays and the Eppendorf DualChip amp Silverquant One strength of the single-dye system lies in the fact that an aberrant sample cannot affect the raw data derived from other samples because each array chip is exposed to only one sample (as opposed to a two-color system in which a single low-quality sample may drastically impinge on overall data precision even if the other sample was of high quality) Another benefit is that data are more easily compared to arrays from different experiments the absolute values of gene expression may be compared between studies conducted months or years apart A drawback to the one-color system is that when compared to the two-color system twice as many microarrays are needed to compare samples within an experiment

Expression profile clustering

Then we can try to gather genes that share the same expression profile on several experiments This clustering can be done gradually as for phylogenetic analysis which consist in calculating similarity criteria between expression profiles and gather the most similar ones We can also use more complex techniques as principal component analysis or neuronal networks

At the end hierarchical clustering is usually displayed as a matrix where each column represent one experiment and each row a gene Ratios are displayed thanks to a colour scale going from green (repressed genes) to red (induced genes)

Uses and types

Arrays of DNA can be spatially arranged as in the commonly known gene chip (also called genome chip DNA chip or gene array) or can be specific DNA sequences labelled such that they can be independently identified in solution The traditional solid-phase array is a collection of microscopic DNA spots attached to a solid surface such as glass plastic or silicon biochip The affixed DNA segments are known as probes (although some sources use different terms such as reporters) Thousands of them can be placed in known locations on a single DNA microarray

DNA microarrays can be used to detect DNA (as in comparative genomic hybridization) or detect RNA (most commonly as cDNA after reverse transcription)that may or may not be translated into proteins The process of measuring gene expression via cDNA is called expression analysis or expression profiling

Since an array can contain tens of thousands of probes a microarray experiment can accomplish that many genetic tests in parallel Therefore arrays have dramatically accelerated many types of investigation

Applications include

Technology or Application

Synopsis

Gene expression profiling

In an mRNAor gene expression profiling experiment the expression levels of thousands of genes are simultaneously monitored to study the effects of certain treatments diseases and developmental stages on gene expression For example microarray-based gene expression profiling can be used to identify genes whose expression is changed in response to pathogens or other organisms by comparing gene expression in infected to that in uninfected cells or tissues

Comparative genomic hybridization

Assessing genome content in different cells or closely related organisms

Chromatin immunoprecipitation on Chip

DNA sequences bound to a particular protein can be isolated by immunoprecipitating that protein (CHIP) these fragments can be then hybridized to a microarray (such as a tiling array) allowing the determination of protein binding site occupancy throughout

the genome Example protein to immunoprecipitate are histone modifications (H3K27me3 H3K4me2 H3K9me3 etc) Polycomb-group protein (PRC2Suz12 PRC1YY1) and trithorax-group protein (Ash1) to study the epigenetic landscape or RNA Polymerase II to study the transcription lanscape

SNP detection

Identifying single nucleotide polymorphism among alleles within or between populations Several applications of microarrays make use of SNP detection including Genotyping forensic analysis measuring predisposition to disease identifying drug-candidates evaluating germline mutations in individuals or somatic mutations in cancers assessing loss of heterozygosity or genetic linkage analysis

Alternative splicing detection

An exon junction array design uses probes specific to the expected or potential splice sites of predicted exons for a gene It is of intermediate density or coverage to a typical gene expression array (with 1-3 probes per gene) and a genomic tiling array (with hundreds or thousands of probes per gene) It is used to assay the expression of alternative splice forms of a gene Exon arrays have a different design employing probes designed to detect each individual exon for known or predicted genes and can be used for detecting different splicing isoforms

Tiling array

Genome tiling arrays consist of overlapping probes designed to densely represent a genomic region of interest sometimes as large as an entire human chromosome The purpose is to empirically detect expression of transcripts or alternatively splice forms which may not have been previously known or predicted

splicing of a transcript leading to probes with no match to the target molecule cross hybridization due to for example a spot within a spot-set that represents an evolutionarily conserved gene segment which hybridizes with nontarget molecules derived from various gene-family members

The fourth level of specificity is that of the microarray in which a variable number of spot-sets may exhibit different forms of hybridization with target sequences perfect hybridization (ie all target molecules are hybridized to their representative spot-sets and all spot-sets are hybridized to the target molecules they represent) partial hybridization in either direction no hybridization (ie target molecules are not hybridized to any spot-set or spot-sets do not match any target molecules) or cross- hybridization (eg target molecules of different genes hybridize to the same spot-set or target molecules of a particular gene hybridize to several different genesrsquo spot-sets) These different forms may exist for a large number of different target molecules or spot-sets

Slide scanning

A laser excites each spot and the fluorescent emission gather through a photo-multiplicator (PMT) coupled to a confocal microscope We obtained two images where grey scales represent fluorescent intensities read If we replace grey scales by green scales for the first image and red scales for the second one we obtained by superimposing the two images one image composed of spots going from green ones (where only DNA from the first condition is fixed) to red (where only DNA from the second condition is fixed) passing through the yellow colour (where DNA from the two conditions are fixed on equal amount)

Data analysis

We have now two microarray images from which we have to calculate the number of DNA molecules in each experimental condition To dos o we measure the signal amount in the green dye emission wavelength and the signal amount in the red dye emission wavelength Then we normalise these amount according to various parameters (yeast amount in each culture condition emission power of each dye hellip) We suppose that the amount of fluorescent DNA fixed is proportional to the mRNA amount present in each cell at the beginning and we calculate the redgreen fluorescence ratio If this ratio is greater than 1 (red on the image) the gene expression is greater in the second experimental condition if this ration is smaller than 1 (green on the image) the gene expression is greater in the first condition We can visualize these differences in expression using software as the one developed in the laboratory call ArrayPlot (cf below image) This software allows from the intensities list of spot to display the red intensities of each spot as a function of the green intensities

Fabrication

Microarrays can be manufactured in different ways depending on the number of probes under examination costs customization requirements and the type of scientific question being asked Arrays may have as few as 10 probes to up to 21 million micrometre-scale probes from commercial vendors

Surface engineering

The first step of DNA microarray fabrication involves surface engineering of a substrate in order to obtain desirable surface properties for the application of interest Optimal surface properties are those which produce high signal to noise ratios for the DNA targets of interest Generally this involves maximizing the probe surface density and activity while minimizing the non-specific binding of the targets of interest Methods of surface engineering vary depending on the platform material design and application

Spotted vs oligonucleotide arrays

Microarrays can be fabricated using a variety of technologies including printing with fine-pointed pins onto glass slides photolithography using pre-made masks photolithography using dynamic micromirror devices ink-jet printing or electrochemistry on microelectrode arrays

In spotted microarrays the probes are oligonucleotides cDNA or small fragments of PCR products that correspond to mRNAs The probes are synthesized prior to deposition on the array surface and are then spotted onto glass A common approach utilizes an array of fine pins or needles controlled by a robotic arm that is dipped into wells containing DNA probes and then depositing each probe at designated locations on the array surface The resulting grid of probes represents the nucleic acid profiles of the prepared probes and is ready to receive complementary cDNA or cRNA targets derived from experimental or clinical samples This technique is used by research scientists around the world to produce in-house printed microarrays from their own labs These arrays may be easily customized for each experiment because researchers can choose the probes and printing locations on the arrays synthesize the probes in their own lab (or collaborating facility) and spot the arrays They can then generate their own labeled samples for hybridization hybridize the samples to the array and finally scan the arrays with their own equipment This provides a relatively low-cost microarray that may be customized for each study and avoids the costs of purchasing often more expensive commercial arrays that may represent vast numbers of genes that are not of interest to the investigator Publications exist which indicate in-house spotted microarrays may not provide the same level of sensitivity compared to commercial oligonucleotide arrays possibly owing to the small batch sizes and reduced printing efficiencies when compared to industrial manufactures of oligo arrays

In oligonucleotide microarrays the probes are short sequences designed to match parts of the sequence of known or predicted open reading frames Although oligonucleotide probes are often used in spotted microarrays the term oligonucleotide array most often refers to a specific technique of manufacturing Oligonucleotide arrays are produced by printing short oligonucleotide sequences designed to represent a single gene or family of gene splice-variants by synthesizing this sequence directly onto the array surface instead of depositing intact sequences Sequences may be longer (60-mer probes such as the Agilent design) or shorter (25-mer probes produced by Affymetrix) depending on the desired purpose longer probes are more specific to individual target genes shorter probes may be spotted in higher density across the array and are cheaper to manufacture One technique used to produce oligonucleotide arrays include photolithographic synthesis (Agilent and Affymetrix) on a silica substrate where light and light-sensitive masking agents are used to build a sequence one nucleotide at a time across the entire array Each applicable probe is selectively unmasked prior to bathing the array in a solution of a single nucleotide then a masking reaction takes place and the next set of probes are unmasked in preparation for a different nucleotide exposure After many repetitions the sequences of every probe become fully constructed More recently Maskless Array Synthesis from NimbleGen Systems has combined flexibility with large numbers of probes

Two-channel vs one-channel detection

Diagram of typical dual-colour microarray experiment

Two-color microarrays or two-channel microarrays are typically hybridized with cDNA prepared from two samples to be compared (eg diseased tissue versus healthy tissue) and that are labeled with two different fluorophores Fluorescent dyes commonly used for cDNA labelling include Cy3 which has a fluorescence emission wavelength of 570 nm (corresponding to the green part of the light spectrum) and Cy5 with a fluorescence emission wavelength of 670 nm (corresponding to the red part of the light spectrum) The two Cy-labelled cDNA samples are mixed and hybridized to a single microarray that is then scanned in a microarray scanner to visualize fluorescence of the two fluorophores after excitation with a laser beam of a defined wavelength Relative intensities of each fluorophore may then be used in ratio-based analysis to identify up-regulated and down-regulated genes

Oligonucleotide microarrays often contain control probes designed to hybridize with RNA spike-ins The degree of hybridization between the spike-ins and the control probes is used to normalize the hybridization measurements for the target probes Although absolute levels of gene expression may be determined in the two-color array the relative differences in expression among different spots within a sample and between samples is the preferred method of data analysis for the two-color system Examples of providers for such microarrays includes Agilent with their Dual-Mode platform Eppendorf with their DualChip platform for fluorescence labeling and TeleChem International with Arrayit

In single-channel microarrays or one-color microarrays the arrays are designed to give estimations of the absolute levels of gene expression Therefore the comparison of two conditions requires two separate single-dye hybridizations As only a single dye is used the data collected represent absolute values of gene expression These may be compared to other genes within a sample or to reference normalizing probes used to calibrate data across the entire array and across multiple arrays Three popular single-channel systems

are the Affymetrix Gene Chip the Applied Microarrays CodeLink arrays and the Eppendorf DualChip amp Silverquant One strength of the single-dye system lies in the fact that an aberrant sample cannot affect the raw data derived from other samples because each array chip is exposed to only one sample (as opposed to a two-color system in which a single low-quality sample may drastically impinge on overall data precision even if the other sample was of high quality) Another benefit is that data are more easily compared to arrays from different experiments the absolute values of gene expression may be compared between studies conducted months or years apart A drawback to the one-color system is that when compared to the two-color system twice as many microarrays are needed to compare samples within an experiment

Expression profile clustering

Then we can try to gather genes that share the same expression profile on several experiments This clustering can be done gradually as for phylogenetic analysis which consist in calculating similarity criteria between expression profiles and gather the most similar ones We can also use more complex techniques as principal component analysis or neuronal networks

At the end hierarchical clustering is usually displayed as a matrix where each column represent one experiment and each row a gene Ratios are displayed thanks to a colour scale going from green (repressed genes) to red (induced genes)

Uses and types

Arrays of DNA can be spatially arranged as in the commonly known gene chip (also called genome chip DNA chip or gene array) or can be specific DNA sequences labelled such that they can be independently identified in solution The traditional solid-phase array is a collection of microscopic DNA spots attached to a solid surface such as glass plastic or silicon biochip The affixed DNA segments are known as probes (although some sources use different terms such as reporters) Thousands of them can be placed in known locations on a single DNA microarray

DNA microarrays can be used to detect DNA (as in comparative genomic hybridization) or detect RNA (most commonly as cDNA after reverse transcription)that may or may not be translated into proteins The process of measuring gene expression via cDNA is called expression analysis or expression profiling

Since an array can contain tens of thousands of probes a microarray experiment can accomplish that many genetic tests in parallel Therefore arrays have dramatically accelerated many types of investigation

Applications include

Technology or Application

Synopsis

Gene expression profiling

In an mRNAor gene expression profiling experiment the expression levels of thousands of genes are simultaneously monitored to study the effects of certain treatments diseases and developmental stages on gene expression For example microarray-based gene expression profiling can be used to identify genes whose expression is changed in response to pathogens or other organisms by comparing gene expression in infected to that in uninfected cells or tissues

Comparative genomic hybridization

Assessing genome content in different cells or closely related organisms

Chromatin immunoprecipitation on Chip

DNA sequences bound to a particular protein can be isolated by immunoprecipitating that protein (CHIP) these fragments can be then hybridized to a microarray (such as a tiling array) allowing the determination of protein binding site occupancy throughout

the genome Example protein to immunoprecipitate are histone modifications (H3K27me3 H3K4me2 H3K9me3 etc) Polycomb-group protein (PRC2Suz12 PRC1YY1) and trithorax-group protein (Ash1) to study the epigenetic landscape or RNA Polymerase II to study the transcription lanscape

SNP detection

Identifying single nucleotide polymorphism among alleles within or between populations Several applications of microarrays make use of SNP detection including Genotyping forensic analysis measuring predisposition to disease identifying drug-candidates evaluating germline mutations in individuals or somatic mutations in cancers assessing loss of heterozygosity or genetic linkage analysis

Alternative splicing detection

An exon junction array design uses probes specific to the expected or potential splice sites of predicted exons for a gene It is of intermediate density or coverage to a typical gene expression array (with 1-3 probes per gene) and a genomic tiling array (with hundreds or thousands of probes per gene) It is used to assay the expression of alternative splice forms of a gene Exon arrays have a different design employing probes designed to detect each individual exon for known or predicted genes and can be used for detecting different splicing isoforms

Tiling array

Genome tiling arrays consist of overlapping probes designed to densely represent a genomic region of interest sometimes as large as an entire human chromosome The purpose is to empirically detect expression of transcripts or alternatively splice forms which may not have been previously known or predicted

Data analysis

We have now two microarray images from which we have to calculate the number of DNA molecules in each experimental condition To dos o we measure the signal amount in the green dye emission wavelength and the signal amount in the red dye emission wavelength Then we normalise these amount according to various parameters (yeast amount in each culture condition emission power of each dye hellip) We suppose that the amount of fluorescent DNA fixed is proportional to the mRNA amount present in each cell at the beginning and we calculate the redgreen fluorescence ratio If this ratio is greater than 1 (red on the image) the gene expression is greater in the second experimental condition if this ration is smaller than 1 (green on the image) the gene expression is greater in the first condition We can visualize these differences in expression using software as the one developed in the laboratory call ArrayPlot (cf below image) This software allows from the intensities list of spot to display the red intensities of each spot as a function of the green intensities

Fabrication

Microarrays can be manufactured in different ways depending on the number of probes under examination costs customization requirements and the type of scientific question being asked Arrays may have as few as 10 probes to up to 21 million micrometre-scale probes from commercial vendors

Surface engineering

The first step of DNA microarray fabrication involves surface engineering of a substrate in order to obtain desirable surface properties for the application of interest Optimal surface properties are those which produce high signal to noise ratios for the DNA targets of interest Generally this involves maximizing the probe surface density and activity while minimizing the non-specific binding of the targets of interest Methods of surface engineering vary depending on the platform material design and application

Spotted vs oligonucleotide arrays

Microarrays can be fabricated using a variety of technologies including printing with fine-pointed pins onto glass slides photolithography using pre-made masks photolithography using dynamic micromirror devices ink-jet printing or electrochemistry on microelectrode arrays

In spotted microarrays the probes are oligonucleotides cDNA or small fragments of PCR products that correspond to mRNAs The probes are synthesized prior to deposition on the array surface and are then spotted onto glass A common approach utilizes an array of fine pins or needles controlled by a robotic arm that is dipped into wells containing DNA probes and then depositing each probe at designated locations on the array surface The resulting grid of probes represents the nucleic acid profiles of the prepared probes and is ready to receive complementary cDNA or cRNA targets derived from experimental or clinical samples This technique is used by research scientists around the world to produce in-house printed microarrays from their own labs These arrays may be easily customized for each experiment because researchers can choose the probes and printing locations on the arrays synthesize the probes in their own lab (or collaborating facility) and spot the arrays They can then generate their own labeled samples for hybridization hybridize the samples to the array and finally scan the arrays with their own equipment This provides a relatively low-cost microarray that may be customized for each study and avoids the costs of purchasing often more expensive commercial arrays that may represent vast numbers of genes that are not of interest to the investigator Publications exist which indicate in-house spotted microarrays may not provide the same level of sensitivity compared to commercial oligonucleotide arrays possibly owing to the small batch sizes and reduced printing efficiencies when compared to industrial manufactures of oligo arrays

In oligonucleotide microarrays the probes are short sequences designed to match parts of the sequence of known or predicted open reading frames Although oligonucleotide probes are often used in spotted microarrays the term oligonucleotide array most often refers to a specific technique of manufacturing Oligonucleotide arrays are produced by printing short oligonucleotide sequences designed to represent a single gene or family of gene splice-variants by synthesizing this sequence directly onto the array surface instead of depositing intact sequences Sequences may be longer (60-mer probes such as the Agilent design) or shorter (25-mer probes produced by Affymetrix) depending on the desired purpose longer probes are more specific to individual target genes shorter probes may be spotted in higher density across the array and are cheaper to manufacture One technique used to produce oligonucleotide arrays include photolithographic synthesis (Agilent and Affymetrix) on a silica substrate where light and light-sensitive masking agents are used to build a sequence one nucleotide at a time across the entire array Each applicable probe is selectively unmasked prior to bathing the array in a solution of a single nucleotide then a masking reaction takes place and the next set of probes are unmasked in preparation for a different nucleotide exposure After many repetitions the sequences of every probe become fully constructed More recently Maskless Array Synthesis from NimbleGen Systems has combined flexibility with large numbers of probes

Two-channel vs one-channel detection

Diagram of typical dual-colour microarray experiment

Two-color microarrays or two-channel microarrays are typically hybridized with cDNA prepared from two samples to be compared (eg diseased tissue versus healthy tissue) and that are labeled with two different fluorophores Fluorescent dyes commonly used for cDNA labelling include Cy3 which has a fluorescence emission wavelength of 570 nm (corresponding to the green part of the light spectrum) and Cy5 with a fluorescence emission wavelength of 670 nm (corresponding to the red part of the light spectrum) The two Cy-labelled cDNA samples are mixed and hybridized to a single microarray that is then scanned in a microarray scanner to visualize fluorescence of the two fluorophores after excitation with a laser beam of a defined wavelength Relative intensities of each fluorophore may then be used in ratio-based analysis to identify up-regulated and down-regulated genes

Oligonucleotide microarrays often contain control probes designed to hybridize with RNA spike-ins The degree of hybridization between the spike-ins and the control probes is used to normalize the hybridization measurements for the target probes Although absolute levels of gene expression may be determined in the two-color array the relative differences in expression among different spots within a sample and between samples is the preferred method of data analysis for the two-color system Examples of providers for such microarrays includes Agilent with their Dual-Mode platform Eppendorf with their DualChip platform for fluorescence labeling and TeleChem International with Arrayit

In single-channel microarrays or one-color microarrays the arrays are designed to give estimations of the absolute levels of gene expression Therefore the comparison of two conditions requires two separate single-dye hybridizations As only a single dye is used the data collected represent absolute values of gene expression These may be compared to other genes within a sample or to reference normalizing probes used to calibrate data across the entire array and across multiple arrays Three popular single-channel systems

are the Affymetrix Gene Chip the Applied Microarrays CodeLink arrays and the Eppendorf DualChip amp Silverquant One strength of the single-dye system lies in the fact that an aberrant sample cannot affect the raw data derived from other samples because each array chip is exposed to only one sample (as opposed to a two-color system in which a single low-quality sample may drastically impinge on overall data precision even if the other sample was of high quality) Another benefit is that data are more easily compared to arrays from different experiments the absolute values of gene expression may be compared between studies conducted months or years apart A drawback to the one-color system is that when compared to the two-color system twice as many microarrays are needed to compare samples within an experiment

Expression profile clustering

Then we can try to gather genes that share the same expression profile on several experiments This clustering can be done gradually as for phylogenetic analysis which consist in calculating similarity criteria between expression profiles and gather the most similar ones We can also use more complex techniques as principal component analysis or neuronal networks

At the end hierarchical clustering is usually displayed as a matrix where each column represent one experiment and each row a gene Ratios are displayed thanks to a colour scale going from green (repressed genes) to red (induced genes)

Uses and types

Arrays of DNA can be spatially arranged as in the commonly known gene chip (also called genome chip DNA chip or gene array) or can be specific DNA sequences labelled such that they can be independently identified in solution The traditional solid-phase array is a collection of microscopic DNA spots attached to a solid surface such as glass plastic or silicon biochip The affixed DNA segments are known as probes (although some sources use different terms such as reporters) Thousands of them can be placed in known locations on a single DNA microarray

DNA microarrays can be used to detect DNA (as in comparative genomic hybridization) or detect RNA (most commonly as cDNA after reverse transcription)that may or may not be translated into proteins The process of measuring gene expression via cDNA is called expression analysis or expression profiling

Since an array can contain tens of thousands of probes a microarray experiment can accomplish that many genetic tests in parallel Therefore arrays have dramatically accelerated many types of investigation

Applications include

Technology or Application

Synopsis

Gene expression profiling

In an mRNAor gene expression profiling experiment the expression levels of thousands of genes are simultaneously monitored to study the effects of certain treatments diseases and developmental stages on gene expression For example microarray-based gene expression profiling can be used to identify genes whose expression is changed in response to pathogens or other organisms by comparing gene expression in infected to that in uninfected cells or tissues

Comparative genomic hybridization

Assessing genome content in different cells or closely related organisms

Chromatin immunoprecipitation on Chip

DNA sequences bound to a particular protein can be isolated by immunoprecipitating that protein (CHIP) these fragments can be then hybridized to a microarray (such as a tiling array) allowing the determination of protein binding site occupancy throughout

the genome Example protein to immunoprecipitate are histone modifications (H3K27me3 H3K4me2 H3K9me3 etc) Polycomb-group protein (PRC2Suz12 PRC1YY1) and trithorax-group protein (Ash1) to study the epigenetic landscape or RNA Polymerase II to study the transcription lanscape

SNP detection

Identifying single nucleotide polymorphism among alleles within or between populations Several applications of microarrays make use of SNP detection including Genotyping forensic analysis measuring predisposition to disease identifying drug-candidates evaluating germline mutations in individuals or somatic mutations in cancers assessing loss of heterozygosity or genetic linkage analysis

Alternative splicing detection

An exon junction array design uses probes specific to the expected or potential splice sites of predicted exons for a gene It is of intermediate density or coverage to a typical gene expression array (with 1-3 probes per gene) and a genomic tiling array (with hundreds or thousands of probes per gene) It is used to assay the expression of alternative splice forms of a gene Exon arrays have a different design employing probes designed to detect each individual exon for known or predicted genes and can be used for detecting different splicing isoforms

Tiling array

Genome tiling arrays consist of overlapping probes designed to densely represent a genomic region of interest sometimes as large as an entire human chromosome The purpose is to empirically detect expression of transcripts or alternatively splice forms which may not have been previously known or predicted

In spotted microarrays the probes are oligonucleotides cDNA or small fragments of PCR products that correspond to mRNAs The probes are synthesized prior to deposition on the array surface and are then spotted onto glass A common approach utilizes an array of fine pins or needles controlled by a robotic arm that is dipped into wells containing DNA probes and then depositing each probe at designated locations on the array surface The resulting grid of probes represents the nucleic acid profiles of the prepared probes and is ready to receive complementary cDNA or cRNA targets derived from experimental or clinical samples This technique is used by research scientists around the world to produce in-house printed microarrays from their own labs These arrays may be easily customized for each experiment because researchers can choose the probes and printing locations on the arrays synthesize the probes in their own lab (or collaborating facility) and spot the arrays They can then generate their own labeled samples for hybridization hybridize the samples to the array and finally scan the arrays with their own equipment This provides a relatively low-cost microarray that may be customized for each study and avoids the costs of purchasing often more expensive commercial arrays that may represent vast numbers of genes that are not of interest to the investigator Publications exist which indicate in-house spotted microarrays may not provide the same level of sensitivity compared to commercial oligonucleotide arrays possibly owing to the small batch sizes and reduced printing efficiencies when compared to industrial manufactures of oligo arrays

In oligonucleotide microarrays the probes are short sequences designed to match parts of the sequence of known or predicted open reading frames Although oligonucleotide probes are often used in spotted microarrays the term oligonucleotide array most often refers to a specific technique of manufacturing Oligonucleotide arrays are produced by printing short oligonucleotide sequences designed to represent a single gene or family of gene splice-variants by synthesizing this sequence directly onto the array surface instead of depositing intact sequences Sequences may be longer (60-mer probes such as the Agilent design) or shorter (25-mer probes produced by Affymetrix) depending on the desired purpose longer probes are more specific to individual target genes shorter probes may be spotted in higher density across the array and are cheaper to manufacture One technique used to produce oligonucleotide arrays include photolithographic synthesis (Agilent and Affymetrix) on a silica substrate where light and light-sensitive masking agents are used to build a sequence one nucleotide at a time across the entire array Each applicable probe is selectively unmasked prior to bathing the array in a solution of a single nucleotide then a masking reaction takes place and the next set of probes are unmasked in preparation for a different nucleotide exposure After many repetitions the sequences of every probe become fully constructed More recently Maskless Array Synthesis from NimbleGen Systems has combined flexibility with large numbers of probes

Two-channel vs one-channel detection

Diagram of typical dual-colour microarray experiment

Two-color microarrays or two-channel microarrays are typically hybridized with cDNA prepared from two samples to be compared (eg diseased tissue versus healthy tissue) and that are labeled with two different fluorophores Fluorescent dyes commonly used for cDNA labelling include Cy3 which has a fluorescence emission wavelength of 570 nm (corresponding to the green part of the light spectrum) and Cy5 with a fluorescence emission wavelength of 670 nm (corresponding to the red part of the light spectrum) The two Cy-labelled cDNA samples are mixed and hybridized to a single microarray that is then scanned in a microarray scanner to visualize fluorescence of the two fluorophores after excitation with a laser beam of a defined wavelength Relative intensities of each fluorophore may then be used in ratio-based analysis to identify up-regulated and down-regulated genes

Oligonucleotide microarrays often contain control probes designed to hybridize with RNA spike-ins The degree of hybridization between the spike-ins and the control probes is used to normalize the hybridization measurements for the target probes Although absolute levels of gene expression may be determined in the two-color array the relative differences in expression among different spots within a sample and between samples is the preferred method of data analysis for the two-color system Examples of providers for such microarrays includes Agilent with their Dual-Mode platform Eppendorf with their DualChip platform for fluorescence labeling and TeleChem International with Arrayit

In single-channel microarrays or one-color microarrays the arrays are designed to give estimations of the absolute levels of gene expression Therefore the comparison of two conditions requires two separate single-dye hybridizations As only a single dye is used the data collected represent absolute values of gene expression These may be compared to other genes within a sample or to reference normalizing probes used to calibrate data across the entire array and across multiple arrays Three popular single-channel systems

are the Affymetrix Gene Chip the Applied Microarrays CodeLink arrays and the Eppendorf DualChip amp Silverquant One strength of the single-dye system lies in the fact that an aberrant sample cannot affect the raw data derived from other samples because each array chip is exposed to only one sample (as opposed to a two-color system in which a single low-quality sample may drastically impinge on overall data precision even if the other sample was of high quality) Another benefit is that data are more easily compared to arrays from different experiments the absolute values of gene expression may be compared between studies conducted months or years apart A drawback to the one-color system is that when compared to the two-color system twice as many microarrays are needed to compare samples within an experiment

Expression profile clustering

Then we can try to gather genes that share the same expression profile on several experiments This clustering can be done gradually as for phylogenetic analysis which consist in calculating similarity criteria between expression profiles and gather the most similar ones We can also use more complex techniques as principal component analysis or neuronal networks

At the end hierarchical clustering is usually displayed as a matrix where each column represent one experiment and each row a gene Ratios are displayed thanks to a colour scale going from green (repressed genes) to red (induced genes)

Uses and types

Arrays of DNA can be spatially arranged as in the commonly known gene chip (also called genome chip DNA chip or gene array) or can be specific DNA sequences labelled such that they can be independently identified in solution The traditional solid-phase array is a collection of microscopic DNA spots attached to a solid surface such as glass plastic or silicon biochip The affixed DNA segments are known as probes (although some sources use different terms such as reporters) Thousands of them can be placed in known locations on a single DNA microarray

DNA microarrays can be used to detect DNA (as in comparative genomic hybridization) or detect RNA (most commonly as cDNA after reverse transcription)that may or may not be translated into proteins The process of measuring gene expression via cDNA is called expression analysis or expression profiling

Since an array can contain tens of thousands of probes a microarray experiment can accomplish that many genetic tests in parallel Therefore arrays have dramatically accelerated many types of investigation

Applications include

Technology or Application

Synopsis

Gene expression profiling

In an mRNAor gene expression profiling experiment the expression levels of thousands of genes are simultaneously monitored to study the effects of certain treatments diseases and developmental stages on gene expression For example microarray-based gene expression profiling can be used to identify genes whose expression is changed in response to pathogens or other organisms by comparing gene expression in infected to that in uninfected cells or tissues

Comparative genomic hybridization

Assessing genome content in different cells or closely related organisms

Chromatin immunoprecipitation on Chip

DNA sequences bound to a particular protein can be isolated by immunoprecipitating that protein (CHIP) these fragments can be then hybridized to a microarray (such as a tiling array) allowing the determination of protein binding site occupancy throughout

the genome Example protein to immunoprecipitate are histone modifications (H3K27me3 H3K4me2 H3K9me3 etc) Polycomb-group protein (PRC2Suz12 PRC1YY1) and trithorax-group protein (Ash1) to study the epigenetic landscape or RNA Polymerase II to study the transcription lanscape

SNP detection

Identifying single nucleotide polymorphism among alleles within or between populations Several applications of microarrays make use of SNP detection including Genotyping forensic analysis measuring predisposition to disease identifying drug-candidates evaluating germline mutations in individuals or somatic mutations in cancers assessing loss of heterozygosity or genetic linkage analysis

Alternative splicing detection

An exon junction array design uses probes specific to the expected or potential splice sites of predicted exons for a gene It is of intermediate density or coverage to a typical gene expression array (with 1-3 probes per gene) and a genomic tiling array (with hundreds or thousands of probes per gene) It is used to assay the expression of alternative splice forms of a gene Exon arrays have a different design employing probes designed to detect each individual exon for known or predicted genes and can be used for detecting different splicing isoforms

Tiling array

Genome tiling arrays consist of overlapping probes designed to densely represent a genomic region of interest sometimes as large as an entire human chromosome The purpose is to empirically detect expression of transcripts or alternatively splice forms which may not have been previously known or predicted

Two-channel vs one-channel detection

Diagram of typical dual-colour microarray experiment

Two-color microarrays or two-channel microarrays are typically hybridized with cDNA prepared from two samples to be compared (eg diseased tissue versus healthy tissue) and that are labeled with two different fluorophores Fluorescent dyes commonly used for cDNA labelling include Cy3 which has a fluorescence emission wavelength of 570 nm (corresponding to the green part of the light spectrum) and Cy5 with a fluorescence emission wavelength of 670 nm (corresponding to the red part of the light spectrum) The two Cy-labelled cDNA samples are mixed and hybridized to a single microarray that is then scanned in a microarray scanner to visualize fluorescence of the two fluorophores after excitation with a laser beam of a defined wavelength Relative intensities of each fluorophore may then be used in ratio-based analysis to identify up-regulated and down-regulated genes

Oligonucleotide microarrays often contain control probes designed to hybridize with RNA spike-ins The degree of hybridization between the spike-ins and the control probes is used to normalize the hybridization measurements for the target probes Although absolute levels of gene expression may be determined in the two-color array the relative differences in expression among different spots within a sample and between samples is the preferred method of data analysis for the two-color system Examples of providers for such microarrays includes Agilent with their Dual-Mode platform Eppendorf with their DualChip platform for fluorescence labeling and TeleChem International with Arrayit

In single-channel microarrays or one-color microarrays the arrays are designed to give estimations of the absolute levels of gene expression Therefore the comparison of two conditions requires two separate single-dye hybridizations As only a single dye is used the data collected represent absolute values of gene expression These may be compared to other genes within a sample or to reference normalizing probes used to calibrate data across the entire array and across multiple arrays Three popular single-channel systems

are the Affymetrix Gene Chip the Applied Microarrays CodeLink arrays and the Eppendorf DualChip amp Silverquant One strength of the single-dye system lies in the fact that an aberrant sample cannot affect the raw data derived from other samples because each array chip is exposed to only one sample (as opposed to a two-color system in which a single low-quality sample may drastically impinge on overall data precision even if the other sample was of high quality) Another benefit is that data are more easily compared to arrays from different experiments the absolute values of gene expression may be compared between studies conducted months or years apart A drawback to the one-color system is that when compared to the two-color system twice as many microarrays are needed to compare samples within an experiment

Expression profile clustering

Then we can try to gather genes that share the same expression profile on several experiments This clustering can be done gradually as for phylogenetic analysis which consist in calculating similarity criteria between expression profiles and gather the most similar ones We can also use more complex techniques as principal component analysis or neuronal networks

At the end hierarchical clustering is usually displayed as a matrix where each column represent one experiment and each row a gene Ratios are displayed thanks to a colour scale going from green (repressed genes) to red (induced genes)

Uses and types

Arrays of DNA can be spatially arranged as in the commonly known gene chip (also called genome chip DNA chip or gene array) or can be specific DNA sequences labelled such that they can be independently identified in solution The traditional solid-phase array is a collection of microscopic DNA spots attached to a solid surface such as glass plastic or silicon biochip The affixed DNA segments are known as probes (although some sources use different terms such as reporters) Thousands of them can be placed in known locations on a single DNA microarray

DNA microarrays can be used to detect DNA (as in comparative genomic hybridization) or detect RNA (most commonly as cDNA after reverse transcription)that may or may not be translated into proteins The process of measuring gene expression via cDNA is called expression analysis or expression profiling

Since an array can contain tens of thousands of probes a microarray experiment can accomplish that many genetic tests in parallel Therefore arrays have dramatically accelerated many types of investigation

Applications include

Technology or Application

Synopsis

Gene expression profiling

In an mRNAor gene expression profiling experiment the expression levels of thousands of genes are simultaneously monitored to study the effects of certain treatments diseases and developmental stages on gene expression For example microarray-based gene expression profiling can be used to identify genes whose expression is changed in response to pathogens or other organisms by comparing gene expression in infected to that in uninfected cells or tissues

Comparative genomic hybridization

Assessing genome content in different cells or closely related organisms

Chromatin immunoprecipitation on Chip

DNA sequences bound to a particular protein can be isolated by immunoprecipitating that protein (CHIP) these fragments can be then hybridized to a microarray (such as a tiling array) allowing the determination of protein binding site occupancy throughout

the genome Example protein to immunoprecipitate are histone modifications (H3K27me3 H3K4me2 H3K9me3 etc) Polycomb-group protein (PRC2Suz12 PRC1YY1) and trithorax-group protein (Ash1) to study the epigenetic landscape or RNA Polymerase II to study the transcription lanscape

SNP detection

Identifying single nucleotide polymorphism among alleles within or between populations Several applications of microarrays make use of SNP detection including Genotyping forensic analysis measuring predisposition to disease identifying drug-candidates evaluating germline mutations in individuals or somatic mutations in cancers assessing loss of heterozygosity or genetic linkage analysis

Alternative splicing detection

An exon junction array design uses probes specific to the expected or potential splice sites of predicted exons for a gene It is of intermediate density or coverage to a typical gene expression array (with 1-3 probes per gene) and a genomic tiling array (with hundreds or thousands of probes per gene) It is used to assay the expression of alternative splice forms of a gene Exon arrays have a different design employing probes designed to detect each individual exon for known or predicted genes and can be used for detecting different splicing isoforms

Tiling array

Genome tiling arrays consist of overlapping probes designed to densely represent a genomic region of interest sometimes as large as an entire human chromosome The purpose is to empirically detect expression of transcripts or alternatively splice forms which may not have been previously known or predicted

are the Affymetrix Gene Chip the Applied Microarrays CodeLink arrays and the Eppendorf DualChip amp Silverquant One strength of the single-dye system lies in the fact that an aberrant sample cannot affect the raw data derived from other samples because each array chip is exposed to only one sample (as opposed to a two-color system in which a single low-quality sample may drastically impinge on overall data precision even if the other sample was of high quality) Another benefit is that data are more easily compared to arrays from different experiments the absolute values of gene expression may be compared between studies conducted months or years apart A drawback to the one-color system is that when compared to the two-color system twice as many microarrays are needed to compare samples within an experiment

Expression profile clustering

Then we can try to gather genes that share the same expression profile on several experiments This clustering can be done gradually as for phylogenetic analysis which consist in calculating similarity criteria between expression profiles and gather the most similar ones We can also use more complex techniques as principal component analysis or neuronal networks

At the end hierarchical clustering is usually displayed as a matrix where each column represent one experiment and each row a gene Ratios are displayed thanks to a colour scale going from green (repressed genes) to red (induced genes)

Uses and types

Arrays of DNA can be spatially arranged as in the commonly known gene chip (also called genome chip DNA chip or gene array) or can be specific DNA sequences labelled such that they can be independently identified in solution The traditional solid-phase array is a collection of microscopic DNA spots attached to a solid surface such as glass plastic or silicon biochip The affixed DNA segments are known as probes (although some sources use different terms such as reporters) Thousands of them can be placed in known locations on a single DNA microarray

DNA microarrays can be used to detect DNA (as in comparative genomic hybridization) or detect RNA (most commonly as cDNA after reverse transcription)that may or may not be translated into proteins The process of measuring gene expression via cDNA is called expression analysis or expression profiling

Since an array can contain tens of thousands of probes a microarray experiment can accomplish that many genetic tests in parallel Therefore arrays have dramatically accelerated many types of investigation

Applications include

Technology or Application

Synopsis

Gene expression profiling

In an mRNAor gene expression profiling experiment the expression levels of thousands of genes are simultaneously monitored to study the effects of certain treatments diseases and developmental stages on gene expression For example microarray-based gene expression profiling can be used to identify genes whose expression is changed in response to pathogens or other organisms by comparing gene expression in infected to that in uninfected cells or tissues

Comparative genomic hybridization

Assessing genome content in different cells or closely related organisms

Chromatin immunoprecipitation on Chip

DNA sequences bound to a particular protein can be isolated by immunoprecipitating that protein (CHIP) these fragments can be then hybridized to a microarray (such as a tiling array) allowing the determination of protein binding site occupancy throughout

the genome Example protein to immunoprecipitate are histone modifications (H3K27me3 H3K4me2 H3K9me3 etc) Polycomb-group protein (PRC2Suz12 PRC1YY1) and trithorax-group protein (Ash1) to study the epigenetic landscape or RNA Polymerase II to study the transcription lanscape

SNP detection

Identifying single nucleotide polymorphism among alleles within or between populations Several applications of microarrays make use of SNP detection including Genotyping forensic analysis measuring predisposition to disease identifying drug-candidates evaluating germline mutations in individuals or somatic mutations in cancers assessing loss of heterozygosity or genetic linkage analysis

Alternative splicing detection

An exon junction array design uses probes specific to the expected or potential splice sites of predicted exons for a gene It is of intermediate density or coverage to a typical gene expression array (with 1-3 probes per gene) and a genomic tiling array (with hundreds or thousands of probes per gene) It is used to assay the expression of alternative splice forms of a gene Exon arrays have a different design employing probes designed to detect each individual exon for known or predicted genes and can be used for detecting different splicing isoforms

Tiling array

Genome tiling arrays consist of overlapping probes designed to densely represent a genomic region of interest sometimes as large as an entire human chromosome The purpose is to empirically detect expression of transcripts or alternatively splice forms which may not have been previously known or predicted

Uses and types

Arrays of DNA can be spatially arranged as in the commonly known gene chip (also called genome chip DNA chip or gene array) or can be specific DNA sequences labelled such that they can be independently identified in solution The traditional solid-phase array is a collection of microscopic DNA spots attached to a solid surface such as glass plastic or silicon biochip The affixed DNA segments are known as probes (although some sources use different terms such as reporters) Thousands of them can be placed in known locations on a single DNA microarray

DNA microarrays can be used to detect DNA (as in comparative genomic hybridization) or detect RNA (most commonly as cDNA after reverse transcription)that may or may not be translated into proteins The process of measuring gene expression via cDNA is called expression analysis or expression profiling

Since an array can contain tens of thousands of probes a microarray experiment can accomplish that many genetic tests in parallel Therefore arrays have dramatically accelerated many types of investigation

Applications include

Technology or Application

Synopsis

Gene expression profiling

In an mRNAor gene expression profiling experiment the expression levels of thousands of genes are simultaneously monitored to study the effects of certain treatments diseases and developmental stages on gene expression For example microarray-based gene expression profiling can be used to identify genes whose expression is changed in response to pathogens or other organisms by comparing gene expression in infected to that in uninfected cells or tissues

Comparative genomic hybridization

Assessing genome content in different cells or closely related organisms

Chromatin immunoprecipitation on Chip

DNA sequences bound to a particular protein can be isolated by immunoprecipitating that protein (CHIP) these fragments can be then hybridized to a microarray (such as a tiling array) allowing the determination of protein binding site occupancy throughout

the genome Example protein to immunoprecipitate are histone modifications (H3K27me3 H3K4me2 H3K9me3 etc) Polycomb-group protein (PRC2Suz12 PRC1YY1) and trithorax-group protein (Ash1) to study the epigenetic landscape or RNA Polymerase II to study the transcription lanscape

SNP detection

Identifying single nucleotide polymorphism among alleles within or between populations Several applications of microarrays make use of SNP detection including Genotyping forensic analysis measuring predisposition to disease identifying drug-candidates evaluating germline mutations in individuals or somatic mutations in cancers assessing loss of heterozygosity or genetic linkage analysis

Alternative splicing detection

An exon junction array design uses probes specific to the expected or potential splice sites of predicted exons for a gene It is of intermediate density or coverage to a typical gene expression array (with 1-3 probes per gene) and a genomic tiling array (with hundreds or thousands of probes per gene) It is used to assay the expression of alternative splice forms of a gene Exon arrays have a different design employing probes designed to detect each individual exon for known or predicted genes and can be used for detecting different splicing isoforms

Tiling array

Genome tiling arrays consist of overlapping probes designed to densely represent a genomic region of interest sometimes as large as an entire human chromosome The purpose is to empirically detect expression of transcripts or alternatively splice forms which may not have been previously known or predicted

the genome Example protein to immunoprecipitate are histone modifications (H3K27me3 H3K4me2 H3K9me3 etc) Polycomb-group protein (PRC2Suz12 PRC1YY1) and trithorax-group protein (Ash1) to study the epigenetic landscape or RNA Polymerase II to study the transcription lanscape

SNP detection

Identifying single nucleotide polymorphism among alleles within or between populations Several applications of microarrays make use of SNP detection including Genotyping forensic analysis measuring predisposition to disease identifying drug-candidates evaluating germline mutations in individuals or somatic mutations in cancers assessing loss of heterozygosity or genetic linkage analysis

Alternative splicing detection

An exon junction array design uses probes specific to the expected or potential splice sites of predicted exons for a gene It is of intermediate density or coverage to a typical gene expression array (with 1-3 probes per gene) and a genomic tiling array (with hundreds or thousands of probes per gene) It is used to assay the expression of alternative splice forms of a gene Exon arrays have a different design employing probes designed to detect each individual exon for known or predicted genes and can be used for detecting different splicing isoforms

Tiling array

Genome tiling arrays consist of overlapping probes designed to densely represent a genomic region of interest sometimes as large as an entire human chromosome The purpose is to empirically detect expression of transcripts or alternatively splice forms which may not have been previously known or predicted