current microrna target identification methods are heavily ... · microrna target identi cation...

45
False positives in microRNA target identification Introduction Results Robustness of predicted targets Comparative genomics yield false positives Conclusion Acknowledgements Supplementary data Current microRNA target identification methods are heavily contaminated with false positives Herv´ e Seitz IGH (UMR 9002 CNRS and university of Montpellier), Montpellier, France November 8, 2018 This slideshow is available at: http://www.igh.cnrs.fr/equip/Seitz/Sifr2018.pdf

Upload: others

Post on 21-Oct-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Current microRNA target identificationmethods are heavily contaminated with false

    positives

    Hervé Seitz

    IGH (UMR 9002 CNRS and university of Montpellier), Montpellier, France

    November 8, 2018

    This slideshow is available at:http://www.igh.cnrs.fr/equip/Seitz/Sifr2018.pdf

    http://www.igh.cnrs.fr/equip/Seitz/Sifr2018.pdf

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Introduction

    miRNA sequences are known (for the most part); can weidentify their targets ?

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Introduction

    Target geneTranscription mRNA Translation

    Protein

    mRNAdecay

    Translationalrepression

    miRNA

    miRNA sequences are known (for the most part); can weidentify their targets ?

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Introduction

    Ago protein+ miRNA

    Target geneTranscription mRNA Translation

    Protein

    mRNAdecay

    Translationalrepression

    miRNA sequences are known (for the most part); can weidentify their targets ?

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Introduction

    Ago protein+ miRNA

    Target geneTranscription mRNA Translation

    Protein

    mRNAdecay

    Translationalrepression

    miRNA sequences are known (for the most part); can weidentify their targets ?

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Introduction

    I Sequence complementarity to the miRNA.

    I Physically bind the miRNA effector complex.

    I mRNA and protein level respond to miRNA.

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Introduction

    Stringent

    Not stringent

    (illustration by D. Goodsell, Scripps Research Institute andRutgers University)

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Introduction

    Stringent

    Not stringent

    (illustration by D. Goodsell, Scripps Research Institute andRutgers University)

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Introduction

    Comparative genomics dogma:if it is conserved, it must be functional.

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Introduction

    Comparative genomics dogma:if it is conserved, it must be functional.

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Introduction

    Comparative genomics dogma:if it is conserved, it must be functional.

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Introduction

    .

    ||||||NNNNNNN NNNNNNNNNNNNNN5´ 3´miRNA:

    target:

    2 7

    −→ the “seed”

    Comparative genomics (if a seed match is conserved, it mustbe functional).

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Introduction

    .

    ||||||NNNNNNN NNNNNNNNNNNNNN5´ 3´miRNA:

    target:

    2 7

    −→ the “seed”

    Comparative genomics (if a seed match is conserved, it mustbe functional).

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Introduction

    .

    ||||||NNNNNNN NNNNNNNNNNNNNN5´ 3´miRNA:

    target:

    2 7

    −→ the “seed”

    Comparative genomics (if a seed match is conserved, it mustbe functional).

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Introduction

    I Experimental methods (CLIP and avatars): hundreds oftargets per miRNA.

    I Computational methods (seed matches, conserved inevolution): hundreds of targets per miRNA.

    miRNA-mediated repression is very modest (usually< 2-fold): lower than well tolerated fluctuations in geneexpression (e.g., haplosufficiency). Why have these sitesbeen conserved if they are not functional?

    CLIP and avatars

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Introduction

    I Experimental methods (CLIP and avatars): hundreds oftargets per miRNA.

    I Computational methods (seed matches, conserved inevolution): hundreds of targets per miRNA.

    miRNA-mediated repression is very modest (usually< 2-fold): lower than well tolerated fluctuations in geneexpression (e.g., haplosufficiency). Why have these sitesbeen conserved if they are not functional?

    CLIP and avatars

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Introduction

    I Experimental methods (CLIP and avatars): hundreds oftargets per miRNA.

    I Computational methods (seed matches, conserved inevolution): hundreds of targets per miRNA.

    miRNA-mediated repression is very modest (usually< 2-fold): lower than well tolerated fluctuations in geneexpression (e.g., haplosufficiency).

    Why have these sitesbeen conserved if they are not functional?

    CLIP and avatars

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Introduction

    I Experimental methods (CLIP and avatars): hundreds oftargets per miRNA.

    I Computational methods (seed matches, conserved inevolution): hundreds of targets per miRNA.

    miRNA-mediated repression is very modest (usually< 2-fold): lower than well tolerated fluctuations in geneexpression (e.g., haplosufficiency). Why have these sitesbeen conserved if they are not functional?

    CLIP and avatars

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Results

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Robustness of predicted targets

    Baek et al., 2008: quantification of miR-223-mediatedrepression in mouse neutrophils.

    Neutrophil isolation

    Blood collection

    RNA extraction

    cDNA labeling,

    array hybridization

    Blood collection

    Pooled blood

    Split in 5 replicates

    Neutrophil isolation

    RNA extraction

    cDNA labeling,

    array hybridization

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Robustness of predicted targets

    Baek et al., 2008: quantification of miR-223-mediatedrepression in mouse neutrophils.

    Neutrophil isolation

    Blood collection

    RNA extraction

    cDNA labeling,

    array hybridization

    Blood collection

    Pooled blood

    Split in 5 replicates

    Neutrophil isolation

    RNA extraction

    cDNA labeling,

    array hybridization

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Robustness of predicted targets

    Baek et al., 2008: quantification of miR-223-mediatedrepression in mouse neutrophils.

    Neutrophil isolation

    Blood collection

    RNA extraction

    cDNA labeling,

    array hybridization

    Blood collection

    Pooled blood

    Split in 5 replicates

    Neutrophil isolation

    RNA extraction

    cDNA labeling,

    array hybridization

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Robustness of predicted targets

    Baek et al., 2008: quantification of miR-223-mediatedrepression in mouse neutrophils.

    Neutrophil isolation

    Blood collection

    RNA extraction

    cDNA labeling,

    array hybridization

    Blood collection

    Pooled blood

    Split in 5 replicates

    Neutrophil isolation

    RNA extraction

    cDNA labeling,

    array hybridization

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Robustness of predicted targets

    p: probability that the difference between two individualmice is smaller than miRNA-guided repression

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Robustness of predicted targets

    p: probability that the difference between two individualmice is smaller than miRNA-guided repression

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Robustness of predicted targets

    p: probability that the difference between two individualmice is smaller than miRNA-guided repression

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Robustness of predicted targets

    p: probability that the difference between two individualmice is smaller than miRNA-guided repression

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Robustness of predicted targets

    For 150 predicted targets out of 192:inter-individual fluctuations across 5 wild-type mice exceedsmiRNA-mediated regulation (p-value < 0.05).

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Robustness of predicted targets

    For 150 predicted targets out of 192:inter-individual fluctuations across 5 wild-type mice exceedsmiRNA-mediated regulation (p-value < 0.05).

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Comparative genomics yield false positives

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Comparative genomics yield false positives

    Taeniopygia guttata

    Monodelphis domestica

    Mus musculus

    Homo sapiens

    Danio rerio

    Sus scrofa

    Bos taurus

    USP9X mRNA miR-134 site

    Sarcophilus harrisii

    Gallus gallus

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Comparative genomics yield false positives

    :|| || |||||||

    Taeniopygia guttata

    Monodelphis domestica

    Mus musculus

    Homo sapiens

    Danio rerio

    Sus scrofa

    Bos taurus

    GACAUAACCAGCAAUGAAGA-UUUU-AGUCUCA

    GGCAUGACACUG-AUUCAGG-AUUUCAGUCACA

    GGCAUGAC----AAUUCAGG-CGUUCAGUCACA

    GACAUGACCCUGAAUUCAGG-AGUUCAGUCACA

    AGCAUGACACUGAAUUCAGGAAUUUCAGUCACA

    AGCAUGACACUGAAUUCAGGAAUUUCAGUCACA

    GGCACUAUGCUGAGUUCUGGAACAACAGUCACA

    GGCACUAUGCUGAAUUCUGGAACAACAGUCACA

    GAUGUGACGCUGAAUUCUGAAGCCACAGUCACA

    USP9X mRNA miR-134 site

    Sarcophilus harrisii

    Gallus gallus

    GGGGAGACCAGUUGGUCAGUGU5´3´

    3´5´

    miR-134:

    : conserved in 8 or 9 species out of 9 : conserved in 6 or 7 species out of 9 : conserved in less than 6 species out of 9

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Comparative genomics yield false positives

    :|| || |||||||

    Taeniopygia guttata

    Monodelphis domestica

    Mus musculus

    Homo sapiens

    Danio rerio

    miR-134

    Sus scrofa

    Bos taurus

    GACAUAACCAGCAAUGAAGA-UUUU-AGUCUCA

    GGCAUGACACUG-AUUCAGG-AUUUCAGUCACA

    GGCAUGAC----AAUUCAGG-CGUUCAGUCACA

    GACAUGACCCUGAAUUCAGG-AGUUCAGUCACA

    AGCAUGACACUGAAUUCAGGAAUUUCAGUCACA

    AGCAUGACACUGAAUUCAGGAAUUUCAGUCACA

    GGCACUAUGCUGAGUUCUGGAACAACAGUCACA

    GGCACUAUGCUGAAUUCUGGAACAACAGUCACA

    GAUGUGACGCUGAAUUCUGAAGCCACAGUCACA

    USP9X mRNA miR-134 site

    Sarcophilus harrisii

    Gallus gallus

    GGGGAGACCAGUUGGUCAGUGU5´3´

    3´5´

    miR-134:

    : conserved in 8 or 9 species out of 9 : conserved in 6 or 7 species out of 9 : conserved in less than 6 species out of 9

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Comparative genomics yield false positives

    Hominidae

    Catarrhini

    Boreoeutheria

    Euteleostomi

    Comparison to prediction programs

    Effect of tree architecture

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Comparative genomics yield false positives

    Hominidae

    Catarrhini

    Boreoeutheria

    Euteleostomi

    3´ UTR seed matches:

    Clade-specific miRNA families

    Hominidae Catarrhini Boreo-eutheria

    Euteleostomi

    Prop

    ortio

    n of

    ove

    r-con

    serv

    ed s

    eed

    mat

    ches

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0(n=10 miRNA

    families) (n=14 miRNAfamilies)

    (n=14 miRNAfamilies)

    (n=10 miRNAfamilies)

    Comparison to prediction programs

    Effect of tree architecture

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Conclusion

    Most known miRNA “targets” are probably not functionallytargeted.

    Observing a molecular does not mean that it has a biologicalfunction.

    Homeostasis attenuates changes of external origin: it alsobuffers internal changes.

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Conclusion

    Most known miRNA “targets” are probably not functionallytargeted.

    Observing a molecular event does not mean that it has abiological function.

    Homeostasis attenuates changes of external origin: it alsobuffers internal changes.

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Conclusion

    Most known miRNA “targets” are probably not functionallytargeted.

    Observing a molecular object does not mean that it has abiological function.

    Homeostasis attenuates changes of external origin: it alsobuffers internal changes.

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Conclusion

    Most known miRNA “targets” are probably not functionallytargeted.

    Observing a molecular object does not mean that it has abiological function.

    Homeostasis attenuates changes of external origin: it alsobuffers internal changes.

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Acknowledgements

    I. Busseau A. SergeevaL. Martinez N. Pinzón

    B. Li S. Mockly

    Collaborators: F. Apparailly, J. Presumey (INM, Montpellier)

    October 2019: CNRS � école thématique � on smallregulatory RNAs (miRNAs, siRNAs, piRNAs) in Sète.

    This slideshow is available at:http://www.igh.cnrs.fr/equip/Seitz/Sifr2018.pdf

    http://www.igh.cnrs.fr/equip/Seitz/Sifr2018.pdf

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Acknowledgements

    I. Busseau A. SergeevaL. Martinez N. Pinzón

    B. Li S. Mockly

    Collaborators: F. Apparailly, J. Presumey (INM, Montpellier)

    October 2019: CNRS � école thématique � on smallregulatory RNAs (miRNAs, siRNAs, piRNAs) in Sète.

    This slideshow is available at:http://www.igh.cnrs.fr/equip/Seitz/Sifr2018.pdf

    http://www.igh.cnrs.fr/equip/Seitz/Sifr2018.pdf

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Supplementary data

    CLIP and avatars

    Overconservation statistics

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Introduction

    Reverse transcription,amplification, sequencing

    mRNA

    Tag clusters

    Partial RNA digestion

    Immuno-precipitation

    Ago

    target RNA

    miRNA

    U.V. crosslinking

    Gel purification,proteolysis

    Adapter ligation,gel purification

    Return

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Over-conservation in prediction programs

    Predictions by TargetScan 7.0:

    Clade-specific miRNA families

    Hominidae Catarrhini Boreo-eutheria

    EuteleostomiProp

    ortio

    n of

    ove

    r-con

    serv

    ed s

    eed

    mat

    ches

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0Predictions by miRanda (aug2010):

    Clade-specific miRNA families

    Hominidae Catarrhini Boreo-eutheria

    EuteleostomiProp

    ortio

    n of

    ove

    r-con

    serv

    ed s

    eed

    mat

    ches

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Predictions by PicTar2:

    Clade-specific miRNA families

    Prop

    ortio

    n of

    ove

    r-con

    serv

    ed s

    eed

    mat

    ches

    Hominidae Catarrhini Boreo-eutheria

    Euteleostomi0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Predictions by microT 5.0:

    Clade-specific miRNA families

    Hominidae Catarrhini Boreo-eutheria

    EuteleostomiProp

    ortio

    n of

    ove

    r-con

    serv

    ed s

    eed

    mat

    ches

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    (conservationnot considered)

    Return

  • False positives inmicroRNA target

    identification

    Introduction

    Results

    Robustness of predictedtargets

    Comparative genomics yieldfalse positives

    Conclusion

    Acknowledgements

    Supplementarydata

    Effect of tree architecture

    Hominidae Catarrhini Boreoeutheria Euteleostomi

    Clade of interest

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Prop

    ortio

    n of

    see

    d m

    atch

    es c

    onse

    rved

    outs

    ide

    clad

    e of

    inte

    rest

    Hominidae-specific seedsCatarrhini-specific seedsBoreoeutheria-specific seedsEuteleostomi-specific seeds

    Non-seed hexamers3´ UTR seed matches:

    Return

    IntroductionResultsRobustness of predicted targetsComparative genomics yield false positives

    ConclusionAcknowledgementsSupplementary data

    anm0: 0.14: 0.13: 0.12: 0.11: 0.10: 0.9: 0.8: 0.7: 0.6: 0.5: 0.4: 0.3: 0.2: 0.1: 0.0: