genecfingerprinnggcp21.org/tanzania/moragferguson3.pdfcurrentsnp&genotyping&for&gene$c&...

20
Gene$c Fingerprin$ng

Upload: others

Post on 31-May-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GenecFingerprinnggcp21.org/Tanzania/MoragFerguson3.pdfCurrentSNP&genotyping&for&gene$c& linkage&mapping&in&cassava LG2 0.0 s06715:142408 0.7 s03823:20746s03823:37685 1.4 s06715:198298

Gene$c  Fingerprin$ng  

Page 2: GenecFingerprinnggcp21.org/Tanzania/MoragFerguson3.pdfCurrentSNP&genotyping&for&gene$c& linkage&mapping&in&cassava LG2 0.0 s06715:142408 0.7 s03823:20746s03823:37685 1.4 s06715:198298

Introduc$on  Unique  iden$fica$on  

Page 3: GenecFingerprinnggcp21.org/Tanzania/MoragFerguson3.pdfCurrentSNP&genotyping&for&gene$c& linkage&mapping&in&cassava LG2 0.0 s06715:142408 0.7 s03823:20746s03823:37685 1.4 s06715:198298

Purpose  of  Fingerprin$ng  in  Germplasm  Cura$on  

•  Characterisa$on  of  ‘type-­‐specimen’  •  Confirma$on  of  iden$ty  •  To  iden$fy  variants  within  a  ‘variety’  •  Iden$fica$on  of  duplicates  •  Study  diversity  

Page 4: GenecFingerprinnggcp21.org/Tanzania/MoragFerguson3.pdfCurrentSNP&genotyping&for&gene$c& linkage&mapping&in&cassava LG2 0.0 s06715:142408 0.7 s03823:20746s03823:37685 1.4 s06715:198298

Use  the  gene$c  code  to  fingerprint  

Page 5: GenecFingerprinnggcp21.org/Tanzania/MoragFerguson3.pdfCurrentSNP&genotyping&for&gene$c& linkage&mapping&in&cassava LG2 0.0 s06715:142408 0.7 s03823:20746s03823:37685 1.4 s06715:198298

How  to  read  the  gene$c  code?  

•  Gregor  Mendel  published  in  1866    •  Enzymes  (isozymes)  •  DNA:  – Non-­‐PCR  based  (RFLPs,  RFLP-­‐VNTP)  – PCR  based  (1983)  –  arbitrary  primed    •  (RAPD,  AFLP)  

– PCR  based  –  site-­‐targeted  PCR  •  (SSR,  STS)  

– Sequencing  (SNP)  AFLP  

RFLP  SSR  

Page 6: GenecFingerprinnggcp21.org/Tanzania/MoragFerguson3.pdfCurrentSNP&genotyping&for&gene$c& linkage&mapping&in&cassava LG2 0.0 s06715:142408 0.7 s03823:20746s03823:37685 1.4 s06715:198298

Single  Nucleo$de  Polymorphism  

Page 7: GenecFingerprinnggcp21.org/Tanzania/MoragFerguson3.pdfCurrentSNP&genotyping&for&gene$c& linkage&mapping&in&cassava LG2 0.0 s06715:142408 0.7 s03823:20746s03823:37685 1.4 s06715:198298

Detec$ng  SNPs  

•  ‘Chip’  based  –  use  of  specific  primers  –  Illumina  GoldenGate  (96,  384  and  1536  SNPs)  – Affymetrix  chips  (over  100,000  SNPs)  – KBioSciences  (flexible)  

•  Sequencing  

Page 8: GenecFingerprinnggcp21.org/Tanzania/MoragFerguson3.pdfCurrentSNP&genotyping&for&gene$c& linkage&mapping&in&cassava LG2 0.0 s06715:142408 0.7 s03823:20746s03823:37685 1.4 s06715:198298

All  DNA  sequencing  is  based  on  the  principles  of  DNA  synthesis  

•  A  DNA  template  •  A  primer  to  ini$ate  •  An  enzyme  to  add  new  nucleo$des  •  A  way  to  record  which  nucleo$de  is  added  

Page 9: GenecFingerprinnggcp21.org/Tanzania/MoragFerguson3.pdfCurrentSNP&genotyping&for&gene$c& linkage&mapping&in&cassava LG2 0.0 s06715:142408 0.7 s03823:20746s03823:37685 1.4 s06715:198298

Advances  in  Sequencing  Technology  •  Sanger  sequencing  

–  the  reac$on  occurs  in  the  tube  –  The  gel/machine  simply  reads  out  the  results  –  Limited  by  physical  capabili$es  of  electrophoresis  –  Requires  physical  space  to  separate  by  size  –  This  limits  capacity  &  speed  of  a  sequencing  machine  

•  In  Next  Genera$on  Sequencing  (NGS):  –   The  reac$on  occurs  in  the  machine  –  Read  the  DNA  sequence  as  it  is  generated.  –  Growing  length  of  DNA  strand  is  now  represented  in  $me,  not  in  physical  

separa$on  –  Allows  much  higher  capacity  –  Plus  very  high  resolu$on  imaging  

Page 10: GenecFingerprinnggcp21.org/Tanzania/MoragFerguson3.pdfCurrentSNP&genotyping&for&gene$c& linkage&mapping&in&cassava LG2 0.0 s06715:142408 0.7 s03823:20746s03823:37685 1.4 s06715:198298
Page 11: GenecFingerprinnggcp21.org/Tanzania/MoragFerguson3.pdfCurrentSNP&genotyping&for&gene$c& linkage&mapping&in&cassava LG2 0.0 s06715:142408 0.7 s03823:20746s03823:37685 1.4 s06715:198298

Using  sequencing  for  SNP  genotyping  

Op$ons:  •  Whole  genome  re-­‐sequencing  (too  much  sequencing  and  bioinforma$cs)  

•  Reduced  Representa$on  Genomic  sequencing  – Genotyping-­‐by-­‐sequencing  (GBS)  – RADSeq  

Page 12: GenecFingerprinnggcp21.org/Tanzania/MoragFerguson3.pdfCurrentSNP&genotyping&for&gene$c& linkage&mapping&in&cassava LG2 0.0 s06715:142408 0.7 s03823:20746s03823:37685 1.4 s06715:198298

Genotyping-­‐by-­‐sequencing  (GBS)  A  reduced-­‐representa$on  approach.  •  Restric$on  enzyme  used  to  generate  many  fragments  of  genomic  DNA  which  are  then  sequenced.    Only  a  subset  of  SNPs  are  sampled  from  each  individual—need  fewer  reads          per  individual,  allowing  for  mul$plexing.  •  Restric$on  digest  ensures  that  the  same  sites  are  sampled  from  each  individual.  

Reference  Genome  Sequence  

Genotype  1  

AAG

GC  C  C  

Genotype  2  G

GGG

G  

G  G  

Page 13: GenecFingerprinnggcp21.org/Tanzania/MoragFerguson3.pdfCurrentSNP&genotyping&for&gene$c& linkage&mapping&in&cassava LG2 0.0 s06715:142408 0.7 s03823:20746s03823:37685 1.4 s06715:198298

Rela$ve  Genotyping  Costs  

Technology   Number  of  SNPs   Cost/genotype  ($)  

GoldenGate   1536   70  

KBioSciences   500   64  

KBioSciences   300   40  

GBS  (Cornell)   4500  upwards  (now  15,000  using  ApeK1)  

53  (48  plex)  38  (96  plex)  

Includes  $8  for  bioinforma$cs  service.  IGD  Website  $20  for  96  plex  and  $10  for  386  plex  

Page 14: GenecFingerprinnggcp21.org/Tanzania/MoragFerguson3.pdfCurrentSNP&genotyping&for&gene$c& linkage&mapping&in&cassava LG2 0.0 s06715:142408 0.7 s03823:20746s03823:37685 1.4 s06715:198298

SNPs  in  Cassava  

§  Es$mated  one  SNP  every  121bp  §  Cassava  genome  es$mated  to  be  ~  770Mb    §  Approx.  6.3  million  SNPs  in  cassava  §  Ideal  for  fingerprin$ng    §  How  do  we  visualise?  §  Should  we  sub-­‐sample  and  if  so  how?  

Page 15: GenecFingerprinnggcp21.org/Tanzania/MoragFerguson3.pdfCurrentSNP&genotyping&for&gene$c& linkage&mapping&in&cassava LG2 0.0 s06715:142408 0.7 s03823:20746s03823:37685 1.4 s06715:198298

SNPs  in  Cassava  cont.  

Page 16: GenecFingerprinnggcp21.org/Tanzania/MoragFerguson3.pdfCurrentSNP&genotyping&for&gene$c& linkage&mapping&in&cassava LG2 0.0 s06715:142408 0.7 s03823:20746s03823:37685 1.4 s06715:198298

Current  SNP  genotyping  for  diversity  assessment  in  cassava  

•  Primer-­‐specific  SNPs  – GoldenGate  (960  genotypes)  – KBioSciences  (96  genotypes)  

•  GBS  – 700  genotypes  from  breeding  program  and  genebank  (CRP)  

– 650  from  gene$c  gain  (Next/Gen)  •  RADSeq  – 577  (CIAT,  CRP)  

Page 17: GenecFingerprinnggcp21.org/Tanzania/MoragFerguson3.pdfCurrentSNP&genotyping&for&gene$c& linkage&mapping&in&cassava LG2 0.0 s06715:142408 0.7 s03823:20746s03823:37685 1.4 s06715:198298

Current  SNP  genotyping  for  gene$c  linkage  mapping  in  cassava  

LG2

s06715:1424080.0s03823:20746 s03823:376850.7s06715:1982981.4s04175:264551 s04175:2750222.9s04175:350626 s04175:3439983.6s04175:474943 s04175:634572s04175:713299 s04175:6737284.3s04175:4504815.0s06711:44855712.4s04175:429876 s04175:42991615.5s03823:153891 s04175:331167s03823:6098520.7s04175:626180 s06711:39880922.1s03823:27160723.5s03823:2071224.3s08582:24618 s08582:1514131.8s08582:39496 s08582:7202332.5s06158:99353 s08287:73642s08287:73701 s06158:10513736.9s06825:193194 s06825:15350037.6s07005:11282549.3s06825:416263 s06825:39943953.2s06825:153432 s06825:19280253.9s06158:18683054.9s05782:4148659.0s03131:8736359.7s00093:513360.4s07933:4033264.8s07933:137652 s09133:941s09133:96465.5s06485:64517 s07933:1577169.9s11174:6054572.8s05214:1081274 s05214:981002s05214:65653274.3s05214:371707 s05214:71902978.7s05214:659702 s05214:71929882.3s05214:28384783.0s05214:10285384.5s06906:36889885.2s00821:27315291.1s00631:4753492.6s06906:368466108.1s06906:39485111.0s04745:91381122.1s02618:244177125.7s00984:12233135.8s10806:61159 s10806:61138147.7

2_P1

s06715:1424080.0s03823:20746 s03823:376850.7s06715:1982981.4s04175:264551 s04175:2750222.9s04175:350626 s04175:3439983.2s04175:474943 s04175:634572s04175:713299 s04175:6737283.6s04175:4504813.9s06711:4485577.8s04175:429876 s04175:4299169.4s03823:153891 s04175:331167s03823:6098512.1s06711:47904713.9s06711:29970414.4s06711:199846 s03866:487314.8s04175:626180 s06711:39880915.7s03823:27160718.3s03823:2071219.7s06711:479063 s04175:169298s06711:47932019.8s04175:78403620.9s03823:2114823.1s06825:61328430.5s08582:24618 s08582:1514133.3s08582:39496 s08582:72023s06825:57062734.5s06158:99353 s08287:73642s08287:73701 s06158:10513741.8s06825:193194 s06825:15350042.2s07005:11282548.4s06825:416263 s06825:39943950.4s06825:153432 s06825:19280250.8s06158:18683051.3s06158:17094153.7s06485:74176 s05782:4148655.7s03131:8736356.3s00093:513357.0s07933:9983 s07933:4033260.8s07933:13765261.9s09133:941 s09133:96462.6s06485:64517 s07933:1577166.7s08877:75856 s08877:118507s11174:6088467.4s11174:6054569.4s05214:1081274 s05214:981002s05214:65653270.7s05214:386819 s05214:37170774.8s05214:71902975.5s05214:659702 s05214:71929878.3s05214:28384778.9s05214:10285380.0s06906:36889880.6s05214:71900183.8s00821:27315285.2s00631:4753486.4s05214:38678587.9s06906:36846698.5s06906:39732 s06906:39459100.7s06906:39485100.8s02811:62372111.8s04745:91381115.4s02618:244177117.2s01709:120976119.4s00984:12233127.3s10806:61159 s10806:61138139.2

2

s03823:207460.0s06715:1982980.7s04175:264551 s04175:275022s04175:350626 s04175:343998s04175:634572 s04175:713299s04175:673728

2.1

s04175:3311672.8s06711:4790475.7s06711:2997046.5s06711:199846 s03866:48737.2s06711:3988098.6s06711:479063 s04175:169298s06711:47932014.6s04175:78403616.1s03823:2114819.2s06825:61328430.0s08582:39496 s06825:57062735.9s06158:99353 s06158:10513746.0s06158:18683047.0s06158:17094149.6s06485:74176 s05782:4148651.7s07933:9983 s07933:4033256.1s07933:13765257.6s09133:941 s09133:96459.0s08877:75856 s08877:118507s11174:6088463.4s05214:386819 s05214:37170770.2s05214:71902971.6s05214:71900177.6s05214:38678580.5s06906:39732 s06906:3945989.7s06906:3948589.8s02811:62372103.5s04745:91381 s02618:244177108.0s01709:120976110.2

2_P2

§  Linkage  map  –  order  of  markers  on  chromosomes  

§  Currently  3500  SNP  markers  §  It  is  possible  to  select  a  sub-­‐set  of  

markers  evenly  distributed  across  genome  for  primer-­‐specific  SNP  genotyping  

Page 18: GenecFingerprinnggcp21.org/Tanzania/MoragFerguson3.pdfCurrentSNP&genotyping&for&gene$c& linkage&mapping&in&cassava LG2 0.0 s06715:142408 0.7 s03823:20746s03823:37685 1.4 s06715:198298

Factors  to  consider  •  Long-­‐term  availability  and  applicability  of  technology  

•  Turn-­‐around  $me  •  Cost  •  How  much  data  do  we  need  to:      – To  fingerprint  a  type-­‐specimen  – determine  iden$ty  – To  iden$fy  variants  within  a  ‘variety’  – Study  diversity  

Page 19: GenecFingerprinnggcp21.org/Tanzania/MoragFerguson3.pdfCurrentSNP&genotyping&for&gene$c& linkage&mapping&in&cassava LG2 0.0 s06715:142408 0.7 s03823:20746s03823:37685 1.4 s06715:198298

Rela$ve  Genotyping  Costs  

Technology   Number  of  SNPs   Cost/genotype  ($)  

GoldenGate   1536   70  

KBioSciences   500   64  

KBioSciences   300   40  

GBS  (Cornell)   4500  upwards  (now  15,000  using  ApeK1)  

53  (48  plex)  38  (96  plex)  

Includes  $8  for  bioinforma$cs  service.  IGD  Website  $20  for  96  plex  and  $10  for  386  plex  

Page 20: GenecFingerprinnggcp21.org/Tanzania/MoragFerguson3.pdfCurrentSNP&genotyping&for&gene$c& linkage&mapping&in&cassava LG2 0.0 s06715:142408 0.7 s03823:20746s03823:37685 1.4 s06715:198298

My  conclusion  •  Go  with  GBS  •  We  should  have  a  standard  set  of  approx  300  SNPs  evenly  spaced  and  in  different  regions  (coding,  non-­‐coding)  of  genome  for  KBioSciences  genotyping.  These  must  also  be  captured  by  GBS.  

•  You  will  need  DNA  extrac$on  and  quan$fica$on  facili$es