frédéric choulet a pseudomolecule of 774 mb: the 3b experience inra gdec – clermont-ferrand,...
TRANSCRIPT
3B
Sequenced physical map
#MTP BACs 8452
3B MTP-BAC sequencing
#BAC pools 922
#Roche 8 kb MP lib. 922
bp coverage (Roche/454) 36x
BAC-ends (Sanger) 42,551
Whole Genome Prof. tags 327,282
Whole 3B shotgun (Illumina) 82x
o Curation of the scaffoldingV. Barbe, S. Mangenot (Genoscope)
Assembly and scaffolding
Integration of BAC-end match positions
Parsing of MP read positions
scaff00001 scaff00013
scaff00024 scaff00008scaff00011
scaff00007
scaff00005
3B-v1
16,136 scaff
1,040 Mb
18% Ns
3B-v3
4,999 scaff
992 Mb
13% Ns
o Curation of the scaffoldingV. Barbe, S. Mangenot (Genoscope)
Assembly and scaffolding
3B-v3
4,999 scaff
992 Mb
13% Ns
3B-v1
16,136 scaff
1,040 Mb
o Gap filling o Seq. error corrections
JM. Aury, A. Couloux (Genoscope)
Illumina readsWhole 3B Shotgun
109,914 gaps filled 126,290 bases corrected (error rate: 0.1%)
18% Ns
3B-v4
-
-
8% Ns
o Curation of the scaffoldingV. Barbe, S. Mangenot (Genoscope)
Assembly and scaffolding 3B-v1
16,136 scaff
1,040 Mb
o Gap filling o Seq. error corrections
JM. Aury, A. Couloux (Genoscope)
o Redundancy removal and scaffold mergingS. Theil (INRA GDEC)
Pool_A
Pool_B
ctg1
ctg2
2,808 scaff
833 Mb
3B-v443
scaffAssembler.pl
3B-v4
4,999 scaff
992 Mb
redundancy:160 Mb
Ordering scaffolds 2,808 scaff
833 Mb
3B-v443
o SNP discovery
BaitTE
DNA captured from 10 genotypes
gene
52,265 baits isbpProbeDesign.pl 39,077 SNPs
SureSelect® seq. capture (E. Paux, N. Cubizolles, E. Rey)
Ordering scaffolds
o SNP discovery
Genetic mapping (P. Sourdille)
+ Neighbor map: 3865 markers
LD mapping (F. Balfourier)
• Anchor map: 384 indiv Cs x Renan
o Genotyping mapping pop
3,075 SNPs
• 367 lines from a core-collection
3B
Ordering scaffolds
genetic map
44.8 cM152 scaffolds
LD map
19 LD blocks
366 bins0 cM 133 cM
554 bins
Ordering scaffolds
pseudomolBuilder.pl
1358 scaff
774 Mb
pseudomolecule
unlocalized
1450 scaff
59 Mb
93%
7%
N N N N N N N N
o SNP discovery
o Genotyping mapping pop
o Integration of phys. map info
3cM 0 1 2 3 3 4 5 6
A B C D E
o orientation unknown: 48% of the seq.
o micro-order unknown: 554 bins / 1358 scaff
? ?
o RH mapo Optical mapo Long reads
Future Improvements
?
Bioinformatics
Scaffolding/pseudomolecule construction
scaffAssembler.pl
Annotation
gapCloser ssrFinishing
triAnnot (new modules: filtering, pseudogenes, transfer annotation)
clari-TE & clari-TE-lib
Data management gowDB (Bio::DB::seqFeatureStore)
Gbrowse @ URGI
pseudomolBuilder.pl
isbpProbeDesign.pl
Assembly Newbler
Sébastien TheilNatasha GloverJosquin DaronLise Pingault
Pierre SourdilleEtienne Paux
Philippe Leroy
Jacques Le GouisNicolas Guilhot
Aurélien Bernard
Nelly Cubizolles
Catherine Feuillet
François Balfourier
M. AlauxL. CoudercV. JamillouxH. Quenesville
URGI
H. BergesA. Bellec
CNRGV
BIA
A. AlbertiV. BarbeJ. PoulainC. DurandS. MangenotJM. AuryA. CoulouxP. Wincker
Genoscope
J. DolezelJ. Safar
IEB
K. Vandepoele
K. Mayer et al. P. SchnableS. RounsleyD. Ware
C. Gaspin
SAB
VIB
MIPS
Acknowledgments
Hélène Rimbert
TGACJ. Rogers, M. Caccamo et al.
J. RogersK. Eversole