genomic insights into the evolution of echinochloa towards
TRANSCRIPT
Data S2 - Additional figures supporting the manuscript
Data S2A. Validation of subgenome phasing by K-mer (K=13) in E. colonaData S2B. Validation of subgenome phasing accuracy based on K-mer and transposon elements analysis in E. oryzicola and E. crus-galliData S2C. Centromeric regions on chromosomes of E. oryzicola and E. crus-galliData S2D. Synteny analysisData S2E. Genomic compositions in each subgeonme in EchinochloaData S2F. Detailed genomic structures of momilactone A gene cluster regions in PoaceaeData S2G. Phylogeny of all Poaceae homologs of three genes neighboring to the MA cluster in EchinochloaData S2H. Fourteen scenarios used to test the existence and direction of gene flow between groupsData S2I. The effects of sample size on estimation of nucleotide diversity (π) estimationData S2J. Genomic differentiation (Fst) between two groups in E. crus-galli and E. oryzicolaData S2K. Characterization of AUX/IAA genes in E. crus-galliData S2L. Sequence conservation of degron regions in AUX/IAA proteins.
Genomic Insights into the Evolution of Echinochloa towards
Weed and Crop
Dongya Wu, Chu-Yu Ye, Bowen Jiang, Yu Feng, Wei Tang, Sangting Lao, Lei Jia, Hanyang
Lin, Lingjuan Xie, Xifang Weng, Chenfeng Dong, Qinghong Qian, Feng Lin, Haiming Xu,
Huabin Lu, Luan Cutti, Longbiao Guo, Beng-Kah Song, Laura Scarabel, Jie Qiu, Qian-Hao
Zhu, Qin Yu, Michael P. Timko, Hirofumi Yamaguchi, Aldo Merotto Jr, Yingxiong Qiu,
Kenneth M. Olsen, Longjiang Fan*
Data S2A. Validation of E. colona subgenome phasing by K-mer (K=13). Clustering of counts of 13-mers that differentiate homeologous chromosomes enables the consistent partitioning of the hexaploid E. colona genome into subgenomes in hexaploid E. colona.
a b
AT
BT
AH
BH
CH
BT AT AHBHCHc
d
Data S2B. Validation of subgenome phasing accuracy based on K-mer and transposon elements analysis in E. oryzicola and E. crus-galli. Clustering of counts of 13-mers that differentiate homeologous chromosomes enables the consistent partitioning of the genome into subgenomes in tetraploid E. oryzicola (a) and hexaploid E. crus-galli (b). Transposon element abundance was used to confirm the accruracy of subgenome phasing in tetraploid E. oryzicola (a) and hexaploid E. crus-galli (b). Subgenome-specific repeat elements would appear dominantly in the corresponding subgenome, as red blocks show in each chromosomes. For example, in E. oryzicola genome, 61 transposon elements (including 2 LTR/Copia, 6 LTR/Gypsy and 53 other types) were found to phase subgenomes.
Data S2C. Syteny analysis. (a) Marco-synteny among Echinochloa subgenomes, O. sativa and S. italica. Ehap, dioploid E. haploclada; AT and BT, two subgenomes in tetraploid E. oryzicola; AH, BH and CH, three subgenomes in hexaploid E. crus-galli; DH, EH and FH, three subgenomes in E. colona. (b) Micro synteny at chromosome 4 in nine Echinochloa subgenomes. The evolutionary relationship was shown at the left. The genes with various functions were marked by different colors.
a
b
Data S2D. Centromeric regions on chromosomes of E. oryzicola and E. crus-galli.
E. oryzicola
E. crus-galli
FH
EH
DH
CH
BH
AH
BT
AT
Ehap0.0 100.0 200.0 300.0 400.0 500.0
LTR/Gypsy LTR/Copia LINE/L1DNA/CMC-EnSpm DNA/MULE-MuDR Simple_repeatUnknown non-TE
Ech
inoc
hloa
Sub
geno
me
Length(Mb)
Data S2E. Genomic compositions in each subgeonme in Echinochloa.
chr4H:HORVU4Hr1G029060- HORVU4Hr1G028780 chr5H ///
HORVU5Hr1G082390
HORVU5Hr1G082380
HORVU5Hr1G067310
Hordeum vulgare
chr4: Bradi4g10060-Bradi4g09900×6
Brachypodium distachyon
Ola011497-Ola011474×5Olyra latifolia
chr11: LPERR11G19220-LPERR11G19390Leersia perrieri
chr4: OB04G12320-OB04G12440
chr11: OB11G27160-OB11G27320O. brachyantha
chr4: OBART04G02800-OBART04G03080
×7 chr11: OBART11G22730-OBART11G22900
O. barthii
chr11: ORGLA11G0183100-ORGLA11G0184100
chr4: ORGLA04G0020000-ORGLA04G0022100
O. glaberrima
chr4: ORUFI04G03490-ORUFI04G03710
chr11: ORUFI11G24440-ORUFI11G24630
O. rufipogon
chr11: LOC_Os11g45400-LOC_Os11g45740
chr4: LOC_Os04g09780-LOC_Os04g10350
Oryza sativa
chr11: OPUNC11G19270-OPUNC11G19400
chr4: OPUNC04G02700-OPUNC04G03000
O. punctata
chr9: CsB901529-CsB901537
chr6: CsB601182-CsB601177C. songorica (B)
chr9:CsA900080-CsA900065Cleistogenes songorica (A) chr6: CsA600445-
CsA600457
TVU42411-TVU42426
TVU23487-TVU23468Eragrostis curvula
chr2: Pahal.2G214200-Pahal.2G214300
chr1: Pahal.1G03910
chr8: Pahal.8G259800-Pahal.8G262700Panicum hallii
chr2: 2G145700-2G145900
chr8: 8G234000-8G237700Setaria italica
DH04.2272-DH04.2321
×17E. colona (DH)
CH04.2621-CH04.2648E. crus-galli (CH)
EH04.2682-EH04.27423 4 5E. haploclada
EH04.1884-EH04.1913E. colona (EH)
BH04.2290-BH04.225453E. crus-galli (BH)
BT04: 2194-BT04.21725E. oryzicola (BT)
FH04.1776-FH04.1796E. colona (FH)
AH04.2217-AH04.22445E. crus-galli (AH)
21 AT04.1673-AT04.17265Echinochloa oryzicola (AT)
PF01657 PF13920
CYP99A CPS KSL CYP76M MAS
Acyltransferase PPR repeat Multicopper oxidase PF00282
Cytochrome P450 RNA recognition motif NB-ARC PF03140
MuLE
Cupin
Glutathione S-transferase
O-methyltransferase Fbox
DUF538
Glycosyl hydrolase Pectinesterase Myb-like PF05514 PF07716
PF03171
Data S2F. Detailed genomic structures of momilactone A gene cluster regions in Poaceae. Red and purple circles at the left represented intact and partial clusters, respectively.
CH04.2644 CH04.2645
CH04.2647
Bambusoideae
Oryzoideae
Pooideae
Chloridoideae
Panicoideae
Pharoideae
Data S2G. Phylogeny of all Poaceae homologs of three genes neighboring to the MA cluster in Echinochloa. Bootstraps values ≤ 95 were shown.
NAnc
Npop1 Npop2
NAnc
Npop1 Npop2M12
NAnc
Npop1 Npop2M21
NAnc
Npop1 Npop2
M21
M12
No_migration
Tdiv TdivTdiv
Migration_12 Migration_21
Tdiv
IMA
NAnc
Npop1 M21
M12
Tdiv
Recent_geneflow
Tm
NAnc
Npop1
M21
M12 Tdiv
Ancient_geneflow
Tm
Npop2 Npop2
NAnc
Npop1 M21Tdiv
Recent_M21
Tm
Npop2
NAnc
Npop1
M12
Tdiv
Recent_M12
Tm
Npop2
NAnc
Npop1M12 Tdiv
Ancient_M12
Tm
Npop2
NAnc
Npop1M21
Ancient_M21
Tm
Npop2
NAnc
Npop1
M21
M12 Tdiv
Ancient_recent12
Tm
Npop2
M12
NAnc
Npop1
M21
M12
Ancient_recent21
Tm
Npop2
M21
NAnc
Npop1 M21
M12
Recent_ancient21
Tm
Npop2
M21
NAnc
Npop1 M21
M12
Tm
Npop2
M12
Recent_ancient12
Data S2H. Fourteen scenarios used to test the existence and direction of gene flow between groups.
Data S2I. The effects of sample size on estimation of nucleotide diversity (π) estimation. Populations with sample size of 10, 20, 30, 40, 50, 100, 150 and 200 were randomly selected from E.crus-galli var. crus-galli, respectively. The nucleotide diversity was estimated in each subgenome.
MYB3R-2 MYB3R-2MADS57
ZFP182FD1 Hd1
Dof12
PIN3t LEA3-2 ABCG31 RCI2-5 ZFP179IBR5
CL1-2
RCI2-5 DERF1
A
B
Fst
Fst
Fst
Fst
Fst
a
b
Data S2J. Genomic differentiation (Fst) between two groups in E.crus-galli and E. oryzicola. (a) Fst between E. crus-galli var. crus-galli and var. praticola. (b) Fst between E. oryzicola group LL and HL. Genes related to flowering time, cold response, and drought response were marked in green, blue and yellow, respectively.
*Arg→Gln
a
b
Data S2K. Characterization of AUX/IAA genes in E. crus-galli. (a) phylogeny of AUX/IAA genes in E. crus-galli and O. sativa genomes. Bootstraps values are shown at the branches. (b) Structure of EcAUX/IAA12 (CH01.1437). One mutation Arg86Gln was found in one quinclorac-resistant sample (SA88) from Brazil. Green sequence LxLxL indicates the ethylene response factor associated amphiphilic repression (EAR) motifs. The sequence in red shows the degron motif, which binds to auxin and the co-receptor TIR/AFB proteins. The K and DxD and ExD sequences (OPCA) motifs in the PB1 domain are shown in blue.
Ara
bido
psis
thal
iana
O. s
ativ
aE
. cru
s-ga
lli
Data S2L. Sequence conservation of degron regions in AUX/IAA proteins. All AUX/IAA proteins in A. thaliana, O. sativa and E. crus-galli are identified and aligned by MAFFT. The positions of candidate mutation (Arg86) in AUX/IAA are highlighted in red box.