genomic insights into the evolution of echinochloa towards

13
Data S2 - Additional figures supporting the manuscript Data S2A. Validation of subgenome phasing by K-mer (K=13) in E. colona Data S2B. Validation of subgenome phasing accuracy based on K-mer and transposon elements analysis in E. oryzicola and E. crus-galli Data S2C. Centromeric regions on chromosomes of E. oryzicola and E. crus-galli Data S2D. Synteny analysis Data S2E. Genomic compositions in each subgeonme in Echinochloa Data S2F. Detailed genomic structures of momilactone A gene cluster regions in Poaceae Data S2G. Phylogeny of all Poaceae homologs of three genes neighboring to the MA cluster in Echinochloa Data S2H. Fourteen scenarios used to test the existence and direction of gene flow between groups Data S2I. The effects of sample size on estimation of nucleotide diversity (π) estimation Data S2J. Genomic differentiation (F st ) between two groups in E. crus-galli and E. oryzicola Data S2K. Characterization of AUX/IAA genes in E. crus-galli Data S2L. Sequence conservation of degron regions in AUX/IAA proteins. Genomic Insights into the Evolution of Echinochloa towards Weed and Crop Dongya Wu, Chu-Yu Ye, Bowen Jiang, Yu Feng, Wei Tang, Sangting Lao, Lei Jia, Hanyang Lin, Lingjuan Xie, Xifang Weng, Chenfeng Dong, Qinghong Qian, Feng Lin, Haiming Xu, Huabin Lu, Luan Cutti, Longbiao Guo, Beng-Kah Song, Laura Scarabel, Jie Qiu, Qian-Hao Zhu, Qin Yu, Michael P. Timko, Hirofumi Yamaguchi, Aldo Merotto Jr, Yingxiong Qiu, Kenneth M. Olsen, Longjiang Fan*

Upload: others

Post on 09-Dec-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Data S2 - Additional figures supporting the manuscript

Data S2A. Validation of subgenome phasing by K-mer (K=13) in E. colonaData S2B. Validation of subgenome phasing accuracy based on K-mer and transposon elements analysis in E. oryzicola and E. crus-galliData S2C. Centromeric regions on chromosomes of E. oryzicola and E. crus-galliData S2D. Synteny analysisData S2E. Genomic compositions in each subgeonme in EchinochloaData S2F. Detailed genomic structures of momilactone A gene cluster regions in PoaceaeData S2G. Phylogeny of all Poaceae homologs of three genes neighboring to the MA cluster in EchinochloaData S2H. Fourteen scenarios used to test the existence and direction of gene flow between groupsData S2I. The effects of sample size on estimation of nucleotide diversity (π) estimationData S2J. Genomic differentiation (Fst) between two groups in E. crus-galli and E. oryzicolaData S2K. Characterization of AUX/IAA genes in E. crus-galliData S2L. Sequence conservation of degron regions in AUX/IAA proteins.

Genomic Insights into the Evolution of Echinochloa towards

Weed and Crop

Dongya Wu, Chu-Yu Ye, Bowen Jiang, Yu Feng, Wei Tang, Sangting Lao, Lei Jia, Hanyang

Lin, Lingjuan Xie, Xifang Weng, Chenfeng Dong, Qinghong Qian, Feng Lin, Haiming Xu,

Huabin Lu, Luan Cutti, Longbiao Guo, Beng-Kah Song, Laura Scarabel, Jie Qiu, Qian-Hao

Zhu, Qin Yu, Michael P. Timko, Hirofumi Yamaguchi, Aldo Merotto Jr, Yingxiong Qiu,

Kenneth M. Olsen, Longjiang Fan*

Data S2A. Validation of E. colona subgenome phasing by K-mer (K=13). Clustering of counts of 13-mers that differentiate homeologous chromosomes enables the consistent partitioning of the hexaploid E. colona genome into subgenomes in hexaploid E. colona.

a b

AT

BT

AH

BH

CH

BT AT AHBHCHc

d

Data S2B. Validation of subgenome phasing accuracy based on K-mer and transposon elements analysis in E. oryzicola and E. crus-galli. Clustering of counts of 13-mers that differentiate homeologous chromosomes enables the consistent partitioning of the genome into subgenomes in tetraploid E. oryzicola (a) and hexaploid E. crus-galli (b). Transposon element abundance was used to confirm the accruracy of subgenome phasing in tetraploid E. oryzicola (a) and hexaploid E. crus-galli (b). Subgenome-specific repeat elements would appear dominantly in the corresponding subgenome, as red blocks show in each chromosomes. For example, in E. oryzicola genome, 61 transposon elements (including 2 LTR/Copia, 6 LTR/Gypsy and 53 other types) were found to phase subgenomes.

Data S2C. Syteny analysis. (a) Marco-synteny among Echinochloa subgenomes, O. sativa and S. italica. Ehap, dioploid E. haploclada; AT and BT, two subgenomes in tetraploid E. oryzicola; AH, BH and CH, three subgenomes in hexaploid E. crus-galli; DH, EH and FH, three subgenomes in E. colona. (b) Micro synteny at chromosome 4 in nine Echinochloa subgenomes. The evolutionary relationship was shown at the left. The genes with various functions were marked by different colors.

a

b

Data S2D. Centromeric regions on chromosomes of E. oryzicola and E. crus-galli.

E. oryzicola

E. crus-galli

FH

EH

DH

CH

BH

AH

BT

AT

Ehap0.0 100.0 200.0 300.0 400.0 500.0

LTR/Gypsy LTR/Copia LINE/L1DNA/CMC-EnSpm DNA/MULE-MuDR Simple_repeatUnknown non-TE

Ech

inoc

hloa

Sub

geno

me

Length(Mb)

Data S2E. Genomic compositions in each subgeonme in Echinochloa.

chr4H:HORVU4Hr1G029060- HORVU4Hr1G028780 chr5H ///

HORVU5Hr1G082390

HORVU5Hr1G082380

HORVU5Hr1G067310

Hordeum vulgare

chr4: Bradi4g10060-Bradi4g09900×6

Brachypodium distachyon

Ola011497-Ola011474×5Olyra latifolia

chr11: LPERR11G19220-LPERR11G19390Leersia perrieri

chr4: OB04G12320-OB04G12440

chr11: OB11G27160-OB11G27320O. brachyantha

chr4: OBART04G02800-OBART04G03080

×7 chr11: OBART11G22730-OBART11G22900

O. barthii

chr11: ORGLA11G0183100-ORGLA11G0184100

chr4: ORGLA04G0020000-ORGLA04G0022100

O. glaberrima

chr4: ORUFI04G03490-ORUFI04G03710

chr11: ORUFI11G24440-ORUFI11G24630

O. rufipogon

chr11: LOC_Os11g45400-LOC_Os11g45740

chr4: LOC_Os04g09780-LOC_Os04g10350

Oryza sativa

chr11: OPUNC11G19270-OPUNC11G19400

chr4: OPUNC04G02700-OPUNC04G03000

O. punctata

chr9: CsB901529-CsB901537

chr6: CsB601182-CsB601177C. songorica (B)

chr9:CsA900080-CsA900065Cleistogenes songorica (A) chr6: CsA600445-

CsA600457

TVU42411-TVU42426

TVU23487-TVU23468Eragrostis curvula

chr2: Pahal.2G214200-Pahal.2G214300

chr1: Pahal.1G03910

chr8: Pahal.8G259800-Pahal.8G262700Panicum hallii

chr2: 2G145700-2G145900

chr8: 8G234000-8G237700Setaria italica

DH04.2272-DH04.2321

×17E. colona (DH)

CH04.2621-CH04.2648E. crus-galli (CH)

EH04.2682-EH04.27423 4 5E. haploclada

EH04.1884-EH04.1913E. colona (EH)

BH04.2290-BH04.225453E. crus-galli (BH)

BT04: 2194-BT04.21725E. oryzicola (BT)

FH04.1776-FH04.1796E. colona (FH)

AH04.2217-AH04.22445E. crus-galli (AH)

21 AT04.1673-AT04.17265Echinochloa oryzicola (AT)

PF01657 PF13920

CYP99A CPS KSL CYP76M MAS

Acyltransferase PPR repeat Multicopper oxidase PF00282

Cytochrome P450 RNA recognition motif NB-ARC PF03140

MuLE

Cupin

Glutathione S-transferase

O-methyltransferase Fbox

DUF538

Glycosyl hydrolase Pectinesterase Myb-like PF05514 PF07716

PF03171

Data S2F. Detailed genomic structures of momilactone A gene cluster regions in Poaceae. Red and purple circles at the left represented intact and partial clusters, respectively.

CH04.2644 CH04.2645

CH04.2647

Bambusoideae

Oryzoideae

Pooideae

Chloridoideae

Panicoideae

Pharoideae

Data S2G. Phylogeny of all Poaceae homologs of three genes neighboring to the MA cluster in Echinochloa. Bootstraps values ≤ 95 were shown.

NAnc

Npop1 Npop2

NAnc

Npop1 Npop2M12

NAnc

Npop1 Npop2M21

NAnc

Npop1 Npop2

M21

M12

No_migration

Tdiv TdivTdiv

Migration_12 Migration_21

Tdiv

IMA

NAnc

Npop1 M21

M12

Tdiv

Recent_geneflow

Tm

NAnc

Npop1

M21

M12 Tdiv

Ancient_geneflow

Tm

Npop2 Npop2

NAnc

Npop1 M21Tdiv

Recent_M21

Tm

Npop2

NAnc

Npop1

M12

Tdiv

Recent_M12

Tm

Npop2

NAnc

Npop1M12 Tdiv

Ancient_M12

Tm

Npop2

NAnc

Npop1M21

Ancient_M21

Tm

Npop2

NAnc

Npop1

M21

M12 Tdiv

Ancient_recent12

Tm

Npop2

M12

NAnc

Npop1

M21

M12

Ancient_recent21

Tm

Npop2

M21

NAnc

Npop1 M21

M12

Recent_ancient21

Tm

Npop2

M21

NAnc

Npop1 M21

M12

Tm

Npop2

M12

Recent_ancient12

Data S2H. Fourteen scenarios used to test the existence and direction of gene flow between groups.

Data S2I. The effects of sample size on estimation of nucleotide diversity (π) estimation. Populations with sample size of 10, 20, 30, 40, 50, 100, 150 and 200 were randomly selected from E.crus-galli var. crus-galli, respectively. The nucleotide diversity was estimated in each subgenome.

MYB3R-2 MYB3R-2MADS57

ZFP182FD1 Hd1

Dof12

PIN3t LEA3-2 ABCG31 RCI2-5 ZFP179IBR5

CL1-2

RCI2-5 DERF1

A

B

Fst

Fst

Fst

Fst

Fst

a

b

Data S2J. Genomic differentiation (Fst) between two groups in E.crus-galli and E. oryzicola. (a) Fst between E. crus-galli var. crus-galli and var. praticola. (b) Fst between E. oryzicola group LL and HL. Genes related to flowering time, cold response, and drought response were marked in green, blue and yellow, respectively.

*Arg→Gln

a

b

Data S2K. Characterization of AUX/IAA genes in E. crus-galli. (a) phylogeny of AUX/IAA genes in E. crus-galli and O. sativa genomes. Bootstraps values are shown at the branches. (b) Structure of EcAUX/IAA12 (CH01.1437). One mutation Arg86Gln was found in one quinclorac-resistant sample (SA88) from Brazil. Green sequence LxLxL indicates the ethylene response factor associated amphiphilic repression (EAR) motifs. The sequence in red shows the degron motif, which binds to auxin and the co-receptor TIR/AFB proteins. The K and DxD and ExD sequences (OPCA) motifs in the PB1 domain are shown in blue.

Ara

bido

psis

thal

iana

O. s

ativ

aE

. cru

s-ga

lli

Data S2L. Sequence conservation of degron regions in AUX/IAA proteins. All AUX/IAA proteins in A. thaliana, O. sativa and E. crus-galli are identified and aligned by MAFFT. The positions of candidate mutation (Arg86) in AUX/IAA are highlighted in red box.