in-depth secretome analysis of puccinia striiformis f. sp ......jan 13, 2020 · puccinia...
TRANSCRIPT
1
1
In-depth secretome analysis of Puccinia striiformis f. sp. tritici in 2
infected wheat uncovers effector functions 3
4
(Short Title: In-depth secretome analysis of Pst) 5
6
Ahmet Caglar Ozketen1*, Ayse Andac-Ozketen1, Bayantes Dagvadorj1,2, Burak 7
Demiralay1, Mahinur S. Akkaya3* 8
9
1 Middle East Technical University, Biotechnology Program, Ankara 06800, Turkey. 10
2 Division of Plant Sciences, Research School of Biology, The Australian National 11
University, Canberra, Australian Capital Territory 2601, Australia. 12
3 School of Bioengineering, Dalian University of Technology, No. 2 Linggong Road, 13
Dalian, Liaoning 116023, China. 14
15
16
*Corresponding authors: Mahinur S. Akkaya and Ahmet Caglar Ozketen 17
E-mails: [email protected] & [email protected] 18
19
20
Keywords 21
Transcriptome, secretome, effectors, yellow/stripe rust, pathogenesis, localization, 22
function, Puccinia striiformis f. sp. tritici 23
24
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
2
ABSTRACT 25
The importance of wheat yellow rust disease, caused by Puccinia striiformis f. sp. tritici 26
(Pst), has increased substantially due to the emergence of aggressive new Pst races in 27
the last couple of decades. In an era of escalating human populations and climate 28
change, it is vital to understand the infection mechanism of Pst in order to develop 29
better strategies to combat wheat yellow disease. This study focuses on the 30
identification of small secreted proteins (SSPs) and candidate-secreted effector proteins 31
(CSEPs) that are used by the pathogen to support infection and control disease 32
development. We generated de novo assembled transcriptomes of Pst collected from 33
wheat fields in central Anatolia. We inoculated both susceptible and resistant seedlings 34
with Pst and analyzed haustoria formation. At 10 days post-inoculation (dpi), we 35
analyzed the transcriptomes and identified 10,550 Differentially Expressed Unigenes 36
(DEGs), of which 6,220 were Pst-related. Among those Pst-related genes, 230 were 37
predicted as PstSSPs. In silico characterization was performed using an approach 38
combining the transcriptomic data and data mining results to provide a reliable list to 39
narrow down the ever-expanding repertoire of predicted effectorome. The 40
comprehensive analysis detected 14 Differentially Expressed Small-Secreted Proteins 41
(DESSPs) that overlapped with the genes in available literature data to serve as the best 42
CSEPs for experimental validation. One of the CSEPs was cloned and studied to test 43
the reliability of the presented data. Biological assays show that the randomly selected 44
CSEP, Unigene17495 (PSTG_10917), localizes in the chloroplast and is able to 45
suppress cell death induced by INF1 in a Nicotiana benthamiana heterologous 46
expression system. 47
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
3
INTRODUCTION 48
Puccinia striiformis f. sp. tritici (Pst) is an obligate biotrophic fungus, which is the 49
causative agent of wheat stripe (yellow) rust disease, a disease that can drastically 50
reduce the production of wheat worldwide. The use of fungicides is a suitable solution 51
for combatting Pst, but using resistant wheat strains to control Pst genetically would be 52
better for the environment and more cost-effective. In order to achieve host resistance 53
against rapid evolving virulent strains of Pst, understanding the molecular basis of the 54
disease is crucial. 55
56
Pst infects the leaf of the wheat plant by penetrating the interior through the stomata 57
and later differentiating into various developmental entities before finally entering into 58
the mesophyll cells in which it can generate a feeding structure called a haustorium. 59
The main interface for pathogen-host interactions is in the haustorium that expands 60
between the plant cell wall and the cell membrane. This interface can activate primary 61
host defenses when pathogen-associated molecular patterns (PAMPs) are detected by 62
pattern recognition receptors (PRRs), resulting in PAMP-triggered immunity (PTI) [1]. 63
Virulence is achieved by Pst through the deployment of secreted proteins, termed 64
effectors, into the apoplastic fluid or inside the host [2]. The effectors are useful 65
weapons in the fungal arsenal that allow Pst i) to overcome PTI, ii) to establish a 66
suitable environment, and iii) to help proliferation [3]. Particular effectors are 67
recognized by cytoplasmic receptor proteins in the host cell, triggering a defense 68
response called effector-triggered immunity (ETI). Compatible and incompatible 69
interactions depend on whether the effectors can achieve virulence by evading 70
Resistance (R) proteins, while the R proteins evolve to detect the effectors. Therefore, 71
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
4
evolutionary pressure imparts a driving force on the race between host resistance factors 72
and pathogen virulence factors [4]. 73
74
Investigating the secreted effectorome of the fungus guides us towards a better 75
comprehension of the nature of pathogenic virulence and determinants that can lead to 76
avirulence. The combination of next-generation sequencing (NGS) and data mining 77
approaches represent a great opportunity to determine candidate effectors secreted by 78
the fungus. Several studies attempting to construct a secretome of either Pst or its 79
closest relatives (stem rust pathogen Puccinia graminis f. sp. tritici (Pgt) and leaf rust 80
pathogen Puccinia triticina (Ptt) have been published. Yin and colleagues reported a 81
set of secreted proteins generated from expressed sequence tag (EST) sequences of a 82
cDNA library generated from Pst haustoria [5]. A repertoire of small-secreted proteins 83
(SSPs) was reported using a pipeline for effector mining in the genome sequence 84
information of Pgt and Melampsora larici populina (the pathogen that causes poplar 85
leaf rust) [6]. The isolate PST-130 was sequenced and assembled to discover Pst genes 86
and identify candidate effectors [7]. The genome sequences of Pgt (CRL 75-36-700-3), 87
Ptt (BBDD), PST-78 (2K-041), PST-1 (3-5-79), PST-127 (08-220), PST-CYR-32 (09-88
001) were released by the Broad Institute (http://www.broadinstitute.org/). Cantu and 89
colleagues used comprehensive genome analysis to identify candidate secreted 90
effectors in two United Kingdom isolates (PST-87/7 and PST-08/21) and two United 91
States isolates (PST-21 and PST-43) [8]. They also released data of infection-expressed 92
genes (6 and 14 dpi) and haustorial-expressed genes (7 dpi) for PST-87/7. Another 93
report was published using transcriptome sequencing on Pst isolate 104E137A to 94
elucidate haustorial secreted proteins (HSPs) and expose the best candidate effectors 95
[9]. A pioneer correlation analysis between genomes of seven existing races and seven 96
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
5
new races of Pst was pursued to predict avirulence (Avr) determinants among SSPs 97
[10]. In this study, 7 new races of Pst were sequenced with NGS and then combined 98
with 7 older published genomes for a total of 14 races of Pst that were then subjected 99
to correlation analysis to point out Avr candidates [10]. 100
101
Data mining methodology is a promising strategy to identify candidate secreted effector 102
proteins (CSEPs) in genome and transcriptome data, as the incorporation of newly 103
generated data continually improves the prediction algorithms using deep machine 104
learning. Likewise, our understanding of the infection and disease progression of the 105
pathogen is steadily expanding. Therefore, new information (candidate effectors) can 106
be generated by exhausting presently available sequence data. A combined effort of 107
using both comparative transcriptomics and data mining to narrow down the ever-108
expanding sets of CSEPs will simplify and focus the work being done to discover the 109
function of each CSEP experimentally. Only a few reports have focused on integrating 110
both in silico prediction and transcriptome sequencing data on particular Pst races at 111
the defined time intervals of haustoria formation and infection [8,9]. 112
113
In this study, our objective is to identify differentially expressed gene sequences 114
(DEGs) of Pst during the interaction with both susceptible and resistant host genotypes. 115
We acquired de novo transcriptome sequencing data from both susceptible and resistant 116
wheat cultivars infected with Pst for 10 days in order to discover differentially 117
expressed secretome repertoires composed of small secreted proteins at a unique time 118
point of the infection process after haustoria formation. Detected PstDEGs were then 119
subjected to in silico prediction analysis to identify differentially expressed small 120
secreted proteins (PstDESSPs). Then, the cataloged secretome candidates were 121
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
6
analyzed in-depth via annotation and prediction programs to sort them into classified 122
functions. Furthermore, a comparison of published Pst expression data [8, 9] indicated 123
the presence of 14 overlapping mutual CSEPs as the most promising subset for 124
experimental validation. To test the reliability of our work, cDNA of one of the CSEP 125
genes (Unigene17495, homologous to PSTG10917), was generated from Pst-infected 126
wheat leaves (10 dpi) and cloned into Agrobacterium tumefaciens-compatible 127
expression vectors. Heterologous expression of the CSEP in Nicotiana benthamiana 128
revealed it was localized to the chloroplast in plant cells. Furthermore, we observed the 129
biological function of Unigene17495 to be suppression of the INF1-mediated cell death 130
response to N. benthamiana. 131
132
MATERIALS AND METHODS 133
Plant materials, growth, and pathogen inoculation 134
Near-isogenic wheat lines of Avocet-S and Avocet-YR10 were used to study both 135
compatible and incompatible interactions upon Pst inoculation. Pst isolates were 136
collected from the fields of Anatolia, Turkey and obtained from the 'General Directorate 137
of Agricultural Research and Policies (TAGEM) of The Ministry of Food, Agriculture, 138
and Livestock. Avocet-S cultivars are susceptible to PstTR (which has virulence 139
towards Yr2, 6, 7, 8, 9, 25, A, EP, Vic, Mich), whereas Avocet-Yr10 is resistant. Seeds 140
were planted in soil and grown in a growth chamber under ideal conditions (relative 141
humidity 60%, 22°C, 16 h light, and 8 h dark). The Pst inoculations were performed 142
when the seedlings reached the two-leaf stage by spraying freshly collected 143
urediniospores directly onto the leaves. Before the inoculation, urediniospores (100 144
mg) were germinated for 5 min at 42°C and treated with mineral oil (1 mL) to enhance 145
their attachment to the leaf blade. For the control samples (mock inoculations), only 146
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
7
mineral oil was used. After air-drying for 20 min, the plants were kept in the dark at 147
10°C overnight in the presence of continuous mist in order for infection to take place. 148
After overnight infection, the standard growth conditions were resumed. The wheat 149
leaves were collected at 10 days post-inoculation (10 dpi) when the disease symptoms 150
were visible on the susceptible cultivar. 151
152
RNA extraction, de novo transcriptome sequencing, assembly, and annotation 153
The flow chart in Fig 1 describes the overall protocol followed for collecting infected 154
leaf samples using homogenization with liquid nitrogen and a sterile mortar and pestle. 155
For 0.1 g leaf sample, 0.02 g Poly(vinylpolypyrrolidone) (PVPP) was added during 156
mortar and pestle grinding. For each condition, three biological replicates were 157
homogenized together. Total RNA isolation was carried out using QIAzol Lysis 158
Reagent (Qiagen) by following the manufacturer’s protocol. A NanoDrop was used for 159
the spectrophotometric quantification of the total RNA samples. The samples were 160
shipped to Beijing Genomics Institute (BGI) in absolute ethanol for sequencing. BGI 161
measured the RNA quality of the samples on an Agilent 2100 Bioanalyzer, and the 162
samples with a RIN value above 8 were allowed to proceed to mRNA enrichment and 163
cDNA synthesis using random hexamers. Single nucleotide adenine was added to the 164
purified and end-repaired short cDNA fragments. After adapter ligation, size selection 165
was used to obtain templates for PCR amplification. The Illumina HiSeq™ 2000 166
platform was used to perform de novo transcriptome sequencing. BGI-Shenzhen 167
(Shenzhen, China) conducted all of the protocol steps from mRNA enrichment to 168
sequencing. All quality controls were evaluated during sequencing using the Agilent 169
2100 Bioanalyzer and ABI StepOnePlus Real-Time PCR System for quantification and 170
qualification of the sample libraries. 171
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
8
172
Before assembly, clean reads were obtained from raw reads using internal BGI software 173
(filter_fq) that follows three main parameters: (1) elimination of reads with adaptors, 174
(2) removal of reads with unidentified nucleotides larger than 5%, (3) elimination of 175
low-quality reads (if the percentage of the low-quality value (Q ≤ 10) reads are higher 176
than 20%, they are removed). For de novo transcriptome assembly, clean reads of each 177
library were assembled into contigs and unigenes by the Trinity program (release-178
20130225, http://trinityrnaseq.sourceforge.net/) [11]. 179
180
Unigenes were aligned using the Blastx program (e-value < 0.00001) and the following 181
protein databases: NR (NCBI, non-redundant database) [12], Swiss-Prot (EMBL 182
protein database) [13], KEGG (Kyoto Encyclopedia of Genes and Genomes) [14], and 183
COG (Clusters of Orthologous Groups) [15]. The direction of the unigene sequences 184
was determined using the best alignment scores. NR, Swiss-Prot, KEGG, and COG 185
were sourced to predict the coding region (CDS) of unigenes in the subsequent order 186
of priority if a conflict in sequence orientation arose. ESTScan software was performed 187
on unigenes with no successive alignments [16]. The results for NR, NT, Swiss-Prot, 188
KEGG, and COG database alignments were also used for unigene function annotation. 189
To obtain gene ontology (GO) annotation from NR annotations, the Blast2GO program 190
(v2.5.0) was performed (e-value < 0.00001) on fungi databases (taxa, 4751) [17]. Web 191
Gene Ontology Annotation Plot (WEGO) software was used for the classification of 192
GO annotations [18]. The KEGG database was employed to investigate the metabolic 193
pathway analysis of unigenes [14]. 194
195
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
9
Expression difference analysis 196
Differentially expressed genes (DEGs) were identified using the FPKM method also 197
known as RPKM method [19]. In this method, FPKM stands for fragments per kb per 198
million reads, and RPKM stands for reads per kb per million reads. We used the 199
following calculation formula: 200
𝐹𝑃𝐾𝑀 =106𝐶
𝑁𝐿/103 201
202
FPKM signifies the expression of Unigene A, C is the number of fragments that 203
uniquely aligned to Unigene A, N is the total number of fragments that uniquely aligned 204
to all unigenes, and L is the base number in the CDS of Unigene A. As we obtained 205
FPKM values for each unigene, we calculated FPKM ratios between two samples at a 206
time. We defined differentially expressed genes (DEGs) to have a FPKM ratio ≥ 2 and 207
a false discovery rate (FDR) ≤ 0.001. We executed GO functional analysis (P-value ≤ 208
0.05), including GO functional enrichment and functional classification on DEGs. 209
Furthermore, we carried out the KEGG pathway analysis (Q-value ≤ 0.05) for pathway 210
enrichment to further understand biological functions [20]. 211
212
Identification of pathogen-associated DEGs 213
We identified our DEGs through transcriptome data obtained from pathogen-inoculated 214
wheat leaves. However, we needed to eliminate any possible contaminating sequences 215
from the host. To accomplish this, we constructed a local database (named ‘The 216
Pucciniale database’) by merging protein data of Puccinia striiformis f. sp. tritici 217
(isolate 2K41-Yr9, race PST-78), Puccinia graminis f. sp. tritici (strain CRL 75-36-218
700-3, race SCCL), and Puccinia triticina (isolate 1-1, race 1-bbbd). All protein data 219
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
10
was obtained from the Broad Institute Puccinia website 220
(http://www.broadinstitute.org/). We also merged protein data of the WheatD genome 221
of Aegilops tauschii (wheatD_final_43150.gff.cds, downloaded from GIGA_DB 222
‘http://gigadb.org/’) [21] to ‘The Pucciniale database’. We performed a BlastX analysis 223
(e-value cut-off ≤ e-10) on DEGs using our local database of Pucciniale and wheat [22]. 224
If the hit matched with the wheat proteome or even if it produced a significant hit with 225
the Pucciniale proteome, we discarded the sequences following the decision rule. The 226
best hits within the Pucciniale proteome were evaluated as PstDEGs. The sequences 227
with no significant hits against neither the pathogen nor host were assessed as unknown 228
novel DEGs. 229
230
Construction and characterization of secretome data 231
We predicted the ORFs and protein sequences from the PstDEG data using the 232
ORFPredictor web tool (http://bioinformatics.ysu.edu/tools/OrfPredictor.html). We 233
followed a widely accepted secretome prediction pipeline to construct our own 234
differentially expressed secretome database [6, 23, 24]. The predictions were made 235
following three criteria: 1) presence of N-terminus signal peptide region, 2) absence of 236
transmembrane helices, and 3) mature protein length should be smaller than 300 amino 237
acids. SignalP (version 4.1) was used to predict the presence of a secretion signal [25]. 238
We estimated transmembrane helices using the TMHMM webtool (version 2.0, 239
http://www.cbs.dtu.dk/services/TMHMM/) [26]. Filtered sequences were termed as 240
differentially expressed small-secreted proteins of Puccinia striiformis f. sp. tritici 241
(PstDESSPs). The same filtering pipeline was applied to total proteins obtained from 242
the 'Broad Institute database’ for Pst and Pst relatives (Pgt and Ptt) 243
(http://www.broadinstitute.org/). Four secretome data sets (PstDESSPs, PstSSPs, 244
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
11
PgtSSPs, PttSSPs) were subjected to effector prediction via the EffectorP 1.0 and 2.0 245
programs (http://effectorp.csiro.au/) [27, 28]. The Blastp program was used (E-value < 246
e-5) for characterization of PstDESSPs by scanning against the following databases: the 247
lipase engineering database (http://www.led.uni-stuttgart.de/), the protease and 248
protease inhibitors (MEROPs) database (http://merops.sanger.ac.uk/) [29], the 249
carbohydrate-active enzymes (CAZymes) database (http://csbl.bmb.uga.edu/dbCAN/) 250
[30], the fungal peroxidase database (fPoxDB, http://peroxidase.riceblast.snu.ac.kr/) 251
[31], and the pathogen-host interaction database (http://phi-blast.phi-base.org/) [32]. 252
The ClustalW program was executed for multiple alignment analysis of PstDESSPs 253
(https://www.ebi.ac.uk/Tools/msa/clustalo/) [33]. The iTOL web-tool was used to 254
construct a phylogenetic tree using multiple alignment results (https://itol.embl.de/) 255
[34]. Small secreted cysteine-rich effector proteins were assessed as cysteine-rich if the 256
protein possessed four cysteine residues or more. A comparison between infection- and 257
haustoria-expressed genes was done using a similarity search using Blastp (e-value cut-258
off ≤ e-10) on publicly available data [8, 9]. We scanned for PstDESSPs overlapping 259
with Avr determinant candidates originally interpreted using correlation analysis by Xia 260
and colleagues [10]. Subcellular localization predictions were done on mature proteins 261
(without N-terminus signal peptides) to forecast translocation sites in the host plant 262
using four different tools: TargetP 1.1 (http://www.cbs.dtu.dk/services/TargetP/) [35], 263
Localizer (http://localizer.csiro.au/) [36], WolfPSORT (https://wolfpsort.hgc.jp/) [37], 264
and ApoplastP (http://apoplastp.csiro.au/) [38]. All of the analyses, annotations, and 265
predictions regarding PstDESSPs are displayed in Table S3. 266
267
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
12
Cloning of Unigene17495 (Pstg10917) 268
The cloning primers were designed using the Pstg10917 sequence with an N-terminus 269
CACC site in forward primers (with and without signal peptide (SP)) and no stop codon 270
in reverse primers (Table S4). Total RNA samples of infected Avocet-S leaves (10 dpi) 271
used in transcriptome sequencing were sourced to synthesize cDNA after DNase 272
treatment. The Transcriptor First Strand cDNA Synthesis Kit (Roche) was used 273
following the manufacturer’s protocol, except that we preferred using a mixture of both 274
random hexamer primers and oligo(dT) primers at equal volumes to the reaction primer. 275
The effector of interest, with and without SP, was amplified from cDNA, cloned into 276
the pENTR/D-TOPO vector (Invitrogen), and then subsequently cloned into the 277
pK7FWG2 expression vector [39] utilizing Gateway Cloning. Both colony PCR and 278
sequencing analysis validated the cloning of constructs at the proper positions. Then, 279
the plasmid constructs were electroporated into Agrobacterium tumefaciens GV3101. 280
281
Agrobacterium tumefaciens-mediated gene transfer 282
A. tumefaciens infiltration experiments were performed using the previously described 283
protocol [40] with slight alterations. A. tumefaciens GV3101 strains carrying 284
expression vectors were plated on selective agar media. After two days, bacteria were 285
scratched and suspended in sterile water. Bacteria were collected by centrifugation at 286
4,000 rpm for 4 min at room temperature (24-25°C). Bacteria were washed two times 287
with sterile water and two times with A. tumefaciens-mediated gene transfer-induction 288
medium (10 mM MES, pH 5.6, 10 mM MgCl2). The final cell concentration was 289
adjusted to 0.4 at A600nm for infiltration unless otherwise stated. The inoculums were 290
injected into Nicotiana benthamiana leaves using a needleless syringe. The expressions 291
of the gene of interests were observed in the next 2-4 d. 292
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
13
293
Confocal microscopy 294
A. tumefaciens GV3101 cells carrying pK7FWG2/Pstg_10917 and pK7FWG2/ΔSP-295
Pstg_10917 were injected into N. benthamiana leaves (4-5 weeks old) for transient 296
expression. After 2-3 d, leaves were cut into small pieces of 3-5 mm length and soaked 297
in tap water. A Leica 385 TCS SP5 confocal microscope (Leica Microsystems, 298
Germany) was used for subcellular localization imaging. The wavelengths used for 299
imaging GFP fluorescence were excitation at 488 nm and emission at 495-550 nm. 300
Alternatively, the excitation and emission wavelengths for chloroplast autofluorescence 301
were in the far-infrared (>800 nm). 302
303
Cell death suppression assay 304
The pTRBO/GFP plasmid for GFP expression (courtesy of the Kamoun Lab, Sainsbury 305
Laboratory, Norwich, UK) and the pK7FWG2/SP-GFP plasmid [41] for SP-GFP were 306
used as negative controls in the cell death suppression assays. The A. tumefaciens 307
GV3101 cells carrying the negative control plasmids (GFP and SP-GFP) or the effector 308
of interest (with and without SP) were injected into a N. benthamiana leaf. After 24 h, 309
A. tumefaciens GV3101 carrying cell death elicitor pGR106/Inf1 (a gift from the 310
Bozkurt lab, Imperial College, London, UK) was injected in the same spot without 311
damaging leaf tissue. Cell death was measured at 4 dpi. 312
313
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
14
RESULTS AND DISCUSSION 314
Sequencing of Pst-infected wheat leaves and de novo transcriptome assembly 315
We have collected total RNA from Pst-infected wheat leaves of susceptible and 316
resistant seedlings to analyze gene expression differences between the compatible and 317
incompatible pathogen-host interactions. The sequence reads have been deposited in 318
the NCBI BioProject database under the accession number PRJNA545454. A total of 319
9,204,786 clean reads for Pst-infected susceptible wheat (Avocet-S_PST), and 320
6,899,596 clean reads for Pst-infected resistant wheat cultivar (Avocet-YR10_PST) 321
were obtained after quality filtering. Their Q20 values were 98.16% and 97.90%, 322
respectively (Table 1). 323
324
The clean reads were assembled into more than 100,000 contigs with an average length 325
of greater than 200 nucleotides using Trinity software (Table 2). Unigenes were 326
clustered based on their level of similarity with each other. A cluster was defined as 327
containing unigenes with more than 70% similarity. We generated 12,956 distinct 328
unigene clusters for Avocet-YR10_PST and 13,564 for Avocet-S_PST (Table 2). A 329
total of 42,275 and 55,004 unigenes came from single genes within the two sets, 330
respectively. This data is represented in Table 2 as the distinct singleton. The length 331
distribution of assembled contigs and unigenes is provided in Fig S1. 332
333
Functional annotation and classification of de novo transcriptome 334
The unigenes obtained from both Avocet-S_PST and Avocet-YR10_PST were merged 335
into a single integrated data set to conduct an overall interpretation of transcriptome 336
sequencing. The unigenes were aligned against the following databases: NR, NT, 337
Swiss-Prot, KEGG, COG, and GO for annotation via the Blastx program (e-value < 338
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
15
0.00001). Overall, 26,843 of 61,105 unigenes, corresponding to 43.9%, were annotated 339
in one or more databases (Table 3). The detailed annotation results are listed in Table 340
S1. The NR database provided the highest number of annotations. Approximately half 341
of the annotated unigenes shared high similarity (>60%) and aligned to either Pst genes 342
(41.4%) or genes of other fungal species (Fig 2). The results of COG classification and 343
GO functional annotation of the unigenes are presented in Fig S2. 344
345
Discovery of differentially expressed unigenes (DEGs) 346
We used the FRKM (RPKM) method [19] to determine DEGs between compatible and 347
incompatible interactions. We identified 10,550 DEGs with a false discovery rate 348
(FDR) of less than 0.001, and an expression difference of at least two-fold (Fig 3). 349
350
We mapped the DEGs to our custom database constructed by merging the pathogen 351
and host genomes. Local Blastx analysis (e-value < 0.00001) revealed that 6,220 of the 352
DEGs aligned best to the Pucciniales genome and 2,880 of the DEGs aligned best to 353
the wheat genome. The remaining DEGs yielded no significant homology to either the 354
pathogen or the host; hence, they are considered to be unknown novel DEGs. The 6,220 355
Pst-mapped DEGs (PstDEGs) were selected as the differentially expressed, pathogen-356
associated candidate genes to proceed with for secretome analysis. 357
358
The identified PstDEGs were subjected to Blast2GO analysis to elucidate their 359
functional classification (Table S2). A graphical view of the molecular functions and 360
biological processes of the PstDEGs is displayed in Fig 4. We were able to annotate the 361
majority (75.8%) of PstDEGs with their corresponding functional classification. The 362
distribution of molecular function between proteins with catalytic activity (61.73%) and 363
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
16
binding activity (61.07%) attributes were practically equal in the Blast2GO level 2 364
analysis (Gene Ontology (GO) specificity increases as the “level” of the analysis gets 365
higher). During level 3 analysis of the PstDEGs, we observed that the proteins with 366
catalytic activity belonged to the hydrolase, transferase, and oxidoreductase categories 367
(Fig 4A). The Blast2GO analysis illustrates that the pathogen undergoes radical 368
changes during disease progression. A large number of PstDEGs (6,220 genes 369
compared to the total predicted protein number of Pst genes which is over 20,000) were 370
sorted into various categories of biological processes (Fig 4B). Therefore, we wanted 371
to determine the number of PstDEGs that sorted as enzymes. In total, we cataloged 372
1,824 PstDEGs as enzymes. Among them, hydrolases were the most common, followed 373
by transferases and oxidoreductases (Fig 4C). 374
375
Identification and characterization of the differentially expressed secretome 376
We made use of a previously constructed pipeline defined and validated by several 377
research groups (6, 23, 24). We extracted the secretome repertoire of 230 small-secreted 378
proteins of Puccinia striiformis, which are referred to as PstDESSPs, by selecting all 379
of the differentially expressed genes identified in our transcriptome data (6,220 380
differentially expressed genes (DEGs)). We identified other small-secreted proteins by 381
applying the same pipeline on all of the available genome sequences of Puccinia 382
species: Pst, Pgt, and Ptt (http://www.broadinstitute.org/). We then compared all of the 383
identified secretomes in a method similar to that used in the study published by Xia et 384
al., 2017 [10]. Among 230 small secreted proteins, 94 were predicted as effectors using 385
‘Effector 2.0’ software, and were then classified as Pst-Small Secreted Candidate 386
Effectors (PstSSCEs) (Fig 5A). The percentage of PstDESSPs within the data set of all 387
differentially expressed genes was 3.7% (230 out of 6,220). This ratio appeared 388
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
17
significantly smaller when compared to the hypothetical prediction of small secreted 389
proteins within PST-78 (6.5%, 1,332 out of 20,482) and the other Pucciniale genomes. 390
Therefore, we concluded that the number of actively involved small secreted proteins 391
at 10 dpi is indeed much lower than what was hypothetically expected. This indicates 392
that the hypothetical number of PstDESSPs can be exaggerated when obtained using 393
only transcriptome predictions from genome sequences. 394
395
Certain types of gene functions are useful for the fungus in pathogen-host interactions, 396
including Carbohydrate-Active Enzymes (CAZymes), proteases, peroxidases, and 397
lipases. The PstDESSPs were compared against several distinguished public databases 398
(MEROPs, dbCAN, LED, fPoxDB) in order to predict their functions. We identified 4 399
proteases and protease inhibitors, 4 CAZymes, 11 lipases, and 9 peroxidases among the 400
PstDESSPs (Fig 5B). We searched the Pathogen-Host Interaction (PHI) database to 401
predict the virulence attribute of each PstDESSP based on their sequence similarity. A 402
total of 24 of the PstDESSPs exhibited significant similarity to at least one entry in the 403
PHI database. For example, Unigene31938 (Pstg04010) had a predicted ‘copper-zinc 404
superoxide dismutase’ domain match with PHI383 (C. albicans) and PHI6412 (P. 405
striiformis) (Table S3). Although the level of similarity barely exceeded the cutoff 406
value, both of these entries have a superoxide dismutase domain. PHI6412 (PsSOD1) 407
has been reported to increase the resistance of the pathogen against host-derived 408
oxidative stress [42]. The results suggest that all of the PstDESSPs that matched with 409
entries in the PHI database are worthy enough of further exploration employing in vivo 410
systems. Hence, a comparison against multiple databases provides a better way to 411
discover CSEPs and understand the importance of the PstDESSPs in Table S3. 412
Furthermore, a considerable percentage (80%) of the PstDESSPs remain 413
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
18
uncharacterized (Fig 5B), due to the tendency of the candidate effectors to only show 414
homology to the pathogenicity related domains and not to other typical functional 415
domains. 416
417
To understand the relationship between PstDESSPs, we generated a phylogenetic tree 418
using an analysis of multiple sequence alignments. The outcome suggests that there are 419
no significant conserved sequences among the 230 PstDESSPs. In general, we 420
identified three main branches (or classes) of PstDESSPs based on sequence 421
similarities. The purple branch in Fig 5C possesses a substantial amount of the 422
PstDESSPs and is also subdivided into two smaller arms. No significant conserved 423
motifs were detected among the PstDESSPs, except for the Y/F/WxC motif (Table S3), 424
which is common to the powdery mildew [43]. 425
426
The PstDESSPs determined in this study were compared with other published 427
expression data sets of Pst to investigate whether the SSPs in this study were unique or 428
whether they were similar to haustorial differentially expressed effector candidates. 429
Cantu and colleagues studied the expression data for effectors expressed in infected 430
leaf and haustoria samples and provided a pioneer list of SSP tribes in their analysis (7, 431
8). We analyzed the similarity between tribes of SSPs via the Blastp program (e-value 432
< 0.00001). A total of 116 PstDESSPs were found to share significant homology to 433
haustoria-expressed secreted effector proteins (HESPs), and 133 of PstDESSPs were 434
found to be homologous to infected leaves-expressed secreted effector proteins (IESPs) 435
(Table S3). The remaining 96 PstDESSPs that did not match with HESPs or IESPs were 436
found to be differentially expressed at 10 dpi in this study. Comparison with the list of 437
haustorial secreted proteins (HSPs) published by Garnica et al. indicated that 94 of the 438
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
19
PstDESSPs are drastically similar to HSPs [9]. Overall, the number of PstDESSPs that 439
we identified that are not similar to any of those in previously reported data sets is 32, 440
signifying their exclusivity to the present study. Xia and colleagues scrutinized the 441
effector candidates for avirulence (Avr) by comparing the expression data sets of Cantu 442
et al., 2013 [8], [10]. We compared our list of PstDESSPs to the list of Avr candidates 443
and the results suggest that three previously overlooked Avr candidates and six known 444
Avr candidates are present among our PstDESSP data set (Table S5). 445
446
The cysteine content of a candidate effector protein has long been a determinant for 447
making effector predictions [8–10, 23, 24]. Although several effectors have been 448
discovered to be not rich in their cysteine composition, such as AvrM [44] and AvrL567 449
[45], we examined the PstDESSPs in terms of cysteine possession. We identified 64 450
cysteine-rich small-secreted effector proteins (SCEPs) that have four or more cysteine 451
amino acid residues in their mature protein length (not including the N-terminal signal 452
peptide) (Table S5). 453
454
To find the most promising subset of candidate effectors, we focused on the PstDESSPs 455
that produced positive results in the EffectorP analysis, had cysteine residue counts of 456
more than three, and had homologous matches within the previously mentioned reports 457
(HESPs, IESPs, and HSPs). A total of 14 PstDESSPs met all five criteria and were 458
defined as reliable CSEPs for further investigations (Fig 5D). A more detailed 459
description of the features of the 14 CSEPs is provided in Table 4. 460
461
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
20
Subcellular localization of Unigene17495 (Pstg_10917) 462
One of the CSEPs from Table 4 was chosen at random to test the reliability of the 463
presented PstDESSPs data. Unigene17495 (Pstg_10917) was predicted to have a transit 464
peptide sequence following the N-terminus secretion signal. Thus, if the effector 465
candidate succeeds in translocation within the host cell, it may be able to partially target 466
the chloroplast, the site of control for programmed cell death (PCD) against both biotic 467
and abiotic stressors. The heterologous expression of Unigene17495 with a GFP tag on 468
N. benthamiana leaves showed chloroplast localization (Fig 6). Its localization to the 469
chloroplast despite being predicted for secretion outside the cell was unexpected. 470
However, re-checking the predictions by other algorithms; ‘TargetP’ in addition to 471
predicting a secretion signal, only after its removal, it also detected a transit peptide 472
within the remainder sequence (111 a.a.). On the other hand, ‘Localizer’ analysis 473
predicted a transit peptide overlapping the secretion signal region. Therefore, the 474
experimentally shown chloroplast localization, instead of originally predicted secretion 475
outside of the cell, can be explained by the presence of a transit peptide and a possible 476
disruption of the secretion signal due to a particular condition in the vicinity of the 477
molecule. Alternatively, in the natural host, trafficking in both locations may be 478
possible or there is a slight change in the physiological condition that may direct a 479
preference of one target location to another. 480
481
Unigene17495 (Pstg10917) suppresses INF1-induced cell death 482
Unigene17495 (Pstg10917) is conserved among Pst relatives. However, none of the 483
homologs have been previously studied for their function. Interestingly, an effector 484
candidate of the soybean rust pathogen (Phakopsora pachyrhizi), PpEC82 [46], 485
resembles Unigene17495 (Pstg_10917) with an e-value of e-14. PpEC82 was reported 486
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
21
to localize to the cytosol and nucleus with strong aggregation [47] properties, unlike 487
the chloroplast-targeting Unigene17495 (Pstg_10917) as we have demonstrated. 488
Moreover, Qi et al. stated that PpEC82 was able to suppress BAX-induced cell death 489
when expressed in yeast cells [47]. Correspondingly, we conducted cell death 490
suppression assays to test whether Unigene17495 (Pstg_10917) has suppression 491
abilities similar to its distant homolog, PpEC82. We used a cell death elicitor (INF1) as 492
a cell death inducer and categorized the resulting suppression of cell death observations 493
into four levels: strong suppression, moderate suppression, weak suppression, or no 494
suppression at all. Across 15 replicates of Unigene17495-treated N. benthamiana 495
leaves, we observed three cases of strong, three cases of moderate, and three cases of 496
weak cell death suppression. The other six leaves presented no significant level of cell 497
death suppression. In all 15 trials, leaves treated with the negative controls (GFP and 498
SP-GFP) resulted in tissue collapse and cell death in the infiltrated area. Therefore, we 499
concluded that Unigene17495 (Pstg_10917) without SP, was able to partially suppress 500
programmed cell death triggered by the INF1 elicitor (Fig 7). We need to emphasize 501
that not all attempts produced equal levels of cell death suppression. Although the 502
expression of the effector was verified by microscope analysis at 2 dpi for all of the 503
leaves, 6 out of the 15 leaves revealed no detectable suppression of INF1-induced cell 504
death. Fisher’s exact test calculated a P-value of 0.0007 for the findings, and thus the 505
results were significant (P < 0.05). Together, these data show that Pstg10917ΔSP-GFP 506
(Unigene17495) is capable of significantly suppressing PCD induced by INF1. 507
508
It should be noted that the complex nature of heterologous expression systems serves 509
as a double-edged sword due to the oscillations in cell death response with N. 510
benthamiana. However, the versatility of the system provides a fast and effective way 511
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
22
to study candidate effectors. Other candidate effectors of Pst have been studied using 512
heterologous expression systems. For example, the Pst effector PstHa5a23 was 513
reported to suppress PCD triggered by BAX, INF1, and two mitogen-activated protein 514
kinases (MKK1 and NPK1) involved in host cell resistance [48]. PstHa5a23 is a 515
homolog to Pstg_00676 identified in our PstDESSPs list. In a recent publication, 516
Pstg_14695 and Pstg_09266 were able to suppress PCD as part of effector-triggered 517
immunity [49]. Furthermore, Petre et al. utilized mass spectroscopy analysis to decipher 518
host cell factors that interact with GFP-tagged candidate effectors in co-519
immunoprecipitation samples [50]. Eleven of the twenty candidate effectors reported 520
in that study are homologous to our PstDESSPs. These recent reports support the 521
reliability of our PstDESSPs data as a trustworthy collection of candidate effectors that 522
can be used for effector function studies in the future. 523
524
Conclusion 525
The obligate biotrophic fungus Pst causes yellow rust disease by establishing an 526
interface between the host and pathogen using small secreted proteins. Transcriptome 527
sequencing has proven a useful strategy in elucidating the crosstalk of proteins between 528
the host and pathogen, including the interaction between wheat and Pst. We compared 529
compatible and incompatible interaction points in order to identify the differentially 530
expressed genes (PstDEGs) at 10 dpi. In silico predictions highlighted the SSPs within 531
the group of PstDEGs. The annotations, comparisons, and characterizations contributed 532
to the identification of a list of the most promising subset of SSPs that function as 533
phytopathogen effectors. The presented SSP data represent candidates for the 534
functional analysis to better understand yellow rust disease in wheat. In particular, we 535
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
23
were able to determine that Unigene17495 (Pstg10917) of our PstDESSPs list is 536
localized to the chloroplasts of infected plant cells and is capable of halting 537
programmed cell death. 538
539
ACKNOWLEDGMENTS 540
This study was funded by COST-113Z350 (call FA1208), PI: MSA. BD (Dagvadorj), 541
ACO, and AA-O were supported by TUBITAK-BIDEB: 2215 and 2211/C, 542
respectively. The authors want to thank the Kamoun Lab, Sainsbury Lab, and Tolga 543
Bozkurt of Imperial College London, Department of Life Sciences, London, SW7 544
2AZ, the UK for vectors used in the study. 545
546
AUTHOR CONTRIBUTIONS 547
Ahmet Caglar Ozketen: Methodology, Investigation, Data Curation, Validation, 548
Writing – original draft preparation, Writing – review & editing 549
Ayse Andac-Ozketen: Investigation, Validation 550
Bayantes Dagvadorj: Investigation 551
Burak Demiralay: Data Curation 552
Mahinur S. Akkaya: Methodology, Funding acquisition, Supervision, Writing – review 553
& editing 554
555
REFERENCES 556
1. Stotz HU, Mitrousia GK, de Wit PJGM, Fitt BDL. Effector-triggered defence 557
against apoplastic fungal pathogens. Trends in Plant Science. 2014. 558
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
24
doi:10.1016/j.tplants.2014.04.009 559
2. Garnica DP, Nemri A, Upadhyaya NM, Rathjen JP, Dodds PN. The Ins and Outs 560
of Rust Haustoria. PLoS Pathog. 2014;10: 10–13. 561
doi:10.1371/journal.ppat.1004329 562
3. De Jonge R, Bolton MD, Thomma BPHJ. How filamentous pathogens co-opt 563
plants: The ins and outs of fungal effectors. Current Opinion in Plant Biology. 564
2011. doi:10.1016/j.pbi.2011.03.005 565
4. Boller T, Felix G. A Renaissance of Elicitors: Perception of Microbe-Associated 566
Molecular Patterns and Danger Signals by Pattern-Recognition Receptors. Annu 567
Rev Plant Biol. 2009;60: 379–406. 568
doi:10.1146/annurev.arplant.57.032905.105346 569
5. Yin C, Chen X, Wang X, Han Q, Kang Z, Hulbert SH. Generation and analysis 570
of expression sequence tags from haustoria of the wheat stripe rust fungus 571
Puccinia striiformis f. sp. Tritici. BMC Genomics. 2009;10: 1–9. 572
doi:10.1186/1471-2164-10-626 573
6. Duplessis S, Hacquard S, Delaruelle C, Tisserant E, Frey P, Martin F, et al. 574
Melampsora larici-populina Transcript Profiling During Germination and 575
Timecourse Infection of Poplar Leaves Reveals Dynamic Expression Patterns 576
Associated with Virulence and Biotrophy. Mol Plant-Microbe Interact. 2011;24: 577
808–818. doi:10.1094/MPMI-01-11-0006 578
7. Cantu D, Govindarajulu M, Kozik A, Wang M, Chen X, Kojima KK, et al. Next 579
generation sequencing provides rapid access to the genome of Puccinia 580
striiformis f. sp. tritici, the causal agent of wheat stripe rust. PLoS One. 2011;6: 581
4–11. doi:10.1371/journal.pone.0024230 582
8. Cantu D, Segovia V, MacLean D, Bayles R, Chen X, Kamoun S, et al. Genome 583
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
25
analyses of the wheat yellow (stripe) rust pathogen Puccinia striiformis f. sp. 584
tritici reveal polymorphic and haustorial expressed secreted proteins as candidate 585
effectors. BMC Genomics. 2013;14. doi:10.1186/1471-2164-14-270 586
9. Garnica DP, Upadhyaya NM, Dodds PN, Rathjen JP. Strategies for Wheat Stripe 587
Rust Pathogenicity Identified by Transcriptome Sequencing. PLoS One. 2013; 588
doi:10.1371/journal.pone.0067150 589
10. Xia C, Wang M, Cornejo OE, Jiwan DA, See DR, Chen X. Secretome 590
characterization and correlation analysis reveal putative pathogenicity 591
mechanisms and identify candidate avirulence genes in the wheat stripe rust 592
Fungus Puccinia striiformis f. sp. tritici. Front Microbiol. 2017;8. 593
doi:10.3389/fmicb.2017.02394 594
11. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, et al. Full-595
length transcriptome assembly from RNA-Seq data without a reference genome. 596
Nat Biotechnol. 2011; doi:10.1038/nbt.1883 597
12. Pruitt KD, Tatusova T, Maglott DR. NCBI reference sequences (RefSeq): A 598
curated non-redundant sequence database of genomes, transcripts and proteins. 599
Nucleic Acids Res. 2007; doi:10.1093/nar/gkl842 600
13. Bairoch A, Apweiler R. The SWISS-PROT protein sequence database and its 601
supplement TrEMBL in 2000. Nucleic Acids Res. 2000;28: 45–8. Available: 602
http://www.ncbi.nlm.nih.gov/pubmed/10592178 603
14. Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic 604
Acids Res. 2000;28: 27–30. Available: 605
http://www.ncbi.nlm.nih.gov/pubmed/10592173 606
15. Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin E V, et 607
al. The COG database: an updated version includes eukaryotes. BMC 608
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
26
Bioinformatics. 2003;4: 41. doi:10.1186/1471-2105-4-41 609
16. Iseli C, Jongeneel C V, Bucher P. ESTScan: a program for detecting, evaluating, 610
and reconstructing potential coding regions in EST sequences. Proceedings Int 611
Conf Intell Syst Mol Biol. 1999; 138–48. doi:10.1007/s12576-011-0153-z 612
17. Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, Robles M. Blast2GO: 613
A universal tool for annotation, visualization and analysis in functional genomics 614
research. Bioinformatics. 2005;21: 3674–3676. 615
doi:10.1093/bioinformatics/bti610 616
18. Ye J, Fang L, Zheng H, Zhang Y, Chen J, Zhang Z, et al. WEGO: A web tool 617
for plotting GO annotations. Nucleic Acids Res. 2006; doi:10.1093/nar/gkl031 618
19. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and 619
quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008; 620
doi:10.1038/nmeth.1226 621
20. Kanehisa M, Araki M, Goto S, Hattori M, Hirakawa M, Itoh M, et al. KEGG for 622
linking genomes to life and the environment. Nucleic Acids Res. 2008; 623
doi:10.1093/nar/gkm882 624
21. Jia J, Zhao S, Kong X, Li Y, Zhao G, He W, et al. Aegilops tauschii draft genome 625
sequence reveals a gene repertoire for wheat adaptation. Nature. 2013; 626
doi:10.1038/nature12028 627
22. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, et al. 628
Gapped BLAST and PSI-BLAST: A new generation of protein database search 629
programs. Nucleic Acids Research. 1997. doi:10.1093/nar/25.17.3389 630
23. Saunders DGO, Win J, Cano LM, Szabo LJ, Kamoun S, Raffaele S. Using 631
Hierarchical Clustering of Secreted Protein Families to Classify and Rank 632
Candidate Effectors of Rust Fungi. Stajich JE, editor. PLoS One. 2012;7: 633
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
27
e29847. doi:10.1371/journal.pone.0029847 634
24. Hacquard S, Joly DL, Lin Y-C, Tisserant E, Feau N, Delaruelle C, et al. A 635
Comprehensive Analysis of Genes Encoding Small Secreted Proteins Identifies 636
Candidate Effectors in Melampsora larici-populina (Poplar Leaf Rust). Mol 637
Plant-Microbe Interact. 2012;25: 279–293. doi:10.1094/MPMI-09-11-0238 638
25. Petersen TN, Brunak S, von Heijne G, Nielsen H. SignalP 4.0: discriminating 639
signal peptides from transmembrane regions. Nat Methods. 2011; 640
doi:10.1038/nmeth.1701 641
26. Möller S, Croning MDR, Apweiler R. Evaluation of methods for the prediction 642
of membrane spanning regions. Bioinformatics. 2001; 643
doi:10.1093/bioinformatics/17.7.646 644
27. Sperschneider J, Gardiner DM, Dodds PN, Tini F, Covarelli L, Singh KB, et al. 645
EffectorP: Predicting fungal effector proteins from secretomes using machine 646
learning. New Phytol. 2016;210: 743–761. doi:10.1111/nph.13794 647
28. Sperschneider J, Dodds PN, Gardiner DM, Singh KB, Taylor JM. Improved 648
prediction of fungal effector proteins from secretomes with EffectorP 2.0. Mol 649
Plant Pathol. 2018; doi:10.1111/mpp.12682 650
29. Rawlings ND, Barrett AJ, Thomas PD, Huang X, Bateman A, Finn RD. The 651
MEROPS database of proteolytic enzymes, their substrates and inhibitors in 652
2017 and a comparison with peptidases in the PANTHER database. Nucleic 653
Acids Res. 2018; doi:10.1093/nar/gkx1134 654
30. Yin Y, Mao X, Yang J, Chen X, Mao F, Xu Y. DbCAN: A web resource for 655
automated carbohydrate-active enzyme annotation. Nucleic Acids Res. 2012; 656
doi:10.1093/nar/gks479 657
31. Choi J, Détry N, Kim KT, Asiegbu FO, Valkonen JP, Lee YH. FPoxDB: Fungal 658
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
28
peroxidase database for comparative genomics. BMC Microbiol. 2014; 659
doi:10.1186/1471-2180-14-117 660
32. Urban M, Pant R, Raghunath A, Irvine AG, Pedro H, Hammond-Kosack KE. 661
The Pathogen-Host Interactions database (PHI-base): Additions and future 662
developments. Nucleic Acids Res. 2015; doi:10.1093/nar/gku1165 663
33. Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, et al. Fast, scalable 664
generation of high-quality protein multiple sequence alignments using Clustal 665
Omega. Mol Syst Biol. 2011; doi:10.1038/msb.2011.75 666
34. Letunic I, Bork P. Interactive tree of life (iTOL) v3: an online tool for the display 667
and annotation of phylogenetic and other trees. Nucleic Acids Res. 2016; 668
doi:10.1093/nar/gkw290 669
35. Emanuelsson O, Nielsen H, Brunak S, Von Heijne G. Predicting subcellular 670
localization of proteins based on their N-terminal amino acid sequence. J Mol 671
Biol. 2000; doi:10.1006/jmbi.2000.3903 672
36. Sperschneider J, Catanzariti AM, Deboer K, Petre B, Gardiner DM, Singh KB, 673
et al. LOCALIZER: Subcellular localization prediction of both plant and effector 674
proteins in the plant cell. Sci Rep. 2017; doi:10.1038/srep44598 675
37. Horton P, Park KJ, Obayashi T, Fujita N, Harada H, Adams-Collier CJ, et al. 676
WoLF PSORT: Protein localization predictor. Nucleic Acids Res. 2007; 677
doi:10.1093/nar/gkm259 678
38. Sperschneider J, Dodds PN, Singh KB, Taylor JM. ApoplastP: prediction of 679
effectors and plant proteins in the apoplast using machine learning. New Phytol. 680
2018; doi:10.1111/nph.14946 681
39. Karimi M, Inzé D, Depicker A. GATEWAYTM vectors for Agrobacterium-682
mediated plant transformation. Trends in Plant Science. 2002. 683
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
29
doi:10.1016/S1360-1385(02)02251-3 684
40. Win J, Kamoun S, Jones AME. Purification of effector-target protein complexes 685
via transient expression in Nicotiana benthamiana. Methods Mol Biol. 2011; 686
doi:10.1007/978-1-61737-998-7_15 687
41. Dagvadorj B, Ozketen AC, Andac A, Duggan C, Bozkurt TO, Akkaya MS. A 688
Puccinia striiformis f. sp.Tritici secreted protein activates plant immunity at the 689
cell surface. Sci Rep. 2017;7: 1–10. doi:10.1038/s41598-017-01100-z 690
42. Liu J, Guan T, Zheng P, Chen L, Yang Y, Huai B, et al. An extracellular Zn-691
only superoxide dismutase from Puccinia striiformis confers enhanced resistance 692
to host-derived oxidative stress. Environ Microbiol. 2016;18: 4118–4135. 693
doi:10.1111/1462-2920.13451 694
43. Godfrey D, Böhlenius H, Pedersen C, Zhang Z, Emmersen J, Thordal-695
Christensen H. Powdery mildew fungal effector candidates share N-terminal 696
Y/F/WxC-motif. BMC Genomics. 2010;11. doi:10.1186/1471-2164-11-317 697
44. Catanzariti A-M, Dodds PN, Lawrence GJ, Ayliffe MA, Ellis JG. Haustorially 698
Expressed Secreted Proteins from Flax Rust Are Highly Enriched for Avirulence 699
Elicitors. Plant Cell Online. 2006;18: 243–256. doi:10.1105/tpc.105.035980 700
45. Dodds PN, Lawrence GJ, Catanzariti A, Ayliffe MA, Ellis JG. The Melampsora 701
lini AvrL567 Avirulence Genes Are Expressed in Haustoria and Their Products 702
Are Recognized inside Plant Cells. Plant Cell Online. 2004;16: 755–768. 703
doi:10.1105/tpc.020040 704
46. Link TI, Lang P, Scheffler BE, Duke M V, Graham MA, Cooper B, et al. The 705
haustorial transcriptomes of Uromyces appendiculatus and Phakopsora 706
pachyrhizi and their candidate effector families. Mol Plant Pathol. 2014;15: 379–707
393. doi:10.1111/mpp.12099 708
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
30
47. Qi M, Grayczyk JP, Seitz JM, Lee Y, Link TI, Choi D, et al. Suppression or 709
Activation of Immune Responses by Predicted Secreted Proteins of the Soybean 710
Rust Pathogen Phakopsora pachyrhizi. Mol Plant-Microbe Interact. 2018;31: 711
163–174. doi:10.1094/MPMI-07-17-0173-FI 712
48. Cheng Y, Wu K, Yao J, Li S, Wang X, Huang L, et al. PSTha5a23, a candidate 713
effector from the obligate biotrophic pathogen Puccinia striiformis f. sp. tritici, 714
is involved in plant defense suppression and rust pathogenicity. Environ 715
Microbiol. 2017;19: 1717–1729. doi:10.1111/1462-2920.13610 716
49. Ramachandran SR, Yin C, Kud J, Tanaka K, Mahoney AK, Xiao F, et al. 717
Effectors from Wheat Rust Fungi Suppress Multiple Plant Defense Responses. 718
Phytopathology. 2017;107: 75–83. doi:10.1094/PHYTO-02-16-0083-R 719
50. Petre B, Saunders DGO, Sklenar J, Lorrain C, Krasileva K V., Win J, et al. 720
Heterologous expression screens in nicotiana benthamiana identify a candidate 721
effector of the wheat yellow rust pathogen that associates with processing bodies. 722
PLoS One. 2016;11: 1–16. doi:10.1371/journal.pone.0149035 723
724
725
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
31
FIGURE LEGENDS 726
Fig 1. Flowchart to summarize sampling, sequencing, and in silico analysis. 727
Pipeline for the steps performed to achieve de novo assembly, expression analysis, and 728
mapping of Pst genes. * = FDR (False Discovery Rate). 729
730
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
32
Table 1. Sequencing statistics for Pst inoculated/infected wheat transcriptome. 731
Samples Avocet-S_Pst Avocet-Yr10_Pst
Total Clean Reads 9,204,786 6,899,596
Total Clean
Nucleotides (nt) 828,430,740 620,963,640
Q20 98.16 % 97.90 %
GC 51.44 % 54.04 %
Q20 is the proportion of nucleotides with a quality value larger than 20. 732 GC is the proportion of guanine and cytosine nucleotides among total nucleotides. 733 734
735
Table 2. De novo assembly statistics for Pst inoculated/infected wheat transcriptome. 736
Samples Number Length (nt) Mean Length (nt) N50 Distinct
Clusters
Distinct
Singleton
Contig Avocet-Yr10_Pst 105,866 23,189,064 219 249 - -
Avocet-S_Pst 128,680 29,848,685 232 272 - -
Unigene Avocet-Yr10_Pst 55,231 18,753,197 340 371 12,956 42,275
Avocet-S_Pst 68,568 24,329,132 355 394 13,564 55,004
All 61,105 26,978,874 442 490 15,016 46,089
Distinct clusters are similar (more than 70%) unigenes, and these unigenes may originate from the same 737 gene or homologous gene; whereas distinct singletons represent the unigene come from a single gene. 738 739
740
Table 3. Annotation statistics of the assembled unigenes of the transcriptome 741
sequencing project 742
Sequence file NR NT Swiss-Prot KEGG COG GO All
All-Unigene.fa 24,933 9,945 16,318 16,275 13,278 10,015 26,843
743
744
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
33
Fig 2. Statistics of the NR classification of unigenes. (A) The e-value distribution of 745
the alignment results of NR annotation. (B) The similarity distribution and (C) the 746
species distribution of the results of NR annotation. 747
748
749
750
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
34
Fig 3. Distribution of differentially expressed unigenes (DEGs). Red: Up-regulated 751
genes (6851), green: downregulated genes (3699), and blue: no differential expression 752
observed with the parameters FDR ≤ 0.001 and │log2ratio│≥ 1. 753
754
755
756
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
35
Fig 4. Functional annotations of PstDEGs via the Blast2Go program. (A) 757
Molecular function analysis (Level 3) of PstDEGs. (B) Biological process classification 758
(Level 2) of PstDEGs. (C) Enzyme class distribution of PstDEGs. 759
760
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
36
761
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
37
Fig 5. Characterization of the PstDESSPs. (A) The number of total proteins, small 762
secreted proteins, and small secreted candidate effectors among our data (PstDEGs) 763
and three wheat rust fungi genomes: Pst, Pgt (P. graminis f. sp. tritici), and Ptt (P. 764
triticina). (B) Distribution of PstDESSPs for functional annotations. (C) Phylogenetic 765
analysis based on the protein sequence similarity of PstDESSPs. (D) Characterization 766
and comparison of PstDESSPs in five categories: small cysteine-rich effector proteins 767
(SCEPs), predicted to be an effector candidate by a machine learning algorithm 768
(EffectorP 2.0-positive), and consistent with previous sequencing reports in the 769
literature of candidate rust effectors in haustorial (HESPs) or infection (IESPs) 770
expressed secreted proteins [8], and haustorial secreted proteins (HSPs) [9]. 771
772
773
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
38
Table 4: List of the most promising CSEPs overlapped in all five classes in Figure 5D. 774
aLength represents the length of mature proteins without N-terminus signal peptide part. 775 b Cysteine residue number in the mature protein. 776 c Differential expression compatible interaction over incompatible interaction. 777 d The results of TargetP (TP) program,e Localizer (Loc.) Program. Abbreviations are C: Chloroplast, M: 778 mitochondria, N: nucleus, G: Golgi, E.R.: endoplasmic reticulum, Cy: cytosolic, AP: apoplastic, NonAP: non-779 apoplastic. All predictions were performed using mature protein sequences under the assumption that mature 780 proteins translocate into the host cell. 781
782
783
CSEP Properties Subcellular Localization Prediction
ID Corresponding
Pst78 Homologs
Conserved
Domains
Length
(a.a.)a
Cysb DEc TPd Loc.e WolfPSORT ApoplastP
CL887.Contig1 PSTG_16009 138 10 Up - - C AP
CL887.Contig2 PSTG_16009 138 10 Up - - C AP
Unigene13023 PSTG_03450 100 10 Up - - C AP
Unigene17495 PSTG_10917 111 6 Up C C M AP
Unigene28760 PSTG_14090 DPBB_1
superfamily 112 6 Up - - C AP
Unigene29179 PSTG_04274 82 6 Up - - C AP
Unigene31294 PSTG_00994 254 10 Up - N N NonAP
CL5364.Contig2 PSTG_01880 205 6 Up - - ER NonAP
CL5364.Contig3 PSTG_01881 205 6 Up - - N AP
Unigene33801 PSTG_05079 92 6 Up - -
C AP
Unigene36354 PSTG_08969 196 8 Up C - C AP
Unigene36901 PSTG_03879,
partial 49 6 Up - - N AP
Unigene38807 PSTG_14378 191 10 Up - - G AP
Unigene7930 PSTG_02173 109 4 Up - - C AP
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
39
Fig 6. Subcellular localization of Unigene17495 (Pstg10917) tagged with GFP at 784
C-terminus. (A) Expression of Pstg10917ΔSP-GFP (without N-terminus secretion 785
signal). (B) and (E) Chloroplast autofluorescence. (C) and (F) Overlaid images. (D) 786
Expression of Pstg10917 (with N-terminus secretion signal). The expression on N. 787
benthamiana leaves was visualized under a confocal microscope at 2 dpi. 788
789
790
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
40
Fig 7. Suppression of cell death mediated by Inf1 expression. (A) Daylight and (B) 791
UV light exposure. Constructs were expressed with Agrobacterium tumefaciens; (1) 792
GFP, (2) SP-GFP, (3) Unigene17495 without SP (Pstg10917ΔSP-GFP) and (4) 793
Unigene17495 (Pstg10917-GFP). Cell death elicitor (Inf1) was infiltrated into the same 794
region after 24 hours. Photos were taken at 4 days after Inf1 challenge. 795
796
797
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint
41
SUPPORTING INFORMATION 798
Fig S1. Length distribution of contigs and unigenes after the assembly process. 799
Graphical representation of the length distribution of A) contigs of AvocetS_PST, B) 800
unigenes of AvocetS_PST, C) contigs of AvocetYR10_PST and D) unigenes of 801
AvocetS_PST. 802
803
Fig S2. Function annotation of the unigene data. The function of each of the unigenes 804
was determined using the A) COG or B) GO databases. 805
806
Table S1. Annotation of the assembled unigenes discovered in transcriptome 807
sequencing. 808
809
Table S2. Annotation of the PstDEGs mapped to the Pucciniales local database. 810
811
Table S3. In-depth characterization of the differentially expressed small 812
secretome profile of Pst (PstDESSPs). 813
814
Table S4. Primers used in this study for cloning. 815
Table S5. Matching PstDESSPs to the YR candidates identified in the study by 816
Xia et al., 2017. 817
818
819
820
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under apreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted January 13, 2020. ; https://doi.org/10.1101/2020.01.11.897884doi: bioRxiv preprint