Genomic and transcriptomic analysis unveils population evolution and development of pesticide resistance in fall armyworm Spodoptera frugiperda

Genome assembly, evaluation and annotation

A total of 110.93 Gb high-quality single tube long fragment read (stLFR) sequencing data was generated from a male individual from Yunnan Province (Table S1 and Fig. 1A). The size of the primary genome assembled by the stLFR data is 542.42 Mb, and the scaffold N50 is 507.12 kb. Using the High-throughput chromosome conformation capture (Hi-C) technology, the scaffolds were finally concatenated to 31 chromosomes, whereby scaffold N50 is 14.16 Mb. Genomes size assembled in this study fell within the scope of 246 Mb to 809 Mb of the Lepidopteran species (Triant et al., 2018). However, the size of the genome reported in the current study is larger than that of the previously assembled FAW genomes (Kakumani et al., 2014; Gouin et al., 2017; Nam et al., 2018, 2019; Nandakumar et al., 2017; Zhang et al., 2019, 2020; Xiao et al., 2020) (Table 1). The extra regions of our FAW genome compared with that published by Xiao (Xiao et al., 2020) are evenly distributed on 31 chromosomes (Fig. S2). We mapped the raw data (raw reads used for assembly of our genome and Xiao’s genome) to those regions, and the mapping rates are more than 30%, suggesting the validity of the extra regions. Besides, 1,494 genes were identified in the extra regions, of which 1,421 genes were expressed gene (average FPKM value of control > 0) (Fig. S3) and 512 genes were differential expression genes (DEGs) in 23 pesticide treatments (Fig. S4), indicating the assembled extra regions might be genuine and functional.

Figure 1figure 1

FAW genomic characterization and synteny with Spodoptera litura. (A) The male adult of fall armyworm (FAW) collected in Yunnan used for genome assembly. (B) Genomic characterization of FAW genome, it shows the GC content, gene number, repeat content, Single Nucleotide Polymorphism (SNP) density and chromosome from the center to the edge. (C) Synteny between FAW and Spodoptera litura genome. The gray lines reveal the genome-wide collinearity, and the red lines reveal the Cytochrome P450 (CYP) gene collinearity

Table 1 Comparison of SFynMstLFR and other published Spodoptera frugiperda genome assemblies.

The GC content of FAW genome is 36.52% (Fig. 1B) which fell within the scope of 31.6% to 37.7% of its closely related Lepidopteran species, Bicyclus anynana (Nowell et al., 2017). The genome covers 95% for complete BUSCO genes with duplicated BUSCO components of 9.8% (Table S2). The quality of these data was largely improved, because of more information available from other published FAW genomes (PRJNA380964, PRJNA257248, PRJEB13110, and PRJNA344686) (Table 1), including the recently published FAW assembled using the Pacbio RSII technology (Xiao et al., 2020). The sequencing data generated from libraries for Hi-C, WGS, RNA-seq were also mapped to the assembled genome, and all the mapping rates and sequencing coverage were above 90% (Tables S3 and S4), indicated that most of the sequencing data can be mapped back to the genome, and the K-mer analysis showed that the inferred genome size were various, indicating the possibly big differences on genome size between different individuals (Fig. S5 and Table S5). Moreover, the FAW expressed sequence tags (EST) and transcripts from NCBI were perfectly mapped to the genome with > 90% and > 80% mapping ratios, respectively (Table S6).

Combining de novo and homology-based search, we identified 153 Mb repeat elements, accounting for 28.24% of the FAW genome (Fig. 1B). A total of 22,201 genes were annotated using the repeat-masked genome based on de novo, RNA-seq and homology-based methods (Tables S7 and S8). The gene set covered 94.2% of complete BUSCO genes, which presented a more complete gene set than previously published FAW genome (Table S9). Totally, 93.48% of the identified genes in the gene set were identified as functional genes (Table S8). The FAW genome established in this study revealed a high sequence coverage and identity with its closely related species, Spodoptera litura (Fig. 1C), another important insect pest on vegetable and horticultural plants.

Comparative genomics revealed the possible genetic basis of pesticides resistance

Based on the 4,056,205 bp fourfold degenerate synonymous site (4DTv) sequences across the genomes of FAW and other seven Lepidopteran species, the phylogenetic analysis showed that FAW is closely related with S. litura (Fig. 2). Branch-site likelihood ratio test was performed to identify positively selected genes (PSGs) based on 20,051 single-copy orthologous genes of FAW and Bombyx mori abstracted by syntenic alignment, and a total of 363 PSGs were identified, which were enriched in reverse transcriptase domain (IPR000477), endonuclease/exonuclease/phosphatase (IPR005135), zinc finger, CCHC-type (IPR001878), fructose-2,6-bisphosphatase (IPR003094), etc. (Table S10). Of them, ABCC4, UGT and ChuaMOX genes were related to detoxification function.

Figure 2figure 2

Phylogenetic relationships among seven lepidopteran and D. melanogaster genomes. Below: the gene family number of ATP-binding cassette (ABC), Carboxylesterase (CES), Cytochrome P450 (CYP), Glutathione S-transferase (GST), UDP-glucuronosyl transferase (UGT), Aldehyde Oxidase (AOX), Chitinase Acidic (CHIA), Epoxide Hydrolase (EPHX), PIF1 5′-to-3′ DNA helicase (PIF), Patched Domain Containing (PTCHD) and protein tyrosine phosphatase (PTP) genes

Several insect gene families are supposed to be associated with detoxification function, including ATP-binding cassette (ABC) (Koenig et al., 2015), Carboxyl esterase (CES) (Satoh and Hosokawa, 2009), Cytochrome P450 (CYP) (Zhu et al., 2013), Glutathione S-transferase (GST) (Hayes and Pulford, 2008), UDP-glucuronosyl transferase (UGT) (Bock, 2016), Aldehyde Oxidase (AOX) (Chang et al., 2010), Chitinase Acidic (CHIA) (Downing et al., 2000), Epoxide Hydrolase (EPHX) (Iarmarcovai et al., 2008), PIF1 5′-to-3′ DNA helicase (PIF) (Erlandson 2009), Patched Domain Containing (PTCHD) and Protein Tyrosine Phosphatase (PTP) (Herraiz et al., 2006) gene families. Here, we scanned all of these genes or gene families to detect expanded detoxification-related genes (Fig. 2 and Table S11). A total of 1,256 CYP genes were detected from these eight insect species, and 425 CYP genes were identified in the FAW genome, indicating that they are extremely expanded in FAW than the other seven Lepidopteran species. The significant expansion trend is similar with the published FAW genome (Xiao et al., 2020). Besides, the expanded CYP genes from FAW shows quite short-time divergence, and 283 genes are specific to FAW (Fig. 3A, 3C and Table S12). According to the DNA sequence identity, the 425 CYP genes of FAW were merged into 152 gene clusters (Fig. 3D).

Figure 3figure 3

The gene tree of Cytochrome P450 gene family. (A) 1,256 Cytochrome P450 (CYP) genes in 8 species, the label color shows different species, the bar color shows different CYP gene sub-families. (B) The seq-logo plot of the motif and 10 bp flanking sequences of CYP genes in 3 species. (C) The 425 CYP genes of FAW, the bar color shows different CYP gene sub-families, 107 differential expression genes (DGEs) between all treatments and control are shown as orange triangle, 37 DGEs between Bt treatment and control are shown as red star, and the histogram means CYP gene FPKM value of control. (D) The 152 CYP genes filtered by CD-HIT, the bar color shows different CYP gene sub-families

Furthermore, the MEME software verified the accuracy of the CYP gene family for the FAW-specific CYP gene set based on the control CYP genes from B. mori and S. litura. The conservation of motif and 10 bp flanking sequences is shown as a seq-logo plot (Fig. 3B). From the highlighted seq-logo plot, the conservation of motifs in the three species (FAW, B. mori, and S. litura) displayed a striking variant. The motif sequence in FAW is more conserved than in the other two species.

Whole genome resequencing Variation discovery

To assess the genetic diversity among FAW accessions, 163 individuals from America (n = 39), Africa (n = 24) and China (n = 100) (Table S20) were mapped to the FAW assembly (Fig. 4A). The average mapping rate and sequencing coverage were 95.60% ± 3.34% and 90.41% ± 2.38%, respectively (Table S4). After filtration (See “METHODS”), we identified 45,569,102 single nucleotide polymorphism (SNPs) and 11,391,362 short genomic insertions and deletions (InDels) (less than 40 bp) with a minor allele frequency (MAF) > 0.01. About 71.71% of the SNPs are located in the intergenic regions, whereas 3.99% of them are in the coding sequences. The non-synonymous-to-synonymous substitution ratio for the SNPs in the coding regions is 0.242, indicating high quality of the SNP call sets. Moreover, 72.94% of the InDels are located in the intergenic regions, whereas 0.38% of them in the coding regions. About 68.77% of the InDels in the coding regions were estimated to cause frame shift mutations. The SNP density is shown in Figs. 1B and S6.

Figure 4figure 4

Phylogeny and population structure of FAW. (A) Sampling locations of six geographical populations. (Ethiopia: red, Kenya: light red, America: blue, South Africa: pink, Guangxi: green and Yunnan: light green) (B) PCA plots of the first three components of major FAW accessions using whole-genome SNP data. (C) Maximum likelihood phylogenetic tree of FAW population (from what data). Fabricius (Spodoptera litura) was used as an outgroup. (D) Cross-validation (CV) error of ADMIXTURE. (E) Population structure of major FAW categories estimated by ADMIXTURE. Each color represents one ancestral population. Each accession is represented by a bar, and the length of each colored segment in the bar represents the proportion contributed by that ancestral population. (F) CLR scores calculated by SweeD across the genome in both American D and Yunnan accessions. The dashed lines mark the regions at the top 0.5%. The red arrow indicates the detoxification candidate gene regions

African origin of FAW populations in China

The core set of SNPs were used to analyze the phylogeny and population structure of FAW. Principal component analysis (PCA) showed substantial genetic diversity among the six FAW populations, PC1, PC2 and PC3 revealing a total genetic variance of 5.49%, 5.34% and 2.70%, respectively (Fig. 4B). PC1 separated American FAW from other five accessions, PC2 evidently separated the four American accessions and PC3 separated AA accessions from the rest. American populations showed relatively large genetic divergence within populations, compared to Chinese and African populations.

Maximum likelihood (ML) phylogenetic analysis revealed distinct monophyletic clades for Ethiopia (ET), Kenya (KE), America (A), South Africa (SA), Guangxi (GX), and Yunnan (YN) FAW populations (Fig. 4C). This ML-tree further illustrated that the American group could be marginally divided into subgroups AA and AB-AC-AD (Fig. 4C). Chinese (YN, GX) and African (ET, KE, SA) clades displayed close phylogenetic relationship (Fig. 4C). Structure analysis showed that populations in China and Africa comprise accordant components (Fig. 4D and 4E, K = 5). Varying the number of K (presumed ancestral populations) revealed genetic distinctions between American and other accessions. When K = 2, we observed two separate clusters: American (A) and other accessions, and when K = 3, three separate clusters were exhibited: AA, AB-AC-AD and other accessions. Similar patterns are also supported by the PCA and ML-tree analyses (Fig. 4B and 4C), suggesting that Chinese accessions (YN and GX) shared much closer genetic relationships with African accessions (SA, ET, KE) than that of America (AA, AB, AC, AD). Taken together, FAW populations in China (YN, GX) migrated directly from Africa, because the genetic structure of these populations is highly mixed. Besides, we estimated admixture graphs of different geographically defined groups using TreeMix. Extensive gene flow (45.17%) were found from AB to AC when m = 1 (Fig. S7). Additional gene flow from KE to ET (14.66%) and YN (15.17%) were also found when setting the parameter m = 2 and m = 3, respectively. These findings proved that American accessions are deep divergent from other accessions, further supporting that Chinese accessions was derived from Africa, rather than directly from America.

All the samples were identified as C strain in FAW populations invading China

In our study, we identified 240 samples by using TpiE4-183, including 163 resequencing samples, 72 RNA-seq samples used for pesticide resistance analysis, one sample used for genome assembly survey, one sample used for stLFR genome assembly, one sample used for Hi-C sequencing and two RNA-seq samples used for gene annotation (Tables S1 and S20). Interestingly, all 100 samples from China and 24 samples from Africa were identified as C strain, while both C strain and R strain were found in the American samples (Fig. S8). At present, only C strain was found in the Chinese and African populations, indicating all these samples have similar genomic backgrounds.

Differentiation of FAW demographic histories

To reveal the demographic and geophylogenic history of FAW, a pairwise sequentially Markovian coalescent (PSMC) model was used to analyze SNP data from FAW populations of America (n = 10, three from AA, three from AB, two from AC, and two from AD), Africa (n = 10, four from ET, four from KE and two from SA) and China (n = 10, five from YN and five from GX). The results were scaled to the real time by assuming a generation time of 0.25 year and a neutral mutation rate of 2.9 × 10−9 per year (See “METHODS”). From the one Mya (million years ago), the effective population size of each subgroup was found to be less than 800,000 (Fig. S9). Subsequently, African populations (except for KE), all Chinese populations and AB population from America exhibited a mild Ne expansion in 100–400 Kya (thousand years ago) (Ne ≈ 500,000–1,000,000), which levels off during 10–100 Kya (Ne ≈ 750,000–1,000,000). However, American accessions (except for AB), as well as the KE population from Africa, exhibited a rapid Ne expansion during 9–200 Kya (Ne from 500,000 to even 16,000,000). Subsequently in 100 Kya, Chinese accessions, as well as ET, SA and AB populations had a rapid Ne contraction (Ne ≈ 100,000). At 7–10 Kya, American accessions (AA, AC and AD) had a rapid Ne contraction as well (Ne ≈ 50,000). Overall, we found extremely similar demographic history within Chinese FAW populations, which displayed more similar demographic distribution patterns with most African populations than with American populations.

Genetic diversity, population differentiation and natural selections

Across the FAW genome, the global nucleotide diversity (π) is well correlated to the global SNP and InDel density (Fig. S6). Subsequent analysis revealed that America accessions were highly polymorphic with AB (π = 1.02 × 10−3) being less polymorphic than AA (π = 2.59 × 10−3). AC had a nucleotide diversity of 1.95 × 10−3, whereas that of AD was 1.74 × 10−3. KE accession exhibited the highest nucleotide diversity of 3.51 × 10−3 (Fig. S10 and Table S13).

FST (Weir and Cockerham’s estimator) analysis indicated that AB branch had the greatest genetic divergence compared with the other eight accessions (Table S14), consistent with the result of nucleotide diversity analysis (Fig. S10 and Table S13). AD branch displayed great genetic divergence among SA, ET, KE, YN and GX accessions (Table S14) compared to American accessions. Thus, Chinese FAW populations were evolutionally closer to African FAW than American FAW, further suggesting that Chinese FAW originated from Africa.

We also performed to investigate potential selective signals in the genomes of each FAW group by identifying regions (phromosome length/1000) that scored top 0.5% in the model. The selective sweep signals and harbored gene number are shown in Table S15. As the different management strategies between America and China, which Bt crop has been used in America for a long history, whereas chemical pesticides have been largely applied in China, American accession (AA, AB, AC and AD) and Chinese accession (YN and GX) were selected to detect the differences. In American accession, CYP6, CYP9, UGT, SLC22A4/5, SLCO1B and ABCA3 genes were under positive selection. In Yunnan accession, ChuaMOX, SLC22A4/5, CYP6, CYP9, CES2, UGT and CYP4V2 genes were detected to have signals of selective sweep (Fig. 4F).

Transcriptomic responses of FAW to pesticides exposure

To investigate the transcriptomic profiles of FAW to pesticides, twenty-three pesticides including four biological, ten single and nine mixed chemical pesticides were applied to the third-instar larvae of FAW (Fig. 5 and Table S16). The FAW displayed differential responses to different pesticides with regard to the transcriptional expressions (Fig. 5A). Among the biological pesticides, Bacillus thuringiensis (P2) induced the most dramatically transcriptional expression with 1,231 up-regulated and 824 down-regulated genes. In the single and mixed chemical pesticides, eleven pesticides treatments induced the differential expressions of more than 1,500 genes. Then we obtained a union set with 7,991 differential expression genes (DEGs) by merging the DEGs after 23 pesticides treatment. Further clustering of the 7,991 DEGs categorized the 23 pesticides-treated expression profiles into four clusters (Fig. 5B). The four pesticide clusters represent differentially functional groups that effective (Clusters 3 and 4) or ineffective (Clusters 1 and 2) to FAW pest control. Several most effective chemical pesticides in FAW management, such as emamectin benzoate, spinetoram, chlorantraniliprole and cyantraniliprole are concentrated in Clusters 3 and 4, irrespective of single or mixing with other chemicals. Ryanodine receptor-targeted pesticides, such as flutolanil (P18), thiamethoxam (P19), cyantraniliprole (P21) and chlorantraniliprole (P23) were clustered together in Cluster 4. Emamectin benzoate-containing mixture pesticides were clustered with ryanodine receptor-targeted pesticides in Cluster 4. Different with other two biological pesticides, Bacillus thuringiensis and azadirachtin (P2 and P4) were also clustered in Cluster 4 (Fig. 5B).

Figure 5figure 5

Gene expression profiling after twenty-three pesticides exposure. (A) Number of differential expression genes (DEGs) after each pesticide treatment. (B) Pesticides clusters using the combined 7991 DEGs. (C) The heatmap of 107 DGEs of FAW cytochrome P450 genes. X axial shows the different treatment (P1–P23), the color shows the fold change of DGEs, yellow means up regulation, blue means down regulation, black means the gene is not DEG in one/some treatment(s). (D) DGEs of Bt involved in hsa00982 pathway. The red means up-regulated DEGs, the blue means down-regulated DEGs

The 107 differentially expressed P450 genes were further clustered (Fig. 5C). Cluster I represented 48 up-regulated genes, including 2 CYP2, 26 CYP3, 5 CYP4, 6 mitochondrial CYPs and 9 FAW-specific CYPs after 23 pesticides treatment (Fig. 5C). Cluster II represented 22 down-regulated genes, including 3 CYP3, 13 CYP4, 2 mitochondrial CYPs and 4 FAW-specific CYPs after 23 pesticides treatment (Fig. 5C). Further KEGG enrichment analysis showed that cutin, suberine and wax biosynthesis (map00073) pathways were mutually up-regulated after nineteen pesticides exposure (Table S17). Meanwhile, many sugar metabolism pathways, such as fructose and mannose metabolism (map00051), amino sugar and nucleotide sugar metabolism (map00520), galactose metabolism (map00052) and pentose and glucuronate interconversions (map00040) pathways were mutually down-regulated after variant pesticides treatment (Table S17).

In addition, we focused on some detoxification gene families such as CYP, UGT, GST, ABC and CES to identify the potential pesticides resistance genes:

CYP gene

Under the selection of pesticide, many insects have evolved to produce new CYPs, or CYP gene family expands, like Plutella xylostella (You et al., 2013), Aphis glycines (Wenger et al., 2017) and Bemisia tabaci (Chen et al., 2016) and so on. New CYP genes or largely expanded CYP gene family have potential roles in detoxification and/or pesticide resistance.

In this study, a remarkable change in the expression of the CYP genes was observed when FAW individuals were treated with the pesticides. Indeed, many CYP genes (107/425) were differently expressed as shown in Fig. 3C and 3D. Fold change values between treatments and controls in the 107 DEGs are shown in Table S18. There were 2,055 DEGs between Bt-treated larvae and controls, which contained 37 CYP genes including CYP4, CYP6, CYP9, CYP12 and CYP49 (Figs. 3C and S11). CYP49 and most of CYP4 showed down regulation, while CYP6, CYP9 and CYP12 exhibited up regulation, which related to pesticide resistance even the cross resistance in multiple pesticides. Bt could be an effective pesticide target at many gene like CYP. Moreover, the expression of DEGs was significantly elevated in drug metabolism—cytochrome P450 (hsa00982, Fig. 5D) and metabolism of xenobiotics by cytochrome P450 (hsa00980).

UGT gene

Glucuronidation is a major conjugation reaction catalyzed by UGT family, which is associated with detoxification process reported in many insect genomes, such as P. xylostella (You et al., 2013) and B. tabaci (Chen et al., 2016). Transcriptome analysis indicated that all 49 UGT genes were detected as expressed genes (FPKM > 0 in any repetition) (Fig. S12), of which 22 genes were DEGs between 23 treatments and control. Notably, there were 9 down-regulated DEGs between chlorfenapyr treatment and control as well as 8 down-regulated DEGs between spinetoram water treatment and control. Glucuronic acid could combine with many harmful substances in insect liver, aiding in the detoxification process, indicating that the two types of pesticides may target UGT gene effectively.

GST gene

GST catalyzes the binding of nucleophilic glutathione to various electron-friendly exogenous chemicals, critical in toxicology. Isomers of GST display unusual broad-spectrum catalysis as well as the ability to sequester non-substrate drugs and hormones. Twenty-seven out of 29 GST genes were identified as expressed genes (Fig. S13), and 14 genes were DEGs between 23 treatments and control. There were 9 down-regulated DEGs between spinetoram water treatment and control, as well as 6 down-regulated DEGs between chlorfenapyr treatment and control. As a critical role of GST gene in toxicology, these two types of pesticides may target at GST gene effectively.

ABC gene

Research shows that transporters, particularly subfamilies B and C, both of which are involved in detoxification and multidrug resistance, are up-regulated in the gut when larvae feed on plants (Koenig et al., 2015). Besides, ABC transporters are targets of Bacillus thuringiensis insecticidal toxins (Tay et al., 2015). In this study, 57 out of 58 ABC gene families (ABCA3 and ABCC4) were expressed genes (Fig. S14), and 21 genes were DEGs between 23 treatments and control. There were seven down-regulated DEGs between Emamectin*Flutolanil microemulsion treatment and control, implying the ABC gene family may be a potential effective target for combination of Emamectin and Flutolanil.

CES gene

The CES genes encode members of the large carboxylesterase family, which are also involved in detoxification or metabolic activation of various drugs, pesticides, environmental toxins and carcinogens (Satoh and Hosokawa, 2009). In this study, all 30 CES genes (CES1 and CES2) were detected as expressed genes (Fig. S15), and of which, 16 genes were DEGs between 23 treatments and control. There were eight down-regulated DEGs between Metarhizium anisopliae treatment and control as well as between chlorantraniliprole treatment and control, indicating that the CES gene might be a potentially effective target for Metarhizium anisopliae and chlorantraniliprole.

Other genes

Apart from the above genes (families), there were other DEGs under varied treatments. SLC22A3, SLC22A4 and SLC22A5 participated in poly-specific transportation of organic cations in the many organs, critical for elimination of many endogenous small organic cations as well as a wide array of drugs and environmental toxins. There were 51 expressed solute carrier family 22 (SLC22A) gene member (Fig. S16). Interestingly, there was at least one DEG between each treatment and control, and the total DEG number was thirty-one. There were 21 DEGs between spinetoram water treatment and control, and 16 DEGs between chlorantraniliprole*thiamethoxam treatment and control.

All 60 ChuaMOX genes were detected as expressed genes (Fig. S17), and 41 genes were DEGs between all 23 treatments and control. Enzyme encoded by ChuaMOX is similar to one produced in sacs of millipede Chamberlinius hualienensis. In millipedes, the sacs rupture during defensive behavior, producing benzoyl cyanide—a protective chemical. There were over 15 ChuaMOX DEGs in seven treatments (one biological, four single and two mixed chemical pesticides). Besides, SULT1E1, NNT, PTPMT1 and JHEH genes showed differential expression in different treatments.

Azadirachtin is a chemical compound in limonoid group in seeds as a secondary metabolite. There were 1,657 DEGs between azadirachtin treatment and control, which significantly enriched in detoxification (hsa04146), drug metabolism—cytochrome P450 (hsa00982, Fig. 5D) and metabolism of xenobiotics by cytochrome P450 (hsa00980) about 34 KEGG pathways (Table S19). There were 15 DEGs (Fig. S18) involved in hsa00982 and hsa00980, indicating that similar to Bt toxin, azadirachtin pesticide regulates expression of detox gene as well.

Bt, Empedobacter brevis and azadirachtin act on the digestive system of insects. There were 199 DEGs (Fig. S19) overlapped among them, including PTPMT1 and JHEH, CES1, ChuaMOX, SLC22A4 and SLC22A5 genes associated with detoxification were also among the overlapped DEGs.

留言 (0)

沒有登入
gif