Primary succession of Bifidobacteria drives pathogen resistance in neonatal microbiota assembly

To comprehensively examine NGM assembly dynamics, we expanded on phase 1 of our BBS cohort (BBS1)1,9 with an additional 688 neonatal participants (primarily day 7) in phase 2 (BBS2), effectively doubling our sampling effort. A large-scale, longitudinal metagenomic characterization of the combined BBS dataset, comprising 2,387 gut microbiota samples from 1,288 healthy UK neonates (≤1 month), enabled us to study neonatal microbiota assembly with unparalleled scale and resolution (Extended Data Fig. 1a,b and Supplementary Tables 13). To investigate the origin and both short-term and long-term stability of the NGM primary colonizers, we utilized three subgroups from the expanded BBS2 cohort. These included (1) 183 neonate–mother pairs (representing 14% of participants), referred to as investigating ‘maternal transmission’; (2) 359 participants with longitudinal sampling within the neonatal period (median = 3 samples per participant on days 4, 7 and 21; representing 28% of participants), referred to as investigating ‘neonatal longitudinal colonization’; and (3) 302 participants with paired samples taken both in the neonatal period and later in infancy (at 8.75 ± 1.98 months; representing 23% of participants), referred to as investigating ‘infancy persistence’ (Extended Data Fig. 1c).

Complementing the increased sample size, we have also updated extensive, high-quality clinical and sociodemographic metadata harmonized from BBS clinical record forms and hospital electronic records (Methods), thereby facilitating robust statistical and epidemiological assessment of primary succession patterns. Most neonates in this cohort (84.5%, N = 836) were at least partially breastfed by their mothers, with 44.1% being exclusively breastfed (N = 436). A large majority of participants at the time of infancy sampling were still being breastfed (86.2%; N = 199), with very few fully weaned (0.87%; N = 2). Only 11.3% (N = 123) received postnatal antibiotics during the first week of life (Supplementary Table 4).

Three community states in the neonatal gut microbiota

To delineate the primary succession patterns of the NGM, we sought to identify the primary colonizers driving gut microbial community structure during the neonatal period. Applying partitioning around medoids (PAM) clustering to 1,904 BBS neonatal gut metagenomes at the species level revealed an optimal clustering of three within the NGM, hereafter referred to as ‘NGM community states’10 (Fig. 1a and Extended Data Fig. 2a,b). These three community states were further validated by another widely used microbial community typing method: the Dirichlet multinomial mixture (DMM) modelling framework (Extended Data Fig. 2c,d). Both the PAM and DMM-based approaches showed strong concordance in community state assignments (Cramér’s V correlation of 0.726; Extended Data Fig. 2e) and core species compositions (Extended Data Fig. 2f). Notably, these three community states were consistently observed across the three main sampling points in the BBS cohort (days 4, 7 and 21), underscoring their representativeness of the neonatal period, irrespective of the timing of sample collection (Extended Data Fig. 3).

Fig. 1: Dominant species driving three NGM community states.figure 1

a, Principal coordinates analysis (PCoA) plots of 1,904 neonatal gut metagenomes sampled within the first 30 days of life and clustered using the PAM algorithm on the basis of species-level JSD. Three distinct NGM community states (optimal number clusters k = 3) were identified via PAM clustering. The inset pie chart displays the proportion of the three NGM community states, each labelled according to its primary driver species, namely B. breve (BB, green; N = 336, 17.6% of the samples), E. faecalis (EF, purple; N = 827, 43.4% of the samples) and B. longum (BL, orange; N = 741, 38.9% of the samples). Ellipses encapsulate 67% of the samples within each respective cluster. b, Top 10 driver species contributing to variation observed in the ordination space, as ranked by effect size (‘envfit’ R2, false discovery rate (FDR)-corrected two-sided test, P < 0.05). c,d, Each NGM community state is dominated by a single driver species, as measured by the high relative abundance of the driver species (c) and the low alpha diversity (d) across the three NGM community states (FDR-corrected, two-sided Wilcoxon test). Boxplot centre line and red point indicate the median and mean, respectively; box limits indicate the upper and lower quartiles; and whiskers indicate 1.5× the interquartile range (BB n = 336, EF n = 827, BL n = 741).

Three bacterial species, Bifidobacterium longum subsp. longum (BL), Bifidobacterium breve (BB) and Enterococcus faecalis (EF) acted as the taxonomic drivers for each community state (Fig. 1b and Extended Data Figs. 2g and 4). Each species dominated their respective NGM community states with a relative mean abundance of 56.5% for BB, 21.7% for EF and 27.2% for BL (Fig. 1c). Henceforth, they are referred to as NGM driver species with acronyms indicating each respective community state.

The observed single-species dominance of either B. breve, B. longum or E. faecalis in very early life can also be consistently observed in other cohorts, albeit underreported owing to the previous undersampling during the neonatal period (the largest sample size being <100). Evidence for this comes from diverse populations and methodologies, including 16S gene or quantitative PCR (qPCR)-based observations in Norway11 (N = 87) and Denmark12 (N = 16), as well as shotgun metagenomic surveys of neonates across industrialized urban populations similar to the UK BBS cohort in Europe (Sweden13), Asia (Israel14) and North America (the TEDDY cohort15,16) (Extended Data Fig. 5). Importantly, the NGM community states observed across industrialized cohorts are paralleled in non-industrialized populations. In a peri-urban cohort in South Asia (Bangladesh17), although B. breve continues to be a primary NGM driver species, the community states typically driven by B. longum and E. faecalis in industrialized settings are instead represented by closely similar species: B. infantis (closely related to B. longum) and Escherichia coli (sharing facultative anaerobic and opportunistic pathogenic traits with E. faecalis). Collectively, these cross-study validations strengthen the generalizability of our results in neonatal populations from different geographical regions and lifestyles beyond the UK, and using different methodologies.

Of note, B. longum subsp. infantis (B. infantis), which is closely related to BL and often used as an infant probiotic, was not identified as a driver species. It was rarely detected (~2% prevalence based on 0.5% relative abundance) in the BBS neonates14. The near absence of B. infantis in our UK neonatal cohort aligns with findings from other Western industrialized countries, including a recent meta-analysis14 of cohorts from Israel, Sweden, Finland, Estonia, Italy and the USA18, where there is little evidence of B. infantis naturally colonizing the gut microbiota of healthy, full-term infants. This underscores the importance of distinguishing between closely related species that exhibit very different host colonization patterns.

Applying metagenomic strain tracking analysis on the ‘maternal transmission’ subset, only B. longum exhibited evidence of maternal transmission, with all evaluable BL neonates (15 out of 15) harbouring the exact same B. longum strain found in their mothers’ gut microbiota. This result, consistent with a recent global meta-analysis19, strongly indicates the maternal gut microbiota as the main source of the BL community state (Extended Data Fig. 6). While we could have overlooked maternal transmission of very low-abundance B. breve and E. faecalis below the metagenomic strain detection limit, we consider it more likely that they originate from unsampled maternal (for example, B. breve in breast milk20,21) or environmental sources (for example, E. faecalis in the hospital birth environment22,23) previously implicated as potential sources of these species in the NGM.

The abundant dominance of single driver species was particularly pronounced in community state BB, in which B. breve constituted over half of the NGM by mean relative abundance, and exhibited the lowest microbial richness and evenness, as reflected by the alpha (Shannon) diversity (Fig. 1d). In comparison, the other two NGM community states, BL and EF, had higher microbial diversity, and other moderately abundant species frequently co-occurred with the driver species (Extended Data Fig. 2f,g); B. longum with commensal E. coli, Bacteroides and other Bifidobacterium species; E. faecalis with environment and skin-associated Streptococcus, Staphylococcus spp., as well as healthcare-associated opportunistic pathogens Enterococcus, Klebsiella, Enterobacter spp. and C. perfringens. Notably, these less-dominant species in EF were also known signatures of hospital CS birth not only in this UK cohort1 but also in cohorts from North America24,25, Latin America24 and Europe13,24,26.

Factors influencing the acquisition of the NGM community states

To determine the perinatal factors influencing the acquisition of each NGM community state, we performed epidemiological analyses using 20 high-quality clinical and sociodemographic metadata variables (N = 1,108 eligible participants; Fig. 2 and Supplementary Table 5). After adjusting for potential confounders in multivariate fixed-effect logistic regression models, we found that the acquisition of an EF community state was independently associated with being born via CS birth (compared to vaginal delivery (VD); adjusted odds ratio (AOR) = 2.30 [95% CI 1.34–3.95], P = 0.003; 70.5/23.6/40.0% among EF/BL/BB, respectively) and with the mother receiving intrapartum antibiotics during labour (AOR = 3.69 [95% CI 2.11–6.42], P < 0.001; 80.8/32.7/46.3% among EF/BL/BB, respectively). Conversely, being born via CS birth and labour antibiotics exposure were negatively associated with BL acquisition (AOR for CS vs VD = 0.36 [95% CI 0.21–0.64], P < 0.001; AOR for receiving antibiotics during labour = 0.46 [95% CI 0.26–0.79], P = 0.005, respectively).

Fig. 2: Clinical and sociodemographic variables associated with NGM community state in the first week of life (N = 1,108).figure 2

ac, Multivariate associations between clinical and sociodemographic variables and each week-1 NGM community state. Three different models were built: EF vs non-EF (a), BL vs non-BL (b) and BB vs non-BB (c). Likelihood ratio tests (two-sided) were used to calculate P values (without FDR correction), with P ≤ 0.05 in the multivariate models displayed. Odds ratios (OR) are plotted on a log10 scale. For details of univariate and multivariate analyses, refer to Supplementary Tables 5 and 6. The week-1 NGM community state was identified for each eligible participant using the earliest available sample from week 1, either on day 4 (N = 64) or day 7 (N = 1,044).

Interestingly, several intrinsic host factors including sex (male with BB), maternal ethnicity (Asian with EF and BB), age (<30 and ≥40 with EF and BL, respectively) and parity (first time giving birth with EF) were also independently associated with specific community states. For example, mothers identifying as Asian (compared with white participants) were more likely to acquire BB (AOR = 2.11 [95% CI 1.32–3.38], P = 0.006) but less likely to acquire EF (AOR = 0.63 [95% CI 0.41–0.95], P = 0.04; 9.0/12.1/19.5% among EF/BL/BB, respectively). It is noteworthy that BB is the only community state that was exclusively influenced by host factors and independent of any clinical factors including mode of birth and antibiotics, which may suggest a distinct route of BB acquisition that remains unaffected by the perturbations associated with hospital births. These observations align with the hypotheses that maternal factors, such as genetic determinants of breast milk composition (for example, secretor status of the mothers)27, a history of previous pregnancies or cohabitation with children28, as well as cross-cultural differences in infant-care-associated behaviours7 may influence the vertical transmission of maternal microbiota.

Neither postnatal antibiotics nor breastfeeding exposure, whether immediately after birth or within the first week of life, appeared to predispose neonates to any specific community state. This lack of association is probably attributed to the uniformly high-levels of antibiotic-free status (84.7/90.8/89.5% among EF/BL/BB, respectively) and breastfeeding rates (79.1/81.8/88.6% among EF/BL/BB, respectively) during the earliest postnatal window sampled in this cohort. The absence of an association between breastfeeding and EF also aligns with previous reports that, despite its antimicrobial properties, breast milk alone does not inhibit E. faecalis growth in vitro29,30.

Priority effects in NGM community state stability

We reasoned that the three primary colonizers as NGM drivers could benefit from priority effects, which would be evident through the exclusion of, or replacement by, later-arriving species in the NGM. To search for evidence of such priority effects, we sought to examine the stability and temporal signals of both the NGM community states and their driver species in the ‘neonatal longitudinal’ subset, stratified by birth modes. Most VD neonates who initially acquired a Bifidobacterium-dominated community state (either 92% for BB or 89% for BL, 79% or 72% by considering transient switches between day 4 and 7) during week 1 retained their community state when resampled in week 3 (Fig. 3a and Extended Data Fig. 7a). By contrast, EF was the most unstable community state, with less than half of the neonates (29% in VD and 39% in CS) remaining in their early EF community state during the neonatal period (EF vs BB AOR 16.2 [95%CI 3.84–68.10], EF vs BL AOR 13.89 [4.02–48.02]; P < 0.001; Supplementary Table 6). Irrespective of birth mode, BB proved more stable than EF (pairwise chi-square test, corrected P < 0.001), while the sample size was insufficient to be confident about the relative stability of BL in CS neonates (65% versus 48% for EF; pairwise chi-square test, corrected P = 0.52).

Fig. 3: Dynamics and stability of NGM community states and their driver species.figure 3

a,b, Stability of NGM community states (a) and levels of three species driving NGM community states (b) (week 1, based on the earlier sample of day 4 or 7) in neonates longitudinally sampled from weeks 1 to 3 (day 21, total N = 306; VD N = 140; CS N = 166). The proportion of community states that remained consistent from weeks 1 to 3 is depicted as a percentage of their initial sample size in week 1 (labelled in black). Participants starting with BB or BL on week 1 were significantly more likely to retain their community state in week 3 compared with those with EF (pairwise chi-square tests with FDR correction, P < 0.001). ce, Persistence of the dominant abundance of driver species of NGM community states in week 1 (c,d) or week 3 (e) in the paired longitudinal samples obtained later at week 3 (c) and in infancy (d,e). f, Persistent carriage of week-1 driver species in paired longitudinal samples obtained later in infancy. Species carriage is defined using a threshold of 0.1% relative abundance. Sample sizes of participants longitudinally sampled for weeks 1 and 3 shown in ac are: total N = 306; VD N = 26/39/75 among BB/EF/BL, respectively; CS N = 26/114/26 among BB/EF/BL, respectively; for week 1 and infancy (also referred to as the ‘infancy persistence’ group) shown in d and f: total N = 302; VD N = 27/43/90 among BB/EF/BL, respectively; CS N = 17/108/17 among BB/EF/BL, respectively; and for week 3 and infancy shown in e: total N = 146; VD N = 12/11/43 among BB/EF/BL, respectively; CS N = 17/37/26 among BB/EF/BL, respectively. Colour represents NGM community states or driver species: BB and B. breve in green; EF and E. faecalis in purple; BL and B. longum in orange. Boxplots as in Fig. 1. Statistical differences in abundance between time points (a), species (ce) and carriage frequency (f) were determined using paired t-tests, Wilcoxon tests and chi-square tests (all two-sided) with FDR correction, respectively.

The stability of the underlying driver species closely mirrored the observed community state dynamics. In contrast to E. faecalis, which rapidly declined throughout the stochastic assembly trajectory of the early community state EF, both B. breve and B. longum retained their high abundance within their respective community states throughout the 3-week neonatal sampling window (Fig. 3b and Extended Data Fig. 7b). Notably, both species, as late-arriving secondary colonizers (that is, colonized NGM only in week 3), exhibited signs of competitively excluding E. faecalis in CS neonates who initially acquired the EF community state (Fig. 3b). This competitive exclusion effect seemed most pronounced for B. breve; in contrast to B. longum, it was able to colonize VD neonates at increasing levels as a late-arriving species (Extended Data Fig. 7b). Among the primary colonizers that dominated the NGM in the first week, B. breve is the only species conferring durable colonization dominance (relative to the other driver species), which persisted as far as the final neonatal period sampling point at week 3 (P < 0.001 in VD and CS; Fig. 3c).

The stability of the two Bifidobacterium species is also reflected at the strain level (Extended Data Fig. 6); most of the neonates retained the same B. longum (79.5%, N = 35/44 BL neonates) or B. breve (75%, N = 24/32 BB neonates) strain they initially acquired throughout the neonatal period, in contrast to 62.3% for E. faecalis (N = 43/69 EF neonates; the denominators represent longitudinally sampled individuals with detectable strain sharing events).

Together, as primary colonizers, both Bifidobacterium species benefit from priority effects, maintaining a stable NGM assembly trajectory owing to their ability to confer durable species dominance and inhibit the later arrival of opportunistic pathogens such as E. faecalis. In particular, B. breve exhibits stronger priority effects between the two species (that is, only as a primary colonizer), as well as strong deterministic exclusion of E. faecalis (that is, as either a primary or a secondary colonizer).

Stability of NGM driver species into infancy

We also assessed the longer-term engraftment of the NGM driver species in participants resampled 6–12 months beyond the neonatal period using the ‘infancy persistence’ subset. Remarkably, the relative dominance of B. breve (over the other driver species, in VD, P < 0.05; Fig. 3d) also extended into infancy when there was still no significant difference in breastfeeding rates between early NGM community states (BB/EF/BL: 88.4%/89.6%/80.5%, chi-square test, P = 0.18). In addition, the long-term competitive exclusion effect of B. breve was evident in CS neonates who either retained or transitioned into BB (primarily from EF) by week 3. These long-term stability patterns were exclusively observed for B. breve, with its abundance in infancy being almost double in neonates who previously had a BB community state compared with those with other community states (Fig. 3e).

Although NGM driver species rarely retained their differential abundance later in infancy (except B. breve), the frequency of carriage for all three driver species was consistently higher in infants stratified by their corresponding NGM community states (Fig. 3f). As many as 93% of VD (or 77% of CS) neonates with week-1 community state of BB still carried B. breve, compared with 58% and 66% (or 65% of CS) of VD neonates with week-1 community states EF and BL, respectively (pairwise chi-square tests, P < 0.001). While levels of E. faecalis in community state EF waned over time to non-differential levels later in infancy, neonatal acquisition of EF remains a predisposing factor for longer-term carriage of E. faecalis. This opportunistic pathogen species was still detected in higher proportions (44%) in neonates from the EF community state during their first week (relative to 37–41% in BB and 35–38% in BL) when resampled later in infancy, regardless of their birth mode (pairwise chi-square tests, P < 0.001; Fig. 3f).

EF state enriched with virulence and antibiotic resistance genes

To determine the functional differences among NGM community states, we leveraged their driver species as proxies for functional analyses, using 1,249 high-quality isolate (N = 133) and metagenome-assembled genomes (N = 1,116) generated from the corresponding community state samples (BB N = 297, EF N = 561, BL N = 391; Supplementary Table 7). We found a striking difference between Bifidobacterium spp. and E. faecalis functional profiles in antimicrobial resistance (AMR) and virulence potential. Importantly, all E. faecalis strain genomes recovered from neonates with EF community states encoded known virulence factors including 70% predicted to produce the toxin cytolysin31. By contrast, both Bifidobacterium driver species genomes displayed markedly reduced levels of AMR and virulence-associated genes, with a burden 10- to a 100-fold less than in EF (median 17 versus 0; Fig. 4a). Further AMR gene screening of the entire gut resistome within each community state revealed a higher carriage of high-risk AMR genes, such as CTX-M-15 linked to extended-spectrum beta-lactamase (ESBL), in both BL and EF community states (Fig. 4b). This underscores the notable pathogenic potential of ESBL-carrying Enterobacteriaceae pathogens co-occurring in non-BB community states. These findings align with our risk factor analyses (Fig. 2), which identified maternal antibiotics exposure during labour (to some VD and all CS neonates) as a strong risk for the acquisition of an EF community state that bears increased risk of AMR and virulence.

Fig. 4: Bifidobacterium species drive resistance to antimicrobial resistance and pathogen colonization.figure 4

a, Counts of detected AMR and virulence genes in driver species genomes, with median values enclosed in brackets. Wilcoxon test (two-sided) with FDR correction; number of genomes (isolates in brackets): BB N = 297 (30), EF N = 561 (54) and BL N = 391 (49). b, Carriage of high-risk AMR genes associated with ESBL in the day-7 NGM community state samples based on raw metagenomic assemblies (BB N = 207, EF N = 498, BL N = 444). The x axis shows the most clinically prevalent ESBL genes belonging to CTX-M, OXA, SHV and TEM families. c, Proportion of species genomes, indicated by a colour gradient, predicted to utilize HMOs or their primary downstream products, lactose and fucose. The actual proportions are labelled for genotypes that are not completely present. The predictions are based on the presence of both the gene and its encoded transporters required for utilization of each substrate. 2′-fucosyllactose (2′-FL) liberates lactose and fucose which are also present in breast milk. Utilizations of LNnT, LNT and LNB will all liberate lactose. d, NGM driver species BB confers pathogen colonization resistance in vivo. The boxplot depicts the relative abundance of BB compared to the opportunistic pathogen species EF or K. oxytoca (KO). The x axis represents three experimental groups co-colonized as follows: (1) BB type strain DSM 20213 (2′-FL+) with EF; (2) BB natural variant D19 (2′-FL−, isolated from a BBS neonate) with EF; and (3) BB type strain (2′-FL+) with KO (D63). The BB genotype (2′-FL+/−) indicates whether the strain encodes the α-l-fucosidase (GH95) enzyme encoding for 2′-FL metabolism. In each co-colonization group, one group of mice also received a 2′-FL supplement (50 mg ml−1 per day) in their daily drinking water. The y axis for BB co-colonization with KO is shown on a log scale. Each experimental condition included 3–5 mice per cage and 3 technical replicate cages. Statistical differences between treatment groups were determined using a t-test with Welch’s correction (two-sided). Boxplot centre line indicates the median, box limits indicate upper and lower quartiles, and whiskers indicate 1.5× the interquartile range.

Pathogen resistance of B. breve via metabolic adaptation to HMOs

At the genome-wide functional level, we observed distinct metabolic landscapes of NGM community states based on KEGG orthologues (Extended Data Fig. 8a), particularly in metabolic repertoire of carbohydrate-active enzymes (Extended Data Fig. 8b). Both Bifidobacterium community states, in contrast to EF, exhibited an enrichment in carbohydrate-active enzymes associated with metabolizing human milk oligosaccharides (HMOs) abundant and exclusively found in human breast milk. By contrast, EF predominantly possesses genes tailored for utilizing complex dietary glycans such as mannan and chitin, as well as those like starch and cellulose that are commonly found in a plant-based diet usually consumed later in life (Extended Data Figs. 8b and 9).

Compared with the limited HMO metabolic capability of the BL community state, BB is capable of utilizing the all the major HMO substrates including lacto-N-tetraose (LNT), lacto-N-neotetraose (LNnT) and lacto-N-biose (LNB), as well as the primary end-products of HMO metabolism l-fucose and d-lactose, which are naturally present in human breast milk (Fig. 4c). Interestingly, among the three community states, only BB—comprising nearly all B. breve genomes (97.6%, N = 290/297)—encode the enzyme (α-l-fucosidase, GH95 or GH29) required for metabolizing the most abundant HMO component 2′-fucosyllactose (2′-FL). Although these B. breve strains lack known transporters for importing 2′-FL for intracellular metabolism, previous in vitro experiments have shown that similar strains are capable of growing on 2′-FL32,33. Therefore, B. breve might be able to metabolize 2′-FL via a previously uncharacterized pathway. In contrast, such capability is extremely rare among BL (5.0%, N = 19/391) and completely absent in EF (Fig. 4c). Notably, the species-level variations in HMO utilization observed in the study strains are representative of BB/BL/EF species, exhibiting patterns consistent with those previously reported34. These patterns are not influenced by breastfeeding rates in this neonatal cohort, which are uniformly high and statistically indistinguishable among the community states (79.1%, 81.8% and 88.6% for EF, BL and BB, respectively).

Given that opportunistic pathogens including E. faecalis, E. faecium, Klebsiella oxytoca, K. pneumoniae, Enterobacter cloacae and Clostridium perfringens, which are enriched in the EF community state, lack the capability to metabolize HMOs and their by-products, we hypothesize that B. breve’s versatility in utilizing these predominant neonatal dietary components substantially enhances its fitness against opportunistic pathogens in vivo. Considering that all neonates in the study would have been exposed to the same level of HMOs through a predominantly breast milk-based diet, regardless of their community state, we reason that the metabolic capability to utilize HMOs, including but not limited to 2′-FL, not only contributes to the dominance and stability of the BB community state but also enables B. breve to outcompete pathogenic species that cannot utilize HMOs. Supporting our hypothesis, we demonstrate in a gnotobiotic mouse model, co-colonized with B. breve and the opportunistic pathogen driver species E. faecalis, that B. breve dominates, and this dominance is amplified by dietary 2′-FL supplementation (Fig. 4d). The 2′-FL-mediated pathogen resistance in vivo phenotype of B. breve also extends to the Gram-negative enteropathogen K. oxytoca, albeit to a lesser extent. Importantly, the anti-pathogen effect was absent in mice colonized with a natural B. breve variant isolated from a BBS neonate lacking the α-l-fucosidase (GH95) enzyme necessary for 2′-FL metabolism. These findings suggest that B. breve’s strain-specific and gene-dependent utilization of HMOs could have a crucial role in enhancing resistance to pathogen colonization by inhibiting pathogen growth.

留言 (0)

沒有登入
gif