1] Center for Genome Sciences and Systems Biology, Washington University in St. Louis, St. Louis, Missouri 63108, USA [2] Department of Anthropology, New School for Social Research, New York, New York 10003, USA.

4

Division of Statistical Genomics, Washington University in St. Louis, St. Louis, Missouri 63108, USA.

5

Departments of Medicine, Microbiology and Pathology, University of Virginia School of Medicine, Charlottesville, Virginia 22908, USA.

Abstract

Therapeutic food interventions have reduced mortality in children with severe acute malnutrition (SAM), but incomplete restoration of healthy growth remains a major problem. The relationships between the type of nutritional intervention, the gut microbiota, and therapeutic responses are unclear. In the current study, bacterial species whose proportional representation define a healthy gut microbiota as it assembles during the first two postnatal years were identified by applying a machine-learning-based approach to 16S ribosomal RNA data sets generated from monthly faecal samples obtained from birth onwards in a cohort of children living in an urban slum of Dhaka, Bangladesh, who exhibited consistently healthy growth. These age-discriminatory bacterial species were incorporated into a model that computes a 'relative microbiota maturity index' and 'microbiota-for-age Z-score' that compare postnatal assembly (defined here as maturation) of a child's faecal microbiota relative to healthy children of similar chronologic age. The model was applied to twins and triplets (to test for associations of these indices with genetic and environmental factors, including diarrhoea), children with SAM enrolled in a randomized trial of two food interventions, and children with moderate acute malnutrition. Our results indicate that SAM is associated with significant relative microbiota immaturity that is only partially ameliorated following two widely used nutritional interventions. Immaturity is also evident in less severe forms of malnutrition and correlates with anthropometric measurements. Microbiota maturity indices provide a microbial measure of human postnatal development, a way of classifying malnourished states, and a parameter for judging therapeutic efficacy. More prolonged interventions with existing or new therapeutic foods and/or addition of gut microbes may be needed to achieve enduring repair of gut microbiota immaturity in childhood malnutrition and improve clinical outcomes.

Illustration of the equations used to calculate ‘relative microbiota maturity’ and ‘Microbiota-for-Age Z score’

The procedure to calculate both microbiota maturation metrics are shown for a single fecal sample from a focal child (pink circle) relative to microbiota age values calculated in healthy reference controls. These reference values are computed in samples collected from children used to validate the Random-Forests-based sparse 24-taxa model and are shown in a, as a broken line of the interpolated spline fit (- - -) and in b, as median ± SD values for each monthly chronologic age bin from months 1 to 24.

Transient microbiota immaturity and reduction in diversity associated with diarrhea in healthy twins and triplets

a, The transient effect of diarrhea in healthy children.Seventeen children from 10 families with healthy twins/triplets had a total of 36 diarrheal illnesses where fecal samples were collected. Fecal samples collected in the months immediately prior to and following diarrhea in these children were examined in an analysis that included multiple environmental factors in the ‘healthy twins and triplets’ birth cohort. Linear mixed models of these specified environmental factors indicated that ‘diarrhea’, ‘month following diarrhea’ and ‘presence of formula in diet’ have significant effects on relative microbiota maturity, accounting for random effects arising from within-family and within-child dependence in measurements of this maturity metric. The factors ‘postnatal age’, ‘presence/absence of solid foods’,‘exclusive breastfeeding’, ‘enteropathogen detected by microscopy’, ‘antibiotics’ as well as ‘other periods relative to diarrhea’ had no significant effect. The numbers of fecal samples (n) are shown in parenthesis. Mean values ± SEM are plotted. **, p<0.01. See for the effects of dietary and environmental covariates. b, Effect of diarrhea and recovery on age-adjusted Shannon Diversity Index (SDI). Mean valuesof effect on SDI ± SEM are plotted. *, p<0.05; **, p<0.01.

Microbiota variation in families with twins and triplets during the first year of life

a, Maternal influence. Heatmap of the mean relative abundances of 13 bacterial taxa (97% ID OTUs) found to be statistically significantly enriched in the first month post-partum in the fecal microbiota of mothers (see column marked ‘1’) compared to microbiota sampled between the second and twelfth months post-partum (FDR-corrected p<0.05; ANOVA of linear mixed-effects model with random by-mother intercepts). An analogous heatmap of the relative abundance of these taxa in their twin/triplet offspring is shown. Three of these 97%ID OTUs are members of the top 24 age-discriminatory taxa (blue) and belong to the genus Bifidobacterium. b-e,comparisons of maternal, paternal and infant microbiota. Mean values±SEM of Hellinger and unweighted UniFrac distances between the fecal microbiota of family members sampled over time were computed. Samples obtained at postnatal months 1, 4, 10 and 12 from twins/triplets, mothers and fathers were analyzed (n=12 fathers;12 mothers; 25 children). b, Intrapersonal variation in the bacterial component of the maternal microbiota is greater between the first and fourth months after childbirth than variation in fathers. c,Distances between the fecal microbiota of spouses (each mother-father pair) compared to distances between all unrelated adults (male-female pairs). The microbial signature of co-habitation is only evident 10 months following childbirth. d,e,The degree of similarity between mother and infant during the first postpartum month is significantly greater than the similarity between microbiota of fathers and infants (panel c) while the fecal microbiota of co-twins are significantly more similar to one another than to age-matched unrelated children during the first year of life (panel d). For all distance analyses, Hellinger and unweighted UniFrac distance matrices were permuted 1,000 times between the groups tested. P-values represent the fraction of times permuted differences between tested groups were greater than real differences between groups. *, p<0.05; **, p<0.01; ***, p<0.001.

Anthropometric measures of nutritional status in children with SAM before, during and after both randomized food interventions

a-c, Weight-for-Height Z-scores (WHZ), Height-for-Age Z-scores (HAZ) and Weight-for-Age Z-scores (WAZ). Mean values ± SEM are plotted and referenced to national average anthropometric values for children surveyed between the ages of 6 and 24 months during the 2011 Bangladeshi Demographic Health Survey (BDHS)34.

Persistent reduction of diversity in the gut microbiota of children with SAM

Age-adjusted Shannon Diversity Index for fecal microbiota samples collected from healthy children (n=50), and from children with SAM at various phases of the clinical trial (mean values ± SEM are plotted). The significance of differences between SDI at various stages of the clinical trial is indicated relative to healthy controls (above the bars) and versus the time of enrollment prior to treatment (below the bars). *, p<0.05; **, p<0.01, ***, p<0.001 (post-hoc Dunnett’s multiple comparison procedure of linear mixed models). Also see .

Heatmap of bacterial taxa significantly altered during the acute phase of treatment and nutritional rehabilitation in the microbiota of children with SAM compared to similarly aged healthy children

Bacterial taxa (97%ID OTUs) significantly altered (FDR-corrected p-value <0.05) in children with SAM are shown (see for p-values and effect size for individual taxa). Three groups of bacterial taxa are shown: a, those enriched prior to the food intervention; b, those enriched during the follow-up phase compared to healthy controls; and c,those that are initially depleted but return to healthy levels. Members of the top 24 age-discriminatory taxa are highlighted in blue. Note that there were no children represented in the Khichuri-Halwa arm under the age of 12 months during the ‘Follow-up after 3 months’ period.

Heatmap of bacterial taxa altered during long-term follow-up in the fecal microbiota of children with SAM compared to similarly aged healthy children

Bacterial taxa (97% ID OTUs) significantly altered (FDR-corrected p-value <0.05) in children with SAM are shown (see for p-values and effect size for individual taxa). a,Taxa depleted across all phases of SAM relative to healthy. b,Those depleted during the follow-up phase. Members of the top 24 age-discriminatory taxa are highlighted in blue. Note that there were no children represented in the Khichuri-Halwa arm under the age of 12 months during the ‘Follow-up after 3 months’ period.

Plots of microbiota and anthropometric parameters in nine children sampled before antibiotics (abx), after oral amoxicillin plus parenteral gentamicin/ampicillin, and at the end of the antibiotic and dietary interventions administered over the course of nutritional rehabilitation in the hospital. All comparisons were made relative to the pre-antibiotic sample using the non-parametric Wilcoxon matched-pairs rank test, where each child served as his/her own control. a-c, Microbiota parameters,plotted as mean values ± SEM, include relative microbiota maturity, Microbiota-for-age Z score (MAZ), and Shannon Diversity Index (SDI). Weight-for-Height Z scores (WHZ) are provided in panel d. e,f, The two predominant bacterial family-level taxa showing significant changes following antibiotic treatment. ns, not significant; **, p<0.01

Relative microbiota maturity and MAZ scores correlate with Weight-for-Height Z-scores (WHZ) in children with MAM

a-c, WHZ scores are significantly inversely correlated with relative microbiota maturity (panel a) and MAZ scores (panel b) in a cross-sectional analysis of 33 children at 18 months of age who were above and below the anthropometric threshold for MAM (Spearman rho = 0.62 and 0.63, respectively; ***, p < 0.001). In contrast, there is no significant correlation between WHZ scores and microbiota diversity (panel c).d-l, Relative abundances of age-discriminatory 97% ID OTUs that are inputs to the Random Forests model that are significantly different in the fecal microbiota of children with MAM compared to age-matched 18-month old healthy controls (p<0.05 Mann-Whitney U-test). Box plots represent the upper and lower quartiles (boxes), the median (middle horizontal line),and measurements that are beyond 1.5 times the interquartile range (whiskers), above or below the 75th/25th percentile, respectively (points) (Tukey’s method, PRISM software v6.0d).Taxa are presented in descending order of their importance to the Random Forests model. Also see .

Cross-sectional assessment of microbiota maturity at 18 months of age in Bangladeshi children with and without MAM plus extension of Bangladeshi-based model of microbiota maturity to Malawi

a,b, Children with MAM (WHZ scores lower than -2 SD; grey) have significantly lower relative microbiota maturity (panel a) and MAZ scores (panel b) than healthy individuals (blue). Mean values ± SEM are plotted *, p<0.05 (Mann-Whitney U test). See for correlations of metrics of microbiota maturation with WHZ scores box-plots of age-discriminatory taxa whose relative abundances are significantly different in children with MAM relative to healthy reference controls. c, Microbiota age predictions resulting from application of the Bangladeshi 24-taxon model to 47 fecal samples (brown circles) obtained from concordant healthy Malawian twins and triplets are plotted versus the chronologic age of the Malawian donor (collection occurred in individuals ranging from 0.4 to 25.1 months old). The results show the Bangladeshi model generalizes to this population, which is also at high risk for malnutrition (each circle represents an individual fecal sample collected during the course of a previous study11). d, Spearman rho and significance of rank order correlations between the relative abundances of Bangladeshi-age discriminatory taxa and chronologic age of all healthy Bangladeshi children described in present study and the concordant healthy Malawian twins and triplets. *, p<0.05.

Bacterial taxonomic biomarkers for defining gut microbiota maturation in healthy Bangladeshi children during the first two years of life

a, Twenty-four age-discriminatory bacterial taxa were identified by applying Random Forests regression of their relative abundances in fecal samples against chronologic age in 12 healthy children (n=272 fecal samples). 97%ID OTUs with their deepest level of confident taxonomic annotation (also see ) are shown, ranked in descending order of their importance to the accuracy of the model. Importance was determined based on the percentage increase in mean-squared error of microbiota age prediction when the relative abundance values of each taxon were randomly permuted (Mean importance ± SD, n=100 replicates). The insert shows 10-fold cross-validation error as a function of the number of input 97%ID OTUs used to regress against chronologic age of hosts used in the training set, in order of variable importance (blue line). b, Microbiota age predictions in a birth cohort of healthy singletons used to train the 24 bacterial taxa model (brown, each circle represents an individual fecal sample). The trained model was subsequently applied to two sets of healthy children: 13 singletons set aside for model testing (green circles, n=276 fecal samples) and another birth cohort of 25 twins and triplets (blue circles, n=448 fecal samples). The curve is a smoothed spline fit between microbiota age and chronologic age in the validation sets (right two panels of b), accounting for the observed sigmoidal relationship (see Methods). c, Heatmap of mean relative abundances of the 24 age-predictive bacterial taxa shown in panel a plotted against the chronologic age of healthy singletons used to train the Random Forests model, and correspondingly in the healthy singletons and twins/triplets used to validate the model (Hierarchical clustering performed using the Spearman rank correlation distance metric).

a, Design of the randomized interventional trial. b, Microbiota maturity defined during various phases of treatment and follow-up in children with SAM. Relative microbiota maturity in the upper portion of the panel is based on the difference between calculated microbiota age (Random Forests-derived taxonomic biomarker model) and values calculated in healthy children of similar chronologic age, as interpolated over the first two years of life using a spline curve. In the lower portion of the panel, maturity is expressed as a microbiota-for-age Z score (MAZ). Mean values ± SEM are plotted. The significance of differences between microbiota indices at various stages of the clinical trial is indicated relative to healthy controls (arrows above the bars) and versus samples collected at enrollment for each intervention group (arrows below the bars) (post-hoc Dunnett’s multiple comparison procedure of linear mixed models; *, p<0.05; **, p<0.01, ***, p<0.001). Healthy children not used to train the Random Forests model served as healthy controls (n=38). c-f, Plot of microbiota age for each child with SAM at enrollment, at the conclusion of the food intervention phase, and within and beyond 3 months of follow-up. The curve shown in each panel was fit using predictions in healthy children: this curve is the same as that replicated across each plot in .