Abstract

Objective. Despite its relative frequency among autosomal recessive diseases and the availability of the sweat test, cystic fibrosis (CF) has been difficult to diagnose in early childhood, and delays can lead to severe malnutrition, lung disease, or even death. The Wisconsin CF Neonatal Screening Project was designed as a randomized clinical trial to assess the benefits and risks of early diagnosis through screening. In addition, the incidence of CF was determined, and the validity of our randomization method assessed by comparing 16 demographic variables.

Methodology. Immunoreactive trypsinogen analysis was applied to dried newborn blood specimens for recognition of CF risk from 1985 to 1991 and was coupled to DNA-based detection of the ΔF508 mutation from 1991 to 1994. Randomization of 650 341 newborns occurred when their blood specimens reached the Wisconsin screening laboratory. This created 2 groups—an early diagnosis, screened cohort and a standard diagnosis or control group. To avoid selection bias, we devised a unique unblinding method with a surveillance program to completely identify the control subjects. Because sequential analysis of nutritional outcome measures revealed significantly better growth in screened patients during 1996, we accelerated the unblinding and completely identified the control group by April 1998. Having each member of this cohort enrolled and evaluated for at least 1 year and having completed a comprehensive surveillance program, we performed another statistical analysis of anthropometric evaluated indices that includes all CF patients without meconium ileus.

Results. The incidence of classical CF, ie, patients diagnosed in this trial with a sweat chloride of 60 mEq/L greater, was 1:4189. By incorporating other CF patients born during the randomization period, including 2 autopsy diagnosed patients and 8 probable patients, we calculate a maximum incidence of 1:3938 (95% confidence interval: 3402–4611). Although there were group differences in the proportion of patients with ΔF508 genotypes and with pancreatic insufficiency, validity of the randomization plan was demonstrated by analyzing 16 demographic variables and finding no significant difference after adjustment for multiple comparisons. Focusing on patients without meconium ileus, we found a marked difference in the mean ± standard deviation age of diagnosis for screened patients (13 ± 37 weeks), compared with the standard diagnosis group (100 ± 117). Anthropometric indices of nutritional status were significantly higher at diagnosis in the screened group, including length/height, weight, and head circumference. During 13 years of study, despite similar nutritional therapy and the inherently better pancreatic status of the control group, analysis of nutritional outcomes revealed significantly greater growth associated with early diagnosis. Most impressively, the screened group had a much lower proportion of patients with weight and height data below the 10th percentile throughout childhood.

Conclusions. Although the screened group had a higher proportion of patients with pancreatic insufficiency, their growth indices were significantly better than those of the control group during the 13-year follow-up evaluation and, therefore, this randomized clinical trial of early CF diagnosis must be interpreted as unequivocally positive. Our conclusions did not change when the height and weight data before 4 years of age for the controls detected by unblinding were included in the analysis. Also, comparison of growth outcomes after 4 years of age in all subjects showed persistence of the significant differences. Therefore, selection bias has been eliminated as a potential explanation. In addition, the results show that severe malnutrition persists after delayed diagnosis of CF and that catch-up may not be possible. We conclude that early diagnosis of CF through neonatal screening combined with aggressive nutritional therapy can result in significantly enhanced long-term nutritional status.

Worldwide experience has shown that cystic fibrosis (CF) is difficult to diagnose, despite its relatively high prevalence among life-threatening autosomal recessive disorders, its propensity to cause severe disease, and the availability for 5 decades of a reliable diagnostic procedure, ie, the sweat test.1 In the United States, the mean age at the time of diagnosis was 4.8 years during 1996 and has ranged from 2.6 to 4.8 years from 1986 onward, according to data from the Cystic Fibrosis Foundation Registry.2,3Similar delays have been observed in European countries when the standard diagnostic strategy is used, ie, performance of a sweat test after recognition of signs/symptoms or a positive family history.4,5 The delay in diagnosis is associated with occurrence of severe malnutrition,6 progressive lung disease,7 and other abnormalities.8–10 It is alarming that nearly one half of newly diagnosed patients in the United States have severe malnutrition (weight or height less than the 5th percentile) at the time of diagnosis,6 particularly because poor nutrition has been associated with a worse prognosis.11 The potential to achieve early diagnosis through neonatal screening has existed since Crossley et al12 reported that blood immunoreactive trypsinogen (IRT) levels are high in newborn infants with CF. It has been shown convincingly that the IRT test alone,13 when used with a recall specimen14 or when coupled with DNA testing,15,16 can be used to diagnose most CF patients in the neonatal period. The application of molecular genetics testing has been particularly attractive, whether used to detect the most common mutant allele (ΔF508) or for CF transmembrane regulator (CFTR) multimutation analysis.16 Widespread application of this technology in regions of the world where CF is relatively common has not occurred for a variety of reasons (the only exception being Australia17). The most commonly cited explanation for reticence to implement such testing has been the need for convincing evidence of benefits attributable to early diagnosis. The US Cystic Fibrosis Foundation promoted this attitude by urging caution and more research in 1983 through an influential position paper.18

We began a randomized clinical trial of early diagnosis of CF through neonatal screening in 1985. Assessment of benefits has focused on nutritional outcome measures and evaluation of pulmonary disease. Accrual of screened and control patients occurred between April 1985 and 1998 using neonatal screening for early diagnosis coupled with an unblinding/surveillance system with unique characteristics. Before the complete recognition of patients in the standard diagnosis (control) group, however, sequential statistical analyses revealed significantly better growth associated with early diagnosis. A preliminary report of our nutritional evaluations was published with the caution that complete patient accrual had not occurred.19 An imbalance in the demographic characteristics of the 2 groups was also reported; specifically, the control group had more subjects with pancreatic sufficiency and relatively fewer patients with ΔF508 genotypes.19 When these data were reported, criticisms ensued because of concern about potential selection bias.20 Demonstration of potential nutritional benefits led to us to perform accelerated unblinding to complete accrual of the control group patients and to eliminate any possible selection bias. We now report assessment of the randomization method validity, a description of the overall CF incidence with as great a precision as possible, and a comparison of nutritional status in the 2 groups. This constitutes the definitive comparison of growth of the 2 cohorts generated through randomization in the Wisconsin CF Neonatal Screening Study.

METHODS

This project has been conducted in 3 phases, namely: 1) comprehensive planning (while grant support was being secured); 2) patient accrual (with concurrent evaluation and treatment); and 3) longitudinal evaluation (concentrating on nutritional and pulmonary outcome measures). Our study design is illustrated in Fig 1. The investigation has been performed as a collaborative project involving Wisconsin's 2 Cystic Fibrosis Centers (in Madison and Milwaukee) and the State Laboratory of Hygiene's centralized newborn screening program that is responsible for testing the entire newborn population of ∼70 000 annual births (Table 1). This project was approved by the human subjects committees of the University of Wisconsin and the Medical College of Wisconsin, as well as the Research and Publications Committee and Human Rights Board at the Children's Hospital of Wisconsin. In addition, several community hospitals in the State elected to submit our protocol to their institutional review boards, and all of them approved the project.

Experimental design used to perform a randomized clinical trial for assessment of the benefits, risks, and costs of CF neonatal screening. The laboratory method of early detection changed in July 1991 from a single IRT test to a combination of IRT testing and DNA analysis. Both methods have 99.9% specificity, but the sensitivity and positive predictive value are better with the 2-tiered screening strategy.16

This project was designed as a randomized clinical trial during 1983–1984 to address the following hypothesis: early diagnosis of CF will be medically beneficial without major risk. We planned from the onset to use trypsinogen analysis on dried newborn blood specimens to achieve early diagnosis of one half the birth cohort and then to follow longitudinally 2 concurrent groups of patients who received the same evaluation and treatment. Although we recognized that the IRT test was not 100% sensitive, various observations suggested at least a 90% detection rate, even in patients with pancreatic sufficiency.13 As illustrated in Fig 1, the IRT test with a cutoff level 180 ng/mL was used when randomized screening began on April 15, 1985 after a 6-month laboratory startup phase.21After discovery of the major mutation (ΔF508) in the CFTR and our determination that a 2-tiered screening test would be advantageous,22 we implemented an IRT/DNA (ΔF508) method to facilitate early diagnosis after June 1991. Using a lower cutoff point for IRT (namely 110 ng/mL) improved sensitivity, allowed more rapid diagnosis (with one half the patients being homozygous for the ΔF508 mutant allele), and did not increase costs.2,16Whenever the IRT or DNA test was positive, we contacted the infant's primary care physician who, in turn, communicated the result to parents and recommended a diagnostic sweat test by 4 to 6 weeks of age at either the Madison or Milwaukee CF Center. During the planning phase, it was concluded that an unambiguous case definition of CF should be predetermined and adhered to throughout the patient accrual phase. Because we anticipated that many infants would not show signs or symptoms of the disease or a positive family history when their sweat test was performed, our protocol specified that the diagnosis of CF required a sweat chloride value of ≥60 mEq/ L. This enabled us to identify a cohort of CF subjects with the classical form of the disease, but any child with a borderline sweat test (40–60 mEq/L23) was also followed in our protocol as a category referred as “other CF group.”

It was concluded during the planning phase that every effort would be made to avoid selection bias in the standard diagnosis group. This cohort was generated from children randomized to the control arm and identified by either a positive family history or signs/symptoms of CF (eg, meconium ileus [MI], malnutrition, unexplained hypoelectrolytemia, malnutrition, or characteristic respiratory disease). A unique feature was incorporated into the study design, namely to perform the IRT or IRT/DNA test on the control group infants' dried blood specimens, but to do so blindly and to computer-store the results until 1 of 3 developments occurred, namely: 1) parents requested the results; 2) CF was diagnosed in the standard manner; or 3) the child reached 4 years of age (the average age of diagnosis in Wisconsin and in the United States at the time randomization began). The unblinding feature at 4 years of age was accompanied by a systematic surveillance program in which all primary care physicians were contacted periodically to learn whether there were any unreferred CF patients; in addition, we surveyed death certificates twice during the project to determine that no unknown patient had died of CF during the project.

The duration of the projected patient accrual phase was determined as part of our planning process to be a period sufficient to generate at least 45 CF patients in the screened group who did not have MI. This conclusion was based on statistical calculations indicating that significant nutritional and pulmonary outcome differences could be detected with 80% power at an α of .05 with 90 patients. Proceeding under the prevailing assumptions of the time that CF occurred in 1:2000 live births with 10% of subjects having MI,24 we originally anticipated that randomized screening would last 3 to 4 years. Two unexpected developments, however, extended patient accrual in the screened group to nearly 9 years: 1) our discovery that CF is only approximately one half as prevalent as suspected in the United Sates25; and 2) determining that MI occurs with a frequency of 20% to 25%.26

After continuous patient accrual for 9 years, randomization was discontinued in July 1994 when over 50 CF patients without MI were enrolled in the screened group and a trend toward more favorable nutritional outcome was detected by sequential statistical analyses. After enrollment of 40 control subjects in October of 1996, a comprehensive statistical analysis was performed revealing significantly better growth in the screened group. Although we recognized at that point that the control group was not completely identified through unblinding (in fact, 89% of subjects in retrospect were diagnosed), a comprehensive statistical analysis was performed and the apparent nutritional benefits from that preliminary analysis were reported.19 Our Policy and Data Monitoring Board recommended in March 1997 that the unblinding process be accelerated to offer potential nutritional benefits to the remaining CF patients in the standard diagnosis group. Then, after approval by the institutional review boards, we doubled the rate of unblinding and no longer required that children be 4 years of age before contacts with parents were made and before sweat tests were performed.

Evaluation and Treatment Protocol to Access Benefits

After diagnosis of CF was established through pilocarpine iontophonesis sweat testing, informed consent was requested from parents, and children were entered into a protocol that specified evaluation methods, their sequence, and standardized therapeutic interventions. At diagnosis, 93% of children's parents gave informed consent for enrollment, including 97% of those eligible in the screened group and 89% of the controls. At the end of the patient accrual phase, 76% and 87% of enrolled subjects, respectively, had been maintained in the screened and control groups being followed longitudinally in the project; 87 total patients (43 screened and 44 controls) adhered to the protocol for follow-up evaluations and treatment when unblinding was completed in April 1998. The Evaluation and Treatment Protocol was developed by physicians and nurses of the 2 CF centers to systematically apply the same patient management methods in Madison and Milwaukee. Throughout all phases of the project, staff members from the 2 centers have met on a regular basis to discuss the Evaluation and Treatment Protocol, compare practices, and to review any potential changes that might be needed with regard to treatment. All caregivers and investigators other than the biostatisticians were blinded to patient group identity after the first visit.

Clinic visits occurred routinely at the Madison and Milwaukee CF Centers every 6 weeks during the first year of life and every 3 months thereafter. Compliance with scheduled visits was monitored and as of April 1999, we found that of the total scheduled protocol visits 92% were attended by each group. This is especially impressive when the winter weather conditions (∼5 months annually in Wisconsin) are taken into account. A variety of regular, generally interpersonal, methods were used to ensure maintenance in the project and compliance. These included a consistently compassionate, caring attitude during clinic visits; reimbursement of travel expenses; periodic telephone contact and mailings; progress reports on the project; birthday cards; CF family day gatherings; and other social events.

Assessment of benefits has concentrated on nutritional and pulmonary outcome measures and has excluded patients with MI for 3 reasons: 1) they are readily diagnosed without screening tests (in fact, they are not encompassed by the standard definition27 of screening—“application of a test to those who are asymptomatic for the purpose of classifying them with respect for having a disease”); 2) obviously there would be no difference in the age of diagnosis of patients with immediate neonatal intestinal obstruction in the 2 groups; and 3) a variety of observations suggested during the planning phase that prognosis of CF patients with MI was heavily influenced by initial treatment requirements to relieve intestinal obstruction and that both morbidity and mortality were worse than uncomplicated CF.28 During the course of this project, however, CF patients with MI have been categorized separately and studied as distinct cohorts along with the “other CF group.”21,26,29

During the planning phase, it was determined that the start point for the nutritional and pulmonary assessments would be the date of a positive sweat test at one of the participating CF Centers. Nutritional assessments include biochemical measures and anthropometric indices of growth. At each visit, a research nurse or dietician measures the child's length, height, and weight. In addition, during the first 2 years of life, head circumference measurements were taken as part of routine pediatric care to determine the occipito-frontal diameter. Detailed information on methods used to assess growth and the biochemical techniques is provided elsewhere.19,30Pulmonary outcome measures will be the subject of another report. Determination of pancreatic functional status depended on either fat absorption studies with 3-day fecal collections whenever possible (67% of subjects) or a new method with proven validity19 that relies on measurement of fat soluble vitamins and trypsinogen level changes in blood (33% of patients). Prospective 3-day food intake records were generated by parents (who were trained to do this accurately) every 6 months, as well as during the fat absorption study.

Therapeutic management of CF in this project was governed by our Evaluation and Treatment Protocol. This specified systematic nutritional interventions tailored to the needs of individual patients. After diagnosis, assessments were made of pancreatic functional status using measurements of plasma fat-soluble vitamin levels (E and A), measurement of essential fatty acids, and 3-day fat absorption studies. Nutritional management included supplementation with pancreatic enzymes in microsphere form, fat-soluble vitamins, and sources of linoleic acid.30,31 Relatively high intakes of energy and fat were recommended, ie, 120% to 150% of the age-appropriate level as specified in the 1989 edition of Recommended Dietary Allowances.32 For a variety of reasons,33,34 we recommended predigested formula for infants, unless the mother wished to breastfeed; when this occurred, we monitored serum albumin and electrolytes closely and supplemented infants as needed. Although the composition of the predigested formula changed during this study as described elsewhere,34 no other deviations from the original plan occurred with regard to nutritional management. Plasma levels of albumin, fat-soluble vitamins A and E, and the essential fatty acids were measured at diagnosis and then every 6 months (or more frequently if any abnormalities were detected). Supplements were given as needed using doses described elsewhere.31 Pulmonary disease management was systematic and based on the needs of individual patients. Our strategy relied on standard approaches35 with emphasis on oral antibiotics when mild and respiratory infections occurred, chest physiotherapy, and hospitalization for intravenous antibiotics if appropriate. When new treatments became available, the research team met and determined whether the treatment protocol should be modified; this led to a decision that alternate day corticosteroid therapy would not be used in either CF center for enrolled patients.

Statistical Analyses

The 158 patients with sweat chloride values above 60 mEq/L assigned either to the screened or control groups in the Wisconsin Neonatal Screening Study were evaluated to determine whether the randomization to group was appropriately balanced in terms of total patients as well as the 16 other variables listed in Table 2. The P value for the simple binomial test for balance was used to compare the numbers of patients in each group. To compare the 16 remaining variables, we computed the proportions (for categorical variables) or means (for continuous variables) along with percent (for categorical variables) or standard deviations (SDs) for continuous variables. A Poisson assumption was used to compute the SD for births per year. The categorical variables were analyzed with χ2 tests. Birth date (used to assess births per year) was treated as a failure time and analyzed with a log-rank test. Birth weight was treated as a continuous variable and analyzed with a z test (allowing for unequal variances). Next, 2000 Monte Carlo permutations of group assignment were determined and the minimum P value of the 16 tests was computed for each permutation to determine the appropriate P value after adjusting for the multiplicity of tests. Confidence intervals (CIs), at the 95% level, for the incidence of CF were computed by using the square-root transformation for the Poisson model.36 For standard comparisons, statistical methods included the Wilcoxon rank-sum test for continuous variables and Fisher's exact test for categorical variables (all such tests were 2-sided).

Demographic, Clinical, and Study Characteristics of the Control Group Compared With the Early Diagnosis Group Among Randomized Patients

We also performed a repeated-measures analysis using generalized estimating equation (GEE) methods37 with working assumption of independence among observations to assess the difference in anthropometric indexes between the early diagnosis group and the control group. The analyses were adjusted for age, sex, center, genotype, pancreatic status, and age at diagnosis. Interaction terms for sex and other covariates were also included in the regression models to determine whether the differences between the groupings were caused by sex. For some of the regression models, one or more of the interaction terms needed to be dropped to avoid problems with colinearity.38 The identity link was used for continuous outcomes, whereas the logit link with binomial variance function was used for dichotomous outcomes. Three basic modeling strategies were used: first, all study data were included in the regression models; second, the data were restricted to measurements taken after 4 years of age; and third, a “what if analysis” was performed, which involved supplementing the study with data obtained before diagnosis for control patients as a result of the unblinding process. These last 2 strategies were performed to eliminate any potential bias.

Surveillance

Successful investigation of the benefits and risks of neonatal screening for CF required us to establish an active surveillance system in support of our randomized clinical trial. Because this study is the first controlled, perspective assessment of the benefits and risks of newborn screening, new methods needed to be established. These included the following: 1) surveys of primary care physicians on 3 occasions during the 9-year randomization; 2) surveys of hospitals through 6 mailings; 3) examination of birth certificates on several occasions to identify children and their parents; 4) examination of death certificates through computerized searches with the help of the Wisconsin Center for Vital Statistics to determine whether any CF patients died during this study without being recognized; and 5) systematic tracking mechanisms to locate families in the unblinding phase using a sequence of methods beginning with the child's primary care physician, using demographic information to locate parents, using employment records, telephone directory searching services, insurance company data, social service agencies, and the Wisconsin Department of Transportation. Often, mailings were conducted by registered letters with compulsive follow-ups included.

RESULTS

A total of 650 341 newborns were randomized between April 15, 1985 and June 30, 1994, when their dried blood specimen reached the State Laboratory of Hygiene for trypsinogen determination. This accounts for exactly 99% of Wisconsin live births during this period (see Table 1). The IRT test alone was used for 220 862 infants, and IRT/DNA testing was used for 104 308 infants in the screened group. There were 325 121 infants assigned to the early diagnosis group and one fewer in the control group. Only 195 (.03%) of parents requested screening test results, and 91 of these were in the control group; however, no CF patient was identified among those 195 infants. Table 1provides data on the incidence of CF and Table 3 summarizes the numbers of patients identified in each group, their age of diagnosis, and their anthropometric indices of nutritional status at the time of positive sweat test. The average age of diagnosis was significantly lower in the screened group without MI, as expected, although there were 5 false-negative CF patients in that cohort with the following ages at diagnosis: 6.9, 19, 21, 124, and 281 weeks. In all, 157 patients with classical, unequivocal CF were identified through randomization, and there was one more child with an abnormal sweat test and probable CF (see Table 1). Accelerated unblinding/surveillance resulted in addition of 2 more subjects diagnosed with signs or symptoms of CF and 6 more detected by unblinding added to the 9 already identified through our unblinding and surveillance system. When patient accrual was completed in April 1998, the average age of diagnosis of the subgroup identified through accelerated unblinding was 4.1 years, compared with 5.2 years in the 9 others. A decision was reached at that time to collect growth data on all subjects identified through unblinding before their diagnoses and to follow each subject for another year and to obtain more anthropometric indices for a final, definitive statistical analysis after April 1999.

Demographic, Nutritional, and Clinical Characteristics at the Time of Diagnosis of CF in Patients Without MI

The screened and the standard diagnosis (control) cohorts were evaluated to determine whether the randomization to group was appropriately balanced and to determine whether the assignment method based on terminal digit discrimination was satisfactory. A summary of differences between early diagnosis and control patients, in terms of total patients as well as 16 other variables, is provided inTable 2. The P value for the simple binomial test for balanced numbers of patients is given and indicates that the total number of patients in each group is balanced. The remaining variables were then evaluated to determine whether any significant imbalances in the randomization remained. The results of our statistical tests of individual variables are given in the first P value column of Table 2. This analysis indicated that only pancreatic status (P = .012) and genotype (P = .031) were significantly different between the 2 groups. Specifically, the control group was found to have more patients with pancreatic sufficiency and genotypes with CFTR alleles other than ΔF508; the other mutant alleles are described in Table 3. Next, 2000 Monte Carlo permutations of group assignment were determined and the minimum P value of the 16 tests was computed for each. After adjusting for the multiplicity of tests, however, the resulting adjusted P values given in the far right column of Table 2show that none of the imbalances for any of the variables remains statistically significant. Therefore, although the screened group has relatively more patients with pancreatic insufficiency and ΔF508 genotypes, this difference cannot be attributed to failure of the randomization method but seems to be a chance occurrence. Nevertheless, these differences are relevant to group comparisons of nutritional and pulmonary outcomes because the 2 factors more prevalent in the screened group are associated with greater morbidity and a worse prognosis28,39; in other words, the control group is intrinsically better off than is the screened cohort.

Because of the combination of a thorough screening program and our compulsive surveillance methods, we were able to calculate as precise a CF incidence as has ever been determined for a large, fully defined newborn population. Table 1 provides a summary of calculated incidence figures and reveals that one patient with classical CF was detected for every 4189 live births (95% CI: 3603–4930). Adding in 2 infants with CF diagnosed at autopsy and 7 patients with typical clinical manifestations of CF but with a borderline sweat test (40–60 mEq/L) increases the incidence to 1:3962; 2 of these other group patients had 2 CFTR mutations and 2 others had 1 mutant allele. There were also 6 children with borderline sweat test results and 1 CFTR mutation, and 1 child with a mean sweat chloride of 64 mEq/L who has 1 CFTR mutation but no symptoms. Adding them in and recalculating with this total of 173 children reveals a maximum CF incidence of 1:3801 and a 95% CI of 3292 to 4438. The incidence of CF in Wisconsin newborns is not different from that of the United States.25

Nutritional Status at Diagnosis

Table 3 summarizes data at diagnosis of CF patients without MI who were randomly assigned to either the screened or the control group and evaluated. They were well balanced with regard to gender and center, but the standard diagnosis group reflected their entire cohort in having fewer patients with pancreatic insufficiency and ΔF508 genotypes. As shown in Table 3 the average age of diagnosis was significantly different between the 2 groups of CF patients without MI. Eliminating the false-negative infants in the screened group yields median and mean ages of diagnosis of 6.71 and 7.24 weeks, respectively, compared with 26.14 and 97.87 in the control group. The median of the screened group reflects our plan as specified in the protocol. Biochemical assessment of nutritional status at diagnosis revealed some differences between the 2 cohorts, when all patients of each group were included in the analysis. The mean plasma level of vitamin A was lower in the screened group (Table 3); however, the proportion of screened patients with probable vitamin A deficiency (retinol <20 μg/dL) was not significantly different. Serum albumin levels were lower on average in screened patients but their younger age probably accounts for the difference. Anthropometric indices of nutritional status were significantly different at diagnosis for the 2 groups with screened subjects showing significantly higher values for height, weight, and head circumference compared with controls (Table 3). Values obtained in the control group were similar to data published on newly diagnosed CF patients born in the United States and registered with the National Cystic Fibrosis Foundation.40 In contrast, screened patients more closely resemble the entire population of children in the United States based on standards published by the National Center for Health Statistics.41

Growth Throughout Childhood

Observations on the growth of patients in the 2 groups are presented in the accompanying figures of the period of birth until 13 years of age. These graphs also show the pattern of patient accrual during the first 13 years of this project. Growth evaluations included weight and height for age percentiles, z scores, and determining the proportion of patients with a weight or height for age below the 10th percentile (an indication of severe malnutrition). Significant differences by GEE analysis were found in most of the comparisons, but the discrepancy in height, ie, stunting, was particularly impressive. As shown in Fig 2, height for age z score mean data were consistently lower in the control group until an apparent convergence at 10 to 12 years of age (long after accrual/enrollment occurred along with initiation of aggressive nutritional therapy). Similarly, the proportion of patients below either the 5th3 or 10th percentile was much different between the 2 groups throughout the trial. When height or weight below the 10th percentile was used as an index of severe malnutrition (Fig 3), the outcome was significantly better in the screened group. The odds ratio for the risk of a weight below the 10th percentile in the control group, compared with the screened group, was 4.12 (95% CI: 1.64–10.38), and the corresponding odds ratio for height was 4.62 (95% CI: 1.70–12.61). Neonatal screening eliminated the risk of height below the 10th percentile in the screened group after 9 years of age, but many controls continued to show anthropometric evidence of severe malnutrition. The differences are greater than we observed in our preliminary analysis19 and are especially impressive in view of intrinsically better prognosis expected in the control group with their higher percentage having pancreatic sufficiency. Also, it should be emphasized again that all patients whose growth is presented in these figures have received similar, standardized treatment after diagnosis.

Growth of enrolled CF patients without MI who were diagnosed either through neonatal screening or by standard methods (control group). The numbers of subjects are shown for each year of observation of continuously enrolled subjects. The value indicated for each age represents the mean number of subjects who had a weight or height measured within 2 months of their birthday (672 observations by year on 99 patients); standard errors are depicted by the vertical bars. Statistical analyses with the GEE model were performed using 2894 observations obtained from 140 patients (56 screened subjects and 48 controls). Repeated-measures analysis using the GEE method revealed a marginally significant difference (P = .06) in the weight for age z scores, while height was highly significant (P = .009). The analysis included adjustment for the following covariates: group, age, sex, center, genotype, pancreatic status, age at diagnosis, sex by age interaction, sex by genotype interaction, and sex by pancreatic status interaction. Adjusting for birth weight did not change the statistical results.

Evidence of severe malnutrition in the standard diagnosis (control) group as reflected by weight or height less than the 10th percentile for age.41 The numbers of subjects and observations are the same as described for Fig 2. Using GEE analysis, we found that the proportions of the 2 groups differed significantly for both weight (P = .003) and height (P = .003). Covariates adjusted for include group, age, sex, center, genotype, pancreatic status, age at diagnosis, and sex by age interaction.

Two other approaches were used to analyze the growth data and compare screened patients with controls. First, our biostatisticians performed what we refer to as a “what if analysis” to determine whether growth differences would have been significant had all the control patients been identified from early childhood. To do this, we used 87 more prediagnosis observations on the growth of the CF control patients detected after delayed diagnosis beyond 2 years or through unblinding (data obtained from primary care physicians and parents). Reanalysis with the GEE method showed persistence of the significant differences in both height and weight. For instance, the proportion of controls less than the 10th percentile was significantly greater for weight (P = .007) and height (P = .004). In addition, we performed the comparison of anthropometric indices in the 2 groups beginning at 4 years of age, as suggested by Wald and Morris.20 Using the GEE method applied to data from 91 patients, we again found that the screened group grew significantly better. After 4 years of age, the proportion of patients with heights less than the 10th percentile is obviously greater in the standard diagnosis group (Fig 3).

Figure 4 presents a summary of dietary intake of energy, protein and fat by the 2 groups at each year of age through childhood. These data are based on 802 prospectively obtained, 3-day food intake records on 94 CF patients (averaging 1.6 records per patient per follow-up year) and probably represent the most complete assessment yet performed of nutrient intake by patients with CF from birth until adolescence. The results were analyzed according to pancreatic functional status for obvious reasons.1,31Indeed, we found that the children with pancreatic sufficiency in each group at nearly every observation period showed less intake of energy, protein, and fat than did those with pancreatic insufficiency. The patients with pancreatic sufficiency consumed energy at approximately the recommended daily allowance (RDA) levels (99% on the average) and consumed 34% of calories as fat. Their average height zscore was +.01 SD, which was significantly greater than that of the control patients with pancreatic insufficiency (−.73 SD;P = .014) but not greater than that of the screened group (−.09 SD; P = .74). The patients with pancreatic insufficiency had a higher caloric intake, averaging 118% of RDA levels. However, there was no statistical difference at any age between the screened and control patients who were pancreatic insufficient, as shown in Fig 4. Fat intake averaged 37% and 38% of dietary energy in the screened and control patients with pancreatic insufficiency, respectively, and showed very little variation after 1 year of age.

Dietary intake of 91 CF patients without MI who were enrolled and followed longitudinally in this trial. Caloric intake is expressed relative to the age-specific RDA dietary energy needs.32Intake data were calculated from prospective 3-day food intake records kept by parents (usually mothers) of study patients. Note the lower intakes of patients with pancreatic sufficiency but their normal height for age z score data.

DISCUSSION

This report presents our definitive assessment of nutritional outcomes of the 2 groups of patients resulting from the Wisconsin randomized trial of early diagnosis of CF through neonatal screening. The results confirm and extend previously reported preliminary results published19 before patient accrual was completed through the unblinding and surveillance strategies that we performed over a 14-year period throughout Wisconsin. The analysis presented herein is definitive because complete patient accrual and at least 1-year follow-up were needed for us to be satisfied with the validity of the study. The addition of 9 patients detected through our unblinding/surveillance method after March 1997 enabled us to generate comparable cohorts of screened and control patients. Randomization validity was important to establish because of our finding of group differences in 2 of 16 demographic variables—pancreatic functional status and genotype profile (both factors that could not be controlled by randomization). Because these 2 variables are known to alter prognosis in CF, we needed to assess the randomization method carefully, particularly in view of the fact that the standard diagnosis (control) group had relatively fewer patients with pancreatic insufficiency and ΔF508 genotypes. We conclude from the analysis presented in Table 2 that randomization was effective and that these differences can be attributed to chance.

The incidence of CF described herein, while as precise as possible, is certainly lower than was expected when we planned the study in 1983–1984. Nevertheless, we are convinced that in Wisconsin, 1 CF patient should be expected for every 3400 live births. This is the same as the incidence that we reported for the United States as a whole.25 Data from some Northern European countries, including results based on newborn screening studies suggest a higher incidence—at least in Great Britain.5 Attempts in the past to adjust US incidence data for white births have often led to a higher presumed incidence. In our study, we did such an adjustment and our conclusions did not change (Table 1). We believe that in view of the likelihood of detecting CF patients among blacks, Hispanics, and Native Americans and the compelling reasons to avoid diagnostic bias, this practice should be abandoned in the United States. Indeed, the misconception that CF is rare in these minority populations probably accounts for underdiagnosis because physicians may be less inclined to obtain sweat tests in such instances.

After publication of our preliminary results on nutritional status,19 skepticism and criticism by Wald and Morris20 appeared in the literature because of their concern about selection bias. We considered that issue carefully during the planning phase of this project and had to overcome potential selection bias to obtain and maintain grant support from the National Institutes of Health. The blinded screening and subsequent unblinding/surveillance methods applied to the control group, which satisfied ethical scrutiny,29 represent a unique strategy for a randomized clinical trial and can be applied to other newborn screening tests. Therefore, the design of this project and success in both patient accrual and data collection clearly allowed us to avoid selection bias. Incorporating prediagnostic growth data from the controls detected through unblinding served to strengthen our conclusions about the nutritional benefits of CF screening. Also, analysis after 4 years of age when all patients were enrolled showed that the statistical differences persisted, despite similar therapy and dietary intake. The results presented in this report, therefore, provide very strong evidence of nutritional benefits associated with early diagnosis through neonatal screening. Similar evidence has been generated in the study performed in Australia.42 Their analysis also showed better pulmonary outcomes,42 as did that of investigators in The Netherlands.43 It should be mentioned that the severity of lung disease was relatively mild in the 2 groups that we studied at the time of diagnosis, based on Schwachman-Kulczycki44 chest radiograph scores (23 ± 2.7 and 22 ± 3.4 in the screened and control patients, respectively).

In our longitudinal investigation, we have used a variety of anthropometric indices to characterize nutritional status. These include height for age percentile, weight for age percentile, and a weight for height index called the percentage of ideal weight for height (%IWH), which is described in the CF Foundation Nutrition Consensus Report.31 The %IWH index is similar to body mass index in that they both measure the relative proportion of weight to stature. Analyses by GEE models indicated no significant difference in %IWH between the screened and control groups.

Although many factors have been known to influence the nutritional status of CF patients, their age of diagnosis has only recently been suspected to emerge as a critical determinant. In this project, we isolated age of diagnosis as carefully as possible by randomized neonatal screening coupled with the routine use of a standardized evaluation and treatment protocol. We also devoted considerable attention to high rates of enrollment, maintenance in the project, and compliance with care. Because the therapeutic interventions in this trial applied equitably to the 2 groups, it is obvious from our results that age of diagnosis is not only a critical factor in nutritional status but also has more long-term effects than have been expected previously. Indeed, our results suggest that catch-up growth may not occur in some patients who have experienced delayed diagnosis and who were severely malnourished when recognized as having CF. In addition, the associated decrease in head circumference that has been observed by others45 raises long-term concerns about the possibility of impaired brain growth in early childhood and altered cognitive function in later years. The greater severity of malnutrition and growth failure are even more impressive when one takes into account the fact that the control group was relatively better off because of inclusion of more patients with pancreatic sufficiency (a subgroup that grows better with relatively fewer calories as shown in Fig 4).

Although the efficacy of neonatal screening for CF continues to be debated and concerns have been expressed about cost, we believe as stated elsewhere46 that “the burden of proof is on those who argue against neonatal screening for CF.” Our rationale for this conclusion is as follows: unbiased, longitudinal assessment of nutritional outcomes in children with CF strongly indicates significantly better long-term growth in patients who experienced early diagnosis through screening, while no persistent risks have been detected, despite continuous scrutiny over 14 years in this study.3 In addition, there are supportive observational studies and accumulating data on pulmonary benefits.13,42,43 All of this adds compellingly to the intuitive expectation that early diagnosis of CF should be beneficial, especially because delays in recognition can be harmful.6,28 The apparently favorable benefit/risk relationship achieved through neonatal screening without greater costs than standard diagnostic methods2,46 makes IRT/DNA testing very attractive. This 2-tier screening strategy also has high sensitivity16 but must be incorporated into good follow-up and treatment programs to achieve the potential benefits of early diagnosis.

CONCLUSION

It should be emphasized that our positive randomized trial revealed both statistically and clinically significant differences. Indeed, nutritional advantages associated with early diagnosis and treatment are significant enough to eliminate stunting in CF patients who are diagnosed through neonatal screening. All of the criteria for “assessing the effectiveness of community screening programs”47 are met by CF neonatal screening. The favorable evidence accumulated thus far is as strong as that generated by scientific studies of any newborn screening test. Therefore, we suggest that more regions begin to initiate such screening programs, as recommended by the Centers for Disease Control and Prevention.48

ACKNOWLEDGMENTS

This work was supported by the National Institutes of Health (Grants DK 34108 and M01 RR03186) and the Cystic Fibrosis Foundation (Grant A001-5-01).

We are deeply grateful to the many pediatric nurses and dieticians who have participated in the project and to the parents of enrolled children with CF. Their cooperation and patience have truly been the sine qua non of this investigation.

Preliminary results from this study were reported before complete accrual of patients in N Engl J Med. 1997;337:963–969; and a summary of this report was presented on at the Annual Meeting of the Pediatric Academic Societies; May 4, 1999; San Francisco, CA.