Background: Organismal performance assays (OPAs) are a unique method of toxicity quantification used to assess the safety of potentially toxic compounds such as pharmaceuticals. OPAs utilize genetically diverse wild mice (Mus musculus) housed in large seminatural enclosures wherein exposed individuals compete directly with controls for resources. Previously, OPAs have been successful in detecting adverse effects in mice that were exposed to paroxetine. Here, we further test OPAs' utility in pharmaceutical safety assessment by testing OPAs with rofecoxib, a drug with known adverse effects on humans. Materials and Methods: We exposed mice to rofecoxib (~37.5 mg/kg/day) during gestation and into early adulthood. Exposure ceased when individuals were released into enclosures. Five independent populations were established and rofecoxib-exposed individuals (n = 58) competed directly with control individuals (n = 58) over 28 weeks. Organismal performance was determined by quantifying reproduction, survival, and male competitive ability. Results: In enclosures, rofecoxib-exposed males had equal reproduction, survival, and competitive ability. Rofecoxib-exposed females had equal survival compared to controls but experienced 40% higher reproductive output. Conclusions: The adverse health effects of rofecoxib seen in humans escaped detection by OPAs, just as they had during traditional preclinical assays. These results may be explained by the exposure design (in the enclosures, all animals were on the control diet), the relatively short duration of exposure, species differences, or because the health benefits of the drug negated the side effects. Similar to numerous assays used in preclinical trials, OPAs cannot reveal all maladies, despite their demonstrated sensitivity in detecting cryptic toxicity from numerous exposures.

Pharmaceutical development is associated with financial costs of ~$1.4 billion per compound, and 12-15 years of research per drug are required. [1] During human clinical trials, 73% of pharmaceuticals that passed preclinical safety assessments fail [2] and 10% of Food and Drug Administration (FDA)-approved pharmaceuticals are recalled after market release due to unforeseen toxicity. [3] In addition to the human cost in pain, suffering, and loss of life, pharmaceuticals that fail after FDA approval can also cause substantial costs associated with litigation fees and human suffering. One potential solution to increase the detection rate of harmful compounds during preclinical trials is to employ new methodologies. We have developed a unique toxicity assessment tool, known as the organismal performance assay (OPA), which has the potential to be valuable if implemented during preclinical trials.

Preclinical studies were not geared at assessing the cardiac safety of rofecoxib. Rather, preclinical studies would determine if a pharmaceutical of interest causes mutagenicity, carcinogenicity, teratogenicity, and infertility. [4] Rofecoxib exposure was not found to be mutagenic in rodent cells, nor was it found to be carcinogenic, teratogenic, or to cause infertility in rodents. [4] Cardiac adversity was first suspected after market release during a phase IV clinical trial, the Vioxx ® gastrointestinal outcome research (VIGOR) study. [5] These results were later confirmed by the adenomatous polyp prevention on Vioxx ® (APPROVe) study, which ultimately led to the recall of rofecoxib. [6]

OPAs provide a unique combination of breadth and sensitivity, and provide unambiguous information on the affects of exposures on overall Darwinian fitness. OPAs utilize genetically diverse wild-derived mice (Mus musculus), and treatment individuals compete directly with controls for resources, such as food, mates, and nesting sites in seminatural enclosures. The use of wild-derived mice is important because they have cohabitated with humans since the beginning of agriculture and thus demonstrate normal behaviors in man-made structures-behaviors that many laboratory strains no longer possess. [7] Performance of individuals is measured on an ultimate level in terms of Darwinian fitness (i.e., lifelong reproduction and survival) and a major component of fitness (e.g., social dominance). Quantifying fitness is key in OPAs, as the ultimate function of an organism's physiology is the performance of complex behaviors that facilitate reproduction.

OPAs have been used to quantify the adverse effects of a broad array of treatments. Most recently, OPAs have demonstrated their sensitivity by quantifying the effects of the pharmaceutical paroxetine (Paxil ® ), an antidepressant belonging to the selective serotonin reuptake inhibitor drug class currently available on the market. [8] Additional OPA studies have detected adverse effects from cousin- and sibling-level inbreeding, harboring a selfish gene, and consuming added sugar at human-relevant levels. [8],[9],[10],[11],[12],[13] In all of these studies, OPAs found substantial deleterious effects that were missed by current methodologies. OPAs are capable of revealing mammalian toxicity with high sensitivity because when wild mice live under social conditions, even small declines in behavior or physiological performance are revealed in OPA endpoint measures. As OPAs challenge all physiological systems simultaneously, reduced physiological function will be detectable by the inability of treatment individuals to perform comparable to controls.

The focus of this study was to use a drug with known adverse effects to determine if OPAs would indicate a decrease in fitness when the mice were exposed at a dose regime near therapeutic levels. We selected rofecoxib (Vioxx ® , Merck, Whitehouse Station, NJ, USA), a selective nonsteroidal anti-inflammatory drug (NSAID) that was prescribed to patients to relieve arthritic pain. The FDA approved rofecoxib in 1999, but it was recalled by Merck in 2004 after the drug was shown to increase the risk of cardiovascular events when patients took the drug >18 months. [6] During its time on the market, ~107 million prescriptions of rofecoxib were dispensed [14] to 80 million patients, [15] generating a revenue of 5 billion USD in sales. [16] More than 27,000 cardiac events were associated with the use of rofecoxib, [17] and Merck has paid more than 6 billion USD in lawsuit settlements and legal fees. [18]

Here we used rofecoxib to further test OPAs as a broad screening tool for adverse effects by determining if rofecoxib exposure causes fitness declines in mice. If rofecoxib exposure adversely affects any physiological system, we predict that exposed individuals will suffer survival, dominance, and reproduction declines relative to controls. If OPAs are successful in detecting rofecoxib-induced adversity, these results will provide additional evidence that OPAs could potentially be a powerful tool if implemented during preclinical studies.

Materials And Methods

Animals

Genetically diverse, wild-derived house mice were used in this experiment. Wild mice possess natural behaviors needed to function within seminatural environments. [7] The wild mice used in this experiment were from the 12 th generation of the colony described in a study by Meagher et al. [10] The relatedness of this colony was assessed in a subset of mice from the 11 th generation and found to be comparable to wild populations. [19] Animals were provided food and water ad libitum and maintained on a 12:12 h light:dark cycle. The University of Utah Institutional Animal Care and Use Committee (IACUC) approved all procedures and protocols.

Drug exposure

Dosing was achieved by incorporating 12.5 g of rofecoxib (AK Scientific Inc., Union City, CA, USA; molecular formula: C 17 H 14 O 4 S) into 50 kg of rodent chow (TD.130006; Harlan Teklad, Madison, WI, USA). As wild mice eat approximately 3 g per day and weigh approximately 20 g, [20] individuals ingested ~ 0.75 mg of rofecoxib per day or ~37.5 mg/kg/day. Using a metabolic rate conversion factor, this is equivalent to a human dose of ~3.0 mg/kg/day, or a daily dose of ~182.4 mg, assuming the average human weighs 60 kg. [20] Rofecoxib was prescribed at doses 12.5-50 mg/day; [4] thus, animals in this experiment were exposed to a dose ~3.5-fold higher than the human therapeutic dose but within the range of doses in preclinical studies. [4]

Sixty breeding pairs were selected for this experiment and divided into two treatment groups, rofecoxib-exposed and control. All breeders were individually housed 8 days prior to pairing; females in the rofecoxib treatment started exposure at this time, while males in the rofecoxib treatment started exposure 5 days prior to pairing. Breeding pairs were kept together until a maximum of three litters were produced. All offspring were weaned at 28 days of age, standard for our wild-derived mice, and housed with same-sex siblings. Upon weaning, litter size, sex, and weight were recorded. Offspring were kept on their respective treatment until adulthood, and released into enclosures; this duration of rofecoxib exposure maximized the ability of OPAs to detect health consequences as, once released into the seminatural enclosures, all animals were fed the control treatment. Currently, there is no way to keep animals on their respective treatments while they are free-ranging during OPAs, and switching the rofecoxib-exposed animals to the control treatment was a more conservative approach of detecting fitness impacts rather than switching the control individuals to the rofecoxib diet.

OPA enclosures

Enclosures have previously been described in Ruff et al. [12] Briefly, there are 11 independent enclosures ~30 m 2 . Each enclosure is divided into six territories by hardwire mesh that is easily climbed but adds a component of spatial complexity. Additionally, each enclosure had four optimal territories and two suboptimal territories. Optimal territories consisted of a large storage bin with multiple dark nesting sites and direct access to food, whereas suboptimal territories consisted of light-exposed nesting sites [Figure 1]. All territories contained ad libitum access to food and water and were kept on a 12:12 h light:dark cycle.

Figure 1: An image of a semi-natural enclosure used in OPA experiments. Each enclosure is ~30 m2 and contains six territories that are divided by wire mesh that is easily climbed but adds spatial complexity. The four optimal territories have the large blue bins, which contain multiple dark nesting sites and are defendable. The two suboptimal territories consist of light exposed nesting sites. Each territory contains food within the chimney-like structures and water (poultry waterers). PIT tag antennas are placed above each feeding site. Photograph courtesy Douglas Cornwall

Five independent OPA populations were established and four were maintained for 28 weeks. One population was terminated at 11 weeks due to 100% control male mortality. Populations consisted of 8-10 males and 12-18 females for a total of 116 animals (42 male, 74 female); these animals are referred to as founders. Half the mice of each sex were on the rofecoxib-exposed treatment, while the remaining half served as controls. The population structure allowed for direct competition between exposed and control individuals. Enclosure space and population size created a population density reported to be within the range observed in the wild. [21]

Upon release into enclosures, male mice were 16.31 ± 5.24 [mean (M) ± standard deviation (SD)] weeks old and females were 15.34 ± 5.31 weeks old. To allow males to establish territories and to prevent incidental matings, males from both treatments were released into the enclosures with nonexperimental females. One week later, nonexperimental females were removed and replaced with experimental females from both treatments. To eliminate any inbreeding, no populations contained opposite-sex individuals related at the cousin level or below. Three populations consisted of one set of brothers, four populations consisted of one or two sister pairs, four populations consisted of sister triplets, and one population consisted of sister quadruplets. When relatedness occurred, it was balanced between treatments. Populations in the wild typically contain related individuals.

Reproductive success

Founder reproductive success was determined by genetically analyzing offspring born in the enclosures. Offspring were first removed at week 8, then during 5-week intervals, referred to as pup sweeps. This time interval prevented offspring from reaching sexual maturity and from confounding the reproductive success data. During pup sweeps, offspring were removed and sacrificed, and a tissue sample was collected from each for genetic analyses. A total of 1,138 samples were collected with a mean of 227.60 ± 104.50 offspring per population.

A population-level approach was used to determine reproductive success in three of the five populations; this approach is extensively described in Meagher et al. [10] Briefly, founder individuals from each treatment were selected based on nonoverlapping sex specific allelic variants, females on the mitochondrial genome, and males on the Y-chromosome. To control for confounding effects, such as segregating genes linked with the markers, reciprocal markers were assigned across populations. Mitochondrial genotypes were assessed in 860 samples (three of five populations) and obtained for 100% of offspring. Of the 1,138 offspring, 570 Y-chromosome genotypes were obtained (from all five populations), suggesting that 100% of all males were typed if the sex ratio was 1:1. In one of the populations not typed with the above method, female reproductive success was determined by parentage analysis using multiple microsatellite loci to gain more knowledge for another study, and this has been extensively described in the supplementary information. The final population was not genotyped, as the results of this population would not have influenced the overall trends of female reproductive success results to meet our predictions.

Male competitive ability

Prior to release into enclosures, all individuals received a passive integrated transponder (PIT) tag (TX1400ST, BioMark, Boise, ID, USA) and a unique ear punch for identification purposes. Two sets of PIT antennas and readers (FS2001F-ISO, BioMark, Boise, ID, USA) were used in this experiment and were rotated twice per week among concurrent populations. PIT tag antennas were placed above each feeding station within the six territories of an enclosure. All PIT tag data were downloaded to a computer containing data logging software (Minimon, Culver City, CA, USA). A male was considered a territorial occupant (dominant) in a territory if >80% of the total reads belonged to him at a particular location. PIT tag data were collected on female mice but were not analyzed.

Survivorship

Survivorship was assessed by daily noninvasive health checks and extensive enclosure checks during pup sweeps. Extensive checks were not performed at a higher frequency so as to not disrupt territoriality formation, which increases infanticidal behavior. Research personnel entered enclosures only to freshen waters, fill feeders, rotate PIT tag readers, remove deceased individuals, and to conduct pup sweeps. Deceased founders were identified by PIT tag IDs. The date of death was estimated upon the condition of the corpse. Individuals that had died long before research personnel discovered them were given a death date half way between the date they were found and the last date they had been read by PIT tag readers.

Statistical analyses

Linear mixed models (LMMs) were used to analyze wean weight from cages and body weight in OPAs enclosures, as these data are continuous and normally distributed. Generalized linear mixed models (GLMMs) were used to analyze litter size, reproductive success, and male competitive ability, as these data are discrete counts and conform to either a Poisson distribution (litter size and reproductive success) or binomial distribution (male competitive ability). To use a GLMM with a Poisson distribution, data are logarithmically transformed; thus, standard errors (SE) are asymmetric, as values reported in the results section have been back-transformed. Both LMMs and GLMMs were conducted in R 3.0.2 using the lme4 library. [22],[23]P values were calculated for LMMs with the Satterthwaite approximation using the lmerTest package. [24] Cox proportional hazard models (PH) were used for survivorship. (JMP 9.0.3, SAS institute Inc., Cary, NC, USA). A complete description of statistical analyses can be found in the supplementary information.

Rofecoxib exposure increased female reproductive success by 40% relative to controls. At the model intercept (week eight), rofecoxib-exposed females had significantly more offspring than controls (GLMM; z = 3.89, P < 0.0001) [Figure 4]a with a mean of 28.76 offspring per population (SEM + 3.44, -3.07). Control females had a mean of 18.53 offspring per population (+4.24, -3.44). No effect of time (GLMM; z = 0.39, P = 0.70) or time by treatment interaction (GLMM; z = 1.20, P = 0.23) was detected, suggesting that rofecoxib-exposed females had more offspring throughout the duration of the study. For a complete readout of mixed model results on reproduction in enclosures and competitive ability, see [Supplementary Table 3[Additional file 3]].

Figure 4: Reproductive success of rofecoxib-exposed and control animals in seminatural enclosures. (a) Rofecoxib-exposed females had 40% more offspring than controls [n = 5, observations = 34 (GLMM; z = 3.89, P < 0.0001)] and this was consistent throughout the study (b) No difference in male reproduction was detected between treatments [n = 5, observations = 44 (GLMM; z = 0.19, P = 0.85)]. A trend was detected in which rofecoxib -exposed males had fewer male offspring over time (GLMM; z = −0.05, P = 0.09). Lines connect means of the populations at each time point for each sex, and error bars represent standard error

Male competitive ability was not impacted by treatment. At week three (model intercept), control males occupied 30% of territories and rofecoxib-exposed males occupied 30%, leaving 40% of territories unoccupied (GLMM; z = 0.08, P = 0.94) [Figure 5]. The percentage of undefended territories is not unusual because about one-third of the territories are suboptimal and often difficult to defend. There was a marginally significant increase in the number of territories being occupied over time (GLMM; z = 1.72, P = 0.09), but no time by treatment interaction occurred (GLMM; z = -0.27, P = 0.79), suggesting that males from both treatments occupied more territories over time.

Figure 5: Male competitive ability between rofecoxib-exposed males and controls. Males of both treatments occupied equal percentages of territories: 30% each [n = 5, observations = 104 (GLMM; z = 0.08, P = 0.94)]. This effect was consistent throughout the study. Points represent the mean number of territories of five populations. Lines connect means of the five populations, and error bars represent standard error. To aid in visualization, time points from 5-week intervals have been pooled, except for the first time point consisting of 8 weeks

Control females (n = 74) had increased mortality; however, the data were not analyzed due to overall low mortality as only two control females died, while 100% of rofecoxib-exposed females survived [Figure 6]a. No significant differences were detected in male mortality between treatments (PH; χ2 = 0.04, P = 0.83) [Figure 6]b. The mortality rate did not differ in replicate populations (PH; χ2 = 4.85, P = 0.30), nor was there a difference in the effect of treatment among populations (PH; χ2 = 2.92, P = 0.57).

Figure 6: Survivorship of rofecoxib-exposed animals compared to controls in seminatural enclosures. (a) Control females (n = 74) experienced higher mortality than rofecoxib-exposed females; however, these data were not analyzed due to so few deaths: 100% survival of rofecoxib-exposed and two mortalities in the control group. (b) No differences in male mortality was detected between treatments [n = 42 (PH χ2 = 0.04; P = 0.83)].

In cages, rofecoxib exposure did not affect the wean weight of offspring, but a trend was detected where rofecoxib-exposed litters were smaller than control litters. In enclosures, rofecoxib-exposed males had equal reproduction, survival, and competitive ability. In enclosures, rofecoxib-exposed females also had equal survival compared to controls, but experienced 40% higher reproductive output.

One possible explanation for our observed results is that the drug did not cause lasting deleterious effects which ceased after exposure, allowing exposed individuals to rebound to average fitness. A second explanation is that perhaps animals in this experiment were not exposed to rofecoxib long enough to induce fitness effects. Animals were exposed to rofecoxib during gestation and to ~15-16 weeks of age, whereas cardiac adversity was not detected in humans until after taking the prescription for >18 months. Another possible explanation is that although cardiac events occurred in 27,000 people, an estimated 80 million people took the drug, [15] indicating that <0.4% of the people who took the drug underwent a cardiac event. Due to such a small overall percentage of cardiac adversity in humans, this effect would not have been statistically detectable in five populations of mice with a total of 116 individuals. Lastly, perhaps rofecoxib exposure does not cause the same adverse effects in mice as it did in humans and differences between the species may be why adverse effects were not detected by OPAs. Rofecoxib-exposed litters tended to be smaller when born in cages. This result is in line with previous research on NSAIDs showing that these drugs can cause transient infertility, [25] as prostaglandin synthesis is important for normal ovulation. [26] Disruption of prostaglandin synthesis was also suggested to be the cause of increased embryo loss in rofecoxib-exposed rats during preclinical studies. [4]

Although rofecoxib-exposed litters tended to be smaller in cages, rofecoxib-exposed females had 40% more offspring when compared to controls when they were in enclosures. These contrasting results are interesting because the main variable that changed was the type of environment these animals were in and because increased reproduction was detected in a treatment where the immune system is targeted. Tradeoffs exist between pregnancy and the immune system, and because rofecoxib suppresses the immune system, perhaps less energy was allocated toward immune function and more energy allocated to reproduction. [27] The fact that we detected a positive impact on female reproduction within a seminatural environment does indicate that normal physiological systems were being disrupted. Thus, such results could be treated as a red flag during pharmaceutical development, triggering further investigation into what physiological system was displaying abnormal behavior.

Conclusion

The adverse health effects of rofecoxib seen in humans escaped detection by OPAs; however, rofecoxib-induced adverse effects were also missed by current preclinical methodologies. These results may be explained by the exposure design (animals competing in enclosures were removed from rofecoxib prior to release into seminatural enclosures), the relatively short duration of exposure, species differences, or because the health benefits of the drug negated (or outweighed, in the case of female reproduction) the side effects. Similar to all other assays used in preclinical trials, OPAs cannot reveal all maladies, despite their demonstrated sensitivity in detecting cryptic toxicity from numerous exposures. OPAs' ability to detect adverse health outcomes in a variety of other treatments suggests that their implementation during pharmaceutical development would be beneficial. In addition to the proximate-level animal studies, OPAs would be useful because they quantify fitness on an ultimate level that provides clear interpretation on overall health. This is achieved because the nature of the OPA is that it challenges most physiological systems simultaneously and consequently, toxicity that reduces performance of any physiological system (e.g. cardiorespiratory, metabolic, or neurological) is likely to be detected in the endpoint measures of this assay (such as survival and reproductive success). Further development of the OPA as a tool for drug safety evaluation is needed, such as evaluating additional pharmaceuticals that have known adverse effects.

The project was funded by the University of Utah's Technology Commercialization Program and was partially conducted while Wayne Potts was supported by NSF grant DEB 09-18969 and NIH grant R01-GM109500. Shannon Gaukler was supported by an NSF GK-12 Educational Outreach Fellowship (DGE 08-41233).