*Professor, Department of Anesthesia, Stanford University School of Medicine, Stanford, California. †Associate Professor, Departments of Psychiatry and Behavioral Sciences and of Pediatrics, Stanford University School of Medicine. ‡Research Assistant, Department of Anesthesia, Stanford University School of Medicine. §Associate Professor, Department of Anesthesia, Stanford University School of Medicine. ‖Research Nurse, Department of Anesthesia, Stanford University School of Medicine. #Research Scientist, Departments of Psychiatry and Behavioral Sciences, Stanford University School of Medicine. **Research Scientist and Director of the Center of Health Sciences, SRI International, Menlo Park, California. ††Professor, Department of Anesthesia, Stanford University School of Medicine, and Department of Anesthesia, Veterans Affairs Palo Alto Health Care System, Palo Alto, California.

OPIOIDS are the cornerstone medication for the management of moderate to severe pain. They are a key component of balanced anesthetic techniques and remain pivotal for the management of pain after surgery or trauma. Unfortunately, the clinical utility of opioids is limited by several aversive drug effects including respiratory depression, sedation, nausea, pruritus, and addiction. Patients' susceptibility to any of these effects varies greatly.1,2 Gaining a better understanding of the mechanisms underlying such differences is essential to identify patients who are at risk.

Sedation and respiratory depression are among the most worrisome adverse opioid effects. For example, patient-controlled opioid analgesia in the postoperative period is associated with severe respiratory depression requiring administration of an opioid antagonist at a rate of approximately 0.5%.3,4 Co-occurrence of somnolence is typical. Although advanced age, obesity, and concomitant use of sedative medications are well-established covariates that increase the risk of respiratory depression, work examining genetic factors is quite limited.5 For example, an experimental study in homozygous carriers of the 118G allele of the OPRM1
(opioid receptor, μ 1) variant suggested that carriers of the G allele experienced less respiratory depression at equianalgesic opioid doses.6 However, another experimental study in heterozygous carriers of the G allele as well as a clinical study could not confirm such protective effects of the G allele.7,8

Approximately one-third of patients undergoing surgery suffer from postoperative nausea and vomiting, a condition strongly associated with the perioperative use of opioids.9 Postoperative nausea and vomiting appear to be more prevalent in homozygous carriers of the 118A allele of the OPRM1
variant.10–12 Although variants of the ABCB1
(adenosine triphosphate binding cassette, subfamily B, member 1) gene have also been associated with the incidence of postoperative nausea and vomiting and opioid-mediated nausea during chronic opioid therapy, results of these studies are inconsistent.13–15 Finally, a large gene-association study in cancer patients linked opioid-related nausea to variants of the HTR3B
(5-hydroxytryptamine receptor 3B), COMT
(catechol-O
-methyltransferase), and CHRM3
(cholinergic receptor, muscarinic 3) genes.16 Remarkably, variants of the CHRM3
gene were also associated with the risk of postoperative nausea and vomiting in surgical patients.17 Very few data are available regarding the genetics of pruritus, another common opioid-related side effect.

In addition, prescription opioid abuse has reached alarming dimensions as accidental death from overdose has increased exponentially.18,19 Although many studies have focused on identifying relevant gene variants in addict populations, not much work has examined the genetics underlying acute reinforcing opioid effects.20 However, subjective responses such as drug liking or disliking may be promising index phenotypes. For example, drug liking on first exposure is predictive of opioid abuse and is a commonly assessed outcome for estimating the abuse potential of novel opioid formulations.21,22

Studies examining the relative importance of genetic and environmental influences on aversive and reinforcing opioid responses are lacking. The aim of this study was to provide estimates of heritability and familial aggregation by studying twins under well-controlled, laboratory-type conditions.23–25 Demonstration of significant heritability is particularly important to clarify whether genetic factors are of clinical importance, which in turn would justify larger-scale and more detailed molecular studies.

Materials and Methods

This pharmacogenomic study was registered at ClinicalTrials.gov on May 2, 2008 (NCT00672438; PI: Angst MS) and was conducted during September 2008 and June 2010. The study produced large datasets covering four outcome domains. Here we present data on aversive and reinforcing opioid effects, whereas data on pain sensitivity and analgesic opioid effects are reported in a separate manuscript.3Pain sensitivity and analgesic opioid effects were primary outcomes, whereas aversive and reinforcing effects were secondary outcomes. Some portions of the methods section including the description of subjects, general study setting (including fig. 1), drug administration, and statistical analysis are analogous and are included in both manuscripts to ensure completeness. A detailed description of the methods and procedures required for the conduct of an interventional and laboratory-type pharmacogenomic study in a sizable number of twins has previously been published.25

Fig. 1. Two hundred twenty-eight monozygotic and dizygotic twins successfully underwent a computer-controlled infusion with the μ-opioid agonist alfentanil in a single occasion, randomized, double-blinded and placebo-controlled study paradigm. Baseline assessments included respiratory parameters (transcutaneous carbon dioxide and respiratory rate), cognitive speed, pain tests (reported elsewhere), covariates potentially affecting measured opioid effects (demographics, psychometric tests, sleep quality), vital signs, and blood draws. Fifty percent of twin pairs were allocated to receive alfentanil first and saline placebo second, whereas the other 50% of twin pairs received alfentanil and saline placebo in reverse order. The alfentanil target concentrations for both treatment sequences are depicted in the graph. A concentration of 100 ng/ml produces significant analgesic and aversive opioid effects in patients suffering from postoperative pain. Respiratory parameters, cognitive speed, subjective aversive effects (nausea, dizziness, sedation, pruritus), reinforcing effects (drug liking and disliking), analgesic effects (reported elsewhere), and vital signs were assessed in identical fashion during both stages of the infusion protocol. Blood draws for assaying alfentanil plasma concentrations were also obtained. IV = intravenous. This figure has been reproduced with permission of the International Association for the Study of Pain®(IASP®). The figure may not be reproduced for any other purpose without permission.

Twins were recruited by a joint effort of SRI International and Stanford University School of Medicine. Initial contact and primary enrollment were the responsibility of study staff of SRI International. Recruitment was mainly achieved through the Twin Research Registry and advertisements broadcasted by regional radio stations.26 The study was approved by the Institutional Review Boards of SRI International (Menlo Park, California) and Stanford University School of Medicine (Stanford, California). Two hundred forty-two monozygotic and dizygotic twins were enrolled after giving written informed consent. A medical history was taken and participants were screened for inclusion and exclusion criteria. Inclusion criteria were (1) age 18–70 yr, (2) fluency in the English language, and (3) negative urine pregnancy test on the study day (premenopausal women). Exclusion criteria were (1) clinically relevant systemic diseases such as psychiatric, neurologic, and dermatologic conditions interfering with the collection and interpretation of study data, (2) cardiorespiratory diseases causing at least moderate impairment in daily activities, (3) renal and hepatic diseases with functional impairment, (4) morbid obesity, (5) sleep apnea, (6) history of addiction, (7) allergy to study medication, (8) chronic intake of medications with recognized analgesic/antihyperalgesic activity, (9) intake of over-the-counter analgesics within 2 days before the study, (10) Raynaud disease, (11) pregnancy, and (12) other conditions compromising a participant's safety or the integrity of the study.

Study Setting

The study took place in the Human Pain Laboratory of the Department of Anesthesia at Stanford University School of Medicine. The laboratory offers a quiet environment and precise lighting and temperature control. Critical equipment for the successful and safe conduction of the study included ergonomic and adjustable treatment chairs (Cloud 9, Living Earth Crafts, Vista, CA), vital signs monitors (Propaq Model 244, Welch Allyn, Beaverton, OR), oxygen supply, and a resuscitation cart with a defibrillator, airway management equipment, and emergency drugs. Laboratory staff included (1) a research associate who was blinded to treatment, solely interacted with study participants, performed all testing procedures, and collected all subjective and behavioral data; (2) a registered nurse trained in critical care or emergency medicine who was not blinded to treatment, performed the phlebotomy, administered the study drug, monitored and recorded vital signs, and collected blood specimens; and (3) an anesthesiologist who was blinded and was physically present to oversee the drug infusion and ensure the safety of study participants. Vital signs including heart rate (electrocardiogram), blood pressure, and hemoglobin oxygen saturation were monitored throughout the study. Participants were required to fast overnight except for clear liquids, which were allowed up to 2 h before starting the drug infusion. Participants were also required to have at least 6 h of nighttime sleep before a study session. Twin pairs were not allowed to share their study experience before the completion of the experiments in both twins. During the drug infusion participants received supplemental oxygen (2 l/min) via
nasal cannula. Resting periods during the study were standardized, the room lighting was dimmed, participants listened to relaxing music of their choice via
headphones, and activities by study staff causing noise or possibly distracting participants were prohibited. At the end of the study session, participants were discharged when they met criteria used for patients undergoing noninvasive, ambulatory procedures requiring sedation (e.g.
, colonoscopy). Criteria included (1) blood pressure ± 20% of baseline, (2) hemoglobin oxygen saturation more than 95%, (3) wakefulness, (4) no or mild nausea, (5) no vomiting, (6) ability to urinate, and (7) prearranged transport home available.

General Study Design and Randomization

Twins underwent a single occasion, randomized, double-blind and placebo-controlled study protocol (fig. 1). During the preinfusion phase, cognitive speed, respiratory parameters, pain sensitivity, and covariates potentially affecting measured opioid effects were assessed. Blood was collected for genotyping. During the infusion phase, changes in cognitive speed, respiratory parameters, and pain sensitivity were measured to infer sedative, respiratory depressant, and analgesic opioid effects. Similarly, occurrence and magnitude of nausea, pruritus, and reinforcing opioid effects were assessed. Blood was collected to assay for drug plasma concentrations. Fifty percent of the twin pairs were randomized to receive an infusion of the μ-opioid agonist alfentanil followed by the infusion of saline placebo, whereas the other 50% of twin pairs were randomized to receive the infusions in reverse order. The randomization list allocating twin pairs to a particular infusion sequence was generated via
Research Randomizer.1The list was generated by staff of SRI International who were not further involved in the conduct of the study.

The single-occasion study design had to control for placebo effects that could potentially confound some of the measured opioid effects. Traditional study designs test for drug and placebo effects on separate study occasions. However, asking twins to return for a second study occasion was not considered feasible, because such a requirement may have hampered our ability to recruit and retain a sufficient number of twins. For example, 31% of twins lived more than 60 miles away from the study location. Although a single-occasion design did not allow assessing placebo effects in all participants, such effects could be assessed in the 50% of twins randomized to receive saline placebo before alfentanil. Randomizing the other 50% of twins to receive alfentanil before saline placebo was necessary to maintain the blinding. However, placebo effects could not be assessed in these twins because residual alfentanil plasma concentrations were still present during the saline placebo infusion.

Opioid Administration and Assay

A computer-controlled infusion paradigm was used to quickly achieve and maintain steady-state drug plasma and effect site concentrations. An equilibration period of 20 min was observed between starting the infusion of alfentanil or saline placebo and the first assessment of drug effects, which allowed measuring all outcomes at similar drug concentrations.27 Although a computer-controlled infusion allows maintaining a stable plasma concentration in an individual participant, plasma concentrations vary among participants. Therefore, alfentanil plasma concentrations were measured to include interindividual differences in drug concentrations as a covariate in the final analysis. The μ-opioid agonist alfentanil (Janssen Pharmaceutica, Titusville, NJ) was chosen because of its quick onset and offset of action, and a previously validated computer-controlled infusion algorithm for its administration.27 Alfentanil was administered intravenously via
a computer-controlled infusion pump (Harvard Pump 22, Harvard Apparatus, Inc., South Natick, MA) targeting a steady-state plasma concentration of 100 ng/ml. This target concentration produces clinically relevant opioid effects without causing harmful side effects.27,28 STANPUMP2using Scott's weight-adjusted pharmacokinetic parameters was the software driving the infusion pump.29

Alfentanil plasma concentrations were assayed at the Clinical Research and Development Unit of the Department of Anesthesia at the University of Colorado Health Sciences Center (Denver, Colorado). Six milliliters of venous blood were drawn into heparinized glass tubes, centrifuged, and the plasma frozen and stored at −70°C until assayed. Using liquid chromatography/liquid chromatography mass spectrometry/mass spectrometry, the lower limit of quantitation was 1.25 pg/ml with a 1,000-fold linear range (R = 0.99), and an intraassay and between-assay coefficient of variations ranging between 4–16% and 3–14%.

Respiratory Depression

Respiratory depression was quantified by measuring changes in partial pressure of transcutaneous carbon dioxide and respiratory rate. Transcutaneous carbon dioxide was quantified with aid of a partial pressure of oxygen/partial pressure of carbon dioxide (pO2/pCO2) electrode (Perimed Inc., North Royalton, OH) mounted on the anterior chest wall. Although measured tissue carbon dioxide is higher than the arterial carbon dioxide, the two measures correlate strongly and relative changes match closely.30 Respiratory rate was assessed by counting the number of breaths over a period of 1 min. Both measures were obtained after strictly observing a standardized resting period of at least 15 min. During this period, participants listened to soft music via
headphones, the room lighting was dimmed, any interaction was avoided, and activities by study staff causing noise or possibly distracting participants were prohibited. The partial pressure of transcutaneous carbon dioxide was continuously monitored and measures were only recorded at the end of the resting period in undisturbed participants with readings that were stable over a period of 2–3 min.

The use of more invasive or complex techniques to assess respiratory depression was considered but not deemed feasible. Additional interventions such as the insertion of an arterial line for repeatedly measuring arterial carbon dioxide, or the use of time-consuming and stressful rebreathing techniques for determining the transcutaneous carbon dioxide response function, may have significantly hampered our ability to recruit a sufficient number of twins for this demanding study protocol.

Sedation

Subjective sedation scores were assessed immediately after the recording of the respiratory parameters. Participants indicated on a 100-mm visual analog scale (VAS) anchored by the words “not sedated at all” (VAS = 0) and “sedated as much as possible” (VAS = 100) how sedated they felt.

Cognitive performance was assessed with the trail-making test. The trail-making test measures cognitive speed and correlates significantly with tests quantifying intelligence.31 This paper-and-pencil test consists of four matrices featuring 90 numbers organized in 9 rows and 10 columns on a 23 × 21 cm sheet of paper. Subsequent numbers are located randomly in neighboring rows or columns. Starting at number 1, a participant has to connect numbers in ascending order as quickly as possible. The time to completion of the test is recorded. Mistakes are called out by an observer and have to be corrected by the participant before continuing with the test. The particular matrix that a participant had to complete during a test cycle was chosen randomly. All participants were trained in the trail-making test before first recordings were made.

Nausea, Pruritus, and Dizziness

At the end of an infusion stage participants were asked to rate the average and maximum severity of nausea, pruritus, and dizziness on a 100-mm VAS anchored by the words “not at all” and “as much as possible.”

Reinforcing Opioid Effects

At the end of an infusion stage participants were asked the following questions: (1) Did you like the drug at any moment (yes/no)? (2) Did you dislike the drug at any moment (yes/no)? (3) If you liked and disliked the drug, did you like or dislike it first? (4) How much did you like the drug on average (100-mm VAS, 0 = “not at all,” 100 = “as much as possible”)? (5) What was the maximum that you liked the drug at any moment (VAS)? (6) What was the maximum that you disliked the drug at any moment (VAS)?

Anxiety: Anxiety was assessed with the Profile of Mood States, a self-reported questionnaire that evaluates six dimensions of mood (anxiety, depression, anger, vigor, inertia, and bewilderment).32 Participants rated 65 mood-related adjectives on a five-point scale (0 = not at all, 1 = a little, 2 = moderately, 3 = quite a bit, 4 = extremely). The Profile of Mood States yields a total score and subscores for each dimension of mood (anxiety subscore range: 0 to 36).

Sleep: Sleep quality during the month preceding the study was assessed with the Pittsburgh Sleep Quality Index, a self-reported questionnaire that assesses seven components of sleep (quality, latency, duration, efficiency, disturbance, medication, and daytime dysfunction). The total score ranges between 0 and 21, and a value more than 6 is indicative for sleep disturbance.

Zygosity Testing

Zygosity was assessed by genotyping 47 single nucleotide polymorphisms, a recently published high throughput method providing high accuracy.34 Genotyping was performed with a custom-designed Oligo Pool for Methylation Assay (Golden Gate Genotyping Assay, Illumina Inc, San Diego, CA) and BeadXpress (iGenix Inc., Bainbridge Island, WA).

Statistical Analysis

Data are presented as mean and SD (SD) or as median and interquartile range (IQR). Summary statistics, parametric or nonparametric hypothesis testing with paired or nonpaired test procedures, and correlation analysis on continuous and ranked data were performed in Systat Version 13 (Chicago, IL). An α level of P
< 0.05 indicated statistical significance. All outcomes were of secondary nature, which did not require adjusting the P
value to the number of outcomes.

The primary statistical analysis of heritability (genetic effects) and familial aggregation (genetic and/or shared environmental effects) was based on a classic twin model.35 In principle, this model takes advantage of the fact that monozygotic twins share 100% of their genes, whereas dizygotic twins share approximately 50% of their genes. On the other hand, monozygotic and dizygotic twins share the same familial environment. Comparing the degree of similarity in a phenotype by correlational analysis in monozygotic and dizygotic twin pairs allows estimating the relative contribution of genetic and environmental factors, which can be broken down further into shared (familial) and unique (random) environmental factors. Greater similarity in monozygotic than in dizygotic twin pairs suggests that genetic effects at least partially account for studied phenotype. Similarities that are equal or nearly equal in monozygotic and dizygotic twin pairs suggest that shared environmental effects at least partially account for studied phenotype. Finally, phenotypical dissimilarities in monozygotic twin pairs suggest that unique environmental factors at least partially account for studied phenotype, because monozygotic twins share all of their genes and all of the common environment.

The specific analysis was based on a generalized form of the Defries-Fulker (DF) regression model for twins (Stata Version 11, StataCorp LP, College Station, TX).36 In this model, phenotypic measurements of each twin are regressed on his or her cotwin's phenotypic measurement, while accounting for the joint distribution of the twin data. In contrast with alternative methods of twin analysis, the DF regression model produces unbiased estimates of twin pair intraclass correlations, which provide the basis for estimates of genetic and familial effects. The DF regression model was modified to allow for the simultaneous estimation of covariate effects, as well as genetic and familial effects. The method is equivocal to competing methods in terms of power but is more robust in the presence of nonnormality. The modified model was specifically developed to address the highly skewed distribution of the visual analog scale data collected in this study.

To identify the most significant covariate effects, a forward selection algorithm using the DF regression model with both genetic and familial effects was applied to each of the studied phenotypes. Covariates were added sequentially, starting with the covariate associated with the smallest P
value (P
< 0.05; Wald test for the corresponding Z-statistic). This procedure was repeated until remaining covariates no longer yielded a P
value less than 0.05.

All subsequent tests of genetic and familial effects incorporated the significant covariates for the respective phenotypes. Z-statistics (Wald test) were used for significance testing. Reported significance levels and 95% CIs for selected covariates, and genetic and familial effects were estimated by generating 10,000 random bootstrap datasets. Datasets were created with clustered resampling techniques using twin pairs as the resampling unit. Bootstrap resampling was stratified by zygosity. A one-sided Wald test was used to test for a significant genetic effect. This test examined whether the intraclass correlation within monozygotic twin pairs was significantly larger than the intraclass correlation for dizygotic twin pairs, using the standard assumption that monozygotic and dizygotic twins shared a common environment. If the genetic test was not statistically significant, a secondary one-sided test procedure was used to test for a significant familial effect. This test examined whether the average twin-pair intraclass correlation, assumed to be equal for monozygotic and dizygotic pairs, was greater than zero. Familial aggregation was not assessed for phenotypes that showed a significant genetic effect. Heritability estimates were calculated under the DF ACE twin model that allows estimating the differential contributions of additive genetic (A), shared environmental (C), and unique environmental effects (E). Familial aggregation was estimated under a DF CE model. CIs for heritability and familial aggregation were truncated to lie between 0 and 1, which is consistent with the assumptions underlying the twin model.

Power Analysis

The power analysis was based on the enrollment of 85 monozygotic and 40 dizygotic twin pairs. Power was approximated based on Fisher log transformation for the Pearson correlation coefficient between measurements for twin A and twin B at an α level of P
< 0.05. Eighty-five monozygotic pairs yielded a power of 0.8 to detect familial aggregation more than 30%. Eighty-five monozygotic and 40 dizygotic pairs yielded a power of 0.5 to detect heritability more than 50%, assuming that shared environmental effects account for a modest 20% of the variance. However, these calculations were best approximations because of uncertainty regarding the relative contribution of additive genetic, shared environmental, and unique environmental effects to the response variance.

Results

This manuscript reports results on aversive and reinforcing opioid effects, whereas a separate manuscript reports pain sensitivity and analgesic outcomes.3For completeness, demographic variables, drug plasma concentrations and safety parameters are reported in both manuscripts. During the study no adjustments to the protocol or outcome assessments were required.

Subjects

A total of one hundred twenty-one twin pairs were recruited. One hundred fourteen pairs completed the study. Seven pairs were excluded for the following reasons: (1) two pairs because of positive pregnancy test; (2) two pairs because of missed appointments; and (3) three pairs showed poor compliance with the study procedures. Detailed demographics of the final cohort are depicted in table 1. Most twins were women (62%), monozygotic (71%), Caucasian (78%), and of non-Hispanic origin (82%). The median age was 29 yr (22–47; IQR) and the median body mass index was 24.3kg/m2(21.8–27.8; IQR). Educational levels were a high school degree for 19, some college education for 95, and a college degree for 114 subjects.

The average alfentanil plasma concentration was 72 ± 16 ng/ml (SD), which is 28% lower than predicted. The size of this prediction error is consistent with previous results, and actual plasma concentrations produced sizable drug effects.27 Although plasma concentrations covered a 3.7-fold range (lowest to highest), approximately 80% of the concentrations were within a twofold range. The plasma concentration was positively correlated with the body mass index (R = 0.53; P
< 0.001) and was significantly higher in men than in women (76.6 ± 22.7 ng/ml vs.
62.6 ± 18.8 ng/ml (SD); P
< 0.001). The body mass index was not different between men and women. The average within-pair difference in plasma concentration was 17 ± 14% in monozygotic twins and 21 ± 16% (SD) in dizygotic twins.

Respiratory Depression

Alfentanil-induced changes in transcutaneous carbon dioxide varied widely among studied twins, being absent in some and increasing by up to 18.6 mmHg in others (fig. 2). The median transcutaneous carbon dioxide before drug infusion was 40.7 mmHg (38.0 to 43.7, IQR) and significantly increased to 47.7 mmHg (43.9 to 51.8; IQR) during the infusion of alfentanil (P
< 0.001). Due to an equipment malfunction involving the pO2/pCO2-electrode, reliable measurements were only obtained in 196 twins (missing values in 32 participants).

Fig. 2. Respiratory depression was assessed by measuring opioid-induced decreases in respiratory rate (RR) and increases in transcutaneous carbon dioxide (tc-CO2). Results are ranked from smallest to largest along the x-axis. The interindividual differences in drug-induced changes in respiratory rate (A
) varied widely and ranged from −12 to 3 breaths/min. Similarly, the increase in tc-CO2varied widely (B
), being absent in some and increasing by up to 18.6 mmHg in others. The solid
and dashed lines
indicate the median and the interquartile range, respectively.

Fig. 2. Respiratory depression was assessed by measuring opioid-induced decreases in respiratory rate (RR) and increases in transcutaneous carbon dioxide (tc-CO2). Results are ranked from smallest to largest along the x-axis. The interindividual differences in drug-induced changes in respiratory rate (A
) varied widely and ranged from −12 to 3 breaths/min. Similarly, the increase in tc-CO2varied widely (B
), being absent in some and increasing by up to 18.6 mmHg in others. The solid
and dashed lines
indicate the median and the interquartile range, respectively.

The effects of alfentanil on respiratory rate varied widely among studied twins, increasing in a few but decreasing by up to 12 breaths/min in most others (fig. 2). The median respiratory rate before drug infusion was 15 min−1(13–17; IQR) and significantly decreased to 11 min−1(9–13) during the infusion of alfentanil (P
< 0.001).

Sedation

Alfentanil-induced changes in cognitive speed (trail-making test) varied widely among studied twins, being absent in some and increasing by more than 60 s in others (fig. 3). The median time to complete the trail-making test before drug infusion was 62 s (55–73; IQR) and significantly increased to 69 s (55–84; IQR) during the infusion of alfentanil (P
< 0.001).

Fig. 3. Sedation was assessed by measuring cognitive speed and by asking participants to indicate on a 100-mm visual analog scale (VAS) how sedated they felt. Results are ranked from smallest to largest along the x-axis. Drug-mediated slowing in cognitive speed (A
) varied widely in participants being unaffected in some and being modestly affected in most participants. Subjective sedation scores (B
) increased in all participants, but the magnitude of such increase varied remarkably. The solid
and dashed lines
indicate the median and the interquartile range, respectively. TMT = trail making test.

Fig. 3. Sedation was assessed by measuring cognitive speed and by asking participants to indicate on a 100-mm visual analog scale (VAS) how sedated they felt. Results are ranked from smallest to largest along the x-axis. Drug-mediated slowing in cognitive speed (A
) varied widely in participants being unaffected in some and being modestly affected in most participants. Subjective sedation scores (B
) increased in all participants, but the magnitude of such increase varied remarkably. The solid
and dashed lines
indicate the median and the interquartile range, respectively. TMT = trail making test.

The effects of alfentanil on subjective sedation scores varied remarkably among studied twins, ranging from a 5- to a 100-point VAS score (fig. 3). The median sedation score was 75 (60–85; IQR) during the infusion of alfentanil. Given that sedation scores were only determined during the infusion of saline placebo and alfentanil but not at baseline, the statistical significance of drug-induced changes was evaluated in the subpopulation of twins receiving placebo before alfentanil (table 2; P
< 0.001).

Subjects were asked to rate average as well as maximum drug-induced changes in nausea, pruritus, and dizziness. Average and maximum changes were tightly correlated (R = 0.89 to 0.92; P
< 0.001). Effects of alfentanil varied widely among studied twins as shown for maximum drug effects in figure 4. The median VAS scores for average and maximum nausea during the infusion of alfentanil were 1 (0–32; IQR) and 8 (0–70; IQR) in all subjects, whereas they were 32 (17–60; IQR) and 70 (41–89; IQR) in the 50% of subjects reporting this side effect. The median VAS scores for average and maximum pruritus were 20 (0–44; IQR) and 30 (0–65; IQR), whereas they were 30 (15–50; IQR) and 50 (26–71; IQR) in the 58% of subjects reporting this side effect. The median VAS scores for average and maximum dizziness were 20 (0–50; IQR) and 40 (0–71; IQR), whereas they were 39 (20–60; IQR) and 60 (34–80; IQR) in the 68% of subjects reporting this side effect. Given that scores were only determined during the infusion of saline placebo and alfentanil but not at baseline, the statistical significance of drug-induced changes was evaluated in the subpopulation of twins receiving placebo before alfentanil (table 2; P
< 0.001 for all comparisons).

Fig. 4. Subjective aversive opioid effects were all assessed on a 100-mm visual analog scale (VAS). Participants were asked at the end of the infusion phase to provide ratings for average and maximum nausea, pruritus, and dizziness. Average and maximum ratings correlated tightly (R = 0.89–0.92). Maximum scores are displayed in the figure. Results are ranked from smallest to largest along the x-axis. The VAS scores for nausea (A
), pruritus (B
), and dizziness (C
) varied widely among participants, being absent in many but ranking close to the maximum in others. The solid
and dashed lines
indicate the median and the interquartile range, respectively.

Fig. 4. Subjective aversive opioid effects were all assessed on a 100-mm visual analog scale (VAS). Participants were asked at the end of the infusion phase to provide ratings for average and maximum nausea, pruritus, and dizziness. Average and maximum ratings correlated tightly (R = 0.89–0.92). Maximum scores are displayed in the figure. Results are ranked from smallest to largest along the x-axis. The VAS scores for nausea (A
), pruritus (B
), and dizziness (C
) varied widely among participants, being absent in many but ranking close to the maximum in others. The solid
and dashed lines
indicate the median and the interquartile range, respectively.

During the infusion of alfentanil, 65 participants liked the drug, 31 disliked it, 14 neither liked nor disliked it, and 118 both liked and disliked it at different times. Of the 118 participants who liked and disliked the drug, 93 liked it initially and 25 disliked it initially. The VAS scores for average liking, maximum liking, and maximum disliking varied greatly among participants (fig. 5). Average and maximum liking were strongly correlated (R = 0.81; P
< 0.001). Average liking was negatively correlated with maximum disliking (R = −0.35; P
< 0.001), whereas maximum liking and maximum disliking were not correlated. The median VAS scores for average and maximum liking during the infusion of alfentanil were 50 (0–75; IQR) and 70 (30–90; IQR). The median VAS score for maximum disliking was 35 (0–80; IQR).

Fig. 5. Reinforcing drug effects were assessed on a 100-mm visual analog scale (VAS). Participants were asked at the end of the infusion phase to provide ratings for average (A
) and maximum drug liking (B
), and maximum drug disliking (C
). Results are ranked from smallest to largest along the x-axis. The VAS scores for drug liking and disliking varied widely among participants, being absent in many but ranking close to the maximum in others. The solid
and dashed lines
indicate the median and the interquartile range, respectively.

Fig. 5. Reinforcing drug effects were assessed on a 100-mm visual analog scale (VAS). Participants were asked at the end of the infusion phase to provide ratings for average (A
) and maximum drug liking (B
), and maximum drug disliking (C
). Results are ranked from smallest to largest along the x-axis. The VAS scores for drug liking and disliking varied widely among participants, being absent in many but ranking close to the maximum in others. The solid
and dashed lines
indicate the median and the interquartile range, respectively.

Liking and disliking correlated with some of the subjective aversive opioid effects (table 3). Most notably, average liking was negatively correlated with nausea and dizziness, which accounted for 3–8% of the observed variance. Maximum disliking was positively correlated with nausea and dizziness, which accounted for 19–37% of the observed variance. By contrast, maximum liking was neither positively nor negatively correlated with nausea or dizziness.

Possible placebo effects were evaluated in the 50% of twins who received the infusion of saline placebo before the infusion of alfentanil. Objective outcomes assessed at baseline and during the infusion of saline placebo and alfentanil included respiratory rate, transcutaneous carbon dioxide, and the trail-making test. For these outcomes, changes observed between baseline and saline placebo administration were compared with changes observed between baseline and alfentanil administration. However, subjective drug-related outcomes including sedation, dizziness, nausea, drug liking, and drug disliking were only assessed during saline placebo and alfentanil administration. For these outcomes the absolute values obtained during the administration of saline placebo and alfentanil were compared.

Overall, measures of respiratory depression (respiratory rate and transcutaneous carbon dioxide) and cognitive speed changed modestly if at all during the infusion of saline placebo compared with predrug assessments. Consequently, drug effects were inferred in all subjects by subtracting predrug measurements from measurements obtained during the infusion of alfentanil.

The median respiratory rate was 16 min−1(13–18; IQR) before drug infusion and 15 min−1(13–17; IQR) during the infusion of saline placebo (P
= 0.06). The median net decrease in respiratory rate from predrug measurements was −1 min−1(−2 to 1; IQR) during saline placebo and −4 min−1(−7 to −2; IQR) during alfentanil administration.

The median time required to complete the trail-making test was 65 s (56–74) before drug infusion and 64 s (55–74) during the infusion of saline placebo (P
= 0.15). The median net decrease from predrug measurements was −1.5 s (−5 to 3; IQR) during saline placebo administration, whereas the median net increase was 4.5 s (−1 to 11; IQR) during alfentanil administration.

Subjective outcomes were differentially affected by the administration of saline placebo (table 2; also see figures, Supplemental Digital Content 1 and 2, , which depict incidence and magnitude of aversive and reinforcing opioid effects during administration of saline placebo and alfentanil). The incidence of dizziness, nausea, and drug disliking during placebo administration was low (3–8%), and the median effect size was zero (0–0; IQR). The incidence of pruritus and drug liking was modest (12–18%), and the median effect size was zero (0–0; IQR). However, the incidence of sedation was remarkable (48%), whereas the median effect size was zero (0–15; IQR). These results suggest that changes in dizziness, nausea, and drug disliking during alfentanil administration were essentially drug-related. Changes in pruritus and drug liking were largely drug-related. However, changes in sedation were only partially drug-related. The heritability analysis assumed that effects measured during the administration of alfentanil were entirely drug-related. Although this assumption appears to be reasonable for most subjective outcomes, it may not be entirely valid for sedation.

Possible effects of drug sequence (saline placebo – alfentanil vs.
alfentanil – placebo) on assessed outcomes were examined by comparing effects sizes measured during the infusion of alfentanil between the two groups of twins who received the infusions in reversed order. No significant sequential effects were detected (P
= 0.12–0.92).

Covariates

Covariates significantly affected several of the measured phenotypes. Table 4and table 5summarize these covariate contributions. Age and sex were the covariates most commonly associated with the measured phenotypes.

Alfentanil plasma concentration was not a significant covariate for any of the measured phenotypes, implying that pharmacokinetic variability played a minor role for estimates of heritability and familial aggregation. This result is corroborated by two additional findings. First, differences in the magnitude of opioid-mediated aversive or reinforcing effects were not related to differences in plasma concentrations when considering all study participants. Second, differences in the magnitude of opioid-mediated effects within twin pairs were not related to differences in plasma concentration within twin pairs. Exemplary data for parameters of respiratory depression are shown in figure 6.

Fig. 6. The figure depicts changes in respiratory parameters during the infusion of alfentanil. A
illustrates that the reduction in respiratory rate was not related to the plasma concentration within the range of studied plasma concentrations (r2< 0.01). The inset
graph demonstrates that within twin-pair differences in the reduction of respiratory rate were not related to within-pair differences in plasma concentrations (r2< 0.01). B
illustrates that the increase in carbon dioxide was not related to the plasma concentration within the range of studied plasma concentrations (r2< 0.01). The inset
graph demonstrates that within twin-pair differences in the increase of carbon dioxide was not related to within-pair differences in plasma concentrations (r2< 0.01). These findings indicate that plasma concentrations were not a relevant factor affecting estimates of heritability and familial aggregation.

Fig. 6. The figure depicts changes in respiratory parameters during the infusion of alfentanil. A
illustrates that the reduction in respiratory rate was not related to the plasma concentration within the range of studied plasma concentrations (r2< 0.01). The inset
graph demonstrates that within twin-pair differences in the reduction of respiratory rate were not related to within-pair differences in plasma concentrations (r2< 0.01). B
illustrates that the increase in carbon dioxide was not related to the plasma concentration within the range of studied plasma concentrations (r2< 0.01). The inset
graph demonstrates that within twin-pair differences in the increase of carbon dioxide was not related to within-pair differences in plasma concentrations (r2< 0.01). These findings indicate that plasma concentrations were not a relevant factor affecting estimates of heritability and familial aggregation.

Sedation: No heritability was detected for opioid-mediated increases in subjective sedation scores or alterations in cognitive speed. Although 29% of the response variance associated with subjective sedation scores was explained by significant familial effects, no such effects were detected for alterations in cognitive speed.

Nausea, pruritus, and dizziness: Significant heritability was detected for opioid-induced nausea. Genetic effects accounted for an impressive 56–59% of observed response variance. Although no significant heritability was detected for pruritus and dizziness, both phenotypes were significantly aggregated in families. For pruritus, familial effects accounted for 17–38% of observed response variance. For dizziness, familial effects accounted for 32–39% of observed response variance.

Reinforcing effects: Significant heritability was detected for the disliking of the drug. Genetic effects accounted for 36% of observed response variance. Although no significant heritability was detected for drug liking, significant familial effects accounted for 23–26% of observed response variance.

The use of opioids has grown dramatically over the past decades due to the increased attention of health care providers to pain-related suffering. Similarly, our appreciation of factors limiting the utility of opioids has grown. Problems often acutely manifest after initiating opioid therapy include nausea, sedation, pruritus, and respiratory depression. When mild, these factors may simply be nuisances that can be addressed with adjustments in opioid dosing, the use of alternative opioid formulations, and the addition of other agents. On the opposite end of the spectrum, severe sedation and respiratory depression can result in patient injury and death. Moreover, opioid abuse and addiction have become very problematic. Lacking from our current knowledge is an understanding of the relative importance of genetic and environmental factors that underlie patients' susceptibility to experience aversive opioid effects and develop abusive behavior. We used a twin study paradigm and an experimental laboratory setting to provide quantitative estimates of the overall genetic and environmental contributions to aversive and affective opioid effects and quantify the influence of important covariates on these effects. Aversive and affective opioid effects were secondary outcomes of a larger dataset. Consequently, the statistical analysis did not require adjusting P
values to the number of reported outcomes. However, exact P
values are reported, which allows independent assessment for potential type I errors.

We observed a wide diversity of responses for most of the studied phenotypes. In particular, significant heritability was documented for opioid-mediated respiratory depression, nausea, and drug disliking. With the exception of the trail-making test quantifying cognitive speed, all other outcomes including sedation, pruritus, dizziness, and drug liking were significantly aggregated in families. Both genetic and shared environmental effects contribute to the finding of familial aggregation; thus, failure to detect heritability per se
for these outcomes does not preclude relevant genetic effects. In particular, mild to moderate genetic effects may have gone undetected considering the size of our study.

Nausea related to the use of opioids is particularly problematic, both in the acute and more chronic setting. After surgery, nausea delays discharge from the recovery room and results in unanticipated hospital admissions.37,38 Furthermore, nausea in postoperative periods while using opioids is common and often necessitates intervention.39,40 Our results suggest that the interindividual variability in opioid-mediated nausea is highly heritable, with more than 50% of the response variance attributable to genetic factors.

A limited amount of existing genetic data demonstrate that specific gene variants may be associated with opioid-induced nausea. A small study in cancer patients found that variants of UGT2B7
(UDP glucuronosyltransferase 2 family, polypeptide B7) were associated with higher levels of opioid-induced nausea, whereas variants of ABCB1
were associated with the frequency of vomiting.14 In a much larger multicenter study also involving cancer patients on chronic opioid therapy, variants of HTR3B
, COMT
, and CHRM3
were significantly associated with nausea.16 The 5-HT3 association is particularly plausible as this receptor is the target for efficacious antiemetic drugs. In agreement with some of the results in cancer patients, a more recent genome-wide association study in surgical patients also reported a significant association between a variant of CHRM3
and nausea.17 Finally, a study in patients receiving morphine for postoperative pain found that an interaction between the A118G variant of OPRM1
and the G1947A variant of COMT
was associated with reduced levels of nausea.12

The chief barriers to the aggressive use of opioids are concerns regarding their respiratory depressant effects. For example, most studies reviewing patient-controlled analgesia protocols indicate an incidence of approximately 0.5% for severe respiratory depression requiring administration of an opioid antagonist.3,4 Incidences tend to be higher if decreases in respiratory rate, hypoxia, or hypercapnia are used as measures of respiratory depression.3 Moreover, the rapidly expanding use of opioids for the control of chronic pain has been associated with an equally rapid increase in the number of emergency department visits related to opioid overdose and, more concurringly, deaths from accidental overdose.41 We found that approximately 30% of interindividual variance in opioid-induced respiratory depression as measured by changes in respiratory rate was heritable and therefore attributable to genetic factors. Similarly, approximately 30% of the response variance in opioid-induced elevations of transcutaneous carbon dioxide was aggregated in families. Few gene association studies have carefully quantified opioid-mediated respiratory depression. One study indicated that homozygous carriers of the G allele of the A118G variant of OPRM1
experienced less respiratory depression at equianalgesic plasma concentration of alfentanil compared with carriers of the A allele.6 However, a previous study administering the morphine metabolite morphine-6-glucuronide failed to identify an effect of the A118G variant of OPRM1
on respiratory depression. It is noteworthy that this small study of 16 subjects did not include any homozygous carriers of the G allele.8 More recently, a study administering an intravenous bolus of fentanyl in 189 patients after laparoscopic surgery also failed to detect a clinically relevant correlation between the A118G variant of OPRM1
and opioid-induced respiratory depression.7 Thus it appears that although genetics may contribute significantly to interindividual differences in opioid-mediated respiratory depression, we hardly understand the basis underlying these differences.

The abuse potential of opioids prescribed for pain control has come to the forefront of interest due to the rapidly escalating rate of abuse and accidental overdose.19,41 Addiction to opioids is heritable, and genetic studies have been designed to address the specific molecular underpinning.20,42 Although our study paradigm did not allow direct study of the complex clinical phenotype of opioid addiction, we were able to precisely measure acute reinforcing effects such as the liking and disliking of the drug. Liking in response to acute opioid administration is an established index phenotype to predict abuse potential, whereas disliking upon first exposure is associated with lack of abuse.21 Scales of liking and disliking are included in questionnaires such as the Drug Effect Questionnaire that are used to assess abuse potential.43,44 Drug liking has been assessed to quantify abuse potential of several opioids including heroin, morphine, buprenorphine, oxycodone, fentanyl, and remifentanil.21,22,45,46 Our results indicate that drug liking is significantly aggregated in families, although heritability could not be established. We observed that “maximum” liking was less influenced by other subjective opioid effects than “average” liking. Maximum liking may therefore be a less convoluted and perhaps preferable measure to assess positive reinforcing opioid effects. On the other hand, disliking was significantly heritable, suggesting that genetics may contribute to mechanisms protective against the abuse of opioids. Furthermore, disliking was correlated with nausea, which seems quite plausible. We suggest that opioid disliking may constitute a useful and easily measurable index phenotype to assess the abuse potential of opioids in future research.

Another dimension of data analysis made possible by our study design concerns the effect of major covariates on measured outcomes. This analysis also eliminated confounding influences of covariates such as age and sex on estimates of heritability and familial aggregation when assuming similar influences of genetic and common environmental effects on covariate strata. However, the number of twins enrolled in our study precluded formal analysis of this assumption. Age and sex were the covariates most commonly affecting aversive and reinforcing opioid effects. Age was associated with greater respiratory depression and drug-induced slowing of cognitive speed. Such observations are consistent with previous reports.47,48 Likewise, advanced age was associated with greater drug disliking, which is consistent with lower rates of opioid abuse in aging patients with chronic pain.49 Women reported more pruritus and dizziness during the infusion of alfentanil. However, there was no detectable effect of female sex on nausea, a well-established risk factor for nausea when using opioids in pain management.50 On the other hand, our analysis did reveal some novel findings. Asians displayed much higher VAS dizziness scores than did members of the other races. We also found significantly greater drug liking in Caucasians and non-Hispanics. Interestingly, studies focused on prescription opioid abuse have identified Caucasian race as a risk factor.51,52 Finally, the measured alfentanil plasma concentration was not a significant covariate for any of the studied phenotypes. This finding suggests that substantial pharmacodynamic variability concealed any pharmacokinetic variability within the range of studied plasma concentrations. Inspection of our data supports this conclusion as the range of measured plasma concentration was 3.7-fold, whereas the range of observed pharmacodynamic changes was substantially greater than 10-fold.

In summary, we provide estimates for the global genetic and environmental contributions to a range of common and clinically important aversive and reinforcing opioid effects. We also report on the frequency, variability, and magnitude of these effects as well as their modulation by a series of important covariates. To our knowledge this is the first study to amass a broad array of quantitative data characterizing acute opioid effects under carefully controlled and laboratory-type conditions. We also demonstrated that results were typically not affected in relevant ways by placebo responses or sequential effects. Laboratory-type procedures as described here may therefore provide an excellent paradigm for future studies examining the molecular genetics of individual opioid response profiles.

Fig. 2. Respiratory depression was assessed by measuring opioid-induced decreases in respiratory rate (RR) and increases in transcutaneous carbon dioxide (tc-CO2). Results are ranked from smallest to largest along the x-axis. The interindividual differences in drug-induced changes in respiratory rate (A
) varied widely and ranged from −12 to 3 breaths/min. Similarly, the increase in tc-CO2varied widely (B
), being absent in some and increasing by up to 18.6 mmHg in others. The solid
and dashed lines
indicate the median and the interquartile range, respectively.

Fig. 2. Respiratory depression was assessed by measuring opioid-induced decreases in respiratory rate (RR) and increases in transcutaneous carbon dioxide (tc-CO2). Results are ranked from smallest to largest along the x-axis. The interindividual differences in drug-induced changes in respiratory rate (A
) varied widely and ranged from −12 to 3 breaths/min. Similarly, the increase in tc-CO2varied widely (B
), being absent in some and increasing by up to 18.6 mmHg in others. The solid
and dashed lines
indicate the median and the interquartile range, respectively.

Fig. 3. Sedation was assessed by measuring cognitive speed and by asking participants to indicate on a 100-mm visual analog scale (VAS) how sedated they felt. Results are ranked from smallest to largest along the x-axis. Drug-mediated slowing in cognitive speed (A
) varied widely in participants being unaffected in some and being modestly affected in most participants. Subjective sedation scores (B
) increased in all participants, but the magnitude of such increase varied remarkably. The solid
and dashed lines
indicate the median and the interquartile range, respectively. TMT = trail making test.

Fig. 3. Sedation was assessed by measuring cognitive speed and by asking participants to indicate on a 100-mm visual analog scale (VAS) how sedated they felt. Results are ranked from smallest to largest along the x-axis. Drug-mediated slowing in cognitive speed (A
) varied widely in participants being unaffected in some and being modestly affected in most participants. Subjective sedation scores (B
) increased in all participants, but the magnitude of such increase varied remarkably. The solid
and dashed lines
indicate the median and the interquartile range, respectively. TMT = trail making test.

Fig. 4. Subjective aversive opioid effects were all assessed on a 100-mm visual analog scale (VAS). Participants were asked at the end of the infusion phase to provide ratings for average and maximum nausea, pruritus, and dizziness. Average and maximum ratings correlated tightly (R = 0.89–0.92). Maximum scores are displayed in the figure. Results are ranked from smallest to largest along the x-axis. The VAS scores for nausea (A
), pruritus (B
), and dizziness (C
) varied widely among participants, being absent in many but ranking close to the maximum in others. The solid
and dashed lines
indicate the median and the interquartile range, respectively.

Fig. 4. Subjective aversive opioid effects were all assessed on a 100-mm visual analog scale (VAS). Participants were asked at the end of the infusion phase to provide ratings for average and maximum nausea, pruritus, and dizziness. Average and maximum ratings correlated tightly (R = 0.89–0.92). Maximum scores are displayed in the figure. Results are ranked from smallest to largest along the x-axis. The VAS scores for nausea (A
), pruritus (B
), and dizziness (C
) varied widely among participants, being absent in many but ranking close to the maximum in others. The solid
and dashed lines
indicate the median and the interquartile range, respectively.

Fig. 5. Reinforcing drug effects were assessed on a 100-mm visual analog scale (VAS). Participants were asked at the end of the infusion phase to provide ratings for average (A
) and maximum drug liking (B
), and maximum drug disliking (C
). Results are ranked from smallest to largest along the x-axis. The VAS scores for drug liking and disliking varied widely among participants, being absent in many but ranking close to the maximum in others. The solid
and dashed lines
indicate the median and the interquartile range, respectively.

Fig. 5. Reinforcing drug effects were assessed on a 100-mm visual analog scale (VAS). Participants were asked at the end of the infusion phase to provide ratings for average (A
) and maximum drug liking (B
), and maximum drug disliking (C
). Results are ranked from smallest to largest along the x-axis. The VAS scores for drug liking and disliking varied widely among participants, being absent in many but ranking close to the maximum in others. The solid
and dashed lines
indicate the median and the interquartile range, respectively.

Fig. 6. The figure depicts changes in respiratory parameters during the infusion of alfentanil. A
illustrates that the reduction in respiratory rate was not related to the plasma concentration within the range of studied plasma concentrations (r2< 0.01). The inset
graph demonstrates that within twin-pair differences in the reduction of respiratory rate were not related to within-pair differences in plasma concentrations (r2< 0.01). B
illustrates that the increase in carbon dioxide was not related to the plasma concentration within the range of studied plasma concentrations (r2< 0.01). The inset
graph demonstrates that within twin-pair differences in the increase of carbon dioxide was not related to within-pair differences in plasma concentrations (r2< 0.01). These findings indicate that plasma concentrations were not a relevant factor affecting estimates of heritability and familial aggregation.

Fig. 6. The figure depicts changes in respiratory parameters during the infusion of alfentanil. A
illustrates that the reduction in respiratory rate was not related to the plasma concentration within the range of studied plasma concentrations (r2< 0.01). The inset
graph demonstrates that within twin-pair differences in the reduction of respiratory rate were not related to within-pair differences in plasma concentrations (r2< 0.01). B
illustrates that the increase in carbon dioxide was not related to the plasma concentration within the range of studied plasma concentrations (r2< 0.01). The inset
graph demonstrates that within twin-pair differences in the increase of carbon dioxide was not related to within-pair differences in plasma concentrations (r2< 0.01). These findings indicate that plasma concentrations were not a relevant factor affecting estimates of heritability and familial aggregation.