Not a subscriber?

You may access and print any article from the Journal of Clinical Sleep Medicine for your personal scholarly, research,
and educational use. Please note, access to the article is from the computer on which the article was purchased only.
Purchase of the article does not permit distribution, electronic or otherwise, of the article without the written
permission of AASM. Further, purchase does not permit the posting of the article text on an online forum or website.

1Stroke Program, University of Michigan, Ann Arbor, MI; 2Sleep Disorders Center, University of Michigan, Ann Arbor, MI; 3Department of Epidemiology, University of Michigan School of Public Health, Ann Arbor, MI

ABSTRACT

Study Objectives:

As the importance of portable monitors for detection of sleep apnea increases, efficient and cost-minimizing methods for data interpretation are needed. We sought to compare in stroke patients, for whom portable studies often have particular advantages, results from a cardiopulmonary monitoring device with and without manual edits by a polysomnographic technologist.

Methods:

Participants in an ongoing stroke surveillance study in Corpus Christi, Texas, underwent sleep apnea assessments with the ApneaLink Plus device within 45 days of stroke onset. Recordings were analyzed by the device's software unedited, and again after edits were made to the raw data by a registered polysomnographic technologist. Sensitivity and specificity were calculated, with the edited data as the reference standard. Sleep apnea was defined by 3 different apnea-hypopnea index (AHI) thresholds: ≥ 5, ≥ 10, and ≥ 15.

Results:

Among 327 subjects, 54% were male, 59% were Hispanic, and the median age was 65 years (interquartile range: 57, 77). The median AHI for the unedited data was 9 (4, 22), and for the edited data was 13 (6, 27) (p < 0.01). Specificity was above 98% for each AHI cutoff, while sensitivity was 81% to 82%. For each cutoff threshold, the edited data yielded a higher proportion of positive sleep apnea screens (p < 0.01) by approximately 10% in each group.

Citation:

Although polysomnography remains the gold standard for evaluation of sleep apnea, the procedure entails expense, wait times, and labor intensity that have motivated development and testing of unattended portable cardiopulmonary monitors. Portable devices are also more practical than polysomnography in particular clinical situations, such as during an acute stroke hospitalization, in which inpatient resources are limited and patients may be intolerant to elaborate testing.1 As sleep apnea is a risk factor for stroke2 and a risk factor for poor outcomes after stroke,3 further studies of sleep apnea after stroke are important.

The ApneaLink device, a portable cardiopulmonary monitoring device, has been validated against full polysomnography4–11 and has been used in several clinical research studies, including a large international clinical trial .1,12 While clinical guidelines for the use of portable monitoring state that the devices “must allow for the display of raw data for manual scoring or editing of automated scoring by a trained and qualified sleep technician/ technologist,”13 little is known about the value of such a review for some of the most commonly used recording systems, such as the ApneaLink, and no study has addressed this issue in stroke patients. Furthermore, a technologist's review of the raw data from home monitoring can be both costly and time-consuming. The practice expense reimbursement-determining relative value units (RVUs) assigned for home studies are approximately one-tenth of those assigned for laboratory studies,14 which can make detailed technologist review of respiratory signals less feasible. We therefore sought to compare the results from the ApneaLink Plus device with and without manual edits by a sleep technologist. We focused on stroke patients, for whom rapid access to simple and tolerable monitoring in an acute or subacute setting is a particularly high priority.

BRIEF SUMMARY

Current Knowledge/Study Rationale: The value of a polysomnographic technologist's review of raw data from a portable respiratory monitor is unknown. This study compared results from a commonly used portable respiratory monitor, the ApneaLink Plus, in stroke patients, before and after raw data were edited by a polysomnographic technologist.

Study Impact: This study showed that for cohorts of stroke patients, ApneaLink Plus data may be used without the added expense and time consumption of data review and edit by a polysomnographic technologist. However, if results of the ApneaLink Plus are to be used for decisions about an individual patient, for clinical or research purposes, edits by a polysomnographic technologist to the raw data should be undertaken, and will lead to a higher proportion of positive screening test results.

METHODS

Acute stroke patients who enrolled in the Brain Attack Surveillance in Corpus Christi (BASIC) project were screened for participation. The BASIC project is an ongoing stroke surveillance project in Nueces County located in south Texas. The BASIC methods have been published previously.15–17 Ischemic stroke and intracerebral hemorrhage patients enrolled in the parent BASIC project were eligible for sleep apnea screening if they did not meet these exclusion criteria: current use of supplemental oxygen, current mechanical ventilation or other positive pressure ventilation, or pregnancy. Patients with predominantly central sleep apnea were not excluded, as the large majority of sleep apnea after stroke is obstructive, and many patients with central sleep apnea have an obstructive component as well.18 All subjects were enrolled within 45 days of stroke symptom onset. The study was approved by the institutional review board of the University of Michigan and the two Corpus Christi hospital systems. Subjects or their proxy provided written informed consent to participate.

Sleep apnea assessment was performed in the subject's hospital room, at home, or at a nursing home, with the ApneaLink Plus device. This device monitors nasal pressure with a nasal cannula, and oxygen saturation and pulse with a flexible oxygen saturation probe. The unit differentiates central from obstructive events with the use of an effort belt. The device was administered to subjects by trained study personnel. Data were downloaded from the device and automatically analyzed by the ApneaLink Plus software (“unedited data”) using its “AASM criteria,” meaning that hypopneas were defined as a decrease in nasal pressure ≥ 30% for ≥ 10 sec, if followed by ≥ 4% oxygen desaturation. Unedited scores were recalculated for a convenience sample of cases in which the oximetry data were poor for ≥ 5% of the recording. In this exploratory analysis, hypopneas were defined as a reduction in airflow of ≥ 50% for ≥ 10 sec (“classic criteria”).

A single registered polysomnographic technologist with 28 years of clinical and research experience and an average recent inter-scorer reliability over 600 epochs of 90% for respiratory events compared with the AASM gold standard scorers edited the data, under the supervision of a board-certified sleep physician (R.D.C.). Specifically the technologist reviewed the raw data, and edited it manually to adjust the start and stop times to eliminate artifacts, adjust poor quality data and reactivate some data for scoring through sensitivity adjustment, and reclassify small numbers of apneic events that were not scored in a manner fully consistent with AASM 2007 guidelines (“edited data”).19 After this review, the software automatically recalculated results, using the newly defined recording periods and any reclassified apneas and hypopneas accepted “as is.” Otherwise, the software used default criteria and settings, as follows5,8: An apnea was identified by the software when nasal pressure fell ≥ 80% compared to baseline for ≥ 10 sec. A hypopnea was identified based on the AASM criteria above or, if oximetry data were intermittent and unreliable throughout the recording based on the technologist's assessment, the classic criteria (applied by the technologist in 21 cases). In practice, negligible numbers of apneas had been automatically scored in a false-positive manner because of the setting that required only an 80% airflow decrement rather than the 90% AASM 2007 criterion.19 However, upon technologist review, occasional hypopneas were rescored as apneas, and rare central apneas were rescored as obstructive apneas. Precise numbers of these corrections were not tracked.

If 2 h of interpretable data were not available, a second night of testing was offered to the subject. Twenty-five subjects had insufficient data without a second successful study and thus are not part of this report. The quality of each remaining study was rated by the technologist subjectively as good (all channels worked well for the majority of the total recording time), fair (nasal pressure, chest excursion, and oximetry worked well and simultaneously ≥ 2 h in total), or poor (all studies that did not quality as good or fair). The apnea-hypopnea index (AHI) was calculated as the sum of apneas and hypopneas per hour of recording. The oxygen desaturation index was calculated as the sum of desaturations ≥ 4% below baseline per hour.

Statistical Analysis

Descriptive statistics were used to assess baseline characteristics. Comparison of AHI by each method was made with a Wilcoxon signed-rank test for paired data for all subjects and by data quality categories. While ApneaLink Plus results are not a reference standard for the diagnosis of sleep apnea, the edited data is a reference standard compared with the unedited data. We therefore calculated standard measures of accuracy for the unedited compared with the edited results. Subjects were categorized by sleep apnea status based on 3 commonly used AHI cutoff thresholds for the ApneaLink Plus: ≥ 5, ≥ 10, ≥ 15, in addition to ≥ 20 and ≥ 30 to assess more significant degrees of severity, based on edited and unedited data, and compared with McNemar χ2 tests. Area under the receiver operating characteristic curve (AUC) with 95% confidence intervals were calculated for each edited AHI cutoff.20 Alternative cutoffs for the unedited AHI were also explored: the optimal unedited AHI cutoff was selected that maximized the sum of sensitivities and specificities. A Bland-Altman plot of the mean (unedited and edited) against mean absolute difference in AHI (edited – unedited) was inspected. In an exploratory analysis, main analyses were repeated for the convenience sample of cases (n = 23) in which the classic hypopnea criteria were applied to the unedited data and compared with the edited data. Analyses were performed with TIBCO Spotfire S+ 8.1 for Windows and R version 2.13.1.

RESULTS

Baseline characteristics of the 327 subjects are found in Table 1. Nineteen of the tracings were from a study repeated due to insufficient data. The ApneaLink Plus tests were performed a median of 13 (IQR: 6, 20) days post stroke onset. Subjects were about half male (54%); the median age was 65 (57, 77). A little over half of subjects were Hispanic (59%). The median duration of analyzable recording from the unedited data was 8 h 59 min (IQR: 6 h 34 min, 10 h 33 min). Recordings were rated as good (50%), fair (32%), and poor (18%) by the polysomnographic technologist.

Table 1

One recording had no oximetry data and thus was excluded from the main analyses. Unedited and edited raw data are presented in Figure 1. The median AHI for the unedited data was 9 (4, 21) and for the edited data was 13 (6, 27), with a median difference in AHI (edited - unedited) of 3 (1, 5), p < 0.01. The median difference in AHI varied little across quality ratings: good quality (2 [IQR: 1, 4]), fair quality (3 [IQR: 1, 5]), and poor quality (3 [IQR: 2, 6]) studies. The median differences in apnea index, hypopnea index, obstructive apnea index, and central apnea index were higher by 0-2 units in the edited versus unedited data, but the oxygen desaturation index was not different (Table 2). Specificity was > 98% for all categories defined by the different AHI cutoffs, whereas sensitivity was 81% to 82% (Table 3). For each AHI cutoff, the edited data yielded a higher proportion of positive sleep apnea screens (p < 0.01) by approximately 10 percentage points (Table 3). The AUC (Table 4, Figure 2) suggested excellent discrimination (0.96-0.97). Optimized AHI thresholds for the unedited data with their associated sensitivities and specificities are presented by edited AHI cutoff in Table 4. Visual inspection of the scatter plot and Bland-Altman plot (Figure 3) suggested good agreement, but a few outliers existed. The biggest differences appeared to occur in subjects with higher AHIs.

Scatterplot of AHI before and after edits of the ApneaLink Plus raw data by a polysomnographic technologist (n = 326).

Figure 1

Scatterplot of AHI before and after edits of the ApneaLink Plus raw data by a polysomnographic technologist (n = 326).

Figure 2

Bland Altman plot of the mean absolute difference in edited and unedited AHI against the mean (unedited and edited) (n = 326).

The middle dashed line is the mean difference and the two extreme lines are the +2 and -2 standard deviation.

Figure 3

Bland Altman plot of the mean absolute difference in edited and unedited AHI against the mean (unedited and edited) (n = 326). The middle dashed line is the mean difference and the two extreme lines are the +2 and -2 standard deviation.

Exploratory Analysis

Within the convenience sample of 23 subjects for whom the unedited data were reanalyzed with the “classic” hypopnea definition, the median AHI for the unedited data was 20 (9, 38) and for the edited data was 21 (12, 40), with no difference found in the median difference in AHI (p = 0.37, see Table 2). The median difference in apnea index and obstructive apnea index were higher by 1 unit in the edited versus unedited data, whereas the hypopnea index, central apnea index, and oxygen desaturation index were not different (Table 2).

DISCUSSION

This prospective observational study in acute and subacute stroke patients shows that the ApneaLink Plus autoscored AHI without prior data edits approximates the autoscored AHI after a registered polysomnographic technologist manually edits the raw data. While differences were statistically significant, bias was not often clinically significant, given the median AHI difference of only 3. Therefore, to assess stroke patient populations for sleep apnea with the ApneaLink Plus device, it may be unnecessary to expend important resources to process the data, especially if the goal is a low proportion of false positives as our results demonstrated a high specificity for the identification of sleep apnea using unedited data. However, if used to make clinical decisions in individual patients, such as to screen subjects to determine referral for polysomnography, or to screen for sleep apnea in research studies, prior review of the data may be advisable before the autoscore software is used, given that the proportion of false negatives was almost 20%. Interestingly, the small difference between AHI for edited and unedited recordings was not present in the subsample of unedited recordings reanalyzed without oximetry contributions to the hypopnea definition. This may have arisen because any differences in desaturation interpretation were eliminated, perhaps leaving fewer opportunities for discrepancies between edited and unedited versions.

Our data suggest that a small reduction in the AHI threshold applied to unedited ApneaLink data more closely approximates results from the polysomnographic technologist's edited data. The associated improvements in sensitivity are accompanied by very little impact on specificity (see Table 4). However, before the optimized AHI thresholds identified in the current study are applied widely, they should be confirmed in other studies.

Two previous studies compared ApneaLink autoscored and manually scored results with polysomnography, but a direct comparison between auto and manual scores was not reported.21,22 Comparison from a single channel ApneaLink study in 95 patients with a suspicion of sleep apnea showed a somewhat higher area under the receiver operating characteristic curve (though all were above 0.80) for each AHI threshold for the manual scores compared with the autoscores, but a higher sensitivity for the manual and higher specificity for the autoscores for the diagnosis of sleep apnea.21 A second study in 25 obese adolescents with the ApneaLink Plus found close agreement between manual and autoscored results when compared with polysomnography.22

Strengths of this study include the large sample size, well characterized group of stroke patients, high representation of minorities, and prospective population-based design. Use of a single experienced technologist to score each study ensured consistency that would have been lost with multiple scorers, but may also be viewed as a limitation in generalizability, to the extent that not all technologists would edit the ApneaLink recordings identically. We did not assess the reliability of study edits across more than one scorer, and we did not examine night-to-night variability. We were also unable to compare the ApneaLink Plus results to those of full, laboratory-based polysomnography, as this would have been challenging if not unfeasible just after stroke.1 Thus the sensitivities and specificities presented necessarily focus on differences between unedited and edited ApneaLink data, rather than those between unedited ApneaLink data and polysomnography.

In this study, ApneaLink Plus-generated AHIs were somewhat higher and more likely to produce a positive screen result after the automatically scored data were edited by a polysomnographic technologist. Although overall results may differ little between cohorts of subjects, an underestimate of sleep apnea severity and lower likelihood of a positive sleep apnea screen may be unacceptably frequent for an individual patient if unedited raw data are used. Application of lower cutoffs might be considered if unedited ApneaLink Plus data are used to screen for sleep apnea in stroke patients.

DISCLOSURE STATEMENT

This was not an industry supported study. This work was funded by the NIH (R01 NS070941, R01 HL098065, and R01 NS038916). Dr. Morgenstern has received research support from St. Jude Medical Corporation. Dr. Chervin has participated in research supported through the University of Michigan by Philips Respironics and Fisher Paykel; has consulted for Proctor & Gamble through a contract with the University of Michigan; has consulted for Zansors; and is named in copyrighted material, patents, and patents pending held by the University of Michigan for the assessment and treatment of sleep disorders. The other authors have indicated no financial conflicts of interest. The work was performed at the University of Michigan, Ann Arbor, MI.

BaHammam AS, Sharif M, Gacuan DE, George S, authors. Evaluation of the accuracy of manual and automatic scoring of a single airflow channel in patients with a high probability of obstructive sleep apnea. Med Sci Monit. 2011;17:MT13–19. [PubMed Central][PubMed]