Benjamin A Goldstein

Abstract

Electronic health record (EHR) data are becoming a primary resource for clinical research. Compared to traditional research data, such as those from clinical trials and epidemiologic cohorts, EHR data have a number of appealing characteristics. However, because they do not have mechanisms set in place to ensure that the appropriate data are collected, they also pose a number of analytic challenges. In this paper, we illustrate that how a patient interacts with a health system influences which data are recorded in the EHR. These interactions are typically informative, potentially resulting in bias. We term the overall set of induced biases informed presence. To illustrate this, we use examples from EHR based analyses. Specifically, we show that: 1) Where a patient receives services within a health facility can induce selection bias; 2) Which health system a patient chooses for an encounter can result in information bias; and 3) Referral encounters can create an admixture bias. While often times addressing these biases can be straightforward, it is important to understand how they are induced in any EHR based analysis.