Abstract

Early detection of patients with chronic diseases at risk of developing persistent pain is clinically desirable for timely initiation of multimodal therapies. Quality follow-up registries may provide the necessary clinical data; however, their design is not focused on a specific research aim, which poses challenges on the data-analysis strategy. Here, machine-learning was used to identify early parameters that provide information about a future development of persistent pain in rheumatoid arthritis (RA). Data of 288 patients were queried from a registry based on the Swedish Epidemiological Investigation of RA (EIRA). Unsupervised machine-learning identified three distinct patient subgroups (low, median and high) persistent pain intensities. Next, supervised machine learning, implemented as random forests followed by computed ABC analysis-based item categorization, was used to select predictive parameters among 21 different demographic, patient rated and objective clinical factors. The selected parameters were used to train machine-learned algorithms to assign patients pain-related subgroups (1,000 random resamplings, 2/3 training, 1/3 test data). Algorithms trained with three-month data of patient global assessment and health assessment questionnaire provided pain group assignment at a balanced accuracy of 70 %. When restricting the predictors to objective clinical parameters of disease severity, swollen joint count and tender joint count acquired at three months provided a balanced accuracy of rheumatoid arthritis of 59 %. Results indicate that machine-learning is suited to extract knowledge from data queried from pain and disease related registries. Early functional parameters of RA are informative for the development and degree of persistent pain.