Other sites

Psychotherapies PubMed battle

Introduction

Todays mission have been to visualize publication frequencies over the years for different psychotherapies using PubMed and R. I choose to concentrate on Psychodynamic Therapy (PDT), Cognitive Behavior Therapy (CBT), Acceptance and Commitment Therapy (ACT) and Mindfulness. I know there’s is a R function to query PubMed, but I didn’t use it and instead searched PubMed manually and saved the result as four separate csv-files.

PubMed search filter
I used a really rudimentary search filter:

“cognitive behavior therapy” OR “cognitive therapy” (13245 hits)

“psychodynamic therapy” OR “psychoanalytic therapy” (13933 hits)

Mindfulness (1099 hits)

“Acceptance and commitment therapy” (374 hits)

Writing the R code

So, after I’d saved the data files from PubMed to my computer, I switched over the R and loaded the necessary packages.

library(stringr) # used for str_extract()
library(ggplot2) # used for plotting
library(plyr) # used for count()

Then I wrote a short function to extract year and frequency from the csv-file. Since publication year wasn’t under it’s own header I used str_extract to extract that information.

# create function named scrape.pubmed
scrape.pubmed <- function() {
# we only need column 5
results <- pubmed_result[5]
# the header is repeated regularly, so I remove those rows.
results <- results[results$Details != "ShortDetails",]
# The remaning field contain journal name in addition to pub. year
# The general expression catches values 1900--1999 OR 2000--2019
# Most rows had year information at the end of the string,
# but not all rows had this. Therefore I used this regular expression to
# minimize mismatches.
year <- str_extract(results, "(19[0-9]{2}|20[0-1][0-9])")
# Use the count()-function to group years together.
# I didn't want values from 2012, hence the < 2012 addition.
count <- count(year[year < 2012])
# Return the value of count
return(count)
}

Now that I have a function to collect my data, I loaded the different csv-files I saved from PubMed and collect year data from them.

Results

We can see that PDT really had a decline in the early 90′s and the CBT started it’s rise and that it really sky rocketed in the 21th century. PDT also had an increase in the 21th century but it got a heck of a lot of work to do before it can match CBT’s publication output. The plot also show that mindfulness had a surge and that it’s output now is on par with PDT.