Compared to physical conditions, the quality of care of mental health disorders remains poor and the rate of improvement in treatment is slow. Understanding the efficacy of psychotherapy is limited by a lack of both large-scale data and quantifiable measures of therapy content. We present an approach using a deep-learning model to automatically categorise the content (i.e. natural language) of cognitive behavioural therapy (CBT) sessions. The model was applied to approximately 90,000 transcripts of internet-enabled message-based CBT . Using this quantifiable measure of treatment, we determine the relationship between the “dose” of different aspects of therapy delivered and clinical outcomes. The approach represents a significant advance in developing a data-driven understanding of the treatment of mental health conditions with implications for monitoring and standardizing clinical practice as well as enhancing the efficacy of psychotherapy.