Extracting Personality Data from Linn’s Texts

The final leg of the Linn project turned out to be more exciting than I anticipated. Jingya and I were tasked with creating visualization(s) by analyzing Linn’s diary entries or letters. Of course, every visualization needs some underlying data, which in our case was Linn’s personality data. A major part of our challenge was to figure out a way to generate this data from his diary entries. Jingya and I thought we could employ existing web Application Programming Interfaces (API). In layman’s terms, an API can be defined as a package of rules which when given an input produce desired outputs in a structured format. In our scenario, the input was the text from the journal, whereas the output was the personality data associated with the text.

We started off slowly because we could not find a freely-available API which met our needs. We eventually stumbled upon IBM Watson’s Personality Insights Service. As stated on their website, “Personality Insights extracts and analyzes a spectrum of personality attributes to help discover actionable insights about people and entities.” This API was what we needed; however, we were required to use a server-side technology called node.js, which neither of us was familiar with.

My primary task then was to set up a skeleton node.js application. As I soon discovered to my disappointment, node.js being a server-side technology does not have access to the text displayed on web pages. So I wrote a script (shown below) to store the text which I could easily feed into my node.js application.

script (left) added to Journal website to produce a dictionary (right) from date to the diary entry content

Once I had access to the text in a readily parsable format, I used node.js to make API calls to retrieve the personality data for each of Linn’s diary entries. Initially, I was skeptical about the number of API calls I would be allowed to make with the free-tier subscription. Fortunately though, I was able to make enough calls to test my application and store the needed data as a JSON file. Jingya is using this data to create a dynamically generated visualization using p5 and (or) d3.

node.js script to automate retrieval and storage of personality data for all diary entries

My experience while working on this project was fascinating. I got to learn about a well-established and popular technology, which I did not think I would at the outset of this course. I think I did a fair job at familiarizing myself with node.js and the Personality Insights API. However, I could have saved myself some trouble had I read the Node.js documentation carefully. I tried a few hacks to access the web page content, but could not find a way around. Overall, I have picked up valuable skills such as transcription, text encoding and analysis over the course of this project, and I look forward to applying them in my future endeavors.