Research Projects

On the corner of 41st St. and 8th Ave. in The New York Times (NYT) Building in Manhattan, a team composed of 13 moderators scour an endless stream of NYT user comments. Led by Bassey Etim, The Times community manager, the team searches for both inflammatory comments to remove as well as intriguing comments to feature, referred to as NYT Picks. By removing comments that are inflammatory or off-topic and featuring the NYT Picks, the team seeks to maintain a healthy community and extend the NYT style to its comments sections[1]. Etim explains, “We use real people because humans can absorb the variables of conversation and weigh them in more intricate ways.”[2] Such comment moderation is pioneering as many publishers have opted to simply disable commenting or leave their comment sections unmoderated over long periods of time. The NYT approach has its drawbacks however, and because Etim’s team can only read so many comments at a time, the website has to limit both the time period in which a user may comment on an article and the number of articles open to commenting.

A team of researchers at the University of Maryland (UMD) is working on an interdisciplinary approach, dubbed CommentIQ, to help the NYT’s moderators and enable comment moderation for other online publishers such as blogs and other news sites. Consisting of researchers from the university’s schools of journalism, information science, and computer science, the team seeks to develop automated techniques and an open source application program interface (API) that may help rather than replace a

human moderator’s judgements.

The design of CommentIQ[3, 4] is centered around the idea of a human working with an automated program via a feedback loop. The scores generated by the algorithm provide a ranking and a statistical overview, which is then fed to the moderator to assist the moderator’s decision-making process. Similarly, the final classification made by the human moderator after reading the comment and viewing the predictions is then fed back into the system to help the algorithm with more accurate future score predictions. As a result, the approach enables the professional moderator to “teach” the algorithm, and ideally the algorithm’s performance will generate greater cost-saving benefits as it learns more from its moderator over time.

Using natural language processing and predictions made by machine learning algorithms, the approach automatically evaluates a comment on multiple criteria such as general writing quality, personal story, relevance to the article as well as other comments, and general sentiment. These evaluations are then merged with other information on the comment -- such as location of the user and time of submission -- into a user-friendly interface designed by researchers in Human Computer Interaction. While the design of the interface is currently in progress, when complete it will provide the moderators an easy-to-navigate overview of comments as well as the ability to filter and search the comment data as needed. By allowing moderators to more easily search the distribution of opinions along various dimensions -- such as geological location, gender or political biases -- moderators can select = representative comments spanning each. Ideally, this will result in a more objective comment selection process and provide a comments section that can provide a wide distribution of opinions.

The UMD research team is composed of Dr. Nicholas Diakopoulos, a professor at the College of Journalism as well as Dr. Niklas Elmqvist, a professor at the iSchool specializing in information visualization and human computer interaction. The team also consists of Deok Gun Park, a Ph.D student in the Department of Computer Science concentrating in natural language processing and moderation interface, and Simranjit Singh, a graduate student in information technology.