Metareview Service

Automated Metareview Web Services

Metareviewing (also spelled “meta-reviewing”) is the process of assessing reviews. Academic conferences often have procedures for metareviewing referee reports to build confidence that acceptance decisions are made on the basis of careful reviews performed by competent reviewers. Metareviewing is important in the context of (classroom) peer assessment, because (1) not all students have learned enough to accurately assess their peers’ work on a particular topic, and (2) students will spend inadequate time on reviews if there is no mechanism to keep them accountable for the quality of their reviewing.

This web service presents a suite of methods for obtaining various metrics on a review. For example, one method measures review volume—the number of different words used in a review. In most cases, a review that says more is more useful than a review that says less, so volume can be considered a (crude) measure of review quality.

A peer-assessment system might use metareview metrics in various ways. It might present the instructor with a report giving, for each student, metrics on that student’s reviews. The instructor could use these metrics—either directly or after further investigation—to assign a grade for reviewing. Or, a system might present metareview metrics to a student, to let the reviewer see how well his/her reviews stack up against reviews done by other students in the course. The metrics could be presented in the form of a graph or chart, so that the student could see at a glance if the current review “makes the grade” relative to other students in the class. The system might give the reviewer an opportunity to revise the review after seeing the feedback.

Each method corresponds to a single metareview metric. A score for that metric for a particular review can be generated by passing a JSON object in a POST request with required input fields. The parameters required for individual web methods are listed below. A JSON object containing the score for the review is returned to the client.

After the score is returned, the system can use it in any way desired, e.g, display it on a page, or use it in calculating a grade.

Alternative 1: Access the Peerlogic Server

The metarview web services can be accessed by issuing web-service requests to the Peerlogic project server, at http://peerlogic.csc.ncsu.edu/metareview/metareviewgenerator/[methodname]. You (or your peer-assessment system) can use this service without building the code on your local machine. The format for calling these services is given below.

Metric Definitions

Volume

The volume metric gives the count of unique tokens present in a textual input. It ignores articles and pronouns.

Sample input

Method name: volume

Input: {“reviews”:”Good work. I didn’t like the introduction of the article. This can be easily written in a better way. But the document provides very interesting and important information regarding the cloud.”}

Output: {“volume”:19}

Tone

Tone refers to the choice of words used by the reviewer. It is either positive, negative or neutral. We use positive and negative indicators from an opinion lexicon provided by Liu et al. to determine the semantic orientation of a review.

Positive: A review is said to have a positive tone if it predominantly contains positive feedback, i.e., it uses words or phrases that have a positive semantic orientation. Example: “The page is very well-organized and the information under corresponding titles is complete and accurate.” Adjectives such as “well organized”, “complete” and “accurate” are indicators of a positive semantic orientation.

Negative: This category contains reviews that predominantly contain words or phrases that have a negative semantic orientation. Reviews that provide negative criticism to the author’s work fall under this category, since while providing negative remarks reviewers tend to use language or words that are likely to offend the authors. Such reviews could be morphed or written in a way that is less offensive to the author of a submission. Example: “The approach is trivial, and the paper has been formatted very poorly.” The given example contains negatively oriented phrases “trivial” and “very poorly”. The author could consider such a review to be rude. One of the ways in which this review could be re-phrased to convey the message politely is—“The approach needs improvement. In its present form it does not appear to be conveying any new information. The paper could have been formatted better.”

Neutral: Reviews that do not contain either positively or negatively oriented words or phrases, or contain an equal number of both are considered to be neutral. Example: “The organization looks good overall. But lots of IDEs are mentioned in the first part and only a few of them are compared with each other. I did not understand the reason for that.” This review contains both positively and negatively oriented segments: “The organization looks good overall” is positively oriented, while “I did not understand the reason for that.” is negatively oriented. The positive and negatively oriented words when taken together give this review a neutral orientation.

Sample input

Method name: tone

Input: {“reviews”:”Good work. I didn’t like the introduction of the article. This can be easily written in a better way. But the document provides very interesting and important information regarding the cloud.”}

Content

Content identifies the type of feedback provided by the reviewer. It can contain praise, identify problems with the reviewed artifact, or make suggestions on how to improve the work.

Summation: This metric identifies whether a review contains summary or contains a positive (praise) assessment of the reviewed artifact. This type of review does not point out any problem in the reviewed work or offer any suggestions for improvement.

Problem detection: As the name suggests, this metric identifies whether the review identifies any problem with the reviewed work. A suggestion for improvement (see below) does not count as detection of a problem.

Advisement: A review can provide ways in which an artifact can be improved. The advisory metric is used to measure the degree to which the review offers suggestions.

The content metric returns three scores for each review, one for each of the above three aspects (summation, problem detection, advisement).

Sample input

Method name: content

Example 1

Input: {“reviews”:”Good work. The introduction of the article is not written in a good format. I didn’t like the introduction of the article. This can be easily written in a better way. But the document provides very interesting and important information regarding the cloud.”}

Plagiarism

This metric detects whether a reviewer copied the content of review from the reviewed artifact or another Internet source. Content from the artifact is not counted as plagiarized if it is put within in double quotes. Any content taken from the artifact will be marked as copied unless it is placed within quotes.

Sample input

Method name: plagiarism

Input: {“reviews”:”Good work. The introduction of the article is not written in a good format.”}