Subscribe to this blog

Follow by Email

Search This Blog

Assessment in MOOCs

I was asked: "I was wondering how they might work with the Humanities, as I teach
Seventeenth-Century Literature, Shakespeare
and other related subjects, which require research papers and final
examinations. I can see using MOOCs for people who simply have a
(non-credit) interest in these subjects, but I can't see myself marking
5,000 term papers, and a similar number of exams. Multiple-choice
evaluation, as in science, is easily taken care of electronically, but
not in humanities. I am sure this looks like a naive question, but I
think MOOCs are a wonderful idea for people who simply wish to enrich
their knowledge, and would like to know a little
more about them."

First of all, the MOOCs I have worked on have not focused on assessment -
they have been courses, yes, with a small number (20 or so) taking them
for credit, but the vast majority of participants auditing. So the
question of marking term papers never came up. And like you, I would not
contemplate multiple-choice exams in humanities and literature courses.

If you really need assessment, a few solutions have been proposed and, to a limited extend, tried out:

- automated essay assessment - this is not as far-fetched as it may
seem, though it's not necessarily a cure-all. Automated essay assessment
needs to be seeded with a large number of already-marked essays; on
being given this seed, it extracts the properties of high-quality
essays, and then matches new essays to those properties. There's a
really good essay describing the process here:
http://mfeldstein.com/si-ways-the-edx-announcement-gets-automated-essay-grading-wrong/ Here's another article: http://tlt.its.psu.edu/2013/04/12/mooc-moments-essay-grading-software/

- another form of automated assessment is based on task-completion or
success-based metrics. The best example of this is Codeacademy.
http://www.codecademy.com/#!/exercises/0 It's a bit like programmed
learning
http://www.gsis.kumamoto-u.ac.jp/en/opencourses/pf/3Block/07/07-2_text.html
where people can be stepped through the material soliciting active
learner responses; "The extent of a learner's understanding is
ascertained from what is demonstrated in the responses." In many
casdes, this can be supported through a form of self-assessment, using
simple techniques such as flash cards and more complex techniques such
as sample responses to questions; the participant can determine for
themselves whether they passed and can move on.

- peer assessment - essays are graded not by professors but by other
course participants. This would require that each essay be graded by a
largeish number of other participants, otherwise, the grading would be
no better than random. How large is enough? It might be too large,
especially when you account for people who grade without reading, people
who grade based on poor criteria, etc. Peer grading can work really
well for blog posts and discussion lists, where it can be managed with a
simple thumbs-up thumbs-down metric. Here's an example:
http://www.nytimes.com/2012/11/20/education/colleges-turn-to-crowd-sourcing-courses.html?_r=0
And the fact of being graded by peers often spurs people to greater
accomplishment. There's some discussion of peer grading here:
http://degreeoffreedom.org/moocs-and-peer-grading-1/ And here's a critique of the technique: http://www.insidehighered.com/views/2013/03/05/essays-flaws-peer-grading-moocs

- network-based grading - in this model, individuals are not graded by
means of grading individual pieces of work, but rather are graded
according to network metrics; the idea is that quality work will produce
quality network metrics. The model is not unlike that pioneered by
Klout http://klout.com/home which counts the number of Twitter
followers, Facebook likes, and similar indicators, to produce a single
Klout score.

The problem with Klout is that it is simplistic and easily gamed.
Nonetheless there is potential for a more fine-grained assessment to
look at how ideas created by one person propagate through a network, to
look at whether a person's reading recommendations have become
influential, and similar less obvious measures. These can be pretty
fine-grained, based on semantic analysis. Here's a simple example, of a
Twitter scanner that looks for instances of bullying (obviously,
something that would lower the person's score):
http://phys.org/news/2012-08-machines-scour-twitter-bullying.html And
here are some links to research by my colleagues at NRC on the analysis
of sentiments and emotions in online postings:
http://www.umiacs.umd.edu/~saif/

Each of the last two methods run some risks:

- the "blind-leading-the-blind" phenomenon, whereby the collection of
participants in a course can elevate myths about the subject matter to
the status of fact (I remember instances from my childhood, where the
position of "backcatcher" became a baseball position, and "touching
iron" became a foul in basketful).

- the "charlatan" phenomenon - Students not already expert in a subject
matter may mistakenly believe that one of their member is an expert.

For these reasons, I have always recommended that a MOOC seek to attract
not only students and novices in a discipline, but also practitioners
and experts in the discipline. Such people will quite rightly gain the
greatest 'authority' according to peer or assessment measures, and as a
result, their actions (such as relaying an idea, passing on a link,
etc.) will gain more weight, shifting the outcome of peer or network
based assessment to one based more on credibility.

The difficulty lies in attracting these people, who are often very busy,
to a MOOC. This is one of the benefits of scale; a very large MOOC is
more likely to attract experts (and the presence of experts is more
likely to increase the size of the MOOC). But experts aren't likely to
attend a carefully choreographed "Intro to Victorian Literature" course;
it's all very old and familiar to them. The model of learning needs to changed to involved the experts.

This is what we attempted, with some success, in our connectivist MOOCs -
rather than set it up as a series of lessons, we set it up as a series
of discussions. The experts would participate at a high level, often
interacting mostly with each other, while participants at other levels
observed and were able to emulate this practice. Yes, we did provide
scaffolding, to help the novices get into the flow of the discussion,
but the scaffolding did not become the course.

Another model of MOOC addresses the issue by sharing the assessment.

The "distributed MOOC" is essentially a MOOC that is shared by a number
of institutions (again, this was something we attempted in the earliest
connectivist MOOCs). Today this is sometimes being called a 'wrapped
MOOC'. The idea is that some of all of the MOOC contents are shared by
members of classes from any number of institutions. Participants
interact with each other, and follow online events and resources
together. Each, though, is subject to individual assessment by their
home institution, which may attach whatever rubric to the material they wish.

A final option is to bypass grading entirely, and let a person's
outcomes stand on their own as evidence of accomplishment in the course.
this is the objective of portfolio-based courses, more common in the
arts and writing, but also increasingly popular in the sciences, and
especially design, engineering and computing. The idea is essentially
that a person presents an artifact that can be studied directly by
potential employers. This artifact may or may not be subject to peer
grading, which may produce a course score. But the course score is
secondary to the artifact itself. Here's a quick guide to
portfolio-bases assessment
http://www.unm.edu/~devalenz/handouts/portfolio.html

One of the advantages of portfolio-based assessment is that it removes
the ambiguity inherent in grades that result from tests and assignments.
Here, for example, is an article describing how portfolio-bases
assessment can help parents see directly how well their children are
performing.
http://www.earlychildhoodnews.com/earlychildhood/article_view.aspx?ArticleID=495
Portfolio-based assessment is often based on matching production to
rubrics; this example http://www.ncbi.nlm.nih.gov/pubmed/17457074
demonstrates portfolio-based assessment in medicine.