AWS Infrastructure for Stanford Course in Parallel Programming

Over the last two winters I have had the privilege of being a teaching
assistant for CS149: Parallel Programming at Stanford University
with instructors Alex Aiken and
Kunle Olukotun. When I
started, I took up the job of overhauling the management of machines
for the course. CS149 is unusual in that each assignment investigates
a different programming model. As a result, every assignment runs on a
different hardware and software stack.

In the beginning, TAs maintained a separate physical hardware for each
assignment. Setting up an assignment meant going down into the
basement of the Gates building to check that the machines hadn’t been
borrowed or stolen, and configuring (often manually) the software for
each new quarter. This process became even more problematic as
turnover among the TAs resulted in a lack of continuity between
iterations of the course. Clearly this was unmanageable in the long
term.

The previous year’s TAs had begun to move to AWS EC2, but left the
transition incomplete. I started with a handful of half-baked AMIs and
the remaining physical hardware from before the transition. After
deciding to complete the move to EC2, I started over with a new
infrastructure. My goal was to make the process entirely automated so
that each assignment could be run with a small handful of commands.

That infrastructure, which has been in production use at Stanford for
two years, is now released as open source under an MIT license. The
source and instructions are available at the project page, here:

I am proud of my work, but to be entirely honest, the code is not that
pretty. As a TA, my goal was always to be good enough, not perfect. In
places where it made sense, I reused existing technologies
(e.g. Kerberos). In other places, I rolled my own lightweight
solutions rather than adopt what I felt to be unnecessary technical
burdens (e.g. Puppet et al.). Given that my deadline for an initial
working implementation was two weeks, I stand by decisions. But if I
had been given more time to build the MVP, I might have made different
choices.

The code is abstracted well enough to enable reuse for courses similar
to CS149. That said, for programming models which differ substantially
from the ones taught in CS149 (e.g. MPI), more effort will be
required.