Background and Goals

HSCC has a rich history of publishing strong papers emphasizing
computational contributions; however, subsequent re-creation of
these computational elements is often challenging because details
of the implementation are unavoidably absent in the paper. Some
authors post their code and data to their websites, but there is
little formal incentive to do so and no easy way to determine
whether others can actually use the result. As a
consequence, computational results often become non reproducible
-- even by the research group which originally produced them --
after just a few years.

The goal of the HSCC repeatability evaluation process is to
improve the reproducibility of computational results in the papers
selected for the conference.

Benefits for Authors

We hope that this process will provide the following benefits to
authors:

Raise the profile of papers containing repeatable
computational results by highlighting them at the conference
and online.

Raise the profile of HSCC as a whole, by making it easier to
build upon the published results.

Provide authors with an incentive to adopt best-practices
for code and data management that are known to improve the
quality and extendability of computational results.

Provide authors an opportunity to receive feedback from
independent reviewers about whether their computational
results can be repeated.

Obtain a special mention in the conference proceedings, and
take part in the competition for the best RE award.

While creating a repeatability package will require some work from
the authors, we believe the cost of that extra work is outweighed
by a direct benefit to members of the authors' research lab: if an
independent reviewer can replicate the results with a minimum of
effort, it is much more likely that future members of the lab will
also be able to do so, even if the primary author has departed.

Author Instructions and
Submission Guidelines

Authors of papers accepted to
HSCC 2017 - and especially Tool Papers - are invited to
submit a repeatability package (RP). An RP submission is
optional, and will not affect the final publication of the
corresponding paper.
RPs are considered confidential material in the same sense as
initial paper submissions: committee members agree not to share RP
contents and to delete them after evaluation. RPs remain the
property of the authors, and there is no requirement to post them
publicly (although we encourage you to do so).
Papers whose RPs pass the repeatability evaluation criteria will
be listed online and in the final proceedings. On the other hand,
papers whose RPs do not pass the repeatability evaluation criteria
will be treated the same as papers which do not submit RPs (eg:
failing RPs will not be individually identified).

The RP consists of three components:

A copy (in pdf
format) of the final camera-ready paper. This copy will
be used by the REC to evaluate how well the elements of the RP
match the paper.

A document (either
a webpage, a pdf, or a plain text file) explaining at a
minimum:

What elements of the paper are included in the RP (eg:
specific figures, tables, etc.).

The system requirements for running the RP (eg: OS,
compilers, environments, etc.).

Instructions for installing and running the software and
extracting the corresponding results.

The software and any
accompanying data. We will accept at least the
following formats:

A link to a public online repository, such as
bitbucket.org, code.google.com, github.com or
sourceforge.net.

An archive in a standard file format (eg: zip, gz, tgz)
containing all the necessary components.

A link to a virtual machine image (using either VirtualBox
or VMware) which can be downloaded.

If you would like to submit software and/or data in another
format, please contact the RE committee chair in advance to
discuss options.

The RP should be submitted through Easychair (see next paragraph,
and note that this is a different site than that used for initial
paper submissions). When preparing your RP, keep in mind
that other conferences have reported that the most common reason
for reproducibility failure is installation problems. We
recommend that you have an independent member of your lab test
your installation instructions and RP on a clean machine before
final submission.

The repeatability evaluation process uses anonymous reviews so as
to solicit honest feedback. Authors of RPs should make a
genuine effort to avoid learning the identity of the
reviewers. This effort may require turning off analytics or
only using systems with high enough traffic that REC accesses will
not be apparent. In all cases where tracing is unavoidable
the authors should provide warnings in the documentation so that
reviewers can take necessary precautions to maintain anonymity.

Repeatability Evaluation
Criteria

Each member of the Repeatability Evaluation Committee assigned to
review a Repeatability Package (RP) will judge it based on three
criteria -- coverage,
instructions, and quality -- where each criteria is
assessed on the following scale:

significantly exceeds expectations (5),

exceeds expectations (4),

meets expectations (3),

falls below expectations (2),

missing or significantly falls below expectations (1).

In order to be judged "repeatable" an RP must "meet expectations"
(average score of 3), and must not have any missing elements (no
scores of 1). Each RP is evaluated independently according
to the objective criteria. The higher scores ("exceeds" or
"significantly exceeds expectations") in the criteria should be
considered aspirational goals, not requirements for acceptance.

Coverage

What fraction of the appropriate figures and tables are reproduced
by the RP? Note that some figures and tables should not be
included in this calculation; for example, figures generated in a
drawing program, or tables listing only parameter values.
The focus is on those figures or tables in the paper containing
computationally generated or processed experimental evidence to
support the claims of the paper.

Note that satisfying this criterion does not
require that the corresponding figures or tables be recreated in
exactly the same format as appears in the paper, merely that the
data underlying those figures or tables be generated in a
recognizable format.

A repeatable element is
one for which the computation can be rerun by following the
instructions in the RP in a suitably equipped environment.
An extensible element is
one for which variations of the original computation can be run by
modifying elements of the code and/or data. Consequently,
necessary conditions for extensibility include that the modifiable
elements be identified in the instructions or documentation, and
that all source code must be available and/or involve calls to
commonly available and trusted software (eg: Windows, Linux, C or
Python standard libraries, Matlab, etc.).

The categories for this criterion are:

None (missing / 1):
There are no repeatable elements. This case
automatically applies to papers which do not submit a RP or
papers which contain no computational elements.

Some (falls below
expectations / 2): There is at least one repeatable
element.

Most (meets expectations /
3): The majority (at least half) of the elements are
repeatable.

All repeatable or most
extensible (exceeds expectations / 4): All elements
are repeatable or most are repeatable and easily
modified. Note that if there is only one computational
element and it is repeatable, then this score should be
awarded.

Complete (meets
expectations / 3): For every computational element
that is repeatable, there is a specific instruction which
explains how to repeat it. The environment under which
the software was originally run is described.

Comprehensive (exceeds
expectations / 4): For every computational element
that is repeatable there is a single command which recreates
that element almost exactly as it appears in the published
paper (eg: file format, fonts, line styles, etc. might not be
the same, but the content of the element is the same).
In addition to identifying the specific environment under
which the software was originally run, a broader class of
environments is identified under which it could run.

Outstanding (significantly
exceeds expectations / 5): In addition to the
criteria for a comprehensive set of instructions, explanations
are provided of:

all the major components / modules in the software,

important design decisions made during implementation,

how to modify / extend the software, and/or

what environments / modifications would break the
software.

Quality

This criterion explores the documentation and trustworthiness
of the software and its results. While a set of scripts
which exactly recreate, for example, the figures from the paper
certainly aid in repeatability, without well-documented code it
is hard to understand how the data in that figure were
processed, without well-documented data it is hard to determine
whether the input is correct, and without testing it is hard to
determine whether the results can be trusted.

If there are tests in the RP which are not included in the
paper, they should at least be mentioned in the instructions
document. Documentation of test details can be put into
the instructions document or into a separate document in the RP.

The categories for this criterion are:

None (missing / 1):
There is no evidence of documentation or testing.

Rudimentary documentation
(falls below expectations / 2): The purpose of almost
all files is documented (preferably within the file, but
otherwise in the instructions or a separate readme file).

Comprehensive documentation
(meets expectations / 3): The purpose of almost all
files is documented. Within source code files, almost
all classes, methods, attributes and variables are given
lengthy clear names and/or documentation of their
purpose. Within data files, the format and structure of
the data is documented; for example, in comma separated value
(csv) files there is a header row and/or comments explaining
the contents of each column.

Comprehensive documentation
and rudimentary testing (exceeds expectations / 4):
In addition to the criteria for comprehensive documentation,
there are identified test cases with known solutions which can
be run to validate at least some components of the code.

In addition to the criteria for comprehensive documentation,
there are clearly identified unit tests (preferably run with a
unit test framework) which exercise a significant fraction of
the smaller components of the code (individual functions and
classes) and system level tests which exercise a significant
fraction of the full package. Unit tests are typically
self-documenting, but the system level tests will require
documentation of at least the source of the known solution.

Note that tests are a form of documentation, so it is not really
possible to have testing without documentation.

Coverage: all
repeatable (exceeds expectations / 4). Code to
recreate figures 1-5 and 7-8 is provided. Figure 6 is
a hand-drawn coordinate system. There are no tables.

Instructions:
complete (meets expectations / 3). The included
readme.txt file lists which m-files are used to recreate
which figures. The environment is described (Matlab
R2010b or later, link to the Toolbox of Level Set
Methods). However, some effort is required to extract
certain figures (eg: figures 7 & 8).

Quality:
comprehensive documentation (meets expectations / 3).
All source files include Matlab help entries for every
function as well as numerous comments. There are no
data files. However, there is no sign of testing.