Table of Content

What is Deterministic Record/Replay

Deterministic application record and replay is the ability to record
application execution and deterministically replay it at a later time.
Record-replay has many potential uses:
* Diagnosing and debugging applications by capturing and reproducing hard to find bugs.
* Dynamic application analysis by performing costly instrumentation on replicas that
replay application behavior recorded on production systems.
* Intrusion analysis by capturing intrusions involving non-deterministic effects.
* Fault-tolerance by providing replicas that replay execution and at the occurrence of
a fault, go live in place of the previously running application instance.

What is Scribe

Scribe is a record/replay engine to provide deterministic execution
record and replay of generic applications on Linux.

Watch Scribe in Action

Scribe is Extensible

Scribe records application execution into a log file, and can later replay
the same application from the log file. It has APIs to inspect the log files,
to modify the logged execution, to control the recording and replaying and
fiddle with its state.

But there is more to Scribe …

Scribe can do tandem-like application execution, where the recording on one
host is streamed to a second host and replayed in real time.

Scribe can be used to record application execution, then modify the
resulting log to force difference behavior when the application is replayed.
For example, replay an multi-process applications with different scheduling to
automatically expose and detect harmful race conditions
(Racepro).

Scribe can be used to record an application execution, and replay it
after modifying the application, tolerating a divergent execution from the
original one to some extent. For example, replay an application with debugging
enabled from a recording without debugging output. This is a new concept that
we’ve introduced in the mutable replay paper.
The mutable replay engine plugs into the Scribe engine through the Python
library.

Future Work

Here’s what I’d focus on next:

Coverage: Some applications don’t replay well. Some used to work on the
original prototype, like Firefox and OpenOffice, but not anymore for obscure
reasons that need further exploration.

Interpreters: Scribe is transparent, meaning it requires no changes to
the applications. In some cases, it actually does make sense to change the application.
For example, if the goal is to record and replay programs in languages like
Ruby/Python/Java, we may get away without record an replay of the internals
of the respective VM. I started to patch the Ruby interpreter to make
it Scribe aware (see here). Mutable
replay works much better when it has context.

Distributed: I want to be able to record an application that spans multiple
servers. Because Scribe records all the interactions the application
has with its external environment, you don’t want to record separately
the database and the application.

“I have a Dream”

With these three components in place, I can fulfil a dream: being a web developer, I’d
like to have an entire web stack recorded. When a user clicks on the “Feedback” button, I would replay the whole
system locally and observe exactly what the user got by replaying her entire session. With that, I’d like to replay it faithfully down to the race that
may have happened in the database. I’d also like to be able to modify the code to
understand it better while it’s replaying. With enough brains on this, we can
make it a reality.