Jolt framework lets users force some hung programs to recover

A group of MIT researchers have developed an experimental software framework …

MIT researchers have developed an experimental software framework called Jolt that allows applications to recover in some cases when they hang. When Jolt detects that a program is stuck in a certain kind of infinite loop, it can force it to exit the loop and continue executing.

The researchers have published a paper that describes their implementation of Jolt and how it performed in a number of tests against bugs in well-known open source software utilities. In several test cases, Jolt allowed hung programs to continue to completion in situations where the user would otherwise have to forcefully terminate the process.

The inspiration for the research project came from a bug in Microsoft Word. An MIT professor was writing a document in the word processor one morning when it froze unexpectedly. Using a debugging tool, he found the loop in which the program was stuck and forced it to move on, allowing him to save his document and restart the program. He described the incident in an e-mail to his colleague, Professor Martin Rinard, who then got the idea of building an automated tool for breaking out of infinite loops.

The idea is compelling, but the initial implementation comes with some caveats. The method that Jolt uses to identify infinite loops is very limited. Jolt compares the program's state during each iteration of a loop to determine if the values are changing. If the program's state remains the same between iterations, Jolt will cause the program to branch out of the loop so that execution can continue.

Jolt isn't effective in cases where the operations within a loop are changing the program's state but not changing it in ways that fulfill the loop's natural exit condition. Another issue is that Jolt can't identify infinite loops that are caused by recursive function calls.

In order for Jolt to work properly, the source code of an application has to be modified during compilation to inject function calls for tracking loop entry and exit. To accomplish this, the researchers built on the Low Level Virtual Machine (LLVM) compiler infrastructure and added a step to perform the necessary modifications to the code. It also adds a label outside of each loop to indicate where the execution should be picked back up when Jolt causes the program to exit the loop.

The real heavy lifting in Jolt is done by a dynamic instrumentation system that attaches to a program at runtime and tracks operations that write to memory during loops. It uses that data to build a snapshot of the memory state when it reaches the beginning of each loop. The snapshots are compared to determine if the state is actually changing. The researchers built their dynamic instrumentation mechanism on top of Pin.

To see how Jolt works in practice, the researchers put it to the test with real-world software. In one of the tests, they pitted Jolt against a Python parsing bug in ctags, a tool that analyzes code and generates an index of names. There was apparently a bug that causes ctags to go into an infinite loop when it encounters two triple-quoted strings on the same line.

Without Jolt, ctags would just hang indefinitely upon hitting that error, forcing the user to terminate the process. Jolt allowed the program to finish running. It moves on and finishes other files that is supposed to process, leaving abridged data for the file where the error was encountered.

They performed similar tests with other common command line tools, including grep and ping. In seven of their eight tests, Jolt identified the infinite loop within half a second or less and allowed the program to continue. In two of the eight test cases, the program emitted the same output when performed with a fixed version of the program and when Jolt was used to exit a loop.

The paper also includes data that show how Jolt instrumentation impacts the performance of an application. The overhead ranges from 0.5 percent to 8.6 percent.

The researchers' findings are intriguing and offer some insight into how automated mechanisms can be used to allow users to recover from certain kinds of program faults. The project still obviously has a ways to go before it will be a practical option for regular end users.

The researchers are working on a follow-up, called Bolt, that they hope will overcome the need for static instrumentation at compile time. That could help the project move one step closer to delivering a convenient standalone solution for unhanging hung applications.