Wednesday, April 19, 2017

Researchers find flaws in RISC-V core

Researchers at Princeton University have found a number of significant flaws in the RISC-V open source processor core. The specification is set to come to market later this year, although some companies such as SiFive are already using it.

The researchers, testing a technique they created for analyzing computer memory use, found over 100 errors involving incorrect orderings in the storage and retrieval of information from memory in variations of the RISC-V processor architecture. The researchers warned that, if uncorrected, the problems could cause errors in software running on RISC-V chips. Officials at the RISC-V Foundation said the errors would not affect most versions of RISC-V but would have caused problems for higher-performance systems.

"Incorrect memory access orderings can result in software performing calculations using the wrong values," said Margaret Martonosi, Professor of Computer Science at Princeton and the leader of the Princeton team that also includes Ph.D. students Caroline Trippel and Yatin Manerkar. "These in turn can lead to hard-to-debug software errors that either cause the software to crash or to be vulnerable to security exploits. With RISC-V processors often envisioned as control processors for real-world physical devices (i.e., internet of things devices) these errors can cause unreliability or security vulnerabilities affecting the overall safety of the systems."

Krste Asanović, the chair of the RISC-V Foundation, welcomed the researchers' contributions. He said the RISC-V Foundation has formed a working group, headed by Martonosi's former graduate student and co-researcher Daniel Lustig, to solve the memory-ordering problems. Asanović, a professor of electrical engineering and computer science at the University of California-Berkeley, said the RISC-V project was looking for input from the design community to "fill the gaps and the holes and getting a spec that everyone can agree on."

"The goal is to ratify the spec in 2017," he said. "The memory model is part of that."

Lustig, a co-author of Martonosi's recent paper and now a research scientist at NVIDIA, said work was underway to improve the RISC-V memory model.

"RISC-V is in the fortunate position of being able to look back on decades' worth of industry and academic experience," he said. "It will be able to learn from all of the insights and mistakes made by previous attempts."

The RISC-V instruction set was first developed at UC-Berkeley, with the idea that any designer could use the instruction set to create processor cores and software compilers. The project is now run by the RISC-V Foundation, whose membership includes a roster of universities, nonprofit organizations and top technology companies, including Google, IBM, Microsoft, NVIDIA and Oracle.

Martonosi's team discovered the problems when testing their new system to check memory operations across any computer architecture. The system, called TriCheck, allows designers and others interested in working with a design, to detect memory ordering errors before they become a problem. The tool has three general levels of computing: the high-level programs that create modern applications from web browsers to word processors; the instruction set architecture that functions as a basic language of the machine; and the underlying hardware implementation, a particular microprocessor designed to execute the instruction set.

The memory ordering challenge stems from the complexity of modern computers. As designers squeeze more performance out of computer systems, they rely on many concurrent operations sharing the same sections of computer memory. This parallel, shared-memory operation is extremely efficient, both for speed and power usage, but it puts a heavy demand on the computer's ability to interleave and properly order memory usage. If, for example, several processes are using the same section of memory, the computer needs to make sure that operations are applied to memory in the correct order, which may not always be the order in which they arrive from different concurrently running processors.

Subtle changes in any of the three computing levels — the machine level, the compiler and the high-level programming languages — can have unintended effects on the other layers. All three have to work together seamlessly to make sure memory errors don't crop up. One advantage of TriCheck is that it allows experts in one of these layers to avoid conflicts with the other two layers, even if they do not have expertise in them.

"If I write a program in C, it makes some assumptions about memory ordering," said Martonosi. "Subsequently, a different set of memory ordering rules are defined by the instruction-set architecture. We need to check that the high-level program's assumptions are accurately supported by the underlying instruction set and processor design."

However, the researchers said the TriCheck's greatest strength is its ability to give designers a broad view of memory usage. Although designers have long been interested in this perspective, previous attempts to comprehensively analyze memory operations have been too slow to be practical.

TriCheck is able to check memory ordering efficiently by using succinct formal specifications of memory ordering rules, known as axioms. For a given program, compiler, instruction set and hardware implementation, TriCheck can enumerate many ordering possibilities from these axioms, and then check for errors. By expressing the memory-ordering possibilities as connected graphs, TriCheck can identify potential errors by looking for cycles in the graphs. These checks can be done very efficiently on modern high-performance computers, and TriCheck's speed has allowed it to explore larger and more complex designs than prior work.

"TriCheck is an important step in our overall goal of verifying correct memory orderings comprehensively across complex hardware and software systems," she said. "Given the increased reliance on computer systems everywhere — including finance, automobiles and industrial control systems — moving towards verifiably correct operation is important for their reliability and safety."

No comments:

Flaherty Publishing

By looking across all the different technologies and markets in the embedded space, this blog pulls together trends and opportunities through exclusive news, video and comment that you might not have seen from sites dedicated to individual topic areas. The labels below allow you to select your own interest areas, and please look through the archive.