March 9, 2016

In the GhostRider
paper, we presented a memory trace oblivious system including a type-system,
compiler, ISA additions, and hardware FPGA implementation. GhostRider leverages
compiler and microarchitecture co-design to improve upon past systems such as
Ascend and
Phantom that
protect against attackers snooping the off-chip DRAM address bus. By exposing
on-chip scratchpads at the ISA-level, the GhostRider compiler is able to control
when off-chip accesses occur, preventing timing side channels. Similarly, direct
control of the placement of data in Oblivious RAM (ORAM) or Encrypted RAM (ERAM)
based on static analysis of the program’s access pattern allows for improved
performance with the same security guarantee as placing all data in ORAM. The
GhostRider prototype was implemented using JavaCC for
the compiler and a modified
RISC-V Rocket Chip that ran on the
Convey HC-2ex.

In our prototype, the specific latencies reported for ERAM/ORAM accesses include
overhead for moving data to and from the scratchpads. This overhead is specific
to our choice of the Convey HC-2 platform as well as our hardware design. Our
simple academic implementation makes many tradeoffs that do not reflect the
fundamental characteristics of the architecture. For example, the transfer unit
expects data to be received from an ORAM controller on a separate FPGA. This
incurs additional latency as the cross-FPGA link is only 32 bits wide and data
must be received and formatted to be written to the scratchpad. Similarly, the
ERAM access latency is dominated by data movement that could be optimized with
more engineering effort. This makes the difference between ORAM and ERAM
latency <10x in our prototype, when prior work has overheads of almost 100x
bandwidth.

In light of this discrepancy, one might question whether the GhostRider
prototype is unrealistic, and the experimental results presented in the paper
invalid. However, this reasoning misses the point of why we do experiments in
systems and architecture research – to convince ourselves that our idea works
and we did not overlook something simple, and to convince the community the idea
has potential benefit and deserves further investigation. Thus when comparing a
new system to GhostRider, it is vital to ensure that parameters such as
ERAM/ORAM latency are consistent with the experimental setup of the new system.
An even better approach would be to re-implement the compiler analysis and
scratchpad architecture in the same system for comparison.