The Plasma Project

Distributed Filesystem and MapReduce
- by gerd,
2011-10-02

PlasmaFS is a distributed filesystem for large files, implemented
in user space. Plasma Map/Reduce runs the famous algorithm scheme
for mapping and rearranging large files. Plasma KV is a key/value
database on top of PlasmaFS.

PlasmaFS is deployed on an arbitrary number of namenodes and
datanodes. All data and metadata is replicated. ACID transactions
provide data safety and clear query semantics. PlasmaFS focuses on
large files and blocksizes in the range 64K to 1M. It is
error-resiliant and extensible.

PlasmaFS is accessible over a command-line client (plasma), NFS v3,
and over its own native network API.

Plasma tries to be extremely performant - for example, it uses
shared memory where possible, and minimizes network traffic. It is
implemented in Ocaml and compiles to machine code (no VM). It
focuses on 64 bit machines. The design, however, also aims at
clean semantics and data safety in order to minimize the risk of
losing data.

Plasma KV is a key/value database where the data files are
stored in PlasmaFS. It targets at simple database applications that
are dominated by reads and that need to be extremely scalable.
Unlike other NoSQL implementations, Plasma KV provides high data
safety by using the transactional interface of PlasmaFS. Also, it
allows high isolation between readers and writers - in particular,
a writer does not lock readers out.