Riak 1.2 Release Notes

Features and Improvements for Riak

Aggregation of non-streamed MapReduce results was
improved. Previous versions used an O(n^2) process, where n is
the number of outputs for a phase. The aggregation in Riak 1.2 is
O(n). (riak_kv#331,333, riak-erlang-client#58,59).

Timeouts of MapReduce jobs now produce less error log spam. The
safe-to-ignore-but-confusing {sink_died, normal} messages have
been removed. (riak_pipe#45)

The riak-admin transfers command now reports the
status of active transfers. This gives more insight
into what transfers are occurring, their type, their running time,
and the rate at which data is being transferred. Calling this
command will no longer stall handoff.

The memory storage backend for Riak KV now supports secondary indexes and has a "test" mode that lets developers quickly clear all local storage (useful in the context of an external test suite).

Protocol Buffers Enhancements

The design of the Protocol Buffers on the server-side has been significantly refactored, allowing sub-applications other than Riak KV to supply services to clients.

Secondary indexes can be natively queried from Protocol Buffers clients, They no longer need to emulate them with MapReduce.

Riak Search indexes can be natively queried from Protocol Buffers clients. They no longer need to emulate them with MapReduce.

Stats improvements

Getting stats from riak_kv should no longer timeout under very heavy load as there is no longer a gen_server
process for stats.

Stats can still be retrieved as before, with the addition that one can now attach to a node and
query stats directly through folsom. Use folsom_metrics:get_metrics() to see a list of available stats.

Configurable sample types for histogram metrics in riak_kv and riak_search. Defaults to a one minute sliding window, with random uniform reservoir size of 1028 readings per second. This means that the following statistics may show slightly different results from pre1.2 nodes as there may be fewer readings than the total number or events.

riak_kv_node_get_fsm_siblings

riak_kv_node_get_fsm_time,

riak_kv_node_put_fsm_time

riak_kv_node_get_fsm_objsize

You can configure the sample type by adding
{stat_sample_type, {slide, Window::int()}} or {stat_sample_type, {slide_uniform, {Window::int(), Size::int()}}} to your app.config under the section for riak_kv and/or riak_search. Further you may change the sample type for a named stat only, like this {{riak_kv, node_get_fsm_time}, {slide_uniform, {60, 10000}}}

Packaging Improvements

A binary package for FreeBSD 9 is now provided

A binary package for SmartOS is now provided

Ubuntu packages for 10.04 (Lucid), 11.04 (Natty), and 12.04 (Precise) are now provided as separate packages

See "Bugs Fixed" for packaging related bug fixes

Leveldb tuning

Bloom filter code from google added. This greatly reduces the search time for keys that do not exist.

Capability Negotiation

Riak nodes now negotiate with each other to determine supported operating modes,
allowing clusters containing mixed-versions of Riak to work properly without special
configuration.

This simplifies rolling upgrades. In the past, users needed to disable new features
during the rolling upgrade, and then enable them after all nodes were upgraded. This
is now handled automatically by Riak.

This change replaces several existing configuration parameters, with the old settings
being ignored entirely in Riak 1.2. The following values are the ones that are no longer
used in Riak 1.2, along with the new behavior:

To override capability negotiation (which is discouraged), there is now a per-component override setting
that can be set in app.config. For example, the following could be added to the riak_kv section of
app.config to alter negotiation of the listkeys_backpressure and mapred_system settings:

%% Override listkeys_backpressure setting to always be set to 'false'.%%%% Override mapred_system setting to use 'legacy' if all nodes in the cluster%% support 'legacy', otherwise use the built-in default setting.
[{override_capability,
[{listkeys_backpressure, [{use, false}]},
{mapred_system, [{prefer, legacy}]}]
}]

Overhauled Cluster Adminstration

Riak now provides a multi-phase approach to cluster administration
that allows changes to be staged and reviewed before being committed.

This change allows multiple changes to be grouped together, such as
adding multiple nodes at once, or adding some nodes while removing
others.

This new approach also provides details about how a set of staged
changes will impact the cluster, listing the future ring ownership
as well as the number of transfers necessary to implement the planned
changes.

This new approach is currently implemented only by riak-admin, and
is not yet part of Riak Control. The older riak-admin commands such
as join, leave, force-remove have been deprecated, although they can
still be used by appending -f, eg. riak-admin join -f.

The new cluster admin interface is accessed through riak-admin cluster:

Usage: riak-admin cluster <command>
The following commands stage changes to cluster membership. These commands
do not take effect immediately. After staging a set of changes, the staged
plan must be committed to take effect:
join <node> Join node to the cluster containing <node>
leave Have this node leave the cluster and shutdown
leave <node> Have <node> leave the cluster and shutdown
force-remove <node> Remove <node> from the cluster without
first handing off data. Designed for
crashed, unrecoverable nodes
replace <node1> <node2> Have <node1> transfer all data to <node2>,
and then leave the cluster and shutdown
force-replace <node1> <node2> Reassign all partitions owned by <node1> to
<node2> without first handing off data, and
remove <node1> from the cluster.
Staging commands:
plan Display the staged changes to the cluster
commit Commit the staged changes
clear Clear the staged changes

Enhancements

Known Issues

The Protocol Buffers interface when returning RpbErrorResp responses to the client will set the errcode field to 0, whereas before it was 1 or unset. Only client libraries that previously attempted to apply meaning to the errcode field will be affected. Improvement of the error responses from Protocol Buffers is planned for the next major release.

Some spurious messages may be sent to the log after a Pipe-based MapReduce job sent via PBC has been shutdown. This does not affect normal operations. basho/riak_kv#366

The SmartOS packages were tested against 1.5.x and 1.6.x datasets from Joyent. The newest datasets of SmartOS 1.7.x have not been tested and are not supported currently.

Secondary index queries against a heavily loaded cluster may hit an improperly-handled internal timeout and result in error responses. This affects both HTTP and Protocol Buffers interfaces and has existed since Riak 1.0. basho/riak_kv#379

MapReduce queries may print messages in the log of the form, [error] Module <module name> must be purged before loading, due to a race in the code that ensures a module is loaded before it is used. This message may be safely ignored. It can be silenced by attaching to the Riak console and evaluating code:purge(<module name>)..

Some users may experience a performance regression in 2I compared to 1.0 and 1.1. The problem manifests as higher latencies for range and equality queries. A preliminary investigation suggests the change of the Erlang VM from R14B04 to R15B01 is partially responsible, but there may be other factors.

Notes

The Luke application, and with it the "legacy" MapReduce system should be considered deprecated. All systems should be configured to use Riak Pipe as their MapReduce system (the default since 1.0). The Luke application may be removed as soon as the next release.

The Innostore storage backend is deprecated and will not be supported in the 1.2 release.