Filter Example

This is a simple filter example. It uses the AgeOffFilter that is provided as
part of the core package org.apache.accumulo.core.iterators.user. Filters are
iterators that select desired key/value pairs (or weed out undesired ones).
Filters extend the org.apache.accumulo.core.iterators.Filter class
and must implement a method accept(Key k, Value v). This method returns true
if the key/value pair are to be delivered and false if they are to be ignored.
Filter takes a “negate” parameter which defaults to false. If set to true, the
return value of the accept method is negated, so that key/value pairs accepted
by the method are omitted by the Filter.

Note the absence of the entry inserted more than 30 seconds ago. Since the
scope was set to “scan”, this means the entry is still in Accumulo, but is
being filtered out at query time. To delete entries from Accumulo based on
the ages of their timestamps, AgeOffFilters should be set up for the “minc”
and “majc” scopes, as well.

To force an ageoff of the persisted data, after setting up the ageoff iterator
on the “minc” and “majc” scopes you can flush and compact your table. This will
happen automatically as a background operation on any table that is being
actively written to, but can also be requested in the shell.

The first setiter command used the special -ageoff flag to specify the
AgeOffFilter, but any Filter can be configured by using the -class flag. The
following commands show how to enable the AgeOffFilter for the minc and majc
scopes using the -class flag, then flush and compact the table.

By default, flush and compact execute in the background, but with the -w flag
they will wait to return until the operation has completed. Both are
demonstrated above, though only one call to each would be necessary. A
specific table can be specified with -t.

After the compaction runs, the newly created files will not contain any data
that should have been aged off, and the Accumulo garbage collector will remove
the old files.

When setting new iterators, make sure to order their priority numbers
(specified with -p) in the order you would like the iterators to be applied.
Also, each iterator must have a unique name and priority within each scope.