Some supervised learning algorithms (such as decision trees and neural nets)
require an equal class distribution to generalize well, i.e. to get good
classification performance. In case of unbalanced input data,
for instance there are only few objects of the "active" but
many of the "inactive" class, this node adjusts the class distribution
by adding artificial rows (in the example by adding rows for the "active" class).

The algorithm works roughly as follows: It creates synthetic rows by extrapolating
between a real object of a given class (in the above example "active")
and one of its nearest neighbors (of the same class). It then picks a point
along the line between these two objects and determines the attributes (cell values)
of the new object based on this randomly chosen point.

Options

Class Column

Pick the column that contains the class information.

Nearest neighbor

An option that determines how many nearest neighbors shall be considered.
The algorithm picks an object from the target class, randomly selects
one of its neighbors and draws the new synthetic example along the
line between the sample and the neighbor.

Oversample by

Checking this option oversamples each class equally. You need to
specify how much synthetic data is introduced, e.g. a value of 2
will introduce two more portions for each class (if there are 50
rows in the input table labeled as "A"; the output will contain
150 rows belonging to "A").

Oversample minority classes

This option adds synthetic examples to all classes that are
not the majority class. The output contains the same number
of rows for each of the possible classes.

Enable static seed

Check this option if you want to use a seed for the random number
generator. This will cause consecutive runs of the node to produce
the same output data. If unchecked, each run of the node generates
a new seed. Use "Draw new seed" to randomly draw a new seed.

Workflows

Installation

To use this node in KNIME, install
KNIME Core
from the following update site:

KNIME 4.0

Wait a sec! You want to explore and install nodes even faster? We highly recommend our
NodePit for KNIME
extension for your KNIME Analytics Platform.

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.

Contact

Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well?
Do you think, the search results could be improved or something is missing?
Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com,
follow @NodePit on Twitter,
or chat on Gitter!

Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.

NodePit is the world’s first search engine that allows you to easily search, find and install KNIME nodes and workflows. Explore the KNIME community’s variety. Start mining and follow @NodePit on Twitter.