Operations

Managing your DC/OS NiFi service

This section describes various operations tasks you may need. DC/OS NiFi allows you to

Update your configuration after launch

Update your placement constraints

Add, replace, restart or resize a node

Back up your application

Use the DC/OS NiFi Administration Toolkit

User metrics to troubleshoot your nodes

Updating Configuration

You can make changes to the service after it has been launched. Configuration management is handled by the scheduler process, which in turn handles DC/OS NiFi deployment itself.

After making a change, the scheduler will be restarted, and it will automatically deploy any detected changes to the service, one node at a time. For example, a given change will first be applied to nifi-0, then nifi-1, and so on.

Nodes are configured with a “Readiness check” to ensure that the underlying service appears to be in a healthy state before continuing with applying a given change to the next node in the sequence.

Some changes, such as decreasing the number of nodes or changing volume requirements, are not supported after initial deployment. See Limitations.

The instructions below describe how to update the configuration for a running DC/OS service.

Enterprise DC/OS 1.10 and later

Enterprise DC/OS 1.10 introduces a convenient command line option that allows for easier updates to a service’s configuration, as well as allowing users to inspect the status of an update, to pause and resume updates, and to restart or complete steps if necessary.

Prerequisites

The service’s subcommand available and installed on your local machine.

You can install just the subcommand CLI by running

dcos package install --cli --yes nifi

If you are running an older version of the subcommand CLI that doesn’t have the update command, uninstall and reinstall your CLI.

dcos package uninstall --cli nifi
dcos package install --cli nifi

Preparing configuration

If you installed this service with Enterprise DC/OS 1.10 or later, you can fetch the full configuration of a service (including any default values that were applied during installation). For example:

dcos nifi describe > options.json

Make any configuration changes to the options.json file.

If you installed this service with a prior version of DC/OS, this configuration will not have been persisted by the the DC/OS package manager. You can instead use the options.json file that was used when installing the service.

NOTE: You need to specify all configuration values in the options.json file when performing a configuration update. Any unspecified values will be reverted to the default values specified by the DC/OS service. See the "Recreating options.json" section below for information on recovering these values.

Recreating options.json (optional)

If the options.json from the last service installation or update is not available, you will need to manually recreate it using the following steps.

First, we’ll fetch the default application’s environment, current application’s environment, and the actual nifi that maps config values to the environment:

Managing nodes

Adding a Node

The service deploys two nodes by default. You can customize this value at initial deployment or after the cluster is already running. Shrinking the cluster is not supported.

Modify the COUNT "node":{"count":3} environment variable to update the node count. If you decrease this value, the scheduler will prevent the configuration change until it is reverted back to its original value or larger.

Resizing a Node

The CPU and memory requirements of each node can be increased or decreased as follows:

Restarting a Node

This operation will restart a node, while keeping it at its current location and with its current persistent volume data. This may be thought of as similar to restarting a system process, but it also deletes any data that is not on a persistent volume.

Run

dcos nifi pod restart nifi-<NUM>`, e.g. `nifi_-2

Replacing a Node

NOTE: Nodes are not moved automatically. You must perform the following steps manually to move nodes to new systems. You can automate node replacement according to your own preferences.

This operation will move a node to a new agent and will discard the persistent volumes at the prior system to be rebuilt at the new system. Perform this operation if a given system is about to be offlined or has already been offlined.

Run dcos nifi pod replace nifi-<NUM>, e.g. nifi_-2 to halt the current instance with id <NUM> (if still running) and launch a new instance on a different agent. For example, let’s say nifi-2's host system has died and nifi-2 needs to be moved.

Now that the node has been decommissioned (if needed by your service) start nifi-2 at a new location in the cluster.

dcos nifi pod replace nifi-2

Advanced update actions

The following sections describe advanced commands that be used to interact with an update in progress.

Monitoring the update

Once the Scheduler has been restarted, it will begin a new deployment plan as individual pods are restarted with the new configuration.

You can query the status of the update as follows:

dcos nifi update status

If the Scheduler is still restarting, DC/OS will not be able to route to it and this command will return an error message. Wait a short while and try again. You can also go to the Services tab of the DC/OS web interface to check the status of the restart.

Pause

To pause an ongoing update, issue a pause command:

dcos nifi update pause

You will receive an error message if the plan has already completed or has been paused. Once completed, the plan will enter the WAITING state.

Resume

If a plan is in a WAITING state, as a result of being paused or reaching a breakpoint that requires manual operator verification, you can use the resume command to continue the plan:

dcos nifi update resume

You will receive an error message if you attempt to resume a plan that is already in progress or has already completed.

Force-Complete

In order to manually “complete” a step (such that the Scheduler stops attempting to launch a task), you can issue a force-complete command. This will instruct to Scheduler to mark a specific step within a phase as complete. You need to specify both the phase and the step, for example:

dcos nifi update force-complete service-phase service-0:[node]

Force-Restart

Similarly to force-complete, you can also force a restart. This can either be done for an entire plan, a phase, or just for a specific step.

To restart the entire plan:

dcos nifi update force-restart

Or for all steps in a single phase:

dcos nifi update force-restart service-phase

Or for a specific step within a specific phase:

dcos nifi update force-restart service-phase service-0:[node]

Disaster recovery

Backing up

The DC/OS NiFi framework allows you to back up your DC/OS NiFi application to Amazon S3. The following information/values are required for backup.

AWS_ACCESS_KEY_ID

AWS_SECRET_ACCESS_KEY

AWS_REGION

S3_BUCKET_NAME

To enable backup, trigger the backup-S3 plan with the following plan parameters:

Once this plan execution is completed, the backup will be uploaded to S3.
The DC/OS NiFi backup is taken using the DC/OS NiFi toolkit. The DC/OS NiFi backup will be done using three sidecar tasks:

Backup - Back up to local node (ROOT/MOUNT). The Backup task is responsible for backing up the local application to the local node, which may be on the ROOT or Mount Volume.

Figure 1. Backing up to local node

Upload_to_S3 - Upload the backup from the local node to S3. This sidecar task takes the backup created in Step 1, from the ROOT/Mount volume, and uploads it to Amazon S3 in the Bucket Name specified.

Figure 2. S3 upload

Cleanup - Remove the backup from local node. When Step 2 is complete and the backup has been uploaded to S3, a sidecar task known as Cleanup is triggered. This task cleans up/removes the backup folder from the local Root/Mount volumes.

Figure 3. Cleanup service

DC/OS NiFi Toolkit Commands

The Admin Toolkit contains command line utilities for administrators to support DC/OS NiFi maintenance in standalone and clustered environments. These utilities include:

Notify — The notification tool allows administrators to send bulletins to the DC/OS NiFi UI using the command line.

Node Manager — The node manager tool allows administrators to perform a status check on a node as well as to connect, disconnect, or remove nodes that are part of a cluster.