Version 1.2.0

Version 1.2.0 of Apache NiFi is a feature and stability release that adds many new processors, record reader and writer services.

Release Date: May 8th, 2017

Highlights of the 1.2.0 release include:

Core Framework

Now supports running multiple versions of the same components. Makes upgrades and multi-tenant flows easier to manage as sets the stage for upcoming Apache NiFi Registry work!

New provenance repository implementation available called 'WriteAheadProvenanceRepository'. Huge performance increase over the standard implementation and indexes in real-time.

New or Improved Processors, Controller Services, and Reporting Tasks

New Record oriented abstraction for reading/writing schema aware event streams from CSV, JSON, AVRO, Grok, and plaintext with easy extension for other formats/schemas

QueryRecord processor to execute SQL queries over a stream of records powered by Apache Calcite

ConvertRecord processor to efficiently transform records from a given schema and format into another schema and format

SplitRecord processor to efficiently split huge record bundles into configurable batch sizes for divide and conquer or protect downstream systems

Processors to efficiently stream Records into and out of Apache Kafka in a format and schema aware manner and which automatically handle achieving high throughput and full provenance

Controller Services for plugging into and managing data schemas (Avro Schema Registry, Hortonworks Schema Registry) that integrate nicely into the record readers and writers

Features/improvements related to Change Data Capture (CDC), including CaptureChangeMySQL which reads from the MySQL binlogs, EnforceOrder, and PutDatabaseRecord processors, as well as a "Rollback on Failure" capability of some Put processors

New processors and controller service to support a Wait/Notify pattern enabling conditions in another portion of a flow to signal another portion to continue or execute.

For those that like to write new capabilities on the fly using scripting languages you can build your own reporting tasks, record readers, and writers using various scripting langauges now and the ExecuteScript processor lets you write in Clojure now too

The JSON Jolt Transform processor now allows Jolt transforms to include NiFi expression language statements and is much faster

New processors to compute and compare content using Fuzzy Hashing powerful for cyber security and other cases

New processors to interact with Google Cloud Platform/Google Cloud Storage and Azure Blob and Table Storage

Version 1.1.0

Version 1.1.0 of Apache NiFi is a feature and stability focused release which builds on the great improvements and community progress of the 1.0 release.

Release Date: November 29, 2016

There are many changes in the 1.1.0 release with some highlights including:

Core Framework Improvements

Cluster Management logic has been stabilized and improved to better support zero-master clustering. For example, recovery time is faster now as we're not having to always wait some artificial length of time to see if any new nodes will join and thus we should reconsider which flow is the golden copy.

The flowfile, provenance, and content repositories have been reworked to support rollback. We've always been careful to ensure that upgrades would work well where existing flow state would be honored but what was harder was supporting the ability to rollback where state was written via a new version of NiFi but the user decided to rollback to an older version. This is now supported. This powerful feature also sets the stage for future work to enable rolling upgrades and automated rollbacks!

Startup times for flows that have large backlogs should be far faster as the swap files have been reworked to provide summaries and avoid the need for full scans.

For developers, they can now indicate that their processor should be given an instance isolated classloader. Some libraries, like Hadoop client and scripting engines as an example, use static variables that can pollute instances of processors on the graph. This feature allows those cases to be easily overcome by enabling isolation per instance of a processor. This also makes it really easy to expose classloader extension for custom jars to users.

For developers, there is now an ability to migrate in-flight process session state to another process session. This will yield higher efficiency and makes for a far easier programming model for aggregation type patterns as seen in MergeContent, for example.

User Experience Improvements

We now provide visual indication of queue growth relative to back pressure settings and when back pressure is engaged. This will make the concept of congestion and back pressure far more intuitive and frankly it is just fun to see in the UI. Definitely check this out.

After the 1.0.0 release several members of the community expressed how much they love the new look and feel but wished we had kept some of the colors. Better and more intuitive color contrast is back.

Validation of components is now limited to occur only for components which are not scheduled to execute. This results in much faster UX behavior as many of the operations one could do through the UI and REST API would result in expensive validation operations that were unnecessary.

Users can now export images of the provenance graphs.

Users can now use cron-scheduling for components even on primary node only tasks.

Updated Versions of Dependencies

We now leverage the Azure Event Hubs 0.9.0 client library.

We now interact with Apache Spark using the 2.0.1 libraries.

We now interact with HDFS using the Apache Hadoop 2.7.3 libraries.

New or Improved Processors

New Fetch and Put processors to interact with ElasticSearch 5.0 and new processors to execute Query and Scroll operations against ElasticSearch.

New processors to parse CEF formatted logs

The Extract Email processors now support TNEF formatted attachments.

New processor to validate CSV files.

The Apache Solr processors have been updated to support SSL and Kerberos.

New processors to act as client and server for Websockets.

New Utility

In upgrading from 0.x to 1.x we provided a lot of capabilities to make the process easy and automatic. However, we didn't account for migrating from the embedded use of zookeeper to an external instance. We've now provided a utility that helps you migrate NiFi state from one zookeeper cluster to another.

Previously it was difficult to change the sensitive property key which is used to encrypt all sensitive properties contained within an actual flow configuration. A utility now exists to easily convert from an old key to a new key which is a valuable piece of an overall security process.

Security Improvements

NiFi now supports the concept of restricted components. These are processors, controller services, reporting tasks that allow an authorized user to execute unsanitized code or access and alter files accessible by the NiFi user on the system NiFi is running. Therefore, these components are tagged by the developer as restricted and when running NiFi in secure mode only an administrator must grant each user access to the policy allowing restricted component access.

Site-to-Site now improved to be helpful even when port-forwarding is utilized. Very helpful for cases where an administrator might run NiFi with lower privileges but want external interaction to use well known privileged ports.

The policy management user experience has been improved to make it more intuitive what is happening in certain cases.

The encrypted configuration feature now has been extended to cover the Login Identity Provider capability. This is really helpful for example so you can have your LDAP password only stored in encrypted form in the login provider configuration file. Additional work is planned for these encrypted configurations to make interaction with a Hardware Security Module available as well.

Version 0.5.1

Version 0.5.1 of Apache NiFi addresses several bugs and issues.

Release Date: February 26, 2016

Highlights of the 0.5.1 release includes:

Highlights of Bugs addressed

Close a case that could lead to data loss: In the event that all flow files assigned to a given resource claim were swapped out, and the archive had to reclaim old space or was not engaged, and NiFi was restarted resource claims could be removed resulting in data loss.

LDAP based authorization blocking valid API calls: When using LDAP based authentication there were times when the access token wasn't being propagated which resulted in some valid API calls being rejected such as when downloading content or templates.

Version 0.5.0

Apache NiFi 0.5.0 includes several exciting new capabilities to include new processors, exciting new developer tools for building and testing processors. Stability and performance continue to be a priority and a long list of bugs were identified and resolved!

Release Date: February 16, 2016

Highlights of the 0.5.0 release includes:

New Application Features

Data inspection: We've greatly enhanced the ability to interact with data and inspect it as it is flowing through NiFi.

Things to make NiFi Development Better and Easier

State Management: As many developers are aware, keeping state in a Processor was often "up to you". This extension to the framework addresses this by adding state management as a core feature. In addition, many existing processors which kept state were modified to take advantage of this new capability.

Testing Improvements: We've improved the performance of many unit tests, added support for groovy for unit tests, the ability to test in unit tests, as well as better support for integration testing

Improvements to Existing Capabilities

RELP Support for Syslog: In addition to adding support for RELP (Reliable Event Logging Protocol) for transporting syslog messages, emphasis was put on improving the framework to enable even more extensions in future releases.

S3 Improvements: We've pushed past the 5G max upload limitation of the previous PutS3Object processor by adding Multipart Upload support. Additionally, we added a couple authorization capabilities, like expression language support for keys, and an exciting new controller service created initially for fetching from buckets access across accounts.

Encryption: A broad set of enhancements were made to improve encryption and decryption of content and associated user documentation.

New Extensions

Script Execution: We've dramatically enhanced the ability to operate on data flowing through NiFi by adding support for launching scripts in a flow. We added a broad set of scripting languages - you can launch JRuby, Groovy, JavaScript, Lua, or Jython scripts with access to FlowFile data, which will allow much more dynamism and quick reaction capability in your data flows.

NIFI-1497
-
Access token not included in all requestsResolved
: An issue with Access Tokens may cause Viewing Content and Custom UIs to fail

NIFI-1527
-
Resource Claim counts not incremented on restart for FlowFiles that are swapped outResolved
: An issue that under certain conditions can result in data loss

NIFI-1694
-
EncryptContent processor should accept keyring file or individual key file for PGP encryption/decryptionOpen
: Users upgrading from NiFi ≤ 0.4.1 may encounter errors in PGP key-based encryption/decryption with EncryptContent processors if they have provided an individual key file for the public keyring file or secret keyring file property. A valid keyring file must be provided, and the full userID "Name (Comment) <Email>" form should be provided.

Version 0.4.1

Version 0.4.1 of Apache NiFi is an incremental release addressing several bugs and providing a few minor improvements over the 0.4.0 release.

Release Date: December 22, 2015

Highlights of 0.4.1 release include:

Bugs addressed

Site-To-Site: If the remote system was applying back-pressure and the sending system attempted to stop the connection the sending system flow configuration could hang. If one of the nodes in a cluster being delivered to went off-line it could cause site-to-site to stop delivering to the other nodes in certain conditions. The automatic account request mechanism was broken.

Flow File Ordering: Ordering based on timestamp of flow file entry has been augmented with a sequence counter to provide better ordering precision.

Run duration: Only shown now when the processor supports the batching mechanism of run duration.

Version 0.4.0

Version 0.4.0 of Apache NiFi is a provides substantial improvements in functionality and usability, as well as providing some stability and performance improvements and bug fixes.

Release Date: December 11, 2015

Highlights of 0.4.0 release include:

New Application Features

Multiple Authentication Mechanisms: NiFi now supports multiple Authentication Mechanisms! No longer is NiFi tied to being either non-secure or security based on two-way SSL but now can provideUser Authentication via LDAP. This was a significant undertaking, but has paved the way to far more easily provide new Authentication Mechanisms. Future releaseswill include additional mechanisms, such as Kerberos.

Drop FlowFiles from Queue: Users are now able to right-click on a Connection and drop the FlowFiles in the queue, rather than relying on FlowFile Expiration to remove unwanted FlowFiles.

Usability Improvements

Explicit Processor Connectivity: Processors that do not expect incoming data will no longer allow incoming Connections. Attempting to draw a connection to a "Source Processor" will show theconnection line as a red, dotted line, and will not allow the Connection to be made. Likewise, Processors that require input to perform work will be invalid until they have an incoming Connection.Some Processors may accept incoming data or run without any incoming data. These Processors will be valid regardless of whether or not they have incoming connections.

Getting Started Guide: Getting Started Guide is added to the 'help' screen of the application. This guide provides an introduction to NiFi terms, introduces the key concepts of NiFi anddiscusses how to work with FlowFiles and their Attributes. This is similar in concept to the User Guide but is far less verbose and explains concepts at a higher level.

Provenance Fetch Event: A new Provenance Event Type (FETCH) was added. This Event Type is used to indicate that an existing FlowFile's contents were modified as a result of obtainingdata from an external resource. This is in contrast to a RECEIVE event, which is used to indicate that a FlowFile entered the system as a result of obtaining data from an external resource.

New Extensions

RouteText: Allows user to easily establish queries against textual data. Each line of a FlowFile is matched against the specified rules and routed according to the rules (potentially many lines of textare included in each output FlowFile). Also supports grouping of textual data so that FlowFiles that are output do not contain text from two different groups.

TailFile: Allows user to "tail" a file, consuming data from the end of the file as it is written by another process. This is typically used to consume data from log files as it is written. Processor will pick upwhere it left off, even if NiFi is restarted and log files roll over.

ListFile / FetchFile: Performs a listing of files in a given directory and fetches those files. These processors differ from GetFile in that the ListFile processor keeps state about files that have been consumed,so that the file can be ingested only once without deleting the source file. Additionally, if the directory being monitored exists in a mounted volume, the state can be shared across the cluster so that a newprimary node can pick up where the previous processor left off, and the listing can also be shared across the cluster, distributing the work of pulling in and processing the files.

ListSFTP / FetchSFTP: Performs a listing of files on an SFTP server and fetches those files. Similarly to ListFile / FetchFile, state can be distributed across the cluster so that multiple nodes can perform thework in parallel.

DeleteS3Object: Removes an Object from Amazon S3.

PutHBaseCell / GetHBase / PutHBaseJSON: Allows users to put the contents of a FlowFile to HBase and listen for changes to an HBase table, automatically pulling in the rows that are added/updated.

GetAzureEventHub / PutAzureEventHub: Send the contents of FlowFiles as Events to Microsoft Azure Event Hub or listen for incoming Events on an Event Hub and create FlowFiles for those Events.

GetCouchbaseKey / PutCouchbaseKey: Send the contents of FlowFiles to Couchbase or fetch the contents of a record from Couchbase.

AttributesToJSON: Easily form a JSON document (as the contents of a FlowFile or as an Attribute) from a user-defined set of FlowFile Attributes.

SplitAvro: Splits a FlowFile that consists of many Avro records into individual FlowFiles, each containing a smaller number of Avro records.

ExtractAvroMetadata: Extracts the metadata from the header of an Avro file and adds the metadata to the FlowFile as a set of Attributes.

Image Content Viewer: When users look at the details of a Provenance Event, in the Content tab, if the View button is clicked, and the contents of the FlowFile are an image, that image will be rendered inthe UI, rather than indicating that no viewer is available for this content type.