Setting up MiNiFi CPP 0.2 Telematics Tracking

Frequently in our IoT edge use-cases there is a need to transmit GPS data over varying types of data links such as cellular (3G/4G) and WiFi. This post will review how to setup MiNiFi CPP on a Raspberry Pi Zero-W with GPS, 4G Cellular Modem, and WiFi. Alot of the focus* here will be how you can use the MiNiFi product today to enable your SandBox platform which may result in this example feeling more ‘clunky’ to enable alot of actions the tracking use-case required this early on with MiNiFi.

*How to setup the Cloud/DataCenter NiFi servers for Site-2-Site is not discussed here.

WARN – The version of MiNiFi CPP used is 0.2; this is very early for this product and if your reading this Blog Post from a historical perspective it may be useful but its highly likely that if a year has past the product has matured more and would change most of the following designs and user experience for the better.

Prototype Dongle Config

Edge & Platform Requirements

Many people ask me for my opinion on if the ‘Edge’ is part of the Spoke in a Hub & Spoke architecture. Given the increase in Edge computing I think of the edge as the Tire, or the Rubber on the road. I add this new layer into the traditional Hub & Spoke design because its an independent device that comes and goes based on jagged connectivity.

Edge (TIRE) Requirements

Captures DataPoints @ Given Sensor Intervals

GPS, Cell Strength, etc

Can be configured as N-Seconds or CRON like scheduling

Back Pressure

Can Prioritize what expires when storage becomes full

Can Prioritize which messages in Queues/Connections to send downstream first

Multiple Data Links (Cellular and WiFi)

Links can be Encrypted (TLS)

Delivery Guarantees

Can Receive From Hubs (*In MiNiFi C2)

Security

The Hub and Edge both authenticate each-other with certificates

Very Small Resource FootPrint of ‘Framework Software [MiNiFi]’

See Prior Blog Post graphs at the Bottom for MiNiFi on Pi Zero-W with Idle System Metrics and even the workflow discussed here.

Platform (HUB & SPOKE) Requirements

Delivery Guarantees

Back Pressure

If a DataLink is down for days, keep the important data

Transfer Throughput

Transfers should perform AUTOMATIC Load Balancing, and discovery between nodes

No Reverse Proxy

No Load Balancer (Soft or Hard)

Cluster AutoDiscover for Hosts to Load Balance on

Compression

Reduce egress costs

Improve throughput (if spare cpu cycles exist)

Security

Edge, Spoke, and Hub all use certificates to authenticate

Hub can PULL data from Spoke

Prevents exposing the DataCenter (No Inbound Connections)

Can use a HTTP_PROXY

This makes many security folks happy to have a single audit point and is valid so long as the bandwidth is available

Data Links can be Encrypted

Logical Architecture of the Platform

From an Architecture point of view each of these ‘zones’s is its own Architecture Failure Domain. This allows each of the Zones to fail without causing a outright failure to any of its connected systems. In the most simple of aspects the above Platform is utilizing a Hub & Spoke (and Tire) architecture where the DataCenter itself is the Hub, Cloud the Spoke… There are many reasons to go to this design; Security Surface reduction for multiple utility solutions needing data flow, and provides a bi-directional pipeline between the Cloud and DataCenter with Site-2-Site (See NiFi C2 for MiNiFi BiDirectional Comms), Regional Cloud for Latency, etc. Important to note that Site-2-Site capabilities in NiFi provides native load-balancing between clusters, compression, encryption, delivery guarantees for data, and if required can connect through an HTTP_PROXY to access clusters on other network segments.

Hardware

Edge Device

Pi Zero-W (using a U3 C10 Flash Storage Card)

BU-353-S4 GPS Dongle

Huawei Boltz 4G LTE Modem Unlocked

Using Google Fi SIM

Micro USB 2.0 Male to Female OTG Cable

Anker USB Hub

Cloud

4GB 2 Core Server

DataCenter

8 Core 96GB Ram Server

Software

Edge Device

Raspbian GNU/Linux 8 (jessie)

MiNiFi CPP 0.2 (built from source)

GPSD

usb-modeswitch

libqmi

Bash

Cloud

Ubuntu 16.04 Server

NiFi 1.3

DataCenter

Ubuntu 16.04 Server

NiFi 1.3

Code Flow and Implementation

The data-flow designed here is very basic with the goal of just getting GPS collected while in the field. Four Shell Scripts were built to provide the needed functionality and all directly link to the Site-2-Site Remote Process Group. All connections are configured as LIFO so that the most recent data is sent to the cloud whenever we regain connectivity and then the older data will get set when it can. Each connection was configured based on the important of the data so that GPS datapoints have up to 100MB of storage while simpler items like the WiFi connection status will only gather 10MB worth of events before expiring them. Additionally time expiration is being used; GPS will wait 168 hours before expiring events it has not sent, while other informational events expire every 48 hours.

GPS Tracking Flow in NiFi

Shell Scripts in ExecuteProcess Processors

If you have a desire to replicate the above flow you can find all there scripts and templates linked below. The scripts all generate JSON details about the actions they perform such as SCANNING for WiFi SSIDs, Resetting the entire 4G LTE Modem, Disk Metrics and even the GPS. The GPS JSON is generated by the GPSD application itself so we just capture it as is.

Setting It all Up

Headless Pi Configuration

To enable ssh by default drop a 0-byte file called ‘ssh’ into ‘/boot/’ this will enable ssh which on later versions of Raspbian is disabled by default for security.

touch /path/to/fashcard/boot/ssh

Configure to AutoJoin your WiFi SSID by editing /etc/wpa_supplicant_wpa_supplicant.conf, this is different the the WiFi script as this will occur at boot even if the scripts are not running. You will also have to configure your /etc/network/interface to autoconnect.

Setup the 4G LTE Modem, this seems complex but isn’t to bad. If you Google you will find lots of other methods to do this including WVDail but here QMI will be used as it appears to be the currently most common method. Your device may hipup and require you to run modeswitch again. See the LTE-Connect script for a way to brute force handle problems.

It Lives!

Depending on the chipset where MiNiFi CPP is installed you may be required to compile it from source code (as was the case for the Raspberry Pi as its ARM.) If your testing this on an x86 system you can just download the tarball. The tarball folder absolute path becomes $MINIFI_HOME. Because the scripts in the ExecuteProcess Processors take care of starting most everything the services need such as GPS Daemons init.d scripts are not required for everything OTHER THAN the MiNiFi service itself; you can install MiNiFi for auto-start: $MINIFI_HOME/bin/minifi.sh install

In my vehicle this device gets a hard power down every time the vehicle’s ignition of turned off as it is powered from the accessory port. It has continued to operate and restart each time without issue in spite of this behavior.

Conclusion

While some functional needs for our use case are required to be written with shell-scripting (for now) there are a number of benefits related to working within MiNiFi. Some may take it as a negative to have to use shell scripts to enable our functionality it should also been seen as the flexibility to leverage ALL of the functions that our System on Chip (Pi Zero-W) provides. Out of the box your provided with first class back-pressure mechanics allowing edge devices to purge events that haven’t been able to send after they lose value or because the events are less valuable then others, and it includes both methods to prioritize events (FIFO/LIFO/etc), expire based on Event Count per queue, Storage Utilization, and Time. The Site-2-Site mechanism provides compression, encryption and delivery guarantees letting you focus on what matters and not getting the basics of Data Transfer correct. System Impact is almost non-existent for idle MiNiFi systems allowing it to fit on very small SOCs (Pi Zero-W 1CPU, 512MB Ram.) The features just keeps getting better with some of what I had to do above already getting changed or improved!

ARM still seems to be fresh even in 2017, be ready to compile libraries that are on x86 but not ARM.

If you looked at the LTE connection scripts you will see I have taken a very brute force approach to handling issues. Its likely there is significant room for improvement in this ENTIRE MiNiFi flow but for the point of demonstration its served its need.

That said, driving around I have had a few LTE failures that did not recover and need more investigation.

The Data Link used for Site-2-Site is decided on at this time by the OS. If we wanted to manipulate it may require modifying root, or extending S2S to target a specific interface. Currently we area using WiFi and LTE and grabbing whichever is available for data link at transfer time. There is significant room to make this process more ‘miserly’ (cost sensitive.)

Today there is no way to ‘identify’ that a specific set of data came from an edge device. It would be nice if all flowfiles were attributed with a specific host serial number that could be used downstream.

In NiFi, flowfiles received over S2S have s2s.hostname added as flowfile attributes.