RPA Reporting and Analytics 102: NoSQL and Logging for Business Metrics

RPA Reporting and Analytics 102: NoSQL and Logging for Business Metrics

Are you interested in logging some custom metrics for your business? You’ve come to the right blog post! First things first, to understand this post there’s a pre-requisite reading assignment – my previous post. You’ll need to know where logs come from before you can best understand how to use them.

Assuming you’ve done your homework, let’s pick up where we left off from my previous post. I had gone over using SQL as a data source and emphasized that it’s great for operational reporting, but not as much for business reporting. While all those custom logs are there, they’re sitting in a table called logs inside of a field called raw message, still in their original JSON format. That means that Tableau, Power BI, etc., are going to treat everything inside that raw message as one item and you won’t be able to get individual fields out. To access those values, you’ll need to 'flatten' the JSON, meaning that you’ll have to pull those values out and make them each accessible as their own field. It’s possible, but it requires some extra steps and effort. Here’s an example of a raw message with a custom field highlighted:

If reporting on custom fields and metrics is really important to you, you’ll probably want to consider Elasticsearch or another NoSQL-based platform. At this point, I’ve mentioned Elasticsearch a few times and haven’t quite explained what it is, so let me do that. Elasticsearch is actually just one part of a three-tiered “stack” provided by Elastic. That stack, or parts of a technology system, are broken down as follows:

•Beats and Logstash – the ingestion layer, basically how you bring logs into the database. You can ignore these two products because we skip right over them and send logs directly to the database using a logging library called NLog.

• Elasticsearch – as Elastic best puts it, “the heart of the Elastic Stack.” This is the actual database where the logs are sent to. Elasticsearch is a super sophisticated database because it works by classifying each log as a “document,” which it then groups together by similarities into a subsection called an index when you search for something. We’ll talk about indices and index patterns in a bit since they’re critical to understand before you use this tool. The moral of the story is, the indices are what allow you to search Elasticsearch just like you would Google. No complicated queries! Exciting, I know.

• Kibana – Probably the only portion of the Elastic Stack that most of you are going to work with. This is where you get to use the information from the logs to build cool dashboards and get insights. The best part about Kibana is that when you use the add-log fields activity as we discussed earlier, those fields are added right to your logs and they’re available to use as a field when building a dashboard! Easy, right? Ninety-nine percent of the time, it’s a no brainer. The 1% downside is that creating custom fields requires you to write queries using Lucene, which honestly even I struggle with and I was a developer for three years. Don’t get too discouraged though, for simple calculations like field1/field2, it’s not tough at all. Take a look at their instructions to see how far you can get.

Now, let’s get back to that topic of indices and index patterns. UiPath takes care of the indices for you, so all of your logs are grouped by tenant name, month, and year. The index will look something like “default-2018.10,” meaning these logs came from a tenant called default in October of 2018. Once you get to Kibana, you’re going to have to pick a subset of your indices to report off of. This is called an index pattern. You can choose to be super broad and get ALL your data from Kibana by using “*.” You can narrow down by tenant, and choose “tenant*,” by year (“tenant-2018*"), etc. Once you have an index pattern created, you’re ready to roll.

Creating that index pattern is pretty much the only step you’ll have to do to use the out-of-the-box operational Kibana dashboards I made! You can check them out on UiPath Go! Feel free to use them as a starting point —add things in, take things out, and add business metrics too. There are a few other Kibana dashboards on Go!, so you can pick what works best for your needs. When you download the set I built, the Zip file will include documentation about how to create index patterns and a few more technical things about the way we send logs.

One thing you should keep in mind before committing to the Elastic Stack is that prior to version 6.8.0 of Elastic, there was no built-in security on Kibana. This meant that anyone who had the link to your Kibana could see all of your data. To get security, purchasing a plugin called X-Pack was required. Luckily, Kibana 6.8.0 and above takes care of this issue, but X-Pack still provides some nifty features like alerting, infrastructure monitoring, and machine learning that might be interesting or useful to you. You can check out the subscription listings to see a comparison of the feature sets.

Like I mentioned earlier, working with custom logs in Kibana is usually a breeze. To make sure you get the most optimal results, here are some best practices you should follow:

1. Don’t use the Log Message activity if you want to get some specific value out of it. It’ll be a nightmare to extract specific words or numbers out of the string of text. For instance, “the $ value of my invoice is 300.” Good luck extracting “300” from your field on Kibana.

2. Make sure your logging is standardized across all of the processes you’re tracking. Otherwise, you’ll have a bunch of strangely named fields and no one will know which process they belong to.

3. Any field you add will be present in all logs until the moment they go out of scope (meaning the portion of the workflow they’re held in is over), or the moment the job/transaction is over. This means you’ll likely have multiple logs with the same values. If you try to build anything on Kibana without filtering these logs, you’ll have quite a mess. Pro tip: to get only the logs from processes that were completed, you can use this query in the search bar on Kibana: _exists_: "totalExecutionTime" (keep the quotes exactly as they are, don’t add any more).

4. Know the key performance indicators (KPIs) and metrics you want to track before building anything in UiPath Studio. This will help you best design your logging logic and you won’t miss anything along the way.

5. Perhaps one of the most important and most overlooked considerations: DO NOT LOG SENSITIVE DATA. Social security numbers, credit card numbers, etc. are the absolute last things you should be sending anywhere. If you absolutely need to track them, I recommend either using the last 4 digits or applying some sort of hash to them so that no one but you can actually understand what they mean when they’re in plain sight.

6. Understand that the logging level you set determines how detailed your logs are. For instance, verbose logging gets the value of every single variable and argument from every activity. Even if you don’t use a log activity for sensitive data,it will still be logged unless you check ‘private’ on the activity. It’s worth noting that you can set the logging level at a few places—the Robot, Orchestrator, and even from the log message activity. Remember that you can tell the Robot to log on verbose, but if the Orchestrator logging level is set to information, anything more detailed will be cut before being sent to Elasticsearch or another target.

Now that we have logging best practices out of the way, you’ll be pleased to know there are a few other NoSQL-based tools that you can send your logs to! As of 19.4, we support connections to MongoDB via NLog. You can find instructions for that target here.

Along those lines, a target for Azure Cosmos DB has also been a request, so there are instructions for that coming in future posts. Though Splunk isn’t technically a NoSQL-based tool, it’s a really popular ask from our customers. We have two ways to connect to Splunk via NLog—by the HTTP Event Collector or the Network target. Stay tuned for upcoming posts for those instructions.

If I can offer a word of advice, NoSQL is probably the way to go when it comes to custom logging and business metrics. However, even if you’re following all of our best practices for logging, you also need to make sure that you’re following our best practices for development in general. For instance, did you know that a lot of the processes you’ve been using tons of custom logging with would probably be best suited for a queue?

You’d be surprised how much more efficient your processing (and often reporting) can get when you start leveraging UiPath Queues. You can read more about that in my post RPA Reporting and Analytics 103: Queues.