Creating new fields

One common use for the Code Engine is to modify events by adding new fields. These fields can store the results of lookups or external data, concatenations of existing fields, or other data.

Regardless of how the new fields are created or populated, they're added as new keys in the event object. New keys added via the Code Engine will be replicated as new fields in the Mapper and then added to the table in the data destination.

Suppose we have an event that has an address and we'd like to add the postal code from another table or perhaps from a Geo-IP lookup service. When we add a new field (postal_code in our example below), the Mapper automatically adds that column to our table in the data destination.

Take care when enriching events with data from external sites as this can cause lag if the service you're querying slows down or stops responding. Often, connecting to external/third party services can be more efficient when performed within the data destination.

Note

Take care when enriching events with data from external sites as this can cause lag if the service you're querying slows down or stops responding. Often, connecting to external/third party services can be more efficient when performed within the data destination.

Discarding events

Suppose that an input sets a field for the user’s login status, and you wish to only record events from users who are logged in. The following code could be used to discard events where the user is not logged in.

Splitting events

An event can be split into multiple events. For example, suppose incoming events each include a list of websites visited by a user, and you want a separate event for every website that each user visits.

This sample function returns a list of event dictionaries, where each dictionary is composed of a site and the user from the original single event.

After returning multiple events, each event is automatically packaged with a _metadata dictionary corresponding to its parent event. However, the metadata fields on such events are not available for access in the Code Engine. Thus, the _metadata fields cannot be transformed unless explicitly copied to each event object. The following code example amends the previous example with an explicit metadata copy and field assignment:

Regardless of whether the _metadata dictionary is added automatically or explicitly, the dictionary will appear in the Mapper. The _metadata dictionary and its fields are discussed here.

Flattening JSON

If you are importing JSON data that includes nested fields (typically from a webhook, SDK, or REST API data source), you may want to flatten the JSON before it's loaded into your data destination.

When your JSON data is imported, Alooma creates a column in the target data warehouse for every top level key (except for _metadata). This can be an issue if your key contains nested JSON as that JSON will become the contents of the column.

Here's an example of a basic JSON flattening function as it might appear in the Code Engine:

Note

This is not intended to be a one-size-fits-all example of how to flatten JSON. Your data will vary, and you will likely need to modify, perhaps heavily, the example above. That said, the example should help you on your way. If you have questions, please reach out.

Here's some very simple sample data, before flattening (the _metadata is just copied over so it's not important for this example):

Geo-IP resolution (enriching events)

The Alooma Code Engine supports direct extraction of geographical information from IP addresses. This is an example of how to use the Code Engine to enrich existing events with supplemental data. Simply import the geoip library and call the geoip.lookup function on an IP address. The function returns an object containing the country, country code, region, city, and postal (zip) code.

Note

Take care when enriching events with data from external sites as this can cause lag if the service you're querying slows down or stops responding. Often, connecting to external/third party services can be more efficient when performed within the data destination.

Alooma uses IP2Location for Geo-IP resolution and we update to the latest version each month. Geo-IP resolution works on both IPv4 and IPv6 addresses. If an IP address is invalid, or in the rare case that a country cannot be found, then the lookup function returns None. City and postal code data is less comprehensive, and may be None if there is no information for a given IP address.

Notification generation

Alooma provides an API to generate notifications that appear in the notification pane of the Dashboard page. You can generate notifications to display information, warnings, and errors. A notification has two string arguments: a title and a description.

Multiple notifications are aggregated by their title when received within 15 minutes of one another. Aggregated notifications can be expanded in the notification pane of the Dashboard page in order to see the separate descriptions for each notification.

Note that when running code in the Code Engine, notifications from the execution will not appear in the notification pane.

Retrieving elements from nested dictionaries

Transform code often has long or nested conditional statements to check for the presence of nested dictionary elements in the event object. This convention can result in cumbersome code, but is necessary to avoid KeyError exceptions when accessing a dictionary.

The following get function provides a shortcut to retrieve values if they exist, and avoids KeyError exceptions if values do not exist.

Handling surrogates in data

If your data includes UTF-16 characters that have surrogates, the mapping can fail as the event is processed and the output may become corrupted. The solution for this is to strip out (or replace) those UTF-16 characters. In our example below, we're replacing any such UTF-16 characters with a question mark (?) as specified in the discard_surrogates() function.

You should not run this code on every field in every event as that may slow processing of events in large volume environments. Rather, create a map of the event types and fields that you do wish to parse and have the transform only check fields in the map.

Hashing information

One way to avoid having Personally Identifiable Information (PII) in your data warehouse is to hash it as it flows through the Code Engine. Here's an example of a basic hash function and transform as it might appear in the Code Engine. In this example, we're looking for events in the "Customer" table, and we'll hash the values of the "Address" and "Income" fields.

So the idea is to specify the table that holds the data and the fields within that data to hash. Here's some very simple sample data, prior to hashing (the _metadata is not important for this example):

Prepending the schema onto the event type

When it comes to mapping, there are several options for designing your data destination. For some configurations, using the OneClick mapping makes sense. In others, creating the target schemas based on the source schemas is the right approach. In Alooma, when a value is prepended to an event, and automapping is on, we automatically create a new schema based on that prepended value.

For MySQL inputs, you can take advantage of this by adding the schema to the event type name (so event_type becomes schema.event_type). This can be helpful when sending events from MySQL to schemas in the data warehouse that match the source schemas.

Working with secrets and alooma.py

Because there are times you will need to pass sensitive information (things like tokens, keys, usernames, and passwords) from the Code Engine, you can define these as "secrets" via the Alooma API and then reference them in the Code Engine without having to know or show the values of the secrets.

For example, using alooma.py, you can set (and get and delete) secrets:

In this case, we've set two secrets: one called "my_user" with the value of "joe@example.com" and one called "my_password" with the value of "12345678". Now, once you include alooma_secrets, you can reference those secrets in the Code Engine: