Technical Article

Populating Tables With CSV Data Via Sideband Connections

Datagroups and tables are the two primary methods we have in iRules for organizing key and value pairs. Both can be reused for subsequent connections. Datagroups have the advantage of being directly editable from the BIG-IP user interface, however they cannot be modified from within an iRule. This would open a potential security hole by allowing BIG-IP filesystem access from an iRule. Tables on the other hand must be populated from within an iRule, which allows for tracking user state, session data, etc. Table cannot be modified from any user interface though. With the advent of sideband connections, we can now make a connection to a separate service retrieve structured data and populate a table.

Tables were introduced in BIG-IP version 10 as a means to store key and value pairs across connections and sessions. Because tables are stored in memory, they are ultra-low latency and can handle large volumes of transactions. There are many cases when we want to functionality of a datagroup without requiring administrative access to the BIG-IP. With the introduction of sideband connections in version 11, we now have the ability to retrieve data from other sources. We can then parse that data and populate the table.

For this example, we’ve chosen the CSV (comma-separated value) file format to store the tabular data. CSV files are one of the oldest file structures for formatting tabular data. The standard is loosely described in RFC 4180, but there have been several interpretations of CSV structure in the past. We will be using the most common format whereby there is no header line, each field is separated by a comma, and each line terminated with carriage return and line feed (CR+LF).

Session Continuity, Table Locking, and Graceful Failures

The simplest implementation of this functionality would be to periodically delete all table data, retrieve new data, parse, and insert. This method presents a number of challenges for any virtual that is handling connections at a faster rate than the data retrieval can complete and even then this would be a poor implementation. As a remedy, we conduct the process differently, by only deleting old records after all new records have been added. This insures that there are no interruptions in the user experience.

Locking is another consideration that must be taken for high volume implementations. In order to prevent table corruption, we will lock the table to prevent two instances from modifying the same table simultaneously. We handle the locking mechanism using an entry in another table with a timeout and lifetime equal to the table’s cache timeout.

Because we are modifying a potentially mission critical table, we want to handle any interruption in updates gracefully. Therefore, no existing data will be removed until we receive a valid set of new data. If a failure occurs, the next update attempt will not commence until the next scheduled refresh.

Caching

There is a small, but measurable amount of latency in triggering retrieval of new records. This latency was measured as less than one second for a sideband connection from an LTM virtual edition to a web server on the same LAN. In order to avoid each connection from performing this operation, we added a caching mechanism so updates are only initiated periodically. The length of time you’ll want to cache records is dependent on your application. Some content such as a sitemap or a robot.txt file could be updated once daily. Other examples such as redirects or ACLs may need to be refreshed more regularly. This option is tunable via a static variable in RULE_INIT.

Putting It All Together

When we condense all of those provisions into a single process, we wind up with something that looks similar to the diagram below. This process outlines the procedure for importing CSV data. Each additional request during the cache period will bypass the import process and retrieve data directly from the table.

The ability to populate tables with data via a sideband connection opens up many possibilities for handling application logic on the LTM. The biggest advantage is having an off-box method for adding data to iRules. We mentioned a few of the use cases such as redirects, sitemaps, and ACLs, but the list is really limitless. In addition to being able to manually edit the CSV files and import data to the LTM, there is now the capability to ingest dynamically generated data of any format. While the ability to perform redirects with an iRule may seem mundane, the underlying code opens up endless possibilities all of which translate to lower management effort. Until next time, happy coding!

Matt: Aren't they awesome? We've had a ton of ideas for sideband connection Tech Tips stuck in limbo. Now that we've got sideband fucntionality we are working our way down the "if we only had sideband connections" Tech Tip list.