all been renamed for clarity. The codebase was
upgraded to Java 8 and uses new features in the latest version of the Java SDK, such as optionals and the
streams API, for more expressive and maintainable
code.

The development team that deploys and operates
the SKILL service was also able to reduce lines of
codes in the project by ripping out the old servlet
configuration code and using a homegrown web
service chassis instead. This chassis is used in many
web services at CB and handles application startup,
request and response deserialization, and other boilerplate web service functionalities, thereby removing
complexity from the SKILL service and improving
ease of maintenance. The service has also been
enhanced to respond with a variety of descriptive
HTTP status codes for various errors, such as 400 Bad
Request errors for improperly structured requests and
401 Unauthorized errors for requests that do not
present the required authentication credentials.

The SKILL service has been available for production use within CB’s technology department for more
than two years at the time of writing. In this time, a
large number of development teams have found
applications for the service.

Figure 4 shows a graph of production traffic to theservice over a three-day window, broken down bycaller. In total, the service provides skill tagging forover a dozen applications within the CB ecosystem.Traffic patterns vary per customer: some have highervolume, some send spiky bursts of traffic, and so on.Figure 5 shows a graph of production traffic for a sin-gle application, CB’s demand data processing system,which runs in very large batches and creates massiveamounts of traffic in short bursts. This caller’s trafficwas omitted from the graph in Figure 4 to ensurevisual legibility. Even during our highest traffic peri-ods, the SKILL service remains highly performant,with a 0.00 percent error rate and a 99th-percentileresponse time of 35ms. In the past year, we have beenable to tune our performance and scalability to theselevels by moving the service to a Docker-based con-tainerized infrastructure, which allows us to bring upnew instances in seconds during traffic spikes, andalso reduces operational overhead costs.

Scaling up our server fleet to handle these traffic
spikes smoothly proved quite difficult. Our first solution to this was to run more instances at all times, but
this was wasteful and expensive. The service itself was
already quite optimized, so there were no easy gains
to be made with regards to performance. Ultimately,
we found that the best solution was to consult with
our users and ask them to build gradual scaling into
their batch processes. Currently, a locally deployable,
offline version of the SKILL service is being developed
that will enable teams to perform skills enrichment
without sending requests to our service at all, at whatever speed their own hardware will allow.

Usage and MaintenanceAfter maintaining the SKILL service in production forsome time, we received customer feedback indicatinga desire for a service that would return related skillsfor a skill. We were able to develop and deploy thisfunctionality in a short amount of time and with

Figure 4. Skills Traffic Over a Three-Day Window, Broken Down by Caller.