Software engineering – Schibsted Byteshttp://bytes.schibsted.com
Insights from products & techWed, 29 Nov 2017 11:30:35 +0000en-UShourly1https://wordpress.org/?v=4.9Comparing three solutions for estimating population sizeshttp://bytes.schibsted.com/1732486-2/
http://bytes.schibsted.com/1732486-2/#respondWed, 29 Nov 2017 11:12:53 +0000http://bytes.schibsted.com/?p=1732486In an earlier post we discussed a service to estimate the number of unique visitors to a website. In this article we explore the algorithms we used to build the ...

]]>In an earlier post we discussed a service to estimate the number of unique visitors to a website. In this article we explore the algorithms we used to build the system: HLL (HyperLogLog) and KMV (Kth Minimal Value) and evaluate each.

When you set up an advertising campaign, it’s crucial to know how many users the campaign will reach. We built an estimation subsystem inside Schibsted’s Audience Targeting Engine (ATE) which can answer questions like:

“How many males from Oslo who are interested in sports visit Schibsted websites during a week?”

“How many users in Helsinki age 25-35 visit Schibsted websites in a day?”

Example: males from Norway who are interested in sports

The Problem

To estimate the number of unique users for the next week, we can look at historical data for the previous weeks and assume that we will get a similar number of users.

From a technical point of view this is a count-distinct problem. Though we don’t need to have the exact cardinality, it’s essential to have some approximate value.

As we discussed in our earlier article, we have several limitations:

We should provide the estimate in real time for any future targeting campaigns. This allows advertising campaign managers to try multiple combinations of targeting parameters and select the best one.

In the ATE, we have many values for our targeting parameters. For example, we have more than 100,000 locations, more than 100,000 interests and more than a million search terms, so it’s impossible to calculate in advance all combinations of targeting parameters.

HLL approach

The most popular approach to solve the count-distinct problem is to use the HyperLogLog (HLL) algorithm, which allows us to estimate the cardinality with a single iteration over the set of users, using constant memory.

The popular databases Redis and Redshift as well as the Spark computing framework already have a built-in implementation of this algorithm.

HLL in Spark

Our original approach was to use a Spark cluster. External job server functionality allows us to execute queries on Spark using a REST API. These queries are executed on data that is already cached in memory and disk, so it doesn’t require additional time to load the data from file systems and distribute it across the nodes.

To get an estimate, Spark executes queries using the countApproxDistinct function. For example: to find all males in Oslo of age 25-35, the query is the following.

This approach has an issue: Spark has to filter the whole dataset and only then it can create the corresponding HLL. These queries are very fast on small datasets but take up to several seconds on large datasets like ours. That is usually suitable for ad-hoc analytics, but it is not acceptable in our case, because our goal is to get a response within no more than a couple of seconds.

HLL in Redis

Redis uses a slightly different approach. Rather than calculate an HLL at query time, it stores pre-calculated HLLs as a value in its key-value storage. At query time, it only calculates an estimate based on the HLL. So the query executes within milliseconds.

Also, Redis supports unions of severals HLLs. So in order to select people who have been in Oslo or Helsinki, we can build two HLLs for these cities and compute the union of them. But for intersections, for example where we need to find people who were both in Oslo and Helsinki, it performs poorly in certain cases – for example, when we need to intersect multiple sets or when we need to intersect sets with very different sizes. To understand why exactly this happens we need to learn how the HLL algorithm works.

HLL Algorithm

The algorithm consists of two parts:

In the first part we create a basic HLL array which can be modified if we need to add more users. This is the most time-consuming part.

The second part is the actual evaluation, which is very fast and can be performed at any time.

Calculating the base array

Internally, an HLL is represented as an array Mof length 2^b. In order to add an element to the array we need to do 3 steps:

Compute the hashcode of the element.

Split the binary hash value in 2 parts.

The first part of length brepresents the position in the array M; we call it i

The second part (we call it w) is used to derive the position of the leftmost digit “1” in binary representation, we call it leftmostPos(w).

Insert the value into the array using the formula M[i] = max(M[i], leftmostPos(w))

In the example we took b=4, so the array M will have 16 elements. We will start with empty array M:

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

Let’s say we inserting element x where hash(x) = 0011000110

Split the hash value into 2 parts:

Insert the position of the leftmost 1 = 4 to the 3rd position in the M array (array indexes are starting from zero):

0

0

0

4

0

0

0

0

0

0

0

0

0

0

0

0

Estimation

To get the cardinality estimation we need to calculate the harmonic mean function:

and then finally determine the count:

where αis a constant to correct hash collisions.

These formulas may not look very intuitive; for the mathematical proof of the algorithm you can check the original article.

HLL Unions

To calculate unions, we need two arrays M1 and M2 with calculated p(w) values. Based on these two arrays, we calculate a new array M. For each element we apply a formula similar to the one in step 3. M[i] = max(M1[i], M2[i]). This will allow us to get a new base array, so we can perform evaluations on it.

Intersections in HLL

There are two approaches to building intersections on top of HLLs:

The most commons approach is the Inclusion-exclusion principle. It allows us to get the intersection based on original estimates and their unions. The idea is based on the fact that the union of 2 sets is the sum of these set sizes, minus their intersection.

|A ∪ B| = |A| + |B| – |A ∩ B|

This idea can be expanded to any number of sets. For example for three sets the formula looks slightly more complex:

Another way of calculating intersections is with the Minhash approach. It suggests building an additional minhash data structure which helps to estimate intersections.

As we will see in the evaluations, the first approach has issues when we need to intersect multiple HLLs, and the second one requires additional time to build minhash structure.

Kth Minimal Value approach

We wanted to find an algorithm that has the same speed and precision as HLL but supports intersections by design.

This type of data structure exists in the datasketches framework and is called a theta sketch. It was developed fairly recently at Yahoo and to our knowledge only the Druid database is using it at the moment. The core of a theta sketch is based on the KMV (Kth Minimal Value) algorithm.

The KMV Algorithm

This algorithm has two steps, similar to HLL.

from each incoming element, calculate a hash value with a floating point number between 0 and 1, and keep only K minimal values.

based on the value of the Kth minimal element, it is possible to make an estimation:

Count = number of samples – 1value of the Kth sample

For example we have only 10 elements and we selected K=3

Hash codes of our elements:

0.02

0.07

0.18

0.20

0.21

0.31

0.56

0.59

0.81

0.96

Because we selected K=3, we keep only 3 elements.

0.02

0.07

0.18

…

The estimate in this case is

Count = 3 – 10.18 = 11.1

This estimate is pretty close, but in this case we are just lucky because the value of K is too small for practical purposes.

The best way to pick the value of K is to find a suitable tradeoff between accuracy and sketch size. For example, if we select K=2^12 then the error will be ~3% and the size of the sketch will be no more than 61Kb.

Unions

In order to calculate a union we need to have several sets with K (or less) minimal values in each. Then we can combine them into a single set that will contain the K minimal values of all the lists.

For example, we can take the set with 10 elements from the previous example and union it with the set which has 5 elements in common, so the expected estimate of the union is 15.

Set 1:

0.02

0.07

0.18

0.20

0.21

0.31

0.56

0.59

0.81

0.96

Set 2:

0.02

0.12

0.18

0.20

0.21

0.46

0.61

0.66

0.81

0.82

We have the same value of K=3, as we had in previous example, so we keep only 3 minimal values from each set.

Set 1:

0.02

0.07

0.18

…

Set 2:

0.02

0.12

0.18

…

Now we can select 3 minimal unique numbers out of these 2 sets.

Union set:

0.02

0.07

0.12

…

Count = 3 – 10.18 = 16.7, where original estimate was 15.

Intersections

Intersections can be calculated as an intersection of two arrays with K minimal values. In this case K may become smaller, but if the intersection is big enough this is not a problem.

For example, we can take the same two sets of 10 element each with 5 elements in common from the previous example. In this case, the expected estimate is 5.

We have the same value of K=3 as we had in previous example, so we keep only 3 minimal values.

Set 1:

0.02

0.07

0.18

…

Set 2:

0.02

0.12

0.18

…

After intersection, we keep only duplicated elements. In this case there are only two of them, so our K reduces to 2.

Metrics

Our goal was to build an accurate, high performance system, so the most important metrics for us were:

Accuracy

Time to compute an estimate (query time).

Accuracy in our case means relative error in percent, which can be described as

error = (measured – truth) / truth

HLLs and Theta sketches provide theoretical estimations of this metric, which depend on the size of the data structure. The more accurate the estimate, the more memory it uses. However, in some cases, especially involving intersections, there are no theoretical numbers, so it is useful to get experimental results.

The second important metric is the response time for a user, i.e. how quickly we can compute an estimate from HLLs or Theta sketches. This includes the time that is required to perform a union or an intersection of several sketches.

Memory usage is less important because all solutions are constant in memory, and accuracy is usually a tradeoff with memory consumption. Time to build the initial data structure is also not very important because we can always precalculate that data structure.

Experiments

When an advertising campaign is set up to target only males, it reaches millions of users. If it targets people with a specific interest, it is just a couple of thousand users. But it’s important in both cases to get fast and precise estimates.We conducted a large number of experiments with HLLs and datasketches and the experiments below are the most representative.

To make sure we could compare relative error and execution time of the algorithms we set up their internal parameters to have a 1% theoretical error.

To make the experiments more descriptive and flexible to set up, the algorithms were conducted on various sets of randomly generated data. The experiments were executed on AWS c4.xlarge instances.

Experiment 1: One category

Some advertising campaigns are very simple and target just one category. In this case, we build an HLL/datasketch with from 100 to several millions elements and query it. All the experimental values were below the theoretical 1% and the execution time for all algorithms was less than 0.1 ms.

Experiment 2: Unions of multiple groups of users

In the first article, we explained how we use a grid which can have up to several thousand points to target users by geographical coordinates. Each point has a relatively small number of users, but to get a final estimate we need to union all of them.

Unions of 1000 groups of 1000 unique users:

Algorithm

Relative Error

Execution time

HLL with inclusion/exclusion

0.25 %

60 ms

HLL with minhash

0.05 %

400 ms

Theta sketches

0.20 %

30 ms

As we can see from the table this estimate is very precise and very fast in most of the cases.

Experiment 3: Intersection of several large groups of users

Another common case is when we want to target users with several targeting parameters. For example, it can be a popular website in a country where we target males of a specific age who are interested in sports. So we get an intersection of several groups with many users.

Execution time of intersections

Intersection of 20 groups, 1 million users each:

Relative Error for 20 intersected groups

Execution time for 20 intersected groups

HLL with inclusion/exclusion

0.35 %

30 seconds

HLL with minhash

0.60 %

700 ms

Theta sketches

0.20 %

50 ms

The precision of all algorithms is very good but in this case, HLL with inclusion/exclusion struggles with execution time.

Experiment 4: Intersection of a small and a large group of users

This represents the case when we want to target a relatively small group of users with several targeting parameters. For example, females in a Norwegian town. The group “females” will have millions of users, but a Norwegian town may have only a few thousand people.

Precision of intersections

Relative Error for intersection of 1000 with 1 mln users

Execution time for intersection of 1000 with 1 mln users

HLL with inclusion/exclusion

34 %

4.4 ms

HLL with minhash

11 %

78 ms

Theta sketches

13 %

3.2 ms

In this experiment, opposite to Experiment 3, all implementations turn out to be very fast, but the estimates in the corner cases are not precise enough.

Conclusions

The results of our experiments show that in general, all algorithms have good results both for accuracy and execution time. In simple use-cases when we have only one category or union of several categories, there is no significant difference between algorithms. However, in certain cases especially with intersections, execution time and accuracy was not ideal for the HLL-based solutions.

For our service, we chose the theta framework from Yahoo’s Datasketches library because it shows the most stable results both for accuracy and execution in different complex cases.

So you have been working on your e-commerce platform and you are almost ready for joining the Hotsale Season (BuenFin or Black Friday). Below you can find a preparation guide and a checklist to make sure your platform will not let you down.

Scale your infrastructure

Warm up your gears: Whether you are running on AWS or Google Cloud, you should start talking to your account manager to warm up 3-4x your current running instances so as to be able to scale with ease, instead of being desperate and out of capacity in the middle of the top sale because you couldn’t anticipate your server's load.

Make sure you are taking full advantage of the multiple availability zone feature in your cloud hosting as very often there are migrations happening in both Amazon and Google cloud that can affect your platform stability; hence, if one zone is down or having some problems you can still serve your users without any downtime.

Stress test your platform

First of all, prepare for your stress test strategy plan. This is very important before jumping into testing different tools. Start by opening your analytics tool. Most of the common tools offer a User Flow in a stacked column flow or funnels. Create the user steps based on every entry point you have in your platform and create your chart as you can see below:

Or, you can use a different funnel charts for your starting points
(Home, Product, Category and Campaign Pages)

Now that you know where you will start from, here comes the easy part. There are a lot of free tools to perform stress tests on your platform like Apache JMeter, Apache Bench (ab), Siege, and I will be focusing on open source and free tools. The one I will use for our example is called Vegeta, which is very easy to use from your Linux terminal.

Website Speed Test

Personally, my favorite tool for measuring and improving frontend performance is webpagetest.org as it gives you this beautiful waterfall including what your assets are, what is taking long time and what is impacting your page load time. You can see an example of our page load time here: Segundamano.mx web page test.

Looking at the top right of our results page, you will see the following ranking:

First Byte Time (A): It should always be below 500ms. You should always push for 200 to 300ms.

Apache gzip configurations: (Add the following to your .htaccess file)
[crayon-5a31d889f3aad817893725/]Nginx gzip configurations:
[crayon-5a31d889f3ab1617158075/]Compress images (A): Ensure all the images served are compressed. This is normally a result of configuring your assets bundler. Whether you are using Webpack, Gulp or Rollup, you can easily Google the image compression configuration.

Cache static content (F): Make sure you have a good caching strategy, so that you can cache the most essential assets in your website to make the user experience fast when they are navigating from one page to another. Normally, you will not be able to control caching external of resources, especially if you are adding marketing pixels or tracking pixels since they are handled by the ad server’s provider.

Prepare for Failure

“Anything that can go wrong will go wrong.” Murphy's law

It doesn’t mean you should accept this law and not prepare yourself for platform failure. There is a way to hack this failure and minimize the risk by being in a mode of “Productive Paranoia” (from the book Great by choice for Collins and Morten Hansen).

Start by writing on your whiteboard the critical and essential parts of your platform which, in case of failing, may take your platform down. Start with your database and then your APIs once you get to the frontend servers. Make an action plan of what you should do if any of these components or microservices fail and how you can recover within the shortest time period.

Security Penetration Test

This year has been a crazy year for security experts and big companies, with lots of ransomwares and data breaches. You’d better be testing every part of your platform and business from cloud infrastructure, apps and even internal infrastructure and performing constant social engineering attempts against your own business to discover early your security vulnerabilities and fix them.

]]>So you have been working on your e-commerce platform and you are almost ready for joining the Hotsale Season (BuenFin or Black Friday). Below you can find a preparation guide and a checklist to make sure your platform will not let you down.

Scale your infrastructure

Warm up your gears: Whether you are running on AWS or Google Cloud, you should start talking to your account manager to warm up 3-4x your current running instances so as to be able to scale with ease, instead of being desperate and out of capacity in the middle of the top sale because you couldn’t anticipate your server's load.
Make sure you are taking full advantage of the multiple availability zone feature in your cloud hosting as very often there are migrations happening in both Amazon and Google cloud that can affect your platform stability; hence, if one zone is down or having some problems you can still serve your users without any downtime.

Stress test your platform

First of all, prepare for your stress test strategy plan. This is very important before jumping into testing different tools. Start by opening your analytics tool. Most of the common tools offer a User Flow in a stacked column flow or funnels. Create the user steps based on every entry point you have in your platform and create your chart as you can see below:

Or, you can use a different funnel charts for your starting points
(Home, Product, Category and Campaign Pages)

Now that you know where you will start from, here comes the easy part. There are a lot of free tools to perform stress tests on your platform like Apache JMeter, Apache Bench (ab), Siege, and I will be focusing on open source and free tools. The one I will use for our example is called Vegeta, which is very easy to use from your Linux terminal.
Example of Vegeta usage:
[crayon-5a31d889f3aa4874299164/]
Vegeta is very commonly used in Google to run performance and scalability tests on different parts of their technologies including Kubernetes. (Kubernetes team uses Vegeta in their 10,000,000 QPS load test.)

Website Speed Test

Personally, my favorite tool for measuring and improving frontend performance is webpagetest.org as it gives you this beautiful waterfall including what your assets are, what is taking long time and what is impacting your page load time. You can see an example of our page load time here: Segundamano.mx web page test.
Looking at the top right of our results page, you will see the following ranking:
First Byte Time (A): It should always be below 500ms. You should always push for 200 to 300ms.Apache gzip configurations: (Add the following to your .htaccess file)
[crayon-5a31d889f3aad817893725/]
Nginx gzip configurations:
[crayon-5a31d889f3ab1617158075/]
Compress images (A): Ensure all the images served are compressed. This is normally a result of configuring your assets bundler. Whether you are using Webpack, Gulp or Rollup, you can easily Google the image compression configuration.Cache static content (F): Make sure you have a good caching strategy, so that you can cache the most essential assets in your website to make the user experience fast when they are navigating from one page to another. Normally, you will not be able to control caching external of resources, especially if you are adding marketing pixels or tracking pixels since they are handled by the ad server’s provider.

Prepare for Failure

“Anything that can go wrong will go wrong.” Murphy's law

It doesn’t mean you should accept this law and not prepare yourself for platform failure. There is a way to hack this failure and minimize the risk by being in a mode of “Productive Paranoia” (from the book Great by choice for Collins and Morten Hansen).
Start by writing on your whiteboard the critical and essential parts of your platform which, in case of failing, may take your platform down. Start with your database and then your APIs once you get to the frontend servers. Make an action plan of what you should do if any of these components or microservices fail and how you can recover within the shortest time period.

Security Penetration Test

This year has been a crazy year for security experts and big companies, with lots of ransomwares and data breaches. You’d better be testing every part of your platform and business from cloud infrastructure, apps and even internal infrastructure and performing constant social engineering attempts against your own business to discover early your security vulnerabilities and fix them.

Leboncoin, the leading classified ads platform in France and one of the most visited globally, is powered by a classic three-tier architecture. Our business logic tier is one big monolith though. It powers leboncoin.fr website, the iOS and Android applications and all our APIs. It continues to serve us, our users and customers well.

But over time, as a codebase grows in size, so does its complexity. Different parts of the codebase become increasingly tightly coupled with one another (this is referred to as software entropy. You can read more about it in The Pragmatic Programmer book).

This coupling means that changes in one part of the codebase can impact other parts, so they need to be changed in turn – although they are outside the scope of the original feature or fix being made.

Software entropy makes introducing new features slow and complex with a lot of overhead. It also makes it difficult to streamline processes and can cause friction with stakeholders. Refactorings to reduce the technical debt become expensive, complex and time consuming, making them less likely to happen.

Working with such a codebase can really slow you down. To address the challenges of software entropy, ever-growing complexity and the resulting decline in velocity, the backend team at leboncoin has devised a three-part strategy, which will eventually take us to a microservices architecture-based backend.

Part one: Feature Freeze (a.k.a don’t touch it!)

Throwing away the existing codebase is obviously not an option, and rewriting from scratch is a risky endeavor that would bring the introduction of new features to a halt.Feature freeze basically means that the legacy codebase enters maintenance mode where changes would be limited to bug fixes or otherwise urgent changes, but no new features would be introduced. Maintenance can be a resource hog, but the good news here is that our code base is covered by an extensive regression test suite. These tests are run as a nightly job in our CI environment, alerting us when and if the monolith needs attention. The monolith can rest comfortably while we transition to our target architecture.

Part two: APIs, APIs everywhere

The pun about “rest” in the previous section was intended. As stated above, part of our strategy is to transition our entire backend to a set of RESTful API endpoints implemented in microservices and following a microservices architectural style.

The first step is to create shims over the legacy (non-RESTful) APIs (the blue blocks in the diagram above). Creating a shim in this context is to introduce a new API only using the existing software. In other words, we are not going to write new code for functionalities already provided by the legacy monolith, but introduce “old new” APIs based on existing code. Once all of our clients have migrated to these new APIs, we are free to rewrite the underlying code as we see fit without putting current developments at risk or delaying new features. This work is already underway, with components like authentication and user accounts management in production.

Part three: Code, Ship, Repeat using API Gateways

The third stage of our strategy is to create microservices to implement new RESTful APIs for our clients to ship new features from now on. The new APIs would be implemented by cohesive microservices containing a small number of strongly related API endpoints.

To support this microservice-based architecture, we have introduced a new piece of infrastructure: an API Gateway. Using an API gateway lowers the barrier to enabling a new microservice in production. The team or developer responsible for a microservice just needs to deploy it and configure the API gateway to route incoming requests to it. And voilà, the microservice is up and running in production, ready to process requests.

The API Gateway is a first step in broader plan we are laying down to build an infrastructure that supports our migration to a microservices architecture ... but that would be a topic for another post!

These are really exciting times to be a software engineer at leboncoin. There is lots going on, on many fronts and not only on the backend side. So if you enjoy taking on interesting technical challenges, working with cool new technologies (did I mention that we have switched to golang as a primary programming language?), we are looking for talented software engineers to join us.

Leboncoin, the leading classified ads platform in France and one of the most visited globally, is powered by a classic three-tier architecture. Our business logic tier is one big monolith though. It powers leboncoin.fr website, the iOS and Android applications and all our APIs. It continues to serve us, our users and customers well.But over time, as a codebase grows in size, so does its complexity. Different parts of the codebase become increasingly tightly coupled with one another (this is referred to as software entropy. You can read more about it in The Pragmatic Programmer book). This coupling means that changes in one part of the codebase can impact other parts, so they need to be changed in turn – although they are outside the scope of the original feature or fix being made. Software entropy makes introducing new features slow and complex with a lot of overhead. It also makes it difficult to streamline processes and can cause friction with stakeholders. Refactorings to reduce the technical debt become expensive, complex and time consuming, making them less likely to happen.Working with such a codebase can really slow you down. To address the challenges of software entropy, ever-growing complexity and the resulting decline in velocity, the backend team at leboncoin has devised a three-part strategy, which will eventually take us to a microservices architecture-based backend.Part one: Feature Freeze (a.k.a don’t touch it!)Throwing away the existing codebase is obviously not an option, and rewriting from scratch is a risky endeavor that would bring the introduction of new features to a halt.Feature freeze basically means that the legacy codebase enters maintenance mode where changes would be limited to bug fixes or otherwise urgent changes, but no new features would be introduced. Maintenance can be a resource hog, but the good news here is that our code base is covered by an extensive regression test suite. These tests are run as a nightly job in our CI environment, alerting us when and if the monolith needs attention. The monolith can rest comfortably while we transition to our target architecture.Part two: APIs, APIs everywhereThe pun about “rest” in the previous section was intended. As stated above, part of our strategy is to transition our entire backend to a set of RESTful API endpoints implemented in microservices and following a microservices architectural style.The first step is to create shims over the legacy (non-RESTful) APIs (the blue blocks in the diagram above). Creating a shim in this context is to introduce a new API only using the existing software. In other words, we are not going to write new code for functionalities already provided by the legacy monolith, but introduce “old new” APIs based on existing code. Once all of our clients have migrated to these new APIs, we are free to rewrite the underlying code as we see fit without putting current developments at risk or delaying new features. This work is already underway, with components like authentication and user accounts management in production. Part three: Code, Ship, Repeat using API GatewaysThe third stage of our strategy is to create microservices to implement new RESTful APIs for our clients to ship new features from now on. The new APIs would be implemented by cohesive microservices containing a small number of strongly related API endpoints. To support this microservice-based architecture, we have introduced a new piece of infrastructure: an API Gateway. Using an API gateway lowers the barrier to enabling a new microservice in production. The team or developer responsible for a microservice just needs to deploy it and configure the API gateway to route incoming requests to it. And voilà, the microservice is up and running in production, ready to process requests. The API Gateway is a first step in broader plan we are laying down to build an infrastructure that supports our migration to a microservices architecture ... but that would be a topic for another post!These are really exciting times to be a software engineer at leboncoin. There is lots going on, on many fronts and not only on the backend side. So if you enjoy taking on interesting technical challenges, working with cool new technologies (did I mention that we have switched to golang as a primary programming language?), we are looking for talented software engineers to join us.

At Schibsted, our large, distributed team of iOS developers collaborates on a single codebase. Our modular UI is shared by several apps, each of which applies a unique theme. Our requirements are effectively the worst case scenario for using Storyboards and Interface Builder.

Our solution to these requirements is Layout, a new collaboration-friendly UI system for iOS.

The Challenge

We needed a system that:

Makes it easy for multiple developers to collaborate on the same screens and components, and to review and merge changes using standard tools.

Allows for faster UI development than recompiling the whole app after every change.

Supports runtime theming and customization.

Is modular and composable.

Is simple and familiar for existing iOS developers.

Can deliver over-the-air updates for bug fixes or A/B testing.

So what’s wrong with the tools that Apple provides with the iOS SDK?

The Trouble with Interface Builder

When Apple first released the iOS SDK in 2008, they included Interface Builder, a WYSIWYG drag-and-drop tool originally developed for laying out desktop application windows, now adapted for the iPhone touchscreen.

Right from the outset, the iOS developer community was divided on whether this was the right approach, and many eschewed Interface Builder in favor of creating views programmatically.

Interface Builder has come a long way since it was first introduced, but the reasons for not using it remain largely unchanged:

Nibs and Storyboards (the serialized UI files exported by Interface Builder) only support a subset of UIKit features. Any non-trivial screen will require parts of the layout or control wiring to be done in code, resulting in an awkward division of logic between the Nib/Storyboard file and source code, making it harder for developers to find where a change needs to be made.

The XML format for Nib files (xib) is chock-full of unique machine-generated identifiers, making it impossible to effectively edit, review or merge changes. They can only safely be worked on by one developer at a time, even when using Git.

Nibs cannot easily be subclassed or composed, and cannot reference code constants for fonts, colors, etc. All properties set inside Nibs and Storyboards are effectively "magic" values that cannot be assigned meaningful names, and can only be reused by copying and pasting them between screens that share common elements.

Changing the appearance of views at runtime (e.g. for night mode, or to support multiple themes) means creating code bindings or subclasses for each and every on-screen view, and setting the properties programmatically. Different themes can't be previewed at build time, eliminating most of the benefit of Interface-Builder's WYSIWYG preview.

The mechanism behind Interface Builder encourages use of implicitly-unwrapped optionals and string-based name/type binding that will crash at runtime if mistyped. This was onerous when using Objective-C, but is especially unpleasant when programming in Swift, which favors static typing and compile-time error checking.

While these problems have remained unsolved since its inception, Interface Builder has acquired new ones:

When Nibs gave way to Storyboards, Apple encouraged developers to begin using IB for creating not just individual screens, but segues between screens as well, grouping multiple screens in a single Storyboard. This exacerbates the problem of file conflicts, as now developers cannot safely make edits to any screen in the app (or whatever subset of the app is contained within one Storyboard) without coordinating to avoid conflicts.

Interface Builder was designed before the introduction of AutoLayout, when the layout system used a more traditional “springs and struts” model called Autoresizing. IB encourages you to place views with drag and drop, but AutoLayout constraints completely replace that model, making it redundant. The relationship between the views shown on screen and the constraints that position them there is confusing and brittle, and making significant changes to layout usually requires removing all constraints and starting again.

IBDesignable, the new system introduced to solve the problem with previewing custom views, is flaky and doesn't work well with code that is split across multiple modules, so a screen containing custom components often appears as a patchwork of empty squares in Interface Builder, rather than an accurate preview. It also puts the burden on developers to add extra boilerplate code to support design mode.

In the face of these problems, why not simply ignore Interface Builder and create views in code?

The Trouble with Hand-Coding UI

The most common alternative to using Interface Builder for UI is to create all your views programmatically. But this approach has its own problems:

The more code you write, the more bugs (or potential for bugs) you introduce into your program. The files generated by Interface Builder replace human-generated code with machine-generated data, and as such reduce the space for bugs to hide.

AutoLayout, Apple's technology for specifying the size and positions of views on screen, is extremely long-winded to use in code. Despite the problems mentioned earlier, Interface Builder's GUI approach makes it somewhat less cumbersome.

Being able to drag and drop components and see the result immediately in a (mostly) WYSIWYG interface is a huge productivity gain over having to stop, edit, compile and run every time you want to see the result of a change.

Apple expects developers to build their apps using Storyboards. Example code uses it. Default project templates are automatically set up to use it. Some features, such as localized app startup screens, are essentially required to use it.

This last point is perhaps the most important: Ultimately, any deviation from Apple's status quo carries a cost.

The Trouble with 3rd Party Solutions

There are many existing 3rd party frameworks that address some of the issues, but none that meet all of our requirements:

AutoLayout wrappers such as Masonry help to speed up development when creating views in code, but they don’t avoid the fundamental problem of having to recompile after every change - a problem exacerbated by Swift’s relatively slow compile times.

Alternative development toolchains like React Native offer live reloading without recompiling, but at the cost of porting your app to JavaScript and React. Our iOS developers are happy with the power and flexibility of Apple’s UIKit and Foundation frameworks, and we already have a large codebase written in Swift, so switching to JavaScript would be a hard sell. It would also be a hard decision to reverse if it didn't work out.

Over the years we have seen many such 3rd party development tools come and go, but the majority fail to gain significant traction because the pros so rarely outweigh the cons.

By adopting a nonstandard tool you run the risk that it will be slow to support new iOS features, or that its developers will stop maintaining it altogether. Even if it has a strong development community behind it, that community is a tiny fraction of the of the total iOS developer base - that impacts your ability to find developers who can work on your product, or to find solutions to bugs on Stack Overflow, and so on.

In light of that, it seems like creating our own UI framework may seem like an odd decision, so why build our own tool? In short, because the problems are real, and the potential benefits of solving them are worth the risk.

But if we are going to pay the cost of adopting a nonstandard tool, it better be something we are comfortable using, that meets all of our requirements, which we can maintain ourselves, and which is easy to migrate away from again if our requirements change.

That's where Layout comes in.

Layout

Layout (available here from Schibsted’s github page) is a drop-in replacement for Nibs and (to some extent) Storyboards, based on human-readable XML files that can easily be edited and merged.

In place of hard-coded constraints, Layout offers dynamic, parametric layouts via the use of runtime-evaluated expressions (more on these later).

In place of WYSIWYG editing, Layout offers live reloading, so you can make tweaks and bug fixes to your real UI, in-situ, without recompiling the app.

Like Nibs and Storyboards, Layout uses reflection to automatically recognize and bind components at runtime, so there is no need to create plug-ins for your existing view components - they will mostly just work without changes.

Unlike Nibs and Storyboards, Layout's runtime bindings are fully type-checked and crash-safe if used correctly. Type or naming errors are detected statically and reported at the first opportunity, not deferred until a button is pressed or an outlet is accessed.

Layout is written in 100% pure Swift code. There is no reliance on JavaScript or any other resource-hungry scripting language.

Most importantly, Layout is not a replacement for UIKit. It is not a cross-platform abstraction, and it does not provide its own UI components - everything that Layout puts on the screen is either a standard UIKit view, or an ordinary UIView subclass that you have built yourself.

It is easy to add Layout to an existing project, easy to use existing components, and easy to remove it again if it’s not the right fit. You can use a handful of Layout-driven screens or components in an otherwise standard app, and it need not affect your app architecture in the slightest.

Expressions

Layout's XML files describe view properties in terms of runtime expressions, powered by the Expression library. Expressions are simple, pure functions that allow dynamic values to be specified in a declarative fashion.

In recent years, the development community has come to the realization that traditional, procedural code that relies on mutable state is a difficult way to manage complex logic such as user interfaces. Frameworks such as Facebook's React have demonstrated a simpler, more maintainable approach to UI based on the composition of pure functions that take state as an input and produce a view hierarchy as the output.

Apple's own AutoLayout framework is also based on this idea, replacing complex procedural code with dynamic constraints that can describe flexible layouts mathematically instead of programmatically.

So why doesn't Layout just use AutoLayout for view positioning? It would be relatively simple to specify AutoLayout constraints in XML and rely on that for positioning views, so why reinvent the wheel?

Despite its name, the scope of Layout goes beyond merely laying out views - it can be used to configure essentially any property of a view, not just its size and position. And because these properties are specified in terms of expressions rather than static values, the ability to describe flexible layouts comes essentially for free.

So Layout doesn't use AutoLayout because it would be redundant - you can replicate all the features of AutoLayout using expressions, as well as more complex behaviors that cannot easily be described in terms of AutoLayout constraints.

With that said, Layout does interoperate nicely with AutoLayout-based views. If a view has internal AutoLayout constraints, Layout will use those to determine its size.

Here is an example. This is the native Swift code in the view controller which loads the XML template and passes in some constants to use:
[crayon-5a31d88a001df354520692/]
The XML describes a UIView containing a UIImageView and a UILabel:
[crayon-5a31d88a001e8649373157/]
All the dimensions here are flexible: The image and label are sized to fit their content; the container view is set to 100% of the width of its container, and its height is set to auto + 20, which means it will fit the height of the image view or label (whichever is larger) plus a 10-point margin on either side.

The image view's width is determined to be whichever is the larger of its intrinsic width or height (determined by the source image), and the height is set to match the width, so that it is always square. The label is positioned 10 points to the right of the image view, and the image view and label are both vertically centered in their container.

The expressions used to calculate the value for each layout property should look familiar to any iOS developer - they are very much like the layout code you might have written inside a layoutSubviews method in the days before AutoLayout. But these expressions are not compiled Swift code, and they can be tweaked and debugged without rebuilding the app.

Below, you can see the resultant view as it would appear in portrait and landscape on an iPhone:

Live Reloading

Developing a pixel-perfect layout involves a lot of iteration, and developer productivity depends on making that process as fast and painless as possible. Layout's solution to this is live reloading, a mechanism which allows Layout to reload XML files on-the-fly without recompiling the app.

Live reloading works by scanning your project directory for layout XML files and matching them up to the versions bundled in the built application. Layout then loads these files preferentially over the bundled file. If Layout cannot locate the file it will display an error, and if it finds multiple candidate files, you will be asked to choose which one it should use.

Layouts can be reloaded at any time using the Cmd-R keyboard shortcut when the simulator has focus.

Unlike React Native, Layout's live reloading system is serverless, and does not rely on any background processes. The live reloading code is encapsulated inside your app, and only included in apps built for the simulator, as a real iOS device doesn't have access to your Mac's file system.

The Red Box

Almost any mistake you make in a Storyboard will result in a crash inside UIKit, leaving you hunting through the console log trying to work out where you went wrong. Layout's live reloading feature wouldn't be much use if every typo resulted in a crash, so instead we have the Red Box.

The Red Box (a concept borrowed from React Native) is a fullscreen error dialog that appears whenever you make a mistake in your XML. Whether it's a syntax error, a misnamed property, or a circular reference in one of your expressions, the Red Box will tell you exactly where you went wrong, and a tap will dismiss it again once the problem has been corrected.

Layout's error handling is enabled in production apps too, so you can intercept and log production errors, and then either crash or fail gracefully at your own discretion.

Over-The-Air Updates

While Nibs or Storyboards can be loaded from a remote URL, there is no official support for this in UIKit, and doing so is very unsafe, as a mismatch between app and Storyboard versions may result in an untrappable crash.

Layout supports asynchronous loading of remote XML files and has the basic infrastructure in place for over-the-air UI updates. Errors can be trapped and handled gracefully, so that the app can roll back to a safe state in the event of a bad update.

Convenience

There are a number of rough edges in iOS development, and Layout takes the opportunity to simplify those wherever possible. For example:

Fonts: Programmatically specifying a particular font at a given weight and style can be tricky; you need to know the exact font name and/or mess around with nested dictionaries of stringly-typed constants to specify the required attributes. Layout lets you specify fonts as a single space-delimited, CSS-style string that it parses to find the closest match. Font sizes can be specified in fixed or relative units, and Layout provides seamless support for iOS’s dynamic text resizing feature, even for custom fonts, which normally require you to calculate point sizes manually.

Colors: Colors are almost universally specified using a 6-digit hex format or a triplet of RGB values in the range 0-255, but iOS requires you to specify RGB values in the range 0-1. This makes sense mathematically, but it's a pain to do this conversion manually. Layout lets you use 3-, 4-, 6- or 8-digit hex color literals directly in your XML, along with CSS-style rgb() and rgba() functions.

Rich Text: Creating attributed strings in code is hard work, and to date iOS still doesn't provide any way to put styled text in a localized strings file. Layout lets you drop basic HTML straight into your XML file for doing simple styling like bold or italics, and if you use HTML in your strings file, those strings will be automatically converted to attributed strings when used with Layout components.

Table cells: Storyboards provide a convenient way to create cell and header templates directly inside your table. But if you later decide that you need to share those cells between more than one screen, that's an awkward refactor. Layout lets you specify XML cell templates either inline within the table or in their own file, using exactly the same syntax, making it painless to refactor later.

Localization: Using localized strings inside Interface Builder is awkward as you have limited control over key names, there is no automatic way to add new strings after the initial extraction, and no warning if you forget or mistype a key. Layout lets you reference strings from your Localizable.strings file directly in your XML, and will display an error if any string is missing. It also applies the same live loading feature to strings as it does for XML, so you can add, modify, or remove strings and see the results without recompiling the app.

Roadmap

Though it has only been in development for a few months, Layout has already been integrated into several Schibsted apps, and is running in production on the App Store.

In the next few months, we'll be focussing on:

Supporting more special-case iOS components, properties and types

Even more robust error handling, and improved unit and integration test coverage

Better tooling, such as static analysis, linting and refactoring for template files

Improving the examples and documentation

We're really excited about Layout's potential, which is why we've decided to open it up to the wider iOS community. Your feedback will help it to grow faster, and to expand it support new use cases beyond our requirements at Schibsted.

]]>Developing UI at scale can be challenging.At Schibsted, our large, distributed team of iOS developers collaborates on a single codebase. Our modular UI is shared by several apps, each of which applies a unique theme. Our requirements are effectively the worst case scenario for using Storyboards and Interface Builder.Our solution to these requirements is Layout, a new collaboration-friendly UI system for iOS.

The Challenge

We needed a system that:

Makes it easy for multiple developers to collaborate on the same screens and components, and to review and merge changes using standard tools.

Allows for faster UI development than recompiling the whole app after every change.

Supports runtime theming and customization.

Is modular and composable.

Is simple and familiar for existing iOS developers.

Can deliver over-the-air updates for bug fixes or A/B testing.

So what’s wrong with the tools that Apple provides with the iOS SDK?

The Trouble with Interface Builder

When Apple first released the iOS SDK in 2008, they included Interface Builder, a WYSIWYG drag-and-drop tool originally developed for laying out desktop application windows, now adapted for the iPhone touchscreen.
Right from the outset, the iOS developer community was divided on whether this was the right approach, and many eschewed Interface Builder in favor of creating views programmatically.
Interface Builder has come a long way since it was first introduced, but the reasons for not using it remain largely unchanged:

Nibs and Storyboards (the serialized UI files exported by Interface Builder) only support a subset of UIKit features. Any non-trivial screen will require parts of the layout or control wiring to be done in code, resulting in an awkward division of logic between the Nib/Storyboard file and source code, making it harder for developers to find where a change needs to be made.

The XML format for Nib files (xib) is chock-full of unique machine-generated identifiers, making it impossible to effectively edit, review or merge changes. They can only safely be worked on by one developer at a time, even when using Git.

Nibs cannot easily be subclassed or composed, and cannot reference code constants for fonts, colors, etc. All properties set inside Nibs and Storyboards are effectively "magic" values that cannot be assigned meaningful names, and can only be reused by copying and pasting them between screens that share common elements.

Changing the appearance of views at runtime (e.g. for night mode, or to support multiple themes) means creating code bindings or subclasses for each and every on-screen view, and setting the properties programmatically. Different themes can't be previewed at build time, eliminating most of the benefit of Interface-Builder's WYSIWYG preview.

The mechanism behind Interface Builder encourages use of implicitly-unwrapped optionals and string-based name/type binding that will crash at runtime if mistyped. This was onerous when using Objective-C, but is especially unpleasant when programming in Swift, which favors static typing and compile-time error checking.

[caption id="attachment_1732295" align="alignleft" width="2744"] Apple's Interface Builder[/caption]
While these problems have remained unsolved since its inception, Interface Builder has acquired new ones:

When Nibs gave way to Storyboards, Apple encouraged developers to begin using IB for creating not just individual screens, but segues between screens as well, grouping multiple screens in a single Storyboard. This exacerbates the problem of file conflicts, as now developers cannot safely make edits to any screen in the app (or whatever subset of the app is contained within one Storyboard) without coordinating to avoid conflicts.

Interface Builder was designed before the introduction of AutoLayout, when the layout system used a more traditional “springs and struts” model called Autoresizing. IB encourages you to place views with drag and drop, but AutoLayout constraints completely replace that model, making it redundant. The relationship between the views shown on screen and the constraints that position them there is confusing and brittle, and making significant changes to layout usually requires removing all constraints and starting again.

IBDesignable, the new system introduced to solve the problem with previewing custom views, is flaky and doesn't work well with code that is split across multiple modules, so a screen containing custom components often appears as a patchwork of empty squares in Interface Builder, rather than an accurate preview. It also puts the burden on developers to add extra boilerplate code to support design mode.

In the face of these problems, why not simply ignore Interface Builder and create views in code?

The Trouble with Hand-Coding UI

The most common alternative to using Interface Builder for UI is to create all your views programmatically. But this approach has its own problems:

The more code you write, the more bugs (or potential for bugs) you introduce into your program. The files generated by Interface Builder replace human-generated code with machine-generated data, and as such reduce the space for bugs to hide.

AutoLayout, Apple's technology for specifying the size and positions of views on screen, is extremely long-winded to use in code. Despite the problems mentioned earlier, Interface Builder's GUI approach makes it somewhat less cumbersome.

Being able to drag and drop components and see the result immediately in a (mostly) WYSIWYG interface is a huge productivity gain over having to stop, edit, compile and run every time you want to see the result of a change.

Apple expects developers to build their apps using Storyboards. Example code uses it. Default project templates are automatically set up to use it. Some features, such as localized app startup screens, are essentially required to use it.

This last point is perhaps the most important: Ultimately, any deviation from Apple's status quo carries a cost.

The Trouble with 3rd Party Solutions

There are many existing 3rd party frameworks that address some of the issues, but none that meet all of our requirements:

AutoLayout wrappers such as Masonry help to speed up development when creating views in code, but they don’t avoid the fundamental problem of having to recompile after every change - a problem exacerbated by Swift’s relatively slow compile times.

Alternative development toolchains like React Native offer live reloading without recompiling, but at the cost of porting your app to JavaScript and React. Our iOS developers are happy with the power and flexibility of Apple’s UIKit and Foundation frameworks, and we already have a large codebase written in Swift, so switching to JavaScript would be a hard sell. It would also be a hard decision to reverse if it didn't work out.

Over the years we have seen many such 3rd party development tools come and go, but the majority fail to gain significant traction because the pros so rarely outweigh the cons.By adopting a nonstandard tool you run the risk that it will be slow to support new iOS features, or that its developers will stop maintaining it altogether. Even if it has a strong development community behind it, that community is a tiny fraction of the of the total iOS developer base - that impacts your ability to find developers who can work on your product, or to find solutions to bugs on Stack Overflow, and so on.In light of that, it seems like creating our own UI framework may seem like an odd decision, so why build our own tool? In short, because the problems are real, and the potential benefits of solving them are worth the risk.But if we are going to pay the cost of adopting a nonstandard tool, it better be something we are comfortable using, that meets all of our requirements, which we can maintain ourselves, and which is easy to migrate away from again if our requirements change.That's where Layout comes in.

Layout

Layout (available here from Schibsted’s github page) is a drop-in replacement for Nibs and (to some extent) Storyboards, based on human-readable XML files that can easily be edited and merged.In place of hard-coded constraints, Layout offers dynamic, parametric layouts via the use of runtime-evaluated expressions (more on these later).In place of WYSIWYG editing, Layout offers live reloading, so you can make tweaks and bug fixes to your real UI, in-situ, without recompiling the app.In place of procedural, stateful view logic, Layout offers declarative data binding.Like Nibs and Storyboards, Layout uses reflection to automatically recognize and bind components at runtime, so there is no need to create plug-ins for your existing view components - they will mostly just work without changes.Unlike Nibs and Storyboards, Layout's runtime bindings are fully type-checked and crash-safe if used correctly. Type or naming errors are detected statically and reported at the first opportunity, not deferred until a button is pressed or an outlet is accessed.Layout is written in 100% pure Swift code. There is no reliance on JavaScript or any other resource-hungry scripting language.Most importantly, Layout is not a replacement for UIKit. It is not a cross-platform abstraction, and it does not provide its own UI components - everything that Layout puts on the screen is either a standard UIKit view, or an ordinary UIView subclass that you have built yourself.It is easy to add Layout to an existing project, easy to use existing components, and easy to remove it again if it’s not the right fit. You can use a handful of Layout-driven screens or components in an otherwise standard app, and it need not affect your app architecture in the slightest.

Expressions

Layout's XML files describe view properties in terms of runtime expressions, powered by the Expression library. Expressions are simple, pure functions that allow dynamic values to be specified in a declarative fashion.In recent years, the development community has come to the realization that traditional, procedural code that relies on mutable state is a difficult way to manage complex logic such as user interfaces. Frameworks such as Facebook's React have demonstrated a simpler, more maintainable approach to UI based on the composition of pure functions that take state as an input and produce a view hierarchy as the output.Apple's own AutoLayout framework is also based on this idea, replacing complex procedural code with dynamic constraints that can describe flexible layouts mathematically instead of programmatically. So why doesn't Layout just use AutoLayout for view positioning? It would be relatively simple to specify AutoLayout constraints in XML and rely on that for positioning views, so why reinvent the wheel?Despite its name, the scope of Layout goes beyond merely laying out views - it can be used to configure essentially any property of a view, not just its size and position. And because these properties are specified in terms of expressions rather than static values, the ability to describe flexible layouts comes essentially for free.So Layout doesn't use AutoLayout because it would be redundant - you can replicate all the features of AutoLayout using expressions, as well as more complex behaviors that cannot easily be described in terms of AutoLayout constraints.With that said, Layout does interoperate nicely with AutoLayout-based views. If a view has internal AutoLayout constraints, Layout will use those to determine its size.Here is an example. This is the native Swift code in the view controller which loads the XML template and passes in some constants to use:
[crayon-5a31d88a001df354520692/]
The XML describes a UIView containing a UIImageView and a UILabel:
[crayon-5a31d88a001e8649373157/]
All the dimensions here are flexible: The image and label are sized to fit their content; the container view is set to 100% of the width of its container, and its height is set to auto + 20, which means it will fit the height of the image view or label (whichever is larger) plus a 10-point margin on either side.
The image view's width is determined to be whichever is the larger of its intrinsic width or height (determined by the source image), and the height is set to match the width, so that it is always square. The label is positioned 10 points to the right of the image view, and the image view and label are both vertically centered in their container.
The expressions used to calculate the value for each layout property should look familiar to any iOS developer - they are very much like the layout code you might have written inside a layoutSubviews method in the days before AutoLayout. But these expressions are not compiled Swift code, and they can be tweaked and debugged without rebuilding the app.
Below, you can see the resultant view as it would appear in portrait and landscape on an iPhone:

Live Reloading

Developing a pixel-perfect layout involves a lot of iteration, and developer productivity depends on making that process as fast and painless as possible. Layout's solution to this is live reloading, a mechanism which allows Layout to reload XML files on-the-fly without recompiling the app.
Live reloading works by scanning your project directory for layout XML files and matching them up to the versions bundled in the built application. Layout then loads these files preferentially over the bundled file. If Layout cannot locate the file it will display an error, and if it finds multiple candidate files, you will be asked to choose which one it should use.
Layouts can be reloaded at any time using the Cmd-R keyboard shortcut when the simulator has focus.
Unlike React Native, Layout's live reloading system is serverless, and does not rely on any background processes. The live reloading code is encapsulated inside your app, and only included in apps built for the simulator, as a real iOS device doesn't have access to your Mac's file system.
[caption id="attachment_1732325" align="alignnone" width="1000"] Layout in action[/caption]

The Red Box

Almost any mistake you make in a Storyboard will result in a crash inside UIKit, leaving you hunting through the console log trying to work out where you went wrong. Layout's live reloading feature wouldn't be much use if every typo resulted in a crash, so instead we have the Red Box.
The Red Box (a concept borrowed from React Native) is a fullscreen error dialog that appears whenever you make a mistake in your XML. Whether it's a syntax error, a misnamed property, or a circular reference in one of your expressions, the Red Box will tell you exactly where you went wrong, and a tap will dismiss it again once the problem has been corrected.
Layout's error handling is enabled in production apps too, so you can intercept and log production errors, and then either crash or fail gracefully at your own discretion.

Over-The-Air Updates

While Nibs or Storyboards can be loaded from a remote URL, there is no official support for this in UIKit, and doing so is very unsafe, as a mismatch between app and Storyboard versions may result in an untrappable crash.
Layout supports asynchronous loading of remote XML files and has the basic infrastructure in place for over-the-air UI updates. Errors can be trapped and handled gracefully, so that the app can roll back to a safe state in the event of a bad update.

Convenience

There are a number of rough edges in iOS development, and Layout takes the opportunity to simplify those wherever possible. For example:

Fonts: Programmatically specifying a particular font at a given weight and style can be tricky; you need to know the exact font name and/or mess around with nested dictionaries of stringly-typed constants to specify the required attributes. Layout lets you specify fonts as a single space-delimited, CSS-style string that it parses to find the closest match. Font sizes can be specified in fixed or relative units, and Layout provides seamless support for iOS’s dynamic text resizing feature, even for custom fonts, which normally require you to calculate point sizes manually.

Colors: Colors are almost universally specified using a 6-digit hex format or a triplet of RGB values in the range 0-255, but iOS requires you to specify RGB values in the range 0-1. This makes sense mathematically, but it's a pain to do this conversion manually. Layout lets you use 3-, 4-, 6- or 8-digit hex color literals directly in your XML, along with CSS-style rgb() and rgba() functions.

Rich Text: Creating attributed strings in code is hard work, and to date iOS still doesn't provide any way to put styled text in a localized strings file. Layout lets you drop basic HTML straight into your XML file for doing simple styling like bold or italics, and if you use HTML in your strings file, those strings will be automatically converted to attributed strings when used with Layout components.

Table cells: Storyboards provide a convenient way to create cell and header templates directly inside your table. But if you later decide that you need to share those cells between more than one screen, that's an awkward refactor. Layout lets you specify XML cell templates either inline within the table or in their own file, using exactly the same syntax, making it painless to refactor later.

Localization: Using localized strings inside Interface Builder is awkward as you have limited control over key names, there is no automatic way to add new strings after the initial extraction, and no warning if you forget or mistype a key. Layout lets you reference strings from your Localizable.strings file directly in your XML, and will display an error if any string is missing. It also applies the same live loading feature to strings as it does for XML, so you can add, modify, or remove strings and see the results without recompiling the app.

Roadmap

Though it has only been in development for a few months, Layout has already been integrated into several Schibsted apps, and is running in production on the App Store.
In the next few months, we'll be focussing on:

Supporting more special-case iOS components, properties and types

Even more robust error handling, and improved unit and integration test coverage

Better tooling, such as static analysis, linting and refactoring for template files

Improving the examples and documentation

We're really excited about Layout's potential, which is why we've decided to open it up to the wider iOS community. Your feedback will help it to grow faster, and to expand it support new use cases beyond our requirements at Schibsted.

We built a service to estimate the number of website visitors reached by new audience segments in real time, for queries with any combination of user attributes. Here’s how we did it.

By Manuel Weiss, with help from the ATE team

Audience targeting and segments

Schibsted’s Audience Targeting Engine (ATE) allows us to target advertising based on a user’s attributes, like their age, gender, search history, location, etc.

ATE’s Segment Manager allows the creation of audience “segments”. A segment represents a group of users and is defined by a set of attributes that an advertiser is interested in (e.g. users aged 40-45, based in London). These segments can then be attached to an advertisement campaign.

When an ad campaign manager sets up a segment to target a certain group of users, it is important to know the approximate size of that user group so that the number of impressions can be estimated.

[caption id="attachment_1732281" align="aligncenter" width="1551"] Example of a segment: users interested in business and technology, aged 40-45, based in London[/caption]

The problem

To calculate the number of users matching a segment, we can look at historical data and compute how many unique users we have observed over a given time period (last week, for example) that would fall into this segment. (This is also known as the “count-distinct problem”.)

This is fairly easy to do for existing segments. But the estimate is most crucial when a new segment is created or an existing one changed. The problem is that it takes a while to go through the historical data and count all users matching the newly created segment. And we cannot precompute all possible segments, as they can consist of any combination of attributes. Additionally, it’s also possible to upload lists of user IDs, which can then also be combined (i.e. intersected) with other targeting attributes. Obviously, this custom list is unknown until it is uploaded.

[caption id="attachment_1732276" align="aligncenter" width="1931"] Example of unions and intersections for a segment[/caption]

The solution to that problem is to record the unique set of users separately for every criterion and then do the unions and intersections as needed when a new segment is created or changed. To give an example, let’s assume the following segment:

Gender: Male

Age groups: 36-45, 46-55

Location: Oslo, Bergen, Stavanger

Interests: Porsche, Tesla

We will have 8 sets of users: all male users, users aged 36 to 45 or 46 to 55, users in Oslo, Bergen or Stavanger and users interested in Porsche or Tesla. Now we’ll do a union of all sets within a targeting category and then an intersection between categories. This gives us the count of unique users for our segment.

In order to do this for any new segment, we need to store a set of unique user IDs for each possible value of each criterion:

3 for gender: Male, Female, Unknown

6 age groups

> 100.000 for location by district/region/country

Several 100.000s for interest categories (separate for each supported website)

Several millions for location by latitude/longitude

Several millions for search terms

Which creates a new problem: how do we efficiently store millions of user IDs for each of these sets, while being able to do unions and intersections?

To the rescue: sketches!

Luckily, smart people have been thinking about this forawhile and have come up with a smart solution: sketches! They are probabilistic data structures, like, for example, the slightly better known Bloom filters. The basic idea is similar to lossy compression (as we know it from JPEGs and MP3s): you don’t get quite the same thing back, but it is good enough for your purposes, using only a fraction of the original size.

There are many different versions of these data sketches, as they're also called, but they all build on the following observation: if I store some information about specific patterns in the incoming data, I can estimate how many distinct items I have observed so far. As a very simplified example, if I flip a coin many times and only store the largest number of heads in a row, I can estimate how many times the coin was flipped. For a more detailed explanation of sketches, see here.

All of these sketches rely on a uniform distribution of the incoming values, which can easily be achieved by hashing the original user IDs.

So we can construct a data structure that allows us to add a very large number of values, and then ask how many distinct values were observed. But even better – we can take two such data structures and do a union (i.e. how many distinct values show up in either of the two: “lives in Oslo or Bergen”) or an intersection (i.e. how many distinct values show up in both: “is male and 46-55 years old”).

For a more detailed introduction to these concepts, have a look at these excellent blogposts.

In an upcoming blog post, we will discuss some data sketch implementations and the benchmarks we did. Here, we want to focus on how we used data sketches to solve the problem outlined in the introduction and how we implemented this system in production.

Commercial importance of population estimates

For our campaign managers, seeing the population estimates in real time as they work on their segments is absolutely crucial. A segment’s reach needs to be greater than a certain number of unique users to be commercially viable. What this size is, depends completely on the type of segment. The segments that make up the standard offering in the product portfolio need to be very large and reach a cross section of our users. Custom segments for a specific advertiser can be very small in comparison, but still very attractive for a particular campaign. And, the ad sales team always needs to know the size of each segment to make the call: should I sell this? Will the campaign be able to deliver?

[caption id="attachment_1732274" align="aligncenter" width="1024"] The segment creation user interface with the estimated weekly unique users in the top right.[/caption]

Using data sketches in a production system

Our previous solution

When we first started estimating reach, we had only a couple of targeting attributes, and needed to support internal users only. For each query, we would kick off a job via Spark Jobserver that would calculate the estimate on the fly based on HyperLogLog data structures kept in memory. This approach did not scale with the number of targeting attributes, the amount of data and the number of concurrent estimation requests. When there was an issue with Spark, the system would stop responding. The new design overcomes all these issues and can respond to many simultaneous requests.

Calculating the sketches

Every night, we run a batch job on Spark which reads all the user profiles recorded on the previous day and calculates a data sketch for each attribute. The attributes in these profiles have been inferred by our User Modelling data science team based on browsing behaviour on our sites. Once all the sketches are calculated, we load them from S3 into a PostgreSQL RDS (serialised as a byte array).

This might seem like an odd choice, given there’s nothing relational about our list of sketches. But we are relying on two very convenient features:

The geo location sketches come in two different flavours, as there are two ways to define locations in a segment: by postcode/district code/region code and by latitude/longitude + radius. The latter allows the precise definition of target areas; e.g. for a new shopping mall which wants to target everyone living within 20km of their location or a coffee shop targeting mobile users within 500m of any of their branches.

To allow the targeting by latitude/longitude, we divide the world into a grid of points on the surface and calculate a data sketch for every quadrant in this grid for which we have seen any users. For efficiency reasons, we calculate these sketches at three different resolutions:

Partitioning in PostgreSQL

Another nice feature of PostgreSQL that we rely on is partitioning. Partitioning refers to splitting what is logically one large table into smaller physical pieces. We are loading new data sketches into the database every day and want to query the last seven days of data. At the same time, we want to remove older data when it is not in use anymore. As a precaution, we always keep 14 days of data in case there is some issue with loading new data.

To achieve this, we have a parent table, for example low_res_geo_sketches, and one partition for each day. When we add a new partition, we remove the oldest from the parent table, but keep it around for another week. For selects on the parent table, PostgreSQL automatically includes data from all partitions associated with it.

Another optimisation is to run the CLUSTER command after loading a new partition and creating the geo index. This reorganises the data physically on disk to match the index, so that retrieval is as fast as possible. It can take quite a while, depending on the amount of data, but it only needs to be once and happens in the middle of the night.

Computing population estimates on demand

Now that we have all the data sketches ready in our database, how do we actually compute the population estimates for a given segment? When someone creates or edits a segment, the browser sends a request to the population estimation API. This request contains the list of targeting values, like in our example at the beginning.

The server will fetch all 8 required sketches from the db, deserialise them, do a union of all sketches within a targeting category and then an intersection between categories. This resulting sketch is then queried for its estimated count of unique users. Depending on the number of sketches that need to be retrieved, this typically takes less than 1 second.

If there are many geo radius criteria defined in a segment, up to several thousand sketches may need to be loaded from the database. This can sometimes take up to 20 seconds, which is still acceptable as these segments are not created or changed very often. The limiting factor is not the speed of computing the unions/intersections, which is very fast, but the time it takes to load all the bytes from the database. There are many ways to optimise this further, for example by precomputing the union per attribute across the lookback period of seven days, or even by doing the unions/intersections in the database. But for our current needs, it is good enough.

Conclusion

Even estimating unique counts for complex queries is not rocket science and can be done in real time, provided enough effort has been spent to compute things upfront. There are a number of existing robust implementations of the required data structures, or “sketches”, which can just be used as a library, with results that come close to magic!

]]>We built a service to estimate the number of website visitors reached by new audience segments in real time, for queries with any combination of user attributes. Here’s how we did it.By Manuel Weiss, with help from the ATE team

Audience targeting and segments

Schibsted’s Audience Targeting Engine (ATE) allows us to target advertising based on a user’s attributes, like their age, gender, search history, location, etc.ATE’s Segment Manager allows the creation of audience “segments”. A segment represents a group of users and is defined by a set of attributes that an advertiser is interested in (e.g. users aged 40-45, based in London). These segments can then be attached to an advertisement campaign.When an ad campaign manager sets up a segment to target a certain group of users, it is important to know the approximate size of that user group so that the number of impressions can be estimated.
[caption id="attachment_1732281" align="aligncenter" width="1551"] Example of a segment: users interested in business and technology, aged 40-45, based in London[/caption]

The problem

To calculate the number of users matching a segment, we can look at historical data and compute how many unique users we have observed over a given time period (last week, for example) that would fall into this segment. (This is also known as the “count-distinct problem”.)This is fairly easy to do for existing segments. But the estimate is most crucial when a new segment is created or an existing one changed. The problem is that it takes a while to go through the historical data and count all users matching the newly created segment. And we cannot precompute all possible segments, as they can consist of any combination of attributes. Additionally, it’s also possible to upload lists of user IDs, which can then also be combined (i.e. intersected) with other targeting attributes. Obviously, this custom list is unknown until it is uploaded.
[caption id="attachment_1732276" align="aligncenter" width="1931"] Example of unions and intersections for a segment[/caption]
The solution to that problem is to record the unique set of users separately for every criterion and then do the unions and intersections as needed when a new segment is created or changed. To give an example, let’s assume the following segment:

Gender: Male

Age groups: 36-45, 46-55

Location: Oslo, Bergen, Stavanger

Interests: Porsche, Tesla

We will have 8 sets of users: all male users, users aged 36 to 45 or 46 to 55, users in Oslo, Bergen or Stavanger and users interested in Porsche or Tesla. Now we’ll do a union of all sets within a targeting category and then an intersection between categories. This gives us the count of unique users for our segment.
In order to do this for any new segment, we need to store a set of unique user IDs for each possible value of each criterion:

3 for gender: Male, Female, Unknown

6 age groups

> 100.000 for location by district/region/country

Several 100.000s for interest categories (separate for each supported website)

Several millions for location by latitude/longitude

Several millions for search terms

Which creates a new problem: how do we efficiently store millions of user IDs for each of these sets, while being able to do unions and intersections?

To the rescue: sketches!

Luckily, smart people have been thinking about this forawhile and have come up with a smart solution: sketches! They are probabilistic data structures, like, for example, the slightly better known Bloom filters. The basic idea is similar to lossy compression (as we know it from JPEGs and MP3s): you don’t get quite the same thing back, but it is good enough for your purposes, using only a fraction of the original size.There are many different versions of these data sketches, as they're also called, but they all build on the following observation: if I store some information about specific patterns in the incoming data, I can estimate how many distinct items I have observed so far. As a very simplified example, if I flip a coin many times and only store the largest number of heads in a row, I can estimate how many times the coin was flipped. For a more detailed explanation of sketches, see here.All of these sketches rely on a uniform distribution of the incoming values, which can easily be achieved by hashing the original user IDs.So we can construct a data structure that allows us to add a very large number of values, and then ask how many distinct values were observed. But even better – we can take two such data structures and do a union (i.e. how many distinct values show up in either of the two: “lives in Oslo or Bergen”) or an intersection (i.e. how many distinct values show up in both: “is male and 46-55 years old”).For a more detailed introduction to these concepts, have a look at these excellent blogposts.In an upcoming blog post, we will discuss some data sketch implementations and the benchmarks we did. Here, we want to focus on how we used data sketches to solve the problem outlined in the introduction and how we implemented this system in production.

Commercial importance of population estimates

For our campaign managers, seeing the population estimates in real time as they work on their segments is absolutely crucial. A segment’s reach needs to be greater than a certain number of unique users to be commercially viable. What this size is, depends completely on the type of segment. The segments that make up the standard offering in the product portfolio need to be very large and reach a cross section of our users. Custom segments for a specific advertiser can be very small in comparison, but still very attractive for a particular campaign. And, the ad sales team always needs to know the size of each segment to make the call: should I sell this? Will the campaign be able to deliver?
[caption id="attachment_1732274" align="aligncenter" width="1024"] The segment creation user interface with the estimated weekly unique users in the top right.[/caption]

Using data sketches in a production system

Our previous solution

When we first started estimating reach, we had only a couple of targeting attributes, and needed to support internal users only. For each query, we would kick off a job via Spark Jobserver that would calculate the estimate on the fly based on HyperLogLog data structures kept in memory. This approach did not scale with the number of targeting attributes, the amount of data and the number of concurrent estimation requests. When there was an issue with Spark, the system would stop responding. The new design overcomes all these issues and can respond to many simultaneous requests.

Calculating the sketches

[caption id="attachment_1732282" align="aligncenter" width="633"] Data flow in our population estimation system[/caption]
Every night, we run a batch job on Spark which reads all the user profiles recorded on the previous day and calculates a data sketch for each attribute. The attributes in these profiles have been inferred by our User Modelling data science team based on browsing behaviour on our sites. Once all the sketches are calculated, we load them from S3 into a PostgreSQL RDS (serialised as a byte array).This might seem like an odd choice, given there’s nothing relational about our list of sketches. But we are relying on two very convenient features:

Obviously, the user defining a segment in the UI simply clicks on a point in a map.
[caption id="attachment_1732280" align="aligncenter" width="1024"] Lowest-resolution coverage for Norway, Sweden, Finland and Belarus. Each dot represents one data sketch. Areas without dots don’t have any active users.[/caption]

Partitioning in PostgreSQL

Another nice feature of PostgreSQL that we rely on is partitioning. Partitioning refers to splitting what is logically one large table into smaller physical pieces. We are loading new data sketches into the database every day and want to query the last seven days of data. At the same time, we want to remove older data when it is not in use anymore. As a precaution, we always keep 14 days of data in case there is some issue with loading new data.To achieve this, we have a parent table, for example low_res_geo_sketches, and one partition for each day. When we add a new partition, we remove the oldest from the parent table, but keep it around for another week. For selects on the parent table, PostgreSQL automatically includes data from all partitions associated with it.Another optimisation is to run the CLUSTER command after loading a new partition and creating the geo index. This reorganises the data physically on disk to match the index, so that retrieval is as fast as possible. It can take quite a while, depending on the amount of data, but it only needs to be once and happens in the middle of the night.

Computing population estimates on demand

Now that we have all the data sketches ready in our database, how do we actually compute the population estimates for a given segment? When someone creates or edits a segment, the browser sends a request to the population estimation API. This request contains the list of targeting values, like in our example at the beginning.The server will fetch all 8 required sketches from the db, deserialise them, do a union of all sketches within a targeting category and then an intersection between categories. This resulting sketch is then queried for its estimated count of unique users. Depending on the number of sketches that need to be retrieved, this typically takes less than 1 second.If there are many geo radius criteria defined in a segment, up to several thousand sketches may need to be loaded from the database. This can sometimes take up to 20 seconds, which is still acceptable as these segments are not created or changed very often. The limiting factor is not the speed of computing the unions/intersections, which is very fast, but the time it takes to load all the bytes from the database. There are many ways to optimise this further, for example by precomputing the union per attribute across the lookback period of seven days, or even by doing the unions/intersections in the database. But for our current needs, it is good enough.

Conclusion

Even estimating unique counts for complex queries is not rocket science and can be done in real time, provided enough effort has been spent to compute things upfront. There are a number of existing robust implementations of the required data structures, or “sketches”, which can just be used as a library, with results that come close to magic!

]]>http://bytes.schibsted.com/datasketches-secret-sauce-behind-population-estimation/feed/1Get your updates right, but don’t leave old files behindhttp://bytes.schibsted.com/get-updates-right-dont-leave-old-files-behind/
http://bytes.schibsted.com/get-updates-right-dont-leave-old-files-behind/#respondFri, 14 Jul 2017 09:21:22 +0000http://bytes.schibsted.com/?p=1732241During development you often update the deliverables with the latest version of your code. But don’t forget what you removed. Here is how you can do it with Make. What is Make? GNU Make ...

]]>During development you often update the deliverables with the latest version of your code. But don’t forget what you removed. Here is how you can do it with Make.

What is Make?

GNU Make is a tool which controls the generation of executables and other non-source files of a program from the program’s source files.

Make gets its knowledge of how to build your program from a file called the makefile, which lists each of the non-source files and how to compute it from other files. When you write a program, you should write a makefile for it, so that it is possible to use Make to build and install the program. (Source)

With awareness of both how to generate a file (the target) and what that file needs (its dependencies), make can help save valuable time and prevent you from having to rebuild a target whose dependencies did not change since the last build.

Although make is great at saving time, it takes some expertise to manage efficient work during code cleanup phases.

The make clean target

Most projects come with a make clean target providing a developer with the ability to remove any non-source file the make program generated. Even the reference itself instructs how to clean build output:

Conclusion

Whenever you’re considering performing an update either with make as described here, or with puppet, ansible or other configuration management tools, don’t forget to remove anything redundant to prevent the famous it works for me pattern.

How we use Java Nashorn to transform data on the fly

We were asked to migrate data back and forth between two schema-incompatible systems. At the same time, their schemas were evolving. We decided not to waste our developer energy on writing disposable mapping tables in Excel. Instead, we provided our product colleagues with the tools to transform data themselves in a more efficient way. And we used Java Nashorn to do it.

An old problem

We’ve all faced this challenge many times: migrate an old system to a new one. Keep all the data and adapt it to the new schemas, and on top of that provide interoperability. At Schibsted, we’ve had many cases where we have upgraded the technology for our classified ad websites, and then we have had to ensure we keep our existing user base and content data.

We poor developers lose sleep over these operations! Often we create ad hoc scripts:

mix_fields_together, extract_some_info_from_them, call_third_parties_to_fill_in_the_gaps, optimize_so_migration_doesn_t_take_ages, and so on ad nauseam.

And worst of all, when we get the result of the script, often someone will realize that something has to be changed, e.g. we were missing fields or had a mistaken preconception.

Start again. Edit script. Run migration. Show result. More changes.

Sound familiar?

Room for improvement

This time we had to do it better. We wanted to make it so anyone in the team could do the debugging, and also make it a nicer task, so everyone would be able to continue working on adding value.

Hopefully, we would also find a better solution to the problem at hand than our many previous attempts. So what could we do?

Experience has shown that most data transformations are simple mappings that can be managed on Excel worksheets. We will support this.

The same experience shows that a small but nasty number of those transformations require more complex formulas for adding, subtracting, combining or other manipulation of the original data. We will also manage this.

So in summary, these came to be the requirements:

Whoever deals with the migration can manage the mapping tables and the rules to mix, remove, extract and everything else on data fields.

It shouldn’t be a painful process.

Oh, and some other minor stuff like reliability and performance …

Choices we made

So, we wanted to let users upload their Excel mapping tables, and also let them define how to compose their destination field definitions based on those tables.

That called for giving users a language capable of expressions, and also a place to drop those definitions.

[Let me divert for a second to state that we had decided to use Scala for the project implementation, for reasons that would require a separate post to explain!]

We settled on providing a service to upload the tables and the field definitions. The language we would provide to write the definitions was the next big ugly monster–I mean interesting problem!–we faced.

We don't want to implement a language. What if we use something already existing? Let us find out if and what interpreters are available. How cool is that? Java includes a Javascript interpreter and anything Java is Scala too!

Enter Nashorn

It turns out that Java 8 comes with a first class ECMA-compliant Javascript engine out of the box. Its name is Nashorn (German for rhinoceros) and we decided to try and see what it could do for us. Javascript is an easy language to use at this level, and we knew many non-technical people were already familiar with it, if only to do some basic magic on web page prototypes and such.

Besides, this programming language had the added advantage of being “untyped”: for a job like this where the user might end up doing crazy things with data of unknown types, it looked like a perfect fit!

Javascript then.

Back to the task

Or, the most code-oriented part of the post

The Javascript engine

We needed a Javascript interpreter in our Scala code. For that we used the simple instantiation code:
[crayon-5a31d88a02850896804418/]

The transformation rules

In order to migrate data schemas, you need to define a set of output fields derived from a set of input fields and mapping tables. Some simple examples:

We ask the operator to write a YAML file with one field per output field, whose value is the javascript code to calculate it. Our system will read it and later run it for each of the output fields:
[crayon-5a31d88a02858507968403/]
As you can guess from the above, we provide bound variables with the full JSON input object plus lookup tables (Javascript arrays) copied from the good old Excel sheets, so you can compose and look up in order to define your output fields.

It turns out you can bind Javascript variables to the runtime environment where you execute your Javascript code, like this:
[crayon-5a31d88a0285b255811943/]
Of course, the magic is in bindings. We have previously obtained the bindings object from Nashorn in initEngine with (not shown above for clarity):
[crayon-5a31d88a0285e674312035/]

The schema transformation

The last step in the process is to actually apply the transformation definitions to the input. We did that in the transform method, actually asking Nashorn to run Javascript code: a transform global function that we also bound in a separate step.
[crayon-5a31d88a02861128938888/]
For reference, the Javascript transform function is given next. ruleset contains the definition rules parsed from the YAML above. Both transform and ruleset are bound to the engine in a way similar to the input data:
[crayon-5a31d88a02864721437870/]
And thus we obtain a new record with as many fields as defined in the rules.

Colophon

Using an embedded engine for a well-known language like Javascript has allowed us to provide the tools for frictionless migrations. Nashorn has delivered excellently in terms of performance as well as in terms of API. It gives you a lot of power in a very easy way.

References

Nashorn

Scala

How we use Java Nashorn to transform data on the fly

We were asked to migrate data back and forth between two schema-incompatible systems. At the same time, their schemas were evolving. We decided not to waste our developer energy on writing disposable mapping tables in Excel. Instead, we provided our product colleagues with the tools to transform data themselves in a more efficient way. And we used Java Nashorn to do it.

An old problem

We’ve all faced this challenge many times: migrate an old system to a new one. Keep all the data and adapt it to the new schemas, and on top of that provide interoperability. At Schibsted, we’ve had many cases where we have upgraded the technology for our classified ad websites, and then we have had to ensure we keep our existing user base and content data.
We poor developers lose sleep over these operations! Often we create ad hoc scripts:
mix_fields_together, extract_some_info_from_them, call_third_parties_to_fill_in_the_gaps, optimize_so_migration_doesn_t_take_ages, and so on ad nauseam.
And worst of all, when we get the result of the script, often someone will realize that something has to be changed, e.g. we were missing fields or had a mistaken preconception.
Start again. Edit script. Run migration. Show result. More changes.

Sound familiar?

Room for improvement

This time we had to do it better. We wanted to make it so anyone in the team could do the debugging, and also make it a nicer task, so everyone would be able to continue working on adding value.
Hopefully, we would also find a better solution to the problem at hand than our many previous attempts. So what could we do?
Experience has shown that most data transformations are simple mappings that can be managed on Excel worksheets. We will support this.
The same experience shows that a small but nasty number of those transformations require more complex formulas for adding, subtracting, combining or other manipulation of the original data. We will also manage this.
So in summary, these came to be the requirements:

Whoever deals with the migration can manage the mapping tables and the rules to mix, remove, extract and everything else on data fields.

It shouldn’t be a painful process.

Oh, and some other minor stuff like reliability and performance …

Choices we made

So, we wanted to let users upload their Excel mapping tables, and also let them define how to compose their destination field definitions based on those tables.
That called for giving users a language capable of expressions, and also a place to drop those definitions.
[Let me divert for a second to state that we had decided to use Scala for the project implementation, for reasons that would require a separate post to explain!]
We settled on providing a service to upload the tables and the field definitions. The language we would provide to write the definitions was the next big ugly monster–I mean interesting problem!–we faced.

We don't want to implement a language. What if we use something already existing? Let us find out if and what interpreters are available. How cool is that? Java includes a Javascript interpreter and anything Java is Scala too!

Enter Nashorn

It turns out that Java 8 comes with a first class ECMA-compliant Javascript engine out of the box. Its name is Nashorn (German for rhinoceros) and we decided to try and see what it could do for us. Javascript is an easy language to use at this level, and we knew many non-technical people were already familiar with it, if only to do some basic magic on web page prototypes and such.
Besides, this programming language had the added advantage of being “untyped”: for a job like this where the user might end up doing crazy things with data of unknown types, it looked like a perfect fit!
Javascript then.

Back to the task

Or, the most code-oriented part of the post

The Javascript engine

We needed a Javascript interpreter in our Scala code. For that we used the simple instantiation code:
[crayon-5a31d88a02850896804418/]

The transformation rules

In order to migrate data schemas, you need to define a set of output fields derived from a set of input fields and mapping tables. Some simple examples:

And so on.
We ask the operator to write a YAML file with one field per output field, whose value is the javascript code to calculate it. Our system will read it and later run it for each of the output fields:
[crayon-5a31d88a02858507968403/]
As you can guess from the above, we provide bound variables with the full JSON input object plus lookup tables (Javascript arrays) copied from the good old Excel sheets, so you can compose and look up in order to define your output fields.
It turns out you can bind Javascript variables to the runtime environment where you execute your Javascript code, like this:
[crayon-5a31d88a0285b255811943/]
Of course, the magic is in bindings. We have previously obtained the bindings object from Nashorn in initEngine with (not shown above for clarity):
[crayon-5a31d88a0285e674312035/]

The schema transformation

The last step in the process is to actually apply the transformation definitions to the input. We did that in the transform method, actually asking Nashorn to run Javascript code: a transform global function that we also bound in a separate step.
[crayon-5a31d88a02861128938888/]
For reference, the Javascript transform function is given next. ruleset contains the definition rules parsed from the YAML above. Both transform and ruleset are bound to the engine in a way similar to the input data:
[crayon-5a31d88a02864721437870/]
And thus we obtain a new record with as many fields as defined in the rules.

Colophon

Using an embedded engine for a well-known language like Javascript has allowed us to provide the tools for frictionless migrations. Nashorn has delivered excellently in terms of performance as well as in terms of API. It gives you a lot of power in a very easy way.

References

Nashorn

Scala

]]>http://bytes.schibsted.com/data-transform-java-nashorn/feed/1A year of UI Testing with XCTesthttp://bytes.schibsted.com/year-ui-testing-xctest/
http://bytes.schibsted.com/year-ui-testing-xctest/#respondTue, 30 May 2017 17:37:58 +0000http://bytes.schibsted.com/?p=1732146If you ask around you’ll quickly realize that UI tests don’t have an enviable reputation in the iOS developer community. “Difficult to write”, “Slow”, “Hard to maintain”, “Don’t work” are ...

]]>If you ask around you’ll quickly realize that UI tests don’t have an enviable reputation in the iOS developer community. “Difficult to write”, “Slow”, “Hard to maintain”, “Don’t work” are complaints you hear a lot. While to some extent part of these are genuine, truth is that there’s no viable alternative when working on medium/large scale projects.

Do you submit your app to the AppStore relying solely on the result of unit tests? It is likely you don’t and instead precede submission with a set of tedious and time-consuming manual tests. This was our release flow for a rather long time to the point that manual testing became unsustainable, making some kind of automation mandatory.

We decided to give Apple’s XCTest framework a try and a year (and a lot of sweat) later we managed to reduce time wasted on manual tests to a bare minimum, getting pretty close to the continuous delivery dream.
Today we have 220 test cases on iPhone, 214 on iPad which run respectively in ~2h and ~2h30m (iPad execution is slower). The development team took in charge the task of writing UI tests (along with units) greatly increasing the quality of the first development iteration outcome. As a result we achieved a happier team that is able to submit updates quicker and with greater confidence.

The good
If you’re still skeptic focus for a moment on the bright sides of UI tests and you’ll realize there is indeed a lot of potential:

Interact with your app exactly as the user will do (isn’t that exactly what you want?)

Forget about the application’s implementation, which allows thoughtless refactorings

Test a lot with short and readable code where acceptance criteria can be easily expressed

Lift yourself from the pain of manual testing (isn’t that alone enough?)

The bad
There are of course some downsides and limitations but you’ll see how we worked them out

Unflexible. UI tests run in a separate process that has no access to the testing application’s code. This makes it tricky to do things like mocking or data injection

Slow. You don’t run them locally but rather on a continuous integration environment, using xcodebuild / fastlane. However… ⤸

Results. It can get pretty difficult to understand what went wrong from the test logs

Making tests flexible
By far the most discouraging issue when approaching UI testing is its apparent inflexibility. You quickly realize that the main issue is that the test process has no access to the application’s internal state. Doing something trivial like checking that a setting is properly stored in NSUserDefaults can get close to impossible.
We managed to workaround this by developing SBTUITestTunnel which adds the missing link between test and application target. The idea is very simple: when running tests, a web server is instantiated on the app which receives a set of requests from the testing target. The tool sits on top of XCTest extending its functionality in a neat and usable fashion allowing to stub and monitor network requests, inject data to NSUserDefaults and Keychain, download/upload files from/to the app’s sandbox and much more.
Let’s see for example how easy is it to stub a specific network request:

These are very simple examples that show some of the capabilities of the testing framework we developed. A lot more can be achieved making it possible to write very complex, yet readable, tests.
A real life example of what one of our ui tests look like is the following:

Adapting workflow to UI testing
It’s a fact that it takes time to execute a full UI test suite, this is however not a problem per se if you adapt your testing workflow appropriately. Our tests today take ~4h30m to execute running overnight (using fastlane / xcodebuild) on a single continuous integration machine. This daily outcome is all we need to get an effective feedback of our latest changes.
On the other hand we work locally when developing new tests, as running few of them takes an acceptable amount of time.

xcodebuild, screencasting and finding what went south
We chose xcodebuild for running tests since it allows greater flexibility of integration with our development pipeline and opens up for unique features which would be not possible otherwise. Among others we were able to add a screencasting job ? to the whole testing session as shown below.
The downside of using xcodebuild is the massive amount of hard-to-read logging output that is produced which makes it challenging to understand why a particular test is failing. A similar problem can be experienced in Xcode where there is a lot of clicking involved before having the picture of what eventually failed.
Rethinking how we would want tests results to be summarized we built another custom tool, sbtuitestbrowser, that parses and presents results in a simple (yet effective) web interface. Written in Swift, of course ?.
The home shows the latest testing sessions, testing classes and single tests.

Each test has the list of actions that were performed, the same you’re used in Xcode. You can enable/disable screenshots or rely on the screencast that is synched with the actions: tap on an action and the video jumps to that point of the test.

Specifying environments/devices
Along with the tools described so far we’ve been exploring other ways to further enhance our testing experience. Having different backend environments (development, integration, pre-production, production) we needed a way to easily specify which tests should run on which environments. We don’t want certain tests to run in production, i.e. replying to a real user with a test message.
By swizzling the XCTest framework and by defining a set of protocols used to «decorate» XCTest classes we made it possible to specify on which devices and environment a test cases should run. Protocols are divided in two categories: environments and devices.

These 3 test cases will be executed independently from the device (SBTTestOnAnyDevice) and environment (SBTTestOnAnyEnvironment) the app is running.
The implementation of this functionality has been tailored to our specific app configuration (with regards to how we are specifying the running environment), but we are planning to make something reusable available soon.

What‘s next?
There are scenarios, like for example a broad refactoring (i.e. changing the network layer), where you want to verify the entire app’s behaviour before pushing your changes to the main branch. In these cases a quicker feedback might be required which you can achieve running tests in parallel. We’ll work on expanding our continuous integration environment allowing to run tests in parallel that will provide a quicker feedback.

Conclusions
We expanded the UI testing framework provided by Apple building additional tools that helped us to reduce manual testing and increase confidence in our code.

SBTUITestTunnel: add a link between app and test target allowing to write much more sophisticated tests

LiveData vs ObservableField

In android data binding we find the sister class ObservableField<T>. If I were to write a similar slide for that one, it would be:

An Observable data holder

The difference is that ObservableField<T> isn't lifecycle-aware and hence there cannot be any automatic subscription management. However the Android Data Binding framework does the subscription management for you - it's just not directly aware of the lifecycle. Instead of immediately unsubscribing when a fragment pauses or stops, it has somewhat of a delay. Notice this memory dump from a bound ObservableField:

You see that the mListener is a ViewDataBinding$WeakListener. If you look at the ViewDataBinding class inner workings, you see that it uses a WeakReference to the views and view bindings. This ensures that whenever an Activity or Fragment is dead, the reference to the ViewModel and its ObservableFields are removed in due time. Just not immediately as you would get if it was lifecycle aware.

Using it with data binding

So the key question is - can we just swap all usages of Observable<T> with LiveData<T> in view models we use with Android Data binding? After all, they're just observables with life cycle awereness.

The Android Data Binind framework doesn't know how to observe on LiveData.

The first bullet point is easily fixable. We could just write a simple converter. The second one is not as simple though. At least not for us laymen. Google could fix it, and I hope they will. Notice the difference in the interfaces for subscribing on the two classes:
[crayon-5a31d88a03e33417542888/]
You have to provide the LiveData-object with a LifecycleOwner. Per now, the Android data binding framework isn't rigged to do that. But it could be, and I hope it will be soon. I notice that it's Yigit Boyar who present these concept - the developer who's also behind Android Data Binding. He should know how to tie these things together.

The ViewModel holder

Another interesting class introduced in the Android Architecture Component talks is ViewModelProviders. I've found it does three things:

Retains view models across device rotations and other lifecycle-affecting events

Rids of the view models on onDestroy

Saves and restores the view models helped by the Parcelable interface on onSaveInstanceState and onCreate *

*) Disclaimer: Very superficial research done here

The problem it solves is avoiding double calls to expensive services when a long-running request is interrupted by a lifecycle change. Illustrated by this slide in the talk, just to set the context:

I'm claiming that this class bring more value to the table for data binding users than the LiveData-class. It wasn't very difficult to achieve before either, though. The Data Binding framework already allowed for this pattern as long as you implemented the holder class yourself.

Example: I made an improvement to my earlier MVVM example project where I've introduced a holder for the ViewModel-class accessed through a application-wide singleton. There wasn't a lot of code change needed. You'll notice after this commit that the "# logged in users" in the top right corner of the example activity doesn't reset after device rotation using this technique - even if you interrupt the 2 second long initial load.

In the branch androidarch I solved the same problem using the ViewModelProviders-class from the Architecture Components. I also experimented with binding to a LiveData class instead of ObservableField, but found it not to bring any value to the architecture or solution due the arguments stated above.

Conclusion

I think the Android Architecture Components is a good set of components that for sure will make it easier for new developers to create a solid foundation from the start. The earlier examples from Google have rieked of bad practices and been riddled with pitfalls. But in terms of the interoperability with Android Databinding it seems very unfinished. Maybe they hoped to get it more done for Google I/O than what was presented. In any case I look forward to the next versions of this library and its migration into the support framework.

LiveData vs ObservableField

In android data binding we find the sister class ObservableField<T>. If I were to write a similar slide for that one, it would be:

An Observable data holder

The difference is that ObservableField<T> isn't lifecycle-aware and hence there cannot be any automatic subscription management. However the Android Data Binding framework does the subscription management for you - it's just not directly aware of the lifecycle. Instead of immediately unsubscribing when a fragment pauses or stops, it has somewhat of a delay. Notice this memory dump from a bound ObservableField:
You see that the mListener is a ViewDataBinding$WeakListener. If you look at the ViewDataBinding class inner workings, you see that it uses a WeakReference to the views and view bindings. This ensures that whenever an Activity or Fragment is dead, the reference to the ViewModel and its ObservableFields are removed in due time. Just not immediately as you would get if it was lifecycle aware.

The Android Data Binind framework doesn't know how to observe on LiveData.

The first bullet point is easily fixable. We could just write a simple converter. The second one is not as simple though. At least not for us laymen. Google could fix it, and I hope they will. Notice the difference in the interfaces for subscribing on the two classes:
[crayon-5a31d88a03e33417542888/]
You have to provide the LiveData-object with a LifecycleOwner. Per now, the Android data binding framework isn't rigged to do that. But it could be, and I hope it will be soon. I notice that it's Yigit Boyar who present these concept - the developer who's also behind Android Data Binding. He should know how to tie these things together.

The ViewModel holder

Another interesting class introduced in the Android Architecture Component talks is ViewModelProviders. I've found it does three things:

Retains view models across device rotations and other lifecycle-affecting events

Rids of the view models on onDestroy

Saves and restores the view models helped by the Parcelable interface on onSaveInstanceState and onCreate *

*) Disclaimer: Very superficial research done here
The problem it solves is avoiding double calls to expensive services when a long-running request is interrupted by a lifecycle change. Illustrated by this slide in the talk, just to set the context:
I'm claiming that this class bring more value to the table for data binding users than the LiveData-class. It wasn't very difficult to achieve before either, though. The Data Binding framework already allowed for this pattern as long as you implemented the holder class yourself.
Example: I made an improvement to my earlier MVVM example project where I've introduced a holder for the ViewModel-class accessed through a application-wide singleton. There wasn't a lot of code change needed. You'll notice after this commit that the "# logged in users" in the top right corner of the example activity doesn't reset after device rotation using this technique - even if you interrupt the 2 second long initial load.
In the branch androidarch I solved the same problem using the ViewModelProviders-class from the Architecture Components. I also experimented with binding to a LiveData class instead of ObservableField, but found it not to bring any value to the architecture or solution due the arguments stated above.

Conclusion

I think the Android Architecture Components is a good set of components that for sure will make it easier for new developers to create a solid foundation from the start. The earlier examples from Google have rieked of bad practices and been riddled with pitfalls. But in terms of the interoperability with Android Databinding it seems very unfinished. Maybe they hoped to get it more done for Google I/O than what was presented. In any case I look forward to the next versions of this library and its migration into the support framework.

FINN.no is the largest online marketplace in Norway, and we take continuous deployment seriously. We are about one hundred developers deploying new code to production 978 times each week. That is 978 / 100 = 9.78 deployments to production per developer every week. In order to get to these numbers and still keep it under control, we have had to both change our development process and establish the right set of tools.

In this article, we introduce feature toggles and how this technique has contributed to our continuous deployment success. We will also present Unleash, an open source framework we created to support feature toggles at an enterprise scale. Unleash allows us to enable features for specific users via the activation strategies concept. Unleash is already set up and in use by multiple teams within Schibsted, including FINN, SPT Payment, and Knocker.

[caption id="attachment_1731747" align="aligncenter" width="658" class="center "] Number of production deploys in “week 50” in recent years[/caption]

Feature Toggles

‘Feature toggles’ is a simple technique to separate the process of putting new code into production from the process of releasing new features to our users. In its simplest form a feature toggle is just an “if” statement in the code, guarding the new feature:

The fastest way to get started with feature toggles in your project is to use a plain old properties file. This allows you to integrate unfinished features to the master branch and hide them from users in production. A simple plain properties file was exactly how we started playing with the features toggles in FINN.

We realized that feature toggles enabled us to move faster because we were able to integrate new and unfinished features to the master branch early. This also avoids long-running feature branches that tend to be harder to merge. We learned the hard way that having many parallel feature branches adds a lot of extra complexity related to merging, but with the help of feature toggles, we can merge them directly to master.

This went well, but we also felt we could do better. We wanted one simple dashboard to turn the feature toggles on and off. The dashboard should contain all feature toggles for all web apps and applications in FINN. After creating several proof-of-concepts using various technologies, we proposed a solution to our technology management group. They loved the idea and we created Unleash, our shiny and open source feature toggle system.

Unleash gives us a great overview of all feature toggles across all our services and applications. It also makes it super easy to activate new features in real time in production and of course instantaneous deactivation of the feature if there’s a problem.

System Overview

Unleash comprises three parts:

1. Unleash API - The service holding all feature toggles and their configurations. Configurations declare which activation strategies to use and which parameters they should get.

3. Unleash SDK - Used by clients to check if a feature is enabled or disabled. The SDK also collects metrics and sends them to the Unleash API. Activation Strategies are also implemented in the SDK. Unleash currently provides official SDKs for Java and Node.js

Performance - In order to be super fast, the client SDK caches all feature toggles and their current configuration in memory. The activation strategies are also implemented in the SDK. This makes it really fast to check if a toggle is on or off because it is just a simple function operating on local state, without the need to poll data from the database.

Resilience - If the unleash API becomes unavailable for a short amount of time, the cache in SDK will minimize the effect. The client will not be able to get updates when the API is unavailable, but the SDK will keep running with the last known state.

Built-in backup - The SDK also persists the latest known state to a local file at the instance where the client is running. It will persist a local copy every time the client detects changes from the API. Having a local backup of the latest known state minimizes the consequence of clients not being able to talk to Unleash API at startup. This is required because the network is unreliable.

Gradual Roll Out of Features

It is powerful to be able to turn a feature on and off instantaneously without redeploying the whole application. The next level of control comes when you are able to enable a feature for a specific user or a small subset of the users. We achieve this level of control with the help of activation strategies. The simplest strategy is the “default” strategy, which basically means that the feature should be enabled for everyone.

Another common strategy is to enable a new feature only for specific users. Typically I want to enable a new feature only for myself in production before I enable it for everyone else. To achieve this we can use the “UserWithIdStrategy”. This strategy allows you to specify a list of specific user ids that you want to activate the new feature for. (A user id may, of course, be an email if that is more appropriate in your system.)

When I have verified that the new feature works as expected in production, I can activate the feature for some real users. To achieve this I can use one of the gradual rollout strategies. At FINN we use three variants of gradual rollout, which have slightly different purposes:

GradualRolloutUserId is the strategy we use when we want to gradually roll out a new feature to our logged-in users. This strategy guarantees that the same user gets the same experience every time, across devices. It also guarantees that a user from the first 10% will also be part of the first 20% of the users. Thus we ensure that users get the same experience. Even if we gradually increase the number of users who are exposed to a particular feature. To achieve this we hash the user id and normalize the hash value to a number between 1 and 100 with a simple modulo operator.

[caption id="attachment_1731751" align="aligncenter" width="518"] Converting a user id string to a number between 1 and 100[/caption]

GradualRolloutSessionId is almost identical to the previous strategy, with the exception that it works on session ids. This makes it possible to target all users (not just logged in users), guaranteeing that a user will get the same experience within a session.

GradualRolloutRandom has no form of user stickiness, it just picks a random number for each “isEnabled” call. We have found this rollout strategy to be very useful in some scenarios when we are enabling a feature which is not visible to the end user. Because of its randomness, it becomes a great way to sample metrics and error reports and it can be a useful way to detect JavaScript errors in the browser.

In Unleash 2.0 we added support for multiple strategies. This is especially convenient when you both want to expose your new shiny feature to a percentage of the users and at the same time make sure to expose it to yourself.

Metrics

When you enable a feature for a small group of users it is nice to know how many users actually get exposed to it. This is why we added metrics to Unleash 2.0. The SDK will locally collect metrics in the background and regularly share this information with the Unleash API. The API then aggregates the metrics from all clients and calculates how many times a feature toggle evaluated to enabled in a given time period.
When implementing the metrics feature we discovered that this was a nice way to display which toggles are actually in use in an application. It even made it possible to show feature toggles being used in the client code which is not defined in the Unleash API. Showing undefined toggles used in client applications makes it easier to discover feature toggles not defined in Unleash UI. This can happen when a developer starts using a new feature toggle but has not defined it in Unleash UI yet. The undefined feature toggle is also a clickable link to make it easy to define it in Unleash.

Types of Toggles

We have seen how release toggles allow us to decouple deployment of code from the release of new features. At FINN we have ended up with three main categories of feature toggles:

Release toggles are the most common type of feature toggles we use. These are used for new features we roll out in the market and are mainly used to roll them out safely: instead of exposing a new feature to everyone at once we can try it out on a small subset of our users and verify that it scales, does not contain bugs, and that it moves our KPIs in the right direction.

Kill switches are sometimes necessary to protect our site. Like many other large sites, we occasionally have some features that struggle under certain edge cases, often related to third party integrations. The Kill switches allow us to turn these features off, which is a simple way to degrade our service in order to keep the business running. Of course, it would be nice to not need the kill switches, but they have proven their value over time.

Business toggles are used to turn on certain features for specific users, segments or customers. It’s pretty straightforward to solve this in Unleash because of the flexible activation strategies, but we generally don’t use Unleash for this. It’s considered acceptable for experimenting with new ways to segment our service, but we don’t want the business toggles to become a permanent part of Unleash. We think permanent business rules should be handled by the application code and not Unleash.

Tips and Best Practices

We have been using feature toggles for several years in FINN. Here are some tips about how to use them more effectively:

Feature Toggles Increase Technical Debt

One of the biggest issue we have found is that due to ease of use developers love to add new feature toggles but don’t seem to remove them afterward. Sometimes developers keep them around because they feel they are “nice to have”. Other times they are just forgotten or no one has the time to refactor and remove them. A feature toggle is technical debt from the moment it is added to the code. The general advice is to remove the toggle as soon as it has served its purpose. Every feature toggle adds a new code path through the application making it harder to test and debug the code.

Don’t Use Feature Toggles if You Don’t Need Them

In many cases, you can safely add a new and unused feature to the application without protecting it with a feature toggle. It might be a new API endpoint or a new page on a separate URL, either way, a feature toggle is not needed.

If you have to protect your new feature you should try to use as few if-statements as possible, ideally, just one if-statement should guard the feature.

Further Plans for Unleash

Unleash has been actively used in production by FINN from 2014. The project has been lucky to have many different contributors over the years.

Currently Ivar, Sveinung and Vegard actively maintain the project. We have some great ideas for the future and we hope to see some of them implemented. In this section, we will briefly explain some of these ideas. We would love to hear your feedback on them!

Unleash as SaaS

Currently, you have to host your own instance of the API and the UI in order to use Unleash. This includes setting up a database and running a node application in production. We believe it could greatly improve adoption if Unleash was provided as a software as a service (SaaS). It would, of course, require us to add some access control, but this is something we have on the roadmap anyway. This might require us to charge the users of Unleash for a monthly cost and spend that money on hosting and new development.

“AND Strategies”

Unleash 2.0 added support for multiple strategies. This was a huge improvement! But it is implemented as a simple array where all the strategies are “OR”-ed in order to determine if a feature should be enabled. We believe we could improve this by adding support for required strategies. This would make it possible to combine strategies more freely to create new segmented roll-out groups. For example, this would make it trivial to only gradually roll out a feature toggle to our beta users.

Authentication and Access Control

The Unleash service already provides hooks for registering your own middleware, so that you can actually add your own access control. But why not add a few ready to use implementations, such as Google Authentication or Okta? This would make it much easier to integrate into an enterprise already using an identity provider.

Application Scoped Toggles

Today all applications receive all toggles for everyone in the same cluster. In Unleash 2.0 we added a client registration feature, where all clients register with the API using a unique identifier. We could use the application identifiers to provide a simple way to define which application you intend to use a feature toggle for. This would make it possible to only expose relevant feature toggles to relevant applications.

Time to Live

Because most features toggles should be used for a limited time it would be nice to be able to set a time-to-live field on them. Then Unleash UI could then inform developers about toggles they use which have expired, which would be a great motivator to clean up feature toggles which have served their purpose.

A/B Testing

At FINN we use Unleash as a way to setup and manage simple A/B tests. Our current implementation is a bit cumbersome and we believe there is room for generalizing and standardizing this, making it available to anyone. Also, in order to support multivariate experiments, we might need to change the API slightly.

These are just some of the ideas we have for Unleash. Which features would you like to see in Unleash? I would love to hear from you!

Summary

In this article, we introduced feature toggles and how FINN uses Unleash to gradually release new features. This approach gives us more control because it decouples the process of putting new code into production from the release of new features for our users.

]]>FINN.no is the largest online marketplace in Norway, and we take continuous deployment seriously. We are about one hundred developers deploying new code to production 978 times each week. That is 978 / 100 = 9.78 deployments to production per developer every week. In order to get to these numbers and still keep it under control, we have had to both change our development process and establish the right set of tools.
In this article, we introduce feature toggles and how this technique has contributed to our continuous deployment success. We will also present Unleash, an open source framework we created to support feature toggles at an enterprise scale. Unleash allows us to enable features for specific users via the activation strategies concept. Unleash is already set up and in use by multiple teams within Schibsted, including FINN, SPT Payment, and Knocker.
[caption id="attachment_1731747" align="aligncenter" width="658" class="center "] Number of production deploys in “week 50” in recent years[/caption]

Feature Toggles

‘Feature toggles’ is a simple technique to separate the process of putting new code into production from the process of releasing new features to our users. In its simplest form a feature toggle is just an “if” statement in the code, guarding the new feature:
if (unleash.isEnabled("AwesomeFeature")) {
//magic new feature code
} else {
//old boring stuff
}
The fastest way to get started with feature toggles in your project is to use a plain old properties file. This allows you to integrate unfinished features to the master branch and hide them from users in production. A simple plain properties file was exactly how we started playing with the features toggles in FINN.
We realized that feature toggles enabled us to move faster because we were able to integrate new and unfinished features to the master branch early. This also avoids long-running feature branches that tend to be harder to merge. We learned the hard way that having many parallel feature branches adds a lot of extra complexity related to merging, but with the help of feature toggles, we can merge them directly to master.
This went well, but we also felt we could do better. We wanted one simple dashboard to turn the feature toggles on and off. The dashboard should contain all feature toggles for all web apps and applications in FINN. After creating several proof-of-concepts using various technologies, we proposed a solution to our technology management group. They loved the idea and we created Unleash, our shiny and open source feature toggle system.
Unleash gives us a great overview of all feature toggles across all our services and applications. It also makes it super easy to activate new features in real time in production and of course instantaneous deactivation of the feature if there’s a problem.

System Overview

Unleash comprises three parts:
1. Unleash API - The service holding all feature toggles and their configurations. Configurations declare which activation strategies to use and which parameters they should get.
2. Unleash UI - The dashboard used to manage feature toggles, define new strategies, look at metrics, etc.
3. Unleash SDK - Used by clients to check if a feature is enabled or disabled. The SDK also collects metrics and sends them to the Unleash API. Activation Strategies are also implemented in the SDK. Unleash currently provides official SDKs for Java and Node.js
[caption id="attachment_1731749" align="aligncenter" width="508"] Unleash system overview[/caption]
Unleash is written with performance and resilience in mind.

Performance - In order to be super fast, the client SDK caches all feature toggles and their current configuration in memory. The activation strategies are also implemented in the SDK. This makes it really fast to check if a toggle is on or off because it is just a simple function operating on local state, without the need to poll data from the database.

Resilience - If the unleash API becomes unavailable for a short amount of time, the cache in SDK will minimize the effect. The client will not be able to get updates when the API is unavailable, but the SDK will keep running with the last known state.

Built-in backup - The SDK also persists the latest known state to a local file at the instance where the client is running. It will persist a local copy every time the client detects changes from the API. Having a local backup of the latest known state minimizes the consequence of clients not being able to talk to Unleash API at startup. This is required because the network is unreliable.

Gradual Roll Out of Features

It is powerful to be able to turn a feature on and off instantaneously without redeploying the whole application. The next level of control comes when you are able to enable a feature for a specific user or a small subset of the users. We achieve this level of control with the help of activation strategies. The simplest strategy is the “default” strategy, which basically means that the feature should be enabled for everyone.
Another common strategy is to enable a new feature only for specific users. Typically I want to enable a new feature only for myself in production before I enable it for everyone else. To achieve this we can use the “UserWithIdStrategy”. This strategy allows you to specify a list of specific user ids that you want to activate the new feature for. (A user id may, of course, be an email if that is more appropriate in your system.)
When I have verified that the new feature works as expected in production, I can activate the feature for some real users. To achieve this I can use one of the gradual rollout strategies. At FINN we use three variants of gradual rollout, which have slightly different purposes:
GradualRolloutUserId is the strategy we use when we want to gradually roll out a new feature to our logged-in users. This strategy guarantees that the same user gets the same experience every time, across devices. It also guarantees that a user from the first 10% will also be part of the first 20% of the users. Thus we ensure that users get the same experience. Even if we gradually increase the number of users who are exposed to a particular feature. To achieve this we hash the user id and normalize the hash value to a number between 1 and 100 with a simple modulo operator.
[caption id="attachment_1731751" align="aligncenter" width="518"] Converting a user id string to a number between 1 and 100[/caption]
GradualRolloutSessionId is almost identical to the previous strategy, with the exception that it works on session ids. This makes it possible to target all users (not just logged in users), guaranteeing that a user will get the same experience within a session.
GradualRolloutRandom has no form of user stickiness, it just picks a random number for each “isEnabled” call. We have found this rollout strategy to be very useful in some scenarios when we are enabling a feature which is not visible to the end user. Because of its randomness, it becomes a great way to sample metrics and error reports and it can be a useful way to detect JavaScript errors in the browser.
In Unleash 2.0 we added support for multiple strategies. This is especially convenient when you both want to expose your new shiny feature to a percentage of the users and at the same time make sure to expose it to yourself.

Metrics

When you enable a feature for a small group of users it is nice to know how many users actually get exposed to it. This is why we added metrics to Unleash 2.0. The SDK will locally collect metrics in the background and regularly share this information with the Unleash API. The API then aggregates the metrics from all clients and calculates how many times a feature toggle evaluated to enabled in a given time period.
When implementing the metrics feature we discovered that this was a nice way to display which toggles are actually in use in an application. It even made it possible to show feature toggles being used in the client code which is not defined in the Unleash API. Showing undefined toggles used in client applications makes it easier to discover feature toggles not defined in Unleash UI. This can happen when a developer starts using a new feature toggle but has not defined it in Unleash UI yet. The undefined feature toggle is also a clickable link to make it easy to define it in Unleash.

Types of Toggles

We have seen how release toggles allow us to decouple deployment of code from the release of new features. At FINN we have ended up with three main categories of feature toggles:
Release toggles are the most common type of feature toggles we use. These are used for new features we roll out in the market and are mainly used to roll them out safely: instead of exposing a new feature to everyone at once we can try it out on a small subset of our users and verify that it scales, does not contain bugs, and that it moves our KPIs in the right direction.
Kill switches are sometimes necessary to protect our site. Like many other large sites, we occasionally have some features that struggle under certain edge cases, often related to third party integrations. The Kill switches allow us to turn these features off, which is a simple way to degrade our service in order to keep the business running. Of course, it would be nice to not need the kill switches, but they have proven their value over time.
Business toggles are used to turn on certain features for specific users, segments or customers. It’s pretty straightforward to solve this in Unleash because of the flexible activation strategies, but we generally don’t use Unleash for this. It’s considered acceptable for experimenting with new ways to segment our service, but we don’t want the business toggles to become a permanent part of Unleash. We think permanent business rules should be handled by the application code and not Unleash.

Tips and Best Practices

We have been using feature toggles for several years in FINN. Here are some tips about how to use them more effectively:
Feature Toggles Increase Technical Debt
One of the biggest issue we have found is that due to ease of use developers love to add new feature toggles but don’t seem to remove them afterward. Sometimes developers keep them around because they feel they are “nice to have”. Other times they are just forgotten or no one has the time to refactor and remove them. A feature toggle is technical debt from the moment it is added to the code. The general advice is to remove the toggle as soon as it has served its purpose. Every feature toggle adds a new code path through the application making it harder to test and debug the code.
Don’t Use Feature Toggles if You Don’t Need Them
In many cases, you can safely add a new and unused feature to the application without protecting it with a feature toggle. It might be a new API endpoint or a new page on a separate URL, either way, a feature toggle is not needed.
If you have to protect your new feature you should try to use as few if-statements as possible, ideally, just one if-statement should guard the feature.

Further Plans for Unleash

Unleash has been actively used in production by FINN from 2014. The project has been lucky to have many different contributors over the years.
Currently Ivar, Sveinung and Vegard actively maintain the project. We have some great ideas for the future and we hope to see some of them implemented. In this section, we will briefly explain some of these ideas. We would love to hear your feedback on them!
Unleash as SaaS
Currently, you have to host your own instance of the API and the UI in order to use Unleash. This includes setting up a database and running a node application in production. We believe it could greatly improve adoption if Unleash was provided as a software as a service (SaaS). It would, of course, require us to add some access control, but this is something we have on the roadmap anyway. This might require us to charge the users of Unleash for a monthly cost and spend that money on hosting and new development.
“AND Strategies”
Unleash 2.0 added support for multiple strategies. This was a huge improvement! But it is implemented as a simple array where all the strategies are “OR”-ed in order to determine if a feature should be enabled. We believe we could improve this by adding support for required strategies. This would make it possible to combine strategies more freely to create new segmented roll-out groups. For example, this would make it trivial to only gradually roll out a feature toggle to our beta users.
Authentication and Access Control
The Unleash service already provides hooks for registering your own middleware, so that you can actually add your own access control. But why not add a few ready to use implementations, such as Google Authentication or Okta? This would make it much easier to integrate into an enterprise already using an identity provider.
Application Scoped Toggles
Today all applications receive all toggles for everyone in the same cluster. In Unleash 2.0 we added a client registration feature, where all clients register with the API using a unique identifier. We could use the application identifiers to provide a simple way to define which application you intend to use a feature toggle for. This would make it possible to only expose relevant feature toggles to relevant applications.
Time to Live
Because most features toggles should be used for a limited time it would be nice to be able to set a time-to-live field on them. Then Unleash UI could then inform developers about toggles they use which have expired, which would be a great motivator to clean up feature toggles which have served their purpose.
A/B Testing
At FINN we use Unleash as a way to setup and manage simple A/B tests. Our current implementation is a bit cumbersome and we believe there is room for generalizing and standardizing this, making it available to anyone. Also, in order to support multivariate experiments, we might need to change the API slightly.
These are just some of the ideas we have for Unleash. Which features would you like to see in Unleash? I would love to hear from you!

Summary

In this article, we introduced feature toggles and how FINN uses Unleash to gradually release new features. This approach gives us more control because it decouples the process of putting new code into production from the release of new features for our users.
Sound interesting? Check out the getting started guide. If you need help please do not hesitate to file an issue or reach out to any of the core contributors.
Thanks to everyone who has contributed to the project. All contributions are highly appreciated.

9 months ago I wrote about asynchronous programming patters in different languages covering C#, Javascript and Java. Lately I've been digging into Kotlin and specifically the Coroutines implementation in their 1.1 beta. This finally brings the async/await-pattern to Android - a pattern I grew learned to love when it was introduced to C# some years ago. The main benefit from my point of view is how it integrates asynchronous programming into the language syntax, making writing, and even more important: reading asynchronous code simple.

What are coroutines?

Although I was familiar with async/await, the term "coroutines" was new to me. I'm going to quote the author of Kotlin Coroutines on the defnition:

A coroutine — is an instance of suspendable computation. It is conceptually similar to a thread, in the sense that it takes a block of code to run and has a similar life-cycle — it is created and started, but it is not bound to any particular thread. It may suspend its execution in one thread and resume in another one. Moreover, like a future or promise, it may complete with some result or exception.

So from a programmer's point of view: it's just a stateful block of code. It's the contract of where and when it executes which makes it a coroutine. It is not a thread but it does indeed execute on a thread (all code does). You as a developer can control where it is to execute by specifying a Scheduler. In Kotlin on Android the most common use case is to schedule it on the CommonPool thread pool, which is a pool in the size order of 3-4 background threads.

Kotlin does not reinvent the wheel on Scheduling and Thread Pooling - under the hood of CommonPool, the JVM ForkJoinPool.commonPool() is used if present. If not then a fallback solution is provided by the Kotlin coroutine implementation.

The base interface for a Coroutine instance in Kotlin is a Job. If you want more control over the life cycle you can create a Deferred using the async()-method which gives the Job five states instead of three:

New

Active

Resolved

Failed*

Cancelled*

*) Only present for Deferred. Instead of Resolved you have Completed for the base interface Job.

The async patterns

As initially stated I wanted to see how the three use cases with serial, parallel fork-join and parallel take-first execution manifested itself in Kotlin with coroutines. The code speaks for itself. The serial and parallel-join cases were trivial. For the Parallell-take-first implementation I couldn't find a helper method to do what I wanted; wait until the first of X number of jobs have finished and use that result. This is equivalent to RxJava's .first() and C#'s Task.WaitAll. To accomplish this I implemented a helper method awaitFirst. This implementation may or may not be sound - please correct me if it's bad.

Support functions

[crayon-5a31d88a04c2b028942586/]

Pipeline / Serial execution

[crayon-5a31d88a04c33553150677/]

Parallel fork-join

[crayon-5a31d88a04c36722992784/]

Parallel take first

[crayon-5a31d88a04c39723908707/]

Summary

Since we're already using RxAndroid in our projects at VG, taking the step to Kotlin Coroutines is mainly an improvement of syntax. If you're still doing callbacks though, this step is huge. It basically makes asynchronous code look synchronous. Make no mistake though - it still involves the complexity of threads and concurrency. You need to know what you're doing and what's happening under the hood to avoid bugs. But once you do, this will bring your code to the next level and can have a positive impact on both performance and reliability. I dare say Kotlin Coroutines are much simpler to work with than RxAndroid/RxJava, so I hope that in the future we'll see support libraries (HTTP comms, UI) based on Coroutines that make the dependancy to RxJava less or non-existant.

]]>9 months ago I wrote about asynchronous programming patters in different languages covering C#, Javascript and Java. Lately I've been digging into Kotlin and specifically the Coroutines implementation in their 1.1 beta. This finally brings the async/await-pattern to Android - a pattern I grew learned to love when it was introduced to C# some years ago. The main benefit from my point of view is how it integrates asynchronous programming into the language syntax, making writing, and even more important: reading asynchronous code simple.

What are coroutines?

Although I was familiar with async/await, the term "coroutines" was new to me. I'm going to quote the author of Kotlin Coroutines on the defnition:

A coroutine — is an instance of suspendable computation. It is conceptually similar to a thread, in the sense that it takes a block of code to run and has a similar life-cycle — it is created and started, but it is not bound to any particular thread. It may suspend its execution in one thread and resume in another one. Moreover, like a future or promise, it may complete with some result or exception.

So from a programmer's point of view: it's just a stateful block of code. It's the contract of where and when it executes which makes it a coroutine. It is not a thread but it does indeed execute on a thread (all code does). You as a developer can control where it is to execute by specifying a Scheduler. In Kotlin on Android the most common use case is to schedule it on the CommonPool thread pool, which is a pool in the size order of 3-4 background threads.
Kotlin does not reinvent the wheel on Scheduling and Thread Pooling - under the hood of CommonPool, the JVM ForkJoinPool.commonPool() is used if present. If not then a fallback solution is provided by the Kotlin coroutine implementation.
The base interface for a Coroutine instance in Kotlin is a Job. If you want more control over the life cycle you can create a Deferred using the async()-method which gives the Job five states instead of three:

New

Active

Resolved

Failed*

Cancelled*

*) Only present for Deferred. Instead of Resolved you have Completed for the base interface Job.

The async patterns

As initially stated I wanted to see how the three use cases with serial, parallel fork-join and parallel take-first execution manifested itself in Kotlin with coroutines. The code speaks for itself. The serial and parallel-join cases were trivial. For the Parallell-take-first implementation I couldn't find a helper method to do what I wanted; wait until the first of X number of jobs have finished and use that result. This is equivalent to RxJava's .first() and C#'s Task.WaitAll. To accomplish this I implemented a helper method awaitFirst. This implementation may or may not be sound - please correct me if it's bad.
See my original post for the explanation of the use cases

Support functions

[crayon-5a31d88a04c2b028942586/]

Pipeline / Serial execution

[crayon-5a31d88a04c33553150677/]

Parallel fork-join

[crayon-5a31d88a04c36722992784/]

Parallel take first

[crayon-5a31d88a04c39723908707/]

Summary

Since we're already using RxAndroid in our projects at VG, taking the step to Kotlin Coroutines is mainly an improvement of syntax. If you're still doing callbacks though, this step is huge. It basically makes asynchronous code look synchronous. Make no mistake though - it still involves the complexity of threads and concurrency. You need to know what you're doing and what's happening under the hood to avoid bugs. But once you do, this will bring your code to the next level and can have a positive impact on both performance and reliability. I dare say Kotlin Coroutines are much simpler to work with than RxAndroid/RxJava, so I hope that in the future we'll see support libraries (HTTP comms, UI) based on Coroutines that make the dependancy to RxJava less or non-existant.

To deliver a truly personalised experience across multiple devices we require our users to login. To get our users to login we need to create a seamless login experience. Users often forget their username or password or do not understand that they can use the same login credentials between the different products or services we offer.

Password management services make it easy for users to keep track of their login credentials and use them across multiple platforms.
Apple and Google have created services to manage user credentials. Apples initiative is called “shared web credentials” and Googles “Google smart lock for passwords”.
If you update your apps and website to support both 'shared web credentials' and 'Google Smart lock for passwords' you can simplify the login experience for those users who have chosen to save their credentials with Apple and Google. The biggest benefit is in sharing login credentials between your website and your apps.

Shared Web credentials
Shared web credentials makes it easy for native iOS apps and their corresponding websites to share the same login credentials. For example when a user who has logged into your website in Safari opens your App they can be prompted to use the same credentials previously used in Safari to login. Any changes the user makes to their account either in the app or website will update the iCloud keychain so that the changes are made available in both the app and website.

There is also an added advantage for websites who share the same authentication domain with other products and services. For example, two different brands owned by the same company who share the same authentication domain can give the user the option of logging in with the login credentials the user has used on the other brands app or website.

Google Smartlock for passwords
Googles smart lock for passwords works like shared web credentials, allowing for easy credential sharing between apps and websites on Android devices and chromebook. If your app and website have been 'associated' with each other, the user will seamlessly be able to use the same credentials between website and app without having to enter their details again. In addition to creating the required updates to your website and app you will need to submit the smartlock for passwords affiliation form to verify your app/website association.

Another popular password management service that may be worth supporting in your app is '1Password. You can see how easy the login process is for users of 1password in this video

https://vimeo.com/102142106

For relatively little work you can make it a lot easier for your users to login to your website and apps, removing the hassle of remembering and typing in credentials that are already stored within credential management services.

]]>To deliver a truly personalised experience across multiple devices we require our users to login. To get our users to login we need to create a seamless login experience. Users often forget their username or password or do not understand that they can use the same login credentials between the different products or services we offer.
Password management services make it easy for users to keep track of their login credentials and use them across multiple platforms.
Apple and Google have created services to manage user credentials. Apples initiative is called “shared web credentials” and Googles “Google smart lock for passwords”.
If you update your apps and website to support both 'shared web credentials' and 'Google Smart lock for passwords' you can simplify the login experience for those users who have chosen to save their credentials with Apple and Google. The biggest benefit is in sharing login credentials between your website and your apps.
Shared Web credentials
Shared web credentials makes it easy for native iOS apps and their corresponding websites to share the same login credentials. For example when a user who has logged into your website in Safari opens your App they can be prompted to use the same credentials previously used in Safari to login. Any changes the user makes to their account either in the app or website will update the iCloud keychain so that the changes are made available in both the app and website.
There is also an added advantage for websites who share the same authentication domain with other products and services. For example, two different brands owned by the same company who share the same authentication domain can give the user the option of logging in with the login credentials the user has used on the other brands app or website.
How to support shared webcredentials in your app and website. Google Smartlock for passwords
Googles smart lock for passwords works like shared web credentials, allowing for easy credential sharing between apps and websites on Android devices and chromebook. If your app and website have been 'associated' with each other, the user will seamlessly be able to use the same credentials between website and app without having to enter their details again. In addition to creating the required updates to your website and app you will need to submit the smartlock for passwords affiliation form to verify your app/website association.
How to support Google smart lock in your app and website.
Another popular password management service that may be worth supporting in your app is '1Password. You can see how easy the login process is for users of 1password in this video
https://vimeo.com/102142106
For relatively little work you can make it a lot easier for your users to login to your website and apps, removing the hassle of remembering and typing in credentials that are already stored within credential management services.