Kaggle hackathon is a meetup that gives data scientists the opportunity to make friends in the industry and solve challenging and exciting data problems.

The day is sprint of tackling cool data projects, sharing interesting ideas, as well as a chance to meet new people and build relationships across the sector. It ends with drinks, which is also a big part of the fun!

The events are organised by the London Data Science Workshop, and the projects that data scientists are invited to work on are part of Kaggle, an online community where data scientists get challenged with real-word machine learning problems.

On 27 January 2018 about 60 data hackers, scientists and analysts met in ZPG’s trendy London office to take on several competitions available on the Kaggle platform. At the end of the day people had the chance to present their findings to the whole audience.

To give an idea of what types of machine learning problems you can collaborate on, here’s a brief summary of the three most popular ones from the January workshop.

The toxic comment classification

This challenge is about building a model that is capable of detecting different types of toxicity like threats, obscenity, insults, and identity-based hate. The dataset is wiki corpus dataset which was rated by human raters for toxicity.

The corpus contains 63 million comments from discussions relating to user pages and articles dating from 2004 - 2015. About ten people worked on this project, sharing ideas and early results.

Shannon, Head of Analytics at uSwitch ZPG, says about the challenge:

“I hadn’t used Python much before to transform free text fields so I was introduced for the first time to the nltk package.

“At the end of the morning I had a few new data transformation techniques under my belt and my faith in humanity had dropped considerably; there were some truly vile comments in the dataset, which hit home the need to accurately flag such comments in an automated way.”

The idea of this challenge is that to spot nuclei allows to speed up cures for people who suffer from diseases like cancer, heart disease, chronic obstructive pulmonary disease, Alzheimer’s, and diabetes. The 2018 Data Science Bowl propose the following mission: create an algorithm to automate nucleus detection.
Participants are asked to create a computer model that can identify a range of nuclei across varied conditions. By observing patterns, asking questions, and building a model, participants will have a chance to push state-of-the-art technology further.

Mercari price suggestion challenge

The challenge is about building an algorithm that automatically suggests the right product prices.
Participants are provided with user-inputted text descriptions of the products, including details like product category name, brand name, and item condition.
More details about this challenge can be found here.

Outside of these challenges, there were many people who focussed on the introductory Titanic dataset competition, found here. In this challenge, participants were asked to apply the tools of machine learning to predict which passengers survived the tragedy of Titanic. Skills that you may find useful to have for this challenge are binary classification, logistic regression, Python and R basics.

8 questions are presented to the audience and you'll have to pick one of the possible answers. Anyone with a working knowledge of the language will be able to understand the puzzles, but even the most senior Clojure master will be puzzled. The interactive talk format is entertaining while the puzzles teach you about the subtleties of Clojure and its standard library.

HTML is the backbone of the web and the code you write should be semantic.

If you write semantic HTML, It's easier to read and ascertain developer intent. Take a look at the following semantic and unsemantic examples:

No thank you

Yes please!

It's also more accessible. Screenreaders read through elements and feed back the structure of the page to the person using the device. It's more useful to hear that there is a header with a navigation element, than a div containing a bunch of anchors.

< React 16.2

Prior to React version 16, each component could only display a single element (that could contain other elements).

You can't do this

You have to wrap the headings in a containing element

Because of the necessary containing element, I've peeped a lot of code in the last year that when rendered looks like this:

Too. Many. div's.

Because the components that rendered the above HTML are built in isolation, it's very easy for us to just wrap our elements in a div and forget about it. We often ignore the generated HTML in favour of the cute component display we get with React Developer Tools.

Hard to read and not very accessible.

React 16.2+

One of the features that React 16 gave us was the ability to render an array of components without a containing element.

Hooray, no containing element!
But... it doesn't feel like HTML, and I don't want to have to put all my elements in an array just to get them to render!

Enter...drumroll<Fragment />!

Fragment is a new component you can import from React, which will replace all those horrible div's.

When the page renders, all you'll see in the DOM is

<h1>Cool title</h1>
<p>Cooler subtitle</p>

Look, no messy div's!

You can reduce the code even more by using <> which is equivalent to using <Fragment />

It works, but it looks a little weird to me, as if I accidentally put extra angle brackets in my code. I'll likely stick to <Fragment />.

There are a couple of caveats with the <></> syntax at present.

If you want to give the fragment a key, you have to use <Fragment />, not <></>

A decade ago uSwitch was firmly gripped by enterprise shackles with hierarchy and departmentalisation the order of the day.

Product development was constrained by the business/IT divide. Like many organisations we had optimised for control and uniformity rather than organisational effectiveness.

Today, following an organisational transformation where people have been encouraged to escape their silos, uSwitch consists of several independent product teams, covering: product, commercial, marketing, design and technical delivery capabilities allowing effective collaboration. Product delivery has been reduced from months to weeks to days (and in some cases hours!)

We will describe in detail how we have combined agile, lean and theories of motivation (such as Dan Pink's Drive) to create an environment where teams can achieve what was previously impossible and do meaningful work that can be directly linked to revenue generation.

We will describe how these radical changes have required us to re-think almost every aspect of the business including: how we attract, onboard, and develop people.

uSwitch is a price comparison website that helps customers find the best deals on energy, broadband, insurance and more. Each of the products we compare is unique and operates against different market constraints. As such each of them has a dedicated, self-contained team working on delivering the best experience for the customers and partners. The teams have a large degree of freedom to organise themselves, so while there are a lot of similarities in how different teams operate, they are not the same. The differences can stem from the size and maturity of the vertical, the different ways we integrate with the end providers, regulations and finally the specific experiences and preferences of the team members. This particular post concentrates on the insurance team where I’m working at the moment.

Insurance is one of the smallest products in uSwitch, with seven people working on it: a marketing person, a content person, an operations person, a commercial person and three engineering people. There is also one SEO person shared with another team and ideally, we’d have another front-end engineer working with us (yes, we’re hiring!).

Our main product is car insurance and it’s the only one we develop in-house. We also provide a few other types of insurance through uSwitch-labelled third parties. We sit together in one area and collaborate closely all the time. Whenever you have a question the person you need is sitting just beside you and on multiple occasions, you tend to overhear interesting information from people talking around you.

We set the direction of our work via the Objectives and Key Results (OKRs) approach, with some of the objectives coming from the company-wide initiatives while the rest are set by the team. We share one external email account per product, through which all external partner communication happens, and we have insight into the budgetary targets that we’re working towards. In a way, we operate like a startup but with the stability, benefits and support of a much larger, profitable business.

As a developer, I have both more freedom and more responsibility than in a traditional company. As you might have noticed there are no QAs, BAs or PMs in our team and all those roles are to a degree performed by developers. Engineers become experts in the insurance domain and gain an in-depth understanding of the product. As an engineer you’re expected to own the product and come up with the — not always technical —solutions to the problems.

We like data-driven decisions — a client-facing change in a product would often be A/B tested to validate our hypothesis before the launch. Developers have full access to production environments and are responsible for running the systems on the underlying infrastructure, which these days often means Kubernetes running on top of AWS with a service mesh in sight. We make sure that each developer can deploy directly to all environments quickly and we try to make new starters deploy at least one of their changes to production on their first day.

Given the developers are deeply embedded in the business context they can often prioritise the work themselves. They also make the decisions on how to approach quality and risk in the project. In some cases that might mean carefully testing a business-critical piece of code. More often though we go test-light or test-less and instead depend on extensive monitoring, both on software and business levels, to make sure that the site is both working for the customers and earning us money. In the context of page reliability as long as we’re earning money nothing else matters, and as long as we’re not earning money nothing else matters.

One of the nice consequences of having multiple teams working in an often similar problem space is the culture of stealing solutions from each other. In insurance one of our key results this quarter is implementing a tracking pixel for completed purchases on our partners' sites. That will allow us to get quicker and more accurate feedback on the product changes we implement. I know that quite recently the broadband team implemented a system that is very likely to meet our requirements. Guess what will likely happen! Eventually, if the problem is shared across multiple teams we might decide to provide a company-wide solution. Recent examples include integrating with affiliate networks and working towards a common A/B testing platform.

Our company setup means that new technologies and approaches can be first tested by a single team and, if the results are promising, implemented by more teams. Such an approach allows us to experiment and explore while limiting risk and initial investment. For the core parts of the systems, we would usually choose conservative, well-known technologies that we’re most comfortable with like Clojure, Rails or React. And we start introducing new technologies in smaller, less business-critical systems or tools.

Essentially, the core factors that we try to maximise in our environment are autonomy, mastery and purpose, as presented in the book Drive: The Surprising Truth About What Motivates Us by Daniel Pink. His theory is that being able to direct your actions, continually improving your skills and meaningful goals increases performance and satisfaction. I tend to agree.

This is the second part of an article dedicated to Clojure transducers. In the first part we discussed transducers fundamentals and the functional abstraction they represent. In this article, we are going to explore how they are used, including:

Composability

Transducers isolate transforming reducing functions (like map or filter in their transducer version) from sequential iteration. One interesting consequence of this design is that transducers can now compose like any other function. Have a look at the following example:

(map inc) and (filter odd?) both generate a function with the same interface: taking one item and returning a transformation of that item. We can compose them with comp to form a new function inc-and-filter which is the composition of the two.

Finally, we can provide an argument + to inc-and-filter which again returns a new function. We call the new function special+ because it enhances the normal + with two transformations. We can see the effect of incorporating inc and odd? into special+ by calling it with two arguments; the second argument is incremented and then conditionally added to the first depending on it being odd or even. We can use reduce with special+ as per usual and compare it with the version using simple +:

The two reduce are just apparently the same, but the mechanic is completely different. In the first case, our special+ applies transformations as it iterates through the input sequence. The second case produces intermediate sequences for each transformation, plus one for the final reduction. More importantly, we can now isolate transformations from reduction, so transduce makes that explicit in the arguments:

(transduce (comp (map inc) (filter odd?)) + (range 10))
;; 25

How many transducers (our transforming reducing functions) can be composed this way? The answer is … as many as you like:

The example above is a famous "gist" created by Rich Hickey (the inventor of Clojure) to show off the available transducers in the standard library.

But there’s more: composition can be applied on top of more composition, allowing programmers to isolate and name concepts consistently. We could group the previous, long chain of transducers into more manageable chunks:

As you can see, comp can be used several times to compose over already composed transducers, with the end result that logically consistent pieces of computation can be grouped together and named for better reuse and readability.

Note that, although there are structural similarities with the thread last macro ->>, without transducers and comp, it wouldn't be possible to have more than one level of grouping.

Reuse across transports

There is another aspect which contributes to reuse (apart from composition). Now that transducers are isolated from the collection they are applied to, we can reuse transducers with other transports. A transport is essentially the way a collection of items is iterated. One of the most common transports in the standard library is sequential iteration (that is, using sequence functions like map, filter, etc.). But there are other examples of transports. The core.async library, for instance, implements iteration through an abstraction called ‘channel’, which behaves similarly to a blocking queue.

The following example shows the same transducers chain we have seen before to process incoming elements from a core.async channel:

The transducer xform is now applied to a channel ‘out’ inside a go-loop. Every item pushed down the channel passes through the transducer chain. The input is a range of 10 numbers transformed into a channel, but items could be streamed asynchronously (and often are; for example, as server side events on a web page).

The go-loop works sequentially in this case, but core.async also contains facilities to apply the same transducers in parallel. A pipeline is an abstraction that supports multi-threaded access that sits between an input and an output channel.

A pipeline accepts a max number of parallel threads (usually the same of the physical cores available in the system). Each item coming from the input channel is processed in parallel through the transducer chain up to the max number of threads.

Custom transducers

The standard library provides a core number of transducer-enabled functions, which cover most of the situations. When those are not enough, you can create your own.

However, a custom transducer needs to obey a few rules to play well with other transducers:

It should support at least a zero, one and two-arguments calls.

The zero-argument call defines the initial value for the reduction. Despite not being used at the moment, it is recommended to invoke the reducing function (usually abbreviated "rf"), with no arguments, and return the result (a conservative approach that plays well with the current behavior of transduce in this case).

The single-argument call defines the tear-down behavior. This is especially useful if the transducer contains any state that needs deallocation. It will be called once with the final result of the reduction. After providing any custom logic (optional), the custom transducer is expected to call "rf" with the result, so other transducers in the chain can also have their chance for cleanup.

The two-arguments call represents a reduction step, which contains the business logic for the custom transducer. This is the typical reduce operation: the first argument represents the results so far, followed by the next item from the transport. It is expected that the reducing function "rf" is invoked after applying any transformation, propagating the call to other transducers that might be in the chain.

A transducer can decide to terminate reduction at any time by calling reduced on the results. Other transducers should pay attention to reduced? elements and prevent unnecessary computation. This is, for example, what happens using transducers like take-while; when the predicate becomes false, no other items should be reduced and computation should stop there. If after take-while there are other computationally intensive transducers, they should also stop processing and just let the results flow to the end.

The rules has mainly to do with the fact that each transducer in the chain should give a fair chance to the others to perform (or stop) their transformations, initialisation or cleanup logic.

Other than that, a custom transducer can implement any sort of complicated behaviours, including maintaining internal state, if necessary. Not surprisingly, transducers maintaining state are called "stateful" to distinguish them from "stateless" transducers. Maintaining state in a transducer is quite common and roughly half of the transducers in the standard library do.

It is so common that a new concurrency primitive has been created called volatile. In the historical group of vars, atom, refs and agents, volatile is the primitive that allows multiple threads to promptly see what's inside as soon as it is updated. It's very different from the other concurrency primitives that are instead "protecting" state from concurrent access. If you are wondering why, the answer is that in a context like core.async pipelines where the same transducer is being accessed concurrently, a volatile guarantees that the state is seen by all threads (but it won't work in foldable stateful transducers that we are going to see later in this article).

A logging (stateless) transducer

So, let's get practical and create our first stateless transducer. Assume that you just created a complicated, but nicely composed, transducer chain for data processing (we are simulating that here with a much shorter and simpler one).

How would you go about debugging it? You might need to understand at which step in the chain things are not working as expected. Here's an idea for a logging transducer with a printing side effect. Other than printing on screen, the transducer is not altering the reduction:

In this example, log is a transducer accepting an optional single argument "idx". When "idx" is present, log additionally prints it, assuming that is the position of the transducer in a composed chain. (We'll see in a second how that information can be used.) Before being composed via comp, transducers are just a list of functions. The idea is that we can interleave the list with the logging transducer ahead of comp, and use a dynamic variable to control when to print to the console:

Here, comp* is a thin wrapper around the normal comp function in the standard library. It has the responsibility to check the *dbg?* dynamic variable. When *dbg?* is true, we interleave our logging transducer to an already existing chain of transducers passed as input. We don't do anything otherwise.

The first example invocation of transduce shows that comp* behaves like normal comp. When we bind *dbg*? to true though, the logging transducer starts printing. Several logging transducers instances are created — as many as necessary to interleave the input chain. Each logging transducer carries the "idx" information about its position in the chain, so it can print it. By looking at the source code, we know that ‘step 1’ corresponds to the transducer at index 1 in the (0-indexed) list passed to comp* (which is filter). If we see some odd value of ‘Item’ or ‘Result’, we know which step is producing it and we can take action.

An interleave (stateful) transducer

Let's now see an example of stateful custom transducer.

The following reduce illustrates the point by showing a way to implement the Egyptian multiplication algorithm. Ancient Egyptians didn't use times tables to multiply numbers, but they worked out the operation by decomposing numbers by powers of two:

We would like to express this algorithm with transducers but we don't have an interleave transducer in the standard library — just normal interleave. All other operations inside the thread last ->> form are available as transducers.

So, how should we design the interleave transducer to work with transduce? First of all, transduce does not support multiple collections as input. So, the idea is to use one sequence as the main input for transduce and the other as the sequence of interleaving elements. The sequence of interleaving elements lives inside the state of the interleave transducer, taking the first item to interleave at each reducing step. The remaining elements have to be stored as state inside the transducer while waiting for the next call. Without further ado, here's the interleave transducer:

There is quite a lot to cover, please refer to the numbers in the above listing:

interleave-xform is modeled on the same semantic of the interleave function in the standard library: it interleaves elements up to the end of the shortest sequence. interleave-xform contains all the required arities: no argument, single argument and two argument.

interleave-xform assumes the interleaving is coming from a collection passed while creating the transducer. We need to keep track of the remaining items in the sequence as we consume them, so the rest of are stored in a volatile! instance.

During the reducing step, we verify to have at least one more element to interleave before allowing the reduction. Note the use of if-let and destructuring on the first element of the content of the volatile instance.

As any good transducer citizen, we need to check whether another transducer along the chain has required the end of the reduction. In that case, we obey without propagating any further reducing step.

If, instead, we are not at the end of the reduction and we have more elements to interleave, we can proceed to update our volatile state and call the next transducer using the "filler" element coming from the internal state. Note that at this point, this is the second time we invoke ‘rf’; the first being for the normal reducing step, the second is an additional reducing step for the interleaving.

In case we don't have any more items to interleave, we end the reduction using reduced. This prevents nil elements from appearing in the final output — exactly the same as normal interleave.

With interleave-xform in place, we can express the Egyptian multiplication method as follows:

The second iteration of increasingly doubling numbers is now considered the interleaving sequence that we pass when creating the transducer. The other iteration with increasingly halved numbers is instead the normal input for transduce. The two sequences are interleaved together and partitioned into vectors as part of the transducing step. Apart from the interleaving part, the rest of the processing is a mechanical refactoring from a sequence operation into the related transducer version.

Laziness

There are four main ways to apply transducers in the standard library: transduce, sequence, eduction and into. Up until now we've seen one of the most popular, transduce, which is designed to completely consume the input collection. Even when transduce is used to output a new sequence, it doesn't work lazily, as we can quickly verify by using a counter on the number of transformations happening on each item:

In the example above, you can see that transduce completely consumes a lazy sequence, despite the fact that we want just the first 10 elements (note that conj on a list is pre-pending element at the beginning of the current list). The counter shows that the transducer has been called on all of the items, fully evaluating the range. into uses transduce underneath and has the same behaviour. If you are interested in applying a transducer chain lazily by gradually consuming the input, sequence and eduction are designed for that:

The type of laziness in eduction and sequence is called ‘chunked’. It means that they are lazy but not extremely lazy, by allowing some consumption of the input sequence ahead of the actual position in the iteration. New items are processed in chunks of 32 elements each: once the 32th element is reached, all others up to the 64th are processed, and so on. In our example, we consume 10 elements but process the whole first chunk of 32.

So, what’s the difference between eduction and sequence? eduction allows a variable number of transducers to be passed as argument without comp. Besides that, eduction starts a brand-new loop for every sequence operation, including running all transducers again if necessary. Here's a way to demonstrate this behaviour:

In our new examples, we are using both first and rest on the result of eduction and sequence, respectively. You can see that both first and rest forces eduction to scan the input sequence, as demonstrated by the counter printing of ‘20’, which is twice the amount of items in the input sequence. sequence caches the results internally, so it doesn't execute transducers again just returning the result from the cache. eduction approach has benefits on memory allocation at the price of multiple evaluations. A rule of thumb to pick the right choice would be:

Use sequence when you plan to use the produced output multiple times. For example, assigning it to a local binding to then proceed to further processing. Internal caching of sequence results in better performances overall. At the same time, it could consume more memory, as the entire sequence could load in memory if the last element is requested.

Use eduction when there is no plan to perform multiple scans of the output, saving on unnecessary caching. This is the case with transducer chains which depend on some user-generated search parameters for example. In this scenario, the application needs to produce a one-off view of the system that is discarded as soon as the response is returned.

Parallelism

We have seen an elegant example of parallelism in transducers using core.async pipelines. There is also another option to parallelize transducible operations using fold, a function from the reducers namespace in the standard library. fold offers parallelism based on the "divide and conquer" model: chunks of work are created and computation happens in parallel while, at the same time, finished tasks are combined back into the final result.

The following example shows ptransduce, a function that works like a parallel transduce. Since reducers are based on the same functional abstraction, we can leverage fold without changing the transducers chain:

Note that the input collection needs to be a vector (or a map) in order for reducers to work in parallel. It also needs to be bigger than a certain size (512 items by default) to enable parallelism. Apart from this, the xform transducer chain is the same as before but we need to call it (xform rf) because fold expects a reducing function, which is what xform returns plus added transformations when invoked with a basic reducing function (like + in our case).

Both pipelines and fold parallelism work unless stateful transducers are involved. Observe:

(drop 100) is now part of the transducer chain. We can see the expected result of ‘24997500’ with a normal transduce; but pipeline and fold are both producing unexpected results. The reason is because multiple instances of the same stateful transducer exists. In the case of pipeline, one transducer chain is instantiated for each item (so drop is always dropping the single element available).

In the case of fold, there is one transducer chain each chunk, and the situation is additionally complicated by work stealing, a feature in which threads that are currently not doing any work can steal items from other threads. When that happens, the stateful transducer is not initialized again with the result that the same state is suddenly shared across threads. That's why, to see the inconsistency, you need to run fold multiple times.

The problem of parallelism with stateful transducer is both technical and semantic. The drop operation for instance is well defined sequentially. The elements are skipped at the beginning of the collection in the requested number. But the definition needs to be reformulated when chunks are involved: should drop remove the same number of elements each chunk? In that case sequential drop would diverge from parallel drop. Since an unique and consistent solution does not exist in the standard library at the moment, the problem of parallel stateful transducers remains unsolved and they should be avoided with fold or pipeline.

Conclusions

As we’ve seen in the previous part of this two articles series, transducers are, at their core, a simple but powerful functional abstraction. "Reduce" encapsulates a pattern for recursion that can be adapted to many situations and, at the same time, it promotes the design of code in terms of a standard reducing interface. Since transducers’ building blocks conform to the same contract, they are easy to compose and reuse. Transducers were introduced late in Clojure, perhaps explaining why some struggle in their adoption at first.

There are still some rough edges and room for improvement in transducers as they are today. A few libraries started to emerge to provide transducers version of other functions (most notably, Christophe Grand’s xforms), hinting at the fact that more could be added to the standard library itself. Transducers are also amenable for parallel computation; but there is no solid semantic for parallel stateful transducers, so they can’t be used with fold. This is somehow discouraging their parallel use as a whole.

On the positive side, transducers already cover a fair amount of common use cases and you should consider reaching them as the default for everyday programming.

We frequently find ourselves observing and reacting to changes in our customers and our website. Here are a few of the more common situations that we experience:

An enticing product is added to one of our comparison pages, causing an increase in people selecting products on that comparison page.

A bug is released onto an important landing page, preventing people from continuing onto the next page.

News stories over several days cause a spike in search engine traffic to some of our guide pages.

These types of situations are quickly noticed by people.

‘Oh, traffic has gone up on the site!’ says someone watching real-time data stream at the right moment.

‘Why has this KPI dropped in the last hour?’ asks someone watching a KPI dashboard.

‘This landing page is getting loads of traffic,’ says an analyst looking at landing pages in a weekly trends analysis.

There are a few challenges here. The first is that real-time data streams can’t be checked constantly in a reliable way by people who have other things to do. It distracts from more important tasks and there are tasks, like meetings or lunch hours, that take up the time of entire teams. As a result, things get missed. Sometimes something important that could have been known within 10 minutes becomes known three hours later. Sometimes it’s someone from outside the most relevant team that spots the unexpected blip that pushes the team into a flurry of action. Not ideal.

The second challenge is that it’s time consuming to check thousands of time series regularly for changes. Although we have a fairly manageable number of KPIs and metrics we’re interested in, when split by other variables they often become unmanageable. We have many thousands of pages on our site; hundreds of groups of these pages; hundreds of variable values about the device, browser and operating system used by the people accessing our site; a staggering number of search terms from search engine traffic; and many other ways to split a quick-to-check list of important metrics into a tedious, soul-destroying, coffee-demanding endurance checklist. It’s enough to make analysts want to say, “Let’s just look at the top-level metrics and maybe check some likely variable splits if any of the the top-level metrics are looking funny.”

To be clear, these are still challenges for us and we haven’t investigated all the likely detection methods available. But we’ve progressed and continue to do so. My guess is that some of the useful things we’ve found in this area could be useful to some of you, too.

What we want from this area

We primarily want these three things:

To detect large and important changes quickly. For us this means detecting large spikes or dips in important metrics as they’re happening, usually on the timescale of minutes.

To detect less time-sensitive changes eventually, which for us usually means the day after, sliding down to a week or month for less dramatic changes.

Have a small overall number of false positives so that each set of points identified as anomalous are investigated. We want to save time and get important things noticed quickly. We don’t want an algorithm to constantly say ‘check everything, every minute, every day’.

Some approaches we use

The approaches we’ve taken range from the simple to the complex. A few that we’ve looked at are rules based on domain knowledge, Robust Principal Component Analysis and changepoint detection.

Approach one: Rules based on domain knowledge

There are some simple situations based on domain knowledge and experience that we can look out for in the data. Examples include:

Time series starting or stopping, corresponding to items being added or removed on site. For instance, if a page were to be added to the site, the number of daily pageviews of that page would be zero until the very first pageviews. Similarly, if a page is removed, the daily pageviews would be zero for each day after the removal date. We can categorise these time series as such with a simple rule like ‘if a time series starts with zero counts at the first X time steps, categorise this time series as “new time series started”’.

Minimum/maximum thresholds, corresponding to known danger zones. For instance, if we had zero pageviews of our results pages between 13 and 13:30 on a Monday, we would be certain that something was wrong with our site (or our tracking). On the opposite end of the spectrum, we also have known upper limits where our servers might struggle to handle the traffic.

Approach two: Robust Principal Component Analysis

A number of useful articles have been written about this on the web so I won’t go into much technical detail about this method.

This method is useful for finding temporary spikes and dips in time series. It can account for seasonalities as well as work with multiple variables simultaneously. We’ve found it useful in alerting us to unusual changes in less important variables from the previous day, as well as pointing us to any spikes and dips in seasonal time series.

Robust Principal Component Analysis in our case works in the following way:

Start with a time series such as hourly views of A Very Important Page every hour over the last 52 weeks.

Choose an interval of time that corresponds to a season in our variable — a week, for example.

Convert our time series into a collection of these intervals. In our example, a collection of week-long time series)

Create a matrix D with each row as one of these intervals, so row one: week one; row two: week to.

Actual RPCA step: find a way of representing this matrix as a low rank matrix L added to a sparse matrix S added to a (small) error matrix.

A low rank matrix mostly has rows which are linear combinations of a small subset of rows, a sparse matrix has mostly zeroes as its elements, and the error matrix has very small, normally-distributed values. Our interpretation is that L gives what is expected to happen if everything behaves normally considering other data points in the cycle/season, and S shows the effect of the anomalous behaviour.

The coefficients these matrices are multiplied by allow us to control how much we care about optimizing each one. For instance, if the coefficient for S is relatively small, then we’ll see more data points being treated as anomalous. Conversely, if it’s relatively large, we’ll see very few data points treated as anomalous and L being quite close to the original matrix.

Robust Principal Component Analysis can be done with Netflix's Surus Package in R. If you provide the time series and the periodicity, it will perform the matrix decomposition for you.

As an example, we took hourly pageviews to a page on our site over the course of a year and looked at the weeks with the largest spikes or dips. Here are the top two results:

The algorithm that finds the L and S matrices does so by minimising this cost function:

The first term uses the squared euclidean distance value of how far X is from L + S. The second term uses the nuclear norm of L and approximates the rank of L. The third term uses the L-1 norm of S, i.e. it sums the absolute values of the elements of the matrix.

The coefficients (LPenalty and SPenalty) allow us to control how much we care about optimising the sensitivity of the anomaly detection. For instance, if the coefficient for the L-1 norm of S is relatively small, then we’ll see more data points being treated as anomalous. Conversely, if it’s relatively large, we’ll see very few data points treated as anomalous and L being quite close to the original matrix.

Approach three: Changepoint detection

Many changepoint detection methods take the following approach:

Choose a time series.

Assume that a known probability distribution with an unknown mean generates the data at each point.

Find a partition on which the mean is constant on each segment.

How the partition is chosen varies between methods. The changepoint package in R is effective at changepoint detection using several different methods.

Here are some examples using dummy data with the means (in orange) determined using the cpt.mean function from the package. The dummy data points are sampled from three separate normal distributions with different means and variances. The method used is BinSeg (an implementation of the method in this paper: https://www.ime.usp.br/~abe/lista/pdfXz71qDkDx1.pdf) with different values for Q (the maximum number of changepoints).

A useful working method for us is to use the BinSeg method with a small max number of changepoints, apply this to a set of time series and rank the series in descending order of largest step changes.

Summary

In conclusion, we:

Use Robust Principal Component Analysis to identify spikes and dips

Use changepoint detection to identify step changes in data

Use simple size metrics based on the above algorithms to prioritise anomalies when working with many time series

Use rules based on domain knowledge to ensure known event indicators aren’t missed

Think of a code base that you hack on; one that changes frequently. Ask yourself:

How often does it change?

Are there particular types of change that recur?

What is the cost of making each change?

At uSwitch we deal with lots of energy suppliers. Each send us a number of different reports, in various formats, containing data that we need to parse and analyse. We automate this process as much as we can, but the structure of the reports often changes, requiring frequent modifications to the parsing code.

After asking ourselves the three questions above, we accepted that having to change our parsing code often was inevitable and we set out to create a parsing system that would make the cost of change as cheap as possible.

The parsing code that we started with typically looked like the following:

Sure, it's functional, it's quite concise, it's even fairly easy to work out what it's doing if you read it a couple of times, so what's the problem?

Getting to what matters

The problem is that this code tells us very little about the data that it's intended to parse.

Which columns are we interested in?

What data types do we expect?

Can a column contain blanks?

Are there any edge cases?

These are the questions that we have to answer every time we come to change this code. What we need is a way to express this information directly so that changing the code is done at the level of these questions.

Write what you mean; no more

To parse a field for example, we might state the field name, the column heading under which the field can be found, and the data type that we expect to parse.

{:switch-date {:column "Switch Date" :parse date}}

We can easily extend this scheme to support reports that use different data types within the same column by using an ordered collection of parsers. The order reflects the preference of data types to be parsed.

{:switch-date {:column "Switch Date" :parse [date string]}}

Parsers are tried in succession until a value is successfully parsed, or until all parsers have been exhausted, at which point we fail with an error.

This also provides us with an elegant way of expressing that a field can optionally be empty.

{:switch-date {:column "Switch Date" :parse [date string blank]}}

Similarly, we can support changing column headings across reports by using a collection of column names. In this case, order is not significant, so we use a set.

Combining these field declarations into a single structure gives us a recipe for parsing a whole report. This particular report is in spreadsheet format, so we use a regular expression to identify the worksheet by name.

Since we also have to sanitise some of the reports that we deal with, we added the ability to perform row filtering and arbitrary transforms to our parsing recipes. We also allow them to be specified in sets; one entry per worksheet. This is an example of what our new report parsing code looks like.

Implementing the code that takes these specifications and uses them to parse a spreadsheet was a fairly simple task that only had to be done once. It was also easy to extend the parsing to CSVs, simply by having an alternate implementation. All of the parsers, which extract the specific value types, are shared between implementations.

Profit

The result is that we now have descriptive and readable report parsers that can be understood at a glance and, most importantly, can be quickly and easily changed. This has transformed the tasks of supporting new types of reports, or tweaking existing reports, from a frustrating time sink to a simple undertaking. It has also greatly increased our confidence in our parsing code for reasons best described by Tony Hoare:

There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies, and the other is to make it so complicated that there are no obvious deficiencies.

Time, of course, is money and we now have more of it to spend doing other, more exciting things, knowing that we can react to every day report changes quickly and with confidence.

Remember

Find out what changes most frequently

Optimise to make those changes as trivial as possible

Spend your time working on things that bring real benefit

]]>https://labs.uswitch.com/euroclojure-2017-final-part/21e691e5-4bbe-480b-97e3-2696c842f12bFri, 22 Sep 2017 09:23:04 GMTSadly, this is the last post on our journey through EuroClojure. We explored the basics of AI and application security in my first post, discovered new tools in the second one and now all we have to do is to put them to work with some real life experience.

The following talks are showcases of applications used in the real world, from real users, that the speakers shared with us to show what they achieved using Clojure.

Moving people with Clojure

This talk was about — you guessed it — moving people with Clojure. It focused on an Indonesian company that moved from a Java monolith to a Clojure service — also passing through a Golang service — in less than two years. They were able to rewrite their applications and go into production in about two months.
Go Jek is an Indonesian company founded in 2010. They started out offering two-wheel transportation by phone call, but today, they offer a lot of different services by drivers, including delivery services (for food, medicine, tickets and other items) and rides.
In early 2016, they moved from a Java monolith service to a Golang service. In August 2016, however, they decided to move to Clojure due to its features and flexible domain model. Believe it or not, it went from production to a smooth and almost bug free release in two months!

How did they do that and why? The first thing that comes to my mind is the code syntax. For the functionality of finding an available driver in the selected area who’s able to complete the requested task, they moved from a single, huge function with lots of code lines to a thread macro and a list of smaller functions. What does that mean? Easier code maintenance, readability and writability.

But how did they move from one service to another? They built the clojure service and sent it a copy of all the live traffic, which was then directed to a mock server.

Thanks to the generative testing, it was easy to test its functionality on ever-growing data, until the result was good enough to be released. At that point, the testing continued on live data. One per cent of their traffic was redirected to the Clojure allocation service and then to the same servers that received data from the Golang service. From there, a gradual rollout allowed them to completely move to their new service without any major hiccups.

Using Clojurescript to launch iOS/Android apps to 1M users

The author of this talk gave a very deep, thorough context to explain the needs for this project and the reasons behind his choices.

The author is from Azerbaijan, a country between the Caspian Sea and the Caucasus Mountains which has seen rapid technological growth in the last few years. This, of course, influences people’s everyday lives and their expectations for technical products — especially mobile apps since nowadays, there’s a native app for almost everything.

That’s why they needed a quick solution to create a mobile app for their website, a buying/selling online service similar to Craiglist.They had some key criteria for the project:

Be able to launch and iterate quickly

Maintain same codebase for iOS/Android apps

Avoid having to use Objective-C or Java

Have access to platform native features

Get fast, native experience on both platforms

As most of you know, the next step is to investigate all the possible solutions and find the best match for the project’s needs. And that’s exactly what was done, listing all the pros and cons of each possibility:

Website wrapper

Quick and easy, same codebase

Worse experience than website due to user expectations

Doesn’t add much value, except for an icon on your screen

Semi-native

Part native, part web views

Easy to start, lots of code shared with website

Need to know native platforms

Slower than native apps in many cases

App store updates still required for any change to native code

Almost native

React Native / Xamarin / NativeScript / others

Develop using the same tools and get almost native apps

Most of the code is shared between iOS/Android

Most updates do not require app stores

Ultimately, they landed on the last option, bringing the project much closer to a native app than the other two would have. After deciding on almost native, the investigation moved on to look at React Native. They figured out that it was great as a platform but didn’t like the language (Javascript).

That’s where ClojureScript comes into play. They opted to use Reagent, a framework built on top of React and compatible with ClojureScript. This allowed them to simplify the code, define components using functions and data, manage the app state with atoms and trigger the re-render function by changing them.

That worked: it gave them what they needed, and it was helpful to have immediate feedback on screen thanks to Reagent and React. The Transit data format gave a final boost to this whole system. Why? Because it’s language-agnostic, supports any data type, accepts arbitrary keys and works on top of JSON.

The app reached production in three months and launched one year ago for both iOS and Android. So far, it has almost 200,000 installs, 20% of the daily sessions are from the app, 99.8% of the sessions are crash-free on both platforms, and it has an average rating above 4.7 from around 3,000 reviews.

Conclusion

Overall, I can say I’m really happy about this experience. All my worries about being a woman junior developer at a tech conference disappeared the exact moment I was welcomed and got involved in the social groups at EuroClojure.

I’ve realised that my knowledge has improved. Even if what I’m bringing back is a list of tools and indirect experience, at least I now have something new to explore.

Will I implement any of these tools? Hard to say for sure, but I’ll definitely put them on the table the next time I need to improve an application’s security, come up with an idea on how to change technology or build a robot that helps me move house!

In my previous post, I introduced the EuroClojure 2017 conference and covered two beginner-friendly talks to warm up a bit with something simple. As promised, I’m moving into more technical talks for this post and will cover the showcases in the third, and last, post of this series.

In both of the talks I’ll be summarising in this post, the speakers experimented with the tools available and either created new ones to fit their needs or demonstrated what you can do with them.

otplike – Erlang/OTP processes and behaviours for Clojure

I must admit, this was one of those ‘out of my range’ talks where you think, “I’m not here to learn tech; I’m here to learn what I need to learn.” I found it interesting for several reasons.

I’ve recently started to understand what processes and components are, what they’re for, and how they’re started/stopped, so I thought it might be really useful for me to learn something new about them.

Also, I’m already familiar with some similar libraries; knowing about more widens my knowledge. It gives me more insight into the pros and cons of each so I can make an informed decision on which one is the best for me to use. Even if I don’t understand all the technical aspects of a particular library, I know that it exists, what it’s for and its requirements — which, in my opinion, is more important than knowing everything about it.

I’ve worked with Clojure for almost two years now, and I find it interesting to see how people use it to implement new things and how it can be used to build programs using concepts from other technologies and languages. I can’t give you all the technical details, but I can tell you that this library uses Clojure to build an asynchronous program using Erlang/OTP concepts.

How many times have you and your colleagues ended up comparing different technologies, languages, or libraries and wondered how amazing it would be to have the functionality of one and the syntax of the other?

That’s exactly what the team at Suprematic needed, and they made it possible. They have been able to take what they needed from Erlang/OTP, removed its weird syntax and applied those concepts on top of Clojure core.async. This is a great example of what you can do when you know the available tools, have a clear idea of what you need and understand the best solution for your project.

Building a collaborative web app with ClojureScript

Have you ever thought to build your own version of Google Docs for developers? Or wanted to play tic-tac-toe with your best friend on the other side of the world but couldn’t find a decent online version of the game? Or tried your hand at solving a 4Clojure exercise with remote teammates? Whatever your reason, if you need an online code editor that allows you to live share code with other people, CREPL (Collaborative REPL) is the solution for you.

How can you build a CREPL? What are the ingredients to build one? Here’s the recipe:

ClojureScript with self-hosting support in order to compile and evaluate your code

Pedestal for the server side, which makes it easier to create ‘live’ applications, letting the app give immediate feedback even while backend work is going on

Clojure.java.jdbc for the database

Now we have all the ingredients and we know what they’re for, but what we don’t really know is this: What can we do with the final result? What are the features of this CREPL?

Obviously we can write code and see live changes from other users. Then, we can evaluate the current expression and see the result added to the document in a “comment” block. We can also evaluate the whole document, or, even more exciting, we can run the code continuously.

But what happens if two or more users change the same part of the code? We’re used to Git, which detects all the conflicts and lets us fix them, commit by commit. In the CREPL, we can find a faster approach using Operational Transformation, the same method used by Google Docs to let users work on the same files at the same time.

The process isn’t so different from Git’s. The document is a log of operations executed one after the other, applying each of them to the result of the previous one. The history of the document is still stored as snapshots, which allows the application to detect the conflict and resolve it.

Of course, the more users editing the same file or the larger the file is, the slower the process runs. But this is still a good starting point, even if there’s still room for improvements.

CREPL probably isn’t going to replace Git; maybe it’s not even meant to. CREPL might not be the solution for a big tech problem, but I appreciate the author creating something new and leaving the user free to find other ways to use it. Who knows, maybe someone will use CREPL and find new useful features and ways to implement them. Anyone want to take this chance?

This year, I was one of the lucky winners of an opportunity grant for EuroClojure. I had high expectations going in, and these were met.

EuroClojure differs from some other tech conferences. It wasn’t only about tech talks; it was about networking, socialising, and involving people in the tech world.

Because of the greater focus on socialising, the organisers of EuroClojure put a lot of effort into making people feel comfortable at home and creating a friendly environment. Alongside the usual ‘unsessions’, they organised a social breakfast on the first day for solo attendees who wanted to meet new people, needed a guide or just felt a bit lost.

I went there alone, but I never really felt alone. Wherever I went, no matter what I was doing, someone was always coming up to me for a quick chat, whether about the conference or Italian food. I also noticed that there were a lot of women, or at least more than I remember from the previous EuroClojure.

EuroClojure took place in Berlin, which is a great city (although make sure you carry cash with you since cards aren’t widely accepted). The venue itself was a huge cinema, so you could sit in a comfortable chair with a cupholder and still see and hear perfectly.

Overall, it was a great experience, despite the A/C breaking on the first day.

Before I get too deep into this post, I want to make it clear that I won’t be discussing all the talks at the conference. This is for a variety of reasons, but the biggest is that I’m a junior developer, so there are a lot of things I don’t know and some technical talks were definitely out of my range.

I found so many of the talks interesting and useful, so instead of covering everything in one post, I’m going to start with the most exciting talks for me — the ones that taught me something new and were beginner friendly — before moving onto some more technical (but equally interesting) talks and real life experiences in later posts.

If you’re interested in all of the talks, you can find the full list videos here.

Clojure tools for Symbolic Artificial Intelligence

This talk was about a library that the speakers built for AI applications written in Clojure.

It started by explaining a common theme: taking a start state and moving your application to a goal state. For example, let’s say that you need to move a box from the bottom of a pile to the top of another pile. Your application knows that first, it needs to move all the boxes on top. Mind blowing!

But how does the application know that? That’s where their library comes into play. The breadth-first search takes the start state, the goal state and a transformation function. This function will receive the start state and return a sequence of successor states. By iterating through the states, I can reach my final state. For more detail on how this works, they provided some free materials.

Interesting — but I’m not going to use AI the next time I have to move home, unfortunately. So what’s a real world application for this?

We can instruct our application, giving it rules to follow so it can then make inferences. So if I teach my application that a grandmother is the mother of a mother or a father and I then tell it that Angela is my mom and Sarah is her mother, it can understand that Sarah is my grandmother. And so I can tell Siri/Cortana to remind me that tomorrow is my grandmother’s birthday without even having to use my granny’s name. Amazing, isn’t it?

Simple AND secure?

In my opinion, this was a really beginner-friendly talk. It started with a nice introduction about ring and compojure to build a simple application — cool. Never thought it could be so easy to run an app with some ‘ready to cook’ libraries.

But what’s the point of building an app if it can be crashed by an unauthorised user in under five minutes? That’s where things get more interesting. A nice list of good practices and useful libraries come to help you to protect your app:

XSS - Cross-site scripting
One of the first things to do is to validate the user’s input so your app always receives something expected. It doesn’t really like surprises, especially if they’re harmful.
But what happens if someone is able to go through your validation and send a potentially dangerous script? It could be run by the client and blow up everything. But don’t worry, you only need to escape the output and translate it into HTML code. Enlive, Selmer and Hoplon are some examples of libraries that let you easily escape your output by default.

CSRF - Cross-site request forgery
These things are pretty cool but aren’t always enough. You need to protect both the client and the server sides of your app. To do that, you have to recognise the sender of a request in order to allow only applications that you really trust. Ring anti-forgery helps you by sending a secret token to the server which will block any other request.
Also, a good practice to keep a database safe from someone else’s input is to parameterise your queries in order to avoid the injection of a Bobby table.

Authentication and Authorisation
Last but not least, you have to protect your users alongside your application. What are the sections of my app accessible to everyone? Which ones are accessible only by registered users? And what happens if someone tries to access the edit profile page of a different user? That’s where Buddy comes into play. Buddy is a library that lets you create a set of rules for each route, defining which ones require authorisation and which ones require authentication.

Finally: always use HTTPS to secure your APIs!

These were just the talks I found useful from a beginner’s point of view, but I’ll be writing follow-ups on more technical talks. In the meantime, you might find it useful to watch the videos linked here and explore the other talks I haven’t mentioned yet in this blog.

This is the first part of an article dedicated to Clojure transducers. The first part concentrates more on the functional abstraction they implement. The second part shows practical examples, tools and techniques.

Transducers have been introduced in Clojure 1.7 (at the end of 2014) and never got the attention they deserved. They are still relegated to niche or advanced uses, but there are compelling reasons to reach for them as the default, for example replacing thread first/last (-> and ->> respectively) when applicable.

Maybe because their name is similar to "reducers" (a similar Clojure feature which focuses on parallelism), or because of some conceptual learning curve, many programmers I know are still reluctant to use transducers in everyday code.

In this article I’m going to show you that transducers are a functional abstraction (similar to combining object oriented patterns). They can be derived with a few refactoring moves on top of existing collection processing functions. The fact that Clojure offers them out of the box removes any excuse to start using them now!

Same function, different implementations

Map and filter are very common operation for stream oriented programming (there are many more in the Clojure standard library and the following examples apply to most of them as well). Here's a simplified version of how map and filter are implemented in Clojure:

The access mechanism to the input collection (first, rest, the empty list '() are all specific to the Clojure sequential interface).

Building the output (conj is used to put elements in the final list, but something else could be used).

The recursion mechanism is used to consume the input (note that this is a stack consuming and not tail-recursive loop).

The "essence" of the operation itself, which is the way "mapping" or "filtering" works (filter requires a conditional for example).

There are similar operations for data pipelines in other Clojure libraries. For example core.async is a library implementing the CSP mode for concurrency (Communicating Sequential Processes) in which channels exchange information between processes. A common case for the sender is to apply transformations to the outgoing messages, including map, filter and many others.

Let's have a look at how they could be implemented (this is a simplified version of the now deprecated ones in core.async):

The access mechanism uses core.async channel primitives (<!). The channel semantic also includes checking if the message is nil before proceeding.

Building the output consists of pushing (>!) items down an output channel.

The recursion mechanism is implemented by the go-loop macro and related recur instruction.

The "essence" of the operation itself is the same as before: map consists of applying "f" to each value and filter uses a predicate on each value in a when condition.

We are going to see one last example inspired by another library: the Clojure Reactive extensions. RxClojure is a library implementing the Clojure bindings for RxJava.

Reactive programming is a push-based model based on streams: events (called “observables”) are collected and routed to components "reacting" to compose behavior. How could map or filter be implemented in this case?

The following are not in RxClojure, as they are just calling into the relative Java version. If we had to implement them in Clojure though, they would probably look something like:

We start to see a pattern emerging, once again we can distinguish between:

The access mechanism uses lift to iterate through the incoming xs in conjunction with on-next inside the operator implementation.

Building the output is not explicit as before. Events are consumed downstream without accumulating.

The recursion mechanism is implicit. Somewhere else in the code a loop is happening, but it's not exposed as part of the main API.

The "essence" of the operation is the same as before: map consists of (f v) for each value and filter uses a when condition.

Combinatorial explosion

By looking at three popular implementations of map and filter, we learned that the essence of the implemented operation and some form of iteration are common traits. Making access to the input or building the output depends on the specific transport. The same happens if we look at the implementation of other functions for collection processing (in Clojure alone), such as:

To the list above you should also include any custom functions that you might need beyond what's offered by the Clojure standard library.

The dilemma is; how can we deal with the ensuing combinatorial explosion? Are we doomed to implement the same function all over for each new type of transport/sequence/collection? Could we just write map once (and possibly only once) to use it everywhere? Transducers are the solution to this problem (and much more).

An exercise in refactoring

Our goal now is to isolate map or filter (or any other collection processing function) "essence" and provide a way to run them in a transport-independent fashion. If we succeed, we'll have a recipe to build processing pipelines that can be reused in different contexts. It turns out that reduce, a well known operation in functional programming, is the key to achieve this goal.

Reduce is an abstraction that encapsulates the prototypical tail-recursive loop. The following "sum of all numbers in the list" is a typical example:

Reduce accumulates the result explicitly as one of the parameters. This form of recursion is also called "iterative" and once transformed into a loop-recur does not consume the stack. The other interesting fact about reduce is that it decouples the iteration from the transformation semantic, which is what we are trying to achieve.

Map and filter (as well as many other recursive algorithms) can be rewritten "reduce style". The fact that a stack-consumung algorithm can be re-written as iterative is a well known property in theory of computation. By re-writing map and filter (and possibly other sequential functions) as iterative, we are offered the possibility to extract the "essence" of the operation:

"f" is now passed as part of the parameters in our new map and filter. If you look carefully, the two functions map and filter are now identical (except for the name). Invoking them requires a more sophisticated "f" function than before, one taking two arguments: the result so far (also called accumulator) and the next element to process.

One big plus after this change is that the essence of filtering (or mapping), is now isolated from the recursion and input iteration mechanics. It is not yet isolated from the way the output is built (conj in both cases) and the actual function (inc and odd? respectively). But let's take baby steps and do some renaming (map and filter can be renamed reduce) and extract functions ("f" can be called mapping for map and filtering for filter):

Reduce encapsulates the iteration and the sequential access mechanism. But there is still a problem with mapping and filtering: if we want to be able to use mapping or filtering on a core.async channel for instance, we need to abstract conj away. We can't modify mapping or filtering interface, because it is part of the reduce contract. But we can add a parameter "rf" (for Reducing Function) in a wrapping lambda and return again a function of two parameters:

We also need to extract inc and odd? which are just example functions and should be passed from the outside. Again, we don't want to alter the two arguments interface required by reduce, so we wrap again and introduce the new parameter "f" (or "pred"):

This is exactly how clojure.core/map and clojure.core/filterappears in the Clojure standard library. Clojure core contains additional "arities" (not implemented here for clarity) to enable the use of map and filter without necessarily going through reduce. Transduce for example, is a thin layer on top of reduce. At the same time, it improves readability:

The standard library also provides "transducers awareness" in many other places. The following for instance, are also possible and remove the need for an explicit conj:

(sequence (map inc) (range 10))
(into [] (map inc) (range 10))

Conj is not explicit because the reducing function can be inferred from the specific call to sequence (we want to build a sequence) or into [] (we want to build a vector). Now that we have the basic recipe, it's time to put the new construct in practice and see how it can be used in our daily code.

Conclusions

The article shows that transducers are built on top of a simple functional abstraction and there is nothing magic happening under the hood. Apart from the interesting refactoring exercise, transducers have deeper consequences in terms of reusability, composability and performances that we are going to explore in the second part of this article.

It’s coming up to a year since I picked up Elixir. I instantly fell in love with it, and it inspired me to do many things I may not have done otherwise - start writing blog posts (it pushed me to help start this Labs blog!), contribute to open source, and attend more meetups and conferences than any previous year. I’ve never felt this connected with any other developer community before and I feel it’s been a massive factor in my enjoyment of the language. Elixir LDN takes place this week, so I thought it’d be interesting to reflect on my experiences and what I love about the Elixir community.

One of my first real introductions to Elixir was actually last year’s Elixir LDN conference - a pretty small event, but the people attending were excited to see their community grow. It was great to see both veterans of industry, such as Robert Virding (co-inventor of Erlang), as well as people who were sharing their first success stories with Elixir (Marni Cohen, Gabriella Chronis). Ju Liu blew everyone away with his live demo of Elixir running on Raspberry Pi’s. It was a great introduction to the language and community alike.

After spending some time with the community, I started recognising names. The guy who helped me with a pull request to the AWS library was interviewed on the Elixir Fountain podcast, and he also spends a crazy amount of time helping people on Slack (thanks Ben!). Baris, an old colleague of mine, runs the Elixir London meetup. The community is at a fantastic point in time where it’s big enough that you can still discover new people and new ways of solving problems, but small enough that you can easily become a part of it. My blog post earlier this year made the rounds on a few Elixir newsletters and it was thrilling to see many people who I was able to help, and who then reached out to ask more questions or just thank me.

My team was happy for me to experiment with Elixir - I started by rewriting one of our small microservices into it - and while there were bumps in the road, it was much easier to pick it up due to the excellent documentation, helpful error messages and - as mentioned above - the extremely helpful Elixir community, be it on the forums or Slack. The project was successful, it runs well and the code base is easy to reason about. Since then we’ve recruited another two developers who are excited to use Elixir in production, use it where it makes sense and share their experiences with the rest of the company.

In the last year I’ve attended three Elixir conferences, a few of the local meetups and I idle in the Slack where the community grows every day - but I feel that a lot of the success of the culture and spirit is due to José Valim, creator of Elixir. His positive attitude literally acts as a role model (“José is nice, so we are nice” - while the saying might have originated from Ruby, it’s certainly true in case of Elixir too). I’ve had the chance to see him speak and ask him some questions in person - he’s very humble and down to earth, but clearly very passionate about his work. I recommend you watch his keynote from this year’s ElixirConf EU where he talks about Elixir’s history.

I think it was José who said that the first few years were focused on establishing the language concepts, helping create books and online resources to help people learn, and now that everything is in a very stable place, it’s a good time to convince companies to pick it up. I’m thankful that uSwitch allows us to work with any tools and languages, if they make sense, so it was relatively easy to get started with it and I’m sure it’s made me a better developer. Now, almost a year later, I’m proud that we’re sponsoring the Elixir LDN conference and I can’t wait to speak to people there and meet some new faces!

I hope you feel encouraged to create new content, participate in the conversations, contribute to open source libraries or create your own. It really is a time where you can make an impact. It’s up to us - the developers - to encourage our employers and coworkers to try great programming languages, such as Elixir, and see where it might be a good fit!

According to my CTO, the age of the cyborg has begun. We are pretty much doomed as one day we will all be replaced by bioengineered post-humans. Well, that’s his long-term prediction.

In the meantime, he hired me to help the business to find real people for our software development roles. The issue the business was facing was quite complex and required a new approach. Before I start talking about our process for hiring top-quality developers, I would briefly like to describe the type of developers we hire here at uSwitch and our recruitment process.

What uSwitch look for in a developer

Our teams are autonomous and we focus on problems, not projects. Developers are the product experts. If we feel a language is a better fit for a problem than one we are currently using, we will rewrite our code in that language. We are running production code written in Clojure, Ruby, Elixir, Go, React et al.

Therefore, we are looking for people interested in understanding the whole business problem we face and leveraging technology as a solution.

Just to put things into perspective, these are the kind of developers who are also on the radar of Facebook, Amazon, Google and a few other well-known names. So, there was no pressure.

The uSwitch hiring process for developers

We use Trello as a kanban board to monitor a candidate’s journey throughout the process. When a CV comes in, a card with the candidate’s details is created. At every single stage, our developers are free to make comments and suggestions.

When I joined uSwitch, I was simply blown away by the quality of feedback made by engineers. Every single Trello card helps us to build our internal bank of knowledge about our first impressions of a candidate’s CV, their performance during a phone screen, code test and a final interview. Below is our process:

Phone interview with two developers (max 40 min)

Code test - A technical challenge where candidates are given approximately one week to complete it. Two developers need to review the submission and their comments made are either voted up or down. Once two people have reviewed the test, then it is moved to “rejected” or moved to the "in office" stage

In office interview - This is a two hour interview split into a technical and cultural section

Defining the recruitment problem part one: Looking at the data

Here are some stats. In the past, if uSwitch wanted to hire 10 developers, we had to introduce into our hiring process on average about 300 to 330 applicants per year.

Historically, our recruitment “conversion rate” (i.e. percentage of successful candidates out of the total number of applicants, the ones who received offers from us) was at about 3%. That level indicated that the task ahead of me was seriously daunting.

Almost every single person in our development team is responsible for hiring. We did some calculations, and it turned out that to deal with such a large volume of applicants, our developers’ expanded effort to hire new people would come close to 650 hours! It’s mind boggling. 650 hours is about 82 working days spent on hiring just 10-12 people.

Now, the recruitment plan for 2017 was even more ambitious as we were hoping to recruit twice as many engineers. Despite the fact that we used to work with about 20 agencies with some of the well-known high street brands, only one understood us very well and managed to place a number of developers in our organization.
However, even for this specialist consultancy, the “success rate” was around 10%, meaning that only 1 in 10 applicants sent across would secure an offer from us. The remaining 19 recruiters, despite submitting about 250 CVs, placed only 3 candidates in two year — another shocking number.

For these agencies, the numbers were even worse, as their “conversion rate” was around 2-3%. Hardly an incentive to keep working with us long-term. Looking at those stats, you start asking yourself: is it true that they were all that bad? Or maybe the fault was on our side? It turned out that it was the latter.

Defining the recruitment problem part two: Talk it out

Every business is different. And that’s down to the people who work there, the strength of (or lack of) its culture, its business prospect, the market it operates in, etc. Every company has its unique attributes and needs.

Prompted by our CTO, I went around the teams talking to people about uSwitch, trying to understand what people really think about the company and what they like or dislike about working here. It was like a journey of discovery to strip away the layers until the core of the problem was exposed.

Now, when we learnt more about the various issues affecting the company, we zoomed in closer to focus on the most important one — the core, burning problem. We quickly uncovered that the company has an external image issue. As a business, we have been around for nearly 15 years and, apart from a few Clojure enthusiasts who already knew about us from conferences and events we have sponsored, others were viewing uSwitch as that old-school price comparison biz that probably only a handful of geeks would consider a somehow attractive place to work in.

They had a point. We weren’t a start-up anymore (although our teams work like them), but we weren’t a well-established corporation either.

Within the business, we all knew we had a great environment built by a fantastic group of people but, sadly, no one else knew about it. For a business that employs a whole bunch of smart, experienced engineers, embraces new technology, and gives a lot of freedom and autonomy to experiment with new ideas and approaches to how we work, we really struggled to communicate even a fraction of this message to people on the outside.

So, our issue was slightly different. To remain attractive to any potential candidates in the developer community, we had to fix the way we presented ourselves to the outside world — that included candidates as well as our suppliers and recruitment agencies.

Solving the problem: Changing the perception of our business

Changing a company / brand image can be an expensive and time-consuming process, and we didn’t have much time. So, we had to use a slightly different approach. First, we got the most promising, specialist recruiters to our office. Yes, we turned to specialist recruiters for help in the first instance as we quickly realised that the task of hiring at least 18-20 engineers ASAP wouldn’t be a one person job (i.e. my job). Instead, it would mean getting everyone on board: executive team, HR, line managers and the whole engineering team, working together to improve our processes.

The plan was simple, we wanted to turn a group of 3-4 niche recruiters into our sales force and get them into our new office was part of the plan to upsell some of the benefits of working at uSwitch (and ZPG, our parent company). Our new office, with some of the great features like modern office space, free breakfast, afternoon snacks, free, on-site gym yoga and pilates classes was in itself an important piece of information worth sharing with people who were soon to act as our brand representatives.

We also gave the recruiters the opportunity to meet with some of the key people in our team. We reflected on issues of the past: a slow feedback loop, not the best understanding of what we’re looking for, general breakdown of communication and trust between us and recruiters. Our promise to them was simple:

A quick process

24-hour feedback on all CVs submitted (later changed to 48 hours as the volume of applications increased)

Thorough, very detailed feedback directly from developers on all applicants, starting from first initial comments on a candidate’s CV, then our feedback on a code test and finally, our remarks on candidate performance in a final, face-to-face interview. (Our aim is always to pass on as much information as possible to candidates who interview with us, so they can learn and improve.)

Developing the uSwitch recruitment model

Now, we had something: a better, sleeker and more efficient process, great feedback that we could use to improve our internal and external communication, and motivated and fully briefed recruiters. And what is most important — an idea to create Labs, this very blog that started back in March which helps us to shed some light on our culture and the way we work.
We also knew our historical data, our past sins and some key stats (i.e. percent conversion rate). Now, the key thing was to put everything into a model that would help us to monitor our process, agencies’ performance, referrals, time to hire etc. So, we created a simple Excel spreadsheets with a few useful pivot tables that could help us to visualise some of our data.

For instance, below you can see how many CVs we’ve received on a weekly basis since the beginning of the year. Just to explain, every single “spike” in the number of applicants (it happened in February, April and most recent one in June) resulted in offers and new placements.

Here, on the other hand, you can see our simplified model predicting of what would happen in the next 12 months (i.e. how soon we could hope to hire people we wanted) if the number of applicants reached 5-7 per week with a conversion rate at 7%.

Looking at the model, you could actually see multiple moving parts and elements in various stages of progress giving us a sense of where we were and what still needed to be done.

Quickly, our model has become a backbone of our whole recruitment strategy. Seeing the stats, basic numbers and probabilities made it also easier for us to weather some temporary problems inevitable to occur when the recruitment gods stopped smiling at us. It happens from time to time when one applicant after another gets either rejected or simply drops out and we start losing our confidence in the process. We experienced this moment of doubt back in the second week of February when we had six final interviews scheduled for that week. By Thursday, five out of six candidates had been rejected for various reasons with only one left in the pipeline. Our first reaction was panic. There must be something wrong either with our system, the way we screen candidates or maybe even with us.

As you can see, it can get very emotional. And here the model came to the rescue. As it happened, on the last day of that week, the sixth candidate got the job! Stats don’t lie, and odds started working to our advantage. It meant that, statistically, we were on track and if we kept doing what had been doing so far, we would definitely hire the people we needed. Just keep calm and carry on as they say.

Conclusion

Putting all this together, understanding your past data (however ugly it was), honest discussion with people in the business, working together to uncover the real problem and building a model that supports the process helped us to fix our hiring issue and make our hiring process better and more efficient. This approach has worked really well for us, but there is still a lot that needs to be done — for example ongoing work on our brand, looking into employee retention, diversity, increasing the number of direct hires and referrals, polishing our process even further, etc.

However, for now we did solve the most urgent problem and we were off to a good start. Below are some of the highlights:

In just under 7 months we managed to find 16 developers through a healthy mix of agency and direct hires.

Our time-to-hire ratio is around 25 days (as opposed to 3 months or longer in previous years!), meaning it takes us on average about three to four weeks from the moment a vacancy goes live until we find the right person.

Our recruitment process has also become very efficient. With quick, detailed feedback and only three stages we can move really fast if necessary.
Our conversion rate is slowly approaching the 10% level, which means that we and our recruiters are getting better at identifying the talent we need. Plus, we spend less time interviewing which has a positive impact on overall productivity.

We rebuilt our relationships with recruiters, narrowing down the group of agencies we use to four specialist consultancies that can respond to our needs fast.

Our brand is becoming more visible in the developer community thanks to this very blog, our presence in conferences (both UK and Europe) and many in-house events we host.

We have a healthy pipeline of candidates which bodes well for the future.

Now, we are in a strong position to complete our hiring plan for this financial year and start thinking about implementing some other ideas we have in mind for the next 6-12 months.

In the broadband team at uSwitch, we're constantly striving to improve and innovate. Part of this work involves automating some of the more boring repetitive tasks the team performs, giving that work to a machine instead. A little while ago we moved to a fully automated approach for deciding the ordering of the products on our tables, taking over from our Performance Marketing Lead, Ewan. Today we're going to focus on a small part of how that works and the code we wrote to solve it.

Once we’ve finished the main process of ordering the table, we want to ensure there aren’t any large blocks from any one retailer. The rules for this, in human terms, are simple:

First up, we can see that ewan-shuffle is a recursive function: It's plausible that in rearranging to break up one grouping, we can create another. So looking at the end of our function, we can see the following:

(if (= new-list original)
new-list
(recur new-list))

What this means is that if the result of ewan-shuffle (which is in new-list) is the same as the list was before we shuffled it (in this loop), we've reached the end of our algorithm and we can return that list as completely shuffled. Normally this is a completely unshuffled list as our retailers are always vying for the top position on our table. It's not always that simple, though, so let's have a look at how the shuffle itself works.

We'll look at this in stages, looking at the arguments we can expect to have at each point and why the logic is written the way that it is. Let's start with the outside of our reducer, our let block:

Then we pass this partitioned list of products into our reduce function, which has an initial value of a map with an empty list of results and an empty list of "remainders". Now let's have a look at the internals of the reducer to see how we turn our partitioned list into a shuffled one in the reducer:

We pull the result and remainder out of our accumulator and call our first collection of products (partitioned already by :retailer_id) our incoming. We split this list of incoming products at two. This two is the maximum number of products from any one retailer we want in a row, as mentioned above. Then we splice the result (currently empty) with our (up to) two products from our retailer and our remainder (also currently empty). Here's what's in scope:

splice has taken result, incoming-ok, and remainder as its arguments. On our first run-through, this doesn't do very much: It concats together to be ({:rank 1, :retailer_id 1} {:rank 2, :retailer_id 1}) — our original incoming-ok — thanks to the emptiness in remainder.

Returning to our reducer, it's return time: We return our result from splice in result and our leftover product in remainder. Now the reducer runs again, with fresh scope after the let:

We concat these things together, starting with our shuffled, accepted list. We then include the incoming-head in order to break up our list of products we've identified in the last iteration, followed by our remainder, the rest of the block of ‘retailer 1’ products from the last iteration. Finally, we include the incoming-tail. Our return value is:

In our final run-through, the last product is added without incident, and our shuffled list is returned! The loop runs once more, checking that in our efforts to clean up existing blocks of a single retailer, we haven't created more. We haven't, so shuffling this new list is a no-op. When the same list is returned from the shuffle, we are done with the loop, and we return our shuffled list of products, ready to be picked up by other systems and applied to our product tables, such as the broadband packages page.

This is one of the many ways that we as devs collaborate with our retailer-facing colleagues to try and take some of the repetitive tasks away from them to allow them to focus on the core of their roles: working closely with our retailers to come up with deals that work best for our customers. I love being able to automate boring jobs like this with a relatively short amount of clojure, and uSwitch does a great job of encouraging us to spot opportunities like these to improve the world around us for ourselves, our colleagues and our customers.