Why are @illumina flowcell names so similar (but only occasionally rude)

Anyone who’s run Illumina instruments over the years is likely to have noticed how flowcells can have remarkably similar (and occasionally amusing) names. This can create a real headache when looking for a specific run as a single mismatch can cause you to spend some time looking at one run when you thought you were supposed to be looking at something else. Even the BaseSpace search appears to allow mismatches in this key identifier.

This has been an obvious issue their HiSeq platforms for many years but has remained unchanged. It’s only a problem for labs running lots of flowcells because these are generally shipped in a similar order to their manufacturing date. As such it is not uncommon to have two flwocells being run on the same day with different sets of samples on paper and in the LIMs – a recipe for flwocell mixups if ever there was one! And one my team is very careful to avoid!!!

This is what HiSeq flowcells look like (feel free to fill in the gaps):

NovaSeq looks like it’ll have the same issue – but with many fewer flowcells being run per week due to the vast volumes of data obtainable on Illumina’s newest sequencer:

NovaSeq: H27FHDSXX

The problem is worst on MiSeq. Of over 1500 MiSeq runs performed in my lab almost one third of the 5 character flowcell IDs differed by only a single character e.g. A9YV6, A9YVD, A9YVK, A9YVL, A9YVM, A9YVR, A9YVT & A9YVU (a run of 8) or 64D50, 64D53, 64D54, 64D55, 64D5A, 64D5B, 64D5C, 64D5F, 64D5G & 64D5L (a run of 10).

It’d be great if Illumina increased the number of characters, or used these few characters more sensibly, or randomly assigned flowcells IDs rather than simply increment (which is what it looks like they do). Sometimes small things make all the difference.

Rude flowcells:

One of the interesting, but unsubstantiated, things I’ve learnt about Illumina over the years is that they employ a rude word filter to screen flowcell IDs. If you’ve got any interesting ones to share please do (I’ll start a Twitter thread on this).

Fortunately for me they did not seem to catch these little w*****s: C6J0WANXX, C5MCWANXX, C5U6WANXX, C4NYWANXX, C5ECWANXX, C40JWANXX, C4EKWANXX !

2 Comments

Hi James,
The GAI used four or five numbers (without letters) in 2007
The GAII used a combination of five numbers and letters, starting with two or three numbers in 2008 (FC3….)
The GAIIx used a combination of five numbers and letters, starting with two or three numbers in 2010 (FC6…)
So it has always been a combination of five numbers and letters, where the flow cells for HiSeq seem to have more letters than numbers, and an additional four letter code specific for the sequencing platform.

About Enseqlopedia

The new home of the Core Genomics blog, and a site for NGS users to tell people who they are and what they do (on the map), and share knowledge (on the Enseqlopedia NGS methods wiki).
This site is aimed at the whole NGS community - users, core labs & services, technology providers.
Thanks for looking - James.

Contribute to Enseqlopedia

Enseqlopedia encompasses three elements: The NGS methods wiki "Enseqlopdia", the Googlemap of sequencers "NGS Mapped" and the Core Genomics blog. Please do sign-up and contribute to the Map or the Wiki, or to receive future blog posts - you can register for any or all.