All Scans are Normal, but Some are More Normal Than Others

February 18, 2019

A few weeks ago, we told you about the release of a number of machine learning tools in our products in particular for product authenticity. Our overall idea is to implement what we call Data Driven Product Authenticity; in short, we want to move from an industry fighting illicit products by using product tags that are hard to copy (e.g., because of the inks used) to an approach that uses the new gold to authenticate products: data.

Why you may ask? Well first because putting special tags on products bears significant costs. But then also because counterfeiters get better at replicating all kinds of sophisticated tags. Replicating the data footprints products leave throughout the supply chain is much harder.

The tools we are releasing allow us to assess the authenticity of products using the traces of data they generate. We started by applying machine learning techniques to detect gray markets and missing supply chain data. We then announced a system combining using traceability data, master data and image recognition. We are now proud to announce our first application of our patent pending Data Driven Product Authenticity system to detect counterfeit products by using machine learning algorithms on consumer scans data. We call this abnormal scans detection; detecting clusters of consumer scanning products that do not fit usual patterns. These data points are interesting, because they could reveal illicit activities such as theft or counterfeits.

As with the previous applications, our goal is to make machine learning accessible to our customers and hence we are packaging this new model and its process in a way that makes it directly actionable for our customers. Let us explain how this works below.

Most interactions with digitized products in the EVRYTHNG platform start with scanning a tag (e.g., QR code, NFC tag, bluetooth tags, etc.). For example scanning a QR code gives you access to a product’s unique GS1 Digital Link (the GS1 URL standard co-chaired by EVRYTHNG’s CTO Dominique Guinard). The GS1 Digital link is the unique identifier of a product and the means by which people are able to interact with a product on a digital platform. Let’s look at some examples; you scan the QR code on a new, rather fancy handbag to unlock a promotion. Or, perhaps you scan a QR code on a shirt in-store to verify that it’s genuine. Both of these scans are recorded in our platform as an event, showing the history of a product, when and where it was scanned.

What if we see the same product being scanned many times in short succession and in different countries? Or suddenly a type of product is being scanned a lot more than in the previous weeks? Patterns in scan data are really interesting because abnormal scans could reveal counterfeits or parallel trade. But scans that are a result of illicit activities can be buried deep within normal scans. Which is why writing simple rules, for example with the EVRYTHNG reactor, is usually not enough. What we need is to leverage the power of machine learning, to distinguish between normal and abnormal scans; to learn from the scans and distinguish normal from abnormal pattern.

A machine learning workflow consists of several steps: data selection, data transformation, model training and model deployment and ongoing improvement. Selecting the data is super easy, we just scoop up all of those scan actions in our platform. We then transform and normalize the data before feeding it to an isolation forest. The isolation forest is a type of unsupervised machine learning.

Out of the unsupervised learning we get labelled actions, which in our case means a normal scan label or an abnormal scan label. These labelled actions are used to train a deep neural network, deployed on our machine learning service to classify future scans as normal or abnormal.

So far so good. This entire workflow is automated. Yet, we haven’t answered the most pressing question: What is normal anyway? That depends on the context, which a human is best suited to provide. As a starting point, the isolation forest will detect outliers. But then we need feedback from users to continuously train the model to reflect the perception of normality that users have. And users need to see progress and understand why a model came to a given conclusion. I must admit, we don’t have all the answers. We’re working with our fantastic UX team to develop a truly enjoyable and productive user experience identifying those illicit activities hiding in the forest and protecting your brand.

Our tests of the system so far have been really positive and we are looking forward to implementing this system with our first customers. We will report about the results here soon. Meanwhile, contact us if you are interested in deploying this workflow for your products. And remember: the future of product authenticity isn’t in complex inks but in data. Data is the new gold and at EVRYTHNG we are developing ways of extracting this gold using the power of machine learning!