What is AI Inference at the Edge?

To answer this question, it is first worth quickly explaining the difference between deep learning and inference.

Deep learning is the process of creating a computer model to identify whatever you need it to, such as faces in CCTV footage, or product defects on a production line.

Inference is the process of taking that model, deploying it onto a device, which will then process incoming data (usually images or video) to look for and identify whatever it has been trained to recognise.

Deep Learning and Inference in the Cloud

Generally deep learning can be carried out in the cloud or by utilising extremely high performance computing platforms, often utilising multiple graphics cards to accelerate the process.

Inference can be carried out in the cloud too, which works well for non-time critical workflows. However, inference is now commonly being carried out on a device local to the data being analysed, which significantly reduces the time for a result to be generated (i.e. recognising the face of someone on a watch list)

Inference at the Edge = Real-Time Results

Clearly, for real-time applications such as facial recognition or the detection of defective products in a production line, it is important that the result is generated as quickly as possible, so that a person of interest can be identified and tracked, or the faulty product can be quickly rejected.

This is where AI Inference at the Edge makes sense. Installing a low power computer with an integrated inference accelerator, close to the source of data, results in much faster response time. When compared to cloud inference, inference at the edge can potentially reduce the time for a result from a few seconds to a fraction of a second. The benefits of this do not need to be explained.

Low Power, Cost Effective Inference

To ensure that the computer carrying out inference has the necessary performance, without the need for an expensive and power hungry CPU or GPU, an inference accelerator card or specialist inference platform can be the perfect solution.

Utilising accelerators based on Intel Movidius, Nvidia Jetson, or a specialist FPGA has the potential to significantly reduce both the cost and the power consumption per inference ‘channel’.

Applications that Benefit from Inference at the Edge

Apart from the facial recognition and visual inspection applications mentioned previously, inference at the edge is also ideal for object detection, automatic number plate recognition and behaviour monitoring.

To learn more about Inference at the Edge, get in touch with one of the team on 01527 512400 or email us at computers@steatite.co.uk