Is data quality an obstacle for predictive analytics optimization?

Is data quality an obstacle for predictive analytics optimization?

Is data quality an obstacle for predictive analytics optimization?

First of all, If you try to put any data that you have into the predictive algorithm, it is going to predict some results, but they’re not going to be what you need. Instead, the algorithm is going to deliver you a set of very low-quality predictions. In other words, it follows the principle “garbage in, garbage out”, in which the decision-making might be flawed due to incomplete, or imprecise data. Improving the quality of the historical logistics data is extremely difficult but it is a must before you even start thinking about predictive optimization.

“Improving the quality of the historical logistics data is extremely difficult but it is a must before you even start thinking about predictive optimization.”

You will be surprised but even the largest companies like Top 10 in ground transportation are not aware of how full their trucks are. These companies don’t have any reliable KPI which tells them that the truck number 1 was traveling yesterday 70% full, the truck number 2 was traveling 40% full and so on. The reason why they don’t have these KPIs is not because those companies don’t care about the numbers. The problem goes much deeper into the industry. A very poor quality of data, which is generated in logistics, is much worse than the data you would find in a bank or in a healthcare provider. There are a couple of reasons for that. First, the data is generated mostly by humans since the level of EDI penetration in the industry is not yet that high. At the same time, even the standards for EDI recordings or for data transfers between the sender and transporter are not yet on the trusted level. Second, the number of shipments being transported every day is extremely high. If you look at the top logistics companies, they are moving hundreds of millions of shipments per month now, especially with the growth of e-commerce. Having people enter all this information manually into the IT systems in a high quality is going to be enormously costly and the companies obviously cannot afford that. Thus, it is a common pattern for logistics companies to be stuck with their data being very sparse and not very precise.

Typical data quality problem: shipments measured by weight, however, trucks mostly get full based on volume and dimensions

What we have seen is that as soon as the company starts going into optimization and tries to optimize the business processes, they are immediately running into the problem of poor quality of data and they cannot proceed any further. That is the main reason why Transmetrics actually starts helping logistics companies first with the data cleansing and enrichment. That is also where we apply a lot of complex Artificial Intelligence algorithms. To give you an example: you may have cargo shipments and for some of them you know that those are pallets, for others, you know only the weight or the volume or the dimensions and for others, you know only the customer and so on. All of this makes the data very sparse. Our self-learning algorithms go back and work fueled by the past data by applying heuristic principles and using machine learning. As a result, the algorithms show the complete description for every shipment, such as the weight, the volume, the dimensions, the density, whether the shipment is stackable or not, and other properties that the logistics companies might need. By gathering this information, companies can start understanding how full their trucks were, how well they were loaded and it gives a previously unachievable amount of process transparency.

“Once companies have a good quality data, it unlocks all kinds of opportunities for predictive optimization to achieve much higher levels of operational efficiency.”

At this stage, the companies can already start noticing some obvious issues in their supply chain, which had existed for a long time but they were not visible. And once companies have a good quality data, it unlocks all kinds of opportunities for predictive optimization to achieve much higher levels of operational efficiency.

Typically, the next step we do at Transmetrics is implementing a predictive model, which starts working on the predictive performance for the next few weeks based on the pool of cleaned and enhanced historical shipment data. Then in our Optimization module, we apply Artificial Intelligence and complex stochastic optimization algorithms to help planners and dispatchers with suggestions on how exactly to adjust the operations.

And of course, because cargo is a very complicated market with more than 40 different business models, Transmetrics has different solutions for different types of logistics operations. For example, for a network such as DHL, UPS or DPD Group it looks as a tool telling them which trips within the network to cancel in order to take out overcapacity. For a shipping line, it provides suggestions on how to reduce the logistics of empty containers so that they reduce costs for moving around empty ones, but at the same time, they don’t run out of empty containers. It also can help warehouse companies to schedule the staff shifts in the most optimal way. The end solution might look differently based on the type of business the company is in, but in the end, it is always about improving the data quality to a reasonable level first, and only then doing forecasting and optimization.