Client: One of the Largest Automobile Manufacturers in the World
Problem they faced
They have distribution centers in all 6 continents (and multiple countries in each continent). Each car they make requires thousands of components. Hence, the distribution centers need to stock roughly a quarter of a million components.
The manufacturing units need to be able to predict the demand of each of these components to plan production and supply chain efficiently.
What was tricky here?
The client said they were using well respected reliable time series models on an enterprise-grade software. Still, the performance was not up to the mark. They came with a hope that we knew how to do magic!
What did we do?
We followed two approaches. One is trying out a variety of time series models namely – TSLM, STLM, Arima, MA, HW, Croston, Arimax and so on and so forth. We were able to mix them up and ensemble it and improve the accuracy a bit. Not enough though.
We also carefully analyzed the errors to find that the true reason for bad performance (Mean Absolute Percentage Error) was that there were some components that were ordered in 1s and 2s. Most other components were ordered in 100s and 1000s. For a 100, if we predict 105, the error is 5%. For a 1, if we predict 2, the error is 100! So, the mean of these numbers is pushed up substantially. Hence, we suggested Median absolute Percentage Error and it worked a charm!
Moral of the story
A well-designed business-aligned performance metric plays an extremely important role in the success of data science implementations. Feel free to break away from textbooks and published works and design a performance metric that really means something to the project.