A steel pipe manufacturing company provides sturdy pipes used in oil drilling. The pipes made are shaped at the end based on the specific drilling needs. Hence, the company stores the pipes in a warehouse till they receive the specifications from the client.
However, not all the pipes are straight and go through a “straightening” process (heating, straightening through rolling and cooling) once the order is received before being carved. The cooling of the pipe after straightening takes around 45 minutes during which time several more pipes would be manufactured based on the established measurements.
However, if the output is not found satisfactory, many pipes would have been straightened under the same conditions for loss.
Additionally, a different input condition (temperature, pressure, etc.) needs to be used to repeat the process. But there is no guarantee that the new set of measurements are the right condition to set the pipes. Since all these measurements are based on assumptions.
The client wanted us to provide a better solution.
What was tricky here?
Prima facie, this looks like a regression problem where one needs to predict the output straightness as a function of input straightness, “straightening” process conditions and other parameters (like the type of steel, etc.).
However, the firm had a different numbering system for the pipes in both departments (pipe making and pipe straightening). They could match only at a batch level but not an individual pipe level. There was no way for the data scientists to know what the input straightness of a pipe was.
We were given 100 input pipes and their straightness but the 100 output pipes, their straightness, and the Ids of both sets were completely different!
This is a classic scenario that illustrates the issues that data scientists face while solving real-world problems.
To have INSOFE faculty and data scientists solve your business problems, prep your engineering teams to face the real world complexities, visit here
What did we do?
To address this problem, INSOFE developed an original optimization solution (a variant of the classical assignment problem) to map the inputs to outputs. We assigned costs for various input straightness and output straightness after thoroughly surveying with operations teams. Once developed, the engine provided a simple solution to map pipe IDs between departments. This optimization, which was not part of the original requirements took 35% of the total time we spent on the project!
A host of regression techniques were then tested of which gradient boosting machines algorithm provided the best solution.
A lookup table was prepared for plant operations engineers with variables they could play with for a variety of input conditions. The app recommends the conditions for best output and also allows operators to play with the recommendations to see how outputs change based on the number of inputs.
The client productized the system and used it as a guide for the production operators. The failed components reduced drastically (more than 50%) with the solution provided by INSOFE.
Moral of the story
Most of the time, you end up spending much more time cleaning data and understanding it than anticipated 😊.