Today we are going to share a story where we participated in a project that included low quality data and failed the expected model’s accuracy rate. But in the end, both us and the client see this experience as a success - and let’s see why.
It was suggested at Transform 2019 by VentureBeat that 87% of all AI projects fail. Other sources say that 85% of models never make it into production. With AI, the fast companies will outperform the slow ones, regardless of their size. And tiny, no-name companies are actually stealing market share from the giants. However, only 13-15% of Machine Learning projects will succeed.
The reasons for this may lie in a developer’s expertise (poor management, lack of expertise, misunderstanding between Data Scientists and developers), and in the data itself. Volume and quality of data, data silos inside the company, poor data labeling, and other issues may all contribute to preventing the project’s delivery. While the positive attitudes of flexibility and proactive communication always help to mitigate these difficulties.
For 18 years we have delivered more than 280 technical solutions and have also received several awards for them: the technical Academy Award for the planar tracking solution Mocha, titles and awards from Microsoft for projects with Machine Learning modules. For 8 years out of the last 18, we have been delivering projects with AI in Logistics, Healthcare, Agriculture, Ecommerce as well as other industries.
Today we are going to share a story where we participated in a project that included low quality data and failed the expected model’s accuracy rate. But in the end, both us and the client see this experience as a success - and let’s discuss why.
Warehouse management - a very promising case
A large furniture distribution company was looking for a partner to develop a system to predict future shipments. It was a competition: the team who would achieve the goal, would be elected as the developer of the full system.
So the task was to predict shipments of items for the next 4 weeks for each of their 70+ warehouses with an accuracy of 65% or more. The final test on a closed dataset was going to be conducted on several specific types of goods unknown to the participants. The company was also well-known. The task seemed both intriguing and promising.
The introductory meeting revealed several issues. First of all, the dataset provided for the competition contained records from only two years. For a business that heavily relies on seasons – this is a ridiculously small sample. Getting a valid trend and seasonal differences based on only 2 iterations is challenging.
Next, the list of items for this company consisted of hundreds of different types. These items were also divided between several warehouses, which made the available history of shipments very sparse. In such conditions it was almost impossible to make an accurate forecast for every single type of item.
We describe an unbelievable challenging task here, and the project which is based on data of this quality, has a big chance to fail.
We decided to participate anyway.
Even though at that moment the dataset wasn’t consistent and potentially, had some other issues, it didn’t mean that the data would stay inconsistent forever. The company had the necessary resources available to fix it. What they did need was a reliable partner that would be able to help them with their data. Obviously, the best way for us to become one was to help them during the competition.
Problems with initial dataset and how we addressed them
So, we decided to try our best in the project that has all the potential to fail - but find every possibility to prevent it.
First, we came up with a plan for the whole R&D process and made sure that all of the team members were familiar with it.
When the client provided the whole dataset, we understood that working on this project will bring us more pain than we could imagine.
- The dataset contained a lot of false samples for the items that were stored in the warehouses, but weren’t actually shipped. After we finished filtering, we lost approximately 80% of the samples.
It was a huge blow to our confidence and our thoughts of success. Having over a hundred thousand total item/warehouse pairs, 20% of them were actually shipped only once a year! This wasn’t anything that we could change. Still, we informed the customer.
- It seemed that the data was initially prepared for another problem statement.
As mentioned before, we had to predict shipments from different warehouses. However, a part of the data tables contained examples of secondary records for item/store pairs without any possibility of mapping/connecting them to the warehouses. As a result, some details about the shipments were lost. As It turned out, there was a sudden change of priorities on the client's side during the preparation for this competition. Everything got mixed up and some of the data tables were just left as they were. Nevertheless, after having a few calls with the customer, we were able to extract at least some useful insights from this left over data.
- We ended up not having information about special offers. Well, we did have it, but we couldn’t use it. This problem had several aspects.
Special offers were divided in two types with completely different origins and ways of distribution. This division was completely neglected in the data – we were able to figure it out only from communicating with the customer while addressing the problem mentioned on the previous slide. Tables contained collisions – one could state that the item was shipped as part of the special offer and the other one would state the opposite.
After consulting with the client, we agreed that special offer shipments would not be included in the closed testing dataset.
At one moment we realised that the project would fail anyway
These issues didn’t make the development easier, and our master plan had to be scrapped a couple of times. However, each adjustment we made helped keep us on track, preventing us from losing a clear understanding of where we are and what actions we should take next. Continuing development didn’t bring any further significant surprises, just deeper data analysis, architecture and model development. At this time, we were sharing all of our steps, problems and discoveries with our potential customer.
Closer to the end of the development we (along with other teams inthis contest) got the last part of the data – three months prior to the closed dataset. And that’s where everything got screwed up completely.
- Distribution of the new dataset was completely different from what we observed before – to the point that it didn’t make any sense. The values showed that some warehouses barely had any shipments at all. Looking at previous years, they did not show anything like this either.
Of course, we asked the customer what could be the reason for this. In the end, it turned out that the data was extracted with mistakes. Although the customer’s in house specialists promised to fix this, they were not able to deal with it completely. Part of the information was lost for good due to some technical issues or data storage problems. They we're not very happy about it and didn’t go into much detail.
Considering that this period was right before the one chosen for testing – we knew that the project was doomed.
But was it a complete failure?
At this point you might want to give up and just let it go. But we didn’t – here’s why. Let’s look at the results table.
|Team1||Team 2||Team 3||Team 4||WA Team||Team 6||Team 7|
Not a single team was even close to reaching the 65% goal for the project. And our team was somewhere in the middle with a 46% efficiency.
It seems that from almost any point of view it was a complete failure: our team did not reach the required efficiency, and our result wasn’t even the best one. But the thing is – despite all of that – we were the ones chosen to consult the customer on how to update their data collection and analysis. And that was our success, regardless of failing the initial task.
As for the customer – although they were not able to start the full-scale project immediately, now they had a much better understanding of what needed to be done to solve the problems in the future. This was the real value of the project for them.
How come we were chosen for future cooperation
The key is that during the whole project we kept in touch with the client: gave them intermediate results, reported data related problems, consulted them, and asked for their advice.
The most valuable thing we provided was a transparent and honest view of the situation: so when everything went wrong – it wasn’t a complete surprise to them. More than that, we suggested ways of dealing with the problems and finding alternative approaches.
Of course if our results were significantly lower than others, this would cross out most of the good points. This is why having a good master plan is crucial. Having one during the whole development helps manage resources and focus on what is really important at each moment. Otherwise we would inevitably get stuck with solving minut problems without actually getting any closer to the final goal.
The next big thing is reporting. When a company just shows their results, no customer can relate and therefore appreciate all work put into the project. Also, it may be hard for the client to recall all the details of the final communication. What algorithms did the developer use? Why had they chosen them? Writing down all the significant details in a report allows our clients to make a thorough review later, at the moment when it is convenient for them.
As a final note, there is a very obvious but also very important detail (especially for such projects) – the attitude of our team. Pessimistic thoughts, criticism, and mistrust, added into the mix of one’s goal have ruined a lot of less complicated projects.
The customers chose us because we give the most detailed and interpretable information about their system, data, and the problem itself, and because we communicate proactively about all possible issues, as they are occurring.
As for the customer – even if they aren’t ready to start the full-scale project immediately, after the pilot they have a much better understanding of what needed to be done to deal with the initial task. This is the real value of the ‘doomed’ project.
In a previous article, we shared how to improve your company's data for a machine learning project. Don't miss it!
Let us tell you more about our projects!