Catch our Drift? The Monumental Impact of Data Drift

Have you heard of the 1 in 60 rule? It can have pretty big implications in the long-term success of your venture. On March 28, 1979, a flight crashed into a mountain in Antarctica, resulting in the death of all 279 people on the flight. Upon investigation, it was found that the crew had not been told of a two-degree correction to their flight plan. This caused their navigation system to route them straight toward Mount Erebus instead of through McMurdo Sound. 

Two degrees doesn’t sound like a big deal, especially in the short term. However, even just one degree off can result in being miles off course! The 1 in 60 rule states that heading 60 miles in a one degree error will result in straying off course by one mile. This analogy has so many implications in life and in business, but we see a very clear parallel in the world of machine learning and artificial intelligence when it comes to data drift. 

Drift is a phrase we hear a lot in the world of machine learning and artificial intelligence to talk about how a machine learning model gets slowly worse over time. Drift refers to the performance of a machine learning model in production slowly getting worse over time. Let’s be honest, data drift is going to happen, so we need to have a wise approach about how we mitigate the impact, plan for managing it, and get smarter with our models for the future. 

Mitigate the Impact

We’re not trying to be a downer over here, but the reality is machine learning models drift on the regular. To ignore that reality would be irresponsible to do. Machine learning models drift over time because the data they were trained on might be outdated, or just doesn’t represent the current conditions. Model drift can also happen if the model wasn’t designed to handle the changes in data. Do you have a plan in place to mitigate the impacts of data drift? 

Managing Drift

You’ve already taken the first step, which is to admit that it is a problem. As with most things in life, you have to start there. From there, you need to be watching out for it. There are two main ways to detect drift: statistical tests like sequential analysis; time distribution method; and building a custom model, and also focusing on a model-based approach. You can dive much deeper into how these work, but it’s important that you are looking out for drift, but then also have a plan to get back on course.

Get Smarter

Make data drift a top priority in your venture. Give yourself the quality control checks and balances as part of your standard process to help ensure what you’re building has merit over the long haul. Learn from the data drift. As you do, you’ll not only have a better product, but you’ll have a better idea of what is causing the drift. Every point in the process is an opportunity to build your machine learning model just a little bit better to accommodate for trends in the data drift you’re seeing, changes in the user behavior or data coming in, or new needs in your users. Your output, and ultimately your offer, will be better if you actively pursue how to manage data drift. 

We may not crash into icy mountains as a result of our drift, but the implications are still important to consider. A slight degree off from the goal of our machine learning model has a big impact in the ultimate offer we’re sharing with the world. The details matter, and when it comes to businesses leveraging artificial intelligence we’re seeing these details matter even more!