Back to Blog
STRATEGY
6 min read
March 10, 2026

Why 87% of ML Projects Fail (And How the 13% Succeed)

Root cause analysis of ML project failures based on a decade of deployments. The organizational and technical patterns that separate success from expensive science experiments.

The single biggest predictor of ML project success isn't the model — it's whether someone with production engineering experience is in the room.

The Statistic

Gartner, VentureBeat, and half a dozen research firms all converge on the same number: roughly 85-90% of ML projects never make it to production. The specific number varies by study, but the pattern is consistent.

I've been deploying ML systems across defense, oil & gas, manufacturing, healthcare, and financial services. I've seen projects succeed spectacularly and fail expensively. The failure modes are remarkably predictable once you know what to look for.

The Organizational Failure Modes

1. The Solution Looking for a Problem

The CTO reads about transformer models and decides the company needs one. A team is assembled. Six months later, they've built an impressive NLP system that solves a problem nobody actually has.

The fix: Start with the business problem, not the technology. The first question should always be "What manual process costs us the most?" not "How can we use ML?"

2. The Missing Production Engineer

A team of data scientists builds a model in Jupyter notebooks. It achieves great metrics on test data. Then someone asks "How do we deploy this?" and the room goes quiet.

Data scientists are not production engineers. They're researchers who happen to code. Expecting a data scientist to build a production-grade ML system is like expecting an architect to do the plumbing. Both are skilled professionals — they're just different skills.

The fix: Every ML project needs at least one person who has deployed and maintained a production system. Not someone who's read about it — someone who's been paged at 3 AM because the model started returning nonsense.

3. The Infinite Scope

"Let's start with a simple classification model, then we'll add real-time inference, then we'll add a recommendation engine, then we'll add NLP, then..." By month three, the project has scope-creeped into a platform initiative with no deliverables.

The fix: Scope ruthlessly. One problem. One model. One deployment target. Deliver something that works before expanding.

4. The Data Isn't Ready

The project plan assumes clean, labeled, accessible data. Reality: the data is spread across three systems, partially labeled, inconsistently formatted, and the team that owns it doesn't want to share it.

Data preparation typically takes 60-80% of an ML project's effort. If the project plan allocates 10% for data prep, the project will fail.

The fix: Do a data audit before committing to the project. Understand exactly what data exists, where it lives, how clean it is, and who owns it. If the data isn't ready, the project isn't ready.

The Technical Failure Modes

5. Training-Serving Skew

The model was trained with one feature engineering pipeline and served with a different one. The features are computed slightly differently at training time vs inference time. The model receives inputs that are technically valid but semantically different from what it learned.

This is the silent killer of ML systems. The model produces outputs that look reasonable but are subtly wrong. Nobody catches it until the damage is done.

6. No Monitoring

The model is deployed and... that's it. No performance monitoring, no drift detection, no alerting. Six months later, accuracy has degraded from 95% to 72%, and nobody knows until a human notices something is off.

7. The Notebook-to-Production Gap

The model works beautifully in a Jupyter notebook. But the notebook uses pandas DataFrames, the production system uses Spark. The notebook runs on a single machine, the production system needs to handle 10,000 requests per second. The notebook processes data in batch, the production system needs real-time inference.

The Success Pattern

The 13% that succeed share these characteristics:

They Start with a Clear Business Metric

Not "build an ML model" but "reduce unplanned downtime by 30%" or "cut false positive rate in fraud detection from 60% to 20%." The success criteria are defined before anyone writes a line of code.

They Have Production Engineers from Day One

Not added later as an afterthought. Production engineers are involved in architecture decisions from the start, influencing model design, feature engineering, and deployment strategy.

They Deploy Something in Weeks, Not Months

The first deployment might be a simple baseline model — logistic regression, XGBoost, whatever gets you to production fastest. Once the infrastructure is in place and the feedback loop is working, you can iterate on model quality.

They Monitor Aggressively

Every production ML system that succeeds long-term has comprehensive monitoring: input data distributions, prediction distributions, latency, error rates, and business metrics. They know within hours when something degrades.

They Plan for Maintenance

ML systems are not deploy-and-forget. Models degrade. Data drifts. Upstream systems change. The successful 13% budget for ongoing maintenance, retraining, and monitoring from the start.

The Actionable Checklist

If you're starting an ML project, answer these questions honestly:

  1. Can you describe the business problem in one sentence without using the word "AI"?
  2. Do you have at least 6 months of relevant, accessible data?
  3. Is there someone on the team who has deployed an ML system to production before?
  4. Do you have a clear success metric that the business stakeholders agree on?
  5. Is the project scoped to a single, well-defined problem?
  6. Do you have a plan for monitoring the model after deployment?
  7. Do you have a plan for retraining when performance degrades?
  8. Is there executive sponsorship that will survive the first setback?

If the answer to any of these is "no," fix that before writing any code. The cost of fixing these issues up front is a fraction of the cost of discovering them six months into a failed project.

The Bottom Line

ML project success is 20% algorithms and 80% engineering, organizational alignment, and operational maturity. The teams that succeed aren't necessarily the ones with the best models — they're the ones with the best engineering practices, the clearest scope, and the most honest assessment of their own readiness.

If you want to be in the 13%, start with the boring stuff: clean data, clear metrics, production engineers, and realistic scope. The exciting ML stuff comes after the foundation is solid.

Discussion (5)

VP
ops_leader_energyVP Operations · Oil & Gas2 weeks ago

This hit home. We spent $300K with a Big 4 firm on an 'AI transformation roadmap.' Got 200 slides. Zero production systems. The part about vendors optimizing for deliverables instead of outcomes is exactly what happened to us. We needed fault detection, they gave us a strategy deck.

M
Mostafa DhouibAuthor2 weeks ago

Unfortunately this is extremely common. The incentive structure at large consultancies rewards billable hours and slide production, not deployed systems. The real question is: after 200 slides, did anyone actually look at your sensor data? That's usually where the answer lives — not in a framework diagram.

CT
cto_manufacturingCTO · Manufacturing10 days ago

We're in the 87% right now. Hired an ML team internally — three data scientists, none with production experience. They built a defect detection model with 94% accuracy in Jupyter. It crashes every 6 hours on the production line. The gap between notebook accuracy and production reliability is something nobody warned us about.

M
Mostafa DhouibAuthor10 days ago

The notebook-to-production gap is the #1 killer. A few things that are likely breaking it: memory leaks from not releasing tensors, no input validation so malformed images crash the pipeline, and probably no graceful degradation when the camera feed drops. These aren't ML problems — they're engineering problems that ML teams aren't trained to solve. Happy to take a look if you want to share the architecture.

CT
cto_manufacturingCTO · Manufacturing9 days ago

You nailed it — the camera feed drops were exactly the issue. Sent you an email. Appreciate the honesty.

M
Mostafa DhouibFounder & ML Engineer at Opulion

Facing a similar challenge?

Tell us about your problem. We'll respond with an honest technical assessment within 24 hours.