Why Enterprise AI Projects Stall on Data Access, Not Models
Enterprise AI fails due to data access bottlenecks, multi-hop pipelines, and late governance. SMBC, Marsh McLennan leaders share real-world solutions.

Enterprises are racing to adopt AI, but most engineering teams are fighting a different battle: accessing, moving, and trusting operational data.
This reality came through in our recent panel with leaders from SMBC Group, Marsh McLennan, and Conduktor. The session focused on making valuable data safe for AI, but the conversation quickly shifted to architectural problems that slow AI long before a model is ever trained.
Everyone on the panel agreed: AI isn't the problem. Data movement is.

Data Access Controls Block AI Before It Starts
Ask any architect or data engineer where AI projects stall, and the answer won't be "the model". It will be access.
The panelists described operational data constrained by:
- Strict role-based access controls
- Data residency rules across regions
- Sensitivity classifications nobody fully trusts
- Spreadsheets leaking data outside governance
- The constant fear of accidental misuse
One quote hit hard:
"Not all data is appropriate for everyone, and even accidental misuse still has consequences."
This is the opposite of the AI hype narrative. Before any innovation can happen, teams are stuck in a loop of approvals, clarifications, and risk reviews.
The blocker isn't ambition. It's architecture.
10+ Hops Before Data Reaches the Lakehouse
Once access is granted, the next challenge emerges: the pipeline itself.
Across all three companies, the number of hops operational data passes through was staggering. Sometimes 10, 11, even 15 hops before landing in a lakehouse.
Every hop introduces risk:
- Schema changes that ripple downstream
- Different tooling across teams
- Inconsistent transformations
- Unclear ownership
- No shared understanding of semantics
- Lineage that exists only in tribal knowledge
Sreeni from Marsh McLennan captured this perfectly:
"By the time the data ends up in your lakehouse, you're not always sure it's the right piece of information anymore."
The pipeline itself has become too fragmented to trust. The industry talks endlessly about model drift, yet the bigger, more dangerous drift is happening inside pipelines that no one has full visibility into.
Governance Applied at the End of the Pipeline Is Too Late
If data access is the first bottleneck and multi-hop fragility is the second, governance is the third and arguably the most consequential.
The panelists shared challenges that rarely make it into conference keynotes:
- Spreadsheets exported outside control
- Inconsistent masking and tokenization
- Sensitive fields blended with non-sensitive ones
- Unvalidated data feeding downstream decisions
- Classification models that don't match real usage
- Audit logs built for after-the-fact forensics, not prevention
"It's better not to provide any data than provide incorrect data that leads to wrong decisions."
Governance applied at the end of the pipeline is too late. Quality, sensitivity detection, controls, and lineage must shift left, closer to the source, before data moves. It's a worldview shift: governance is no longer traffic control. It's architectural foundation.
Data Request Overhead Kills Velocity
The slowest part of data movement isn't the movement. It's everything around it.
A single data request triggers a chain reaction:
- Analytics teams define attributes
- SMEs decipher which systems contain them
- Source teams extract the right tables
- Validation teams check consistency
- Governance teams classify sensitivity
- Engineers assess scaling, contracts, and tools
- Transformation teams align semantics
"Most of the time is spent before development, identifying the source, agreeing on transformations, validating quality."
Ask any architect why projects take months, and they'll say the same thing: It's not the code. It's the conversations, the negotiations, the risk tradeoffs.
Until enterprises fix this, AI will always move slower than promised.
Stale Data Undermines AI Predictions
In risk scoring, fraud detection, market trends, or credit exposure, delay becomes a liability.
As Shuchi noted:
"A lot of insights are time-sensitive. If they arrive late, the value is gone."
AI amplifies this. Models trained on inconsistent data can drift into dangerous territory: bias, wrong decisions, broken customer experiences, regulatory violations. Real-time is the new normal, and most pipelines aren't ready for it.
Five Principles for Intelligent Data Movement
The panel discussed where modern architectures are heading:
1. Understand data at the source
Detect sensitive fields, semantics, and schema early.
2. Strengthen contracts
Changes upstream should never silently break downstream logic.
3. Shift governance left
Quality checks, lineage capture, and controls must happen before movement.
4. Build context-aware pipelines
Not diagrams. Actual end-to-end lineage with ownership and business meaning.
5. Automate pre-work
Profiling, validation, classification, and consistency checks should be built in, not bolted on.
Collectively, the panel was describing the same need: an intelligent data movement layer that understands risk, quality, and governance before a single record flows. That's where the industry is heading, and where the biggest innovation is happening.
Watch the Full Conversation
This blog captures only part of the discussion. The panel shared far more, including stories about schema drift disasters, data residency dead-ends, lineage headaches, and the internal negotiation required to move data across global organizations.
If you want to hear how three leaders are rethinking data movement inside complex, regulated enterprises:
