AI-First Data Ecosystems

The phrase AI-first data ecosystem gets thrown around like it means a single design decision. It does not. It is a stack of three decisions, taken in this order: where the data lives, how the semantics are owned, and which workloads get the freedom to call which sources. Get the order wrong and you spend a year refactoring while AI workloads run anyway, on bad data, in front of the board.

At JCorp the where question got answered first — centralisation on Azure, with a Group-wide backbone. The how question — semantic ownership — turned out to be the harder one. We resisted the temptation to centralise semantics into a single team, because semantic ownership belongs with the people who live the operating reality. What sits centrally is the contract: how a semantic term is defined and where its source of truth lives. Anyone can use the term, but no-one redefines it locally.

The third question is the one that exposes architecture choices to AI workloads directly. Which models can read which sources? Which agents can write back? Where is the human gate, and where is it not? Those decisions can't be retro-fitted — they have to be made before agentic AI lands in production. The talks I give about this thesis are getting longer, not shorter, as the deployment evidence catches up with the principle.