The AI contract gaps the Google-Pentagon deal just made visible

May 9, 2026

5

On Tuesday, Google signed a deal permitting the U.S. Department of Defense to use its Gemini AI models for classified military work, under terms allowing “any lawful government purpose.” The restrictions reportedly written into the agreement — no domestic mass surveillance, no autonomous weapons without human oversight — are not contractually binding. And Google has limited ability to monitor or restrict how those systems are ultimately applied.

The geopolitical and ethical implications of that arrangement will be debated at length, but for enterprise CIOs, the contract’s more immediate relevance lies elsewhere. The structure of the master service agreement (MSA) exposes familiar pressure points: contracts that signal intent without enforcing it; limited visibility into how systems behave in production; and a governance model that struggles to keep pace with how AI is actually used.

None of these issues are unique to defense. What the Google–DoD relationship illustrates is how quickly they surface once AI systems are deployed at scale.

Contracts that don’t constrain behavior

Enterprise AI contracts often contain detailed language around acceptable use, data handling and safeguards. On paper, these provisions can appear robust; in practice, they frequently operate as expressions of intent rather than enforceable constraints.

Chris Hutchins, founder and CEO of Hutchins Data Strategy Consulting and strategic advisor to Reliath AI, said this disconnect is built into how enterprise organizations think about their AI vendor contracts in the first place.

“Contracts are only as good as the control mechanisms that govern them,” he said. “An MSA is not a control mechanism. It is a snapshot of what the vendor said on that day.”

That snapshot quickly becomes outdated in an environment where models evolve continuously. Hutchins said enterprises often treat clauses on data use or model behavior as if they provide ongoing assurance, but legacy SaaS governance frameworks can’t be simply transposed onto AI models.

“If you believe the clause stating that the training data will not be used is a control mechanism, you are mistaken,” he said.

The gap becomes more pronounced when looking at how contracts handle downstream use. Hutchins said many agreements contain exceptions that materially weaken their protections. “You would be surprised what ‘improvements, abuse, safety and evaluation, and research’ actually mean,” he said, noting that these categories can create pathways for secondary use of data that customers did not anticipate.

“Anyone signing that clause without reviewing the exceptions is signing a contract that is almost the opposite of the one in their minds,” he warned.

Simon Ratcliffe, fractional CIO at Freeman Clarke, framed the issue more broadly. “The overarching problem with AI governance is enterprises are trying to apply static governance tools — contracts, policies, controls — to something inherently dynamic,” he said. “This is a mismatch with potential for disaster.”

He was more direct on the limits of policy as a control mechanism. “At scale, pure control is a fiction,” Ratcliffe said. “Policies can define intent, boundaries and consequences, but they cannot fully govern behavior in distributed, API-driven, often employee-led adoption environments.”

The gray areas in these contracts are not simply a matter of poor drafting. They reflect a long-held assumption that contractual language can still meaningfully shape behavior in systems that are continuously updated, integrated, and repurposed. The Google–DoD agreement makes clear how limited that assumption can be when applied at scale.

“Contracts are only as good as the control mechanisms that govern them.”
— Chris Hutchins, CEO, Hutchins Data Strategy Consulting

The observability gap in production

If contracts define intent, enforcement depends on visibility. This is where many enterprise AI strategies begin to break down.

Most governance frameworks are established at the point of procurement or initial deployment. Risk assessments, usage policies and approval processes are designed to shape how systems should be used. But as Ratcliffe said, “AI risk actually materializes during operation, when we see how models behave with real data, how prompts evolve, how outputs are used downstream.”

The problem is that few organizations have the infrastructure to observe those dynamics in real time. “The largest gap is runtime visibility,” Ratcliffe said. Policies may prohibit sensitive data from being shared with external models, but “production systems pass metadata, logs or user inputs that violate that principle.”

Hutchins described a similar divide between documented policy and operational reality. “What policy you have, what you have published in slide decks, is policy intent,” he said. “The reality of what you have in production is in another policy file.” Without sufficient monitoring, organizations are effectively operating on assumptions about how their AI systems behave, rather than empirical evidence.

In highly controlled environments — such as classified networks — the problem becomes more visible because it is more extreme. But the underlying dynamic is consistent across enterprise contexts. Once AI systems are integrated into business processes, both vendors and customers can lose sight of how they are being used.

“Users copy outputs into the next tool down the line, and the chain of custody is lost,” Hutchins said.

That raises a practical question for CIOs: if governance depends on the ability to observe and intervene, what happens when that visibility is incomplete by design?

Strengthening AI contracts in practice

When faced with increasingly inadequate contracts, the response is not to abandon them altogether, but to rethink what they are expected to do and how they are structured.

Ratcliffe argued that organizations need to move from what he described as “service assurance” to “outcome assurance.” In practice, that means shifting away from general commitments and toward mechanisms that account for how models evolve over time.

This is an area that Hutchins flags as being currently under-addressed in AI agreements. “The AI vendor retains the right to swap out models, and change prompts and filters, meaning your implementation may change with no notice,” he said. “Changes may occur overnight, and a new version of the AI may perform in a completely different manner with no explanation.”

To combat this, Ratcliffe recommends that contracts include model change notification clauses with defined impact thresholds, along with versioning guarantees or the ability to pin to specific model versions. This returns some of the control over model application to the enterprise.

Data handling is another area where specificity matters. Ratcliffe said organizations should define clear data boundaries, including zero-retention options and indemnity around misuse. Hutchins, meanwhile, pointed to the need to scrutinize exceptions within data clauses, where secondary use is often permitted under broad categories.

Observability also needs to be addressed contractually, not just technically. Ratcliffe said enterprises should embed audit and observability rights, including access to logs, evaluation metrics, and testing environments. Without those rights, enforcing governance policies becomes significantly more difficult.

Finally, both experts emphasized the importance of planning for an exit or a total renegotiation. Ratcliffe highlighted the need for portability of prompts, workflows and embeddings, while Hutchins emphasized timing. “Renewal is when the most options are available,” he said. “Don’t wait for some crisis to act.”

From governance as policy to governance as system

The combined effect of these dynamics is a shift in how AI governance needs to be approached. Contracts, policies and upfront controls remain necessary, but they are no longer sufficient on their own.

Ratcliffe argues for a move toward runtime governance, where monitoring, evaluation and intervention are continuous rather than episodic. He said organizations that are making progress are treating AI not as a feature, but as “an operational risk surface.”

“We need to change our thought process because organizations that still think in terms of prohibition or rigid approval models will either fail or drive usage underground,” he warned.

That shift comes at a price. Hutchins did not shy away from the potential ramifications of a more tightly governed AI deployment framework: the visible costs of equipping a small team to inventory, evaluate, and monitor governance and runtime; the delay in project approval; the change in how vendors need to sell their AI-enhanced products.

Despite this, he unequivocally recommends taking action.

“The biggest cost will come from delaying this decision, because the alternatives are an irrational system with unclear processes, class action lawsuits and government inquiries,” he said. “The math for this decision is easy.”

Previous articleThe Workday case that CIOs can’t ignore

The AI contract gaps the Google-Pentagon deal just made visible

Contracts that don’t constrain behavior

The observability gap in production

Strengthening AI contracts in practice

From governance as policy to governance as system

Related Articles