Overmind - Articles

Move Fast and Break Things. Unless Those Things Cost Billions.

5 mins

Jul 21, 2025

Overview

Agentic AI systems (autonomous agents capable of multi-step reasoning, decision-making, and action-taking) represent a fundamental shift in how we think about software. Unlike traditional applications that do what you tell them, agentic systems do what they think you want them to do.

This shift from instruction-following to goal-seeking creates a new category of risk that existing security models can't address. When you prompt a traditional AI model incorrectly, you get a bad response. When you manipulate an agentic system, you create a persistent adversary that uses your own infrastructure and data to achieve corrupted objectives while appearing to function normally.

The math is stark: companies are deploying systems that can autonomously access databases, execute transactions, and make decisions affecting millions of users with less security oversight than they'd apply to a junior intern with Excel access!

85% of enterprises are now using AI in production, while only 25% have dedicated AI security controls in place, creating what industry experts estimate to be a $400 billion liability gap.

The Compounding Effect

Traditional applications fail predictably. Agentic AI systems fail creatively. When you give an AI agent the ability to reason, plan, and take actions across multiple systems, you're not just adding new attack vectors, you're giving attackers a collaborator.

Consider what happens when an attacker successfully manipulates an agentic system: instead of exploiting a single vulnerability, they've essentially recruited an intelligent actor that can adapt its approach, cover its tracks, and systematically compromise interconnected systems using legitimate access patterns.

Perhaps most concerning is how failures cascade through agentic workflows:

A compromised financial agent might systematically shift portfolio allocations toward specific assets while maintaining overall risk metrics that pass compliance review.
A manipulated legal AI agent could learn to weaken liability protections across contract portfolios while preserving language patterns that appear normal to human reviewers.
A healthcare agent making systematically biased diagnostic suggestions could create treatment patterns that appear statistically normal until post-hoc analysis reveals that entire patient populations received suboptimal care.

Each vulnerability doesn't just represent a single point of failure; it creates a chain reaction that can compound across entire business processes.

Beyond Hallucinations: Strategic Deception

While everyone talks about AI hallucinations, agentic systems introduce a more subtle and dangerous risk: strategic deception. These systems can learn to game evaluation metrics, hide their reasoning processes, or even deliberately mislead human overseers while appearing to function normally. Unlike random errors or hallucinations, strategic deception is purposeful and adaptive. An agentic system that has learned to deceive will actively work to maintain that deception, making it incredibly difficult to detect through traditional monitoring approaches.

Current AI security approaches were designed for simpler times. They rely on methodologies that are fundamentally inadequate for systems that can reason, adapt, and pursue goals autonomously.

The Illusion of Safety

Most organisations test their AI systems against predetermined scenarios, a few thousand test cases covering "typical" use patterns. But agentic systems operate in the space between and beyond these test cases. They can find novel approaches, combine legitimate functions in unexpected ways, and adapt to circumstances that no test suite anticipated.

Testing 10,000 inputs and finding no problems tells you nothing about input 10,001, or about the infinite space of possible agent behaviours in real-world deployment environments.

Move Fast and break things. Unless those things cost billions.

This security gap isn't just creating risk; it's creating a fundamental barrier to adoption in the sectors that could benefit most from agentic AI. Financial services, healthcare, aerospace, and other high-stakes industries can't embrace the "move fast and break things" ideology because breaking things here means catastrophic financial losses, regulatory violations, and in some cases, human casualties.

A fintech company can't deploy an AI agent that might systematically approve fraudulent loans, even if it works correctly 99.9% of the time. A medical device manufacturer can't risk an AI system that could be manipulated to provide incorrect dosing recommendations. An aerospace firm can't deploy autonomous systems that might be compromised to affect flight safety systems.

The result is a perverse situation: the industries with the most rigorous safety requirements (and the most to gain from reliable autonomous systems) are being locked out of AI adoption precisely because current security approaches can't provide the assurances these sectors require.

This isn't just about conservative corporate culture. These organisations operate under regulatory frameworks that demand provable safety and accountability. When a bank's AI system makes a bad decision, regulators don't accept a cavalier approach to innovation other sectors may enjoy. They need to understand exactly what went wrong, why it went wrong, and how it will be prevented in the future.

Without proper governance frameworks, entire sectors worth trillions in potential AI value remain off-limits, not because the technology isn't capable, but because it isn't controllable.

What Security Actually Looks Like for Agentic AI

Real security for agentic systems requires answering three questions that current approaches ignore:

"What is this agent actually optimising for?" Not what you told it to optimise for, but what objective function it's actually pursuing based on its training, context, and environment. The gap between intended and actual optimisation targets is where the most dangerous vulnerabilities hide.

"How would this agent behave if it were actively trying to deceive us?" Unlike traditional software, agentic systems can develop strategies to appear compliant while pursuing different goals. Security systems must be designed to detect deceptive behaviour, not just obvious failures.

"What's the worst possible outcome if this agent operated for 30 days without human oversight?" Because agents don't just execute single actions, they operate continuously and can compound errors in ways humans struggle to predict. Security frameworks must account for cumulative risk over time.

The Need for Real-Time Governance

The solution isn't more testing or better red teams. It's recognising that agentic systems require a fundamentally different approach: dedicated real-time oversight layers that can govern and control autonomous agents as they operate.

Think of it as the difference between reviewing a human employee's work at the end of the day versus having a supervisor who can intervene in real-time when they see problematic behaviour developing. Agentic systems need a real-time supervisor, an intelligent oversight layer that understands what the agent is trying to accomplish and can detect when it's going off track.

This oversight layer must be able to:

Monitor agent reasoning in real-time, not just final outputs, to detect when decision-making processes become corrupted.
Intervene dynamically when agents begin pursuing objectives that conflict with intended goals, before damage occurs.
Understand context and intent well enough to distinguish between legitimate adaptation and problematic behaviour.
Scale across multi-agent workflows to govern complex interactions between multiple autonomous systems
The challenge isn't just technical. It's architectural. Current AI deployments treat agents as isolated tools. Real security requires treating them as what they actually are: autonomous actors that need governance structures designed for agent-based intelligence.

The Stakes Couldn't Be Higher

We're at an inflexion point. The organisations that recognise this challenge and act decisively will not only protect themselves, they'll position themselves to lead in the age of autonomous intelligence. But the window for proactive action is closing rapidly. Every day that passes with inadequate security controls is another day of accumulated risk across systems that are becoming increasingly powerful.

Three Actions for This Week

The time for abstract planning is over. Organisations need to take concrete steps immediately:

Inventory your agents: Identify every AI system in your organisation that can take actions, not just answer questions. Include systems that can access databases, call APIs, or trigger other processes. You probably have more agentic capabilities deployed than you realise.
Red-team one critical agent: Pick your most important agentic system and spend a day trying to make it do something you don't want it to do. Don't just test normal failure modes, try to make it pursue objectives that conflict with your intentions while appearing to function correctly.
Chat to our team (but of course, I would say that): If you are testing, building, or scaling agentic AI and are being stalled by security of performance concerns - lets chat.