When Your AI Agents Start Tricking Each Other

Feb 23

In 2024, the big worry was humans being fooled by AI. By 2026, a new threat is emerging: Agent-on-Agent (AoA) fraud. As more organizations rely on AI agents for procurement, treasury, and customer service, the traditional “perimeter of trust” is disappearing. Fraud isn’t just human-to-machine anymore. Now, it can happen machine-to-machine, where one autonomous AI agent tricks another.

What Is “Anthropomorphic Spoofing”?

Some threat actors are already experimenting with AI agents that mimic legitimate human behavior with incredible precision. These aren’t simple bots spamming a system. They inject subtle variations like unusual timing, keystroke patterns, or non-linear reasoning paths to bypass standard behavioral controls.

A growing tactic is Indirect Prompt Injection: a malicious agent “poisons” the context of other AI agents through seemingly harmless messages, like invoices or emails. In practice, a hijacked procurement AI could trick a treasury agent into authorizing an unauthorized transfer, leaving companies exposed.

The law is struggling to catch up. Regulations like the upcoming EU AI Act aim to hold companies responsible for the actions of their AI systems. But when one AI agent deceives another, questions like “who is responsible?” and “what counts as intent?” remain unsettled. Companies now may face a “liability vacuum” where legal accountability isn’t fully defined..

Monitoring Agentic AI: Some Metrics

Traditional fraud monitoring, like IP tracking or device logs, is no longer enough. Security teams are developing new ways to detect agent-on-agent fraud:

Reasoning-to-Action Ratio (RAR)
- What measures: How long an agent takes to execute a task after receiving a complex instruction.
- Why matters: Humans naturally take longer with more complex tasks. An AI agent that processes everything at the same speed may be a red flag.
Agent Referral Traffic Patterns
- What measures: Volume and intent of API requests between agents.
- Why matters: Sudden spikes or “recursive loops” (multiple rapid approvals) can indicate malicious activity designed to overwhelm other AI agents.
Contextual Drift & Goal Hijacking
- What it measures: How much an agent’s output strays from its original instructions.
- Why it matters: Significant drift can signal that an agent has been influenced by an external actor, intentionally or accidentally.

Real Case Example: When AI Agents Can Be Tricked Into Acting Against Each Other

In late 2025, cybersecurity researchers uncovered a real vulnerability that illustrates the kinds of risks we talk about when discussing agent‑on‑agent fraud. The issue was found in ServiceNow’s Now Assist AI platform, where built‑in features that let AI agents communicate and collaborate with each other could be abused by attackers.

In a scenario dubbed “second‑order prompt injection,” a seemingly harmless instruction embedded in data fields — like a ticket description or form input — can mislead a lower‑privileged AI agent. That compromised agent then unwittingly “recruits” a higher‑privileged agent to perform sensitive actions, such as accessing or exporting confidential data. Because the platform allowed agent‑to‑agent communication by default, the higher‑privileged agent trusted the request and executed it without human review.

This real‑world example shows that the risk isn’t just about human attackers fooling a single AI. When AI agents trust each other and act autonomously on each other’s instructions, malicious actors can exploit that trust to escalate privileges or exfiltrate data — even when the attackers never directly interact with a human operator.

Key Takeaways

Agent-on-agent fraud may still be emerging, but organizations that rely heavily on autonomous AI must rethink risk monitoring, operational controls, and legal responsibility. It’s no longer enough to protect systems from human attackers—AI systems themselves now require careful oversight.

Companies need to invest in both technical safeguards and human expertise to prevent costly mistakes and maintain trust in their AI-driven operations.

Fraud isn’t just human-to-machine anymore; it can happen AI-to-AI. Traditional monitoring is no longer enough—organizations need new metrics and safeguards to detect and prevent these emerging threats.

Legal frameworks are evolving, but questions about accountability and responsibility remain unresolved. Proactive oversight—both technical and human—is essential to stay ahead of these risks and protect organizational assets.

References (2025–2026)

IBS Intelligence — Agentic AI to drive next wave of fraud in 2026
https://ibsintelligence.com/ibsi-news/agentic-ai-to-drive-next-wave-of-fraud-in-2026/
OWASP GenAI Security Project — Top 10 for Agentic Applications 2026
https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/
European Commission — EU AI Act regulatory framework
https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai
Baker Donelson — 2026 AI Legal Forecast: From Innovation to Compliance
https://www.bakerdonelson.com/2026-ai-legal-forecast-from-innovation-to-compliance
Second‑order prompt injection can turn AI into a malicious insider — TechRadar article on the vulnerability in ServiceNow’s AI agents.
https://www.techradar.com/pro/security/second-order-prompt-injection-can-turn-ai-into-a-malicious-insider
ServiceNow Now Assist exploit enables AI prompt injection attacks — SignalPlus coverage of how default configurations enable inter‑agent abuse.
https://t.signalplus.com/crypto-news/detail/servicenow-now-assist-exploit-ai-prompt-injection

Albert Planas

When Your AI Agents Start Tricking Each Other

What Is “Anthropomorphic Spoofing”?

Monitoring Agentic AI: Some Metrics

Real Case Example: When AI Agents Can Be Tricked Into Acting Against Each Other

Key Takeaways

From Agent-on-Agent Fraud to Autonomous AI Attacks: The Next Escalation

Vibe Coding: Hype, Reality, and What It Actually Means