All this author’s posts

When an offensive security AI agent can compromise one of the world’s most sophisticated consulting firms in under two hours with no credentials, guidance, or insider knowledge, it’s not just a breach but a warning sign to industry.

That’s exactly what happened when an AI agent targeted McKinsey’s Generative AI platform, Lilli. The agent chained together application flaws, API misconfigurations, and AI-layer vulnerabilities into a machine-speed attack. This wasn’t a novel zero-day exploit. It was the exploitation of familiar application security gaps and newer AI attack vectors, amplified by AI speed, autonomy, and orchestration.

Enterprises are already connecting functionality and troves of data through APIs. Increasingly, they’re wiring up applications with Generative AI and agentic workflows to accelerate their businesses. The risk of intellectual property loss and sensitive data exposure is amplified exponentially. Organizational teams must rethink their AI security strategy and likely also revisit API security in parallel.

Offensive AI Exposes the Soft Underbelly of AI Systems

Let’s be precise about what happened, avoiding the blame for McKinsey moving at a pace that much of the industry is already adopting with application and AI technology.

The offensive AI agent probing McKinsey’s AI system was quickly able to:

Discover 200+ API endpoints via public docs.
Identify 22 unauthenticated endpoints.
Exploit a SQL injection flaw in an unauthenticated search API
Enumerate a backend database via error messages.
Escalate privileges for full access across the platform.

From there, the AI agent accessed:

46.5M internal chat messages with sensitive data.
728K files containing research and client records.
57K user accounts covering the entire workforce.
95 system prompts governing AI behaviors.

Even experienced penetration testers don’t move this fast, not without AI tools to augment their testing. Many would stumble to find the type of SQL injection flaw present, let alone all the other elements in the attack chain.

Attack Chains Span Application Layers

What makes this security incident different and intriguing is how the AI agent crossed layers of the technology stack that are now prominent in AI-native designs.

1. Application Layer

McKinsey’s search API was vulnerable to blind SQL injection. The AI agent discovered that while values were parameterized (a security best practice), it could still inject into JSON keys used as field names in the backend database and analyze the resulting error messages. Through continued probing and evaluation of these error messages, the agent mapped the query structure and extracted production data.

These are long-known weaknesses in how applications are secured. Many organizations rely on web application firewall (WAF) instances to filter and monitor web application traffic and to stop attacks such as SQL injection. However, attack methods constantly evolve. Blind SQL injection, where attackers infer information from the system without seeing direct results, is harder to detect and works by analyzing system responses to invalid queries, such as those that delay server response. These attacks can also be made to look like normal data traffic.

Security teams need monitoring capabilities that analyze application traffic over time to identify anomalous behaviors and the signals of an attack.

2. API Layer

The offensive agent quickly performed reconnaissance of McKinsey’s system to understand its API footprint and discovered that 22 API endpoints were unauthenticated, one of which served as the initial and core point of compromise.

The public API documentation served as a roadmap for the AI agent, detailing the system's structure and functionality. This presents a tricky proposition, since well-documented APIs and API schema definitions are critical to increasing adoption of productized APIs, enabling AIs to find your services, and facilitating agent orchestration.

APIs aren’t just data pipes anymore; they’re also control planes for AI systems.

APIs serve as control planes in AI-native designs, managing the configuration of model commands and access controls, and also connecting the various AI and data services. Compromising this layer enables attackers to manipulate AI configuration, control AI behavior, and exfiltrate data.

The major oversight here was the presence of 22 unauthenticated API endpoints that allowed unfettered access. This is a critical API security vulnerability, known as broken authentication.

Lack of proper authorization enabled the AI agent to manipulate unique identifiers assigned to data objects within the API calls, increase its own access permissions (escalate privileges), and retrieve other users' data. The weakness is commonly known as broken object-level authorization (BOLA), where system checks fail to restrict user or machine access to specific data. McKinsey’s AI design also allowed direct API access to backend systems, potentially exposing internal technical resources and violating zero-trust architecture (ZTA) principles. With ZTA, you must presume that the given identity and the environment are compromised, operate with least privilege, and ensure controls are in place to limit blast radius in the event of an attack. At a minimum, all identities must be continuously authenticated and authorized before accessing resources.

3. AI Layer

A breach in an AI system essentially provides centralized access to all organizational knowledge. A successful intrusion can grant control over system logic via features such as writable system prompts. This enables attackers to rewrite AI guardrails, subtly steering AI to bypass compliance policies, generate malicious code, or leak sensitive information.

New risks arise when organizations aim to improve AI system usefulness by grounding them with other sources (e.g., web searches, databases, documents, files) or using retrieval-augmented generation (RAG) pipelines that connect data sources to AI systems. This is done to tweak the prompts sent to LLMs and improve the quality of responses. However, attackers exploit these connections to corrupt the information processing or trick the AI into revealing sensitive or proprietary data.

With its elevated access, the AI agent had the ability to gain influence over:

How the system reasons by corrupting prompts or RAG context that dictate AI logic
What the system returns by manipulating LLM prompts and responses
What users trust when malicious code or actions are inserted into outputs from authoritative internal tools that are used unknowingly by users or other agents.

A breach in the AI layer is not just a security incident, but a core attack on the integrity and competence of the business.

The rise of generative AI has further dissolved traditional security perimeters and created critical new attack vectors. Attackers can now target core mechanisms of institutional intelligence and reasoning, not just data.

Point Solutions Will Continue to Fail

Traditional "defense in depth" thinking segments application and AI protection into isolated layers, commonly WAFs, API gateways, API runtime security, and AI guardrails. While offering granular protection, such approaches inadvertently create a critical security blind spot: they fail to track sophisticated, multi-stage attacks that exploit handoffs between application layers.

Modern attacks are fluid campaigns. They may target frontend code as the initial attack vector, abuse APIs to attack business logic, bypass access controls enforced by gateways, pivot to database services for data exfiltration, and leverage access to manipulate reasoning of AI services.

The fatal flaw is the inability to maintain a single, unbroken chain of contextual awareness across the entire sequence. Each isolated WAF, gateway, or AI guardrail only sees a segment of the event and loses visibility once the request passes to the next layer. This failure to correlate events in real-time across APIs, applications, databases, and AI services is the blind spot that attackers exploit. By the time related signals are gathered and correlated in an organization’s SIEM, the breach has already occurred. True resilience requires a unified runtime platform to quickly identify, correlate, and respond to complex application attack chains.

Shift From Tools to Platforms

To connect signals and stop advanced attacks, organizations need correlated visibility and control across their application, API, and AI footprint. This essential capability comes from three key elements.

1. Unified Visibility & Signal Correlation

A platform must identify your application assets by combining and analyzing traffic signals from:

Application traffic: Signals generated by user interfaces, including chatbots, web applications, and client-side code.
API traffic: Signals generated by applications, system integrations, and agents that invoke functionality or act on data.
AI interactions: Signals generated when user or machine identities generate prompts, receive responses, locate services through MCP, invoke tools, and find AI resources.

2. Context-Awareness

Runtime protection must go beyond simple authentication checks. It requires a deep understanding of other application context including:

Authorization flows: Verify if a user, machine, or agent should be allowed to perform a specific action, not just if a request is authenticated.
Data flows: Identify 1st- and 3rd-party APIs, including AI services, called by your AI applications and the data or sensitive data they pass.

3. Multi-Layer Runtime Protection

Threat detection and prevention must happen at multiple levels during runtime, which include:

Request stitching: Detect abusive sequences and anomalous transaction flows instead of focusing solely on isolated events or rule violations.
API requests & responses: Protect REST and GraphQL API calls and detect attacks such as BOLA and introspection.
AI prompts & responses: Protect model interactions to preserve AI safety and prevent attacks like prompt injection or sensitive data leaks.

AI Doesn’t Break Security, It Exposes It

The incident with McKinsey's AI system didn’t introduce new vulnerabilities. It revealed something more important.

AI systems amplify every weakness across your stack, and AI excels at finding them.

Act now by reevaluating your AI security posture, unifying security monitoring, and bridging gaps that AI can exploit before attackers do.

It’s fortunate this event was essentially a research experiment and not a motivated threat actor. Attackers are already thinking in terms of AI-native designs. It’s not about endpoints or services for them; it’s about attack chains that enable them to get to your organization’s data or intelligence.

When reviewing your application security strategy, it’s not whether you have application firewalls, API protection, or AI guardrails to mitigate attacks; it’s whether they work together effectively.

Michael Isbitski

All this author’s posts

Michael Isbitski has nearly 30 years in the industry, with experience across diverse roles, including analyst, architect, engineer, and marketer, with a focus on cybersecurity and systems engineering.

The McKinsey Incident Is a Warning Shot for AI-Native Design
| Harness Blog

Offensive AI Exposes the Soft Underbelly of AI Systems