Google

GTIG: First AI-Built Zero-Day Found

Google says AI helped craft a 2FA bypass; patching stopped planned mass exploitation

Google says AI helped craft a 2FA bypass; patching stopped planned mass exploitation

Google’s Threat Intelligence Group (GTIG) on May 11, 2026 published an AI Threat Tracker that describes what it believes is the first confirmed case of a zero‑day exploit developed with the help of an AI model.

GTIG says the exploit was implemented as a Python script that bypassed two‑factor authentication (2FA) in a widely used open‑source, web‑based system administration tool. The company says it coordinated a responsible disclosure and a patch with the vendor before a planned mass‑exploitation campaign could start.

Researchers told GTIG the code bore multiple hallmarks of AI generation — detailed educational docstrings, a hallucinated Common Vulnerability Scoring System (CVSS) score, and an unusually textbook Pythonic structure that matched the kinds of examples seen in LLM training data. Those signals helped analysts conclude an AI model likely assisted both discovery and weaponization.

GTIG declined to name the tool or the criminal group in public materials, saying the disclosure and coordinated patch reduced the risk of a mass event. The company described the disruption as proactive counter‑discovery, crediting internal AI agents and threat hunting for the intervention.

The report places the zero‑day finding in a broader pattern: adversaries are moving from one‑off experiments with generative models toward industrialized, agentic workflows that automate vulnerability research, exploit testing, and parts of malware development. GTIG highlighted several other incidents where models were used to scale operations.

One striking case GTIG documented is PROMPTSPY, an Android backdoor with an embedded agent module that calls out to an LLM API to navigate device UIs and persist on compromised phones. The report and linked analysis describe AI‑driven autonomy and previously unreported capabilities for such malware.

GTIG also observed new forms of AI‑augmented obfuscation and polymorphism, where models generate decoy code or just‑in‑time modifications to evade signature detection. The firm says these techniques can hide malicious intent behind coherent, inert code blocks that look legitimate to many static scanners.

Security vendors and researchers say Google’s disclosure is a warning shot: AI lowers the technical barrier for creating exploits and accelerates the timeline from discovery to weaponization. That speed narrows defenders’ windows to detect, patch, and respond before criminals launch mass attacks.

Industry reporting and threat research over the past year have tracked similar shifts: CrowdStrike’s 2026 intelligence summarized a sharp rise in AI‑enabled operations and much shorter breakout times from initial breach to lateral movement, while IBM’s X‑Force flagged growing exploitation of public‑facing apps. Those trends make GTIG’s finding especially urgent for software maintainers.

For open‑source projects and small vendors, the report underscores a hard fact — documentation, helpful comments, and example code that aid users can also be high‑quality signals for an LLM. GTIG warns that models trained or primed on such material can surface high‑level logic flaws that traditional scanners miss.

GTIG recommends practical mitigations: remove risky hardcoded trust assumptions, apply least‑privilege and multi‑factor checks at multiple enforcement points, improve unit and integration testing for authorization logic, and establish rapid coordinated disclosure channels with major cloud and repository hosts. The group also urged defenders to adopt AI‑assisted detection tools to keep pace.

The disclosure is likely to change how defenders and vendors prioritize fuzzing, logic testing, and code reviews. GTIG’s paper also raises policy questions about model access and the gray market for API relays that let adversaries proxy requests to high‑capability models while evading rate limits and monitoring. Security teams should treat the GTIG case as a sign that AI‑assisted zero‑days are a present, not hypothetical, risk.