Research & Writing

Ideas from the
safety frontier.

Technical research, threat analysis, and field notes from our work building foundational AI safety tooling.

Featuredanalysis

Apr 22, 2026

One Employee's AI Side App Just Cost Vercel Its Customer Data

A Vercel employee connected an unauthorized AI tool to their corporate Google account with "Allow All" permissions. Attackers used that access to breach Vercel's internal systems and put customer data up for sale at $2 million. Here is exactly what happened and what it means for enterprises with employees using unsanctioned AI tools.

SuperAlign Team

SuperAlign Team

announcementMar 31, 2026

Introducing Project Horizon: SuperAlign's Research Preview for Context-Aware AI Safety

Project Horizon is SuperAlign's research preview for context-aware AI safety systems. We're introducing Halo and Prism — two specialized moderation systems designed for enterprise AI workflows.

analysisMar 26, 2026

46 Minutes: How a Poisoned Python Package Reached 47,000 AI Environments

A threat group called TeamPCP injected credential-stealing malware into LiteLLM versions 1.82.7 and 1.82.8 on PyPI. Nearly 47,000 downloads happened in 46 minutes. Here is what the attack did, how it started with a compromised security scanner, and what enterprises running AI agents need to check now.

researchMar 20, 2026

When the Assembly Line Becomes the Attack Surface: Supply Chain Threats in the Age of AI Agents

Software supply chain attacks can steal your credentials in minutes. Now AI agents are running the same attacks autonomously. What the hackerbot-claw campaign against Microsoft, DataDog, and Aqua Security reveals about the enterprise AI security gap.

analysisMar 5, 2026

When Your AI Ignores Your Security Policies: What the Copilot DLP Failures Reveal

Microsoft Copilot bypassed DLP policies twice in eight months, and no security tool caught either failure. Here's what it means for enterprise AI governance.

researchMar 1, 2026

The Hidden Supply Chain Threat Hiding in Your AI Agent's Markdown Files

Agent behavioral configuration lives in markdown files that lack the governance of code. This creates a new supply chain attack surface.

analysisFeb 17, 2026

When Guardrails Fail: What Claude Opus 4.6 Reveals About Prompt Injection Risk

Anthropic's Claude Opus 4.6 system card finally quantifies prompt injection risk at scale. These numbers should reshape how enterprises deploy AI agents.

researchFeb 4, 2026

How MCP Servers Turn AI Integrations Into Systemic Security Risks

The Model Context Protocol enables AI integration but carries fundamental security flaws. 43% of implementations have critical vulnerabilities.

researchJan 28, 2026

The Moltbot Rush: When Viral AI Agents Expose Your Entire Digital Life

Moltbot gained 85,000 GitHub stars by promising to automate your digital life. Security researchers found it introduces risks most users don't understand.

researchJan 23, 2026

Hidden in Plain Language: How Calendar Invites Became Data Extraction Tools Through Prompt Injection

A calendar event with crafted instructions could silently extract your private meeting data when you ask Gemini about your schedule. This reveals fundamental gaps in how AI systems handle untrusted inputs.

researchJan 20, 2026

When AI Agents Have Privileged Access: The BodySnatcher Vulnerability Exposes a Critical Design Flaw

The BodySnatcher vulnerability shows how authentication gaps in AI agent platforms can become critical security breaches. Nearly half of Fortune 100 companies use affected systems.

analysisDec 30, 2025

When AI Democratization Meets Vulnerability: The Real Cost of No-Code AI Agents

No-code AI platforms promise accessibility. Recent research shows they also introduce security challenges traditional approaches don't address.

reportDec 1, 2025

The Shadow AI Crisis: Why 40% of Organizations Will Face Security Incidents by 2030

Gartner predicts that 40% of organizations will suffer security incidents from unauthorized AI usage by 2030. Most are unprepared.

researchNov 17, 2025

Cursor's Browser Just Became a Target: What MCP Server Hijacking Means for Your Security Posture

Malicious MCP servers can take over Cursor's browser, harvest credentials, and run persistent code. Learn how to protect your development environment.

researchNov 17, 2025

How SuperAlign Helps Enterprises Counter AI-Powered Threats

Traditional tools cannot defend against AI-orchestrated attacks. Learn how SuperAlign helps enterprises address the critical security gaps that GTG-1002 exposed.