Microsoft Security Research published 120 days of operational data on an autonomous AI agent that runs inside tens of thousands of customer networks. It is the first time a major vendor in the crowded agentic-security category has shown its numbers.
The paper, published Wednesday (PDF), describes the Dynamic Threat Detection Agent — DTDA — an autonomous component of Microsoft’s Security Copilot that has been running in customer Defender environments since February.
DTDA is not a chatbot. It is a backend service. When Defender flags a security incident, DTDA independently investigates. It pulls together alerts, system events, user behavior signals, and threat intelligence, then uses a large language model to reason about whether anything malicious is going on that the original alert missed. If it finds something, it writes its own alert and drops it into the analyst’s queue, labeled as a Copilot-generated detection.
Most enterprise customers think of Security Copilot as something analysts query during an investigation. DTDA inverts that: it works without a human in the loop, and customers with eligible licenses have been receiving DTDA-generated alerts for months.
The report’s authors — Scott Freitas, a Senior Applied Scientist at Microsoft Security Research in Redmond, and Amir Gharib, of Microsoft Security Research in Toronto — describe this as a move “from analyst-assistive workflows toward continuous, autonomous threat discovery.” It is a capability many Security Copilot customers may not yet realize is already running in their tenant.
Microsoft says it will include DTDA in Microsoft 365 E5 subscriptions starting in July.
The disclosed numbers, from 120 days of production:
Key production metrics for Microsoft’s Dynamic Threat Detection Agent, from 120 days of deployment.
The disclosure lands in an increasingly noisy agentic-SOC market. Google Cloud, Palo Alto Networks (Cortex AgentiX), CrowdStrike, and a wave of startups have all announced autonomous investigation agents over the past 12 months — but virtually none have published comparable production precision, cost, or failure-rate data.
The authors note this gap directly: industry documentation in the category typically emphasizes capability claims over mechanisms or measured performance.
Freitas and Gharib are also candid about DTDA’s limits. The paper does not benchmark against rival products, because the data to do so does not exist publicly. Roughly 89% of the agent’s runtime is spent retrieving telemetry, not reasoning about it. And because DTDA reads data that attackers can influence, prompt injection is an active risk — one Microsoft says it mitigates with strict output schemas and Azure’s content-safety controls.
Whether DTDA’s 80.1% precision turns out to be the high-water mark or the floor for the category is the open question. For now, it is the only number on the board.
Photo by BoliviaInteligente on Unsplash