Microsoft Reveals Production Metrics for Defender’s AI Hunter

Microsoft’s AI system found hidden attack activity missed by existing alerts, but the results come from Microsoft’s own environment and data.

byTom Spring

May 21, 2026

Microsoft Security Research published 120 days of operational data on an autonomous AI agent that runs inside tens of thousands of customer networks. It is the first time a major vendor in the crowded agentic-security category has shown its numbers.

The paper, published Wednesday (PDF), describes the Dynamic Threat Detection Agent — DTDA — an autonomous component of Microsoft’s Security Copilot that has been running in customer Defender environments since February.

DTDA is not a chatbot. It is a backend service. When Defender flags a security incident, DTDA independently investigates. It pulls together alerts, system events, user behavior signals, and threat intelligence, then uses a large language model to reason about whether anything malicious is going on that the original alert missed. If it finds something, it writes its own alert and drops it into the analyst’s queue, labeled as a Copilot-generated detection.

Most enterprise customers think of Security Copilot as something analysts query during an investigation. DTDA inverts that: it works without a human in the loop, and customers with eligible licenses have been receiving DTDA-generated alerts for months.

The report’s authors — Scott Freitas, a Senior Applied Scientist at Microsoft Security Research in Redmond, and Amir Gharib, of Microsoft Security Research in Toronto — describe this as a move “from analyst-assistive workflows toward continuous, autonomous threat discovery.” It is a capability many Security Copilot customers may not yet realize is already running in their tenant.

Microsoft says it will include DTDA in Microsoft 365 E5 subscriptions starting in July.

The disclosed numbers, from 120 days of production:

DTDA · 120-day production sample · GPT-4.1

Alert-level precision

80.1%

1,088 graded alerts · 208 orgs

Novel alerts generated

~15%

of investigated incidents

Job-level failure rate

0.38%

236K+ production jobs

Median latency

28 min

single-incident, end-to-end

Median token cost

$2.04

p95: $7.82 · 90K incidents

Offline gap-recovery F1

0.78

GPT-5.4 · +0.12 over GPT-4.1

Precision by attack phase

Initial access 72.9%

86 TPs · 32 FPs · 71 orgs · 95% CI 64.1–80.2%

Execution 80.7%

159 TPs · 38 FPs · 78 orgs · 95% CI 74.6–85.7%

Post-compromise 81.0%

626 TPs · 147 FPs · 150 orgs · 95% CI 78.1–83.6%

The disclosure lands in an increasingly noisy agentic-SOC market. Google Cloud, Palo Alto Networks (Cortex AgentiX), CrowdStrike, and a wave of startups have all announced autonomous investigation agents over the past 12 months — but virtually none have published comparable production precision, cost, or failure-rate data.

The authors note this gap directly: industry documentation in the category typically emphasizes capability claims over mechanisms or measured performance.

Freitas and Gharib are also candid about DTDA’s limits. The paper does not benchmark against rival products, because the data to do so does not exist publicly. Roughly 89% of the agent’s runtime is spent retrieving telemetry, not reasoning about it. And because DTDA reads data that attackers can influence, prompt injection is an active risk — one Microsoft says it mitigates with strict output schemas and Azure’s content-safety controls.

Whether DTDA’s 80.1% precision turns out to be the high-water mark or the floor for the category is the open question. For now, it is the only number on the board.

Photo by BoliviaInteligente on Unsplash

Author

Tom Spring

Tom Spring is the founder of Security Point Break and is based in Boston, MA. For over two decades he has worked at national publications in the leadership roles of senior editorial director of SC Media, publisher at Threatpost, as executive news editor PCWorld/Macworld, and as technical editor at CRN. He is a seasoned cybersecurity reporter, editor and storyteller that aims always for truth and clarity.

Author

Tom Spring

Tom Spring is the founder of Security Point Break and is based in Boston, MA. For over two decades he has worked at national publications in the leadership roles of senior editorial director of SC Media, publisher at Threatpost, as executive news editor PCWorld/Macworld, and as technical editor at CRN. He is a seasoned cybersecurity reporter, editor and storyteller that aims always for truth and clarity.

AI's Next Security Problem: Network Traffic No One Watches

byTom Spring

Microsoft Quietly Patches 8 Critical Cloud Vulnerabilities

byTom Spring

The Latest

Cracks in Claude Code, Cursor, Amazon Q, Codex Expose ‘Trust Boundaries’

Critical Blocksy WordPress Plugin Bug Lets Attackers Skip Login Entirely

GitHub AI Agent Bug Let Attackers Leak Private Code

AI SOC Tools Misflag Up to 86% of Safe Traffic, Study Finds

Microsoft Reveals Production Metrics for Defender’s AI Hunter

Author

Leave a ReplyCancel reply

AI's Next Security Problem: Network Traffic No One Watches

Microsoft Quietly Patches 8 Critical Cloud Vulnerabilities

Microsoft Reveals Production Metrics for Defender’s AI Hunter

Key production metrics for Microsoft’s Dynamic Threat Detection Agent, from 120 days of deployment.

Author

Related

Leave a ReplyCancel reply

AI's Next Security Problem: Network Traffic No One Watches

Microsoft Quietly Patches 8 Critical Cloud Vulnerabilities

Related Posts

Discover more from Security Point Break