The idea of turning AI loose to find software’s hidden flaws was largely a research curiosity a mere three months ago. At its Build conference this week, Microsoft debuted it as a product – MDASH.
The tool puts Microsoft in the AI bug-hunting race alongside Anthropic and OpenAI. It also distilled the larger Build message: AI is moving into enterprise software faster than security teams can comfortably absorb it, forcing DevSecOps into the center of a conference traditionally built around developers and code.
Against that backdrop, Satya Nadella, Microsoft chairman and chief executive, used his Tuesday keynote to make Microsoft’s security argument plain: the AI developer boom will not scale unless cybersecurity is built into it from the start.
Build 2026 outlined five standout cybersecurity pillars:
- MDASH: AI systems that find and prove exploitable bugs faster than human researchers.
- MXC: A Windows containment layer for running AI agents securely on user workstations.
- Agent 365: An agent-identity tool for assigning granular or broad guardrails to agents.
- Agent Control Specification: Open specifications meant to set the rules for how agents are governed industry-wide.
- Confidential computing: Chip-level encryption that keeps rival tenants’ agent data protected even while it is being processed.
Together, Microsoft is betting that its command of the entire computing stack — Windows, Azure, GitHub and the identity tools that tie them together — gives it an edge no rival can match in an escalating AI arms race.
[See Related: Microsoft’s Security Pitch at Build 2026]
1. Software that hunts its own flaws
The most consequential security news at Build was a system built to find software vulnerabilities the way an attacker would. Nadella introduced Microsoft Security Multi-Model Agentic Scanning Harness (MDASH) as an “agent harness for security,” for a moment when the task is to “defend yourself using AI against attacks that may, in fact, be using AI.”
In a demonstration, a Microsoft engineer ran it as a command-line tool inside the company’s new GitHub Copilot app, scanning a codebase and ranking flaws by severity. It flagged hard-coded secrets and what the demo called “AI-specific vulnerabilities,” then rewrote a bug locally and produced a diff for review before opening a pull request. The centerpiece was a real open-source flaw split across three files — past a developer comment insisting the code was sound — that one set of agents flagged, a second debated, and a third proved by triggering a crash.
MDASH represents Microsoft’s official entry into the category of AI tools that can hunt down and patch bugs at machine speed. Anthropic unveiled Claude Mythos in April, and OpenAI followed with Daybreak.
“Last month we announced MDASH,” Nadella said. “We are bringing together 100 agents across the frontier and custom models to really find these exploitable bugs better than any single model does.”
He claimed MDASH topped both Anthropic and OpenAI in bug hunting, citing testing based on the CyberGym benchmark.
The comparison is only slightly off. MDASH is a system of 100-plus agents, while Mythos is a single model — one Anthropic restricts through a consortium Microsoft itself belongs to.
2. The agent becomes the attack surface
Microsoft’s recurring argument was that autonomous agents are now an enterprise attack surface, and that identity is the unsolved piece. Agents, Nadella said, “require their own identities, access controls, even when they’re working on your behalf.” The rationale is why Microsoft is extending the company’s Entra identity, Defender and Purview tools work not just on Azure but rival clouds alike.
In a demonstration, Microsoft deployed a locally built agent to its Foundry platform. Agents carried their own identity and licenses, worked alongside employees in a Teams chat, and required administrator approval, with admins able to monitor or block it at any time. It also showed how it could apply additional guardrails for blocking personal data from leaking into the agent’s tool calls.
Every agent in your organization needs to be managed with the same rigor as users, apps, and devices, Nadella said.
Despite Microsoft’s edge being its identity-and-endpoint footprint within the IT stack, the open question remains if routing every agent through one vendor’s stack will be seen by customers as governance or lock-in.
3. Building the cage into the operating system
Nadella’s keynote was used to debut a new process isolation technology called Microsoft Execution Containers (MXC). He described it as a new policy layer that lets Windows apply isolation and containment to untested agents.
On stage, OpenClaw creator Peter Steinberger ordered an autonomous AI agent to delete every file on a Windows desktop. Next the agent tried and failed multiple times as the files survived inside a sandbox the operating system refused to open.
“What makes MXC so powerful is also what makes companies a bit nervous,” Steinberger said. Watching the agent “try to delete all your desktop files and just fail made me really happy, because six months ago it totally would have worked.”
The point was AI agents can be managed securely – albeit in cages.
Nadella described MXC as “a new policy layer that lets Windows apply isolation and containment using OS-native primitives.” A way, he said, to box in agents that “generate and run code dynamically,” regardless of who built them.
MXC is still in development with a preview release launch slated for July.
4. A contest over who writes the rules
Microsoft also used Build to stake a claim on how agents get governed industry-wide, releasing two open-source projects: the Agent Control Specification, meant to standardize where and how limits are applied inside an agent’s decision loop. Part of that control also includes a new framework called ASSERT, for policy-driven safety evaluation.
The ASSERT specification allows security teams to define enforceable guardrails they already have or want. For example, it requires agents to seek human approval before high-stakes actions (giving a customer a $500 refund) and that agents must disclose to a user they are dealing with an agent.
The move comes alongside Anthropic’s Model Context Protocol and a bevy of competing agent-identity efforts. Critics point out an open specification can be genuine interoperability, or a bid to define the governance layer on the author’s (Microsoft, Anthropic or other) terms.
With no standard yet established, that ambiguity of who defines the specification is not a small point.
5. Locking down the data agents run on
The security thread reached down to the silicon.
In the keynote’s opening conversation between Nadella and NVIDIA chief executive Jensen Huang, outlined an NVIDIA‑Microsoft co‑engineered stack based on NVIDIA’s own Grace Blackwell and Vera Rubin AI models designed for confidential agentic workloads.
The AI “supercomputing stack” includes Grace Blackwell for training and reinforcement learning, and Vera Rubin for running large fleets of agents. The security emphasis is end‑to‑end encryption workloads running on “low‑latency CPU architecture designed for agents.”
He described it as a platform built for a cloud where autonomous agents from rival customers share the same physical hardware. Workloads, he said, are encrypted not only in storage and in transit but “also encrypted in use.”
Protecting data at rest and in motion is routine but keeping it encrypted while a processor is actively working on it (aka confidential computing) is a highwire act when adding agentic computing to the mix.
“Every step of the agentic compute path — storage, memory, and execution — is encrypted,” Huang said. The technology could swing the doors open wide for agentic systems to run in highly regulated environments and industries such as healthcare, finance, and critical infrastructure where “encrypted in use” is table stakes.
No product was demonstrated and no release date given.
With Build 2026 ending today, the post analysis is clear: Microsoft is positioning itself to secure the agentic era at every layer it owns, from chip to cloud to desktop.
The question is whether customers decide that letting one company secure every layer is a bargain or a bind. The race now is whether the defenses can keep arriving as fast as the agents do.
Photo courtesy of Microsoft by Dan DeLong