AI is not just making software development faster. It is turning the audit trail into an expense report written by AI, approved by AI and submitted with no receipts. Good luck defending that after the breach.
As developers use AI tools to write, review and accelerate code, application security teams are being asked to answer questions many organizations are not yet equipped to answer. What code shipped? Which tools or models touched it? Which findings were ignored? Which AI components are embedded in the application? And who accepted the risk before release?
The trendlines are creating what Checkmarx describes as an AI Catch-22 for application security. AI is helping teams create, review and ship code faster than many security tools, QA processes and governance models can validate it. The antidote often proposed is more AI in the same pipeline. What could go wrong?
The risks tied to getting AppSec wrong are not theoretical. MOVEit showed how one exploited application flaw could cascade into mass downstream data theft. The XZ Utils backdoor showed how a trusted open-source component could become a near-miss supply chain catastrophe. The Polyfill.io compromise showed how widely embedded web code could be turned against thousands of sites.
Bad code happens: Deal!
For AppSec teams, the issue is no longer only whether vulnerable code gets shipped. It is whether the organization can reconstruct what entered the pipeline, what security found, what was waived and who accepted the risk before release.
That accountability gap runs through Checkmarx’s latest research and product releases focused on the future of application security, hybrid static application security testing and AI asset inventory.
The company’s 2026 Future of Application Security report found that 75% of organizations knowingly deploy vulnerable code at least some of the time. It also found that 95% of CISOs say they have felt pressure to delay or suppress compliance-related security findings when business deadlines are at stake.
Scapegoat speedbump
Those numbers are not just a warning about vulnerable code. They point to a larger problem: AI-assisted development is moving faster than many organizations’ ability to document, prioritize and defend the decisions that put code into production.
“It’s a common misunderstanding among industry insiders, and especially executives who aren’t in the weeds every day doing security stuff, that it’s fundamentally a tools problem,” said Darren Meyer, security research advocate at Checkmarx in an interview with Security Point Break. “It’s not fundamentally a tools problem. It’s fundamentally an organization, process and people problem.”
The tools are often present. The evidence chain is weaker.
Checkmarx found that 96% of developers have AI security tooling in their integrated development environments, yet only 18% apply security continuously as code is written. The result is a software factory in which AI can help create code, security tools can flag risk, and business pressure can still push vulnerable code through the release process.
The new AppSec question: Can you prove what happened?
Application security has long been framed as a tension between speed and security. Product teams want releases shipped. Developers are measured on delivery. Security teams are expected to stop critical flaws without becoming the department of no.
AI changes that equation by making the handoff between writing, reviewing, approving and shipping code less visible.
In some organizations, Meyer said, production code is now being submitted by people who are not trained software engineers but can generate working code with AI tools. Those submissions then enter review processes already under time pressure.
Peer review, pull request review and security review all become compressed. In some cases, AI is used not only to write code but to review it, creating a path where software can move from idea to production without enough human examination.
“Code can go from idea to production without anyone ever evaluating it properly and asking, ‘Is this actually what we want to ship?’” Meyer said.
That is not only a code-quality issue. It is an audit issue.
If a vulnerability later surfaces, security and engineering leaders may need to reconstruct whether the flaw came from AI-generated code, an open-source component, an approved model, an unauthorized AI library, a missed scanner finding or a risk exception made under release pressure.
Meyer said the business upside of AI-assisted development often appears before the security cost does. Faster releases, lower friction and more output show up immediately. Security debt, brittle code and missed review decisions may surface months later, when the people who approved the shortcut or understood the context may have moved on.
Scanner noise weakens the evidence trail
The audit-trail problem is not limited to AI-generated code. It also shows up in the findings security tools produce.
Checkmarx is positioning its new hybrid SAST engine as one answer. The engine combines deterministic rules-based scanning, a purpose-tuned large language model and a Finding Analysis Engine designed to confirm true positives and suppress false positives before findings reach developers.
The company said the hybrid engine achieved an F1 score of 0.64 in testing across seven production codebases, compared with a 0.20 average across competing approaches it evaluated, while reducing false positives by 60%.
The larger issue is trust.
Traditional SAST can cover large codebases and known vulnerability patterns, but it can produce noisy results. AI-based analysis can reason about code in ways rules-based scanners cannot, but AI alone may miss too much.
“Using AI to do security work has some problems,” Meyer said. “It tends to be very good at certain things, but it tends to not be very predictable, and it tends to miss a lot of things.”
Traditional scanning has the opposite weakness.
“Well-established SAST approaches have the opposite problem,” Meyer said. “They surface a lot of things, but a lot of it is noisy. They cover a lot, but their awareness isn’t there.”
False positives are often treated as a developer-experience problem. In an AI-driven release pipeline, they are also an accountability problem. If a tool floods developers with hundreds of low-confidence findings, teams may ignore the output or delay triage. If leaders then make risk decisions based on raw finding counts, they may not know which vulnerabilities are exploitable, which have been fixed and which were accepted.
“If you say there’s a security issue and there is not, you are wasting the organization’s time,” Meyer said.
False negatives create the reverse problem: hidden risk that reaches production without being acknowledged.
AI inventory pushes SBOMs into new territory
The next visibility gap is the AI layer itself.
Traditional software bills of materials were built to identify software packages, libraries and dependencies. They were not designed to capture AI models, agents, Model Context Protocol servers, AI services, SDKs and other components that increasingly shape how applications behave.
Checkmarx’s AI Inventory release is aimed at that gap. The company said the capability identifies AI components across repositories, traces them to specific files and line numbers, and generates AI Bills of Materials, or AI-BOMs, in CycloneDX 1.7.
“Instead of just looking at, ‘I have adopted a library that does AI stuff,’ it looks at things that SBOMs can’t see,” Meyer said. “What models and weights am I using? Am I using something that has provenance? Am I using something that has been approved? What agents am I using? What services am I using?”
That turns AI inventory into more than an asset-management exercise. It becomes a way to prove what AI components are in production, where they came from, whether they were approved and whether they can be defended to auditors, customers and regulators.
For CISOs, the immediate question is not simply whether developers are using AI. In many organizations, they already are. The more important question is whether security, AppSec and engineering leaders can reconstruct the path from AI-assisted development to production release.
“The problem of doing security at scale at the increasing pace of development is genuinely hard, and it is getting harder,” Meyer said. “Many organizations have seen this coming slowly over the past decade or so and, for budgetary reasons or other reasons, kicked the can down the road. AI means we’re running out of road extremely quickly.”
The problem is no longer only whether code is vulnerable. It is whether the company can prove how the code got there, what security knew before it shipped and who decided the risk was acceptable.