The Real Lesson From the Delve Story: Automate the Paperwork, Not the Certificate

Hands up if you've seen this in tubes of London. It belonged to another company, but I AI-edited it to explain something.

Using Delve isn't a compliance strategy - London Tube ad

Automation is cool until you cross a line. So there's a ceiling on how much a compliance startup can grow and be automated. We saw the ceiling, but the ones closer to it are still dangerous, and we are overlooking them.

The article about Delve is compelling because it speaks to a deeper failure in the compliance market. The real problem is not just one company, one leak, or one vendor that may have pushed automation too far. The real problem is that too many buyers now want compliance to behave like software: instant, low-friction, visually clean, and cheap.

That is how you end up with trust pages instead of trust, dashboards instead of discipline, and paperwork that looks complete while the underlying security program is still hollow.

The wrong lesson from this story is that automation is useless. It is not. The right lesson is that automation must be used in the right layer. It can help collect artifacts, monitor environments, run tests, normalize evidence, and reduce manual drudgery. But it cannot replace independent judgment. It cannot replace human consulting. And it absolutely should not replace independent certification. %10 lesser Delve practices might still be as dangerous as Delve's.

Compliance was never supposed to be a form-filling exercise

A SOC 2 report has value because it is supposed to provide assurance to outsiders about controls relevant to security, availability, processing integrity, confidentiality, or privacy. The whole point is that someone relying on that report should be able to trust it. That is also why AICPA guidance emphasises that accountants performing attestation work must be independent in both fact and appearance, and why the profession frames those standards as essential for users to have confidence that reports were not unduly influenced by management or other interested parties.

That principle matters because compliance is not just clerical work. A real consultant does not merely move evidence from one folder to another. A real consultant decides what matters, what is missing, what is misleading, what is immature, what needs remediation first, and what management is claiming that the environment cannot yet support. They ask the annoying questions. They challenge scope. They force specificity. They tell founders that "mostly implemented" is not the same thing as implemented.

Software can assist that process. It cannot own it.

If a company wants real assurance, it needs humans in the loop at the places where judgment lives: control design, evidence review, exception handling, remediation planning, readiness assessment, and executive accountability. Remove the human layer and you do not get better compliance. You get cleaner-looking theater.

Point-in-time compliance is already weak. For AI systems, it is dangerously weak.

Even traditional compliance has always suffered from snapshot thinking. A company gathers artifacts, survives an audit window, and then treats the certificate as if it represents the next twelve months. That was already a bad habit. In AI systems, it becomes absurd.

AI agents do not sit still. Prompts change. tools change. models change. integrations change. runtime behavior changes. attack surfaces change. Third-party components change underneath you.

NIST's AI Risk Management Framework and its Generative AI Profile explicitly push toward ongoing monitoring, periodic review, continuous monitoring of third-party GAI systems in deployment, and regular review of safety and security guardrails. The OWASP AI Testing Guide makes the same broader point from another angle: AI trustworthiness requires more than conventional security testing because these systems face adversarial manipulation, sensitive data leakage, unsafe or excessive agency, and model drift over time.

That is why companies should stop thinking about "compliance" as a one-time documentation sprint and start thinking in terms of continuous assurance.

For offensive testing, tools in the emerging AI security stack can help. PenClaw AI describes itself as an AI pentester agent operated through workplace messaging tools. Audn AI says it runs adversarial security tests against AI agents and produces evidence-backed vulnerability reports with remediation guidance. Audn Blue positions itself as a runtime protection layer designed to block jailbreaks, deep-fakes, and data leaks in real time.

That is the right direction.

Red teaming belongs in the continuous assurance layer. So do blue-team runtime controls such as policy enforcement, tool-use restrictions, exfiltration detection, voice-agent firewalls, and live prompt-defense systems. Those controls do not exist to decorate a trust center. They exist to keep systems safe when real users, real attackers, and real edge cases hit production.

Fake evidence is the real original sin

People talk about fake certificates as if the certificate is where the fraud begins. It is not.

The fake certificate is just the final printed artifact in a chain that broke much earlier.

The real break happens when evidence is fabricated, templated into fiction, accepted without challenge, or mass-produced without regard to whether the underlying work occurred. A certificate becomes meaningless because the evidence behind it is meaningless. And the evidence becomes meaningless when nobody actually did the work that evidence is supposed to reflect.

That is why fake evidence is at least as dangerous as a fake certificate. In practice, it is more dangerous, because it poisons everything downstream.

It poisons audits because the auditor is no longer assessing reality.

It poisons enterprise sales because questionnaires get answered from fantasy rather than practice.

It poisons internal governance because management starts believing its own dashboard.

It poisons security culture because teams learn that passing matters more than being right.

Evidence is hard to produce for a reason. Real access reviews are hard. Real backup restoration exercises are hard. Real incident response rehearsals are hard. Real device-management evidence is hard. Real data-mapping is hard. Real control ownership is hard. The difficulty is not a bug in the system. The difficulty is the price of reality.

Once a vendor promises to remove that cost entirely, the buyer should ask a simple question: have they removed the work, or only removed the visibility of the work?

The future is not "no automation." The future is "automation in the right place."

I do believe a huge part of compliance consulting will be automated.

Evidence collection can be automated.

Evidence normalization can be automated.

Control mapping can be automated.

Reminder workflows can be automated.

Questionnaire drafting can be automated.

Change detection can be automated.

Continuous technical testing can be automated.

Even large parts of evidence packaging and readiness preparation can be automated.

But there is a line that should not be crossed.

Independent certification should not be automated.

Auditor judgement should not be automated.

Risk acceptance should not be automated.

Exception adjudication should not be automated.

Executive assertions should not be automated.

Governance accountability should not be automated.

NIST's Generative AI Profile is useful here because it makes a point many founders do not want to hear: the quality of AI red-team outputs depends on the background and expertise of the red team, and those results should receive additional analysis before being folded into governance and decision-making. OWASP's testing guidance similarly treats AI testing as multidisciplinary, not just a button-clicking security scan.

That is the model I believe in: automate collection, keep interpretation human, and keep attestation independent.

What companies should actually buy

Companies should buy less compliance theater and more assurance.

They should hire human consultants to scope systems honestly, identify what is missing, and force the organization to face uncomfortable truths. No, you can't get ISO 27K in 3 weeks. Any investor forcing you to do so also don't know it. That's why this is not Delve's founders' issue, all alone, it's an ecosystem issue.

They should run continuous adversarial testing for AI systems, including red teaming of voice and text agents, prompt-injection pathways, tool misuse, social-engineering edge cases, and data-exfiltration scenarios.

They should deploy blue-team runtime controls that can actually block unsafe behavior in production rather than merely describe good intentions in a document.

They should use software to reduce administrative pain, not to manufacture certainty.

And when the time comes for certification, they should demand independence, specificity, and a clear separation between the party helping them prepare and the party signing the opinion.

That is slower than fantasy. It is also the only thing that scales without collapsing into liability.

Closing

The market does not need fewer security tools. It needs fewer lies about what those tools are doing.

The future compliance company will not win by pretending that evidence no longer requires labor. It will win by helping organizations capture real evidence faster, validate controls continuously, and expose weaknesses before auditors, customers, or attackers do. So a really good boutique firm that knows the best tools and constantly build and improves it's system will win.

Automate the plumbing. Keep the judgement human. Keep the certificate independent.

Anything else is just paperwork wearing a security costume.