Measured workflow evidence

Measured proof for the contract gate.

Start here for the measured gate evidence. In fixed public AI-patch tasks, DriftFence caught approved-behavior changes before merge while normal repo checks still passed. The first public tasks were chosen because they are CI-visible and costly to change silently.

See the main AI patch example Check workflow fit

What this shows The contract gate caught changed approved behavior before merge.

Where it started Public tasks with clear behavior boundaries and existing CI checks.

How to use it Pick one protected workflow, store approved behavior in Git, and require approval when CI sees behavior drift.

Method guardrail: the model prompt did not mention DriftFence or include DriftFence files.

Run setting: GPT-5.5 at extra-high reasoning on fixed public tasks.

Run setting GPT-5.5 Blind model-written patches at extra-high reasoning.

release-it result 3/6 release-it private-package patches blocked while repo checks stayed green.

verdaccio result 5/5 verdaccio patches blocked on every test-passing run while repo checks stayed green.

Independent review 8/8 Every blocked patch was later upheld on review.

First measured examples

Use this page to decide whether your first workflow has the right shape: one owner, one behavior boundary, and one CI path that can exercise it.

The measured proof is the CI contract gate. A pilot validates the full agent workflow in your repo: guardrails before edits, agent revisions after failed checks, and Git review for intentional changes.

What was measured.

Start with the AI-written patch runs, then use the supporting replay examples for additional workflow patterns.

AI patch runs first, supporting replays second

`release-it` blocked private-package changes.

In release-it, DriftFence blocked 3 of 6 model-written private-package patches while the same standard CI checks still passed. The blocked patches changed release-it's private-package behavior before merge.

6 / 6 standard CI checks passed 3 / 6 blocked 3 / 3 blocked and upheld on review

Blocked case

Private-package publishing behavior

DriftFence blocked three test-passing patches here, and a separate blinded follow-up review judged all three meaningful.

What changed

Release behavior changed while the usual checks stayed green.

Here is the clearest example. AI-written code changed approved release behavior without tripping the repo's usual checks first.

Full case

See the full `release-it` case.

In release-it, one representative patch changed approved release behavior while the passing benchmark command still held and DriftFence reported the first difference.

Open the full release-it case

`verdaccio` blocked deprecation-merge changes.

In verdaccio, DriftFence blocked 5 of 5 test-passing deprecation-merge patches while the same standard CI checks still passed. A second repo showed the same behavior shape on package metadata.

5 / 5 standard CI checks passed 5 / 5 blocked 5 / 5 blocked and upheld on review

Blocked case

Deprecation metadata merge

DriftFence blocked every patch in this case while the same repo checks stayed green, and a separate blinded follow-up review judged all five meaningful.

What changed

Package metadata changed without a normal CI failure.

Package metadata changed while the usual repo checks still passed, not just in one release-it workflow.

Similar measured cases did not trigger DriftFence.

The same measured set includes release automation and package-publishing cases DriftFence did not flag, so the evidence does not rely only on successful catches.

release-it similar cases

Two other release-it cases did not trigger DriftFence.

The prerelease publish case blocked 0 of 5 test-passing patches, and the subdirectory version case produced no standalone DriftFence block before one normal CI failure.

verdaccio similar case

Proxy selection did not trigger DriftFence.

In the proxy-selection case, DriftFence blocked 0 of 5 test-passing patches while the same repo checks still passed.

Measured boundaries

The full measured set stays visible.

The blocked cases were measured alongside similar cases that did not trigger DriftFence.

Examples from repo history.

These pages replay real commits against fixed approved behavior and show where DriftFence would have flagged a change while the same repo checks still passed.

Firebase: firebase-tools replay

DriftFence started flagging App Hosting release error handling.

3 quiet, then flagged

Release error handling changed on the fourth replayed commit. The same App Hosting checks still passed in standard CI, and DriftFence kept flagging the same change on five later commits.

See the Firebase replay page

Cloudflare: workers-sdk replay

DriftFence kept flagging the same recurring Wrangler deploy change.

193 commits, same signal

Error code 10007 disappeared and stayed gone across the full replay window. The same Wrangler deploy checks still passed in standard CI.

See the Workers replay page

AWS: serverless replay

DriftFence flagged AWS dev runtime matching.

4 flagged commits

Runtime matching changed while the same AWS dev checks stayed green. Two similar on-exit cases did not trigger DriftFence.

See the Serverless replay page

AWS: configure-aws-credentials replay

DriftFence started flagging AWS role chaining output.

2 quiet, then flagged

Cross-account role chaining started reporting the assumed account instead of the source account. Two earlier commits stayed quiet in standard CI, and the OIDC and direct IAM-user controls did not trigger DriftFence.

See the AWS credentials replay page

AWS: amazon-ecr-login replay

DriftFence started flagging ECR password masking.

5 quiet, then flagged

Explicit mask-password: true started masking forwarded docker password outputs. The same login-action checks still passed in standard CI, and the mask-password: false and skip-logout: true controls did not trigger DriftFence.

See the ECR login replay page

Docker: login-action replay

DriftFence started flagging Docker registry-auth redaction.

2 quiet, then flagged

Passwords supplied through registry-auth started being redacted with core.setSecret. The same login-action checks still passed in standard CI, and the saved-registry-state and standard-login controls did not trigger DriftFence.

See the login-action replay page

Docker: build-push-action replay

DriftFence started flagging Docker build argument forwarding.

1 quiet, then 2 forwarding changes

call started forwarding into --call, then allow started emitting one flag per value. The same build-action checks still passed in standard CI, and the existing builder control did not trigger DriftFence.

See the build-push-action replay page

Docker: setup-buildx-action replay

DriftFence started flagging Buildx unknown-driver flags.

3 quiet, then flagged

Unknown drivers stopped receiving the default buildkitd entitlement flags. The same Buildx setup checks still passed in standard CI, and the docker-container and remote controls did not trigger DriftFence.

See the setup-buildx replay page

Docker: metadata-action replay

DriftFence started flagging Docker annotation forwarding.

5 quiet, then flagged

Default OCI annotations started mirroring generated labels, and custom descriptions stopped staying null. The same metadata-action checks still passed in standard CI, and the existing labels-control case did not trigger DriftFence.

See the metadata-action replay page

Supabase: cli replay

DriftFence kept flagging Supabase orphan pruning.

8 commits, same prune change

Orphan pruning started deleting remote functions after deploy. The same functions deploy checks still passed in standard CI, and the no-orphans and invalid-slug controls did not trigger DriftFence.

See the Supabase replay page

Google: release-please-action replay

DriftFence started flagging release config forwarding.

8 quiet, then flagged

versioning-strategy and release-as started forwarding into manifest construction. The same release-action checks still passed in standard CI, and the target-branch and fork controls did not trigger DriftFence.

See the release-please-action replay page

prisma replay

DriftFence flagged datasource path handling.

1 flagged commit

A narrow config-path change altered datasource resolution. The same SQLite migrate diff checks still passed in standard CI, and the schema-only controls did not trigger DriftFence.

See the Prisma replay page

Evidence boundary and supporting material.

These results focus on cases where DriftFence caught behavior changes while the usual repo checks stayed green, and they keep the surrounding boundaries visible too.

Measured examples

Specific pre-merge release automation and package-publishing results.

AI-written package-release and package-metadata changes altered approved behavior while the repo's own test checks still passed.

Buyer takeaway

Use the results to choose one workflow.

These cases show the workflow-firewall loop on measured public examples. Use them to decide whether one private workflow has enough risk and CI coverage for a paid pilot.

Supporting material

For the release-it deep dive, open the release-it experiment. For repo-history examples, use the replay pages linked above and the public benchmark results.

Next step

Move from result review to one real workflow.

If your team has one critical workflow whose behavior must be preserved, start with a workflow fit review. It captures the workflow, CI gap, and stack context in one private email so the first reply can be specific.

Start with your current workflow, the risky behavior, and the CI slice you already trust.
Share a repo, PR, or release path and keep the discussion in the tools your team already uses.
The fastest next step is one workflow, one CI gap, and one fit review.

Check workflow fit See pricing

Bring this context

One risky workflow and one CI gap.

That is enough to tell whether DriftFence fits your repo or whether the workflow needs a different approach first.

Critical workflow Relevant CI slice Why it matters

Commercial start

Team from $750/month. Pilots from $15,000.

DriftFence pricing is built around the first protected workflow, not per-seat sprawl or metered standard CI.

Private path

Start with a workflow fit review.

Use the fit-review page when you want to discuss a real repo, pilot scoping, or a private rollout path.