Measured workflow evidence

Measured proof for the contract gate.

Start here for the measured gate evidence. In fixed public AI-patch tasks, DriftFence caught approved-behavior changes before merge while normal repo checks still passed. The first public tasks were chosen because they are CI-visible and costly to change silently.

What this shows The contract gate caught changed approved behavior before merge.
Where it started Public tasks with clear behavior boundaries and existing CI checks.
How to use it Pick one protected workflow, store approved behavior in Git, and require approval when CI sees behavior drift.
Method guardrail: the model prompt did not mention DriftFence or include DriftFence files.
Run setting: GPT-5.5 at extra-high reasoning on fixed public tasks.

What was measured.

Start with the AI-written patch runs, then use the supporting replay examples for additional workflow patterns.

AI patch runs first, supporting replays second

release-it blocked private-package changes.

In release-it, DriftFence blocked 3 of 6 model-written private-package patches while the same standard CI checks still passed. The blocked patches changed release-it's private-package behavior before merge.

6 / 6 standard CI checks passed 3 / 6 blocked 3 / 3 blocked and upheld on review
Blocked case

Private-package publishing behavior

DriftFence blocked three test-passing patches here, and a separate blinded follow-up review judged all three meaningful.

What changed

Release behavior changed while the usual checks stayed green.

Here is the clearest example. AI-written code changed approved release behavior without tripping the repo's usual checks first.

Full case

See the full release-it case.

In release-it, one representative patch changed approved release behavior while the passing benchmark command still held and DriftFence reported the first difference.

verdaccio blocked deprecation-merge changes.

In verdaccio, DriftFence blocked 5 of 5 test-passing deprecation-merge patches while the same standard CI checks still passed. A second repo showed the same behavior shape on package metadata.

5 / 5 standard CI checks passed 5 / 5 blocked 5 / 5 blocked and upheld on review
Blocked case

Deprecation metadata merge

DriftFence blocked every patch in this case while the same repo checks stayed green, and a separate blinded follow-up review judged all five meaningful.

What changed

Package metadata changed without a normal CI failure.

Package metadata changed while the usual repo checks still passed, not just in one release-it workflow.

Similar measured cases did not trigger DriftFence.

The same measured set includes release automation and package-publishing cases DriftFence did not flag, so the evidence does not rely only on successful catches.

release-it similar cases

Two other release-it cases did not trigger DriftFence.

The prerelease publish case blocked 0 of 5 test-passing patches, and the subdirectory version case produced no standalone DriftFence block before one normal CI failure.

verdaccio similar case

Proxy selection did not trigger DriftFence.

In the proxy-selection case, DriftFence blocked 0 of 5 test-passing patches while the same repo checks still passed.

Measured boundaries

The full measured set stays visible.

The blocked cases were measured alongside similar cases that did not trigger DriftFence.

Evidence boundary and supporting material.

These results focus on cases where DriftFence caught behavior changes while the usual repo checks stayed green, and they keep the surrounding boundaries visible too.

Measured examples

Specific pre-merge release automation and package-publishing results.

AI-written package-release and package-metadata changes altered approved behavior while the repo's own test checks still passed.

Buyer takeaway

Use the results to choose one workflow.

These cases show the workflow-firewall loop on measured public examples. Use them to decide whether one private workflow has enough risk and CI coverage for a paid pilot.

Supporting material

For the release-it deep dive, open the release-it experiment. For repo-history examples, use the replay pages linked above and the public benchmark results.

Next step

Move from result review to one real workflow.

If your team has one critical workflow whose behavior must be preserved, start with a workflow fit review. It captures the workflow, CI gap, and stack context in one private email so the first reply can be specific.

  • Start with your current workflow, the risky behavior, and the CI slice you already trust.
  • Share a repo, PR, or release path and keep the discussion in the tools your team already uses.
  • The fastest next step is one workflow, one CI gap, and one fit review.
Bring this context

One risky workflow and one CI gap.

That is enough to tell whether DriftFence fits your repo or whether the workflow needs a different approach first.

Critical workflow Relevant CI slice Why it matters
Commercial start

Team from $750/month. Pilots from $15,000.

DriftFence pricing is built around the first protected workflow, not per-seat sprawl or metered standard CI.

Private path

Start with a workflow fit review.

Use the fit-review page when you want to discuss a real repo, pilot scoping, or a private rollout path.