Passing CI did not prove the release operation still behaved the same.
We tested DriftFence on four fixed
release-it coding tasks with
GPT-5.4. release-it is an
open-source release automation tool for versioning, tagging, and
publishing npm packages. Each task stands in for one concrete
release operation inside release-it. In the two
publish-behavior operations, DriftFence would have blocked 5 of 12
model-written code changes before merge even though the relevant
tests still passed. Independent review later judged 4 of those 5
blocked patches worth review or rejection. In the two other
operations we tested in the same experiment, DriftFence stayed
quiet on all 11 test-passing patches.
release-it operation use case. It does not yet show
broad efficacy across all repos or all operations. It shows what
happened when the same model was run from the same starting state
on four fixed tasks.