Audit & measurement

The load-bearing bug

2026-07-03 · ~4 min read

Sometimes a bug isn't a mistake you should fix, but a decision nobody made on purpose. Today we put the entire source code – engine and bot layer – through a fresh-eyes audit and shipped two improvements. The more interesting of the two started out as a supposed error and ended with a clear lesson: switching it back on would have hurt. Switching it off gained +20.9 Elo.

An audit with fresh eyes

We had the whole codebase read once as if seeing it for the first time – an AI-assisted audit across the entire source, done within the availability of Anthropic's Claude model Fable. Fresh eyes catch the things your own gaze has long stopped noticing: assumptions never questioned, and switches that sit differently than you think. Exactly two such findings made it into the engine.

The forfeit bug

The first finding was a pure correctness matter. The check for the 50-move rule was guarded incorrectly at the root of the search. In the rare window where a game runs past 50 reversible moves without anyone claiming a draw, the engine could return no legal move at the root. In practice that means: clock runs out, game lost – to a self-inflicted edge case, not to anything on the board. Rare, but real. Fixed.

The load-bearing bug

The second finding is the real story. A search technique called singular extensions – it deliberately extends especially promising lines to look deeper there – was effectively switched off on the main line by a flag error. The intuitive reflex: found the error, so switch it back on.

But for things like this we don't trust intuition – we trust measurement. An SPRT test – two versions play thousands of games against each other until the result is statistically unambiguous – showed something unexpected:

Switching the technique back on cost about 56 Elo. The "bugfix" made the engine noticeably weaker.
The counter-test – switching the technique off entirely – gained +20.9 Elo, statistically crystal-clear over 2133 test games.

So the bug was load-bearing: what looked like an oversight was in truth holding the engine up. At clrsrc's speed class this bit of extra cleverness is net harmful – the engine spends the saved time better on a uniformly deeper search across all moves. Less special-casing, more substance. Which fits the idea behind the name remarkably well: clear source – the lean, pure source.

A more robust bot

In parallel we hardened the bot layer – eleven small reliability fixes that add no Elo but keep games from being lost to infrastructure hiccups:

Crash recovery in the game loop, so a single error doesn't drag the whole game down with it.
A safety net that plays a legal fallback move on a malformed move instead of handing the game over without a fight.
Connection watchdogs that spot hanging reconnects and half-open streams faster.
Restart-proof daily counters and more small items from the same family.

What remains

At the end of the day: +20.9 Elo, a forfeit risk gone, and a noticeably more robust bot – all live. But the nicest takeaway isn't a number, it's an attitude: measure, don't assume. The obvious fix would have been the wrong one; only the test exposed the load-bearing bug for what it was. Sometimes less search cleverness really is more playing strength.

📅 Save the date: On 2026-07-20 at 14:00 (CEST), clrsrc_lc0 plays the International Chess Day Team Battle (3+0, rated). Feel free to watch.

You can watch the engine live on the Live page.

← Back to the blog