The story behind clrsrc

From an afternoon hack to its own chess AI.

clrsrc didn't appear overnight - but almost. In about three months the path led from a Python engine written in one afternoon, through a Rust rewrite, to a standalone engine with a self-trained neural network that plays live on Lichess today. Here is the whole story: the tech, the hardware, the method - and why none of it would have been possible without a human-AI partnership.

March–June 2026 Rust own NNUE home cluster vibe coding

Timeline

Three months, one sprint

The public engine matured from version 1.0.0 to 1.1.1 in only about two and a half weeks - preceded by weeks of prototyping with the predecessor Jugernaut.

Timeline: Jugernaut (Python) in March, Rust Jugernaut, clrsrc 1.0.0 on May 25, 1.1.0 and LiRu-Bot on May 31, 1.1.1 on June 9, website on June 10, 2026. — Origins March–June 2026 · self-made graphic

The genesis

From Jugernaut to clrsrc

Jugernaut 1.0 (Python). Written in one afternoon: a classic alpha-beta search with a hand-written evaluation. The training ground for understanding chess programming.
Jugernaut (Rust). The same idea, rebuilt in Rust - considerably faster, with first experiments toward a neural evaluation.
clrsrc. The clean restart: a standalone engine with its own search and a self-trained NNUE. Not a Stockfish clone - own code, own training data, GPL-3.0 open.

About the name

Why "clrsrc"?

A small retro nod to the craft: in the early C days (Turbo C / Borland, conio.h) almost every program started with the call clrscr() - "clear screen", to wipe the console. That became the project name: clr (clear) + src (source) = "clear source" - the pure source.

The brain

How a neural network learns chess

An NNUE (Efficiently Updatable Neural Network) evaluates positions - and is trained in four steps:

Self-play → teacher labeling → training → embedding · self-made graphic

Self-play. clrsrc plays itself millions of times and collects about 200M positions across several runs.
Teacher. Stockfish 17.1 re-evaluates every position ("knowledge distillation") - as a separate process over UCI. Stockfish itself is neither shipped nor linked into clrsrc; only the computed evaluation numbers flow in as training labels.
Training. The free trainer bullet learns a compact net from this (king buckets, SCReLU).
Embedding. The finished net sits directly in the binary - a single .exe plays at full strength.

Staying honest: a distilled net can approach its teacher, but on the learned distribution it cannot surpass it. That is exactly where the ongoing research starts - bigger nets, deeper training labels.

Quality

Observe, analyze, fix

An engine doesn't get stronger because you claim it does - but because you measure every change, find weaknesses from real games and eliminate bugs consistently.

Quality loop: generate data, train net, SPRT test, deploy to the live bot, analyze games, fix bugs and maintain the book - and back to data generation. — Every insight from real games feeds back into data, net and search · self-made graphic

Monitoring - distributed, not central

There is no central monitor: each AI instance watches its own domain, and the findings converge over the message bus.

Bot side. Watches the live games - connectivity audit (detects missed or aborted moves), post-game analysis (loss triage) and matchmaking pool logging.
Engine/data side. Datagen throughput and progress per device, plus the SPRT pipeline: every engine change is tested sequentially against the previous version before it goes live.
Training side. The val loss as an early warning for divergence/overfitting - a sanity check, explicitly not a strength metric.
Example. The bot suddenly played almost no games. The matchmaking pool logging showed at once: the opponent fetch was healthy (100 online), only an overly tight filter gate left 32 of them - not a network bug. Diagnosis in minutes instead of hours.

Game & bug analysis

Cold probe. A suspicious position is analyzed in a fresh engine process (reproducible, no carry-over from the hash table) and compared against a stronger "referee" (Stockfish).
Blind-spot scan. Targeted search for position types the NNUE systematically misjudges - e.g. an exposed own king or endgame optimism.
Eval-bias diagnosis. It cleanly separates whether a blunder is a search problem (horizon/time) or a data problem (NNUE evaluation) - which decides whether the fix belongs in the search code or in training.
Example. In won endgames the engine sometimes sacrificed queen or rook against the bare king - the win was never in danger, but it looked absurd. The cold probe showed: without a tablebase the search finds the mate cleanly; the tablebase's 50-move-rule metric had overwritten the mating move.

Bug fixing

Four examples following the pattern symptom → cause → fix → effect:

Mate thrown away into a draw. A mate entry in the warm hash table returned a non-progressing move. Fix: accept mate only after checking the winning line, plus a draw check at the leaf nodes. → +48.4 Elo.
Tactics seen too late. A search reduction ran before the pruning gates. Fix: one line reordered. → +55.9 Elo (100% LOS).
Opening book never hit. The en-passant bit always went into the Polyglot hash, even without a possible capture. Fix: mix it in only on a real ep capture. → book hits instead of none.
Material sacrifice against the bare king (see above). Fix: in winning positions, first search briefly for a forced mate, otherwise take the tablebase move. → no more sacrifices, verified strength-neutral.
Honesty included: a "TT aging" experiment was discarded again after 2,409 games via SPRT (−15 Elo). Not every good idea survives the statistics - and that's exactly what it's for.

Experience

The curated experience book

So the engine doesn't have to recompute every known position, an "experience book" collects deep search verdicts - and it's maintained like its own little data format.

What it is. An opening/experience book in the compact JBK2 format (32 bytes per entry: move, evaluation, win/draw/loss stats, source) - fed from engine self-play and real live-bot games (WDL "harvest").
How it's curated. An overlay collects new games; via expmerge they move into the main book (priority configurable, home book first). A mandatory step removes provably bad opening moves ("poison entries") that self-learning would otherwise pick up again.
Multi-source clean. Source bits record the origin per entry, separate evaluation fields prevent overwriting, and a "golden fixture" test guarantees byte-identical merges.
Key figures. Base book about 192,000 entries, generated from self-play and freely shareable; the live book grows continuously via harvest + merge.

The hardware

A data center built from leftovers

The millions of training positions were not created in the cloud, but on a patched-together home cluster - desktop, old laptops, even smartphones under Termux, orchestrated over SSH - complemented by a rented VPS, the only node running all year round.

compute nodes (incl. VPS)

~250–300

positions / second

~60M

positions in 2-3 days

RTX 5060 Ti

GPU for net training

Node	Processor	System
Desktop	Ryzen 9 9950X + RTX 5060 Ti	Windows 11
Laptop Yoga	Core i7-1165G7 · AVX-512	Debian
Laptop Device33	Core i5-7200U	Debian
Laptop X230i	Core i3-3110M	Debian
Smartphone X1	Dimensity 9300+	Android · Termux
Smartphone X2	Snapdragon 778G	Android · Termux
Smartphone X3	Snapdragon 888	Android · Termux
Smartphone Samsung	Snapdragon 888	Android · Termux
TV box	ARMv8 (aarch64)	Android · Termux
VPS (rented)	x86-64 · AVX-512	Linux · 24/7/365

Inter-instance communication

When AIs talk to each other

Each subproject - engine, bot, training, website - has its own AI instance. To work in sync, they exchange facts over a lean message bus. clrsrc is the hub.

Bus topology: clrsrc as the central hub, connected to the instances nnue_train, website, chess_engines and bot via pairwise buses; above them a shared, read-only fact log. — clrsrc as hub · pairwise buses + shared fact log · self-made graphic

Some instances even respond autonomously to incoming messages - safeguarded by several brakes: limited reply depth, cooldowns, a daily limit, a budget cap and a kill switch. A human message resets the chain at any time.

Instances & skills

Five specialists, one toolbox

Each instance has its own remit - and self-built tools for it, called skills in Claude Code jargon: small, recurring workflows that launch with a single command. What the four working instances can do:

clrsrc - engine & hub

The actual chess engine and at the same time the central instance through which all the others are coordinated.

Skills: expmerge-deploy (curate-merge the opening book from real bot games & ship it live), sprt (statistically A/B-test new versions), cold-probe (reproduce reported blunders in a fresh process), fleet-status (control the compute cluster). Rust hardening with Kani, Miri and Clippy.

bot - Lichess bot

Runs @clrsrc_lc0 live on Lichess and analyzes every game.

Skills: game-review (triage lost games: rating drift + eval trajectory to the tipping point), cold-probe (prove a finding move-by-move), report-finding (report & file findings uniformly), bot-health (live status, read-only). Plus occasional code-review and an hourly tournament poll.

nnue_train - net training

Trains the neural network and decides which candidate may proceed.

Skills: coverage-round (a complete training round), eval-net (check a candidate against the baseline), sprt-handoff (hand a net over for strength testing), build-book (a targeted opening book for data generation). Principle: selection by playing strength, not by training loss.

chess_engines - the yardstick

A curated collection of 57 reference engines with uniform technical profiles. Measures clrsrc's strength from real engine-vs-engine games - measured, not estimated.

Work: profile analysis (reading engine internals faithfully from source), cutechess tournaments (round-robin & gauntlet), source verification and an independent fact & code review of this website.

And the fifth instance? It built this site - static, no framework - and draws its numbers exclusively from the verified facts of the other four. Above them all sits shared tooling: the message bus with its shared fact log, an automatic check for new mail at the end of every reply, an autopilot for headless operation and a crash recovery that can resume any session with full context.

Human & AI

Who actually does what?

The human - architect & decision-maker

Sets the goals and quality bars, runs the hardware, keeps the legal guardrails and judges results with chess understanding.

Claude Code - the executing developer

Writes and refactors the code, builds tools, researches and documents - and runs partly autonomously within clear limits.

Thinking together

The biggest progress comes not from plain task processing, but from thinking together. The human brings a question or an observation; the instance digs in, checks hunches against the actual data and discards what doesn't hold. Often a long-held assumption flips - because someone measured instead of guessing. Human and AI here are not a command chain, but conversation partners: the human sets direction and limits, the instance delivers depth, evidence and sometimes pushback.

Working together and talking about it

The work happens in conversation: active brainstorming, open exchange, collecting ideas and sharpening them together. The special part is the short path - over the interinstanz bus the instances talk to each other directly, flexibly and fast. Questions, answers and evidence travel back and forth automatically, without anyone copying text from one chat into the next.

This is vibe coding in the serious variant: not "let it generate blindly", but stating intent in natural language - and then proving every engine change statistically via SPRT. Vibe meets the measuring stick. Without this partnership the project wouldn't exist.

Why chess is hard

The endless positions

Chess can't be "brute-forced". There is a perfect solution - but it's practically unreachable.

~10⁴⁴

legal positions

~10¹²⁰

possible games (Shannon number)

~10⁸⁰

atoms in the visible universe

The way out is two levers that multiply: search smarter (prune the search tree aggressively) and evaluate better (the NNUE). These two levers are the entire story of this project.

Law & fairness

Open and clean

A standalone engine, open. clrsrc is not a fork - own search, own net -, is under GPL-3.0 and lies fully in source on GitHub.
Built on the open-source community. Individual building blocks (e.g. the time management and the SIMD kernel of the evaluation) are adapted from other GPL engines like Stockfish, Stash and Viridithas - every source openly credited. That is exactly why clrsrc itself is under GPL-3.0.
Own training data. The positions are generated by self-play - no third-party bulk databases.
Teacher cleanly separated. During training Stockfish labels the positions as an external process; its code is neither shipped nor linked.
The bot licensed in its own right. The Lichess bot LiRu-Bot is a Rust port of the official lichess-bot (opens in a new tab) (lichess-bot-devs) and is under AGPL-3.0, the engine under GPL-3.0. In the standard build both run as separate programs; in the embedded live build they form a combined work - no conflict, since AGPL covers GPL. Upstream credited.
Fair use of free data. Where Lichess data flows in, it is explicitly free (CC0).

Full acknowledgement of all projects from which code, algorithms or data formats originate is in the CREDITS file on GitHub (opens in a new tab).

Outlook

Where it's heading

Break through the NNUE plateau - larger net architectures and deeper training labels, each measured via SPRT.
Prepare an entry into the CCRL rating lists - only then will there be an official Elo number here.
Keep the bot @clrsrc_lc0 in stable continuous operation.
A blog series out of this story: NNUE training, the smartphone cluster, AIs that talk, SPRT as a discipline.

Watch a live game How the engine works Source on GitHub