Every number here comes from the program's results and is reproducible from the open release. The headline: a systematic search found heavy-hex-native codes that beat IBM's Gross code on the figure of merit — and we are exact about which distances are proven and which are bounded.
| Code [[n,k,d]] | Torus | k·d²/n | vs Gross | Distance | qubits/log. | Novelty |
|---|---|---|---|---|---|---|
| [[144, 12, 12]] | 12×6 | 12.0 | 1.00× | baseline | 12.0 | IBM Gross |
| [[196, 18, 24]] | 14×7 | 52.9 | 4.41× | bound | 10.9 | novel |
| [[192, 16, 20]] | 12×8 | 33.3 | 2.78× | bound | 12.0 | novel |
| [[196, 8, 26]] | 14×7 | 27.6 | 2.30× | bound | 24.5 | novel |
| [[192, 8, 24]] | 12×8 | 24.0 | 2.00× | exact | 24.0 | novel |
| [[168, 8, 22]] | 12×7 | 23.1 | 1.92× | bound | 21.0 | cyclic risk |
| [[180, 6, 26]] | 10×9 | 22.5 | 1.88× | bound | 30.0 | cyclic risk |
| [[198, 6, 27]] | 11×9 | 22.1 | 1.84× | bound | 33.0 | cyclic risk |
Distances: exact = certified by a MIP solver over every coset; bound = BP+OSD upper bound at partial coverage. Cyclic-torus codes carry residual prior-art risk pending a live-literature check.
For the lead code, we didn't settle for a fast-decoder estimate. An exact solver checked every logical coset — and revised the distance down.
k·d²/n = 24.0 = exactly 2.00× Gross. Every one of 255 cosets solved by an integer-programming solver. The asymmetric distance makes it especially strong under biased noise. This is the result we'd stake the paper on.
The figure-of-merit gain comes at a cost: the highest-scoring codes currently have a circuit-level threshold below today's hardware. This is the central result of the study, not a caveat buried in an appendix.
The whole credibility of a QEC result rests on this distinction. So we publish it line by line — including the negative results and the work still to do.
14 of the top 50 candidates beat Gross after rigorous BP+OSD re-verification; reproducible from the open release.
Exact MIP solver, every one of 255 cosets solved. It corrected the fast-decoder estimate downward — the value of certification.
BP+OSD upper bounds at partial coverage (≈12–40% of logical classes). Strong, but not yet exactly certified — k is too large for the MIP solver.
Enforced via a toroidal L1 ≤ 4 locality surrogate; an explicit Eagle/Heron graph embedding has not yet been produced.
Currently below IBM gate-error rates — the figure-of-merit gain trades against threshold. Full-scale threshold runs and a fault-tolerant schedule are the next phase.
Smith-Normal-Form analysis rules out equivalence to IBM's published codes; a live arXiv / code-tables dedupe is still required before any novelty claim is final.
Tested honestly across three rounds: language models are good constraint-respecting proposers but did not beat random search at finding frontier codes. Reported as a negative result.
An IBM memory-experiment protocol is designed and preflighted, but has not been run. No experimental logical-error claim is made.
[[192, 20, 20]] topped the raw leaderboard at a claimed score of 41.67 (d=20). Rigorous verification dropped it to d=8, score 6.67 (Δd=-12). We publish failures like this on purpose.
A fault-tolerant schedule, full threshold runs, and a memory demo on real hardware — that is where backing makes the difference.