Show HN: Sameshi – a ~1200 Elo chess engine that fits within 2KB
225 points by datavorous_ 2 days ago | 68 comments
I made a chess engine today, and made it fit within 2KB. I used a variant of MinMax called Negamax, with alpha beta pruning. For the board representation I have used a 120-cell "mailbox". I managed to squeeze in checkmate/stalemate in there, after trimming out some edge cases.

I am a great fan of demoscene (computer art subculture) since middle school, and hence it was a ritual i had to perform.

For estimating the Elo, I measured 240 automated games against Stockfish Elo levels (1320 to 1600) under fixed depth-5 and some constrained rules, using equal color distribution.

Then converted pooled win/draw/loss scores to Elo through some standard logistic formula with binomial 95% confidence interval.


sireat 2 days ago
This is very cool and having stalemate is nice, however how much space would it take to implement the full ruleset?

As you write: not implemented: castling, en passant, promotion, repetition, 50-move rule - those are all required to call the game being played modern chess.

I could see an argument for skipping repetition and 50-move rule for tiny engines, but you do need castling, en pessant and promotion for pretty much any serious play.

https://en.wikipedia.org/wiki/Video_Chess fit in 4k and supported fuller ruleset in 1980 did it not?

So I would ask what is the smallest fully UCI (https://www.chessprogramming.org/UCI) compliant engine available currently?

This would be a fun goal to beat - make something tiny that supports full ruleset.

PS my first chess computer in early 1980s was this: https://www.ismenio.com/chess_fidelity_cc3.html - it also supported castling, en pessant, not sure about 50 move rule.

reply
dmurray 2 days ago
ToledoChess [0] has a few implementations of this in different languages. Some highlights:

2KB of JavaScript with castling, en passant, promotion, search and even a GUI

326 bytes of assembly, without the special rules

I don't think the author has a UCI-compliant one, but it should be easier than the GUI. There are forks of the JS one that might do it.

[0] https://nanochess.org/chess6.html

reply
jll29 2 days ago
Cool project. You could also use the front-end of GNU chess to save some lines, and implement only a back-end.

Bug report:

    a b c d e f g h
  8 r n b q k b n r 8
  7 . . p p p p p p 7
  6 . p . . . . . . 6
  5 p . . . . . . . 5
  4 P . . P P . . . 4
  3 . . . . . . . . 3
  2 . P P . . P P P 2
  1 R N B Q K B N R 1
    a b c d e f g h
  move: b2b3
  ai: b6b4
The pawn is not permitted to move two fields after it has already beeen moved once before: b6b4 isn't a valid move after b7b6. (First moving two fields, and then one would have been okay, in contrast.)
reply
datavorous_ 2 days ago
Thanks for pointing it out! I will try to patch it.

Appreciate you taking the time to test it.

reply
l674 2 days ago
If anyone is curious, the most common tool I've seen for ELO estimation among engine developers is cutechess [1], which uses SPRT [2]. Or ordo [3], haven't used this myself though

[1] https://cutechess.com/

[2] https://www.chessprogramming.org/Sequential_Probability_Rati...

[3] https://github.com/michiguel/Ordo

reply
lekevicius 2 days ago
Do you think it would be possible to achieve 1:1 ELO:bytes? Even smaller, but can be less smart.
reply
esafak 2 days ago
That's an awesome code golf challenge
reply
datavorous_ 2 days ago
maybe for very low ratings it's plausible? 1 elo per byte might happen in a tiny range but at a useful strength it would break fast, that's what i think
reply
iterance 2 days ago
What's the snallest possible program that accepts a chess board state and prints any legal move? True randomness may only have a couple hundred ELO, but then, that's pretty big for golf
reply
dmurray 2 days ago
The program that resigns every time unfortunately does a lot worse than random. But it depends on the population it's pitted against - it should at least pick up a few points against copies of itself.
reply
contravariant 24 hours ago
Don't resign, just offer a remise after moving a pawn. Only resign if no pawns are left.

I'd claim it would work on human opponents, but I think it would get banned from chess tournaments.

reply
dmurray 15 hours ago
Perhaps playing 1. e4 2. Bc4 3. Qh5 4. Qf7 (and resigning or offering a draw if some move isn't legal) would minmax this further

The problem isn't really well defined. Elo rating is assumed to be determinable independent of what opponents you face, so scoring 50% against opponents rated 1800 gives you the same information as scoring 26% against opponents rated 2000. In practice that's obviously not completely true, and for degenerate examples like the ones we are discussing it completely falls apart.

reply
galkk 20 hours ago
This is not chess, but something that allows to move chess pieces.

> Not implemented: castling, en passant, promotion, repetition, 50-move rule.

reply
Mr_Minderbinder 16 hours ago
I have been into computer chess for many years and I was fully expecting those concessionary statements. I have seen enough programs in this lucrative genre where a lot of attention can be gained by fraudulently claiming you implemented chess in a seemingly impossibly small size. When confronted, the charlatans will often claim senselessly that those omissions were in fact superfluous. This is a behaviour I have unfortunately also observed in other areas of computing.

If anyone reading this is interested in small and efficient chess programs that are still reasonably strong, there was a x86 assembly port of Stockfish called asmFish from a couple of years ago (the Win64 release binary was about 130KiB). Also see OliThink (~1000 LOC) and Xiphos which has some of the simplest C code for an engine of its strength that I have seen. I have not investigated the supposedly 4K sized engines that participated in TCEC too closely but from what I have seen so far it would seem that there are a few asterisks to be attached to those claims.

reply
tromp 2 days ago
https://www.chessprogramming.org/Toledo is a family a moderately strong tiny chess programs.
reply
thomasmg 2 days ago
Cool! I just recently implemented a chess engine in ~400 (readable) lines, with all rules, first in Java and then ported to my own programming language "Bau" [1]. This is including a terminal UI. I'll measure the ELO, but I was never able to beat it :-) The castling moves are specially tricky to implement I think. I enjoyed the challenge as well.

[1] https://github.com/thomasmueller/bau-lang/blob/main/src/test...

reply
ycombinatrix 22 hours ago
How come there's no unsigned numeric types in Bau?
reply
thomasmg 16 hours ago
I tried to describe this in [1]: "Unsigned integer are intentionally not supported to simplify learning and using the language, to avoid surprising behavior and edge cases, and to reduce security issues and error-handling pitfalls. When needed, unsigned behavior is available through explicit operations. This design does not affect performance or memory usage."

I understand this may not sound very convincing yet... it is hard to describe... basically, unsigned types sound simple, but they are not.

[1] https://thomasmueller.github.io/bau-lang/features.html

reply
dfc 2 days ago
How many games did you have to throw away because stockfish wanted to castle? Or did you force stockfish to not castle? Castling seems like such a frequent move it is hard to draw any conclusions about the strength of an engine that does not support it.
reply
datavorous_ 2 days ago
zero games were thrown away for castling, because i forced stockfish not to castle (and not to play en passant/promotion) by filtering legal moves and only giving those filtered moves via root_moves

so every game stayed in the same no castling variant

and you're right, this rating is for that constrained variant, not full chess.

reply
jsmith99 2 days ago
Wouldn't stockfish's position evaluation be incorrect in that case? (If it evaluated the position based on a formula that assumed normal rules)
reply
YawningAngel 24 hours ago
I'm not quite clear on the how of it, but Stockfish works pretty well outside the normal bounds of chess. There are toy chess variants on chess.com with "dragons" (knight + bishop) and stockfish can use those very effectively
reply
tzs 19 hours ago
Wouldn't that just stop it from considering castling, en passant, and promotion on the first move of the position you are analyzing? It's still going to consider those in the tree search, and the static evaluation neural net was trained on positions where they are allowed.

For castling you should be able to fix this by specifying that castling rights have been lost in the positions you give it.

For the others I think you would need to filter not just when giving it the list of allowed first moves. You'd also have to make it not consider en passant and promotion in the search, by modifying its move generation.

The static evaluation would still be off a bit, but that probably would not have much effect most of the time.

From what I've read it is feasible to train a new neural net on a decent home computer in maybe around a week, but that's probably overkill for your use of figuring out how strong your engine is at no castling/no promotion/no en passant chess.

reply
yuppiepuppie 15 hours ago
Very cool! I’ve added this to the HN Arcade https://hnarcade.com/games/games/sameshi
reply
drev 14 hours ago
So small it fits on a gameboy https://github.com/odrevet/sameshi-gbdk
reply
chvid 2 days ago
Cool that you could keep it under 2k but it would nice to have a readable version of the source code.

Do you work with it like this or do you have some sort of script you apply to get it down to a single line, single letter variable names?

reply
mackeye 20 hours ago
a readable version seems to have just been added! https://github.com/datavorous/sameshi/tree/master/readable
reply
noutella 2 days ago
What you’re describing is the typical output / function of a minifier
reply
alansaber 2 days ago
The real fun would be reverse-engineering the minified code (there are loads of tools to do this for chrome extensions)
reply
TZubiri 2 days ago
not lossless
reply
GeertB 2 days ago
How did you handle games where Stockfish would castle or promote?
reply
datavorous_ 2 days ago
i forced stockfish to play only non castling, non en passant, non promotion moves by filtering legal moves and passing only those as root_moves

also removed castling/EP rights from FEN

reply
comboy 2 days ago
I'd call that cheating but the size and capability is impressive nonetheless.
reply
burstw0w 2 days ago
Good job! I love how you obfuscated your code, really in a spirit of FOSS!
reply
datavorous_ 2 days ago
Oh well, the file initially looked like https://github.com/datavorous/sameshi/blob/7ab4e47144f96becd...

It is hideous now!

reply
burstw0w 2 days ago
It's not about being hideous, it's about being useless.

Your code is useless to anyone that wants to contribute, or maybe make something better by improving on the idea.

reply
y-curious 2 days ago
Coworker: “hey if you have a second, I have a one-liner PR open”

The PR:

reply
recursive 21 hours ago
You might have misunderstood the purpose of this.
reply
bstsb 23 hours ago
it’s minification, not obfuscation. the whole point of the engine is its small footprint
reply
haute_cuisine 2 days ago
This is amazing! Thanks for sharing. What would be the elo gain for 4KB engine?

P.S. I assume 1200 elo in chess com scale (not lichess / fide elo) and bullet chess variant?

reply
grumpopotamus 2 days ago
There is a TCEC category for 4k engines. The top ones are ~3000 Elo.
reply
sigmoid10 2 days ago
It's wild to think that 4096 bytes are sufficient to play chess on a level beyond anything humans ever achieved. Makes you think what other difficult tasks are out there that take even highly gifted humans years or decades to master, but a superior algorithm would more or less fit into one of those big QR code formats.

These things always make me think back to Westworld season 2, where the finale revealed that human minds are much simpler than they themselves believe and fit completely into an algorithm that could be printed in an average book.

reply
vunderba 2 days ago
Well, one of the most fundamental algorithms for building a chess AI is minimax [1] (or variants like negamax), and that’s been around for close to a century. The key difference is that as compute power and available RAM have grown, it’s become possible to search much deeper and evaluate far more plies.

So while 4k is still very impressive for the code base, it comes with a significantly larger runtime footprint.

[1] - https://en.wikipedia.org/wiki/Minimax

reply
senfiaj 2 days ago
Min-max + alpha-beta pruning is the backbone of the chess engines. I think 2KiB or even 1KiB might be enough (I guess the last one would be a very challenging squeeze). But what separates the best engines from average ones, is the heuristics. Heuristics is the most complicated one, and I doubt it's possible to fit it into a single-digit kilobyte memory (even 2-digit). For heuristics, engines like Stockfish also use neural networks, in addition to hand crafted algorithms. Also huge tables are used for endgames, etc.
reply
kevmo314 2 days ago
The core search algorithm is very simple though. 4KB engines may not run that fast if they do exhaustive search, but they’ll be quite accurate.

According to TCEC the time control is 30 mins + 3 sec, that’s a lot of compute!

reply
sigmoid10 2 days ago
If you look at the current winner [1], it does a lot more than just brute force tree search. The space state for chess is simply too big to cover without good heuristics. Deep Blue may have been a pure brute force approach to beat Kasparov after Deep Thought failed using the same core algorithm, but modern chess engines search far deeper on the tree with far fewer nodes than Deep Blue ever could thanks to better heuristics.

[1] https://github.com/MinusKelvin/ice4

reply
kevmo314 2 days ago
I'm not suggesting that it's only brute force tree search, just that it's not very complicated to develop a theoretically perfect chess engine in direct response to the parent

> It's wild to think that 4096 bytes are sufficient to play chess on a level beyond anything humans ever achieved.

reply
gnramires 2 days ago
It's not just about the base algorithm. It's also about the memory needed to run it, and the clockspeed. For example, even the hardest problem you can imagine, if it has a verifier algorithm that fits in 4k (which means the solution itself can be much larger than 4k), then you can simply do a basic brute force search over the solution space. That doesn't mean this algorithm is very intelligent; it's only very capable if you have a sufficiently fast computer; although indeed brute force is only feasible for the simplest tasks in practice, so the idea that algorithms (of increasing sizes) enable (greater) intelligence is definitely a part of the story, but not the whole story. You can also think of DNA, which represents a recipe for our bodies and brain, which we then use (essentially as an "algorithm") to learn things, with degrees of freedom (memory) far surpassing what DNA stores.

Now if you had a very good chess program running in very constrained (dynamic/RAM) memory, then that'd be partially more revealing. From a cursory search there's a 1800 ELO engine for the C64, which seems very impressive but very far from the best human players.

I'd be interested to see a curve of ELO x Avaliable RAM for the best chess engines (up to given RAM), and how that compares to other games and activities.

On RAM vs ROM (program size) memory, I think at a high level dynamic memory helps you keep track of search paths in a large tree search, saving you some computation. Program size tends to enable improving the effectiveness of your search heuristic, as well as pre-computing e.g. initial and final game optimal moves (potentially saving arbitrarily much compute). I like thinking about those things because I think the search paradigm is pretty informative of computation (and even intelligence) in general. Almost every problem is basically some kind of heuristic search in some kind of space. And you tend to get better at things by refining your heuristics (usually through some experimental training process or theoretical insight), considering more options, exploring deeper consequences, etc..

I think what really defines humans isn't really our ability to solve problems or play chess well etc. (although that's extremely useful and also enjoyable most of the time), it's really our emotions and inner world. We are not really Thinking Machines in essence, we're Feeling Machines most significantly. The thinking part is a neat instrumental part :) We can delegate thinking to machines but what we cannot extinguish is feeling or the human "soul", because that is the source of all meaning.

reply
falsaberN1 2 days ago
Oh my god the source is so tiny! It's really hard to parse because of it being minified but I love it to bits.
reply
kachapopopow 2 days ago
need to start measuring these things in the size of compiled functions so we can stop looking at oneliners (maybe wasm since it has an easy to read text representation)
reply
dxxvi 2 days ago
I wonder how big 1300, 1400, ..., 2200 Elo chess engines are.
reply
oh_my_goodness 2 days ago
If you ever spent much time at a chess club, you've seen why 2kB is a really disturbing number.
reply
jqr- 2 days ago
I have not. Can you please tell me why?
reply
vardump 2 days ago
He's just trying to trick HN readers to join chess clubs.
reply
oh_my_goodness 2 days ago
Not really. You have to see it for yourself.

(Partial answer, 2kB is a very small fraction of what we'd like to think counts as human.)

reply
AlexCoventry 2 days ago
Humans don't have much capacity for systematic tree search. It's sort of amazing that humans can do as well as they can, given that limitation.
reply
CyberDildonics 2 days ago
I don't think what you're saying has any connection to chess or chess clubs.

2kB is a very small fraction of what we'd like to think counts as human

This doesn't seem to mean anything. Why would 2KB have any relation to "counting as human". It's the data of about 10 comments.

reply
oh_my_goodness 22 hours ago
TFA describes a 2kB program can play a human game against humans, and sometimes win.
reply
CyberDildonics 21 hours ago
We know that, that's what the article is about. The things you said are vague and contain no information. What does 2KB have to do with "what we think counts as human" ?
reply
oh_my_goodness 21 hours ago
I said something very specific. You can refuse to compare 2kB to the size of the human brain, but that has nothing to do with me.
reply
sevg 17 hours ago
> that has nothing to do with me.

Well, you left a cryptic comment that multiple people couldn’t understand. And then you refused to explain it.

reply
charles_f 17 hours ago
"I am in, you are out, you couldn't understand"

Thanks for your contribution. The goal here is to enlighten each other, not expose ownership of knowledge that one has but doesn't want to share.

reply
CyberDildonics 20 hours ago
I'm completely lost and have no idea what you are trying to say or what point you are making.
reply
newzino 2 days ago
The mailbox board representation is a good call for size-constrained engines. Bitboards give faster move generation but the manipulation code (shifts, masks, magic numbers for sliding pieces) eats a lot of bytes. With mailbox you just need offset tables and a sentinel check for board edges. Curious what your evaluation function looks like though. At 2KB you can't fit piece-square tables (that's 384 values minimum for both colors), so are you doing material-only eval or did you squeeze in some positional heuristics?

The gap between your 1200 Elo in 2KB and the TCEC 4K engines at ~3000 Elo is interesting. That extra 2KB buys a lot when it goes to better evaluation and move ordering. Even a simple captures-first sort in alpha-beta pruning costs only a few bytes of code but can roughly double your effective search depth.

reply
raphaelmolly8 2 days ago
[dead]
reply
wittlesus 19 hours ago
[dead]
reply
genie3io 2 days ago
[dead]
reply
TZubiri 2 days ago
Codex or Claude Code?
reply
datavorous_ 2 days ago
none.

scribbling long enough on a piece of paper is more enjoyable than prompting.

reply
semi-extrinsic 2 days ago
a thousand times this.
reply
lyu07282 2 days ago
Isn't it bad enough they beat us at chess, do you have to make it even worse? ;p
reply