* https://www.nayuki.io/page/deflate-specification-v1-3-html
fn bits(&mut self, need: i32) -> i32 { ....
Put me in mind of one of my early experiments in Rust. It would be interesting to compare a iterator based form that just called .take(need)I haven't written a lot of Rust, but one thing I did was to write an iterator that took an iterator of bytes as input and provided bits as output. Then used an iterator that gave bytes from a block of memory.
It was mostly as a test to see how much high level abstraction left an imprint on the compiled code.
The dissasembly showed it pulling in 32 bits at a time and shifting out the bits pretty much the same way I would have written in ASM.
I was quite impressed. Although I tested it was working by counting the bits and someone critizised it for not using popcount, so I guess you can't have everything.
PSA: Rust exposes the popcnt intrinsic via the `count_ones` method on integer types: https://doc.rust-lang.org/std/primitive.u32.html#method.coun...
Keep in mind this is also 31 years of cruft and lord knows what.
Plan 9 gzip is 738 lines total:
gzip.c 217 lines
gzip.h 40 lines
zip.c 398 lines
zip.h 83 lines
Even the zipfs file server that mounts zip files as file systems is 391 lines.edit - post a link to said code: https://github.com/9front/9front/tree/front/sys/src/cmd/gzip
> ... (and whenever working with C always keep in mind that C stands for CVE).
Sigh.
int crc32(byte[] data) {
int crc = ~0;
for (byte b : data) {
crc ^= b & 0xFF;
for (int i = 0; i < 8; i++)
crc = (crc >>> 1) ^ ((crc & 1) * 0xEDB88320);
}
return ~crc;
}
Or smooshed down slightly (with caveats): int crc32(byte[] data) {
int crc = ~0;
for (int i = 0; i < data.length * 8; i++) {
crc ^= (data[i / 8] >> (i % 8)) & 1;
crc = (crc >>> 1) ^ ((crc & 1) * 0xEDB88320);
}
return ~crc;
}
But one reason that many CRC implementations are large is because they include a pre-computed table of 256× 32-bit constants so that one byte can processed at a time. For example: https://github.com/madler/zlib/blob/7cdaaa09095e9266dee21314...The Huffman decoding implementation is also bigger in production implementations for both speed and error checking. Two Huffman trees need to be exactly complete except in the special case of a single code, and in most cases they are flattened to two-level tables for speed (though the latest desktop CPUs have enough L1 cache to use single-level).
Finally, the LZ copy typically has special cases added for using wider than byte copies for non-overlapping, non-wrapping runs. This is a significant decoding speed optimization.
Feels like Rust culture inherited "throw and forget" as an error handling "strategy" from Java
Sigh.
Anyway, I skimmed the file for you this time, and basically you're either correct or wrong, depending on your definition of "error checking." The code handles error conditions by aborting the process. Seeing as it's a standalone CLI program and not a library meant for reuse, safely shutting down with a meaningful message sounds like fair game to me.
You can leave the snide comments about “Rust culture” (whatever that is) out next time.
The way a language's community handles errors and how the language itself handles errors are different things, sure, but they're not independent of each other.
That said, OP's snark against Rust is completely unmerited, and they can take my `impl From<OtherErr> for MyErr` from my cold dead hands.
After skimming through the author's Rust code, it appears to be a fairly straightforward port of puff.c (included in the zlib source): https://github.com/madler/zlib/blob/develop/contrib/puff/puf...
It makes me wonder if there was some LLM help, based on how similar the fn structure and identifier names are.
I would bet there was
With an entire section complaining how many lines of code existing implementations are, looks like they did found a good simple implementation to clone in Rust then deliberately not mention it.