Zigzag Decoding with AVX-512
56 points by luu 4 days ago | 3 comments
londons_explore 13 minutes ago
This sort of analysis is great.
replyNow why can't compilers do this sort of thing automatically?
Almost any problem seems to be possible to speed up 1000x in AVX512+days of thought compared to the naive version written in a python loop. If we could automate that whole process for big codebases the performance gains could be huge.
diamondlovesyou 4 minutes ago
> Now why can't compilers do this sort of thing automatically?
replyThey do - they just can't assume GFNI instructions are present unless you explicitly say so: https://godbolt.org/z/eYasbKsse
// One-byte case for SLEB128 int64_t from_signext(uint64_t v) { return v < 64 ? v - 128 : v; }
// One-byte case for ULEB128 with zig-zag encoding int64_t from_zigzag(uint64_t z) { return (z >> 1) ^ -(z & 1); }