mirror of
https://github.com/dolthub/dolt.git
synced 2026-05-07 19:30:22 -05:00
338de4e583
Previously the buzhash boundary checker used a single value for the window size, both as the buzhash buffer size when constructing a hash object, and reported as its window size to the boundary checker interface. This was wrong because we don't always pass single byte values to the hasher, for example refs are 20 bytes. The compound list chunking compensated for this by only passing the first byte of each list leaf's ref rather than the full ref. This is bad because there is obviously less entropy in 1 byte vs 20 bytes. The meta sequence chunking compensated for this by multiplying the chunking window size by 20, but this also had the effect of unnecessarily considering 20 times more chunked elements than would fit in the buzhash buffer.