Commit Graph

84 Commits

Author SHA1 Message Date
Aaron Son
f1eda2b3e8 go/store/nbs: table_reader: Pick up binary search for index lookup in the getMany path as well. 2023-05-26 11:41:03 -07:00
Aaron Son
2b012c943f go/store/nbs: table_reader.go: Optimize hasMany to use a binary search instead of a linear search.
With the addition of `errorIfDangling`, hasMany is on the critical path for
COMMIT and dataset HEAD operations. The other places its heavily used is GC and
push/pull. The linear scan can be the right tradeoff for small table files and
for large table files with a lot of addresses to look for, but it's a bad
tradeoff for most errorIfDangling checks in particular, especially on large
databases with large table files.

For now we will always use the binary search, but a heuristic based approach
which could pick between the two would be better.
2023-05-24 09:32:11 -07:00
Aaron Son
484e042b68 go/store/nbs: Remove unnecessary fdCache. Just use clone() and Close() on the tableReaderAt. 2023-02-27 12:03:46 -08:00
Aaron Son
508f67554d go/store/nbs: Make a chunkSource able to return a real io.Reader.
With TableFileStore.WriteTableFile() being used on the clone path, we want to
turn chunk sources into an io.ReadCloser. In the past, store/nbs always did
this by using the ReaderAt to emulate an io.Reader. For files this maybe has OK
performance, but for AWS/Blobstore remotes it causes a serial sequence of lots
of small reads against blob storage as the io.Copy() implementation copies a
few dozen KBs at a time.

This change makes it so a chunkSource can return an actual Reader.
file_table_reader leaves the reader at emulation in place for now. One reason
to potentially get rid of even that layer is to enable something like
.WriteTo() -> sendfile() translations. Those are not currently directly in
place because we currently instrument the returned Reader with byte counting,
for example, so we can report on Clone progress. One reason to leave the
ReaderAt implementation in place, on the other hand, is interactions with file
descriptors and the file descriptor cache. This is an investigation for later;
it seemed most prudent to leave it for now since it's not currently causing
pain.
2023-02-03 14:56:15 -08:00
Aaron Son
46e87a7e2b go/store/nbs: tableReader, *NomsBlockStore: Use errgroup.SetLimit to enforce limit on io parallelism across the GetMany call, instead of just at the chunk source. 2023-02-02 12:37:53 -08:00
Andy Arthur
d3e960e927 refactored chunkSource interface 2022-12-07 18:07:42 -08:00
Andy Arthur
e997e02c8a added getRecordRanges to chunkSource interface 2022-12-01 14:22:48 -08:00
Andy Arthur
edcc243c13 added chunkJournal, journalWriter, and journalChunkSource 2022-11-28 14:54:11 -08:00
Andy Arthur
90fd4e6788 refactored MemoryQuotaProvider to manage buffers more directly 2022-11-03 17:03:51 -07:00
Andy Arthur
36c83f1b85 removed chunkReader.extract(), converted public method to private for chunkReader, chunkSource and tableSet 2022-10-26 14:27:12 -07:00
Andy Arthur
7f66f4c7e3 remove chunkReadPlanner interface 2022-10-25 13:43:12 -07:00
Aaron Son
796a9e5a0d Merge pull request #3307 from dolthub/aaron/nbs-table-reader-findOffsets-remaining-fix
[no-release-notes] go/store/nbs/table_reader.go: Fix a bug in findOffsets where we returning remaining == false sometimes.
2022-04-27 09:29:05 -07:00
Dhruv Sringari
587142caf2 revert table_reader changes
This revert the changes made to table_reader.go in commit d3166e88710f9968c98a0613af5f9ab4a43d7d8e
2022-04-26 17:12:42 -07:00
Aaron Son
775f9504d2 go/store/nbs/table_reader.go: Fix a bug in findOffsets where we returning remaining == false sometimes.
In the case where we find a matching prefix but no matching suffixes, we were
failing to appropriately set remaining = true.
2022-04-26 17:03:52 -07:00
Dhruv Sringari
783a6f2fe9 optimize index usage and fix acquire quota underestimate 2022-04-07 13:22:07 -07:00
Dhruv Sringari
2e38b25dbc thread memory quota provider through tableSet 2022-03-24 12:27:41 -07:00
Dhruv Sringari
e7a342fed2 report chunkSource size 2022-02-17 14:58:40 -08:00
Dhruv Sringari
03536159c5 better onHeapTableIndex 2022-02-14 12:54:20 -08:00
Dhruv Sringari
c78e02b2b3 ordinals and prefixes should err 2022-02-09 19:03:57 -08:00
Dhruv Sringari
521d4f2728 change tableIndex interface to return errors 2022-02-09 19:03:12 -08:00
Aaron Son
550075425a go/store/nbs: Fix bug in table_reader where max read size was 128GB instead of 128MB. 2021-11-15 12:06:13 -08:00
Aaron Son
abffd995df go/libraries/doltcore/remotestorage: Interface for cache returns true if cache is full / truncated in size. 2021-10-04 16:51:58 -07:00
reltuk
8d5e58a085 [ga-format-pr] Run go/utils/repofmt/format_repo.sh and go/Godeps/update.sh 2021-09-30 19:12:08 +00:00
Aaron Son
2cd16b0a46 go/store/nbs: table_reader.go: canReadAhead: Limit how large a read can get from coalescing individual chunk reads. 2021-09-30 12:10:33 -07:00
Brian Hendriks
86bd20a4cd dolt roots command (#1891) 2021-07-04 07:51:07 -07:00
Zach Musgrave
dd39692543 Formatting
Signed-off-by: Zach Musgrave <zach@dolthub.com>
2020-11-06 17:10:04 -08:00
Zach Musgrave
9d2b8ff3a7 Name change for mmap-go
Signed-off-by: Zach Musgrave <zach@dolthub.com>
2020-11-06 17:08:57 -08:00
Aaron Son
84c3066348 go/**/*.go: Update copyright headers for company name change. 2020-11-02 10:17:02 -08:00
Aaron Son
623606c07c format_repo.sh 2020-10-14 17:06:32 -07:00
Aaron Son
8e6815dd21 go/store/nbs: table_reader.go: Refactor readAtOffsets and read batch handling a little bit. 2020-10-14 17:04:10 -07:00
Aaron Son
91b6168c1d go/store/nbs: table_reader.go: canReadAhead: Small parameters cleanup. 2020-10-14 16:32:54 -07:00
Aaron Son
cbda6e0004 go/store/nbs: table_reader.go: Rework how work is built and communicated in getManyAtOffsetsWithReadFunc a bit. 2020-10-14 16:29:17 -07:00
Aaron Son
c713363972 go/store/nbs: Convert chunkReader interfaces from atomicerr/waitgroup to errgroup. 2020-10-14 16:17:08 -07:00
Aaron Son
743539113a go/store/nbs: table_reader.go: Small cleanup on unused reqs param in some read functions. 2020-10-14 15:19:00 -07:00
Aaron Son
6963134640 go/store/nbs: Convert GetManyCompressed to callback instead of output channel. 2020-10-14 14:46:50 -07:00
Aaron Son
cbc28198b9 go/store/chunks: Convert ChunkStore GetMany interface to send results to callback instead of output channel. 2020-10-14 14:28:18 -07:00
Daylon Wilkins
b5bb663233 Reference new org name and updated trigger logic 2020-09-25 15:35:04 -07:00
Aaron Son
36988fb9ee go/store/nbs: mmapTableIndex: PR feedback. 2020-08-21 14:26:55 -07:00
Aaron Son
0ae16bf290 go/store/nbs: table_reader_test: Add some quick smoke tests for on heap and mmap table index. 2020-08-19 15:08:49 -07:00
Aaron Son
781337d468 go/store/nbs: Add Clone() methods to tableReader, chunkSource. Make store Rebase clone chunkSources. 2020-08-17 13:10:07 -07:00
Aaron Son
7bdc8f1905 go/store/nbs: tableIndex gets Clone and Close. 2020-08-17 11:33:29 -07:00
Aaron Son
42660515fa format_repo.sh 2020-08-07 10:30:09 -07:00
Aaron Son
fdf28fd5cb go/store/nbs: Use liquidata-inc/mmap-go. 2020-08-07 10:26:38 -07:00
Aaron Son
fb22193c4f format_repo.sh. 2020-08-06 15:08:06 -07:00
Aaron Son
58715fc957 go/store/nbs: Improve names of tableIndex methods. No underscores. 2020-08-06 12:55:35 -07:00
Aaron Son
a0bced3ec6 go/store/nbs: Convert tableReader completely to tableIndex interface. Implement compaction based on it. 2020-08-06 12:39:32 -07:00
Aaron Son
d370876e3b go/store/nbs: Iterate on table reader conversion to using table index interface. 2020-08-05 17:58:37 -07:00
Aaron Son
6cff5ee657 go/store/nbs: Add some index interface method implementations to mmapTableIndex. 2020-08-05 17:33:53 -07:00
Aaron Son
cf288f14d6 go/store/nbs: Access ordinals from method in store compaction. 2020-08-05 17:17:34 -07:00
Aaron Son
36be60c504 go/store/nbs: table_reader: addr -> *addr in places. 2020-08-05 15:21:54 -07:00