Commit Graph

103 Commits

Author SHA1 Message Date
Aaron Son
27d1f6e481 go/libraries/doltcore/remotestorage: Some cleanup around chunk fetcher. 2024-05-06 11:40:26 -07:00
Aaron Son
38b16880a2 go/store/datas/pull: Let ChunkStore implementation bring their own ChunkFetch implementation. 2024-05-06 11:34:09 -07:00
Aaron Son
b05d3d9a2d go/store/datas/pull: pull_chunk_fetcher: Move chunk fetching to a streaming interface instead of batch.
We want to better pipeline I/O when pulling from a remote, and moving a
streaming interface where storage can see more of the addresses that are needed
at once will allow us to achieve it.

For now, we implement the streaming interface just by calling the existing
batch get interface.
2024-03-25 11:00:21 -07:00
Aaron Son
43bc4cd641 go/store/datas/pull: pull_table_file_writer.go: Further experimentation with thread structure.
Use a buffered channel for pending table file writes. Make the upload
management thread more transparently responsible for just accumulating manifest
updates and calling AddTableFilesToManifest.
2024-03-19 11:58:25 -07:00
Aaron Son
346f072704 go/store/datas/pull: pull_table_file_writer.go: Experiment with different thread structure. 2024-03-19 11:37:18 -07:00
Aaron Son
5cbcb7b09b go/store/datas/pull: pull_table_file_writer_test.go: Make sure to close readers some more. 2024-03-15 11:53:21 -07:00
Aaron Son
deaa3f053c go/store/datas/pull: pull_table_file_writer_test.go: Make sure to close readers. 2024-03-15 11:27:51 -07:00
Aaron Son
3fa1a008d8 go/store/datas:pull: pull_table_file_writer.go: Fix error returned by pull table file writer to include the source of the issue. 2024-03-14 16:31:46 -07:00
Aaron Son
9e99878f90 Merge remote-tracking branch 'origin/main' into aaron/puller-table-file-writer 2024-03-14 13:46:01 -07:00
reltuk
d362e7f206 [ga-format-pr] Run go/utils/repofmt/format_repo.sh and go/Godeps/update.sh 2024-03-11 17:08:16 +00:00
Aaron Son
2c46b380e6 go/store/datas/pull: Fix crash in pull_chunk_tracker. 2024-03-11 10:00:17 -07:00
Aaron Son
0357a43590 Revert "go/store/datas/pull: Revert puller changes from 1.34.0 until we investigate changes in resource utilization."
This reverts commit e98757c136.
2024-03-11 09:48:11 -07:00
Aaron Son
e98757c136 go/store/datas/pull: Revert puller changes from 1.34.0 until we investigate changes in resource utilization. 2024-03-10 18:09:11 -07:00
Aaron Son
edbed08399 go/store/datas/pull: pull_table_file_writer.go: Optimize pull/push to allow for concurrent table file uploads.
Move management of the upload workers and the table file writer to its own
struct which is testable in isolation.
2024-03-06 11:51:06 -08:00
reltuk
ccb3dd3e17 [ga-format-pr] Run go/utils/repofmt/format_repo.sh and go/Godeps/update.sh 2024-03-05 20:13:26 +00:00
Aaron Son
dce814d749 go/store/datas/pull: pull_chunk_tracker.go: PR feedback: Round out the last batch returned from GetChunksToFetch. 2024-03-05 12:04:06 -08:00
Aaron Son
a19717a975 go/store/datas/pull: pull_chunk_tracker.go: PR feedback: Add a test for when HasMany returns a subset of queried chunks. 2024-03-05 11:38:33 -08:00
Aaron Son
7b7c3b2679 go/store/datas/pull: pull_chunk_tracker.go: PR feedback: Add a comment about PullChunkTracker. 2024-03-05 11:11:42 -08:00
Aaron Son
b3e3082bb0 go/store/datas/pull: pull_chunk_tracker.go: PR feedback: Make HasManyThreadCount a constant. 2024-03-05 11:01:39 -08:00
Aaron Son
383a196d53 go/store/datas/pull: puller.go: Simplify diff to relevant portion of this change. 2024-03-04 12:13:46 -08:00
Aaron Son
1a6bf25c2d go/store/datas/pull: PullChunkTracker: Make sure that the initial set of hashes to pull are also seen. 2024-02-29 10:27:41 -08:00
Aaron Son
c5afe61300 go/store/datas/pull: Create a PullChunkTracker for keeping track of what to pull.
The PullChunkTracker is an optimization which can concurrently call HasMany on
the destination database while chunks are being pulled from the source
database.
2024-02-28 15:57:43 -08:00
Neil Macneale IV
ce26453382 Enable sql-server to recieve push updates via remoteapi 2023-12-12 16:47:09 -08:00
zachmu
9c57399740 [ga-format-pr] Run go/utils/repofmt/format_repo.sh and go/Godeps/update.sh 2023-09-27 22:14:05 +00:00
Zach Musgrave
4d9d2c9446 Moved all env constants to a new dconfig package 2023-09-26 12:00:00 -07:00
Zach Musgrave
8886f5bff2 Moving all environment variables to consts in the same file 2023-09-26 12:00:00 -07:00
Aaron Son
c0b0cc42d1 dolt clone: Fix dolt clone run against a sql-server where the database has been GCd.
A long-standing bug in the remotesapi which the sql-server exposes could cause
a `dolt clone` to fail when running against a database which had been garbage
collected. This change fixes the bug in the server. It also patches the client
behavior so that it will tolerate responses from older versions of dolt.
2023-06-27 16:41:22 -07:00
Aaron Son
6f646a7f01 go/store/datas/pull/clone.go: Fix to clone to interact with sql-server remotes better.
Before this fix, |dolt clone| against a sql-server remote can fail with a
confusing error message if the sql-server has a chunk journal. Adding the chunk
journal to our destination ChunkStore causes us to update the root hash, which
the Clone code was not expecting.

This change updates the Clone code to look for the possible update to the
destination's root hash and to not try to set the root hash on the ChunkStore
if it has already been done by the code handling the chunk journal.
2023-05-09 15:25:07 -07:00
Aaron Son
6edc6fd54a go: env/actions: remotes.go: In SyncRoots, if we are syncing to an empty destination repository, use pull.Clone instead of pull.Pull.
pull.Clone uses the TableFileStore interface to transit whole table files
without needing to follow chunk references or do any reconciliation with the
destination database regarding what it already has. It is much faster against
every time of remote.
2023-05-02 17:12:03 -07:00
Aaron Son
7406c4658a go/store/nbs: Fix some quota leaks in conjoin, GC.
Adds a paranoid mode where we noisely detect unclosed table files. The mode can
be enabled by setting an environment variable.

Fixes some unit tests, including all of go/store/... to run cleanly under the
paranoid mode.

Changes the quota interface to:
* Release |sz int| bytes instead of requiring a []byte with the correct length
  to show up.
* Work with |int| instead of |uint64|, since MaxUint64 is never allocatable and
  MaxInt32+z is only allocatable on 64-bit platforms.
* Not return an error on Release(). Implementations should not fail to release
  quota.
2023-02-16 16:01:20 -08:00
AndyA
c113089de2 Merge pull request #5282 from dolthub/andy/batch-stack
go/store/datas/pull: try to avoid unbounded growth of outstanding abs…
2023-02-02 13:00:12 -08:00
Aaron Son
f41cd010aa Merge pull request #5270 from dolthub/aaron/online-gc-prune-table-files-change
go/store/types: value_store.go: Change GC implementation to call TableFileStore.PruneTableFiles after the copy is complete.
2023-02-02 12:19:05 -08:00
Andy Arthur
7ae021dd6a go/store/datas/pull: refactored puller code again 2023-02-02 10:22:29 -08:00
Andy Arthur
002641e5ab go/store/datas/pull: fix bug in visited set update 2023-02-01 17:36:27 -08:00
Andy Arthur
0bd07d8252 go/store/datas/pull: unref visited set in pull earlier 2023-02-01 16:50:21 -08:00
Andy Arthur
83c369e47b go/store/datas/pull: some renaming in puller 2023-02-01 16:47:30 -08:00
Andy Arthur
d7b58abc8b go/store/datas/pull: try to avoid unbounded growth of outstanding absent set 2023-02-01 16:45:07 -08:00
Andy Arthur
c782af7a4a go/store/datas/pull: also reduce nextAbsent footprint 2023-02-01 15:29:16 -08:00
Andy Arthur
1c39fea3c1 go/store/datas/pull: trim memory footprint for puller hash sets 2023-02-01 15:13:01 -08:00
Brian Hendriks
56706c0826 add another layer of batching 2023-01-31 15:51:11 -08:00
Aaron Son
ff63732b49 go/store/types: value_store.go: Change GC implementation to call TableFileStore.PruneTableFiles after the copy is complete. 2023-01-31 15:48:39 -08:00
Taylor Bantle
cee7e15eb7 Remove datas/pull 2023-01-12 10:23:58 -08:00
Taylor Bantle
546cca8f0c Fix gc unit tests 2023-01-12 09:26:25 -08:00
Taylor Bantle
f7d2a767f9 Use blobstore for MemFactory 2023-01-12 09:26:25 -08:00
Taylor Bantle
3339cf34c2 Remove PutMany 2023-01-12 09:26:02 -08:00
Taylor Bantle
7b759f2c75 More fixes 2023-01-12 09:26:02 -08:00
Taylor Bantle
98b073070c Fix some test failures 2023-01-12 09:26:02 -08:00
Taylor Bantle
98871db953 First attempt at PutMany 2023-01-12 09:26:02 -08:00
Taylor Bantle
3b06cb373b Uncomment getAddrs code that breaks tests 2023-01-12 09:26:02 -08:00
Taylor Bantle
18fedd79ad Add sanity check to Put
This reverts commit b1de143a16.
2023-01-12 09:26:02 -08:00