mirror/dolt

mirror of https://github.com/dolthub/dolt.git synced 2026-04-29 19:39:52 -05:00

Author	SHA1	Message	Date
cmasone-attic	ccdf08c4f8	NBS: Serialize Commit() calls within a process (#3594 ) This patch uses process-wide per-store locking to ensure that only one NomsBlockStore instance is ever trying to update the upstream NBS manifest at a time. It also locks out attempts to fetch the manifest contents during that window. Conjoining is now much simpler. Since only one instance can ever be in the critical path of Commit at a time, and conjoining is triggered on that critical path, we now simply perform the conjoin while excluding all other in-process NBS instances. Hopefully, locking out instances who want to fetch the manifest contents during a conjoin won't cripple performance. Fixes issue #3583	2017-07-20 14:04:43 -07:00
Rafael Weinstein	0736ca8b6c	add store-granularity locking to manifest cache (#3576 )	2017-06-28 17:53:09 -07:00
cmasone-attic	fa40c6044f	NBS: updateManifest() fails fast if Update is DOOOOOOMED (#3575 ) If NomsBlockStore can assume that its manifest is a cachingManifest, it can pre-emptively check to see if someone else in-process has already moved the manifest forward and, if so, fail early. Fixes #3574	2017-06-28 13:04:33 -07:00
Rafael Weinstein	3ff92950d8	Revert removal of \|last\| from Commit() (#3531 )	2017-06-09 11:20:45 -07:00
Rafael Weinstein	214054986b	Enforce clearer concurrency semantics of ValueStore (#3527 )	2017-06-08 11:40:22 -07:00
cmasone-attic	e014edfa66	NBS: s3TablePersister caches tables locally on write (#3507 )	2017-05-31 12:10:37 -07:00
cmasone-attic	961970f155	NBS: add cache-on-write behavior for manifests (#3503 ) Fixes #3494	2017-05-30 13:06:12 -07:00
cmasone-attic	3201a5c9e5	Add Read/WriteManifestLatency stats (#3495 ) Fixes #3494	2017-05-22 16:59:32 -07:00
cmasone-attic	5ae0b5063f	NBS: Avoid concurrent (in-process) conjoins to a given store (#3484 ) Previously, every NomsBlockStore instance decided when to conjoin tables (and which to conjoin) entirely on its own, which led to A LOT of concurrent conjoining that would mostly be wasted effort, as one instance would win the race and then all the rest would drop their work on the floor, rebase, and continue. This patch introduces a 'conjoiner' that is either process-global, or owned by one of the NBS factory objects you can create. Now, NBS instances vended by a given factory call this single conjoiner during Commit(), asking it to perform a conjoin if necessary. If a conjoin is already underway, the conjoiner blocks the caller until it's finished and then returns. Whether the conjoin was triggered at the caller's request, or the caller got to opportunistically piggyback on a conjoin already in progress, the caller must rebase after Conjoin() returns. Fixes #3422	2017-05-18 16:40:28 -07:00
Rafael Weinstein	a3cde48690	Instrument NBS with perf metrics (#3449 )	2017-05-05 17:48:07 -07:00
cmasone-attic	6e217538a8	Clean up NBS cruft (#3451 ) Clean up NBS cruft standing in the way of improvements: Unmap buffer in newMmapTableReader() By the time this function exits, we're done with this buffer. Hanging on to it complicates lifetime management for the file backing the mmapTableReader, which is something I'm trying to make simpler. So...ditch it! remove compactSourcesToBuffer replace with simpler test-focused version	2017-05-04 23:08:15 -07:00
cmasone-attic	c32d4e917f	Streaming Compaction (#3434 ) The old compaction code loaded all chunks to be compacted into memory, assembled a compacted table, and then persisted it to backing storage. The nice thing about this was that we could de-dup chunks across the compacted tables. The bad thing was that we needed to hold all the chunks in memory at once. That turned out to be a problem, so we've moved to a new strategy that calculates only the merged index for the compacted table in memory, but streams chunk data directly from old tables to the new, big table. This should be a big win on S3 at least, because it turns out that for tables with > 5MB and < 5GB of chunk data, we can actually just tell S3 to reference a range of the existing object when building a compacted table. Fixes #3411	2017-05-01 08:55:36 -07:00
cmasone-attic	ff7cae6d34	Merge chunks.RootTracker interface into chunks.ChunkStore (#3408 ) You can't fully specify RootTracker without referring to the ChunkStore interface, so they should just merge. Fixes #3402	2017-04-19 21:34:20 -07:00
cmasone-attic	cb930dee81	Merge BatchStore into ChunkStore (#3403 ) BatchStore is dead, long live ChunkStore! Merging these two required some modification of the old ChunkStore contract to make it more BatchStore-like in places, most specifically around Root(), Put() and PutMany(). The first big change is that Root() now returns a cached value for the root hash of the Store. This is how NBS worked already, so the more interesting change here is the addition of Rebase(), which loads the latest persistent root. Any chunks that appeared in backing storage since the ChunkStore was opened (or last rebased) also become visible. UpdateRoot() has been replaced with Commit(), because UpdateRoot() was ALREADY doing the work of persisting novel chunks as well as moving the persisted root hash of the ChunkStore in both NBS and httpBatchStore. This name, and the new contract (essentially Flush() + UpdateRoot()), is a more accurate representation of what's going on. As for Put(), the former contract for claimed to block until the chunk was durable. That's no longer the case. Indeed, NBS was already not fulfilling this contract. The new contract reflects this, asserting that novel chunks aren't persisted until a Flush() or Commit() -- which has replaced UpdateRoot(). Novel chunks are immediately visible to Get and Has calls, however. In addition to this larger change, there are also some tweaks to ValueStore and Database. ValueStore.Flush() no longer takes a hash, and instead just persists any and all Chunks it has buffered since the last time anyone called Flush(). Database.Close() used to have some side effects where it persisted Chunks belonging to any Values the caller had written -- that is no longer so. Values written to a Database only become persistent upon a Commit-like operation (Commit, CommitValue, FastForward, SetHead, or Delete). /****** New ChunkStore interface *****/ type ChunkStore interface { ChunkSource RootTracker } // RootTracker allows querying and management of the root of an entire tree of // references. The "root" is the single mutable variable in a ChunkStore. It // can store any hash, but it is typically used by higher layers (such as // Database) to store a hash to a value that represents the current state and // entire history of a database. type RootTracker interface { // Rebase brings this RootTracker into sync with the persistent storage's // current root. Rebase() // Root returns the currently cached root value. Root() hash.Hash // Commit atomically attempts to persist all novel Chunks and update the // persisted root hash from last to current. If last doesn't match the // root in persistent storage, returns false. // TODO: is last now redundant? Maybe this should just try to update from // the cached root to current? // TODO: Does having a separate RootTracker make sense anymore? BUG 3402 Commit(current, last hash.Hash) bool } // ChunkSource is a place chunks live. type ChunkSource interface { // Get the Chunk for the value of the hash in the store. If the hash is // absent from the store nil is returned. Get(h hash.Hash) Chunk // GetMany gets the Chunks with \|hashes\| from the store. On return, // \|foundChunks\| will have been fully sent all chunks which have been // found. Any non-present chunks will silently be ignored. GetMany(hashes hash.HashSet, foundChunks chan Chunk) // Returns true iff the value at the address \|h\| is contained in the // source Has(h hash.Hash) bool // Returns a new HashSet containing any members of \|hashes\| that are // present in the source. HasMany(hashes hash.HashSet) (present hash.HashSet) // Put caches c in the ChunkSink. Upon return, c must be visible to // subsequent Get and Has calls, but must not be persistent until a call // to Flush(). Put may be called concurrently with other calls to Put(), // PutMany(), Get(), GetMany(), Has() and HasMany(). Put(c Chunk) // PutMany caches chunks in the ChunkSink. Upon return, all members of // chunks must be visible to subsequent Get and Has calls, but must not be // persistent until a call to Flush(). PutMany may be called concurrently // with other calls to Put(), PutMany(), Get(), GetMany(), Has() and // HasMany(). PutMany(chunks []Chunk) // Returns the NomsVersion with which this ChunkSource is compatible. Version() string // On return, any previously Put chunks must be durable. It is not safe to // call Flush() concurrently with Put() or PutMany(). Flush() io.Closer } Fixes #2945	2017-04-19 13:31:58 -07:00
cmasone-attic	fe2c476469	Fix NBS optimistic locking (#3353 ) Introduce a "lock" hash into NBS manifests to address the bad interaction between Flush() and optimistic locking. Our original design didn't include Flush(), which changes the set of tables without updating the root. Thus... an optimistic locking strategy predicated on checking the currently-persisted root hash is not robust to interleaved Flush() calls from multiple clients. Fixes #3349	2017-04-07 16:55:39 -07:00
Rafael Weinstein	9527907674	Nbs local store factory (#3191 ) Add NBS LocalStoreFactory	2017-02-14 20:52:30 -08:00
cmasone-attic	83235c7965	NBS: Calculate maxTableSize precisely (#3165 ) Though Raf and I can't figure out how, it's clear that the method we initially used for calculating the max amount of space for snappy-compressed chunk data was incorrect. That's the root cause of of all the chunks to be written and summing the snappy.MaxEncodedLen() for each. Fixes #3156	2017-02-09 11:46:06 -08:00
cmasone-attic	8cfc5e6512	Gather more info about Bug 3156 (#3158 ) There's some case that causes chunks that compress to more than about 55k (we think these are quite big, chunks that are many hundreds of K in size) not to wind up correctly inserted into tables. It looks like the snappy library believes the buffer we've allocated may not be large enough, so it allocates its own space and this screws us up. This patch changes two things: 1) The CRC in the NBS format is now the CRC of the _compressed_ data 2) Such chunks will be manually copied into the table, so they won't be missing anymore Also, when the code detects a case where the snappy library decided to allocate its own storage, it saves the uncompressed data off to the side, so that it can be pushed to durable storage. Such chunks are stored on disk or in S3 named like "<chunk-hash>-errata", and logging is dumped out so we can figure out which tables were supposed to contain these chunks. Towards #3156	2017-02-07 15:43:06 -08:00
cmasone-attic	8e40ee4959	First pass at compaction (#3143 ) * First pass at compaction The first cut at compaction blocks UpdateRoot() while it compacts n/2 tables down into a single, large table (where n == number of tables named in the NBS manifest). It then attempts to update the manifest with one referencing the compacted table, the novel tables from the client, and the remaining upstream tables that were not compacted. If the update fails, probably due to an optimistic lock failure, the client drops the compacted table it just created, pulls in the tables from the newly-discovered upstream manifest, and tries again. Known flaws: - may explode RAM (#3130) - doesn't handle novel tables > max tables (#3142) - may handle optimistic-lock-failures suboptimally (#3141) Fixes #3132 Also, fixes #2944 because doing so simplifies some code.	2017-02-03 15:58:04 -08:00
Aaron Boodman	a09ef6fb44	Revert "Introduce noms version 8. Use it to guard type simplification." (#3043 )	2017-01-09 16:30:25 -08:00
Aaron Boodman	a4ffa5ba9b	Introduce noms version 8. Use it to guard type simplification. (#3035 ) Introduce noms version 8. Use it to guard type simplification.	2017-01-06 17:32:32 -08:00
Rafael Weinstein	3242f18c20	[NBS] Implement Streaming GetMany (#3002 ) Adds the ability to stream individual chunks requested via GetMany() back to caller. Removes readAmpThresh and maxReadSize. Lowers the S3ReadBlockSize to 512k.	2017-01-03 12:25:01 -08:00
Rafael Weinstein	d8d8c6c7e1	Parallel s3 Slice reads (#2979 ) GetMany() calls can now be serviced by <= N goroutines, where N is the number of physical reads the request in broken down into. This patch also adds a maxReadSize param to the code which decides how to break chunk reads into physical reads, and sets the s3 blockSize to 5MB, which experimentally resulted to lower total latency. Lastly, some small refactors.	2016-12-22 11:45:33 -08:00
cmasone-attic	d129580007	Add frag tool to measure nbs fragmentation (#2963 ) Before we can defragment NBS stores, we need to understand how fragmented they are. This tool provides a measure of fragmentation in which optimal chunk-graph layout implies that ALL children of a given parent can be read in one storage-layer operation (e.g. disk read, S3 transaction, etc).	2016-12-20 17:01:18 -08:00
Rafael Weinstein	cc8ffacddf	Factor tableIndex out of tableReader (#2950 ) Factor tableIndex out of tableReader	2016-12-14 12:41:01 -08:00
Rafael Weinstein	c159876992	Make read amplification threshold configurable (#2941 )	2016-12-13 09:57:41 -08:00
cmasone-attic	7f36fad716	tablePersister.Compact returns a chunkSource (#2939 ) It turns out the only caller of Compact() immediately turns around and calls Open, so why don't I just do that FOR you? Fixes #2935	2016-12-13 06:20:33 -08:00
cmasone-attic	de6e49c9e0	compactingChunkStore crash fix (#2936 ) compactingChunkStore.close() must wait for compactions to finish.	2016-12-12 14:43:46 -08:00
cmasone-attic	7fe3b18a6b	Make compaction async (#2934 ) Introduce a 'compactingChunkStore', which knows how to compact itself in the background. It satisfies get/has requests from an in-memory table until compaction is complete. Once compaction is done, it destroys the in-memory table and switches over to using solely the persistent table. Fixes #2879	2016-12-12 14:15:30 -08:00
cmasone-attic	b3eef38fa4	Break NomsBlockStore dependency on disk storage (#2905 ) This patch introduces/expands the 'manifest' and 'tableSet' abstractions, so that NomsBlockStore is no longer explicitly using any file system operations Towards issue #2877	2016-12-05 09:05:40 -08:00
Rafael Weinstein	a67bb9bf7b	Minor rework of hash.Hash API (#2888 ) Define the hash.Hash type to be a 20-byte array, rather than embed one. Hash API Changes: `hash.FromSlice` -> `hash.New`, `hash.FromData` -> `hash.Of`	2016-12-02 12:11:00 -08:00
Rafael Weinstein	a00a5f5611	Implement experimental block store (#2870 ) * Move NBS into Noms * vendor in deps	2016-12-01 10:04:09 -08:00

32 Commits