mirror/dolt

mirror of https://github.com/dolthub/dolt.git synced 2026-02-11 18:49:14 -06:00

Author	SHA1	Message	Date
Ben Kalman	03b7221c36	Use stretchr/testify not attic-labs/testify (#3677 ) stretchr has fixed a bug with the -count flag. I could merge these changes into attic-labs, but it's easier to just use strechr. We forked stretchr a long time ago so that we didn't link in the HTTP testing libraries into the noms binaries (because we were using d.Chk in production code). The HTTP issue doesn't seem to happen anymore, even though we're still using d.Chk.	2017-09-07 15:01:03 -07:00
cmasone-attic	4e7db2cd49	Fix httpChunkStore race between Close() and Has/Get (#3571 ) In httpChunkStore, calls to Get/Has and friends put a request object with a 'return channel' onto a queue channel, and then block on the return channel. The queue channel was buffered, which made it impossible to cause Get, Has et al to terminate reliably when the store was closed. This patch removes the buffering on the channel so we can deterministically bail from Get/Has et al when closing the store. I don't think we were actually seeing any benefit from the buffer on the queue channels, because everywhere we write to one of them we immediately block on another channel, waiting for the result of the request. Fixes #3566	2017-06-27 13:02:29 -07:00
Rafael Weinstein	3ff92950d8	Revert removal of \|last\| from Commit() (#3531 )	2017-06-09 11:20:45 -07:00
cmasone-attic	0fd37ba20d	Add chunks.Factory::CreateStoreFromCache() (#3529 ) Add a method to chunks.Factory that allows the caller to signal that it's willing to try to make forward progress using an out-of-date ChunkStore. This allows AWSStoreFactory and LocalStoreFactory to vend NBS instances without hammering persistent storage every time. Towards #3491	2017-06-09 08:52:22 -07:00
Rafael Weinstein	214054986b	Enforce clearer concurrency semantics of ValueStore (#3527 )	2017-06-08 11:40:22 -07:00
cmasone-attic	286555560c	Add Stats() to Database (#3497 ) This just returns interface{}, allowing underlying ChunkStore implementations to return whatever kind of stats struct they want. Fixes #3493	2017-05-23 09:55:01 -07:00
cmasone-attic	46cf38eaae	Simplify Pull() (#3490 ) In an NBS world, bulk 'has' checks are waaaay cheaper than they used to be. In light of this, we can toss out the complex logic we were using in Pull() -- which basically existed for no reason other than to avoid doing 'has' checks. Now, the code basically just descends down a tree of chunks breadth-first, using HasMany() at each level to figure out which chunks are not yet in the sink all at once, and GetMany() to pull them from the source in bulk. Fixes #3182, Towards #3384	2017-05-22 15:50:12 -07:00
Rafael Weinstein	3cdb43146a	Remove PutMany (#3448 )	2017-05-03 10:40:24 -07:00
Rafael Weinstein	6e47be3899	make MemoryStoreFactory public (#3417 )	2017-04-21 17:46:11 -07:00
cmasone-attic	fef871c1a7	ValueStore.Flush() no longer persists Chunks (#3416 ) ValueStore.Flush() now Puts all Chunks buffered in the ValueStore layer into the underlying ChunkStore. The Chunks are not persistent at this point, not until and unless the caller calls Commit() on the ChunkStore. This patch also removes ChunkStore.Flush(). The same effect can be achieved by calling ChunkStore.Commit() with the current Root for both last and current. NB: newTestValueStore is now private to the types package. The logic is that, now, outside the types package, callers need to hold onto the underlying ChunkStore if they want to persist Chunks. Toward #3404	2017-04-21 17:30:56 -07:00
cmasone-attic	16ef8884a7	Make MemoryStore come correct (#3406 ) It's important that MemoryStore (and, by extension TestStore) correctly implement the new ChunkStore semantics before we go shifting around the Flush semantics like we want to do in #3404 In order to make this a reality, I introduced a "persistence" layer for MemoryStore called MemoryStorage, which can vend MemoryStoreView objects that represent a snapshot of the persistent storage and implement the ChunkStore contract. Fixes #3400 Removed Rebase() in HandleRootGet, and added ChunkStore tests to validate the new Put behavior more fully	2017-04-21 14:13:52 -07:00
cmasone-attic	ff7cae6d34	Merge chunks.RootTracker interface into chunks.ChunkStore (#3408 ) You can't fully specify RootTracker without referring to the ChunkStore interface, so they should just merge. Fixes #3402	2017-04-19 21:34:20 -07:00
cmasone-attic	dc41f18498	Code review changes from #3403 (#3405 )	2017-04-19 13:34:12 -07:00
cmasone-attic	cb930dee81	Merge BatchStore into ChunkStore (#3403 ) BatchStore is dead, long live ChunkStore! Merging these two required some modification of the old ChunkStore contract to make it more BatchStore-like in places, most specifically around Root(), Put() and PutMany(). The first big change is that Root() now returns a cached value for the root hash of the Store. This is how NBS worked already, so the more interesting change here is the addition of Rebase(), which loads the latest persistent root. Any chunks that appeared in backing storage since the ChunkStore was opened (or last rebased) also become visible. UpdateRoot() has been replaced with Commit(), because UpdateRoot() was ALREADY doing the work of persisting novel chunks as well as moving the persisted root hash of the ChunkStore in both NBS and httpBatchStore. This name, and the new contract (essentially Flush() + UpdateRoot()), is a more accurate representation of what's going on. As for Put(), the former contract for claimed to block until the chunk was durable. That's no longer the case. Indeed, NBS was already not fulfilling this contract. The new contract reflects this, asserting that novel chunks aren't persisted until a Flush() or Commit() -- which has replaced UpdateRoot(). Novel chunks are immediately visible to Get and Has calls, however. In addition to this larger change, there are also some tweaks to ValueStore and Database. ValueStore.Flush() no longer takes a hash, and instead just persists any and all Chunks it has buffered since the last time anyone called Flush(). Database.Close() used to have some side effects where it persisted Chunks belonging to any Values the caller had written -- that is no longer so. Values written to a Database only become persistent upon a Commit-like operation (Commit, CommitValue, FastForward, SetHead, or Delete). /****** New ChunkStore interface *****/ type ChunkStore interface { ChunkSource RootTracker } // RootTracker allows querying and management of the root of an entire tree of // references. The "root" is the single mutable variable in a ChunkStore. It // can store any hash, but it is typically used by higher layers (such as // Database) to store a hash to a value that represents the current state and // entire history of a database. type RootTracker interface { // Rebase brings this RootTracker into sync with the persistent storage's // current root. Rebase() // Root returns the currently cached root value. Root() hash.Hash // Commit atomically attempts to persist all novel Chunks and update the // persisted root hash from last to current. If last doesn't match the // root in persistent storage, returns false. // TODO: is last now redundant? Maybe this should just try to update from // the cached root to current? // TODO: Does having a separate RootTracker make sense anymore? BUG 3402 Commit(current, last hash.Hash) bool } // ChunkSource is a place chunks live. type ChunkSource interface { // Get the Chunk for the value of the hash in the store. If the hash is // absent from the store nil is returned. Get(h hash.Hash) Chunk // GetMany gets the Chunks with \|hashes\| from the store. On return, // \|foundChunks\| will have been fully sent all chunks which have been // found. Any non-present chunks will silently be ignored. GetMany(hashes hash.HashSet, foundChunks chan Chunk) // Returns true iff the value at the address \|h\| is contained in the // source Has(h hash.Hash) bool // Returns a new HashSet containing any members of \|hashes\| that are // present in the source. HasMany(hashes hash.HashSet) (present hash.HashSet) // Put caches c in the ChunkSink. Upon return, c must be visible to // subsequent Get and Has calls, but must not be persistent until a call // to Flush(). Put may be called concurrently with other calls to Put(), // PutMany(), Get(), GetMany(), Has() and HasMany(). Put(c Chunk) // PutMany caches chunks in the ChunkSink. Upon return, all members of // chunks must be visible to subsequent Get and Has calls, but must not be // persistent until a call to Flush(). PutMany may be called concurrently // with other calls to Put(), PutMany(), Get(), GetMany(), Has() and // HasMany(). PutMany(chunks []Chunk) // Returns the NomsVersion with which this ChunkSource is compatible. Version() string // On return, any previously Put chunks must be durable. It is not safe to // call Flush() concurrently with Put() or PutMany(). Flush() io.Closer } Fixes #2945	2017-04-19 13:31:58 -07:00
cmasone-attic	192bdf6801	Remove DynamoStore (#3388 ) We're no longer using this, and forthcoming changes to ChunkStore mean that we'd have to do work to continue supporting it.	2017-04-14 13:41:38 -07:00
cmasone-attic	de76d37f09	Rip out hinting, reverse-order hack; make validation lazy (#3340 ) * Add HasMany() to the ChunkStore interface We'll need this as a part of #3180 * Rip out hinting The hinting mechanism used to assist in server-side validation of values served us well, but now it's in the way of building a more suitable validation strategy. Tear it out and go without validation for a hot minute until #3180 gets done. Fixes #3178 * Implement server-side lazy ref validation The server, when handling writeValue, now just keeps track of all the refs it sees in the novel chunks coming from the client. Once it's processed all the incoming chunks, it just does a big bulk HasMany to determine if any of them aren't present in the storage backend. Fixes #3180 * Remove chunk-write-order requirements With our old validation strategy, it was critical that chunk graphs be written bottom-up, during both novel value creation and sync. With the strategy implemented in #3180, this is no longer required, which lets us get rid of a bunch of machinery: 1) The reverse-order hack in httpBatchStore 2) the EnumerationOrder stuff in NomsBlockCache 3) the orderedPutCache in datas/ 4) the refHeight arg on SchedulePut() Fixes #2982	2017-04-06 16:54:40 -07:00
cmasone-attic	69c351affa	Remove ChunkStore backpressure mechanism (#3278 ) This was something that evolved from the way that Dynamo stores data, and a way to allow clients to make incremental write progress. We never actually made the clients handle it properly, though, and so much has changed since we wrote it that it's only going to be in the way of building something better. Fixes #3234	2017-03-17 12:54:58 -07:00
cmasone-attic	b1e918d1d4	Share s3, dynamodb clients (#3212 ) These objects manage their own pools of HTTP connections and other resources, so it's generally best to share them process-wide if you can. Fixes #3027	2017-02-22 13:41:23 -08:00
Rafael Weinstein	83b657fe62	Remove LevelDBStore (#3193 ) Remove LevelDBStore	2017-02-14 21:52:25 -08:00
Ben Kalman	b0927d852c	gofmt -s -w (#3159 )	2017-02-08 09:37:15 -08:00
Rafael Weinstein	759c36c96f	Walk avoids blobs (#3074 )	2017-01-13 16:32:40 -08:00
Rafael Weinstein	5fa5484f46	remove orederedparallel (#3050 )	2017-01-10 15:45:05 -08:00
Aaron Boodman	a09ef6fb44	Revert "Introduce noms version 8. Use it to guard type simplification." (#3043 )	2017-01-09 16:30:25 -08:00
cmasone-attic	ee2a5fa510	Make DeserializeToChan return an error, use it everywhere (#3042 ) Chunk deserialization can run into errors sometimes if, e.g. the client hangs up during a writeValue request. The old error strategy worked by throwing a "catchable" error and recovering. That's OK if you've only got one goroutine, but since the writeValue handler starts so many goroutines, architecting the code to deal with error handling by panic/recover is dicey. Instead, make DeserializeToChan return an error in the more common failure cases and handle it by passing it over a channel and raising it from a central place.	2017-01-09 13:58:44 -08:00
cmasone-attic	e7a96c3748	Add ValueStore.ReadManyValues() (#3036 ) The more code can use GetMany(), the better performance gets on top of NBS. To this end, add a call to ValueStore that allows code to read many values concurrently. This can be used e.g. by read-ahead code that's navigating prolly trees to increase performance. Fixes #3019	2017-01-08 14:37:37 -08:00
Aaron Boodman	a4ffa5ba9b	Introduce noms version 8. Use it to guard type simplification. (#3035 ) Introduce noms version 8. Use it to guard type simplification.	2017-01-06 17:32:32 -08:00
cmasone-attic	14c20ebdd7	Make server use GetMany to load hinted chunks (#3026 ) Now that we have GetMany, the server can use it directly to let the chunk-fetching layer figure out the best way to batch up requests. A small refactor allows ValidatingBatchingSink to directly update the hint cache instead of relying on logic inside ReadValue to do it. I made that change because ReadValue now also does a bunch of other things around caching read values and checking a cache of 'pending Puts' that will never have anything in it when used from the server. Toward issue #3019	2017-01-05 10:59:26 -08:00
Rafael Weinstein	3242f18c20	[NBS] Implement Streaming GetMany (#3002 ) Adds the ability to stream individual chunks requested via GetMany() back to caller. Removes readAmpThresh and maxReadSize. Lowers the S3ReadBlockSize to 512k.	2017-01-03 12:25:01 -08:00
cmasone-attic	ca31583a08	Add new spec for nbs-aws (#2997 ) The new spec is a URI, akin to what we use for HTTP It allows the specification of a DynamoDB table, an S3 bucket, a database ID, and a dataset ID: aws://table-name:bucket-name/database::dataset The bucket name is optional and, if not provided, Noms will use a ChunkStore implementation backed only by DynamoDB.	2017-01-02 08:24:45 -08:00
Rafael Weinstein	335454b34c	ChunkSink.Flush() (#2937 ) Add ChunkSink.Flush() which signals the ChunkSink that any previously Put chunks should be made durable.	2016-12-12 15:39:13 -08:00
Rafael Weinstein	0652e0b3e0	Add ChunkSource.GetMany(); RemoteBatchStore getRefs uses GetMany() (#2933 ) Add GetMany(), which most ChunkStores implement by repeated calls to their own Get(), but creates the opportunity for stores to optimize reads of larger blocks of potentially sequential chunks (e.g. NBS). Add RemoteBatchStore getRefs endpoint support for calling GetMany() rather than Get() Remove ReadThroughChunkStore which was dead code.	2016-12-12 11:18:22 -08:00
Rafael Weinstein	a67bb9bf7b	Minor rework of hash.Hash API (#2888 ) Define the hash.Hash type to be a 20-byte array, rather than embed one. Hash API Changes: `hash.FromSlice` -> `hash.New`, `hash.FromData` -> `hash.Of`	2016-12-02 12:11:00 -08:00
cmasone-attic	0cf72d5b85	Add debug logging to HandleWriteValue (#2846 ) This patch introduces optional debug logging in util/verbose, and adds some usage of it to HandleWriteValue and the httpBatchStore SchedulePut code path. It also modifies chunks.DeserializeToChan() so that callers can better recover from panics in there. https://github.com/attic-labs/attic/issues/103	2016-11-21 15:11:34 -08:00
Dan Willhite	46586ee928	Remove msg args from d.PanicIfTrue and d.PanicIfFalse. (#2757 ) Should discourage people from writing code that does unnecessary work to generate a msg every time that an error condition is checked. Fixes #2741	2016-11-03 11:43:57 -07:00
cmasone-attic	c9c1bb9ff5	Add concurrency to use of ValidatingBatchingSink (#2684 ) There are two places where ValidatingBatchingSink could be more concurrent: Prepare(), where it's reading in hints, and Enqueue(). Making Prepare() handle many hints concurrently is easy because the hints don't depend on one another, so that method now just spins up a number of goroutines and runs them all at once. Enqueue() is more complex, because while Chunk decoding and validation of its hash can proceed concurrently, validating that a given Chunk is 'ref-complete' requires that the chunks in the writeValue payload all be processed in order. So, this patch uses orderedparallel to run the new Decode() method on chunks in parallel, but then return to serial operation before calling the modified Enqueue() method. Fixes #1935	2016-10-10 15:33:35 -07:00
Eric Halpern	27cbfdd489	Fix noms-sync surprising quantity (#2531 ) * Use sampling for a better bytes-written estimate for noms sync * Confirmed that remaining overestimate of data written is consistent with leveldb stats and opened #2567 to track	2016-09-20 10:57:40 -07:00
Erik Arvidsson	5edf89cf3d	Replace d.Chk.True with d.PanicIfFalse (#2563 ) And same for d.Chk.False	2016-09-14 13:11:28 -07:00
Mike Gray	4e54c44d56	no functional changes, improving code quality (#2410 ) fix misspellings; fix code that was not gofmt'd - plus take advantage of gofmt -s too; couple of unreachable golint reported fixes; reference go report card results and tests	2016-08-23 13:51:38 -04:00
Mike Gray	22bc81e355	adding godoc synopsis for several top level packages (#2394 )	2016-08-22 13:50:31 -04:00
Sungguk Lim	6697c2e6fc	Replace github.com/tsuru/gnuflag with github.com/juju/gnuflag (#2340 ) Replace vendor folder and where it is used.	2016-08-11 10:29:57 -07:00
cmasone-attic	4ccaa7014a	Rate-limit LevelDBStore read operations as well (#2239 ) Under load, our server can exhaust the number of file descriptors it's allowed to have open at one time. Part of this is because of how many incoming connections it's handling, but we believe that handling lots of simultaneous reads to leveldb is the larger part of the issue. This patch applies the rate limit we were using for writing to both read and write operations. Fixes #2227	2016-08-04 09:35:35 -07:00
mgedigian	321350d7e5	fixing typos, stale comments, broken link (#2250 )	2016-08-02 15:47:04 -07:00
cmasone-attic	55025ee801	Add caching layer to demo-server (#2228 ) This patch creates a new kind of chunks.Factory that demo-server uses to vend ChunkStore instances that all share the same MemoryStore-based Chunk cache. This cache _will_ grow without bound, but the current RAM/data ratio on demo.noms.io means that, in practice, we will be fine for a bit. This will need to be removed in favor of a real solution in Issue #2227 Fixes #2009	2016-08-01 11:55:16 -07:00
Erik Arvidsson	ed0364cc19	Switch to gnuflag (#2206 ) This is to support: - shorthands - Putting commands anywhere (after positional arguments too)	2016-07-29 18:08:23 -07:00
Chris Masone	b0112ba52b	Remove NewSerializer NewSerializer spun up a goroutine within itself. We've decided this is an anti-pattern. Furthermore, we were using this inside our remote database handler code, and a panic inside that goroutine could take down the server. The callsites now use Serialize() directly. Fixes #2169	2016-07-28 16:05:03 -07:00
Chris Masone	afb50a6272	Add comment explaining DynamoDB table format to dynamo_store.go Fixes #2170	2016-07-27 12:37:13 -07:00
Vinicius Baggio Fuentes	9e7f1aaef7	go/chunks: changes convenience constructor for DynamoStore. The convenience constructor changed in this patch takes in a aws.Config object directly. This allows any implementation of the mentioned interface to be passed in to Noms's Dynamo store -- giving flexibility for client code to add their own credential acquisition mechanisms, for instance. [fixes #2151]	2016-07-26 11:52:09 -07:00
Erik Arvidsson	f2a83346ca	JS: Change hash function to sha512 For browser support we use npm amscrypto.js-sha512. For node we use its builtin crypto module.	2016-07-12 13:59:09 -07:00
Erik Arvidsson	1507b8dd8f	Go: Change hash function to sha512	2016-07-12 13:59:08 -07:00
Mike Gray	a7f29a716d	noms as one command line application, with version and help (#1874 )	2016-07-06 15:38:25 -04:00

1 2

64 Commits