This patch adds a static function which can walk graphs looking for (and diffing) two structs. It uses type information to avoid traversing sub-values which can't contain structs. It also uses a similar approach as sync to avoid visiting common sub-chunk-graphs.
The only thing that wants what we used to call the "best" diff
algorithm is the command-line tools. Non-interactive programs all want
the algorithm that finishes up fastest, which is top-down.
Fixes https://github.com/attic-labs/attic/issues/627
Readahead + NBS benefit greatly when "related" Chunks are close to
each other. The current code did a good job of writing siblings in the
Chunk graph next to each other, but "cousins" (that is, children whose
parents are siblings) might wind up spread quite far apart. This
patch makes WriteValue hold onto novel Chunks until it sees a
_grandparent_ come through the pipeline. All of that Chunk's queued
grandchildren will be Put at that time.
Additionally, ValueStore.Flush() now takes a Hash and flushes all
Chunks that are reachable from the Chunk with that Hash, as opposed
to simply flushing all Chunks to the BatchStore. This means that
there's now no supported way to write orphaned Chunks/Values to a
Database.
Fixes#3051
* More logging for TestStreamingMap2
Since the head of each dataset can have an arbitrarily complex
type, type accretion leads the Datasets map at the root of the
DB to become very large. This type info isn't really very useful
at that level either. So, get rid of it by making this map be
from String -> Ref<Value>.
Fixes#2869
When constructing a recursive decoder (e.g, mapDecoder, arrayDecoder, setDecoder),
the new decoder is placed in the cache before its element decoders are created to
avoid a cycle.
If another go routine finds the new decoder in the cache, it can use it before it's
fully initialized.
Use a RWMutex to guard against this. Take the write lock out before adding the decoder
to the cache and release after the element decoders have been initialized.
Use a read lock in the decoder to function go ensure that it blocks until initialization
is complete.
toward: #3071
In httpBatchStore.GetMany(), we check our unwritten
puts to see if any of the requested chunks already
exist locally. If any do, we're _supposed_ to remove
their hashes from the set slated to be requested from
the server. That logic was borked.
Towards https://github.com/attic-labs/attic/issues/503
* Add zero check
Also Fixes#3063
We are using babel-plugin-transform-inline-environment-variables which
replaces `process.env.FOO` with the value of the `FOO` environment
variable at compile time.
However, due to our pipeline we end with something like:
```js
var NAME = 'NOMS_VERSION_NEXT';
process.env[NAME]
```
which does not get replaced in development mode. If it was a const the
transformer could replace it but var bindings can change.
Compaction not only persists the contents of a memTable, it also
filters out duplicate chunks. This means that calling count() on a
compactingChunkStore before and after compaction completes could lead
to different results. In the case where the memTable contains only
duplicate chunks, this is Very Bad becuase it leads to an non-existent
table winding up in the NBS manifest.
Fixes#3044
Chunk deserialization can run into errors sometimes if, e.g. the
client hangs up during a writeValue request. The old error strategy
worked by throwing a "catchable" error and recovering. That's OK if
you've only got one goroutine, but since the writeValue handler starts
so many goroutines, architecting the code to deal with error handling
by panic/recover is dicey.
Instead, make DeserializeToChan return an error in the more common
failure cases and handle it by passing it over a channel and raising
it from a central place.
The more code can use GetMany(), the better performance gets on top of
NBS. To this end, add a call to ValueStore that allows code to read
many values concurrently. This can be used e.g. by read-ahead code
that's navigating prolly trees to increase performance.
Fixes#3019