This is because:
* All type.Ref are now typed, so Typed was a tautology.
* The only way to construct a type.Ref is with a Value, so FromValue was
a tautology (with a small amount of work to remove callers of NewRef).
* Go: Back datas.unwrittenPutCache with a LevelDB
httpBatchStore was caching as-yet-unwritten Chunks both in memory and
on disk. To avoid this, the in-memory cache is now backed by a
LevelDB, which handles spilling to disk when it needs to. When it's
time to send Chunks to the server, the cache is enumerated in
insert-order so that the payload of the write request is properly
structured.
Fixes#1348
This is to support efficient chunk diff. Most of the churn in this patch
comes from updating test expectations, and updating every Ref construction
to include a height and every MetaTuple to include a Ref.
While this causes the cache to potentially grow infinitely,
having a DataStore keep track of every value it's read or written
makes it simpler to program with. Once types.Ref can only come to
be by being read out of a DataStore, we can do away with this
altogether.
The goal is to remove places where we construct a types.Ref from a
ref.Ref, so that we can attach a height to it later. This should also
make it unnecessary to read in the Commit values at all (which is the
case in the JS SDK) but commit validation prevents that for now.
We're seeing some raciness in RemoteDataStore tests that don't
repro locally. This adds the hashes of all chunks seen prior to the
failure to the error response sent back by the DataStoreServer.
Hopefully, this will help debug the raciness.
This replaces the HTTP ChunkStore implementation with an implementation of
our new DataStore client protocol. It migrates much of the batching logic
from RemoteStore into the new BatchStore, which is analogous to a class we
have on the Go side, but continues to use a Delegate to handle all the HTTP
work.
This patch also introduces ValueStore, which handles validating Values as
they're written. Instead of handling Value reading and writing itself,
DataStore now extends ValueStore.
Towards #1280
Struct type definition is now inlined into the chunk. To break
cycles we use back references.
- Removes unresolved type refs
- Removes packages
Fixes#1164Fixes#1165
Using ChunkStore.PutMany() means that the DataStore server code
can detect when the ChunkStore it's writing to can't handle
the amount of data being pushed. This patch reports that
status back across the wire to the client that's attempting
to write a Value graph. Due to Issue #1259, the only thing the
client can currently do is retry the entire batch, but we hope
to do better in the future.
The initial refactor had some pretty confusing struct and method
names, so this patch renames a number of things and migrates a bunch
of code to the types/ from datas/, where it seems to be a better
logical fit.
datas.cachingValueStore -> types.ValueStore
datas.hintedChunkStore interface -> types.BatchStore
datas.naiveHintedChunkSink -> types.BatchStoreAdaptor
datas.httpHintedChunkStore -> datas.httpBatchStore
datas.notAHintedChunkSink -> datas.notABatchStore
Also, types now exports a ValidatingBatchingSink, which is used by
datas.HandleWriteValue to process incoming hints and validate incoming
Chunks before putting them into a ChunkStore.
Towards Issue #1250
For performance reasons, Package objects for generated Noms Types are
side-loaded when reading Values. This means that the
opportunistically-populated chunk->Type map used by DataStore when
validating writes won't see these chunks in a number of cases. This
can lead to false negatives and erroneous validation failures. This
patch special-cases RefOfPackage when caching the Chunks reachable
from a newly-read Value, manually fetching them from the
types.PackageRegistry and crawling their reachable Chunks.
Fixes#1229.
A novel chunk may contain references to any other novel chunk, as long
as there are no cycles. This means that breaking up the stream of
novel chunks being written to the server into batches risks creating
races -- chunks in one batch might reference chunks in another,
meaning that the server would somehow need to be able to
cross-reference batches. This seems super hard, so we've just forced
the code to write in one massive batch upon Commit(). We'll evaluate
the performance of this solution and see what we need to change.
Also, there's a terrible hack in HandleWriteValue to make it so that
pulls can work by back-channeling all their chunks via postRefs/ and
then writing the final Commit object via writeValue/
This can be fixed once we fix issue 822
This patch is unfortunately large, but it seemed necessary to make all
these changes at once to transition away from having an HTTP
ChunkStore that could allow for invalid state in the DB. Now, we have
a RemoteDataStoreClient that allows for reading and writing of Values,
and performs validation on the server side before persisting chunks.
The semantics of DataStore are that written values can be read back
out immediately, but are not guaranteed to be persistent until after
Commit() The semantics are now that Put() blocks until the Chunk is
persisted, and the new PutMany() can be used to write a number of
Chunks all at once.
From a command-line tool point of view, -h and -h-auth still work as
expected.
This patch removes the special RemoteDataStore implementation of
CopyReachableChunksP, as this is seldom-used and adds complexity
that stands in the way of Issue 654
To facilitate validation, DataStore needs to remember which chunks
it's seen, what their refs are, and the Noms type of the Values they
encode. Then, DataStore can look at each Value that comes in via
WriteValue() and validate it by checking every embedded ref (if any)
against this cache.
Towards #654
In pursuit of issue #654, we want to be able to figure out all the
refs contained in a given Value, along with the Types of the Values to
which those refs point. Value.Chunks() _almost_ met those needs, but
it returned a slice of ref.Ref, which doesn't convey any type info.
To address this, this patch does two things:
1) RefBase embeds the Value interface, and
2) Chunks() now returns []types.RefBase
RefBase now provides Type() as well, by virtue of embedding Value, so
callers can just iterate through the slice returned from Chunks() and
gather type info for all the refs embedded in a given Value.
I went all the way and made RefBase a Value instead of just adding the
Type() method because both types.Ref and the generated Ref types are
actually all Values, and doing so allowed me to change the definition of
refBuilderFunc in package_registry.go to be more precise. It now returns
RefBase instead of just Value.
This patch is the first step in moving all reading and writing to the
DataStore API, so that we can validate data commited to Noms.
The big change here is that types.ReadValue() no longer exists and is
replaced with a ReadValue() method on DataStore. A similar
WriteValue() method deprecates types.WriteValue(), but fully removing
that is left for a later patch. Since a lot of code in the types
package needs to read and write values, but cannot import the datas
package without creating an import cycle, the types package exports
ValueReader and ValueWriter interfaces, which DataStore implements.
Thus, a DataStore can be passed to anything in the types package which
needs to read or write values (e.g. a collection constructor or
typed-ref)
Relatedly, this patch also introduces the DataSink interface, so that
some public-facing apis no longer need to provide a ChunkSink.
Towards #654