This is a side-by-side port, taking inspiration from the old dataspec.go
code. Notably:
- LDB support has been added in Go. It wasn't needed in JS.
- There is an Href() method on Spec now.
- Go now handles IPV6.
- Go no longer treats access_token specially.
- Go now has Pin.
- I found some issues in the JS while doing this, I'll fix later.
I've also updated the config code to use the new API so that basically
all the Go samples use the code, even if they don't really change.
There were several tests in the Database suites that were failing to
close test Databases that had orderedChunkCaches in them (backed by
levelDBs). Close them.
I was ALSO failing to destroy the cache used in LocalDatabase
instances only while testing Pull(). That's cleared up now as well.
ValueStore caches Values that are read out of it, but it doesn't
do the same for Values that are written. This is because we expect
that reading Values shortly after writing them is an uncommon usage
pattern, and because the Chunks that make up novel Values are
generally efficiently retrievable from the BatchStore that backs
a ValueStore. The problem discovered in issue #2802 is that ValueStore
caches non-existence as well as existence of read Values. So, reading
a Value that doesn't exist in the DB would result in the ValueStore
permanently returning nil for that Value -- even if you then go and
write it to the DB.
This patch drops the cache entry for a Value whenever it's written.
Fixes#2802
The httpBatchStore test TestVersionMismatch() expects a panic,
but the test was actually potentially causing MULTIPLE panics.
One due to the version mismatch, and another due to using an
invalid root.
Private databases begin with "/p/" - for example, "/kalman" is not
private, but "/p/kalman" is private. They are not the same database.
The bulk of this work is the receipt infrastructure.
A receipt is form data that gives access to a database, encrypted using
secretbox. For example, "Database=/p/kalman&Date=12345678" might encrypt
to "SFH5bcIJ3_XgEbtmi_AdCKTItW20fl90czVl5_pF5PAXhNQ366U1yOpYGAjT".
* A new tool receiptkey generates random receipt (secretbox) keys.
* A new tool receipttool generates receipts for databases.
* demo-server has been updated to check for a receipt in the
Authorization header to access private databases.
receipttool and demo-server must be given the same receipt key.
Add optional merging functionality to noms commit.
noms merge <database> <left-dataset-name> <right-dataset-name> <output-dataset-name>
The command above will look in the given Database for the two named
Datasets and, if possible, merge their HeadValue()s and commit the
result back to <output-dataset-name>.
Fixes#2535
This patch adds an optional MergePolicy field to CommitOptions. It's a
callback. If the caller sets it, then the commit code will look for a
common ancestor between the Dataset HEAD and the provided Commit. If
the caller-provided Commit descends from HEAD, then Commit proceeds as
normal.
If it does not, but there is a common ancestor, the code runs
merge.ThreeWay() on the values of the provided Commit, HEAD, and the
common ancestor, invoking the MergePolicy callback to resolve
conflicts. If merge succeeds, a merge Commit is created that descends
from both HEAD and the caller-provided Commit. This becomes the new
HEAD of the Dataset.
Fixes#2534
Previously we would clone them from the original cursor, to (a) not
modify the original cursor, and (b) have initialization and finalization
not interfere with each other.
However, this isn't necessary and it just creates unnecessary churn. For
example, when we read-ahead, it would be wasteful to re-read the
read-head chunks from initialization.
This puts the flow header after the copyright header.
It also:
* fixes the existing files to have valid headers
* Makes sure the script can handle doctype
In some cases where the same chunk appears more than once in a given
writeValue request, the handleWriteValue code is able to recognize
this and skip re-decoding and re-hashing it. In that case an empty
result winds up percolating through the code, and I wasn't handling
this correctly. Fixed and added a unit test to catch this.
Fixes#2695
There are two places where ValidatingBatchingSink could be more
concurrent: Prepare(), where it's reading in hints, and Enqueue().
Making Prepare() handle many hints concurrently is easy because the
hints don't depend on one another, so that method now just spins up
a number of goroutines and runs them all at once.
Enqueue() is more complex, because while Chunk decoding and validation
of its hash can proceed concurrently, validating that a given Chunk is
'ref-complete' requires that the chunks in the writeValue payload all
be processed in order. So, this patch uses orderedparallel to run the
new Decode() method on chunks in parallel, but then return to serial
operation before calling the modified Enqueue() method.
Fixes#1935
This allows setting a field in a struct to a new type or to set a
non-existig field in a struct.
In JS this is done through the StructMirror.p.set and in Go this is
done through Struct Set.
Fixes#2181
This allows setting the meta field when you commit.
This is a version change because the signature of Database commit
changed in a non backwards compatible way. It now takes an optional
third options object instead of an optional array.
The subsequent runs of url-fetch on jenkins are way faster, and this
appears to be because commiting is much faster on subsequent runs. The
perf tests now use a new database each time.
This patch implements evolving support for configuring aliases and defaults for the noms cli (started with #2131)
For an introduction, please take a look at the sample code here: https://github.com/attic-labs/noms/blob/master/samples/cli/nomsconfig/README.md
Improvements include:
- All go samples now work with .nomsconfig
- Absolute paths in ldb specs are now properly handled
- Add -v|--verbose flag to commands to debug expansion
- Make default just another alias and change [default] section to [db.default]
- Introduce the `.` shorthand to refer to a previously mentioned dataset/object
This patch modifies LocalDatabase so that it no longer swaps out
its embedded ValueStore during Pull(). The reason it was doing this
is that Pull() injects chunks directly into a Database, without
doing any work on its own to ensure correctness. For LocalDatabase,
WriteValue() performs de-facto validation as it goes, so it does not
need this additional validation in the general case. To address the
former wtithout impacting the latter, we were making LocalDatabase
swap out its ValueStore() during Pull(), replacing it with one that
performs validation.
This led to inconsistencies, seen in issue #2598. Collections read
from the DB _before_ this swap could have weird behavior if they
were modified and written after this swap.
The new code just lazily creates a BatchStore for use during Pull().
Fixes#2598
Blob.Concat is a simple use of the sequence concat code that List.Concat uses.
NewBlob uses Blob.Concat to construct a Blob in parallel.
Perf tests for parallel NewBlob write N temporary files then constructs a Blob
from them, so there is some I/O, but it appears to be mostly CPU bound. NewBlob
doesn't get much more than 50% faster with any P >= 2.
Noms SDK users frequently shoot themselves in the foot because they're
holding onto an "old" Database object. That is, they have a Database
tucked away in some internal state, they call Commit() on it, and
don't replace the object in their internal state with the new Database
returned from Commit.
This PR changes the Database and Dataset Go API to be in line with the
proposal in Issue #2589. JS follows in a separate patch.
* Implement noms cli configuration support
- Introduce .nomsconfig
- Supports a default db to use when no explicit db is given
- Supports defining db aliases to use as short hand for db urls
- See samples/cli/nomsconfig for more info
fix: #2131
* Use sampling for a better bytes-written estimate for noms sync
* Confirmed that remaining overestimate of data written is consistent with leveldb stats and opened #2567 to track
These were previously intertwined into one writer that was
embedded in and only usable by the 'noms' command.
This commit separates them into to separate writers that
can be used independently or combined. I also moved them
into go/utils/writers so they can be used by other code.
The main impetus to do this was to fix Bug #2593.