Print Ref values as #123 instead of 123
Since our hashes are SHA-512 and we write them using Base32 there are a lot of overlaps with other parts of NomDL. This makes them unambiguous.
Towards #1466
This removes the type tagged version of the human readable encoding.
Motivation: Simplify this in preparation to make the HRS unambiguous so that we can write a parser.
Towards #1466
Introduce Sloppy - an estimating compression function for snappy - which allows for the rolling hash to better produce a given target chunk size after compression.
When we added GetMany and HasMany, we didn't realize that requests
could then be larger than the allowable HTTP form size. This patch
makes the body of getRefs and hasRefs be serialized as binary instead,
which addresses this issue and actually makes the request body more
compact.
Fixes#3589
A lot of the JS code is taken from the old splore sample, but in
particular main.js is completely different - much simpler, because the
architecture of noms-splore uses a specialised {path => node} HTTP API,
implemented in Go, which does the noms graph traversal.
noms-splore also improves on the old splore sample by making it more
obvious what the key/value pairs are for maps and structs, but regresses
slightly in what it can say about prollytrees.
* use kingpin for help and new commands, set up dummy command for noms blob
* document existing commands using kingpin
* remove noms-get and noms-set in favor of new noms blob command
* normalize bool flags in tests, remove redundant cases that kingpin now handles
* add kingpin to vendor files
* make profile flags global
* move --verbose and --quiet to global flags
Prior to this patch, whenever we created a chunkSource for a table
persisted to AWS, awsTablePersister::Open() would hit DynamoDB to
check whether the table data was stored there. That's how it knew
whether to create a dynamoTableReader or an s3TableReader. This
results in consulting Dynamo (or the in-memory small-table cache)
every time we go to open a table. Most of the time, this isn't
necessary, as we separately cache table indices and really only
need that data at Open() time.
This patch defers reading table data if possible.
Fixes#3607
This will likely bloat the cache with tables no one's going to read
data from, BUT doing this also means that most checks to see if a table
is in Dynamo at all can proceed locally. Stopgap until #3607 lands
Fail over to fully-consistent reads if no result. This means that
misses will get more expensive, but hits will cost us half what they
were costing in the initial version of the code.
Fixes#3604
Looking at metrics on staging today, there are frequent spikes of
tens of thousands of throttled DynamoDB reads. One explanation is
that we're constantly evicting 'hot' tables from the in-memory
cache because the working set is larger than the space we've
allotted for the cache.
There seems to be a floor on the amount of time required to persist
small objects to S3. For workloads that generate lots of small tables,
this can really add up. DynamoDB is much faster to read/write, and can
hold items of up to 400k. This patch stores tables with < 64 chunks
that are < 400k in DynamoDB, caching them in memory on persist and
open to reduce load on the back end.
Fixes#3559