The library that we were using multiplied (and divided) by 2 to
do the zigzag encoding. However, if we are already close to the
precision limit then we lose precision and the encoding/decoding
does not round trip.
Instead, we split the float64 into two Uint32 numbers and do the
operations on those.
Also, the code that split a float64 into base and exp was not shifting
enough, it was shifting until float64(maxInt64) when we need to shift
until less than max safe int (according to float64)
Fixes#2104Fixes#2234
Under load, our server can exhaust the number of file descriptors it's
allowed to have open at one time. Part of this is because of how many
incoming connections it's handling, but we believe that handling lots
of simultaneous reads to leveldb is the larger part of the issue.
This patch applies the rate limit we were using for writing to both
read and write operations.
Fixes#2227
1) ValueStore now maintains validation cache info across Commit() calls,
so we can remove a workaround in Dataset.Commit()
2) Update comment for setNewHead
This fixes https://github.com/attic-labs/noms/issues/2235 where set/map
diff was still running when noms log has finished, and already closed
its database. Commonly this would crash.
I've removed panic-based control flow to make error handling explicit,
and added a "DiffLeftRight" method to set/map so that noms log completes
ASAP. See the issue for details.
I need this in the short term because CSV raw files are too large to
check into github, so they need to be split up. It's easier to just do
"cat * | url-fetch -stdin" than recombining them on the file system.
In the server side of the Remote Databse, the handler for
UpdateRoot now verifies that the new proposed Root is of a
legal type: empty map OR Map<String, Ref<Commit-like>>
Fixes#2116
This patch creates a new kind of chunks.Factory that demo-server
uses to vend ChunkStore instances that all share the same
MemoryStore-based Chunk cache. This cache _will_ grow without bound,
but the current RAM/data ratio on demo.noms.io means that, in practice,
we will be fine for a bit.
This will need to be removed in favor of a real solution in Issue #2227Fixes#2009
It turns out Pull() was making some bad assumptions about how the Go
heap package used its backing storage. Since it wasn't really relying
on heap guarantees anyway, this changes the code to use a slice of Ref
that's sorted in order of increasing ref-height: RefByHeight.
sequence.getOffset was problematic and didn't have a clear meaning. In addition it was causing a bunch of +1 code at call sites. This patch replaces it with cumulativeNumLeaves which has a clearer meaning.
NewSerializer spun up a goroutine within itself. We've decided
this is an anti-pattern. Furthermore, we were using this inside
our remote database handler code, and a panic inside that goroutine
could take down the server. The callsites now use Serialize() directly.
Fixes#2169
Shows number of changes between two top level values.
```
noms diff -summarize $l::#t5p4im6uug7n5m72frr0dnjnkm04e4ph.value $l::#ueo0utduuqsf9vrntrhn25lnc19m848l.value
```
Prints:
```
13,636 insertions, 0 deletions, 107 changes, (1,919,894 values vs 1,933,530 values)
```
Where the numbers are updated as more data is computed from the diff.
Towards #2031
Leaving them running will cause a few database reads after
orderedSequenceDiffBest returns, and it looks like this is causing a
deadlock (see https://github.com/attic-labs/noms/issues/2165). I've also
seen a log related crash which goes away after this patch.
- Too have same API as all the other diff methods
- Send changes to channel without intermediary slices and without the
need to union and sort the fields
First: diff wasn't checking whether it had stopped before sending the
final set of changes. If the caller had stopped - and therefore no
longer reading from the changes channel - diff would hang.
Second: the same close channel was being used on 2 threads and it was
possible for both to read the close signal - causing the other to hang.
The convenience constructor changed in this patch takes in a aws.Config
object directly. This allows any implementation of the mentioned interface
to be passed in to Noms's Dynamo store -- giving flexibility for client
code to add their own credential acquisition mechanisms, for instance.
[fixes#2151]
We need to also pull over the meta field. The references in meta are
already getting pulled over by `Chunks()`. If we do not pull over the
meta field we get a different commit value, with a different hash
which leads to newer pulls not being able to merge cleanly.
Fixes#2112
This changes so that all commit struct types have a meta field
(which might be an empty struct).
Increment the serialization version since the old data does not
necessarily have the meta field.
Fixes ##2112
The left-right diff is expected to return results earlier, whereas the
top-down approach is faster overall. This new diff algorithm runs both:
- early results are returned from the left-right diff.
- if/when top-down catches up, left-right is stopped and the rest of the
changes are returned from top-down.