Commit Graph

3504 Commits

Author SHA1 Message Date
cmasone-attic 1d52617eb5 NBS table names now just hash of suffix block (#3421)
Used to be that an NBS table was named by hashing the hashes
of every chunk present in the table, in hash order. That means
that to generate the name of a table you'd need to iterate
the prefix map and load every associated suffix. That would
be expensive when e.g. compacting multiple tables. This is
waaay cheaper and only slightly more likely to wind up with a
name collision.

Toward #3411
2017-04-24 14:45:54 -07:00
cmasone-attic 98e408a0d0 Add Database.Rebase() (#3420)
All this really does is tell the underlying ChunkStore
to go fetch the current root from persistent storage
and drops the now-out-of-date Dataset map on the floor.

Fixes https://github.com/attic-labs/attic/issues/1157
2017-04-24 14:17:08 -07:00
Rafael Weinstein faa9ef03c0 Trigger compaction probabilistically to avoid lots of concurrent attempts (#3419) 2017-04-24 14:08:15 -07:00
cmasone-attic 27556b6148 Rev Noms version (#3418)
Should've done this last week when I tweaked the HTTP
protocol to return the current root hash when a POST
to the root/ endpoint fails.
2017-04-24 12:22:55 -07:00
Rafael Weinstein 6e47be3899 make MemoryStoreFactory public (#3417) 2017-04-21 17:46:11 -07:00
cmasone-attic fef871c1a7 ValueStore.Flush() no longer persists Chunks (#3416)
ValueStore.Flush() now Puts all Chunks buffered in the ValueStore
layer into the underlying ChunkStore. The Chunks are not persistent
at this point, not until and unless the caller calls Commit() on
the ChunkStore.

This patch also removes ChunkStore.Flush(). The same effect can be
achieved by calling ChunkStore.Commit() with the current Root for both
last and current.

NB: newTestValueStore is now private to the types package.
The logic is that, now, outside the types package, callers
need to hold onto the underlying ChunkStore if they want to
persist Chunks.

Toward #3404
2017-04-21 17:30:56 -07:00
Erik Arvidsson fff2d75481 Make Marshal use StructTemplate (#3413) 2017-04-21 16:27:37 -07:00
cmasone-attic 16ef8884a7 Make MemoryStore come correct (#3406)
It's important that MemoryStore (and, by extension TestStore)
correctly implement the new ChunkStore semantics before we go
shifting around the Flush semantics like we want to do in #3404

In order to make this a reality, I introduced a "persistence"
layer for MemoryStore called MemoryStorage, which can vend
MemoryStoreView objects that represent a snapshot of the
persistent storage and implement the ChunkStore contract.

Fixes #3400

Removed Rebase() in HandleRootGet, and added ChunkStore
tests to validate the new Put behavior more fully
2017-04-21 14:13:52 -07:00
Rafael Weinstein 5d6032a9aa Add types.MakeStructTemplate (#3412)
Add types.MakeStructTemplate
2017-04-20 14:41:57 -07:00
Rafael Weinstein 2bcf85af0b implement "rollUp" minor compaction (#3409)
Change the strategy for choosing which tables to compact. Choose 2 or more of the N smallest tables such that the resulting table will the smallest (or tied for smallest) table in the store.
2017-04-20 11:32:18 -07:00
cmasone-attic ff7cae6d34 Merge chunks.RootTracker interface into chunks.ChunkStore (#3408)
You can't fully specify RootTracker without referring to the
ChunkStore interface, so they should just merge.

Fixes #3402
2017-04-19 21:34:20 -07:00
cmasone-attic dc41f18498 Code review changes from #3403 (#3405) 2017-04-19 13:34:12 -07:00
cmasone-attic cb930dee81 Merge BatchStore into ChunkStore (#3403)
BatchStore is dead, long live ChunkStore! Merging these two required
some modification of the old ChunkStore contract to make it more
BatchStore-like in places, most specifically around Root(), Put() and
PutMany().

The first big change is that Root() now returns a cached value for the
root hash of the Store. This is how NBS worked already, so the more
interesting change here is the addition of Rebase(), which loads the
latest persistent root. Any chunks that appeared in backing storage
since the ChunkStore was opened (or last rebased) also become
visible.

UpdateRoot() has been replaced with Commit(), because UpdateRoot() was
ALREADY doing the work of persisting novel chunks as well as moving
the persisted root hash of the ChunkStore in both NBS and
httpBatchStore. This name, and the new contract (essentially Flush() +
UpdateRoot()), is a more accurate representation of what's going on.

As for Put(), the former contract for claimed to block until the chunk
was durable. That's no longer the case. Indeed, NBS was already not
fulfilling this contract. The new contract reflects this, asserting
that novel chunks aren't persisted until a Flush() or Commit() --
which has replaced UpdateRoot(). Novel chunks are immediately visible
to Get and Has calls, however.

In addition to this larger change, there are also some tweaks to
ValueStore and Database. ValueStore.Flush() no longer takes a hash,
and instead just persists any and all Chunks it has buffered since the
last time anyone called Flush(). Database.Close() used to have some
side effects where it persisted Chunks belonging to any Values the
caller had written -- that is no longer so. Values written to a
Database only become persistent upon a Commit-like operation (Commit,
CommitValue, FastForward, SetHead, or Delete).

/******** New ChunkStore interface ********/

type ChunkStore interface {
     ChunkSource
     RootTracker
}

// RootTracker allows querying and management of the root of an entire tree of
// references. The "root" is the single mutable variable in a ChunkStore. It
// can store any hash, but it is typically used by higher layers (such as
// Database) to store a hash to a value that represents the current state and
// entire history of a database.
type RootTracker interface {
     // Rebase brings this RootTracker into sync with the persistent storage's
     // current root.
     Rebase()

     // Root returns the currently cached root value.
     Root() hash.Hash

     // Commit atomically attempts to persist all novel Chunks and update the
     // persisted root hash from last to current. If last doesn't match the
     // root in persistent storage, returns false.
     // TODO: is last now redundant? Maybe this should just try to update from
     // the cached root to current?
     // TODO: Does having a separate RootTracker make sense anymore? BUG 3402
     Commit(current, last hash.Hash) bool
}

// ChunkSource is a place chunks live.
type ChunkSource interface {
     // Get the Chunk for the value of the hash in the store. If the hash is
     // absent from the store nil is returned.
     Get(h hash.Hash) Chunk

     // GetMany gets the Chunks with |hashes| from the store. On return,
     // |foundChunks| will have been fully sent all chunks which have been
     // found. Any non-present chunks will silently be ignored.
     GetMany(hashes hash.HashSet, foundChunks chan *Chunk)

     // Returns true iff the value at the address |h| is contained in the
     // source
     Has(h hash.Hash) bool

     // Returns a new HashSet containing any members of |hashes| that are
     // present in the source.
     HasMany(hashes hash.HashSet) (present hash.HashSet)

     // Put caches c in the ChunkSink. Upon return, c must be visible to
     // subsequent Get and Has calls, but must not be persistent until a call
     // to Flush(). Put may be called concurrently with other calls to Put(),
     // PutMany(), Get(), GetMany(), Has() and HasMany().
     Put(c Chunk)

     // PutMany caches chunks in the ChunkSink. Upon return, all members of
     // chunks must be visible to subsequent Get and Has calls, but must not be
     // persistent until a call to Flush(). PutMany may be called concurrently
     // with other calls to Put(), PutMany(), Get(), GetMany(), Has() and
     // HasMany().
     PutMany(chunks []Chunk)

     // Returns the NomsVersion with which this ChunkSource is compatible.
     Version() string

     // On return, any previously Put chunks must be durable. It is not safe to
     // call Flush() concurrently with Put() or PutMany().
     Flush()

     io.Closer
}

Fixes #2945
2017-04-19 13:31:58 -07:00
Dan Willhite 6ef3a0ddec Export MockExit so that all tests can use it (#3401) 2017-04-18 15:06:00 -07:00
cmasone-attic 40af2a3dab Split decode vs completeness validation out of ValidatingBatchingSink (#3399)
ValidatingBatchingSink stopped batching a while ago, and has generally
made less and less sense over time. Splitting out decoding from
validation allows for clearer code in the server-side writeValue
handler.

Additionally, since we intend to make new chunks that are Put into a
ChunkStore not persist until Flush() or UpdateRoot() as a part of
now the only CS implementation -- splitting out these concerns allows
localBatchStore to stop caching all new chunks as it goes.

Fixes #3343
2017-04-18 12:07:10 -07:00
Eric Halpern f234913943 Remove code coverage (#3395)
* Remove code coverage

* Remove node --version
2017-04-17 18:17:24 -07:00
Rafael Weinstein ae9f14e9fb Disable read ahead for iter/iterator prolly tree methods. (#3394)
Disable read ahead for iter/iterator prolly tree methods.
2017-04-17 17:55:38 -07:00
Rafael Weinstein 7d914e1842 flush no chunks is noop (#3392) 2017-04-17 14:24:56 -07:00
Aaron Boodman cdc657f5e4 Define DateTime using struct embedding rather than type renaming. (#3386)
Define DateTime using struct embedding rather than type renaming.

This results in us inheriting all the methods of DateTime automatically.
2017-04-17 13:53:05 -07:00
Rafael Weinstein a626827d16 remove warning (#3389) 2017-04-16 16:50:54 -07:00
cmasone-attic 192bdf6801 Remove DynamoStore (#3388)
We're no longer using this, and forthcoming changes to ChunkStore
mean that we'd have to do work to continue supporting it.
2017-04-14 13:41:38 -07:00
Aaron Boodman 9eda8eeb89 Fix broken test (#3387) 2017-04-14 10:23:16 -07:00
Aaron Boodman 06cebbe346 Introduce @target annotation for paths (#3352)
Introduce @target annotation for paths

Fixes #2172
2017-04-13 14:56:31 -07:00
Erik Arvidsson fd997f7bfa Add IsValueSubtypeOf and IsCommit (#3375)
This adds IsValueSubtypeOf which skips computing the type of the value.

Use IsValueSubtypeOf to implement IsCommit which checks if a value is a
commit.

Replace usages of IsSubtype(t, TypeOf(v)) with IsValueSubtypeOf(v, t).

Fixes #3326
Fixes #3348
2017-04-13 10:49:17 -07:00
cmasone-attic 44c6e1c733 NBS: clean up chunk extract error reporting (#3376)
At one point, we had some hard-to-diagnose failures
decoding chunks. We tracked the problem down and fixed
it, but it's good to keep the error reporting. It's done
more cleanly and efficiently, now.

Fixes #3148
2017-04-13 09:42:34 -07:00
cmasone-attic 0105a16916 Don't read chunk during ValueStore.bufferChunk() (#3383)
The last patch did this in order to allow bad-behavers
to still have a chance of succeding if they write Values
top down. This ensures that they won't, and therefore
will run afoul of lazy completeness checking.

Follow on for #3371
2017-04-12 14:47:37 -07:00
cmasone-attic 67f3da3e57 Loosen checks in ValueStore.bufferChunks (#3382)
Before ripping out all the hinting and associated proactive
validation checking, it was impossible to write Values through
the ValueStore API in any way other than bottom-up. That meant
that we could enforce the invariant that a chunk could not
appear in pendingParents unless it was also in pendingPuts.
Now, it's possible to construct legal sequences of calls to
ValueStore that result in this check being violated. Raf and
I don't think that violating this check can actually lead to
an invalid underlying database, as the lazy validation that
replaced the proactive validation should still catch any such
issues.

Fixes #3371
2017-04-12 14:30:08 -07:00
Rafael Weinstein 07595ccfe0 restore some simplify tests (#3373) 2017-04-10 13:01:03 -07:00
Erik Arvidsson 7c4e2385ab Normalize our number encoding (#3370)
Our Number encoding consists of two parts. Firsts we convert the float
into f * 2**exp, then we uvarint encode f and exp. However, we didn't
normalize f so in theory we could end up with multiple representations
of the same number.

This changes the representation to make the f the smallest possible
integer that fulfills the formula above.

For example we used to encode 256 as (0x100, 0) but with this we instead
encode it as (0x01, 8).

Fixes #2307
2017-04-10 12:20:52 -07:00
Rafael Weinstein 0b10350af3 cleanup (#3372) 2017-04-10 11:45:26 -07:00
Rafael Weinstein fbfdd317fc Encode all noms quantities as varint (#3368) 2017-04-08 22:48:03 -07:00
Rafael Weinstein 34ae4262c8 Types refactor (#3367)
1. Decoder no longer needed to remove struct cycles. That happened as we decoded
2. Remove no-op tests.
3. Remove dead code
4. Refactor and add test.
5. Just moving code around (mostly type_cache -> make_type)
6. Remove (no longer necessary) typeDepth counter in ValueDecoder
2017-04-08 12:14:39 -07:00
Rafael Weinstein a2bb7d7a75 Remove to unresolved type (#3366) 2017-04-08 11:12:54 -07:00
Rafael Weinstein d8b5d03520 Zero-tolerance for unnamed struct cycles (#3365) 2017-04-08 10:39:03 -07:00
Rafael Weinstein ad6ffaec9b remove noms migration (#3362) 2017-04-08 10:07:44 -07:00
Rafael Weinstein cb06559428 correctly encode struct cycles (#3360) 2017-04-08 10:01:15 -07:00
cmasone-attic d95b367179 Rename 'lock' attribute in dynamo_manifest.go (#3358)
Turns out 'lock' is a reserved word for DynamoDB attributes,
so change it.
2017-04-07 20:27:00 -07:00
Erik Arvidsson f40e5ae7bf Compare target hash instead (#3356)
Also handle cycles of unnamed structs.
2017-04-07 17:56:46 -07:00
cmasone-attic fe2c476469 Fix NBS optimistic locking (#3353)
Introduce a "lock" hash into NBS manifests to address the bad
interaction between Flush() and optimistic locking. Our original
design didn't include Flush(), which changes the set of tables without
updating the root. Thus... an optimistic locking strategy predicated
on checking the currently-persisted root hash is not robust to
interleaved Flush() calls from multiple clients.

Fixes #3349
2017-04-07 16:55:39 -07:00
Erik Arvidsson 3cda20e251 Add types.HasStructCycle and cleanup MakeStructType (#3354) 2017-04-07 16:27:17 -07:00
Erik Arvidsson 4e913dc210 Fix issue with ilooping with unnamed structs (#3350)
When we clone it is possible to run into loops between named and
unnamed structs. We need to stop cloning the type tree when we see
cycle.

Also update Describe to print the actual type tree, even if it is
invalid.
2017-04-07 09:50:24 -07:00
Aaron Boodman 5ed5b764b6 NewStreamingBlob: print out err details from reader (#3351)
We're getting an error here but can't see what it is in the logs.
2017-04-07 09:39:46 -07:00
Erik Arvidsson fd815b10ad Compute type based on value (#3338)
This moves the type off from the value and instead we compute it as we ask for.

This also changes how we detect cycles. If a named struct contains a struct with the
same name we now create a cycle between them. This also means that cycle types
now take a string and not a number.

For encoding we no longer write the type with the value (unless it is a types.Ref).

This is a format change so this takes us to 7.6

Fixes #3328
Fixes #3325
Fixes #3324
Fixes #3323
2017-04-06 17:43:49 -07:00
cmasone-attic de76d37f09 Rip out hinting, reverse-order hack; make validation lazy (#3340)
* Add HasMany() to the ChunkStore interface

We'll need this as a part of #3180

* Rip out hinting

The hinting mechanism used to assist in server-side validation
of values served us well, but now it's in the way of building a
more suitable validation strategy. Tear it out and go without
validation for a hot minute until #3180 gets done.

Fixes #3178

* Implement server-side lazy ref validation

The server, when handling writeValue, now just keeps track of all the
refs it sees in the novel chunks coming from the client. Once it's
processed all the incoming chunks, it just does a big bulk HasMany to
determine if any of them aren't present in the storage backend.

Fixes #3180

* Remove chunk-write-order requirements

With our old validation strategy, it was critical that
chunk graphs be written bottom-up, during both novel value
creation and sync. With the strategy implemented in #3180,
this is no longer required, which lets us get rid of a bunch
of machinery:

1) The reverse-order hack in httpBatchStore
2) the EnumerationOrder stuff in NomsBlockCache
3) the orderedPutCache in datas/
4) the refHeight arg on SchedulePut()

Fixes #2982
2017-04-06 16:54:40 -07:00
Dan Willhite ecc39d1c76 Fix datetime nanosecods (#3342)
We were not correctly marshalling the nanoseconds which lead to issues with round tripping.
2017-04-06 10:43:21 -07:00
Eric Halpern 9ff2c1d855 Go slice to Noms Set mashalling (#3339)
* Go slice to Noms Set mashalling

fixes: #3335
2017-04-04 11:33:30 -07:00
Erik Arvidsson c964aff0af Remove types.Value Type() in favor of types.TypeOf() (#3337)
BREAKING CHANGE

This removes the `Type()` method from the `types.Value` interface.
Instead use the `types.TypeOf(v types.Value) bool` function.

Fixes #3324
2017-04-03 14:04:13 -07:00
cmasone-attic 344d859775 Fix dynamoManifest to properly handle an empty set of tables (#3336)
DynamoDB doesn't allow empty strings in records. We were sending an
empty string in the case where a store had no tables in it. Instead,
the right thing is to leave this attribute out of the record, and then
detect the case where the attribute is empty when reading the record.
2017-04-03 13:48:18 -07:00
Erik Arvidsson ebbefcc511 Use Struct fields directly in HRS (#3334) 2017-04-02 19:31:20 -07:00
Erik Arvidsson f1d7214af6 Update struct in memory representation (#3333)
* Update struct in memory representation

To contain struct name and field names.

This is in preparation for removing types from the value.

Towards ##3324

* Rename structFields to structTypeFields

* Undo change in vendoring

* searchField and copy
2017-04-02 07:46:39 -07:00