This patch introduces/expands the 'manifest' and 'tableSet'
abstractions, so that NomsBlockStore is no longer explicitly using any
file system operations
Towards issue #2877
These get the set/map element at a specific index.
I haven't implemented it in JS yet because the JS code has no method to
create a cursor at an index. This exists in Go because a refactor was
done a few months ago to add it, but it hasn't been ported to JS.
* Support marshaling from and unmarshaling to *types.Type
* Incorporate code review suggestions.
- Use fallthrough in switch
- Update decode godoc to spec *types.Type handling. Godoc for encode is fine as is.
toward: #2889
-
Encode chunk counts consistently as uint32 until #2873 is addressed. This also fixes an error in passing chunkCounts resulting from compaction that don't account for dropped (duplicate) chunks.
Remove validation/normalization of union order and struct field order as we decode a chunk into a type.
Instead the validation happens in ValidatingBatchSink.
We still normalize the union order when a struct type is created directly (not from a chunk) using makeStructType.
The motivation for this change is that computing the OID (order ID) is expensive and it used to be a O(n^2) since we kept recomputing it as we traversed the type hierarchy.
Towards #2836
This patch introduces optional debug logging in util/verbose, and adds
some usage of it to HandleWriteValue and the httpBatchStore
SchedulePut code path. It also modifies chunks.DeserializeToChan() so
that callers can better recover from panics in there.
https://github.com/attic-labs/attic/issues/103
Introduce photo-dedup-by-date
This program deduplicates photos by the date they were taken. It considers two photos a group if they were separated by less than 5 seconds.
* Implement read-ahead in sequence_cursor
For each meta-sequence that contains leaf sequences, start reading ahead in
parallel and deliver in order to a buffered channel. Each advance of the cursor gets
the next sequence in the read-ahead channel.
toward: #2079
-
* Address code review comments:
- Use // for all comments
- Fix label format
- Increase channel read timeout
* Rework read-ahead to use map[int]channel sequence instead of a channel of sequences
* Rework sequence cursor read-ahead for better throughput
- Guts of read-ahead now encapsulted in sequenceReadAhead
- New implemention uses a cursor to iterate across the leaves ahead
of the current cursor
- It reads ahead using short-lived go routines that place each read-ahead
sequence in a channel that is then stored by hash in a map
- When the sequence is needed, the cursor first looks in the map. If found,
it reads the sequence from the channel stored in the map. If not, it reads
it normally.
- This approach allows for reading ahead in parallel without requiring a long
running pool of goroutines
- Introduce sequenceIterator to encapulate read-ahead behind an abstraction that
always reads forward. This is currently used narrowly but could be used more
widely as the the core implementation for all sequence iterators
* Address review comments
If you want to roll just the go/ directory of noms, you can do:
$ roll.py https://github.com/attic-labs/noms --incl go
If you want to roll the AWS SDK without the tests, you can do:
$ roll.py https://github.com/aws/aws-sdk-go --excl awstesting
This works with nested directories too, for example --incl go/util
--excl is evaluated after --incl, so you could exclude the perf/
directory of go/ if you really wanted:
$ roll.py https://github.com/attic-labs/noms --incl go --excl go/perf