Commit Graph

299 Commits

Author SHA1 Message Date
Erik Arvidsson
b8be6908f8 Implement Set Union
This is done by creating a cursor for each set. This is a cursor for
the actual values in the sets. We then pick the "smallest" value from
the cursors and advance that cursor. This continues until we have
exhausted all the cursors.

  setA.Union(set0, ... setN)

The time complexity is O(len(setA) + len(set0)) + ... len(setN))
2015-12-17 10:18:04 -05:00
Rafael Weinstein
27d5f0d240 Ensure sequenceChunker.Done() returns an internal type so that callers dont have to 2015-12-17 06:20:28 -08:00
Chris Masone
a70c21116a Some compound{Set,Map} cleanup
Remove a bad comment, one-line a few things in Filter()
2015-12-16 15:57:22 -08:00
cmasone-attic
3163ff00dc Merge pull request #781 from cmasone-attic/mapfilter
Implement compoundMap Filter()
2015-12-16 15:50:37 -08:00
Dan Willhite
19228ba9d8 Merge pull request #784 from willhite/panics
Implement Filter on compoundList.
2015-12-16 14:06:57 -08:00
Dan Willhite
20f22e1020 Implement Filter on compoundList. 2015-12-16 13:32:00 -08:00
Chris Masone
a30209bcc7 Implement compoundMap Filter() 2015-12-16 12:57:41 -08:00
Chris Masone
57e2303a62 Re-land "Implement compoundSet.Filter()"
This reverts commit 60ab9c7f0c.

Fixes initial patch to correctly use test harness.  Implementation is
based on newTypedSet(), so hopefully has similar performance
characteristics.
2015-12-16 12:53:13 -08:00
cmasone-attic
60ab9c7f0c Revert "Implement compoundSet.Filter()" 2015-12-16 12:42:33 -08:00
cmasone-attic
f7f7cfaab0 Merge pull request #777 from cmasone-attic/depanic
Implement compoundSet.Filter()
2015-12-16 12:32:43 -08:00
Benjamin Kalman
2f22372c86 Implement chunked insert/remove for sets and maps.
There are some corner case failing tests, but this may be an existing bug
in the sequence chunker.
2015-12-16 11:02:41 -08:00
Dan Willhite
4e4cff2bd2 Merge pull request #779 from willhite/panics
Implement Map and MapP on compoundList.
2015-12-16 09:55:42 -08:00
Rafael Weinstein
9b107c145f Fix xml-importer & pitchmap/indexer 2015-12-15 21:33:44 -08:00
Chris Masone
18c446e934 Add set-equivalence check to unit test 2015-12-15 20:08:51 -08:00
Dan Willhite
4268ded39c Implement Map and MapP on compoundList. 2015-12-15 17:53:51 -08:00
Dan Willhite
eeea1e2056 Merge pull request #773 from willhite/panics
Implement IterAllP on compoundList.
2015-12-15 17:52:32 -08:00
Dan Willhite
efd9a2e00c Implement IterAllP on compoundList. 2015-12-15 17:42:05 -08:00
Chris Masone
8171014b20 remove comment 2015-12-15 17:37:25 -08:00
Chris Masone
0d54e01dda Implement compoundSet.Filter()
Implementation is based on newTypedSet(), so hopefully has similar
performance characteristics.
2015-12-15 17:14:11 -08:00
Chris Masone
31bf320370 Implement compoundMap.MaybeGet() and compoundSet.First()
A couple more simple ones before the hard stuff.
2015-12-15 16:34:04 -08:00
Chris Masone
55529946b7 Implement compoundMap.First()
Also, add TestCompoundMapFirst.
2015-12-15 15:31:41 -08:00
Benjamin Kalman
c5a6382d25 Implement compoundList Set/Insert/Remove/RemoveAt. 2015-12-14 13:23:48 -08:00
Benjamin Kalman
9a3e73779d Make types.Ref implement the OrderedValue interface.
This fixes the bug where compoundSets/Maps of refs are ordered by their
type.Ref's Ref, rather than their type.Ref's TargetRef.
2015-12-14 11:28:38 -08:00
Erik Arvidsson
7c43a2b49d Merge pull request #751 from arv/encode-numbers-as-strings
Go: Encode numbers as strings
2015-12-14 09:49:34 -05:00
Erik Arvidsson
796c11b0f7 Make sure floats are encoded in a more restricted format 2015-12-12 17:34:47 -05:00
Erik Arvidsson
40d164fe47 Go: Encode numbers as strings
Because JSON encoders encode numbers differently we cannot just use
numbers in the output.

This still encodes the NomsKind as numbers.

Towards #749
2015-12-11 16:30:07 -05:00
Benjamin Kalman
26848de0e7 Implement compoundList.Slice. 2015-12-10 15:46:24 -08:00
Rafael Weinstein
dc58008226 Fix crunchbase importer 2015-12-09 15:16:00 -08:00
Benjamin Kalman
5ab90e8ae1 Change list/blob chunk values to be their length, not cumulative length.
This is simpler for chunking, since it no longer needs to "normalize"
the values when re-chunking. It's a bit less efficient because instead
of binary searching we need to linear search through chunk values.
2015-12-09 10:38:06 -08:00
Rafael Weinstein
d198036618 Compound Map & Set 2015-12-08 16:25:02 -08:00
Benjamin Kalman
71072e81ed Combine metaSequenceCursor/sequenceChunkerCursor as sequenceCursor.
Then add a bunch of tests for sequenceCursor. This ends up changing the
behavior of sequenceCursor to make it more consistent, but the behavior
of sequenceChunker and compoundList (etc) shouldn't change.
2015-12-04 17:01:02 -08:00
Rafael Weinstein
e5409f2698 Remove MetaSequenceKind from serialization 2015-12-04 13:34:30 -08:00
Ben Kalman
27c498e032 Merge pull request #716 from kalman/buz-window-size
Correctly distinguish between chunking window size and buzhash window…
2015-12-03 15:26:56 -08:00
Benjamin Kalman
338de4e583 Correctly distinguish between chunking window size and buzhash window size.
Previously the buzhash boundary checker used a single value for the
window size, both as the buzhash buffer size when constructing a hash
object, and reported as its window size to the boundary checker
interface. This was wrong because we don't always pass single byte
values to the hasher, for example refs are 20 bytes.

The compound list chunking compensated for this by only passing the
first byte of each list leaf's ref rather than the full ref. This is bad
because there is obviously less entropy in 1 byte vs 20 bytes.

The meta sequence chunking compensated for this by multiplying the
chunking window size by 20, but this also had the effect of
unnecessarily considering 20 times more chunked elements than would fit
in the buzhash buffer.
2015-12-03 14:58:35 -08:00
Rafael Weinstein
e0b368302d listLeaf & compoundList implement List interface 2015-12-02 13:54:25 -08:00
Rafael Weinstein
4c1f4464af Compound & Leaf values now have same Type() 2015-12-02 10:19:27 -08:00
Erik Arvidsson
698c21bc67 NomDL: Change type syntax to use <> instead of ()
Fixes #678
2015-12-02 12:30:00 -05:00
Erik Arvidsson
61f14f8c9a Rename noms UInt* to Uint*
Fixes #673
2015-12-02 12:01:42 -05:00
Rafael Weinstein
9cf1896940 Simplify retrieval of tuples from metaSequences 2015-12-01 12:44:27 -08:00
Rafael Weinstein
7caa08bc5c Complex Types embed a ChunkStore 2015-12-01 10:40:47 -08:00
Dan Willhite
7dd540d5d5 Make generated files stable and consistent with go fmt. 2015-11-24 16:15:35 -08:00
Rafael Weinstein
7ad26393a0 fix TestEnsureRef 2015-11-21 14:27:28 -08:00
Benjamin Kalman
4b901cdc84 Support compoundList.Append. 2015-11-20 16:40:19 -08:00
Rafael Weinstein
710fcc3708 Minor fix for meta_sequence_cursor_test 2015-11-18 17:13:43 -08:00
Rafael Weinstein
12d88c769d Correctly construct metaSequenceCursor 2015-11-18 14:05:54 -08:00
Erik Arvidsson
eb4b3cbece Update encode/decode tests to use typed constructors 2015-11-16 18:38:43 -05:00
Rafael Weinstein
e75f5097c7 Allow Map & Set to order by natural ordering of element type if available. 2015-11-16 14:30:30 -08:00
Rafael Weinstein
da2c7461df CompoundList lives again 2015-11-16 11:27:47 -08:00
Erik Arvidsson
b6e034c77e Go: Rename type ref files too 2015-11-13 18:04:05 -05:00
Erik Arvidsson
a72ce41a1d Go: TypeRef -> Type
Remaining identifiers
2015-11-13 17:54:53 -05:00