Commit Graph

125 Commits

Author SHA1 Message Date
Rafael Weinstein dddf81b095 Add List.Map & List.MapP 2015-09-08 15:12:24 -07:00
Erik Arvidsson 7966384aeb Get rid of alice-short.txt
Use a deterministic pseudo random list instead
2015-09-08 11:20:12 -04:00
Aaron Boodman 47953557d8 fix build 2015-09-04 16:18:09 -07:00
Aaron Boodman dc2ef0274d remove drone.yml and purposely break build to test 2015-09-04 16:16:21 -07:00
Aaron Boodman f58670bc83 NewBlob(): Reader.Read() can return both data and error.
Fixes #264
2015-09-04 15:02:29 -07:00
Erik Arvidsson 58e6666f83 Minor cleanup of compound list append 2015-09-04 14:28:29 -04:00
Erik Arvidsson d06da3ca0a Chunking: Multi level chunking for blobs
After a compound blob is created we try to chunk it again in a similar
way to how we chunk Lists. We use the refs of the sub blob and compute
a rolling hash over these. If the hash matches a pattern then we split
the existing compound blob into a new compound blob with sub blobs
which are slices of the original compound blob.

Issue #17
2015-09-03 19:47:17 -04:00
Erik Arvidsson 57c5fd9eeb Introduce a list iterator to get rid of O(log n) lookups in loops
When we are building the chunked lists we had a lot of loops that did
O(log n) Get operations. Since we are just getting consecutive elements
from the list we can make getting the next one O(1) making these loop
go from O(n*log(n)) to O(n)

Issue #215
2015-08-31 18:00:41 -04:00
Erik Arvidsson 8d85cc4625 Optimize compound list - tail part
When we write the part after the change and we hit a chunk split we
check whether the list also had a split at the same index (adjusted
for adding/removal). If it did then we know that the rest of the sub
list are the same.

Issue #215
2015-08-31 10:50:01 -04:00
Aaron Boodman 6cb4d613b7 Update godeps 2015-08-30 17:11:04 -07:00
Erik Arvidsson c86bc4aee2 Optimize compound lists
This reuses the head of a compound list.

Issue #215
2015-08-28 17:12:46 -04:00
Erik Arvidsson b6197aadc4 Add support for compound lists.
Lists are now either a leafList or a compoundList. The compoundList
consists of sublists that are the chunks of the whole list.
2015-08-28 15:41:51 -04:00
Erik Arvidsson 2d96186090 Add encoding/decoding of compound lists 2015-08-28 15:27:35 -04:00
Dan Willhite ab34143ba5 Pin dependencies using godep tool. Rewrite dep urls. 2015-08-26 14:05:40 -07:00
Chris Masone 24e0e436e4 Revert change in blob.go 2015-08-26 09:28:04 -07:00
Chris Masone ccd70d7c65 Changed error handling in Marshal and Unmarshal
Instead of returning errors, these now use d.Exp to raise catchable
errors.

Also, added commit hash at which code was pulled from encoding/json

Marshal io.Reader into a Blob, unmarshal Blob into io.Writer
2015-08-26 09:28:04 -07:00
Chris Masone 5de698b8f1 Add Unmarshal and Marshal
Unmarshal and Marshal are tools for moving data from Noms into native Go and
back. The rules are described in the documentation of the two functions, but
the behavior is broadly similar to encoding/json.

Towards issue #160
2015-08-26 09:27:58 -07:00
Rafael Weinstein 06c5bc6c1b Abstract ChunkStoreWriter 2015-08-20 10:58:41 -07:00
Rafael Weinstein 0555d7a3c1 Remove errors from read/write/encode/decode 2015-08-18 16:37:04 -07:00
Rafael Weinstein 0e7d61efc6 Remove errors from ChunkStore and Ref 2015-08-18 16:24:26 -07:00
Chris Masone 1bd5af910a Add support for int8/uint8
It turns out that having these makes marshaling native go to and from noms
cleaner.

Towards issue #160
2015-08-17 13:44:50 -07:00
Aaron Boodman 214b37eccf Remove global imports of dbg package
Fixes #179
2015-08-08 23:57:37 -07:00
Aaron Boodman f407029526 Merge pull request #181 from aboodman/tagdex
Tagdex
2015-08-07 09:52:15 -07:00
Erik Arvidsson ddebdcaefd Slight modification to compound blob encoding
The json serialization now only contains the length of each individual
blob child.

The go representation of this still uses offsets but the offsets are
for the end delimiter.

For "hi" "bye" we get

{"cb", [{"ref": "sha1-hi"}, 2, {"ref": "sha1-bye"}, 3]}

compoundBlob{[2, 5], [sha1-hi, ,sha1-bye]}

Keeping the length in the serialization leads to smaller serializations

Using the end offset leads to simpler binary search and allows us to
use the last entry as the length.

Issue #17
2015-08-07 11:24:27 -04:00
Aaron Boodman 673180c2c9 add values/walk.go 2015-08-07 08:04:42 -07:00
Chris Masone b207d7e7ca Remove ToItems(), fix error reporting in ReadValue 2015-08-06 17:02:26 -07:00
Chris Masone f4ef0d5cbc Address comments
Ensure reader gets closed in all cases in ReadValue, clean up BUG references,
delete NewCompoundBlob, and switch an io.Copy -> ioutil.ReadAll
2015-08-06 16:46:00 -07:00
Chris Masone bec1b344be Migrate the types package to JSON decode using the enc package
This change removes the json decoding code from the types package and ports
it onto the enc package's encoding API.

Fixes issue #159
2015-08-06 16:09:35 -07:00
Chris Masone 07046ce567 Address aa comments 2015-08-06 12:48:39 -07:00
Chris Masone 1a3c3e2c41 Migrate the types package to JSON encode using the enc package
This change removes the json encoding code from the types package and ports
it onto the enc package's encoding API.

Towards issue #159
2015-08-06 10:30:16 -07:00
Erik Arvidsson ea52c4ac7c Implement Seek for Blob.Reader()
This allows us to only read the relevant chunks

Issue #17, #155
2015-08-06 12:22:41 -04:00
Erik Arvidsson c9d928f50b Code review cleanup 2015-08-05 13:37:28 -04:00
Erik Arvidsson d834f7d546 Change JSON serialization format for compound blobs
- Put the length last
- Skip the initial 0 since first blob is always at 0

Issue #17
2015-08-05 11:38:56 -04:00
Erik Arvidsson a8db1242a4 Merge pull request #167 from arv/blob-use-offset
Swith to use offsets in compoun blobs
2015-08-05 09:52:39 -04:00
Aaron Boodman 26c28e7158 Merge pull request #162 from aboodman/ref
Make ref.Ref a type of Value too.
2015-08-04 16:38:12 -07:00
Aaron Boodman b79d7987c4 Make ref.Ref a type of Value too.
"Fixes" #141.
2015-08-04 16:37:33 -07:00
Erik Arvidsson 1e7db8e341 Swith to use offsets in compoun blobs
This is in preparation for Seek

Issue #155, #17
2015-08-04 19:11:44 -04:00
cmasone-attic 029909a9f2 Merge pull request #163 from cmasone-attic/cleanup
Remove entrySlice from json_encode.go
2015-08-04 14:10:28 -07:00
Chris Masone ad968e9a49 Remove entrySlice from json_encode.go
This is no longer used.
2015-08-04 14:07:38 -07:00
Erik Arvidsson 72b8c872f4 Make Blob Reader lazy
This is similar to io.MultiReader but it does not deref the Future
until needed.

Issue #17
2015-08-04 16:45:17 -04:00
Erik Arvidsson c9f56a6094 Blob chunking: Test that we generate same blob leafs
This adds a test to ensure that we generate the same blob leafs when
we prepend and append to the data.

Issue #17
2015-08-04 15:09:16 -04:00
Erik Arvidsson 4e69837ef0 This introduce two new internal values, blobLeaf and compoundBlob. At
this point the compoundBlob only contains blob leafs but a future
change will create multiple tiers. Both these implement the new Blob
interface.

The splitting is done by using a rolling hash over the last 64 bytes,
when that hash ends with 13 consecutive ones we split the data.

Issue #17
2015-08-03 20:09:42 -04:00
Erik Arvidsson c5964aadcd Make Blob take a Reader instead of byte array
This is in preparation for chunking
2015-07-30 18:53:22 -04:00
Aaron Boodman 7944c1b3af Revert "Make WriteValue return a "skinny" copy of input value" 2015-07-30 09:23:35 -07:00
Aaron Boodman a84893c0d8 Make WriteValue return a "skinny" copy of input value
Fixes #141
2015-07-29 16:06:54 -07:00
Rafael Weinstein 1369ac9e6b Allow xml_importer & pitchmap/index to be more streamy 2015-07-28 14:13:59 -07:00
Chris Masone 0591e0b32a Add unit test, return specific error for reader == nil 2015-07-24 15:33:28 -07:00
Chris Masone 47e55d591f ReadValue() should return early if it can't Get() a Ref
ReadValue() tries to Get() the ref it's given from the ChunkSource it's given.
We recently changed ChunkSource to return nil with no error if the ref is not
in the ChunkSource. ReadValue, though, soldiers on in the case of a nil
return value from Get, calling Close() on it and other things. This is, I
think, bad.
2015-07-24 15:18:51 -07:00
Chris Masone 5bceb60e88 Update comment on Future 2015-07-23 15:34:40 -07:00
Chris Masone 4fe00d4f81 Address aa's comments
- Return factory methods to privacy
- use tighter syntax inside Chunks() methods
- Rename Futures() -> Chunks()
2015-07-23 15:32:38 -07:00