Commit Graph

697 Commits

Author SHA1 Message Date
Erik Arvidsson
6fcfeeb215 Optimize CSV Export (#3721)
This optimizes CSV export after the change in types.Value being backed
by byte slices.

Towards #3710
2017-09-19 00:48:21 -07:00
Erik Arvidsson
865f2498bb Update README.md 2017-09-18 16:20:15 -07:00
Benjamin Kalman
a03416acba Reuse the "current" buffer in sequenceChunker (#3702)
Avoids memory reallocation.
2017-09-18 12:23:20 -07:00
Erik Arvidsson
b497bcc974 Make values be backed by []byte (#3694)
This makes all but types.Type be backed by a []byte.

The motivation is to reduce the allocations and the work needed to be
done when we read parts of a value (especially prolly trees).

Towards #2270
2017-09-14 17:45:08 -07:00
Dan Willhite
10ec10dc00 Add ability to register HRSCommenters on Structs. (#3609)
Clients can register HRSCommenters to cause additional info
to be included as comments when generating the human readable
encoding for Noms Structs.
2017-09-13 17:21:08 -07:00
Benjamin Kalman
26eb9e3713 Don't write a sequence chunk if there is no parent (#3699)
In most cases this will avoid writing the root chunk of a prolly tree,
which is the behavior we're aiming for: a prolly tree might be used
inline in which case the root never needs to be written.

The solution in this patch is imperfect because it may unnecessarily
write chunks, but this is rare.

Fixes https://github.com/attic-labs/noms/issues/3645
2017-09-13 15:52:31 -07:00
Erik Arvidsson
3db7be5062 Clean up Type WalkValues (#3700)
Use good practice OO design 😝
2017-09-13 15:45:52 -07:00
Erik Arvidsson
8f95c25403 Remove some printf debugging from tests (#3701) 2017-09-13 15:36:42 -07:00
Erik Arvidsson
5ff6432c7b Add support for parsing values (#3688)
This allows parsing all Noms values from the string representation
used by human readable encoding:

```
v, err := nomdl.Parse(vrw, `map {"abc": 42}`)
```

Fixes #1466
2017-09-13 15:02:01 -07:00
cmasone-attic
41f63a5a6a Stop noms sync from destroying locality (#3659)
This patch implements a new strategy for Pull() that pulls the chunks
from a given level of the graph over in the order they'll be
encountered by clients reading the graph.

Fixes #2968
2017-09-11 16:04:13 -07:00
cmasone-attic
14e95379af NBS: Fragmentation tool using new estimate of locality (#3658)
The new version of this tool now estimates the locality of a DB
written using the "grandchild" strategy implemented by
types.ValueStore. It does do by dividing each level of the graph
up into groups that are roughly the size of the branching factor
of that level, and then calculating how many physical reads are
needed to read each group.

In the case of perfect locality, each group could be read in a
single physical read, so that's what the tool uses as its estimate
of the optimal case.

Toward #2968
2017-09-11 15:34:17 -07:00
wardn
79e285e5d5 explicit collection types (#3683) 2017-09-09 19:56:30 -07:00
phritz
025609828e request set & list elements in batch (#3660)
When requesting a range of values read all the chunks ahead of time.

This works for indexed sequences. Does not include support for ordered sequences.

Work towards https://github.com/attic-labs/noms/issues/3619
2017-09-07 16:23:22 -10:00
Ben Kalman
03b7221c36 Use stretchr/testify not attic-labs/testify (#3677)
stretchr has fixed a bug with the -count flag. I could merge these
changes into attic-labs, but it's easier to just use strechr.

We forked stretchr a long time ago so that we didn't link in the HTTP
testing libraries into the noms binaries (because we were using d.Chk in
production code). The HTTP issue doesn't seem to happen anymore, even
though we're still using d.Chk.
2017-09-07 15:01:03 -07:00
Dan Willhite
071ba838d2 Modifications to ipfs-chat can ipfs chunkstore (#3666) 2017-09-05 18:35:50 -07:00
Ian Davis
8c4fa02c6e Fix godoc strings for several functions (#3665) 2017-09-05 11:50:30 -07:00
Ian Davis
052d3d73c8 Don't ignore spec split error in ForDatasetOpts 2017-09-03 09:06:21 -07:00
Erik Arvidsson
84b4aba5a6 HRS: Prefix map/set values with map/set (#3655)
We now print map and set values as:

```
map {
  "string": 42,
  "set": set {
    true,
    false,
  },
}
```

Towards #1466
2017-08-31 13:31:21 -07:00
Dan Willhite
aa65868741 Changes to accommodate new version of ipfs 2017-08-31 10:48:27 -07:00
Dan Willhite
b1cb8a0fff Add rate limit to ipfs chunkstore, increase thread limit 2017-08-30 17:09:54 -07:00
Erik Arvidsson
e5bcde644a HRS: Write blob values as blob { ... } (#3654)
This prints blob values as:

```
blob {  // 17 B
  00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
  10
}
```

Towards #1466
2017-08-30 15:16:22 -07:00
Dan Willhite
04e837bad9 Increase rlimit and auto-initialize IPFS repos 2017-08-30 11:52:50 -07:00
Rafael Weinstein
86dfd49efc Histogram: allow for samples > 1 << 63 (#3650) 2017-08-30 11:20:20 -07:00
Erik Arvidsson
864f1a8fae HRS: Struct printing cleanup (#3653)
We always write the struct name now so no point in passing the
param.
2017-08-30 11:05:42 -07:00
Erik Arvidsson
eac2264193 Skip more perf suite tests (#3652)
Since these are flaky
2017-08-30 11:03:06 -07:00
Erik Arvidsson
a8d2ade6ac Skip TestRunFlag since it is flaky (#3651) 2017-08-30 10:32:32 -07:00
Erik Arvidsson
a93777d93a HRS: Cleanup struct head (#3647)
For struct values we always prefix with struct.

For struct types we always prefix with Struct.

Towards #1466
2017-08-29 17:47:54 -07:00
Erik Arvidsson
9cd2aae786 Print Refs with # (#3644)
Print Ref values as #123 instead of 123

Since our hashes are SHA-512 and we write them using Base32 there are a lot of overlaps with other parts of NomDL. This makes them unambiguous.

Towards #1466
2017-08-29 17:43:29 -07:00
Erik Arvidsson
9200aabaa7 Remove HRS tagged versions (#3643)
This removes the type tagged version of the human readable encoding.

Motivation: Simplify this in preparation to make the HRS unambiguous so that we can write a parser.

Towards #1466
2017-08-29 17:38:46 -07:00
Rafael Weinstein
c3f98d1631 Remove in mem graphs (#3635)
This patch removes the ability to keep alive uncommitted prolly-tree sequences.
2017-08-29 13:12:10 -07:00
Rafael Weinstein
61f3d87dcf Introduce Sloppy (#3631)
Introduce Sloppy - an estimating compression function for snappy - which allows for the rolling hash to better produce a given target chunk size after compression.
2017-08-28 13:23:00 -07:00
Ben Kalman
f23cbe5344 Make field and index paths work for types (#3639)
Struct type field support field paths, like `.fieldName`
Compound types (List/Set/Map/Union) support index paths, like `[0]`

Fixes https://github.com/attic-labs/noms/issues/3622
2017-08-25 12:23:59 -07:00
Dan Willhite
95aac7ca1e Merge modifications made by Aaron. (#3638)
* Add kingpin library for argument handling.
* Update hard-coded Version number in chunkstore.
* Store noms repo information in ipfs home directory.
2017-08-24 17:50:15 -07:00
Erik Arvidsson
06d0696860 Fix ngql test (#3634) 2017-08-23 13:33:11 -07:00
Erik Arvidsson
3e4fc565ff Pipe through ValueReader in ngql (#3633)
This is in preparation for passing in a ValueReader to collection
constructors.
2017-08-23 13:12:27 -07:00
Aaron Boodman
1a7e006862 Landing beginning of IPFS experiment on trunk (#3625)
* add ipfs dependencies

* Add initial ipfs chunkstore
2017-08-23 11:42:06 -07:00
cmasone-attic
8b5cd66cce NBS: Expose hashes request builder for use in tests (#3630) 2017-08-22 15:53:24 -07:00
cmasone-attic
461ff64579 NBS: Fix large HTTP {get,has}Refs/ requests (#3629)
When we added GetMany and HasMany, we didn't realize that requests
could then be larger than the allowable HTTP form size. This patch
makes the body of getRefs and hasRefs be serialized as binary instead,
which addresses this issue and actually makes the request body more
compact.

Fixes #3589
2017-08-22 14:24:13 -07:00
Rafael Weinstein
e43f50c52c Make Histogram keep a precise sum (#3628) 2017-08-22 10:55:25 -07:00
Rafael Weinstein
a95b2ba9ff Collection test (#3624)
* loadLeafSeqs => loadLeafCollections

* Collection tests examine leaf node counts
2017-08-18 13:40:54 -07:00
Jesse Ditson
5db3cf1679 kingpin docs, noms blob [put | get] (#3621)
* use kingpin for help and new commands, set up dummy command for noms blob

* document existing commands using kingpin

* remove noms-get and noms-set in favor of new noms blob command

* normalize bool flags in tests, remove redundant cases that kingpin now handles

* add kingpin to vendor files

* make profile flags global

* move --verbose and --quiet to global flags
2017-08-16 16:35:22 -07:00
Dan Willhite
0c14fbad05 Ensure NewStreamingMap and NewStreamingSet panic when input is not ordered (#3570)
Fixes #3560
2017-08-16 11:37:40 -07:00
Rafael Weinstein
40be27c23f further nbs stats cleanup (#3614) 2017-08-06 22:06:47 -07:00
Rafael Weinstein
15bf51cf28 use real uncompressed count (#3613) 2017-08-06 21:13:41 -07:00
Rafael Weinstein
2ab75f4233 record logical (uncompressed) bytes per persist in metrics (#3612) 2017-08-06 18:04:26 -07:00
cmasone-attic
89bdfbd981 NBS: Defer AWS reads when opening a table with cached index (#3610)
Prior to this patch, whenever we created a chunkSource for a table
persisted to AWS, awsTablePersister::Open() would hit DynamoDB to
check whether the table data was stored there. That's how it knew
whether to create a dynamoTableReader or an s3TableReader. This
results in consulting Dynamo (or the in-memory small-table cache)
every time we go to open a table. Most of the time, this isn't
necessary, as we separately cache table indices and really only
need that data at Open() time.

This patch defers reading table data if possible.

Fixes #3607
2017-08-04 13:14:53 -07:00
cmasone-attic
0d5fb82197 NBS: populate small table cache on Open (#3608)
This will likely bloat the cache with tables no one's going to read
data from, BUT doing this also means that most checks to see if a table
is in Dynamo at all can proceed locally. Stopgap until #3607 lands
2017-07-28 13:26:38 -07:00
cmasone-attic
ac0e65a782 NBS: Read small tables with eventually consistent reads first (#3606)
Fail over to fully-consistent reads if no result. This means that
misses will get more expensive, but hits will cost us half what they
were costing in the initial version of the code.

Fixes #3604
2017-07-28 11:41:44 -07:00
cmasone-attic
497a97e9d2 NBS: increase size of small-table memory cache (#3605)
Looking at metrics on staging today, there are frequent spikes of
tens of thousands of throttled DynamoDB reads. One explanation is
that we're constantly evicting 'hot' tables from the in-memory
cache because the working set is larger than the space we've
allotted for the cache.
2017-07-28 09:56:27 -07:00
cmasone-attic
1145546a64 NBS: Store small tables in DynamoDB when persisting to AWS (#3603)
There seems to be a floor on the amount of time required to persist
small objects to S3. For workloads that generate lots of small tables,
this can really add up. DynamoDB is much faster to read/write, and can
hold items of up to 400k. This patch stores tables with < 64 chunks
that are < 400k in DynamoDB, caching them in memory on persist and
open to reduce load on the back end.

Fixes #3559
2017-07-27 13:08:43 -07:00