Commit Graph

188 Commits

Author SHA1 Message Date
Erik Arvidsson a63b588f80 Fix lint errors (#2650) 2016-09-29 10:44:43 -07:00
Aaron Boodman 48e7ed9850 fb/find-photos: add in datePublished, dateUpdated, faces (#2634)
fb/find-photos: add in publishDate, updateDate, along with basic face info.
2016-09-28 17:18:46 -07:00
cmasone-attic dd92a06559 JS: Make Database a mutable API that vends immutable Datasets (#2636)
Noms SDK users frequently shoot themselves in the foot because they're
holding onto an "old" Database object. That is, they have a Database
tucked away in some internal state, they call Commit() on it, and
don't replace the object in their internal state with the new Database
returned from Commit.

This PR fixes #2589 by changing the Database and Dataset JS API to be
in line with the proposal there.
2016-09-28 16:50:57 -07:00
Ben Kalman a738ad2d85 flickr correct nsInSecond/nsInMillisecond (#2645) 2016-09-28 16:45:26 -07:00
Aaron Boodman 3efc6c5f7d You're the computer, you sort the fields (#2641)
Change makeStructType() API for sanity. Update callers.
2016-09-28 15:39:52 -07:00
Dan Willhite 4fd84d5141 Add dummy files to js samples with integration tests (#2643) 2016-09-28 13:59:22 -07:00
Dan Willhite 82b25f370f Remove pitch-index from samples/js (#2642) 2016-09-28 13:42:50 -07:00
Dan Willhite 7d81c9b96b Add integration test for sample: url_fetch (#2639)
Towards #1888
2016-09-28 13:36:19 -07:00
Dan Willhite cd8f995b2c Add integration test for sample: aggregate. (#2633)
Towards #1888
2016-09-28 12:52:55 -07:00
Eric Halpern d9715dba0e Support db aliases and default db for noms cli
This patch implements evolving support for configuring aliases and defaults for the noms cli (started with #2131)

For an introduction, please take a look at the sample code here: https://github.com/attic-labs/noms/blob/master/samples/cli/nomsconfig/README.md

Improvements include: 

 - All go samples now work with .nomsconfig
 - Absolute paths in ldb specs are now properly handled 
 - Add -v|--verbose flag to commands to debug expansion
 - Make default just another alias and change [default] section to [db.default]
 - Introduce the `.` shorthand to refer to a previously mentioned dataset/object
2016-09-27 22:21:32 -07:00
Ben Kalman 81673c2591 Add perf test for url-fetch 2016-09-27 16:52:54 -07:00
Aaron Boodman 577c99ff38 Factor out datas.ReadAbsolutePaths() (#2623) 2016-09-27 14:21:54 -07:00
Aaron Boodman e52775f838 Refactor exit mockery into go/util/exit (#2622) 2016-09-27 13:51:27 -07:00
Ben Kalman 097863ea6f Use parallel NewBlob in the csv perf tests (#2625)
This just involves changing types.NewBlob(io.MultiReader(files...)) to
types.NewBlob(files...). On my laptop it improves
Test01ImportSfCrimeBlobFromTestdata from 21s to 16s - though much of
this is dominated by commit, which wouldn't be affected by this change.
2016-09-27 12:22:25 -07:00
Ben Kalman 35d88dd3c6 Implement Blob.Concat and make NewBlob parallel
Blob.Concat is a simple use of the sequence concat code that List.Concat uses.
NewBlob uses Blob.Concat to construct a Blob in parallel.

Perf tests for parallel NewBlob write N temporary files then constructs a Blob
from them, so there is some I/O, but it appears to be mostly CPU bound.  NewBlob
doesn't get much more than 50% faster with any P >= 2.
2016-09-27 11:08:31 -07:00
Aaron Boodman 362a5630d9 Add photo-index: a simple photo indexer. For now only indexes by tag. (#2610)
Add photo-index: a simple photo indexer. For now only indexes by tag.

Will add indexing by face/geo in subsequent patches.
2016-09-27 10:50:37 -07:00
Mike Gray 181f549179 Update FB and Flickr importers to be similar in names and descriptions (#2615) 2016-09-27 13:28:11 -04:00
Aaron Boodman 429784dd00 flickr/find-photos: capture dateTaken, datePublished, dateUpdated (#2614)
flickr/find-photos: capture dateTaken, datePublished, dateUpdated.

TBR
2016-09-27 00:04:58 -07:00
Dan Willhite 5de36728e8 Make nomdex_update use GraphBuilder (#2619) 2016-09-26 17:00:12 -07:00
cmasone-attic 2e462b11a5 Make Database a mutable API that vends immutable Datasets (#2617)
Noms SDK users frequently shoot themselves in the foot because they're
holding onto an "old" Database object. That is, they have a Database
tucked away in some internal state, they call Commit() on it, and
don't replace the object in their internal state with the new Database
returned from Commit.

This PR changes the Database and Dataset Go API to be in line with the
proposal in Issue #2589. JS follows in a separate patch.
2016-09-26 12:18:14 -07:00
Dan Willhite 7bb7a068d6 Fix hyperlink in nomdex Readme file (#2618) 2016-09-26 11:00:33 -07:00
Dan Willhite 403bfa6560 Create Readme.md (#2616) 2016-09-26 10:42:15 -07:00
Dan Willhite e351f718e4 Use smaller dataset to testing csv-import/multi-map (#2609)
Also reuse data already imported as blob by another test.
2016-09-23 10:59:30 -07:00
Erik Arvidsson 21b8fc2b4f Update Flow to 0.32 (#2606)
Motivation: Better libs needed in an upcoming PR.
2016-09-22 15:22:20 -07:00
Dan Willhite e5541f9343 Make csv importer use GraphBuilder (#2600) 2016-09-22 15:19:37 -07:00
Erik Arvidsson e28cda9ba7 Update Babel dependencies (#2604)
The motivation is to get babylon@6.11.1 which has a fix for a bug where it
treated toString as a duplicate export.
2016-09-22 13:54:46 -07:00
Dan Willhite 3b17956907 Add perf test for multi-key maps. (#2605) 2016-09-22 13:34:37 -07:00
Eric Halpern e3e9b29d2c Noms configuration for default and aliases (#2597)
* Implement noms cli configuration support

- Introduce .nomsconfig
- Supports a default db to use when no explicit db is given
- Supports defining db aliases to use as short hand for db urls
- See samples/cli/nomsconfig for more info

fix: #2131
2016-09-21 19:43:51 -07:00
Erik Arvidsson 5edf89cf3d Replace d.Chk.True with d.PanicIfFalse (#2563)
And same for d.Chk.False
2016-09-14 13:11:28 -07:00
Erik Arvidsson d9ebf6ac90 flickr/slurp: Treat fail as an error (#2544)
Reject the promise when we get a fail status.
2016-09-12 13:57:33 -07:00
Mike Gray 1996e0a3d8 Add Noms commit command (#2474)
* Add "noms commit" command
* Updated csv-import, json-import, xml-import and url-fetch to (optionally) not commit results
* Added helpers for creating commit meta-data struct through command line or function calls
2016-09-09 12:42:27 -04:00
Erik Arvidsson 70004e1699 Let splore use versioned URLs too (#2530) 2016-09-07 18:05:14 -07:00
cmasone-attic 1c69c6b891 Update merge.ThreeWay() to allow very basic custom conflict resolution (#2505)
This patch modifies merge.ThreeWay() to take a callback that allows
for custom conflict resolution. The noms-merge command-line tool uses
this to inject a callback that accepts input from the console
dictating whether to accept the value from the 'left' or 'right' merge
candidates.

Toward #2445
2016-09-07 13:21:22 -07:00
zcstarr ef817db179 Adds error for invalid skip records argument and exits csv importer (#2522) 2016-09-06 17:37:05 -07:00
zcstarr b1c0aeb9c5 Adds checks for bad column-type and header csv-import flag values (#2525) 2016-09-06 17:12:21 -07:00
zcstarr ef11062cab Fixes bug where delimiter applied to header arguments (#2523) 2016-09-06 16:50:05 -07:00
Mike Gray 47565f39d1 Improve code based on tool analysis feeback (#2521)
Fixes are based on Go report card output:
- `gofmt -s` eliminates some duplication in struct/slice initialization
- `golint` found some issues like: `warning: should drop = nil from declaration of var XXX; it is the zero value`
- `golint` found some issues like: `warning: receiver name XXX should be consistent with previous receiver name YYY for ZZZ`
- `golint` says not to use underscores for function/variable names
- `golint` found several issues like: `warning: if block ends with a return statement, so drop this else and outdent its block`

No functional changes are included - just source code quality improvements.
2016-09-06 16:35:25 -04:00
zcstarr 3cdebb7e77 Add run safe method that reads stderr and stdout regardless of panic (#2475)
Run method will now always return stdout,stderr, and a recoveredErr
on Exit or Panic. MustRun will Panic with recoveredErr.
2016-09-06 11:30:57 -07:00
Erik Arvidsson fd4c52acef stage.py: import os too (#2514) 2016-09-02 14:00:15 -07:00
Erik Arvidsson 893d0fa360 stage.py needs to import sys before using it (#2513)
Also changes the tabs to spaces. Thanks pylint.
2016-09-02 13:38:43 -07:00
Erik Arvidsson f465b0bd3c Fix import of noms.staging in stage.py (#2510)
Also fix unit test that was not updated when the functions were
renamed.
2016-09-02 13:09:07 -07:00
cmasone-attic e5fcfd6ebf Make poke hang on to parent's Commit metadata (#2504)
Before this, poke would drop any commit metadata from the dataset being modified. Now, it just pulls it forward.
2016-09-01 17:21:31 -07:00
cmasone-attic 9f080a2fa7 Fix test assertion in noms-merge::TestLose() (#2498)
This was caused by me changing an error string and failing
to update the test.
2016-09-01 11:34:04 -07:00
cmasone-attic 771eb092da Add failure tests for noms-merge (#2484)
Tests command line validation for noms-merge

Toward #2445
2016-09-01 11:31:29 -07:00
Aaron Boodman 42fd80be2e Add a super quick indeterminite progress meter to noms-merge (#2488) 2016-09-01 10:24:30 -07:00
Daniel Krech 01bdeab025 Add progress reporting to json_importer (#2494)
Fixes #2494
2016-09-01 00:04:26 -07:00
Ben Kalman 9c694f024b Add a perf test for CSV map import (#2461)
Currently we only have a perf test for CSV list import, which uses the
sf-crime dataset. This test uses the 43MB sf-registered-businesses
dataset instead, since sf-crime is too slow. Which is ironic, since we
normally parse sf-crime into a map.

I've also tightened up some of the other perf tests.
- Fixed a bug where Database was shared between runs.
- Make the pure CSV parsing test use a smaller dataset, it doesn't need
  to use something as large as ny-vehicle-registrations.
2016-08-31 17:05:00 -07:00
Ben Kalman 815ca1586f Add perf build.py and util (#2487) 2016-08-31 16:50:34 -07:00
cmasone-attic 49ea5ec3c0 Introduce noms-merge, a standalone noms tool for merging datasets (#2470)
This is a first pass at exposing the new merge package to users.  The
tool is very basic, and currently only works on datasets in the same
database. It requires the 'parent' (i.e. a common ancestor of the two
datasets being merged) to be provided by a commandline option; a
follow-on patch will make the code discover this ancestor automatically.

Toward #2445
2016-08-31 14:20:21 -07:00
Erik Arvidsson f5ce7e056b Fix path in splore build script (#2471)
The path to the noms sdk in the build script was not updated when the
js sdk was moved.
2016-08-31 12:01:10 -04:00