removed portions of noms we dont need / won't maintain in preparation for moving to our repo
@@ -1,7 +0,0 @@
|
||||
.git
|
||||
doc
|
||||
codecov.yml
|
||||
CONTRIBUTING.md
|
||||
LICENSE
|
||||
README.md
|
||||
samples
|
||||
@@ -1,4 +0,0 @@
|
||||
sudo: required
|
||||
services:
|
||||
- docker
|
||||
script: docker build .
|
||||
@@ -1,97 +0,0 @@
|
||||
Contributing to Noms
|
||||
====================
|
||||
|
||||
## Install Go
|
||||
|
||||
First setup Go on your machine per https://golang.org/doc/install.
|
||||
|
||||
Don't forget to [setup your `$GOPATH` and `$BIN` environment variables](https://golang.org/doc/install) correctly. Everybody forgets that.
|
||||
|
||||
You can test your setup like so:
|
||||
|
||||
```shell
|
||||
# This should print something
|
||||
echo $GOPATH
|
||||
|
||||
# We need at least version 1.7
|
||||
go version
|
||||
```
|
||||
|
||||
## Setup Noms Environment
|
||||
|
||||
Add `NOMS_VERSION_NEXT=1` to your environment. The current trunk codebase is a development version of the format and this environment variable is a safety check to ensure people aren't accidentally using this development format against production servers.
|
||||
|
||||
## Get and build Noms
|
||||
|
||||
```shell
|
||||
go get github.com/attic-labs/noms/cmd/noms
|
||||
cd $GOPATH/src/github.com/attic-labs/noms/cmd/noms
|
||||
go build
|
||||
go test
|
||||
```
|
||||
|
||||
## License
|
||||
|
||||
Noms is open source software, licensed under the [Apache License, Version 2.0](LICENSE).
|
||||
|
||||
## Contributing code
|
||||
|
||||
Due to legal reasons, all contributors must sign a contributor agreement, either for an [individual](https://attic-labs.github.io/ca/individual.html) or [corporation](https://attic-labs.github.io/ca/corporation.html), before a pull request can be accepted.
|
||||
|
||||
## Languages
|
||||
|
||||
* Use Go, JS, or Python.
|
||||
* Shell script is not allowed.
|
||||
|
||||
## Coding style
|
||||
|
||||
* Go uses `gofmt`, advisable to hook into your editor
|
||||
* JS follows the [Airbnb Style Guide](https://github.com/airbnb/javascript)
|
||||
* Tag PRs with either `toward: #<bug>` or `fixes: #<bug>` to help establish context for why a change is happening
|
||||
* Commit messages follow [Chris Beam's awesome commit message style guide](http://chris.beams.io/posts/git-commit/)
|
||||
|
||||
### Go error reporting
|
||||
|
||||
In general, for Public API in Noms, we use the Go-style of returning errors by default.
|
||||
|
||||
For non-exposed code, we do provide, and use, some wrappers to do Exception-style error handling. There *must* be an overriding rationale for using this style, however. One reason to use the Exception-style is that the current code doesn't know how to proceed and needs to panic, but you want to signal that a calling function somewhere up the stack might be able to recover from the failure and continue.
|
||||
|
||||
For these cases, please use the following family of functions to 'raise' a 'catchable' error (see [go/d/try.go](https://godoc.org/github.com/attic-labs/noms/go/d)):
|
||||
|
||||
* d.PanicIfError()
|
||||
* d.PanicIfTrue()
|
||||
* d.PanicIfFalse()
|
||||
|
||||
You might see some old code that uses functions that seem similar starting with `d.Chk`, however we are going to remove those and don't want to use them for new code. See #3258 for details.
|
||||
|
||||
## Submitting PRs
|
||||
|
||||
We follow a code review protocol dervied from the one that the [Chromium team](https://www.chromium.org/) uses:
|
||||
|
||||
1. Create a GitHub fork of the repo you want to modify (e.g., fork `https://github.com/attic-labs/noms` to `https://github.com/<username>/noms`).
|
||||
2. Add your own fork as a remote to your github repo: `git remote add <username> https://github.com/<username>/noms`.
|
||||
3. Push your changes to a branch at your fork: `git push <username> <branch>`
|
||||
4. Create a PR using the branch you just created. Usually you can do this by just navigating to https://github.com/attic-labs/noms in a browser - GitHub recognizes the new branch and offers to create a PR for you.
|
||||
5. When you're ready for review, make a comment in the issue asking for a review. Sometimes people won't review until you do this because we're not sure if you think the PR is ready for review.
|
||||
6. Iterate with your reviewer using the normal Github review flow.
|
||||
7. Once the reviewer is happy with the changes, they will submit them.
|
||||
|
||||
## Running the tests
|
||||
|
||||
You can use `go test` command, e.g:
|
||||
|
||||
* `go test $(go list ./... | grep -v /vendor/)` should run every test except from vendor packages.
|
||||
|
||||
If you have commit rights, Jenkins automatically runs the Go tests on every PR, then every subsequent patch. To ask Jenkins to immediately run, any committer can reply (no quotes) "Jenkins: test this" to your PR.
|
||||
|
||||
### Perf tests
|
||||
|
||||
By default, neither `go test` nor Jenkins run the perf tests, because they take a while.
|
||||
|
||||
To run the tests yourself, use the `-perf` and `-v` flag to `go test`, e.g.:
|
||||
|
||||
* `go test -v ./samples/go/csv/... -perf mem`
|
||||
|
||||
See https://godoc.org/github.com/attic-labs/noms/go/perf/suite for full documentation and flags.
|
||||
|
||||
To ask Jenkins to run the perf tests for you, reply (no quotes) "Jenkins: perf this" to your PR. Your results will be viewable at http://perf.noms.io/?ds=http://demo.noms.io/perf::pr_$your-pull-request-number/csv-import. Again, only a committer can do this.
|
||||
@@ -1,24 +0,0 @@
|
||||
FROM golang:latest AS build
|
||||
|
||||
ENV NOMS_SRC=$GOPATH/src/github.com/attic-labs/noms
|
||||
ENV CGO_ENABLED=0
|
||||
ENV GOOS=linux
|
||||
ENV NOMS_VERSION_NEXT=1
|
||||
|
||||
RUN mkdir -pv $NOMS_SRC
|
||||
COPY . ${NOMS_SRC}
|
||||
RUN go test github.com/attic-labs/noms/...
|
||||
RUN go install -v github.com/attic-labs/noms/cmd/noms
|
||||
RUN cp $GOPATH/bin/noms /bin/noms
|
||||
|
||||
FROM alpine:latest
|
||||
|
||||
COPY --from=build /bin/noms /bin/noms
|
||||
|
||||
VOLUME /data
|
||||
EXPOSE 8000
|
||||
|
||||
ENV NOMS_VERSION_NEXT=1
|
||||
ENTRYPOINT [ "noms" ]
|
||||
|
||||
CMD ["serve", "/data"]
|
||||
@@ -1,696 +0,0 @@
|
||||
# This file is autogenerated, do not edit; changes may be undone by the next 'dep ensure'.
|
||||
|
||||
|
||||
[[projects]]
|
||||
digest = "1:08636edd4ac1b095a9689b7a07763aa70e035068e0ff0e9dbfe2b6299b98e498"
|
||||
name = "cloud.google.com/go"
|
||||
packages = [
|
||||
"compute/metadata",
|
||||
"iam",
|
||||
"internal",
|
||||
"internal/optional",
|
||||
"internal/trace",
|
||||
"internal/version",
|
||||
"storage",
|
||||
]
|
||||
pruneopts = "UT"
|
||||
revision = "0ebda48a7f143b1cce9eb37a8c1106ac762a3430"
|
||||
version = "v0.34.0"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:9f3b30d9f8e0d7040f729b82dcbc8f0dead820a133b3147ce355fc451f32d761"
|
||||
name = "github.com/BurntSushi/toml"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "3012a1dbe2e4bd1391d42b32f0577cb7bbc7f005"
|
||||
version = "v0.3.1"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:e92f5581902c345eb4ceffdcd4a854fb8f73cf436d47d837d1ec98ef1fe0a214"
|
||||
name = "github.com/StackExchange/wmi"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "5d049714c4a64225c3c79a7cf7d02f7fb5b96338"
|
||||
version = "1.0.0"
|
||||
|
||||
[[projects]]
|
||||
branch = "master"
|
||||
digest = "1:315c5f2f60c76d89b871c73f9bd5fe689cad96597afd50fb9992228ef80bdd34"
|
||||
name = "github.com/alecthomas/template"
|
||||
packages = [
|
||||
".",
|
||||
"parse",
|
||||
]
|
||||
pruneopts = "UT"
|
||||
revision = "a0175ee3bccc567396460bf5acd36800cb10c49c"
|
||||
|
||||
[[projects]]
|
||||
branch = "master"
|
||||
digest = "1:c198fdc381e898e8fb62b8eb62758195091c313ad18e52a3067366e1dda2fb3c"
|
||||
name = "github.com/alecthomas/units"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "2efee857e7cfd4f3d0138cc3cbb1b4966962b93a"
|
||||
|
||||
[[projects]]
|
||||
branch = "master"
|
||||
digest = "1:f6e569e4a0c5d9c7fab4a9613cf55ac0e2160c17cc1eae1c96b78b842619c64a"
|
||||
name = "github.com/attic-labs/graphql"
|
||||
packages = [
|
||||
".",
|
||||
"gqlerrors",
|
||||
"language/ast",
|
||||
"language/kinds",
|
||||
"language/lexer",
|
||||
"language/location",
|
||||
"language/parser",
|
||||
"language/printer",
|
||||
"language/source",
|
||||
"language/typeInfo",
|
||||
"language/visitor",
|
||||
]
|
||||
pruneopts = "UT"
|
||||
revision = "917f92ca24a759a0e3bfd1b135850f9b0c04682e"
|
||||
|
||||
[[projects]]
|
||||
branch = "master"
|
||||
digest = "1:c34bc967eedd84e16c16905429ad84c0de1355c0d16126b35b0eca8eb6581056"
|
||||
name = "github.com/attic-labs/kingpin"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "442efcfac769eef3072317c696afe5861c6f7a15"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:7dd0b657dc55cd8ff3e6adfa2f74056d6b840e11897c3be7384ef3e294bf0241"
|
||||
name = "github.com/aws/aws-sdk-go"
|
||||
packages = [
|
||||
"aws",
|
||||
"aws/awserr",
|
||||
"aws/awsutil",
|
||||
"aws/client",
|
||||
"aws/client/metadata",
|
||||
"aws/corehandlers",
|
||||
"aws/credentials",
|
||||
"aws/credentials/ec2rolecreds",
|
||||
"aws/credentials/endpointcreds",
|
||||
"aws/credentials/processcreds",
|
||||
"aws/credentials/stscreds",
|
||||
"aws/crr",
|
||||
"aws/csm",
|
||||
"aws/defaults",
|
||||
"aws/ec2metadata",
|
||||
"aws/endpoints",
|
||||
"aws/request",
|
||||
"aws/session",
|
||||
"aws/signer/v4",
|
||||
"internal/ini",
|
||||
"internal/s3err",
|
||||
"internal/sdkio",
|
||||
"internal/sdkrand",
|
||||
"internal/sdkuri",
|
||||
"internal/shareddefaults",
|
||||
"private/protocol",
|
||||
"private/protocol/eventstream",
|
||||
"private/protocol/eventstream/eventstreamapi",
|
||||
"private/protocol/json/jsonutil",
|
||||
"private/protocol/jsonrpc",
|
||||
"private/protocol/query",
|
||||
"private/protocol/query/queryutil",
|
||||
"private/protocol/rest",
|
||||
"private/protocol/restxml",
|
||||
"private/protocol/xml/xmlutil",
|
||||
"service/dynamodb",
|
||||
"service/s3",
|
||||
"service/sts",
|
||||
]
|
||||
pruneopts = "UT"
|
||||
revision = "62936e15518acb527a1a9cb4a39d96d94d0fd9a2"
|
||||
version = "v1.16.15"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:8e18047715934056ed06c61d4d512ffd80787671d95bb1808cebb56adac56d34"
|
||||
name = "github.com/clbanning/mxj"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "79cfe7d36986ce108bd1e2c1d0a2a85c895237a2"
|
||||
version = "v1.8.3"
|
||||
|
||||
[[projects]]
|
||||
branch = "master"
|
||||
digest = "1:3651a7691180a385540fadaf7ebcc8708c061d3b1a9777312a23f7ba10ff6025"
|
||||
name = "github.com/codahale/blake2"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "8d10d0420cbfbdc9c1164c0c4ad3457a6c3771b9"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:ffe9824d294da03b391f44e1ae8281281b4afc1bdaa9588c9097785e3af10cec"
|
||||
name = "github.com/davecgh/go-spew"
|
||||
packages = ["spew"]
|
||||
pruneopts = "UT"
|
||||
revision = "8991bc29aa16c548c550c7ff78260e27b9ab7c73"
|
||||
version = "v1.1.1"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:6f9339c912bbdda81302633ad7e99a28dfa5a639c864061f1929510a9a64aa74"
|
||||
name = "github.com/dustin/go-humanize"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "9f541cc9db5d55bce703bd99987c9d5cb8eea45e"
|
||||
version = "v1.0.0"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:edb569dd02419a41ddd98768cc0e7aec922ef19dae139731e5ca750afcf6f4c5"
|
||||
name = "github.com/edsrzf/mmap-go"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "188cc3b666ba704534fa4f96e9e61f21f1e1ba7c"
|
||||
version = "v1.0.0"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:c96d16a4451e48e2c44b2c3531fd8ec9248d822637f1911a88959ca0bcae4a64"
|
||||
name = "github.com/go-ole/go-ole"
|
||||
packages = [
|
||||
".",
|
||||
"oleutil",
|
||||
]
|
||||
pruneopts = "UT"
|
||||
revision = "39dc8486bd0952279431257138bc428275b86797"
|
||||
version = "v1.2.2"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:5d1b5a25486fc7d4e133646d834f6fca7ba1cef9903d40e7aa786c41b89e9e91"
|
||||
name = "github.com/golang/protobuf"
|
||||
packages = [
|
||||
"proto",
|
||||
"protoc-gen-go/descriptor",
|
||||
"ptypes",
|
||||
"ptypes/any",
|
||||
"ptypes/duration",
|
||||
"ptypes/timestamp",
|
||||
]
|
||||
pruneopts = "UT"
|
||||
revision = "aa810b61a9c79d51363740d207bb46cf8e620ed5"
|
||||
version = "v1.2.0"
|
||||
|
||||
[[projects]]
|
||||
branch = "master"
|
||||
digest = "1:4a0c6bb4805508a6287675fac876be2ac1182539ca8a32468d8128882e9d5009"
|
||||
name = "github.com/golang/snappy"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "2e65f85255dbc3072edf28d6b5b8efc472979f5a"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:236d7e1bdb50d8f68559af37dbcf9d142d56b431c9b2176d41e2a009b664cda8"
|
||||
name = "github.com/google/uuid"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "9b3b1e0f5f99ae461456d768e7d301a7acdaa2d8"
|
||||
version = "v1.1.0"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:cd9864c6366515827a759931746738ede6079faa08df9c584596370d6add135c"
|
||||
name = "github.com/googleapis/gax-go"
|
||||
packages = [
|
||||
".",
|
||||
"v2",
|
||||
]
|
||||
pruneopts = "UT"
|
||||
revision = "c8a15bac9b9fe955bd9f900272f9a306465d28cf"
|
||||
version = "v2.0.3"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:b934afd6ff135f3f1c9a5c573247aaa7c81d9653fb52f59b802ebc7ab809b79f"
|
||||
name = "github.com/hanwen/go-fuse"
|
||||
packages = [
|
||||
"fuse",
|
||||
"fuse/nodefs",
|
||||
"fuse/pathfs",
|
||||
"splice",
|
||||
]
|
||||
pruneopts = "UT"
|
||||
revision = "5690be47d614355a22931c129e1075c25a62e9ac"
|
||||
version = "v20170619"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:d2d38625c95af7eb19435356d15af129e11869ceaff17150cda8d28e3b25bb8d"
|
||||
name = "github.com/ipfs/go-ipfs"
|
||||
packages = [
|
||||
".",
|
||||
"core",
|
||||
"core/coreapi/interface",
|
||||
"core/coreapi/interface/options",
|
||||
"dagutils",
|
||||
"exchange/reprovide",
|
||||
"filestore",
|
||||
"filestore/pb",
|
||||
"fuse/mount",
|
||||
"keystore",
|
||||
"namesys",
|
||||
"namesys/opts",
|
||||
"namesys/republisher",
|
||||
"p2p",
|
||||
"pin",
|
||||
"pin/internal/pb",
|
||||
"repo",
|
||||
"thirdparty/cidv0v1",
|
||||
"thirdparty/math2",
|
||||
"thirdparty/verifbs",
|
||||
]
|
||||
pruneopts = "UT"
|
||||
revision = "aefc746f34e5ffdee5fba1915c6603b65a0ebf81"
|
||||
version = "v0.4.18"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:fa3a7dc0a780fb19838fb96d4e18c0b9a019d9bb618798308d7b6ca48fcb9876"
|
||||
name = "github.com/jbenet/go-base58"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "6237cf65f3a6f7111cd8a42be3590df99a66bc7d"
|
||||
version = "1.0.0"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:bb81097a5b62634f3e9fec1014657855610c82d19b9a40c17612e32651e35dca"
|
||||
name = "github.com/jmespath/go-jmespath"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "c2b33e84"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:b6bbd2f9e0724bd81890c8644259f920c6d61c08453978faff0bebd25f3e7d3e"
|
||||
name = "github.com/jpillora/backoff"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "8eab2debe79d12b7bd3d10653910df25fa9552ba"
|
||||
version = "1.0.0"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:114ecad51af93a73ae6781fd0d0bc28e52b433c852b84ab4b4c109c15e6c6b6d"
|
||||
name = "github.com/jroimartin/gocui"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "c055c87ae801372cd74a0839b972db4f7697ae5f"
|
||||
version = "v0.4.0"
|
||||
|
||||
[[projects]]
|
||||
branch = "master"
|
||||
digest = "1:a34ff13c37101cd363a34dec05ef3ca896c91162cc7e612d9e4768caba9910b3"
|
||||
name = "github.com/juju/fslock"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "4d5c94c67b4b207e1ab4ebca6b4e47f174618b86"
|
||||
|
||||
[[projects]]
|
||||
branch = "master"
|
||||
digest = "1:b8d72d48e77c5a93e09f82d57cd05a30c302ff0835388b0b7745f4f9cf3e0652"
|
||||
name = "github.com/juju/gnuflag"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "2ce1bb71843d6d179b3f1c1c9cb4a72cd067fc65"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:f97285a3b0a496dcf8801072622230d513f69175665d94de60eb042d03387f6c"
|
||||
name = "github.com/julienschmidt/httprouter"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "348b672cd90d8190f8240323e372ecd1e66b59dc"
|
||||
version = "v1.2.0"
|
||||
|
||||
[[projects]]
|
||||
branch = "master"
|
||||
digest = "1:975079ef1a4b94c23122af1c18891ef9518b47f9fa30e8905b34802c5d7c7adc"
|
||||
name = "github.com/kch42/buzhash"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "9bdec3dec7c611fa97beadc374d75bdf02cd880e"
|
||||
|
||||
[[projects]]
|
||||
branch = "add-daylon"
|
||||
digest = "1:2d0f8845c6bb182b7f2d6d5d9f6d2e80569412d19a0470c92183f23adf8aa175"
|
||||
name = "github.com/liquidata-inc/ld"
|
||||
packages = [
|
||||
"go/libraries/ldio",
|
||||
"go/libraries/ldset",
|
||||
"go/libraries/textdb",
|
||||
]
|
||||
pruneopts = "UT"
|
||||
revision = "5276363e6eea62b858a43d872b41969e2fbee0f3"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:c658e84ad3916da105a761660dcaeb01e63416c8ec7bc62256a9b411a05fcd67"
|
||||
name = "github.com/mattn/go-colorable"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "167de6bfdfba052fa6b2d3664c8f5272e23c9072"
|
||||
version = "v0.0.9"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:0981502f9816113c9c8c4ac301583841855c8cf4da8c72f696b3ebedf6d0e4e5"
|
||||
name = "github.com/mattn/go-isatty"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "6ca4dbf54d38eea1a992b3c722a76a5d1c4cb25c"
|
||||
version = "v0.0.4"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:0356f3312c9bd1cbeda81505b7fd437501d8e778ab66998ef69f00d7f9b3a0d7"
|
||||
name = "github.com/mattn/go-runewidth"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "3ee7d812e62a0804a7d0a324e0249ca2db3476d3"
|
||||
version = "v0.0.4"
|
||||
|
||||
[[projects]]
|
||||
branch = "master"
|
||||
digest = "1:2b32af4d2a529083275afc192d1067d8126b578c7a9613b26600e4df9c735155"
|
||||
name = "github.com/mgutz/ansi"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "9520e82c474b0a04dd04f8a40959027271bab992"
|
||||
|
||||
[[projects]]
|
||||
branch = "master"
|
||||
digest = "1:f3fc7efada7606d5abc88372e1f838ed897fa522077957070fbc2207a50d6faa"
|
||||
name = "github.com/nsf/termbox-go"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "0938b5187e61bb8c4dcac2b0a9cf4047d83784fc"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:0028cb19b2e4c3112225cd871870f2d9cf49b9b4276531f03438a88e94be86fe"
|
||||
name = "github.com/pmezard/go-difflib"
|
||||
packages = ["difflib"]
|
||||
pruneopts = "UT"
|
||||
revision = "792786c7400a136282c1664665ae0a8db921c6c2"
|
||||
version = "v1.0.0"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:5331094ce2c687a921af5ec1367fe96e894e5b6866c2c3b8d415e86b65e69bce"
|
||||
name = "github.com/shirou/gopsutil"
|
||||
packages = [
|
||||
"cpu",
|
||||
"disk",
|
||||
"host",
|
||||
"internal/common",
|
||||
"mem",
|
||||
"net",
|
||||
"process",
|
||||
]
|
||||
pruneopts = "UT"
|
||||
revision = "ccc1c1016bc5d10e803189ee43417c50cdde7f1b"
|
||||
version = "v2.18.12"
|
||||
|
||||
[[projects]]
|
||||
branch = "master"
|
||||
digest = "1:99c6a6dab47067c9b898e8c8b13d130c6ab4ffbcc4b7cc6236c2cd0b1e344f5b"
|
||||
name = "github.com/shirou/w32"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "bb4de0191aa41b5507caa14b0650cdbddcd9280b"
|
||||
|
||||
[[projects]]
|
||||
branch = "master"
|
||||
digest = "1:e564a9e23c65422754afbc07ec84252048a83b5c9f0a2e76a761cd35472216e5"
|
||||
name = "github.com/skratchdot/open-golang"
|
||||
packages = ["open"]
|
||||
pruneopts = "UT"
|
||||
revision = "a2dfa6d0dab6634ecf39251031a3d52db73b5c7e"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:8ff03ccc603abb0d7cce94d34b613f5f6251a9e1931eba1a3f9888a9029b055c"
|
||||
name = "github.com/stretchr/testify"
|
||||
packages = [
|
||||
"assert",
|
||||
"require",
|
||||
"suite",
|
||||
]
|
||||
pruneopts = "UT"
|
||||
revision = "ffdc059bfe9ce6a4e144ba849dbedead332c6053"
|
||||
version = "v1.3.0"
|
||||
|
||||
[[projects]]
|
||||
branch = "master"
|
||||
digest = "1:685fdfea42d825ebd39ee0994354b46c374cf2c2b2d97a41a8dee1807c6a9b62"
|
||||
name = "github.com/syndtr/goleveldb"
|
||||
packages = [
|
||||
"leveldb",
|
||||
"leveldb/cache",
|
||||
"leveldb/comparer",
|
||||
"leveldb/errors",
|
||||
"leveldb/filter",
|
||||
"leveldb/iterator",
|
||||
"leveldb/journal",
|
||||
"leveldb/memdb",
|
||||
"leveldb/opt",
|
||||
"leveldb/storage",
|
||||
"leveldb/table",
|
||||
"leveldb/util",
|
||||
]
|
||||
pruneopts = "UT"
|
||||
revision = "b001fa50d6b27f3f0bb175a87d0cb55426d0a0ae"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:3b5a3bc35810830ded5e26ef9516e933083a2380d8e57371fdfde3c70d7c6952"
|
||||
name = "go.opencensus.io"
|
||||
packages = [
|
||||
".",
|
||||
"exemplar",
|
||||
"internal",
|
||||
"internal/tagencoding",
|
||||
"plugin/ochttp",
|
||||
"plugin/ochttp/propagation/b3",
|
||||
"stats",
|
||||
"stats/internal",
|
||||
"stats/view",
|
||||
"tag",
|
||||
"trace",
|
||||
"trace/internal",
|
||||
"trace/propagation",
|
||||
"trace/tracestate",
|
||||
]
|
||||
pruneopts = "UT"
|
||||
revision = "b7bf3cdb64150a8c8c53b769fdeb2ba581bd4d4b"
|
||||
version = "v0.18.0"
|
||||
|
||||
[[projects]]
|
||||
branch = "master"
|
||||
digest = "1:0303de617dda42e24d7c55ce621bfeb982320396bcacbc5e22966f3552205808"
|
||||
name = "golang.org/x/net"
|
||||
packages = [
|
||||
"context",
|
||||
"context/ctxhttp",
|
||||
"html",
|
||||
"html/atom",
|
||||
"http/httpguts",
|
||||
"http2",
|
||||
"http2/hpack",
|
||||
"idna",
|
||||
"internal/timeseries",
|
||||
"trace",
|
||||
]
|
||||
pruneopts = "UT"
|
||||
revision = "45ffb0cd1ba084b73e26dee67e667e1be5acce83"
|
||||
|
||||
[[projects]]
|
||||
branch = "master"
|
||||
digest = "1:23443edff0740e348959763085df98400dcfd85528d7771bb0ce9f5f2754ff4a"
|
||||
name = "golang.org/x/oauth2"
|
||||
packages = [
|
||||
".",
|
||||
"google",
|
||||
"internal",
|
||||
"jws",
|
||||
"jwt",
|
||||
]
|
||||
pruneopts = "UT"
|
||||
revision = "d668ce993890a79bda886613ee587a69dd5da7a6"
|
||||
|
||||
[[projects]]
|
||||
branch = "master"
|
||||
digest = "1:c4f2af053602f247b8625846cd88dfbf9295e3d02e82c58c27ebe3be06bef80c"
|
||||
name = "golang.org/x/sys"
|
||||
packages = [
|
||||
"unix",
|
||||
"windows",
|
||||
]
|
||||
pruneopts = "UT"
|
||||
revision = "20be8e55dc7b4b7a1b1660728164a8509d8c9209"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:a2ab62866c75542dd18d2b069fec854577a20211d7c0ea6ae746072a1dccdd18"
|
||||
name = "golang.org/x/text"
|
||||
packages = [
|
||||
"collate",
|
||||
"collate/build",
|
||||
"internal/colltab",
|
||||
"internal/gen",
|
||||
"internal/tag",
|
||||
"internal/triegen",
|
||||
"internal/ucd",
|
||||
"language",
|
||||
"secure/bidirule",
|
||||
"transform",
|
||||
"unicode/bidi",
|
||||
"unicode/cldr",
|
||||
"unicode/norm",
|
||||
"unicode/rangetable",
|
||||
]
|
||||
pruneopts = "UT"
|
||||
revision = "f21a4dfb5e38f5895301dc265a8def02365cc3d0"
|
||||
version = "v0.3.0"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:768c35ec83dd17029060ea581d6ca9fdcaef473ec87e93e4bb750949035f6070"
|
||||
name = "google.golang.org/api"
|
||||
packages = [
|
||||
"gensupport",
|
||||
"googleapi",
|
||||
"googleapi/internal/uritemplates",
|
||||
"googleapi/transport",
|
||||
"internal",
|
||||
"iterator",
|
||||
"option",
|
||||
"storage/v1",
|
||||
"transport/http",
|
||||
"transport/http/internal/propagation",
|
||||
]
|
||||
pruneopts = "UT"
|
||||
revision = "19e022d8cf43ce81f046bae8cc18c5397cc7732f"
|
||||
version = "v0.1.0"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:fa026a5c59bd2df343ec4a3538e6288dcf4e2ec5281d743ae82c120affe6926a"
|
||||
name = "google.golang.org/appengine"
|
||||
packages = [
|
||||
".",
|
||||
"internal",
|
||||
"internal/app_identity",
|
||||
"internal/base",
|
||||
"internal/datastore",
|
||||
"internal/log",
|
||||
"internal/modules",
|
||||
"internal/remote_api",
|
||||
"internal/urlfetch",
|
||||
"urlfetch",
|
||||
]
|
||||
pruneopts = "UT"
|
||||
revision = "e9657d882bb81064595ca3b56cbe2546bbabf7b1"
|
||||
version = "v1.4.0"
|
||||
|
||||
[[projects]]
|
||||
branch = "master"
|
||||
digest = "1:a7d48ca460ca1b4f6ccd8c95502443afa05df88aee84de7dbeb667a8754e8fa6"
|
||||
name = "google.golang.org/genproto"
|
||||
packages = [
|
||||
"googleapis/api/annotations",
|
||||
"googleapis/iam/v1",
|
||||
"googleapis/rpc/code",
|
||||
"googleapis/rpc/status",
|
||||
]
|
||||
pruneopts = "UT"
|
||||
revision = "bd9b4fb69e2ffd37621a6caa54dcbead29b546f2"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:9edd250a3c46675d0679d87540b30c9ed253b19bd1fd1af08f4f5fb3c79fc487"
|
||||
name = "google.golang.org/grpc"
|
||||
packages = [
|
||||
".",
|
||||
"balancer",
|
||||
"balancer/base",
|
||||
"balancer/roundrobin",
|
||||
"binarylog/grpc_binarylog_v1",
|
||||
"codes",
|
||||
"connectivity",
|
||||
"credentials",
|
||||
"credentials/internal",
|
||||
"encoding",
|
||||
"encoding/proto",
|
||||
"grpclog",
|
||||
"internal",
|
||||
"internal/backoff",
|
||||
"internal/binarylog",
|
||||
"internal/channelz",
|
||||
"internal/envconfig",
|
||||
"internal/grpcrand",
|
||||
"internal/grpcsync",
|
||||
"internal/syscall",
|
||||
"internal/transport",
|
||||
"keepalive",
|
||||
"metadata",
|
||||
"naming",
|
||||
"peer",
|
||||
"resolver",
|
||||
"resolver/dns",
|
||||
"resolver/passthrough",
|
||||
"stats",
|
||||
"status",
|
||||
"tap",
|
||||
]
|
||||
pruneopts = "UT"
|
||||
revision = "df014850f6dee74ba2fc94874043a9f3f75fbfd8"
|
||||
version = "v1.17.0"
|
||||
|
||||
[[projects]]
|
||||
digest = "1:c06d9e11d955af78ac3bbb26bd02e01d2f61f689e1a3bce2ef6fb683ef8a7f2d"
|
||||
name = "gopkg.in/alecthomas/kingpin.v2"
|
||||
packages = ["."]
|
||||
pruneopts = "UT"
|
||||
revision = "947dcec5ba9c011838740e680966fd7087a71d0d"
|
||||
version = "v2.2.6"
|
||||
|
||||
[solve-meta]
|
||||
analyzer-name = "dep"
|
||||
analyzer-version = 1
|
||||
input-imports = [
|
||||
"cloud.google.com/go/storage",
|
||||
"github.com/BurntSushi/toml",
|
||||
"github.com/attic-labs/graphql",
|
||||
"github.com/attic-labs/graphql/gqlerrors",
|
||||
"github.com/attic-labs/kingpin",
|
||||
"github.com/aws/aws-sdk-go/aws",
|
||||
"github.com/aws/aws-sdk-go/aws/awserr",
|
||||
"github.com/aws/aws-sdk-go/aws/credentials",
|
||||
"github.com/aws/aws-sdk-go/aws/session",
|
||||
"github.com/aws/aws-sdk-go/service/dynamodb",
|
||||
"github.com/aws/aws-sdk-go/service/s3",
|
||||
"github.com/clbanning/mxj",
|
||||
"github.com/codahale/blake2",
|
||||
"github.com/dustin/go-humanize",
|
||||
"github.com/edsrzf/mmap-go",
|
||||
"github.com/golang/snappy",
|
||||
"github.com/google/uuid",
|
||||
"github.com/hanwen/go-fuse/fuse",
|
||||
"github.com/hanwen/go-fuse/fuse/nodefs",
|
||||
"github.com/hanwen/go-fuse/fuse/pathfs",
|
||||
"github.com/ipfs/go-ipfs/core",
|
||||
"github.com/jbenet/go-base58",
|
||||
"github.com/jpillora/backoff",
|
||||
"github.com/jroimartin/gocui",
|
||||
"github.com/juju/fslock",
|
||||
"github.com/juju/gnuflag",
|
||||
"github.com/julienschmidt/httprouter",
|
||||
"github.com/kch42/buzhash",
|
||||
"github.com/liquidata-inc/ld/go/libraries/ldio",
|
||||
"github.com/liquidata-inc/ld/go/libraries/textdb",
|
||||
"github.com/mattn/go-isatty",
|
||||
"github.com/mgutz/ansi",
|
||||
"github.com/shirou/gopsutil/cpu",
|
||||
"github.com/shirou/gopsutil/disk",
|
||||
"github.com/shirou/gopsutil/host",
|
||||
"github.com/shirou/gopsutil/mem",
|
||||
"github.com/skratchdot/open-golang/open",
|
||||
"github.com/stretchr/testify/assert",
|
||||
"github.com/stretchr/testify/suite",
|
||||
"github.com/syndtr/goleveldb/leveldb",
|
||||
"github.com/syndtr/goleveldb/leveldb/iterator",
|
||||
"github.com/syndtr/goleveldb/leveldb/opt",
|
||||
"github.com/syndtr/goleveldb/leveldb/util",
|
||||
"golang.org/x/net/context",
|
||||
"golang.org/x/net/html",
|
||||
"golang.org/x/oauth2",
|
||||
"google.golang.org/api/googleapi",
|
||||
"gopkg.in/alecthomas/kingpin.v2",
|
||||
]
|
||||
solver-name = "gps-cdcl"
|
||||
solver-version = 1
|
||||
@@ -1,158 +0,0 @@
|
||||
# Gopkg.toml example
|
||||
#
|
||||
# Refer to https://golang.github.io/dep/docs/Gopkg.toml.html
|
||||
# for detailed Gopkg.toml documentation.
|
||||
#
|
||||
# required = ["github.com/user/thing/cmd/thing"]
|
||||
# ignored = ["github.com/user/project/pkgX", "bitbucket.org/user/project/pkgA/pkgY"]
|
||||
#
|
||||
# [[constraint]]
|
||||
# name = "github.com/user/project"
|
||||
# version = "1.0.0"
|
||||
#
|
||||
# [[constraint]]
|
||||
# name = "github.com/user/project2"
|
||||
# branch = "dev"
|
||||
# source = "github.com/myfork/project2"
|
||||
#
|
||||
# [[override]]
|
||||
# name = "github.com/x/y"
|
||||
# version = "2.4.0"
|
||||
#
|
||||
# [prune]
|
||||
# non-go = false
|
||||
# go-tests = true
|
||||
# unused-packages = true
|
||||
|
||||
|
||||
[[constraint]]
|
||||
name = "cloud.google.com/go"
|
||||
version = "0.34.0"
|
||||
|
||||
[[constraint]]
|
||||
name = "github.com/BurntSushi/toml"
|
||||
version = "0.3.1"
|
||||
|
||||
[[override]]
|
||||
name = "github.com/attic-labs/graphql"
|
||||
branch = "master"
|
||||
|
||||
[[constraint]]
|
||||
name = "github.com/attic-labs/kingpin"
|
||||
branch = "master"
|
||||
|
||||
[[constraint]]
|
||||
name = "github.com/aws/aws-sdk-go"
|
||||
version = "1.16.15"
|
||||
|
||||
[[constraint]]
|
||||
name = "github.com/clbanning/mxj"
|
||||
version = "1.8.3"
|
||||
|
||||
[[constraint]]
|
||||
branch = "master"
|
||||
name = "github.com/codahale/blake2"
|
||||
|
||||
[[constraint]]
|
||||
name = "github.com/dustin/go-humanize"
|
||||
version = "1.0.0"
|
||||
|
||||
[[constraint]]
|
||||
name = "github.com/edsrzf/mmap-go"
|
||||
version = "1.0.0"
|
||||
|
||||
[[constraint]]
|
||||
branch = "master"
|
||||
name = "github.com/golang/snappy"
|
||||
|
||||
[[constraint]]
|
||||
name = "github.com/google/uuid"
|
||||
version = "1.1.0"
|
||||
|
||||
[[constraint]]
|
||||
name = "github.com/hanwen/go-fuse"
|
||||
version = "20170619.0.0"
|
||||
|
||||
[[constraint]]
|
||||
name = "github.com/ipfs/go-ipfs"
|
||||
version = "0.4.18"
|
||||
|
||||
[[constraint]]
|
||||
name = "github.com/jbenet/go-base58"
|
||||
version = "1.0.0"
|
||||
|
||||
[[constraint]]
|
||||
name = "github.com/jpillora/backoff"
|
||||
version = "1.0.0"
|
||||
|
||||
[[constraint]]
|
||||
name = "github.com/jroimartin/gocui"
|
||||
version = "0.4.0"
|
||||
|
||||
[[constraint]]
|
||||
branch = "master"
|
||||
name = "github.com/juju/fslock"
|
||||
|
||||
[[constraint]]
|
||||
branch = "master"
|
||||
name = "github.com/juju/gnuflag"
|
||||
|
||||
[[constraint]]
|
||||
name = "github.com/julienschmidt/httprouter"
|
||||
version = "1.2.0"
|
||||
|
||||
[[constraint]]
|
||||
branch = "master"
|
||||
name = "github.com/kch42/buzhash"
|
||||
|
||||
[[constraint]]
|
||||
branch = "add-daylon"
|
||||
name = "github.com/liquidata-inc/ld"
|
||||
|
||||
[[constraint]]
|
||||
name = "github.com/mattn/go-isatty"
|
||||
version = "0.0.4"
|
||||
|
||||
[[constraint]]
|
||||
branch = "master"
|
||||
name = "github.com/mgutz/ansi"
|
||||
|
||||
[[constraint]]
|
||||
name = "github.com/shirou/gopsutil"
|
||||
version = "2.18.12"
|
||||
|
||||
[[constraint]]
|
||||
branch = "master"
|
||||
name = "github.com/skratchdot/open-golang"
|
||||
|
||||
[[constraint]]
|
||||
name = "github.com/stretchr/testify"
|
||||
version = "1.3.0"
|
||||
|
||||
[[constraint]]
|
||||
branch = "master"
|
||||
name = "github.com/syndtr/goleveldb"
|
||||
|
||||
[[constraint]]
|
||||
branch = "master"
|
||||
name = "golang.org/x/net"
|
||||
|
||||
[[constraint]]
|
||||
branch = "master"
|
||||
name = "golang.org/x/oauth2"
|
||||
|
||||
[[constraint]]
|
||||
branch = "master"
|
||||
name = "golang.org/x/sys"
|
||||
|
||||
[[constraint]]
|
||||
name = "google.golang.org/api"
|
||||
version = "0.1.0"
|
||||
|
||||
[[constraint]]
|
||||
name = "gopkg.in/alecthomas/kingpin.v2"
|
||||
version = "2.2.6"
|
||||
|
||||
[prune]
|
||||
go-tests = true
|
||||
unused-packages = true
|
||||
@@ -1,135 +0,0 @@
|
||||
<img src='doc/nommy_cropped_smaller.png' width='350' title='Nommy, the snacky otter'>
|
||||
|
||||
[Use Cases](#use-cases) | [Setup](#setup) | [Status](#status) | [Documentation](./doc/intro.md) | [Contact](#contact-us)
|
||||
<br><br>
|
||||
|
||||
[](https://hub.docker.com/r/noms/noms/)
|
||||
[](https://godoc.org/github.com/attic-labs/noms)
|
||||
|
||||
# Welcome
|
||||
|
||||
*Noms* is a decentralized database philosophically descendant from the Git version control system.
|
||||
|
||||
Like Git, Noms is:
|
||||
|
||||
* **Versioned:** By default, all previous versions of the database are retained. You can trivially track how the database evolved to its current state, easily and efficiently compare any two versions, or even rewind and branch from any previous version.
|
||||
* **Synchronizable:** Instances of a single Noms database can be disconnected from each other for any amount of time, then later reconcile their changes efficiently and correctly.
|
||||
|
||||
Unlike Git, Noms is a database, so it also:
|
||||
|
||||
* Primarily **stores structured data**, not files and directories (see: [the Noms type system](https://github.com/attic-labs/noms/blob/master/doc/intro.md#types))
|
||||
* **Scales well** to large amounts of data and concurrent clients
|
||||
* Supports **atomic transactions** (a single instance of Noms is CP, but Noms is typically run in production backed by S3, in which case it is "[effectively CA](https://cloud.google.com/spanner/docs/whitepapers/SpannerAndCap.pdf)")
|
||||
* Supports **efficient indexes** (see: [Noms prolly-trees](https://github.com/attic-labs/noms/blob/master/doc/intro.md#prolly-trees-probabilistic-b-trees))
|
||||
* Features a **flexible query model** (see: [GraphQL](./go/ngql/README.md))
|
||||
|
||||
A Noms database can reside within a file system or in the cloud:
|
||||
|
||||
* The (built-in) [NBS](./go/nbs) `ChunkStore` implementation provides two back-ends which provide persistence for Noms databases: one for storage in a file system and one for storage in an S3 bucket.
|
||||
|
||||
Finally, because Noms is content-addressed, it yields a very pleasant programming model.
|
||||
|
||||
Working with Noms is ***declarative***. You don't `INSERT` new data, `UPDATE` existing data, or `DELETE` old data. You simply *declare* what the data ought to be right now. If you commit the same data twice, it will be deduplicated because of content-addressing. If you commit _almost_ the same data, only the part that is different will be written.
|
||||
|
||||
<br>
|
||||
|
||||
## Use Cases
|
||||
|
||||
#### [Decentralization](./doc/decent/about.md)
|
||||
|
||||
Because Noms is very good at sync, it makes a decent basis for rich, collaborative, fully-decentralized applications.
|
||||
|
||||
#### ClientDB (coming someday)
|
||||
|
||||
Embed Noms into mobile applications, making it easier to build offline-first, fully synchronizing mobile applications.
|
||||
|
||||
<br>
|
||||
|
||||
## Setup
|
||||
|
||||
```shell
|
||||
# You probably want to add this to your environment
|
||||
export NOMS_VERSION_NEXT=1
|
||||
|
||||
go get github.com/attic-labs/noms/cmd/noms
|
||||
go install github.com/attic-labs/noms/cmd/noms
|
||||
```
|
||||
|
||||
<br>
|
||||
|
||||
## Run
|
||||
|
||||
Import some data:
|
||||
|
||||
```shell
|
||||
go install github.com/attic-labs/noms/samples/go/csv/csv-import
|
||||
curl 'https://data.cityofnewyork.us/api/views/kku6-nxdu/rows.csv?accessType=DOWNLOAD' > /tmp/data.csv
|
||||
csv-import /tmp/data.csv /tmp/noms::nycdemo
|
||||
```
|
||||
|
||||
Explore:
|
||||
|
||||
```shell
|
||||
noms show /tmp/noms::nycdemo
|
||||
```
|
||||
|
||||
Should show:
|
||||
|
||||
```go
|
||||
struct Commit {
|
||||
meta: struct Meta {
|
||||
date: "2017-09-19T19:33:01Z",
|
||||
inputFile: "/tmp/data.csv",
|
||||
},
|
||||
parents: set {},
|
||||
value: [ // 236 items
|
||||
struct Row {
|
||||
countAmericanIndian: "0",
|
||||
countAsianNonHispanic: "3",
|
||||
countBlackNonHispanic: "21",
|
||||
countCitizenStatusTotal: "44",
|
||||
countCitizenStatusUnknown: "0",
|
||||
countEthnicityTotal: "44",
|
||||
...
|
||||
```
|
||||
|
||||
<br>
|
||||
|
||||
## Status
|
||||
|
||||
### Data Format
|
||||
|
||||
We are fairly confident in the core data format, and plan to support Noms database [version `7`](https://github.com/attic-labs/noms/blob/v7/go/constants/version.go#L9) and forward. If you create a database with Noms today, future versions will have migration tools to pull your databases forward.
|
||||
|
||||
### Roadmap
|
||||
|
||||
We plan to implement the following for Noms version 8:
|
||||
- [x] Horizontal scalability (Done! See: [nbs](./go/nbs/README.md))
|
||||
- [x] Automatic merge (Done! See: [CommitOptions.Policy](https://godoc.org/github.com/attic-labs/noms/go/datas#CommitOptions) and the `noms merge` subcommand).
|
||||
- [x] Query language (Done! See [ngql](./go/ngql/README.md))
|
||||
- [ ] Garbage Collection (https://github.com/attic-labs/noms/issues/3374)
|
||||
- [x] Optional fields (https://github.com/attic-labs/noms/issues/2327)
|
||||
- [ ] Implement migration (https://github.com/attic-labs/noms/issues/3363)
|
||||
- [ ] Fix sync performance with long commit chains (https://github.com/attic-labs/noms/issues/2233)
|
||||
- [ ] [Various other smaller bugs and improvements](https://github.com/attic-labs/noms/issues?q=is%3Aissue+is%3Aopen+label%3AP0)
|
||||
|
||||
<br>
|
||||
|
||||
## Learn More About Noms
|
||||
|
||||
For the decentralized web: [The Decentralized Database](doc/decent/about.md)
|
||||
|
||||
Learn the basics: [Technical Overview](doc/intro.md)
|
||||
|
||||
Tour the CLI: [Command-Line Interface Tour](doc/cli-tour.md)
|
||||
|
||||
Tour the Go API: [Go SDK Tour](doc/go-tour.md)
|
||||
|
||||
<br>
|
||||
|
||||
## Contact Us
|
||||
|
||||
Interested in using Noms? Awesome! We would be happy to work with you to help understand whether Noms is a fit for your problem. Reach out at:
|
||||
|
||||
- [Mailing List](https://groups.google.com/forum/#!forum/nomsdb)
|
||||
- [Twitter](https://twitter.com/nomsdb)
|
||||
@@ -1,52 +0,0 @@
|
||||
codecov:
|
||||
branch: master
|
||||
bot: "mikegray"
|
||||
ci:
|
||||
- "jenkins3.noms.io"
|
||||
|
||||
coverage:
|
||||
precision: 2 # how many decimal places to display in the UI: 0 <= value <= 4
|
||||
round: down # how coverage is rounded: down/up/nearest
|
||||
range: 70...100 # custom range of coverage colors from red -> yellow -> green
|
||||
|
||||
notify:
|
||||
slack:
|
||||
default:
|
||||
url: "secret:n+BYhIXTXsaCiMKB3vOf6yP68ytdKd3WpXtJFWPEUsEWXDiGnU5dTB5DO2yv8tR0COdxvs7K31hVpEfHEXdoXOaQhUw3FKf3fh8KZDLN7CGTbeDhw1uNGGyBr2d2TWnopzYtcXomdwMmuckARtiWQx0YXJiZY9YyCrIoDK9HIJQ="
|
||||
branches: null
|
||||
threshold: 5.0
|
||||
attachments: "tree, diff"
|
||||
|
||||
status:
|
||||
project:
|
||||
default:
|
||||
enabled: yes
|
||||
target: auto
|
||||
branches: null
|
||||
threshold: null
|
||||
if_no_uploads: error
|
||||
if_not_found: success
|
||||
if_ci_failed: error
|
||||
|
||||
patch:
|
||||
default:
|
||||
enabled: yes
|
||||
target: auto
|
||||
branches: null
|
||||
threshold: null
|
||||
if_no_uploads: error
|
||||
if_not_found: success
|
||||
if_ci_failed: error
|
||||
|
||||
changes:
|
||||
default:
|
||||
enabled: yes
|
||||
branches: null
|
||||
if_no_uploads: error
|
||||
if_not_found: success
|
||||
if_ci_failed: error
|
||||
|
||||
comment:
|
||||
layout: "tree"
|
||||
branches: null
|
||||
behavior: default
|
||||
|
Before Width: | Height: | Size: 127 KiB |
@@ -1,161 +0,0 @@
|
||||
[Home](../README.md) »
|
||||
|
||||
[Technical Overview](intro.md) | [Use Cases](../README.md#use-cases) | **Command-Line Interface** | [Go bindings Tour](go-tour.md) | [Path Syntax](spelling.md) | [FAQ](faq.md)
|
||||
<br><br>
|
||||
# A Short Tour of the Noms CLI
|
||||
|
||||
This is a quick introduction to the Noms command-line interface. It should only take a few minutes to read, but there's also a screencast if you prefer:
|
||||
|
||||
[<img src="cli-screencast.png" width="500">](https://www.youtube.com/watch?v=NeBsaNdAn68)
|
||||
|
||||
## Install Noms
|
||||
|
||||
... if you haven't already. Follow the instructions [here](https://github.com/attic-labs/noms#setup).
|
||||
|
||||
## The `noms` command
|
||||
|
||||
Now you should be able to run `noms`:
|
||||
|
||||
```shell
|
||||
> noms
|
||||
Noms is a tool for goofing with Noms data.
|
||||
|
||||
Usage:
|
||||
|
||||
noms command [arguments]
|
||||
|
||||
The commands are:
|
||||
|
||||
diff Shows the difference between two objects
|
||||
ds Noms dataset management
|
||||
log Displays the history of a Noms dataset
|
||||
serve Serves a Noms database over HTTP
|
||||
show Shows a serialization of a Noms object
|
||||
sync Moves datasets between or within databases
|
||||
version Display noms version
|
||||
|
||||
Use "noms help [command]" for more information about a command.
|
||||
```
|
||||
|
||||
Without any arguments, `noms` lists out all available commands. To get information on a specific command, we can use `noms help [command]`:
|
||||
|
||||
```shell
|
||||
> noms help sync
|
||||
usage: noms sync [options] <source-object> <dest-dataset>
|
||||
|
||||
See Spelling Objects at https://github.com/attic-labs/noms/blob/master/doc/spelling.md for details on the object and dataset arguments.
|
||||
|
||||
...
|
||||
```
|
||||
|
||||
## noms ds
|
||||
|
||||
The `noms ds` command lists the _datasets_ within a particular database:
|
||||
|
||||
```shell
|
||||
> noms ds http://demo.noms.io
|
||||
...
|
||||
sf-film-locations/raw
|
||||
sf-film-locations
|
||||
...
|
||||
```
|
||||
|
||||
## noms log
|
||||
|
||||
Noms datasets are versioned. You can see the history with `log`:
|
||||
|
||||
```shell
|
||||
> !? noms log http://demo.noms.io::sf-film-locations
|
||||
commit aprsmg0j2eegk8eehbgj7cd3tmmd1be8
|
||||
Parent: None
|
||||
Date: "2017-09-19T21:42:46Z"
|
||||
InputPath: "http://localhost:8000::#dksek6tuf8ens06bi4culq85tfp5q4cg.value"
|
||||
|
||||
...
|
||||
```
|
||||
|
||||
Note that Noms is a typed system. What is being shown here for each entry is not text, but a serialization of the diff between two datasets.
|
||||
|
||||
## noms show
|
||||
|
||||
You can see the entire serialization of any object in the database with `noms show`:
|
||||
|
||||
```shell
|
||||
> noms show 'http://demo.noms.io::#aprsmg0j2eegk8eehbgj7cd3tmmd1be8'
|
||||
|
||||
struct Commit {
|
||||
meta: struct {},
|
||||
parents: Set<Ref<Cycle<Commit>>>,
|
||||
value: List<struct Row {
|
||||
Actor1: String,
|
||||
Actor2: String,
|
||||
Actor3: String,
|
||||
Director: String,
|
||||
Distributor: String,
|
||||
FunFacts: String,
|
||||
Locations: String,
|
||||
ProductionCompany: String,
|
||||
ReleaseYear: Number,
|
||||
Title: String,
|
||||
Writer: String,
|
||||
}>,
|
||||
}({
|
||||
meta: Meta {
|
||||
date: "2016-07-25T18:34:00+0000",
|
||||
inputPath: "http://localhost:8000::sf-film-locations/raw.value",
|
||||
},
|
||||
parents: {
|
||||
c506ta03786j48a07he83ju669u78qa2,
|
||||
},
|
||||
value: [ // 1,241 items
|
||||
Row {
|
||||
Actor1: "Siddarth",
|
||||
...
|
||||
```
|
||||
|
||||
## noms sync
|
||||
|
||||
You can work with Noms databases that are remote exactly the same as you work with local databases. But it's frequently useful to move data to a local machine, for example, to make a private fork or to work with the data disconnected from the source database.
|
||||
|
||||
Moving data in Noms is done with the `sync` command. Note that unlike Git, we do not make a distinction between _push_ and _pull_. It's the same operation in both directions:
|
||||
|
||||
```shell
|
||||
> noms sync http://demo.noms.io::sf-film-locations /tmp/noms::films
|
||||
> noms ds /tmp/noms
|
||||
films
|
||||
```
|
||||
|
||||
We can now make an edit locally:
|
||||
|
||||
```shell
|
||||
> go install github.com/attic-labs/noms/samples/go/csv/...
|
||||
> csv-export /tmp/noms::films > /tmp/film-locations.csv
|
||||
```
|
||||
|
||||
open /tmp/film-location.csv and edit it, then:
|
||||
|
||||
```shell
|
||||
> csv-import --column-types=String,String,String,String,String,String,String,String,Number,String,String \
|
||||
/tmp/film-locations.csv /tmp/noms::films
|
||||
```
|
||||
|
||||
## noms diff
|
||||
|
||||
The `noms diff` command can show you the differences between any two values. Let's see our change:
|
||||
|
||||
```shell
|
||||
> noms diff http://demo.noms.io::sf-film-locations /tmp/noms::films
|
||||
|
||||
./.meta {
|
||||
- "date": "2016-07-25T18:51:23+0000"
|
||||
+ "date": "2016-07-25T22:51:14+0000"
|
||||
+ "inputFile": "/tmp/film-locations.csv"
|
||||
- "inputPath": "http://demo.noms.io::sf-film-locations/raw.value"
|
||||
./.parents {
|
||||
- pckdvpvr9br1fie6c3pjudrlthe7na18
|
||||
+ q4jcc2i7kntkjiipvjgpr5r02ldroj0g
|
||||
}
|
||||
./.value[0] {
|
||||
- "Locations": "Epic Roasthouse (399 Embarcadero)"
|
||||
+ "Locations": "Epic Roadhouse (399 Embarcadero)"
|
||||
```
|
||||
|
Before Width: | Height: | Size: 49 KiB |
@@ -1,77 +0,0 @@
|
||||
[Home](../../README.md) » [Use Cases](../../README.md#use-cases) » **Decentralized** »
|
||||
|
||||
**About** | [Quickstart](quickstart.md) | [Architectures](architectures.md) | [P2P Chat Demo](demo-p2p-chat.md) | [IPFS Chat Demo](demo-ipfs-chat.md)
|
||||
<br><br>
|
||||
# Noms — The Decentralized Database
|
||||
|
||||
[Noms](http://noms.io) makes it ~~easy~~ tractable to create rich,
|
||||
multiuser, collaborative, fully-decentralized applications.
|
||||
|
||||
Like most databases, Noms features a rich data model, atomic
|
||||
transactions, support for large-scale data, and efficient searches,
|
||||
scans, reads, and updates.
|
||||
|
||||
Unlike any other database, Noms has built-in multiparty sync and
|
||||
conflict resolution. This feature makes Noms a very good fit for P2P
|
||||
decentralized applications.
|
||||
|
||||
Any number of dapp peers in a P2P network can
|
||||
concurrently modify the same logical Noms database, and continuously
|
||||
and efficiently sync their changes with each other. All peers will
|
||||
converge to the same state.
|
||||
|
||||
For many applications, peers can store an entire local copy of the
|
||||
data they are interested in. For larger applications, it should be
|
||||
possible to back Noms by a decentralized blockstore like IPFS, Swarm,
|
||||
or Sia (or in the future, Filecoin), and store large-scale data in a
|
||||
completely decentralized way, without replicating it on every
|
||||
node. Noms also has a blockstore for S3, which is ideal for
|
||||
applications that have some centralized components.
|
||||
|
||||
**We'd love to talk to you about the possibility of using noms in your project** so please don't hestitate to contact us at [noms@attic.io](mailto:noms@attic.io).
|
||||
|
||||
## How it Works
|
||||
|
||||
Think of Noms like a programmable Git: changes are bundled as commits
|
||||
which reference previous states of the database. Apps pull changes
|
||||
from peers and merge them using a principled set of APIs and
|
||||
strategies. Except that rather than users manually pulling and
|
||||
merging, applications typically do this continuously, automatically
|
||||
converging to a shared state.
|
||||
|
||||
Your application uses a [Go client
|
||||
library](https://github.com/attic-labs/noms/blob/master/doc/go-tour.md)
|
||||
to interact with Noms data. There is also a [command-line
|
||||
interface](https://github.com/attic-labs/noms/blob/master/doc/cli-tour.md)
|
||||
for working with data and initial support for a [GraphQL-based query
|
||||
language](https://github.com/attic-labs/noms/blob/master/go/ngql/README.md).
|
||||
|
||||
Some additional features include:
|
||||
* **Versioning**: It’s easy to use, compare, or revert to older database versions
|
||||
* **Efficient diffs**: diffing even huge datasets is efficient due to
|
||||
noms’ use of a novel BTree-like data structure called a [Prolly
|
||||
Tree](https://github.com/attic-labs/noms/blob/master/doc/intro.md#prolly-trees-probabilistic-b-trees)
|
||||
* **Efficient storage**: data are chunked and content-addressable, so
|
||||
there is exactly one copy of each chunk in the database, shared by
|
||||
other data that reference it. Small changes to massive data
|
||||
structures always result in small operations.
|
||||
* **Verifiable**: The entire database rolls up to a single 20-byte hash
|
||||
that uniquely represents the database at that moment - anyone can
|
||||
verify that a particular database hashes to the same value
|
||||
|
||||
Read the [Noms design overview](https://github.com/attic-labs/noms/blob/master/doc/decent/intro.md).
|
||||
|
||||
## Status
|
||||
|
||||
For overall status of the database, see [Noms Status](../../README.md#status).
|
||||
|
||||
For the decentralized use case in particular: we are fairly confident in this approach and are actively looking for partners to work with to build it out.
|
||||
|
||||
- [x] Demonstrate core concept of using Noms to continuously sync across many users (Done! See noms-chat demos)
|
||||
- [ ] Demonstrate using libp2p or similar to traverse NATs
|
||||
- [ ] Investigate backing IPFS with Noms rather than the reverse - this should improve stability and dramatically improve local performance
|
||||
- [ ] Demonstrate using IPFS with a schema that permits nodes to disappear
|
||||
|
||||
**_If you would like to use noms in your project we’d love to hear from you_**:
|
||||
drop us an email ([noms@attic.io](mailto:noms@attic.io)) or send us a
|
||||
message in slack ([slack.noms.io](http://slack.noms.io)).
|
||||
@@ -1,71 +0,0 @@
|
||||
[Home](../../README.md) » [Use Cases](../../README.md#use-cases) » **Decentralized** »
|
||||
|
||||
[About](about.md) | [Quickstart](quickstart.md) | **Architectures** | [P2P Chat Demo](demo-p2p-chat.md) | [IPFS Chat Demo](demo-ipfs-chat.md)
|
||||
<br><br>
|
||||
|
||||
# Architectures
|
||||
|
||||
There are many possible ways to use Noms as part of a decentralized application. Noms can naturally be mixed and matched with other decentralized tools like blockchains, IPFS, etc. This page lists a few approaches we find promising.
|
||||
|
||||
## Classic P2P Architecture
|
||||
|
||||
Noms can be used to implement apps in a peer-to-peer configuration. Each instance of the application (i.e., each "node") maintains a database locally with the data that is relevant to it. When a node creates new data, it commits that data to it's database and broadcasts a message to it's peers that contains the hash of it's lastest commit.
|
||||
|
||||

|
||||
|
||||
Peers that are listening for these message can decide if that data is relevent to them. Those that are interested can pull the new data from the publisher. The two clients efficiently communicate so that only data that isn't present in the requesting client is transmitted (much the same way that one git client sends source changes to another).
|
||||
|
||||
Peers can use a flow similar to the following in order to sync changes with one another:
|
||||
|
||||
```nohighlight
|
||||
for {
|
||||
listen for new message
|
||||
if new msg is relevant {
|
||||
if new msg is ancestor of current commit {
|
||||
// nothing to do
|
||||
continue
|
||||
}
|
||||
pull new data from sender of msg
|
||||
if current head is ancestor of new msg {
|
||||
// fast forward to the new commit
|
||||
set head of dataset to new commit
|
||||
continue
|
||||
}
|
||||
merge new with current head and commit
|
||||
publish new commit
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Noms has a default [merge policy](https://github.com/attic-labs/noms/blob/2d0e9e738370d49cc09e8fa6e290ceca1c3e2005/go/merge/three_way.go#L14) that covers many classes of concurrent operations. If the application restricts itself to only operations that are mergeable by this policy, then Noms can automatically merge all concurrent changes. In this case, the entire database is effectively a CRDT.
|
||||
|
||||
If this is not sufficient, then applications can create their own merge policies, implementing whatever merge is appropriate for their use case.
|
||||
|
||||
# Decentralized Chunkstore Architecture
|
||||
|
||||
Another potential architecture for decentralized apps uses a decentralized chunkstore (such as IPFS, Swarm, or Sia) rather than local databases. In this case, rather than each node maintaining a local datastore, Noms chunks are stored in a decentralized chunkstore. The underlying chunkstore is responsible for making chunks available when needed.
|
||||
|
||||

|
||||
|
||||
The flow used by peers to sync with one another is similar to the peer-to-peer architecture. The main difference is data is not duplicated on local machines and doesn't have to be pulled during sync. Each app keeps track of it's latest commit in the chunk store.
|
||||
|
||||
```nohighlight
|
||||
for {
|
||||
listen for new message
|
||||
if new msg is relevant {
|
||||
if new msg is ancestor of current commit {
|
||||
// nothing to do
|
||||
continue
|
||||
}
|
||||
// No pull necessary
|
||||
if current head is ancestor of new msg {
|
||||
// fast forward to the new commit
|
||||
set head of dataset to new commit
|
||||
continue
|
||||
}
|
||||
merge new with current head and commit
|
||||
publish new commit
|
||||
}
|
||||
}
|
||||
```
|
||||
We have a prototype implementation of an IPFS-based chunkstore. If you are interested in pursuing this direction, let us know!
|
||||
@@ -1,49 +0,0 @@
|
||||
[Home](../../README.md) » [Use Cases](../../README.md#use-cases) » **Decentralized** »
|
||||
|
||||
[About](about.md) | [Quickstart](quickstart.md) | [Architectures](architectures.md) | [P2P Chat Demo](demo-p2p-chat.md) | **IPFS Chat Demo**
|
||||
<br><br>
|
||||
# Demo App: IPFS-based Decentralized Chat
|
||||
|
||||
This sample app demonstrates backing a P2P noms app by a decentralized blockstore (in this case, IPFS). Data is pulled off the network dynamically as needed - each client doesn't need a complete copy.
|
||||
|
||||
# Build and Run
|
||||
|
||||
Demo app code is in the
|
||||
[ipfs-chat](https://github.com/attic-labs/noms/tree/master/samples/go/decent/ipfs-chat/)
|
||||
directory. To get it up and running take the following steps:
|
||||
|
||||
* Use git to clone the noms repository onto your computer:
|
||||
|
||||
```shell
|
||||
go get github.com/attic-labs/noms/samples/go/decent/ipfs-chat
|
||||
```
|
||||
|
||||
* From the noms/samples/go/decent/ipfs-chat directory, build the program with the following command:
|
||||
|
||||
```shell
|
||||
go build
|
||||
```
|
||||
|
||||
* Run the ipfs-chat client with the following command:
|
||||
|
||||
```shell
|
||||
./ipfs-chat client --username <aname1> --node-idx=1 ipfs:/tmp/ipfs1::chat >& /tmp/err1
|
||||
```
|
||||
|
||||
* Run a second ipfs-chat client with the following command:
|
||||
|
||||
```shell
|
||||
./ipfs-chat client --username <aname2> --node-idx=2 ipfs:/tmp/ipfs2::chat >& /tmp/err2
|
||||
```
|
||||
|
||||
If desired, ipfs-chat can be run as a daemon which will replicate all
|
||||
chat content in a local store which will enable clients to go offline
|
||||
without causing data to become unavailable to other clients:
|
||||
|
||||
```shell
|
||||
./ipfs-chat daemon --node-idx=3 ipfs:/tmp/ipfs3::chat
|
||||
```
|
||||
|
||||
Note: the 'node-idx' argument ensures that each IPFS-based program
|
||||
uses a distinct set of ports. This is useful when running multiple
|
||||
IPFS-based programs on the same machine.
|
||||
@@ -1,46 +0,0 @@
|
||||
[Home](../../README.md) » [Use Cases](../../README.md#use-cases) » **Decentralized** »
|
||||
|
||||
[About](about.md) | [Quickstart](quickstart.md) | [Architectures](architectures.md) | **P2P Chat Demo** | [IPFS Chat Demo](demo-ipfs-chat.md)
|
||||
<br><br>
|
||||
# Demo App: P2P Decentralized Chat
|
||||
|
||||
This sample demonstrates the simplest possible case of building a p2p app on top of Noms. Each node stores a complete copy of the data it is interested in, and peers find each other using [IPFS pubsub](https://ipfs.io/blog/25-pubsub/).
|
||||
|
||||
Currently, nodes have to have a publicly routable IP, but it should be possible to use [libP2P](https://github.com/libp2p) or similar to connect to most nodes.
|
||||
|
||||
# Build and Run
|
||||
|
||||
Demo app code is in the
|
||||
[p2p](https://github.com/attic-labs/noms/tree/master/samples/go/decent/p2p-chat)
|
||||
directory. To get it up and running take the following steps:
|
||||
|
||||
* Use git to clone the noms repository onto your computer:
|
||||
|
||||
```shell
|
||||
go get github.com/attic-labs/noms/samples/go/decent/p2p-chat
|
||||
```
|
||||
|
||||
* From the noms/samples/go/decent/p2p-chat directory, build the program with the following command:
|
||||
|
||||
```shell
|
||||
go build
|
||||
```
|
||||
|
||||
* Run the p2p client with the following command:
|
||||
|
||||
```shell
|
||||
mkdir /tmp/noms1
|
||||
./p2p-chat client --username=<aname1> --node-idx=1 /tmp/noms1 >& /tmp/err1
|
||||
```
|
||||
|
||||
* Run a second p2p client with the following command:
|
||||
|
||||
```shell
|
||||
mkdir /tmp/noms2
|
||||
./p2p-chat client --username=<aname2> --node-idx=2 /tmp/noms2 >& /tmp/err2
|
||||
```
|
||||
|
||||
Note: the p2p client relies on IPFS for it's pub/sub implementation. The
|
||||
'node-idx' argument ensures that each IPFS-based node uses a distinct set
|
||||
of ports. This is useful when running multiple IPFS-based programs on
|
||||
the same machine.
|
||||
|
Before Width: | Height: | Size: 36 KiB |
|
Before Width: | Height: | Size: 31 KiB |
@@ -1,133 +0,0 @@
|
||||
[Home](../../README.md) » [Use Cases](../../README.md#use-cases) » **Decentralized** »
|
||||
|
||||
[About](about.md) | **Quickstart** | [Architectures](architectures.md) | [P2P Chat Demo](demo-p2p-chat.md) | [IPFS Chat Demo](demo-ipfs-chat.md)
|
||||
<br><br>
|
||||
# How to Use Noms in a Decentralized App
|
||||
|
||||
If you’d like to use noms in your project we’d love to hear from you:
|
||||
drop us an email ([noms@attic.io](mailto:noms@attic.io)) or send us a
|
||||
message in slack ([slack.noms.io](http://slack.noms.io)).
|
||||
|
||||
The steps you’ll need to take are:
|
||||
|
||||
1. Decide how you’ll model your problem using noms’ datatypes: boolean,
|
||||
number, string, blob, map, list, set, structs, ref, and
|
||||
union. (Note: if you are interested in using CRDTs as an alternative
|
||||
to classic datatypes please let us know.)
|
||||
2. Consider...
|
||||
* How peers will discover each other
|
||||
* How peers will notify each other of changes
|
||||
* How and when they will pull changes, and
|
||||
* What potential there is for conflicting changes. Consider modeling
|
||||
your problem so that changes commute in order to make merging
|
||||
easier.
|
||||
|
||||
In our [p2p sample](https://github.com/attic-labs/noms/blob/master/doc/decent/demo-p2p-chat.md) application, all peers periodically broadcast their HEAD on a known channel using [IPFS pubsub](https://ipfs.io/blog/25-pubsub/), pull each others' changes immediately, and avoid conflicts by using operations that can be resolved with Noms' built in merge policies.
|
||||
|
||||
This is basically the simplest possible approach, but lots of options are possible. For example, an alternate approach for discoverability could be to keep a registry of all participating nodes in a blockchain (e.g., by storing them in an Ethereum smart contract). One could store either the current HEAD of each node (updated whenever the node changes state), or just an IPNS name that the node is writing to.
|
||||
|
||||
As an example of changes that commute consider modeling a stream
|
||||
of chat messages. Appending messages from both parties to a list
|
||||
is not commutative; the result depends on the order in which
|
||||
messages are added to the list. An example of a commutative
|
||||
strategy is adding the messages to a `Map` keyed by
|
||||
`Struct{sender, ordinal}`: the resulting `Map` is the same no
|
||||
matter what order messages are added.
|
||||
|
||||
3. Vendor the code into your project.
|
||||
4. Set `NOMS_VERSION_NEXT=1` in your environment.
|
||||
5. Decide which type of storage you'd like to use: memory (convenient for playing around), disk, IPFS, or S3. (If you want to implement a store on top of another type of storage that's possible too; email us or reach out on slack and we can help.)
|
||||
6. Set up and instantiate a database for your storage. Generally, you use the spec package to parse a [dataset spec](https://github.com/attic-labs/noms/blob/master/doc/spelling.md) like `mem::mydataset` which you can then ask for [`Database`](https://github.com/attic-labs/noms/blob/master/go/datas/database.go) and [`Dataset`](https://github.com/attic-labs/noms/blob/master/go/datas/dataset.go).
|
||||
* **Memory**: no setup required, just instantiate it:
|
||||
|
||||
```go
|
||||
sp := spec.ForDataset("mem::test") // Dataset name is "test"
|
||||
```
|
||||
|
||||
* **Disk**: identify a directory for storage, say `/path/to/chunks`, and then instantiate:
|
||||
|
||||
```go
|
||||
sp := spec.ForDataset("/path/to/chunks::test") // Dataset name is "test"
|
||||
```
|
||||
|
||||
* **IPFS**: identify an IPFS node by directory. If an IPFS node doesn't exist at that directory, one will be created:
|
||||
|
||||
```go
|
||||
sp := spec.ForDataset("ipfs:/path/to/ipfs_repo::test") // Dataset name is "test"
|
||||
```
|
||||
|
||||
* **S3**: Follow the [S3 setup instructions](https://github.com/attic-labs/noms/blob/master/go/nbs/NBS-on-AWS.md) then instantiate a database and dataset:
|
||||
|
||||
```go
|
||||
sess := session.Must(session.NewSession(aws.NewConfig().WithRegion("us-west-2")))
|
||||
store := nbs.NewAWSStore("dynamo-table", "store-name", "s3-bucket", s3.New(sess), dynamodb.New(sess), 1<<28))
|
||||
database := datas.NewDatabase(store)
|
||||
dataset := database.GetDataset("aws://dynamo-table:s3-bucket/store-name::test") // Dataset name is "test"
|
||||
```
|
||||
|
||||
7. Implement using the [Go API](https://github.com/attic-labs/noms/blob/master/doc/go-tour.md). If you're just playing around you could try something like this:
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
|
||||
"github.com/attic-labs/noms/go/spec"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
)
|
||||
|
||||
// Usage: quickstart /path/to/store::ds
|
||||
func main() {
|
||||
sp, err := spec.ForDataset(os.Args[1])
|
||||
if err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Unable to parse spec: %s, error: %s\n", sp, err)
|
||||
os.Exit(1)
|
||||
}
|
||||
defer sp.Close()
|
||||
|
||||
db := sp.GetDatabase()
|
||||
if headValue, ok := sp.GetDataset().MaybeHeadValue(); !ok {
|
||||
data := types.NewList(sp.GetDatabase(),
|
||||
newPerson("Rickon", true),
|
||||
newPerson("Bran", true),
|
||||
newPerson("Arya", false),
|
||||
newPerson("Sansa", false),
|
||||
)
|
||||
|
||||
fmt.Fprintf(os.Stdout, "data type: %v\n", types.TypeOf(data).Describe())
|
||||
_, err = db.CommitValue(sp.GetDataset(), data)
|
||||
if err != nil {
|
||||
fmt.Fprint(os.Stderr, "Error commiting: %s\n", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
} else {
|
||||
// type assertion to convert Head to List
|
||||
personList := headValue.(types.List)
|
||||
// type assertion to convert List Value to Struct
|
||||
personStruct := personList.Get(0).(types.Struct)
|
||||
// prints: Rickon
|
||||
fmt.Fprintf(os.Stdout, "given: %v\n", personStruct.Get("given"))
|
||||
}
|
||||
}
|
||||
|
||||
func newPerson(givenName string, male bool) types.Struct {
|
||||
return types.NewStruct("Person", types.StructData{
|
||||
"given": types.String(givenName),
|
||||
"male": types.Bool(male),
|
||||
})
|
||||
}
|
||||
```
|
||||
|
||||
8. You can inspect data that you've committed via the [noms command-line interface](https://github.com/attic-labs/noms/blob/master/doc/cli-tour.md). For example:
|
||||
|
||||
```shell
|
||||
noms log /path/to/store::ds
|
||||
noms show /path/to/store::ds
|
||||
```
|
||||
|
||||
> Note that Memory tables won't be inspectable because they exist only in the memory of the process that created them.
|
||||
|
||||
9. Implement pull and merge. The [pull API](../../go/datas/pull.go) is used pull changes from a peer and the [merge API](../../go/merge/) is used to merge changes before commit. There's an [example of merging in the IPFS-based-chat sample
|
||||
app](https://github.com/attic-labs/noms/blob/master/samples/go/ipfs-chat/pubsub.go).
|
||||
@@ -1,66 +0,0 @@
|
||||
[Home](../README.md) »
|
||||
|
||||
[Technical Overview](intro.md) | [Use Cases](../README.md#use-cases) | [Command-Line Interface](cli-tour.md) | [Go bindings Tour](go-tour.md) | [Path Syntax](spelling.md) | **FAQ**
|
||||
<br><br>
|
||||
# Frequently Asked Questions
|
||||
|
||||
### Decentralized like BitTorrent?
|
||||
|
||||
No, decentralized like Git.
|
||||
|
||||
Specifically, Noms isn't itself a peer-to-peer network. If you can get two instances to share data, somehow, then they can synchronize. Noms doesn't define how this should happen though.
|
||||
|
||||
Currently, instances mainly share data via either HTTP/DNS or a filesystem. But it should be easy to add other mechanisms. For example, it seems like Noms could run well on top of BitTorrent, or IPFS. You should [look into it](https://github.com/attic-labs/noms/issues/2123).
|
||||
|
||||
### Isn't it wasteful to store every version?
|
||||
|
||||
Noms deduplicates chunks of data that are identical within one database. So if multiple versions of one dataset share a lot of data, or if the same data is present in multiple datasets, Noms only stores one copy.
|
||||
|
||||
That said, it is definitely possible to have write patterns that defeat this. Deduplication is done at the chunk level, and chunks are currently set to an average size of 4KB. So if you change about 1 byte in every 4096 in a single commit, and those changed bytes are well-distributed throughout the dataset, then we will end up making a complete copy of the dataset.
|
||||
|
||||
### Is there a way to not store the entire history?
|
||||
|
||||
Theoretically, definitely. In Git, for example, the concept of "shallow clones" exists, and we could do something similar in Noms. This has not been implemented yet.
|
||||
|
||||
### How does Noms handle conflicts?
|
||||
|
||||
Noms provides several built-in policies that can automatically merge common cases of conflicts. For example concurrent edits to sets are always mergeable and concurrent edits to different keys in a map or struct are also mergeable.
|
||||
|
||||
The conflict resolution system is pluggable so new policies that are application-specific can be added. However, it's possible to build surprisingly complex applications with just the built-in policies.
|
||||
|
||||
### Why don't you just use CRDTs?
|
||||
|
||||
[Convergent (or Commutative) Replicated Data Types (CRDTs)](http://hal.upmc.fr/inria-00555588/document) are a class of distributed data structures that provably converge to some agreed-upon state with no synchronization. Stated differently: CRDTs define a merge policy that is commutative over all their operations.
|
||||
|
||||
CRDTs are nice because they require no custom conflict/merge code from the developer.
|
||||
|
||||
Noms defines a set of intutive built-in merge policies for its core datatypes. For example, the default policy makes all operations on Noms Sets commute (add wins in the case of concurrent remove/add). This means that with the default policy, Noms Sets are a CRDT.
|
||||
|
||||
If your application uses only operations on Noms datatypes that can be merged with whatever merge policy you are using, then your schema is a CRDT. It's possible to build surprisingly complex applications this way with just the default policy.
|
||||
|
||||
Noms also allows you to provide your own custom policy. If your policy commutes, then the resulting datatype will be a CRDT.
|
||||
|
||||
However, it would be nice if application developers could more easily opt-in to using only mergeable operations, thereby enforcing that their schema is a CRDT, and providing confidence that custom merge logic doesn't need to be implemented.
|
||||
|
||||
More generally, perhaps there could be a way to test that all possible conflict cases have been handled by the developer. This would allow developers to implement their own custom CRDTs. This is something we'd like to research in the future.
|
||||
|
||||
### Why don't you support Windows?
|
||||
|
||||
We are a tiny team and we all personally use Macs as our development machines, and we use Linux in production. These two platforms are very close to identical, and so we can generally test on Mac and assume it will work on Linux. Adding Windows would add significant complexity to our code and build processes which we're not willing to take on.
|
||||
|
||||
### But you'll accept patches for Windows, right?
|
||||
|
||||
No, because then we'll have to maintain those patches.
|
||||
|
||||
### Are there any workaround for Windows?
|
||||
|
||||
You can use it in a virtual machine. We have also heard Noms works OK with gitbash or cygwin, but that's coincidence.
|
||||
|
||||
### Why is it called Noms?
|
||||
|
||||
1. It's insert-only. OMNOMNOM.
|
||||
2. It's content addressed. Every value has its own hash, or [name](http://dictionary.reverso.net/french-english/nom).
|
||||
|
||||
### Are you sure Noms doesn't stand for something?
|
||||
|
||||
Pretty sure. But if you like, you can pretend it stands for Non-Mutable Store.
|
||||
@@ -1,315 +0,0 @@
|
||||
[Home](../README.md) »
|
||||
|
||||
[Technical Overview](intro.md) | [Use Cases](../README.md#use-cases) | [Command-Line Interface](cli-tour.md) | **Go bindings Tour** | [Path Syntax](spelling.md) | [FAQ](faq.md)
|
||||
<br><br>
|
||||
# A Short Tour of Noms for Go
|
||||
|
||||
This is a short introduction to using Noms from Go. It should only take a few minutes if you have some familiarity with Go.
|
||||
|
||||
During the tour, you can refer to the complete [Go SDK Reference](https://godoc.org/github.com/attic-labs/noms) for more information on anything you see.
|
||||
|
||||
|
||||
|
||||
## Requirements
|
||||
|
||||
* [Noms command-line tools](https://github.com/attic-labs/noms#setup)
|
||||
* [Go v1.6+](https://golang.org/dl/)
|
||||
* Ensure your [$GOPATH](https://github.com/golang/go/wiki/GOPATH) is configured
|
||||
|
||||
## Start a Local Database
|
||||
|
||||
Let's create a local database to play with:
|
||||
|
||||
```sh
|
||||
> mkdir /tmp/noms-go-tour
|
||||
> noms serve /tmp/noms-go-tour
|
||||
```
|
||||
|
||||
## [Database](https://github.com/attic-labs/noms/blob/master/go/datas/database.go)
|
||||
Leave the server running, and in a separate terminal:
|
||||
|
||||
```sh
|
||||
> mkdir noms-tour
|
||||
> cd noms-tour
|
||||
```
|
||||
|
||||
Then use your favorite editor so that we can start to play with code. To get started with Noms, first create a Database:
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
|
||||
"github.com/attic-labs/noms/go/spec"
|
||||
)
|
||||
|
||||
func main() {
|
||||
sp, err := spec.ForDatabase("http://localhost:8000")
|
||||
if err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Could not access database: %s\n", err)
|
||||
return
|
||||
}
|
||||
defer sp.Close()
|
||||
}
|
||||
```
|
||||
|
||||
Now let's run it:
|
||||
|
||||
```sh
|
||||
> go run noms-tour.go
|
||||
```
|
||||
|
||||
If you did not leave the server running you would see output of ```Could not access database``` here, otherwise your program should exit cleanly.
|
||||
|
||||
See [Spelling in Noms](https://github.com/attic-labs/noms/blob/master/doc/spelling.md) for more information on database spec strings.
|
||||
|
||||
|
||||
## [Dataset](https://github.com/attic-labs/noms/blob/master/go/dataset/dataset.go)
|
||||
|
||||
Datasets are the main interface you'll use to work with Noms. Let's update our example to use a Dataset spec string:
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
|
||||
"github.com/attic-labs/noms/go/spec"
|
||||
)
|
||||
|
||||
func main() {
|
||||
sp, err := spec.ForDataset("http://localhost:8000::people")
|
||||
if err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Could not create dataset: %s\n", err)
|
||||
return
|
||||
}
|
||||
defer sp.Close()
|
||||
|
||||
if _, ok := sp.GetDataset().MaybeHeadValue(); !ok {
|
||||
fmt.Fprintf(os.Stdout, "head is empty\n")
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Now let's run it:
|
||||
|
||||
```sh
|
||||
> go run noms-tour.go
|
||||
head is empty
|
||||
```
|
||||
|
||||
Since the dataset does not yet have any values you see ```head is empty```. Let's add some data to make it more interesting:
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
|
||||
"github.com/attic-labs/noms/go/spec"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
)
|
||||
|
||||
func newPerson(givenName string, male bool) types.Struct {
|
||||
return types.NewStruct("Person", types.StructData{
|
||||
"given": types.String(givenName),
|
||||
"male": types.Bool(male),
|
||||
})
|
||||
}
|
||||
|
||||
func main() {
|
||||
sp, err := spec.ForDataset("http://localhost:8000::people")
|
||||
if err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Could not create dataset: %s\n", err)
|
||||
return
|
||||
}
|
||||
defer sp.Close()
|
||||
|
||||
db := sp.GetDatabase()
|
||||
|
||||
data := types.NewList(db,
|
||||
newPerson("Rickon", true),
|
||||
newPerson("Bran", true),
|
||||
newPerson("Arya", false),
|
||||
newPerson("Sansa", false),
|
||||
)
|
||||
|
||||
fmt.Fprintf(os.Stdout, "data type: %v\n", types.TypeOf(data).Describe())
|
||||
|
||||
_, err = db.CommitValue(sp.GetDataset(), data)
|
||||
if err != nil {
|
||||
fmt.Fprint(os.Stderr, "Error commiting: %s\n", err)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Now you will get output of the data type of our Dataset value:
|
||||
|
||||
```shell
|
||||
> go run noms-tour.go
|
||||
data type: List<struct {
|
||||
given: String
|
||||
male: Bool
|
||||
}>
|
||||
```
|
||||
|
||||
Now you can access the data via your program:
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
|
||||
"github.com/attic-labs/noms/go/spec"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
)
|
||||
|
||||
func main() {
|
||||
sp, err := spec.ForDataset("http://localhost:8000::people")
|
||||
if err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Could not create dataset: %s\n", err)
|
||||
return
|
||||
}
|
||||
defer sp.Close()
|
||||
|
||||
if headValue, ok := sp.GetDataset().MaybeHeadValue(); !ok {
|
||||
fmt.Fprintf(os.Stdout, "head is empty\n")
|
||||
} else {
|
||||
// type assertion to convert Head to List
|
||||
personList := headValue.(types.List)
|
||||
// type assertion to convert List Value to Struct
|
||||
personStruct := personList.Get(0).(types.Struct)
|
||||
// prints: Rickon
|
||||
fmt.Fprintf(os.Stdout, "given: %v\n", personStruct.Get("given"))
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Running it now:
|
||||
|
||||
```sh
|
||||
> go run noms-tour.go
|
||||
given: Rickon
|
||||
```
|
||||
|
||||
You can see this data using the command-line too:
|
||||
|
||||
```sh
|
||||
> noms ds http://localhost:8000
|
||||
people
|
||||
|
||||
> noms show http://localhost:8000::people
|
||||
struct Commit {
|
||||
meta: struct {},
|
||||
parents: set {},
|
||||
value: [ // 4 items
|
||||
struct Person {
|
||||
given: "Rickon",
|
||||
male: true,
|
||||
},
|
||||
struct Person {
|
||||
given: "Bran",
|
||||
male: true,
|
||||
},
|
||||
struct Person {
|
||||
given: "Arya",
|
||||
male: false,
|
||||
},
|
||||
struct Person {
|
||||
given: "Sansa",
|
||||
male: false,
|
||||
},
|
||||
],
|
||||
}
|
||||
```
|
||||
|
||||
Let's add some more data.
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
|
||||
"github.com/attic-labs/noms/go/spec"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
)
|
||||
|
||||
func main() {
|
||||
sp, err := spec.ForDataset("http://localhost:8000::people")
|
||||
if err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Could not create dataset: %s\n", err)
|
||||
return
|
||||
}
|
||||
defer sp.Close()
|
||||
|
||||
if headValue, ok := sp.GetDataset().MaybeHeadValue(); !ok {
|
||||
fmt.Fprintf(os.Stdout, "head is empty\n")
|
||||
} else {
|
||||
// type assertion to convert Head to List
|
||||
personList := headValue.(types.List)
|
||||
personEditor := personList.Edit()
|
||||
data := personEditor.Append(
|
||||
types.NewStruct("Person", types.StructData{
|
||||
"given": types.String("Jon"),
|
||||
"family": types.String("Snow"),
|
||||
"male": types.Bool(true),
|
||||
}),
|
||||
).List()
|
||||
|
||||
fmt.Fprintf(os.Stdout, "data type: %v\n", types.TypeOf(data).Describe())
|
||||
|
||||
_, err = sp.GetDatabase().CommitValue(sp.GetDataset(), data)
|
||||
if err != nil {
|
||||
fmt.Fprint(os.Stderr, "Error commiting: %s\n", err)
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Running this:
|
||||
|
||||
```sh
|
||||
> go run noms-tour.go
|
||||
data type: List<Struct Person {
|
||||
family?: String,
|
||||
given: String,
|
||||
male: Bool,
|
||||
}>
|
||||
```
|
||||
|
||||
Datasets are versioned. When you *commit* a new value, you aren't overwriting the old value, but adding to a historical log of values:
|
||||
|
||||
```sh
|
||||
> noms log http://localhost:8000::people
|
||||
commit ba3lvopbgcqqnofm3qk7sk4j2doroj1l
|
||||
Parent: f0b1befu9jp82r1vcd4gmuhdno27uobi
|
||||
(root) {
|
||||
+ struct Person {
|
||||
+ family: "Snow",
|
||||
+ given: "Jon",
|
||||
+ male: true,
|
||||
+ }
|
||||
}
|
||||
|
||||
commit f0b1befu9jp82r1vcd4gmuhdno27uobi
|
||||
Parent: hshltip9kss28uu910qadq04mhk9kuko
|
||||
|
||||
commit hshltip9kss28uu910qadq04mhk9kuko
|
||||
Parent: None
|
||||
```
|
||||
|
||||
## Values
|
||||
|
||||
Noms supports a [variety of datatypes](https://github.com/attic-labs/noms/blob/master/doc/intro.md#types) beyond List, Struct, String, and Bool we used above.
|
||||
|
||||
## Samples
|
||||
|
||||
You can continue learning more about the Noms Go SDK by looking at the documentation and by reviewing the [samples](https://github.com/attic-labs/noms/blob/master/samples/go). The [hr sample](https://github.com/attic-labs/noms/blob/master/samples/go/hr) is a more complete implementation of our example above and will help you to see further usage of the other datatypes.
|
||||
@@ -1,200 +0,0 @@
|
||||
[Home](../README.md) »
|
||||
|
||||
**Technical Overview** | [Use Cases](../README.md#use-cases) | [Command-Line Interface](cli-tour.md) | [Go bindings Tour](go-tour.md) | [Path Syntax](spelling.md) | [FAQ](faq.md)
|
||||
<br><br>
|
||||
# Noms Technical Overview
|
||||
|
||||
Most conventional database systems share two central properties:
|
||||
|
||||
1. Data is modeled as a single point-in-time. Once a transaction commits, the previous state of the database is either lost, or available only as a fallback by reconstructing from transaction logs.
|
||||
|
||||
2. Data is modeled as a single source of truth. Even large-scale distributed databases which are internally a fault-tolerant network of nodes, present the abstraction to clients of being a single logical master, with which clients must coordinate in order to change state.
|
||||
|
||||
Noms blends the properties of decentralized systems, such as [Git](https://git-scm.com/), with properties of traditional databases in order to create a general-purpose decentralized database, in which:
|
||||
|
||||
1. Any peer’s state is as valid as any other.
|
||||
|
||||
2. All commits of the database are retained and available at any time.
|
||||
|
||||
3. Any peer is free to move forward independently of communication from any other—while retaining the ability to reconcile changes at some point in the future.
|
||||
|
||||
4. The basic properties of structured databases (efficient queries, updates, and range scans) are retained.
|
||||
|
||||
5. Diffs between any two sets of data can be computed efficiently.
|
||||
|
||||
6. Synchronization between disconnected copies of the database can be performed efficiently and correctly.
|
||||
|
||||
## Basics
|
||||
|
||||
As in Git, [Bitcoin](https://bitcoin.org/en/), [Ethereum](https://www.ethereum.org/), [IPFS](https://ipfs.io/), [Camlistore](https://camlistore.org/), [bup](https://bup.github.io/), and other systems, Noms models data as a [directed acyclic graph](https://en.wikipedia.org/wiki/Directed_acyclic_graph) of nodes in which every node has a _hash_. A node's hash is derived from the values encoded in the node and (transitively) from the values encoded in all nodes which are reachable from that node.
|
||||
|
||||
In other words, a Noms database is a single large [Merkle DAG](https://github.com/jbenet/random-ideas/issues/20).
|
||||
|
||||
When two nodes have the same hash, they represent identical logical values and the respective subgraph of nodes reachable from each are topologically equivalent. Importantly, in Noms, the reverse is also true: a single logical value has one and only one hash. When two nodes have differnet hashes, they represent different logical values.
|
||||
|
||||
Noms extends the ideas of prior systems to enable efficiently computing and reconciling differences, synchronizing state, and building indexes over large-scale, structured data.
|
||||
|
||||
## Databases and Datasets
|
||||
|
||||
A _database_ is the top-level abstraction in Noms.
|
||||
|
||||
A database has two responsibilities: it provides storage of [content-addressed](https://en.wikipedia.org/wiki/Content-addressable_storage) chunks of data, and it keeps track of zero or more _datasets_.
|
||||
|
||||
A Noms database can be implemented on top of any underlying storage system that provides key/value storage with at least optional optimistic concurrency. We only use optimistic concurrency to store the current value of each dataset. Chunks themselves are immutable.
|
||||
|
||||
We have implementations of Noms databases on top of our own file-backed store [Noms Block Store (NBS)](https://github.com/attic-labs/noms/tree/master/go/nbs) (usually used locally), our own [HTTP protocol](https://github.com/attic-labs/noms/blob/master/go/datas/database_server.go) (used for working with a remote database), [Amazon DynamoDB](https://aws.amazon.com/dynamodb/), and [memory](https://github.com/attic-labs/noms/blob/master/go/chunks/memory_store.go) (mainly used for testing).
|
||||
|
||||
Here's an example of creating an http-backed database using the [Go Noms SDK](go-tour.md):
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
|
||||
"github.com/attic-labs/noms/go/spec"
|
||||
)
|
||||
|
||||
func main() {
|
||||
sp, err := spec.ForDatabase("http://localhost:8000")
|
||||
if err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Could not access database: %s\n", err)
|
||||
return
|
||||
}
|
||||
defer sp.Close()
|
||||
}
|
||||
```
|
||||
|
||||
A dataset is nothing more than a named pointer into the DAG. Consider the following command to copy the dataset named `foo` to the dataset named `bar` within a database:
|
||||
|
||||
```shell
|
||||
noms sync http://localhost:8000::foo http://localhost:8000::bar
|
||||
```
|
||||
|
||||
This command is trivial and causes basically zero IO. Noms first resolves the dataset name `foo` in `http://localhost:8000`. This results in a hash. Noms then checks whether that hash exists in the destination database (which in this case is the same as the source database), finds that it does, and then adds a new dataset pointing at that chunk.
|
||||
|
||||
Syncs across database can be efficient by the same logic if the destination database already has all or most of the chunks required chunks.
|
||||
|
||||
## Time
|
||||
|
||||
All data in Noms is immutable. Once a piece of data is stored, it is never changed. To represent state changes, Noms uses a progression of `Commit` structures.
|
||||
|
||||
[TODO - diagram]
|
||||
|
||||
As in Git, Commits typically have one _parent_, which is the previous commit in time. But in the cases of merges, a Noms commit can have multiple parents.
|
||||
|
||||
### Chunks
|
||||
|
||||
When a value is stored in Noms, it is stored as one or more chunks of data. Chunk boundaries are typically created implicitly, as a way to store large collections efficiently (see [Prolly Trees](#prolly-trees-probabilistic-b-trees)). Programmers can also create explicit chunk boundaries using the `Ref` type (see [Types](#types )).
|
||||
|
||||
[TODO - Diagram]
|
||||
|
||||
Every chunk encodes a single logical value (which may be a component of another value and/or be composed of sub-values). Chunks are [addressed](https://en.wikipedia.org/wiki/Content-addressable_storage) in the Noms persistence layer by the hash of the value they encode.
|
||||
|
||||
## Types
|
||||
|
||||
Noms is a typed system, meaning that every Noms value is classified into one of the following _types_:
|
||||
|
||||
* `Boolean`
|
||||
* `Number` (arbitrary precision binary)
|
||||
* `String` (utf8-encoded)
|
||||
* `Blob` (raw binary data)
|
||||
* `Set<T>`
|
||||
* `List<T>`
|
||||
* `Map<K,V>`
|
||||
* Unions: `T|U|V|...`
|
||||
* `Ref<T>` (explicit out-of-line references)
|
||||
* `Struct` (user-defined record types, e.g., `Struct Person { name: String, age?: Number })`
|
||||
* `Type` (A value that stores a Noms type)
|
||||
|
||||
Blobs, sets, lists, and maps can be gigantic - Noms will _chunk_ these types into reasonable sized parts internally for efficient storage, searching, and updating (see [Prolly Trees](#prolly-trees-probabilistic-b-trees) below for more on this).
|
||||
|
||||
Strings, numbers, unions, and structs are not chunked, and should be used for "reasonably-sized" values. Use `Ref` if you need to force a particular value to be in a different chunk for some reason.
|
||||
|
||||
Types serve several purposes in Noms:
|
||||
|
||||
1. Most importantly, types make Noms data self-describing. You can use the `types.TypeOf` function on any Noms `Value`, no matter how large, and get a very precise description of the entire value and all values reachable from it. This allows software to interoperate without prior agreement or planning.
|
||||
|
||||
2. Users of Noms can define their own structures and publish data that uses them. This allows for ad-hoc standardization of types within communities working on similar data.
|
||||
|
||||
3. Types can be used _structurally_. A program can check incoming data against a required type. If the incoming root chunk matches the type, or is a superset of it, then the program can proceed with certainty of the shape of all accessible data. This enables richer interoperability between software, since schemas can be expanded over time as long as a compatible subset remains.
|
||||
|
||||
4. Eventually, we plan to add type restrictions to datasets, which would enforce the allowed types that can be committed to a dataset. This would allow something akin to schema validation in traditional databases.
|
||||
|
||||
### Refs vs Hashes
|
||||
|
||||
A _hash_ in Noms is just like the hashes used elsewhere in computing: a short string of bytes that uniquely identifies a larger value. Every value in Noms has a hash. Noms currently uses the [sha2-512](https://github.com/attic-labs/noms/blob/master/go/hash/hash.go#L7) hash function, but that can change in future versions of the system.
|
||||
|
||||
A _ref_ is different in subtle, but important ways. A `Ref` is a part of the type system - a `Ref` is a value. Anywhere you can find a Noms value, you can find a `Ref`. For example, you can commit a `Ref<T>` to a dataset, but you can't commit a bare hash.
|
||||
|
||||
The difference is that `Ref` carries the type of its target, along with the hash. This allows us to efficiently validate commits that include `Ref`, among other things.
|
||||
|
||||
### Type Accretion
|
||||
|
||||
Noms is an immutable database, which leads to the question: How do you change the schema? If I have a dataset containing `Set<Number>`, and I later decide that it should be `Set<String>`, what do I do?
|
||||
|
||||
You might say that you just commit the new type, but that would mean that users can't look at a dataset and understand what types previous versions contained, without manually exploring every one of those commits.
|
||||
|
||||
We call our solution to this problem _Type Accretion_.
|
||||
|
||||
If you construct a `Set` containing only `Number`s, its type will be `Set<Number>`. If you then insert a string into this set, the type of the resulting value is `Set<Number|String>`.
|
||||
|
||||
This is usually completely implicit, done based on the data you store (you can set types explicitly though, which is useful in some cases).
|
||||
|
||||
We do the same thing for datasets. If you commit a `Set<Number>`, the type of the commit we create for you is:
|
||||
|
||||
```go
|
||||
Struct Commit {
|
||||
Value: Set<Number>
|
||||
Parents: Set<Ref<Cycle<Commit>>>
|
||||
}
|
||||
```
|
||||
|
||||
This tells you that the current and all previous commits have values of type `Set<Number>`.
|
||||
|
||||
But if you then commit a `Set<String>` to this same dataset, then the type of that commit will be:
|
||||
|
||||
```go
|
||||
Struct Commit {
|
||||
Value: Set<String>
|
||||
Parents: Set<Ref<Cycle<Commit>> |
|
||||
Ref<Struct Commit {
|
||||
Value: Set<Number>
|
||||
Parents: Cycle<Commit>
|
||||
}>>
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
This tells you that the dataset's current commit has a value of type `Set<String>` and that previous commits are either the same, or else have a value of type `Set<Number>`.
|
||||
|
||||
Type accretion has a number of benefits related to schema changes:
|
||||
|
||||
1. You can widen the type of any container (list, set, map) without rewriting any existing data. `Set<Struct { name: String }>` becomes `Set<Struct { name: String }> | Struct { name: String, age: Number }>>` and all existing data is reused.
|
||||
2. You can widen containers in ways that other databases wouldn't allow. For example, you can go from `Set<Number>` to `Set<Number|String>`. Existing data is still reused.
|
||||
3. You can change the type of a dataset in either direction - either widening or narrowing it, and the dataset remains self-documenting as to its current and previous types.
|
||||
|
||||
## Prolly Trees: Probabilistic B-Trees
|
||||
|
||||
A critical invariant of Noms is that the same value will be represented by the same graph, having the same chunk boundaries, regardless of what past sequence of logical mutations resulted in the value. This is the essence of content-addressing and it is what makes deduplication, efficient sync, indexing, and and other features of Noms possible.
|
||||
|
||||
But this invariant also rules out the use of classical B-Trees, because a B-Tree’s internal state depends upon its mutation history. In order to model large mutable collections in Noms, of the type where B-Trees would typically be used, Noms instead introduces _Prolly Trees_.
|
||||
|
||||
A Prolly Tree is a [search tree](https://en.wikipedia.org/wiki/Search_tree) where the number of values stored in each node is determined probabilistically, based on the data which is stored in the tree.
|
||||
|
||||
A Prolly Tree is similar in many ways to a B-Tree, except that the number of values in each node has a probabilistic average rather than an enforced upper and lower bound, and the set of values in each node is determined by the output of a rolling hash function over the values, rather than via split and join operations when upper and lower bounds are exceeded.
|
||||
|
||||
### Indexing and Searching with Prolly Trees
|
||||
|
||||
Like B-Trees, Prolly Trees are sorted. Keys of type Boolean, Number, and String sort in their natural order. Other types sort by their hash.
|
||||
|
||||
Because of this sorting, Noms collections can be used as efficient indexes, in the same manner as primary and secondary indexes in traditional databases.
|
||||
|
||||
For example, say you want to quickly be able to find `Person` structs by their age. You could build a map of type `Map<Number, Set<Person>>`. This would allow you to quickly (~log<sub>k</sub>(n) seeks, where `k` is average prolly tree width, which is currently 64) find all the people of an exact age. But it would _also_ allow you to find all people within a range of ages efficiently (~num_results/log<sub>k</sub>(n) seeks), even if the ages are non-integral.
|
||||
|
||||
Also, because Noms collections are ordered search trees, it is possible to implement set operations like union and intersect efficiently on them.
|
||||
|
||||
So, for example, if you wanted to find all the people of a particular age AND having a particular hair color, you could construct a second map having type `Map<String, Set<Person>>`, and intersect the two sets.
|
||||
|
||||
Over time, we plan to develop this basic capability into support for some kind of generalized query system.
|
||||
|
Before Width: | Height: | Size: 86 KiB |
@@ -1,108 +0,0 @@
|
||||
[Home](../README.md) »
|
||||
|
||||
[Technical Overview](intro.md) | [Use Cases](../README.md#use-cases) | [Command-Line Interface](cli-tour.md) | [Go bindings Tour](go-tour.md) | **Path Syntax** | [FAQ](faq.md)
|
||||
<br><br>
|
||||
# Spelling in Noms
|
||||
|
||||
Many commands and APIs in Noms accept database, dataset, or value specifications as arguments. This document describes how to construct these specifications.
|
||||
|
||||
## Spelling Databases
|
||||
|
||||
database specifications take the form:
|
||||
|
||||
```nohighlight
|
||||
<protocol>[:<path>]
|
||||
```
|
||||
|
||||
The `path` part of the name is interpreted differently depending on the protocol:
|
||||
|
||||
- **http(s)** specs describe a remote database to be accessed over HTTP. In this case, the entire database spec is a normal http(s) URL. For example: `https://dev.noms.io/aa`.
|
||||
- **mem** specs describe an ephemeral memory-backed database. In this case, the path component is not used and must be empty.
|
||||
- **nbs** specs describe a local [Noms Block Store (NBS)](https://github.com/attic-labs/noms/tree/master/go/nbs)-backed database. In this case, the path component should be a relative or absolute path on disk to a directory in which to store the data, e.g. `nbs:/tmp/noms-data`.
|
||||
- In Go, `nbs:` can be ommitted (just `/tmp/noms-data` will work).
|
||||
- **aws** specs describe a remote Noms Block Store backed directly by Amazon Web Services, specifically DynamoDB and S3. The format is a URI containing the names of the DynamoDB table to use, the S3 bucket to use, and the database to serve. For example: `aws://dynamo-table:s3-bucket/database`.
|
||||
|
||||
## Spelling Datasets
|
||||
|
||||
Dataset specifications take the form:
|
||||
|
||||
```nohighlight
|
||||
<database>::<dataset>
|
||||
```
|
||||
|
||||
See [spelling databases](#spelling-databases) for how to build the `database` part of the name. The `dataset` part is just any string matching the regex `^[a-zA-Z0-9\-_/]+$`.
|
||||
|
||||
Example datasets:
|
||||
|
||||
```nohighlight
|
||||
/tmp/test-db::my-dataset
|
||||
nbs:/tmp/test-db::my-dataset
|
||||
http://localhost:8000::registered-businesses
|
||||
https://demo.noms.io/aa::music
|
||||
```
|
||||
|
||||
## Spelling Values
|
||||
|
||||
Value specifications take the form:
|
||||
|
||||
```nohighlight
|
||||
<database>::<root><path>
|
||||
```
|
||||
|
||||
See [spelling databases](#spelling-databases) for how to build the database part of the name.
|
||||
|
||||
The `root` part can be either a hash or a dataset name. If `root` begins with `#` it will be interpreted as a hash otherwise it is used as a dataset name. See [spelling datasets](#spelling-datasets) for how to build the dataset part of the name.
|
||||
|
||||
The `path` part is relative to the `root` provided.
|
||||
|
||||
### Specifying Struct Fields
|
||||
Elements of a Noms struct can be referenced using a period `.`.
|
||||
|
||||
For example, if the `root` is a dataset, then one can use `.value` to get the root of the data in the dataset. In this case `.value` selects the `value` field from the `Commit` struct at the top of the dataset. One could instead use `.meta` to select the `meta` struct from the `Commit` struct. The `root` does not need to be a dataset though, so if it is a hash that references a struct, the same notation still works: `#o38hugtf3l1e8rqtj89mijj1dq57eh4m.field`.
|
||||
|
||||
### Specifying Collection Values
|
||||
Elements of a Noms list, map, or set can be retrieved using brackets `[...]`.
|
||||
|
||||
For example, if the dataset is a Noms map of number to struct then one could use `.value[42]` to get the Noms struct associated with the key 42. Similarly selecting the first element from a Noms list would be `.value[0]`. If the Noms map was keyed by string, then using `.value["0000024-02-999"]` would reference the Noms struct associated with key "0000024-02-999".
|
||||
|
||||
Noms lists also support indexing from the back, using `.value[-1]` to mean the last element of a last, `.value[-2]` for the 2nd last, and so on.
|
||||
|
||||
If the key of a Noms map or set is a Noms struct or a more complex value, then indexing into the collection can be done using the hash of that more complex value. For example, if the `root` of our dataset is a Noms set of Noms structs, then if you provide the hash of the struct element then you can index into the map using the brackets as described above. e.g. http://localhost:8000::dataset.value[#o38hugtf3l1e8rqtj89mijj1dq57eh4m].field
|
||||
|
||||
Similarly, the key is addressable using `@key` syntax. One use for this is when you have the hash of a complex value, but want need to retrieve the key (rather than or in addition to the value) in a Noms map. The syntax is to append `@key` after the closing bracket of the index specifier. e.g. http://localhost:8000::dataset.value[#o38hugtf3l1e8rqtj89mijj1dq57eh4m]@key would retrieve the key element specified by the hash key `#o38hugtf3l1e8rqtj89mijj1dq57eh4m` from the `dataset.value` collection.
|
||||
|
||||
### Specifying Collection Positions
|
||||
Elements of a Noms list, map, or set can be retrived _by their position_ using the `@at(index)` annotation.
|
||||
|
||||
For lists, this is exactly equivalent to `[index]`. For sets and maps, note that Noms has a stable ordering, so `@at(0)` will always return the smallest element, `@at(1)` the 2nd smallest, and so on. `@at(-1)` will return the largest. For maps, adding the `@key` annotation will retrieve the key of the map entry instead of the value.
|
||||
|
||||
### Examples
|
||||
|
||||
```sh
|
||||
# “sf-registered-business” dataset at https://demo.noms.io/cli-tour
|
||||
https://demo.noms.io/cli-tour::sf-registered-business
|
||||
|
||||
# value o38hugtf3l1e8rqtj89mijj1dq57eh4m at https://localhost:8000
|
||||
https://localhost:8000/monkey::#o38hugtf3l1e8rqtj89mijj1dq57eh4m
|
||||
|
||||
# “bonk” dataset at /foo/bar
|
||||
/foo/bar::bonk
|
||||
|
||||
# from https://demo.noms.io/cli-tour, select the "sf-registered-business" dataset,
|
||||
# the root value is a Noms map, select the value of the Noms map identified by string
|
||||
# key "0000024-02-999", then from that resulting struct select the Ownership_Name field
|
||||
https://demo.noms.io/cli-tour::sf-registered-business.value["0000024-02-999"].Ownership_Name
|
||||
```
|
||||
|
||||
Be careful with shell escaping. Your shell might require escaping of the double quotes and other characters or use single quotes around the entire command line argument. e.g.:
|
||||
|
||||
```sh
|
||||
> noms show https://demo.noms.io/cli-tour::sf-registered-business.value["0000024-02-999"].Ownership_Name
|
||||
error: Invalid index: 0000024-02-999
|
||||
|
||||
> noms show https://demo.noms.io/cli-tour::sf-registered-business.value[\"0000024-02-999\"].Ownership_Name
|
||||
"EASTMAN KODAK CO"
|
||||
|
||||
> noms show 'https://demo.noms.io/cli-tour::sf-registered-business.value["0000024-02-999"].Ownership_Name'
|
||||
"EASTMAN KODAK CO"
|
||||
```
|
||||
|
Before Width: | Height: | Size: 393 KiB |
|
Before Width: | Height: | Size: 49 KiB |
@@ -1,15 +0,0 @@
|
||||
# Default database URL to be used whenever a database is not explictly provided
|
||||
[db.default]
|
||||
url = "ldb:.noms/tour" # This path is relative to the location of .nomsconfig
|
||||
|
||||
# DB alias named `origin` that refers to the remote cli-tour db
|
||||
[db.origin]
|
||||
url = "http://demo.noms.io/cli-tour"
|
||||
|
||||
# DB alias named `temp` that refers to a noms db stored under /tmp
|
||||
[db.temp]
|
||||
url = "ldb:/tmp/noms/shared"
|
||||
|
||||
# DB alias named `http` that refers to the local http db
|
||||
[db.http]
|
||||
url = "http://localhost:8000"
|
||||
@@ -1,75 +0,0 @@
|
||||
# nomsconfig
|
||||
|
||||
The noms cli now provides experimental support for configuring a convenient default database and database aliases.
|
||||
|
||||
You can enable this support by placing a *.nomsconfig* config file (like the [one](.nomsconfig) in this sample) in the directory where you'd like to use the configuration. Like git, any noms command issued from that directory or below will use it.
|
||||
|
||||
# Features
|
||||
|
||||
- *Database Aliases* - Define simple names to be used in place of database URLs
|
||||
- *Default Database* - Define one database to be used by default when no database in mentioned
|
||||
- *Dot (`.`) Shorthand* - Use `.` instead of repeating dataset/object name in destination
|
||||
|
||||
# Example
|
||||
|
||||
This example defines a simple [.nomsconfig](.nomsconfig) to try:
|
||||
|
||||
```shell
|
||||
# Default database URL to be used whenever a database is not explictly provided
|
||||
[db.default]
|
||||
url = "ldb:.noms/tour"
|
||||
|
||||
# DB alias named `origin` that refers to the remote cli-tour db
|
||||
[db.origin]
|
||||
url = "http://demo.noms.io/cli-tour"
|
||||
|
||||
# DB alias named `temp` that refers to a noms db stored under /tmp
|
||||
[db.temp]
|
||||
url = "ldb:/tmp/noms/shared
|
||||
|
||||
```
|
||||
|
||||
The *[db.default]* section:
|
||||
|
||||
- Defines a default database
|
||||
- It will be used implicitly whenever a database url is omitted in a command
|
||||
|
||||
The *[db.origin]* and *[db.shared]* sections:
|
||||
|
||||
- Define aliases that can be used wherever a db url is required
|
||||
- You can define additional aliases by adding *[db.**alias**]* sections using any **alias** you prefer
|
||||
|
||||
Dot (`.`) shorthand:
|
||||
|
||||
- When issuing a command that requires a source and destination (like `noms sync`),
|
||||
you can use `.` in place of the dataset/object in the destination. This is shorthand
|
||||
that repeats whatever was used in the source (see below).
|
||||
|
||||
|
||||
You can kick the tires by running noms commands from this directory. Here are some examples and what to expect:
|
||||
|
||||
```shell
|
||||
noms ds # -> noms ds ldb:.noms/tour
|
||||
noms ds default # -> noms ds ldb:.noms/tour
|
||||
noms ds origin # -> noms ds http://demo.noms.io/cli-tour
|
||||
|
||||
noms sync origin::sf-film-locations sf-films # sync ds from origin to default
|
||||
|
||||
noms log sf-films # -> noms log ldb:.noms/tour::sf-films
|
||||
noms log origin::sf-film-locations # -> noms log http://demo.noms.io/cli-tour::sf-film-locations
|
||||
|
||||
noms show '#1a2aj8svslsu7g8hplsva6oq6iq3ib6c' # -> noms show ldb:.noms/tour::'#1a2a...'
|
||||
noms show origin::'#1a2aj8svslsu7g8hplsva6oq6iq3ib6c' # -> noms show http://demo.noms.io/cli-tour::'#1a2a...'
|
||||
|
||||
noms diff '#1a2aj8svslsu7g8hplsva6oq6iq3ib6c' origin::. # diff default::object with origin::object
|
||||
|
||||
noms sync origin::sf-bike-parking . # sync origin::sf-bike-parking to default::sf-bike-parking
|
||||
|
||||
```
|
||||
|
||||
A few more things to note:
|
||||
|
||||
- Relative paths will be expanded relative to the directory where the *.nomsconfg* is defined
|
||||
- Use `noms config` to see the current alias definitions with expanded paths
|
||||
- Use `-v` or `--verbose` on any command to see how the command arguments are being resolved
|
||||
- Explicit DB urls are still fully supported
|
||||
@@ -1 +0,0 @@
|
||||
counter
|
||||
@@ -1,52 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
|
||||
"github.com/attic-labs/noms/go/config"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/attic-labs/noms/go/util/verbose"
|
||||
flag "github.com/juju/gnuflag"
|
||||
)
|
||||
|
||||
func main() {
|
||||
flag.Usage = func() {
|
||||
fmt.Fprintf(os.Stderr, "usage: %s [options] <dataset>\n", os.Args[0])
|
||||
flag.PrintDefaults()
|
||||
}
|
||||
|
||||
verbose.RegisterVerboseFlags(flag.CommandLine)
|
||||
|
||||
flag.Parse(true)
|
||||
|
||||
if flag.NArg() != 1 {
|
||||
fmt.Fprintln(os.Stderr, "Missing required dataset argument")
|
||||
return
|
||||
}
|
||||
|
||||
cfg := config.NewResolver()
|
||||
db, ds, err := cfg.GetDataset(flag.Arg(0))
|
||||
if err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Could not create dataset: %s\n", err)
|
||||
return
|
||||
}
|
||||
defer db.Close()
|
||||
|
||||
newVal := uint64(1)
|
||||
if lastVal, ok := ds.MaybeHeadValue(); ok {
|
||||
newVal = uint64(lastVal.(types.Float)) + 1
|
||||
}
|
||||
|
||||
_, err = db.CommitValue(ds, types.Float(newVal))
|
||||
if err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Error committing: %s\n", err)
|
||||
return
|
||||
}
|
||||
|
||||
fmt.Println(newVal)
|
||||
}
|
||||
@@ -1,35 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"github.com/attic-labs/noms/go/spec"
|
||||
"github.com/attic-labs/noms/go/util/clienttest"
|
||||
"github.com/stretchr/testify/suite"
|
||||
)
|
||||
|
||||
func TestCounter(t *testing.T) {
|
||||
suite.Run(t, &counterTestSuite{})
|
||||
}
|
||||
|
||||
type counterTestSuite struct {
|
||||
clienttest.ClientTestSuite
|
||||
}
|
||||
|
||||
func (s *counterTestSuite) TestCounter() {
|
||||
spec := spec.CreateValueSpecString("nbs", s.DBDir, "counter")
|
||||
args := []string{spec}
|
||||
stdout, stderr := s.MustRun(main, args)
|
||||
s.Equal("1\n", stdout)
|
||||
s.Equal("", stderr)
|
||||
stdout, stderr = s.MustRun(main, args)
|
||||
s.Equal("2\n", stdout)
|
||||
s.Equal("", stderr)
|
||||
stdout, stderr = s.MustRun(main, args)
|
||||
s.Equal("3\n", stdout)
|
||||
s.Equal("", stderr)
|
||||
}
|
||||
@@ -1,28 +0,0 @@
|
||||
# CSV Importer
|
||||
|
||||
Imports a CSV file as `List<T>` where `T` is a struct with fields corresponding to the CSV's column headers. The struct spec can also be set manually with the `-header` flag.
|
||||
|
||||
## Usage
|
||||
|
||||
```shell
|
||||
$ cd csv-import
|
||||
$ go build
|
||||
$ ./csv-import <PATH> http://localhost:8000::foo
|
||||
```
|
||||
|
||||
## Some places for CSV files
|
||||
|
||||
- https://data.cityofnewyork.us/api/views/kku6-nxdu/rows.csv?accessType=DOWNLOAD
|
||||
- http://www.opendatacache.com/
|
||||
|
||||
# CSV Exporter
|
||||
|
||||
Export a dataset in CSV format to stdout with column headers.
|
||||
|
||||
## Usage
|
||||
|
||||
```shell
|
||||
$ cd csv-export
|
||||
$ go build
|
||||
$ ./csv-export http://localhost:8000:foo
|
||||
```
|
||||
@@ -1,27 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package csv
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"unicode/utf8"
|
||||
)
|
||||
|
||||
// StringToRune returns the rune contained in delimiter or an error.
|
||||
func StringToRune(delimiter string) (rune, error) {
|
||||
dlimLen := len(delimiter)
|
||||
if dlimLen == 0 {
|
||||
return 0, fmt.Errorf("delimiter flag must contain exactly one character (rune), not an empty string")
|
||||
}
|
||||
|
||||
d, runeSize := utf8.DecodeRuneInString(delimiter)
|
||||
if d == utf8.RuneError {
|
||||
return 0, fmt.Errorf("Invalid utf8 string in delimiter flag: %s", delimiter)
|
||||
}
|
||||
if dlimLen != runeSize {
|
||||
return 0, fmt.Errorf("delimiter flag is too long. It must contain exactly one character (rune), but instead it is: %s", delimiter)
|
||||
}
|
||||
return d, nil
|
||||
}
|
||||
@@ -1 +0,0 @@
|
||||
csv-analyze
|
||||
@@ -1,82 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"io"
|
||||
"os"
|
||||
"strings"
|
||||
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/attic-labs/noms/go/util/profile"
|
||||
"github.com/attic-labs/noms/samples/go/csv"
|
||||
flag "github.com/juju/gnuflag"
|
||||
)
|
||||
|
||||
func main() {
|
||||
// Actually the delimiter uses runes, which can be multiple characters long.
|
||||
// https://blog.golang.org/strings
|
||||
delimiter := flag.String("delimiter", ",", "field delimiter for csv file, must be exactly one character long.")
|
||||
header := flag.String("header", "", "header row. If empty, we'll use the first row of the file")
|
||||
skipRecords := flag.Uint("skip-records", 0, "number of records to skip at beginning of file")
|
||||
detectColumnTypes := flag.Bool("detect-column-types", false, "detect column types by analyzing a portion of csv file")
|
||||
detectPrimaryKeys := flag.Bool("detect-pk", false, "detect primary key candidates by analyzing a portion of csv file")
|
||||
numSamples := flag.Int("num-samples", 1000000, "number of records to use for samples")
|
||||
numFieldsInPK := flag.Int("num-fields-pk", 3, "maximum number of columns to consider when detecting PKs")
|
||||
|
||||
profile.RegisterProfileFlags(flag.CommandLine)
|
||||
|
||||
flag.Usage = func() {
|
||||
fmt.Fprintf(os.Stderr, "Usage: csv-analyze [options] <csvfile>\n\n")
|
||||
flag.PrintDefaults()
|
||||
}
|
||||
|
||||
flag.Parse(true)
|
||||
|
||||
if flag.NArg() != 1 {
|
||||
flag.Usage()
|
||||
return
|
||||
}
|
||||
|
||||
defer profile.MaybeStartProfile().Stop()
|
||||
|
||||
var r io.Reader
|
||||
var filePath string
|
||||
|
||||
filePath = flag.Arg(0)
|
||||
res, err := os.Open(filePath)
|
||||
d.CheckError(err)
|
||||
defer res.Close()
|
||||
r = res
|
||||
|
||||
comma, err := csv.StringToRune(*delimiter)
|
||||
d.CheckErrorNoUsage(err)
|
||||
|
||||
cr := csv.NewCSVReader(r, comma)
|
||||
csv.SkipRecords(cr, *skipRecords)
|
||||
|
||||
var headers []string
|
||||
if *header == "" {
|
||||
headers, err = cr.Read()
|
||||
d.PanicIfError(err)
|
||||
} else {
|
||||
headers = strings.Split(*header, string(comma))
|
||||
}
|
||||
|
||||
kinds := []types.NomsKind{}
|
||||
if *detectColumnTypes {
|
||||
kinds = csv.GetSchema(cr, *numSamples, len(headers))
|
||||
fmt.Fprintf(os.Stdout, "%s\n", strings.Join(csv.KindsToStrings(kinds), ","))
|
||||
}
|
||||
|
||||
if *detectPrimaryKeys {
|
||||
pks := csv.FindPrimaryKeys(cr, *numSamples, *numFieldsInPK, len(headers))
|
||||
for _, pk := range pks {
|
||||
fmt.Fprintf(os.Stdout, "%s\n", strings.Join(csv.GetFieldNamesFromIndices(headers, pk), ","))
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -1,93 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"io"
|
||||
"io/ioutil"
|
||||
"os"
|
||||
"testing"
|
||||
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/go/util/clienttest"
|
||||
"github.com/stretchr/testify/suite"
|
||||
)
|
||||
|
||||
func TestCSVAnalyze(t *testing.T) {
|
||||
suite.Run(t, &csvAnalyzeTestSuite{})
|
||||
}
|
||||
|
||||
type csvAnalyzeTestSuite struct {
|
||||
clienttest.ClientTestSuite
|
||||
tmpFileName string
|
||||
}
|
||||
|
||||
func writeSampleData(w io.Writer) {
|
||||
_, err := io.WriteString(w, "Date,Time,Temperature\n")
|
||||
d.Chk.NoError(err)
|
||||
|
||||
// 30 samples of String,String,Number
|
||||
for i := 0; i < 30; i++ {
|
||||
_, err = io.WriteString(w, fmt.Sprintf("08/14/2016,12:%d,73.4%d\n", i, i))
|
||||
d.Chk.NoError(err)
|
||||
}
|
||||
|
||||
// an extra sample of String,String,String to have detect types with smaller samples only find Number
|
||||
_, err = io.WriteString(w, fmt.Sprintf("08/14/2016,13:01,none\n"))
|
||||
d.Chk.NoError(err)
|
||||
|
||||
// an extra sample of a duplicate Date,Temperature to have detect pk rule it out (with smaller samples)
|
||||
_, err = io.WriteString(w, fmt.Sprintf("08/14/2016,13:02,73.42\n"))
|
||||
d.Chk.NoError(err)
|
||||
}
|
||||
|
||||
func (s *csvAnalyzeTestSuite) SetupTest() {
|
||||
input, err := ioutil.TempFile(s.TempDir, "")
|
||||
d.Chk.NoError(err)
|
||||
s.tmpFileName = input.Name()
|
||||
writeSampleData(input)
|
||||
defer input.Close()
|
||||
}
|
||||
|
||||
func (s *csvAnalyzeTestSuite) TearDownTest() {
|
||||
os.Remove(s.tmpFileName)
|
||||
}
|
||||
|
||||
func (s *csvAnalyzeTestSuite) TestCSVAnalyzeDetectColumnTypes() {
|
||||
stdout, stderr := s.MustRun(main, []string{"--detect-column-types=1", s.tmpFileName})
|
||||
s.Equal("String,String,String\n", stdout)
|
||||
s.Equal("", stderr)
|
||||
}
|
||||
|
||||
func (s *csvAnalyzeTestSuite) TestCSVAnalyzeDetectColumnTypesSamples20() {
|
||||
stdout, stderr := s.MustRun(main, []string{"--detect-column-types=1", "--num-samples=20", s.tmpFileName})
|
||||
s.Equal("String,String,Number\n", stdout)
|
||||
s.Equal("", stderr)
|
||||
}
|
||||
|
||||
func (s *csvAnalyzeTestSuite) TestCSVAnalyzeDetectPrimaryKeys() {
|
||||
stdout, stderr := s.MustRun(main, []string{"--detect-pk=1", s.tmpFileName})
|
||||
s.Equal("Time\nDate,Time\nTime,Temperature\nDate,Time,Temperature\n", stdout)
|
||||
s.Equal("", stderr)
|
||||
}
|
||||
|
||||
func (s *csvAnalyzeTestSuite) TestCSVAnalyzeDetectPrimaryKeysSamples20() {
|
||||
stdout, stderr := s.MustRun(main, []string{"--detect-pk=1", "--num-samples=20", s.tmpFileName})
|
||||
s.Equal("Time\nTemperature\nDate,Time\nDate,Temperature\nTime,Temperature\nDate,Time,Temperature\n", stdout)
|
||||
s.Equal("", stderr)
|
||||
}
|
||||
|
||||
func (s *csvAnalyzeTestSuite) TestCSVAnalyzeDetectPrimaryKeysSingleField() {
|
||||
stdout, stderr := s.MustRun(main, []string{"--detect-pk=1", "--num-fields-pk=1", s.tmpFileName})
|
||||
s.Equal("Time\n", stdout)
|
||||
s.Equal("", stderr)
|
||||
}
|
||||
|
||||
func (s *csvAnalyzeTestSuite) TestCSVAnalyzeDetectPrimaryKeysTwoFields() {
|
||||
stdout, stderr := s.MustRun(main, []string{"--detect-pk=1", "--num-fields-pk=2", s.tmpFileName})
|
||||
s.Equal("Time\nDate,Time\nTime,Temperature\n", stdout)
|
||||
s.Equal("", stderr)
|
||||
}
|
||||
@@ -1 +0,0 @@
|
||||
csv-export
|
||||
@@ -1,67 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"errors"
|
||||
"fmt"
|
||||
"os"
|
||||
|
||||
"github.com/attic-labs/noms/go/config"
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/attic-labs/noms/go/util/profile"
|
||||
"github.com/attic-labs/noms/go/util/verbose"
|
||||
"github.com/attic-labs/noms/samples/go/csv"
|
||||
flag "github.com/juju/gnuflag"
|
||||
)
|
||||
|
||||
func main() {
|
||||
// Actually the delimiter uses runes, which can be multiple characters long.
|
||||
// https://blog.golang.org/strings
|
||||
delimiter := flag.String("delimiter", ",", "field delimiter for csv file, must be exactly one character long.")
|
||||
|
||||
verbose.RegisterVerboseFlags(flag.CommandLine)
|
||||
profile.RegisterProfileFlags(flag.CommandLine)
|
||||
|
||||
flag.Usage = func() {
|
||||
fmt.Fprintln(os.Stderr, "Usage: csv-export [options] dataset > filename")
|
||||
flag.PrintDefaults()
|
||||
}
|
||||
|
||||
flag.Parse(true)
|
||||
|
||||
if flag.NArg() != 1 {
|
||||
d.CheckError(errors.New("expected dataset arg"))
|
||||
}
|
||||
|
||||
cfg := config.NewResolver()
|
||||
db, ds, err := cfg.GetDataset(flag.Arg(0))
|
||||
d.CheckError(err)
|
||||
|
||||
defer db.Close()
|
||||
|
||||
comma, err := csv.StringToRune(*delimiter)
|
||||
d.CheckError(err)
|
||||
|
||||
err = d.Try(func() {
|
||||
defer profile.MaybeStartProfile().Stop()
|
||||
|
||||
hv := ds.HeadValue()
|
||||
if l, ok := hv.(types.List); ok {
|
||||
structDesc := csv.GetListElemDesc(l, db)
|
||||
csv.WriteList(l, structDesc, comma, os.Stdout)
|
||||
} else if m, ok := hv.(types.Map); ok {
|
||||
structDesc := csv.GetMapElemDesc(m, db)
|
||||
csv.WriteMap(m, structDesc, comma, os.Stdout)
|
||||
} else {
|
||||
panic(fmt.Sprintf("Expected ListKind or MapKind, found %s", hv.Kind()))
|
||||
}
|
||||
})
|
||||
if err != nil {
|
||||
fmt.Println("Failed to export dataset as CSV:")
|
||||
fmt.Println(err)
|
||||
}
|
||||
}
|
||||
@@ -1,126 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"encoding/csv"
|
||||
"io"
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/go/datas"
|
||||
"github.com/attic-labs/noms/go/nbs"
|
||||
"github.com/attic-labs/noms/go/spec"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/attic-labs/noms/go/util/clienttest"
|
||||
"github.com/stretchr/testify/suite"
|
||||
)
|
||||
|
||||
func TestCSVExporter(t *testing.T) {
|
||||
suite.Run(t, &testSuite{})
|
||||
}
|
||||
|
||||
type testSuite struct {
|
||||
clienttest.ClientTestSuite
|
||||
header []string
|
||||
payload [][]string
|
||||
}
|
||||
|
||||
func (s *testSuite) createTestData(buildAsMap bool) []types.Value {
|
||||
s.header = []string{"a", "b", "c"}
|
||||
structName := "SomeStruct"
|
||||
s.payload = [][]string{
|
||||
{"4", "10", "255"},
|
||||
{"5", "7", "100"},
|
||||
{"512", "12", "55"},
|
||||
}
|
||||
|
||||
sliceLen := len(s.payload)
|
||||
if buildAsMap {
|
||||
sliceLen *= 2
|
||||
}
|
||||
|
||||
structs := make([]types.Value, sliceLen)
|
||||
for i, row := range s.payload {
|
||||
fields := make(types.ValueSlice, len(s.header))
|
||||
for j, v := range row {
|
||||
fields[j] = types.String(v)
|
||||
}
|
||||
if buildAsMap {
|
||||
structs[i*2] = fields[0]
|
||||
structs[i*2+1] = types.NewStruct(structName, types.StructData{
|
||||
"a": fields[0],
|
||||
"b": fields[1],
|
||||
"c": fields[2],
|
||||
})
|
||||
} else {
|
||||
structs[i] = types.NewStruct(structName, types.StructData{
|
||||
"a": fields[0],
|
||||
"b": fields[1],
|
||||
"c": fields[2],
|
||||
})
|
||||
}
|
||||
}
|
||||
return structs
|
||||
}
|
||||
|
||||
func verifyOutput(s *testSuite, stdout string) {
|
||||
csvReader := csv.NewReader(strings.NewReader(stdout))
|
||||
|
||||
row, err := csvReader.Read()
|
||||
d.Chk.NoError(err)
|
||||
s.Equal(s.header, row)
|
||||
|
||||
for i := 0; i < len(s.payload); i++ {
|
||||
row, err := csvReader.Read()
|
||||
d.Chk.NoError(err)
|
||||
s.Equal(s.payload[i], row)
|
||||
}
|
||||
|
||||
_, err = csvReader.Read()
|
||||
s.Equal(io.EOF, err)
|
||||
}
|
||||
|
||||
// FIXME: run with pipe
|
||||
func (s *testSuite) TestCSVExportFromList() {
|
||||
setName := "csvlist"
|
||||
|
||||
// Setup data store
|
||||
db := datas.NewDatabase(nbs.NewLocalStore(s.DBDir, clienttest.DefaultMemTableSize))
|
||||
ds := db.GetDataset(setName)
|
||||
|
||||
// Build data rows
|
||||
structs := s.createTestData(false)
|
||||
db.CommitValue(ds, types.NewList(db, structs...))
|
||||
db.Close()
|
||||
|
||||
// Run exporter
|
||||
dataspec := spec.CreateValueSpecString("nbs", s.DBDir, setName)
|
||||
stdout, stderr := s.MustRun(main, []string{dataspec})
|
||||
s.Equal("", stderr)
|
||||
|
||||
verifyOutput(s, stdout)
|
||||
}
|
||||
|
||||
func (s *testSuite) TestCSVExportFromMap() {
|
||||
setName := "csvmap"
|
||||
|
||||
// Setup data store
|
||||
db := datas.NewDatabase(nbs.NewLocalStore(s.DBDir, clienttest.DefaultMemTableSize))
|
||||
ds := db.GetDataset(setName)
|
||||
|
||||
// Build data rows
|
||||
structs := s.createTestData(true)
|
||||
db.CommitValue(ds, types.NewMap(db, structs...))
|
||||
db.Close()
|
||||
|
||||
// Run exporter
|
||||
dataspec := spec.CreateValueSpecString("nbs", s.DBDir, setName)
|
||||
stdout, stderr := s.MustRun(main, []string{dataspec})
|
||||
s.Equal("", stderr)
|
||||
|
||||
verifyOutput(s, stdout)
|
||||
}
|
||||
@@ -1 +0,0 @@
|
||||
csv-import
|
||||
@@ -1,258 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"errors"
|
||||
"fmt"
|
||||
"io"
|
||||
"math"
|
||||
"os"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/attic-labs/noms/go/config"
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/go/datas"
|
||||
"github.com/attic-labs/noms/go/spec"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/attic-labs/noms/go/util/profile"
|
||||
"github.com/attic-labs/noms/go/util/progressreader"
|
||||
"github.com/attic-labs/noms/go/util/status"
|
||||
"github.com/attic-labs/noms/go/util/verbose"
|
||||
"github.com/attic-labs/noms/samples/go/csv"
|
||||
humanize "github.com/dustin/go-humanize"
|
||||
flag "github.com/juju/gnuflag"
|
||||
)
|
||||
|
||||
const (
|
||||
destList = iota
|
||||
destMap = iota
|
||||
)
|
||||
|
||||
func main() {
|
||||
// Actually the delimiter uses runes, which can be multiple characters long.
|
||||
// https://blog.golang.org/strings
|
||||
delimiter := flag.String("delimiter", ",", "field delimiter for csv file, must be exactly one character long.")
|
||||
header := flag.String("header", "", "header row. If empty, we'll use the first row of the file")
|
||||
lowercase := flag.Bool("lowercase", false, "convert column names to lowercase (otherwise preserve the case in the resulting struct fields)")
|
||||
name := flag.String("name", "Row", "struct name. The user-visible name to give to the struct type that will hold each row of data.")
|
||||
columnTypes := flag.String("column-types", "", "a comma-separated list of types representing the desired type of each column. if absent all types default to be String")
|
||||
pathDescription := "noms path to blob to import"
|
||||
path := flag.String("path", "", pathDescription)
|
||||
flag.StringVar(path, "p", "", pathDescription)
|
||||
noProgress := flag.Bool("no-progress", false, "prevents progress from being output if true")
|
||||
destType := flag.String("dest-type", "list", "the destination type to import to. can be 'list' or 'map:<pk>', where <pk> is a list of comma-delimited column headers or indexes (0-based) used to uniquely identify a row")
|
||||
skipRecords := flag.Uint("skip-records", 0, "number of records to skip at beginning of file")
|
||||
limit := flag.Uint64("limit-records", math.MaxUint64, "maximum number of records to process")
|
||||
performCommit := flag.Bool("commit", true, "commit the data to head of the dataset (otherwise only write the data to the dataset)")
|
||||
append := flag.Bool("append", false, "append new data to list at head of specified dataset.")
|
||||
invert := flag.Bool("invert", false, "import rows in column major format rather than row major")
|
||||
spec.RegisterCommitMetaFlags(flag.CommandLine)
|
||||
verbose.RegisterVerboseFlags(flag.CommandLine)
|
||||
profile.RegisterProfileFlags(flag.CommandLine)
|
||||
|
||||
flag.Usage = func() {
|
||||
fmt.Fprintf(os.Stderr, "Usage: csv-import [options] <csvfile> <dataset>\n\n")
|
||||
flag.PrintDefaults()
|
||||
}
|
||||
|
||||
flag.Parse(true)
|
||||
|
||||
var err error
|
||||
switch {
|
||||
case flag.NArg() == 0:
|
||||
err = errors.New("Maybe you put options after the dataset?")
|
||||
case flag.NArg() == 1 && *path == "":
|
||||
err = errors.New("If <csvfile> isn't specified, you must specify a noms path with -p")
|
||||
case flag.NArg() == 2 && *path != "":
|
||||
err = errors.New("Cannot specify both <csvfile> and a noms path with -p")
|
||||
case flag.NArg() > 2:
|
||||
err = errors.New("Too many arguments")
|
||||
case strings.HasPrefix(*destType, "map") && *append:
|
||||
err = errors.New("--append is only compatible with list imports")
|
||||
case strings.HasPrefix(*destType, "map") && *invert:
|
||||
err = errors.New("--invert is only compatible with list imports")
|
||||
}
|
||||
d.CheckError(err)
|
||||
|
||||
defer profile.MaybeStartProfile().Stop()
|
||||
|
||||
var r io.Reader
|
||||
var size uint64
|
||||
var filePath string
|
||||
var dataSetArgN int
|
||||
|
||||
cfg := config.NewResolver()
|
||||
if *path != "" {
|
||||
db, val, err := cfg.GetPath(*path)
|
||||
d.CheckError(err)
|
||||
if val == nil {
|
||||
d.CheckError(fmt.Errorf("Path %s not found\n", *path))
|
||||
}
|
||||
blob, ok := val.(types.Blob)
|
||||
if !ok {
|
||||
d.CheckError(fmt.Errorf("Path %s not a Blob: %s\n", *path, types.EncodedValue(types.TypeOf(val))))
|
||||
}
|
||||
defer db.Close()
|
||||
preader, pwriter := io.Pipe()
|
||||
go func() {
|
||||
blob.Copy(pwriter)
|
||||
pwriter.Close()
|
||||
}()
|
||||
r = preader
|
||||
size = blob.Len()
|
||||
dataSetArgN = 0
|
||||
} else {
|
||||
filePath = flag.Arg(0)
|
||||
res, err := os.Open(filePath)
|
||||
d.CheckError(err)
|
||||
defer res.Close()
|
||||
fi, err := res.Stat()
|
||||
d.CheckError(err)
|
||||
r = res
|
||||
size = uint64(fi.Size())
|
||||
dataSetArgN = 1
|
||||
}
|
||||
|
||||
if !*noProgress {
|
||||
r = progressreader.New(r, getStatusPrinter(size))
|
||||
}
|
||||
|
||||
delim, err := csv.StringToRune(*delimiter)
|
||||
d.CheckErrorNoUsage(err)
|
||||
|
||||
var dest int
|
||||
var strPks []string
|
||||
if *destType == "list" {
|
||||
dest = destList
|
||||
} else if strings.HasPrefix(*destType, "map:") {
|
||||
dest = destMap
|
||||
strPks = strings.Split(strings.TrimPrefix(*destType, "map:"), ",")
|
||||
if len(strPks) == 0 {
|
||||
fmt.Println("Invalid dest-type map: ", *destType)
|
||||
return
|
||||
}
|
||||
} else {
|
||||
fmt.Println("Invalid dest-type: ", *destType)
|
||||
return
|
||||
}
|
||||
|
||||
cr := csv.NewCSVReader(r, delim)
|
||||
err = csv.SkipRecords(cr, *skipRecords)
|
||||
|
||||
if err == io.EOF {
|
||||
err = fmt.Errorf("skip-records skipped past EOF")
|
||||
}
|
||||
d.CheckErrorNoUsage(err)
|
||||
|
||||
var headers []string
|
||||
if *header == "" {
|
||||
headers, err = cr.Read()
|
||||
d.PanicIfError(err)
|
||||
} else {
|
||||
headers = strings.Split(*header, ",")
|
||||
}
|
||||
if *lowercase {
|
||||
for i, _ := range headers {
|
||||
headers[i] = strings.ToLower(headers[i])
|
||||
}
|
||||
}
|
||||
|
||||
uniqueHeaders := make(map[string]bool)
|
||||
for _, header := range headers {
|
||||
uniqueHeaders[header] = true
|
||||
}
|
||||
if len(uniqueHeaders) != len(headers) {
|
||||
d.CheckErrorNoUsage(fmt.Errorf("Invalid headers specified, headers must be unique"))
|
||||
}
|
||||
|
||||
kinds := []types.NomsKind{}
|
||||
if *columnTypes != "" {
|
||||
kinds = csv.StringsToKinds(strings.Split(*columnTypes, ","))
|
||||
if len(kinds) != len(uniqueHeaders) {
|
||||
d.CheckErrorNoUsage(fmt.Errorf("Invalid column-types specified, column types do not correspond to number of headers"))
|
||||
}
|
||||
}
|
||||
|
||||
db, ds, err := cfg.GetDataset(flag.Arg(dataSetArgN))
|
||||
d.CheckError(err)
|
||||
defer db.Close()
|
||||
|
||||
var value types.Value
|
||||
if dest == destMap {
|
||||
value = csv.ReadToMap(cr, *name, headers, strPks, kinds, db, *limit)
|
||||
} else if *invert {
|
||||
value = csv.ReadToColumnar(cr, *name, headers, kinds, db, *limit)
|
||||
} else {
|
||||
value = csv.ReadToList(cr, *name, headers, kinds, db, *limit)
|
||||
}
|
||||
|
||||
if *performCommit {
|
||||
meta, err := spec.CreateCommitMetaStruct(ds.Database(), "", "", additionalMetaInfo(filePath, *path), nil)
|
||||
d.CheckErrorNoUsage(err)
|
||||
if *append {
|
||||
if headVal, present := ds.MaybeHeadValue(); present {
|
||||
switch headVal.Kind() {
|
||||
case types.ListKind:
|
||||
l, isList := headVal.(types.List)
|
||||
d.PanicIfFalse(isList)
|
||||
ref := db.WriteValue(value)
|
||||
value = l.Concat(ref.TargetValue(db).(types.List))
|
||||
case types.StructKind:
|
||||
hstr, isStruct := headVal.(types.Struct)
|
||||
d.PanicIfFalse(isStruct)
|
||||
d.PanicIfFalse(hstr.Name() == "Columnar")
|
||||
str := value.(types.Struct)
|
||||
hstr.IterFields(func(fieldname string, v types.Value) {
|
||||
hl := v.(types.Ref).TargetValue(db).(types.List)
|
||||
nl := str.Get(fieldname).(types.Ref).TargetValue(db).(types.List)
|
||||
l := hl.Concat(nl)
|
||||
r := db.WriteValue(l)
|
||||
str = str.Set(fieldname, r)
|
||||
})
|
||||
value = str
|
||||
default:
|
||||
d.Panic("append can only be used with list or columnar")
|
||||
}
|
||||
}
|
||||
}
|
||||
_, err = db.Commit(ds, value, datas.CommitOptions{Meta: meta})
|
||||
if !*noProgress {
|
||||
status.Clear()
|
||||
}
|
||||
d.PanicIfError(err)
|
||||
} else {
|
||||
ref := db.WriteValue(value)
|
||||
if !*noProgress {
|
||||
status.Clear()
|
||||
}
|
||||
fmt.Fprintf(os.Stdout, "#%s\n", ref.TargetHash().String())
|
||||
}
|
||||
}
|
||||
|
||||
func additionalMetaInfo(filePath, nomsPath string) map[string]string {
|
||||
fileOrNomsPath := "inputPath"
|
||||
path := nomsPath
|
||||
if path == "" {
|
||||
path = filePath
|
||||
fileOrNomsPath = "inputFile"
|
||||
}
|
||||
return map[string]string{fileOrNomsPath: path}
|
||||
}
|
||||
|
||||
func getStatusPrinter(expected uint64) progressreader.Callback {
|
||||
startTime := time.Now()
|
||||
return func(seen uint64) {
|
||||
percent := float64(seen) / float64(expected) * 100
|
||||
elapsed := time.Since(startTime)
|
||||
rate := float64(seen) / elapsed.Seconds()
|
||||
|
||||
status.Printf("%.2f%% of %s (%s/s)...",
|
||||
percent,
|
||||
humanize.Bytes(expected),
|
||||
humanize.Bytes(uint64(rate)))
|
||||
}
|
||||
}
|
||||
@@ -1,478 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"fmt"
|
||||
"io"
|
||||
"io/ioutil"
|
||||
"os"
|
||||
"testing"
|
||||
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/go/datas"
|
||||
"github.com/attic-labs/noms/go/nbs"
|
||||
"github.com/attic-labs/noms/go/spec"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/attic-labs/noms/go/util/clienttest"
|
||||
"github.com/stretchr/testify/suite"
|
||||
)
|
||||
|
||||
const (
|
||||
TEST_DATA_SIZE = 100
|
||||
TEST_YEAR = 2012
|
||||
TEST_FIELDS = "Float,String,Float,Float"
|
||||
)
|
||||
|
||||
func TestCSVImporter(t *testing.T) {
|
||||
suite.Run(t, &testSuite{})
|
||||
}
|
||||
|
||||
type testSuite struct {
|
||||
clienttest.ClientTestSuite
|
||||
tmpFileName string
|
||||
}
|
||||
|
||||
func (s *testSuite) SetupTest() {
|
||||
input, err := ioutil.TempFile(s.TempDir, "")
|
||||
d.Chk.NoError(err)
|
||||
defer input.Close()
|
||||
s.tmpFileName = input.Name()
|
||||
writeCSV(input)
|
||||
}
|
||||
|
||||
func (s *testSuite) TearDownTest() {
|
||||
os.Remove(s.tmpFileName)
|
||||
}
|
||||
|
||||
func writeCSV(w io.Writer) {
|
||||
writeCSVWithHeader(w, "year,a,b,c\n", 0)
|
||||
}
|
||||
|
||||
func writeCSVWithHeader(w io.Writer, header string, startingValue int) {
|
||||
_, err := io.WriteString(w, header)
|
||||
d.Chk.NoError(err)
|
||||
for i := 0; i < TEST_DATA_SIZE; i++ {
|
||||
j := i + startingValue
|
||||
_, err = io.WriteString(w, fmt.Sprintf("%d,a%d,%d,%d\n", TEST_YEAR+j%3, j, j, j*2))
|
||||
d.Chk.NoError(err)
|
||||
}
|
||||
}
|
||||
|
||||
func (s *testSuite) validateList(l types.List) {
|
||||
s.Equal(uint64(TEST_DATA_SIZE), l.Len())
|
||||
|
||||
i := uint64(0)
|
||||
l.IterAll(func(v types.Value, j uint64) {
|
||||
s.Equal(i, j)
|
||||
st := v.(types.Struct)
|
||||
s.Equal(types.Float(TEST_YEAR+i%3), st.Get("year"))
|
||||
s.Equal(types.String(fmt.Sprintf("a%d", i)), st.Get("a"))
|
||||
s.Equal(types.Float(i), st.Get("b"))
|
||||
s.Equal(types.Float(i*2), st.Get("c"))
|
||||
i++
|
||||
})
|
||||
}
|
||||
|
||||
func (s *testSuite) validateMap(vrw types.ValueReadWriter, m types.Map) {
|
||||
// --dest-type=map:1 so key is field "a"
|
||||
s.Equal(uint64(TEST_DATA_SIZE), m.Len())
|
||||
|
||||
for i := 0; i < TEST_DATA_SIZE; i++ {
|
||||
v := m.Get(types.String(fmt.Sprintf("a%d", i))).(types.Struct)
|
||||
s.True(v.Equals(
|
||||
types.NewStruct("Row", types.StructData{
|
||||
"year": types.Float(TEST_YEAR + i%3),
|
||||
"a": types.String(fmt.Sprintf("a%d", i)),
|
||||
"b": types.Float(i),
|
||||
"c": types.Float(i * 2),
|
||||
})))
|
||||
}
|
||||
}
|
||||
|
||||
func (s *testSuite) validateNestedMap(vrw types.ValueReadWriter, m types.Map) {
|
||||
// --dest-type=map:0,1 so keys are fields "year", then field "a"
|
||||
s.Equal(uint64(3), m.Len())
|
||||
|
||||
for i := 0; i < TEST_DATA_SIZE; i++ {
|
||||
n := m.Get(types.Float(TEST_YEAR + i%3)).(types.Map)
|
||||
o := n.Get(types.String(fmt.Sprintf("a%d", i))).(types.Struct)
|
||||
s.True(o.Equals(types.NewStruct("Row", types.StructData{
|
||||
"year": types.Float(TEST_YEAR + i%3),
|
||||
"a": types.String(fmt.Sprintf("a%d", i)),
|
||||
"b": types.Float(i),
|
||||
"c": types.Float(i * 2),
|
||||
})))
|
||||
}
|
||||
}
|
||||
|
||||
func (s *testSuite) validateColumnar(vrw types.ValueReadWriter, str types.Struct, reps int) {
|
||||
s.Equal("Columnar", str.Name())
|
||||
|
||||
lists := map[string]types.List{}
|
||||
for _, nm := range []string{"year", "a", "b", "c"} {
|
||||
l := str.Get(nm).(types.Ref).TargetValue(vrw).(types.List)
|
||||
s.Equal(uint64(reps*TEST_DATA_SIZE), l.Len())
|
||||
lists[nm] = l
|
||||
}
|
||||
|
||||
for i := 0; i < reps*TEST_DATA_SIZE; i++ {
|
||||
s.Equal(types.Float(TEST_YEAR+i%3), lists["year"].Get(uint64(i)))
|
||||
s.Equal(types.String(fmt.Sprintf("a%d", i)), lists["a"].Get(uint64(i)))
|
||||
s.Equal(types.Float(i), lists["b"].Get(uint64(i)))
|
||||
s.Equal(types.Float(i*2), lists["c"].Get(uint64(i)))
|
||||
}
|
||||
}
|
||||
|
||||
func (s *testSuite) TestCSVImporter() {
|
||||
setName := "csv"
|
||||
dataspec := spec.CreateValueSpecString("nbs", s.DBDir, setName)
|
||||
stdout, stderr := s.MustRun(main, []string{"--no-progress", "--column-types", TEST_FIELDS, s.tmpFileName, dataspec})
|
||||
s.Equal("", stdout)
|
||||
s.Equal("", stderr)
|
||||
|
||||
db := datas.NewDatabase(nbs.NewLocalStore(s.DBDir, clienttest.DefaultMemTableSize))
|
||||
defer os.RemoveAll(s.DBDir)
|
||||
defer db.Close()
|
||||
ds := db.GetDataset(setName)
|
||||
|
||||
s.validateList(ds.HeadValue().(types.List))
|
||||
}
|
||||
|
||||
func (s *testSuite) TestCSVImporterLowercase() {
|
||||
input, err := ioutil.TempFile(s.TempDir, "")
|
||||
d.Chk.NoError(err)
|
||||
defer input.Close()
|
||||
writeCSVWithHeader(input, "YeAr,a,B,c\n", 0)
|
||||
defer os.Remove(input.Name())
|
||||
|
||||
setName := "csv"
|
||||
dataspec := spec.CreateValueSpecString("nbs", s.DBDir, setName)
|
||||
stdout, stderr := s.MustRun(main, []string{"--no-progress", "--lowercase", "--column-types", TEST_FIELDS, input.Name(), dataspec})
|
||||
s.Equal("", stdout)
|
||||
s.Equal("", stderr)
|
||||
|
||||
db := datas.NewDatabase(nbs.NewLocalStore(s.DBDir, clienttest.DefaultMemTableSize))
|
||||
defer os.RemoveAll(s.DBDir)
|
||||
defer db.Close()
|
||||
ds := db.GetDataset(setName)
|
||||
|
||||
s.validateList(ds.HeadValue().(types.List))
|
||||
}
|
||||
|
||||
func (s *testSuite) TestCSVImporterLowercaseDuplicate() {
|
||||
input, err := ioutil.TempFile(s.TempDir, "")
|
||||
d.Chk.NoError(err)
|
||||
defer input.Close()
|
||||
writeCSVWithHeader(input, "YeAr,a,B,year\n", 0)
|
||||
defer os.Remove(input.Name())
|
||||
|
||||
setName := "csv"
|
||||
dataspec := spec.CreateValueSpecString("nbs", s.DBDir, setName)
|
||||
_, stderr, _ := s.Run(main, []string{"--no-progress", "--lowercase", "--column-types", TEST_FIELDS, input.Name(), dataspec})
|
||||
s.Contains(stderr, "must be unique")
|
||||
}
|
||||
|
||||
func (s *testSuite) TestCSVImporterFromBlob() {
|
||||
test := func(pathFlag string) {
|
||||
defer os.RemoveAll(s.DBDir)
|
||||
|
||||
newDB := func() datas.Database {
|
||||
os.Mkdir(s.DBDir, 0777)
|
||||
cs := nbs.NewLocalStore(s.DBDir, clienttest.DefaultMemTableSize)
|
||||
return datas.NewDatabase(cs)
|
||||
}
|
||||
|
||||
db := newDB()
|
||||
rawDS := db.GetDataset("raw")
|
||||
csv := &bytes.Buffer{}
|
||||
writeCSV(csv)
|
||||
db.CommitValue(rawDS, types.NewBlob(db, csv))
|
||||
db.Close()
|
||||
|
||||
stdout, stderr := s.MustRun(main, []string{
|
||||
"--no-progress", "--column-types", TEST_FIELDS,
|
||||
pathFlag, spec.CreateValueSpecString("nbs", s.DBDir, "raw.value"),
|
||||
spec.CreateValueSpecString("nbs", s.DBDir, "csv"),
|
||||
})
|
||||
s.Equal("", stdout)
|
||||
s.Equal("", stderr)
|
||||
|
||||
db = newDB()
|
||||
defer db.Close()
|
||||
csvDS := db.GetDataset("csv")
|
||||
s.validateList(csvDS.HeadValue().(types.List))
|
||||
}
|
||||
test("--path")
|
||||
test("-p")
|
||||
}
|
||||
|
||||
func (s *testSuite) TestCSVImporterToMap() {
|
||||
setName := "csv"
|
||||
dataspec := spec.CreateValueSpecString("nbs", s.DBDir, setName)
|
||||
stdout, stderr := s.MustRun(main, []string{"--no-progress", "--column-types", TEST_FIELDS, "--dest-type", "map:1", s.tmpFileName, dataspec})
|
||||
s.Equal("", stdout)
|
||||
s.Equal("", stderr)
|
||||
|
||||
db := datas.NewDatabase(nbs.NewLocalStore(s.DBDir, clienttest.DefaultMemTableSize))
|
||||
defer os.RemoveAll(s.DBDir)
|
||||
defer db.Close()
|
||||
ds := db.GetDataset(setName)
|
||||
|
||||
m := ds.HeadValue().(types.Map)
|
||||
s.validateMap(db, m)
|
||||
}
|
||||
|
||||
func (s *testSuite) TestCSVImporterToNestedMap() {
|
||||
setName := "csv"
|
||||
dataspec := spec.CreateValueSpecString("nbs", s.DBDir, setName)
|
||||
stdout, stderr := s.MustRun(main, []string{"--no-progress", "--column-types", TEST_FIELDS, "--dest-type", "map:0,1", s.tmpFileName, dataspec})
|
||||
s.Equal("", stdout)
|
||||
s.Equal("", stderr)
|
||||
|
||||
db := datas.NewDatabase(nbs.NewLocalStore(s.DBDir, clienttest.DefaultMemTableSize))
|
||||
defer os.RemoveAll(s.DBDir)
|
||||
defer db.Close()
|
||||
ds := db.GetDataset(setName)
|
||||
|
||||
m := ds.HeadValue().(types.Map)
|
||||
s.validateNestedMap(db, m)
|
||||
}
|
||||
|
||||
func (s *testSuite) TestCSVImporterToNestedMapByName() {
|
||||
setName := "csv"
|
||||
dataspec := spec.CreateValueSpecString("nbs", s.DBDir, setName)
|
||||
stdout, stderr := s.MustRun(main, []string{"--no-progress", "--column-types", TEST_FIELDS, "--dest-type", "map:year,a", s.tmpFileName, dataspec})
|
||||
s.Equal("", stdout)
|
||||
s.Equal("", stderr)
|
||||
|
||||
db := datas.NewDatabase(nbs.NewLocalStore(s.DBDir, clienttest.DefaultMemTableSize))
|
||||
defer os.RemoveAll(s.DBDir)
|
||||
defer db.Close()
|
||||
ds := db.GetDataset(setName)
|
||||
|
||||
m := ds.HeadValue().(types.Map)
|
||||
s.validateNestedMap(db, m)
|
||||
}
|
||||
|
||||
func (s *testSuite) TestCSVImporterToColumnar() {
|
||||
setName := "csv"
|
||||
dataspec := spec.CreateValueSpecString("nbs", s.DBDir, setName)
|
||||
stdout, stderr := s.MustRun(main, []string{"--no-progress", "--invert", "--column-types", TEST_FIELDS, s.tmpFileName, dataspec})
|
||||
s.Equal("", stdout)
|
||||
s.Equal("", stderr)
|
||||
|
||||
db := datas.NewDatabase(nbs.NewLocalStore(s.DBDir, clienttest.DefaultMemTableSize))
|
||||
defer os.RemoveAll(s.DBDir)
|
||||
defer db.Close()
|
||||
ds := db.GetDataset(setName)
|
||||
|
||||
str := ds.HeadValue().(types.Struct)
|
||||
s.validateColumnar(db, str, 1)
|
||||
}
|
||||
|
||||
func (s *testSuite) TestCSVImporterToColumnarAppend() {
|
||||
setName := "csv"
|
||||
dataspec := spec.CreateValueSpecString("nbs", s.DBDir, setName)
|
||||
stdout, stderr := s.MustRun(main, []string{"--no-progress", "--invert", "--column-types", TEST_FIELDS, s.tmpFileName, dataspec})
|
||||
s.Equal("", stdout)
|
||||
s.Equal("", stderr)
|
||||
|
||||
input, err := ioutil.TempFile(s.TempDir, "")
|
||||
d.Chk.NoError(err)
|
||||
defer input.Close()
|
||||
writeCSVWithHeader(input, "year,a,b,c\n", 100)
|
||||
defer os.Remove(input.Name())
|
||||
|
||||
stdout, stderr = s.MustRun(main, []string{"--no-progress", "--invert", "--append", "--column-types", TEST_FIELDS, input.Name(), dataspec})
|
||||
s.Equal("", stdout)
|
||||
s.Equal("", stderr)
|
||||
|
||||
db := datas.NewDatabase(nbs.NewLocalStore(s.DBDir, clienttest.DefaultMemTableSize))
|
||||
defer os.RemoveAll(s.DBDir)
|
||||
defer db.Close()
|
||||
ds := db.GetDataset(setName)
|
||||
|
||||
str := ds.HeadValue().(types.Struct)
|
||||
s.validateColumnar(db, str, 2)
|
||||
}
|
||||
|
||||
func (s *testSuite) TestCSVImporterWithPipe() {
|
||||
input, err := ioutil.TempFile(s.TempDir, "")
|
||||
d.Chk.NoError(err)
|
||||
defer input.Close()
|
||||
defer os.Remove(input.Name())
|
||||
|
||||
_, err = input.WriteString("a|b\n1|2\n")
|
||||
d.Chk.NoError(err)
|
||||
|
||||
setName := "csv"
|
||||
dataspec := spec.CreateValueSpecString("nbs", s.DBDir, setName)
|
||||
stdout, stderr := s.MustRun(main, []string{"--no-progress", "--column-types", "String,Float", "--delimiter", "|", input.Name(), dataspec})
|
||||
s.Equal("", stdout)
|
||||
s.Equal("", stderr)
|
||||
|
||||
db := datas.NewDatabase(nbs.NewLocalStore(s.DBDir, clienttest.DefaultMemTableSize))
|
||||
defer os.RemoveAll(s.DBDir)
|
||||
defer db.Close()
|
||||
ds := db.GetDataset(setName)
|
||||
|
||||
l := ds.HeadValue().(types.List)
|
||||
s.Equal(uint64(1), l.Len())
|
||||
v := l.Get(0)
|
||||
st := v.(types.Struct)
|
||||
s.Equal(types.String("1"), st.Get("a"))
|
||||
s.Equal(types.Float(2), st.Get("b"))
|
||||
}
|
||||
|
||||
func (s *testSuite) TestCSVImporterWithExternalHeader() {
|
||||
input, err := ioutil.TempFile(s.TempDir, "")
|
||||
d.Chk.NoError(err)
|
||||
defer input.Close()
|
||||
defer os.Remove(input.Name())
|
||||
|
||||
_, err = input.WriteString("7,8\n")
|
||||
d.Chk.NoError(err)
|
||||
|
||||
setName := "csv"
|
||||
dataspec := spec.CreateValueSpecString("nbs", s.DBDir, setName)
|
||||
stdout, stderr := s.MustRun(main, []string{"--no-progress", "--column-types", "String,Float", "--header", "x,y", input.Name(), dataspec})
|
||||
s.Equal("", stdout)
|
||||
s.Equal("", stderr)
|
||||
|
||||
db := datas.NewDatabase(nbs.NewLocalStore(s.DBDir, clienttest.DefaultMemTableSize))
|
||||
defer os.RemoveAll(s.DBDir)
|
||||
defer db.Close()
|
||||
ds := db.GetDataset(setName)
|
||||
|
||||
l := ds.HeadValue().(types.List)
|
||||
s.Equal(uint64(1), l.Len())
|
||||
v := l.Get(0)
|
||||
st := v.(types.Struct)
|
||||
s.Equal(types.String("7"), st.Get("x"))
|
||||
s.Equal(types.Float(8), st.Get("y"))
|
||||
}
|
||||
|
||||
func (s *testSuite) TestCSVImporterWithInvalidExternalHeader() {
|
||||
input, err := ioutil.TempFile(s.TempDir, "")
|
||||
d.Chk.NoError(err)
|
||||
defer input.Close()
|
||||
defer os.Remove(input.Name())
|
||||
|
||||
_, err = input.WriteString("7#8\n")
|
||||
d.Chk.NoError(err)
|
||||
|
||||
setName := "csv"
|
||||
dataspec := spec.CreateValueSpecString("nbs", s.DBDir, setName)
|
||||
stdout, stderr, exitErr := s.Run(main, []string{"--no-progress", "--column-types", "String,Float", "--header", "x,x", input.Name(), dataspec})
|
||||
s.Equal("", stdout)
|
||||
s.Equal("error: Invalid headers specified, headers must be unique\n", stderr)
|
||||
s.Equal(clienttest.ExitError{1}, exitErr)
|
||||
}
|
||||
|
||||
func (s *testSuite) TestCSVImporterWithInvalidNumColumnTypeSpec() {
|
||||
input, err := ioutil.TempFile(s.TempDir, "")
|
||||
d.Chk.NoError(err)
|
||||
defer input.Close()
|
||||
defer os.Remove(input.Name())
|
||||
|
||||
_, err = input.WriteString("7,8\n")
|
||||
d.Chk.NoError(err)
|
||||
|
||||
setName := "csv"
|
||||
dataspec := spec.CreateValueSpecString("nbs", s.DBDir, setName)
|
||||
stdout, stderr, exitErr := s.Run(main, []string{"--no-progress", "--column-types", "String", "--header", "x,y", input.Name(), dataspec})
|
||||
s.Equal("", stdout)
|
||||
s.Equal("error: Invalid column-types specified, column types do not correspond to number of headers\n", stderr)
|
||||
s.Equal(clienttest.ExitError{1}, exitErr)
|
||||
}
|
||||
|
||||
func (s *testSuite) TestCSVImportSkipRecords() {
|
||||
input, err := ioutil.TempFile(s.TempDir, "")
|
||||
d.Chk.NoError(err)
|
||||
defer input.Close()
|
||||
defer os.Remove(input.Name())
|
||||
|
||||
_, err = input.WriteString("garbage foo\n")
|
||||
d.Chk.NoError(err)
|
||||
|
||||
_, err = input.WriteString("garbage bar\n")
|
||||
d.Chk.NoError(err)
|
||||
|
||||
_, err = input.WriteString("a,b\n")
|
||||
d.Chk.NoError(err)
|
||||
|
||||
_, err = input.WriteString("7,8\n")
|
||||
d.Chk.NoError(err)
|
||||
|
||||
setName := "csv"
|
||||
dataspec := spec.CreateValueSpecString("nbs", s.DBDir, setName)
|
||||
|
||||
stdout, stderr := s.MustRun(main, []string{"--no-progress", "--skip-records", "2", input.Name(), dataspec})
|
||||
s.Equal("", stdout)
|
||||
s.Equal("", stderr)
|
||||
|
||||
db := datas.NewDatabase(nbs.NewLocalStore(s.DBDir, clienttest.DefaultMemTableSize))
|
||||
defer os.RemoveAll(s.DBDir)
|
||||
defer db.Close()
|
||||
ds := db.GetDataset(setName)
|
||||
|
||||
l := ds.HeadValue().(types.List)
|
||||
s.Equal(uint64(1), l.Len())
|
||||
v := l.Get(0)
|
||||
st := v.(types.Struct)
|
||||
s.Equal(types.String("7"), st.Get("a"))
|
||||
s.Equal(types.String("8"), st.Get("b"))
|
||||
}
|
||||
|
||||
func (s *testSuite) TestCSVImportSkipRecordsTooMany() {
|
||||
input, err := ioutil.TempFile(s.TempDir, "")
|
||||
d.Chk.NoError(err)
|
||||
defer input.Close()
|
||||
defer os.Remove(input.Name())
|
||||
|
||||
_, err = input.WriteString("a,b\n")
|
||||
d.Chk.NoError(err)
|
||||
|
||||
setName := "csv"
|
||||
dataspec := spec.CreateValueSpecString("nbs", s.DBDir, setName)
|
||||
|
||||
stdout, stderr, recoveredErr := s.Run(main, []string{"--no-progress", "--skip-records", "100", input.Name(), dataspec})
|
||||
s.Equal("", stdout)
|
||||
s.Equal("error: skip-records skipped past EOF\n", stderr)
|
||||
s.Equal(clienttest.ExitError{1}, recoveredErr)
|
||||
}
|
||||
|
||||
func (s *testSuite) TestCSVImportSkipRecordsCustomHeader() {
|
||||
input, err := ioutil.TempFile(s.TempDir, "")
|
||||
d.Chk.NoError(err)
|
||||
defer input.Close()
|
||||
defer os.Remove(input.Name())
|
||||
|
||||
_, err = input.WriteString("a,b\n")
|
||||
d.Chk.NoError(err)
|
||||
|
||||
_, err = input.WriteString("7,8\n")
|
||||
d.Chk.NoError(err)
|
||||
|
||||
setName := "csv"
|
||||
dataspec := spec.CreateValueSpecString("nbs", s.DBDir, setName)
|
||||
stdout, stderr := s.MustRun(main, []string{"--no-progress", "--skip-records", "1", "--header", "x,y", input.Name(), dataspec})
|
||||
s.Equal("", stdout)
|
||||
s.Equal("", stderr)
|
||||
|
||||
db := datas.NewDatabase(nbs.NewLocalStore(s.DBDir, clienttest.DefaultMemTableSize))
|
||||
defer os.RemoveAll(s.DBDir)
|
||||
defer db.Close()
|
||||
ds := db.GetDataset(setName)
|
||||
|
||||
l := ds.HeadValue().(types.List)
|
||||
s.Equal(uint64(1), l.Len())
|
||||
v := l.Get(0)
|
||||
st := v.(types.Struct)
|
||||
s.Equal(types.String("7"), st.Get("x"))
|
||||
s.Equal(types.String("8"), st.Get("y"))
|
||||
}
|
||||
@@ -1,109 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"io"
|
||||
"os"
|
||||
"os/exec"
|
||||
"path"
|
||||
"testing"
|
||||
|
||||
"github.com/attic-labs/noms/go/perf/suite"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/attic-labs/noms/samples/go/csv"
|
||||
humanize "github.com/dustin/go-humanize"
|
||||
"github.com/stretchr/testify/assert"
|
||||
)
|
||||
|
||||
// CSV perf suites require the testdata directory to be checked out at $GOPATH/src/github.com/attic-labs/testdata (i.e. ../testdata relative to the noms directory).
|
||||
|
||||
type perfSuite struct {
|
||||
suite.PerfSuite
|
||||
csvImportExe string
|
||||
}
|
||||
|
||||
func (s *perfSuite) SetupSuite() {
|
||||
// Trick the temp file logic into creating a unique path for the csv-import binary.
|
||||
f := s.TempFile()
|
||||
f.Close()
|
||||
os.Remove(f.Name())
|
||||
|
||||
s.csvImportExe = f.Name()
|
||||
err := exec.Command("go", "build", "-o", s.csvImportExe, "github.com/attic-labs/noms/samples/go/csv/csv-import").Run()
|
||||
assert.NoError(s.T, err)
|
||||
}
|
||||
|
||||
func (s *perfSuite) Test01ImportSfCrimeBlobFromTestdata() {
|
||||
assert := s.NewAssert()
|
||||
|
||||
files := s.OpenGlob(s.Testdata, "sf-crime", "2016-07-28.*")
|
||||
defer s.CloseGlob(files)
|
||||
|
||||
blob := types.NewBlob(s.Database, files...)
|
||||
fmt.Fprintf(s.W, "\tsf-crime is %s\n", humanize.Bytes(blob.Len()))
|
||||
|
||||
ds := s.Database.GetDataset("sf-crime/raw")
|
||||
_, err := s.Database.CommitValue(ds, blob)
|
||||
assert.NoError(err)
|
||||
}
|
||||
|
||||
func (s *perfSuite) Test02ImportSfCrimeCSVFromBlob() {
|
||||
s.execCsvImportExe("sf-crime")
|
||||
}
|
||||
|
||||
func (s *perfSuite) Test03ImportSfRegisteredBusinessesFromBlobAsMap() {
|
||||
assert := s.NewAssert()
|
||||
|
||||
files := s.OpenGlob(s.Testdata, "sf-registered-businesses", "2016-07-25.csv")
|
||||
defer s.CloseGlob(files)
|
||||
|
||||
blob := types.NewBlob(s.Database, files...)
|
||||
fmt.Fprintf(s.W, "\tsf-reg-bus is %s\n", humanize.Bytes(blob.Len()))
|
||||
|
||||
ds := s.Database.GetDataset("sf-reg-bus/raw")
|
||||
_, err := s.Database.CommitValue(ds, blob)
|
||||
assert.NoError(err)
|
||||
|
||||
s.execCsvImportExe("sf-reg-bus", "--dest-type", "map:0")
|
||||
}
|
||||
|
||||
func (s *perfSuite) Test04ImportSfRegisteredBusinessesFromBlobAsMultiKeyMap() {
|
||||
s.execCsvImportExe("sf-reg-bus", "--dest-type", "map:Zip_Code,Business_Start_Date")
|
||||
}
|
||||
|
||||
func (s *perfSuite) execCsvImportExe(dsName string, args ...string) {
|
||||
assert := s.NewAssert()
|
||||
|
||||
blobSpec := fmt.Sprintf("%s::%s/raw.value", s.DatabaseSpec, dsName)
|
||||
destSpec := fmt.Sprintf("%s::%s", s.DatabaseSpec, dsName)
|
||||
args = append(args, "-p", blobSpec, destSpec)
|
||||
importCmd := exec.Command(s.csvImportExe, args...)
|
||||
importCmd.Stdout = s.W
|
||||
importCmd.Stderr = os.Stderr
|
||||
|
||||
assert.NoError(importCmd.Run())
|
||||
}
|
||||
|
||||
func (s *perfSuite) TestParseSfCrime() {
|
||||
assert := s.NewAssert()
|
||||
|
||||
files := s.OpenGlob(path.Join(s.Testdata, "sf-crime", "2016-07-28.*"))
|
||||
defer s.CloseGlob(files)
|
||||
|
||||
reader := csv.NewCSVReader(io.MultiReader(files...), ',')
|
||||
for {
|
||||
_, err := reader.Read()
|
||||
if err != nil {
|
||||
assert.Equal(io.EOF, err)
|
||||
break
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestPerf(t *testing.T) {
|
||||
suite.Run("csv-import", t, &perfSuite{})
|
||||
}
|
||||
@@ -1 +0,0 @@
|
||||
csv-invert
|
||||
@@ -1,119 +0,0 @@
|
||||
// Copyright 2017 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
"strings"
|
||||
|
||||
"github.com/attic-labs/noms/go/config"
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/go/datas"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/attic-labs/noms/go/util/profile"
|
||||
flag "github.com/juju/gnuflag"
|
||||
)
|
||||
|
||||
func main() {
|
||||
flag.Usage = func() {
|
||||
fmt.Fprintf(os.Stderr, "Usage: %s [options] <dataset-to-invert> <output-dataset>\n", os.Args[0])
|
||||
flag.PrintDefaults()
|
||||
}
|
||||
|
||||
profile.RegisterProfileFlags(flag.CommandLine)
|
||||
flag.Parse(true)
|
||||
|
||||
if flag.NArg() != 2 {
|
||||
flag.Usage()
|
||||
return
|
||||
}
|
||||
|
||||
cfg := config.NewResolver()
|
||||
inDB, inDS, err := cfg.GetDataset(flag.Arg(0))
|
||||
d.CheckError(err)
|
||||
defer inDB.Close()
|
||||
|
||||
head, present := inDS.MaybeHead()
|
||||
if !present {
|
||||
d.CheckErrorNoUsage(fmt.Errorf("The dataset %s has no head", flag.Arg(0)))
|
||||
}
|
||||
v := head.Get(datas.ValueField)
|
||||
l, isList := v.(types.List)
|
||||
if !isList {
|
||||
d.CheckErrorNoUsage(fmt.Errorf("The head value of %s is not a list, but rather %s", flag.Arg(0), types.TypeOf(v).Describe()))
|
||||
}
|
||||
|
||||
outDB, outDS, err := cfg.GetDataset(flag.Arg(1))
|
||||
defer outDB.Close()
|
||||
|
||||
// I don't want to allocate a new types.Value every time someone calls zeroVal(), so instead have a map of canned Values to reference.
|
||||
zeroVals := map[types.NomsKind]types.Value{
|
||||
types.BoolKind: types.Bool(false),
|
||||
types.FloatKind: types.Float(0),
|
||||
types.StringKind: types.String(""),
|
||||
}
|
||||
|
||||
zeroVal := func(t *types.Type) types.Value {
|
||||
v, present := zeroVals[t.TargetKind()]
|
||||
if !present {
|
||||
d.CheckErrorNoUsage(fmt.Errorf("csv-invert doesn't support values of type %s", t.Describe()))
|
||||
}
|
||||
return v
|
||||
}
|
||||
|
||||
defer profile.MaybeStartProfile().Stop()
|
||||
type stream struct {
|
||||
ch chan types.Value
|
||||
zeroVal types.Value
|
||||
}
|
||||
streams := map[string]*stream{}
|
||||
lists := map[string]<-chan types.List{}
|
||||
lowers := map[string]string{}
|
||||
|
||||
sDesc := types.TypeOf(l).Desc.(types.CompoundDesc).ElemTypes[0].Desc.(types.StructDesc)
|
||||
sDesc.IterFields(func(name string, t *types.Type, optional bool) {
|
||||
lowerName := strings.ToLower(name)
|
||||
if _, present := streams[lowerName]; !present {
|
||||
s := &stream{make(chan types.Value, 1024), zeroVal(t)}
|
||||
streams[lowerName] = s
|
||||
lists[lowerName] = types.NewStreamingList(outDB, s.ch)
|
||||
}
|
||||
lowers[name] = lowerName
|
||||
})
|
||||
|
||||
filledCols := make(map[string]struct{}, len(streams))
|
||||
l.IterAll(func(v types.Value, index uint64) {
|
||||
// First, iterate the fields that are present in |v| and append values to the correct lists
|
||||
v.(types.Struct).IterFields(func(name string, value types.Value) {
|
||||
ln := lowers[name]
|
||||
filledCols[ln] = struct{}{}
|
||||
streams[ln].ch <- value
|
||||
})
|
||||
// Second, iterate all the streams, skipping the ones we already sent a value for, and send an empty String for the remaining ones.
|
||||
for lowerName, stream := range streams {
|
||||
if _, present := filledCols[lowerName]; present {
|
||||
delete(filledCols, lowerName)
|
||||
continue
|
||||
}
|
||||
stream.ch <- stream.zeroVal
|
||||
}
|
||||
})
|
||||
|
||||
invertedStructData := types.StructData{}
|
||||
for lowerName, stream := range streams {
|
||||
close(stream.ch)
|
||||
invertedStructData[lowerName] = <-lists[lowerName]
|
||||
}
|
||||
str := types.NewStruct("Columnar", invertedStructData)
|
||||
|
||||
parents := types.NewSet(outDB)
|
||||
if headRef, present := outDS.MaybeHeadRef(); present {
|
||||
parents = types.NewSet(outDB, headRef)
|
||||
}
|
||||
|
||||
_, err = outDB.Commit(outDS, str, datas.CommitOptions{Parents: parents, Meta: head.Get(datas.MetaField).(types.Struct)})
|
||||
d.PanicIfError(err)
|
||||
}
|
||||
@@ -1,53 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package csv
|
||||
|
||||
import (
|
||||
"bufio"
|
||||
"encoding/csv"
|
||||
"io"
|
||||
)
|
||||
|
||||
var (
|
||||
rByte byte = 13 // the byte that corresponds to the '\r' rune.
|
||||
nByte byte = 10 // the byte that corresponds to the '\n' rune.
|
||||
)
|
||||
|
||||
type reader struct {
|
||||
r *bufio.Reader
|
||||
}
|
||||
|
||||
// Read replaces CR line endings in the source reader with LF line endings if the CR is not followed by a LF.
|
||||
func (r reader) Read(p []byte) (n int, err error) {
|
||||
n, err = r.r.Read(p)
|
||||
bn, err := r.r.Peek(1)
|
||||
for i, b := range p {
|
||||
// if the current byte is a CR and the next byte is NOT a LF then replace the current byte with a LF
|
||||
if j := i + 1; b == rByte && ((j < len(p) && p[j] != nByte) || (len(bn) > 0 && bn[0] != nByte)) {
|
||||
p[i] = nByte
|
||||
}
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
func SkipRecords(r *csv.Reader, n uint) error {
|
||||
var err error
|
||||
for i := uint(0); i < n; i++ {
|
||||
_, err = r.Read()
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
return err
|
||||
}
|
||||
|
||||
// NewCSVReader returns a new csv.Reader that splits on comma
|
||||
func NewCSVReader(res io.Reader, comma rune) *csv.Reader {
|
||||
bufRes := bufio.NewReader(res)
|
||||
r := csv.NewReader(reader{r: bufRes})
|
||||
r.Comma = comma
|
||||
r.FieldsPerRecord = -1 // Don't enforce number of fields.
|
||||
return r
|
||||
}
|
||||
@@ -1,83 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package csv
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"github.com/stretchr/testify/assert"
|
||||
)
|
||||
|
||||
func TestCR(t *testing.T) {
|
||||
testFile := []byte("a,b,c\r1,2,3\r")
|
||||
delimiter, err := StringToRune(",")
|
||||
|
||||
r := NewCSVReader(bytes.NewReader(testFile), delimiter)
|
||||
lines, err := r.ReadAll()
|
||||
|
||||
assert.NoError(t, err, "An error occurred while reading the data: %v", err)
|
||||
if len(lines) != 2 {
|
||||
t.Errorf("Wrong number of lines. Expected 2, got %d", len(lines))
|
||||
}
|
||||
}
|
||||
|
||||
func TestLF(t *testing.T) {
|
||||
testFile := []byte("a,b,c\n1,2,3\n")
|
||||
delimiter, err := StringToRune(",")
|
||||
|
||||
r := NewCSVReader(bytes.NewReader(testFile), delimiter)
|
||||
lines, err := r.ReadAll()
|
||||
|
||||
assert.NoError(t, err, "An error occurred while reading the data: %v", err)
|
||||
if len(lines) != 2 {
|
||||
t.Errorf("Wrong number of lines. Expected 2, got %d", len(lines))
|
||||
}
|
||||
}
|
||||
|
||||
func TestCRLF(t *testing.T) {
|
||||
testFile := []byte("a,b,c\r\n1,2,3\r\n")
|
||||
delimiter, err := StringToRune(",")
|
||||
|
||||
r := NewCSVReader(bytes.NewReader(testFile), delimiter)
|
||||
lines, err := r.ReadAll()
|
||||
|
||||
assert.NoError(t, err, "An error occurred while reading the data: %v", err)
|
||||
if len(lines) != 2 {
|
||||
t.Errorf("Wrong number of lines. Expected 2, got %d", len(lines))
|
||||
}
|
||||
}
|
||||
|
||||
func TestCRInQuote(t *testing.T) {
|
||||
testFile := []byte("a,\"foo,\rbar\",c\r1,\"2\r\n2\",3\r")
|
||||
delimiter, err := StringToRune(",")
|
||||
|
||||
r := NewCSVReader(bytes.NewReader(testFile), delimiter)
|
||||
lines, err := r.ReadAll()
|
||||
|
||||
assert.NoError(t, err, "An error occurred while reading the data: %v", err)
|
||||
if len(lines) != 2 {
|
||||
t.Errorf("Wrong number of lines. Expected 2, got %d", len(lines))
|
||||
}
|
||||
if strings.Contains(lines[1][1], "\n\n") {
|
||||
t.Error("The CRLF was converted to a LFLF")
|
||||
}
|
||||
}
|
||||
|
||||
func TestCRLFEndOfBufferLength(t *testing.T) {
|
||||
testFile := make([]byte, 4096*2, 4096*2)
|
||||
testFile[4095] = 13 // \r byte
|
||||
testFile[4096] = 10 // \n byte
|
||||
delimiter, err := StringToRune(",")
|
||||
|
||||
r := NewCSVReader(bytes.NewReader(testFile), delimiter)
|
||||
lines, err := r.ReadAll()
|
||||
|
||||
assert.NoError(t, err, "An error occurred while reading the data: %v", err)
|
||||
if len(lines) != 2 {
|
||||
t.Errorf("Wrong number of lines. Expected 2, got %d", len(lines))
|
||||
}
|
||||
}
|
||||
@@ -1,37 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package csv
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"strconv"
|
||||
"strings"
|
||||
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
)
|
||||
|
||||
// KindSlice is an alias for []types.NomsKind. It's needed because types.NomsKind are really just 8 bit unsigned ints, which are what Go uses to represent 'byte', and this confuses the Go JSON marshal/unmarshal code -- it treats them as byte arrays and base64 encodes them!
|
||||
type KindSlice []types.NomsKind
|
||||
|
||||
func (ks KindSlice) MarshalJSON() ([]byte, error) {
|
||||
elems := make([]string, len(ks))
|
||||
for i, k := range ks {
|
||||
elems[i] = fmt.Sprintf("%d", k)
|
||||
}
|
||||
return []byte("[" + strings.Join(elems, ",") + "]"), nil
|
||||
}
|
||||
|
||||
func (ks *KindSlice) UnmarshalJSON(value []byte) error {
|
||||
elems := strings.Split(string(value[1:len(value)-1]), ",")
|
||||
*ks = make(KindSlice, len(elems))
|
||||
for i, e := range elems {
|
||||
ival, err := strconv.ParseUint(e, 10, 8)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
(*ks)[i] = types.NomsKind(ival)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
@@ -1,29 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package csv
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"testing"
|
||||
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/stretchr/testify/assert"
|
||||
)
|
||||
|
||||
func TestKindSliceJSON(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
|
||||
ks := KindSlice{types.FloatKind, types.StringKind, types.BoolKind}
|
||||
b, err := json.Marshal(&ks)
|
||||
assert.NoError(err)
|
||||
|
||||
assert.Equal(fmt.Sprintf("[%d,%d,%d]", ks[0], ks[1], ks[2]), string(b))
|
||||
|
||||
var uks KindSlice
|
||||
err = json.Unmarshal(b, &uks)
|
||||
assert.NoError(err, "error with json.Unmarshal")
|
||||
assert.Equal(ks, uks)
|
||||
}
|
||||
@@ -1,280 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package csv
|
||||
|
||||
import (
|
||||
"encoding/csv"
|
||||
"fmt"
|
||||
"io"
|
||||
"sort"
|
||||
"strconv"
|
||||
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
)
|
||||
|
||||
// StringToKind maps names of valid NomsKinds (e.g. Bool, Float, etc) to their associated types.NomsKind
|
||||
var StringToKind = func(kindMap map[types.NomsKind]string) map[string]types.NomsKind {
|
||||
m := map[string]types.NomsKind{}
|
||||
for k, v := range kindMap {
|
||||
m[v] = k
|
||||
}
|
||||
return m
|
||||
}(types.KindToString)
|
||||
|
||||
// StringsToKinds looks up each element of strs in the StringToKind map and returns a slice of answers
|
||||
func StringsToKinds(strs []string) KindSlice {
|
||||
kinds := make(KindSlice, len(strs))
|
||||
for i, str := range strs {
|
||||
k, ok := StringToKind[str]
|
||||
if !ok {
|
||||
d.Panic("StringToKind[%s] failed", str)
|
||||
}
|
||||
kinds[i] = k
|
||||
}
|
||||
return kinds
|
||||
}
|
||||
|
||||
// KindsToStrings looks up each element of kinds in the types.KindToString map and returns a slice of answers
|
||||
func KindsToStrings(kinds KindSlice) []string {
|
||||
strs := make([]string, len(kinds))
|
||||
for i, k := range kinds {
|
||||
strs[i] = k.String()
|
||||
}
|
||||
return strs
|
||||
}
|
||||
|
||||
//EscapeStructFieldFromCSV removes special characters and replaces spaces with camelCasing (camel case turns to camelCase)
|
||||
func EscapeStructFieldFromCSV(input string) string {
|
||||
if types.IsValidStructFieldName(input) {
|
||||
return input
|
||||
}
|
||||
return types.CamelCaseFieldName(input)
|
||||
}
|
||||
|
||||
// MakeStructTemplateFromHeaders creates a struct type from the headers using |kinds| as the type of each field. If |kinds| is empty, default to strings.
|
||||
func MakeStructTemplateFromHeaders(headers []string, structName string, kinds KindSlice) (temp types.StructTemplate, fieldOrder []int, kindMap []types.NomsKind) {
|
||||
useStringType := len(kinds) == 0
|
||||
d.PanicIfFalse(useStringType || len(headers) == len(kinds))
|
||||
|
||||
fieldMap := make(map[string]types.NomsKind, len(headers))
|
||||
origOrder := make(map[string]int, len(headers))
|
||||
fieldNames := make(sort.StringSlice, len(headers))
|
||||
|
||||
for i, key := range headers {
|
||||
fn := EscapeStructFieldFromCSV(key)
|
||||
origOrder[fn] = i
|
||||
kind := types.StringKind
|
||||
if !useStringType {
|
||||
kind = kinds[i]
|
||||
}
|
||||
_, ok := fieldMap[fn]
|
||||
if ok {
|
||||
d.Panic(`Duplicate field name "%s"`, key)
|
||||
}
|
||||
fieldMap[fn] = kind
|
||||
fieldNames[i] = fn
|
||||
}
|
||||
|
||||
sort.Sort(fieldNames)
|
||||
|
||||
kindMap = make([]types.NomsKind, len(fieldMap))
|
||||
fieldOrder = make([]int, len(fieldMap))
|
||||
|
||||
for i, fn := range fieldNames {
|
||||
kindMap[i] = fieldMap[fn]
|
||||
fieldOrder[origOrder[fn]] = i
|
||||
}
|
||||
|
||||
temp = types.MakeStructTemplate(structName, fieldNames)
|
||||
return
|
||||
}
|
||||
|
||||
// ReadToList takes a CSV reader and reads data into a typed List of structs.
|
||||
// Each row gets read into a struct named structName, described by headers. If
|
||||
// the original data contained headers it is expected that the input reader has
|
||||
// already read those and are pointing at the first data row.
|
||||
// If kinds is non-empty, it will be used to type the fields in the generated
|
||||
// structs; otherwise, they will be left as string-fields.
|
||||
// In addition to the list, ReadToList returns the typeDef of the structs in the
|
||||
// list.
|
||||
func ReadToList(r *csv.Reader, structName string, headers []string, kinds KindSlice, vrw types.ValueReadWriter, limit uint64) (l types.List) {
|
||||
temp, fieldOrder, kindMap := MakeStructTemplateFromHeaders(headers, structName, kinds)
|
||||
valueChan := make(chan types.Value, 128) // TODO: Make this a function param?
|
||||
listChan := types.NewStreamingList(vrw, valueChan)
|
||||
|
||||
cnt := uint64(0)
|
||||
for {
|
||||
row, err := r.Read()
|
||||
if cnt >= limit || err == io.EOF {
|
||||
close(valueChan)
|
||||
break
|
||||
} else if err != nil {
|
||||
panic(err)
|
||||
}
|
||||
cnt++
|
||||
|
||||
fields := readFieldsFromRow(row, headers, fieldOrder, kindMap)
|
||||
valueChan <- temp.NewStruct(fields)
|
||||
}
|
||||
|
||||
return <-listChan
|
||||
}
|
||||
|
||||
type column struct {
|
||||
ch chan types.Value
|
||||
list <-chan types.List
|
||||
zeroValue types.Value
|
||||
hdr string
|
||||
}
|
||||
|
||||
// ReadToColumnar takes a CSV reader and reads data from each column into a
|
||||
// separate list. Values from columns in each successive row are appended to the
|
||||
// column-specific lists whose type is described by headers. Finally, a new
|
||||
// "Columnar" struct is created that consists of one field for each column and
|
||||
// each field contains a list of values.
|
||||
// If the original data contained headers it is expected that the input reader
|
||||
// has already read those and are pointing at the first data row.
|
||||
// If kinds is non-empty, it will be used to type the fields in the generated
|
||||
// structs; otherwise, they will be left as string-fields.
|
||||
// In addition to the list, ReadToList returns the typeDef of the structs in the
|
||||
// list.
|
||||
func ReadToColumnar(r *csv.Reader, structName string, headers []string, kinds KindSlice, vrw types.ValueReadWriter, limit uint64) (s types.Struct) {
|
||||
valueChan := make(chan types.Value, 128) // TODO: Make this a function param?
|
||||
cols := []column{}
|
||||
fieldOrder := []int{}
|
||||
for i, hdr := range headers {
|
||||
ch := make(chan types.Value, 1024)
|
||||
cols = append(cols, column{
|
||||
ch: ch,
|
||||
list: types.NewStreamingList(vrw, ch),
|
||||
hdr: hdr,
|
||||
})
|
||||
fieldOrder = append(fieldOrder, i)
|
||||
}
|
||||
|
||||
cnt := uint64(0)
|
||||
for {
|
||||
row, err := r.Read()
|
||||
if cnt >= limit || err == io.EOF {
|
||||
close(valueChan)
|
||||
break
|
||||
} else if err != nil {
|
||||
panic(err)
|
||||
}
|
||||
cnt++
|
||||
|
||||
fields := readFieldsFromRow(row, headers, fieldOrder, kinds)
|
||||
for i, v := range fields {
|
||||
cols[i].ch <- v
|
||||
}
|
||||
}
|
||||
|
||||
sd := types.StructData{}
|
||||
for _, col := range cols {
|
||||
close(col.ch)
|
||||
r := vrw.WriteValue(<-col.list)
|
||||
sd[col.hdr] = r
|
||||
}
|
||||
return types.NewStruct("Columnar", sd)
|
||||
}
|
||||
|
||||
// getFieldIndexByHeaderName takes the collection of headers and the name to search for and returns the index of name within the headers or -1 if not found
|
||||
func getFieldIndexByHeaderName(headers []string, name string) int {
|
||||
for i, header := range headers {
|
||||
if header == name {
|
||||
return i
|
||||
}
|
||||
}
|
||||
return -1
|
||||
}
|
||||
|
||||
// getPkIndices takes collection of primary keys as strings and determines if they are integers, if so then use those ints as the indices, otherwise it looks up the strings in the headers to find the indices; returning the collection of int indices representing the primary keys maintaining the order of strPks to the return collection
|
||||
func getPkIndices(strPks []string, headers []string) []int {
|
||||
result := make([]int, len(strPks))
|
||||
for i, pk := range strPks {
|
||||
pkIdx, ok := strconv.Atoi(pk)
|
||||
if ok == nil {
|
||||
result[i] = pkIdx
|
||||
} else {
|
||||
result[i] = getFieldIndexByHeaderName(headers, pk)
|
||||
}
|
||||
if result[i] < 0 {
|
||||
d.Chk.Fail(fmt.Sprintf("Invalid pk: %v", pk))
|
||||
}
|
||||
}
|
||||
return result
|
||||
}
|
||||
|
||||
func readFieldsFromRow(row []string, headers []string, fieldOrder []int, kindMap []types.NomsKind) types.ValueSlice {
|
||||
fields := make(types.ValueSlice, len(headers))
|
||||
for i, v := range row {
|
||||
if i < len(headers) {
|
||||
fieldOrigIndex := fieldOrder[i]
|
||||
val, err := StringToValue(v, kindMap[fieldOrigIndex])
|
||||
if err != nil {
|
||||
d.Chk.Fail(fmt.Sprintf("Error parsing value for column '%s': %s", headers[i], err))
|
||||
}
|
||||
fields[fieldOrigIndex] = val
|
||||
}
|
||||
}
|
||||
return fields
|
||||
}
|
||||
|
||||
// primaryKeyValuesFromFields extracts the values of the primaryKey fields into
|
||||
// array. The values are in the user-specified order. This function returns 2
|
||||
// objects:
|
||||
// 1) a ValueSlice containing the first n-1 keys.
|
||||
// 2) a single Value which will be used as the key in the leaf map created by
|
||||
// GraphBuilder
|
||||
func primaryKeyValuesFromFields(fields types.ValueSlice, fieldOrder, pkIndices []int) (types.ValueSlice, types.Value) {
|
||||
numPrimaryKeys := len(pkIndices)
|
||||
|
||||
if numPrimaryKeys == 1 {
|
||||
return nil, fields[fieldOrder[pkIndices[0]]]
|
||||
}
|
||||
|
||||
keys := make(types.ValueSlice, numPrimaryKeys-1)
|
||||
var value types.Value
|
||||
for i, idx := range pkIndices {
|
||||
k := fields[fieldOrder[idx]]
|
||||
if i < numPrimaryKeys-1 {
|
||||
keys[i] = k
|
||||
} else {
|
||||
value = k
|
||||
}
|
||||
}
|
||||
return keys, value
|
||||
}
|
||||
|
||||
// ReadToMap takes a CSV reader and reads data into a typed Map of structs. Each
|
||||
// row gets read into a struct named structName, described by headers. If the
|
||||
// original data contained headers it is expected that the input reader has
|
||||
// already read those and are pointing at the first data row.
|
||||
// If kinds is non-empty, it will be used to type the fields in the generated
|
||||
// structs; otherwise, they will be left as string-fields.
|
||||
func ReadToMap(r *csv.Reader, structName string, headersRaw []string, primaryKeys []string, kinds KindSlice, vrw types.ValueReadWriter, limit uint64) types.Map {
|
||||
temp, fieldOrder, kindMap := MakeStructTemplateFromHeaders(headersRaw, structName, kinds)
|
||||
pkIndices := getPkIndices(primaryKeys, headersRaw)
|
||||
d.Chk.True(len(pkIndices) >= 1, "No primary key defined when reading into map")
|
||||
gb := types.NewGraphBuilder(vrw, types.MapKind)
|
||||
|
||||
cnt := uint64(0)
|
||||
for {
|
||||
row, err := r.Read()
|
||||
if cnt >= limit || err == io.EOF {
|
||||
break
|
||||
} else if err != nil {
|
||||
panic(err)
|
||||
}
|
||||
cnt++
|
||||
|
||||
fields := readFieldsFromRow(row, headersRaw, fieldOrder, kindMap)
|
||||
graphKeys, mapKey := primaryKeyValuesFromFields(fields, fieldOrder, pkIndices)
|
||||
st := temp.NewStruct(fields)
|
||||
gb.MapSet(graphKeys, mapKey, st)
|
||||
}
|
||||
return gb.Build().(types.Map)
|
||||
}
|
||||
@@ -1,236 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package csv
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"encoding/csv"
|
||||
"math"
|
||||
"testing"
|
||||
|
||||
"github.com/attic-labs/noms/go/chunks"
|
||||
"github.com/attic-labs/noms/go/datas"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/stretchr/testify/assert"
|
||||
)
|
||||
|
||||
var LIMIT = uint64(math.MaxUint64)
|
||||
|
||||
func TestReadToList(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
storage := &chunks.MemoryStorage{}
|
||||
db := datas.NewDatabase(storage.NewView())
|
||||
|
||||
dataString := `a,1,true
|
||||
b,2,false
|
||||
`
|
||||
r := NewCSVReader(bytes.NewBufferString(dataString), ',')
|
||||
|
||||
headers := []string{"A", "B", "C"}
|
||||
kinds := KindSlice{types.StringKind, types.FloatKind, types.BoolKind}
|
||||
l := ReadToList(r, "test", headers, kinds, db, LIMIT)
|
||||
|
||||
assert.Equal(uint64(2), l.Len())
|
||||
|
||||
assert.True(l.Get(0).(types.Struct).Get("A").Equals(types.String("a")))
|
||||
assert.True(l.Get(1).(types.Struct).Get("A").Equals(types.String("b")))
|
||||
|
||||
assert.True(l.Get(0).(types.Struct).Get("B").Equals(types.Float(1)))
|
||||
assert.True(l.Get(1).(types.Struct).Get("B").Equals(types.Float(2)))
|
||||
|
||||
assert.True(l.Get(0).(types.Struct).Get("C").Equals(types.Bool(true)))
|
||||
assert.True(l.Get(1).(types.Struct).Get("C").Equals(types.Bool(false)))
|
||||
}
|
||||
|
||||
func TestReadToMap(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
storage := &chunks.MemoryStorage{}
|
||||
db := datas.NewDatabase(storage.NewView())
|
||||
|
||||
dataString := `a,1,true
|
||||
b,2,false
|
||||
`
|
||||
r := NewCSVReader(bytes.NewBufferString(dataString), ',')
|
||||
|
||||
headers := []string{"A", "B", "C"}
|
||||
kinds := KindSlice{types.StringKind, types.FloatKind, types.BoolKind}
|
||||
m := ReadToMap(r, "test", headers, []string{"0"}, kinds, db, LIMIT)
|
||||
|
||||
assert.Equal(uint64(2), m.Len())
|
||||
assert.True(types.TypeOf(m).Equals(
|
||||
types.MakeMapType(types.StringType, types.MakeStructType("test",
|
||||
types.StructField{"A", types.StringType, false},
|
||||
types.StructField{"B", types.FloaTType, false},
|
||||
types.StructField{"C", types.BoolType, false},
|
||||
))))
|
||||
|
||||
assert.True(m.Get(types.String("a")).Equals(types.NewStruct("test", types.StructData{
|
||||
"A": types.String("a"),
|
||||
"B": types.Float(1),
|
||||
"C": types.Bool(true),
|
||||
})))
|
||||
assert.True(m.Get(types.String("b")).Equals(types.NewStruct("test", types.StructData{
|
||||
"A": types.String("b"),
|
||||
"B": types.Float(2),
|
||||
"C": types.Bool(false),
|
||||
})))
|
||||
}
|
||||
|
||||
func testTrailingHelper(t *testing.T, dataString string) {
|
||||
assert := assert.New(t)
|
||||
storage := &chunks.MemoryStorage{}
|
||||
db1 := datas.NewDatabase(storage.NewView())
|
||||
defer db1.Close()
|
||||
|
||||
r := NewCSVReader(bytes.NewBufferString(dataString), ',')
|
||||
|
||||
headers := []string{"A", "B"}
|
||||
kinds := KindSlice{types.StringKind, types.StringKind}
|
||||
l := ReadToList(r, "test", headers, kinds, db1, LIMIT)
|
||||
assert.Equal(uint64(3), l.Len())
|
||||
|
||||
storage = &chunks.MemoryStorage{}
|
||||
db2 := datas.NewDatabase(storage.NewView())
|
||||
defer db2.Close()
|
||||
r = NewCSVReader(bytes.NewBufferString(dataString), ',')
|
||||
m := ReadToMap(r, "test", headers, []string{"0"}, kinds, db2, LIMIT)
|
||||
assert.Equal(uint64(3), m.Len())
|
||||
}
|
||||
|
||||
func TestReadTrailingHole(t *testing.T) {
|
||||
dataString := `a,b,
|
||||
d,e,
|
||||
g,h,
|
||||
`
|
||||
testTrailingHelper(t, dataString)
|
||||
}
|
||||
|
||||
func TestReadTrailingHoles(t *testing.T) {
|
||||
dataString := `a,b,,
|
||||
d,e
|
||||
g,h
|
||||
`
|
||||
testTrailingHelper(t, dataString)
|
||||
}
|
||||
|
||||
func TestReadTrailingValues(t *testing.T) {
|
||||
dataString := `a,b
|
||||
d,e,f
|
||||
g,h,i,j
|
||||
`
|
||||
testTrailingHelper(t, dataString)
|
||||
}
|
||||
|
||||
func TestEscapeStructFieldFromCSV(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
cases := []string{
|
||||
"a", "a",
|
||||
"1a", "a",
|
||||
"AaZz19_", "AaZz19_",
|
||||
"Q", "Q",
|
||||
"AQ", "AQ",
|
||||
"_content", "content",
|
||||
"Few ¢ents Short", "fewEntsShort",
|
||||
"CAMEL💩case letTerS", "camelcaseLetters",
|
||||
"https://picasaweb.google.com/data", "httpspicasawebgooglecomdata",
|
||||
"💩", "",
|
||||
"11 1💩", "",
|
||||
"-- A B", "aB",
|
||||
"-- A --", "a",
|
||||
"-- A -- B", "aB",
|
||||
}
|
||||
|
||||
for i := 0; i < len(cases); i += 2 {
|
||||
orig, expected := cases[i], cases[i+1]
|
||||
assert.Equal(expected, EscapeStructFieldFromCSV(orig))
|
||||
}
|
||||
}
|
||||
|
||||
func TestReadParseError(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
storage := &chunks.MemoryStorage{}
|
||||
db := datas.NewDatabase(storage.NewView())
|
||||
|
||||
dataString := `a,"b`
|
||||
r := NewCSVReader(bytes.NewBufferString(dataString), ',')
|
||||
|
||||
headers := []string{"A", "B"}
|
||||
kinds := KindSlice{types.StringKind, types.StringKind}
|
||||
func() {
|
||||
defer func() {
|
||||
r := recover()
|
||||
assert.NotNil(r)
|
||||
_, ok := r.(*csv.ParseError)
|
||||
assert.True(ok, "Should be a ParseError")
|
||||
}()
|
||||
ReadToList(r, "test", headers, kinds, db, LIMIT)
|
||||
}()
|
||||
}
|
||||
|
||||
func TestDuplicateHeaderName(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
storage := &chunks.MemoryStorage{}
|
||||
db := datas.NewDatabase(storage.NewView())
|
||||
dataString := "1,2\n3,4\n"
|
||||
r := NewCSVReader(bytes.NewBufferString(dataString), ',')
|
||||
headers := []string{"A", "A"}
|
||||
kinds := KindSlice{types.StringKind, types.StringKind}
|
||||
assert.Panics(func() { ReadToList(r, "test", headers, kinds, db, LIMIT) })
|
||||
}
|
||||
|
||||
func TestEscapeFieldNames(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
storage := &chunks.MemoryStorage{}
|
||||
db := datas.NewDatabase(storage.NewView())
|
||||
dataString := "1,2\n"
|
||||
r := NewCSVReader(bytes.NewBufferString(dataString), ',')
|
||||
headers := []string{"A A", "B"}
|
||||
kinds := KindSlice{types.FloatKind, types.FloatKind}
|
||||
|
||||
l := ReadToList(r, "test", headers, kinds, db, LIMIT)
|
||||
assert.Equal(uint64(1), l.Len())
|
||||
assert.Equal(types.Float(1), l.Get(0).(types.Struct).Get(EscapeStructFieldFromCSV("A A")))
|
||||
|
||||
r = NewCSVReader(bytes.NewBufferString(dataString), ',')
|
||||
m := ReadToMap(r, "test", headers, []string{"1"}, kinds, db, LIMIT)
|
||||
assert.Equal(uint64(1), l.Len())
|
||||
assert.Equal(types.Float(1), m.Get(types.Float(2)).(types.Struct).Get(EscapeStructFieldFromCSV("A A")))
|
||||
}
|
||||
|
||||
func TestDefaults(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
storage := &chunks.MemoryStorage{}
|
||||
db := datas.NewDatabase(storage.NewView())
|
||||
dataString := "42,,,\n"
|
||||
r := NewCSVReader(bytes.NewBufferString(dataString), ',')
|
||||
headers := []string{"A", "B", "C", "D"}
|
||||
kinds := KindSlice{types.FloatKind, types.FloatKind, types.BoolKind, types.StringKind}
|
||||
|
||||
l := ReadToList(r, "test", headers, kinds, db, LIMIT)
|
||||
assert.Equal(uint64(1), l.Len())
|
||||
row := l.Get(0).(types.Struct)
|
||||
assert.Equal(types.Float(42), row.Get("A"))
|
||||
assert.Equal(types.Float(0), row.Get("B"))
|
||||
assert.Equal(types.Bool(false), row.Get("C"))
|
||||
assert.Equal(types.String(""), row.Get("D"))
|
||||
}
|
||||
|
||||
func TestBooleanStrings(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
storage := &chunks.MemoryStorage{}
|
||||
db := datas.NewDatabase(storage.NewView())
|
||||
dataString := "true,false\n1,0\ny,n\nY,N\nY,\n"
|
||||
r := NewCSVReader(bytes.NewBufferString(dataString), ',')
|
||||
headers := []string{"T", "F"}
|
||||
kinds := KindSlice{types.BoolKind, types.BoolKind}
|
||||
|
||||
l := ReadToList(r, "test", headers, kinds, db, LIMIT)
|
||||
assert.Equal(uint64(5), l.Len())
|
||||
for i := uint64(0); i < l.Len(); i++ {
|
||||
row := l.Get(i).(types.Struct)
|
||||
assert.True(types.Bool(true).Equals(row.Get("T")))
|
||||
assert.True(types.Bool(false).Equals(row.Get("F")))
|
||||
}
|
||||
}
|
||||
@@ -1,244 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package csv
|
||||
|
||||
import (
|
||||
"encoding/csv"
|
||||
"fmt"
|
||||
"io"
|
||||
"math"
|
||||
"strconv"
|
||||
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
)
|
||||
|
||||
type schemaOptions []*typeCanFit
|
||||
|
||||
func newSchemaOptions(fieldCount int) schemaOptions {
|
||||
options := make([]*typeCanFit, fieldCount, fieldCount)
|
||||
for i := 0; i < fieldCount; i++ {
|
||||
options[i] = &typeCanFit{true, true, true}
|
||||
}
|
||||
return options
|
||||
}
|
||||
|
||||
func (so schemaOptions) Test(fields []string) {
|
||||
for i, t := range so {
|
||||
if i < len(fields) {
|
||||
t.Test(fields[i])
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func (so schemaOptions) MostSpecificKinds() KindSlice {
|
||||
kinds := make(KindSlice, len(so))
|
||||
for i, t := range so {
|
||||
kinds[i] = t.MostSpecificKind()
|
||||
}
|
||||
return kinds
|
||||
}
|
||||
|
||||
func (so schemaOptions) ValidKinds() []KindSlice {
|
||||
kinds := make([]KindSlice, len(so))
|
||||
for i, t := range so {
|
||||
kinds[i] = t.ValidKinds()
|
||||
}
|
||||
return kinds
|
||||
}
|
||||
|
||||
type typeCanFit struct {
|
||||
boolType bool
|
||||
numberType bool
|
||||
stringType bool
|
||||
}
|
||||
|
||||
func (tc *typeCanFit) MostSpecificKind() types.NomsKind {
|
||||
if tc.boolType {
|
||||
return types.BoolKind
|
||||
} else if tc.numberType {
|
||||
return types.FloatKind
|
||||
} else {
|
||||
return types.StringKind
|
||||
}
|
||||
}
|
||||
|
||||
func (tc *typeCanFit) ValidKinds() (kinds KindSlice) {
|
||||
if tc.numberType {
|
||||
kinds = append(kinds, types.FloatKind)
|
||||
}
|
||||
if tc.boolType {
|
||||
kinds = append(kinds, types.BoolKind)
|
||||
}
|
||||
kinds = append(kinds, types.StringKind)
|
||||
return kinds
|
||||
}
|
||||
|
||||
func (tc *typeCanFit) Test(value string) {
|
||||
tc.testNumbers(value)
|
||||
tc.testBool(value)
|
||||
}
|
||||
|
||||
func (tc *typeCanFit) testNumbers(value string) {
|
||||
if !tc.numberType {
|
||||
return
|
||||
}
|
||||
|
||||
fval, err := strconv.ParseFloat(value, 64)
|
||||
if err != nil {
|
||||
tc.numberType = false
|
||||
return
|
||||
}
|
||||
|
||||
if fval > math.MaxFloat64 {
|
||||
tc.numberType = false
|
||||
}
|
||||
}
|
||||
|
||||
func (tc *typeCanFit) testBool(value string) {
|
||||
if !tc.boolType {
|
||||
return
|
||||
}
|
||||
_, err := strconv.ParseBool(value)
|
||||
tc.boolType = err == nil
|
||||
}
|
||||
|
||||
func GetSchema(r *csv.Reader, numSamples int, numFields int) KindSlice {
|
||||
so := newSchemaOptions(numFields)
|
||||
for i := 0; i < numSamples; i++ {
|
||||
row, err := r.Read()
|
||||
if err == io.EOF {
|
||||
break
|
||||
}
|
||||
so.Test(row)
|
||||
}
|
||||
return so.MostSpecificKinds()
|
||||
}
|
||||
|
||||
func GetFieldNamesFromIndices(headers []string, indices []int) []string {
|
||||
result := make([]string, len(indices))
|
||||
for i, idx := range indices {
|
||||
result[i] = headers[idx]
|
||||
}
|
||||
return result
|
||||
}
|
||||
|
||||
// combinations - n choose m combination without repeat - emit all possible `length` combinations from values
|
||||
func combinationsWithLength(values []int, length int, emit func([]int)) {
|
||||
n := len(values)
|
||||
|
||||
if length > n {
|
||||
return
|
||||
}
|
||||
|
||||
indices := make([]int, length)
|
||||
for i := range indices {
|
||||
indices[i] = i
|
||||
}
|
||||
|
||||
result := make([]int, length)
|
||||
for i, l := range indices {
|
||||
result[i] = values[l]
|
||||
}
|
||||
emit(result)
|
||||
|
||||
for {
|
||||
i := length - 1
|
||||
for ; i >= 0 && indices[i] == i+n-length; i -= 1 {
|
||||
}
|
||||
|
||||
if i < 0 {
|
||||
return
|
||||
}
|
||||
|
||||
indices[i] += 1
|
||||
for j := i + 1; j < length; j += 1 {
|
||||
indices[j] = indices[j-1] + 1
|
||||
}
|
||||
|
||||
for ; i < len(indices); i += 1 {
|
||||
result[i] = values[indices[i]]
|
||||
}
|
||||
emit(result)
|
||||
}
|
||||
}
|
||||
|
||||
// combinationsLengthsFromTo - n choose m combination without repeat - emit all possible combinations of all lengths from smallestLength to largestLength (inclusive)
|
||||
func combinationsLengthsFromTo(values []int, smallestLength, largestLength int, emit func([]int)) {
|
||||
for i := smallestLength; i <= largestLength; i++ {
|
||||
combinationsWithLength(values, i, emit)
|
||||
}
|
||||
}
|
||||
|
||||
func makeKeyString(row []string, indices []int, separator string) string {
|
||||
var result string
|
||||
for _, i := range indices {
|
||||
result += separator
|
||||
result += row[i]
|
||||
}
|
||||
return result
|
||||
}
|
||||
|
||||
// FindPrimaryKeys reads numSamples from r, using the first numFields and returns slices of []int indices that are primary keys for those samples
|
||||
func FindPrimaryKeys(r *csv.Reader, numSamples, maxLenPrimaryKeyList, numFields int) [][]int {
|
||||
dataToTest := make([][]string, 0, numSamples)
|
||||
for i := int(0); i < numSamples; i++ {
|
||||
row, err := r.Read()
|
||||
if err == io.EOF {
|
||||
break
|
||||
}
|
||||
dataToTest = append(dataToTest, row)
|
||||
}
|
||||
|
||||
indices := make([]int, numFields)
|
||||
for i := int(0); i < numFields; i++ {
|
||||
indices[i] = i
|
||||
}
|
||||
|
||||
pksFound := make([][]int, 0)
|
||||
combinationsLengthsFromTo(indices, 1, maxLenPrimaryKeyList, func(combination []int) {
|
||||
keys := make(map[string]bool, numSamples)
|
||||
for _, row := range dataToTest {
|
||||
key := makeKeyString(row, combination, "$&$")
|
||||
if _, ok := keys[key]; ok {
|
||||
return
|
||||
}
|
||||
keys[key] = true
|
||||
}
|
||||
// need to copy the combination because it will be changed by caller
|
||||
pksFound = append(pksFound, append([]int{}, combination...))
|
||||
})
|
||||
return pksFound
|
||||
}
|
||||
|
||||
// StringToValue takes a piece of data as a string and attempts to convert it to a types.Value of the appropriate types.NomsKind.
|
||||
func StringToValue(s string, k types.NomsKind) (types.Value, error) {
|
||||
switch k {
|
||||
case types.FloatKind:
|
||||
if s == "" {
|
||||
return types.Float(float64(0)), nil
|
||||
}
|
||||
fval, err := strconv.ParseFloat(s, 64)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("Could not parse '%s' into number (%s)", s, err)
|
||||
}
|
||||
return types.Float(fval), nil
|
||||
case types.BoolKind:
|
||||
// TODO: This should probably be configurable.
|
||||
switch s {
|
||||
case "true", "1", "y", "yes", "Y", "YES":
|
||||
return types.Bool(true), nil
|
||||
case "false", "0", "n", "no", "N", "NO", "":
|
||||
return types.Bool(false), nil
|
||||
default:
|
||||
return nil, fmt.Errorf("Could not parse '%s' into bool", s)
|
||||
}
|
||||
case types.StringKind:
|
||||
return types.String(s), nil
|
||||
default:
|
||||
d.Panic("Invalid column type kind:", k)
|
||||
}
|
||||
panic("not reached")
|
||||
}
|
||||
@@ -1,351 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package csv
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"testing"
|
||||
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/stretchr/testify/assert"
|
||||
)
|
||||
|
||||
func TestSchemaDetection(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
test := func(input [][]string, expect []KindSlice) {
|
||||
options := newSchemaOptions(len(input[0]))
|
||||
for _, values := range input {
|
||||
options.Test(values)
|
||||
}
|
||||
|
||||
assert.Equal(expect, options.ValidKinds())
|
||||
}
|
||||
test(
|
||||
[][]string{
|
||||
{"foo", "1", "5"},
|
||||
{"bar", "0", "10"},
|
||||
{"true", "1", "23"},
|
||||
{"1", "1", "60"},
|
||||
{"1.1", "false", "75"},
|
||||
},
|
||||
[]KindSlice{
|
||||
{types.StringKind},
|
||||
{types.BoolKind, types.StringKind},
|
||||
{
|
||||
types.FloatKind,
|
||||
types.StringKind,
|
||||
},
|
||||
},
|
||||
)
|
||||
test(
|
||||
[][]string{
|
||||
{"foo"},
|
||||
{"bar"},
|
||||
{"true"},
|
||||
{"1"},
|
||||
{"1.1"},
|
||||
},
|
||||
[]KindSlice{
|
||||
{types.StringKind},
|
||||
},
|
||||
)
|
||||
test(
|
||||
[][]string{
|
||||
{"true"},
|
||||
{"1"},
|
||||
{"1.1"},
|
||||
},
|
||||
[]KindSlice{
|
||||
{types.StringKind},
|
||||
},
|
||||
)
|
||||
test(
|
||||
[][]string{
|
||||
{"true"},
|
||||
{"false"},
|
||||
{"True"},
|
||||
{"False"},
|
||||
{"TRUE"},
|
||||
{"FALSE"},
|
||||
{"1"},
|
||||
{"0"},
|
||||
},
|
||||
[]KindSlice{
|
||||
{types.BoolKind, types.StringKind},
|
||||
},
|
||||
)
|
||||
test(
|
||||
[][]string{
|
||||
{"1"},
|
||||
{"1.1"},
|
||||
},
|
||||
[]KindSlice{
|
||||
{
|
||||
types.FloatKind,
|
||||
types.StringKind},
|
||||
},
|
||||
)
|
||||
test(
|
||||
[][]string{
|
||||
{"1"},
|
||||
{"1.1"},
|
||||
{"4.940656458412465441765687928682213723651e-50"},
|
||||
{"-4.940656458412465441765687928682213723651e-50"},
|
||||
},
|
||||
[]KindSlice{
|
||||
{
|
||||
types.FloatKind,
|
||||
types.StringKind},
|
||||
},
|
||||
)
|
||||
|
||||
test(
|
||||
[][]string{
|
||||
{"1"},
|
||||
{"1.1"},
|
||||
{"1.797693134862315708145274237317043567981e+102"},
|
||||
{"-1.797693134862315708145274237317043567981e+102"},
|
||||
},
|
||||
[]KindSlice{
|
||||
{
|
||||
types.FloatKind,
|
||||
types.StringKind},
|
||||
},
|
||||
)
|
||||
test(
|
||||
[][]string{
|
||||
{"1"},
|
||||
{"1.1"},
|
||||
{"1.797693134862315708145274237317043567981e+309"},
|
||||
{"-1.797693134862315708145274237317043567981e+309"},
|
||||
},
|
||||
[]KindSlice{
|
||||
{
|
||||
types.StringKind},
|
||||
},
|
||||
)
|
||||
test(
|
||||
[][]string{
|
||||
{"1"},
|
||||
{"0"},
|
||||
},
|
||||
[]KindSlice{
|
||||
{
|
||||
types.FloatKind,
|
||||
types.BoolKind,
|
||||
types.StringKind},
|
||||
},
|
||||
)
|
||||
test(
|
||||
[][]string{
|
||||
{"1"},
|
||||
{"0"},
|
||||
{"-1"},
|
||||
},
|
||||
[]KindSlice{
|
||||
{
|
||||
types.FloatKind,
|
||||
types.StringKind},
|
||||
},
|
||||
)
|
||||
test(
|
||||
[][]string{
|
||||
{"0"},
|
||||
{"-0"},
|
||||
},
|
||||
[]KindSlice{
|
||||
{
|
||||
types.FloatKind,
|
||||
types.StringKind},
|
||||
},
|
||||
)
|
||||
test(
|
||||
[][]string{
|
||||
{"1"},
|
||||
{"280"},
|
||||
{"0"},
|
||||
{"-1"},
|
||||
},
|
||||
[]KindSlice{
|
||||
{
|
||||
types.FloatKind,
|
||||
types.StringKind},
|
||||
},
|
||||
)
|
||||
test(
|
||||
[][]string{
|
||||
{"1"},
|
||||
{"-180"},
|
||||
{"0"},
|
||||
{"-1"},
|
||||
},
|
||||
[]KindSlice{
|
||||
{
|
||||
types.FloatKind,
|
||||
types.StringKind},
|
||||
},
|
||||
)
|
||||
test(
|
||||
[][]string{
|
||||
{"1"},
|
||||
{"33000"},
|
||||
{"0"},
|
||||
{"-1"},
|
||||
},
|
||||
[]KindSlice{
|
||||
{
|
||||
types.FloatKind,
|
||||
types.StringKind},
|
||||
},
|
||||
)
|
||||
test(
|
||||
[][]string{
|
||||
{"1"},
|
||||
{"-44000"},
|
||||
{"0"},
|
||||
{"-1"},
|
||||
},
|
||||
[]KindSlice{
|
||||
{
|
||||
types.FloatKind,
|
||||
types.StringKind},
|
||||
},
|
||||
)
|
||||
test(
|
||||
[][]string{
|
||||
{"1"},
|
||||
{"2547483648"},
|
||||
{"0"},
|
||||
{"-1"},
|
||||
},
|
||||
[]KindSlice{
|
||||
{
|
||||
types.FloatKind,
|
||||
types.StringKind},
|
||||
},
|
||||
)
|
||||
test(
|
||||
[][]string{
|
||||
{"1"},
|
||||
{"-4347483648"},
|
||||
{"0"},
|
||||
{"-1"},
|
||||
},
|
||||
[]KindSlice{
|
||||
{
|
||||
types.FloatKind,
|
||||
types.StringKind},
|
||||
},
|
||||
)
|
||||
test(
|
||||
[][]string{
|
||||
{fmt.Sprintf("%d", uint64(1<<63))},
|
||||
{fmt.Sprintf("%d", uint64(1<<63)+1)},
|
||||
},
|
||||
[]KindSlice{
|
||||
{
|
||||
types.FloatKind,
|
||||
types.StringKind},
|
||||
},
|
||||
)
|
||||
test(
|
||||
[][]string{
|
||||
{fmt.Sprintf("%d", uint64(1<<32))},
|
||||
{fmt.Sprintf("%d", uint64(1<<32)+1)},
|
||||
},
|
||||
[]KindSlice{
|
||||
{
|
||||
types.FloatKind,
|
||||
types.StringKind},
|
||||
},
|
||||
)
|
||||
}
|
||||
|
||||
func TestCombinationsWithLength(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
test := func(input []int, length int, expect [][]int) {
|
||||
combinations := make([][]int, 0)
|
||||
combinationsWithLength(input, length, func(combination []int) {
|
||||
combinations = append(combinations, append([]int{}, combination...))
|
||||
})
|
||||
|
||||
assert.Equal(expect, combinations)
|
||||
}
|
||||
test([]int{0}, 1, [][]int{
|
||||
{0},
|
||||
})
|
||||
test([]int{1}, 1, [][]int{
|
||||
{1},
|
||||
})
|
||||
test([]int{0, 1}, 1, [][]int{
|
||||
{0},
|
||||
{1},
|
||||
})
|
||||
test([]int{0, 1}, 2, [][]int{
|
||||
{0, 1},
|
||||
})
|
||||
test([]int{70, 80, 90, 100}, 1, [][]int{
|
||||
{70},
|
||||
{80},
|
||||
{90},
|
||||
{100},
|
||||
})
|
||||
test([]int{70, 80, 90, 100}, 2, [][]int{
|
||||
{70, 80},
|
||||
{70, 90},
|
||||
{70, 100},
|
||||
{80, 90},
|
||||
{80, 100},
|
||||
{90, 100},
|
||||
})
|
||||
test([]int{70, 80, 90, 100}, 3, [][]int{
|
||||
{70, 80, 90},
|
||||
{70, 80, 100},
|
||||
{70, 90, 100},
|
||||
{80, 90, 100},
|
||||
})
|
||||
}
|
||||
|
||||
func TestCombinationsWithLengthFromTo(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
test := func(input []int, smallestLength, largestLength int, expect [][]int) {
|
||||
combinations := make([][]int, 0)
|
||||
combinationsLengthsFromTo(input, smallestLength, largestLength, func(combination []int) {
|
||||
combinations = append(combinations, append([]int{}, combination...))
|
||||
})
|
||||
|
||||
assert.Equal(expect, combinations)
|
||||
}
|
||||
test([]int{0}, 1, 1, [][]int{
|
||||
{0},
|
||||
})
|
||||
test([]int{1}, 1, 1, [][]int{
|
||||
{1},
|
||||
})
|
||||
test([]int{0, 1}, 1, 2, [][]int{
|
||||
{0},
|
||||
{1},
|
||||
{0, 1},
|
||||
})
|
||||
test([]int{0, 1}, 2, 2, [][]int{
|
||||
{0, 1},
|
||||
})
|
||||
test([]int{70, 80, 90, 100}, 1, 3, [][]int{
|
||||
{70},
|
||||
{80},
|
||||
{90},
|
||||
{100},
|
||||
{70, 80},
|
||||
{70, 90},
|
||||
{70, 100},
|
||||
{80, 90},
|
||||
{80, 100},
|
||||
{90, 100},
|
||||
{70, 80, 90},
|
||||
{70, 80, 100},
|
||||
{70, 90, 100},
|
||||
{80, 90, 100},
|
||||
})
|
||||
}
|
||||
@@ -1,107 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package csv
|
||||
|
||||
import (
|
||||
"encoding/csv"
|
||||
"fmt"
|
||||
"io"
|
||||
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
)
|
||||
|
||||
func getElemDesc(s types.Collection, index int) types.StructDesc {
|
||||
t := types.TypeOf(s).Desc.(types.CompoundDesc).ElemTypes[index]
|
||||
if types.StructKind != t.TargetKind() {
|
||||
d.Panic("Expected StructKind, found %s", t.Kind())
|
||||
}
|
||||
return t.Desc.(types.StructDesc)
|
||||
}
|
||||
|
||||
// GetListElemDesc ensures that l is a types.List of structs, pulls the types.StructDesc that describes the elements of l out of vr, and returns the StructDesc.
|
||||
func GetListElemDesc(l types.List, vr types.ValueReader) types.StructDesc {
|
||||
return getElemDesc(l, 0)
|
||||
}
|
||||
|
||||
// GetMapElemDesc ensures that m is a types.Map of structs, pulls the types.StructDesc that describes the elements of m out of vr, and returns the StructDesc.
|
||||
// If m is a nested types.Map of types.Map, then GetMapElemDesc will descend the levels of the enclosed types.Maps to get to a types.Struct
|
||||
func GetMapElemDesc(m types.Map, vr types.ValueReader) types.StructDesc {
|
||||
t := types.TypeOf(m).Desc.(types.CompoundDesc).ElemTypes[1]
|
||||
if types.StructKind == t.TargetKind() {
|
||||
return t.Desc.(types.StructDesc)
|
||||
} else if types.MapKind == t.TargetKind() {
|
||||
_, v := m.First()
|
||||
return GetMapElemDesc(v.(types.Map), vr)
|
||||
}
|
||||
panic(fmt.Sprintf("Expected StructKind or MapKind, found %s", t.Kind().String()))
|
||||
}
|
||||
|
||||
func writeValuesFromChan(structChan chan types.Struct, sd types.StructDesc, comma rune, output io.Writer) {
|
||||
fieldNames := getFieldNamesFromStruct(sd)
|
||||
csvWriter := csv.NewWriter(output)
|
||||
csvWriter.Comma = comma
|
||||
if csvWriter.Write(fieldNames) != nil {
|
||||
d.Panic("Failed to write header %v", fieldNames)
|
||||
}
|
||||
record := make([]string, len(fieldNames))
|
||||
for s := range structChan {
|
||||
i := 0
|
||||
s.WalkValues(func(v types.Value) {
|
||||
record[i] = fmt.Sprintf("%v", v)
|
||||
i++
|
||||
})
|
||||
if csvWriter.Write(record) != nil {
|
||||
d.Panic("Failed to write record %v", record)
|
||||
}
|
||||
}
|
||||
|
||||
csvWriter.Flush()
|
||||
if csvWriter.Error() != nil {
|
||||
d.Panic("error flushing csv")
|
||||
}
|
||||
}
|
||||
|
||||
// WriteList takes a types.List l of structs (described by sd) and writes it to output as comma-delineated values.
|
||||
func WriteList(l types.List, sd types.StructDesc, comma rune, output io.Writer) {
|
||||
structChan := make(chan types.Struct, 1024)
|
||||
go func() {
|
||||
l.IterAll(func(v types.Value, index uint64) {
|
||||
structChan <- v.(types.Struct)
|
||||
})
|
||||
close(structChan)
|
||||
}()
|
||||
writeValuesFromChan(structChan, sd, comma, output)
|
||||
}
|
||||
|
||||
func sendMapValuesToChan(m types.Map, structChan chan<- types.Struct) {
|
||||
m.IterAll(func(k, v types.Value) {
|
||||
if subMap, ok := v.(types.Map); ok {
|
||||
sendMapValuesToChan(subMap, structChan)
|
||||
} else {
|
||||
structChan <- v.(types.Struct)
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
// WriteMap takes a types.Map m of structs (described by sd) and writes it to output as comma-delineated values.
|
||||
func WriteMap(m types.Map, sd types.StructDesc, comma rune, output io.Writer) {
|
||||
structChan := make(chan types.Struct, 1024)
|
||||
go func() {
|
||||
sendMapValuesToChan(m, structChan)
|
||||
close(structChan)
|
||||
}()
|
||||
writeValuesFromChan(structChan, sd, comma, output)
|
||||
}
|
||||
|
||||
func getFieldNamesFromStruct(structDesc types.StructDesc) (fieldNames []string) {
|
||||
structDesc.IterFields(func(name string, t *types.Type, optional bool) {
|
||||
if !types.IsPrimitiveKind(t.TargetKind()) {
|
||||
d.Panic("Expected primitive kind, found %s", t.TargetKind().String())
|
||||
}
|
||||
fieldNames = append(fieldNames, name)
|
||||
})
|
||||
return
|
||||
}
|
||||
@@ -1,151 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package csv
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"encoding/csv"
|
||||
"fmt"
|
||||
"io"
|
||||
"io/ioutil"
|
||||
"os"
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"github.com/attic-labs/noms/go/chunks"
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/go/datas"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/attic-labs/noms/go/util/clienttest"
|
||||
"github.com/stretchr/testify/suite"
|
||||
)
|
||||
|
||||
const (
|
||||
TEST_ROW_STRUCT_NAME = "row"
|
||||
TEST_ROW_FIELDS = "anid,month,rainfall,year"
|
||||
TEST_DATA_SIZE = 200
|
||||
TEST_YEAR = 2012
|
||||
)
|
||||
|
||||
func TestCSVWrite(t *testing.T) {
|
||||
suite.Run(t, &csvWriteTestSuite{})
|
||||
}
|
||||
|
||||
type csvWriteTestSuite struct {
|
||||
clienttest.ClientTestSuite
|
||||
fieldTypes []*types.Type
|
||||
rowStructDesc types.StructDesc
|
||||
comma rune
|
||||
tmpFileName string
|
||||
}
|
||||
|
||||
func typesToKinds(ts []*types.Type) KindSlice {
|
||||
kinds := make(KindSlice, len(ts))
|
||||
for i, t := range ts {
|
||||
kinds[i] = t.TargetKind()
|
||||
}
|
||||
return kinds
|
||||
}
|
||||
|
||||
func (s *csvWriteTestSuite) SetupTest() {
|
||||
input, err := ioutil.TempFile(s.TempDir, "")
|
||||
d.Chk.NoError(err)
|
||||
s.tmpFileName = input.Name()
|
||||
defer input.Close()
|
||||
|
||||
fieldNames := strings.Split(TEST_ROW_FIELDS, ",")
|
||||
s.fieldTypes = []*types.Type{types.StringType, types.FloaTType, types.FloaTType, types.FloaTType}
|
||||
fields := make([]types.StructField, len(fieldNames))
|
||||
for i, name := range fieldNames {
|
||||
fields[i] = types.StructField{
|
||||
Name: name,
|
||||
Type: s.fieldTypes[i],
|
||||
}
|
||||
}
|
||||
rowStructType := types.MakeStructType(TEST_ROW_STRUCT_NAME, fields...)
|
||||
s.rowStructDesc = rowStructType.Desc.(types.StructDesc)
|
||||
s.comma, _ = StringToRune(",")
|
||||
createCsvTestExpectationFile(input)
|
||||
}
|
||||
|
||||
func (s *csvWriteTestSuite) TearDownTest() {
|
||||
os.Remove(s.tmpFileName)
|
||||
}
|
||||
|
||||
func createCsvTestExpectationFile(w io.Writer) {
|
||||
_, err := io.WriteString(w, TEST_ROW_FIELDS)
|
||||
d.Chk.NoError(err)
|
||||
_, err = io.WriteString(w, "\n")
|
||||
d.Chk.NoError(err)
|
||||
for i := 0; i < TEST_DATA_SIZE; i++ {
|
||||
_, err = io.WriteString(w, fmt.Sprintf("a - %3d,%d,%d,%d\n", i, i%12, i%32, TEST_YEAR+i%4))
|
||||
d.Chk.NoError(err)
|
||||
}
|
||||
}
|
||||
|
||||
func startReadingCsvTestExpectationFile(s *csvWriteTestSuite) (cr *csv.Reader, headers []string) {
|
||||
res, err := os.Open(s.tmpFileName)
|
||||
d.PanicIfError(err)
|
||||
cr = NewCSVReader(res, s.comma)
|
||||
headers, err = cr.Read()
|
||||
d.PanicIfError(err)
|
||||
return
|
||||
}
|
||||
|
||||
func createTestList(s *csvWriteTestSuite) types.List {
|
||||
storage := &chunks.MemoryStorage{}
|
||||
db := datas.NewDatabase(storage.NewView())
|
||||
cr, headers := startReadingCsvTestExpectationFile(s)
|
||||
l := ReadToList(cr, TEST_ROW_STRUCT_NAME, headers, typesToKinds(s.fieldTypes), db, LIMIT)
|
||||
return l
|
||||
}
|
||||
|
||||
func createTestMap(s *csvWriteTestSuite) types.Map {
|
||||
storage := &chunks.MemoryStorage{}
|
||||
db := datas.NewDatabase(storage.NewView())
|
||||
cr, headers := startReadingCsvTestExpectationFile(s)
|
||||
return ReadToMap(cr, TEST_ROW_STRUCT_NAME, headers, []string{"anid"}, typesToKinds(s.fieldTypes), db, LIMIT)
|
||||
}
|
||||
|
||||
func createTestNestedMap(s *csvWriteTestSuite) types.Map {
|
||||
storage := &chunks.MemoryStorage{}
|
||||
db := datas.NewDatabase(storage.NewView())
|
||||
cr, headers := startReadingCsvTestExpectationFile(s)
|
||||
return ReadToMap(cr, TEST_ROW_STRUCT_NAME, headers, []string{"anid", "year"}, typesToKinds(s.fieldTypes), db, LIMIT)
|
||||
}
|
||||
|
||||
func verifyOutput(s *csvWriteTestSuite, r io.Reader) {
|
||||
res, err := os.Open(s.tmpFileName)
|
||||
d.PanicIfError(err)
|
||||
actual, err := ioutil.ReadAll(r)
|
||||
d.Chk.NoError(err)
|
||||
expected, err := ioutil.ReadAll(res)
|
||||
d.Chk.NoError(err)
|
||||
s.True(string(expected) == string(actual), "csv files are different")
|
||||
}
|
||||
|
||||
func (s *csvWriteTestSuite) TestCSVWriteList() {
|
||||
l := createTestList(s)
|
||||
w := new(bytes.Buffer)
|
||||
s.True(TEST_DATA_SIZE == l.Len(), "list length")
|
||||
WriteList(l, s.rowStructDesc, s.comma, w)
|
||||
verifyOutput(s, w)
|
||||
}
|
||||
|
||||
func (s *csvWriteTestSuite) TestCSVWriteMap() {
|
||||
m := createTestMap(s)
|
||||
w := new(bytes.Buffer)
|
||||
s.True(TEST_DATA_SIZE == m.Len(), "map length")
|
||||
WriteMap(m, s.rowStructDesc, s.comma, w)
|
||||
verifyOutput(s, w)
|
||||
}
|
||||
|
||||
func (s *csvWriteTestSuite) TestCSVWriteNestedMap() {
|
||||
m := createTestNestedMap(s)
|
||||
w := new(bytes.Buffer)
|
||||
s.True(TEST_DATA_SIZE == m.Len(), "nested map length")
|
||||
WriteMap(m, s.rowStructDesc, s.comma, w)
|
||||
verifyOutput(s, w)
|
||||
}
|
||||
@@ -1,9 +0,0 @@
|
||||
# About
|
||||
|
||||
This directory contains two sample applications that demonstrate using Noms in a decentralized environment.
|
||||
|
||||
Both applications implement multiuser chat, using different strategies.
|
||||
|
||||
`p2p-chat` is the simplest possible example: a fully local noms replica is run on each node, and all nodes synchronize continuously with each other over HTTP.
|
||||
|
||||
`ipfs-chat` backs Noms with the [IPFS](https://ipfs.io/) network, so that nodes don't have to keep a full local replica of all data. However, because [Filecoin](http://filecoin.io/) doesn't yet exist, *some node* does have to keep a full replica, so ipfs-chat has a `daemon` mode so that you can run a persistent node somewhere to be the replica of last resort.
|
||||
@@ -1,47 +0,0 @@
|
||||
// Copyright 2017 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package dbg
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"log"
|
||||
"os"
|
||||
"strconv"
|
||||
)
|
||||
|
||||
var (
|
||||
Filepath = "/tmp/noms-dbg.log"
|
||||
lg = NewLogger(Filepath)
|
||||
)
|
||||
|
||||
func NewLogger(fp string) *log.Logger {
|
||||
f, err := os.OpenFile(fp, os.O_RDWR|os.O_CREATE|os.O_APPEND, 0644)
|
||||
d.PanicIfError(err)
|
||||
pid := strconv.FormatInt(int64(os.Getpid()), 10)
|
||||
return log.New(f, pid+": ", 0644)
|
||||
}
|
||||
|
||||
func GetLogger() *log.Logger {
|
||||
return lg
|
||||
}
|
||||
|
||||
func SetLogger(newLg *log.Logger) {
|
||||
lg = newLg
|
||||
}
|
||||
|
||||
func Debug(s string, args ...interface{}) {
|
||||
s1 := fmt.Sprintf(s, args...)
|
||||
lg.Println(s1)
|
||||
}
|
||||
|
||||
func BoxF(s string, args ...interface{}) func() {
|
||||
s1 := fmt.Sprintf(s, args...)
|
||||
Debug("starting %s", s1)
|
||||
f := func() {
|
||||
Debug("finished %s", s1)
|
||||
}
|
||||
return f
|
||||
}
|
||||
@@ -1,189 +0,0 @@
|
||||
// Copyright 2017 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"log"
|
||||
"os"
|
||||
"os/signal"
|
||||
"runtime"
|
||||
"syscall"
|
||||
|
||||
"github.com/attic-labs/noms/go/chunks"
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/go/datas"
|
||||
"github.com/attic-labs/noms/go/ipfs"
|
||||
"github.com/attic-labs/noms/go/spec"
|
||||
"github.com/attic-labs/noms/samples/go/decent/dbg"
|
||||
"github.com/attic-labs/noms/samples/go/decent/lib"
|
||||
"github.com/ipfs/go-ipfs/core"
|
||||
"github.com/jroimartin/gocui"
|
||||
"gopkg.in/alecthomas/kingpin.v2"
|
||||
)
|
||||
|
||||
func main() {
|
||||
// allow short (-h) help
|
||||
kingpin.CommandLine.HelpFlag.Short('h')
|
||||
|
||||
clientCmd := kingpin.Command("client", "runs the ipfs-chat client UI")
|
||||
clientTopic := clientCmd.Flag("topic", "IPFS pubsub topic to publish and subscribe to").Default("ipfs-chat").String()
|
||||
username := clientCmd.Flag("username", "username to sign in as").String()
|
||||
nodeIdx := clientCmd.Flag("node-idx", "a single digit to be used as last digit in all port values: api, gateway and swarm (must be 0-9 inclusive)").Default("-1").Int()
|
||||
clientDS := clientCmd.Arg("dataset", "the dataset spec to store chat data in").Required().String()
|
||||
|
||||
importCmd := kingpin.Command("import", "imports data into a chat")
|
||||
importDir := importCmd.Flag("dir", "directory that contains data to import").Default("./data").ExistingDir()
|
||||
importDS := importCmd.Arg("dataset", "the dataset spec to import chat data to").Required().String()
|
||||
|
||||
daemonCmd := kingpin.Command("daemon", "runs a daemon that simulates filecoin, eagerly storing all chunks for a chat")
|
||||
daemonTopic := daemonCmd.Flag("topic", "IPFS pubsub topic to publish and subscribe to").Default("ipfs-chat").String()
|
||||
daemonInterval := daemonCmd.Flag("interval", "amount of time to wait before publishing state to network").Default("5s").Duration()
|
||||
daemonNodeIdx := daemonCmd.Flag("node-idx", "a single digit to be used as last digit in all port values: api, gateway and swarm (must be 0-9 inclusive)").Default("-1").Int()
|
||||
daemonDS := daemonCmd.Arg("dataset", "the dataset spec indicating ipfs repo to use").Required().String()
|
||||
|
||||
kingpin.CommandLine.Help = "A demonstration of using Noms to build a scalable multiuser collaborative application."
|
||||
|
||||
expandRLimit()
|
||||
switch kingpin.Parse() {
|
||||
case "client":
|
||||
cInfo := lib.ClientInfo{
|
||||
Topic: *clientTopic,
|
||||
Username: *username,
|
||||
Idx: *nodeIdx,
|
||||
IsDaemon: false,
|
||||
Delegate: lib.IPFSEventDelegate{},
|
||||
}
|
||||
runClient(*clientDS, cInfo)
|
||||
case "import":
|
||||
lib.RunImport(*importDir, *importDS)
|
||||
case "daemon":
|
||||
cInfo := lib.ClientInfo{
|
||||
Topic: *daemonTopic,
|
||||
Username: "daemon",
|
||||
Interval: *daemonInterval,
|
||||
Idx: *daemonNodeIdx,
|
||||
IsDaemon: true,
|
||||
Delegate: lib.IPFSEventDelegate{},
|
||||
}
|
||||
runDaemon(*daemonDS, cInfo)
|
||||
}
|
||||
}
|
||||
|
||||
func runClient(ipfsSpec string, cInfo lib.ClientInfo) {
|
||||
dbg.SetLogger(lib.NewLogger(cInfo.Username))
|
||||
|
||||
sp, err := spec.ForDataset(ipfsSpec)
|
||||
d.CheckErrorNoUsage(err)
|
||||
|
||||
if !isIPFS(sp.Protocol) {
|
||||
fmt.Println("ipfs-chat requires an 'ipfs' dataset")
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
node, cs := initIPFSChunkStore(sp, cInfo.Idx)
|
||||
db := datas.NewDatabase(cs)
|
||||
|
||||
// Get the head of specified dataset.
|
||||
ds := db.GetDataset(sp.Path.Dataset)
|
||||
ds, err = lib.InitDatabase(ds)
|
||||
d.PanicIfError(err)
|
||||
|
||||
events := make(chan lib.ChatEvent, 1024)
|
||||
t := lib.CreateTermUI(events)
|
||||
defer t.Close()
|
||||
|
||||
d.PanicIfError(t.Layout())
|
||||
t.ResetAuthors(ds)
|
||||
t.UpdateMessages(ds, nil, nil)
|
||||
|
||||
go lib.ProcessChatEvents(node, ds, events, t, cInfo)
|
||||
go lib.ReceiveMessages(node, events, cInfo)
|
||||
|
||||
if err := t.Gui.MainLoop(); err != nil && err != gocui.ErrQuit {
|
||||
dbg.Debug("mainloop has exited, err:", err)
|
||||
log.Panicln(err)
|
||||
}
|
||||
}
|
||||
|
||||
func runDaemon(ipfsSpec string, cInfo lib.ClientInfo) {
|
||||
dbg.SetLogger(log.New(os.Stdout, "", 0))
|
||||
|
||||
sp, err := spec.ForDataset(ipfsSpec)
|
||||
d.CheckErrorNoUsage(err)
|
||||
|
||||
if !isIPFS(sp.Protocol) {
|
||||
fmt.Println("ipfs-chat requires an 'ipfs' dataset")
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
// Create/Open a new network chunkstore
|
||||
node, cs := initIPFSChunkStore(sp, cInfo.Idx)
|
||||
db := datas.NewDatabase(cs)
|
||||
|
||||
// Get the head of specified dataset.
|
||||
ds := db.GetDataset(sp.Path.Dataset)
|
||||
ds, err = lib.InitDatabase(ds)
|
||||
d.PanicIfError(err)
|
||||
|
||||
events := make(chan lib.ChatEvent, 1024)
|
||||
handleSIGQUIT(events)
|
||||
|
||||
go lib.ReceiveMessages(node, events, cInfo)
|
||||
lib.ProcessChatEvents(node, ds, events, nil, cInfo)
|
||||
}
|
||||
|
||||
func handleSIGQUIT(events chan<- lib.ChatEvent) {
|
||||
sigChan := make(chan os.Signal)
|
||||
go func() {
|
||||
for range sigChan {
|
||||
stacktrace := make([]byte, 1024*1024)
|
||||
length := runtime.Stack(stacktrace, true)
|
||||
dbg.Debug(string(stacktrace[:length]))
|
||||
events <- lib.ChatEvent{EventType: lib.QuitEvent}
|
||||
}
|
||||
}()
|
||||
signal.Notify(sigChan, os.Interrupt)
|
||||
signal.Notify(sigChan, syscall.SIGQUIT)
|
||||
}
|
||||
|
||||
// IPFS can use a lot of file decriptors. There are several bugs in the IPFS
|
||||
// repo about this and plans to improve. For the time being, we bump the limits
|
||||
// for this process.
|
||||
func expandRLimit() {
|
||||
var rLimit syscall.Rlimit
|
||||
err := syscall.Getrlimit(syscall.RLIMIT_NOFILE, &rLimit)
|
||||
d.Chk.NoError(err, "Unable to query file rlimit: %s", err)
|
||||
if rLimit.Cur < rLimit.Max {
|
||||
rLimit.Max = 64000
|
||||
rLimit.Cur = 64000
|
||||
err = syscall.Setrlimit(syscall.RLIMIT_NOFILE, &rLimit)
|
||||
d.Chk.NoError(err, "Unable to increase number of open files limit: %s", err)
|
||||
}
|
||||
err = syscall.Getrlimit(syscall.RLIMIT_NOFILE, &rLimit)
|
||||
d.Chk.NoError(err)
|
||||
|
||||
err = syscall.Getrlimit(8, &rLimit)
|
||||
d.Chk.NoError(err, "Unable to query thread rlimit: %s", err)
|
||||
if rLimit.Cur < rLimit.Max {
|
||||
rLimit.Max = 64000
|
||||
rLimit.Cur = 64000
|
||||
err = syscall.Setrlimit(8, &rLimit)
|
||||
d.Chk.NoError(err, "Unable to increase number of threads limit: %s", err)
|
||||
}
|
||||
err = syscall.Getrlimit(8, &rLimit)
|
||||
d.Chk.NoError(err)
|
||||
}
|
||||
|
||||
func initIPFSChunkStore(sp spec.Spec, nodeIdx int) (*core.IpfsNode, chunks.ChunkStore) {
|
||||
// recreate database so that we can have control of chunkstore's ipfs node
|
||||
node := ipfs.OpenIPFSRepo(sp.DatabaseName, nodeIdx)
|
||||
cs := ipfs.ChunkStoreFromIPFSNode(sp.DatabaseName, sp.Protocol == "ipfs-local", node, 1)
|
||||
return node, cs
|
||||
}
|
||||
|
||||
func isIPFS(protocol string) bool {
|
||||
return protocol == "ipfs" || protocol == "ipfs-local"
|
||||
}
|
||||
@@ -1,67 +0,0 @@
|
||||
// Copyright 2017 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package lib
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"strings"
|
||||
|
||||
"github.com/attic-labs/noms/go/datas"
|
||||
"github.com/attic-labs/noms/go/marshal"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
)
|
||||
|
||||
type dataPager struct {
|
||||
dataset datas.Dataset
|
||||
msgKeyChan chan types.String
|
||||
doneChan chan struct{}
|
||||
msgMap types.Map
|
||||
terms []string
|
||||
}
|
||||
|
||||
func NewDataPager(ds datas.Dataset, mkChan chan types.String, doneChan chan struct{}, msgs types.Map, terms []string) *dataPager {
|
||||
return &dataPager{
|
||||
dataset: ds,
|
||||
msgKeyChan: mkChan,
|
||||
doneChan: doneChan,
|
||||
msgMap: msgs,
|
||||
terms: terms,
|
||||
}
|
||||
}
|
||||
|
||||
func (dp *dataPager) Close() {
|
||||
dp.doneChan <- struct{}{}
|
||||
}
|
||||
|
||||
func (dp *dataPager) Next() (string, bool) {
|
||||
msgKey := <-dp.msgKeyChan
|
||||
if msgKey == "" {
|
||||
return "", false
|
||||
}
|
||||
nm := dp.msgMap.Get(msgKey)
|
||||
|
||||
var m Message
|
||||
err := marshal.Unmarshal(nm, &m)
|
||||
if err != nil {
|
||||
return fmt.Sprintf("ERROR: %s", err.Error()), true
|
||||
}
|
||||
|
||||
s1 := fmt.Sprintf("%s: %s", m.Author, m.Body)
|
||||
s2 := highlightTerms(s1, dp.terms)
|
||||
return s2, true
|
||||
}
|
||||
|
||||
func (dp *dataPager) Prepend(lines []string, target int) ([]string, bool) {
|
||||
new := []string{}
|
||||
m, ok := dp.Next()
|
||||
if !ok {
|
||||
return lines, false
|
||||
}
|
||||
for ; ok && len(new) < target; m, ok = dp.Next() {
|
||||
new1 := strings.Split(m, "\n")
|
||||
new = append(new1, new...)
|
||||
}
|
||||
return append(new, lines...), true
|
||||
}
|
||||
@@ -1,281 +0,0 @@
|
||||
// Copyright 2017 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package lib
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"time"
|
||||
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/go/datas"
|
||||
"github.com/attic-labs/noms/go/hash"
|
||||
"github.com/attic-labs/noms/go/ipfs"
|
||||
"github.com/attic-labs/noms/go/merge"
|
||||
"github.com/attic-labs/noms/go/spec"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/attic-labs/noms/go/util/math"
|
||||
"github.com/attic-labs/noms/samples/go/decent/dbg"
|
||||
"github.com/ipfs/go-ipfs/core"
|
||||
)
|
||||
|
||||
const (
|
||||
InputEvent ChatEventType = "input"
|
||||
SearchEvent ChatEventType = "search"
|
||||
SyncEvent ChatEventType = "sync"
|
||||
QuitEvent ChatEventType = "quit"
|
||||
)
|
||||
|
||||
type ClientInfo struct {
|
||||
Topic string
|
||||
Username string
|
||||
Interval time.Duration
|
||||
Idx int
|
||||
IsDaemon bool
|
||||
Dir string
|
||||
Spec spec.Spec
|
||||
Delegate EventDelegate
|
||||
}
|
||||
|
||||
type ChatEventType string
|
||||
|
||||
type ChatEvent struct {
|
||||
EventType ChatEventType
|
||||
Event string
|
||||
}
|
||||
|
||||
type EventDelegate interface {
|
||||
PinBlocks(node *core.IpfsNode, sourceDB, sinkDB datas.Database, sourceCommit types.Value)
|
||||
SourceCommitFromMsgData(db datas.Database, msgData string) (datas.Database, types.Value)
|
||||
HashFromMsgData(msgData string) (hash.Hash, error)
|
||||
GenMessageData(cInfo ClientInfo, h hash.Hash) string
|
||||
}
|
||||
|
||||
// ProcessChatEvent reads events from the event channel and processes them
|
||||
// sequentially. Is ClientInfo.IsDaemon is true, it also publishes the current
|
||||
// head of the dataset continously.
|
||||
func ProcessChatEvents(node *core.IpfsNode, ds datas.Dataset, events chan ChatEvent, t *TermUI, cInfo ClientInfo) {
|
||||
stopChan := make(chan struct{})
|
||||
if cInfo.IsDaemon {
|
||||
go func() {
|
||||
tickChan := time.NewTicker(cInfo.Interval).C
|
||||
for {
|
||||
select {
|
||||
case <-stopChan:
|
||||
break
|
||||
case <-tickChan:
|
||||
Publish(node, cInfo, ds.HeadRef().TargetHash())
|
||||
}
|
||||
}
|
||||
}()
|
||||
}
|
||||
|
||||
for event := range events {
|
||||
switch event.EventType {
|
||||
case SyncEvent:
|
||||
ds = processHash(t, node, ds, event.Event, cInfo)
|
||||
Publish(node, cInfo, ds.HeadRef().TargetHash())
|
||||
case InputEvent:
|
||||
ds = processInput(t, node, ds, event.Event, cInfo)
|
||||
Publish(node, cInfo, ds.HeadRef().TargetHash())
|
||||
case SearchEvent:
|
||||
processSearch(t, node, ds, event.Event, cInfo)
|
||||
case QuitEvent:
|
||||
dbg.Debug("QuitEvent received, stopping program")
|
||||
stopChan <- struct{}{}
|
||||
return
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// processHash processes msgs published by other chat nodes and does the work to
|
||||
// integrate new data into this nodes local database and display it as needed.
|
||||
func processHash(t *TermUI, node *core.IpfsNode, ds datas.Dataset, msgData string, cInfo ClientInfo) datas.Dataset {
|
||||
h, err := cInfo.Delegate.HashFromMsgData(msgData)
|
||||
d.PanicIfError(err)
|
||||
defer dbg.BoxF("processHash, msgData: %s, hash: %s, cid: %s", msgData, h, ipfs.NomsHashToCID(h))()
|
||||
|
||||
sinkDB := ds.Database()
|
||||
d.PanicIfFalse(ds.HasHead())
|
||||
|
||||
headRef := ds.HeadRef()
|
||||
if h == headRef.TargetHash() {
|
||||
dbg.Debug("received hash same as current head, nothing to do")
|
||||
return ds
|
||||
}
|
||||
|
||||
dbg.Debug("reading value for hash: %s", h)
|
||||
sourceDB, sourceCommit := cInfo.Delegate.SourceCommitFromMsgData(sinkDB, msgData)
|
||||
if sourceCommit == nil {
|
||||
dbg.Debug("FAILED to read value for hash: %s", h)
|
||||
return ds
|
||||
}
|
||||
|
||||
sourceRef := types.NewRef(sourceCommit)
|
||||
|
||||
_, isP2P := cInfo.Delegate.(P2PEventDelegate)
|
||||
if cInfo.IsDaemon || isP2P {
|
||||
cInfo.Delegate.PinBlocks(node, sourceDB, sinkDB, sourceCommit)
|
||||
}
|
||||
|
||||
dbg.Debug("Finding common ancestor for merge, sourceRef: %s, headRef: %s", sourceRef.TargetHash(), headRef.TargetHash())
|
||||
a, ok := datas.FindCommonAncestor(sourceRef, headRef, sinkDB)
|
||||
if !ok {
|
||||
dbg.Debug("no common ancestor, cannot merge update!")
|
||||
return ds
|
||||
}
|
||||
dbg.Debug("Checking if source commit is ancestor")
|
||||
if a.Equals(sourceRef) {
|
||||
dbg.Debug("source commit was ancestor, nothing to do")
|
||||
return ds
|
||||
}
|
||||
if a.Equals(headRef) {
|
||||
dbg.Debug("fast-forward to source commit")
|
||||
ds, err := sinkDB.SetHead(ds, sourceRef)
|
||||
d.Chk.NoError(err)
|
||||
if !cInfo.IsDaemon {
|
||||
t.UpdateMessagesFromSync(ds)
|
||||
}
|
||||
return ds
|
||||
}
|
||||
|
||||
dbg.Debug("We have a mergeable commit")
|
||||
left := ds.HeadValue()
|
||||
right := sourceCommit.(types.Struct).Get("value")
|
||||
parent := a.TargetValue(sinkDB).(types.Struct).Get("value")
|
||||
|
||||
dbg.Debug("Starting three-way commit")
|
||||
merged, err := merge.ThreeWay(left, right, parent, sinkDB, nil, nil)
|
||||
if err != nil {
|
||||
dbg.Debug("could not merge received data: " + err.Error())
|
||||
return ds
|
||||
}
|
||||
|
||||
dbg.Debug("setting new datasetHead on localDB")
|
||||
newCommit := datas.NewCommit(merged, types.NewSet(sinkDB, ds.HeadRef(), sourceRef), types.EmptyStruct)
|
||||
commitRef := sinkDB.WriteValue(newCommit)
|
||||
dbg.Debug("wrote new commit: %s", commitRef.TargetHash())
|
||||
ds, err = sinkDB.SetHead(ds, commitRef)
|
||||
if err != nil {
|
||||
dbg.Debug("call to db.SetHead on failed, err: %s", err)
|
||||
}
|
||||
dbg.Debug("set new head ref: %s on ds.ID: %s", commitRef.TargetHash(), ds.ID())
|
||||
newH := ds.HeadRef().TargetHash()
|
||||
dbg.Debug("merged commit, dataset: %s, head: %s, cid: %s", ds.ID(), newH, ipfs.NomsHashToCID(newH))
|
||||
if cInfo.IsDaemon {
|
||||
cInfo.Delegate.PinBlocks(node, sourceDB, sinkDB, newCommit)
|
||||
} else {
|
||||
t.UpdateMessagesFromSync(ds)
|
||||
}
|
||||
return ds
|
||||
}
|
||||
|
||||
// processInput adds a new msg (entered through the UI) updates it's dataset.
|
||||
func processInput(t *TermUI, node *core.IpfsNode, ds datas.Dataset, msg string, cInfo ClientInfo) datas.Dataset {
|
||||
defer dbg.BoxF("processInput, msg: %s", msg)()
|
||||
t.InSearch = false
|
||||
if msg != "" {
|
||||
var err error
|
||||
ds, err = AddMessage(msg, cInfo.Username, time.Now(), ds)
|
||||
d.PanicIfError(err)
|
||||
}
|
||||
t.UpdateMessagesAsync(ds, nil, nil)
|
||||
return ds
|
||||
}
|
||||
|
||||
// updates the UI to display search results.
|
||||
func processSearch(t *TermUI, node *core.IpfsNode, ds datas.Dataset, terms string, cInfo ClientInfo) {
|
||||
defer dbg.BoxF("processSearch")()
|
||||
if terms == "" {
|
||||
return
|
||||
}
|
||||
t.InSearch = true
|
||||
searchTerms := TermsFromString(terms)
|
||||
searchIds := SearchIndex(ds, searchTerms)
|
||||
t.UpdateMessagesAsync(ds, &searchIds, searchTerms)
|
||||
return
|
||||
}
|
||||
|
||||
// recurses over the chunks originating at 'h' and pins them to the IPFS repo.
|
||||
func pinBlocks(node *core.IpfsNode, h hash.Hash, db datas.Database, depth, cnt int) (maxDepth, newCnt int) {
|
||||
maxDepth, newCnt = depth, cnt
|
||||
|
||||
cid := ipfs.NomsHashToCID(h)
|
||||
_, pinned, err := node.Pinning.IsPinned(cid)
|
||||
d.Chk.NoError(err)
|
||||
if pinned {
|
||||
return
|
||||
}
|
||||
|
||||
ctx, cancel := context.WithCancel(context.Background())
|
||||
defer cancel()
|
||||
|
||||
v := db.ReadValue(h)
|
||||
d.Chk.NotNil(v)
|
||||
|
||||
v.WalkRefs(func(r types.Ref) {
|
||||
var newDepth int
|
||||
newDepth, newCnt = pinBlocks(node, r.TargetHash(), db, depth+1, newCnt)
|
||||
maxDepth = math.MaxInt(newDepth, maxDepth)
|
||||
})
|
||||
|
||||
n, err := node.DAG.Get(ctx, cid)
|
||||
d.Chk.NoError(err)
|
||||
err = node.Pinning.Pin(ctx, n, false)
|
||||
d.Chk.NoError(err)
|
||||
newCnt++
|
||||
return
|
||||
}
|
||||
|
||||
type IPFSEventDelegate struct{}
|
||||
|
||||
func (d IPFSEventDelegate) PinBlocks(node *core.IpfsNode, sourceDB, sinkDB datas.Database, sourceCommit types.Value) {
|
||||
h := sourceCommit.Hash()
|
||||
dbg.Debug("Starting pinBlocks")
|
||||
depth, cnt := pinBlocks(node, h, sinkDB, 0, 0)
|
||||
dbg.Debug("Finished pinBlocks, depth: %d, cnt: %d", depth, cnt)
|
||||
node.Pinning.Flush()
|
||||
}
|
||||
|
||||
func (d IPFSEventDelegate) SourceCommitFromMsgData(db datas.Database, msgData string) (datas.Database, types.Value) {
|
||||
h := hash.Parse(msgData)
|
||||
v := db.ReadValue(h)
|
||||
return db, v
|
||||
}
|
||||
|
||||
func (d IPFSEventDelegate) HashFromMsgData(msgData string) (hash.Hash, error) {
|
||||
var err error
|
||||
h, ok := hash.MaybeParse(msgData)
|
||||
if !ok {
|
||||
err = fmt.Errorf("Failed to parse hash from msgData: %s", msgData)
|
||||
}
|
||||
return h, err
|
||||
}
|
||||
|
||||
func (d IPFSEventDelegate) GenMessageData(cInfo ClientInfo, h hash.Hash) string {
|
||||
return h.String()
|
||||
}
|
||||
|
||||
type P2PEventDelegate struct{}
|
||||
|
||||
func (d P2PEventDelegate) PinBlocks(node *core.IpfsNode, sourceDB, sinkDB datas.Database, sourceCommit types.Value) {
|
||||
sourceRef := types.NewRef(sourceCommit)
|
||||
datas.Pull(sourceDB, sinkDB, sourceRef, nil)
|
||||
}
|
||||
|
||||
func (d P2PEventDelegate) SourceCommitFromMsgData(db datas.Database, msgData string) (datas.Database, types.Value) {
|
||||
sp, _ := spec.ForPath(msgData)
|
||||
v := sp.GetValue()
|
||||
return sp.GetDatabase(), v
|
||||
}
|
||||
|
||||
func (d P2PEventDelegate) HashFromMsgData(msgData string) (hash.Hash, error) {
|
||||
sp, err := spec.ForPath(msgData)
|
||||
return sp.Path.Hash, err
|
||||
}
|
||||
|
||||
func (d P2PEventDelegate) GenMessageData(cInfo ClientInfo, h hash.Hash) string {
|
||||
return fmt.Sprintf("%s::#%s", cInfo.Spec, h)
|
||||
}
|
||||
@@ -1,164 +0,0 @@
|
||||
// Copyright 2017 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package lib
|
||||
|
||||
import (
|
||||
"errors"
|
||||
"fmt"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"regexp"
|
||||
"sort"
|
||||
"strings"
|
||||
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/go/marshal"
|
||||
"github.com/attic-labs/noms/go/merge"
|
||||
"github.com/attic-labs/noms/go/spec"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/attic-labs/noms/go/util/datetime"
|
||||
"golang.org/x/net/html"
|
||||
)
|
||||
|
||||
var (
|
||||
character = ""
|
||||
msgs = []Message{}
|
||||
)
|
||||
|
||||
func RunImport(dir, dsSpec string) error {
|
||||
filepath.Walk(dir, func(path string, info os.FileInfo, err error) error {
|
||||
if path == dir {
|
||||
return nil
|
||||
}
|
||||
if !strings.HasSuffix(info.Name(), ".html") {
|
||||
return nil
|
||||
}
|
||||
fmt.Println("importing:", path)
|
||||
f, err := os.Open(path)
|
||||
d.Chk.NoError(err)
|
||||
n, err := html.Parse(f)
|
||||
d.Chk.NoError(err)
|
||||
extractDialog(n)
|
||||
return nil
|
||||
})
|
||||
|
||||
if len(msgs) == 0 {
|
||||
return errors.New("Failed to import any data")
|
||||
}
|
||||
fmt.Println("Imported", len(msgs), "messages")
|
||||
|
||||
sp, err := spec.ForDataset(dsSpec)
|
||||
d.CheckErrorNoUsage(err)
|
||||
ds := sp.GetDataset()
|
||||
ds, err = InitDatabase(ds)
|
||||
d.PanicIfError(err)
|
||||
db := ds.Database()
|
||||
|
||||
fmt.Println("Creating msg map")
|
||||
kvPairs := []types.Value{}
|
||||
for _, msg := range msgs {
|
||||
kvPairs = append(kvPairs, types.String(msg.ID()), marshal.MustMarshal(db, msg))
|
||||
}
|
||||
m := types.NewMap(db, kvPairs...)
|
||||
|
||||
fmt.Println("Creating index")
|
||||
ti := NewTermIndex(db, types.NewMap(db)).Edit()
|
||||
for _, msg := range msgs {
|
||||
terms := GetTerms(msg)
|
||||
ti.InsertAll(terms, types.String(msg.ID()))
|
||||
}
|
||||
termDocs := ti.Value().TermDocs
|
||||
|
||||
fmt.Println("Creating users")
|
||||
users := topUsers(msgs)
|
||||
|
||||
fmt.Println("Docs:", termDocs.Len(), "Users:", len(users))
|
||||
root := Root{Messages: m, Index: termDocs, Users: users}
|
||||
nroot := marshal.MustMarshal(db, root)
|
||||
if ds.HasHead() {
|
||||
left := ds.HeadValue()
|
||||
parent := marshal.MustMarshal(db, Root{
|
||||
Index: types.NewMap(db),
|
||||
Messages: types.NewMap(db),
|
||||
})
|
||||
fmt.Println("Merging data")
|
||||
nroot, err = merge.ThreeWay(left, nroot, parent, db, nil, nil)
|
||||
fmt.Println("Merging complete")
|
||||
d.Chk.NoError(err)
|
||||
}
|
||||
fmt.Println("Committing data")
|
||||
_, err = db.CommitValue(ds, nroot)
|
||||
return err
|
||||
}
|
||||
|
||||
func extractDialog(n *html.Node) {
|
||||
if c := characterName(n); c != "" {
|
||||
//fmt.Println("Character:", character)
|
||||
character = c
|
||||
return
|
||||
}
|
||||
if character != "" && n.Type == html.TextNode {
|
||||
//fmt.Println("Dialog:", strings.TrimSpace(n.Data))
|
||||
msg := Message{
|
||||
Ordinal: uint64(len(msgs)),
|
||||
Author: character,
|
||||
Body: strings.TrimSpace(n.Data),
|
||||
ClientTime: datetime.Now(),
|
||||
}
|
||||
msgs = append(msgs, msg)
|
||||
character = ""
|
||||
}
|
||||
for c := n.FirstChild; c != nil; c = c.NextSibling {
|
||||
extractDialog(c)
|
||||
}
|
||||
}
|
||||
|
||||
func characterName(n *html.Node) string {
|
||||
if n.Type != html.ElementNode ||
|
||||
n.Data != "b" ||
|
||||
n.FirstChild == nil {
|
||||
return ""
|
||||
}
|
||||
|
||||
if hasSpaces, _ := regexp.MatchString(`^\s+[^\s]`, n.FirstChild.Data); !hasSpaces {
|
||||
return ""
|
||||
}
|
||||
return strings.TrimSpace(n.FirstChild.Data)
|
||||
}
|
||||
|
||||
type cpair struct {
|
||||
character string
|
||||
cnt int
|
||||
}
|
||||
|
||||
func topUsers(msgs []Message) []string {
|
||||
userpat := regexp.MustCompile(`^[a-zA-Z][a-zA-Z\s]*\d*$`)
|
||||
usermap := map[string]int{}
|
||||
for _, msg := range msgs {
|
||||
name := strings.TrimSpace(msg.Author)
|
||||
if userpat.MatchString(name) {
|
||||
usermap[name] += 1
|
||||
}
|
||||
}
|
||||
pairs := []cpair{}
|
||||
for name, cnt := range usermap {
|
||||
if len(name) > 1 && !strings.HasPrefix(name, "ANOTHER") {
|
||||
pairs = append(pairs, cpair{character: strings.ToLower(name), cnt: cnt})
|
||||
}
|
||||
}
|
||||
// sort descending by cnt
|
||||
sort.Slice(pairs, func(i, j int) bool {
|
||||
return pairs[j].cnt < pairs[i].cnt
|
||||
})
|
||||
users := []string{}
|
||||
for i, p := range pairs {
|
||||
if i >= 30 {
|
||||
break
|
||||
}
|
||||
users = append(users, p.character)
|
||||
}
|
||||
sort.Strings(users)
|
||||
return users
|
||||
}
|
||||
@@ -1,21 +0,0 @@
|
||||
// Copyright 2017 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package lib
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"log"
|
||||
"os"
|
||||
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/samples/go/decent/dbg"
|
||||
)
|
||||
|
||||
func NewLogger(username string) *log.Logger {
|
||||
f, err := os.OpenFile(dbg.Filepath, os.O_RDWR|os.O_CREATE|os.O_APPEND, 0644)
|
||||
d.PanicIfError(err)
|
||||
prefix := fmt.Sprintf("%d-%s: ", os.Getpid(), username)
|
||||
return log.New(f, prefix, 0644)
|
||||
}
|
||||
@@ -1,181 +0,0 @@
|
||||
// Copyright 2017 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package lib
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"regexp"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/go/datas"
|
||||
"github.com/attic-labs/noms/go/marshal"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/attic-labs/noms/go/util/datetime"
|
||||
"github.com/attic-labs/noms/samples/go/decent/dbg"
|
||||
)
|
||||
|
||||
type Root struct {
|
||||
// Map<Key, Message>
|
||||
// Keys are strings like: <Ordinal>,<Author>
|
||||
// This scheme allows:
|
||||
// - map is naturally sorted in the right order
|
||||
// - conflicts will generally be avoided
|
||||
// - messages are editable
|
||||
Messages types.Map
|
||||
Index types.Map
|
||||
Users []string `noms:",set"`
|
||||
}
|
||||
|
||||
type Message struct {
|
||||
Ordinal uint64
|
||||
Author string
|
||||
Body string
|
||||
ClientTime datetime.DateTime
|
||||
}
|
||||
|
||||
func (m Message) ID() string {
|
||||
return fmt.Sprintf("%020x/%s", m.ClientTime.UnixNano(), m.Author)
|
||||
}
|
||||
|
||||
func AddMessage(body string, author string, clientTime time.Time, ds datas.Dataset) (datas.Dataset, error) {
|
||||
defer dbg.BoxF("AddMessage, body: %s", body)()
|
||||
root, err := getRoot(ds)
|
||||
if err != nil {
|
||||
return datas.Dataset{}, err
|
||||
}
|
||||
|
||||
db := ds.Database()
|
||||
|
||||
nm := Message{
|
||||
Author: author,
|
||||
Body: body,
|
||||
ClientTime: datetime.DateTime{clientTime},
|
||||
Ordinal: root.Messages.Len(),
|
||||
}
|
||||
root.Messages = root.Messages.Edit().Set(types.String(nm.ID()), marshal.MustMarshal(db, nm)).Map()
|
||||
IndexNewMessage(db, &root, nm)
|
||||
newRoot := marshal.MustMarshal(db, root)
|
||||
ds, err = db.CommitValue(ds, newRoot)
|
||||
return ds, err
|
||||
}
|
||||
|
||||
func InitDatabase(ds datas.Dataset) (datas.Dataset, error) {
|
||||
if ds.HasHead() {
|
||||
return ds, nil
|
||||
}
|
||||
db := ds.Database()
|
||||
root := Root{
|
||||
Index: types.NewMap(db),
|
||||
Messages: types.NewMap(db),
|
||||
}
|
||||
return db.CommitValue(ds, marshal.MustMarshal(db, root))
|
||||
}
|
||||
|
||||
func GetAuthors(ds datas.Dataset) []string {
|
||||
r, err := getRoot(ds)
|
||||
d.PanicIfError(err)
|
||||
return r.Users
|
||||
}
|
||||
|
||||
func IndexNewMessage(vrw types.ValueReadWriter, root *Root, m Message) {
|
||||
defer dbg.BoxF("IndexNewMessage")()
|
||||
|
||||
ti := NewTermIndex(vrw, root.Index)
|
||||
id := types.String(m.ID())
|
||||
root.Index = ti.Edit().InsertAll(GetTerms(m), id).Value().TermDocs
|
||||
root.Users = append(root.Users, m.Author)
|
||||
}
|
||||
|
||||
func SearchIndex(ds datas.Dataset, search []string) types.Map {
|
||||
root, err := getRoot(ds)
|
||||
d.PanicIfError(err)
|
||||
idx := root.Index
|
||||
ti := NewTermIndex(ds.Database(), idx)
|
||||
ids := ti.Search(search)
|
||||
dbg.Debug("search for: %s, returned: %d", strings.Join(search, " "), ids.Len())
|
||||
return ids
|
||||
}
|
||||
|
||||
var (
|
||||
punctPat = regexp.MustCompile("[[:punct:]]+")
|
||||
wsPat = regexp.MustCompile("\\s+")
|
||||
)
|
||||
|
||||
func TermsFromString(s string) []string {
|
||||
s1 := punctPat.ReplaceAllString(strings.TrimSpace(s), " ")
|
||||
terms := wsPat.Split(s1, -1)
|
||||
clean := []string{}
|
||||
for _, t := range terms {
|
||||
if t == "" {
|
||||
continue
|
||||
}
|
||||
clean = append(clean, strings.ToLower(t))
|
||||
}
|
||||
return clean
|
||||
}
|
||||
|
||||
func GetTerms(m Message) []string {
|
||||
terms := TermsFromString(m.Body)
|
||||
terms = append(terms, TermsFromString(m.Author)...)
|
||||
return terms
|
||||
}
|
||||
|
||||
func ListMessages(ds datas.Dataset, searchIds *types.Map, doneChan chan struct{}) (msgMap types.Map, mc chan types.String, err error) {
|
||||
//dbg.Debug("##### listMessages: entered")
|
||||
|
||||
root, err := getRoot(ds)
|
||||
db := ds.Database()
|
||||
if err != nil {
|
||||
return types.NewMap(db), nil, err
|
||||
}
|
||||
msgMap = root.Messages
|
||||
|
||||
mc = make(chan types.String)
|
||||
done := false
|
||||
go func() {
|
||||
<-doneChan
|
||||
done = true
|
||||
<-mc
|
||||
//dbg.Debug("##### listMessages: exiting 'done' goroutine")
|
||||
}()
|
||||
|
||||
go func() {
|
||||
keyMap := msgMap
|
||||
if searchIds != nil {
|
||||
keyMap = *searchIds
|
||||
}
|
||||
i := uint64(0)
|
||||
for ; i < keyMap.Len() && !done; i++ {
|
||||
key, _ := keyMap.At(keyMap.Len() - i - 1)
|
||||
mc <- key.(types.String)
|
||||
}
|
||||
//dbg.Debug("##### listMessages: exiting 'for loop' goroutine, examined: %d", i)
|
||||
close(mc)
|
||||
}()
|
||||
return
|
||||
}
|
||||
|
||||
func getRoot(ds datas.Dataset) (Root, error) {
|
||||
defer dbg.BoxF("getRoot")()
|
||||
|
||||
db := ds.Database()
|
||||
root := Root{
|
||||
Messages: types.NewMap(db),
|
||||
Index: types.NewMap(db),
|
||||
}
|
||||
// TODO: It would be nice if Dataset.MaybeHeadValue() or HeadValue()
|
||||
// would return just <value>, and it would be nil if not there, so you
|
||||
// could chain calls.
|
||||
if !ds.HasHead() {
|
||||
return root, nil
|
||||
}
|
||||
err := marshal.Unmarshal(ds.HeadValue(), &root)
|
||||
if err != nil {
|
||||
return Root{}, err
|
||||
}
|
||||
return root, nil
|
||||
}
|
||||
@@ -1,69 +0,0 @@
|
||||
// Copyright 2017 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package lib
|
||||
|
||||
import (
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/attic-labs/noms/go/chunks"
|
||||
"github.com/attic-labs/noms/go/datas"
|
||||
"github.com/attic-labs/noms/go/marshal"
|
||||
"github.com/attic-labs/noms/go/util/datetime"
|
||||
"github.com/stretchr/testify/assert"
|
||||
)
|
||||
|
||||
func TestBasics(t *testing.T) {
|
||||
a := assert.New(t)
|
||||
db := datas.NewDatabase(chunks.NewMemoryStoreFactory().CreateStore(""))
|
||||
ds := db.GetDataset("foo")
|
||||
ml, err := getAllMessages(ds)
|
||||
a.NoError(err)
|
||||
a.Equal(0, len(ml))
|
||||
|
||||
ds, err = AddMessage("body1", "aa", time.Unix(0, 0), ds)
|
||||
a.NoError(err)
|
||||
ml, err = getAllMessages(ds)
|
||||
a.NoError(err)
|
||||
expected := []Message{
|
||||
Message{
|
||||
Author: "aa",
|
||||
Body: "body1",
|
||||
ClientTime: datetime.DateTime{time.Unix(0, 0)},
|
||||
Ordinal: 0,
|
||||
},
|
||||
}
|
||||
a.Equal(expected, ml)
|
||||
|
||||
ds, err = AddMessage("body2", "bob", time.Unix(1, 0), ds)
|
||||
a.NoError(err)
|
||||
ml, err = getAllMessages(ds)
|
||||
expected = append(
|
||||
[]Message{
|
||||
Message{
|
||||
Author: "bob",
|
||||
Body: "body2",
|
||||
ClientTime: datetime.DateTime{time.Unix(1, 0)},
|
||||
Ordinal: 1,
|
||||
},
|
||||
},
|
||||
expected...,
|
||||
)
|
||||
a.NoError(err)
|
||||
a.Equal(expected, ml)
|
||||
}
|
||||
|
||||
func getAllMessages(ds datas.Dataset) (r []Message, err error) {
|
||||
doneChan := make(chan struct{})
|
||||
mm, keys, _ := ListMessages(ds, nil, doneChan)
|
||||
for k := range keys {
|
||||
mv := mm.Get(k)
|
||||
var m Message
|
||||
marshal.MustUnmarshal(mv, &m)
|
||||
r = append(r, m)
|
||||
}
|
||||
doneChan <- struct{}{}
|
||||
return r, nil
|
||||
}
|
||||
@@ -1,90 +0,0 @@
|
||||
// Copyright 2017 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package lib
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"sync"
|
||||
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/go/hash"
|
||||
"github.com/attic-labs/noms/samples/go/decent/dbg"
|
||||
"github.com/ipfs/go-ipfs/core"
|
||||
"github.com/jbenet/go-base58"
|
||||
)
|
||||
|
||||
var (
|
||||
PubsubUser = "default"
|
||||
seenHash = map[hash.Hash]bool{}
|
||||
seenHashMutex = sync.Mutex{}
|
||||
)
|
||||
|
||||
func lockSeenF() func() {
|
||||
seenHashMutex.Lock()
|
||||
return func() {
|
||||
seenHashMutex.Unlock()
|
||||
}
|
||||
}
|
||||
|
||||
// RecieveMessages listens for messages sent by other chat nodes. It filters out
|
||||
// any msgs that have already been received and adds events to teh events channel
|
||||
// for any msgs that it hasn't seen yet.
|
||||
func ReceiveMessages(node *core.IpfsNode, events chan ChatEvent, cInfo ClientInfo) {
|
||||
sub, err := node.Floodsub.Subscribe(cInfo.Topic)
|
||||
d.Chk.NoError(err)
|
||||
|
||||
listenForAndHandleMessage := func() {
|
||||
msg, err := sub.Next(context.Background())
|
||||
d.PanicIfError(err)
|
||||
sender := base58.Encode(msg.From)
|
||||
msgMap := map[string]string{}
|
||||
err = json.Unmarshal(msg.Data, &msgMap)
|
||||
if err != nil {
|
||||
dbg.Debug("ReceiveMessages: received non-json msg: %s from: %s, error: %s", msg.Data, sender, err)
|
||||
return
|
||||
}
|
||||
msgData := msgMap["data"]
|
||||
h, err := cInfo.Delegate.HashFromMsgData(msgData)
|
||||
if err != nil {
|
||||
dbg.Debug("ReceiveMessages: received unknown msg: %s from: %s", msgData, sender)
|
||||
return
|
||||
}
|
||||
|
||||
defer lockSeenF()()
|
||||
if !seenHash[h] {
|
||||
events <- ChatEvent{EventType: SyncEvent, Event: msgData}
|
||||
seenHash[h] = true
|
||||
dbg.Debug("got msgData: %s from: %s(%s)", msgData, sender, msgMap["user"])
|
||||
}
|
||||
}
|
||||
|
||||
dbg.Debug("start listening for msgs on channel: %s", cInfo.Topic)
|
||||
for {
|
||||
listenForAndHandleMessage()
|
||||
}
|
||||
panic("unreachable")
|
||||
}
|
||||
|
||||
// Publish asks the delegate to format a hash/ClientInfo into a suitable msg
|
||||
// and publishes that using IPFS pubsub.
|
||||
func Publish(node *core.IpfsNode, cInfo ClientInfo, h hash.Hash) {
|
||||
defer func() {
|
||||
if r := recover(); r != nil {
|
||||
dbg.Debug("Publish failed, error: %s", r)
|
||||
}
|
||||
}()
|
||||
msgData := cInfo.Delegate.GenMessageData(cInfo, h)
|
||||
m, err := json.Marshal(map[string]string{"user": cInfo.Username, "data": msgData})
|
||||
if err != nil {
|
||||
|
||||
}
|
||||
d.PanicIfError(err)
|
||||
dbg.Debug("publishing to topic: %s, msg: %s", cInfo.Topic, m)
|
||||
node.Floodsub.Publish(cInfo.Topic, append(m, []byte("\r\n")...))
|
||||
|
||||
defer lockSeenF()()
|
||||
seenHash[h] = true
|
||||
}
|
||||
@@ -1,120 +0,0 @@
|
||||
// Copyright 2017 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package lib
|
||||
|
||||
import (
|
||||
"sync"
|
||||
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
)
|
||||
|
||||
type TermIndex struct {
|
||||
TermDocs types.Map
|
||||
vrw types.ValueReadWriter
|
||||
}
|
||||
|
||||
func NewTermIndex(vrw types.ValueReadWriter, TermDocs types.Map) TermIndex {
|
||||
return TermIndex{TermDocs, vrw}
|
||||
}
|
||||
|
||||
func (ti TermIndex) Edit() *TermIndexEditor {
|
||||
return &TermIndexEditor{ti.TermDocs.Edit(), ti.vrw}
|
||||
}
|
||||
|
||||
func (ti TermIndex) Search(terms []string) types.Map {
|
||||
seen := make(map[string]struct{}, len(terms))
|
||||
iters := make([]types.SetIterator, 0, len(terms))
|
||||
|
||||
wg := sync.WaitGroup{}
|
||||
idx := 0
|
||||
for _, t := range terms {
|
||||
if _, ok := seen[t]; ok {
|
||||
continue
|
||||
}
|
||||
seen[t] = struct{}{}
|
||||
|
||||
iters = append(iters, nil)
|
||||
i := idx
|
||||
t := t
|
||||
wg.Add(1)
|
||||
go func() {
|
||||
ts := ti.TermDocs.Get(types.String(t))
|
||||
if ts != nil {
|
||||
iter := ts.(types.Set).Iterator()
|
||||
iters[i] = iter
|
||||
}
|
||||
wg.Done()
|
||||
}()
|
||||
|
||||
idx++
|
||||
}
|
||||
wg.Wait()
|
||||
|
||||
var si types.SetIterator
|
||||
for _, iter := range iters {
|
||||
if iter == nil {
|
||||
return types.NewMap(ti.vrw) // at least one term had no hits
|
||||
}
|
||||
|
||||
if si == nil {
|
||||
si = iter // first iter
|
||||
continue
|
||||
}
|
||||
|
||||
si = types.NewIntersectionIterator(si, iter)
|
||||
}
|
||||
|
||||
ch := make(chan types.Value)
|
||||
rch := types.NewStreamingMap(ti.vrw, ch)
|
||||
for next := si.Next(); next != nil; next = si.Next() {
|
||||
ch <- next
|
||||
ch <- types.Bool(true)
|
||||
}
|
||||
close(ch)
|
||||
|
||||
return <-rch
|
||||
}
|
||||
|
||||
type TermIndexEditor struct {
|
||||
terms *types.MapEditor
|
||||
vrw types.ValueReadWriter
|
||||
}
|
||||
|
||||
// Builds a new TermIndex
|
||||
func (te *TermIndexEditor) Value() TermIndex {
|
||||
return TermIndex{te.terms.Map(), te.vrw}
|
||||
}
|
||||
|
||||
// Indexes |v| by |term|
|
||||
func (te *TermIndexEditor) Insert(term string, v types.Value) *TermIndexEditor {
|
||||
tv := types.String(term)
|
||||
hitSet := te.terms.Get(tv)
|
||||
if hitSet == nil {
|
||||
hitSet = types.NewSet(te.vrw)
|
||||
}
|
||||
hsEd, ok := hitSet.(*types.SetEditor)
|
||||
if !ok {
|
||||
hsEd = hitSet.(types.Set).Edit()
|
||||
te.terms.Set(tv, hsEd)
|
||||
}
|
||||
|
||||
hsEd.Insert(v)
|
||||
return te
|
||||
}
|
||||
|
||||
// Indexes |v| by each unique term in |terms| (tolerates duplicate terms)
|
||||
func (te *TermIndexEditor) InsertAll(terms []string, v types.Value) *TermIndexEditor {
|
||||
visited := map[string]struct{}{}
|
||||
for _, term := range terms {
|
||||
if _, ok := visited[term]; ok {
|
||||
continue
|
||||
}
|
||||
visited[term] = struct{}{}
|
||||
te.Insert(term, v)
|
||||
}
|
||||
return te
|
||||
}
|
||||
|
||||
// TODO: te.Remove
|
||||
@@ -1,57 +0,0 @@
|
||||
// Copyright 2017 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package lib
|
||||
|
||||
import (
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"github.com/attic-labs/noms/go/chunks"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/stretchr/testify/assert"
|
||||
)
|
||||
|
||||
func TestRun(t *testing.T) {
|
||||
a := assert.New(t)
|
||||
|
||||
storage := &chunks.MemoryStorage{}
|
||||
vs := types.NewValueStore(storage.NewView())
|
||||
defer vs.Close()
|
||||
|
||||
docs := []struct {
|
||||
terms string
|
||||
id int
|
||||
}{
|
||||
{"foo bar baz", 1},
|
||||
{"foo baz", 2},
|
||||
{"baz bat boo", 3},
|
||||
}
|
||||
|
||||
indexEditor := NewTermIndex(vs, types.NewMap(vs)).Edit()
|
||||
for _, doc := range docs {
|
||||
indexEditor.InsertAll(strings.Split(doc.terms, " "), types.Float(doc.id))
|
||||
}
|
||||
|
||||
index := indexEditor.Value()
|
||||
|
||||
getMap := func(keys ...int) types.Map {
|
||||
m := types.NewMap(vs).Edit()
|
||||
for _, k := range keys {
|
||||
m.Set(types.Float(k), types.Bool(true))
|
||||
}
|
||||
return m.Map()
|
||||
}
|
||||
|
||||
test := func(search string, expect types.Map) {
|
||||
actual := index.Search(strings.Split(search, " "))
|
||||
a.True(expect.Equals(actual))
|
||||
}
|
||||
|
||||
test("foo", getMap(1, 2))
|
||||
test("baz", getMap(1, 2, 3))
|
||||
test("bar baz", getMap(1))
|
||||
test("boo", getMap(3))
|
||||
test("blarg", getMap())
|
||||
}
|
||||
@@ -1,356 +0,0 @@
|
||||
// Copyright 2017 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package lib
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"regexp"
|
||||
"runtime"
|
||||
"strings"
|
||||
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/go/datas"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/attic-labs/noms/go/util/math"
|
||||
"github.com/attic-labs/noms/samples/go/decent/dbg"
|
||||
"github.com/jroimartin/gocui"
|
||||
)
|
||||
|
||||
const (
|
||||
allViews = ""
|
||||
usersView = "users"
|
||||
messageView = "messages"
|
||||
inputView = "input"
|
||||
linestofetch = 50
|
||||
|
||||
searchPrefix = "/s"
|
||||
quitPrefix = "/q"
|
||||
)
|
||||
|
||||
type TermUI struct {
|
||||
Gui *gocui.Gui
|
||||
InSearch bool
|
||||
lines []string
|
||||
dp *dataPager
|
||||
}
|
||||
|
||||
var (
|
||||
viewNames = []string{usersView, messageView, inputView}
|
||||
firstLayout = true
|
||||
)
|
||||
|
||||
func CreateTermUI(events chan ChatEvent) *TermUI {
|
||||
g, err := gocui.NewGui(gocui.Output256)
|
||||
d.PanicIfError(err)
|
||||
|
||||
g.Highlight = true
|
||||
g.SelFgColor = gocui.ColorGreen
|
||||
g.Cursor = true
|
||||
|
||||
relayout := func(g *gocui.Gui) error {
|
||||
return layout(g)
|
||||
}
|
||||
g.SetManagerFunc(relayout)
|
||||
|
||||
termUI := new(TermUI)
|
||||
termUI.Gui = g
|
||||
|
||||
d.PanicIfError(g.SetKeybinding(allViews, gocui.KeyF1, gocui.ModNone, debugInfo(termUI)))
|
||||
d.PanicIfError(g.SetKeybinding(allViews, gocui.KeyCtrlC, gocui.ModNone, quit))
|
||||
d.PanicIfError(g.SetKeybinding(allViews, gocui.KeyCtrlC, gocui.ModAlt, quitWithStack))
|
||||
d.PanicIfError(g.SetKeybinding(allViews, gocui.KeyTab, gocui.ModNone, nextView))
|
||||
d.PanicIfError(g.SetKeybinding(messageView, gocui.KeyArrowUp, gocui.ModNone, arrowUp(termUI)))
|
||||
d.PanicIfError(g.SetKeybinding(messageView, gocui.KeyArrowDown, gocui.ModNone, arrowDown(termUI)))
|
||||
d.PanicIfError(g.SetKeybinding(inputView, gocui.KeyEnter, gocui.ModNone, func(g *gocui.Gui, v *gocui.View) (err error) {
|
||||
defer func() {
|
||||
v.Clear()
|
||||
v.SetCursor(0, 0)
|
||||
msgView, err := g.View(messageView)
|
||||
d.PanicIfError(err)
|
||||
msgView.Title = "messages"
|
||||
msgView.Autoscroll = true
|
||||
}()
|
||||
buf := strings.TrimSpace(v.Buffer())
|
||||
if strings.HasPrefix(buf, searchPrefix) {
|
||||
events <- ChatEvent{EventType: SearchEvent, Event: strings.TrimSpace(buf[len(searchPrefix):])}
|
||||
return
|
||||
}
|
||||
if strings.HasPrefix(buf, quitPrefix) {
|
||||
err = gocui.ErrQuit
|
||||
return
|
||||
}
|
||||
events <- ChatEvent{EventType: InputEvent, Event: buf}
|
||||
return
|
||||
}))
|
||||
|
||||
return termUI
|
||||
}
|
||||
|
||||
func (t *TermUI) Close() {
|
||||
dbg.Debug("Closing gui")
|
||||
t.Gui.Close()
|
||||
}
|
||||
|
||||
func (t *TermUI) UpdateMessagesFromSync(ds datas.Dataset) {
|
||||
if t.InSearch || !t.textScrolledToEnd() {
|
||||
t.Gui.Execute(func(g *gocui.Gui) (err error) {
|
||||
updateViewTitle(g, messageView, "messages (NEW!)")
|
||||
return
|
||||
})
|
||||
} else {
|
||||
t.UpdateMessagesAsync(ds, nil, nil)
|
||||
}
|
||||
}
|
||||
|
||||
func (t *TermUI) Layout() error {
|
||||
return layout(t.Gui)
|
||||
}
|
||||
|
||||
func layout(g *gocui.Gui) error {
|
||||
maxX, maxY := g.Size()
|
||||
if v, err := g.SetView(usersView, 0, 0, 25, maxY-1); err != nil {
|
||||
if err != gocui.ErrUnknownView {
|
||||
return err
|
||||
}
|
||||
v.Title = usersView
|
||||
v.Wrap = false
|
||||
v.Editable = false
|
||||
}
|
||||
if v, err := g.SetView(messageView, 25, 0, maxX-1, maxY-2-1); err != nil {
|
||||
if err != gocui.ErrUnknownView {
|
||||
return err
|
||||
}
|
||||
v.Title = messageView
|
||||
v.Editable = false
|
||||
v.Wrap = true
|
||||
v.Autoscroll = true
|
||||
return nil
|
||||
}
|
||||
if v, err := g.SetView(inputView, 25, maxY-2-1, maxX-1, maxY-1); err != nil {
|
||||
if err != gocui.ErrUnknownView {
|
||||
return err
|
||||
}
|
||||
v.Wrap = true
|
||||
v.Editable = true
|
||||
v.Autoscroll = true
|
||||
}
|
||||
if firstLayout {
|
||||
firstLayout = false
|
||||
g.SetCurrentView(inputView)
|
||||
dbg.Debug("started up")
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (t *TermUI) UpdateMessages(ds datas.Dataset, filterIds *types.Map, terms []string) error {
|
||||
defer dbg.BoxF("updateMessages")()
|
||||
|
||||
t.ResetAuthors(ds)
|
||||
v, err := t.Gui.View(messageView)
|
||||
d.PanicIfError(err)
|
||||
v.Clear()
|
||||
t.lines = []string{}
|
||||
v.SetOrigin(0, 0)
|
||||
_, winHeight := v.Size()
|
||||
|
||||
if t.dp != nil {
|
||||
t.dp.Close()
|
||||
}
|
||||
|
||||
doneChan := make(chan struct{})
|
||||
msgMap, msgKeyChan, err := ListMessages(ds, filterIds, doneChan)
|
||||
d.PanicIfError(err)
|
||||
t.dp = NewDataPager(ds, msgKeyChan, doneChan, msgMap, terms)
|
||||
t.lines, _ = t.dp.Prepend(t.lines, math.MaxInt(linestofetch, winHeight+10))
|
||||
|
||||
for _, s := range t.lines {
|
||||
fmt.Fprintf(v, "%s\n", s)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (t *TermUI) ResetAuthors(ds datas.Dataset) {
|
||||
v, err := t.Gui.View(usersView)
|
||||
d.PanicIfError(err)
|
||||
v.Clear()
|
||||
for _, u := range GetAuthors(ds) {
|
||||
fmt.Fprintln(v, u)
|
||||
}
|
||||
}
|
||||
|
||||
func (t *TermUI) UpdateMessagesAsync(ds datas.Dataset, sids *types.Map, terms []string) {
|
||||
t.Gui.Execute(func(_ *gocui.Gui) error {
|
||||
err := t.UpdateMessages(ds, sids, terms)
|
||||
d.PanicIfError(err)
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
func (t *TermUI) scrollView(v *gocui.View, dy int) {
|
||||
// Get the size and position of the view.
|
||||
lineCnt := len(t.lines)
|
||||
_, windowHeight := v.Size()
|
||||
ox, oy := v.Origin()
|
||||
cx, cy := v.Cursor()
|
||||
|
||||
// maxCy will either be the height of the screen - 1, or in the case that
|
||||
// the there aren't enough lines to fill the screen, it will be the
|
||||
// lineCnt - origin
|
||||
newCy := cy + dy
|
||||
maxCy := math.MinInt(lineCnt-oy, windowHeight-1)
|
||||
|
||||
// If the newCy doesn't require scrolling, then just move the cursor.
|
||||
if newCy >= 0 && newCy < maxCy {
|
||||
v.MoveCursor(cx, dy, false)
|
||||
return
|
||||
}
|
||||
|
||||
// If the cursor is already at the bottom of the screen and there are no
|
||||
// lines left to scroll up, then we're at the bottom.
|
||||
if newCy >= maxCy && oy >= lineCnt-windowHeight {
|
||||
// Set autoscroll to normal again.
|
||||
v.Autoscroll = true
|
||||
} else {
|
||||
// The cursor is already at the bottom or top of the screen so scroll
|
||||
// the text
|
||||
v.Autoscroll = false
|
||||
v.SetOrigin(ox, oy+dy)
|
||||
}
|
||||
}
|
||||
|
||||
func quit(_ *gocui.Gui, _ *gocui.View) error {
|
||||
dbg.Debug("QUITTING #####")
|
||||
return gocui.ErrQuit
|
||||
}
|
||||
|
||||
func quitWithStack(_ *gocui.Gui, _ *gocui.View) error {
|
||||
dbg.Debug("QUITTING WITH STACK")
|
||||
stacktrace := make([]byte, 1024*1024)
|
||||
length := runtime.Stack(stacktrace, true)
|
||||
dbg.Debug(string(stacktrace[:length]))
|
||||
return gocui.ErrQuit
|
||||
}
|
||||
|
||||
func arrowUp(t *TermUI) func(*gocui.Gui, *gocui.View) error {
|
||||
return func(_ *gocui.Gui, v *gocui.View) error {
|
||||
lineCnt := len(t.lines)
|
||||
ox, oy := v.Origin()
|
||||
if oy == 0 {
|
||||
var ok bool
|
||||
t.lines, ok = t.dp.Prepend(t.lines, linestofetch)
|
||||
if ok {
|
||||
v.Clear()
|
||||
for _, s := range t.lines {
|
||||
fmt.Fprintf(v, "%s\n", s)
|
||||
}
|
||||
c1 := len(t.lines)
|
||||
v.SetOrigin(ox, c1-lineCnt)
|
||||
}
|
||||
}
|
||||
t.scrollView(v, -1)
|
||||
return nil
|
||||
}
|
||||
}
|
||||
|
||||
func arrowDown(t *TermUI) func(*gocui.Gui, *gocui.View) error {
|
||||
return func(_ *gocui.Gui, v *gocui.View) error {
|
||||
t.scrollView(v, 1)
|
||||
return nil
|
||||
}
|
||||
}
|
||||
|
||||
func debugInfo(t *TermUI) func(*gocui.Gui, *gocui.View) error {
|
||||
return func(g *gocui.Gui, _ *gocui.View) error {
|
||||
msgView, _ := g.View(messageView)
|
||||
w, h := msgView.Size()
|
||||
dbg.Debug("info, window size:(%d, %d), lineCnt: %d", w, h, len(t.lines))
|
||||
cx, cy := msgView.Cursor()
|
||||
ox, oy := msgView.Origin()
|
||||
dbg.Debug("info, origin: (%d,%d), cursor: (%d,%d)", ox, oy, cx, cy)
|
||||
dbg.Debug("info, view buffer:\n%s", highlightTerms(viewBuffer(msgView), t.dp.terms))
|
||||
return nil
|
||||
}
|
||||
}
|
||||
|
||||
func viewBuffer(v *gocui.View) string {
|
||||
buf := strings.TrimSpace(v.ViewBuffer())
|
||||
if len(buf) > 0 && buf[len(buf)-1] != byte('\n') {
|
||||
buf = buf + "\n"
|
||||
}
|
||||
return buf
|
||||
}
|
||||
|
||||
func nextView(g *gocui.Gui, v *gocui.View) (err error) {
|
||||
nextName := nextViewName(v.Name())
|
||||
if _, err = g.SetCurrentView(nextName); err != nil {
|
||||
return
|
||||
}
|
||||
_, err = g.SetViewOnTop(nextName)
|
||||
return
|
||||
}
|
||||
|
||||
func nextViewName(currentView string) string {
|
||||
for i, viewname := range viewNames {
|
||||
if currentView == viewname {
|
||||
return viewNames[(i+1)%len(viewNames)]
|
||||
}
|
||||
}
|
||||
return viewNames[0]
|
||||
}
|
||||
|
||||
func (t *TermUI) textScrolledToEnd() bool {
|
||||
v, err := t.Gui.View(messageView)
|
||||
if err != nil {
|
||||
// doubt this will ever happen, if it does just assume we're at bottom
|
||||
return true
|
||||
}
|
||||
_, oy := v.Origin()
|
||||
_, h := v.Size()
|
||||
lc := len(t.lines)
|
||||
dbg.Debug("textScrolledToEnd, oy: %d, h: %d, lc: %d, lc-oy: %d, res: %t", oy, h, lc, lc-oy, lc-oy <= h)
|
||||
return lc-oy <= h
|
||||
}
|
||||
|
||||
func updateViewTitle(g *gocui.Gui, viewname, title string) (err error) {
|
||||
v, err := g.View(viewname)
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
v.Title = title
|
||||
return
|
||||
}
|
||||
|
||||
var bgColors, fgColors = genColors()
|
||||
|
||||
func genColors() ([]string, []string) {
|
||||
bg, fg := []string{}, []string{}
|
||||
for i := 1; i <= 9; i++ {
|
||||
// skip dark blue & white
|
||||
if i != 4 && i != 7 {
|
||||
bg = append(bg, fmt.Sprintf("\x1b[48;5;%dm\x1b[30m%%s\x1b[0m", i))
|
||||
fg = append(fg, fmt.Sprintf("\x1b[38;5;%dm%%s\x1b[0m", i))
|
||||
}
|
||||
}
|
||||
return bg, fg
|
||||
}
|
||||
|
||||
func colorTerm(color int, s string, background bool) string {
|
||||
c := fgColors[color]
|
||||
if background {
|
||||
c = bgColors[color]
|
||||
}
|
||||
return fmt.Sprintf(c, s)
|
||||
}
|
||||
|
||||
func highlightTerms(s string, terms []string) string {
|
||||
for i, t := range terms {
|
||||
color := i % len(fgColors)
|
||||
re := regexp.MustCompile(fmt.Sprintf("(?i)%s", regexp.QuoteMeta(t)))
|
||||
s = re.ReplaceAllStringFunc(s, func(s string) string {
|
||||
return colorTerm(color, s, false)
|
||||
})
|
||||
}
|
||||
return s
|
||||
}
|
||||
@@ -1,10 +0,0 @@
|
||||
This demo application is the simplest p2p chat app you could build using Noms.
|
||||
|
||||
Basic idea:
|
||||
|
||||
- Every node runs a Noms HTTP server (port controlled by --port) flag
|
||||
- Every node broadcasts its current commit and IP/port continuously
|
||||
- Every node continuously sync/merges with every other node
|
||||
(note that due to content addressing, most of these syncs will immediately exit)
|
||||
|
||||
|
||||
@@ -1,144 +0,0 @@
|
||||
// Copyright 2017 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"log"
|
||||
"net"
|
||||
"os"
|
||||
"os/signal"
|
||||
"path"
|
||||
"syscall"
|
||||
|
||||
"github.com/attic-labs/noms/go/config"
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/go/datas"
|
||||
"github.com/attic-labs/noms/go/ipfs"
|
||||
"github.com/attic-labs/noms/go/spec"
|
||||
"github.com/attic-labs/noms/go/util/profile"
|
||||
"github.com/attic-labs/noms/samples/go/decent/dbg"
|
||||
"github.com/attic-labs/noms/samples/go/decent/lib"
|
||||
"github.com/jroimartin/gocui"
|
||||
"gopkg.in/alecthomas/kingpin.v2"
|
||||
)
|
||||
|
||||
func main() {
|
||||
// allow short (-h) help
|
||||
kingpin.CommandLine.HelpFlag.Short('h')
|
||||
|
||||
clientCmd := kingpin.Command("client", "runs the ipfs-chat client UI")
|
||||
clientTopic := clientCmd.Flag("topic", "IPFS pubsub topic to publish and subscribe to").Default("noms-chat-p2p").String()
|
||||
username := clientCmd.Flag("username", "username to sign in as").Required().String()
|
||||
nodeIdx := clientCmd.Flag("node-idx", "a single digit to be used as last digit in all port values: api, gateway and swarm (must be 0-9 inclusive)").Default("-1").Int()
|
||||
clientDir := clientCmd.Arg("path", "local directory to store data in").Required().ExistingDir()
|
||||
|
||||
importCmd := kingpin.Command("import", "imports data into a chat")
|
||||
importSrc := importCmd.Flag("dir", "directory that contains data to import").Default("../data").ExistingDir()
|
||||
importDir := importCmd.Arg("path", "local directory to store data in").Required().ExistingDir()
|
||||
|
||||
kingpin.CommandLine.Help = "A demonstration of using Noms to build a scalable multiuser collaborative application."
|
||||
|
||||
switch kingpin.Parse() {
|
||||
case "client":
|
||||
cInfo := lib.ClientInfo{
|
||||
Topic: *clientTopic,
|
||||
Username: *username,
|
||||
Idx: *nodeIdx,
|
||||
IsDaemon: false,
|
||||
Dir: *clientDir,
|
||||
Delegate: lib.P2PEventDelegate{},
|
||||
}
|
||||
runClient(cInfo)
|
||||
case "import":
|
||||
err := lib.RunImport(*importSrc, fmt.Sprintf("%s/noms::chat", *importDir))
|
||||
d.PanicIfError(err)
|
||||
}
|
||||
}
|
||||
|
||||
func runClient(cInfo lib.ClientInfo) {
|
||||
dbg.SetLogger(lib.NewLogger(cInfo.Username))
|
||||
|
||||
var err error
|
||||
httpPort := 8000 + cInfo.Idx
|
||||
sp, err := spec.ForDatabase(fmt.Sprintf("http://%s:%d", getIP(), httpPort))
|
||||
d.PanicIfError(err)
|
||||
cInfo.Spec = sp
|
||||
|
||||
<-runServer(path.Join(cInfo.Dir, "noms"), httpPort)
|
||||
|
||||
db := cInfo.Spec.GetDatabase()
|
||||
ds := db.GetDataset("chat")
|
||||
ds, err = lib.InitDatabase(ds)
|
||||
d.PanicIfError(err)
|
||||
|
||||
node := ipfs.OpenIPFSRepo(path.Join(cInfo.Dir, "ipfs"), cInfo.Idx)
|
||||
events := make(chan lib.ChatEvent, 1024)
|
||||
t := lib.CreateTermUI(events)
|
||||
defer t.Close()
|
||||
|
||||
d.PanicIfError(t.Layout())
|
||||
t.ResetAuthors(ds)
|
||||
t.UpdateMessages(ds, nil, nil)
|
||||
|
||||
go lib.ProcessChatEvents(node, ds, events, t, cInfo)
|
||||
go lib.ReceiveMessages(node, events, cInfo)
|
||||
|
||||
if err := t.Gui.MainLoop(); err != nil && err != gocui.ErrQuit {
|
||||
dbg.Debug("mainloop has exited, err:", err)
|
||||
log.Panicln(err)
|
||||
}
|
||||
}
|
||||
|
||||
func getIP() string {
|
||||
ifaces, err := net.Interfaces()
|
||||
d.PanicIfError(err)
|
||||
for _, i := range ifaces {
|
||||
addrs, err := i.Addrs()
|
||||
d.PanicIfError(err)
|
||||
for _, addr := range addrs {
|
||||
switch v := addr.(type) {
|
||||
case *net.IPNet:
|
||||
if !v.IP.IsLoopback() {
|
||||
ip := v.IP.To4()
|
||||
if ip != nil {
|
||||
return v.IP.String()
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
d.Panic("notreached")
|
||||
return ""
|
||||
}
|
||||
|
||||
func runServer(atPath string, port int) (ready chan struct{}) {
|
||||
ready = make(chan struct{})
|
||||
_ = os.Mkdir(atPath, 0755)
|
||||
cfg := config.NewResolver()
|
||||
cs, err := cfg.GetChunkStore(atPath)
|
||||
d.CheckError(err)
|
||||
server := datas.NewRemoteDatabaseServer(cs, port)
|
||||
server.Ready = func() {
|
||||
ready <- struct{}{}
|
||||
}
|
||||
|
||||
// Shutdown server gracefully so that profile may be written
|
||||
c := make(chan os.Signal, 1)
|
||||
signal.Notify(c, os.Interrupt)
|
||||
signal.Notify(c, syscall.SIGTERM)
|
||||
go func() {
|
||||
<-c
|
||||
server.Stop()
|
||||
}()
|
||||
|
||||
go func() {
|
||||
d.Try(func() {
|
||||
defer profile.MaybeStartProfile().Stop()
|
||||
server.Run()
|
||||
})
|
||||
}()
|
||||
return
|
||||
}
|
||||
@@ -1 +0,0 @@
|
||||
hr
|
||||
@@ -1,12 +0,0 @@
|
||||
# HR
|
||||
|
||||
This is a small command line application that manages a very simple hypothetical hr database.
|
||||
|
||||
## Usage
|
||||
|
||||
```shell
|
||||
go build
|
||||
./hr --ds /tmp/my-noms::hr add-person 42 Abigail Architect
|
||||
./hr --ds /tmp/my-noms::hr add-person 43 Samuel "Chief Laser Operator"
|
||||
./hr --ds /tmp/my-noms::hr list-persons
|
||||
```
|
||||
@@ -1,8 +0,0 @@
|
||||
#!/bin/sh
|
||||
|
||||
if [ -d test-data ]; then
|
||||
mv test-data test-data.bak
|
||||
fi
|
||||
|
||||
./hr --ds test-data::hr add-person 7 "Aaron Boodman" "Chief Evangelism Officer"
|
||||
./hr --ds test-data::hr add-person 13 "Samuel Boodman" "VP, Culture"
|
||||
@@ -1,117 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
"strconv"
|
||||
|
||||
"github.com/attic-labs/noms/go/config"
|
||||
"github.com/attic-labs/noms/go/datas"
|
||||
"github.com/attic-labs/noms/go/marshal"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/attic-labs/noms/go/util/verbose"
|
||||
flag "github.com/juju/gnuflag"
|
||||
)
|
||||
|
||||
func main() {
|
||||
var dsStr = flag.String("ds", "", "noms dataset to read/write from")
|
||||
|
||||
flag.Usage = func() {
|
||||
fmt.Fprintf(os.Stderr, "Usage: %s [flags] [command] [command-args]\n\n", os.Args[0])
|
||||
fmt.Fprintln(os.Stderr, "Flags:")
|
||||
flag.PrintDefaults()
|
||||
fmt.Fprintln(os.Stderr, "\nCommands:")
|
||||
fmt.Fprintln(os.Stderr, "\tadd-person <id> <name> <title>")
|
||||
fmt.Fprintln(os.Stderr, "\tlist-persons")
|
||||
}
|
||||
|
||||
verbose.RegisterVerboseFlags(flag.CommandLine)
|
||||
flag.Parse(true)
|
||||
|
||||
if flag.NArg() == 0 {
|
||||
fmt.Fprintln(os.Stderr, "Not enough arguments")
|
||||
return
|
||||
}
|
||||
|
||||
if *dsStr == "" {
|
||||
fmt.Fprintln(os.Stderr, "Required flag '--ds' not set")
|
||||
return
|
||||
}
|
||||
|
||||
cfg := config.NewResolver()
|
||||
db, ds, err := cfg.GetDataset(*dsStr)
|
||||
if err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Could not create dataset: %s\n", err)
|
||||
return
|
||||
}
|
||||
defer db.Close()
|
||||
|
||||
switch flag.Arg(0) {
|
||||
case "add-person":
|
||||
addPerson(db, ds)
|
||||
case "list-persons":
|
||||
listPersons(ds)
|
||||
default:
|
||||
fmt.Fprintf(os.Stderr, "Unknown command: %s\n", flag.Arg(0))
|
||||
}
|
||||
}
|
||||
|
||||
type Person struct {
|
||||
Name, Title string
|
||||
Id uint64
|
||||
}
|
||||
|
||||
func addPerson(db datas.Database, ds datas.Dataset) {
|
||||
if flag.NArg() != 4 {
|
||||
fmt.Fprintln(os.Stderr, "Not enough arguments for command add-person")
|
||||
return
|
||||
}
|
||||
|
||||
id, err := strconv.ParseUint(flag.Arg(1), 10, 64)
|
||||
if err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Invalid person-id: %s", flag.Arg(1))
|
||||
return
|
||||
}
|
||||
|
||||
np, err := marshal.Marshal(db, Person{flag.Arg(2), flag.Arg(3), id})
|
||||
if err != nil {
|
||||
fmt.Fprintln(os.Stderr, err)
|
||||
return
|
||||
}
|
||||
|
||||
_, err = db.CommitValue(ds, getPersons(ds).Edit().Set(types.Float(id), np).Map())
|
||||
if err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Error committing: %s\n", err)
|
||||
return
|
||||
}
|
||||
}
|
||||
|
||||
func listPersons(ds datas.Dataset) {
|
||||
d := getPersons(ds)
|
||||
if d.Empty() {
|
||||
fmt.Println("No people found")
|
||||
return
|
||||
}
|
||||
|
||||
d.IterAll(func(k, v types.Value) {
|
||||
var p Person
|
||||
err := marshal.Unmarshal(v, &p)
|
||||
if err != nil {
|
||||
fmt.Fprintln(os.Stderr, err)
|
||||
return
|
||||
}
|
||||
fmt.Printf("%s (id: %d, title: %s)\n", p.Name, p.Id, p.Title)
|
||||
})
|
||||
}
|
||||
|
||||
func getPersons(ds datas.Dataset) types.Map {
|
||||
hv, ok := ds.MaybeHeadValue()
|
||||
if ok {
|
||||
return hv.(types.Map)
|
||||
}
|
||||
return types.NewMap(ds.Database())
|
||||
}
|
||||
@@ -1,61 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"path"
|
||||
"runtime"
|
||||
"testing"
|
||||
|
||||
"github.com/attic-labs/noms/go/spec"
|
||||
"github.com/attic-labs/noms/go/util/clienttest"
|
||||
"github.com/stretchr/testify/suite"
|
||||
)
|
||||
|
||||
func TestBasics(t *testing.T) {
|
||||
suite.Run(t, &testSuite{})
|
||||
}
|
||||
|
||||
type testSuite struct {
|
||||
clienttest.ClientTestSuite
|
||||
}
|
||||
|
||||
func (s *testSuite) TestRoundTrip() {
|
||||
spec := spec.CreateValueSpecString("nbs", s.DBDir, "hr")
|
||||
stdout, stderr := s.MustRun(main, []string{"--ds", spec, "list-persons"})
|
||||
s.Equal("No people found\n", stdout)
|
||||
s.Equal("", stderr)
|
||||
|
||||
stdout, stderr = s.MustRun(main, []string{"--ds", spec, "add-person", "42", "Benjamin Kalman", "Programmer, Barista"})
|
||||
s.Equal("", stdout)
|
||||
s.Equal("", stderr)
|
||||
|
||||
stdout, stderr = s.MustRun(main, []string{"--ds", spec, "add-person", "43", "Abigail Boodman", "Chief Architect"})
|
||||
s.Equal("", stdout)
|
||||
s.Equal("", stderr)
|
||||
|
||||
stdout, stderr = s.MustRun(main, []string{"--ds", spec, "list-persons"})
|
||||
s.Equal(`Benjamin Kalman (id: 42, title: Programmer, Barista)
|
||||
Abigail Boodman (id: 43, title: Chief Architect)
|
||||
`, stdout)
|
||||
s.Equal("", stderr)
|
||||
|
||||
}
|
||||
|
||||
func (s *testSuite) TestReadCanned() {
|
||||
_, p, _, _ := runtime.Caller(0)
|
||||
p = path.Join(path.Dir(p), "test-data")
|
||||
|
||||
stdout, stderr := s.MustRun(main, []string{"--ds", spec.CreateValueSpecString("nbs", p, "hr"), "list-persons"})
|
||||
s.Equal(`Aaron Boodman (id: 7, title: Chief Evangelism Officer)
|
||||
Samuel Boodman (id: 13, title: VP, Culture)
|
||||
`, stdout)
|
||||
s.Equal("", stderr)
|
||||
}
|
||||
|
||||
func (s *testSuite) TestInvalidDatasetSpec() {
|
||||
// Should not crash
|
||||
_, _ = s.MustRun(main, []string{"--ds", "invalid-dataset", "list-persons"})
|
||||
}
|
||||
@@ -1 +0,0 @@
|
||||
4:7.18:8s92pdafhd4hkhav6r4748u1rjlosh1k:5b1e9knhol2orv0a8ej6tvelc46jp92l:bsvid54jt8pjto211lcdl14tbfd39jmn:2:998se5i5mf15fld7f318818i6ie0c8rr:2
|
||||
@@ -1 +0,0 @@
|
||||
json-import
|
||||
@@ -1,98 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"errors"
|
||||
"fmt"
|
||||
"io"
|
||||
"log"
|
||||
"net/http"
|
||||
"os"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/attic-labs/noms/go/config"
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/go/datas"
|
||||
"github.com/attic-labs/noms/go/spec"
|
||||
"github.com/attic-labs/noms/go/util/jsontonoms"
|
||||
"github.com/attic-labs/noms/go/util/progressreader"
|
||||
"github.com/attic-labs/noms/go/util/status"
|
||||
"github.com/attic-labs/noms/go/util/verbose"
|
||||
"github.com/dustin/go-humanize"
|
||||
flag "github.com/juju/gnuflag"
|
||||
)
|
||||
|
||||
func main() {
|
||||
performCommit := flag.Bool("commit", true, "commit the data to head of the dataset (otherwise only write the data to the dataset)")
|
||||
flag.Usage = func() {
|
||||
fmt.Fprintf(os.Stderr, "usage: %s <url> <dataset>\n", os.Args[0])
|
||||
flag.PrintDefaults()
|
||||
}
|
||||
|
||||
spec.RegisterCommitMetaFlags(flag.CommandLine)
|
||||
verbose.RegisterVerboseFlags(flag.CommandLine)
|
||||
flag.Parse(true)
|
||||
|
||||
if len(flag.Args()) != 2 {
|
||||
d.CheckError(errors.New("expected url and dataset flags"))
|
||||
}
|
||||
|
||||
cfg := config.NewResolver()
|
||||
db, ds, err := cfg.GetDataset(flag.Arg(1))
|
||||
d.CheckError(err)
|
||||
defer db.Close()
|
||||
|
||||
url := flag.Arg(0)
|
||||
if url == "" {
|
||||
flag.Usage()
|
||||
}
|
||||
|
||||
var r io.Reader
|
||||
if strings.HasPrefix(url, "http") {
|
||||
res, err := http.Get(url)
|
||||
if err != nil {
|
||||
log.Fatalf("Error fetching %s: %+v\n", url, err)
|
||||
} else if res.StatusCode != 200 {
|
||||
log.Fatalf("Error fetching %s: %s\n", url, res.Status)
|
||||
}
|
||||
defer res.Body.Close()
|
||||
r = res.Body
|
||||
} else {
|
||||
// assume it's a file
|
||||
f, err := os.Open(url)
|
||||
if err != nil {
|
||||
log.Fatalf("Invalid URL %s - does not start with 'http' and isn't local file either. fopen error: %s", url, err)
|
||||
}
|
||||
|
||||
r = f
|
||||
}
|
||||
|
||||
var jsonObject interface{}
|
||||
start := time.Now()
|
||||
r = progressreader.New(r, func(seen uint64) {
|
||||
elapsed := time.Since(start).Seconds()
|
||||
rate := uint64(float64(seen) / elapsed)
|
||||
status.Printf("%s decoded in %ds (%s/s)...", humanize.Bytes(seen), int(elapsed), humanize.Bytes(rate))
|
||||
})
|
||||
err = json.NewDecoder(r).Decode(&jsonObject)
|
||||
if err != nil {
|
||||
log.Fatalln("Error decoding JSON: ", err)
|
||||
}
|
||||
status.Done()
|
||||
|
||||
if *performCommit {
|
||||
additionalMetaInfo := map[string]string{"url": url}
|
||||
meta, err := spec.CreateCommitMetaStruct(ds.Database(), "", "", additionalMetaInfo, nil)
|
||||
d.CheckErrorNoUsage(err)
|
||||
_, err = db.Commit(ds, jsontonoms.NomsValueFromDecodedJSON(db, jsonObject, true), datas.CommitOptions{Meta: meta})
|
||||
d.PanicIfError(err)
|
||||
} else {
|
||||
ref := db.WriteValue(jsontonoms.NomsValueFromDecodedJSON(db, jsonObject, true))
|
||||
fmt.Fprintf(os.Stdout, "#%s\n", ref.TargetHash().String())
|
||||
}
|
||||
}
|
||||
@@ -1,55 +0,0 @@
|
||||
# Nomdex
|
||||
|
||||
Nomdex demonstrates how Noms maps can be used to index values in a database and provides a simple query language to search for objects.
|
||||
|
||||
## Description
|
||||
This program experiments with using ordinary Noms Maps as indexes. It leverages the fact that Maps in Noms are implemented by prolly-trees which are similar to B-Trees in many important ways that make them ideal for use as indexes. They are balanced, sorted, require relatively few accesses to find any leaf node and efficient to update.
|
||||
|
||||
###Building Indexes
|
||||
Nomdex constructs indexes as Maps that are keyed by either Strings or Numbers. The values in the index are sets of objects. The following command can be used to build an index:
|
||||
```shell
|
||||
nomdex up --in-path <absolute noms path> --by <relative noms path> --out-ds <dataset name>
|
||||
```
|
||||
The ***'in-path'*** argument must be a ValueSpec(see [Spelling In Noms](../../../doc/spelling.md#spelling-values)) that designates the root of an object hierarchy to be scanned for "indexable" objects.
|
||||
|
||||
The ***'by'*** argument must be a relative path. Nomdex traverses every value reachable from 'in-path' and attempts to resolve this relative ***'by'*** path from it. Any value that has a String, Number, or Bool index using the relative attribute as it's key.
|
||||
|
||||
The ***'out-ds'*** argument specifies a dataset name that will be used to store the new index.
|
||||
|
||||
In addition, there are arguments that allow values to be transformed before using them as keys in the index by applying regex expressions functions. Consult to the help text and code to see how those can be used.
|
||||
|
||||
### Queries in Nomdex
|
||||
Once an index is built, it can be queried against using the nomdex find command. For example, given a database that contains structs of the following type representing cities:
|
||||
```go
|
||||
struct Row {
|
||||
City: String,
|
||||
State: String,
|
||||
GeoPos: struct {
|
||||
Latitude: Number,
|
||||
Longitude: Number,
|
||||
}
|
||||
}
|
||||
```
|
||||
The following commands could be used to build indexes on the City, State, Latitude and Longitude attibutes.
|
||||
```shell
|
||||
nomdex up --in-path http://localhost:8000::cities --by .City --out-ds by-name
|
||||
nomdex up --in-path http://localhost:8000::cities --by .State --out-ds by-state
|
||||
nomdex up --in-path http://localhost:8000::cities --by .GeoPos.Latitude --out-ds by-lat
|
||||
nomdex up --in-path http://localhost:8000::cities --by .GeoPos.Longitude --out-ds by-lon
|
||||
```
|
||||
Once these indexes are created, the following queries could be made using the find command:
|
||||
```shell
|
||||
// find all cities in California
|
||||
nomdex find --db http://localhost:8000 'by-state = "California"'
|
||||
|
||||
// find all cities whose name begins with A, B, or C
|
||||
nomdex find --db http://localhost:8000 'by-name >= "A" and by-name < "D"'
|
||||
|
||||
// Find all tropical cities whose name begins with A, B, or C
|
||||
nomdex find --db http://localhost:8000 '(by-name >= "A" and by-name < "D") and (by-lat >= -23.5 and by-lat <= 23.5)
|
||||
```
|
||||
The nomdex query language is simple, it consists of comparison expressions which take the form of '*indexName comparisonOperator constantValue*'. Index names are the dataset given as the ***'out-ds'*** argument to the *build* command. Comparison operators can be one of: <, <=, >, >=, =, !=. Constants are either String values which are quoted: "hi, I'm a string constant", and Numbers which consist of digits and an optional decimal point and minus sign: 1, -1, 2.3, -3.2.
|
||||
|
||||
In addition, comparison expressions can be combined using "and" and "or". Parenthesis can, and should be used to express the order that evaluation should take place.
|
||||
|
||||
Note: nomdex is not a complete query system. It's purpose is only to illustrate the fact that Noms maps have all the necessary properties to be used as indexes. A complete query system would have many additional features and the ability to optimize queries in an intelligent way.
|
||||
@@ -1,205 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"fmt"
|
||||
"io"
|
||||
"sort"
|
||||
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
)
|
||||
|
||||
type expr interface {
|
||||
ranges() queryRangeSlice
|
||||
dbgPrintTree(w io.Writer, level int)
|
||||
indexName() string
|
||||
iterator(im *indexManager) types.SetIterator
|
||||
}
|
||||
|
||||
// logExpr represents a logical 'and' or 'or' expression between two other expressions.
|
||||
// e.g. logExpr would represent the and/or expressions in this query:
|
||||
// (index1 > 0 and index1 < 9) or (index1 > 100 and index < 109)
|
||||
type logExpr struct {
|
||||
op boolOp
|
||||
expr1 expr
|
||||
expr2 expr
|
||||
idxName string
|
||||
}
|
||||
|
||||
type compExpr struct {
|
||||
idxName string
|
||||
op compOp
|
||||
v1 types.Value
|
||||
}
|
||||
|
||||
func (le logExpr) indexName() string {
|
||||
return le.idxName
|
||||
}
|
||||
|
||||
func (le logExpr) iterator(im *indexManager) types.SetIterator {
|
||||
if le.idxName != "" {
|
||||
return unionizeIters(iteratorsFromRanges(im.indexes[le.idxName], le.ranges()))
|
||||
}
|
||||
|
||||
i1 := le.expr1.iterator(im)
|
||||
i2 := le.expr2.iterator(im)
|
||||
var iter types.SetIterator
|
||||
switch le.op {
|
||||
case and:
|
||||
if i1 == nil || i2 == nil {
|
||||
return nil
|
||||
}
|
||||
iter = types.NewIntersectionIterator(le.expr1.iterator(im), le.expr2.iterator(im))
|
||||
case or:
|
||||
if i1 == nil {
|
||||
return i2
|
||||
}
|
||||
if i2 == nil {
|
||||
return i1
|
||||
}
|
||||
iter = types.NewUnionIterator(le.expr1.iterator(im), le.expr2.iterator(im))
|
||||
}
|
||||
return iter
|
||||
}
|
||||
|
||||
func (le logExpr) ranges() (ranges queryRangeSlice) {
|
||||
rslice1 := le.expr1.ranges()
|
||||
rslice2 := le.expr2.ranges()
|
||||
rslice := queryRangeSlice{}
|
||||
|
||||
switch le.op {
|
||||
case and:
|
||||
if len(rslice1) == 0 || len(rslice2) == 0 {
|
||||
return rslice
|
||||
}
|
||||
for _, r1 := range rslice1 {
|
||||
for _, r2 := range rslice2 {
|
||||
rslice = append(rslice, r1.and(r2)...)
|
||||
}
|
||||
}
|
||||
sort.Sort(rslice)
|
||||
return rslice
|
||||
case or:
|
||||
if len(rslice1) == 0 {
|
||||
return rslice2
|
||||
}
|
||||
if len(rslice2) == 0 {
|
||||
return rslice1
|
||||
}
|
||||
for _, r1 := range rslice1 {
|
||||
for _, r2 := range rslice2 {
|
||||
rslice = append(rslice, r1.or(r2)...)
|
||||
}
|
||||
}
|
||||
sort.Sort(rslice)
|
||||
return rslice
|
||||
}
|
||||
return queryRangeSlice{}
|
||||
}
|
||||
|
||||
func (le logExpr) dbgPrintTree(w io.Writer, level int) {
|
||||
fmt.Fprintf(w, "%*s%s\n", 2*level, "", le.op)
|
||||
if le.expr1 != nil {
|
||||
le.expr1.dbgPrintTree(w, level+1)
|
||||
}
|
||||
if le.expr2 != nil {
|
||||
le.expr2.dbgPrintTree(w, level+1)
|
||||
}
|
||||
}
|
||||
|
||||
func (re compExpr) indexName() string {
|
||||
return re.idxName
|
||||
}
|
||||
|
||||
func iteratorsFromRange(index types.Map, rd queryRange) []types.SetIterator {
|
||||
first := true
|
||||
iterators := []types.SetIterator{}
|
||||
index.IterFrom(rd.lower.value, func(k, v types.Value) bool {
|
||||
if first && rd.lower.value != nil && !rd.lower.include && rd.lower.value.Equals(k) {
|
||||
return false
|
||||
}
|
||||
if rd.upper.value != nil {
|
||||
if !rd.upper.include && rd.upper.value.Equals(k) {
|
||||
return true
|
||||
}
|
||||
if rd.upper.value.Less(k) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
s := v.(types.Set)
|
||||
iterators = append(iterators, s.Iterator())
|
||||
return false
|
||||
})
|
||||
return iterators
|
||||
}
|
||||
|
||||
func iteratorsFromRanges(index types.Map, ranges queryRangeSlice) []types.SetIterator {
|
||||
iterators := []types.SetIterator{}
|
||||
for _, r := range ranges {
|
||||
iterators = append(iterators, iteratorsFromRange(index, r)...)
|
||||
}
|
||||
return iterators
|
||||
}
|
||||
|
||||
func unionizeIters(iters []types.SetIterator) types.SetIterator {
|
||||
if len(iters) == 0 {
|
||||
return nil
|
||||
}
|
||||
if len(iters) <= 1 {
|
||||
return iters[0]
|
||||
}
|
||||
|
||||
unionIters := []types.SetIterator{}
|
||||
var iter0 types.SetIterator
|
||||
for i, iter := range iters {
|
||||
if i%2 == 0 {
|
||||
iter0 = iter
|
||||
} else {
|
||||
unionIters = append(unionIters, types.NewUnionIterator(iter0, iter))
|
||||
iter0 = nil
|
||||
}
|
||||
}
|
||||
if iter0 != nil {
|
||||
unionIters = append(unionIters, iter0)
|
||||
}
|
||||
return unionizeIters(unionIters)
|
||||
}
|
||||
|
||||
func (re compExpr) iterator(im *indexManager) types.SetIterator {
|
||||
index := im.indexes[re.idxName]
|
||||
iters := iteratorsFromRanges(index, re.ranges())
|
||||
return unionizeIters(iters)
|
||||
}
|
||||
|
||||
func (re compExpr) ranges() (ranges queryRangeSlice) {
|
||||
var r queryRange
|
||||
switch re.op {
|
||||
case equals:
|
||||
e := bound{value: re.v1, include: true}
|
||||
r = queryRange{lower: e, upper: e}
|
||||
case gt:
|
||||
r = queryRange{lower: bound{re.v1, false, 0}, upper: bound{nil, true, 1}}
|
||||
case gte:
|
||||
r = queryRange{lower: bound{re.v1, true, 0}, upper: bound{nil, true, 1}}
|
||||
case lt:
|
||||
r = queryRange{lower: bound{nil, true, -1}, upper: bound{re.v1, false, 0}}
|
||||
case lte:
|
||||
r = queryRange{lower: bound{nil, true, -1}, upper: bound{re.v1, true, 0}}
|
||||
case ne:
|
||||
return queryRangeSlice{
|
||||
{lower: bound{nil, true, -1}, upper: bound{re.v1, false, 0}},
|
||||
{lower: bound{re.v1, false, 0}, upper: bound{nil, true, 1}},
|
||||
}
|
||||
}
|
||||
return queryRangeSlice{r}
|
||||
}
|
||||
|
||||
func (re compExpr) dbgPrintTree(w io.Writer, level int) {
|
||||
buf := bytes.Buffer{}
|
||||
types.WriteEncodedValue(&buf, re.v1)
|
||||
fmt.Fprintf(w, "%*s%s %s %s\n", 2*level, "", re.idxName, re.op, buf.String())
|
||||
}
|
||||
@@ -1,83 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
"path"
|
||||
|
||||
"github.com/attic-labs/noms/cmd/util"
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/go/util/exit"
|
||||
flag "github.com/juju/gnuflag"
|
||||
)
|
||||
|
||||
var commands = []*util.Command{
|
||||
update,
|
||||
find,
|
||||
}
|
||||
|
||||
var usageLine = `Nomdex builds indexes to support fast data access.`
|
||||
|
||||
func main() {
|
||||
progName := path.Base(os.Args[0])
|
||||
util.InitHelp(progName, commands, usageLine)
|
||||
flag.Usage = util.Usage
|
||||
flag.Parse(false)
|
||||
|
||||
args := flag.Args()
|
||||
if len(args) < 1 {
|
||||
util.Usage()
|
||||
return
|
||||
}
|
||||
|
||||
if args[0] == "help" {
|
||||
util.Help(args[1:])
|
||||
return
|
||||
}
|
||||
|
||||
for _, cmd := range commands {
|
||||
if cmd.Name() == args[0] {
|
||||
flags := cmd.Flags()
|
||||
flags.Usage = cmd.Usage
|
||||
|
||||
flags.Parse(true, args[1:])
|
||||
args = flags.Args()
|
||||
if cmd.Nargs != 0 && len(args) < cmd.Nargs {
|
||||
cmd.Usage()
|
||||
}
|
||||
exitCode := cmd.Run(args)
|
||||
if exitCode != 0 {
|
||||
exit.Exit(exitCode)
|
||||
}
|
||||
return
|
||||
}
|
||||
}
|
||||
|
||||
fmt.Fprintf(os.Stderr, "noms: unknown command %q\n", args[0])
|
||||
util.Usage()
|
||||
}
|
||||
|
||||
func printError(err error, msgAndArgs ...interface{}) bool {
|
||||
if err != nil {
|
||||
err := d.Unwrap(err)
|
||||
switch len(msgAndArgs) {
|
||||
case 0:
|
||||
fmt.Fprintf(os.Stderr, "error: %s\n", err)
|
||||
case 1:
|
||||
fmt.Fprintf(os.Stderr, "%s%s\n", msgAndArgs[0], err)
|
||||
default:
|
||||
format, ok := msgAndArgs[0].(string)
|
||||
if ok {
|
||||
s1 := fmt.Sprintf(format, msgAndArgs[1:]...)
|
||||
fmt.Fprintf(os.Stderr, "%s%s\n", s1, err)
|
||||
} else {
|
||||
fmt.Fprintf(os.Stderr, "error: %s\n", err)
|
||||
}
|
||||
}
|
||||
}
|
||||
return err != nil
|
||||
}
|
||||
@@ -1,181 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"io"
|
||||
"os"
|
||||
|
||||
"github.com/attic-labs/noms/cmd/util"
|
||||
"github.com/attic-labs/noms/go/config"
|
||||
"github.com/attic-labs/noms/go/datas"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/attic-labs/noms/go/util/outputpager"
|
||||
"github.com/attic-labs/noms/go/util/verbose"
|
||||
flag "github.com/juju/gnuflag"
|
||||
)
|
||||
|
||||
var longFindHelp = `'nomdex find' retrieves and prints objects that satisfy the 'query' argument.
|
||||
|
||||
Indexes are built using the 'nomdex up' command. For information about building
|
||||
indexes, see: nomdex up -h
|
||||
|
||||
Objects that have been indexed can be quickly found using the nomdex query
|
||||
language. For example, consider objects with the following type:
|
||||
|
||||
struct Person {
|
||||
name String,
|
||||
geopos struct GeoPos {
|
||||
latitude Float,
|
||||
longitude Float,
|
||||
}
|
||||
}
|
||||
|
||||
Objects of this type can be indexed on the name, latitude and longitude fields
|
||||
with the following commands:
|
||||
nomdex up --in-path ~/nomsdb::people.value --by .name --out-ds by-name
|
||||
nomdex up --in-path ~/nomsdb::people.value --by .geopos.latitude --out-ds by-lat
|
||||
nomdex up --in-path ~/nomsdb::people.value --by .geopos.longitude --out-ds by-lng
|
||||
|
||||
The following query could be used to find all people with an address near the
|
||||
equator:
|
||||
nomdex find 'by-lat >= -1.0 and by-lat <= 1.0'
|
||||
|
||||
We could also get a list of all people who live near the equator whose name begins with "A":
|
||||
nomdex find '(by-name >= "A" and by-name < "B") and (by-lat >= -1.0 and by-lat <= 1.0)'
|
||||
|
||||
The query language is simple. It currently supports the following relational operators:
|
||||
<, <=, >, >=, =, !=
|
||||
Relational expressions are always of the form:
|
||||
<index> <relational operator> <constant> e.g. personId >= 2000.
|
||||
|
||||
Indexes are the name given by the --out-ds argument in the 'nomdex up' command.
|
||||
Constants are either "strings" (in quotes) or numbers (e.g. 3, 3000, -2, -2.5,
|
||||
3.147, etc).
|
||||
|
||||
Relational expressions can be combined using the "and" and "or" operators.
|
||||
Parentheses can (and should) be used to ensure that the evaluation is done in
|
||||
the desired order.
|
||||
`
|
||||
|
||||
var find = &util.Command{
|
||||
Run: runFind,
|
||||
UsageLine: "find --db <database spec> <query>",
|
||||
Short: "Print objects in index that satisfy 'query'",
|
||||
Long: longFindHelp,
|
||||
Flags: setupFindFlags,
|
||||
Nargs: 1,
|
||||
}
|
||||
|
||||
var dbPath = ""
|
||||
|
||||
func setupFindFlags() *flag.FlagSet {
|
||||
flagSet := flag.NewFlagSet("find", flag.ExitOnError)
|
||||
flagSet.StringVar(&dbPath, "db", "", "database containing index")
|
||||
outputpager.RegisterOutputpagerFlags(flagSet)
|
||||
verbose.RegisterVerboseFlags(flagSet)
|
||||
return flagSet
|
||||
}
|
||||
|
||||
func runFind(args []string) int {
|
||||
query := args[0]
|
||||
if dbPath == "" {
|
||||
fmt.Fprintf(os.Stderr, "Missing required 'index' arg\n")
|
||||
flag.Usage()
|
||||
return 1
|
||||
}
|
||||
|
||||
cfg := config.NewResolver()
|
||||
db, err := cfg.GetDatabase(dbPath)
|
||||
if printError(err, "Unable to open database\n\terror: ") {
|
||||
return 1
|
||||
}
|
||||
defer db.Close()
|
||||
|
||||
im := &indexManager{db: db, indexes: map[string]types.Map{}}
|
||||
expr, err := parseQuery(query, im)
|
||||
if err != nil {
|
||||
fmt.Printf("err: %s\n", err)
|
||||
return 1
|
||||
}
|
||||
|
||||
pgr := outputpager.Start()
|
||||
defer pgr.Stop()
|
||||
|
||||
iter := expr.iterator(im)
|
||||
cnt := 0
|
||||
if iter != nil {
|
||||
for v := iter.Next(); v != nil; v = iter.Next() {
|
||||
types.WriteEncodedValue(pgr.Writer, v)
|
||||
fmt.Fprintf(pgr.Writer, "\n")
|
||||
cnt++
|
||||
}
|
||||
}
|
||||
fmt.Fprintf(pgr.Writer, "Found %d objects\n", cnt)
|
||||
|
||||
return 0
|
||||
}
|
||||
|
||||
func printObjects(w io.Writer, index types.Map, ranges queryRangeSlice) {
|
||||
cnt := 0
|
||||
first := true
|
||||
printObjectForRange := func(index types.Map, r queryRange) {
|
||||
index.IterFrom(r.lower.value, func(k, v types.Value) bool {
|
||||
if first && r.lower.value != nil && !r.lower.include && r.lower.value.Equals(k) {
|
||||
return false
|
||||
}
|
||||
if r.upper.value != nil {
|
||||
if !r.upper.include && r.upper.value.Equals(k) {
|
||||
return true
|
||||
}
|
||||
if r.upper.value.Less(k) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
s := v.(types.Set)
|
||||
s.IterAll(func(v types.Value) {
|
||||
types.WriteEncodedValue(w, v)
|
||||
fmt.Fprintf(w, "\n")
|
||||
cnt++
|
||||
})
|
||||
return false
|
||||
})
|
||||
}
|
||||
for _, r := range ranges {
|
||||
printObjectForRange(index, r)
|
||||
}
|
||||
fmt.Fprintf(w, "Found %d objects\n", cnt)
|
||||
}
|
||||
|
||||
func openIndex(idxName string, im *indexManager) error {
|
||||
if _, hasIndex := im.indexes[idxName]; hasIndex {
|
||||
return nil
|
||||
}
|
||||
|
||||
ds := im.db.GetDataset(idxName)
|
||||
commit, ok := ds.MaybeHead()
|
||||
if !ok {
|
||||
return fmt.Errorf("index '%s' not found", idxName)
|
||||
}
|
||||
|
||||
index, ok := commit.Get(datas.ValueField).(types.Map)
|
||||
if !ok {
|
||||
return fmt.Errorf("Value of commit at '%s' is not a valid index", idxName)
|
||||
}
|
||||
|
||||
// Todo: make this type be Map<String | Float>, Set<Value>> once Issue #2326 gets resolved and
|
||||
// IsSubtype() returns the correct value.
|
||||
typ := types.MakeMapType(
|
||||
types.MakeUnionType(types.StringType, types.FloaTType),
|
||||
types.ValueType)
|
||||
|
||||
if !types.IsValueSubtypeOf(index, typ) {
|
||||
return fmt.Errorf("%s does not point to a suitable index type:", idxName)
|
||||
}
|
||||
|
||||
im.indexes[idxName] = index
|
||||
return nil
|
||||
}
|
||||
@@ -1,146 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"regexp"
|
||||
"testing"
|
||||
|
||||
"github.com/attic-labs/noms/go/datas"
|
||||
"github.com/attic-labs/noms/go/marshal"
|
||||
"github.com/attic-labs/noms/go/nbs"
|
||||
"github.com/attic-labs/noms/go/spec"
|
||||
"github.com/attic-labs/noms/go/util/clienttest"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/suite"
|
||||
)
|
||||
|
||||
type TestObj struct {
|
||||
Key int
|
||||
Fname string
|
||||
Lname string
|
||||
Gender string
|
||||
Age int
|
||||
}
|
||||
|
||||
type testSuite struct {
|
||||
clienttest.ClientTestSuite
|
||||
}
|
||||
|
||||
func TestNomdex(t *testing.T) {
|
||||
suite.Run(t, &testSuite{})
|
||||
}
|
||||
|
||||
func makeTestDb(s *testSuite, dsId string) datas.Database {
|
||||
db := datas.NewDatabase(nbs.NewLocalStore(s.DBDir, clienttest.DefaultMemTableSize))
|
||||
l1 := []TestObj{
|
||||
{1, "will", "smith", "m", 40},
|
||||
{2, "lana", "turner", "f", 91},
|
||||
{3, "john", "wayne", "m", 86},
|
||||
{4, "johnny", "depp", "m", 50},
|
||||
{5, "merrill", "streep", "f", 60},
|
||||
{6, "rob", "courdry", "m", 45},
|
||||
{7, "bruce", "lee", "m", 72},
|
||||
{8, "bruce", "willis", "m", 36},
|
||||
{9, "luis", "bunuel", "m", 100},
|
||||
{10, "andy", "sandberg", "m", 32},
|
||||
{11, "walter", "coggins", "m", 28},
|
||||
{12, "seth", "rogan", "m", 29},
|
||||
}
|
||||
|
||||
m1 := map[string]TestObj{
|
||||
"lg": {13, "lady", "gaga", "f", 39},
|
||||
"ss": {14, "sam", "smith", "m", 28},
|
||||
"rp": {15, "robert", "plant", "m", 69},
|
||||
"ml": {16, "meat", "loaf", "m", 65},
|
||||
"gf": {17, "glenn", "frey", "m", 60},
|
||||
"jr": {18, "joey", "ramone", "m", 55},
|
||||
"rc": {19, "ray", "charles", "m", 72},
|
||||
"bk": {20, "bb", "king", "m", 77},
|
||||
"b": {21, "beck", "", "m", 38},
|
||||
"md": {22, "miles", "davis", "m", 82},
|
||||
"rd": {23, "roger", "daltry", "m", 62},
|
||||
"jf": {24, "john", "fogerty", "m", 60},
|
||||
}
|
||||
|
||||
m := map[string]interface{}{"actors": l1, "musicians": m1}
|
||||
v, err := marshal.Marshal(db, m)
|
||||
s.NoError(err)
|
||||
_, err = db.CommitValue(db.GetDataset(dsId), v)
|
||||
s.NoError(err)
|
||||
return db
|
||||
}
|
||||
|
||||
func (s *testSuite) TestNomdex() {
|
||||
dsId := "data"
|
||||
db := makeTestDb(s, dsId)
|
||||
s.NotNil(db)
|
||||
db.Close()
|
||||
|
||||
fnameIdx := "fname-idx"
|
||||
dataSpec := spec.CreateValueSpecString("nbs", s.DBDir, dsId)
|
||||
dbSpec := spec.CreateDatabaseSpecString("nbs", s.DBDir)
|
||||
stdout, stderr := s.MustRun(main, []string{"up", "--out-ds", fnameIdx, "--in-path", dataSpec, "--by", ".fname"})
|
||||
s.Contains(stdout, "Indexed 24 objects")
|
||||
s.Equal("", stderr)
|
||||
|
||||
genderIdx := "gender-idx"
|
||||
stdout, stderr = s.MustRun(main, []string{"up", "--out-ds", genderIdx, "--in-path", dataSpec, "--by", ".gender"})
|
||||
s.Contains(stdout, "Indexed 24 objects")
|
||||
s.Equal("", stderr)
|
||||
|
||||
stdout, stderr = s.MustRun(main, []string{"find", "--db", dbSpec, `fname-idx = "lady"`})
|
||||
s.Contains(stdout, "Found 1 objects")
|
||||
s.Equal("", stderr)
|
||||
|
||||
stdout, stderr = s.MustRun(main, []string{"find", "--db", dbSpec, `fname-idx = "lady" and gender-idx = "f"`})
|
||||
s.Contains(stdout, "Found 1 objects")
|
||||
s.Equal("", stderr)
|
||||
|
||||
stdout, stderr = s.MustRun(main, []string{"find", "--db", dbSpec, `fname-idx != "lady" and gender-idx != "m"`})
|
||||
s.Contains(stdout, "Found 2 objects")
|
||||
s.Equal("", stderr)
|
||||
|
||||
stdout, stderr = s.MustRun(main, []string{"find", "--db", dbSpec, `fname-idx != "lady" and fname-idx != "john"`})
|
||||
s.Contains(stdout, "Found 21 objects")
|
||||
s.Equal("", stderr)
|
||||
|
||||
stdout, stderr = s.MustRun(main, []string{"find", "--db", dbSpec, `fname-idx != "lady" or gender-idx != "f"`})
|
||||
s.Contains(stdout, "Found 23 objects")
|
||||
s.Equal("", stderr)
|
||||
}
|
||||
|
||||
func TestTransform(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
|
||||
tcs := [][]string{
|
||||
[]string{`"01/02/2003"`, "\"(\\d{2})/(\\d{2})/(\\d{4})\"", "$3/$2/$1", "2003/02/01"},
|
||||
}
|
||||
|
||||
for _, tc := range tcs {
|
||||
base, regex, replace, expected := tc[0], tc[1], tc[2], tc[3]
|
||||
|
||||
testRe := regexp.MustCompile(regex)
|
||||
result := testRe.ReplaceAllString(base, replace)
|
||||
assert.Equal(expected, result)
|
||||
}
|
||||
|
||||
tcs = [][]string{
|
||||
[]string{"343 STATE ST\nROCHESTER, NY 14650\n(43.161276, -77.619386)", "43.161276", "-77.619386"},
|
||||
[]string{"TWO EMBARCADERO CENTER\nPROMENADE LEVEL SAN FRANCISCO, CA 94111\n", "", ""},
|
||||
}
|
||||
|
||||
findLatRe := regexp.MustCompile("(?s)\\(([\\d.]+)")
|
||||
findLngRe := regexp.MustCompile("(?s)(-?[\\d.]+)\\)")
|
||||
for _, tc := range tcs {
|
||||
base, expectedLat, expectedLng := tc[0], tc[1], tc[2]
|
||||
|
||||
lat := findLatRe.FindStringSubmatch(base)
|
||||
assert.True(len(lat) == 0 && expectedLat == "" || (len(lat) == 2 && expectedLat == lat[1]))
|
||||
|
||||
lng := findLngRe.FindStringSubmatch(base)
|
||||
assert.True(len(lng) == 0 && expectedLng == "" || (len(lng) == 2 && expectedLng == lng[1]))
|
||||
}
|
||||
}
|
||||
@@ -1,221 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
"regexp"
|
||||
"strconv"
|
||||
"sync"
|
||||
"sync/atomic"
|
||||
|
||||
"github.com/attic-labs/noms/cmd/util"
|
||||
"github.com/attic-labs/noms/go/config"
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/go/datas"
|
||||
"github.com/attic-labs/noms/go/hash"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/attic-labs/noms/go/util/profile"
|
||||
"github.com/attic-labs/noms/go/util/status"
|
||||
"github.com/attic-labs/noms/go/util/verbose"
|
||||
humanize "github.com/dustin/go-humanize"
|
||||
flag "github.com/juju/gnuflag"
|
||||
)
|
||||
|
||||
var (
|
||||
inPathArg = ""
|
||||
outDsArg = ""
|
||||
relPathArg = ""
|
||||
txRegexArg = ""
|
||||
txReplaceArg = ""
|
||||
txConvertArg = ""
|
||||
)
|
||||
|
||||
var longUpHelp = `'nomdex up' builds indexes that are useful for rapidly accessing objects.
|
||||
|
||||
This sample tool can index objects based on any string or number attribute of that
|
||||
object. The 'up' command works by scanning all the objects reachable from the --in-path
|
||||
command line argument. It tests the object to determine if there is a string or number
|
||||
value reachable by applying the --by path argument to the object. If so, the object is
|
||||
added to the index under that value.
|
||||
|
||||
For example, if there are objects in the database that contain a personId and a
|
||||
gender field, 'nomdex up' can scan all the objects in a given dataset and build
|
||||
an index on the specified field with the following commands:
|
||||
nomdex up --in-path <dsSpec>.value --by .gender --out-ds gender-index
|
||||
nomdex up --in-path <dsSpec>.value --by .address.city --out-ds personId-index
|
||||
|
||||
The previous commands can be understood as follows. The first command updates or
|
||||
builds an index by scanning all the objects that are reachable from |in-path| that
|
||||
have a string or number value reachable using |by| and stores the root of the
|
||||
resulting index in a dataset specified by |out-ds|.
|
||||
|
||||
Notice that the --in-path argument has a value of '<dsSpec>.value'. The '.value'
|
||||
is not strictly necessary but it's normally useful when indexing. Since datasets
|
||||
generally point to Commit objects in Noms, they usually have parents which are
|
||||
previous versions of the data. If you add .value to the end of the dataset, only
|
||||
the most recent version of the data will be indexed. Without the '.value' all
|
||||
objects in all previous commits will also be indexed which is most often not what
|
||||
is expected.
|
||||
|
||||
There are three additional commands that can be useful for transforming the value
|
||||
being indexed:
|
||||
* tx-replace: used to modify behavior of tx-regex, see below
|
||||
* tx-regex: the behavior for this argument depends on whether a tx-replace argument
|
||||
is present. If so, the go routine "regexp.ReplaceAllString() is called:
|
||||
txRe := regex.MustCompile(|tx-regex|)
|
||||
txRe.ReplaceAllString(|index value|, |tx-replace|
|
||||
If tx-replace is not present then the following call is made on each value:
|
||||
txRe := regex.MustCompile(|tx-regex|)
|
||||
regex.FindStringSubmatch(|index value|)
|
||||
*tx-convert: attempts to convert the index value to the type specified.
|
||||
Currently the only value accepted for this arg is 'number'
|
||||
|
||||
The resulting indexes can be used by the 'nomdex find command' for help on that
|
||||
see: nomdex find -h
|
||||
`
|
||||
|
||||
var update = &util.Command{
|
||||
Run: runUpdate,
|
||||
UsageLine: "up --in-path <path> --out-ds <dspath> --by <relativepath>",
|
||||
Short: "Build/Update an index",
|
||||
Long: longUpHelp,
|
||||
Flags: setupUpdateFlags,
|
||||
Nargs: 0,
|
||||
}
|
||||
|
||||
func setupUpdateFlags() *flag.FlagSet {
|
||||
flagSet := flag.NewFlagSet("up", flag.ExitOnError)
|
||||
flagSet.StringVar(&inPathArg, "in-path", "", "a value to search for items to index within ")
|
||||
flagSet.StringVar(&outDsArg, "out-ds", "", "name of dataset to save the results to")
|
||||
flagSet.StringVar(&relPathArg, "by", "", "a path relative to all the items in <in-path> to index by")
|
||||
flagSet.StringVar(&txRegexArg, "tx-regex", "", "perform a string transformation on value before putting it in index")
|
||||
flagSet.StringVar(&txReplaceArg, "tx-replace", "", "replace values matched by tx-regex")
|
||||
flagSet.StringVar(&txConvertArg, "tx-convert", "", "convert the result of a tx regex/replace to this type (only does 'number' currently)")
|
||||
verbose.RegisterVerboseFlags(flagSet)
|
||||
profile.RegisterProfileFlags(flagSet)
|
||||
return flagSet
|
||||
}
|
||||
|
||||
type StreamingSetEntry struct {
|
||||
valChan chan<- types.Value
|
||||
setChan <-chan types.Set
|
||||
}
|
||||
|
||||
type IndexMap map[types.Value]StreamingSetEntry
|
||||
|
||||
type Index struct {
|
||||
m IndexMap
|
||||
indexedCnt int64
|
||||
seenCnt int64
|
||||
mutex sync.Mutex
|
||||
}
|
||||
|
||||
func runUpdate(args []string) int {
|
||||
requiredArgs := map[string]string{"in-path": inPathArg, "out-ds": outDsArg, "by": relPathArg}
|
||||
for argName, argValue := range requiredArgs {
|
||||
if argValue == "" {
|
||||
fmt.Fprintf(os.Stderr, "Missing required '%s' arg\n", argName)
|
||||
flag.Usage()
|
||||
return 1
|
||||
}
|
||||
}
|
||||
|
||||
defer profile.MaybeStartProfile().Stop()
|
||||
|
||||
cfg := config.NewResolver()
|
||||
db, rootObject, err := cfg.GetPath(inPathArg)
|
||||
d.Chk.NoError(err)
|
||||
|
||||
if rootObject == nil {
|
||||
fmt.Printf("Object not found: %s\n", inPathArg)
|
||||
return 1
|
||||
}
|
||||
|
||||
outDs := db.GetDataset(outDsArg)
|
||||
relPath, err := types.ParsePath(relPathArg)
|
||||
if printError(err, "Error parsing -by value\n\t") {
|
||||
return 1
|
||||
}
|
||||
|
||||
gb := types.NewGraphBuilder(db, types.MapKind)
|
||||
addElementsToGraphBuilder(gb, db, rootObject, relPath)
|
||||
indexMap := gb.Build().(types.Map)
|
||||
|
||||
outDs, err = db.Commit(outDs, indexMap, datas.CommitOptions{})
|
||||
d.Chk.NoError(err)
|
||||
fmt.Printf("Committed index with %d entries to dataset: %s\n", indexMap.Len(), outDsArg)
|
||||
|
||||
return 0
|
||||
}
|
||||
|
||||
func addElementsToGraphBuilder(gb *types.GraphBuilder, db datas.Database, rootObject types.Value, relPath types.Path) {
|
||||
typeCacheMutex := sync.Mutex{}
|
||||
typeCache := map[hash.Hash]bool{}
|
||||
|
||||
var txRe *regexp.Regexp
|
||||
if txRegexArg != "" {
|
||||
var err error
|
||||
txRe, err = regexp.Compile(txRegexArg)
|
||||
d.CheckError(err)
|
||||
}
|
||||
|
||||
index := Index{m: IndexMap{}}
|
||||
types.WalkValues(rootObject, db, func(v types.Value) bool {
|
||||
typ := types.TypeOf(v)
|
||||
typeCacheMutex.Lock()
|
||||
hasPath, ok := typeCache[typ.Hash()]
|
||||
typeCacheMutex.Unlock()
|
||||
if !ok || hasPath {
|
||||
pathResolved := false
|
||||
tv := relPath.Resolve(v, db)
|
||||
if tv != nil {
|
||||
index.addToGraphBuilder(gb, tv, v, txRe)
|
||||
pathResolved = true
|
||||
}
|
||||
if !ok {
|
||||
typeCacheMutex.Lock()
|
||||
typeCache[typ.Hash()] = pathResolved
|
||||
typeCacheMutex.Unlock()
|
||||
}
|
||||
}
|
||||
return false
|
||||
})
|
||||
|
||||
status.Done()
|
||||
}
|
||||
|
||||
func (idx *Index) addToGraphBuilder(gb *types.GraphBuilder, k, v types.Value, txRe *regexp.Regexp) {
|
||||
atomic.AddInt64(&idx.seenCnt, 1)
|
||||
if txRe != nil {
|
||||
k1 := types.EncodedValue(k)
|
||||
k2 := ""
|
||||
if txReplaceArg != "" {
|
||||
k2 = txRe.ReplaceAllString(string(k1), txReplaceArg)
|
||||
} else {
|
||||
matches := txRe.FindStringSubmatch(string(k1))
|
||||
if len(matches) > 0 {
|
||||
k2 = matches[len(matches)-1]
|
||||
}
|
||||
}
|
||||
if txConvertArg == "number" {
|
||||
if k2 == "" {
|
||||
return
|
||||
}
|
||||
n, err := strconv.ParseFloat(k2, 64)
|
||||
if err != nil {
|
||||
fmt.Println("error converting to number: ", err)
|
||||
return
|
||||
}
|
||||
k = types.Float(n)
|
||||
} else {
|
||||
k = types.String(k2)
|
||||
}
|
||||
}
|
||||
atomic.AddInt64(&idx.indexedCnt, 1)
|
||||
gb.SetInsert(types.ValueSlice{k}, v)
|
||||
status.Printf("Found %s objects, Indexed %s objects", humanize.Comma(idx.seenCnt), humanize.Comma(idx.indexedCnt))
|
||||
}
|
||||
@@ -1,263 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"strconv"
|
||||
"strings"
|
||||
"text/scanner"
|
||||
"unicode"
|
||||
|
||||
"github.com/attic-labs/noms/go/d"
|
||||
"github.com/attic-labs/noms/go/datas"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
)
|
||||
|
||||
/**** Query language BNF
|
||||
query := expr
|
||||
expr := expr boolOp compExpr | group
|
||||
compExpr := indexToken compOp value
|
||||
group := '(' expr ')' | compExpr
|
||||
boolOp := 'and' | 'or'
|
||||
compOp := '=' | '<' | '<=' | '>' | '>=' | !=
|
||||
value := "<string>" | number
|
||||
number := '-' digits | digits
|
||||
digits := int | float
|
||||
|
||||
*/
|
||||
|
||||
type compOp string
|
||||
type boolOp string
|
||||
|
||||
type indexManager struct {
|
||||
db datas.Database
|
||||
indexes map[string]types.Map
|
||||
}
|
||||
|
||||
const (
|
||||
equals compOp = "="
|
||||
gt compOp = ">"
|
||||
gte compOp = ">="
|
||||
lt compOp = "<"
|
||||
lte compOp = "<="
|
||||
ne compOp = "!="
|
||||
openP = "("
|
||||
closeP = ")"
|
||||
and boolOp = "and"
|
||||
or boolOp = "or"
|
||||
)
|
||||
|
||||
var (
|
||||
compOps = []compOp{equals, gt, gte, lt, lte, ne}
|
||||
boolOps = []boolOp{and, or}
|
||||
)
|
||||
|
||||
type qScanner struct {
|
||||
s scanner.Scanner
|
||||
peekedToken rune
|
||||
peekedText string
|
||||
peeked bool
|
||||
}
|
||||
|
||||
func (qs *qScanner) Scan() rune {
|
||||
var r rune
|
||||
if qs.peeked {
|
||||
r = qs.peekedToken
|
||||
qs.peeked = false
|
||||
} else {
|
||||
r = qs.s.Scan()
|
||||
}
|
||||
return r
|
||||
}
|
||||
|
||||
func (qs *qScanner) Peek() rune {
|
||||
var r rune
|
||||
|
||||
if !qs.peeked {
|
||||
qs.peekedToken = qs.s.Scan()
|
||||
qs.peekedText = qs.s.TokenText()
|
||||
qs.peeked = true
|
||||
}
|
||||
r = qs.peekedToken
|
||||
return r
|
||||
}
|
||||
|
||||
func (qs *qScanner) TokenText() string {
|
||||
var text string
|
||||
if qs.peeked {
|
||||
text = qs.peekedText
|
||||
} else {
|
||||
text = qs.s.TokenText()
|
||||
}
|
||||
return text
|
||||
}
|
||||
|
||||
func (qs *qScanner) Pos() scanner.Position {
|
||||
return qs.s.Pos()
|
||||
}
|
||||
|
||||
func parseQuery(q string, im *indexManager) (expr, error) {
|
||||
s := NewQueryScanner(q)
|
||||
var expr expr
|
||||
err := d.Try(func() {
|
||||
expr = s.parseExpr(0, im)
|
||||
})
|
||||
return expr, err
|
||||
}
|
||||
|
||||
func NewQueryScanner(query string) *qScanner {
|
||||
isIdentRune := func(r rune, i int) bool {
|
||||
identChars := ":/.>=-"
|
||||
startIdentChars := "!><"
|
||||
if i == 0 {
|
||||
return unicode.IsLetter(r) || strings.ContainsRune(startIdentChars, r)
|
||||
}
|
||||
return unicode.IsLetter(r) || unicode.IsDigit(r) || strings.ContainsRune(identChars, r)
|
||||
}
|
||||
|
||||
errorFunc := func(s *scanner.Scanner, msg string) {
|
||||
d.PanicIfError(fmt.Errorf("%s, pos: %s\n", msg, s.Pos()))
|
||||
}
|
||||
|
||||
var s scanner.Scanner
|
||||
s.Mode = scanner.ScanIdents | scanner.ScanFloats | scanner.ScanStrings | scanner.SkipComments
|
||||
s.Init(strings.NewReader(query))
|
||||
s.IsIdentRune = isIdentRune
|
||||
s.Error = errorFunc
|
||||
qs := qScanner{s: s}
|
||||
return &qs
|
||||
}
|
||||
|
||||
func (qs *qScanner) parseExpr(level int, im *indexManager) expr {
|
||||
tok := qs.Scan()
|
||||
switch tok {
|
||||
case '(':
|
||||
expr1 := qs.parseExpr(level+1, im)
|
||||
tok := qs.Scan()
|
||||
if tok != ')' {
|
||||
d.PanicIfError(fmt.Errorf("missing ending paren for expr"))
|
||||
} else {
|
||||
tok = qs.Peek()
|
||||
if tok == ')' {
|
||||
return expr1
|
||||
}
|
||||
tok = qs.Scan()
|
||||
text := qs.TokenText()
|
||||
switch {
|
||||
case tok == scanner.Ident && isBoolOp(text):
|
||||
op := boolOp(text)
|
||||
expr2 := qs.parseExpr(level+1, im)
|
||||
return logExpr{op: op, expr1: expr1, expr2: expr2, idxName: idxNameIfSame(expr1, expr2)}
|
||||
case tok == scanner.EOF:
|
||||
return expr1
|
||||
default:
|
||||
d.PanicIfError(fmt.Errorf("extra text found at end of expr, tok: %d, text: %s", int(tok), qs.TokenText()))
|
||||
}
|
||||
}
|
||||
case scanner.Ident:
|
||||
err := openIndex(qs.TokenText(), im)
|
||||
d.PanicIfError(err)
|
||||
expr1 := qs.parseCompExpr(level+1, qs.TokenText(), im)
|
||||
tok := qs.Peek()
|
||||
switch tok {
|
||||
case ')':
|
||||
return expr1
|
||||
case rune(scanner.Ident):
|
||||
_ = qs.Scan()
|
||||
text := qs.TokenText()
|
||||
if isBoolOp(text) {
|
||||
op := boolOp(text)
|
||||
expr2 := qs.parseExpr(level+1, im)
|
||||
return logExpr{op: op, expr1: expr1, expr2: expr2, idxName: idxNameIfSame(expr1, expr2)}
|
||||
} else {
|
||||
d.PanicIfError(fmt.Errorf("expected boolean op, found: %s, level: %d", text, level))
|
||||
}
|
||||
case rune(scanner.EOF):
|
||||
return expr1
|
||||
default:
|
||||
_ = qs.Scan()
|
||||
}
|
||||
default:
|
||||
d.PanicIfError(fmt.Errorf("unexpected token in expr: %s, %d", qs.TokenText(), tok))
|
||||
}
|
||||
return logExpr{}
|
||||
}
|
||||
|
||||
func (qs *qScanner) parseCompExpr(level int, indexName string, im *indexManager) compExpr {
|
||||
qs.Scan()
|
||||
text := qs.TokenText()
|
||||
if !isCompOp(text) {
|
||||
d.PanicIfError(fmt.Errorf("expected relop token but found: '%s'", text))
|
||||
}
|
||||
op := compOp(text)
|
||||
value := qs.parseValExpr()
|
||||
return compExpr{indexName, op, value}
|
||||
}
|
||||
|
||||
func (qs *qScanner) parseValExpr() types.Value {
|
||||
tok := qs.Scan()
|
||||
text := qs.TokenText()
|
||||
isNeg := false
|
||||
if tok == '-' {
|
||||
isNeg = true
|
||||
tok = qs.Scan()
|
||||
text = qs.TokenText()
|
||||
}
|
||||
switch tok {
|
||||
case scanner.String:
|
||||
if isNeg {
|
||||
d.PanicIfError(fmt.Errorf("expected number after '-', found string: %s", text))
|
||||
}
|
||||
return valueFromString(text)
|
||||
case scanner.Float:
|
||||
f, _ := strconv.ParseFloat(text, 64)
|
||||
if isNeg {
|
||||
f = -f
|
||||
}
|
||||
return types.Float(f)
|
||||
case scanner.Int:
|
||||
i, _ := strconv.ParseInt(text, 10, 64)
|
||||
if isNeg {
|
||||
i = -i
|
||||
}
|
||||
return types.Float(i)
|
||||
}
|
||||
d.PanicIfError(fmt.Errorf("expected value token, found: '%s'", text))
|
||||
return nil // for compiler
|
||||
}
|
||||
|
||||
func valueFromString(t string) types.Value {
|
||||
l := len(t)
|
||||
if l < 2 && t[0] == '"' && t[l-1] == '"' {
|
||||
d.PanicIfError(fmt.Errorf("Unable to get value from token: %s", t))
|
||||
}
|
||||
return types.String(t[1 : l-1])
|
||||
}
|
||||
|
||||
func isCompOp(s string) bool {
|
||||
for _, op := range compOps {
|
||||
if s == string(op) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func isBoolOp(s string) bool {
|
||||
for _, op := range boolOps {
|
||||
if s == string(op) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func idxNameIfSame(expr1, expr2 expr) string {
|
||||
if expr1.indexName() == expr2.indexName() {
|
||||
return expr1.indexName()
|
||||
}
|
||||
return ""
|
||||
}
|
||||
@@ -1,139 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"testing"
|
||||
"text/scanner"
|
||||
|
||||
"github.com/attic-labs/noms/go/chunks"
|
||||
"github.com/attic-labs/noms/go/datas"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/stretchr/testify/assert"
|
||||
)
|
||||
|
||||
type scannerResult struct {
|
||||
tok int
|
||||
text string
|
||||
}
|
||||
|
||||
type parseResult struct {
|
||||
query string
|
||||
ex expr
|
||||
}
|
||||
|
||||
func TestQueryScanner(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
|
||||
s := NewQueryScanner(`9 (99.9) -9 0x7F "99.9" and or http://localhost:8000/cli-tour::yo <= >= < > = _ !=`)
|
||||
|
||||
scannerResults := []scannerResult{
|
||||
{tok: scanner.Int, text: "9"},
|
||||
{tok: int('('), text: "("},
|
||||
{tok: scanner.Float, text: "99.9"},
|
||||
{tok: int(')'), text: ")"},
|
||||
{tok: '-', text: "-"},
|
||||
{tok: scanner.Int, text: "9"},
|
||||
{tok: scanner.Int, text: "0x7F"},
|
||||
{tok: scanner.String, text: `"99.9"`},
|
||||
{tok: scanner.Ident, text: "and"},
|
||||
{tok: scanner.Ident, text: "or"},
|
||||
{tok: scanner.Ident, text: "http://localhost:8000/cli-tour::yo"},
|
||||
{tok: scanner.Ident, text: "<="},
|
||||
{tok: scanner.Ident, text: ">="},
|
||||
{tok: scanner.Ident, text: "<"},
|
||||
{tok: scanner.Ident, text: ">"},
|
||||
{tok: int('='), text: "="},
|
||||
{tok: int('_'), text: "_"},
|
||||
{tok: scanner.Ident, text: "!="},
|
||||
}
|
||||
|
||||
for _, sr := range scannerResults {
|
||||
tok := s.Scan()
|
||||
assert.Equal(sr.tok, int(tok), "expected text: %s, found: %s, pos: %s", sr.text, s.TokenText(), s.Pos())
|
||||
assert.Equal(sr.text, s.TokenText())
|
||||
}
|
||||
tok := s.Scan()
|
||||
assert.Equal(scanner.EOF, int(tok))
|
||||
}
|
||||
|
||||
func TestPeek(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
|
||||
s := NewQueryScanner(`_ < "one"`)
|
||||
scannerResults := []scannerResult{
|
||||
{tok: int('_'), text: "_"},
|
||||
{tok: scanner.Ident, text: "<"},
|
||||
{tok: scanner.String, text: `"one"`},
|
||||
{tok: scanner.EOF, text: ""},
|
||||
}
|
||||
|
||||
for _, sr := range scannerResults {
|
||||
assert.Equal(sr.tok, int(s.Peek()))
|
||||
assert.Equal(sr.text, s.TokenText())
|
||||
assert.Equal(sr.tok, int(s.Scan()))
|
||||
assert.Equal(sr.text, s.TokenText())
|
||||
}
|
||||
}
|
||||
|
||||
func TestParsing(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
|
||||
re1 := compExpr{"index1", equals, types.Float(2015)}
|
||||
re2 := compExpr{"index1", gte, types.Float(2020)}
|
||||
re3 := compExpr{"index1", lte, types.Float(2022)}
|
||||
re4 := compExpr{"index1", lt, types.Float(-2030)}
|
||||
re5 := compExpr{"index1", ne, types.Float(3.5)}
|
||||
re6 := compExpr{"index1", ne, types.Float(-3500.4536632)}
|
||||
re7 := compExpr{"index1", ne, types.String("whassup")}
|
||||
|
||||
queries := []parseResult{
|
||||
{`index1 = 2015`, re1},
|
||||
{`(index1 = 2015 )`, re1},
|
||||
{`(((index1 = 2015 ) ))`, re1},
|
||||
{`index1 = 2015 or index1 >= 2020`, logExpr{or, re1, re2, "index1"}},
|
||||
{`(index1 = 2015) or index1 >= 2020`, logExpr{or, re1, re2, "index1"}},
|
||||
{`index1 = 2015 or (index1 >= 2020)`, logExpr{or, re1, re2, "index1"}},
|
||||
{`(index1 = 2015 or index1 >= 2020)`, logExpr{or, re1, re2, "index1"}},
|
||||
{`(index1 = 2015 or index1 >= 2020) and index1 <= 2022`, logExpr{and, logExpr{or, re1, re2, "index1"}, re3, "index1"}},
|
||||
{`index1 = 2015 or index1 >= 2020 and index1 <= 2022`, logExpr{or, re1, logExpr{and, re2, re3, "index1"}, "index1"}},
|
||||
{`index1 = 2015 or index1 >= 2020 and index1 <= 2022 or index1 < -2030`, logExpr{or, re1, logExpr{and, re2, logExpr{or, re3, re4, "index1"}, "index1"}, "index1"}},
|
||||
{`(index1 = 2015 or index1 >= 2020) and (index1 <= 2022 or index1 < -2030)`, logExpr{and, logExpr{or, re1, re2, "index1"}, logExpr{or, re3, re4, "index1"}, "index1"}},
|
||||
{`index1 != 3.5`, re5},
|
||||
{`index1 != -3500.4536632`, re6},
|
||||
{`index1 != "whassup"`, re7},
|
||||
}
|
||||
|
||||
storage := &chunks.MemoryStorage{}
|
||||
db := datas.NewDatabase(storage.NewView())
|
||||
_, err := db.CommitValue(db.GetDataset("index1"), types.NewMap(db, types.String("one"), types.NewSet(db, types.String("two"))))
|
||||
assert.NoError(err)
|
||||
|
||||
im := &indexManager{db: db, indexes: map[string]types.Map{}}
|
||||
for _, pr := range queries {
|
||||
expr, err := parseQuery(pr.query, im)
|
||||
assert.NoError(err)
|
||||
assert.Equal(pr.ex, expr, "bad query: %s", pr.query)
|
||||
}
|
||||
|
||||
badQueries := []string{
|
||||
`sdfsd = 2015`,
|
||||
`index1 = "unfinished string`,
|
||||
`index1 and 2015`,
|
||||
`index1 < `,
|
||||
`index1 < 2015 and ()`,
|
||||
`index1 < 2015 an index1 > 2016`,
|
||||
`(index1 < 2015) what`,
|
||||
`(index1< 2015`,
|
||||
`(badIndexName < 2015)`,
|
||||
}
|
||||
|
||||
im1 := &indexManager{db: db, indexes: map[string]types.Map{}}
|
||||
for _, q := range badQueries {
|
||||
expr, err := parseQuery(q, im1)
|
||||
assert.Error(err)
|
||||
assert.Nil(expr)
|
||||
}
|
||||
}
|
||||
@@ -1,162 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"fmt"
|
||||
"io"
|
||||
"sort"
|
||||
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
)
|
||||
|
||||
type bound struct {
|
||||
value types.Value
|
||||
include bool
|
||||
infinity int8
|
||||
}
|
||||
|
||||
func (b bound) isLessThanOrEqual(o bound) (res bool) {
|
||||
return b.equals(o) || b.isLessThan(o)
|
||||
}
|
||||
|
||||
func (b bound) isLessThan(o bound) (res bool) {
|
||||
if b.infinity < o.infinity {
|
||||
return true
|
||||
}
|
||||
|
||||
if b.infinity > o.infinity {
|
||||
return false
|
||||
}
|
||||
|
||||
if b.infinity == o.infinity && b.infinity != 0 {
|
||||
return false
|
||||
}
|
||||
|
||||
if b.value.Less(o.value) {
|
||||
return true
|
||||
}
|
||||
|
||||
if b.value.Equals(o.value) {
|
||||
if !b.include && o.include {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func (b bound) isGreaterThanOrEqual(o bound) (res bool) {
|
||||
return !b.isLessThan(o)
|
||||
}
|
||||
|
||||
func (b bound) isGreaterThan(o bound) (res bool) {
|
||||
return !b.equals(o) || !b.isLessThan(o)
|
||||
}
|
||||
|
||||
func (b bound) equals(o bound) bool {
|
||||
return b.infinity == o.infinity && b.include == o.include &&
|
||||
(b.value == nil && o.value == nil || (b.value != nil && o.value != nil && b.value.Equals(o.value)))
|
||||
}
|
||||
|
||||
func (b bound) String() string {
|
||||
var s1 string
|
||||
if b.value == nil {
|
||||
s1 = "<nil>"
|
||||
} else {
|
||||
buf := bytes.Buffer{}
|
||||
types.WriteEncodedValue(&buf, b.value)
|
||||
s1 = buf.String()
|
||||
}
|
||||
return fmt.Sprintf("bound{v: %s, include: %t, infinity: %d}", s1, b.include, b.infinity)
|
||||
}
|
||||
|
||||
func (b bound) minValue(o bound) (res bound) {
|
||||
if b.isLessThan(o) {
|
||||
return b
|
||||
}
|
||||
return o
|
||||
}
|
||||
|
||||
func (b bound) maxValue(o bound) (res bound) {
|
||||
if b.isLessThan(o) {
|
||||
return o
|
||||
}
|
||||
return b
|
||||
}
|
||||
|
||||
type queryRange struct {
|
||||
lower bound
|
||||
upper bound
|
||||
}
|
||||
|
||||
func (r queryRange) and(o queryRange) (rangeDescs queryRangeSlice) {
|
||||
if !r.intersects(o) {
|
||||
return []queryRange{}
|
||||
}
|
||||
|
||||
lower := r.lower.maxValue(o.lower)
|
||||
upper := r.upper.minValue(o.upper)
|
||||
return []queryRange{{lower, upper}}
|
||||
}
|
||||
|
||||
func (r queryRange) or(o queryRange) (rSlice queryRangeSlice) {
|
||||
if r.intersects(o) {
|
||||
v1 := r.lower.minValue(o.lower)
|
||||
v2 := r.upper.maxValue(o.upper)
|
||||
return queryRangeSlice{queryRange{v1, v2}}
|
||||
}
|
||||
rSlice = queryRangeSlice{r, o}
|
||||
sort.Sort(rSlice)
|
||||
return rSlice
|
||||
}
|
||||
|
||||
func (r queryRange) intersects(o queryRange) (res bool) {
|
||||
if r.lower.isGreaterThanOrEqual(o.lower) && r.lower.isLessThanOrEqual(o.upper) {
|
||||
return true
|
||||
}
|
||||
if r.upper.isGreaterThanOrEqual(o.lower) && r.upper.isLessThanOrEqual(o.upper) {
|
||||
return true
|
||||
}
|
||||
if o.lower.isGreaterThanOrEqual(r.lower) && o.lower.isLessThanOrEqual(r.upper) {
|
||||
return true
|
||||
}
|
||||
if o.upper.isGreaterThanOrEqual(r.lower) && o.upper.isLessThanOrEqual(r.upper) {
|
||||
return true
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func (r queryRange) String() string {
|
||||
return fmt.Sprintf("queryRange{lower: %s, upper: %s", r.lower, r.upper)
|
||||
}
|
||||
|
||||
// queryRangeSlice defines the sort.Interface. This implementation sorts queryRanges by
|
||||
// the lower bound in ascending order.
|
||||
type queryRangeSlice []queryRange
|
||||
|
||||
func (rSlice queryRangeSlice) Len() int {
|
||||
return len(rSlice)
|
||||
}
|
||||
|
||||
func (rSlice queryRangeSlice) Swap(i, j int) {
|
||||
rSlice[i], rSlice[j] = rSlice[j], rSlice[i]
|
||||
}
|
||||
|
||||
func (rSlice queryRangeSlice) Less(i, j int) bool {
|
||||
return !rSlice[i].lower.equals(rSlice[j].lower) && rSlice[i].lower.isLessThanOrEqual(rSlice[j].lower)
|
||||
}
|
||||
|
||||
func (rSlice queryRangeSlice) dbgPrint(w io.Writer) {
|
||||
for i, rd := range rSlice {
|
||||
if i == 0 {
|
||||
fmt.Fprintf(w, "\n#################\n")
|
||||
}
|
||||
fmt.Fprintf(w, "queryRange %d: %s\n", i, rd)
|
||||
}
|
||||
if len(rSlice) > 0 {
|
||||
fmt.Fprintf(w, "\n")
|
||||
}
|
||||
}
|
||||
@@ -1,150 +0,0 @@
|
||||
// Copyright 2016 Attic Labs, Inc. All rights reserved.
|
||||
// Licensed under the Apache License, version 2.0:
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/stretchr/testify/assert"
|
||||
)
|
||||
|
||||
const nilHolder = -1000000
|
||||
|
||||
var (
|
||||
r1 = qr(2, true, 5, true)
|
||||
r2 = qr(0, true, 8, true)
|
||||
r3 = qr(0, true, 3, true)
|
||||
r4 = qr(3, true, 8, true)
|
||||
r5 = qr(0, true, 1, true)
|
||||
r6 = qr(6, true, 10, true)
|
||||
r7 = qr(nilHolder, true, 10, true)
|
||||
r8 = qr(3, true, nilHolder, true)
|
||||
r10 = qr(2, true, 5, false)
|
||||
r11 = qr(5, true, 10, true)
|
||||
)
|
||||
|
||||
func newBound(i int, include bool, infinity int) bound {
|
||||
var v types.Value
|
||||
if i != nilHolder {
|
||||
v = types.Float(i)
|
||||
}
|
||||
return bound{value: v, include: include, infinity: int8(infinity)}
|
||||
}
|
||||
|
||||
func qr(lower int, lowerIncl bool, upper int, upperIncl bool) queryRange {
|
||||
lowerInf := 0
|
||||
if lower == nilHolder {
|
||||
lowerInf = -1
|
||||
}
|
||||
upperInf := 0
|
||||
if upper == nilHolder {
|
||||
upperInf = 1
|
||||
}
|
||||
return queryRange{newBound(lower, lowerIncl, lowerInf), newBound(upper, upperIncl, upperInf)}
|
||||
}
|
||||
|
||||
func TestRangeIntersects(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
|
||||
assert.True(r1.intersects(r2))
|
||||
assert.True(r1.intersects(r3))
|
||||
assert.True(r1.intersects(r4))
|
||||
assert.True(r2.intersects(r1))
|
||||
assert.True(r1.intersects(r7))
|
||||
assert.True(r1.intersects(r8))
|
||||
assert.True(r3.intersects(r4))
|
||||
assert.True(r3.intersects(r4))
|
||||
|
||||
assert.False(r1.intersects(r5))
|
||||
assert.False(r1.intersects(r6))
|
||||
assert.False(r10.intersects(r11))
|
||||
}
|
||||
|
||||
func TestRangeAnd(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
|
||||
assert.Empty(r1.and(r5))
|
||||
assert.Empty(r1.and(r6))
|
||||
|
||||
assert.Equal(r1, r1.and(r2)[0])
|
||||
assert.Equal(r1, r2.and(r1)[0])
|
||||
|
||||
expected := qr(3, true, 5, true)
|
||||
assert.Equal(expected, r1.and(r4)[0])
|
||||
}
|
||||
|
||||
func TestRangeOr(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
|
||||
assert.Equal(r2, r1.or(r2)[0])
|
||||
|
||||
expected := qr(0, true, 5, true)
|
||||
assert.Equal(expected, r1.or(r3)[0])
|
||||
|
||||
expectedSlice := queryRangeSlice{r5, r1}
|
||||
assert.Equal(expectedSlice, r1.or(r5))
|
||||
assert.Equal(expectedSlice, r5.or(r1))
|
||||
}
|
||||
|
||||
func TestIsLessThan(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
|
||||
assert.True(newBound(1, true, 0).isLessThanOrEqual(newBound(2, true, 0)))
|
||||
assert.False(newBound(2, true, 0).isLessThanOrEqual(newBound(1, true, 0)))
|
||||
assert.True(newBound(1, true, 0).isLessThanOrEqual(newBound(1, true, 0)))
|
||||
|
||||
assert.True(newBound(1, false, 0).isLessThanOrEqual(newBound(2, false, 0)))
|
||||
assert.False(newBound(2, false, 0).isLessThanOrEqual(newBound(1, false, 0)))
|
||||
assert.True(newBound(1, false, 0).isLessThanOrEqual(newBound(1, false, 0)))
|
||||
|
||||
assert.False(newBound(1, true, 0).isLessThanOrEqual(newBound(1, false, 0)))
|
||||
assert.True(newBound(1, false, 0).isLessThanOrEqual(newBound(1, true, 0)))
|
||||
|
||||
assert.True(newBound(nilHolder, true, -1).isLessThanOrEqual(newBound(1, true, 0)))
|
||||
assert.False(newBound(1, false, 0).isLessThanOrEqual(newBound(nilHolder, true, -1)))
|
||||
}
|
||||
|
||||
func TestIsGreaterThan(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
|
||||
assert.True(newBound(2, true, 0).isGreaterThanOrEqual(newBound(1, true, 0)))
|
||||
assert.False(newBound(1, true, 0).isGreaterThanOrEqual(newBound(2, true, 0)))
|
||||
assert.True(newBound(1, true, 0).isGreaterThanOrEqual(newBound(1, true, 0)))
|
||||
|
||||
assert.True(newBound(2, false, 0).isGreaterThanOrEqual(newBound(1, false, 0)))
|
||||
assert.False(newBound(1, false, 0).isGreaterThanOrEqual(newBound(2, false, 0)))
|
||||
assert.True(newBound(1, false, 0).isGreaterThanOrEqual(newBound(1, false, 0)))
|
||||
|
||||
assert.True(newBound(1, true, 0).isGreaterThanOrEqual(newBound(1, false, 0)))
|
||||
assert.False(newBound(1, false, 0).isGreaterThanOrEqual(newBound(2, true, 0)))
|
||||
|
||||
assert.True(newBound(nilHolder, true, 1).isGreaterThanOrEqual(newBound(1, true, 0)))
|
||||
assert.False(newBound(1, true, 0).isGreaterThanOrEqual(newBound(nilHolder, true, 1)))
|
||||
}
|
||||
|
||||
func TestMinValue(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
ve1 := newBound(5, false, 0)
|
||||
ve2 := newBound(5, true, 0)
|
||||
ve3 := newBound(nilHolder, true, -1)
|
||||
ve4 := newBound(nilHolder, true, 1)
|
||||
|
||||
assert.Equal(ve1, ve1.minValue(ve2))
|
||||
assert.Equal(ve3, ve1.minValue(ve3))
|
||||
assert.Equal(ve1, ve1.minValue(ve4))
|
||||
}
|
||||
|
||||
func TestMaxValue(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
ve1 := newBound(5, false, 0)
|
||||
ve2 := newBound(5, true, 0)
|
||||
ve3 := newBound(nilHolder, true, -1)
|
||||
ve4 := newBound(nilHolder, true, 1)
|
||||
|
||||
assert.Equal(ve2, ve1.maxValue(ve2))
|
||||
assert.Equal(ve1, ve1.maxValue(ve3))
|
||||
assert.Equal(ve4, ve1.maxValue(ve4))
|
||||
}
|
||||
@@ -1,81 +0,0 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"flag"
|
||||
"fmt"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/pkg/profile"
|
||||
"log"
|
||||
"time"
|
||||
)
|
||||
|
||||
type NextEdit func() (types.Value, types.Value)
|
||||
|
||||
type MEBenchmark interface {
|
||||
GetName() string
|
||||
AddEdits(nextEdit NextEdit)
|
||||
//SortEdits()
|
||||
Map()
|
||||
}
|
||||
|
||||
func main() {
|
||||
profPath := flag.String("profpath", "./", "")
|
||||
cpuProf := flag.Bool("cpuprof", false, "")
|
||||
memProf := flag.Bool("memprof", false, "")
|
||||
meBench := flag.Bool("me-bench", false, "")
|
||||
count := flag.Int("n", 1000000, "")
|
||||
flag.Parse()
|
||||
|
||||
if *cpuProf {
|
||||
fmt.Println("cpu profiling enabled.")
|
||||
fmt.Println("writing cpu prof to", *profPath)
|
||||
defer profile.Start(profile.CPUProfile).Stop()
|
||||
}
|
||||
|
||||
if *memProf {
|
||||
fmt.Println("mem profiling enabled.")
|
||||
fmt.Println("writing mem prof to", *profPath)
|
||||
defer profile.Start(profile.MemProfile).Stop()
|
||||
}
|
||||
|
||||
var toBench []MEBenchmark
|
||||
if *meBench {
|
||||
toBench = append(toBench, NewNomsMEBench())
|
||||
}
|
||||
|
||||
log.Printf("Running each benchmark for %d items\n", *count)
|
||||
tg := NewTupleGen(*count)
|
||||
run(tg, toBench)
|
||||
}
|
||||
|
||||
func benchmark(meb MEBenchmark, nextKVP NextEdit) {
|
||||
startAdd := time.Now()
|
||||
meb.AddEdits(nextKVP)
|
||||
endAdd := time.Now()
|
||||
addDelta := endAdd.Sub(startAdd)
|
||||
|
||||
log.Printf("%s - add time: %f\n", meb.GetName(), addDelta.Seconds())
|
||||
|
||||
/*startSort := time.Now()
|
||||
meb.SortEdits()
|
||||
endSort := time.Now()
|
||||
sortDelta := endSort.Sub(startSort)
|
||||
|
||||
log.Printf("%s - sort time: %f\n", meb.GetName(), sortDelta.Seconds())*/
|
||||
|
||||
startMap := time.Now()
|
||||
meb.Map()
|
||||
endMap := time.Now()
|
||||
mapDelta := endMap.Sub(startMap)
|
||||
|
||||
log.Printf("%s - map time: %f\n", meb.GetName(), mapDelta.Seconds())
|
||||
}
|
||||
|
||||
func run(tg *TupleGen, toBench []MEBenchmark) {
|
||||
for _, currBench := range toBench {
|
||||
log.Println("Starting", currBench.GetName())
|
||||
tg.Reset()
|
||||
benchmark(currBench, tg.NextKVP)
|
||||
log.Println(currBench.GetName(), "completed")
|
||||
}
|
||||
}
|
||||
@@ -1,36 +0,0 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"context"
|
||||
"github.com/attic-labs/noms/go/chunks"
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
)
|
||||
|
||||
type NomsMEBench struct {
|
||||
me *types.MapEditor
|
||||
}
|
||||
|
||||
func NewNomsMEBench() *NomsMEBench {
|
||||
ts := &chunks.TestStorage{}
|
||||
vrw := types.NewValueStore(ts.NewView())
|
||||
me := types.NewMap(context.Background(), vrw).Edit()
|
||||
|
||||
return &NomsMEBench{me}
|
||||
}
|
||||
|
||||
func (nmeb *NomsMEBench) GetName() string {
|
||||
return "noms map editor"
|
||||
}
|
||||
|
||||
func (nmeb *NomsMEBench) AddEdits(nextEdit NextEdit) {
|
||||
k, v := nextEdit()
|
||||
|
||||
for k != nil {
|
||||
nmeb.me = nmeb.me.Set(k, v)
|
||||
k, v = nextEdit()
|
||||
}
|
||||
}
|
||||
|
||||
func (nmeb *NomsMEBench) Map() {
|
||||
nmeb.me.Map(context.Background())
|
||||
}
|
||||
@@ -1,51 +0,0 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"github.com/attic-labs/noms/go/types"
|
||||
"github.com/google/uuid"
|
||||
"math/rand"
|
||||
)
|
||||
|
||||
type TupleGen struct {
|
||||
keys []uint64
|
||||
pos int
|
||||
rng *rand.Rand
|
||||
}
|
||||
|
||||
func NewTupleGen(count int) *TupleGen {
|
||||
rng := rand.New(rand.NewSource(0))
|
||||
keySet := make(map[uint64]struct{}, count)
|
||||
for len(keySet) < count {
|
||||
keySet[rng.Uint64()] = struct{}{}
|
||||
}
|
||||
|
||||
keys := make([]uint64, 0, count)
|
||||
for k := range keySet {
|
||||
keys = append(keys, k)
|
||||
}
|
||||
|
||||
return &TupleGen{keys, 0, rng}
|
||||
}
|
||||
|
||||
func (tg *TupleGen) Reset() {
|
||||
tg.pos = 0
|
||||
}
|
||||
|
||||
func (tg *TupleGen) NextKVP() (types.Value, types.Value) {
|
||||
if tg.pos >= len(tg.keys) {
|
||||
return nil, nil
|
||||
}
|
||||
|
||||
key := types.Uint(tg.keys[tg.pos])
|
||||
val := types.NewTuple(
|
||||
types.UUID(uuid.New()),
|
||||
types.Int(tg.rng.Int63()),
|
||||
types.Uint(tg.rng.Uint64()),
|
||||
types.Float(tg.rng.Float64()),
|
||||
types.String("test string"),
|
||||
types.Bool(tg.rng.Int()%2 == 0),
|
||||
types.NullValue)
|
||||
|
||||
tg.pos++
|
||||
return key, val
|
||||
}
|
||||
@@ -1 +0,0 @@
|
||||
nomsfs
|
||||
@@ -1,125 +0,0 @@
|
||||
# nomsfs
|
||||
|
||||
Nomsfs is a [FUSE](https://en.wikipedia.org/wiki/Filesystem_in_Userspace) filesystem built on Noms. To use it you'll need FUSE:
|
||||
|
||||
* *Linux* -- built-in; you should be good to go
|
||||
* *Mac OS X* -- Install [FUSE for OS X](https://osxfuse.github.io/)
|
||||
|
||||
Development and testing have been done exclusively on Mac OS X using FUSE for OS X.
|
||||
Nomsfs builds on the [Go FUSE imlementation](https://github.com/hanwen/go-fuse) from Han-Wen Nienhuys.
|
||||
|
||||
## Usage
|
||||
|
||||
Make sure FUSE is installed. On Mac OS X remember to run `/Library/Filesystems/osxfuse.fs/Contents/Resources/load_osxfuse`.
|
||||
|
||||
|
||||
Build with `go build` (or just run with `go run nomsfs.go`); test with `go test`.
|
||||
|
||||
Mount an existing or new dataset by executing `nomsfs`:
|
||||
|
||||
```shell
|
||||
$ mkdir /var/tmp/mnt
|
||||
$ go run nomsfs.go /var/tmp/nomsfs::fs /var/tmp/mnt
|
||||
running...
|
||||
```
|
||||
|
||||
Use ^C to stop `nomsfs`
|
||||
|
||||
### Exploring The Data
|
||||
|
||||
1. Once you have a mount point and `nomsfs` is running you can add/delete/rename files and directories using the Finder or the command line as you would with any other file system.
|
||||
2. Stop `nomsfs` with ^C
|
||||
3. Let's look around the dataset:
|
||||
```shell
|
||||
> noms ds /var/tmp/nomsfs
|
||||
fs
|
||||
> noms show /var/tmp/nomsfs::fs
|
||||
struct Commit {
|
||||
meta: struct {},
|
||||
parents: Set<Ref<Cycle<Commit>>>,
|
||||
value: struct Filesystem {
|
||||
root: struct Inode {
|
||||
attr: struct Attr {
|
||||
ctime: Number,
|
||||
gid: Number,
|
||||
mode: Number,
|
||||
mtime: Number,
|
||||
uid: Number,
|
||||
xattr: Map<String, Blob>,
|
||||
},
|
||||
contents: struct Directory {
|
||||
entries: Map<String, Cycle<1>>,
|
||||
} | struct Symlink {
|
||||
targetPath: String,
|
||||
} | struct File {
|
||||
data: Ref<Blob>,
|
||||
},
|
||||
},
|
||||
},
|
||||
}({
|
||||
meta: {},
|
||||
parents: {
|
||||
d6jn389ov693oa4b9vqhe3fmn2g49c2k,
|
||||
},
|
||||
value: Filesystem {
|
||||
root: Inode {
|
||||
attr: Attr {
|
||||
ctime: 1.4703496225642643e+09,
|
||||
gid: 20,
|
||||
mode: 511,
|
||||
mtime: 1.4703496225642643e+09,
|
||||
uid: 501,
|
||||
xattr: {},
|
||||
},
|
||||
contents: Directory {
|
||||
entries: {
|
||||
"file.txt": Inode {
|
||||
attr: Attr {
|
||||
ctime: 1.470349669044128e+09,
|
||||
gid: 20,
|
||||
mode: 420,
|
||||
mtime: 1.465233596e+09,
|
||||
uid: 501,
|
||||
xattr: {
|
||||
"com.apple.FinderInfo": 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 // 32 B
|
||||
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00,
|
||||
},
|
||||
},
|
||||
contents: File {
|
||||
data: hv6f7d07uajec3mebergu810v12gem83,
|
||||
},
|
||||
},
|
||||
"noms_logo.png": Inode {
|
||||
attr: Attr {
|
||||
ctime: 1.4703496464136713e+09,
|
||||
gid: 20,
|
||||
mode: 420,
|
||||
mtime: 1.470171468e+09,
|
||||
uid: 501,
|
||||
xattr: {
|
||||
"com.apple.FinderInfo": 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 // 32 B
|
||||
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00,
|
||||
"com.apple.quarantine": 30 30 30 32 3b 35 37 61 31 30 39 34 63 3b 50 72 // 22 B
|
||||
65 76 69 65 77 3b,
|
||||
},
|
||||
},
|
||||
contents: File {
|
||||
data: higtjmhq7fo5m072vkmmldtmkn2vspkb,
|
||||
},
|
||||
},
|
||||
...
|
||||
```
|
||||
|
||||
## Limitations
|
||||
|
||||
Hard links are not supported at this time, but may be added in the future.
|
||||
Mounting a dataset in multiple locations is not supported, but may be added in the future.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
`Mount failed: no FUSE devices found`
|
||||
Make sure FUSE is installed. If you're on Mac OS X make sure the kernel module is loaded by executing `/Library/Filesystems/osxfuse.fs/Contents/Resources/load_osxfuse`.
|
||||
|
||||
## Contributing
|
||||
|
||||
Issues welcome; testing welcome; code welcome. Feel free to pitch in!
|
||||