Bump github.com/blevesearch/bleve/v2 from 2.3.7 to 2.3.9

Bumps [github.com/blevesearch/bleve/v2](https://github.com/blevesearch/bleve) from 2.3.7 to 2.3.9.
- [Release notes](https://github.com/blevesearch/bleve/releases)
- [Commits](https://github.com/blevesearch/bleve/compare/v2.3.7...v2.3.9)

---
updated-dependencies:
- dependency-name: github.com/blevesearch/bleve/v2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
This commit is contained in:
dependabot[bot]
2023-08-15 06:44:05 +00:00
committed by Ralf Haferkamp
parent 82b600aef5
commit f9b69afa9e
58 changed files with 1138 additions and 330 deletions
-32
View File
@@ -1,32 +0,0 @@
language: go
sudo: false
install:
- go get -t github.com/RoaringBitmap/roaring
- go get -t golang.org/x/tools/cmd/cover
- go get -t github.com/mattn/goveralls
- go get -t github.com/mschoch/smat
notifications:
email: false
go:
- "1.13.x"
- "1.14.x"
- tip
# whitelist
branches:
only:
- master
script:
- goveralls -v -service travis-ci -ignore rle16_gen.go,rle_gen.go,rle.go || go test
- go test -race -run TestConcurrent*
- go build -tags appengine
- go test -tags appengine
- GOARCH=arm64 go build
- GOARCH=386 go build
- GOARCH=386 go test
- GOARCH=arm go build
- GOARCH=arm64 go build
matrix:
allow_failures:
- go: tip
+17 -11
View File
@@ -1,4 +1,4 @@
roaring [![Build Status](https://travis-ci.org/RoaringBitmap/roaring.png)](https://travis-ci.org/RoaringBitmap/roaring) [![GoDoc](https://godoc.org/github.com/RoaringBitmap/roaring/roaring64?status.svg)](https://godoc.org/github.com/RoaringBitmap/roaring/roaring64) [![Go Report Card](https://goreportcard.com/badge/RoaringBitmap/roaring)](https://goreportcard.com/report/github.com/RoaringBitmap/roaring)
roaring [![GoDoc](https://godoc.org/github.com/RoaringBitmap/roaring/roaring64?status.svg)](https://godoc.org/github.com/RoaringBitmap/roaring/roaring64) [![Go Report Card](https://goreportcard.com/badge/RoaringBitmap/roaring)](https://goreportcard.com/report/github.com/RoaringBitmap/roaring)
[![Build Status](https://cloud.drone.io/api/badges/RoaringBitmap/roaring/status.svg)](https://cloud.drone.io/RoaringBitmap/roaring)
![Go-CI](https://github.com/RoaringBitmap/roaring/workflows/Go-CI/badge.svg)
![Go-ARM-CI](https://github.com/RoaringBitmap/roaring/workflows/Go-ARM-CI/badge.svg)
@@ -7,10 +7,8 @@ roaring [![Build Status](https://travis-ci.org/RoaringBitmap/roaring.png)](https
This is a go version of the Roaring bitmap data structure.
Roaring bitmaps are used by several major systems such as [Apache Lucene][lucene] and derivative systems such as [Solr][solr] and
[Elasticsearch][elasticsearch], [Apache Druid (Incubating)][druid], [LinkedIn Pinot][pinot], [Netflix Atlas][atlas], [Apache Spark][spark], [OpenSearchServer][opensearchserver], [Cloud Torrent][cloudtorrent], [Whoosh][whoosh], [Pilosa][pilosa], [Microsoft Visual Studio Team Services (VSTS)][vsts], and eBay's [Apache Kylin][kylin]. The YouTube SQL Engine, [Google Procella](https://research.google/pubs/pub48388/), uses Roaring bitmaps for indexing.
[Elasticsearch][elasticsearch], [Apache Druid (Incubating)][druid], [LinkedIn Pinot][pinot], [Netflix Atlas][atlas], [Apache Spark][spark], [OpenSearchServer][opensearchserver], [anacrolix/torrent][anacrolix/torrent], [Whoosh][whoosh], [Pilosa][pilosa], [Microsoft Visual Studio Team Services (VSTS)][vsts], and eBay's [Apache Kylin][kylin]. The YouTube SQL Engine, [Google Procella](https://research.google/pubs/pub48388/), uses Roaring bitmaps for indexing.
[lucene]: https://lucene.apache.org/
[solr]: https://lucene.apache.org/solr/
@@ -18,7 +16,7 @@ Roaring bitmaps are used by several major systems such as [Apache Lucene][lucene
[druid]: https://druid.apache.org/
[spark]: https://spark.apache.org/
[opensearchserver]: http://www.opensearchserver.com
[cloudtorrent]: https://github.com/jpillora/cloud-torrent
[anacrolix/torrent]: https://github.com/anacrolix/torrent
[whoosh]: https://bitbucket.org/mchaput/whoosh/wiki/Home
[pilosa]: https://www.pilosa.com/
[kylin]: http://kylin.apache.org/
@@ -32,7 +30,7 @@ Roaring bitmaps are found to work well in many important applications:
The ``roaring`` Go library is used by
* [Cloud Torrent](https://github.com/jpillora/cloud-torrent)
* [anacrolix/torrent]
* [runv](https://github.com/hyperhq/runv)
* [InfluxDB](https://www.influxdata.com)
* [Pilosa](https://www.pilosa.com/)
@@ -42,6 +40,7 @@ The ``roaring`` Go library is used by
* [SourceGraph](https://github.com/sourcegraph/sourcegraph)
* [M3](https://github.com/m3db/m3)
* [trident](https://github.com/NetApp/trident)
* [Husky](https://www.datadoghq.com/blog/engineering/introducing-husky/)
This library is used in production in several systems, it is part of the [Awesome Go collection](https://awesome-go.com).
@@ -148,10 +147,8 @@ formats like WAH, EWAH, Concise... Maybe surprisingly, Roaring also generally of
- Daniel Lemire, Owen Kaser, Nathan Kurz, Luca Deri, Chris O'Hara, François Saint-Jacques, Gregory Ssi-Yan-Kai, Roaring Bitmaps: Implementation of an Optimized Software Library, Software: Practice and Experience 48 (4), 2018 [arXiv:1709.07821](https://arxiv.org/abs/1709.07821)
- Samy Chambi, Daniel Lemire, Owen Kaser, Robert Godin,
Better bitmap performance with Roaring bitmaps,
Software: Practice and Experience 46 (5), 2016.
http://arxiv.org/abs/1402.6407 This paper used data from http://lemire.me/data/realroaring2014.html
- Daniel Lemire, Gregory Ssi-Yan-Kai, Owen Kaser, Consistently faster and smaller compressed bitmaps with Roaring, Software: Practice and Experience 46 (11), 2016. http://arxiv.org/abs/1603.06549
Software: Practice and Experience 46 (5), 2016.[arXiv:1402.6407](http://arxiv.org/abs/1402.6407) This paper used data from http://lemire.me/data/realroaring2014.html
- Daniel Lemire, Gregory Ssi-Yan-Kai, Owen Kaser, Consistently faster and smaller compressed bitmaps with Roaring, Software: Practice and Experience 46 (11), 2016. [arXiv:1603.06549](http://arxiv.org/abs/1603.06549)
### Dependencies
@@ -170,6 +167,15 @@ Note that the smat library requires Go 1.6 or better.
- go get -t github.com/RoaringBitmap/roaring
### Instructions for contributors
Using bash or other common shells:
```
$ git clone git@github.com:RoaringBitmap/roaring.git
$ export GO111MODULE=on
$ go mod tidy
$ go test -v
```
### Example
@@ -325,7 +331,7 @@ Only the 32-bit roaring format is standard and cross-operable between Java, C++,
### Documentation
Current documentation is available at http://godoc.org/github.com/RoaringBitmap/roaring and http://godoc.org/github.com/RoaringBitmap/roaring64
Current documentation is available at https://pkg.go.dev/github.com/RoaringBitmap/roaring and https://pkg.go.dev/github.com/RoaringBitmap/roaring/roaring64
### Goroutine safety
+30 -4
View File
@@ -1007,16 +1007,42 @@ func (ac *arrayContainer) containerType() contype {
return arrayContype
}
func (ac *arrayContainer) addOffset(x uint16) []container {
low := &arrayContainer{}
high := &arrayContainer{}
func (ac *arrayContainer) addOffset(x uint16) (container, container) {
var low, high *arrayContainer
if len(ac.content) == 0 {
return nil, nil
}
if y := uint32(ac.content[0]) + uint32(x); highbits(y) == 0 {
// Some elements will fall into low part, allocate a container.
// Checking the first one is enough because they are ordered.
low = &arrayContainer{}
}
if y := uint32(ac.content[len(ac.content)-1]) + uint32(x); highbits(y) > 0 {
// Some elements will fall into high part, allocate a container.
// Checking the last one is enough because they are ordered.
high = &arrayContainer{}
}
for _, val := range ac.content {
y := uint32(val) + uint32(x)
if highbits(y) > 0 {
// OK, if high == nil then highbits(y) == 0 for all y.
high.content = append(high.content, lowbits(y))
} else {
// OK, if low == nil then highbits(y) > 0 for all y.
low.content = append(low.content, lowbits(y))
}
}
return []container{low, high}
// Ensure proper nil interface.
if low == nil {
return nil, high
}
if high == nil {
return low, nil
}
return low, high
}
+36 -7
View File
@@ -350,7 +350,6 @@ func (bc *bitmapContainer) getCardinality() int {
return bc.cardinality
}
func (bc *bitmapContainer) isEmpty() bool {
return bc.cardinality == 0
}
@@ -1125,15 +1124,20 @@ func (bc *bitmapContainer) containerType() contype {
return bitmapContype
}
func (bc *bitmapContainer) addOffset(x uint16) []container {
low := newBitmapContainer()
high := newBitmapContainer()
func (bc *bitmapContainer) addOffset(x uint16) (container, container) {
var low, high *bitmapContainer
if bc.cardinality == 0 {
return nil, nil
}
b := uint32(x) >> 6
i := uint32(x) % 64
end := uint32(1024) - b
low = newBitmapContainer()
if i == 0 {
copy(low.bitmap[b:], bc.bitmap[:end])
copy(high.bitmap[:b], bc.bitmap[end:])
} else {
low.bitmap[b] = bc.bitmap[0] << i
for k := uint32(1); k < end; k++ {
@@ -1141,6 +1145,26 @@ func (bc *bitmapContainer) addOffset(x uint16) []container {
newval |= bc.bitmap[k-1] >> (64 - i)
low.bitmap[b+k] = newval
}
}
low.computeCardinality()
if low.cardinality == bc.cardinality {
// All elements from bc ended up in low, meaning high will be empty.
return low, nil
}
if low.cardinality == 0 {
// low is empty, let's reuse the container for high.
high = low
low = nil
} else {
// None of the containers will be empty, so allocate both.
high = newBitmapContainer()
}
if i == 0 {
copy(high.bitmap[:b], bc.bitmap[end:])
} else {
for k := end; k < 1024; k++ {
newval := bc.bitmap[k] << i
newval |= bc.bitmap[k-1] >> (64 - i)
@@ -1148,7 +1172,12 @@ func (bc *bitmapContainer) addOffset(x uint16) []container {
}
high.bitmap[b] = bc.bitmap[1023] >> (64 - i)
}
low.computeCardinality()
high.computeCardinality()
return []container{low, high}
// Ensure proper nil interface.
if low == nil {
return nil, high
}
return low, high
}
+2
View File
@@ -1,4 +1,6 @@
//go:build go1.9
// +build go1.9
// "go1.9", from Go version 1.9 onward
// See https://golang.org/pkg/go/build/#hdr-Build_Constraints
+1
View File
@@ -1,3 +1,4 @@
//go:build !go1.9
// +build !go1.9
package roaring
+2
View File
@@ -1,4 +1,6 @@
//go:build go1.9
// +build go1.9
// "go1.9", from Go version 1.9 onward
// See https://golang.org/pkg/go/build/#hdr-Build_Constraints
+1
View File
@@ -1,3 +1,4 @@
//go:build !go1.9
// +build !go1.9
package roaring
+4
View File
@@ -121,6 +121,10 @@ func (x1 *Bitmap) repairAfterLazy() {
// FastAnd computes the intersection between many bitmaps quickly
// Compared to the And function, it can take many bitmaps as input, thus saving the trouble
// of manually calling "And" many times.
//
// Performance hints: if you have very large and tiny bitmaps,
// it may be beneficial performance-wise to put a tiny bitmap
// in first position.
func FastAnd(bitmaps ...*Bitmap) *Bitmap {
if len(bitmaps) == 0 {
return NewBitmap()
+2
View File
@@ -1,4 +1,6 @@
//go:build go1.9
// +build go1.9
// "go1.9", from Go version 1.9 onward
// See https://golang.org/pkg/go/build/#hdr-Build_Constraints
+1
View File
@@ -1,3 +1,4 @@
//go:build amd64 && !appengine && !go1.9
// +build amd64,!appengine,!go1.9
package roaring
+1
View File
@@ -1,3 +1,4 @@
//go:build !go1.9
// +build !go1.9
package roaring
+1
View File
@@ -1,3 +1,4 @@
//go:build !amd64 || appengine || go1.9
// +build !amd64 appengine go1.9
package roaring
+206 -69
View File
@@ -53,6 +53,59 @@ func (rb *Bitmap) ToBytes() ([]byte, error) {
return rb.highlowcontainer.toBytes()
}
// Checksum computes a hash (currently FNV-1a) for a bitmap that is suitable for
// using bitmaps as elements in hash sets or as keys in hash maps, as well as
// generally quicker comparisons.
// The implementation is biased towards efficiency in little endian machines, so
// expect some extra CPU cycles and memory to be used if your machine is big endian.
// Likewise, don't use this to verify integrity unless you're certain you'll load
// the bitmap on a machine with the same endianess used to create it.
func (rb *Bitmap) Checksum() uint64 {
const (
offset = 14695981039346656037
prime = 1099511628211
)
var bytes []byte
hash := uint64(offset)
bytes = uint16SliceAsByteSlice(rb.highlowcontainer.keys)
for _, b := range bytes {
hash ^= uint64(b)
hash *= prime
}
for _, c := range rb.highlowcontainer.containers {
// 0 separator
hash ^= 0
hash *= prime
switch c := c.(type) {
case *bitmapContainer:
bytes = uint64SliceAsByteSlice(c.bitmap)
case *arrayContainer:
bytes = uint16SliceAsByteSlice(c.content)
case *runContainer16:
bytes = interval16SliceAsByteSlice(c.iv)
default:
panic("invalid container type")
}
if len(bytes) == 0 {
panic("empty containers are not supported")
}
for _, b := range bytes {
hash ^= uint64(b)
hash *= prime
}
}
return hash
}
// ReadFrom reads a serialized version of this bitmap from stream.
// The format is compatible with other RoaringBitmap
// implementations (Java, C) and is documented here:
@@ -218,6 +271,14 @@ type intIterator struct {
hs uint32
iter shortPeekable
highlowcontainer *roaringArray
// These embedded iterators per container type help reduce load in the GC.
// This way, instead of making up-to 64k allocations per full iteration
// we get a single allocation and simply reinitialize the appropriate
// iterator and point to it in the generic `iter` member on each key bound.
shortIter shortIterator
runIter runIterator16
bitmapIter bitmapContainerShortIterator
}
// HasNext returns true if there are more integers to iterate over
@@ -227,8 +288,19 @@ func (ii *intIterator) HasNext() bool {
func (ii *intIterator) init() {
if ii.highlowcontainer.size() > ii.pos {
ii.iter = ii.highlowcontainer.getContainerAtIndex(ii.pos).getShortIterator()
ii.hs = uint32(ii.highlowcontainer.getKeyAtIndex(ii.pos)) << 16
c := ii.highlowcontainer.getContainerAtIndex(ii.pos)
switch t := c.(type) {
case *arrayContainer:
ii.shortIter = shortIterator{t.content, 0}
ii.iter = &ii.shortIter
case *runContainer16:
ii.runIter = runIterator16{rc: t, curIndex: 0, curPosInIndex: 0}
ii.iter = &ii.runIter
case *bitmapContainer:
ii.bitmapIter = bitmapContainerShortIterator{t, t.NextSetBit(0)}
ii.iter = &ii.bitmapIter
}
}
}
@@ -249,14 +321,14 @@ func (ii *intIterator) PeekNext() uint32 {
// AdvanceIfNeeded advances as long as the next value is smaller than minval
func (ii *intIterator) AdvanceIfNeeded(minval uint32) {
to := minval >> 16
to := minval & 0xffff0000
for ii.HasNext() && (ii.hs>>16) < to {
for ii.HasNext() && ii.hs < to {
ii.pos++
ii.init()
}
if ii.HasNext() && (ii.hs>>16) == to {
if ii.HasNext() && ii.hs == to {
ii.iter.advanceIfNeeded(lowbits(minval))
if !ii.iter.hasNext() {
@@ -266,12 +338,17 @@ func (ii *intIterator) AdvanceIfNeeded(minval uint32) {
}
}
func newIntIterator(a *Bitmap) *intIterator {
p := new(intIterator)
// IntIterator is meant to allow you to iterate through the values of a bitmap, see Initialize(a *Bitmap)
type IntIterator = intIterator
// Initialize configures the existing iterator so that it can iterate through the values of
// the provided bitmap.
// The iteration results are undefined if the bitmap is modified (e.g., with Add or Remove).
func (p *intIterator) Initialize(a *Bitmap) {
p.pos = 0
p.highlowcontainer = &a.highlowcontainer
p.init()
return p
}
type intReverseIterator struct {
@@ -279,6 +356,10 @@ type intReverseIterator struct {
hs uint32
iter shortIterable
highlowcontainer *roaringArray
shortIter reverseIterator
runIter runReverseIterator16
bitmapIter reverseBitmapContainerShortIterator
}
// HasNext returns true if there are more integers to iterate over
@@ -288,8 +369,30 @@ func (ii *intReverseIterator) HasNext() bool {
func (ii *intReverseIterator) init() {
if ii.pos >= 0 {
ii.iter = ii.highlowcontainer.getContainerAtIndex(ii.pos).getReverseIterator()
ii.hs = uint32(ii.highlowcontainer.getKeyAtIndex(ii.pos)) << 16
c := ii.highlowcontainer.getContainerAtIndex(ii.pos)
switch t := c.(type) {
case *arrayContainer:
ii.shortIter = reverseIterator{t.content, len(t.content) - 1}
ii.iter = &ii.shortIter
case *runContainer16:
index := int(len(t.iv)) - 1
pos := uint16(0)
if index >= 0 {
pos = t.iv[index].length
}
ii.runIter = runReverseIterator16{rc: t, curIndex: index, curPosInIndex: pos}
ii.iter = &ii.runIter
case *bitmapContainer:
pos := -1
if t.cardinality > 0 {
pos = int(t.maximum())
}
ii.bitmapIter = reverseBitmapContainerShortIterator{t, pos}
ii.iter = &ii.bitmapIter
}
} else {
ii.iter = nil
}
@@ -305,12 +408,16 @@ func (ii *intReverseIterator) Next() uint32 {
return x
}
func newIntReverseIterator(a *Bitmap) *intReverseIterator {
p := new(intReverseIterator)
// IntReverseIterator is meant to allow you to iterate through the values of a bitmap, see Initialize(a *Bitmap)
type IntReverseIterator = intReverseIterator
// Initialize configures the existing iterator so that it can iterate through the values of
// the provided bitmap.
// The iteration results are undefined if the bitmap is modified (e.g., with Add or Remove).
func (p *intReverseIterator) Initialize(a *Bitmap) {
p.highlowcontainer = &a.highlowcontainer
p.pos = a.highlowcontainer.size() - 1
p.init()
return p
}
// ManyIntIterable allows you to iterate over the values in a Bitmap
@@ -326,12 +433,27 @@ type manyIntIterator struct {
hs uint32
iter manyIterable
highlowcontainer *roaringArray
shortIter shortIterator
runIter runIterator16
bitmapIter bitmapContainerManyIterator
}
func (ii *manyIntIterator) init() {
if ii.highlowcontainer.size() > ii.pos {
ii.iter = ii.highlowcontainer.getContainerAtIndex(ii.pos).getManyIterator()
ii.hs = uint32(ii.highlowcontainer.getKeyAtIndex(ii.pos)) << 16
c := ii.highlowcontainer.getContainerAtIndex(ii.pos)
switch t := c.(type) {
case *arrayContainer:
ii.shortIter = shortIterator{t.content, 0}
ii.iter = &ii.shortIter
case *runContainer16:
ii.runIter = runIterator16{rc: t, curIndex: 0, curPosInIndex: 0}
ii.iter = &ii.runIter
case *bitmapContainer:
ii.bitmapIter = bitmapContainerManyIterator{t, -1, 0}
ii.iter = &ii.bitmapIter
}
} else {
ii.iter = nil
}
@@ -373,12 +495,17 @@ func (ii *manyIntIterator) NextMany64(hs64 uint64, buf []uint64) int {
return n
}
func newManyIntIterator(a *Bitmap) *manyIntIterator {
p := new(manyIntIterator)
// ManyIntIterator is meant to allow you to iterate through the values of a bitmap, see Initialize(a *Bitmap)
type ManyIntIterator = manyIntIterator
// Initialize configures the existing iterator so that it can iterate through the values of
// the provided bitmap.
// The iteration results are undefined if the bitmap is modified (e.g., with Add or Remove).
func (p *manyIntIterator) Initialize(a *Bitmap) {
p.pos = 0
p.highlowcontainer = &a.highlowcontainer
p.init()
return p
}
// String creates a string representation of the Bitmap
@@ -410,7 +537,7 @@ func (rb *Bitmap) String() string {
// Iterate iterates over the bitmap, calling the given callback with each value in the bitmap. If the callback returns
// false, the iteration is halted.
// The iteration results are undefined if the bitmap is modified (e.g., with Add or Remove).
// There is no guarantee as to what order the values will be iterated
// There is no guarantee as to what order the values will be iterated.
func (rb *Bitmap) Iterate(cb func(x uint32) bool) {
for i := 0; i < rb.highlowcontainer.size(); i++ {
hs := uint32(rb.highlowcontainer.getKeyAtIndex(i)) << 16
@@ -442,19 +569,25 @@ func (rb *Bitmap) Iterate(cb func(x uint32) bool) {
// Iterator creates a new IntPeekable to iterate over the integers contained in the bitmap, in sorted order;
// the iterator becomes invalid if the bitmap is modified (e.g., with Add or Remove).
func (rb *Bitmap) Iterator() IntPeekable {
return newIntIterator(rb)
p := new(intIterator)
p.Initialize(rb)
return p
}
// ReverseIterator creates a new IntIterable to iterate over the integers contained in the bitmap, in sorted order;
// the iterator becomes invalid if the bitmap is modified (e.g., with Add or Remove).
func (rb *Bitmap) ReverseIterator() IntIterable {
return newIntReverseIterator(rb)
p := new(intReverseIterator)
p.Initialize(rb)
return p
}
// ManyIterator creates a new ManyIntIterable to iterate over the integers contained in the bitmap, in sorted order;
// the iterator becomes invalid if the bitmap is modified (e.g., with Add or Remove).
func (rb *Bitmap) ManyIterator() ManyIntIterable {
return newManyIntIterator(rb)
p := new(manyIntIterator)
p.Initialize(rb)
return p
}
// Clone creates a copy of the Bitmap
@@ -466,11 +599,17 @@ func (rb *Bitmap) Clone() *Bitmap {
// Minimum get the smallest value stored in this roaring bitmap, assumes that it is not empty
func (rb *Bitmap) Minimum() uint32 {
if len(rb.highlowcontainer.containers) == 0 {
panic("Empty bitmap")
}
return uint32(rb.highlowcontainer.containers[0].minimum()) | (uint32(rb.highlowcontainer.keys[0]) << 16)
}
// Maximum get the largest value stored in this roaring bitmap, assumes that it is not empty
func (rb *Bitmap) Maximum() uint32 {
if len(rb.highlowcontainer.containers) == 0 {
panic("Empty bitmap")
}
lastindex := len(rb.highlowcontainer.containers) - 1
return uint32(rb.highlowcontainer.containers[lastindex].maximum()) | (uint32(rb.highlowcontainer.keys[lastindex]) << 16)
}
@@ -514,34 +653,38 @@ func AddOffset64(x *Bitmap, offset int64) (answer *Bitmap) {
containerOffset64 = offset >> 16
}
if containerOffset64 >= (1<<16) || containerOffset64 <= -(1<<16) {
return New()
answer = New()
if containerOffset64 >= (1<<16) || containerOffset64 < -(1<<16) {
return answer
}
containerOffset := int32(containerOffset64)
inOffset := (uint16)(offset - containerOffset64*(1<<16))
if inOffset == 0 {
answer = x.Clone()
for pos := 0; pos < answer.highlowcontainer.size(); pos++ {
key := int32(answer.highlowcontainer.getKeyAtIndex(pos))
key += containerOffset
if key >= 0 && key <= MaxUint16 {
answer.highlowcontainer.keys[pos] = uint16(key)
}
}
} else {
answer = New()
for pos := 0; pos < x.highlowcontainer.size(); pos++ {
key := int32(x.highlowcontainer.getKeyAtIndex(pos))
key += containerOffset
c := x.highlowcontainer.getContainerAtIndex(pos)
offsetted := c.addOffset(inOffset)
if key >= 0 && key <= MaxUint16 {
c := x.highlowcontainer.getContainerAtIndex(pos).clone()
answer.highlowcontainer.appendContainer(uint16(key), c, false)
}
}
} else {
for pos := 0; pos < x.highlowcontainer.size(); pos++ {
key := int32(x.highlowcontainer.getKeyAtIndex(pos))
key += containerOffset
if !offsetted[0].isEmpty() && (key >= 0 && key <= MaxUint16) {
if key+1 < 0 || key > MaxUint16 {
continue
}
c := x.highlowcontainer.getContainerAtIndex(pos)
lo, hi := c.addOffset(inOffset)
if lo != nil && key >= 0 {
curSize := answer.highlowcontainer.size()
lastkey := int32(0)
@@ -551,15 +694,15 @@ func AddOffset64(x *Bitmap, offset int64) (answer *Bitmap) {
if curSize > 0 && lastkey == key {
prev := answer.highlowcontainer.getContainerAtIndex(curSize - 1)
orrseult := prev.ior(offsetted[0])
answer.highlowcontainer.setContainerAtIndex(curSize-1, orrseult)
orresult := prev.ior(lo)
answer.highlowcontainer.setContainerAtIndex(curSize-1, orresult)
} else {
answer.highlowcontainer.appendContainer(uint16(key), offsetted[0], false)
answer.highlowcontainer.appendContainer(uint16(key), lo, false)
}
}
if !offsetted[1].isEmpty() && ((key+1) >= 0 && (key+1) <= MaxUint16) {
answer.highlowcontainer.appendContainer(uint16(key+1), offsetted[1], false)
if hi != nil && key+1 <= MaxUint16 {
answer.highlowcontainer.appendContainer(uint16(key+1), hi, false)
}
}
}
@@ -693,10 +836,6 @@ func (rb *Bitmap) Rank(x uint32) uint64 {
// the smallest element. Note that this function differs in convention from
// the Rank function which returns 1 on the smallest value.
func (rb *Bitmap) Select(x uint32) (uint32, error) {
if rb.GetCardinality() <= uint64(x) {
return 0, fmt.Errorf("can't find %dth integer in a bitmap with only %d items", x, rb.GetCardinality())
}
remaining := x
for i := 0; i < rb.highlowcontainer.size(); i++ {
c := rb.highlowcontainer.getContainerAtIndex(i)
@@ -860,6 +999,28 @@ main:
return answer
}
// IntersectsWithInterval checks whether a bitmap 'rb' and an open interval '[x,y)' intersect.
func (rb *Bitmap) IntersectsWithInterval(x, y uint64) bool {
if x >= y {
return false
}
if x > MaxUint32 {
return false
}
it := intIterator{}
it.Initialize(rb)
it.AdvanceIfNeeded(uint32(x))
if !it.HasNext() {
return false
}
if uint64(it.Next()) >= y {
return false
}
return true
}
// Intersects checks whether two bitmap intersects, bitmaps are not modified
func (rb *Bitmap) Intersects(x2 *Bitmap) bool {
pos1 := 0
@@ -1552,27 +1713,3 @@ func (rb *Bitmap) Stats() Statistics {
}
return stats
}
func (rb *Bitmap) checkValidity() bool {
for _, c := range rb.highlowcontainer.containers {
switch c.(type) {
case *arrayContainer:
if c.getCardinality() > arrayDefaultMaxSize {
fmt.Println("Array containers are limited to size ", arrayDefaultMaxSize)
return false
}
case *bitmapContainer:
if c.getCardinality() <= arrayDefaultMaxSize {
fmt.Println("Bitmaps would be more concise as an array!")
return false
}
case *runContainer16:
if c.getSizeInBytes() > minOfInt(bitmapContainerSizeInBytes(), arrayContainerSizeInBytes(c.getCardinality())) {
fmt.Println("Inefficient run container!")
return false
}
}
}
return true
}
+8 -5
View File
@@ -4,12 +4,15 @@ import (
"bytes"
"encoding/binary"
"fmt"
"io"
"github.com/RoaringBitmap/roaring/internal"
"io"
)
type container interface {
addOffset(uint16) []container
// addOffset returns the (low, high) parts of the shifted container.
// Whenever one of them would be empty, nil will be returned instead to
// avoid unnecessary allocations.
addOffset(uint16) (container, container)
clone() container
and(container) container
@@ -551,9 +554,9 @@ func (ra *roaringArray) toBytes() ([]byte, error) {
}
func (ra *roaringArray) readFrom(stream internal.ByteInput, cookieHeader ...byte) (int64, error) {
var cookie uint32
var cookie uint32
var err error
if len(cookieHeader) > 0 && len(cookieHeader) != 4 {
if len(cookieHeader) > 0 && len(cookieHeader) != 4 {
return int64(len(cookieHeader)), fmt.Errorf("error in roaringArray.readFrom: could not read initial cookie: incorrect size of cookie header")
}
if len(cookieHeader) == 4 {
@@ -645,7 +648,7 @@ func (ra *roaringArray) readFrom(stream internal.ByteInput, cookieHeader ...byte
}
nb := runContainer16{
iv: byteSliceAsInterval16Slice(buf),
iv: byteSliceAsInterval16Slice(buf),
}
ra.containers[i] = &nb
+32 -5
View File
@@ -2281,7 +2281,7 @@ func runArrayUnionToRuns(rc *runContainer16, ac *arrayContainer) ([]interval16,
pos2++
}
}
cardMinusOne += previousInterval.length + 1
cardMinusOne += previousInterval.length
target = append(target, previousInterval)
return target, cardMinusOne
@@ -2582,9 +2582,27 @@ func (rc *runContainer16) serializedSizeInBytes() int {
return 2 + len(rc.iv)*4
}
func (rc *runContainer16) addOffset(x uint16) []container {
low := newRunContainer16()
high := newRunContainer16()
func (rc *runContainer16) addOffset(x uint16) (container, container) {
var low, high *runContainer16
if len(rc.iv) == 0 {
return nil, nil
}
first := uint32(rc.iv[0].start) + uint32(x)
if highbits(first) == 0 {
// Some elements will fall into low part, allocate a container.
// Checking the first one is enough because they are ordered.
low = newRunContainer16()
}
last := uint32(rc.iv[len(rc.iv)-1].start)
last += uint32(rc.iv[len(rc.iv)-1].length)
last += uint32(x)
if highbits(last) > 0 {
// Some elements will fall into high part, allocate a container.
// Checking the last one is enough because they are ordered.
high = newRunContainer16()
}
for _, iv := range rc.iv {
val := int(iv.start) + int(x)
@@ -2600,5 +2618,14 @@ func (rc *runContainer16) addOffset(x uint16) []container {
high.iv = append(high.iv, interval16{uint16(val & 0xffff), iv.length})
}
}
return []container{low, high}
// Ensure proper nil interface.
if low == nil {
return nil, high
}
if high == nil {
return low, nil
}
return low, high
}
+12
View File
@@ -1,3 +1,4 @@
//go:build (!amd64 && !386 && !arm && !arm64 && !ppc64le && !mipsle && !mips64le && !mips64p32le && !wasm) || appengine
// +build !amd64,!386,!arm,!arm64,!ppc64le,!mipsle,!mips64le,!mips64p32le,!wasm appengine
package roaring
@@ -84,6 +85,17 @@ func uint16SliceAsByteSlice(slice []uint16) []byte {
return by
}
func interval16SliceAsByteSlice(slice []interval16) []byte {
by := make([]byte, len(slice)*4)
for i, v := range slice {
binary.LittleEndian.PutUint16(by[i*2:], v.start)
binary.LittleEndian.PutUint16(by[i*2+2:], v.length)
}
return by
}
func byteSliceAsUint16Slice(slice []byte) []uint16 {
if len(slice)%2 != 0 {
panic("Slice size should be divisible by 2")
+268 -33
View File
@@ -1,3 +1,4 @@
//go:build (386 && !appengine) || (amd64 && !appengine) || (arm && !appengine) || (arm64 && !appengine) || (ppc64le && !appengine) || (mipsle && !appengine) || (mips64le && !appengine) || (mips64p32le && !appengine) || (wasm && !appengine)
// +build 386,!appengine amd64,!appengine arm,!appengine arm64,!appengine ppc64le,!appengine mipsle,!appengine mips64le,!appengine mips64p32le,!appengine wasm,!appengine
package roaring
@@ -56,6 +57,22 @@ func uint16SliceAsByteSlice(slice []uint16) []byte {
return result
}
func interval16SliceAsByteSlice(slice []interval16) []byte {
// make a new slice header
header := *(*reflect.SliceHeader)(unsafe.Pointer(&slice))
// update its capacity and length
header.Len *= 4
header.Cap *= 4
// instantiate result and use KeepAlive so data isn't unmapped.
result := *(*[]byte)(unsafe.Pointer(&header))
runtime.KeepAlive(&slice)
// return it
return result
}
func (bc *bitmapContainer) asLittleEndianByteSlice() []byte {
return uint64SliceAsByteSlice(bc.bitmap)
}
@@ -134,7 +151,124 @@ func byteSliceAsInterval16Slice(slice []byte) (result []interval16) {
return
}
// FromBuffer creates a bitmap from its serialized version stored in buffer.
func byteSliceAsContainerSlice(slice []byte) (result []container) {
var c container
containerSize := int(unsafe.Sizeof(c))
if len(slice)%containerSize != 0 {
panic("Slice size should be divisible by unsafe.Sizeof(container)")
}
// reference: https://go101.org/article/unsafe.html
// make a new slice header
bHeader := (*reflect.SliceHeader)(unsafe.Pointer(&slice))
rHeader := (*reflect.SliceHeader)(unsafe.Pointer(&result))
// transfer the data from the given slice to a new variable (our result)
rHeader.Data = bHeader.Data
rHeader.Len = bHeader.Len / containerSize
rHeader.Cap = bHeader.Cap / containerSize
// instantiate result and use KeepAlive so data isn't unmapped.
runtime.KeepAlive(&slice) // it is still crucial, GC can free it)
// return result
return
}
func byteSliceAsBitsetSlice(slice []byte) (result []bitmapContainer) {
bitsetSize := int(unsafe.Sizeof(bitmapContainer{}))
if len(slice)%bitsetSize != 0 {
panic("Slice size should be divisible by unsafe.Sizeof(bitmapContainer)")
}
// reference: https://go101.org/article/unsafe.html
// make a new slice header
bHeader := (*reflect.SliceHeader)(unsafe.Pointer(&slice))
rHeader := (*reflect.SliceHeader)(unsafe.Pointer(&result))
// transfer the data from the given slice to a new variable (our result)
rHeader.Data = bHeader.Data
rHeader.Len = bHeader.Len / bitsetSize
rHeader.Cap = bHeader.Cap / bitsetSize
// instantiate result and use KeepAlive so data isn't unmapped.
runtime.KeepAlive(&slice) // it is still crucial, GC can free it)
// return result
return
}
func byteSliceAsArraySlice(slice []byte) (result []arrayContainer) {
arraySize := int(unsafe.Sizeof(arrayContainer{}))
if len(slice)%arraySize != 0 {
panic("Slice size should be divisible by unsafe.Sizeof(arrayContainer)")
}
// reference: https://go101.org/article/unsafe.html
// make a new slice header
bHeader := (*reflect.SliceHeader)(unsafe.Pointer(&slice))
rHeader := (*reflect.SliceHeader)(unsafe.Pointer(&result))
// transfer the data from the given slice to a new variable (our result)
rHeader.Data = bHeader.Data
rHeader.Len = bHeader.Len / arraySize
rHeader.Cap = bHeader.Cap / arraySize
// instantiate result and use KeepAlive so data isn't unmapped.
runtime.KeepAlive(&slice) // it is still crucial, GC can free it)
// return result
return
}
func byteSliceAsRun16Slice(slice []byte) (result []runContainer16) {
run16Size := int(unsafe.Sizeof(runContainer16{}))
if len(slice)%run16Size != 0 {
panic("Slice size should be divisible by unsafe.Sizeof(runContainer16)")
}
// reference: https://go101.org/article/unsafe.html
// make a new slice header
bHeader := (*reflect.SliceHeader)(unsafe.Pointer(&slice))
rHeader := (*reflect.SliceHeader)(unsafe.Pointer(&result))
// transfer the data from the given slice to a new variable (our result)
rHeader.Data = bHeader.Data
rHeader.Len = bHeader.Len / run16Size
rHeader.Cap = bHeader.Cap / run16Size
// instantiate result and use KeepAlive so data isn't unmapped.
runtime.KeepAlive(&slice) // it is still crucial, GC can free it)
// return result
return
}
func byteSliceAsBoolSlice(slice []byte) (result []bool) {
boolSize := int(unsafe.Sizeof(true))
if len(slice)%boolSize != 0 {
panic("Slice size should be divisible by unsafe.Sizeof(bool)")
}
// reference: https://go101.org/article/unsafe.html
// make a new slice header
bHeader := (*reflect.SliceHeader)(unsafe.Pointer(&slice))
rHeader := (*reflect.SliceHeader)(unsafe.Pointer(&result))
// transfer the data from the given slice to a new variable (our result)
rHeader.Data = bHeader.Data
rHeader.Len = bHeader.Len / boolSize
rHeader.Cap = bHeader.Cap / boolSize
// instantiate result and use KeepAlive so data isn't unmapped.
runtime.KeepAlive(&slice) // it is still crucial, GC can free it)
// return result
return
}
// FrozenView creates a static view of a serialized bitmap stored in buf.
// It uses CRoaring's frozen bitmap format.
//
// The format specification is available here:
@@ -198,13 +332,13 @@ func (rb *Bitmap) FrozenView(buf []byte) error {
const FROZEN_COOKIE = 13766
var (
FrozenBitmapInvalidCookie = errors.New("header does not contain the FROZEN_COOKIE")
FrozenBitmapBigEndian = errors.New("loading big endian frozen bitmaps is not supported")
FrozenBitmapIncomplete = errors.New("input buffer too small to contain a frozen bitmap")
FrozenBitmapOverpopulated = errors.New("too many containers")
FrozenBitmapUnexpectedData = errors.New("spurious data in input")
FrozenBitmapInvalidCookie = errors.New("header does not contain the FROZEN_COOKIE")
FrozenBitmapBigEndian = errors.New("loading big endian frozen bitmaps is not supported")
FrozenBitmapIncomplete = errors.New("input buffer too small to contain a frozen bitmap")
FrozenBitmapOverpopulated = errors.New("too many containers")
FrozenBitmapUnexpectedData = errors.New("spurious data in input")
FrozenBitmapInvalidTypecode = errors.New("unrecognized typecode")
FrozenBitmapBufferTooSmall = errors.New("buffer too small")
FrozenBitmapBufferTooSmall = errors.New("buffer too small")
)
func (ra *roaringArray) frozenView(buf []byte) error {
@@ -213,14 +347,14 @@ func (ra *roaringArray) frozenView(buf []byte) error {
}
headerBE := binary.BigEndian.Uint32(buf[len(buf)-4:])
if headerBE & 0x7fff == FROZEN_COOKIE {
if headerBE&0x7fff == FROZEN_COOKIE {
return FrozenBitmapBigEndian
}
header := binary.LittleEndian.Uint32(buf[len(buf)-4:])
buf = buf[:len(buf)-4]
if header & 0x7fff != FROZEN_COOKIE {
if header&0x7fff != FROZEN_COOKIE {
return FrozenBitmapInvalidCookie
}
@@ -243,29 +377,29 @@ func (ra *roaringArray) frozenView(buf []byte) error {
keys := byteSliceAsUint16Slice(buf[len(buf)-2*nCont:])
buf = buf[:len(buf)-2*nCont]
nBitmap, nArray, nRun := uint64(0), uint64(0), uint64(0)
nArrayEl, nRunEl := uint64(0), uint64(0)
nBitmap, nArray, nRun := 0, 0, 0
nArrayEl, nRunEl := 0, 0
for i, t := range types {
switch (t) {
switch t {
case 1:
nBitmap++
case 2:
nArray++
nArrayEl += uint64(counts[i])+1
nArrayEl += int(counts[i]) + 1
case 3:
nRun++
nRunEl += uint64(counts[i])
nRunEl += int(counts[i])
default:
return FrozenBitmapInvalidTypecode
}
}
if uint64(len(buf)) < (1 << 13)*nBitmap + 4*nRunEl + 2*nArrayEl {
if len(buf) < (1<<13)*nBitmap+4*nRunEl+2*nArrayEl {
return FrozenBitmapIncomplete
}
bitsetsArena := byteSliceAsUint64Slice(buf[:(1 << 13)*nBitmap])
buf = buf[(1 << 13)*nBitmap:]
bitsetsArena := byteSliceAsUint64Slice(buf[:(1<<13)*nBitmap])
buf = buf[(1<<13)*nBitmap:]
runsArena := byteSliceAsInterval16Slice(buf[:4*nRunEl])
buf = buf[4*nRunEl:]
@@ -277,27 +411,44 @@ func (ra *roaringArray) frozenView(buf []byte) error {
return FrozenBitmapUnexpectedData
}
// TODO: maybe arena_alloc all this.
containers := make([]container, nCont)
bitsets := make([]bitmapContainer, nBitmap)
arrays := make([]arrayContainer, nArray)
runs := make([]runContainer16, nRun)
needCOW := make([]bool, nCont)
var c container
containersSz := int(unsafe.Sizeof(c))*nCont
bitsetsSz := int(unsafe.Sizeof(bitmapContainer{}))*nBitmap
arraysSz := int(unsafe.Sizeof(arrayContainer{}))*nArray
runsSz := int(unsafe.Sizeof(runContainer16{}))*nRun
needCOWSz := int(unsafe.Sizeof(true))*nCont
iBitset, iArray, iRun := uint64(0), uint64(0), uint64(0)
bitmapArenaSz := containersSz + bitsetsSz + arraysSz + runsSz + needCOWSz
bitmapArena := make([]byte, bitmapArenaSz)
containers := byteSliceAsContainerSlice(bitmapArena[:containersSz])
bitmapArena = bitmapArena[containersSz:]
bitsets := byteSliceAsBitsetSlice(bitmapArena[:bitsetsSz])
bitmapArena = bitmapArena[bitsetsSz:]
arrays := byteSliceAsArraySlice(bitmapArena[:arraysSz])
bitmapArena = bitmapArena[arraysSz:]
runs := byteSliceAsRun16Slice(bitmapArena[:runsSz])
bitmapArena = bitmapArena[runsSz:]
needCOW := byteSliceAsBoolSlice(bitmapArena)
iBitset, iArray, iRun := 0, 0, 0
for i, t := range types {
needCOW[i] = true
switch (t) {
switch t {
case 1:
containers[i] = &bitsets[iBitset]
bitsets[iBitset].cardinality = int(counts[i])+1
bitsets[iBitset].cardinality = int(counts[i]) + 1
bitsets[iBitset].bitmap = bitsetsArena[:1024]
bitsetsArena = bitsetsArena[1024:]
iBitset++
case 2:
containers[i] = &arrays[iArray]
sz := int(counts[i])+1
sz := int(counts[i]) + 1
arrays[iArray].content = arraysArena[:sz]
arraysArena = arraysArena[sz:]
iArray++
@@ -363,13 +514,13 @@ func (bm *Bitmap) FreezeTo(buf []byte) (int, error) {
}
}
serialSize := 4 + 5*nCont + (1 << 13)*nBits + 4*nRunEl + 2*nArrayEl
serialSize := 4 + 5*nCont + (1<<13)*nBits + 4*nRunEl + 2*nArrayEl
if len(buf) < serialSize {
return 0, FrozenBitmapBufferTooSmall
}
bitsArena := byteSliceAsUint64Slice(buf[:(1 << 13)*nBits])
buf = buf[(1 << 13)*nBits:]
bitsArena := byteSliceAsUint64Slice(buf[:(1<<13)*nBits])
buf = buf[(1<<13)*nBits:]
runsArena := byteSliceAsInterval16Slice(buf[:4*nRunEl])
buf = buf[4*nRunEl:]
@@ -386,7 +537,7 @@ func (bm *Bitmap) FreezeTo(buf []byte) (int, error) {
types := buf[:nCont]
buf = buf[nCont:]
header := uint32(FROZEN_COOKIE|(nCont << 15))
header := uint32(FROZEN_COOKIE | (nCont << 15))
binary.LittleEndian.PutUint32(buf[:4], header)
copy(keys, bm.highlowcontainer.keys[:])
@@ -396,13 +547,13 @@ func (bm *Bitmap) FreezeTo(buf []byte) (int, error) {
case *bitmapContainer:
copy(bitsArena, v.bitmap)
bitsArena = bitsArena[1024:]
counts[i] = uint16(v.cardinality-1)
counts[i] = uint16(v.cardinality - 1)
types[i] = 1
case *arrayContainer:
copy(arraysArena, v.content)
arraysArena = arraysArena[len(v.content):]
elems := len(v.content)
counts[i] = uint16(elems-1)
counts[i] = uint16(elems - 1)
types[i] = 2
case *runContainer16:
copy(runsArena, v.iv)
@@ -415,3 +566,87 @@ func (bm *Bitmap) FreezeTo(buf []byte) (int, error) {
return serialSize, nil
}
func (bm *Bitmap) WriteFrozenTo(wr io.Writer) (int, error) {
// FIXME: this is a naive version that iterates 4 times through the
// containers and allocates 3*len(containers) bytes; it's quite likely
// it can be done more efficiently.
containers := bm.highlowcontainer.containers
written := 0
for _, c := range containers {
c, ok := c.(*bitmapContainer)
if !ok {
continue
}
n, err := wr.Write(uint64SliceAsByteSlice(c.bitmap))
written += n
if err != nil {
return written, err
}
}
for _, c := range containers {
c, ok := c.(*runContainer16)
if !ok {
continue
}
n, err := wr.Write(interval16SliceAsByteSlice(c.iv))
written += n
if err != nil {
return written, err
}
}
for _, c := range containers {
c, ok := c.(*arrayContainer)
if !ok {
continue
}
n, err := wr.Write(uint16SliceAsByteSlice(c.content))
written += n
if err != nil {
return written, err
}
}
n, err := wr.Write(uint16SliceAsByteSlice(bm.highlowcontainer.keys))
written += n
if err != nil {
return written, err
}
countTypeBuf := make([]byte, 3*len(containers))
counts := byteSliceAsUint16Slice(countTypeBuf[:2*len(containers)])
types := countTypeBuf[2*len(containers):]
for i, c := range containers {
switch c := c.(type) {
case *bitmapContainer:
counts[i] = uint16(c.cardinality - 1)
types[i] = 1
case *arrayContainer:
elems := len(c.content)
counts[i] = uint16(elems - 1)
types[i] = 2
case *runContainer16:
runs := len(c.iv)
counts[i] = uint16(runs)
types[i] = 3
}
}
n, err = wr.Write(countTypeBuf)
written += n
if err != nil {
return written, err
}
header := uint32(FROZEN_COOKIE | (len(containers) << 15))
if err := binary.Write(wr, binary.LittleEndian, header); err != nil {
return written, err
}
written += 4
return written, nil
}
+1
View File
@@ -1,3 +1,4 @@
//go:build gofuzz
// +build gofuzz
package roaring
+1
View File
@@ -1,3 +1,4 @@
//go:build arm64 && !gccgo && !appengine
// +build arm64,!gccgo,!appengine
package roaring
+1
View File
@@ -1,3 +1,4 @@
//go:build !arm64 || gccgo || appengine
// +build !arm64 gccgo appengine
package roaring
+2 -1
View File
@@ -1,3 +1,4 @@
//go:build gofuzz
// +build gofuzz
/*
@@ -62,8 +63,8 @@ import (
"fmt"
"sort"
"github.com/mschoch/smat"
"github.com/bits-and-blooms/bitset"
"github.com/mschoch/smat"
)
// fuzz test using state machine driven by byte stream.