From c7dc8a597b61ea4e81fc6c61ae7160f405e535d7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?J=C3=B6rn=20Friedrich=20Dreyer?= Date: Fri, 16 Feb 2024 16:20:13 +0100 Subject: [PATCH] [docs-only] ADR-0025 Distributed Search Index (#8425) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * ADR-0025 Distributed Search Index Signed-off-by: Jörn Friedrich Dreyer * mark the old as accepted and the new as draft --------- Signed-off-by: Jörn Friedrich Dreyer --- docs/ocis/adr/0019-file-search-index.md | 2 +- .../ocis/adr/0025-distributed-search-index.md | 76 +++++++++++++++++++ 2 files changed, 77 insertions(+), 1 deletion(-) create mode 100644 docs/ocis/adr/0025-distributed-search-index.md diff --git a/docs/ocis/adr/0019-file-search-index.md b/docs/ocis/adr/0019-file-search-index.md index 2e6ecf52a5..d0701bff37 100644 --- a/docs/ocis/adr/0019-file-search-index.md +++ b/docs/ocis/adr/0019-file-search-index.md @@ -7,7 +7,7 @@ geekdocEditPath: edit/master/docs/ocis/adr geekdocFilePath: 0019-file-search-index.md --- -* Status: proposed +* Status: accepted * Deciders: @butonic, @micbar, @dragotin, @C0rby * Date: 2022-03-18 diff --git a/docs/ocis/adr/0025-distributed-search-index.md b/docs/ocis/adr/0025-distributed-search-index.md new file mode 100644 index 0000000000..672b7d98d6 --- /dev/null +++ b/docs/ocis/adr/0025-distributed-search-index.md @@ -0,0 +1,76 @@ +--- +title: "25. Distributed Search Index" +date: 2024-02-09T16:27:00+01:00 +weight: 25 +geekdocRepo: https://github.com/owncloud/ocis +geekdocEditPath: edit/master/docs/ocis/adr +geekdocFilePath: 0025-distributed-search-index.md +--- + +* Status: draft +* Deciders: @butonic, @fschade, @aduffeck +* Date: 2024-02-09 + +## Context and Problem Statement + +Search is currently implemented with [blevesearch](https://github.com/blevesearch/bleve), which internally uses bbolt. bbolt writes to a local file, which prevents scaling out the service. + +The initial implementation used a single blevesearch index for all spaces. While this makes querying all spaces easy because the results do not need to be aggregated from multiple indexes, the single node becomes a bottleneck when answering search queries. Furthermore, indexing is also part of the search service and has to share the resources. + +## Decision Drivers + +* Indexing should be decoupled from the search service +* The search service should be able to scale horizontally +* The solution needs to be embeddable in the single binary + +## Considered Options + +* one index per space +* [elasticsearch](https://github.com/elastic/elasticsearch) (java) +* [dgraph](https://github.com/dgraph-io/dgraph) (go) +* [manticore](https://github.com/manticoresoftware/manticoresearch/) (C++) +* [meilisearch](https://github.com/meilisearch/meilisearch) (Rust) + +## Decision Outcome + +Chosen option: *???* + +### Positive Consequences: + +* TODO + +### Negative Consequences: + +* TODO + +## Pros and Cons of the Options + +### one index per space + +Instead of using a single index (current implementation) or a distributed search index like elasticsearch the search service should aggregate queries from dedicated indexes per space. The api to a space index provider should be able to take multiple space ids in the request, similar to how a storage provider can handle multiple spaces. When treating spaces and the corresponding search index to belong together we can also treat them as a single unit for backup and restore. In federated deployments we can send the search queries to all search providers / spaces that the user has access to. + +How a search provider is implemented then depends on the requirements. For a single node deployment bleve might be fine, for a kubernetes deployment a dedicated service might be the better fit. + +### elasticsearch + +* Good, commercial support available at https://www.elastic.co/de/pricing +* Good, industry standard +* Bad, nobody seems to like it +* Bad, not embeddable (Java) + +### dgraph + +* Good, commercial support available at https://dgraph.io/pricing +* Good, embeddable? (go) - TODO verify + +### manticore +* Good, commercial support available at https://manticoresearch.com/services/ +* Bad, not embeddable (C++) + +### meilisearch +* Good, commercial support available at https://www.meilisearch.com/pricing +* Bad, not embeddable (Rust) + +## Links + +* supersedes [ADR-0019 File Search Index]({{< ref "0019-file-search-index.md" >}}) \ No newline at end of file