Commit Graph

71 Commits

Author SHA1 Message Date
Francesco Mazzoli
cd86e632e2 Implement RS recovery, although it won't really be used now...
...since it only relies on block service flags, and we don't
set them right now.
2023-06-03 17:27:54 +00:00
Francesco Mazzoli
efb92be31a Use bitmaps for RS recover API
The previous array was sort of silly, and particularly silly now
from the kernel.
2023-06-01 14:41:21 +00:00
Francesco Mazzoli
974500633a Implement stripe prefetching 2023-06-01 13:01:33 +00:00
Francesco Mazzoli
273fa40dda Fix USR2 logging teardown 2023-06-01 10:10:25 +00:00
Francesco Mazzoli
55074b16b4 Implement fs stat
10.97.12.10:10001       29P  208T   29P   1% /home/restechprod/eggs/mnt
2023-05-29 18:49:50 +00:00
Francesco Mazzoli
499bada153 Other silly eggsktools utility 2023-05-29 17:32:15 +00:00
Francesco Mazzoli
7e25c1fd95 Do not crash if we can't find blocks 2023-05-29 09:52:01 +00:00
Francesco Mazzoli
a12a938c40 syslogify logs 2023-05-29 09:52:01 +00:00
Francesco Mazzoli
45471eded4 eggsktools readfile QOL 2023-05-28 22:12:49 +00:00
Francesco Mazzoli
a4bc32a18f Span drop improvements
We could get into situations where async droppings were scheduled
at every read.
2023-05-26 17:22:43 +00:00
Francesco Mazzoli
f95d177c34 WIP commit...
...which I mistakely left in and I'm too lazy to fix.
2023-05-26 17:22:30 +00:00
Francesco Mazzoli
fc0ae851c8 Typo 2023-05-26 10:14:32 +00:00
Francesco Mazzoli
f98f0f3e95 Move some utility around, allow to deploy kmod easily 2023-05-26 10:05:25 +00:00
Francesco Mazzoli
1458759534 Allow to enable shard/cdc debugging at runtime using USR2 2023-05-26 10:03:59 +00:00
Francesco Mazzoli
1b83a50419 Ensure files actually are on different failure domains...
...in a very lazy way which will probably do for now.
2023-05-25 15:14:02 +00:00
Francesco Mazzoli
a61fce55f8 Simple write test for block service
We can only write 3.2Gbit/s right now, so that's definitely something
to improve.
2023-05-22 13:08:22 +00:00
Francesco Mazzoli
ac25c0aa3d Limit to 64 CPP expansions
Seems a more reasonable number than 256 for some reason.
2023-05-22 08:13:22 +00:00
Francesco Mazzoli
3cccb451d6 Remove unused define 2023-05-22 08:11:40 +00:00
Francesco Mazzoli
1eab8ee6cf Add versions to some RocksDB values
Only the ones where it is needed -- in some cases we can just
modify the keys (e.g. metadata stuff).

Also, come up with a sort of horrifying but more robust way
to specify the RocksDB values with the C preprocessor.
2023-05-22 08:03:01 +00:00
Francesco Mazzoli
6addbdee6a First version of kernel module
Initial version really by Pawel, but many changes in between.

Big outstanding issues:

* span cache reclamation (unbounded memory otherwise...)
* bad block service detection and workarounds
* corrupted blocks detection and workaround

Co-authored-by: Paweł Dziepak <pawel.dziepak@xtxmarkets.com>
2023-05-18 15:29:41 +00:00
Francesco Mazzoli
688059fd60 Vendor in everything
The go deps are vendored as source files, the C++ deps are vendored
as artifactory tarballs rather than internet tarballs.

Good practice, but the immediate motivation was to allow Saulius
to build stuff in Iceland.
2023-04-11 15:13:01 +00:00
Francesco Mazzoli
b771c12763 Static go builds 2023-03-10 16:23:30 +00:00
Francesco Mazzoli
d0100550ca A few fixes/tests 2023-03-10 16:23:30 +00:00
Francesco Mazzoli
4ef819f4e5 Do not duplicate block services when migrating...
...also add checks so that that never happens in ShardDB
2023-03-10 16:23:30 +00:00
Francesco Mazzoli
ffa7c6c5e9 cleanup docker containers after building 2023-03-10 16:23:30 +00:00
Francesco Mazzoli
5bff9b8fae Many, many changes -- tests pass, but FUSE is currently not present
The main thing that's added is full RS support, but a lot of things
were rejigged along the way. The tests are still a bit lacking,
and will be augmented in future commits.
2023-03-03 16:42:22 +00:00
Francesco Mazzoli
ae4ca721ee Simplify RS interface
Let's just care about blocks, and not about how to split the original
data.
2023-02-17 11:06:55 +00:00
Francesco Mazzoli
82cd4a3756 wip 2023-02-17 10:39:31 +00:00
Francesco Mazzoli
a387c6a0c8 Reed-Solomon library
Not optimized in the slightest, but the API should be mostly there.
2023-02-16 23:50:35 +00:00
Francesco Mazzoli
e1b8de02dc More assorted improvements 2023-02-15 14:03:53 +00:00
Francesco Mazzoli
51860fac3a Various improveents, nothing substantial 2023-02-14 22:39:38 +00:00
Francesco Mazzoli
4288189766 Reorganize logs, add req/resp to CLI, add last seen to UI 2023-02-14 12:21:48 +00:00
Francesco Mazzoli
e580cd5fe9 Select right source address in CDC/Shard 2023-02-14 12:20:21 +00:00
Francesco Mazzoli
a88e2aaa01 Add systemd services and utilities to deploy stuff on current cluster 2023-02-02 16:36:13 +00:00
Francesco Mazzoli
5bafbf03f6 Extend protocol to allow for double route for blocks services 2023-02-02 12:45:13 +00:00
Francesco Mazzoli
507f9b4565 More improvement to the web UI 2023-02-01 15:58:50 +00:00
Francesco Mazzoli
63b5b8d3f9 Start adding web UI, for now integrated with shuckle 2023-02-01 14:47:23 +00:00
Francesco Mazzoli
f42ff2b219 Get rid of backtrace machinery
It is currently very fragile, due to:

* Differing versions of compilers/DWARF version result in a variety
    of breakages in the our code which analyzes the DWARF info;

* With musl, libunwind seems to be currently unable to traverse
    beyond signal handlers, due to the DWARF information not
    being present in the signal frame.
    See <https://maskray.me/blog/2022-04-10-unwinding-through-signal-handler>.
    Note that I have not verified that the problem in the blog
    post above is indeed what we're hitting, but it seems plausible.
2023-02-01 12:00:47 +00:00
Francesco Mazzoli
4243ff71e2 Assorted fixes 2023-02-01 12:00:44 +00:00
Francesco Mazzoli
e90c1cadb8 Prompt termination in shard/cdc 2023-01-30 18:27:46 +00:00
Francesco Mazzoli
df9efa481d Keep pinging shuckle 2023-01-30 12:18:46 +00:00
Francesco Mazzoli
85889266b1 Various housekeeping while I get ready to deploy...
...most notably we now produce fully static binaries in an alpine
image.

A few assorted thoughts:

* I really like static binaries, ideally I'd like to run EggsFS
    deployments with just systemd scripts and a few binaries.

* Go already does this, which is great.

* C++ does not, which is less great.

* Linking statically against `glibc` works, but is unsupported.
    Not only stuff like NSS (which `gethostbyname` requires)
    straight up does not work, unless you build `glibc` with
    unsupported and currently apparently broken flags
    (`--enable-static-nss`), but also other stuff is subtly
    broken (I couldn't remember exactly what was broken,
    but see comments such as
    <https://github.com/haskell/haskell-language-server/issues/2431#issuecomment-985880838>).

* So we're left with alternative libcs -- the most popular being
    musl.

* The simplest way to build a C++ application using musl is to just
    build on a system where musl is already the default libc -- such
    as alpine linux.

The backtrace support is in a bit of a bad state. Exception stacktraces
work on musl, but DWARF seems to be broken on the normal release build.

Moreover, libunwind doesn't play well with musl's signal handler:
<https://maskray.me/blog/2022-04-10-unwinding-through-signal-handler>.

Keeping it working seems to be a bit of a chore, and I'm going to revisit
it later.

In the meantime, gdb stack traces do work fine.
2023-01-29 21:41:40 +00:00
Francesco Mazzoli
9adca070ba Convert build system to cmake
Also, produce fully static binaries. This means that `gethostname`
does not work (doesn't work with static glibc unless you build it
with `--enable-static-nss`, which no distro builds glibc with).
2023-01-26 23:20:58 +00:00
Francesco Mazzoli
aac9e275d7 Go blockservice, and a bunch of shuckle improvements 2023-01-24 11:44:46 +00:00
Francesco Mazzoli
adfa282dbd Test inline bodies in integration test 2023-01-23 18:22:18 +00:00
Francesco Mazzoli
51d0769cb3 Test block migration 2023-01-19 14:25:47 +00:00
Francesco Mazzoli
5acefed1a7 Implement loop check in rename directory 2023-01-18 10:46:11 +00:00
Francesco Mazzoli
ac99f10f94 Add artificial packet drop to integration tests...
...and fixup many places in the code to allow for such drops to
happen somewhat smoothly.
2023-01-16 22:54:51 +00:00
Francesco Mazzoli
4d03035e00 Forgot to close the door 2023-01-11 14:04:11 +00:00
Francesco Mazzoli
89e640d7dd Remove OpenSSL dependency 2023-01-11 13:53:30 +00:00