The idea is to drain the socket and do a single RocksDB WAL
write/fsync for all the write requests we have found.
The read requests are immediately executed. The reasoning here is
that currently write requests are _a lot_ slower than the read
requests because fsyncing takes ~500us on fsf1. In the future this
might change.
Since we're at it, we also use batch UDP syscalls in the CDC.
Fixes#119.
This is to save on a ton of writes as jobs stat tons of files.
It would maybe be a bit cleaner to do it in the kmod, but this is
much quicker.
Thanks to @sgrusny for the good idea.
The `tuple` was for when I thought it'd be useful to leave slots
for each request, but we don't need this anymore, and now leading
up to #66 I want to be able to keep vectors of reqs/resps.
Fixes#32. This also involves some reworking of the block request machinery
to make it more robust and faster. The scrubbing is done assuming that
the overwhelming majority of block checking will go through.
This was triggered by a server failing hard (fsr13), without any
short term resolution (we've already replaced the mobo, we'll probably
replace the HBA). In this case GC should still run rather than
get stuck.
Fixes#29.
The additions to codegen are unrelated -- I was exploring a different
approach based on request equality and I decided to keep those
changes in since they might be useful anyhow.
See comment in `msgs.go`. This would normally have required
entirely new transactions, but since we're not in production yet
I'm going to just change the schema and wipe the current FS.
This also adds in an unrelated change regarding more flexible
blacklisting, which will be required for some additional testing
I'm preparing.
Not used right now, but this way we can easily start stuffing more
data in responses.
I also split off some arguments in `NewClient`, unrelated change
(I wanted to pair the MTU with a single client, but I then realized
that it's enough to have it as some global property for now).
And hopefully reduce the likelihood of bugs. On the write end, given
that we do things less asynchronously, things might be a bit slower,
but I think the simplification is worth it for now.
Also, fix/improve a bunch of other stuff.
This is one of the two data model/protocol changes I want to perform
before going into production, the other being file atime.
Right now the kernel module does not take advantage of this, but
it's OK since I tested the rest of the code reasonably and the goal
here is to perform the protocol/data changes.
Initial version really by Pawel, but many changes in between.
Big outstanding issues:
* span cache reclamation (unbounded memory otherwise...)
* bad block service detection and workarounds
* corrupted blocks detection and workaround
Co-authored-by: Paweł Dziepak <pawel.dziepak@xtxmarkets.com>
The main thing that's added is full RS support, but a lot of things
were rejigged along the way. The tests are still a bit lacking,
and will be augmented in future commits.
...most notably we now produce fully static binaries in an alpine
image.
A few assorted thoughts:
* I really like static binaries, ideally I'd like to run EggsFS
deployments with just systemd scripts and a few binaries.
* Go already does this, which is great.
* C++ does not, which is less great.
* Linking statically against `glibc` works, but is unsupported.
Not only stuff like NSS (which `gethostbyname` requires)
straight up does not work, unless you build `glibc` with
unsupported and currently apparently broken flags
(`--enable-static-nss`), but also other stuff is subtly
broken (I couldn't remember exactly what was broken,
but see comments such as
<https://github.com/haskell/haskell-language-server/issues/2431#issuecomment-985880838>).
* So we're left with alternative libcs -- the most popular being
musl.
* The simplest way to build a C++ application using musl is to just
build on a system where musl is already the default libc -- such
as alpine linux.
The backtrace support is in a bit of a bad state. Exception stacktraces
work on musl, but DWARF seems to be broken on the normal release build.
Moreover, libunwind doesn't play well with musl's signal handler:
<https://maskray.me/blog/2022-04-10-unwinding-through-signal-handler>.
Keeping it working seems to be a bit of a chore, and I'm going to revisit
it later.
In the meantime, gdb stack traces do work fine.
Also, produce fully static binaries. This means that `gethostname`
does not work (doesn't work with static glibc unless you build it
with `--enable-static-nss`, which no distro builds glibc with).