Initial version really by Pawel, but many changes in between.
Big outstanding issues:
* span cache reclamation (unbounded memory otherwise...)
* bad block service detection and workarounds
* corrupted blocks detection and workaround
Co-authored-by: Paweł Dziepak <pawel.dziepak@xtxmarkets.com>
The go deps are vendored as source files, the C++ deps are vendored
as artifactory tarballs rather than internet tarballs.
Good practice, but the immediate motivation was to allow Saulius
to build stuff in Iceland.
The main thing that's added is full RS support, but a lot of things
were rejigged along the way. The tests are still a bit lacking,
and will be augmented in future commits.
It is currently very fragile, due to:
* Differing versions of compilers/DWARF version result in a variety
of breakages in the our code which analyzes the DWARF info;
* With musl, libunwind seems to be currently unable to traverse
beyond signal handlers, due to the DWARF information not
being present in the signal frame.
See <https://maskray.me/blog/2022-04-10-unwinding-through-signal-handler>.
Note that I have not verified that the problem in the blog
post above is indeed what we're hitting, but it seems plausible.
...most notably we now produce fully static binaries in an alpine
image.
A few assorted thoughts:
* I really like static binaries, ideally I'd like to run EggsFS
deployments with just systemd scripts and a few binaries.
* Go already does this, which is great.
* C++ does not, which is less great.
* Linking statically against `glibc` works, but is unsupported.
Not only stuff like NSS (which `gethostbyname` requires)
straight up does not work, unless you build `glibc` with
unsupported and currently apparently broken flags
(`--enable-static-nss`), but also other stuff is subtly
broken (I couldn't remember exactly what was broken,
but see comments such as
<https://github.com/haskell/haskell-language-server/issues/2431#issuecomment-985880838>).
* So we're left with alternative libcs -- the most popular being
musl.
* The simplest way to build a C++ application using musl is to just
build on a system where musl is already the default libc -- such
as alpine linux.
The backtrace support is in a bit of a bad state. Exception stacktraces
work on musl, but DWARF seems to be broken on the normal release build.
Moreover, libunwind doesn't play well with musl's signal handler:
<https://maskray.me/blog/2022-04-10-unwinding-through-signal-handler>.
Keeping it working seems to be a bit of a chore, and I'm going to revisit
it later.
In the meantime, gdb stack traces do work fine.
Also, produce fully static binaries. This means that `gethostname`
does not work (doesn't work with static glibc unless you build it
with `--enable-static-nss`, which no distro builds glibc with).
As I'm thinking what to do of the CDC for now, I got thinking
a bit more about how to handle the log entry/state persisting split.
I think this makes more sense, it'll allow the log consensus module
to be bolted on top fairly cleanly.
Most operations apart from spans-related ones work. Using this as
a checkpoint -- the Python code is currently not really working,
I'm working to migrate to pretty much a full C++/go world.
Mostly, we now have ext block ids everywhere, and block service
requests take the block id.
Also, gearing up to explicit blacklists for block services,
in case a client detects failures.