Commit Graph

138 Commits

Author SHA1 Message Date
Miroslav Crnic
ebcdcb650a shard: add support for resetting all data in LogsDB 2024-03-13 11:33:48 +00:00
Francesco Mazzoli
005121bcac Spin block service cache out of ShardDB
This started being a problem since the block service update log
entry does not fit in a UDP packet (it's like 100KB). I think this
approach makes more sense anyway. See comment for `getCache()` for
gotchas.
2024-03-13 11:29:58 +00:00
Francesco Mazzoli
6968c25bc5 Allow : in metrics 2024-03-12 14:04:34 +00:00
Miroslav Crnic
13c5df0131 shard: fix name in xmon and add replica id to tag in metrics 2024-03-12 13:40:35 +00:00
Miroslav Crnic
b240de53b5 shard: distributed log implementation and shard can use it with a flag set 2024-03-12 11:02:04 +00:00
Francesco Mazzoli
0037e8d10b Print some info about block service flags in shard 2024-03-08 09:18:54 +00:00
Miroslav Crnic
712ed8973e core: simplify implementing custom stop for Loop 2024-02-23 13:52:34 +00:00
Francesco Mazzoli
531f989a06 Correct app type for quiet alert creation 2024-02-20 14:16:52 +00:00
Francesco Mazzoli
303421763a Allow to specify rota per alert in C++ 2024-02-20 12:59:42 +00:00
Saulius Grusnys
796e46f466 shuckle to track if blockservices have any files on them (currently t… (#177)
* shuckle to track if blockservices have any files on them (currently there is issue with transient files)
2024-02-20 08:10:51 +00:00
Miroslav Crnic
83d0469c7f SharedRocksdDB: correctly export metrics 2024-02-08 19:39:00 +00:00
Miroslav Crnic
37ba9bc457 shard: support for sharing rocksdb and init LogsDB CFs 2024-02-08 17:44:03 +00:00
Miroslav Crnic
38707535e3 shuckle: support metadata replication 2024-02-07 13:57:00 +00:00
Miroslav Crnic
1dedd7d181 core: SPSC return 0 on timeout in pull 2024-01-29 17:16:05 +00:00
Miroslav Crnic
2ec1304981 core: ppoll, futex dont like negative timeouts 2024-01-29 17:00:14 +00:00
Francesco Mazzoli
9d1a31b482 Fix another signedness mismatch 2024-01-29 16:46:05 +00:00
Miroslav Crnic
e543665f8f core: SPSC support timeout in pull 2024-01-29 16:06:31 +00:00
Francesco Mazzoli
2a326f7c5f Fix usual signedness shenanigans 🥱 2024-01-29 16:05:19 +00:00
Francesco Mazzoli
0a6a0c8f24 Process CDC timeouts in a timely manner 2024-01-29 15:08:06 +00:00
Francesco Mazzoli
2a6feb6df5 Patch RocksDB to make it compile with clang 15. 2024-01-29 14:15:29 +00:00
Francesco Mazzoli
8c0c246348 More robust detection of file vs. device errors
Just check if we're also unable to count the blocks for the disk,
and if yes, assume it's a single file error.

Of course there will be a time period where we will not have detected
the bad disk when counting the blocks (a few minutes at most), but
that's OK -- the scrubber will scrub blocks for that period, and then
stop.

Once <internal-repo/issues/65#issuecomment-24747>
is done, we should use whatever error detection we use for migration
to also distinguish between these errors.
2024-01-22 13:18:53 +00:00
Francesco Mazzoli
b6cf2b67a6 Distribute block services from shuckle
This is in preparation for #44, but more immediately, to better
stop writing to full block services.

The previous strategy of setting a flag was flawed since once
the flag was set it stayed set -- i.e. we would not remove it once
files would be deleted.  This consideration should just be integrated
in distributing the block services.
2024-01-16 16:17:27 +00:00
Francesco Mazzoli
d569bdb494 Re-introduce thread names (they got lost in a refactor) 2024-01-11 17:32:52 +00:00
Francesco Mazzoli
8d0b97171e Remove dead code 2024-01-11 13:03:26 +00:00
Francesco Mazzoli
c27ba8398a Tear down all threads at once
I had copied the LIFO pattern from ETD codebase, but it's not needed
here given that the loop terminates gracefully and so we can coordinate
explicitly if needed.
2024-01-09 16:53:23 +00:00
Francesco Mazzoli
c9bf49d387 Fix silly SPSC bug 2024-01-09 11:14:18 +00:00
Francesco Mazzoli
3097752a30 Minor tweak 2024-01-08 16:03:07 +00:00
Francesco Mazzoli
ee9e0ad0af Remove pthread_attr_setsigmask_np, musl does not have it 2024-01-08 15:58:31 +00:00
Francesco Mazzoli
002b2854ec Fix leak in FetchedSpan, and hopefully fix #141. 2024-01-08 15:58:31 +00:00
Francesco Mazzoli
8075e99bb6 Graceful shard teardown
See <https://mazzo.li/posts/stopping-linux-threads.html> for tradeoffs
regarding how to terminate threads gracefully.

The goal of this work was for valgrind to work correctly, which in turn
was to investigate #141. It looks like I have succeeded:

    ==2715080== Warning: unimplemented fcntl command: 1036
    ==2715080== 20,052 bytes in 5,013 blocks are definitely lost in loss record 133 of 135
    ==2715080==    at 0x483F013: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
    ==2715080==    by 0x3B708E: allocate (new_allocator.h:121)
    ==2715080==    by 0x3B708E: allocate (allocator.h:173)
    ==2715080==    by 0x3B708E: allocate (alloc_traits.h:460)
    ==2715080==    by 0x3B708E: _M_allocate (stl_vector.h:346)
    ==2715080==    by 0x3B708E: std::vector<Crc, std::allocator<Crc> >::_M_default_append(unsigned long) (vector.tcc:635)
    ==2715080==    by 0x42BF1C: resize (stl_vector.h:940)
    ==2715080==    by 0x42BF1C: ShardDBImpl::_fileSpans(rocksdb::ReadOptions&, FileSpansReq const&, FileSpansResp&) (shard/ShardDB.cpp:921)
    ==2715080==    by 0x420867: ShardDBImpl::read(ShardReqContainer const&, ShardRespContainer&) (shard/ShardDB.cpp:1034)
    ==2715080==    by 0x3CB3EE: ShardServer::_handleRequest(int, sockaddr_in*, char*, unsigned long) (shard/Shard.cpp:347)
    ==2715080==    by 0x3C8A39: ShardServer::step() (shard/Shard.cpp:405)
    ==2715080==    by 0x40B1E8: run (core/Loop.cpp:67)
    ==2715080==    by 0x40B1E8: startLoop(void*) (core/Loop.cpp:37)
    ==2715080==    by 0x4BEA258: start_thread (in /usr/lib/libpthread-2.33.so)
    ==2715080==    by 0x4D005E2: clone (in /usr/lib/libc-2.33.so)
    ==2715080==
    ==2715080==
    ==2715080== Exit program on first error (--exit-on-first-error=yes)
2024-01-08 15:41:22 +00:00
Francesco Mazzoli
1963714c0f Remove avoidable stat in collect directories 2023-12-15 21:20:05 +00:00
Francesco Mazzoli
898b85ad9c Tweak GC parameters
We're almost in a steady state, no need to overwhelm the shards.
2023-12-11 15:04:41 +00:00
Francesco Mazzoli
8c172fd2e8 Tiny C++ xmon fix 2023-12-10 11:14:19 +00:00
Francesco Mazzoli
788b5eed57 Fill in current block services before applying the log
It makes a lot more sense to pick outside, given that it involves
randomness. Also, this is in preparation for shuckle picking them
in a smarter way.
2023-12-09 15:20:24 +00:00
Francesco Mazzoli
3394328000 Do not try to close xmon fd if we don't have one
Also, ignore errors if we can't close it. Fixes #134.
2023-12-09 14:50:51 +00:00
Francesco Mazzoli
ab1df9137d Fix error logging when inserting stats 2023-12-08 15:57:02 +00:00
Francesco Mazzoli
53049d5779 Shard batch writes, use batch UDP syscalls
The idea is to drain the socket and do a single RocksDB WAL
write/fsync for all the write requests we have found.

The read requests are immediately executed. The reasoning here is
that currently write requests are _a lot_ slower than the read
requests because fsyncing takes ~500us on fsf1. In the future this
might change.

Since we're at it, we also use batch UDP syscalls in the CDC.

Fixes #119.
2023-12-07 14:29:07 +00:00
Francesco Mazzoli
38f3d54ecd Wait forever, rather than having timeouts
The goal here is to not have constant wakeups due to timeout. Do
not attempt to clean things up nicely before termination -- just
terminate instead. We can setup a proper termination system in
the future, I first want to see if this makes a difference.

Also, change xmon to use pipes for communication, so that it can
wait without timers as well.

Also, `write` directly for logging, so that we know the logs will
make it to the file after the logging call returns (since we now
do not have the chance to flush them afterwards).
2023-12-07 10:11:19 +00:00
Francesco Mazzoli
91db9566e1 Remove option to not write out atime which is too recent
This was pretty nasty to begin with, we now do it in the client.
2023-11-23 13:28:23 +00:00
Francesco Mazzoli
bcf75d5308 Shut up sanitizer 2023-11-21 17:03:05 +00:00
Francesco Mazzoli
1fca8b84cd Fix type signature 2023-11-17 22:48:31 +00:00
Francesco Mazzoli
b964d0632a Add option to not write out atime which is too recent
This is to save on a ton of writes as jobs stat tons of files.
It would maybe be a bit cleaner to do it in the kmod, but this is
much quicker.

Thanks to @sgrusny for the good idea.
2023-11-16 14:45:58 +00:00
Saulius Grusnys
2ce5586eb9 Periodically refresh metadata info in kmod, use two IPs for shuckle
Fixes #112.

Co-authored-by: Francesco Mazzoli <francesco.mazzoli@xtxmarkets.com>
2023-11-14 13:49:36 +00:00
Francesco Mazzoli
3bc17301d6 Switch from tuple to variant for req/resp containers
The `tuple` was for when I thought it'd be useful to leave slots
for each request, but we don't need this anymore, and now leading
up to #66 I want to be able to keep vectors of reqs/resps.
2023-11-09 19:03:37 +00:00
Francesco Mazzoli
ad3c969772 Push full RocksDB stats to grafana 2023-11-09 16:48:51 +00:00
Francesco Mazzoli
057be91613 rocksDBStats -> rocksDBMetrics 2023-11-09 13:38:32 +00:00
Francesco Mazzoli
c5979a9d90 Expose some RocksDB stats 2023-11-09 13:23:49 +00:00
Francesco Mazzoli
d0126d0656 Distinguish IO errors in eggsblocks
See #115 for background.
2023-11-06 19:35:05 +00:00
Francesco Mazzoli
1ec63f9710 Implement scrubbing functionality
Fixes #32. This also involves some reworking of the block request machinery
to make it more robust and faster. The scrubbing is done assuming that
the overwhelming majority of block checking will go through.
2023-11-05 18:33:00 +00:00
Francesco Mazzoli
71556ce933 Switch to restech EggsFS rota 2023-11-03 14:23:44 +00:00