Commit Graph

118 Commits

Author SHA1 Message Date
Francesco Mazzoli
110705db8d EggsFS -> TernFS rename
Things not done because probably disruptive:

* kmod filesystem string
* sysctl/debugfs/trace
* metrics names
* xmon instance names

Some of these might be renamed too, but starting with a relatively
safe set.
2025-09-03 09:29:53 +01:00
Miroslav Crnic
f3f5b4b0e2 cdc: dont flush on each log entry
We manually control flush of the WAL.
We persist the changes outside the loop by calling flush() with sync.
Since we only send responses after flush we will never send information
about anything that was not persisted.
2025-08-28 08:26:11 +00:00
Francesco Mazzoli
786073adbf Do not move entries when printing out error message 2025-08-13 10:45:22 +00:00
Miroslav Crnic
71570f7cdc cdc: remove alert on rare race with gc 2025-06-05 15:04:29 +00:00
Miroslav Crnic
0377c4642e cdc: stop raising alert on MISMATCHING_CREATION_TIME 2024-11-20 18:27:08 +00:00
Miroslav Crnic
1a47089b3d shard: proxy read/write 2024-11-17 16:38:43 +00:00
Miroslav Crnic
5f24b43184 shuckle: support locations 2024-11-14 09:26:44 +00:00
Miroslav Crnic
48c3aa7d4a logsdb: enable partial leader election 2024-10-11 09:52:18 +01:00
Miroslav Crnic
2b738e01c7 shard/cdc: location output in log 2024-09-12 14:06:38 +00:00
Miroslav Crnic
2dec9ec117 cdc: register location 2024-09-12 14:27:55 +01:00
Miroslav Crnic
c7b6a1cbeb stats: stop producing them 2024-07-09 15:57:01 +01:00
Miroslav Crnic
78baed62a5 cdc: request checkpoints from shard and push through log 2024-06-13 16:24:22 +01:00
Miroslav Crnic
9d06deeedc cdc: error part of shard response 2024-06-13 13:00:43 +01:00
Miroslav Crnic
2cd15fc0be core: various protocol changes 2024-06-13 09:13:11 +01:00
Miroslav Crnic
7aac745457 shard/cdc: fetch all replicas quickly unless in do not replicate mode 2024-06-12 16:08:29 +00:00
Miroslav Crnic
71ee6568c5 cdc: correctly name logsdb stats 2024-06-10 15:49:44 +00:00
Miroslav Crnic
170f2fbc61 logsdb: add stats and expose in shard/cdc 2024-06-10 16:24:49 +01:00
Miroslav Crnic
ffb5989692 cdc: increase amount of message received to account for LogsDB msgs 2024-06-02 11:22:20 +00:00
Miroslav Crnic
bb47c8f38d cdc: remove adaptive msg receive 2024-06-02 07:59:16 +00:00
Miroslav Crnic
98d0376fc9 cdc: dont forget requests until they are handed over to LogsDB 2024-05-24 10:16:54 +01:00
Miroslav Crnic
1f145c030e shard/cdc: support snapshoting 2024-05-23 10:17:59 +01:00
Miroslav Crnic
1446e4d0d2 cdc: force log cleanup after crash
Transactional db that CDC uses has a slightly
annoying property that it flushes WAL on transaction
start. As a result release point can get moved and
log records persisted even if we crash.
We want to remove them automatically for now.
2024-05-22 16:58:18 +00:00
Miroslav Crnic
a377536b40 cdc: correctly rewind assumed logIndex 2024-05-22 15:28:15 +00:00
Miroslav Crnic
4e574374ca shard/cdc: cleanup logsdb options, hostmon name match service name 2024-05-22 10:21:41 +00:00
Miroslav Crnic
25e8264517 cdc: rewind expected LogIdx on append window full 2024-05-22 09:14:16 +00:00
Miroslav Crnic
b524748210 cdc: use DEFAULT_UDP_MTU for serialization of entries 2024-05-21 14:23:27 +00:00
Miroslav Crnic
121340f1b2 cdc: log line fixes and handle interupt 2024-05-21 12:56:26 +00:00
Miroslav Crnic
ab4c25e5e3 cdc: use normal buffer size in cdc sockets 2024-05-21 12:55:47 +00:00
Miroslav Crnic
8be746de5b cdc: differentiate replicas in xmon and in metrics 2024-05-21 09:51:37 +00:00
Miroslav Crnic
5d453179ad cdc: dont alert on missing replicas if replication is off 2024-05-20 13:10:31 +00:00
Miroslav Crnic
0b3348b458 SharedRocksDB: (ref) paths in constructor 2024-05-17 14:23:25 +00:00
Miroslav Crnic
f5e17dace5 cdc: add LogsDB
* cdc: pack req/resp into log entries and apply

* shard: drop support for unused incomming packet drop

* cdc: add logsdb
2024-05-14 12:50:17 +01:00
Miroslav Crnic
8a0ea10cde core: UDPSocketPair and use IpPort AddrsInfo everywhere
* core: UDPSocketPair and use IpPort AddrsInfo everywhere

* Refactor UDPSocketPair a bit

* ci: kmod always delete img before create

* shuckle: fix scripts/json marshal

---------

Co-authored-by: Francesco Mazzoli <francesco.mazzoli@xtxmarkets.com>
2024-05-03 11:32:07 +01:00
Francesco Mazzoli
40ed10b2c6 Bump quiet period for metrics alerts
VictoraMetrics is often acting up
2024-04-29 11:39:54 +00:00
Miroslav Crnic
409b126e4b cdc: use SharedRocksDB 2024-04-05 23:22:39 +01:00
Miroslav Crnic
30ee029f7e shuckle: make requests interruptable and pass timeout to all operations
This means that they'll be interrupted at shutdown, rather than holding everything up when shuckle is overloaded.
We also detect idle connection or slow transmitting data.
2024-04-02 18:15:29 +01:00
Francesco Mazzoli
7a5fc9f8a9 Allow to disable shuckle stat inserting 2024-03-25 16:08:54 +00:00
Francesco Mazzoli
3a6e498664 Make some Loop methods static 2024-03-20 13:00:18 +00:00
Miroslav Crnic
b240de53b5 shard: distributed log implementation and shard can use it with a flag set 2024-03-12 11:02:04 +00:00
Miroslav Crnic
712ed8973e core: simplify implementing custom stop for Loop 2024-02-23 13:52:34 +00:00
Francesco Mazzoli
beb07dbe6e Silence CDC queue alert 2024-02-21 14:57:00 +00:00
Francesco Mazzoli
303421763a Allow to specify rota per alert in C++ 2024-02-20 12:59:42 +00:00
Francesco Mazzoli
0a6a0c8f24 Process CDC timeouts in a timely manner 2024-01-29 15:08:06 +00:00
Miroslav Crnic
7ce185c219 cdc: remove uneccessary zeroing in shared 2024-01-24 14:24:06 +00:00
Francesco Mazzoli
f8b432eb18 Add metric and alert for CDC update size 2024-01-16 23:22:39 +00:00
Francesco Mazzoli
c80c6269d9 Remove spurious MsgsGen.hpp includes 2024-01-11 16:05:34 +00:00
Francesco Mazzoli
8075e99bb6 Graceful shard teardown
See <https://mazzo.li/posts/stopping-linux-threads.html> for tradeoffs
regarding how to terminate threads gracefully.

The goal of this work was for valgrind to work correctly, which in turn
was to investigate #141. It looks like I have succeeded:

    ==2715080== Warning: unimplemented fcntl command: 1036
    ==2715080== 20,052 bytes in 5,013 blocks are definitely lost in loss record 133 of 135
    ==2715080==    at 0x483F013: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
    ==2715080==    by 0x3B708E: allocate (new_allocator.h:121)
    ==2715080==    by 0x3B708E: allocate (allocator.h:173)
    ==2715080==    by 0x3B708E: allocate (alloc_traits.h:460)
    ==2715080==    by 0x3B708E: _M_allocate (stl_vector.h:346)
    ==2715080==    by 0x3B708E: std::vector<Crc, std::allocator<Crc> >::_M_default_append(unsigned long) (vector.tcc:635)
    ==2715080==    by 0x42BF1C: resize (stl_vector.h:940)
    ==2715080==    by 0x42BF1C: ShardDBImpl::_fileSpans(rocksdb::ReadOptions&, FileSpansReq const&, FileSpansResp&) (shard/ShardDB.cpp:921)
    ==2715080==    by 0x420867: ShardDBImpl::read(ShardReqContainer const&, ShardRespContainer&) (shard/ShardDB.cpp:1034)
    ==2715080==    by 0x3CB3EE: ShardServer::_handleRequest(int, sockaddr_in*, char*, unsigned long) (shard/Shard.cpp:347)
    ==2715080==    by 0x3C8A39: ShardServer::step() (shard/Shard.cpp:405)
    ==2715080==    by 0x40B1E8: run (core/Loop.cpp:67)
    ==2715080==    by 0x40B1E8: startLoop(void*) (core/Loop.cpp:37)
    ==2715080==    by 0x4BEA258: start_thread (in /usr/lib/libpthread-2.33.so)
    ==2715080==    by 0x4D005E2: clone (in /usr/lib/libc-2.33.so)
    ==2715080==
    ==2715080==
    ==2715080== Exit program on first error (--exit-on-first-error=yes)
2024-01-08 15:41:22 +00:00
Francesco Mazzoli
53049d5779 Shard batch writes, use batch UDP syscalls
The idea is to drain the socket and do a single RocksDB WAL
write/fsync for all the write requests we have found.

The read requests are immediately executed. The reasoning here is
that currently write requests are _a lot_ slower than the read
requests because fsyncing takes ~500us on fsf1. In the future this
might change.

Since we're at it, we also use batch UDP syscalls in the CDC.

Fixes #119.
2023-12-07 14:29:07 +00:00
Francesco Mazzoli
3eae5bbf9b Use an EMA for the in-flight CDC txns as well 2023-12-07 10:27:32 +00:00
Francesco Mazzoli
38f3d54ecd Wait forever, rather than having timeouts
The goal here is to not have constant wakeups due to timeout. Do
not attempt to clean things up nicely before termination -- just
terminate instead. We can setup a proper termination system in
the future, I first want to see if this makes a difference.

Also, change xmon to use pipes for communication, so that it can
wait without timers as well.

Also, `write` directly for logging, so that we know the logs will
make it to the file after the logging call returns (since we now
do not have the chance to flush them afterwards).
2023-12-07 10:11:19 +00:00