Commit Graph

363 Commits

Author SHA1 Message Date
Miroslav Crnic ffb5989692 cdc: increase amount of message received to account for LogsDB msgs 2024-06-02 11:22:20 +00:00
Miroslav Crnic bb47c8f38d cdc: remove adaptive msg receive 2024-06-02 07:59:16 +00:00
Miroslav Crnic 98d0376fc9 cdc: dont forget requests until they are handed over to LogsDB 2024-05-24 10:16:54 +01:00
Miroslav Crnic 1f145c030e shard/cdc: support snapshoting 2024-05-23 10:17:59 +01:00
Miroslav Crnic 1446e4d0d2 cdc: force log cleanup after crash
Transactional db that CDC uses has a slightly
annoying property that it flushes WAL on transaction
start. As a result release point can get moved and
log records persisted even if we crash.
We want to remove them automatically for now.
2024-05-22 16:58:18 +00:00
Miroslav Crnic a377536b40 cdc: correctly rewind assumed logIndex 2024-05-22 15:28:15 +00:00
Miroslav Crnic f11b675807 shuckle: add cdc replicas to page 2024-05-22 11:57:34 +00:00
Miroslav Crnic 4e574374ca shard/cdc: cleanup logsdb options, hostmon name match service name 2024-05-22 10:21:41 +00:00
Miroslav Crnic 25e8264517 cdc: rewind expected LogIdx on append window full 2024-05-22 09:14:16 +00:00
Miroslav Crnic b524748210 cdc: use DEFAULT_UDP_MTU for serialization of entries 2024-05-21 14:23:27 +00:00
Miroslav Crnic 121340f1b2 cdc: log line fixes and handle interupt 2024-05-21 12:56:26 +00:00
Miroslav Crnic ab4c25e5e3 cdc: use normal buffer size in cdc sockets 2024-05-21 12:55:47 +00:00
Miroslav Crnic 8be746de5b cdc: differentiate replicas in xmon and in metrics 2024-05-21 09:51:37 +00:00
Miroslav Crnic 5d453179ad cdc: dont alert on missing replicas if replication is off 2024-05-20 13:10:31 +00:00
Francesco Mazzoli 6faa917c18 Add endpoint and cli util to resurrect files
Only works in the same shard, for now.
2024-05-20 12:06:15 +00:00
Miroslav Crnic 0b3348b458 SharedRocksDB: (ref) paths in constructor 2024-05-17 14:23:25 +00:00
Miroslav Crnic ab337068ad httplib: use poll instead of select 2024-05-17 14:23:01 +00:00
Miroslav Crnic f5e17dace5 cdc: add LogsDB
* cdc: pack req/resp into log entries and apply

* shard: drop support for unused incomming packet drop

* cdc: add logsdb
2024-05-14 12:50:17 +01:00
Miroslav Crnic 91d462ab0e UDPSocketPair: (fix) dont look at unused fd-s 2024-05-14 08:51:29 +00:00
Miroslav Crnic aa8925adf9 shard: fix stats 2024-05-04 09:17:45 +01:00
Miroslav Crnic 8a0ea10cde core: UDPSocketPair and use IpPort AddrsInfo everywhere
* core: UDPSocketPair and use IpPort AddrsInfo everywhere

* Refactor UDPSocketPair a bit

* ci: kmod always delete img before create

* shuckle: fix scripts/json marshal

---------

Co-authored-by: Francesco Mazzoli <francesco.mazzoli@xtxmarkets.com>
2024-05-03 11:32:07 +01:00
Francesco Mazzoli ded0787c18 Fix no-return function 2024-05-01 08:53:38 +00:00
Francesco Mazzoli cd8e52f8f7 Remove assertions in ShardDB
We got a crash because of it (presumably can happen if defrag
conflicts with migrate or something like that)
2024-05-01 08:13:19 +00:00
Miroslav Crnic bbe201964d shard: bump quiet time for metric inserter alert to 5 min 2024-04-30 17:54:13 +01:00
Francesco Mazzoli 40ed10b2c6 Bump quiet period for metrics alerts
VictoraMetrics is often acting up
2024-04-29 11:39:54 +00:00
Francesco Mazzoli d3be7bf53a Remove old-style register block service request 2024-04-22 19:20:04 +00:00
Francesco Mazzoli f109e3542b Have eggsblocks to refresh decommissioned block services
So that we can reliably ignore stale block services in GC (done in
a future commit). To enable this and future-proof this kind of
mechanism (e.g. having `eggsblocks` to mark something as D itself)
I added a new way to register the block service that lets you mask
which flags you're checking. I'll remove the old way once we've
rolled out everywhere.
2024-04-22 18:47:54 +00:00
Miroslav Crnic 6007192a31 bincode: generate STATIC_SIZE for req/resp 2024-04-22 13:49:49 +01:00
Miroslav Crnic 43f69b1f7e shuckle: support ClearShardInfoReq/Resp 2024-04-16 10:25:24 +01:00
Miroslav Crnic a579b41dfc shuckle: support for MoveLeaderReq 2024-04-15 14:24:15 +01:00
Miroslav Crnic c1cea71f55 shard: dont serve reads if not leader 2024-04-15 14:23:55 +01:00
Miroslav Crnic 5fad8546bd shard: increase block fetch interval on leader to 2min 2024-04-10 13:51:00 +01:00
Francesco Mazzoli 51cda3a98b Clear data in shuckle block service fetch loop for symmetry 2024-04-10 10:39:46 +00:00
Francesco Mazzoli 20e7635d75 Clear data when request fails in Shuckle.cpp 2024-04-10 10:39:30 +00:00
Francesco Mazzoli d8267f18c6 Add more fsck functionality 2024-04-09 17:57:00 +00:00
Miroslav Crnic 2de5c6b5dc shard: delay block service update on leader 2024-04-09 17:59:59 +01:00
Francesco Mazzoli eb766f2fb5 Do not attempt to cross-shard unlink file if the file is a directory 2024-04-09 11:43:03 +00:00
Francesco Mazzoli e42c548777 Make SwapSpans idempotent 2024-04-09 07:53:10 +01:00
Francesco Mazzoli 13bf9a005a Run fsck at the end of tests 2024-04-09 07:53:10 +01:00
Francesco Mazzoli f10c7e0744 Add eggscli functionality to "defrag" files
Fixes #50.
2024-04-09 07:53:10 +01:00
Francesco Mazzoli 4dd929a798 Implement swap spans 2024-04-09 07:53:10 +01:00
Miroslav Crnic 409b126e4b cdc: use SharedRocksDB 2024-04-05 23:22:39 +01:00
Miroslav Crnic fcb8ab79f8 shard: always run with logsdb, disable separate ci 2024-04-05 21:50:33 +01:00
Miroslav Crnic 0a6e4be683 shard: disable double flush and improve kmod vm 2024-04-05 17:34:42 +01:00
Francesco Mazzoli d44b331739 Remove one-off fixup command in db tools 2024-04-05 12:55:35 +00:00
Francesco Mazzoli 498eb0feda Add single-use utility to fixup some bad RocksDB values in shard 0 2024-04-04 15:55:11 +00:00
Miroslav Crnic 9f1dbf06d0 eggsshuckle: detect leader change attempt and return error 2024-04-04 13:39:08 +01:00
Miroslav Crnic de17eee24f core: fix incorrect return in connectHost 2024-04-03 15:08:48 +01:00
Miroslav Crnic 30ee029f7e shuckle: make requests interruptable and pass timeout to all operations
This means that they'll be interrupted at shutdown, rather than holding everything up when shuckle is overloaded.
We also detect idle connection or slow transmitting data.
2024-04-02 18:15:29 +01:00
Francesco Mazzoli b38dcc550b fsck fixes/logs 2024-03-28 10:04:15 +00:00