Miroslav Crnic
a579b41dfc
shuckle: support for MoveLeaderReq
2024-04-15 14:24:15 +01:00
Francesco Mazzoli
20e7635d75
Clear data when request fails in Shuckle.cpp
2024-04-10 10:39:30 +00:00
Francesco Mazzoli
e42c548777
Make SwapSpans idempotent
2024-04-09 07:53:10 +01:00
Francesco Mazzoli
4dd929a798
Implement swap spans
2024-04-09 07:53:10 +01:00
Miroslav Crnic
409b126e4b
cdc: use SharedRocksDB
2024-04-05 23:22:39 +01:00
Miroslav Crnic
0a6e4be683
shard: disable double flush and improve kmod vm
2024-04-05 17:34:42 +01:00
Miroslav Crnic
de17eee24f
core: fix incorrect return in connectHost
2024-04-03 15:08:48 +01:00
Miroslav Crnic
30ee029f7e
shuckle: make requests interruptable and pass timeout to all operations
...
This means that they'll be interrupted at shutdown, rather than holding everything up when shuckle is overloaded.
We also detect idle connection or slow transmitting data.
2024-04-02 18:15:29 +01:00
Francesco Mazzoli
68c4c03750
Add command to run some checks directly in RocksDB database
2024-03-27 18:45:14 +00:00
Miroslav Crnic
aebcce4017
logsdb: fix assert for last relased going backwards
2024-03-25 10:31:58 +00:00
Miroslav Crnic
7df0a5da89
shard: cli options now match migration phases for LogsDB, and support manual failover
2024-03-20 15:34:55 +00:00
Saulius Grusnys
fd9079febf
Rate limited shuckle endpoint to decom blockservices
2024-03-20 15:16:00 +00:00
Francesco Mazzoli
1cf299bfac
Use atomics where appropriate
2024-03-20 13:21:18 +00:00
Francesco Mazzoli
f85714dbba
Use pthread_self() to get pthread thread id
2024-03-20 13:11:14 +00:00
Francesco Mazzoli
3a6e498664
Make some Loop methods static
2024-03-20 13:00:18 +00:00
Francesco Mazzoli
9bc7e209e4
Safer ShuckleSock
2024-03-20 11:33:39 +00:00
Francesco Mazzoli
66fe0a2621
Correct pthread_timedjoin_np handling
2024-03-20 11:13:26 +00:00
Francesco Mazzoli
8f1ba6361b
Resist interruptions when joining threads
2024-03-20 10:32:42 +00:00
Francesco Mazzoli
66ccba6124
Forward termination signal to main thread
2024-03-20 10:32:42 +00:00
Francesco Mazzoli
b12cdf7507
Add replicas info to shuckle web ui
2024-03-19 15:55:18 +00:00
Miroslav Crnic
938c845a30
eggsdbtool: cli for shard db comparison
2024-03-19 15:00:01 +00:00
Miroslav Crnic
a4c091c7b2
logsdb: log state at flush to have consistent view
2024-03-19 12:44:56 +00:00
Miroslav Crnic
096b9cbe6a
logsdb: fix for replication path
2024-03-18 17:29:49 +00:00
Miroslav Crnic
dfcabdba97
LogsDB: tweak catchup timeout
2024-03-18 12:00:27 +00:00
Miroslav Crnic
c8cda7e4db
logsdb: periodically log status
2024-03-18 09:44:47 +00:00
Miroslav Crnic
72c1acaea8
xmon: if too many alerts initialize appType to _parent
2024-03-15 19:39:41 +00:00
Miroslav Crnic
27faaa45ae
ci: add ability to run with LogsDB, shard: add handling of LogsDB messages
2024-03-15 16:49:39 +00:00
Miroslav Crnic
ebcdcb650a
shard: add support for resetting all data in LogsDB
2024-03-13 11:33:48 +00:00
Francesco Mazzoli
005121bcac
Spin block service cache out of ShardDB
...
This started being a problem since the block service update log
entry does not fit in a UDP packet (it's like 100KB). I think this
approach makes more sense anyway. See comment for `getCache()` for
gotchas.
2024-03-13 11:29:58 +00:00
Francesco Mazzoli
6968c25bc5
Allow : in metrics
2024-03-12 14:04:34 +00:00
Miroslav Crnic
13c5df0131
shard: fix name in xmon and add replica id to tag in metrics
2024-03-12 13:40:35 +00:00
Miroslav Crnic
b240de53b5
shard: distributed log implementation and shard can use it with a flag set
2024-03-12 11:02:04 +00:00
Francesco Mazzoli
0037e8d10b
Print some info about block service flags in shard
2024-03-08 09:18:54 +00:00
Miroslav Crnic
712ed8973e
core: simplify implementing custom stop for Loop
2024-02-23 13:52:34 +00:00
Francesco Mazzoli
531f989a06
Correct app type for quiet alert creation
2024-02-20 14:16:52 +00:00
Francesco Mazzoli
303421763a
Allow to specify rota per alert in C++
2024-02-20 12:59:42 +00:00
Saulius Grusnys
796e46f466
shuckle to track if blockservices have any files on them (currently t… ( #177 )
...
* shuckle to track if blockservices have any files on them (currently there is issue with transient files)
2024-02-20 08:10:51 +00:00
Miroslav Crnic
83d0469c7f
SharedRocksdDB: correctly export metrics
2024-02-08 19:39:00 +00:00
Miroslav Crnic
37ba9bc457
shard: support for sharing rocksdb and init LogsDB CFs
2024-02-08 17:44:03 +00:00
Miroslav Crnic
38707535e3
shuckle: support metadata replication
2024-02-07 13:57:00 +00:00
Miroslav Crnic
1dedd7d181
core: SPSC return 0 on timeout in pull
2024-01-29 17:16:05 +00:00
Miroslav Crnic
2ec1304981
core: ppoll, futex dont like negative timeouts
2024-01-29 17:00:14 +00:00
Francesco Mazzoli
9d1a31b482
Fix another signedness mismatch
2024-01-29 16:46:05 +00:00
Miroslav Crnic
e543665f8f
core: SPSC support timeout in pull
2024-01-29 16:06:31 +00:00
Francesco Mazzoli
2a326f7c5f
Fix usual signedness shenanigans 🥱
2024-01-29 16:05:19 +00:00
Francesco Mazzoli
0a6a0c8f24
Process CDC timeouts in a timely manner
2024-01-29 15:08:06 +00:00
Francesco Mazzoli
2a6feb6df5
Patch RocksDB to make it compile with clang 15.
2024-01-29 14:15:29 +00:00
Francesco Mazzoli
8c0c246348
More robust detection of file vs. device errors
...
Just check if we're also unable to count the blocks for the disk,
and if yes, assume it's a single file error.
Of course there will be a time period where we will not have detected
the bad disk when counting the blocks (a few minutes at most), but
that's OK -- the scrubber will scrub blocks for that period, and then
stop.
Once <internal-repo/issues/65#issuecomment-24747>
is done, we should use whatever error detection we use for migration
to also distinguish between these errors.
2024-01-22 13:18:53 +00:00
Francesco Mazzoli
b6cf2b67a6
Distribute block services from shuckle
...
This is in preparation for #44 , but more immediately, to better
stop writing to full block services.
The previous strategy of setting a flag was flawed since once
the flag was set it stayed set -- i.e. we would not remove it once
files would be deleted. This consideration should just be integrated
in distributing the block services.
2024-01-16 16:17:27 +00:00
Francesco Mazzoli
d569bdb494
Re-introduce thread names (they got lost in a refactor)
2024-01-11 17:32:52 +00:00