Miroslav Crnic
aebcce4017
logsdb: fix assert for last relased going backwards
2024-03-25 10:31:58 +00:00
Francesco Mazzoli
6182188511
Split out stale alerts, make them daytime
2024-03-22 11:59:10 +00:00
Francesco Mazzoli
c143e48841
Add command to eggscli to read out kernel metrics
2024-03-22 11:34:28 +00:00
Francesco Mazzoli
b903875459
Set new creation time when renaming
...
I thought this would not be necessary due to the fact that
we'd fill it in after revalidation, but we did encounter some
cases where this does not seem to happen.
2024-03-21 20:12:56 +00:00
Francesco Mazzoli
48f9123a5a
Adjust attempts check, now we start from 1
2024-03-21 17:04:04 +00:00
Francesco Mazzoli
4b0dd25bdc
Correctly record attempts in eggsfs_metadata_request
...
This got lost in the `net.c` refactor, and it caused the recovery
mechanism on repeated requests to fail.
2024-03-21 14:19:54 +00:00
Francesco Mazzoli
6b382c044b
Fix race in async getattr
2024-03-21 14:10:54 +00:00
Francesco Mazzoli
3f4988bb32
Some more metadata debug logging
2024-03-21 14:08:59 +00:00
Saulius Grusnys
2157833680
explain why ftruncate needs to be disabled
2024-03-21 09:24:21 +00:00
Saulius Grusnys
8565726989
sysctl param to disable ftruncate support
2024-03-21 09:24:21 +00:00
Francesco Mazzoli
43e6c940b3
Make sure we have size information wherever we need it
2024-03-20 19:43:31 +00:00
Francesco Mazzoli
be2d604d96
Stable shuckle alerts
2024-03-20 17:08:25 +00:00
Miroslav Crnic
7df0a5da89
shard: cli options now match migration phases for LogsDB, and support manual failover
2024-03-20 15:34:55 +00:00
Saulius Grusnys
6f816fb319
improve logging
2024-03-20 15:20:32 +00:00
Saulius Grusnys
fd9079febf
Rate limited shuckle endpoint to decom blockservices
2024-03-20 15:16:00 +00:00
Francesco Mazzoli
1cf299bfac
Use atomics where appropriate
2024-03-20 13:21:18 +00:00
Francesco Mazzoli
f85714dbba
Use pthread_self() to get pthread thread id
2024-03-20 13:11:14 +00:00
Francesco Mazzoli
d512e8d281
Escape file name in backlinks
2024-03-20 13:00:28 +00:00
Francesco Mazzoli
3a6e498664
Make some Loop methods static
2024-03-20 13:00:18 +00:00
Francesco Mazzoli
9bc7e209e4
Safer ShuckleSock
2024-03-20 11:33:39 +00:00
Francesco Mazzoli
66fe0a2621
Correct pthread_timedjoin_np handling
2024-03-20 11:13:26 +00:00
Francesco Mazzoli
8f1ba6361b
Resist interruptions when joining threads
2024-03-20 10:32:42 +00:00
Francesco Mazzoli
66ccba6124
Forward termination signal to main thread
2024-03-20 10:32:42 +00:00
Francesco Mazzoli
488f096eb9
Stat files/directories speculatively on readdir
...
Also, split the timeouts for dentries and for stats. We generally
don't care if stats are out of dates, but dentries should be up
to date.
The code leaves various aspects to be desired:
* No attempt is made to only send stats when needed -- it is always
done. It might be a good idea to instead wait for the first two
stats to come back.
* Theres quite a bit of code duplication.
* It's pretty wasteful to have so many different packets for the
stats. It'd be much better to pack multiple requests and multiple
responses in single packets.
This could be done simply by allowing many requests to come
in the same packet (just one after the other would be fine),
and same for the responses. We can still use the protocol and
request id to keep track of things anyway.
2024-03-19 20:29:23 +00:00
Miroslav Crnic
c25cb696b4
shard: remove protection that only replica 0 can be leader
2024-03-19 16:29:36 +00:00
Francesco Mazzoli
b12cdf7507
Add replicas info to shuckle web ui
2024-03-19 15:55:18 +00:00
Francesco Mazzoli
abd7131e88
Fix BlockServicesCacheDB init
2024-03-19 15:26:19 +00:00
Miroslav Crnic
37539e1c5e
eggsdbtools: reduce logging, output stats
2024-03-19 15:15:49 +00:00
Miroslav Crnic
938c845a30
eggsdbtool: cli for shard db comparison
2024-03-19 15:00:01 +00:00
Francesco Mazzoli
6d9da0e595
Remove all remnants of block service cache in ShardDB
...
The previous code was pretty nasty, it reached into the `ShardDB`
column family from another class. All those keys have been deleted
anyway in production.
2024-03-19 14:27:33 +00:00
Miroslav Crnic
a4c091c7b2
logsdb: log state at flush to have consistent view
2024-03-19 12:44:56 +00:00
Miroslav Crnic
5ce2efb88b
shard: increase number of requests processed in loop when LogsDB is on
2024-03-18 18:06:19 +00:00
Miroslav Crnic
096b9cbe6a
logsdb: fix for replication path
2024-03-18 17:29:49 +00:00
Miroslav Crnic
0b7d1c30d3
shard: turn on replication writes
2024-03-18 14:19:50 +00:00
Miroslav Crnic
dfcabdba97
LogsDB: tweak catchup timeout
2024-03-18 12:00:27 +00:00
Miroslav Crnic
c8cda7e4db
logsdb: periodically log status
2024-03-18 09:44:47 +00:00
Miroslav Crnic
72c1acaea8
xmon: if too many alerts initialize appType to _parent
2024-03-15 19:39:41 +00:00
Miroslav Crnic
27faaa45ae
ci: add ability to run with LogsDB, shard: add handling of LogsDB messages
2024-03-15 16:49:39 +00:00
Saulius Grusnys
74e81ca836
do not hit production shuckle by default from go apps
2024-03-15 08:46:07 +00:00
Saulius Grusnys
e0dc93ded1
additional metrics in eggsblocks ( #222 )
2024-03-15 05:30:44 +00:00
Francesco Mazzoli
3db003a8f6
Fix bug in BlockServicesCacheDB initialization
2024-03-13 12:07:33 +00:00
Francesco Mazzoli
3fc466f197
Fix alert formatting
2024-03-13 12:04:40 +00:00
Miroslav Crnic
ebcdcb650a
shard: add support for resetting all data in LogsDB
2024-03-13 11:33:48 +00:00
Francesco Mazzoli
005121bcac
Spin block service cache out of ShardDB
...
This started being a problem since the block service update log
entry does not fit in a UDP packet (it's like 100KB). I think this
approach makes more sense anyway. See comment for `getCache()` for
gotchas.
2024-03-13 11:29:58 +00:00
Miroslav Crnic
52cc5c01df
tests: ability to run functional tests in docker
2024-03-13 10:21:56 +00:00
Francesco Mazzoli
6968c25bc5
Allow : in metrics
2024-03-12 14:04:34 +00:00
Miroslav Crnic
13c5df0131
shard: fix name in xmon and add replica id to tag in metrics
2024-03-12 13:40:35 +00:00
Miroslav Crnic
b240de53b5
shard: distributed log implementation and shard can use it with a flag set
2024-03-12 11:02:04 +00:00
Francesco Mazzoli
d5fb66b694
Test mmap in CI
2024-03-11 15:35:44 +00:00
Francesco Mazzoli
e96742c711
Implement readpage, and therefore allow mmap
2024-03-11 15:33:57 +00:00