Commit Graph

93 Commits

Author SHA1 Message Date
Francesco Mazzoli
7a7a43ff14 Streamline block reads, do CRC more on the fly.
This is in preparation with a deeper refactor of reading from Go
in general. The big difference that we have now which we did not
have before is that we now have CRCs for every single page.
2025-10-01 14:14:44 +01:00
Joshua Leahy
7a4e466ac6 Make TernFS open source 2025-09-17 18:20:23 +01:00
Miroslav Crnic
8c75dd0d89 registry: changes to core/messages 2025-09-17 09:07:14 +00:00
Miroslav Crnic
92d25d04da shuckle: rename to registry prepare for replace 2025-09-08 08:59:08 +00:00
Francesco Mazzoli
110705db8d EggsFS -> TernFS rename
Things not done because probably disruptive:

* kmod filesystem string
* sysctl/debugfs/trace
* metrics names
* xmon instance names

Some of these might be renamed too, but starting with a relatively
safe set.
2025-09-03 09:29:53 +01:00
Miroslav Crnic
a70a1c4d6a msgs: prepare for adding location to blockservices 2025-06-26 12:55:00 +00:00
Saulius Grusnys
f6d8fec49a add endpoint to update blockservice path (#444) 2025-05-14 17:20:27 +01:00
Miroslav Crnic
8cedd17e6e msgs: deprecate AllBlockServices to add location 2025-05-14 08:34:21 +00:00
Miroslav Crnic
c53af171e5 make scratch file gc-able on release
* shard: support ScrapTransientFile

* scratch: scrap file on release
2025-03-18 12:49:44 +00:00
Miroslav Crnic
25b2cd965e shard: transient file deadline part of entry 2025-03-18 10:03:08 +00:00
Miroslav Crnic
6948f36bc7 shard: support multiple locations in operations 2024-12-02 09:47:48 +00:00
Miroslav Crnic
f931e3c0d5 msgs: remove ConverBlockReq/Resp 2024-12-02 08:16:44 +00:00
Miroslav Crnic
5726a2e308 shuckle: assign writable services per location + messages cleanup 2024-11-28 15:42:44 +00:00
Miroslav Crnic
637543f0a0 shard: enforce no duplicate failure domains 2024-11-25 17:57:57 +00:00
Miroslav Crnic
1a47089b3d shard: proxy read/write 2024-11-17 16:38:43 +00:00
Miroslav Crnic
5f24b43184 shuckle: support locations 2024-11-14 09:26:44 +00:00
Miroslav Crnic
75dfd723c0 shuckle: fix ClearCdcInfoReq name 2024-09-17 10:05:46 +00:00
Miroslav Crnic
b2ea95091a shuckle: support cdc replica moving across hosts 2024-09-16 17:31:47 +01:00
Miroslav Crnic
59fc480e85 shuckle: remove unused requests 2024-09-16 15:21:06 +01:00
Miroslav Crnic
8ac93a4c54 shuckle: add location for all services 2024-09-11 16:59:19 +01:00
Miroslav Crnic
9cd425d7f3 eggsblocks/kmod: add file_id to FetchBlockWithCrcReq 2024-08-22 14:11:01 +01:00
Miroslav Crnic
49bd2e6a2a eggsblocks: conversion as a separate request 2024-08-21 15:39:11 +01:00
Miroslav Crnic
73622ce637 eggsblocks: write/read from new block format with crc after page 2024-08-20 14:55:45 +01:00
Miroslav Crnic
cf40e318ec shuckle: support BlockServicesWithFlagChangeReq 2024-07-24 10:08:01 +01:00
Miroslav Crnic
a41a4b7482 shuckle: drop BlockServiceInfoWithoutFlagsLastChanged 2024-07-23 15:40:44 +01:00
Miroslav Crnic
49723653f8 shuckle: BlockServiceInfo backward compatibility
* shuckle: rename BlockServiceInfo to BlockServiceInfoWithoutFlagsLastChanged

* shuckle: handle AllBlockServices
2024-07-23 13:10:57 +01:00
Miroslav Crnic
e2bfb15c5f blockservice: add BlockFetchWithCrc 2024-07-12 14:24:37 +01:00
Miroslav Crnic
3195d39d9d stats: fully remove everywhere 2024-07-09 15:22:10 +00:00
Miroslav Crnic
f3b7ef4d94 eggsgc: destroy decommissioned blocks through shuckle 2024-07-02 09:52:20 +00:00
Miroslav Crnic
2cd15fc0be core: various protocol changes 2024-06-13 09:13:11 +01:00
Miroslav Crnic
1f145c030e shard/cdc: support snapshoting 2024-05-23 10:17:59 +01:00
Miroslav Crnic
f11b675807 shuckle: add cdc replicas to page 2024-05-22 11:57:34 +00:00
Francesco Mazzoli
6faa917c18 Add endpoint and cli util to resurrect files
Only works in the same shard, for now.
2024-05-20 12:06:15 +00:00
Miroslav Crnic
8a0ea10cde core: UDPSocketPair and use IpPort AddrsInfo everywhere
* core: UDPSocketPair and use IpPort AddrsInfo everywhere

* Refactor UDPSocketPair a bit

* ci: kmod always delete img before create

* shuckle: fix scripts/json marshal

---------

Co-authored-by: Francesco Mazzoli <francesco.mazzoli@xtxmarkets.com>
2024-05-03 11:32:07 +01:00
Francesco Mazzoli
cd8e52f8f7 Remove assertions in ShardDB
We got a crash because of it (presumably can happen if defrag
conflicts with migrate or something like that)
2024-05-01 08:13:19 +00:00
Francesco Mazzoli
d3be7bf53a Remove old-style register block service request 2024-04-22 19:20:04 +00:00
Francesco Mazzoli
f109e3542b Have eggsblocks to refresh decommissioned block services
So that we can reliably ignore stale block services in GC (done in
a future commit). To enable this and future-proof this kind of
mechanism (e.g. having `eggsblocks` to mark something as D itself)
I added a new way to register the block service that lets you mask
which flags you're checking. I'll remove the old way once we've
rolled out everywhere.
2024-04-22 18:47:54 +00:00
Miroslav Crnic
43f69b1f7e shuckle: support ClearShardInfoReq/Resp 2024-04-16 10:25:24 +01:00
Miroslav Crnic
a579b41dfc shuckle: support for MoveLeaderReq 2024-04-15 14:24:15 +01:00
Francesco Mazzoli
e42c548777 Make SwapSpans idempotent 2024-04-09 07:53:10 +01:00
Francesco Mazzoli
4dd929a798 Implement swap spans 2024-04-09 07:53:10 +01:00
Saulius Grusnys
fd9079febf Rate limited shuckle endpoint to decom blockservices 2024-03-20 15:16:00 +00:00
Francesco Mazzoli
b12cdf7507 Add replicas info to shuckle web ui 2024-03-19 15:55:18 +00:00
Francesco Mazzoli
005121bcac Spin block service cache out of ShardDB
This started being a problem since the block service update log
entry does not fit in a UDP packet (it's like 100KB). I think this
approach makes more sense anyway. See comment for `getCache()` for
gotchas.
2024-03-13 11:29:58 +00:00
Miroslav Crnic
b240de53b5 shard: distributed log implementation and shard can use it with a flag set 2024-03-12 11:02:04 +00:00
Saulius Grusnys
796e46f466 shuckle to track if blockservices have any files on them (currently t… (#177)
* shuckle to track if blockservices have any files on them (currently there is issue with transient files)
2024-02-20 08:10:51 +00:00
Miroslav Crnic
38707535e3 shuckle: support metadata replication 2024-02-07 13:57:00 +00:00
Francesco Mazzoli
8c0c246348 More robust detection of file vs. device errors
Just check if we're also unable to count the blocks for the disk,
and if yes, assume it's a single file error.

Of course there will be a time period where we will not have detected
the bad disk when counting the blocks (a few minutes at most), but
that's OK -- the scrubber will scrub blocks for that period, and then
stop.

Once <internal-repo/issues/65#issuecomment-24747>
is done, we should use whatever error detection we use for migration
to also distinguish between these errors.
2024-01-22 13:18:53 +00:00
Francesco Mazzoli
b6cf2b67a6 Distribute block services from shuckle
This is in preparation for #44, but more immediately, to better
stop writing to full block services.

The previous strategy of setting a flag was flawed since once
the flag was set it stayed set -- i.e. we would not remove it once
files would be deleted.  This consideration should just be integrated
in distributing the block services.
2024-01-16 16:17:27 +00:00
Francesco Mazzoli
788b5eed57 Fill in current block services before applying the log
It makes a lot more sense to pick outside, given that it involves
randomness. Also, this is in preparation for shuckle picking them
in a smarter way.
2023-12-09 15:20:24 +00:00