Commit Graph

92 Commits

Author SHA1 Message Date
Joshua Leahy
7a4e466ac6 Make TernFS open source 2025-09-17 18:20:23 +01:00
Miroslav Crnic
8c75dd0d89 registry: changes to core/messages 2025-09-17 09:07:14 +00:00
Miroslav Crnic
92d25d04da shuckle: rename to registry prepare for replace 2025-09-08 08:59:08 +00:00
Francesco Mazzoli
110705db8d EggsFS -> TernFS rename
Things not done because probably disruptive:

* kmod filesystem string
* sysctl/debugfs/trace
* metrics names
* xmon instance names

Some of these might be renamed too, but starting with a relatively
safe set.
2025-09-03 09:29:53 +01:00
Miroslav Crnic
a70a1c4d6a msgs: prepare for adding location to blockservices 2025-06-26 12:55:00 +00:00
Saulius Grusnys
f6d8fec49a add endpoint to update blockservice path (#444) 2025-05-14 17:20:27 +01:00
Miroslav Crnic
8cedd17e6e msgs: deprecate AllBlockServices to add location 2025-05-14 08:34:21 +00:00
Miroslav Crnic
c53af171e5 make scratch file gc-able on release
* shard: support ScrapTransientFile

* scratch: scrap file on release
2025-03-18 12:49:44 +00:00
Miroslav Crnic
25b2cd965e shard: transient file deadline part of entry 2025-03-18 10:03:08 +00:00
Miroslav Crnic
6948f36bc7 shard: support multiple locations in operations 2024-12-02 09:47:48 +00:00
Miroslav Crnic
f931e3c0d5 msgs: remove ConverBlockReq/Resp 2024-12-02 08:16:44 +00:00
Miroslav Crnic
5726a2e308 shuckle: assign writable services per location + messages cleanup 2024-11-28 15:42:44 +00:00
Miroslav Crnic
637543f0a0 shard: enforce no duplicate failure domains 2024-11-25 17:57:57 +00:00
Miroslav Crnic
1a47089b3d shard: proxy read/write 2024-11-17 16:38:43 +00:00
Miroslav Crnic
5f24b43184 shuckle: support locations 2024-11-14 09:26:44 +00:00
Miroslav Crnic
75dfd723c0 shuckle: fix ClearCdcInfoReq name 2024-09-17 10:05:46 +00:00
Miroslav Crnic
b2ea95091a shuckle: support cdc replica moving across hosts 2024-09-16 17:31:47 +01:00
Miroslav Crnic
59fc480e85 shuckle: remove unused requests 2024-09-16 15:21:06 +01:00
Miroslav Crnic
8ac93a4c54 shuckle: add location for all services 2024-09-11 16:59:19 +01:00
Miroslav Crnic
9cd425d7f3 eggsblocks/kmod: add file_id to FetchBlockWithCrcReq 2024-08-22 14:11:01 +01:00
Miroslav Crnic
49bd2e6a2a eggsblocks: conversion as a separate request 2024-08-21 15:39:11 +01:00
Miroslav Crnic
73622ce637 eggsblocks: write/read from new block format with crc after page 2024-08-20 14:55:45 +01:00
Miroslav Crnic
cf40e318ec shuckle: support BlockServicesWithFlagChangeReq 2024-07-24 10:08:01 +01:00
Miroslav Crnic
a41a4b7482 shuckle: drop BlockServiceInfoWithoutFlagsLastChanged 2024-07-23 15:40:44 +01:00
Miroslav Crnic
49723653f8 shuckle: BlockServiceInfo backward compatibility
* shuckle: rename BlockServiceInfo to BlockServiceInfoWithoutFlagsLastChanged

* shuckle: handle AllBlockServices
2024-07-23 13:10:57 +01:00
Miroslav Crnic
e2bfb15c5f blockservice: add BlockFetchWithCrc 2024-07-12 14:24:37 +01:00
Miroslav Crnic
3195d39d9d stats: fully remove everywhere 2024-07-09 15:22:10 +00:00
Miroslav Crnic
f3b7ef4d94 eggsgc: destroy decommissioned blocks through shuckle 2024-07-02 09:52:20 +00:00
Miroslav Crnic
2cd15fc0be core: various protocol changes 2024-06-13 09:13:11 +01:00
Miroslav Crnic
1f145c030e shard/cdc: support snapshoting 2024-05-23 10:17:59 +01:00
Miroslav Crnic
f11b675807 shuckle: add cdc replicas to page 2024-05-22 11:57:34 +00:00
Francesco Mazzoli
6faa917c18 Add endpoint and cli util to resurrect files
Only works in the same shard, for now.
2024-05-20 12:06:15 +00:00
Miroslav Crnic
8a0ea10cde core: UDPSocketPair and use IpPort AddrsInfo everywhere
* core: UDPSocketPair and use IpPort AddrsInfo everywhere

* Refactor UDPSocketPair a bit

* ci: kmod always delete img before create

* shuckle: fix scripts/json marshal

---------

Co-authored-by: Francesco Mazzoli <francesco.mazzoli@xtxmarkets.com>
2024-05-03 11:32:07 +01:00
Francesco Mazzoli
cd8e52f8f7 Remove assertions in ShardDB
We got a crash because of it (presumably can happen if defrag
conflicts with migrate or something like that)
2024-05-01 08:13:19 +00:00
Francesco Mazzoli
d3be7bf53a Remove old-style register block service request 2024-04-22 19:20:04 +00:00
Francesco Mazzoli
f109e3542b Have eggsblocks to refresh decommissioned block services
So that we can reliably ignore stale block services in GC (done in
a future commit). To enable this and future-proof this kind of
mechanism (e.g. having `eggsblocks` to mark something as D itself)
I added a new way to register the block service that lets you mask
which flags you're checking. I'll remove the old way once we've
rolled out everywhere.
2024-04-22 18:47:54 +00:00
Miroslav Crnic
43f69b1f7e shuckle: support ClearShardInfoReq/Resp 2024-04-16 10:25:24 +01:00
Miroslav Crnic
a579b41dfc shuckle: support for MoveLeaderReq 2024-04-15 14:24:15 +01:00
Francesco Mazzoli
e42c548777 Make SwapSpans idempotent 2024-04-09 07:53:10 +01:00
Francesco Mazzoli
4dd929a798 Implement swap spans 2024-04-09 07:53:10 +01:00
Saulius Grusnys
fd9079febf Rate limited shuckle endpoint to decom blockservices 2024-03-20 15:16:00 +00:00
Francesco Mazzoli
b12cdf7507 Add replicas info to shuckle web ui 2024-03-19 15:55:18 +00:00
Francesco Mazzoli
005121bcac Spin block service cache out of ShardDB
This started being a problem since the block service update log
entry does not fit in a UDP packet (it's like 100KB). I think this
approach makes more sense anyway. See comment for `getCache()` for
gotchas.
2024-03-13 11:29:58 +00:00
Miroslav Crnic
b240de53b5 shard: distributed log implementation and shard can use it with a flag set 2024-03-12 11:02:04 +00:00
Saulius Grusnys
796e46f466 shuckle to track if blockservices have any files on them (currently t… (#177)
* shuckle to track if blockservices have any files on them (currently there is issue with transient files)
2024-02-20 08:10:51 +00:00
Miroslav Crnic
38707535e3 shuckle: support metadata replication 2024-02-07 13:57:00 +00:00
Francesco Mazzoli
8c0c246348 More robust detection of file vs. device errors
Just check if we're also unable to count the blocks for the disk,
and if yes, assume it's a single file error.

Of course there will be a time period where we will not have detected
the bad disk when counting the blocks (a few minutes at most), but
that's OK -- the scrubber will scrub blocks for that period, and then
stop.

Once <internal-repo/issues/65#issuecomment-24747>
is done, we should use whatever error detection we use for migration
to also distinguish between these errors.
2024-01-22 13:18:53 +00:00
Francesco Mazzoli
b6cf2b67a6 Distribute block services from shuckle
This is in preparation for #44, but more immediately, to better
stop writing to full block services.

The previous strategy of setting a flag was flawed since once
the flag was set it stayed set -- i.e. we would not remove it once
files would be deleted.  This consideration should just be integrated
in distributing the block services.
2024-01-16 16:17:27 +00:00
Francesco Mazzoli
788b5eed57 Fill in current block services before applying the log
It makes a lot more sense to pick outside, given that it involves
randomness. Also, this is in preparation for shuckle picking them
in a smarter way.
2023-12-09 15:20:24 +00:00
Francesco Mazzoli
53049d5779 Shard batch writes, use batch UDP syscalls
The idea is to drain the socket and do a single RocksDB WAL
write/fsync for all the write requests we have found.

The read requests are immediately executed. The reasoning here is
that currently write requests are _a lot_ slower than the read
requests because fsyncing takes ~500us on fsf1. In the future this
might change.

Since we're at it, we also use batch UDP syscalls in the CDC.

Fixes #119.
2023-12-07 14:29:07 +00:00