1685 Commits

Author SHA1 Message Date
Isabella Bosia
b70fb8073d terncli: make help message more legible 2025-12-16 13:45:16 +00:00
Miroslav Crnic
d6db8c6cc8 shard: cdc lookup strongly consistent 2025-12-15 13:17:42 +00:00
Miroslav Crnic
f4ca4d226f fix build for older compiler 2025-12-15 12:56:41 +00:00
Miroslav Crnic
89b3038448 registry: split out writer thread 2025-12-15 11:35:25 +00:00
Miroslav Crnic
bfce6cbed5 registry: split out readers 2025-12-15 11:35:23 +00:00
Miroslav Crnic
768072e054 kmod: remove block service cache
With reduced span cache time the block service cache
is no longer needed. We also don't need to fetch
changed block services from registry as we'll get
it as part of span fetches.
2025-12-09 15:34:09 +00:00
Miroslav Crnic
d05a360e8b kmod: fix socket leak 2025-12-09 14:34:37 +00:00
Miroslav Crnic
24adfa4259 kmod: fix socket ref count 2025-12-09 12:48:11 +00:00
Miroslav Crnic
9960d8f0b4 ternblocks: support default DSCP and client DSCP override 2025-12-08 12:25:17 +00:00
Miroslav Crnic
6157fec043 kmod: block.c fix queue races 2025-12-08 11:06:04 +00:00
Miroslav Crnic
ae15c2dcda eggsblocks: certificate/crc/too old as metrics not alerts 2025-12-05 14:34:28 +00:00
Miroslav Crnic
abb1580708 shard/cdc: fix atomic shared_ptr usage 2025-12-02 21:24:24 +00:00
Miroslav Crnic
9164876cb6 shard: x-location wait only on CreateDirectoryInode
Waiting on each CDC operation is too expensive.
The only race we have a problem with is during directory creation,
where an observer must not see the directory before it is fully created.
This is a targeted fix until we can implement a more general solution.
2025-12-02 16:10:19 +00:00
Miroslav Crnic
ee23f44d42 shard: update leaders at other location lastSeen 2025-12-01 23:37:06 +00:00
Miroslav Crnic
de0a23090d shard: fix use after free 2025-11-28 19:07:34 +00:00
Isabella Bosia
c0039428e3 unify send recv loops (#76)
* kmod: unify recv loops

* kmod: unify send loops

* kmod: make control flow explicit
2025-11-27 11:16:33 +00:00
Miroslav Crnic
b262674e9c shard: wait other locations on CDC writes
CDC coordinates cross shard transactions.
State machines in it guarantee ordering of events.
In order for order to be guaranteed in all locations,
we need to ensure that all locations have applied the state.
2025-11-26 20:05:36 +00:00
Miroslav Crnic
59e1f291af kmod: fix some resource leaks 2025-11-26 20:05:16 +00:00
Miroslav Crnic
5b9e1b0439 kmod: remove GFP_ATOMIC from socket
There were other issues writing from sk_write_space callback, but this
was not cleaned up when writing from sk_write_space was removed.
2025-11-18 09:17:12 +00:00
Copilot
c8cf261bc9 ternblocks: Decide read-ahead based on storage class, skip fadvise for FLASH
Remove -read-whole-file flag and decide read-ahead behavior based on
storage class: always read ahead for HDD, never for FLASH. Only issue
fadvise syscall when reading ahead to save unnecessary syscalls for FLASH
storage.

Fixes #47

Co-authored-by: mcrnic <miroslav.crnic@xtxmarkets.com>
2025-11-17 17:12:40 +00:00
Miroslav Crnic
8db4389872 terncli: estimate-file-age 2025-11-17 16:40:00 +00:00
Miroslav Crnic
d5c40f7e74 span.go write blocks in parallel 2025-11-17 10:43:28 +00:00
Copilot
045e9adb8a cdc: Fix various RenameDirectory issues
RenameDirectory state machine was not handling target not found correctly.
This would have caused asserts (which result in crashes in production builds)
There was also a bug in the rollback logic which would have caused a lingering
lock on the source link. While breaking assumptions this was a benign bug as
any operation on that directory would try and succeed acquiring this lock again.
It would succeed as lock requests are idempotent.
2025-11-13 15:09:34 +00:00
Miroslav Crnic
b110a7cb38 cdc: ignore unknown tags 2025-11-12 13:18:53 +00:00
Miroslav Crnic
2d7abe35b4 shard: support wait for state applied req 2025-11-12 09:12:07 +00:00
Miroslav Crnic
01cee15980 kmod: fix unsafe span rb tree erase 2025-11-10 13:41:36 +00:00
Miroslav Crnic
5b1a1351e2 kmod: cache inline spans indefinitely 2025-11-07 13:37:47 +00:00
Miroslav Crnic
844bc9adcc client: fix deadlock in fetchRsSpan 2025-11-07 12:26:40 +00:00
Miroslav Crnic
0436fe878c kmod: configurable span cache retention 2025-11-07 10:26:10 +00:00
Miroslav Crnic
9faa523871 ternclient: fix fetching mirrored span 2025-10-31 11:22:35 +00:00
Miroslav Crnic
2ed670a907 udpSocketPair: spread across sockets better
We were using last 2 bit of time but clock could be low precission.
Hashing the time is better here.
2025-10-27 15:30:16 +00:00
Miroslav Crnic
398af8d3cd kmod: memcmp ternfs_block_service 2025-10-27 10:50:10 +00:00
Francesco Mazzoli
02891b6863 Use mimalloc in release and alpine builds
This should make the alpine build usable in production.
2025-10-27 10:20:15 +00:00
Miroslav Crnic
b744242b5a kmod: dont compare block service padding when upserting 2025-10-27 10:19:37 +00:00
Miroslav Crnic
f95775e614 shard: metrics as simple counters 2025-10-24 16:57:23 +01:00
Miroslav Crnic
3e4652eec3 migrate: num-migrations-per-shard -> num-file-migrators 2025-10-24 16:06:58 +01:00
Miroslav Crnic
d4cb2d50cb kmod: need to check if ok to splice page 2025-10-21 22:53:59 +01:00
Miroslav Crnic
7758eb8938 kmod: synchronously fetch policy on dir inode lookup 2025-10-19 12:14:46 +01:00
Miroslav Crnic
d96abd3083 minor fixes 2025-10-16 17:12:37 +00:00
Miroslav Crnic
6bfd89dec7 options: parse -syslog 2025-10-16 14:32:37 +00:00
Francesco Mazzoli
c1e3fa9807 Add way to specify ubuntu build image 2025-10-16 14:01:06 +00:00
Miroslav Crnic
924e75674f shard: support multiple reader threads 2025-10-16 12:39:11 +01:00
Miroslav Crnic
8cec8bcf6b kmod: delete files immediately if policy allows 2025-10-15 22:58:58 +01:00
Francesco Mazzoli
7d92031472 Ignore errors when deleting files immediately (see comment) 2025-10-15 09:41:52 +01:00
Francesco Mazzoli
ee672c0d17 Delete files immediately when policy allows it
We shoud do the same on edge overwrite, but it's a bit more annoying
2025-10-15 09:41:52 +01:00
Miroslav Crnic
ffe3416f16 kmod: minor write path fixes
* kmod: minor write path fixes

We didn't actually see these happen in production.

Fix 1:
From kernel code it looks like copy_page_from_iter can not return 0 in
normal cases but our code should still cover the case if this changes in
the future.

Fix 2:
-ENOMEM was other error where we could write things partially in which
case we would not return written and we would end up at wrong offset.
It's simpler to just return written if we managed to write anything
and surface the error on subsequent call in which we will fail early.

* kmod: add BUG_ON for unexpected span pages
2025-10-10 16:03:43 +01:00
Francesco Mazzoli
29af79e8a8 Fix FUSE options 2025-10-09 22:53:25 +01:00
Francesco Mazzoli
e724223228 Do not ignore changes to the workflow file itself when doing CI 2025-10-09 21:05:28 +01:00
Francesco Mazzoli
641cc89d12 bpf building, take 2 2025-10-09 21:04:33 +01:00
Francesco Mazzoli
3a11d19b4e Allow to direct mount in FUSE 2025-10-09 10:26:51 +00:00