Commit Graph

102 Commits

Author SHA1 Message Date
Francesco Mazzoli 283f3508b9 Add binary /api endpoint, use it to draw histograms
This makes /stats _a lot_ faster.
2023-07-18 12:34:57 +00:00
Francesco Mazzoli 2b1b1a1c15 Insert stats when shutting down 2023-07-17 12:27:07 +00:00
Francesco Mazzoli dcb76a86c2 Fix _hours operator 2023-07-17 12:26:49 +00:00
Francesco Mazzoli 3cc7310a6e Add histograms for all components in /stats 2023-07-17 08:56:09 +00:00
Francesco Mazzoli 2f7be11e29 Add query for single block service in shuckle
I thought I might need it for some upcoming migration improvements,
I probably don't, but still kinda nice to have.
2023-07-13 09:46:37 +00:00
Francesco Mazzoli 2f1385445b Tighten up the mtime story for transient files 2023-07-12 12:52:50 +00:00
Francesco Mazzoli d93df7ef42 Make tests pass for now 2023-07-12 12:22:40 +01:00
Francesco Mazzoli 53598c2fe9 Allow to re-open files as writing if we're already writing them
This makes `cp` work
2023-07-12 12:22:40 +01:00
Francesco Mazzoli 65174341a0 Drop MM after flushing out a transient file 2023-07-12 12:22:40 +01:00
Francesco Mazzoli fe88efb1ce Remove UB in xmon code 2023-07-11 14:15:33 +00:00
Francesco Mazzoli ff9306f6e3 Add Xmon support to C++ code 2023-07-11 12:13:22 +00:00
Francesco Mazzoli d5fea6c08c Retry when block services are unavailable in kmod 2023-07-06 19:39:12 +01:00
Saulius Grusnys 0360ec85cf Switch cutoff time to blockservice to 1h and set the deadline in shard to 2 2023-07-06 13:28:12 +01:00
Francesco Mazzoli 1a4301a499 Simplify go span read/write code, make it work with broken block services
And some other assorted changes.
2023-07-04 08:05:42 +00:00
Francesco Mazzoli 4e0e6fe8a8 Configurable CDC shard timeout
Running in valgrind seems to just not be able to process a small
FullReadDirReq in 100ms, which is a bit concerning, but I'll let
it slide for now.
2023-07-04 08:05:42 +00:00
Francesco Mazzoli 87d0e69f85 Port kmod to new FullReadDir request 2023-07-04 08:05:42 +00:00
Francesco Mazzoli f0add4d926 Remove C++ varint code, we don't use varints anymore 2023-07-04 08:05:42 +00:00
Francesco Mazzoli e2dcd43fea Fix bug in CreateLockedCurrentEdge logic
See comment in `msgs.go`. This would normally have required
entirely new transactions, but since we're not in production yet
I'm going to just change the schema and wipe the current FS.

This also adds in an unrelated change regarding more flexible
blacklisting, which will be required for some additional testing
I'm preparing.
2023-07-04 08:05:42 +00:00
Francesco Mazzoli 0f114623f3 Just use unix nanos for eggs times
This was bugging me for a while, but the final straw was that if
one wants to use the max time (for example to look backwards when
traversing edges), you cannot trivially convert from one to the
other, since you'd overflow. So you can't (for instance) trivially
convert from eggs time to `time.Time` in go.

The main disadvantage is that we lose ~50 of the ~600 years
representable with nanoseconds. But I think that's fine.
2023-07-04 08:05:42 +00:00
Francesco Mazzoli dd78912c0c More stuff as debug 2023-06-18 12:50:05 +00:00
Francesco Mazzoli c328cca75b Fix shard bug when returning from idempotent locked edge creation 2023-06-16 15:20:40 +00:00
Francesco Mazzoli 016c4bf162 First GH workflow attempt 2023-06-15 15:56:34 +00:00
Francesco Mazzoli 444ffba63f Propagate BS flags 2023-06-15 13:53:40 +00:00
Francesco Mazzoli e26eeaede1 Add "mtu" field to requests that benefit from it
Not used right now, but this way we can easily start stuffing more
data in responses.

I also split off some arguments in `NewClient`, unrelated change
(I wanted to pair the MTU with a single client, but I then realized
that it's enough to have it as some global property for now).
2023-06-15 11:57:05 +00:00
Francesco Mazzoli d4715ea11d Add flags to block services in shards 2023-06-14 14:10:16 +00:00
Francesco Mazzoli d1e02e261b Various QOL improvements
Also, try to avoid thundering herds on shuckle from CDC/shards too.
2023-06-08 11:59:09 +00:00
Francesco Mazzoli d076941ce8 Simplify block write/fetch
And hopefully reduce the likelihood of bugs. On the write end, given
that we do things less asynchronously, things might be a bit slower,
but I think the simplification is worth it for now.

Also, fix/improve a bunch of other stuff.
2023-06-08 11:59:09 +00:00
Francesco Mazzoli 90e8500722 Add atime field to file
Right now it's always the same as mtime, but we'll add an endpoint
to modify it.
2023-06-05 12:19:09 +00:00
Francesco Mazzoli b041d14860 Add second ip/addr for CDC/shards too
This is one of the two data model/protocol changes I want to perform
before going into production, the other being file atime.

Right now the kernel module does not take advantage of this, but
it's OK since I tested the rest of the code reasonably and the goal
here is to perform the protocol/data changes.
2023-06-05 12:14:14 +00:00
Saulius Grusnys 3b503861e9 Switch shuckle to store data in sqlite db, add block services flags
Co-authored-by: Francesco Mazzoli <francesco.mazzoli@xtxmarkets.com>
2023-06-04 16:10:14 +00:00
Francesco Mazzoli f54727418f CPP hygiene, debug leftovers 2023-06-03 18:03:49 +00:00
Francesco Mazzoli cd86e632e2 Implement RS recovery, although it won't really be used now...
...since it only relies on block service flags, and we don't
set them right now.
2023-06-03 17:27:54 +00:00
Francesco Mazzoli efb92be31a Use bitmaps for RS recover API
The previous array was sort of silly, and particularly silly now
from the kernel.
2023-06-01 14:41:21 +00:00
Francesco Mazzoli 974500633a Implement stripe prefetching 2023-06-01 13:01:33 +00:00
Francesco Mazzoli 273fa40dda Fix USR2 logging teardown 2023-06-01 10:10:25 +00:00
Francesco Mazzoli 55074b16b4 Implement fs stat
10.97.12.10:10001       29P  208T   29P   1% /home/restechprod/eggs/mnt
2023-05-29 18:49:50 +00:00
Francesco Mazzoli 499bada153 Other silly eggsktools utility 2023-05-29 17:32:15 +00:00
Francesco Mazzoli 7e25c1fd95 Do not crash if we can't find blocks 2023-05-29 09:52:01 +00:00
Francesco Mazzoli a12a938c40 syslogify logs 2023-05-29 09:52:01 +00:00
Francesco Mazzoli 45471eded4 eggsktools readfile QOL 2023-05-28 22:12:49 +00:00
Francesco Mazzoli a4bc32a18f Span drop improvements
We could get into situations where async droppings were scheduled
at every read.
2023-05-26 17:22:43 +00:00
Francesco Mazzoli f95d177c34 WIP commit...
...which I mistakely left in and I'm too lazy to fix.
2023-05-26 17:22:30 +00:00
Francesco Mazzoli fc0ae851c8 Typo 2023-05-26 10:14:32 +00:00
Francesco Mazzoli f98f0f3e95 Move some utility around, allow to deploy kmod easily 2023-05-26 10:05:25 +00:00
Francesco Mazzoli 1458759534 Allow to enable shard/cdc debugging at runtime using USR2 2023-05-26 10:03:59 +00:00
Francesco Mazzoli 1b83a50419 Ensure files actually are on different failure domains...
...in a very lazy way which will probably do for now.
2023-05-25 15:14:02 +00:00
Francesco Mazzoli a61fce55f8 Simple write test for block service
We can only write 3.2Gbit/s right now, so that's definitely something
to improve.
2023-05-22 13:08:22 +00:00
Francesco Mazzoli ac25c0aa3d Limit to 64 CPP expansions
Seems a more reasonable number than 256 for some reason.
2023-05-22 08:13:22 +00:00
Francesco Mazzoli 3cccb451d6 Remove unused define 2023-05-22 08:11:40 +00:00
Francesco Mazzoli 1eab8ee6cf Add versions to some RocksDB values
Only the ones where it is needed -- in some cases we can just
modify the keys (e.g. metadata stuff).

Also, come up with a sort of horrifying but more robust way
to specify the RocksDB values with the C preprocessor.
2023-05-22 08:03:01 +00:00