See <https://mazzo.li/posts/stopping-linux-threads.html> for tradeoffs
regarding how to terminate threads gracefully.
The goal of this work was for valgrind to work correctly, which in turn
was to investigate #141. It looks like I have succeeded:
==2715080== Warning: unimplemented fcntl command: 1036
==2715080== 20,052 bytes in 5,013 blocks are definitely lost in loss record 133 of 135
==2715080== at 0x483F013: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2715080== by 0x3B708E: allocate (new_allocator.h:121)
==2715080== by 0x3B708E: allocate (allocator.h:173)
==2715080== by 0x3B708E: allocate (alloc_traits.h:460)
==2715080== by 0x3B708E: _M_allocate (stl_vector.h:346)
==2715080== by 0x3B708E: std::vector<Crc, std::allocator<Crc> >::_M_default_append(unsigned long) (vector.tcc:635)
==2715080== by 0x42BF1C: resize (stl_vector.h:940)
==2715080== by 0x42BF1C: ShardDBImpl::_fileSpans(rocksdb::ReadOptions&, FileSpansReq const&, FileSpansResp&) (shard/ShardDB.cpp:921)
==2715080== by 0x420867: ShardDBImpl::read(ShardReqContainer const&, ShardRespContainer&) (shard/ShardDB.cpp:1034)
==2715080== by 0x3CB3EE: ShardServer::_handleRequest(int, sockaddr_in*, char*, unsigned long) (shard/Shard.cpp:347)
==2715080== by 0x3C8A39: ShardServer::step() (shard/Shard.cpp:405)
==2715080== by 0x40B1E8: run (core/Loop.cpp:67)
==2715080== by 0x40B1E8: startLoop(void*) (core/Loop.cpp:37)
==2715080== by 0x4BEA258: start_thread (in /usr/lib/libpthread-2.33.so)
==2715080== by 0x4D005E2: clone (in /usr/lib/libc-2.33.so)
==2715080==
==2715080==
==2715080== Exit program on first error (--exit-on-first-error=yes)
The idea is to drain the socket and do a single RocksDB WAL
write/fsync for all the write requests we have found.
The read requests are immediately executed. The reasoning here is
that currently write requests are _a lot_ slower than the read
requests because fsyncing takes ~500us on fsf1. In the future this
might change.
Since we're at it, we also use batch UDP syscalls in the CDC.
Fixes#119.
The goal here is to not have constant wakeups due to timeout. Do
not attempt to clean things up nicely before termination -- just
terminate instead. We can setup a proper termination system in
the future, I first want to see if this makes a difference.
Also, change xmon to use pipes for communication, so that it can
wait without timers as well.
Also, `write` directly for logging, so that we know the logs will
make it to the file after the logging call returns (since we now
do not have the chance to flush them afterwards).
This is to save on a ton of writes as jobs stat tons of files.
It would maybe be a bit cleaner to do it in the kmod, but this is
much quicker.
Thanks to @sgrusny for the good idea.
Fixes#32. This also involves some reworking of the block request machinery
to make it more robust and faster. The scrubbing is done assuming that
the overwhelming majority of block checking will go through.
This was triggered by a server failing hard (fsr13), without any
short term resolution (we've already replaced the mobo, we'll probably
replace the HBA). In this case GC should still run rather than
get stuck.
...and also update them quickly, by indexing them by (inode, tag).
Currently they only get updated on local renames though, we should
also update them when things are moved around remotely.
This fixes a bona-fide bug -- we didn't update the mtime when an
edge was unlocked + moved. However we might as well blindly always
update the mtime, even if there is no POSIX-visible change, to
be on the safe side.
See comment in `msgs.go`. This would normally have required
entirely new transactions, but since we're not in production yet
I'm going to just change the schema and wipe the current FS.
This also adds in an unrelated change regarding more flexible
blacklisting, which will be required for some additional testing
I'm preparing.
Not used right now, but this way we can easily start stuffing more
data in responses.
I also split off some arguments in `NewClient`, unrelated change
(I wanted to pair the MTU with a single client, but I then realized
that it's enough to have it as some global property for now).