Transactional db that CDC uses has a slightly
annoying property that it flushes WAL on transaction
start. As a result release point can get moved and
log records persisted even if we crash.
We want to remove them automatically for now.
This means that they'll be interrupted at shutdown, rather than holding everything up when shuckle is overloaded.
We also detect idle connection or slow transmitting data.
See <https://mazzo.li/posts/stopping-linux-threads.html> for tradeoffs
regarding how to terminate threads gracefully.
The goal of this work was for valgrind to work correctly, which in turn
was to investigate #141. It looks like I have succeeded:
==2715080== Warning: unimplemented fcntl command: 1036
==2715080== 20,052 bytes in 5,013 blocks are definitely lost in loss record 133 of 135
==2715080== at 0x483F013: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2715080== by 0x3B708E: allocate (new_allocator.h:121)
==2715080== by 0x3B708E: allocate (allocator.h:173)
==2715080== by 0x3B708E: allocate (alloc_traits.h:460)
==2715080== by 0x3B708E: _M_allocate (stl_vector.h:346)
==2715080== by 0x3B708E: std::vector<Crc, std::allocator<Crc> >::_M_default_append(unsigned long) (vector.tcc:635)
==2715080== by 0x42BF1C: resize (stl_vector.h:940)
==2715080== by 0x42BF1C: ShardDBImpl::_fileSpans(rocksdb::ReadOptions&, FileSpansReq const&, FileSpansResp&) (shard/ShardDB.cpp:921)
==2715080== by 0x420867: ShardDBImpl::read(ShardReqContainer const&, ShardRespContainer&) (shard/ShardDB.cpp:1034)
==2715080== by 0x3CB3EE: ShardServer::_handleRequest(int, sockaddr_in*, char*, unsigned long) (shard/Shard.cpp:347)
==2715080== by 0x3C8A39: ShardServer::step() (shard/Shard.cpp:405)
==2715080== by 0x40B1E8: run (core/Loop.cpp:67)
==2715080== by 0x40B1E8: startLoop(void*) (core/Loop.cpp:37)
==2715080== by 0x4BEA258: start_thread (in /usr/lib/libpthread-2.33.so)
==2715080== by 0x4D005E2: clone (in /usr/lib/libc-2.33.so)
==2715080==
==2715080==
==2715080== Exit program on first error (--exit-on-first-error=yes)
The idea is to drain the socket and do a single RocksDB WAL
write/fsync for all the write requests we have found.
The read requests are immediately executed. The reasoning here is
that currently write requests are _a lot_ slower than the read
requests because fsyncing takes ~500us on fsf1. In the future this
might change.
Since we're at it, we also use batch UDP syscalls in the CDC.
Fixes#119.
The goal here is to not have constant wakeups due to timeout. Do
not attempt to clean things up nicely before termination -- just
terminate instead. We can setup a proper termination system in
the future, I first want to see if this makes a difference.
Also, change xmon to use pipes for communication, so that it can
wait without timers as well.
Also, `write` directly for logging, so that we know the logs will
make it to the file after the logging call returns (since we now
do not have the chance to flush them afterwards).