c-s load failed during cluster rolling restart - failed to get QUORUM, not enough replicas available #18647

juliayakovlev · 2024-05-13T12:26:55Z

Packages

Scylla version: 5.5.0~dev-20240510.28791aa2c1d3 with build-id 893c2a68becf3d3bcbbf076980b1b831b9b76e29
Kernel Version: 5.15.0-1060-aws

Issue description

This issue is a regression.
It is unknown if this issue is a regression.

Cassandra-stress load (writes and reads) failed while disrupt_rolling_restart_cluster - failed to get QUORUM, not enough replicas available

2024-05-12 08:10:15.584: (CassandraStressLogEvent Severity.CRITICAL) period_type=one-time event_id=ecab70e9-3b78-4097-88c6-ad7618d462e6 during_nemesis=RollingRestartCluster: type=OperationOnKey regex=Operation x10 on key\(s\) \[ line_number=20651 node=Node longevity-tls-50gb-3d-master-loader-node-a6bbb535-2 [18.201.34.63 | 10.4.8.239]
java.io.IOException: Operation x10 on key(s) [4e354f50393938333930]: Error executing: (UnavailableException): Not enough replicas available for query at consistency QUORUM (2 required but only 1 alive)

2024-05-12 08:10:15.652: (CassandraStressLogEvent Severity.CRITICAL) period_type=one-time event_id=b10af2ee-103b-42b3-aa86-9f96e247611c during_nemesis=RollingRestartCluster: type=OperationOnKey regex=Operation x10 on key\(s\) \[ line_number=20691 node=Node longevity-tls-50gb-3d-master-loader-node-a6bbb535-3 [3.248.184.200 | 10.4.10.124]
java.io.IOException: Operation x10 on key(s) [4c304d4e313534343730]: Error executing: (ReadFailureException): Cassandra failure during read query at consistency QUORUM (2 responses were required but only 1 replica responded, 1 failed)

2024-05-12 08:10:15.784: (CassandraStressLogEvent Severity.CRITICAL) period_type=one-time event_id=b10af2ee-103b-42b3-aa86-9f96e247611c during_nemesis=RollingRestartCluster: type=OperationOnKey regex=Operation x10 on key\(s\) \[ line_number=20713 node=Node longevity-tls-50gb-3d-master-loader-node-a6bbb535-3 [3.248.184.200 | 10.4.10.124]
java.io.IOException: Operation x10 on key(s) [304b4e3436304d323131]: Error executing: (WriteFailureException): Cassandra failure during write query at consistency QUORUM (2 responses were required but only 1 replica responded, 1 failed)

This nemesis restarts Scylla on all nodes (one by one) by running sudo systemctl stop scylla-server.service and then sudo systemctl start scylla-server.service.
Nodes order to restart:

'longevity-tls-50gb-3d-master-db-node-a6bbb535-3', 
'longevity-tls-50gb-3d-master-db-node-a6bbb535-4', 
'longevity-tls-50gb-3d-master-db-node-a6bbb535-5', 
'longevity-tls-50gb-3d-master-db-node-a6bbb535-6', 
'longevity-tls-50gb-3d-master-db-node-a6bbb535-8', 
'longevity-tls-50gb-3d-master-db-node-a6bbb535-9'

The load failures happened after longevity-tls-50gb-3d-master-db-node-a6bbb535-6 was restarted and initialisation was completed.
During Scylla start very high foreground writes are observed on the longevity-tls-50gb-3d-master-db-node-a6bbb535-6. Writes started to fail while Scylla stop.

where red line - is longevity-tls-50gb-3d-master-db-node-a6bbb535-6 node.

Reactor stalls (32ms) and kernel callstacks

May 12 08:10:09.301420 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 12. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x32c21 0x21e52 0x68484 0xe059 0x14976 0x14bfd 0x14d09 0x169fb7 0x85d22 0x86121 0x6de02 0x7da2c 0x6332334 0x632ab1b 0x632ae5f 0x632b3a6 0x20fe8b1 0x290118d 0x2900905 0x5e8a29f 0x5e8b587 0x5eaf4c0 0x5e4abda 0x8c946 0x11296f

void seastar::backtrace<seastar::backtrace_buffer::append_backtrace_oneline()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace_oneline()::{lambda(seastar::frame)#1}&&) at ./build/release/seastar/./seastar/include/seastar/util/backtrace.hh:68
 (inlined by) seastar::backtrace_buffer::append_backtrace_oneline() at ./build/release/seastar/./seastar/src/core/reactor.cc:839
 (inlined by) seastar::print_with_backtrace(seastar::backtrace_buffer&, bool) at ./build/release/seastar/./seastar/src/core/reactor.cc:858
seastar::internal::cpu_stall_detector::generate_trace() at ./build/release/seastar/./seastar/src/core/reactor.cc:1482
seastar::internal::cpu_stall_detector::maybe_report() at ./build/release/seastar/./seastar/src/core/reactor.cc:1219
 (inlined by) seastar::internal::cpu_stall_detector::on_signal() at ./build/release/seastar/./seastar/src/core/reactor.cc:1239
 (inlined by) seastar::reactor::block_notifier(int) at ./build/release/seastar/./seastar/src/core/reactor.cc:1520
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
seastar::tls::certificate_credentials::impl::set_x509_key(std::basic_string_view<char, std::char_traits<char> > const&, std::basic_string_view<char, std::char_traits<char> > const&, seastar::tls::x509_crt_format) at ./build/release/seastar/./seastar/src/net/tls.cc:399
operator() at ./build/release/seastar/./seastar/src/net/tls.cc:725
 (inlined by) operator()<seastar::x509_key> at ./build/release/seastar/./seastar/src/net/tls.cc:705
 (inlined by) void seastar::visit_blobs<std::multimap<seastar::basic_sstring<char, unsigned int, 15u, true>, boost::any, std::less<seastar::basic_sstring<char, unsigned int, 15u, true> >, std::allocator<std::pair<seastar::basic_sstring<char, unsigned int, 15u, true> const, boost::any> > > const, seastar::internal::variant_visitor<seastar::tls::credentials_builder::apply_to(seastar::tls::certificate_credentials&) const::$_0, seastar::tls::credentials_builder::apply_to(seastar::tls::certificate_credentials&) const::$_1, seastar::tls::credentials_builder::apply_to(seastar::tls::certificate_credentials&) const::$_2> >(std::multimap<seastar::basic_sstring<char, unsigned int, 15u, true>, boost::any, std::less<seastar::basic_sstring<char, unsigned int, 15u, true> >, std::allocator<std::pair<seastar::basic_sstring<char, unsigned int, 15u, true> const, boost::any> > > const&, seastar::internal::variant_visitor<seastar::tls::credentials_builder::apply_to(seastar::tls::certificate_credentials&) const::$_0, seastar::tls::credentials_builder::apply_to(seastar::tls::certificate_credentials&) const::$_1, seastar::tls::credentials_builder::apply_to(seastar::tls::certificate_credentials&) const::$_2>&&) at ./build/release/seastar/./seastar/src/net/tls.cc:710
 (inlined by) seastar::tls::credentials_builder::apply_to(seastar::tls::certificate_credentials&) const at ./build/release/seastar/./seastar/src/net/tls.cc:716
seastar::tls::credentials_builder::build_server_credentials() const at ./build/release/seastar/./seastar/src/net/tls.cc:768
seastar::tls::credentials_builder::build_reloadable_server_credentials(std::function<void (std::unordered_set<seastar::basic_sstring<char, unsigned int, 15u, true>, std::hash<seastar::basic_sstring<char, unsigned int, 15u, true> >, std::equal_to<seastar::basic_sstring<char, unsigned int, 15u, true> >, std::allocator<seastar::basic_sstring<char, unsigned int, 15u, true> > > const&, std::__exception_ptr::exception_ptr)>, std::optional<std::chrono::duration<long, std::ratio<1l, 1000l> > >) const at ./build/release/seastar/./seastar/src/net/tls.cc:1033
generic_server::server::listen(seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>) at ./generic_server.cc:151
operator() at ./transport/controller.cc:72
 (inlined by) seastar::future<void> std::__invoke_impl<seastar::future<void>, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, cql_transport::cql_server&>(std::__invoke_other, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, cql_transport::cql_server&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:61
 (inlined by) std::__invoke_result<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, cql_transport::cql_server&>::type std::__invoke<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, cql_transport::cql_server&>(cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, cql_transport::cql_server&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:96
 (inlined by) decltype(auto) std::__apply_impl<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, std::tuple<cql_transport::cql_server&>, 0ul>(cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, std::tuple<cql_transport::cql_server&>&&, std::integer_sequence<unsigned long, 0ul>) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/tuple:2288
 (inlined by) decltype(auto) std::apply<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, std::tuple<cql_transport::cql_server&> >(cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, std::tuple<cql_transport::cql_server&>&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/tuple:2299
 (inlined by) seastar::future<void> seastar::futurize<seastar::future<void> >::apply<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, cql_transport::cql_server&>(cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, std::tuple<cql_transport::cql_server&>&&) at ././seastar/include/seastar/core/future.hh:2003
 (inlined by) auto seastar::futurize_apply<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, cql_transport::cql_server&>(cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, std::tuple<cql_transport::cql_server&>&&) at ././seastar/include/seastar/core/future.hh:2078
 (inlined by) operator() at ././seastar/include/seastar/core/sharded.hh:766
 (inlined by) seastar::future<void> std::__invoke_impl<seastar::future<void>, seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}::operator()(cql_transport::cql_server&)::{lambda()#1}>(std::__invoke_other, seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}::operator()(cql_transport::cql_server&)::{lambda()#1}&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:61
 (inlined by) std::__invoke_result<seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}::operator()(cql_transport::cql_server&)::{lambda()#1}>::type std::__invoke<seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}::operator()(cql_transport::cql_server&)::{lambda()#1}>(seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}::operator()(cql_transport::cql_server&)::{lambda()#1}&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:96
 (inlined by) decltype(auto) std::__apply_impl<seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}::operator()(cql_transport::cql_server&)::{lambda()#1}, std::tuple<>>(seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}::operator()(cql_transport::cql_server&)::{lambda()#1}&&, std::tuple<>&&, std::integer_sequence<unsigned long>) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/tuple:2288
 (inlined by) decltype(auto) std::apply<seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}::operator()(cql_transport::cql_server&)::{lambda()#1}, std::tuple<> >(seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}::operator()(cql_transport::cql_server&)::{lambda()#1}&&, std::tuple<>&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/tuple:2299
 (inlined by) operator() at ././seastar/include/seastar/core/sharded.hh:765
 (inlined by) seastar::future<void> std::__invoke_impl<seastar::future<void>, seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}&, cql_transport::cql_server&>(std::__invoke_other, seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}&, cql_transport::cql_server&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:61
 (inlined by) std::enable_if<is_invocable_r_v<seastar::future<void>, seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}&, cql_transport::cql_server&>, seastar::future<void> >::type std::__invoke_r<seastar::future<void>, seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}&, cql_transport::cql_server&>(std::enable_if&&, (seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}&)...) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:114
 (inlined by) std::_Function_handler<seastar::future<void> (cql_transport::cql_server&), seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}>::_M_invoke(std::_Any_data const&, cql_transport::cql_server&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/std_function.h:290
std::function<seastar::future<void> (cql_transport::cql_server&)>::operator()(cql_transport::cql_server&) const at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/std_function.h:591
 (inlined by) operator() at ././seastar/include/seastar/core/sharded.hh:747
 (inlined by) seastar::future<void> seastar::futurize<seastar::future<void> >::invoke<seastar::sharded<cql_transport::cql_server>::invoke_on_all(seastar::smp_submit_to_options, std::function<seastar::future<void> (cql_transport::cql_server&)>)::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}&>(seastar::sharded<cql_transport::cql_server>::invoke_on_all(seastar::smp_submit_to_options, std::function<seastar::future<void> (cql_transport::cql_server&)>)::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}&) at ././seastar/include/seastar/core/future.hh:2035
 (inlined by) seastar::smp_message_queue::async_work_item<seastar::sharded<cql_transport::cql_server>::invoke_on_all(seastar::smp_submit_to_options, std::function<seastar::future<void> (cql_transport::cql_server&)>)::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}>::run_and_dispose() at ././seastar/include/seastar/core/smp.hh:249
seastar::reactor::run_tasks(seastar::reactor::task_queue&) at ./build/release/seastar/./seastar/src/core/reactor.cc:2690
 (inlined by) seastar::reactor::run_some_tasks() at ./build/release/seastar/./seastar/src/core/reactor.cc:3152
seastar::reactor::do_run() at ./build/release/seastar/./seastar/src/core/reactor.cc:3320
operator() at ./build/release/seastar/./seastar/src/core/reactor.cc:4563
 (inlined by) void std::__invoke_impl<void, seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0&>(std::__invoke_other, seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:61
 (inlined by) std::enable_if<is_invocable_r_v<void, seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0&>, void>::type std::__invoke_r<void, seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0&>(seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:111
 (inlined by) std::_Function_handler<void (), seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0>::_M_invoke(std::_Any_data const&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/std_function.h:290
std::function<void ()>::operator()() const at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/std_function.h:591
 (inlined by) seastar::posix_thread::start_routine(void*) at ./build/release/seastar/./seastar/src/core/posix.cc:90
?? ??:0
?? ??:0

kallsyms_20240512_075635_result.log

Impact

Load failed

How frequently does it reproduce?

Describe the frequency with how this issue can be reproduced.

Installation details

Cluster size: 6 nodes (i4i.4xlarge)

Scylla Nodes used in this run:

longevity-tls-50gb-3d-master-db-node-a6bbb535-9 (3.255.115.235 | 10.4.8.139) (shards: 14)
longevity-tls-50gb-3d-master-db-node-a6bbb535-8 (34.244.47.138 | 10.4.9.0) (shards: 14)
longevity-tls-50gb-3d-master-db-node-a6bbb535-7 (34.242.230.228 | 10.4.9.166) (shards: 14)
longevity-tls-50gb-3d-master-db-node-a6bbb535-6 (52.51.27.26 | 10.4.8.53) (shards: 14)
longevity-tls-50gb-3d-master-db-node-a6bbb535-5 (52.16.209.2 | 10.4.8.183) (shards: 14)
longevity-tls-50gb-3d-master-db-node-a6bbb535-4 (18.201.155.139 | 10.4.10.52) (shards: 14)
longevity-tls-50gb-3d-master-db-node-a6bbb535-3 (3.249.199.72 | 10.4.11.206) (shards: 14)
longevity-tls-50gb-3d-master-db-node-a6bbb535-2 (34.255.217.93 | 10.4.11.113) (shards: 14)
longevity-tls-50gb-3d-master-db-node-a6bbb535-1 (3.254.116.202 | 10.4.8.49) (shards: 14)

OS / Image: ami-0b7480423a402aa95 (aws: undefined_region)

Test: longevity-50gb-3days-test
Test id: a6bbb535-3cf6-4f8b-b742-40ef856170ea
Test name: scylla-master/tier1/longevity-50gb-3days-test
Test config file(s):

longevity-50GB-3days-authorization-and-tls-ssl.yaml

Logs and commands

Restore Monitor Stack command: $ hydra investigate show-monitor a6bbb535-3cf6-4f8b-b742-40ef856170ea
Restore monitor on AWS instance using Jenkins job
Show all stored logs command: $ hydra investigate show-logs a6bbb535-3cf6-4f8b-b742-40ef856170ea

Logs:

db-cluster-a6bbb535.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/a6bbb535-3cf6-4f8b-b742-40ef856170ea/20240512_082401/db-cluster-a6bbb535.tar.gz
sct-runner-events-a6bbb535.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/a6bbb535-3cf6-4f8b-b742-40ef856170ea/20240512_082401/sct-runner-events-a6bbb535.tar.gz
sct-a6bbb535.log.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/a6bbb535-3cf6-4f8b-b742-40ef856170ea/20240512_082401/sct-a6bbb535.log.tar.gz
loader-set-a6bbb535.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/a6bbb535-3cf6-4f8b-b742-40ef856170ea/20240512_082401/loader-set-a6bbb535.tar.gz
monitor-set-a6bbb535.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/a6bbb535-3cf6-4f8b-b742-40ef856170ea/20240512_082401/monitor-set-a6bbb535.tar.gz

Jenkins job URL
Argus

The text was updated successfully, but these errors were encountered:

mykaul · 2024-05-13T12:57:26Z

That reactor stall is not new (see #13758 (comment) and https://github.com/scylladb/scylla-enterprise/issues/3963#issue-2161024203 - and I remember (but can't find right now!) more.

mykaul · 2024-05-13T13:00:14Z

Reactor stalls (32ms) and kernel callstacks

@juliayakovlev - where's the kernel stack?

mykaul · 2024-05-13T13:01:08Z

@juliayakovlev - what encryption was configured here, btw? Client <-> server? server <-> server? both?

juliayakovlev · 2024-05-13T13:02:52Z

Reactor stalls (32ms) and kernel callstacks

@juliayakovlev - where's the kernel stack?

If do you mean original file - it's in the node logs (https://cloudius-jenkins-test.s3.amazonaws.com/a6bbb535-3cf6-4f8b-b742-40ef856170ea/20240512_082401/db-cluster-a6bbb535.tar.gz)
If you are searching for decoded file - I attached it in the issue description

juliayakovlev · 2024-05-13T13:03:16Z

@juliayakovlev - what encryption was configured here, btw? Client <-> server? server <-> server? both?

both

mykaul · 2024-05-13T13:08:38Z

Reactor stalls (32ms) and kernel callstacks

@juliayakovlev - where's the kernel stack?

If do you mean original file - it's in the node logs (https://cloudius-jenkins-test.s3.amazonaws.com/a6bbb535-3cf6-4f8b-b742-40ef856170ea/20240512_082401/db-cluster-a6bbb535.tar.gz) If you are searching for decoded file - I attached it in the issue description

I couldn't find a single kernel stack in the logs. All empty?

juliayakovlev · 2024-05-13T13:17:20Z

Reactor stalls (32ms) and kernel callstacks

@juliayakovlev - where's the kernel stack?

If do you mean original file - it's in the node logs (https://cloudius-jenkins-test.s3.amazonaws.com/a6bbb535-3cf6-4f8b-b742-40ef856170ea/20240512_082401/db-cluster-a6bbb535.tar.gz) If you are searching for decoded file - I attached it in the issue description

I couldn't find a single kernel stack in the logs. All empty?

The file is named "kallsyms_20240512_075635" in the longevity-tls-50gb-3d-master-db-node-a6bbb535-6 folder.

In the node log I see only:

May 12 08:10:09.293433 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 2. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x6f1b1
May 12 08:10:09.293433 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:
May 12 08:10:09.297418 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 1. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x6f18b
May 12 08:10:09.297418 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:
May 12 08:10:09.301420 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 12. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x32c21 0x21e52 0x68484 0xe059 0x14976 0x14bfd 0x14d09 0x169fb7 0x85d22 0x86121 0x6de02 0x7da2c 0x6332334 0x632ab1b 0x632ae5f 0x632b3a6 0x20fe8b1 0x290118d 0x2900905 0x5e8a29f 0x5e8b587 0x5eaf4c0 0x5e4abda 0x8c946 0x11296f
May 12 08:10:09.301420 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:
May 12 08:10:09.301959 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 10. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x6f16f
May 12 08:10:09.301959 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:
May 12 08:10:09.302602 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 8. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x6f1ac
May 12 08:10:09.302602 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:
May 12 08:10:09.302994 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 11. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x6f19e
May 12 08:10:09.302994 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:
May 12 08:10:09.303774 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 7. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x6f1ac
May 12 08:10:09.303774 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:
May 12 08:10:09.304526 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 4. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x6f19e
May 12 08:10:09.304526 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:
May 12 08:10:09.304678 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 6. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x6f1ac
May 12 08:10:09.304678 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:
May 12 08:10:09.305453 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 13. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x6f1c2
May 12 08:10:09.305453 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:
May 12 08:10:09.305529 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 9. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x6f19e
May 12 08:10:09.305529 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:
May 12 08:10:09.305529 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 33 ms on shard 5. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x6f1c2
May 12 08:10:09.305529 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:

Not sure what it means

mykaul · 2024-05-13T13:31:43Z

this is exactly what I mean - I don't see any kernel stack.

fruch · 2024-05-13T14:10:48Z

run from last week (5.5.0~dev-20240501.af5674211dd4):
https://argus.scylladb.com/test/98050732-dfe3-464c-a66a-f235bad30829/runs?additionalRuns[]=16ad5b7e-ab08-4d63-bfb3-ca368a4433f5

passed via this nemesis with success

@juliayakovlev, let give it antheor run, to see if reproducible

michoecho · 2024-05-14T04:25:05Z

java.io.IOException: Operation x10 on key(s) [4c304d4e313534343730]: Error executing: (ReadFailureException): Cassandra failure during read query at consistency QUORUM (2 responses were required but only 1 replica responded, 1 failed)

I hate this, I hate this.

This isn't the first (or 100th) time we are debugging why queries failed for unclear reasons.

But Scylla knows very well which replicas were available, which were queried, which failed, and what reasons did they present. Why can't we just make it tell us?

mykaul · 2024-05-14T07:13:16Z

java.io.IOException: Operation x10 on key(s) [4c304d4e313534343730]: Error executing: (ReadFailureException): Cassandra failure during read query at consistency QUORUM (2 responses were required but only 1 replica responded, 1 failed)

I hate this, I hate this.

This isn't the first (or 100th) time we are debugging why queries failed for unclear reasons.

But Scylla knows very well which replicas were available, which were queried, which failed, and what reasons did they present. Why can't we just make it tell us?

Do you expect a log on the coordinator for every drop?

michoecho · 2024-05-14T07:18:55Z

java.io.IOException: Operation x10 on key(s) [4c304d4e313534343730]: Error executing: (ReadFailureException): Cassandra failure during read query at consistency QUORUM (2 responses were required but only 1 replica responded, 1 failed)

I hate this, I hate this.
This isn't the first (or 100th) time we are debugging why queries failed for unclear reasons.
But Scylla knows very well which replicas were available, which were queried, which failed, and what reasons did they present. Why can't we just make it tell us?

Do you expect a log on the coordinator for every drop?

Log? No, I would like more information to be added to the error returned to the client, more than just the number of replicas which failed.

michoecho · 2024-05-14T07:29:12Z

Also, if cassandra-stress retries each operation 10 times, it should print all 10 errors, not just the last one.

Also, it should report the coordinator. The client knows which coordinator it picked.

All we know from these errors is:
"A write failed 10 times. The last time it failed, 1 replica succeeded and 1 replica responded with an error."

Wouldn't it be better if the error reports narrowed down the problem more? With this, we don't even know if the restarted node was a coordinator or a replica, let alone why the replica failed, or what's up with the third, uncontacted replica.

mykaul · 2024-05-14T08:01:09Z

The current protocol does not provide more information - https://github.com/apache/cassandra/blob/6bae4f76fb043b4c3a3886178b5650b280e9a50b/doc/native_protocol_v4.spec#L1076
We can extend it of course. And we can probably extend client-side error messages. CC @roydahan

juliayakovlev · 2024-05-15T15:07:57Z

run from last week (5.5.0~dev-20240501.af5674211dd4): https://argus.scylladb.com/test/98050732-dfe3-464c-a66a-f235bad30829/runs?additionalRuns[]=16ad5b7e-ab08-4d63-bfb3-ca368a4433f5

passed via this nemesis with success

@juliayakovlev, let give it antheor run, to see if reproducible

The issue was not reproduced in https://argus.scylladb.com/test/98050732-dfe3-464c-a66a-f235bad30829/runs?additionalRuns[]=9adcc62d-4f9f-4b92-9316-87279f4c1b92 run

roydahan · 2024-05-15T16:22:24Z

We can extend it of course. And we can probably extend client-side error messages. CC @roydahan

One day we can improve the tools to provide more information, anyway it's a side tracking to this issue.
We have the key, so we should be able to know what are the replicas and we know which replica is down.

Try to reproduce scylladb/scylladb#18647 issue , run the test with ClusterRollingRestart nemesis only

juliayakovlev · 2024-05-16T10:34:03Z

Reproducer with rolling restart cluster nemesis only.
Issue was reproduced while first nemesis run.

Packages

Scylla version: 5.5.0~dev-20240510.28791aa2c1d3 with build-id 893c2a68becf3d3bcbbf076980b1b831b9b76e29

Kernel Version: 5.15.0-1060-aws

Issue description

This issue is a regression.
It is unknown if this issue is a regression.

Describe your issue in detail and steps it took to produce it.

Impact

Describe the impact this issue causes to the user.

How frequently does it reproduce?

Describe the frequency with how this issue can be reproduced.

Installation details

Cluster size: 6 nodes (i4i.4xlarge)

Scylla Nodes used in this run:

longevity-tls-50gb-3d-repr-iss-db-node-0804442d-6 (3.254.157.122 | 10.4.3.161) (shards: 14)
longevity-tls-50gb-3d-repr-iss-db-node-0804442d-5 (3.248.195.69 | 10.4.1.248) (shards: 14)
longevity-tls-50gb-3d-repr-iss-db-node-0804442d-4 (54.75.96.131 | 10.4.1.55) (shards: 14)
longevity-tls-50gb-3d-repr-iss-db-node-0804442d-3 (52.209.38.69 | 10.4.0.23) (shards: 14)
longevity-tls-50gb-3d-repr-iss-db-node-0804442d-2 (3.250.42.65 | 10.4.0.85) (shards: 14)
longevity-tls-50gb-3d-repr-iss-db-node-0804442d-1 (54.247.198.242 | 10.4.1.187) (shards: 14)

OS / Image: ami-0b7480423a402aa95 (aws: undefined_region)

Test: longevity-50gb-3days-test
Test id: 0804442d-781a-4233-8168-7dd3e8896011
Test name: scylla-master/reproducers/longevity-50gb-3days-test
Test config file(s):

longevity-50GB-3days-authorization-and-tls-ssl.yaml

Logs and commands

Restore Monitor Stack command: $ hydra investigate show-monitor 0804442d-781a-4233-8168-7dd3e8896011
Restore monitor on AWS instance using Jenkins job
Show all stored logs command: $ hydra investigate show-logs 0804442d-781a-4233-8168-7dd3e8896011

Logs:

db-cluster-0804442d.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/0804442d-781a-4233-8168-7dd3e8896011/20240516_091803/db-cluster-0804442d.tar.gz
sct-runner-events-0804442d.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/0804442d-781a-4233-8168-7dd3e8896011/20240516_091803/sct-runner-events-0804442d.tar.gz
sct-0804442d.log.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/0804442d-781a-4233-8168-7dd3e8896011/20240516_091803/sct-0804442d.log.tar.gz
loader-set-0804442d.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/0804442d-781a-4233-8168-7dd3e8896011/20240516_091803/loader-set-0804442d.tar.gz
monitor-set-0804442d.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/0804442d-781a-4233-8168-7dd3e8896011/20240516_091803/monitor-set-0804442d.tar.gz

Jenkins job URL
Argus

mykaul · 2024-05-16T10:43:11Z

@juliayakovlev - anything relevant in the replica logs at the time of failure?

mykaul · 2024-05-16T10:43:53Z

We can extend it of course. And we can probably extend client-side error messages. CC @roydahan

One day we can improve the tools to provide more information, anyway it's a side tracking to this issue. We have the key, so we should be able to know what are the replicas and we know which replica is down.

@roydahan - please open a tracking issue for this. Sounds like an easy additional log in the stress tool (c-s?) that could help us.

kostja · 2024-05-16T19:13:41Z

@kbr-scylla suspect it's a duplicate of #15899, let's fix #15899 and re-test this

juliayakovlev · 2024-05-19T15:17:24Z

Issue was not reproduced with Scylla version 5.4.6
https://argus.scylladb.com/test/a1c2befc-bd68-457a-ba19-913607256e6f/runs?additionalRuns[]=e0f3aa44-fb22-40a4-b406-91e16ada6c1b

juliayakovlev · 2024-05-19T15:17:51Z

@juliayakovlev - anything relevant in the replica logs at the time of failure?

I did not find nothing new

kbr-scylla · 2024-05-21T09:35:49Z

#18647 (comment)

Reproducer with rolling restart cluster nemesis only.

@juliayakovlev could you please also check if it reproduces on 5.4?

fruch · 2024-05-21T09:40:59Z

#18647 (comment)

Reproducer with rolling restart cluster nemesis only.

@juliayakovlev could you please also check if it reproduces on 5.4?

@juliayakovlev wrote 2 days ago:

Issue was not reproduced with Scylla version 5.4.6
https://argus.scylladb.com/test/a1c2befc-bd68-457a-ba19-913607256e6f/runs?additionalRuns[]=e0f3aa44-fb22-40a4-b406-91e16ada6c1b

kbr-scylla · 2024-05-21T15:25:09Z

Sorry I missed it.

In this case this is a regression and it is not a duplicate of #15899 (which according to the report, happened way back in 5.1)!

I think it's a major issue -- availability disruption during rolling restart.

Giving it P1 priority and release blocker status.

Actually I already have a suspicion what could be the cause: removing wait-for-gossip-to-settle on node restart before "completing initialization" :( (cc @kostja @gleb-cloudius ) 65cfb9b

We should retest with that final wait-for-gossip-to-settle restored (65cfb9b removed two waits -- I believe we only need the second one for preserving availability)

If so, we should consider:

restoring it
or implementing another mechanism (perhaps a faster one) for availability-preserving-rolling-restart. When performing a rolling restart, first node A then node B, we need to make sure that node A is seen as UP and NORMAL by all nodes before we shut down B. There are perhaps smarter ways to do it other than "wait for gossip to settle".

kbr-scylla · 2024-05-21T15:25:25Z

Modified original post (this is a regression)

dorlaor · 2024-05-22T12:32:58Z

Reproducer with rolling restart cluster nemesis only. Issue was reproduced while first nemesis run.

I think that servers that were considered restarted and joined the cluster do not
have the capacity of other servers, see the amount of background writes, there are big
differences between the servers.

The gap grows and grows, until eventually there isn't enough capacity and we reach a timeout.
We need to figure out what influence server performance after they were rebooted. Things like
cache affect them. This shouldn't affect writes but it might require read to cache a write too.

It's probably not a regression, just an issue we may have in general, need to research

Packages

Scylla version: 5.5.0~dev-20240510.28791aa2c1d3 with build-id 893c2a68becf3d3bcbbf076980b1b831b9b76e29

Kernel Version: 5.15.0-1060-aws

Issue description

This issue is a regression.

It is unknown if this issue is a regression.

Describe your issue in detail and steps it took to produce it.

Impact

Describe the impact this issue causes to the user.

How frequently does it reproduce?

Describe the frequency with how this issue can be reproduced.

Installation details

Cluster size: 6 nodes (i4i.4xlarge)

Scylla Nodes used in this run:

longevity-tls-50gb-3d-repr-iss-db-node-0804442d-6 (3.254.157.122 | 10.4.3.161) (shards: 14)

longevity-tls-50gb-3d-repr-iss-db-node-0804442d-5 (3.248.195.69 | 10.4.1.248) (shards: 14)

longevity-tls-50gb-3d-repr-iss-db-node-0804442d-4 (54.75.96.131 | 10.4.1.55) (shards: 14)

longevity-tls-50gb-3d-repr-iss-db-node-0804442d-3 (52.209.38.69 | 10.4.0.23) (shards: 14)

longevity-tls-50gb-3d-repr-iss-db-node-0804442d-2 (3.250.42.65 | 10.4.0.85) (shards: 14)

longevity-tls-50gb-3d-repr-iss-db-node-0804442d-1 (54.247.198.242 | 10.4.1.187) (shards: 14)

OS / Image: ami-0b7480423a402aa95 (aws: undefined_region)

Test: longevity-50gb-3days-test Test id: 0804442d-781a-4233-8168-7dd3e8896011 Test name: scylla-master/reproducers/longevity-50gb-3days-test Test config file(s):

longevity-50GB-3days-authorization-and-tls-ssl.yaml

Logs and commands

mykaul · 2024-05-26T15:24:27Z

Packages

Scylla version: 6.1.0~dev-20240523.9adf74ae6c7a with build-id 0e61ad9ecb33913aa59e185d2453859c9ed0fd1a

Kernel Version: 5.15.0-1062-aws

Issue description

This issue is a regression.
It is unknown if this issue is a regression.

Describe your issue in detail and steps it took to produce it.

Impact

Describe the impact this issue causes to the user.

How frequently does it reproduce?

Describe the frequency with how this issue can be reproduced.

Installation details

Cluster size: 6 nodes (i4i.4xlarge)

Scylla Nodes used in this run:

longevity-tls-50gb-3d-6-0-db-node-d6d9eca1-9 (34.245.159.154 | 10.4.10.101) (shards: 14)
longevity-tls-50gb-3d-6-0-db-node-d6d9eca1-8 (34.240.42.206 | 10.4.11.134) (shards: 14)
longevity-tls-50gb-3d-6-0-db-node-d6d9eca1-7 (34.255.116.175 | 10.4.10.227) (shards: 14)
longevity-tls-50gb-3d-6-0-db-node-d6d9eca1-6 (3.250.46.23 | 10.4.9.1) (shards: 14)
longevity-tls-50gb-3d-6-0-db-node-d6d9eca1-5 (18.200.252.30 | 10.4.8.20) (shards: 14)
longevity-tls-50gb-3d-6-0-db-node-d6d9eca1-4 (34.253.186.166 | 10.4.11.10) (shards: 14)
longevity-tls-50gb-3d-6-0-db-node-d6d9eca1-3 (54.78.205.8 | 10.4.8.206) (shards: 14)
longevity-tls-50gb-3d-6-0-db-node-d6d9eca1-2 (3.249.117.229 | 10.4.10.104) (shards: 14)
longevity-tls-50gb-3d-6-0-db-node-d6d9eca1-1 (3.255.179.210 | 10.4.10.0) (shards: 14)

OS / Image: ami-0927fb8b03edc430c (aws: undefined_region)

Test: longevity-50gb-3days-test
Test id: d6d9eca1-5327-4f35-9588-2b36c644401f
Test name: scylla-6.0/tier1/longevity-50gb-3days-test
Test config file(s):

longevity-50GB-3days-authorization-and-tls-ssl.yaml

Logs and commands

Restore Monitor Stack command: $ hydra investigate show-monitor d6d9eca1-5327-4f35-9588-2b36c644401f
Restore monitor on AWS instance using Jenkins job
Show all stored logs command: $ hydra investigate show-logs d6d9eca1-5327-4f35-9588-2b36c644401f

Logs:

db-cluster-d6d9eca1.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/d6d9eca1-5327-4f35-9588-2b36c644401f/20240525_121043/db-cluster-d6d9eca1.tar.gz
sct-runner-events-d6d9eca1.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/d6d9eca1-5327-4f35-9588-2b36c644401f/20240525_121043/sct-runner-events-d6d9eca1.tar.gz
sct-d6d9eca1.log.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/d6d9eca1-5327-4f35-9588-2b36c644401f/20240525_121043/sct-d6d9eca1.log.tar.gz
loader-set-d6d9eca1.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/d6d9eca1-5327-4f35-9588-2b36c644401f/20240525_121043/loader-set-d6d9eca1.tar.gz
monitor-set-d6d9eca1.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/d6d9eca1-5327-4f35-9588-2b36c644401f/20240525_121043/monitor-set-d6d9eca1.tar.gz

Jenkins job URL
Argus

gleb-cloudius · 2024-05-27T12:29:54Z

@juliayakovlev @roydahan how is this rolling restart nemesis decides that it can restart next node?

fruch · 2024-05-27T12:54:12Z

@juliayakovlev @roydahan how is this rolling restart nemesis decides that it can restart next node?

once node is listening on cql, it moves to the next one

soyacz · 2024-05-27T13:01:03Z

since couple days we we verify cql port much more often (was every 60 seconds, now every 5), so issue could be emphasized.

mykaul · 2024-05-27T13:10:55Z

@juliayakovlev @roydahan how is this rolling restart nemesis decides that it can restart next node?

once node is listening on cql, it moves to the next one

There was a long thread about it not being enough and the need for additional checks (scylladb/scylla-ccm#564 implemented this on CCM, and I believe there was a similar issue for dtest?). Specifically, ensure all OTHER nodes see that node as alive and owning its share of the ring?

gleb-cloudius · 2024-05-27T13:20:23Z

@juliayakovlev @roydahan how is this rolling restart nemesis decides that it can restart next node?

once node is listening on cql, it moves to the next one

There was a long thread about it not being enough and the need for additional checks (scylladb/scylla-ccm#564 implemented this on CCM, and I believe there was a similar issue for dtest?). Specifically, ensure all OTHER nodes see that node as alive and owning its share of the ring?

Yes, just checking CQL port is not enough. But it worked. We still need to figure out what changed.

kbr-scylla · 2024-05-27T13:43:34Z

Yes, just checking CQL port is not enough. But it worked. We still need to figure out what changed.

Before 65cfb9b, CQL port opened meant that gossip has settled. After this commit, it not longer does.

gleb-cloudius · 2024-05-27T14:07:02Z

Yes, just checking CQL port is not enough. But it worked. We still need to figure out what changed.

Before 65cfb9b, CQL port opened meant that gossip has settled. After this commit, it not longer does.

I know :) But we did not confirm it yet. Also why gossip settling guaranties that all nodes see all other nodes as alive? May be it is just because it takes time and it does not guaranty it in reality.

kbr-scylla · 2024-05-27T14:09:12Z

Also why gossip settling guaranties that all nodes see all other nodes as alive? May be it is just because it takes time and it does not guaranty it in reality.

That's my guess too -- there was no guarantee, but since wait-for-gossip-to-settle always took at least a few seconds, in practice the observable result was that all nodes saw this one as UP before continuing rolling restart on the next node.

fruch · 2024-05-27T20:16:27Z

a strong sense of dejavu here, around this question.

but what's the next step ? how can a user do a safe rolling restart with this version ?

gleb-cloudius · 2024-05-28T08:30:33Z

a strong sense of dejavu here, around this question.

but what's the next step ? how can a user do a safe rolling restart with this version ?

The proper procedure for rolling restart was always to wait for the CQL port and wait for all nodes to see the restarted node as UP.

kbr-scylla · 2024-05-28T08:34:06Z

The proper procedure for rolling restart was always to wait for the CQL port and wait for all nodes to see the restarted node as UP.

BTW our docs are vague about it

https://enterprise.docs.scylladb.com/stable/operating-scylla/procedures/config-change/rolling-restart.html

Step 5 says

Verify the node is up and has returned to the Scylla cluster using nodetool status.

but it doesn't say that nodetool status should show UN for this node for every node (so we need to execute nodetool status on every node)

And we have to admit that it's pretty inconvenient to have to connect to every node and execute status there. It's just bad UX.

gleb-cloudius · 2024-05-28T08:38:08Z

The proper procedure for rolling restart was always to wait for the CQL port and wait for all nodes to see the restarted node as UP.

BTW our docs are vague about it

https://enterprise.docs.scylladb.com/stable/operating-scylla/procedures/config-change/rolling-restart.html

Step 5 says

Verify the node is up and has returned to the Scylla cluster using nodetool status.

but it doesn't say that nodetool status should show UN for this node for every node (so we need to execute nodetool status on every node)

And we have to admit that it's pretty inconvenient to have to connect to every node and execute status there. It's just bad UX.

The node may do it itself before opening CQL port like it does with shutdown notification, but this is not what "waiting for gossiper to settle" was doing, so this a different feature request.

soyacz · 2024-05-28T08:41:22Z

The proper procedure for rolling restart was always to wait for the CQL port and wait for all nodes to see the restarted node as UP.

BTW our docs are vague about it

https://enterprise.docs.scylladb.com/stable/operating-scylla/procedures/config-change/rolling-restart.html

Step 5 says

Verify the node is up and has returned to the Scylla cluster using nodetool status.

but it doesn't say that nodetool status should show UN for this node for every node (so we need to execute nodetool status on every node)

And we have to admit that it's pretty inconvenient to have to connect to every node and execute status there. It's just bad UX.

Anyway, SCT for this case don't even do it on single node.
@gleb-cloudius Do we drain nodes before stopping in rolling restart procedure on prod clusters?

kbr-scylla · 2024-05-28T08:43:48Z

@gleb-cloudius Do we drain nodes before stopping in rolling restart procedure on prod clusters?

We can ask our Field engineers. @tarzanek could you help answering this?

But I suspect the manual drain is redundant -- graceful shutdown should already drain automatically before stopping the process.

mykaul · 2024-05-28T08:50:36Z

@gleb-cloudius Do we drain nodes before stopping in rolling restart procedure on prod clusters?

We can ask our Field engineers. @tarzanek could you help answering this?

It's in Siren's code. But this is an OSS issue, so I won't paste the link. Generally, we do. And we have a timeout between drain and restart too, btw.

fruch added the triage/master Looking for assignee label May 13, 2024

mykaul added the symptom/stalls label May 15, 2024

juliayakovlev added a commit to juliayakovlev/scylla-cluster-tests that referenced this issue May 16, 2024

Reproduce isse 18647

ed5788b

Try to reproduce scylladb/scylladb#18647 issue , run the test with ClusterRollingRestart nemesis only

juliayakovlev added a commit to juliayakovlev/scylla-cluster-tests that referenced this issue May 16, 2024

Reproduce isse 18647

d785b3c

Try to reproduce scylladb/scylladb#18647 issue , run the test with ClusterRollingRestart nemesis only

mykaul assigned kostja May 16, 2024

kostja assigned kbr-scylla May 16, 2024

dimakr mentioned this issue May 20, 2024

Internode compression test scylladb/scylla-cluster-tests#7364

Open

kbr-scylla added status/release blocker Preventing from a release to be promoted and removed triage/master Looking for assignee labels May 21, 2024

kbr-scylla added this to the 6.0 milestone May 21, 2024

kbr-scylla added the status/regression label May 21, 2024

kbr-scylla assigned gleb-cloudius May 27, 2024

c-s load failed during cluster rolling restart - failed to get QUORUM, not enough replicas available #18647

c-s load failed during cluster rolling restart - failed to get QUORUM, not enough replicas available #18647

Comments

juliayakovlev commented May 13, 2024 • edited by kbr-scylla

Packages

Issue description

Impact

How frequently does it reproduce?

Installation details

Logs:

mykaul commented May 13, 2024

mykaul commented May 13, 2024

mykaul commented May 13, 2024

juliayakovlev commented May 13, 2024

juliayakovlev commented May 13, 2024

mykaul commented May 13, 2024

juliayakovlev commented May 13, 2024

mykaul commented May 13, 2024

fruch commented May 13, 2024

michoecho commented May 14, 2024

mykaul commented May 14, 2024

michoecho commented May 14, 2024

michoecho commented May 14, 2024 • edited

mykaul commented May 14, 2024

juliayakovlev commented May 15, 2024

roydahan commented May 15, 2024

juliayakovlev commented May 16, 2024

Packages

Issue description

Impact

How frequently does it reproduce?

Installation details

Logs:

mykaul commented May 16, 2024

mykaul commented May 16, 2024

kostja commented May 16, 2024

juliayakovlev commented May 19, 2024

juliayakovlev commented May 19, 2024

kbr-scylla commented May 21, 2024

fruch commented May 21, 2024

kbr-scylla commented May 21, 2024

kbr-scylla commented May 21, 2024

dorlaor commented May 22, 2024

Packages

Issue description

Impact

How frequently does it reproduce?

Installation details

mykaul commented May 26, 2024

Packages

Issue description

Impact

How frequently does it reproduce?

Installation details

Logs:

gleb-cloudius commented May 27, 2024

fruch commented May 27, 2024

soyacz commented May 27, 2024

mykaul commented May 27, 2024

gleb-cloudius commented May 27, 2024

kbr-scylla commented May 27, 2024

gleb-cloudius commented May 27, 2024

kbr-scylla commented May 27, 2024

fruch commented May 27, 2024

gleb-cloudius commented May 28, 2024

kbr-scylla commented May 28, 2024

gleb-cloudius commented May 28, 2024

soyacz commented May 28, 2024

kbr-scylla commented May 28, 2024

mykaul commented May 28, 2024

juliayakovlev commented May 13, 2024 •

edited by kbr-scylla

michoecho commented May 14, 2024 •

edited