Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

c-s load failed during cluster rolling restart - failed to get QUORUM, not enough replicas available #18647

Open
1 of 2 tasks
juliayakovlev opened this issue May 13, 2024 · 43 comments
Assignees
Labels
Milestone

Comments

@juliayakovlev
Copy link

juliayakovlev commented May 13, 2024

Packages

Scylla version: 5.5.0~dev-20240510.28791aa2c1d3 with build-id 893c2a68becf3d3bcbbf076980b1b831b9b76e29
Kernel Version: 5.15.0-1060-aws

Issue description

  • This issue is a regression.
  • It is unknown if this issue is a regression.

Cassandra-stress load (writes and reads) failed while disrupt_rolling_restart_cluster - failed to get QUORUM, not enough replicas available

2024-05-12 08:10:15.584: (CassandraStressLogEvent Severity.CRITICAL) period_type=one-time event_id=ecab70e9-3b78-4097-88c6-ad7618d462e6 during_nemesis=RollingRestartCluster: type=OperationOnKey regex=Operation x10 on key\(s\) \[ line_number=20651 node=Node longevity-tls-50gb-3d-master-loader-node-a6bbb535-2 [18.201.34.63 | 10.4.8.239]
java.io.IOException: Operation x10 on key(s) [4e354f50393938333930]: Error executing: (UnavailableException): Not enough replicas available for query at consistency QUORUM (2 required but only 1 alive)
2024-05-12 08:10:15.652: (CassandraStressLogEvent Severity.CRITICAL) period_type=one-time event_id=b10af2ee-103b-42b3-aa86-9f96e247611c during_nemesis=RollingRestartCluster: type=OperationOnKey regex=Operation x10 on key\(s\) \[ line_number=20691 node=Node longevity-tls-50gb-3d-master-loader-node-a6bbb535-3 [3.248.184.200 | 10.4.10.124]
java.io.IOException: Operation x10 on key(s) [4c304d4e313534343730]: Error executing: (ReadFailureException): Cassandra failure during read query at consistency QUORUM (2 responses were required but only 1 replica responded, 1 failed)
2024-05-12 08:10:15.784: (CassandraStressLogEvent Severity.CRITICAL) period_type=one-time event_id=b10af2ee-103b-42b3-aa86-9f96e247611c during_nemesis=RollingRestartCluster: type=OperationOnKey regex=Operation x10 on key\(s\) \[ line_number=20713 node=Node longevity-tls-50gb-3d-master-loader-node-a6bbb535-3 [3.248.184.200 | 10.4.10.124]
java.io.IOException: Operation x10 on key(s) [304b4e3436304d323131]: Error executing: (WriteFailureException): Cassandra failure during write query at consistency QUORUM (2 responses were required but only 1 replica responded, 1 failed)

This nemesis restarts Scylla on all nodes (one by one) by running sudo systemctl stop scylla-server.service and then sudo systemctl start scylla-server.service.
Nodes order to restart:

'longevity-tls-50gb-3d-master-db-node-a6bbb535-3', 
'longevity-tls-50gb-3d-master-db-node-a6bbb535-4', 
'longevity-tls-50gb-3d-master-db-node-a6bbb535-5', 
'longevity-tls-50gb-3d-master-db-node-a6bbb535-6', 
'longevity-tls-50gb-3d-master-db-node-a6bbb535-8', 
'longevity-tls-50gb-3d-master-db-node-a6bbb535-9'

The load failures happened after longevity-tls-50gb-3d-master-db-node-a6bbb535-6 was restarted and initialisation was completed.
During Scylla start very high foreground writes are observed on the longevity-tls-50gb-3d-master-db-node-a6bbb535-6. Writes started to fail while Scylla stop.

Screenshot from 2024-05-13 11-55-02
where red line - is longevity-tls-50gb-3d-master-db-node-a6bbb535-6 node.

Reactor stalls (32ms) and kernel callstacks

May 12 08:10:09.301420 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 12. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x32c21 0x21e52 0x68484 0xe059 0x14976 0x14bfd 0x14d09 0x169fb7 0x85d22 0x86121 0x6de02 0x7da2c 0x6332334 0x632ab1b 0x632ae5f 0x632b3a6 0x20fe8b1 0x290118d 0x2900905 0x5e8a29f 0x5e8b587 0x5eaf4c0 0x5e4abda 0x8c946 0x11296f

void seastar::backtrace<seastar::backtrace_buffer::append_backtrace_oneline()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace_oneline()::{lambda(seastar::frame)#1}&&) at ./build/release/seastar/./seastar/include/seastar/util/backtrace.hh:68
 (inlined by) seastar::backtrace_buffer::append_backtrace_oneline() at ./build/release/seastar/./seastar/src/core/reactor.cc:839
 (inlined by) seastar::print_with_backtrace(seastar::backtrace_buffer&, bool) at ./build/release/seastar/./seastar/src/core/reactor.cc:858
seastar::internal::cpu_stall_detector::generate_trace() at ./build/release/seastar/./seastar/src/core/reactor.cc:1482
seastar::internal::cpu_stall_detector::maybe_report() at ./build/release/seastar/./seastar/src/core/reactor.cc:1219
 (inlined by) seastar::internal::cpu_stall_detector::on_signal() at ./build/release/seastar/./seastar/src/core/reactor.cc:1239
 (inlined by) seastar::reactor::block_notifier(int) at ./build/release/seastar/./seastar/src/core/reactor.cc:1520
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
seastar::tls::certificate_credentials::impl::set_x509_key(std::basic_string_view<char, std::char_traits<char> > const&, std::basic_string_view<char, std::char_traits<char> > const&, seastar::tls::x509_crt_format) at ./build/release/seastar/./seastar/src/net/tls.cc:399
operator() at ./build/release/seastar/./seastar/src/net/tls.cc:725
 (inlined by) operator()<seastar::x509_key> at ./build/release/seastar/./seastar/src/net/tls.cc:705
 (inlined by) void seastar::visit_blobs<std::multimap<seastar::basic_sstring<char, unsigned int, 15u, true>, boost::any, std::less<seastar::basic_sstring<char, unsigned int, 15u, true> >, std::allocator<std::pair<seastar::basic_sstring<char, unsigned int, 15u, true> const, boost::any> > > const, seastar::internal::variant_visitor<seastar::tls::credentials_builder::apply_to(seastar::tls::certificate_credentials&) const::$_0, seastar::tls::credentials_builder::apply_to(seastar::tls::certificate_credentials&) const::$_1, seastar::tls::credentials_builder::apply_to(seastar::tls::certificate_credentials&) const::$_2> >(std::multimap<seastar::basic_sstring<char, unsigned int, 15u, true>, boost::any, std::less<seastar::basic_sstring<char, unsigned int, 15u, true> >, std::allocator<std::pair<seastar::basic_sstring<char, unsigned int, 15u, true> const, boost::any> > > const&, seastar::internal::variant_visitor<seastar::tls::credentials_builder::apply_to(seastar::tls::certificate_credentials&) const::$_0, seastar::tls::credentials_builder::apply_to(seastar::tls::certificate_credentials&) const::$_1, seastar::tls::credentials_builder::apply_to(seastar::tls::certificate_credentials&) const::$_2>&&) at ./build/release/seastar/./seastar/src/net/tls.cc:710
 (inlined by) seastar::tls::credentials_builder::apply_to(seastar::tls::certificate_credentials&) const at ./build/release/seastar/./seastar/src/net/tls.cc:716
seastar::tls::credentials_builder::build_server_credentials() const at ./build/release/seastar/./seastar/src/net/tls.cc:768
seastar::tls::credentials_builder::build_reloadable_server_credentials(std::function<void (std::unordered_set<seastar::basic_sstring<char, unsigned int, 15u, true>, std::hash<seastar::basic_sstring<char, unsigned int, 15u, true> >, std::equal_to<seastar::basic_sstring<char, unsigned int, 15u, true> >, std::allocator<seastar::basic_sstring<char, unsigned int, 15u, true> > > const&, std::__exception_ptr::exception_ptr)>, std::optional<std::chrono::duration<long, std::ratio<1l, 1000l> > >) const at ./build/release/seastar/./seastar/src/net/tls.cc:1033
generic_server::server::listen(seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>) at ./generic_server.cc:151
operator() at ./transport/controller.cc:72
 (inlined by) seastar::future<void> std::__invoke_impl<seastar::future<void>, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, cql_transport::cql_server&>(std::__invoke_other, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, cql_transport::cql_server&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:61
 (inlined by) std::__invoke_result<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, cql_transport::cql_server&>::type std::__invoke<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, cql_transport::cql_server&>(cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, cql_transport::cql_server&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:96
 (inlined by) decltype(auto) std::__apply_impl<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, std::tuple<cql_transport::cql_server&>, 0ul>(cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, std::tuple<cql_transport::cql_server&>&&, std::integer_sequence<unsigned long, 0ul>) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/tuple:2288
 (inlined by) decltype(auto) std::apply<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, std::tuple<cql_transport::cql_server&> >(cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, std::tuple<cql_transport::cql_server&>&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/tuple:2299
 (inlined by) seastar::future<void> seastar::futurize<seastar::future<void> >::apply<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, cql_transport::cql_server&>(cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, std::tuple<cql_transport::cql_server&>&&) at ././seastar/include/seastar/core/future.hh:2003
 (inlined by) auto seastar::futurize_apply<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, cql_transport::cql_server&>(cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0&, std::tuple<cql_transport::cql_server&>&&) at ././seastar/include/seastar/core/future.hh:2078
 (inlined by) operator() at ././seastar/include/seastar/core/sharded.hh:766
 (inlined by) seastar::future<void> std::__invoke_impl<seastar::future<void>, seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}::operator()(cql_transport::cql_server&)::{lambda()#1}>(std::__invoke_other, seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}::operator()(cql_transport::cql_server&)::{lambda()#1}&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:61
 (inlined by) std::__invoke_result<seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}::operator()(cql_transport::cql_server&)::{lambda()#1}>::type std::__invoke<seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}::operator()(cql_transport::cql_server&)::{lambda()#1}>(seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}::operator()(cql_transport::cql_server&)::{lambda()#1}&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:96
 (inlined by) decltype(auto) std::__apply_impl<seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}::operator()(cql_transport::cql_server&)::{lambda()#1}, std::tuple<>>(seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}::operator()(cql_transport::cql_server&)::{lambda()#1}&&, std::tuple<>&&, std::integer_sequence<unsigned long>) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/tuple:2288
 (inlined by) decltype(auto) std::apply<seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}::operator()(cql_transport::cql_server&)::{lambda()#1}, std::tuple<> >(seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}::operator()(cql_transport::cql_server&)::{lambda()#1}&&, std::tuple<>&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/tuple:2299
 (inlined by) operator() at ././seastar/include/seastar/core/sharded.hh:765
 (inlined by) seastar::future<void> std::__invoke_impl<seastar::future<void>, seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}&, cql_transport::cql_server&>(std::__invoke_other, seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}&, cql_transport::cql_server&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:61
 (inlined by) std::enable_if<is_invocable_r_v<seastar::future<void>, seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}&, cql_transport::cql_server&>, seastar::future<void> >::type std::__invoke_r<seastar::future<void>, seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}&, cql_transport::cql_server&>(std::enable_if&&, (seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}&)...) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:114
 (inlined by) std::_Function_handler<seastar::future<void> (cql_transport::cql_server&), seastar::sharded<cql_transport::cql_server>::invoke_on_all<cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0>(seastar::smp_submit_to_options, cql_transport::listen_on_all_shards(seastar::sharded<cql_transport::cql_server>&, seastar::socket_address, std::shared_ptr<seastar::tls::credentials_builder>, bool, bool, std::optional<seastar::file_permissions>)::$_0)::{lambda(cql_transport::cql_server&)#1}>::_M_invoke(std::_Any_data const&, cql_transport::cql_server&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/std_function.h:290
std::function<seastar::future<void> (cql_transport::cql_server&)>::operator()(cql_transport::cql_server&) const at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/std_function.h:591
 (inlined by) operator() at ././seastar/include/seastar/core/sharded.hh:747
 (inlined by) seastar::future<void> seastar::futurize<seastar::future<void> >::invoke<seastar::sharded<cql_transport::cql_server>::invoke_on_all(seastar::smp_submit_to_options, std::function<seastar::future<void> (cql_transport::cql_server&)>)::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}&>(seastar::sharded<cql_transport::cql_server>::invoke_on_all(seastar::smp_submit_to_options, std::function<seastar::future<void> (cql_transport::cql_server&)>)::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}&) at ././seastar/include/seastar/core/future.hh:2035
 (inlined by) seastar::smp_message_queue::async_work_item<seastar::sharded<cql_transport::cql_server>::invoke_on_all(seastar::smp_submit_to_options, std::function<seastar::future<void> (cql_transport::cql_server&)>)::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}>::run_and_dispose() at ././seastar/include/seastar/core/smp.hh:249
seastar::reactor::run_tasks(seastar::reactor::task_queue&) at ./build/release/seastar/./seastar/src/core/reactor.cc:2690
 (inlined by) seastar::reactor::run_some_tasks() at ./build/release/seastar/./seastar/src/core/reactor.cc:3152
seastar::reactor::do_run() at ./build/release/seastar/./seastar/src/core/reactor.cc:3320
operator() at ./build/release/seastar/./seastar/src/core/reactor.cc:4563
 (inlined by) void std::__invoke_impl<void, seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0&>(std::__invoke_other, seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:61
 (inlined by) std::enable_if<is_invocable_r_v<void, seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0&>, void>::type std::__invoke_r<void, seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0&>(seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:111
 (inlined by) std::_Function_handler<void (), seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0>::_M_invoke(std::_Any_data const&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/std_function.h:290
std::function<void ()>::operator()() const at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/std_function.h:591
 (inlined by) seastar::posix_thread::start_routine(void*) at ./build/release/seastar/./seastar/src/core/posix.cc:90
?? ??:0
?? ??:0

kallsyms_20240512_075635_result.log

Impact

Load failed

How frequently does it reproduce?

Describe the frequency with how this issue can be reproduced.

Installation details

Cluster size: 6 nodes (i4i.4xlarge)

Scylla Nodes used in this run:

  • longevity-tls-50gb-3d-master-db-node-a6bbb535-9 (3.255.115.235 | 10.4.8.139) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-a6bbb535-8 (34.244.47.138 | 10.4.9.0) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-a6bbb535-7 (34.242.230.228 | 10.4.9.166) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-a6bbb535-6 (52.51.27.26 | 10.4.8.53) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-a6bbb535-5 (52.16.209.2 | 10.4.8.183) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-a6bbb535-4 (18.201.155.139 | 10.4.10.52) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-a6bbb535-3 (3.249.199.72 | 10.4.11.206) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-a6bbb535-2 (34.255.217.93 | 10.4.11.113) (shards: 14)
  • longevity-tls-50gb-3d-master-db-node-a6bbb535-1 (3.254.116.202 | 10.4.8.49) (shards: 14)

OS / Image: ami-0b7480423a402aa95 (aws: undefined_region)

Test: longevity-50gb-3days-test
Test id: a6bbb535-3cf6-4f8b-b742-40ef856170ea
Test name: scylla-master/tier1/longevity-50gb-3days-test
Test config file(s):

Logs and commands
  • Restore Monitor Stack command: $ hydra investigate show-monitor a6bbb535-3cf6-4f8b-b742-40ef856170ea
  • Restore monitor on AWS instance using Jenkins job
  • Show all stored logs command: $ hydra investigate show-logs a6bbb535-3cf6-4f8b-b742-40ef856170ea

Logs:

Jenkins job URL
Argus

@mykaul
Copy link
Contributor

mykaul commented May 13, 2024

That reactor stall is not new (see #13758 (comment) and https://github.com/scylladb/scylla-enterprise/issues/3963#issue-2161024203 - and I remember (but can't find right now!) more.

@mykaul
Copy link
Contributor

mykaul commented May 13, 2024

Reactor stalls (32ms) and kernel callstacks

@juliayakovlev - where's the kernel stack?

@mykaul
Copy link
Contributor

mykaul commented May 13, 2024

@juliayakovlev - what encryption was configured here, btw? Client <-> server? server <-> server? both?

@juliayakovlev
Copy link
Author

Reactor stalls (32ms) and kernel callstacks

@juliayakovlev - where's the kernel stack?

If do you mean original file - it's in the node logs (https://cloudius-jenkins-test.s3.amazonaws.com/a6bbb535-3cf6-4f8b-b742-40ef856170ea/20240512_082401/db-cluster-a6bbb535.tar.gz)
If you are searching for decoded file - I attached it in the issue description

@juliayakovlev
Copy link
Author

@juliayakovlev - what encryption was configured here, btw? Client <-> server? server <-> server? both?

both

@mykaul
Copy link
Contributor

mykaul commented May 13, 2024

Reactor stalls (32ms) and kernel callstacks

@juliayakovlev - where's the kernel stack?

If do you mean original file - it's in the node logs (https://cloudius-jenkins-test.s3.amazonaws.com/a6bbb535-3cf6-4f8b-b742-40ef856170ea/20240512_082401/db-cluster-a6bbb535.tar.gz) If you are searching for decoded file - I attached it in the issue description

I couldn't find a single kernel stack in the logs. All empty?

@juliayakovlev
Copy link
Author

Reactor stalls (32ms) and kernel callstacks

@juliayakovlev - where's the kernel stack?

If do you mean original file - it's in the node logs (https://cloudius-jenkins-test.s3.amazonaws.com/a6bbb535-3cf6-4f8b-b742-40ef856170ea/20240512_082401/db-cluster-a6bbb535.tar.gz) If you are searching for decoded file - I attached it in the issue description

I couldn't find a single kernel stack in the logs. All empty?

The file is named "kallsyms_20240512_075635" in the longevity-tls-50gb-3d-master-db-node-a6bbb535-6 folder.

In the node log I see only:

May 12 08:10:09.293433 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 2. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x6f1b1
May 12 08:10:09.293433 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:
May 12 08:10:09.297418 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 1. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x6f18b
May 12 08:10:09.297418 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:
May 12 08:10:09.301420 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 12. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x32c21 0x21e52 0x68484 0xe059 0x14976 0x14bfd 0x14d09 0x169fb7 0x85d22 0x86121 0x6de02 0x7da2c 0x6332334 0x632ab1b 0x632ae5f 0x632b3a6 0x20fe8b1 0x290118d 0x2900905 0x5e8a29f 0x5e8b587 0x5eaf4c0 0x5e4abda 0x8c946 0x11296f
May 12 08:10:09.301420 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:
May 12 08:10:09.301959 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 10. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x6f16f
May 12 08:10:09.301959 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:
May 12 08:10:09.302602 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 8. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x6f1ac
May 12 08:10:09.302602 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:
May 12 08:10:09.302994 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 11. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x6f19e
May 12 08:10:09.302994 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:
May 12 08:10:09.303774 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 7. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x6f1ac
May 12 08:10:09.303774 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:
May 12 08:10:09.304526 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 4. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x6f19e
May 12 08:10:09.304526 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:
May 12 08:10:09.304678 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 6. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x6f1ac
May 12 08:10:09.304678 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:
May 12 08:10:09.305453 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 13. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x6f1c2
May 12 08:10:09.305453 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:
May 12 08:10:09.305529 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 32 ms on shard 9. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x6f19e
May 12 08:10:09.305529 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:
May 12 08:10:09.305529 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: Reactor stalled for 33 ms on shard 5. Backtrace: 0x5e785fa 0x5e77a05 0x5e78dbf 0x3dbaf 0x6f1c2
May 12 08:10:09.305529 longevity-tls-50gb-3d-master-db-node-a6bbb535-6 scylla[1465]: kernel callstack:

Not sure what it means

@mykaul
Copy link
Contributor

mykaul commented May 13, 2024

this is exactly what I mean - I don't see any kernel stack.

@fruch fruch added the triage/master Looking for assignee label May 13, 2024
@fruch
Copy link
Contributor

fruch commented May 13, 2024

run from last week (5.5.0~dev-20240501.af5674211dd4):
https://argus.scylladb.com/test/98050732-dfe3-464c-a66a-f235bad30829/runs?additionalRuns[]=16ad5b7e-ab08-4d63-bfb3-ca368a4433f5

passed via this nemesis with success

@juliayakovlev, let give it antheor run, to see if reproducible

@michoecho
Copy link
Contributor

java.io.IOException: Operation x10 on key(s) [4c304d4e313534343730]: Error executing: (ReadFailureException): Cassandra failure during read query at consistency QUORUM (2 responses were required but only 1 replica responded, 1 failed)

I hate this, I hate this.

This isn't the first (or 100th) time we are debugging why queries failed for unclear reasons.

But Scylla knows very well which replicas were available, which were queried, which failed, and what reasons did they present. Why can't we just make it tell us?

@mykaul
Copy link
Contributor

mykaul commented May 14, 2024

java.io.IOException: Operation x10 on key(s) [4c304d4e313534343730]: Error executing: (ReadFailureException): Cassandra failure during read query at consistency QUORUM (2 responses were required but only 1 replica responded, 1 failed)

I hate this, I hate this.

This isn't the first (or 100th) time we are debugging why queries failed for unclear reasons.

But Scylla knows very well which replicas were available, which were queried, which failed, and what reasons did they present. Why can't we just make it tell us?

Do you expect a log on the coordinator for every drop?

@michoecho
Copy link
Contributor

java.io.IOException: Operation x10 on key(s) [4c304d4e313534343730]: Error executing: (ReadFailureException): Cassandra failure during read query at consistency QUORUM (2 responses were required but only 1 replica responded, 1 failed)

I hate this, I hate this.
This isn't the first (or 100th) time we are debugging why queries failed for unclear reasons.
But Scylla knows very well which replicas were available, which were queried, which failed, and what reasons did they present. Why can't we just make it tell us?

Do you expect a log on the coordinator for every drop?

Log? No, I would like more information to be added to the error returned to the client, more than just the number of replicas which failed.

@michoecho
Copy link
Contributor

michoecho commented May 14, 2024

Also, if cassandra-stress retries each operation 10 times, it should print all 10 errors, not just the last one.

Also, it should report the coordinator. The client knows which coordinator it picked.

All we know from these errors is:
"A write failed 10 times. The last time it failed, 1 replica succeeded and 1 replica responded with an error."

Wouldn't it be better if the error reports narrowed down the problem more? With this, we don't even know if the restarted node was a coordinator or a replica, let alone why the replica failed, or what's up with the third, uncontacted replica.

@mykaul
Copy link
Contributor

mykaul commented May 14, 2024

The current protocol does not provide more information - https://github.com/apache/cassandra/blob/6bae4f76fb043b4c3a3886178b5650b280e9a50b/doc/native_protocol_v4.spec#L1076
We can extend it of course. And we can probably extend client-side error messages. CC @roydahan

@juliayakovlev
Copy link
Author

run from last week (5.5.0~dev-20240501.af5674211dd4): https://argus.scylladb.com/test/98050732-dfe3-464c-a66a-f235bad30829/runs?additionalRuns[]=16ad5b7e-ab08-4d63-bfb3-ca368a4433f5

passed via this nemesis with success

@juliayakovlev, let give it antheor run, to see if reproducible

The issue was not reproduced in https://argus.scylladb.com/test/98050732-dfe3-464c-a66a-f235bad30829/runs?additionalRuns[]=9adcc62d-4f9f-4b92-9316-87279f4c1b92 run

@roydahan
Copy link

We can extend it of course. And we can probably extend client-side error messages. CC @roydahan

One day we can improve the tools to provide more information, anyway it's a side tracking to this issue.
We have the key, so we should be able to know what are the replicas and we know which replica is down.

juliayakovlev added a commit to juliayakovlev/scylla-cluster-tests that referenced this issue May 16, 2024
Try to reproduce scylladb/scylladb#18647 issue
, run the test with ClusterRollingRestart nemesis only
juliayakovlev added a commit to juliayakovlev/scylla-cluster-tests that referenced this issue May 16, 2024
Try to reproduce scylladb/scylladb#18647 issue
, run the test with ClusterRollingRestart nemesis only
@juliayakovlev
Copy link
Author

Reproducer with rolling restart cluster nemesis only.
Issue was reproduced while first nemesis run.

Screenshot from 2024-05-16 13-33-26
Screenshot from 2024-05-16 13-30-59

Packages

Scylla version: 5.5.0~dev-20240510.28791aa2c1d3 with build-id 893c2a68becf3d3bcbbf076980b1b831b9b76e29

Kernel Version: 5.15.0-1060-aws

Issue description

  • This issue is a regression.
  • It is unknown if this issue is a regression.

Describe your issue in detail and steps it took to produce it.

Impact

Describe the impact this issue causes to the user.

How frequently does it reproduce?

Describe the frequency with how this issue can be reproduced.

Installation details

Cluster size: 6 nodes (i4i.4xlarge)

Scylla Nodes used in this run:

  • longevity-tls-50gb-3d-repr-iss-db-node-0804442d-6 (3.254.157.122 | 10.4.3.161) (shards: 14)
  • longevity-tls-50gb-3d-repr-iss-db-node-0804442d-5 (3.248.195.69 | 10.4.1.248) (shards: 14)
  • longevity-tls-50gb-3d-repr-iss-db-node-0804442d-4 (54.75.96.131 | 10.4.1.55) (shards: 14)
  • longevity-tls-50gb-3d-repr-iss-db-node-0804442d-3 (52.209.38.69 | 10.4.0.23) (shards: 14)
  • longevity-tls-50gb-3d-repr-iss-db-node-0804442d-2 (3.250.42.65 | 10.4.0.85) (shards: 14)
  • longevity-tls-50gb-3d-repr-iss-db-node-0804442d-1 (54.247.198.242 | 10.4.1.187) (shards: 14)

OS / Image: ami-0b7480423a402aa95 (aws: undefined_region)

Test: longevity-50gb-3days-test
Test id: 0804442d-781a-4233-8168-7dd3e8896011
Test name: scylla-master/reproducers/longevity-50gb-3days-test
Test config file(s):

Logs and commands
  • Restore Monitor Stack command: $ hydra investigate show-monitor 0804442d-781a-4233-8168-7dd3e8896011
  • Restore monitor on AWS instance using Jenkins job
  • Show all stored logs command: $ hydra investigate show-logs 0804442d-781a-4233-8168-7dd3e8896011

Logs:

Jenkins job URL
Argus

@mykaul
Copy link
Contributor

mykaul commented May 16, 2024

@juliayakovlev - anything relevant in the replica logs at the time of failure?

@mykaul
Copy link
Contributor

mykaul commented May 16, 2024

We can extend it of course. And we can probably extend client-side error messages. CC @roydahan

One day we can improve the tools to provide more information, anyway it's a side tracking to this issue. We have the key, so we should be able to know what are the replicas and we know which replica is down.

@roydahan - please open a tracking issue for this. Sounds like an easy additional log in the stress tool (c-s?) that could help us.

@kostja
Copy link
Contributor

kostja commented May 16, 2024

@kbr-scylla suspect it's a duplicate of #15899, let's fix #15899 and re-test this

@juliayakovlev
Copy link
Author

@juliayakovlev
Copy link
Author

@juliayakovlev - anything relevant in the replica logs at the time of failure?

I did not find nothing new

@kbr-scylla
Copy link
Contributor

#18647 (comment)

Reproducer with rolling restart cluster nemesis only.

@juliayakovlev could you please also check if it reproduces on 5.4?

@fruch
Copy link
Contributor

fruch commented May 21, 2024

#18647 (comment)

Reproducer with rolling restart cluster nemesis only.

@juliayakovlev could you please also check if it reproduces on 5.4?

@juliayakovlev wrote 2 days ago:

Issue was not reproduced with Scylla version 5.4.6
https://argus.scylladb.com/test/a1c2befc-bd68-457a-ba19-913607256e6f/runs?additionalRuns[]=e0f3aa44-fb22-40a4-b406-91e16ada6c1b

@kbr-scylla
Copy link
Contributor

Sorry I missed it.

In this case this is a regression and it is not a duplicate of #15899 (which according to the report, happened way back in 5.1)!

I think it's a major issue -- availability disruption during rolling restart.

Giving it P1 priority and release blocker status.

Actually I already have a suspicion what could be the cause: removing wait-for-gossip-to-settle on node restart before "completing initialization" :( (cc @kostja @gleb-cloudius ) 65cfb9b

We should retest with that final wait-for-gossip-to-settle restored (65cfb9b removed two waits -- I believe we only need the second one for preserving availability)

If so, we should consider:

  • restoring it
  • or implementing another mechanism (perhaps a faster one) for availability-preserving-rolling-restart. When performing a rolling restart, first node A then node B, we need to make sure that node A is seen as UP and NORMAL by all nodes before we shut down B. There are perhaps smarter ways to do it other than "wait for gossip to settle".

@kbr-scylla
Copy link
Contributor

Modified original post (this is a regression)

@kbr-scylla kbr-scylla added status/release blocker Preventing from a release to be promoted and removed triage/master Looking for assignee labels May 21, 2024
@kbr-scylla kbr-scylla added this to the 6.0 milestone May 21, 2024
@dorlaor
Copy link
Contributor

dorlaor commented May 22, 2024

Reproducer with rolling restart cluster nemesis only. Issue was reproduced while first nemesis run.

Screenshot from 2024-05-16 13-33-26 Screenshot from 2024-05-16 13-30-59

I think that servers that were considered restarted and joined the cluster do not
have the capacity of other servers, see the amount of background writes, there are big
differences between the servers.

The gap grows and grows, until eventually there isn't enough capacity and we reach a timeout.
We need to figure out what influence server performance after they were rebooted. Things like
cache affect them. This shouldn't affect writes but it might require read to cache a write too.

It's probably not a regression, just an issue we may have in general, need to research

Packages

Scylla version: 5.5.0~dev-20240510.28791aa2c1d3 with build-id 893c2a68becf3d3bcbbf076980b1b831b9b76e29

Kernel Version: 5.15.0-1060-aws

Issue description

  • This issue is a regression.
  • It is unknown if this issue is a regression.

Describe your issue in detail and steps it took to produce it.

Impact

Describe the impact this issue causes to the user.

How frequently does it reproduce?

Describe the frequency with how this issue can be reproduced.

Installation details

Cluster size: 6 nodes (i4i.4xlarge)

Scylla Nodes used in this run:

  • longevity-tls-50gb-3d-repr-iss-db-node-0804442d-6 (3.254.157.122 | 10.4.3.161) (shards: 14)
  • longevity-tls-50gb-3d-repr-iss-db-node-0804442d-5 (3.248.195.69 | 10.4.1.248) (shards: 14)
  • longevity-tls-50gb-3d-repr-iss-db-node-0804442d-4 (54.75.96.131 | 10.4.1.55) (shards: 14)
  • longevity-tls-50gb-3d-repr-iss-db-node-0804442d-3 (52.209.38.69 | 10.4.0.23) (shards: 14)
  • longevity-tls-50gb-3d-repr-iss-db-node-0804442d-2 (3.250.42.65 | 10.4.0.85) (shards: 14)
  • longevity-tls-50gb-3d-repr-iss-db-node-0804442d-1 (54.247.198.242 | 10.4.1.187) (shards: 14)

OS / Image: ami-0b7480423a402aa95 (aws: undefined_region)

Test: longevity-50gb-3days-test Test id: 0804442d-781a-4233-8168-7dd3e8896011 Test name: scylla-master/reproducers/longevity-50gb-3days-test Test config file(s):

Logs and commands

@mykaul
Copy link
Contributor

mykaul commented May 26, 2024

Packages

Scylla version: 6.1.0~dev-20240523.9adf74ae6c7a with build-id 0e61ad9ecb33913aa59e185d2453859c9ed0fd1a

Kernel Version: 5.15.0-1062-aws

Issue description

  • This issue is a regression.
  • It is unknown if this issue is a regression.

Describe your issue in detail and steps it took to produce it.

Impact

Describe the impact this issue causes to the user.

How frequently does it reproduce?

Describe the frequency with how this issue can be reproduced.

Installation details

Cluster size: 6 nodes (i4i.4xlarge)

Scylla Nodes used in this run:

  • longevity-tls-50gb-3d-6-0-db-node-d6d9eca1-9 (34.245.159.154 | 10.4.10.101) (shards: 14)
  • longevity-tls-50gb-3d-6-0-db-node-d6d9eca1-8 (34.240.42.206 | 10.4.11.134) (shards: 14)
  • longevity-tls-50gb-3d-6-0-db-node-d6d9eca1-7 (34.255.116.175 | 10.4.10.227) (shards: 14)
  • longevity-tls-50gb-3d-6-0-db-node-d6d9eca1-6 (3.250.46.23 | 10.4.9.1) (shards: 14)
  • longevity-tls-50gb-3d-6-0-db-node-d6d9eca1-5 (18.200.252.30 | 10.4.8.20) (shards: 14)
  • longevity-tls-50gb-3d-6-0-db-node-d6d9eca1-4 (34.253.186.166 | 10.4.11.10) (shards: 14)
  • longevity-tls-50gb-3d-6-0-db-node-d6d9eca1-3 (54.78.205.8 | 10.4.8.206) (shards: 14)
  • longevity-tls-50gb-3d-6-0-db-node-d6d9eca1-2 (3.249.117.229 | 10.4.10.104) (shards: 14)
  • longevity-tls-50gb-3d-6-0-db-node-d6d9eca1-1 (3.255.179.210 | 10.4.10.0) (shards: 14)

OS / Image: ami-0927fb8b03edc430c (aws: undefined_region)

Test: longevity-50gb-3days-test
Test id: d6d9eca1-5327-4f35-9588-2b36c644401f
Test name: scylla-6.0/tier1/longevity-50gb-3days-test
Test config file(s):

Logs and commands
  • Restore Monitor Stack command: $ hydra investigate show-monitor d6d9eca1-5327-4f35-9588-2b36c644401f
  • Restore monitor on AWS instance using Jenkins job
  • Show all stored logs command: $ hydra investigate show-logs d6d9eca1-5327-4f35-9588-2b36c644401f

Logs:

Jenkins job URL
Argus

@gleb-cloudius
Copy link
Contributor

@juliayakovlev @roydahan how is this rolling restart nemesis decides that it can restart next node?

@fruch
Copy link
Contributor

fruch commented May 27, 2024

@juliayakovlev @roydahan how is this rolling restart nemesis decides that it can restart next node?

once node is listening on cql, it moves to the next one

@soyacz
Copy link
Contributor

soyacz commented May 27, 2024

since couple days we we verify cql port much more often (was every 60 seconds, now every 5), so issue could be emphasized.

@mykaul
Copy link
Contributor

mykaul commented May 27, 2024

@juliayakovlev @roydahan how is this rolling restart nemesis decides that it can restart next node?

once node is listening on cql, it moves to the next one

There was a long thread about it not being enough and the need for additional checks (scylladb/scylla-ccm#564 implemented this on CCM, and I believe there was a similar issue for dtest?). Specifically, ensure all OTHER nodes see that node as alive and owning its share of the ring?

@gleb-cloudius
Copy link
Contributor

@juliayakovlev @roydahan how is this rolling restart nemesis decides that it can restart next node?

once node is listening on cql, it moves to the next one

There was a long thread about it not being enough and the need for additional checks (scylladb/scylla-ccm#564 implemented this on CCM, and I believe there was a similar issue for dtest?). Specifically, ensure all OTHER nodes see that node as alive and owning its share of the ring?

Yes, just checking CQL port is not enough. But it worked. We still need to figure out what changed.

@kbr-scylla
Copy link
Contributor

Yes, just checking CQL port is not enough. But it worked. We still need to figure out what changed.

Before 65cfb9b, CQL port opened meant that gossip has settled. After this commit, it not longer does.

@gleb-cloudius
Copy link
Contributor

Yes, just checking CQL port is not enough. But it worked. We still need to figure out what changed.

Before 65cfb9b, CQL port opened meant that gossip has settled. After this commit, it not longer does.

I know :) But we did not confirm it yet. Also why gossip settling guaranties that all nodes see all other nodes as alive? May be it is just because it takes time and it does not guaranty it in reality.

@kbr-scylla
Copy link
Contributor

Also why gossip settling guaranties that all nodes see all other nodes as alive? May be it is just because it takes time and it does not guaranty it in reality.

That's my guess too -- there was no guarantee, but since wait-for-gossip-to-settle always took at least a few seconds, in practice the observable result was that all nodes saw this one as UP before continuing rolling restart on the next node.

@fruch
Copy link
Contributor

fruch commented May 27, 2024

a strong sense of dejavu here, around this question.

but what's the next step ? how can a user do a safe rolling restart with this version ?

@gleb-cloudius
Copy link
Contributor

a strong sense of dejavu here, around this question.

but what's the next step ? how can a user do a safe rolling restart with this version ?

The proper procedure for rolling restart was always to wait for the CQL port and wait for all nodes to see the restarted node as UP.

@kbr-scylla
Copy link
Contributor

The proper procedure for rolling restart was always to wait for the CQL port and wait for all nodes to see the restarted node as UP.

BTW our docs are vague about it

https://enterprise.docs.scylladb.com/stable/operating-scylla/procedures/config-change/rolling-restart.html

Step 5 says

Verify the node is up and has returned to the Scylla cluster using nodetool status.

but it doesn't say that nodetool status should show UN for this node for every node (so we need to execute nodetool status on every node)

And we have to admit that it's pretty inconvenient to have to connect to every node and execute status there. It's just bad UX.

@gleb-cloudius
Copy link
Contributor

The proper procedure for rolling restart was always to wait for the CQL port and wait for all nodes to see the restarted node as UP.

BTW our docs are vague about it

https://enterprise.docs.scylladb.com/stable/operating-scylla/procedures/config-change/rolling-restart.html

Step 5 says

Verify the node is up and has returned to the Scylla cluster using nodetool status.

but it doesn't say that nodetool status should show UN for this node for every node (so we need to execute nodetool status on every node)

And we have to admit that it's pretty inconvenient to have to connect to every node and execute status there. It's just bad UX.

The node may do it itself before opening CQL port like it does with shutdown notification, but this is not what "waiting for gossiper to settle" was doing, so this a different feature request.

@soyacz
Copy link
Contributor

soyacz commented May 28, 2024

The proper procedure for rolling restart was always to wait for the CQL port and wait for all nodes to see the restarted node as UP.

BTW our docs are vague about it

https://enterprise.docs.scylladb.com/stable/operating-scylla/procedures/config-change/rolling-restart.html

Step 5 says

Verify the node is up and has returned to the Scylla cluster using nodetool status.

but it doesn't say that nodetool status should show UN for this node for every node (so we need to execute nodetool status on every node)

And we have to admit that it's pretty inconvenient to have to connect to every node and execute status there. It's just bad UX.

Anyway, SCT for this case don't even do it on single node.
@gleb-cloudius Do we drain nodes before stopping in rolling restart procedure on prod clusters?

@kbr-scylla
Copy link
Contributor

@gleb-cloudius Do we drain nodes before stopping in rolling restart procedure on prod clusters?

We can ask our Field engineers. @tarzanek could you help answering this?

But I suspect the manual drain is redundant -- graceful shutdown should already drain automatically before stopping the process.

@mykaul
Copy link
Contributor

mykaul commented May 28, 2024

@gleb-cloudius Do we drain nodes before stopping in rolling restart procedure on prod clusters?

We can ask our Field engineers. @tarzanek could you help answering this?

It's in Siren's code. But this is an OSS issue, so I won't paste the link. Generally, we do. And we have a timeout between drain and restart too, btw.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants