Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snap Versions 6.3-6.7 Include Wrong Version of NodeJS. 6.7 Keeps crashing. #32308

Closed
maurice-does-software opened this issue Apr 24, 2024 · 3 comments
Labels
stat: need more info stat: no response Issue with no response after a given time

Comments

@maurice-does-software
Copy link

Description:

I can't 100% say the crashing is related to mismatched Node versions, only that the release notes ( https://github.com/RocketChat/Rocket.Chat/releases/tag/6.3.12 thru https://github.com/RocketChat/Rocket.Chat/releases/tag/6.7.0 ) say "Node: 14.21.3", and the Administration > Workspace section of our install says "Node Version v14.21.2". RocketChat and the version of Node it is using are installed via snap.

Steps to reproduce:

Start with snap version 6.1.8, wait for it to auto-update to later version, notice it crashing during peak times, which are still relatively low traffic for our org, but produce significant spikes in CPU and RAM usage.

Expected behavior:

Newer versions continuing to run as smoothly as they were previously.

Actual behavior:

Intermittent "502 Bad Gateway" responses in web gui and official mobile app, which continue until the host running RocketChat is restarted.

Server Setup Information:

  • Version of Rocket.Chat Server: 6.7.0
  • Operating System: Debian 12
  • Deployment Method: Snap
  • Number of Running Instances: 1
  • DB Replicaset Oplog: Enabled
  • NodeJS Version: 14.21.2
  • MongoDB Version: 6.0.11 / wiredTiger (oplog Enabled)

Client Setup Information

  • Desktop App or Browser Version: Chrome 123.0.6312.122
  • Operating System: Windows 11, Ubuntu 20, various Android and iOS versions (The issue here is NOT with the clients.)

Additional context

Crashes started this past week. After attempting to diagnose the issue from the logs, but not knowing what was relevant, or how any of it tied together, and seeing the spikes in CPU and RAM usage, we doubled the capacity of our virtual server. Intermittent crashes are still occurring tho.

Relevant logs:

MongoDB logs scraped with command sudo grep -n -B1 -A1 'error' /var/log/mongodb/mongod.log | tail -72:

112117-{"t":{"$date":"2024-04-22T15:01:04.583+00:00"},"s":"I", "c":"NETWORK", "id":23016, "ctx":"listener","msg":"Waiting for connections","attr":{"port":27017,"ssl":"off"}}
112118:{"t":{"$date":"2024-04-22T15:01:04.583+00:00"},"s":"I", "c":"-", "id":4939300, "ctx":"monitoring-keys-for-HMAC","msg":"Failed to refresh key cache","attr":{"error":"ReadConcernMajorityNotAvailableYet: Read concern majority reads are currently not possible.","nextWakeupMillis":400}}
112119-{"t":{"$date":"2024-04-22T15:01:04.585+00:00"},"s":"I", "c":"REPL", "id":5123005, "ctx":"ShardSplitDonorService-0","msg":"Rebuilding PrimaryOnlyService due to stepUp","attr":{"service":"ShardSplitDonorService"}}

112310-{"t":{"$date":"2024-04-22T16:06:03.627+00:00"},"s":"I", "c":"STORAGE", "id":22297, "ctx":"initandlisten","msg":"Using the XFS filesystem is strongly recommended with the WiredTiger storage engine. See http://dochub.mongodb.org/core/prodnotes-filesystem","tags":["startupWarnings"]}
112311:{"t":{"$date":"2024-04-22T16:06:03.628+00:00"},"s":"I", "c":"STORAGE", "id":22315, "ctx":"initandlisten","msg":"Opening WiredTiger","attr":{"config":"create,cache_size=1445M,session_max=33000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,remove=true,path=journal,compressor=snappy),builtin_extension_config=(zstd=(compression_level=6)),file_manager=(close_idle_time=600,close_scan_interval=10,close_handle_minimum=2000),statistics_log=(wait=0),json_output=(error,message),verbose=[recovery_progress:1,checkpoint_progress:1,compact_progress:1,backup:0,checkpoint:0,compact:0,evict:0,history_store:0,recovery:0,rts:0,salvage:0,tiered:0,timestamp:0,transaction:0,verify:0,log:0],"}}
112312-{"t":{"$date":"2024-04-22T16:06:04.507+00:00"},"s":"I", "c":"STORAGE", "id":4795906, "ctx":"initandlisten","msg":"WiredTiger opened","attr":{"durationMillis":879}}

112349-{"t":{"$date":"2024-04-22T16:06:08.307+00:00"},"s":"I", "c":"REPL", "id":4280506, "ctx":"initandlisten","msg":"Reconstructing prepared transactions"}
112350:{"t":{"$date":"2024-04-22T16:06:08.309+00:00"},"s":"I", "c":"-", "id":4939300, "ctx":"monitoring-keys-for-HMAC","msg":"Failed to refresh key cache","attr":{"error":"ReadConcernMajorityNotAvailableYet: Read concern majority reads are currently not possible.","nextWakeupMillis":200}}
112351-{"t":{"$date":"2024-04-22T16:06:08.310+00:00"},"s":"I", "c":"REPL", "id":4280507, "ctx":"initandlisten","msg":"Loaded replica set config, scheduled callback to set local config"}

112380-{"t":{"$date":"2024-04-22T16:06:08.334+00:00"},"s":"I", "c":"CONNPOOL", "id":22576, "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"Connecting","attr":{"hostAndPort":"127.0.0.1:27017"}}
112381:{"t":{"$date":"2024-04-22T16:06:08.334+00:00"},"s":"I", "c":"-", "id":4333222, "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"RSM received error response","attr":{"host":"127.0.0.1:27017","error":"HostUnreachable: Error connecting to 127.0.0.1:27017 :: caused by :: Connection refused","replicaSet":"rs01","response":{}}}
112382:{"t":{"$date":"2024-04-22T16:06:08.334+00:00"},"s":"I", "c":"NETWORK", "id":4712102, "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"Host failed in replica set","attr":{"replicaSet":"rs01","host":"127.0.0.1:27017","error":{"code":6,"codeName":"HostUnreachable","errmsg":"Error connecting to 127.0.0.1:27017 :: caused by :: Connection refused"},"action":{"dropConnections":true,"requestImmediateCheck":true}}}
112383-{"t":{"$date":"2024-04-22T16:06:08.334+00:00"},"s":"I", "c":"REPL", "id":21106, "ctx":"ReplCoord-0","msg":"Resetting sync source to empty","attr":{"previousSyncSource":":27017"}}

112410-{"t":{"$date":"2024-04-22T16:06:08.337+00:00"},"s":"I", "c":"NETWORK", "id":23016, "ctx":"listener","msg":"Waiting for connections","attr":{"port":27017,"ssl":"off"}}
112411:{"t":{"$date":"2024-04-22T16:06:08.338+00:00"},"s":"I", "c":"-", "id":4939300, "ctx":"monitoring-keys-for-HMAC","msg":"Failed to refresh key cache","attr":{"error":"ReadConcernMajorityNotAvailableYet: Read concern majority reads are currently not possible.","nextWakeupMillis":400}}
112412-{"t":{"$date":"2024-04-22T16:06:08.341+00:00"},"s":"I", "c":"REPL", "id":5123005, "ctx":"TenantMigrationDonorService-0","msg":"Rebuilding PrimaryOnlyService due to stepUp","attr":{"service":"TenantMigrationDonorService"}}

112427-{"t":{"$date":"2024-04-22T16:06:08.838+00:00"},"s":"I", "c":"NETWORK", "id":20113, "ctx":"LogicalSessionCacheReap","msg":"Successfully connected to host","attr":{"connString":"127.0.0.1:27017","numOpenConns":2,"socketTimeoutSecs":0}}
112428:{"t":{"$date":"2024-04-22T16:06:09.029+00:00"},"s":"I", "c":"FTDC", "id":20631, "ctx":"ftdc","msg":"Unclean full-time diagnostic data capture shutdown detected, found interim file, some metrics may have been lost","attr":{"error":{"code":0,"codeName":"OK"}}}
112429-{"t":{"$date":"2024-04-22T16:06:11.023+00:00"},"s":"I", "c":"NETWORK", "id":22943, "ctx":"listener","msg":"Connection accepted","attr":{"remote":"127.0.0.1:38952","uuid":"78b6b43b-7cf5-4a2b-997a-9f9285931ebf","connectionId":9,"connectionCount":5}}

112556-{"t":{"$date":"2024-04-22T16:11:49.754+00:00"},"s":"I", "c":"COMMAND", "id":51803, "ctx":"conn4","msg":"Slow query","attr":{"type":"command","ns":"admin.$cmd","command":{"ping":1,"maxTimeMSOpOnly":10000,"$db":"admin"},"numYields":0,"reslen":163,"locks":{},"remote":"127.0.0.1:35092","protocol":"op_msg","durationMillis":141}}
112557:{"t":{"$date":"2024-04-22T16:11:56.792+00:00"},"s":"I", "c":"CONTROL", "id":23377, "ctx":"SignalHandler","msg":"Received signal","attr":{"signal":15,"error":"Terminated"}}
112558-{"t":{"$date":"2024-04-22T16:11:56.803+00:00"},"s":"I", "c":"CONTROL", "id":23378, "ctx":"SignalHandler","msg":"Signal was sent by kill(2)","attr":{"pid":1,"uid":0}}

112609-{"t":{"$date":"2024-04-22T16:11:57.704+00:00"},"s":"I", "c":"ASIO", "id":22582, "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"Killing all outstanding egress activity."}
112610:{"t":{"$date":"2024-04-22T16:11:57.705+00:00"},"s":"I", "c":"CONNPOOL", "id":22572, "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"Dropping all pooled connections","attr":{"hostAndPort":"127.0.0.1:27017","error":"ShutdownInProgress: Shutting down the connection pool"}}
112611-{"t":{"$date":"2024-04-22T16:11:57.706+00:00"},"s":"I", "c":"REPL", "id":4784920, "ctx":"SignalHandler","msg":"Shutting down the LogicalTimeValidator"}

112636-{"t":{"$date":"2024-04-22T16:11:57.825+00:00"},"s":"I", "c":"STORAGE", "id":4795902, "ctx":"SignalHandler","msg":"Closing WiredTiger","attr":{"closeConfig":"leak_memory=true,"}}
112637:{"t":{"$date":"2024-04-22T16:11:58.056+00:00"},"s":"W", "c":"REPL", "id":6100702, "ctx":"ftdc","msg":"Failed to get last stable recovery timestamp due to {error}","attr":{"error":"lock acquire timeout"}}
112638-{"t":{"$date":"2024-04-22T16:11:58.989+00:00"},"s":"I", "c":"STORAGE", "id":4795901, "ctx":"SignalHandler","msg":"WiredTiger closed","attr":{"durationMillis":1154}}

112641-{"t":{"$date":"2024-04-22T16:11:59.021+00:00"},"s":"I", "c":"FTDC", "id":20626, "ctx":"SignalHandler","msg":"Shutting down full-time diagnostic data capture"}
112642:{"t":{"$date":"2024-04-22T16:11:59.305+00:00"},"s":"W", "c":"REPL", "id":6100702, "ctx":"ftdc","msg":"Failed to get last stable recovery timestamp due to {error}","attr":{"error":"lock acquire timeout"}}
112643-{"t":{"$date":"2024-04-22T16:12:00.066+00:00"},"s":"I", "c":"CONTROL", "id":20565, "ctx":"SignalHandler","msg":"Now exiting"}

112658-{"t":{"$date":"2024-04-22T16:12:29.900+00:00"},"s":"I", "c":"STORAGE", "id":22297, "ctx":"initandlisten","msg":"Using the XFS filesystem is strongly recommended with the WiredTiger storage engine. See http://dochub.mongodb.org/core/prodnotes-filesystem","tags":["startupWarnings"]}
112659:{"t":{"$date":"2024-04-22T16:12:29.900+00:00"},"s":"I", "c":"STORAGE", "id":22315, "ctx":"initandlisten","msg":"Opening WiredTiger","attr":{"config":"create,cache_size=1445M,session_max=33000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,remove=true,path=journal,compressor=snappy),builtin_extension_config=(zstd=(compression_level=6)),file_manager=(close_idle_time=600,close_scan_interval=10,close_handle_minimum=2000),statistics_log=(wait=0),json_output=(error,message),verbose=[recovery_progress:1,checkpoint_progress:1,compact_progress:1,backup:0,checkpoint:0,compact:0,evict:0,history_store:0,recovery:0,rts:0,salvage:0,tiered:0,timestamp:0,transaction:0,verify:0,log:0],"}}
112660-{"t":{"$date":"2024-04-22T16:12:30.916+00:00"},"s":"I", "c":"STORAGE", "id":4795906, "ctx":"initandlisten","msg":"WiredTiger opened","attr":{"durationMillis":1016}}

112685-{"t":{"$date":"2024-04-22T16:12:35.294+00:00"},"s":"I", "c":"CONTROL", "id":6608200, "ctx":"initandlisten","msg":"Initializing cluster server parameters from disk"}
112686:{"t":{"$date":"2024-04-22T16:12:35.294+00:00"},"s":"I", "c":"-", "id":4939300, "ctx":"monitoring-keys-for-HMAC","msg":"Failed to refresh key cache","attr":{"error":"ReadConcernMajorityNotAvailableYet: Read concern majority reads are currently not possible.","nextWakeupMillis":200}}
112687-{"t":{"$date":"2024-04-22T16:12:35.295+00:00"},"s":"I", "c":"REPL", "id":21544, "ctx":"initandlisten","msg":"Recovering from stable timestamp","attr":{"stableTimestamp":{"$timestamp":{"t":1713802313,"i":1}},"topOfOplog":{"ts":{"$timestamp":{"t":1713802313,"i":1}},"t":7},"appliedThrough":{"ts":{"$timestamp":{"t":0,"i":0}},"t":-1}}}

112740-{"t":{"$date":"2024-04-22T16:12:35.333+00:00"},"s":"I", "c":"REPL", "id":6015306, "ctx":"OplogApplier-0","msg":"Applier already left draining state, exiting."}
112741:{"t":{"$date":"2024-04-22T16:12:35.334+00:00"},"s":"I", "c":"-", "id":4939300, "ctx":"monitoring-keys-for-HMAC","msg":"Failed to refresh key cache","attr":{"error":"ReadConcernMajorityNotAvailableYet: Read concern majority reads are currently not possible.","nextWakeupMillis":400}}
112742-{"t":{"$date":"2024-04-22T16:12:35.335+00:00"},"s":"I", "c":"REPL", "id":40440, "ctx":"initandlisten","msg":"Starting the TopologyVersionObserver"}

112882-{"t":{"$date":"2024-04-22T16:24:29.668+00:00"},"s":"I", "c":"COMMAND", "id":20499, "ctx":"ftdc","msg":"serverStatus was very slow","attr":{"timeStats":{"after basic":21,"after activeIndexBuilds":55,"after asserts":180,"after batchedDeletes":194,"after bucketCatalog":194,"after catalogStats":234,"after connections":266,"after electionMetrics":282,"after extra_info":282,"after featureCompatibilityVersion":282,"after flowControl":290,"after globalLock":290,"after indexBulkBuilder":308,"after indexStats":355,"after locks":377,"after logicalSessionRecordCache":437,"after mirroredReads":454,"after network":534,"after opLatencies":534,"after opcounters":583,"after opcountersRepl":644,"after oplog":709,"after oplogTruncation":767,"after readConcernCounters":767,"after readPreferenceCounters":813,"after repl":887,"after scramCache":887,"after security":920,"after storageEngine":984,"after tcmalloc":1042,"after tenantMigrations":1095,"after trafficRecording":1116,"after transactions":1133,"after transportSecurity":1151,"after twoPhaseCommitCoordinator":1151,"after wiredTiger":1369,"at end":1369}}}
112883:{"t":{"$date":"2024-04-22T16:24:43.390+00:00"},"s":"I", "c":"CONTROL", "id":23377, "ctx":"SignalHandler","msg":"Received signal","attr":{"signal":15,"error":"Terminated"}}
112884-{"t":{"$date":"2024-04-22T16:24:43.400+00:00"},"s":"I", "c":"CONTROL", "id":23378, "ctx":"SignalHandler","msg":"Signal was sent by kill(2)","attr":{"pid":1,"uid":0}}

112932-{"t":{"$date":"2024-04-22T16:24:44.250+00:00"},"s":"I", "c":"REPL", "id":4784920, "ctx":"SignalHandler","msg":"Shutting down the LogicalTimeValidator"}
112933:{"t":{"$date":"2024-04-22T16:24:44.294+00:00"},"s":"I", "c":"STORAGE", "id":22263, "ctx":"TimestampMonitor","msg":"Timestamp monitor is stopping","attr":{"error":{"code":11600,"codeName":"InterruptedAtShutdown","errmsg":"interrupted at shutdown"}}}
112934-{"t":{"$date":"2024-04-22T16:24:44.299+00:00"},"s":"I", "c":"SHARDING", "id":4784921, "ctx":"SignalHandler","msg":"Shutting down the MigrationUtilExecutor"}

112976-{"t":{"$date":"2024-04-22T16:24:45.384+00:00"},"s":"I", "c":"STORAGE", "id":22297, "ctx":"initandlisten","msg":"Using the XFS filesystem is strongly recommended with the WiredTiger storage engine. See http://dochub.mongodb.org/core/prodnotes-filesystem","tags":["startupWarnings"]}
112977:{"t":{"$date":"2024-04-22T16:24:45.384+00:00"},"s":"I", "c":"STORAGE", "id":22315, "ctx":"initandlisten","msg":"Opening WiredTiger","attr":{"config":"create,cache_size=1445M,session_max=33000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,remove=true,path=journal,compressor=snappy),builtin_extension_config=(zstd=(compression_level=6)),file_manager=(close_idle_time=600,close_scan_interval=10,close_handle_minimum=2000),statistics_log=(wait=0),json_output=(error,message),verbose=[recovery_progress:1,checkpoint_progress:1,compact_progress:1,backup:0,checkpoint:0,compact:0,evict:0,history_store:0,recovery:0,rts:0,salvage:0,tiered:0,timestamp:0,transaction:0,verify:0,log:0],"}}
112978-{"t":{"$date":"2024-04-22T16:24:46.226+00:00"},"s":"I", "c":"STORAGE", "id":4795906, "ctx":"initandlisten","msg":"WiredTiger opened","attr":{"durationMillis":842}}

112998-{"t":{"$date":"2024-04-22T16:24:47.381+00:00"},"s":"I", "c":"REPL", "id":4280500, "ctx":"initandlisten","msg":"Attempting to create internal replication collections"}
112999:{"t":{"$date":"2024-04-22T16:24:47.381+00:00"},"s":"I", "c":"-", "id":4939300, "ctx":"monitoring-keys-for-HMAC","msg":"Failed to refresh key cache","attr":{"error":"ReadConcernMajorityNotAvailableYet: Read concern majority reads are currently not possible.","nextWakeupMillis":200}}
113000-{"t":{"$date":"2024-04-22T16:24:47.383+00:00"},"s":"I", "c":"REPL", "id":4280501, "ctx":"initandlisten","msg":"Attempting to load local voted for document"}

113084-{"t":{"$date":"2024-04-22T16:24:47.398+00:00"},"s":"I", "c":"NETWORK", "id":20113, "ctx":"LogicalSessionCacheReap","msg":"Successfully connected to host","attr":{"connString":"127.0.0.1:27017","numOpenConns":2,"socketTimeoutSecs":0}}
113085:{"t":{"$date":"2024-04-22T16:24:47.399+00:00"},"s":"I", "c":"-", "id":4939300, "ctx":"monitoring-keys-for-HMAC","msg":"Failed to refresh key cache","attr":{"error":"ReadConcernMajorityNotAvailableYet: Read concern majority reads are currently not possible.","nextWakeupMillis":400}}
113086-{"t":{"$date":"2024-04-22T16:24:47.399+00:00"},"s":"I", "c":"REPL", "id":5123005, "ctx":"ShardSplitDonorService-0","msg":"Rebuilding PrimaryOnlyService due to stepUp","attr":{"service":"ShardSplitDonorService"}}

@reetp
Copy link

reetp commented May 9, 2024

Snaps are slow to release as they usually do more extensive testing on it. So it is unlikely due to the node mismatch you noted.

I can't see anything in your log to show a node crash? Looks more like a mongo error. (Please check formatting of code when you paste as it is very difficult to read as it is)

What hardware are you running this on?

How many users?

Which version? CE, Starter, Pro etc?

Copy link
Contributor

This issue has been marked as stale because there has been no further activity in the last 10 days. If the issue remains stale for the next 4 days (a total of 14 days with no activity), then it will be assumed that the question has been resolved and the issue will be automatically closed.

@github-actions github-actions bot added the stat: no response Issue with no response after a given time label May 19, 2024
Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat: need more info stat: no response Issue with no response after a given time
Projects
None yet
Development

No branches or pull requests

2 participants