Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] elasticsearch-ci/7.17.22 / bwc-snapshots-windows fails #108474

Open
ldematte opened this issue May 9, 2024 · 3 comments
Open

[CI] elasticsearch-ci/7.17.22 / bwc-snapshots-windows fails #108474

ldematte opened this issue May 9, 2024 · 3 comments
Assignees
Labels
:Core/Infra/Core Core issues without another label :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. low-risk An open issue or test failure that is a low risk to future releases Team:Core/Infra Meta label for core/infra team Team:Distributed Meta label for distributed team >test-failure Triaged test failures from CI

Comments

@ldematte
Copy link
Contributor

ldematte commented May 9, 2024

CI Link

https://gradle-enterprise.elastic.co/s/au3yen3ihluxs

Repro line

N/A

Does it reproduce?

Didn't try locally, but it seems to fail pretty reliably (at least on my PR).

Applicable branches

7.17

Failure history

No response

Failure excerpt

The failure message is:

Execution failed for task ':qa:ccs-rolling-upgrade-remote-cluster:v7.17.22#oldClusterTest'.
> process was found dead while waiting for ports files, node{:qa:ccs-rolling-upgrade-remote-cluster:v7.17.22-local-0}

The root cause seems this:

[2024-05-09T16:15:33.909848600Z] [BUILD] Starting Elasticsearch process	
»  May 09, 2024 4:15:40 PM sun.util.locale.provider.LocaleProviderAdapter <clinit>	
»  WARNING: COMPAT locale provider will be removed in a future release	
»   ↑ repeated 2 times ↑	
» [2024-05-09T16:15:58,077][ERROR][o.e.b.Elasticsearch      ] [v7.17.22-local-0] fatal exception while booting Elasticsearch org.elasticsearch.ElasticsearchException: Failed to bind service	
»  	at org.elasticsearch.server@8.15.0-SNAPSHOT/org.elasticsearch.node.NodeConstruction.prepareConstruction(NodeConstruction.java:283)	
»  	at org.elasticsearch.server@8.15.0-SNAPSHOT/org.elasticsearch.node.Node.<init>(Node.java:192)	
»  	at org.elasticsearch.server@8.15.0-SNAPSHOT/org.elasticsearch.bootstrap.Elasticsearch$2.<init>(Elasticsearch.java:240)	
»  	at org.elasticsearch.server@8.15.0-SNAPSHOT/org.elasticsearch.bootstrap.Elasticsearch.initPhase3(Elasticsearch.java:240)	
»  	at org.elasticsearch.server@8.15.0-SNAPSHOT/org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:75)	
»  Caused by: org.elasticsearch.gateway.CorruptStateException: Format version is not supported. Upgrading to [8.15.0] is only supported from version [7.17.0].	
»  	at org.elasticsearch.server@8.15.0-SNAPSHOT/org.elasticsearch.env.NodeEnvironment.checkForIndexCompatibility(NodeEnvironment.java:517)	
»  	at org.elasticsearch.server@8.15.0-SNAPSHOT/org.elasticsearch.env.NodeEnvironment.upgradeLegacyNodeFolders(NodeEnvironment.java:416)	
»  	at org.elasticsearch.server@8.15.0-SNAPSHOT/org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:309)	
»  	at org.elasticsearch.server@8.15.0-SNAPSHOT/org.elasticsearch.node.NodeConstruction.validateSettings(NodeConstruction.java:511)	
»  	at org.elasticsearch.server@8.15.0-SNAPSHOT/org.elasticsearch.node.NodeConstruction.prepareConstruction(NodeConstruction.java:258)	
»  	... 4 more	
»  	
»  ERROR: Elasticsearch did not exit normally - check the logs at C:\bk\qa\ccs-rolling-upgrade-remote-cluster\build\testclusters\v7.17.22-local-0\logs\v7.17.22-local.log

Tagging both Core/Infra and Distributed, as it could be a version compatibility issue or a persisted cluster state issue - the comment line above the error says:

// We are upgrading the cluster, but we didn't find any previous metadata. Corrupted state or incompatible version.

Curiously, this seems to happen on Windows only?

@ldematte ldematte added :Core/Infra/Core Core issues without another label >test-failure Triaged test failures from CI :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. labels May 9, 2024
@elasticsearchmachine elasticsearchmachine added Team:Core/Infra Meta label for core/infra team Team:Distributed Meta label for distributed team needs:risk Requires assignment of a risk label (low, medium, blocker) labels May 9, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@ldematte
Copy link
Contributor Author

Lacking a better alternative, here is an "empty" PR on main that shows the issue: #108490

@rjernst rjernst added low-risk An open issue or test failure that is a low risk to future releases and removed needs:risk Requires assignment of a risk label (low, medium, blocker) labels May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Core Core issues without another label :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. low-risk An open issue or test failure that is a low risk to future releases Team:Core/Infra Meta label for core/infra team Team:Distributed Meta label for distributed team >test-failure Triaged test failures from CI
Projects
None yet
Development

No branches or pull requests

4 participants