Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

My helm operator hang after I upgraded helm-operator from v1.33.0 to v1.34.0 #6690

Open
kschanrtp opened this issue Mar 4, 2024 · 18 comments
Labels
triage/needs-information Indicates an issue needs more information in order to work on it.
Milestone

Comments

@kschanrtp
Copy link

Bug Report

What did you do?

I upgraded the helm-operator version from v1.33.0 to v1.34.0

What did you expect to see?

My helm operator deploy helm chart successfully

What did you see instead? Under which circumstances?

My helm operator hang doing new install.

I did notice there is great jump of version for helm-operator-plugins. Not sure if this related or not

- github.com/operator-framework/helm-operator-plugins v0.0.12-0.20231013185714-215d1f8a3e7d
+ github.com/operator-framework/helm-operator-plugins v0.1.3  

I have anonymized the log output below.

Working helm operator log running v1.33.0

{"level":"info","ts":"2024-02-29T19:16:11Z","logger":"cmd","msg":"Version","Go Version":"go1.21.5","GOOS":"linux","GOARCH":"amd64","helm-operator":"v1.33.0","commit":"542966812906456a8d67cf7284fc6410b104e118"}
...
{"level":"info","ts":"2024-02-29T19:17:01Z","msg":"Starting EventSource","controller":"myhelm-controller","source":"kind source: *unstructured.Unstructured"}
{"level":"info","ts":"2024-02-29T19:17:01Z","msg":"Starting Controller","controller":"myhelm-controller"}
{"level":"info","ts":"2024-02-29T19:17:01Z","msg":"Starting workers","controller":"myhelm-controller","worker count":16}
{"level":"info","ts":"2024-02-29T19:17:12Z","msg":"Starting EventSource","controller":"myhelm-controller","source":"kind source: *unstructured.Unstructured"}
{"level":"info","ts":"2024-02-29T19:17:12Z","logger":"helm.controller","msg":"Watching dependent resource","ownerApiVersion":"my.example.com/v1alpha1","ownerKind":"myKind","apiVersion":"v1","kind":"Service"}
...
myhelm chart is deployed

helm operator log running v1.34.0

{"level":"info","ts":"2024-03-04T20:12:52Z","logger":"cmd","msg":"Version","Go Version":"go1.21.7","GOOS":"linux","GOARCH":"amd64","helm-operator":"v1.34.0","commit":"4e01bcd726aa8b0e092fcd3ab874961e276f3db3"}
...
{"level":"info","ts":"2024-03-04T20:13:43Z","msg":"Starting EventSource","controller":"myhelm-controller","source":"kind source: *unstructured.Unstructured"}
{"level":"info","ts":"2024-03-04T20:13:43Z","msg":"Starting Controller","controller":"myhelm-controller"}
{"level":"info","ts":"2024-03-04T20:13:44Z","msg":"Starting workers","controller":"myhelm-controller","worker count":16}
NO More Output

Environment

Operator type:

Kubernetes cluster type:

$ operator-sdk version
operator-sdk-v1.12.0+git

$ go version (if language is Go)
go: 1.21.1

$ kubectl version
Server Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.10+28ed2d7", GitCommit:"c725f2ce5164bf4165b22d6c28dd0ace4b3b7e9b", GitTreeState:"clean", BuildDate:"2024-01-23T03:16:21Z", GoVersion:"go1.20.12 X:strictfipsruntime", Compiler:"gc", Platform:"linux/amd64"}

Possible Solution

Additional context

@acornett21
Copy link
Contributor

@acornett21
Copy link
Contributor

@kschanrtp 1.34.0's release did not complete fully. Can you try updating to 1.34.1 to see if this resolves your issue?

@kschanrtp
Copy link
Author

@acornett21 Same problem with 1.34.0

Same problem
{"level":"info","ts":"2024-03-05T20:04:17Z","logger":"cmd","msg":"Version","Go Version":"go1.21.7","GOOS":"linux","GOARCH":"amd64","helm-operator":"v1.34.1","commit":"edaed1e5057db0349568e0b02df3743051b54e68"}
...
{"level":"info","ts":"2024-03-05T20:05:06Z","msg":"Starting EventSource","controller":"myhelm-controller","source":"kind source: *unstructured.Unstructured"}
{"level":"info","ts":"2024-03-05T20:05:06Z","msg":"Starting Controller","controller":"myhelm-controller"}
{"level":"info","ts":"2024-03-05T20:05:06Z","msg":"Starting workers","controller":"myhelm-controller","worker count":16}
NO MORE OUTPUT

@sudhir-kelkar
Copy link

@acornett21
Any update on this?
We need this fix to get rid of security vulnerability GHSA-r53h-jv2g-vpx6

@acornett21
Copy link
Contributor

@sudhir-kelkar I have not looked at this, I was just relating all the issues that came in, and asking if this still existed in 1.34.1, since 1.34.0 release was incomplete. I personally will not have time to look at this for a few weeks, I'm only a contributor to this project, not a dedicated maintainer.

@jberkhahn
Copy link
Contributor

Could you please share the structure of your CR? The most likely reason something like this happens is if your RBAC is incorrect and the controller doesn't have permissions to see all the resources it needs to. Could you please post the output of your subscription (if you're using OLM).

@jberkhahn jberkhahn added the triage/needs-information Indicates an issue needs more information in order to work on it. label Apr 8, 2024
@jberkhahn jberkhahn added this to the Backlog milestone Apr 8, 2024
@jberkhahn
Copy link
Contributor

relates #6651

@malli31
Copy link

malli31 commented Apr 21, 2024

Any udpate on this, even after using 1.34.1 not able to see any pods after cr deployment,
Switching back to 1.33.0 is perfectly working

@malli31
Copy link

malli31 commented Apr 22, 2024

@acornett21 any inputs why 1.34.1 is not working? bumping back to 1.33.0 is perfectly working perfectly fine.
Any work around suggested ?? Not seeing any logs or any events or anything yet all

@jberkhahn
Copy link
Contributor

Something broke when we cut 1.34. We're not sure what exactly but are currently investigating.

@kmcdon83
Copy link

+1 for this issue, moving from 1.33 to 1.34.1 has stopped any process of reconciliation

@kmcdon83
Copy link

+1 for this issue, moving from 1.33 to 1.34.1 has stopped any process of reconciliation

I have verified the 1.34.2 has resolved my issue.

@kschanrtp
Copy link
Author

kschanrtp commented May 16, 2024

I am still having problem with 1.34.2. Same problem. It does not do the reconcilation.

{"level":"info","ts":"2024-05-16T16:43:04Z","logger":"cmd","msg":"Version","Go Version":"go1.21.10","GOOS":"linux","GOARCH":"amd64","helm-operator":"v1.34.2","commit":"81dd3cb24b8744de03d312c1ba23bfc617044005"}
...
{"level":"info","ts":"2024-05-16T16:43:55Z","msg":"Starting EventSource","controller":"manageservice-controller","source":"kind source: *unstructured.Unstructured"}
{"level":"info","ts":"2024-05-16T16:43:55Z","msg":"Starting Controller","controller":"manageservice-controller"}
{"level":"info","ts":"2024-05-16T16:43:55Z","msg":"Starting workers","controller":"manageservice-controller","worker count":16}

@kmcdon83
Copy link

+1 for this issue, moving from 1.33 to 1.34.1 has stopped any process of reconciliation

I have verified the 1.34.2 has resolved my issue.

Sorry, I spoke too soon. It does appear there is no reconciliation occurring.

@kschanrtp
Copy link
Author

@jberkhahn Is it possible to create 1.33.1 based on 1.33.0 but compile with latest ubi 8 image to pick up security fixes in the ubi 8 image?

@acornett21
Copy link
Contributor

Hi @kschanrtp You're in control of your operator controller image and it's updates, if you want/need to update you can update the Dockerfile in your own operator project to do so. Something like:

USER root

RUN microdnf update && microdnf clean all

# Switch back to whatever user your container uses at runtime.

Or if you only want to update the libraries with CVE's you can do those individually.

@kschanrtp
Copy link
Author

kschanrtp commented May 21, 2024

@acornett21 I thought I have done that and it did not work. I will try again. May be my order of the update is not correct.

@kschanrtp
Copy link
Author

The CVEs are on the go module side of the helm-operator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

6 participants