Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting "Cannot load the graph: cluster [kubernetes] is not found or is not accessible for Kiali" with certain prometheus configurations. #7305

Closed
deuxailes opened this issue Apr 24, 2024 · 4 comments · Fixed by #7376
Assignees
Labels
bug Something isn't working

Comments

@deuxailes
Copy link

Describe the bug

Trying to get Kiali to work with my AKS cluster with the kube-prometheus-stack Helm chart.

Currently, features revolving around Istio work. It seems like Kiali is having a hard time picking up Prometheus info.

Getting "Cannot load the graph: cluster [kubernetes] is not found or is not accessible for Kiali" when attempting to open the traffic graph of Kiali.

image

Getting this unhelpful error message in the logs of Kiali:

2024-04-24T23:07:09Z DBG Found controlplane [istiod-asm-1-20/aks-istio-system] on cluster [aks-dev-eastus-152].
2024-04-24T23:07:11Z ERR K8s Cache [kubernetes] is not found or is not accessible for Kiali: goroutine 801465 [running]:
runtime/debug.Stack()
/opt/hostedtoolcache/go/1.20.10/x64/src/runtime/debug/stack.go:24 +0x65
github.com/kiali/kiali/handlers.handlePanic({0x253b1d0, 0xc0003bfea8})
/home/runner/work/kiali/kiali/handlers/graph.go:86 +0x219
panic({0x1dc38c0, 0xc000694bf0})
/opt/hostedtoolcache/go/1.20.10/x64/src/runtime/panic.go:884 +0x213
github.com/kiali/kiali/graph.CheckError(...)
/home/runner/work/kiali/kiali/graph/util.go:38
github.com/kiali/kiali/graph/telemetry/istio/appender.ServiceEntryAppender.loadServiceEntryHosts({0xc002168c30?, {0xc001cfb3e0?, 0x220f631?}}, {0xc000ddafa0, 0xa}, {0xc0019d4990, 0x11}, 0xc0015b6630)
/home/runner/work/kiali/kiali/graph/telemetry/istio/appender/service_entry.go:115 +0x918
github.com/kiali/kiali/graph/telemetry/istio/appender.ServiceEntryAppender.AppendGraph({0xc0015b64e0?, {0xc0010a807b?, 0xc001e1d6c0?}}, 0xc0015b6690, 0x7f559f849c00?, 0xc001b7a9f0?)
/home/runner/work/kiali/kiali/graph/telemetry/istio/appender/service_entry.go:97 +0x405
github.com/kiali/kiali/graph/telemetry/istio.BuildNodeTrafficMap({0xc0015b64e0, {0x0, {0xc00065b830, 0x9, 0x9}}, 0x0, 0x1, 0xc0015b63f0, {{0x220c92e, 0x8}, ...}, ...}, ...)
/home/runner/work/kiali/kiali/graph/telemetry/istio/istio.go:486 +0x3c2
github.com/kiali/kiali/graph/api.graphNodeIstio({_, _}, _, _, {{0x220d9e6, 0x9}, {0x2207c58, 0x5}, {{0xc0010a80bd, 0x3}, ...}, ...})
/home/runner/work/kiali/kiali/graph/api/api.go:90 +0x125
github.com/kiali/kiali/graph/api.GraphNode({_, _}, _, {{0x220d9e6, 0x9}, {0x2207c58, 0x5}, {{0xc0010a80bd, 0x3}, {0x13a52453c000, ...}}, ...})
/home/runner/work/kiali/kiali/graph/api/api.go:72 +0x1d5
github.com/kiali/kiali/handlers.GraphNode({0x253b1d0, 0xc0003bfea8}, 0xc0000c2400)
/home/runner/work/kiali/kiali/handlers/graph.go:64 +0x165
net/http.HandlerFunc.ServeHTTP(0x220ba66?, {0x253b1d0?, 0xc0003bfea8?}, 0x7f559f6a6120?)
/opt/hostedtoolcache/go/1.20.10/x64/src/net/http/server.go:2122 +0x2f
github.com/kiali/kiali/routing.metricHandler.func1({0x254ada0?, 0xc000d9bea0}, 0xc000846c01?)
/home/runner/work/kiali/kiali/routing/router.go:206 +0x13b
net/http.HandlerFunc.ServeHTTP(0x254c140?, {0x254ada0?, 0xc000d9bea0?}, 0xc000846ca0?)
/opt/hostedtoolcache/go/1.20.10/x64/src/net/http/server.go:2122 +0x2f
github.com/kiali/kiali/handlers.AuthenticationHandler.Handle.func1({0x254ada0, 0xc000d9bea0}, 0xc0000c2200)
/home/runner/work/kiali/kiali/handlers/authentication.go:79 +0x3fd
net/http.HandlerFunc.ServeHTTP(0x254c140?, {0x254ada0?, 0xc000d9bea0?}, 0x251c908?)
/opt/hostedtoolcache/go/1.20.10/x64/src/net/http/server.go:2122 +0x2f
github.com/kiali/kiali/server.plainHttpMiddleware.func1({0x254ada0?, 0xc000d9bea0?}, 0xc0015b6300?)
/home/runner/work/kiali/kiali/server/server.go:178 +0x65
net/http.HandlerFunc.ServeHTTP(0xc0000c2100?, {0x254ada0?, 0xc000d9bea0?}, 0x2?)
/opt/hostedtoolcache/go/1.20.10/x64/src/net/http/server.go:2122 +0x2f
github.com/gorilla/mux.(*Router).ServeHTTP(0xc000aae000, {0x254ada0, 0xc000d9bea0}, 0xc0000c2000)
/home/runner/go/pkg/mod/github.com/gorilla/mux@v1.8.1/mux.go:212 +0x1cf
github.com/NYTimes/gziphandler.GzipHandlerWithOpts.func1.1({0x254abf0, 0xc001034ee0}, 0x0?)
/home/runner/go/pkg/mod/github.com/!n!y!times/gziphandler@v1.1.1/gzip.go:336 +0x24e
net/http.HandlerFunc.ServeHTTP(0xc0002fc280?, {0x254abf0?, 0xc001034ee0?}, 0x40dc4a?)
/opt/hostedtoolcache/go/1.20.10/x64/src/net/http/server.go:2122 +0x2f
net/http.(*ServeMux).ServeHTTP(0xc0010a8061?, {0x254abf0, 0xc001034ee0}, 0xc0000c2000)
/opt/hostedtoolcache/go/1.20.10/x64/src/net/http/server.go:2500 +0x149
net/http.serverHandler.ServeHTTP({0xc002376930?}, {0x254abf0, 0xc001034ee0}, 0xc0000c2000)
/opt/hostedtoolcache/go/1.20.10/x64/src/net/http/server.go:2936 +0x316
net/http.(*conn).serve(0xc000e66090, {0x254c140, 0xc0015af5f0})
/opt/hostedtoolcache/go/1.20.10/x64/src/net/http/server.go:1995 +0x612
created by net/http.(*Server).Serve
/opt/hostedtoolcache/go/1.20.10/x64/src/net/http/server.go:3089 +0x5ed

Expected Behavior

A traffic graph, showing applications, etc.

What are the steps to reproduce this bug?

  1. Install 'kube-prometheus-stack' Helm chart release, with version 48.3.1.
  2. Utilize the managed service mesh addon through AKS, Istio install via Terraform.
  3. Install KialiCR with values below.

Environment

Learn about how to determine versions here.

  • Kiali version: 1.82
  • Istio version: 1.20
  • **Kubernetes impl: **
  • Kubernetes version: 1.28.5
  • Other notable environmental factors:

KialiCR Values:

nameOverride: ""
fullnameOverride: ""

image: # see: https://quay.io/repository/kiali/kiali-operator?tab=tags
  repo: quay.io/kiali/kiali-operator # quay.io/kiali/kiali-operator
  tag: v1.82 # version string like v1.39.0 or a digest hash
  digest: "" # use "sha256" if tag is a sha256 hash (do NOT prefix this value with a "@")
  pullPolicy: Always
  pullSecrets: []

# Deployment options for the operator pod.
nodeSelector: {}
podAnnotations: {}
podLabels: {}
env: []
tolerations: []
resources:
  requests:
    cpu: "10m"
    memory: "64Mi"
affinity: {}
replicaCount: 1
priorityClassName: ""
securityContext: {}

# metrics.enabled: set to true if you want Prometheus to collect metrics from the operator
metrics:
  enabled: true

# debug.enabled: when true the full ansible logs are dumped after each reconciliation run
# debug.verbosity: defines the amount of details the operator will log (higher numbers are more noisy)
# debug.enableProfiler: when true (regardless of debug.enabled), timings for the most expensive tasks will be logged after each reconciliation loop
debug:
  enabled: true
  verbosity: "1"
  enableProfiler: false

# Defines where the operator will look for Kial CR resources. "" means "all namespaces".
watchNamespace: ""

# Set to true if you want the operator to be able to create cluster roles. This is necessary
# if you want to support Kiali CRs with spec.deployment.accessible_namespaces of '**'.
# Setting this to "true" requires allowAllAccessibleNamespaces to be "true" also.
# Note that this will be overriden to "true" if cr.create is true and cr.spec.deployment.accessible_namespaces is ['**'].
clusterRoleCreator: true

# Set to a list of secrets in the cluster that the operator will be allowed to read. This is necessary if you want to
# support Kiali CRs with spec.kiali_feature_flags.certificates_information_indicators.enabled=true.
# The secrets in this list will be the only ones allowed to be specified in any Kiali CR (in the setting
# spec.kiali_feature_flags.certificates_information_indicators.secrets).
# If you set this to an empty list, the operator will not be given permission to read any additional secrets
# found in the cluster, and thus will only support a value of "false" in the Kiali CR setting
# spec.kiali_feature_flags.certificates_information_indicators.enabled.
secretReader: ['cacerts', 'istio-ca-secret']

# Set to true if you want to allow the operator to only be able to install Kiali in view-only-mode.
# The purpose for this setting is to allow you to restrict the permissions given to the operator itself.
onlyViewOnlyMode: false

# allowAdHocKialiNamespace tells the operator to allow a user to be able to install a Kiali CR in one namespace but
# be able to install Kiali in another namespace. In other words, it will allow the Kiali CR spec.deployment.namespace
# to be something other than the namespace where the CR is installed. You may want to disable this if you are
# running in a multi-tenant scenario in which you only want a user to be able to install Kiali in the same namespace
# where the user has permissions to install a Kiali CR.
allowAdHocKialiNamespace: true

# allowAdHocKialiImage tells the operator to allow a user to be able to install a custom Kiali image as opposed
# to the image the operator will install by default. In other words, it will allow the
# Kiali CR spec.deployment.image_name and spec.deployment.image_version to be configured by the user.
# You may want to disable this if you do not want users to install their own Kiali images.
allowAdHocKialiImage: false

# allowAdHocOSSMConsoleImage tells the operator to allow a user to be able to install a custom OSSMC image as opposed
# to the image the operator will install by default. In other words, it will allow the
# OSSMConsole CR spec.deployment.imageName and spec.deployment.imageVersion to be configured by the user.
# You may want to disable this if you do not want users to install their own OSSMC images.
# This is only applicable when running on OpenShift.
allowAdHocOSSMConsoleImage: false

# allowSecurityContextOverride tells the operator to allow a user to be able to fully override the Kiali
# container securityContext. If this is false, certain securityContext settings must exist on the Kiali
# container and any attempt to override them will be ignored.
allowSecurityContextOverride: false

# allowAllAccessibleNamespaces tells the operator to allow a user to be able to configure Kiali
# to access all namespaces in the cluster via spec.deployment.accessible_namespaces=['**'].
# If this is false, the user must specify an explicit list of namespaces in the Kiali CR.
# Setting this to "true" requires clusterRoleCreator to be "true" also.
# Note that this will be overriden to "true" if cr.create is true and cr.spec.deployment.accessible_namespaces is ['**'].
allowAllAccessibleNamespaces: true

# accessibleNamespacesLabel restricts the namespaces that a user can add to the Kiali CR spec.deployment.accessible_namespaces.
# This value is either an empty string (which disables this feature) or a label name with an optional label value
# (e.g. "mylabel" or "mylabel=myvalue"). Only namespaces that have that label will be permitted in
# spec.deployment.accessible_namespaces. Any namespace not labeled properly but specified in accessible_namespaces will cause
# the operator to abort the Kiali installation.
# If just a label name (but no label value) is specified, the label value the operator will look for is the value of
# the Kiali CR's spec.istio_namespace. In other words, the operator will look for the named label whose value must be the name
# of the Istio control plane namespace (which is typically, but not necessarily, "istio-system").
accessibleNamespacesLabel: ""

# For what a Kiali CR spec can look like, see:
# https://github.com/kiali/kiali-operator/blob/master/deploy/kiali/kiali_cr.yaml
cr:
  create: true
  name: kiali
  # If you elect to create a Kiali CR (--set cr.create=true)
  # and the operator is watching all namespaces (--set watchNamespace="")
  # then this is the namespace where the CR will be created (the default will be the operator namespace).
  namespace: ""

  # Annotations to place in the Kiali CR metadata.
  annotations: {}

  spec:
    kubernetes_config:
      cluster_name: "aks-dev-eastus-152"
    auth:
      strategy: "anonymous"
    external_services:
      istio:
        istio_injection_annotation: "istio.io/rev"
        istiod_deployment_name: "istiod-asm-1-20"
        config_map_name: "istio-asm-1-20"
        istio_sidecar_injector_config_map_name: "istio-sidecar-injector-asm-1-20"
        root_namespace: aks-istio-system
        component_status:
          enabled: true
          components:
          - app_label: istiod
            is_core: true
          - app_label: aks-istio-ingressgateway-internal
            is_core: true
            is_proxy: true
            namespace: aks-istio-ingress
      prometheus:
        url: <private apim proxy URI we use>
       # url: http://prometheus-kube-prometheus-prometheus.prometheus.svc.cluster.local:9090
      tracing:
        #url: http://jaeger-deployment-query.istio-system.svc.cluster.local:16686
        in_cluster_url: http://jaeger-deploy-query.aks-istio-system.svc.cluster.local:16685/jaeger
        use_grpc: true
        sampling: 100
      grafana:
        enabled: false
    deployment:
      logger:
        log_level: "debug"
      accessible_namespaces:
      - '**'
@deuxailes deuxailes added the bug Something isn't working label Apr 24, 2024
@hhovsepy
Copy link
Contributor

Hi @deuxailes , is there a second cluster "kubernetes" configured in your environment?
From the logs it seems there is a traffic for a cluster "kubernetes" which is trying to show on the Graph but the "kubernetes" cluster is not accessible for Kiali.

@nrfox
Copy link
Contributor

nrfox commented Apr 25, 2024

There may still be a bug here. It looks like the service entry appender is trying to use the cluster from the node directly from:

// get the cached hosts for this cluster:namespace, otherwise add to the cache
serviceEntryHosts, found := getServiceEntryHosts(cluster, namespace, globalInfo)
if !found {
istioCfg, err := globalInfo.Business.IstioConfig.GetIstioConfigList(context.TODO(), cluster, business.IstioConfigCriteria{
IncludeServiceEntries: true,
})
graph.CheckError(err)

and
// Otherwise, if there are SE hosts defined for the cluster:namespace, check to see if they apply to the node
nodesToCheck := []*graph.Node{}
for _, n := range candidates {
if a.loadServiceEntryHosts(n.Cluster, n.Namespace, globalInfo) {
nodesToCheck = append(nodesToCheck, n)
}
}

I think most appenders check if the node is accessible first before trying to use the cluster right @jshaughn? It'd be good not to error out when a node's cluster from telemetry doesn't match what kiali is configured with but rather log a warning or something.

@hhovsepy
Copy link
Contributor

Right, there is a bug in appenders, also in Workload entries:
https://github.com/kiali/kiali/blob/master/graph/telemetry/istio/appender/workload_entry.go#L44-L60

@deuxailes
Copy link
Author

@hhovsepy @nrfox Thanks for the speedy replies.

To add more context, we utilize Azure Monitor Workspace as our hosted Prometheus server. We utilize Azure APIM to forward requests to the Query endpoint of the AMW with the proper /api/v1 path appended.

All of our environments send prometheus logs to the AMW instance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Development

Successfully merging a pull request may close this issue.

4 participants