Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pod init:Error in EKS IPv6 with istio cni #48677

Closed
2 tasks done
lwj5 opened this issue Jan 6, 2024 · 8 comments
Closed
2 tasks done

Pod init:Error in EKS IPv6 with istio cni #48677

lwj5 opened this issue Jan 6, 2024 · 8 comments
Labels
area/networking lifecycle/automatically-closed Indicates a PR or issue that has been closed automatically. lifecycle/stale Indicates a PR or issue hasn't been manipulated by an Istio team member for a while

Comments

@lwj5
Copy link

lwj5 commented Jan 6, 2024

Is this the right place to submit this?

  • This is not a security vulnerability or a crashing bug
  • This is not a question about how to use Istio

Bug Description

Pods stuck in a restart loop after installing istio cni.

This is running in a IPv6 EKS cluster, with dualstack on the node.

2024-01-06T08:02:18.250326Z	info	cni	============= Start iptables configuration for app-57876d7c9d-f58xq =============
2024-01-06T08:02:18.250349Z	info	cni	Istio iptables environment:
ENVOY_PORT=
INBOUND_CAPTURE_PORT=
ISTIO_INBOUND_INTERCEPTION_MODE=
ISTIO_INBOUND_TPROXY_ROUTE_TABLE=
ISTIO_INBOUND_PORTS=
ISTIO_OUTBOUND_PORTS=
ISTIO_LOCAL_EXCLUDE_PORTS=
ISTIO_EXCLUDE_INTERFACES=
ISTIO_SERVICE_CIDR=
ISTIO_SERVICE_EXCLUDE_CIDR=
ISTIO_META_DNS_CAPTURE=
INVALID_DROP=
2024-01-06T08:02:18.250355Z	info	cni	Istio iptables variables:
IPTABLES_VERSION=
PROXY_PORT=15001
PROXY_INBOUND_CAPTURE_PORT=15006
PROXY_TUNNEL_PORT=15008
PROXY_UID=1337
PROXY_GID=1337
INBOUND_INTERCEPTION_MODE=REDIRECT
INBOUND_TPROXY_MARK=1337
INBOUND_TPROXY_ROUTE_TABLE=133
INBOUND_PORTS_INCLUDE=*
INBOUND_PORTS_EXCLUDE=15020,15021,15090
OUTBOUND_OWNER_GROUPS_INCLUDE=*
OUTBOUND_OWNER_GROUPS_EXCLUDE=
OUTBOUND_IP_RANGES_INCLUDE=*
OUTBOUND_IP_RANGES_EXCLUDE=
OUTBOUND_PORTS_INCLUDE=
OUTBOUND_PORTS_EXCLUDE=15020
KUBE_VIRT_INTERFACES=
ENABLE_INBOUND_IPV6=false
DUAL_STACK=false
DNS_CAPTURE=false
DROP_INVALID=false
CAPTURE_ALL_DNS=false
DNS_SERVERS=[],[]
NETWORK_NAMESPACE=/var/run/netns/cni-173e67ff-4627-97ac-6aca-ca6cf68023a8
CNI_MODE=true
EXCLUDE_INTERFACES=
2024-01-06T08:02:18.250359Z	info	cni	Running iptables-restore with the following input:
* nat
-N ISTIO_INBOUND
-N ISTIO_REDIRECT
-N ISTIO_IN_REDIRECT
-N ISTIO_OUTPUT
-A ISTIO_INBOUND -p tcp --dport 15008 -j RETURN
-A ISTIO_REDIRECT -p tcp -j REDIRECT --to-ports 15001
-A ISTIO_IN_REDIRECT -p tcp -j REDIRECT --to-ports 15006
-A PREROUTING -p tcp -j ISTIO_INBOUND
-A ISTIO_INBOUND -p tcp --dport 15020 -j RETURN
-A ISTIO_INBOUND -p tcp --dport 15021 -j RETURN
-A ISTIO_INBOUND -p tcp --dport 15090 -j RETURN
-A ISTIO_INBOUND -p tcp -j ISTIO_IN_REDIRECT
-A OUTPUT -p tcp -j ISTIO_OUTPUT
-A ISTIO_OUTPUT -p tcp --dport 15020 -j RETURN
-A ISTIO_OUTPUT -o lo -s 127.0.0.6/32 -j RETURN
-A ISTIO_OUTPUT -o lo ! -d 127.0.0.1/32 -p tcp ! --dport 15008 -m owner --uid-owner 1337 -j ISTIO_IN_REDIRECT
-A ISTIO_OUTPUT -o lo -m owner ! --uid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -m owner --uid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -o lo ! -d 127.0.0.1/32 -p tcp ! --dport 15008 -m owner --gid-owner 1337 -j ISTIO_IN_REDIRECT
-A ISTIO_OUTPUT -o lo -m owner ! --gid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -m owner --gid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -d 127.0.0.1/32 -j RETURN
-A ISTIO_OUTPUT -j ISTIO_REDIRECT
COMMIT
2024-01-06T08:02:18.250364Z	info	cni	Running command (without lock by mount): iptables-restore --noflush
2024-01-06T08:02:18.250370Z	info	cni	Running ip6tables-restore with the following input:
2024-01-06T08:02:18.250372Z	info	cni	Running command (without lock by mount): ip6tables-restore --noflush
2024-01-06T08:02:18.250375Z	info	cni	Running command (without lock): iptables-save
2024-01-06T08:02:18.250378Z	info	cni	Command output:
# Generated by iptables-save v1.8.4 on Sat Jan  6 08:02:18 2024
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:ISTIO_INBOUND - [0:0]
:ISTIO_IN_REDIRECT - [0:0]
:ISTIO_OUTPUT - [0:0]
:ISTIO_REDIRECT - [0:0]
-A PREROUTING -p tcp -j ISTIO_INBOUND
-A OUTPUT -p tcp -j ISTIO_OUTPUT
-A ISTIO_INBOUND -p tcp -m tcp --dport 15008 -j RETURN
-A ISTIO_INBOUND -p tcp -m tcp --dport 15020 -j RETURN
-A ISTIO_INBOUND -p tcp -m tcp --dport 15021 -j RETURN
-A ISTIO_INBOUND -p tcp -m tcp --dport 15090 -j RETURN
-A ISTIO_INBOUND -p tcp -j ISTIO_IN_REDIRECT
-A ISTIO_IN_REDIRECT -p tcp -j REDIRECT --to-ports 15006
-A ISTIO_OUTPUT -p tcp -m tcp --dport 15020 -j RETURN
-A ISTIO_OUTPUT -s 127.0.0.6/32 -o lo -j RETURN
-A ISTIO_OUTPUT ! -d 127.0.0.1/32 -o lo -p tcp -m tcp ! --dport 15008 -m owner --uid-owner 1337 -j ISTIO_IN_REDIRECT
-A ISTIO_OUTPUT -o lo -m owner ! --uid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -m owner --uid-owner 1337 -j RETURN
-A ISTIO_OUTPUT ! -d 127.0.0.1/32 -o lo -p tcp -m tcp ! --dport 15008 -m owner --gid-owner 1337 -j ISTIO_IN_REDIRECT
-A ISTIO_OUTPUT -o lo -m owner ! --gid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -m owner --gid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -d 127.0.0.1/32 -j RETURN
-A ISTIO_OUTPUT -j ISTIO_REDIRECT
-A ISTIO_REDIRECT -p tcp -j REDIRECT --to-ports 15001
COMMIT
# Completed on Sat Jan  6 08:02:18 2024
2024-01-06T08:02:18.250382Z	info	cni	============= End iptables configuration for app-57876d7c9d-f58xq =============
2024-01-06T08:02:25.021561Z	info	repair	Pod detected as broken, deleting: ns/app-57876d7c9d-f58xq
2024-01-06T08:02:25.033859Z	info	repair	Pod detected as broken, deleting: ns/app-57876d7c9d-f58xq
2024-01-06T08:02:26.316370Z	info	repair	Pod detected as broken, deleting: ns/app-57876d7c9d-f58xq
2024-01-06T08:02:27.024030Z	info	repair	Pod detected as broken, deleting: ns/app-57876d7c9d-f58xq
2024-01-06T08:02:27.046487Z	info	repair	Pod detected as broken, deleting: ns/app-57876d7c9d-f58xq
2024-01-06T08:02:27.065019Z	info	repair	Pod detected as broken, deleting: ns/app-57876d7c9d-f58xq

Version

% istioctl version
client version: 1.20.1
control plane version: 1.20.1
data plane version: 1.20.1 (8 proxies)
% kubectl version        
Client Version: v1.28.3
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.4-eks-8cb36c9

Additional Information

No response

@ajaykumarmandapati
Copy link

Hi, we also have a similar behavior upgrading from 1.17.8 to 1.19.5 , CNI log shows the below error messages.

istioctl version                                                                                                                                                                                                                                       <region:eu-central-1>
client version: 1.20.2
control plane version: 1.19.5
data plane version: 1.16.2 (1 proxies), 1.19.5 (3 proxies)
kubectl version --short                                                                                                                                                                                                                                <region:eu-central-1>
Flag --short has been deprecated, and will be removed in the future. The --short output will become the default.
Client Version: v1.25.4
Kustomize Version: v4.5.7
Server Version: v1.26.12-eks-5e0fdde
# Completed on Thu Jan 18 10:50:08 2024
2024-01-18T10:50:08.589561Z	info	cni	============= End iptables configuration for echoserver-5cb5cd9bc4-bc578 =============
2024-01-18T10:50:09.899425Z	info	repair	Pod detected as broken, deleting: victoria/vmagent-victoria-stack-564dc96bc4-9sc7f
2024-01-18T10:50:10.272651Z	info	repair	Pod detected as broken, deleting: victoria/vmagent-victoria-stack-564dc96bc4-9sc7f
2024-01-18T10:50:11.889020Z	info	repair	Pod detected as broken, deleting: victoria/vmagent-victoria-stack-564dc96bc4-9sc7f
2024-01-18T10:50:11.924535Z	info	repair	Pod detected as broken, deleting: victoria/vmagent-victoria-stack-564dc96bc4-9sc7f
2024-01-18T10:50:11.948501Z	error	controllers	error handling victoria/vmagent-victoria-stack-564dc96bc4-9sc7f, retrying (retry count: 1): pods "vmagent-victoria-stack-564dc96bc4-9sc7f" not found	controller=repair pods
2024-01-18T10:50:14.451337Z	info	cni	============= Start iptables configuration for vmagent-victoria-stack-564dc96bc4-mjmqb =============
2024-01-18T10:50:14.451364Z	info	cni	Istio iptables environment:
ENVOY_PORT=
INBOUND_CAPTURE_PORT=
ISTIO_INBOUND_INTERCEPTION_MODE=
ISTIO_INBOUND_TPROXY_ROUTE_TABLE=
ISTIO_INBOUND_PORTS=
ISTIO_OUTBOUND_PORTS=
ISTIO_LOCAL_EXCLUDE_PORTS=
ISTIO_EXCLUDE_INTERFACES=
ISTIO_SERVICE_CIDR=
ISTIO_SERVICE_EXCLUDE_CIDR=
ISTIO_META_DNS_CAPTURE=
INVALID_DROP=
2024-01-18T10:50:14.451368Z	info	cni	Istio iptables variables:
IPTABLES_VERSION=
PROXY_PORT=15001
PROXY_INBOUND_CAPTURE_PORT=15006
PROXY_TUNNEL_PORT=15008
PROXY_UID=1337
PROXY_GID=1337
INBOUND_INTERCEPTION_MODE=REDIRECT
INBOUND_TPROXY_MARK=1337
INBOUND_TPROXY_ROUTE_TABLE=133
INBOUND_PORTS_INCLUDE=
INBOUND_PORTS_EXCLUDE=15020,15021,15090
OUTBOUND_OWNER_GROUPS_INCLUDE=*
OUTBOUND_OWNER_GROUPS_EXCLUDE=
OUTBOUND_IP_RANGES_INCLUDE=
OUTBOUND_IP_RANGES_EXCLUDE=0.0.0.0/0
OUTBOUND_PORTS_INCLUDE=
OUTBOUND_PORTS_EXCLUDE=15020
KUBE_VIRT_INTERFACES=
ENABLE_INBOUND_IPV6=false
DUAL_STACK=false
DNS_CAPTURE=false
DROP_INVALID=false
CAPTURE_ALL_DNS=false
DNS_SERVERS=[],[]
NETWORK_NAMESPACE=/var/run/netns/cni-f6ae7dda-6ead-c508-47fb-75ff6f2e66ad
CNI_MODE=true
EXCLUDE_INTERFACES=
2024-01-18T10:50:14.451370Z	info	cni	Running iptables-restore with the following input:
* nat
-N ISTIO_INBOUND
-N ISTIO_REDIRECT
-N ISTIO_IN_REDIRECT
-N ISTIO_OUTPUT
-A ISTIO_INBOUND -p tcp --dport 15008 -j RETURN
-A ISTIO_REDIRECT -p tcp -j REDIRECT --to-ports 15001
-A ISTIO_IN_REDIRECT -p tcp -j REDIRECT --to-ports 15006
-A OUTPUT -p tcp -j ISTIO_OUTPUT
-A ISTIO_OUTPUT -p tcp --dport 15020 -j RETURN
-A ISTIO_OUTPUT -o lo -s 127.0.0.6/32 -j RETURN
-A ISTIO_OUTPUT -o lo ! -d 127.0.0.1/32 -p tcp ! --dport 15008 -m owner --uid-owner 1337 -j ISTIO_IN_REDIRECT
-A ISTIO_OUTPUT -o lo -m owner ! --uid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -m owner --uid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -o lo ! -d 127.0.0.1/32 -p tcp ! --dport 15008 -m owner --gid-owner 1337 -j ISTIO_IN_REDIRECT
-A ISTIO_OUTPUT -o lo -m owner ! --gid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -m owner --gid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -d 127.0.0.1/32 -j RETURN
-A ISTIO_OUTPUT -d 0.0.0.0/0 -j RETURN
COMMIT
2024-01-18T10:50:14.451372Z	info	cni	Running command (without lock by environment): iptables-restore --noflush
2024-01-18T10:50:14.451373Z	info	cni	Running ip6tables-restore with the following input:
2024-01-18T10:50:14.451375Z	info	cni	Running command (without lock by environment): ip6tables-restore --noflush
2024-01-18T10:50:14.451378Z	info	cni	Running command (without lock): iptables-save
2024-01-18T10:50:14.451380Z	info	cni	Command output: 
# Generated by iptables-save v1.8.9 on Thu Jan 18 10:50:14 2024
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:ISTIO_INBOUND - [0:0]
:ISTIO_IN_REDIRECT - [0:0]
:ISTIO_OUTPUT - [0:0]
:ISTIO_REDIRECT - [0:0]
-A OUTPUT -p tcp -j ISTIO_OUTPUT
-A ISTIO_INBOUND -p tcp -m tcp --dport 15008 -j RETURN
-A ISTIO_IN_REDIRECT -p tcp -j REDIRECT --to-ports 15006
-A ISTIO_OUTPUT -p tcp -m tcp --dport 15020 -j RETURN
-A ISTIO_OUTPUT -s 127.0.0.6/32 -o lo -j RETURN
-A ISTIO_OUTPUT ! -d 127.0.0.1/32 -o lo -p tcp -m tcp ! --dport 15008 -m owner --uid-owner 1337 -j ISTIO_IN_REDIRECT
-A ISTIO_OUTPUT -o lo -m owner ! --uid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -m owner --uid-owner 1337 -j RETURN
-A ISTIO_OUTPUT ! -d 127.0.0.1/32 -o lo -p tcp -m tcp ! --dport 15008 -m owner --gid-owner 1337 -j ISTIO_IN_REDIRECT
-A ISTIO_OUTPUT -o lo -m owner ! --gid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -m owner --gid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -d 127.0.0.1/32 -j RETURN
-A ISTIO_OUTPUT -j RETURN
-A ISTIO_REDIRECT -p tcp -j REDIRECT --to-ports 15001
COMMIT
# Completed on Thu Jan 18 10:50:14 2024
2024-01-18T10:50:14.451383Z	info	cni	============= End iptables configuration for vmagent-victoria-stack-564dc96bc4-mjmqb =============
2024-01-18T10:50:14.943948Z	info	repair	Pod detected as broken, deleting: default/echoserver-5cb5cd9bc4-bc578
2024-01-18T10:50:15.264495Z	info	repair	Pod detected as broken, deleting: default/echoserver-5cb5cd9bc4-bc578
2024-01-18T10:50:16.907906Z	info	repair	Pod detected as broken, deleting: default/echoserver-5cb5cd9bc4-bc578
2024-01-18T10:50:16.930428Z	info	repair	Pod detected as broken, deleting: default/echoserver-5cb5cd9bc4-bc578

@linsun
Copy link
Member

linsun commented Jan 29, 2024

@codeverifier thoughts? I think you got it working lately. Was there a special env var to enable?

@linsun
Copy link
Member

linsun commented Jan 29, 2024

cc @bleggett as well in case he has some idea

@day0ops
Copy link
Contributor

day0ops commented Jan 29, 2024

@lwj5 is this on a EKS IPv6 only cluster with VPC CNI ?

@lwj5
Copy link
Author

lwj5 commented Jan 29, 2024

@codeverifier yes that is correct

@bleggett
Copy link
Contributor

Something is likely fighting with istio-cni's sidecar repair feature over the pod.

What are the logs/status outputs of the failed pod?

@istio-policy-bot istio-policy-bot added the lifecycle/stale Indicates a PR or issue hasn't been manipulated by an Istio team member for a while label Apr 29, 2024
@ajaykumarmandapati
Copy link

Something is likely fighting with istio-cni's sidecar repair feature over the pod.

What are the logs/status outputs of the failed pod?

Here is more information on the issue - #50660 (comment)

@istio-policy-bot
Copy link

🚧 This issue or pull request has been closed due to not having had activity from an Istio team member since 2024-01-29. If you feel this issue or pull request deserves attention, please reopen the issue. Please see this wiki page for more information. Thank you for your contributions.

Created by the issue and PR lifecycle manager.

@istio-policy-bot istio-policy-bot added the lifecycle/automatically-closed Indicates a PR or issue that has been closed automatically. label May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/networking lifecycle/automatically-closed Indicates a PR or issue that has been closed automatically. lifecycle/stale Indicates a PR or issue hasn't been manipulated by an Istio team member for a while
Projects
None yet
Development

No branches or pull requests

6 participants