Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trace exec with filter has blindspots when configured with podinformer container collection #2614

Closed
danielpacak opened this issue Mar 15, 2024 · 11 comments
Labels
lifecycle/staleproof Avoid this issue / PR being marked as stale by the bot.

Comments

@danielpacak
Copy link
Contributor

danielpacak commented Mar 15, 2024

Description

We cannot always use runc+fanotify or fanotify+ebpf container collection strategies as shown in this example. When I check code it seems that if these two strategies are not supported, IG falls back to the pod informer collection. However, in my setup the pod informer collection caused loosing exec events that happen very early in the container lifecycle, e.g. container entrypoint execution.

Impact

Lost exec events, but I can imagine it might be acceptable for such a tool as IG. I'm just trying to figure if it's expected behavior.

I assume the container event is delayed and the raw exec event is ignored before the mntns eBPF map is updated.

Environment and steps to reproduce

  1. Set-up:
    Bottlerocket OS 1.15.1 (aws-k8s-1.26); Kernel 5.15.128; IG v0.23.1 .. v0.26.0
  2. Action(s):
    kubectl run -ti --rm -n default dpacak-test --image=ubuntu -- bash
    
  3. Error:
    kubectl gadget deploy
    
    $ kubectl gadget trace exec -n default -p dpacak-test
    [... the very first bash exec is not show here ...]

Expected behavior

The entry point bash exec with container metadata is captured by IG.

Additional information

I bumped into this issue while working on capturing container runtime profiles (aka RAD fingerprints). We've been using IG for some parts of that work, but we are hitting more and more limitations. Therefore, I'd appreciate your feedback on observed behavior and whether we push IG a bit too hard to the limits. To give you a real example, check a fingerprints of ubuntu image built with fanotify and podinformer collectors respectively. The latter is incomplete.

ifnotify+ebpf
podinformer

@eiffel-fl
Copy link
Member

Hi!

However, in my setup the pod informer collection caused loosing exec events that happen very early in the container lifecycle, e.g. container entrypoint execution.

If I remember correctly, this is a limitation of podinformer but I cannot find any documentation where we state this and the why of it.
@alban Do you have any light to shed on this?
In the meanwhile, I will also take a deeper look at it to try to find the reason why!

Best regards.

@alban
Copy link
Member

alban commented Mar 18, 2024

The Podinformer inherently will take a bit of time before announcing there is a new container because the message comes from the Kubernetes API server (from another node) and the startup of the container is not paused during that process.

Why is the fanotify-ebpf not working for you? Because the kernel is compiled without fanotify?

@danielpacak
Copy link
Contributor Author

danielpacak commented Mar 18, 2024

Why is the fanotify-ebpf not working for you? Because the kernel is compiled without fanotify?

That's a good question. We run it on Bottlerocket, which is pretty confined and hardened with various LSMs, but I'll try to figure exactly where it fails and come back with details.

@danielpacak
Copy link
Contributor Author

time="2024-03-18T12:40:51Z" level=warning msg="container-hook: failed to fanotify mark: fanotify: mark error, permission denied"
time="2024-03-18T12:40:51Z" level=warning msg="container-hook: failed to fanotify mark: fanotify: mark error, permission denied"
time="2024-03-18T12:40:51Z" level=warning msg="ContainerNotifier: not supported: no container runtime can be monitored with fanotify. The following paths were tested: /bin/runc,/usr/bin/runc,/usr/sbin/runc,/usr/local/bin/runc,/usr/local/sbin/runc,/usr/lib/cri-o-runc/sbin/runc,/run/torcx/unpack/docker/bin/runc,/usr/bin/crun,/usr/bin/conmon. You can use the RUNTIME_PATH env variable to specify a custom path. If you are successful doing so, please open a PR to add your custom path to runtimePaths"

@eiffel-fl
Copy link
Member

eiffel-fl commented Mar 18, 2024

Why is the fanotify-ebpf not working for you? Because the kernel is compiled without fanotify?

That's a good question. We run it on Bottlerocket, which is pretty confined and hardened with various LSMs, but I'll try to figure exactly where it fails and come back with details.

I never heard about bottlerocket, so this highly possible we do not support this distribution.

time="2024-03-18T12:40:51Z" level=warning msg="container-hook: failed to fanotify mark: fanotify: mark error, permission denied"
time="2024-03-18T12:40:51Z" level=warning msg="container-hook: failed to fanotify mark: fanotify: mark error, permission denied"
time="2024-03-18T12:40:51Z" level=warning msg="ContainerNotifier: not supported: no container runtime can be monitored with fanotify. The following paths were tested: /bin/runc,/usr/bin/runc,/usr/sbin/runc,/usr/local/bin/runc,/usr/local/sbin/runc,/usr/lib/cri-o-runc/sbin/runc,/run/torcx/unpack/docker/bin/runc,/usr/bin/crun,/usr/bin/conmon. You can use the RUNTIME_PATH env variable to specify a custom path. If you are successful doing so, please open a PR to add your custom path to runtimePaths"

Interesting, where is runc in BottleRocket? It should normally be under /bin/runc as pointed by:
https://github.com/bottlerocket-os/bottlerocket/blob/d533ff69c9349ecaf315b787b8524a78e2ef0719/packages/runc/runc.spec#L42
So, you can try to find it?

Also, do you have access to their kernel config (e.g. under /proc/config.gz)? If so, you can search for FANOTIFY.
I would say no as nothing in their config contains FANOTIFY but it is still better to test in situ:
https://github.com/bottlerocket-os/bottlerocket/blob/d533ff69c9349ecaf315b787b8524a78e2ef0719/packages/kernel-6.1/kernel-6.1.spec#L120

$ grep FANOTIFY ./arch/x86/configs/x86_64_defconfig                                 (tags/v6.6^0) %
$ echo $?                                                                           (tags/v6.6^0) %
1

https://github.com/bottlerocket-os/bottlerocket/blob/develop/packages/kernel-6.1/config-bottlerocket
https://github.com/bottlerocket-os/bottlerocket/blob/develop/packages/kernel-6.1/config-bottlerocket-aws

@danielpacak
Copy link
Contributor Author

danielpacak commented Mar 18, 2024

I believe the runc is in /bin/runc and the problem is not in finding it but permission denied errors I saw after enabling debug logger:

time="2024-03-18T13:30:29Z" level=debug msg="bpf already mounted"
time="2024-03-18T13:30:29Z" level=debug msg="debugfs already mounted"
time="2024-03-18T13:30:29Z" level=debug msg="tracefs already mounted"
time="2024-03-18T13:30:30Z" level=debug msg="container-hook: trying runtime at /host/bin/runc"
time="2024-03-18T13:30:30Z" level=warning msg="container-hook: failed to fanotify mark: fanotify: mark error, permission denied"
time="2024-03-18T13:30:30Z" level=debug msg="container-hook: trying runtime at /host/usr/bin/runc"
time="2024-03-18T13:30:30Z" level=warning msg="container-hook: failed to fanotify mark: fanotify: mark error, permission denied"
time="2024-03-18T13:30:30Z" level=debug msg="container-hook: trying runtime at /host/usr/sbin/runc"
time="2024-03-18T13:30:30Z" level=debug msg="container-hook: runc at /host/usr/sbin/runc not found"
time="2024-03-18T13:30:30Z" level=debug msg="container-hook: trying runtime at /host/usr/local/bin/runc"
time="2024-03-18T13:30:30Z" level=debug msg="container-hook: runc at /host/usr/local/bin/runc not found"
time="2024-03-18T13:30:30Z" level=debug msg="container-hook: trying runtime at /host/usr/local/sbin/runc"
time="2024-03-18T13:30:30Z" level=debug msg="container-hook: runc at /host/usr/local/sbin/runc not found"
time="2024-03-18T13:30:30Z" level=debug msg="container-hook: trying runtime at /host/usr/lib/cri-o-runc/sbin/runc"
time="2024-03-18T13:30:30Z" level=debug msg="container-hook: runc at /host/usr/lib/cri-o-runc/sbin/runc not found"
time="2024-03-18T13:30:30Z" level=debug msg="container-hook: trying runtime at /host/run/torcx/unpack/docker/bin/runc"
time="2024-03-18T13:30:30Z" level=debug msg="container-hook: runc at /host/run/torcx/unpack/docker/bin/runc not found"
time="2024-03-18T13:30:30Z" level=debug msg="container-hook: trying runtime at /host/usr/bin/crun"
time="2024-03-18T13:30:30Z" level=debug msg="container-hook: runc at /host/usr/bin/crun not found"
time="2024-03-18T13:30:30Z" level=debug msg="container-hook: trying runtime at /host/usr/bin/conmon"
time="2024-03-18T13:30:30Z" level=debug msg="container-hook: runc at /host/usr/bin/conmon not found"
time="2024-03-18T13:30:30Z" level=warning msg="ContainerNotifier: not supported: no container runtime can be monitored with fanotify. The following paths were tested: /bin/runc,/usr/bin/runc,/usr/sbin/runc,/usr/local/bin/runc,/usr/local/sbin/runc,/usr/lib/cri-o-runc/sbin/runc,/run/torcx/unpack/docker/bin/runc,/usr/bin/crun,/usr/bin/conmon. You can use the RUNTIME_PATH env variable to specify a custom path. If you are successful doing so, please open a PR to add your custom path to runtimePaths"
time="2024-03-18T13:30:30Z" level=debug msg="Runcfanotify: trying runc at /host/bin/runc"
time="2024-03-18T13:30:30Z" level=warning msg="Runcfanotify: failed to fanotify mark: fanotify: mark error, permission denied"
time="2024-03-18T13:30:30Z" level=debug msg="Runcfanotify: trying runc at /host/usr/bin/runc"
time="2024-03-18T13:30:30Z" level=warning msg="Runcfanotify: failed to fanotify mark: fanotify: mark error, permission denied"
time="2024-03-18T13:30:30Z" level=debug msg="Runcfanotify: trying runc at /host/usr/sbin/runc"
time="2024-03-18T13:30:30Z" level=debug msg="Runcfanotify: runc at /host/usr/sbin/runc not found"
time="2024-03-18T13:30:30Z" level=debug msg="Runcfanotify: trying runc at /host/usr/local/bin/runc"
time="2024-03-18T13:30:30Z" level=debug msg="Runcfanotify: runc at /host/usr/local/bin/runc not found"
time="2024-03-18T13:30:30Z" level=debug msg="Runcfanotify: trying runc at /host/usr/local/sbin/runc"
time="2024-03-18T13:30:30Z" level=debug msg="Runcfanotify: runc at /host/usr/local/sbin/runc not found"
time="2024-03-18T13:30:30Z" level=debug msg="Runcfanotify: trying runc at /host/usr/lib/cri-o-runc/sbin/runc"
time="2024-03-18T13:30:30Z" level=debug msg="Runcfanotify: runc at /host/usr/lib/cri-o-runc/sbin/runc not found"
time="2024-03-18T13:30:30Z" level=debug msg="Runcfanotify: trying runc at /host/run/torcx/unpack/docker/bin/runc"
time="2024-03-18T13:30:30Z" level=debug msg="Runcfanotify: runc at /host/run/torcx/unpack/docker/bin/runc not found"
time="2024-03-18T13:30:30Z" level=debug msg="Runcfanotify: trying runc at /host/usr/bin/crun"
time="2024-03-18T13:30:30Z" level=debug msg="Runcfanotify: runc at /host/usr/bin/crun not found"
time="2024-03-18T13:30:30Z" level=warning msg="checking if current pid namespace is host pid namespace no runc instance can be monitored with fanotify. The following paths were tested: /bin/runc,/usr/bin/runc,/usr/sbin/runc,/usr/local/bin/runc,/usr/local/sbin/runc,/usr/lib/cri-o-runc/sbin/runc,/run/torcx/unpack/docker/bin/runc,/usr/bin/crun. You can use the RUNC_PATH env variable to specify a custom path. If you are successful doing so, please open a PR to add your custom path to runcPaths"

I'm going to check the kernel config as suggested.

@danielpacak
Copy link
Contributor Author

I've checked again and it seems that the Bottlerocket OS is compiled with fanotify support:

cat /boot/config | grep FANOTIFY
CONFIG_FANOTIFY=y
CONFIG_FANOTIFY_ACCESS_PERMISSIONS=y

However, it seems that Bottlerocket's default SELinux policy block the fanotify_mark syscall with the FAN_OPEN_EXEC_PERM flag. I did a naive test to use the FAN_OPEN_EXEC flag instead and this one worked without being blocked.

Do you happen to know if using FAN_OPEN_EXEC is reasonable without breaking the functionality of a container collector? Attached is the code snippet, which I used as minimal reproducible example:

fanotify_mark_FAN_OPEN_EXEC_PERM

@eiffel-fl
Copy link
Member

eiffel-fl commented Mar 22, 2024

I've checked again and it seems that the Bottlerocket OS is compiled with fanotify support:

cat /boot/config | grep FANOTIFY
CONFIG_FANOTIFY=y
CONFIG_FANOTIFY_ACCESS_PERMISSIONS=y

Good to know! This should be able to work then.

However, it seems that Bottlerocket's default SELinux policy block the fanotify_mark syscall with the FAN_OPEN_EXEC_PERM flag. I did a naive test to use the FAN_OPEN_EXEC flag instead and this one worked without being blocked.

What is the return value you get when using FAN_OPEN_EXEC_PERM?
From upstream kernel, EINVAL is returned in some cases:
https://elixir.bootlin.com/linux/v6.8/source/include/linux/fanotify.h#L107
https://elixir.bootlin.com/linux/v6.8/source/fs/notify/fanotify/fanotify_user.c#L1711
https://elixir.bootlin.com/linux/v6.8/source/fs/notify/fanotify/fanotify_user.c#L1839

Do you happen to know if using FAN_OPEN_EXEC is reasonable without breaking the functionality of a container collector? Attached is the code snippet, which I used as minimal reproducible example:

After taking a slight look, I would say these two flags are totally different:

fanotify_mark_FAN_OPEN_EXEC_PERM

Can you please share this code somewhere, so I can pull it and test it?

@danielpacak
Copy link
Contributor Author

Thanks @eiffel-fl for prompt reply. I'll share a minimal example I've been using for tests as soon as I have some spare CPU / time cycles.

@alban
Copy link
Member

alban commented Mar 25, 2024

The difference between FAN_OPEN_EXEC and FAN_OPEN_EXEC_PERM is that the first is unfortunately not synchronous: the process execution (e.g. runc) will not be paused while the fanotify code in ig is running.

All flags with the _PERM suffix mean that the process using the fanotify API (e.g. ig) must use ResponseAllow or ResponseDeny to let the kernel know the target process (e.g. runc) can continue.

If you use FAN_OPEN_EXEC, the following calls to ResponseAllow will not do anything because the execution is not paused:

        // This unblocks the execution
        defer n.runtimeBinaryNotify.ResponseAllow(data)
        // This unblocks whoever is accessing the pidfile
        defer pidFileDirNotify.ResponseAllow(data)

So it would be racy. It might work but it is not guaranteed. You might lose the first few events.

Copy link

This issue has been automatically marked as stale because it has not had recent activity.
It will be closed in 14 days if no further activity occurs.

@github-actions github-actions bot added the lifecycle/stale Marked to be closed in next 14 days because of inactivity. label May 25, 2024
@mauriciovasquezbernal mauriciovasquezbernal added lifecycle/staleproof Avoid this issue / PR being marked as stale by the bot. and removed lifecycle/stale Marked to be closed in next 14 days because of inactivity. labels May 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/staleproof Avoid this issue / PR being marked as stale by the bot.
Projects
Status: 🆕 New
Development

No branches or pull requests

4 participants