Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Booting the VM kernel gives "Internal error: undefined instruction: 0 [#1] SMP" on Volterra #4508

Open
1 of 3 tasks
jglathe opened this issue Mar 15, 2024 · 2 comments

Comments

@jglathe
Copy link

jglathe commented Mar 15, 2024

Describe the bug

Hi,

I was checking out if I could run firecracker-vm on the Volterra box (Windows Dev Kit 2023). It is booted into EL2, has /dev/kvm, and also an lxc VM running to check if VMs are generally available and working.

Screenshot from 2024-02-29 14-40-00

When booting it halts with the undefined instruction, not quite sure if it's a kernel config that needs to be changed.


[    0.005519] ASID allocator initialised with 32768 entries
[    0.005920] Hierarchical SRCU implementation.
[    0.006492] EFI services will not be available.
[    0.006825] smp: Bringing up secondary CPUs ...
[    0.007143] smp: Brought up 1 node, 1 CPU
[    0.007417] SMP: Total of 1 processors activated.
[    0.007714] CPU features: detected: GIC system register CPU interface
[    0.008131] CPU features: detected: Privileged Access Never
[    0.008494] CPU features: detected: User Access Override
[    0.008840] CPU features: detected: 32-bit EL0 Support
[    0.009208] Internal error: undefined instruction: 0 [#1] SMP
[    0.009578] Process migration/0 (pid: 10, stack limit = 0xffffff8008b78000)
[    0.010030] CPU: 0 PID: 10 Comm: migration/0 Not tainted 4.14.174+ #14
[    0.010443] Hardware name: linux,dummy-virt (DT)
[    0.010739] task: ffffffc0068a0000 task.stack: ffffff8008b78000
[    0.011147] PC is at arm64_set_ssbd_mitigation+0x64/0xa0
[    0.011492] LR is at arm64_set_ssbd_mitigation+0x18/0xa0
[    0.011851] pc : [<ffffff800808e154>] lr : [<ffffff800808e108>] pstate: 004000c5
[    0.012346] sp : ffffff8008b7bd20
[    0.012569] x29: ffffff8008b7bd20 x28: 0000000000000000 
[    0.012927] x27: ffffff800803bbd0 x26: 0000000000000001 
[    0.013282] x25: ffffff800871ee20 x24: ffffff8008854e66 
[    0.013637] x23: 0000000000000001 x22: 0000000000000040 
[    0.013997] x21: ffffff800803bc94 x20: ffffff800887c4a8 
[    0.014353] x19: 0000000000000001 x18: ffffffffffffffff 
[    0.014710] x17: 0000000000000007 x16: 0000000000000001 
[    0.015072] x15: ffffff80087fad08 x14: ffffff808888af37 
[    0.015416] x13: 0000000000000000 x12: 0000000000000001 
[    0.015765] x11: 0000000000000000 x10: 0000000000000a00 
[    0.016110] x9 : ffffff8008b7bd70 x8 : ffffffc0068a0a60 
[    0.016455] x7 : 0000000000000000 x6 : 00000000ffffffff 
[    0.016793] x5 : 0000003fff6eb000 x4 : 0000000000000004 
[    0.017089] x3 : 0000000000000000 x2 : 0000000000000001 
[    0.017432] x1 : 0000000000000001 x0 : 0000000000000001 
[    0.017776] Call trace:
[    0.017941] Exception stack(0xffffff8008b7bbe0 to 0xffffff8008b7bd20)
[    0.018363] bbe0: 0000000000000001 0000000000000001 0000000000000001 0000000000000000
[    0.018878] bc00: 0000000000000004 0000003fff6eb000 00000000ffffffff 0000000000000000
[    0.019342] bc20: ffffffc0068a0a60 ffffff8008b7bd70 0000000000000a00 0000000000000000
[    0.019845] bc40: 0000000000000001 0000000000000000 ffffff808888af37 ffffff80087fad08
[    0.020325] bc60: 0000000000000001 0000000000000007 ffffffffffffffff 0000000000000001
[    0.020750] bc80: ffffff800887c4a8 ffffff800803bc94 0000000000000040 0000000000000001
[    0.021178] bca0: ffffff8008854e66 ffffff800871ee20 0000000000000001 ffffff800803bbd0
[    0.021646] bcc0: 0000000000000000 ffffff8008b7bd20 ffffff800808e108 ffffff8008b7bd20
[    0.022099] bce0: ffffff800808e154 00000000004000c5 ffffff8008652778 ffffff800808a95c
[    0.022592] bd00: ffffffffffffffff ffffff800808e108 ffffff8008b7bd20 ffffff800808e154
[    0.023098] [<ffffff800808e154>] arm64_set_ssbd_mitigation+0x64/0xa0
[    0.023505] [<ffffff800808ef14>] cpu_enable_ssbs+0x74/0xa0
[    0.023846] [<ffffff800808e6b0>] __enable_cpu_capability+0x10/0x20
[    0.024233] [<ffffff800813535c>] multi_cpu_stop+0x8c/0x110
[    0.024572] [<ffffff8008135634>] cpu_stopper_thread+0xc4/0x148
[    0.024893] [<ffffff80080c3d30>] smpboot_thread_fn+0x1a0/0x1d0
[    0.025261] [<ffffff80080bf80c>] kthread+0x12c/0x130
[    0.025567] [<ffffff8008084c50>] ret_from_fork+0x10/0x18
[    0.025898] Code: d4000002 f9400bf3 a8c27bfd d65f03c0 (d503403f) 
[    0.026279] ---[ end trace 66dc7e40a2c28e42 ]---
[    0.026574] note: migration/0[10] exited with preempt_count 1

Attached is the complete log, it boots the VM. And balks on the code.

start.txt

To get the vmlinux kernel I have set export arch=aarch64, maybe it must be more specific?

To Reproduce

I used https://github.com/alexellis/firecracker-init-lab in addition to the firecracker how-to, came along nicely until the VM crashes.

Expected behaviour

It just works ™️

Environment

  • Firecracker version: 1.6.0
  • Host and guest kernel versions: host: 6.8.0, guest: 5.10.209
  • Rootfs used: ubuntu-22.04.ext4
  • Architecture: aarch64 on Qualcomm sc8280xp, booted to EL2 with slbounce
  • Any other relevant software versions: Host Distribution is Ubuntu 23.10 (Desktop Image for Raspberry Pi, assimilated) very stable.

Additional context

I would assume that the kernel config for the guest kernel expects an ARMv8 feature that is not there in this chip?

Checks

  • Have you searched the Firecracker Issues database for similar problems?
  • Have you read the existing relevant Firecracker documentation?
  • Are you certain the bug being reported is a Firecracker issue?
@jglathe
Copy link
Author

jglathe commented Mar 16, 2024

Additional info: This is what CPU features Host and VM present to the kernel:

jglathe@sdbox2:~/firecracker$ journalctl -b -1 -g  "features: detected"
Okt 26 15:55:41 sdbox2 kernel: CPU features: detected: GIC system register CPU interface
Okt 26 15:55:41 sdbox2 kernel: CPU features: detected: Spectre-v4
Okt 26 15:55:41 sdbox2 kernel: CPU features: detected: Spectre-BHB
Okt 26 15:55:41 sdbox2 kernel: CPU features: detected: Kernel page table isolation (KPTI)
Okt 26 15:55:41 sdbox2 kernel: CPU features: detected: 32-bit EL0 Support
Okt 26 15:55:41 sdbox2 kernel: CPU features: detected: Data cache clean to the PoU not required for I/D coherence
Okt 26 15:55:41 sdbox2 kernel: CPU features: detected: Common not Private translations
Okt 26 15:55:41 sdbox2 kernel: CPU features: detected: CRC32 instructions
Okt 26 15:55:41 sdbox2 kernel: CPU features: detected: Data cache clean to Point of Persistence
Okt 26 15:55:41 sdbox2 kernel: CPU features: detected: RCpc load-acquire (LDAPR)
Okt 26 15:55:41 sdbox2 kernel: CPU features: detected: LSE atomic instructions
Okt 26 15:55:41 sdbox2 kernel: CPU features: detected: Privileged Access Never
Okt 26 15:55:41 sdbox2 kernel: CPU features: detected: RAS Extension Support
Okt 26 15:55:41 sdbox2 kernel: CPU features: detected: Speculative Store Bypassing Safe (SSBS)
Okt 26 15:55:41 sdbox2 kernel: CPU features: detected: Hardware dirty bit management on CPU0-7
jglathe@sdbox2:~/firecracker$ lxc exec ubuntu -- bash
root@ubuntu:~# sudo dmesg|grep "features: detected"
[    0.000000] CPU features: detected: GIC system register CPU interface
[    0.000000] CPU features: detected: Spectre-v4
[    0.000000] CPU features: detected: Spectre-BHB
[    0.000000] CPU features: detected: Kernel page table isolation (KPTI)
[    0.026448] CPU features: detected: 32-bit EL0 Support
[    0.026450] CPU features: detected: Data cache clean to the PoU not required for I/D coherence
[    0.026453] CPU features: detected: Common not Private translations
[    0.026454] CPU features: detected: CRC32 instructions
[    0.026456] CPU features: detected: Data cache clean to Point of Persistence
[    0.026459] CPU features: detected: RCpc load-acquire (LDAPR)
[    0.026461] CPU features: detected: LSE atomic instructions
[    0.026463] CPU features: detected: Privileged Access Never
[    0.026465] CPU features: detected: RAS Extension Support
[    0.026467] CPU features: detected: Speculative Store Bypassing Safe (SSBS)
[    0.028328] CPU features: detected: Hardware dirty bit management on CPU0-3
root@ubuntu:~# 

Does the VM just want to do something odd, like using SSBD even if it's not there?

@sudanl0
Copy link
Contributor

sudanl0 commented Mar 18, 2024

Hi @jglathe,
Thank you for your query.
Sorry, but since we haven't tested the specific kernel/cpu configuration we cannot be of much help here.
A suggestion would be to try on any of our tested kernel configurations listed here to see if it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants