Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quick and dirty suspend feature #1493

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Quick and dirty suspend feature #1493

wants to merge 1 commit into from

Conversation

fdr
Copy link
Collaborator

@fdr fdr commented Apr 23, 2024

The case of re-allocating if CPU and/or memory resources have since allocated is left unsolved.

Also seen is grotesque copies of bookkeeping of VmHost core and memory counts, as well as some tests I declined to copy and paste for "systemctl stop" idempotency.

The case of re-allocating if CPU and/or memory resources have since
allocated is left unsolved.

Also seen is grotesque copies of bookkeeping of VmHost core and memory
counts, as well as some tests I declined to copy and paste for
"systemctl stop" idempotency.
@furkansahin
Copy link
Contributor

We can either exclude suspending and suspended states from the before_run and handle destroy path in these labels OR we need to be careful transitioning the VMs back to not suspended and then destroy. Otherwise, we will cause used_cores value to drift from reality.

hop_wait
end

nap 2**35
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to nap this long? It simply means the unsespending will happen manually. Considering the semaphore increment will be done by the operator, I'm OK with this. But, maybe we can reduce it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, strange thing, last system routinely napped for a thousand years and was more event-driven. This system is actually regressing in reducing spurious polling-like traffic relative to its predecessor.

So it's odd it shows in one place, yet, we're kinda doing something wrong everywhere else where we could fall asleep for a long time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should schedule the strand when semaphore is incremented. It would definitely push things more towards being event driven and would encourage longer naps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of being more event driven, and incrementing semaphores could be one of these events. In this case, someone needs to log into the console already and increment the unsuspend semaphore. It's probably ok to then also run the strand to immediately perform the unsuspend action.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants