-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
smp: allocate hugepages eagerly when kernel support is available #2146
Conversation
52a65cb
to
ff55fa7
Compare
@xemul please review |
Side note -- https://lwn.net/Articles/846501/ |
src/core/smp.cc
Outdated
auto aligned = addr & ~(page_size - 1); | ||
auto offset_in_hugepage = aligned & (*huge_page_size_opt - 1); | ||
auto first_page_offset = forwards ? 0 : *huge_page_size_opt - page_size; | ||
auto hugepage_addr = aligned - offset_in_hugepage; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just hugepage_addr = addr & ~(huge_page_size_opt - 1)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just hugepage_addr = addr & ~(huge_page_size_opt - 1) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That just happened to be the way I wrote it. I'll reconsider, but more important is to use MAP_POPULATE_WRITE instead.
MADV_POPULATE_WRITE is probably better than MADV_COLLAPSE. We could just issue both.
|
Instead of deferring merging pages into hugepages to the transparent hugepage scanner, advise the kernel to do so immediately using the new MADV_POPULATE_WRITE and MADV_COLLAPSE advices. Refactor the prefaulter to attempt first to use MAP_POPULATE_WRITE to fault in a whole hugepage's worth of memory. This should fault the range as a hugepage but for good measure use MADV_COLLAPSE too (which would be a no-op if the work was done in MADV_POPULATE_WRITE).
ff55fa7
to
99d2465
Compare
v2: rewrite; use MADV_POPULATE_WRITE if available to have the kernel do the faulting. Add new helper populate_memory() to work on a hugepage's worth of memory |
static std::regex re(R"(Hugepagesize:\s*(\d+) kB)"); | ||
auto m = std::smatch{}; | ||
if (std::regex_search(meminfo, m, re)) { | ||
return std::stoi(m[1]) * size_t(1024); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it m[1] and not m[0] because of that extra outer parentheses in the regular expression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So why is this not m[0]?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m[0] is the entire match, not a subexpression.
https://en.cppreference.com/w/cpp/regex/match_results/operator_at
populate_memory(batch_start, range.end); | ||
range.end = batch_start; | ||
|
||
if (range.start >= range.end) { | ||
ranges.erase(ranges.begin() + current_range); | ||
current_range = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pre-existing code, but I still wonder - why do we have this current_range at all? It seems it's set to 0 initially, and then always remains 0 or set to 0 again. Where is it set to anything else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's incremented below.
Instead of deferring merging pages into hugepages to the transparent
hugepage scanner, advise the kernel to do so immediately using the new
MADV_POPULATE_WRITE and MADV_COLLAPSE advices.
Refactor the prefaulter to attempt first to use MAP_POPULATE_WRITE
to fault in a whole hugepage's worth of memory. This should fault
the range as a hugepage but for good measure use MADV_COLLAPSE too
(which would be a no-op if the work was done in MADV_POPULATE_WRITE).