New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak due to MemoryContext not being cleaned up #22328
Labels
2.20.4_blocker
2.20.5_blocker
2024.1.0_blocker
area/cdcsdk
CDC SDK
jira-originated
kind/bug
This issue is a bug
priority/highest
Highest priority issue
Comments
yugabyte-ci
added
area/cdcsdk
CDC SDK
jira-originated
kind/bug
This issue is a bug
priority/low
Low priority
status/awaiting-triage
Issue awaiting triage
and removed
status/awaiting-triage
Issue awaiting triage
labels
May 9, 2024
yugabyte-ci
added
priority/highest
Highest priority issue
2.20.4_blocker
2024.1.0_blocker
and removed
priority/low
Low priority
labels
May 9, 2024
suranjan
added a commit
that referenced
this issue
May 14, 2024
Summary: This diff ensures that the GetChanges RPC, upon its ScopeExit, will ensure deletion of its MemoryContext to prevent leaks caused by other RPCs setting their own MemoryContext without resetting it back to the old value. CDC uses Memory Context to allocate objects that are decoded from binary format. These memory contexts are thread-local objects. After each decode operation, this context is reset but not deleted to reduce allocation/deallocation. This implies that each thread processing the GetChanges RPC would have one memory context saved in its thread-local storage. This is fine as long as some other RPCs don't reset this thread-local object without either deleting it or resetting it back to the old value. It seems there are other RPCs that could get executed by the same thread and they allocate a new memory context and set it to the same thread-local, leading to this leak. As a good practice, each RPC/task that gets executed by a thread should do the following: ``` Get the old memory context Create and Set the new memory context Delete its memory context Reset the old on completion of its task ``` As a safety precaution, we should also look for other instances where we are not setting the old memory context back to the thread_local. Jira: DB-11236 Test Plan: Manual Test Reviewers: stiwary, amartsinchyk, sergei Reviewed By: amartsinchyk Subscribers: ybase, ycdcxcluster Differential Revision: https://phorge.dev.yugabyte.com/D34900
suranjan
added a commit
that referenced
this issue
May 15, 2024
…hanges call Summary: Original commit: e71246c / D34900 This diff ensures that the GetChanges RPC, upon its ScopeExit, will ensure deletion of its MemoryContext to prevent leaks caused by other RPCs setting their own MemoryContext without resetting it back to the old value. CDC uses Memory Context to allocate objects that are decoded from binary format. These memory contexts are thread-local objects. After each decode operation, this context is reset but not deleted to reduce allocation/deallocation. This implies that each thread processing the GetChanges RPC would have one memory context saved in its thread-local storage. This is fine as long as some other RPCs don't reset this thread-local object without either deleting it or resetting it back to the old value. It seems there are other RPCs that could get executed by the same thread and they allocate a new memory context and set it to the same thread-local, leading to this leak. As a good practice, each RPC/task that gets executed by a thread should do the following: ``` Get the old memory context Create and Set the new memory context Delete its memory context Reset the old on completion of its task ``` As a safety precaution, we should also look for other instances where we are not setting the old memory context back to the thread_local. Jira: DB-11236 Test Plan: Manual Test Reviewers: stiwary, amartsinchyk, sergei Reviewed By: amartsinchyk Subscribers: ycdcxcluster, ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D35088
suranjan
added a commit
that referenced
this issue
May 15, 2024
…tChanges call Summary: Original commit: e71246c / D34900 This diff ensures that the GetChanges RPC, upon its ScopeExit, will ensure deletion of its MemoryContext to prevent leaks caused by other RPCs setting their own MemoryContext without resetting it back to the old value. CDC uses Memory Context to allocate objects that are decoded from binary format. These memory contexts are thread-local objects. After each decode operation, this context is reset but not deleted to reduce allocation/deallocation. This implies that each thread processing the GetChanges RPC would have one memory context saved in its thread-local storage. This is fine as long as some other RPCs don't reset this thread-local object without either deleting it or resetting it back to the old value. It seems there are other RPCs that could get executed by the same thread and they allocate a new memory context and set it to the same thread-local, leading to this leak. As a good practice, each RPC/task that gets executed by a thread should do the following: ``` Get the old memory context Create and Set the new memory context Delete its memory context Reset the old on completion of its task ``` As a safety precaution, we should also look for other instances where we are not setting the old memory context back to the thread_local. Jira: DB-11236 Test Plan: Manual Test Reviewers: stiwary, amartsinchyk, sergei Reviewed By: amartsinchyk Subscribers: ybase, ycdcxcluster Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D35087
suranjan
added a commit
that referenced
this issue
May 15, 2024
…GetChanges call Summary: Original commit: e71246c / D34900 This diff ensures that the GetChanges RPC, upon its ScopeExit, will ensure deletion of its MemoryContext to prevent leaks caused by other RPCs setting their own MemoryContext without resetting it back to the old value. CDC uses Memory Context to allocate objects that are decoded from binary format. These memory contexts are thread-local objects. After each decode operation, this context is reset but not deleted to reduce allocation/deallocation. This implies that each thread processing the GetChanges RPC would have one memory context saved in its thread-local storage. This is fine as long as some other RPCs don't reset this thread-local object without either deleting it or resetting it back to the old value. It seems there are other RPCs that could get executed by the same thread and they allocate a new memory context and set it to the same thread-local, leading to this leak. As a good practice, each RPC/task that gets executed by a thread should do the following: ``` Get the old memory context Create and Set the new memory context Delete its memory context Reset the old on completion of its task ``` As a safety precaution, we should also look for other instances where we are not setting the old memory context back to the thread_local. Jira: DB-11236 Test Plan: Manual Test Reviewers: stiwary, amartsinchyk, sergei Reviewed By: amartsinchyk Subscribers: ycdcxcluster, ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D35086
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
2.20.4_blocker
2.20.5_blocker
2024.1.0_blocker
area/cdcsdk
CDC SDK
jira-originated
kind/bug
This issue is a bug
priority/highest
Highest priority issue
Jira Link: DB-11236
The text was updated successfully, but these errors were encountered: