Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak due to MemoryContext not being cleaned up #22328

Closed
yugabyte-ci opened this issue May 9, 2024 · 0 comments
Closed

Memory leak due to MemoryContext not being cleaned up #22328

yugabyte-ci opened this issue May 9, 2024 · 0 comments

Comments

@yugabyte-ci
Copy link
Contributor

yugabyte-ci commented May 9, 2024

Jira Link: DB-11236

@yugabyte-ci yugabyte-ci added area/cdcsdk CDC SDK jira-originated kind/bug This issue is a bug priority/low Low priority status/awaiting-triage Issue awaiting triage and removed status/awaiting-triage Issue awaiting triage labels May 9, 2024
@yugabyte-ci yugabyte-ci added priority/highest Highest priority issue 2.20.4_blocker 2024.1.0_blocker and removed priority/low Low priority labels May 9, 2024
suranjan added a commit that referenced this issue May 14, 2024
Summary:
This diff ensures that the GetChanges RPC, upon its ScopeExit, will ensure deletion of its MemoryContext to prevent leaks caused by other RPCs setting their own MemoryContext without resetting it back to the old value.

CDC uses Memory Context to allocate objects that are decoded from binary format. These memory contexts are thread-local objects. After each decode operation, this context is reset but not deleted to reduce allocation/deallocation.

This implies that each thread processing the GetChanges RPC would have one memory context saved in its thread-local storage. This is fine as long as some other RPCs don't reset this thread-local object without either deleting it or resetting it back to the old value. It seems there are other RPCs that could get executed by the same thread and they allocate a new memory context and set it to the same thread-local, leading to this leak.

As a good practice, each RPC/task that gets executed by a thread should do the following:

```
Get the old memory context
Create and Set the new memory context
Delete its memory context
Reset the old on completion of its task
```

As a safety precaution, we should also look for other instances where we are not setting the old memory context back to the thread_local.
Jira: DB-11236

Test Plan: Manual Test

Reviewers: stiwary, amartsinchyk, sergei

Reviewed By: amartsinchyk

Subscribers: ybase, ycdcxcluster

Differential Revision: https://phorge.dev.yugabyte.com/D34900
suranjan added a commit that referenced this issue May 15, 2024
…hanges call

Summary:
Original commit: e71246c / D34900
This diff ensures that the GetChanges RPC, upon its ScopeExit, will ensure deletion of its MemoryContext to prevent leaks caused by other RPCs setting their own MemoryContext without resetting it back to the old value.

CDC uses Memory Context to allocate objects that are decoded from binary format. These memory contexts are thread-local objects. After each decode operation, this context is reset but not deleted to reduce allocation/deallocation.

This implies that each thread processing the GetChanges RPC would have one memory context saved in its thread-local storage. This is fine as long as some other RPCs don't reset this thread-local object without either deleting it or resetting it back to the old value. It seems there are other RPCs that could get executed by the same thread and they allocate a new memory context and set it to the same thread-local, leading to this leak.

As a good practice, each RPC/task that gets executed by a thread should do the following:

```
Get the old memory context
Create and Set the new memory context
Delete its memory context
Reset the old on completion of its task
```

As a safety precaution, we should also look for other instances where we are not setting the old memory context back to the thread_local.
Jira: DB-11236

Test Plan: Manual Test

Reviewers: stiwary, amartsinchyk, sergei

Reviewed By: amartsinchyk

Subscribers: ycdcxcluster, ybase

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D35088
suranjan added a commit that referenced this issue May 15, 2024
…tChanges call

Summary:
Original commit: e71246c / D34900
This diff ensures that the GetChanges RPC, upon its ScopeExit, will ensure deletion of its MemoryContext to prevent leaks caused by other RPCs setting their own MemoryContext without resetting it back to the old value.

CDC uses Memory Context to allocate objects that are decoded from binary format. These memory contexts are thread-local objects. After each decode operation, this context is reset but not deleted to reduce allocation/deallocation.

This implies that each thread processing the GetChanges RPC would have one memory context saved in its thread-local storage. This is fine as long as some other RPCs don't reset this thread-local object without either deleting it or resetting it back to the old value. It seems there are other RPCs that could get executed by the same thread and they allocate a new memory context and set it to the same thread-local, leading to this leak.

As a good practice, each RPC/task that gets executed by a thread should do the following:

```
Get the old memory context
Create and Set the new memory context
Delete its memory context
Reset the old on completion of its task
```

As a safety precaution, we should also look for other instances where we are not setting the old memory context back to the thread_local.
Jira: DB-11236

Test Plan: Manual Test

Reviewers: stiwary, amartsinchyk, sergei

Reviewed By: amartsinchyk

Subscribers: ybase, ycdcxcluster

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D35087
@yugabyte-ci yugabyte-ci reopened this May 15, 2024
suranjan added a commit that referenced this issue May 15, 2024
…GetChanges call

Summary:
Original commit: e71246c / D34900
This diff ensures that the GetChanges RPC, upon its ScopeExit, will ensure deletion of its MemoryContext to prevent leaks caused by other RPCs setting their own MemoryContext without resetting it back to the old value.

CDC uses Memory Context to allocate objects that are decoded from binary format. These memory contexts are thread-local objects. After each decode operation, this context is reset but not deleted to reduce allocation/deallocation.

This implies that each thread processing the GetChanges RPC would have one memory context saved in its thread-local storage. This is fine as long as some other RPCs don't reset this thread-local object without either deleting it or resetting it back to the old value. It seems there are other RPCs that could get executed by the same thread and they allocate a new memory context and set it to the same thread-local, leading to this leak.

As a good practice, each RPC/task that gets executed by a thread should do the following:

```
Get the old memory context
Create and Set the new memory context
Delete its memory context
Reset the old on completion of its task
```

As a safety precaution, we should also look for other instances where we are not setting the old memory context back to the thread_local.
Jira: DB-11236

Test Plan: Manual Test

Reviewers: stiwary, amartsinchyk, sergei

Reviewed By: amartsinchyk

Subscribers: ycdcxcluster, ybase

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D35086
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants