Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Leak Issue #1544

Open
sanwark11 opened this issue Aug 16, 2023 · 1 comment
Open

Memory Leak Issue #1544

sanwark11 opened this issue Aug 16, 2023 · 1 comment

Comments

@sanwark11
Copy link

sanwark11 commented Aug 16, 2023

Issue Summary:
Hello everyone,

I'm encountering a memory-related challenge while using Flask, Gunicorn, and the SimpleTransformers library for document classification. Specifically, I've trained a model to identify resumes, and it works well initially. However, I've noticed that as the number of requests increases, the memory consumption of the application gradually rises. After around 100 requests, the memory usage remains elevated and doesn't go down.

Problem Details:
Upon investigating the issue with a memory profiler, I've identified that a few lines within the classification_model.py file of the SimpleTransformers library are causing significant memory consumption spikes. Here are the key lines and their respective memory increment:

Line 2199: A memory spike of 11.5 MiB occurs during outputs = self._calculate_loss(model, inputs, loss_fct=self.loss_fct).
Line 2181: An increase of 0.5 MiB is observed during the loop for i, batch in enumerate(tqdm(eval_dataloader, disable=args.silent)).
Line 2182: There's a memory bump of 0.4 MiB when executing model.eval().
Line 2089: Memory usage grows by 0.9 MiB during eval_dataset = self.load_and_cache_examples(...).
Line 2049: An additional 0.4 MiB is used while running self._move_model_to_device().

Attempted Solutions:
To tackle this, I disabled multiprocessing as a suggested remedy. While this reduced the memory leak, it didn't fully resolve the issue. Some incremental memory consumption still occurs after a certain number of requests.

Code I am using

args = {
    "use_multiprocessing": False,
    "use_multiprocessing_for_evaluation": False,
    "process_count": 1
}

trained_model = ClassificationModel(
    "roberta",
    model_path,
    num_labels=2,
    use_cuda=False,
    args=args
)
prediction, raw_outputs = trained_model.predict([text])

Environment Details:

OS: Ubuntu 20
System: 16GB RAM, 8-core CPU
Libraries:
simpletransformers==0.63.9,
transformers==4.21.3,
torch==1.13.1 (CPU)

Note:
Our testing environment is a VM instance, while production uses Kubernetes. The memory problem is prevalent in both setups.
I'm keen on gathering insights and solutions from the community to address this memory concern. Your input and suggestions would be immensely helpful.

@sanwark11
Copy link
Author

Please suggest some solution, I am facing this issue on producation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant