Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RoBERTa model Prediction taking Infinite time with no error but Multiprocessing disabled warning #1558

Open
etqadkhan opened this issue Nov 24, 2023 · 0 comments

Comments

@etqadkhan
Copy link

Describe the bug
While loading the ClassificationModel() class for RoBERTa model that I have trained on a custom dataset, I am facing an issue where when I am trying to perform model.predict() operation on the text data and even for a small dataset (as less as 20 data points), the model doesn't return the result but keeps running infinitely. Seems to be a problem with multiprocessing as when I load the model the warning says,

UserWarning: use_multiprocessing automatically disabled as xlmroberta fails when using multiprocessing for feature conversion.

To Reproduce

from simpletransformers.classification import (ClassificationModel,)
import pandas as pd
model_name = "xlmroberta"
model_destination = "model_directory"
model = ClassificationModel(model_name, model_destination)
df = pd.read_pickle(r'pickle_file.pkl')
predictions, raw_outputs = model.predict(df.text[0:100].to_list())

Expected behavior
The expected behaviour is that the model should be able to predict for the data points in a smooth manner in a pretty small amount of time and not go into an infinite run. The problem didn't happen when number of data points were less, but as soon as I tried for 100 data points (see the screenshot), it went into a never ending execution. The individual text sentences are not very long with the mean number of characters per sentence being 141 characters.

Screenshots
error_for_more_data
ran_for_less_data

Desktop (please complete the following information):

  • OS - Debian 11
  • Environment - PyTorch 2.0 (with Intel MKL-DNN/MKL) [Vertex AI Workbench env]
  • GPU - NVDIA T4
  • Machine Type - 8vCPU 52 GB RAM
  • simpletransformers==0.64.3
  • transformers==4.35.2
  • torch==2.0.0+cu118
  • CUDA version - 11.8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant