Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding using multiple-GPUs #541

Open
DenuwanClouda opened this issue Sep 1, 2023 · 6 comments
Open

Encoding using multiple-GPUs #541

DenuwanClouda opened this issue Sep 1, 2023 · 6 comments

Comments

@DenuwanClouda
Copy link

I am doing this task where I am recommending products based on their reviews. To do that I am using your library and when I am creating the index, what I would like to know is whether there is a way to utilize several GPUs. Because the time it takes to encode the review text is huge and I have a multiGPU environment.
Also the task also expects explanations and I would also like to know can the same be done when I execute the explain method.

Thank you.

@davidmezzetti
Copy link
Member

This has been brought up from time to time and a good reminder that better integrated support should be added.

One less than pretty workaround that has been used in the past:

import torch

embeddings.model.model.model = torch.nn.DataParallel(embeddings.model.model.model)

Might be worth trying that to see if it improves performance at all. Below are a couple more references to investigate.

References:

Let's keep this open and I'll take a look at better multi-GPU support in an upcoming release. Ultimately, the best solution given Python's GIL is to spawn a pool of encoding processes, one per GPU.

@DenuwanClouda
Copy link
Author

Thank you for the response

@DenuwanClouda
Copy link
Author

@davidmezzetti

  • The suggested work around did not drastically improve in my case.
  • For everyone's information: I am using sub indexes feature so the suggested workaround had to be slightly changed in my case as
    embeddings.models['sentence-transformers/all-MiniLM-L6-v2'].model.model = torch.nn.DataParallel(embeddings.models['sentence-transformers/all-MiniLM-L6-v2'].model.model)
    Hope this is correct
  • But this method did not work so I tried to use seperate function to take control of the encoding to speed it up using multiprocessing as suggested from the link provided. See the image below
image image

But this seems to under utilise the 2 GPUs in kaggle free tier.

Any suggestions?

@davidmezzetti
Copy link
Member

If you're going to use the multi process pool, make sure to undo the torch dataparallel wrapper.

This most likely needs a dedicated effort to optimize encoding for 2 GPUs. I only develop with 1 GPU so it's not a use case I've prioritized.

@DenuwanClouda
Copy link
Author

DenuwanClouda commented Sep 15, 2023

I was able to significantly speed up the process using this external encoding handling feature. An encoding which tool around 1 hr in a single GPU environment was reduced to 18 mins. But it would be nice if this feature is an inbuilt feature in your library.
External function used is as below

# starting the pool outside the function
pool = model.start_multi_process_pool()

def transform(data):
    #Compute the embeddings using the multi-process pool
    emb = model.encode_multi_process(data, pool2, batch_size=64)
    
    return emb

@davidmezzetti
Copy link
Member

Glad to hear it! Nice to see you were able to use an external transform function to solve this.

Will keep this open to add a similar method directly to txtai.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants