Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference using TensorRT #81

Open
mon28 opened this issue Aug 24, 2023 · 1 comment
Open

Inference using TensorRT #81

mon28 opened this issue Aug 24, 2023 · 1 comment

Comments

@mon28
Copy link

mon28 commented Aug 24, 2023

Hi,

I have been exploring models that I can fine tune with my own data to provide embeddings for the task of pair wise similarity calculation.
My data looks like: [title][space][url]. I do not have domain specific information. There are two questions that I have:

  1. What would the instruction look like in this case for training and inference scenarios?
  2. I wish to deploy this in production and use TensorRT for inference. Could you help me with an example of how that would work out?

Thanks,
Mon

@hongjin-su
Copy link
Collaborator

Hi, Thanks a lot for your interest in the INSTRUCTOR!

  1. In the section 2.3 of our paper, we provide the template to write instructions.
  2. There are several good tutorials that talk about TensorRT:
    https://developer.nvidia.com/tensorrt-getting-started
    https://medium.com/ching-i/tensorrt-%E4%BB%8B%E7%B4%B9%E8%88%87%E5%AE%89%E8%A3%9D%E6%95%99%E5%AD%B8-45e44f73b25e
    In general, the technique that applies to transformer models may also be applicable for the INSTRUCTOR model, as it has very similar architecture to T5-encoder.

Hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants