Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to perform inference on large datasets? #17

Open
abdulfatir opened this issue Mar 18, 2024 · 0 comments
Open

How to perform inference on large datasets? #17

abdulfatir opened this issue Mar 18, 2024 · 0 comments
Labels
FAQ Frequently asked question

Comments

@abdulfatir
Copy link
Contributor

abdulfatir commented Mar 18, 2024

Opening this as a FAQ.

The pipeline.predict interface accepts either a 1D/2D tensor or a list of tensors. If you want to perform inference on a large dataset, you can either:

  • Send batches of shape [batch_size, context_length] to the predict function in a loop over batches in your dataset. Note: you would need to pad the time series with torch.nan on the left, if they don't have the same length.
  • (Easier) Send lists of tensors of length batch_size to the predict function in a loop over batches in your dataset. No need to pad here, it will be done internally.

If you're running OOM, decrease the batch_size.

@abdulfatir abdulfatir added the FAQ Frequently asked question label Mar 18, 2024
@abdulfatir abdulfatir changed the title How to perform inference for large datasets? How to perform inference on large datasets? Mar 18, 2024
@lostella lostella pinned this issue Mar 18, 2024
@abdulfatir abdulfatir unpinned this issue Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FAQ Frequently asked question
Projects
None yet
Development

No branches or pull requests

1 participant