add parallel sampling using vllm #409

daspartho · 2023-11-29T16:08:04Z

close #370

adds support for parallel sampling using vllm library when num_return_sequences in generation kwargs is > 1 and the model is supported by vllm (currently all hf models in llm-vm)

TODO: handle dependencies

VictorOdede

Let's set vllm_support to be a property of the BaseOnsiteLLM class then we can set it to true by default unless it's not supported by the model then we can set the property to false

daspartho · 2023-12-04T08:33:01Z

made suggested changes. vllm_support is set to true by default and needs to be set false explicitly for unsupported models.

parallel sampling using vllm

6694965

VictorOdede reviewed Nov 30, 2023

View reviewed changes

vllm_support default true

3e6a87e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add parallel sampling using vllm #409

add parallel sampling using vllm #409

daspartho commented Nov 29, 2023 •

edited

VictorOdede left a comment

daspartho commented Dec 4, 2023

add parallel sampling using vllm #409

Are you sure you want to change the base?

add parallel sampling using vllm #409

Conversation

daspartho commented Nov 29, 2023 • edited

VictorOdede left a comment

Choose a reason for hiding this comment

daspartho commented Dec 4, 2023

daspartho commented Nov 29, 2023 •

edited