Integrate Infinity Framework for Enhanced Embedding Inference Speed #1270

michaelfeil · 2024-02-17T06:54:59Z

🚀 The feature

I propose the integration of the Infinity framework into embedchain to significantly speed up embedding inference. Infinity is a pure Python framework designed to enhance the efficiency of embedding computations.

Motivation, pitch

Infinity uses techniques such as dynamic batching, flash-attention2, faster/parallel tokenization, torch compile, and optimal use of fp16 precision. The integration of Infinity aims to provide a substantial improvement in inference speed and efficiency.

langchain-ai/langchain#17671
langchain-ai/langchain#13928

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate Infinity Framework for Enhanced Embedding Inference Speed #1270

Integrate Infinity Framework for Enhanced Embedding Inference Speed #1270

michaelfeil commented Feb 17, 2024

Integrate Infinity Framework for Enhanced Embedding Inference Speed #1270

Integrate Infinity Framework for Enhanced Embedding Inference Speed #1270

Comments

michaelfeil commented Feb 17, 2024

🚀 The feature

Motivation, pitch