You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I propose the integration of the Infinity framework into embedchain to significantly speed up embedding inference. Infinity is a pure Python framework designed to enhance the efficiency of embedding computations.
Motivation, pitch
Infinity uses techniques such as dynamic batching, flash-attention2, faster/parallel tokenization, torch compile, and optimal use of fp16 precision. The integration of Infinity aims to provide a substantial improvement in inference speed and efficiency.
馃殌 The feature
I propose the integration of the Infinity framework into embedchain to significantly speed up embedding inference. Infinity is a pure Python framework designed to enhance the efficiency of embedding computations.
Motivation, pitch
Infinity uses techniques such as dynamic batching, flash-attention2, faster/parallel tokenization, torch compile, and optimal use of fp16 precision. The integration of Infinity aims to provide a substantial improvement in inference speed and efficiency.
langchain-ai/langchain#17671
langchain-ai/langchain#13928
The text was updated successfully, but these errors were encountered: