Please do not abandon this project! #126

oobabooga · 2023-11-27T02:23:58Z

Earlier this year I was impressed with the offloading performance of FlexGen, and I wonder how it would compare with the performance currently provided by llama.cpp for Llama and Llama-2 models in a CPU offloading scenario.

Any chance Llama support could be added to FlexGen @Ying1123 @keroro824?

BinhangYuan · 2023-11-27T02:28:23Z

We are pushing a refactoring of the current implementation to support most HF models, we will release that soon under a fork of this repo and will keep you informed.

oobabooga · 2023-11-27T02:35:39Z

That's exciting news @BinhangYuan! I look forward to testing the new release and incorporating it in my text-generation-webui project. Cheers :)

arnfaldur · 2024-05-12T12:24:32Z

Are there any news of this fork?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Please do not abandon this project! #126

Please do not abandon this project! #126

oobabooga commented Nov 27, 2023

BinhangYuan commented Nov 27, 2023

oobabooga commented Nov 27, 2023 •

edited

arnfaldur commented May 12, 2024

Please do not abandon this project! #126

Please do not abandon this project! #126

Comments

oobabooga commented Nov 27, 2023

BinhangYuan commented Nov 27, 2023

oobabooga commented Nov 27, 2023 • edited

arnfaldur commented May 12, 2024

oobabooga commented Nov 27, 2023 •

edited