[Feature] Adding support for Mixtral and Gemma models #1247

martinakaduc · 2024-03-08T09:28:17Z

This pull request intends to add support for Mixtral and Gemma as LLM backbone.

muhark · 2024-03-11T13:44:20Z

@martinakaduc I see some of my comments in here--did you figure out the necessary changes to the preprocess_gemma function? :)
FWIW yours looks right, I realize that I may have been running an old version of the chat template that didn't include the BOS token when I was developing.

martinakaduc · 2024-03-11T13:51:56Z

preprocess_gemma

@muhark Yes, I have referenced your code. I also figured out the problem with preprocess_gemma where <end_of_turn>\n requires 2 tokens.

I have successfully trained my llava_gemma for Vietnamese! 😄 (https://huggingface.co/ura-hcmut/GemSUraV-7B)

ShawnAn-WHU · 2024-04-18T02:30:37Z

@martinakaduc Sorry to bother. I see some warnings when pretrain Mixtral 8*7B, just like #1417. Did you face the same problem? Thanks in advance!

martinakaduc added 5 commits March 6, 2024 19:02

Add Mixtral

5092fd8

Add Gemma

df93321

Fix for Gemma's preprocessing

1c4b878

Fix llava_gemma loading

f9aaa10

Fix bug for deployment

0e8f3a6

Fix deployment

02b89a6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Adding support for Mixtral and Gemma models #1247

[Feature] Adding support for Mixtral and Gemma models #1247

martinakaduc commented Mar 8, 2024

muhark commented Mar 11, 2024

martinakaduc commented Mar 11, 2024 •

edited

ShawnAn-WHU commented Apr 18, 2024

[Feature] Adding support for Mixtral and Gemma models #1247

Are you sure you want to change the base?

[Feature] Adding support for Mixtral and Gemma models #1247

Conversation

martinakaduc commented Mar 8, 2024

muhark commented Mar 11, 2024

martinakaduc commented Mar 11, 2024 • edited

ShawnAn-WHU commented Apr 18, 2024

martinakaduc commented Mar 11, 2024 •

edited