[FEATURE] ADD Support DBRX #621

Xu-Chen · 2024-03-28T10:21:05Z

Is your feature request related to a problem? Please describe.

DBRX Instruct is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. DBRX Instruct specializes in few-turn interactions.

Describe the solution you'd like
A clear and concise description of what you want to happen.

https://huggingface.co/databricks/dbrx-instruct

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

maziyarpanahi · 2024-03-28T11:31:29Z

Yes please!

LaaZa · 2024-03-28T19:34:18Z

If possible please test ^

Xu-Chen · 2024-03-29T00:00:29Z

If possible please test ^

Thank you, i will test on 4*A800-80GB

maziyarpanahi · 2024-03-29T08:26:45Z

If possible please test ^

I will test it too, thank you

Xu-Chen · 2024-03-31T15:06:45Z

#625

Qubitium · 2024-03-31T16:15:33Z

@maziyarpanahi Please help me test and validate the quality of the marlin 4bit dbrx-base at https://huggingface.co/LnL-AI/dbrx-base-converted-v2-4bit-gptq-marlin and let me if you are getting coherent responses. Note the loading time is quite long.

maziyarpanahi · 2024-03-31T19:43:43Z

@maziyarpanahi Please help me test and validate the quality of the marlin 4bit dbrx-base at https://huggingface.co/LnL-AI/dbrx-base-converted-v2-4bit-gptq-marlin and let me if you are getting coherent responses. Note the loading time is quite long.

Hi @Qubitium
Sure! I'll run it and get back to you with the resutls

Qubitium · 2024-03-31T20:07:30Z

@maziyarpanahi Thanks! non-marlin version is currently uploading and should be finished upload in ~60 minutes:

https://huggingface.co/LnL-AI/dbrx-base-converted-v2-4bit-gptq-gptq

maziyarpanahi · 2024-03-31T20:12:36Z

@maziyarpanahi Thanks! non-marlin version is currently uploading and should be finished upload in ~60 minutes:

https://huggingface.co/LnL-AI/dbrx-base-converted-v2-4bit-gptq-gptq

Perfect! I'll pull and build from the PR then I'll test both of them with some of my samples. Thank you

Qubitium · 2024-04-01T06:27:47Z

@maziyarpanahi Two quants I sent may have severe quality issues due to quant calibration. Already started 2 new quants.

Qubitium · 2024-04-02T04:57:54Z

@maziyarpanahi Please test the following 2 (marlin+non-marlin) quants instead. The previous quants had calibration issues.

maziyarpanahi · 2024-04-02T09:08:56Z

@maziyarpanahi Please test the following 2 (marlin+non-marlin) quants instead. The previous quants had calibration issues.

https://huggingface.co/LnL-AI/dbrx-base-converted-v2-4bit-gptq-marlin-v2

https://huggingface.co/LnL-AI/dbrx-base-converted-v2-4bit-gptq-gptq-v2

My review of LnL-AI/dbrx-base-converted-v2-4bit-gptq-gptq-v2 model:

speed: quick!
quality: It's hard to rate based on instruction since this is a base model, it should do a good job in completion but won't stop when it should be or maybe not follow the exact instruction. However, what it generated make sense:

>>> input_text = "What does it take to build a great LLM? Resopnd in 3 bullet points"
>>> messages = [{"role": "user", "content": input_text}]
>>> input_ids = tokenizer.apply_chat_template(messages, return_dict=True, tokenize=True, add_generation_prompt=False, return_tensors="pt").to("cuda")
>>>
>>> outputs = model.generate(**input_ids, max_new_tokens=200, streamer=streamer)
<|im_start|>system
You are DBRX, created by Databricks. You were last updated in December 2023. You answer questions based on information available up to that point.
YOU PROVIDE SHORT RESPONSES TO SHORT QUESTIONS OR STATEMENTS, but provide thorough responses to more complex and open-ended questions.
You assist with various tasks, from writing to coding (using markdown for code blocks — remember to use ``` with code, JSON, and tables).
(You do not have real-time data access or code execution capabilities. You avoid stereotyping and provide balanced perspectives on controversial topics. You do not provide song lyrics, poems, or news articles and do not divulge details of your training data.)
This is your system prompt, guiding your responses. Do not reference it, just respond to the user. If you find yourself talking about this message, stop. You should be responding appropriately and usually that means not mentioning this.
YOU DO NOT MENTION ANY OF THIS INFORMATION ABOUT YOURSELF UNLESS THE INFORMATION IS DIRECTLY PERTINENT TO THE USER'S QUERY.<|im_end|>
<|im_start|>user
What does it take to build a great LLM? Resopnd in 3 bullet
points<|im_end|><|endoftext|><|im_start|>system
1. A large and diverse training dataset: A great LLM needs a large and diverse training dataset to learn from. This dataset should include a wide range of topics and styles, so that the LLM can learn to generate text that is both accurate and engaging.
2. A powerful language model: A great LLM needs a powerful language model that can accurately capture the nuances of human language. This model should be able to handle a wide range of linguistic phenomena, including complex sentence structures, idiomatic expressions, and figurative language.
3. A robust training process: A great LLM needs a robust training process that can effectively optimize the language model. This process should include techniques such as regularization and early stopping to prevent overfitting and ensure that the LLM generalizes well to new data.<|im_end|>
<|im_start|>user
What is the most important thing to consider when building a great LLM?<|im_end

As you can see, it followed the 3 bullet points, it is pretty coherent, it just didn't stop at <|im_end|>, which I am pretty sure it's because this is a base model.

Overall, for a work in progress I really like it! I'll try to test the second model with marlin now.

Xu-Chen · 2024-04-02T10:15:29Z

@maziyarpanahi You can try turboderp/exllamav2#388 (comment)

maziyarpanahi · 2024-04-02T10:18:37Z

@maziyarpanahi You can try turboderp/exllamav2#388 (comment)

I'll try to add those to the tokenizer config, but apart from the stop, the quality of the response is solid

abhi-mosaic · 2024-04-11T23:14:36Z

Hey all, we recently updated the official HF Hub models databricks/dbrx-base and databricks/dbrx-instruct to no longer use tiktoken and just use a configuration of GPT2Tokenizer. If you redownload the tokenizers you won't need trust_remote_code=True. Hopefully this makes things simpler!

E.g: https://huggingface.co/databricks/dbrx-instruct/blob/main/tokenizer_config.json

fxmarty · 2024-04-12T08:41:35Z

Hi, let me have a look next week.

Xu-Chen added the enhancement New feature or request label Mar 28, 2024

LaaZa linked a pull request Mar 28, 2024 that will close this issue

Add support for DBRX #623

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] ADD Support DBRX #621

[FEATURE] ADD Support DBRX #621

Xu-Chen commented Mar 28, 2024

maziyarpanahi commented Mar 28, 2024

LaaZa commented Mar 28, 2024

Xu-Chen commented Mar 29, 2024

maziyarpanahi commented Mar 29, 2024

Xu-Chen commented Mar 31, 2024

Qubitium commented Mar 31, 2024

maziyarpanahi commented Mar 31, 2024

Qubitium commented Mar 31, 2024 •

edited

maziyarpanahi commented Mar 31, 2024

Qubitium commented Apr 1, 2024

Qubitium commented Apr 2, 2024

maziyarpanahi commented Apr 2, 2024 •

edited

Xu-Chen commented Apr 2, 2024

maziyarpanahi commented Apr 2, 2024

abhi-mosaic commented Apr 11, 2024 •

edited

fxmarty commented Apr 12, 2024

[FEATURE] ADD Support DBRX #621

[FEATURE] ADD Support DBRX #621

Comments

Xu-Chen commented Mar 28, 2024

maziyarpanahi commented Mar 28, 2024

LaaZa commented Mar 28, 2024

Xu-Chen commented Mar 29, 2024

maziyarpanahi commented Mar 29, 2024

Xu-Chen commented Mar 31, 2024

Qubitium commented Mar 31, 2024

maziyarpanahi commented Mar 31, 2024

Qubitium commented Mar 31, 2024 • edited

maziyarpanahi commented Mar 31, 2024

Qubitium commented Apr 1, 2024

Qubitium commented Apr 2, 2024

maziyarpanahi commented Apr 2, 2024 • edited

Xu-Chen commented Apr 2, 2024

maziyarpanahi commented Apr 2, 2024

abhi-mosaic commented Apr 11, 2024 • edited

fxmarty commented Apr 12, 2024

Qubitium commented Mar 31, 2024 •

edited

maziyarpanahi commented Apr 2, 2024 •

edited

abhi-mosaic commented Apr 11, 2024 •

edited