Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Fusing Models - Output is Bad #757

Open
GusLovesMath opened this issue May 6, 2024 · 2 comments
Open

Issue with Fusing Models - Output is Bad #757

GusLovesMath opened this issue May 6, 2024 · 2 comments

Comments

@GusLovesMath
Copy link

GusLovesMath commented May 6, 2024

Hi,

When I train my model using the code below (Note this was all done in jupyter Notebook).

!python -m mlx_lm.lora \
    --model mlx-community/Meta-Llama-3-8B-Instruct-4bit \
    --train \
    --batch-size 1 \
    --lora-layers 1 \
    --iters 1000 \
    --data Data \
    --seed 0

And load the model using the load function where I pass the adapters path.

# Load the fine-tuned model with LoRA weights
model_lora, tokenizer_lora = load(
    path_or_hf_repo="mlx-community/Meta-Llama-3-8B-Instruct-4bit", 
    adapter_path="adapters"
)

I get nicely generated outputs able to solve annoyingly verbose math problems. Like so (I called my model LlaMATH).

Screenshot 2024-05-06 at 3 07 32 PM

However when fusing using the code below and importing the fused model from either hugging face or locally the outputs are bad.

!python -m mlx_lm.fuse \
    --model mlx-community/Meta-Llama-3-8B-Instruct-4bit \
    --adapter-path adapters \
    --upload-repo MyRepo/LlaMATH-3-8B-Instruct-4bit \
    --hf-path mlx-community/Meta-Llama-3-8B-Instruct-4bit

Using the fused model

# Can also load the fused model locally
fused_model, fused_tokenizer = load("./lora_fused_model/")

# Running model with test question
response = generate(
    fused_model,
    fused_tokenizer,
    prompt=prompt,
    max_tokens=100,
    temp=0.0, 
    verbose=False
)

This will give me some nonsense. Any help would be great. I am using python 3.11, and mlx_lm.version == 0.12.1. Thank you and I appreciate any advice or help! :D

@mzbac
Copy link
Contributor

mzbac commented May 11, 2024

Try increasing the scale/alpha factor in Lora, it may be helpful.
https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/tuner/lora.py#L88
https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/tuner/lora.py#L57C31-L57C37

@GusLovesMath
Copy link
Author

I'll try this in a few days! Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants