Model output is different when using default optimize_model #10782

vishnumadhu365 · 2024-04-17T07:11:24Z

While testing ipex-llm I observed a difference in model output after calling optimize_model() which defaulted to sym_int4.
Please help clarify the following:

What is causing this variation in output ?
Does optimize_model() call ensure that the model accuracy remains the same across eval benchmarks like human eval, mmlu etc ?

Thanks!

env :
Python 3.9
ipex-llm 2.1.0b20240416
torch 2.2.2
transformers 4.31.0

reproducer:

import os
os.environ["TOKENIZERS_PARALLELISM"] = "false"
os.environ["TRANSFORMERS_VERBOSITY"] = "error"

import sys
import warnings
warnings.filterwarnings("ignore")

import torch
torch.manual_seed(100)

from transformers import AutoTokenizer, AutoModelForCausalLM

model_path = 'meta-llama/Llama-2-7b-chat-hf'
model = AutoModelForCausalLM.from_pretrained(model_path,
                                                 trust_remote_code=True,
                                                 use_cache=True)
tokenizer = AutoTokenizer.from_pretrained(model_path, 
                                           trust_remote_code=True)

system_prompt = "You are a creative poet. Write a poem about the given topic. Use only 100 words"
user_prompt = "Write a poem about owls and starry nights"
prompt_template = f"<s>[INST] <<SYS>>\n {system_prompt} \n<</SYS>>\n\n {user_prompt}  [/INST]"

print("*"*10 + "Original model output" + "*"*10)
print(tokenizer.decode(model.generate(tokenizer.encode(prompt_template, return_tensors="pt"), max_new_tokens=100)[0], skip_special_tokens=True))
sys.stdout.flush()

from ipex_llm import optimize_model
model = optimize_model(model)

print("*"*10 + "IPEX-LLM Optimized model output" + "*"*10)
print(tokenizer.decode(model.generate(tokenizer.encode(prompt_template, return_tensors="pt"), max_new_tokens=100)[0], skip_special_tokens=True))
sys.stdout.flush()

output:

**********Original model output**********
[INST] <<SYS>>
 You are a creative poet. Write a poem about the given topic. Use only 100 words 
<</SYS>>

 Write a poem about owls and starry nights  [/INST]  Sure! Here is a 100-word poem about owls and starry nights:

Silent sentinels of the night,
Owls perch on boughs, their eyes alight.
Glittering stars above, a twinkling sight,
A magical night, pure delight.
Converting the current model to sym_int4 format......
**********IPEX-LLM Optimized model output**********
[INST] <<SYS>>
 You are a creative poet. Write a poem about the given topic. Use only 100 words 
<</SYS>>

 Write a poem about owls and starry nights  [/INST]  Sure, here is a poem about owls and starry nights in exactly 100 words:

Owls hoot in the night's embrace
Their soft coos echo through space
While stars twinkle bright and slow
A celestial show to know
Nature's symphony so grand
In this peaceful night's command

The text was updated successfully, but these errors were encountered:

hkvision · 2024-04-22T04:08:32Z

Hi,

We are doing some further optimizations in ipex-llm for optimal performance, which may change some logits and outputs, this is expected.
But at the same time, we are running accuracy benchmarks (e.g. the tasks in https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) to make sure that our optimizations don't have any obvious negative impacts in the accuracy.
If you observe any wrong output with the ipex-llm optimized model, feel free to tell us and we will check it. Thanks!

jason-dai added the user issue label Apr 19, 2024

glorysdj assigned Romanticoseu Apr 22, 2024

glorysdj unassigned Romanticoseu Apr 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model output is different when using default optimize_model #10782

Model output is different when using default optimize_model #10782

vishnumadhu365 commented Apr 17, 2024

hkvision commented Apr 22, 2024 •

edited by jason-dai

Model output is different when using default optimize_model #10782

Model output is different when using default optimize_model #10782

Comments

vishnumadhu365 commented Apr 17, 2024

hkvision commented Apr 22, 2024 • edited by jason-dai

hkvision commented Apr 22, 2024 •

edited by jason-dai