Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable fast_init_ in load_low_bit #10945

Merged
merged 2 commits into from May 8, 2024

Conversation

leonardozcm
Copy link
Contributor

Description

related issue: #10826
For all saved low bit weights are already processed and need no initialized, this will not effect our function now.

@leonardozcm
Copy link
Contributor Author

Verify acc and load speed:
before:

load time 1.1827845573425293s
-------------------- Prompt --------------------
<s>[INST] <<SYS>>

<</SYS>>

What is AI? [/INST]
-------------------- Output --------------------
[INST] <<SYS>>

<</SYS>>

What is AI? [/INST]  Artificial intelligence (AI) is the broader field of research and development aimed at creating machines that can perform tasks that typically require human intelligence,

after

load time 1.1476478576660156s
-------------------- Prompt --------------------
<s>[INST] <<SYS>>

<</SYS>>

What is AI? [/INST]
-------------------- Output --------------------
[INST] <<SYS>>

<</SYS>>

What is AI? [/INST]  Artificial intelligence (AI) is the broader field of research and development aimed at creating machines that can perform tasks that typically require human intelligence,

@leonardozcm leonardozcm merged commit 0d6e120 into intel-analytics:main May 8, 2024
18 checks passed
@leonardozcm leonardozcm deleted the fast_init branch May 8, 2024 02:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants