Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about quantization #119

Open
xinhaoc opened this issue Jun 4, 2023 · 0 comments
Open

question about quantization #119

xinhaoc opened this issue Jun 4, 2023 · 0 comments

Comments

@xinhaoc
Copy link

xinhaoc commented Jun 4, 2023

Hi FlexGen team! I have a question about your quantization algorithm. are you using this function run_float_quantization for int4/int8 compression? When I run the test(test_float_quantize), it fails because the params is different with the deepspeed version(the ref_out_tensor is the same). the deepspeed param can recover the float16 tensor, the run_float_quantize can't. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant