Support save/load API for WOQ #1786

Kaihui-intel · 2024-05-11T01:42:51Z

Type of Change

feature

Description

Support save/load API for WOQ
remove export_compressed_model from config

Expected Behavior & Potential Risk

the expected behavior that triggered by this PR

How has this PR been tested?

UT
local test:
fp32&rtn

ks	Version	Filter	n-shot	Metric	Value		Stderr
lambada_openai	1	none	0	perplexity	26.0209	±	0.9382
		none	0	acc	0.3790	±	0.0068

Accuracy: 0.37900
Batch size = 1

Tasks	Version	Filter	n-shot	Metric	Value		Stderr
lambada_openai	1	none	0	perplexity	29.1191	±	1.1134
		none	0	acc	0.3679	±	0.0067

Accuracy: 0.36794
Batch size = 1

opt_125m_woq_gptq_int4_dq_bnb

ks	Version	Filter	n-shot	Metric	Value		Stderr
lambada_openai	1	none	0	perplexity	26.9172	±	1.0165
		none	0	acc	0.3701	±	0.0067
Accuracy: 0.37008
Batch size = 1

Dependency Change?

any library dependency introduced or removed

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

github-actions · 2024-05-15T08:13:44Z

⛈️ Required checks status: Has failure 🔴

Warning
If you do not have the access to re-run the Probot, please contact XuehaoSun for help. If you push a new commit, all of the workflow will be re-triggered.

Groups summary

🟢 Code Scan Tests workflow

Check ID	Status
Code-Scan	success	✅
Code-Scan (Bandit Code Scan Bandit)	success	✅
Code-Scan (DocStyle Code Scan DocStyle)	success	✅
Code-Scan (Pylint Code Scan Pylint)	success	✅

These checks are required after the changes to neural_compressor/torch/algorithms/weight_only/gptq.py, neural_compressor/torch/algorithms/weight_only/modules.py, neural_compressor/torch/algorithms/weight_only/rtn.py, neural_compressor/torch/algorithms/weight_only/save_load.py, neural_compressor/torch/algorithms/weight_only/utility.py, neural_compressor/torch/quantization/algorithm_entry.py, neural_compressor/torch/quantization/config.py, neural_compressor/torch/quantization/load_entry.py.

🟢 Model Tests workflow

Check ID	Status
Model-Test	success	✅
Model-Test (Generate Report GenerateReport)	success	✅
Model-Test (Run ONNX Model resnet50-v1-12)	success	✅
Model-Test (Run PyTorch Model resnet18)	success	✅
Model-Test (Run PyTorch Model resnet18_fx)	success	✅
Model-Test (Run TensorFlow Model darknet19)	success	✅
Model-Test (Run TensorFlow Model inception_v1)	success	✅
Model-Test (Run TensorFlow Model resnet-101)	success	✅
Model-Test (Run TensorFlow Model resnet50v1.5)	success	✅
Model-Test (Run TensorFlow Model ssd_mobilenet_v1_ckpt)	success	✅
Model-Test (Run TensorFlow Model ssd_resnet50_v1)	success	✅

These checks are required after the changes to .azure-pipelines/scripts/models/run_model_trigger_common.sh.

🔴 Model Tests 3x workflow

Check ID	Status	Error details
Model-Test-3x	failure		❌
Model-Test-3x (Generate Report GenerateReport)	failure	download	❌
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4)	success		✅
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4_dq_bnb)	success		✅
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4_dq_ggml)	success		✅

These checks are required after the changes to neural_compressor/torch/algorithms/weight_only/gptq.py, neural_compressor/torch/algorithms/weight_only/modules.py, neural_compressor/torch/algorithms/weight_only/rtn.py, neural_compressor/torch/algorithms/weight_only/save_load.py, neural_compressor/torch/algorithms/weight_only/utility.py, neural_compressor/torch/quantization/algorithm_entry.py, neural_compressor/torch/quantization/config.py, neural_compressor/torch/quantization/load_entry.py, examples/3.x_api/pytorch/nlp/huggingface_models/language-modeling/quantization/llm/run_benchmark.sh, examples/3.x_api/pytorch/nlp/huggingface_models/language-modeling/quantization/llm/run_clm_no_trainer.py, examples/3.x_api/pytorch/nlp/huggingface_models/language-modeling/quantization/llm/run_quant.sh, .azure-pipelines/scripts/models/run_model_trigger_common.sh.

🟢 Unit Tests 3x-PyTorch workflow

Check ID	Status
UT-3x-Torch	success	✅
UT-3x-Torch (Coverage Compare CollectDatafiles)	success	✅
UT-3x-Torch (Unit Test 3x Torch Unit Test 3x Torch)	success	✅
UT-3x-Torch (Unit Test 3x Torch baseline Unit Test 3x Torch baseline)	success	✅

These checks are required after the changes to neural_compressor/torch/algorithms/weight_only/gptq.py, neural_compressor/torch/algorithms/weight_only/modules.py, neural_compressor/torch/algorithms/weight_only/rtn.py, neural_compressor/torch/algorithms/weight_only/save_load.py, neural_compressor/torch/algorithms/weight_only/utility.py, neural_compressor/torch/quantization/algorithm_entry.py, neural_compressor/torch/quantization/config.py, neural_compressor/torch/quantization/load_entry.py, test/3x/torch/quantization/test_smooth_quant.py, test/3x/torch/quantization/test_static_quant.py, test/3x/torch/quantization/weight_only/test_autoround.py, test/3x/torch/quantization/weight_only/test_awq.py, test/3x/torch/quantization/weight_only/test_gptq.py, test/3x/torch/quantization/weight_only/test_rtn.py, test/3x/torch/quantization/weight_only/test_teq.py.

Thank you for your contribution! 💜

Note
This comment is automatically generated and will be updates every 180 seconds within the next 6 hours. If you have any other questions, contact chensuyue or XuehaoSun for help.

for more information, see https://pre-commit.ci

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

for more information, see https://pre-commit.ci

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

Signed-off-by: chensuyue <suyue.chen@intel.com>

…l-compressor into kaihui/save_and_load

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

for more information, see https://pre-commit.ci

…l-compressor into kaihui/save_and_load

for more information, see https://pre-commit.ci

…l-compressor into kaihui/save_and_load

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

for more information, see https://pre-commit.ci

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

for more information, see https://pre-commit.ci

Signed-off-by: chensuyue <suyue.chen@intel.com>

xin3he

@Kaihui-intel, we should add an UT for act_order, you can raise another PR.

Kaihui-intel added 10 commits May 9, 2024 14:10

add woq save/load & rtn ut

9a03458

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

remove export_compressed_model

23cb790

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

support gptq

3794929

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

fix rtn ut

abee018

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

cast awq ut to pytest

614550c

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

support awq

f1b2448

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

support teq

6c1802f

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

support autoround

1db8f70

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

update example

8b9d1f4

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

update run script

507f73b

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

Kaihui-intel requested review from xin3he and chensuyue May 13, 2024 05:53

rebase master

9c7c216

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

pre-commit-ci bot and others added 11 commits May 15, 2024 08:14

[pre-commit.ci] auto fixes from pre-commit.com hooks

2f60bef

for more information, see https://pre-commit.ci

rename teq ut

d975b96

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

fix typo

a539f15

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

d2828bb

for more information, see https://pre-commit.ci

fix qconfig mapping

1f98a6a

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

Merge branch 'master' into kaihui/save_and_load

ccf1081

add accuracy running for int4 model

cc7ef6a

Signed-off-by: chensuyue <suyue.chen@intel.com>

Merge branch 'kaihui/save_and_load' of https://github.com/intel/neura…

167e028

…l-compressor into kaihui/save_and_load

Merge branch 'kaihui/save_and_load' of https://github.com/intel/neura…

c701b95

…l-compressor into kaihui/save_and_load

teardown ut

757a1f7

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

minor fix

6fc00fd

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

Kaihui-intel force-pushed the kaihui/save_and_load branch from ac2fe4f to c701b95 Compare May 16, 2024 07:09

pre-commit-ci bot and others added 4 commits May 16, 2024 07:11

[pre-commit.ci] auto fixes from pre-commit.com hooks

f86a395

for more information, see https://pre-commit.ci

Merge branch 'kaihui/save_and_load' of https://github.com/intel/neura…

0863a9f

…l-compressor into kaihui/save_and_load

[pre-commit.ci] auto fixes from pre-commit.com hooks

992c40a

for more information, see https://pre-commit.ci

Merge branch 'kaihui/save_and_load' of https://github.com/intel/neura…

26b10fa

…l-compressor into kaihui/save_and_load

fix pylint

e3bf4c5

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

Kaihui-intel force-pushed the kaihui/save_and_load branch from 5bfca84 to 26b10fa Compare May 16, 2024 08:17

pre-commit-ci bot and others added 7 commits May 16, 2024 08:21

[pre-commit.ci] auto fixes from pre-commit.com hooks

60502ab

for more information, see https://pre-commit.ci

fix pylint

9f3c4b8

Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

737cbf8

for more information, see https://pre-commit.ci

update path

676f6bf

Signed-off-by: chensuyue <suyue.chen@intel.com>

limit perf version for test

eaf09e4

Signed-off-by: chensuyue <suyue.chen@intel.com>

fix acc parameter

61918dd

Signed-off-by: chensuyue <suyue.chen@intel.com>

Merge branch 'master' into kaihui/save_and_load

18e4bed

xin3he approved these changes May 17, 2024

View reviewed changes

chensuyue merged commit bacc164 into master May 17, 2024
38 of 40 checks passed

chensuyue deleted the kaihui/save_and_load branch May 17, 2024 09:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support save/load API for WOQ #1786

Support save/load API for WOQ #1786

Kaihui-intel commented May 11, 2024 •

edited

github-actions bot commented May 15, 2024 •

edited

xin3he left a comment

Support save/load API for WOQ #1786

Support save/load API for WOQ #1786

Conversation

Kaihui-intel commented May 11, 2024 • edited

Type of Change

Description

Expected Behavior & Potential Risk

How has this PR been tested?

Dependency Change?

github-actions bot commented May 15, 2024 • edited

⛈️ Required checks status: Has failure 🔴

Groups summary

xin3he left a comment

Choose a reason for hiding this comment

Kaihui-intel commented May 11, 2024 •

edited

github-actions bot commented May 15, 2024 •

edited