GPTQ for RWKV #98

3outeille · 2023-04-19T09:25:29Z

This is work in progress and serve as main thread for any questions related to this topic

3outeille · 2023-04-19T14:43:56Z

@BlinkDL Do I have to quantize blocks.1.att.* as well ? (I am thinking of key, value, receptance weight)

BlinkDL · 2023-04-20T07:18:55Z

@3outeille yes do it for all matrices weights (ignore time_xxx)

…ation for now

…ow for some layer + need tests)

3outeille · 2023-04-25T10:05:47Z

@BlinkDL Do you happen to have a reference perplexity measure (or whatever metrics ) I can use as a baseline ?

BlinkDL · 2023-04-25T14:48:15Z

@BlinkDL Do you happen to have a reference perplexity measure (or whatever metrics ) I can use as a baseline ?

https://github.com/BlinkDL/ChatRWKV/blob/main/v2/benchmark.py use the LAMBADA ppl here

…t quantization

meditans · 2023-04-26T23:51:24Z

Question: would we expect a huge improvement wrt perplexity if we did quantization-aware training?

3outeille · 2023-04-27T08:17:03Z

@meditans QAT will probably yield huge improvement but this imply re-training your model whereas GPTQ uses a post-training quantization strategy (no re-training involved)

… step Date: Tue May 2 18:17:57 2023 +0000

BlinkDL · 2023-05-08T04:18:28Z

How's it going :) are you in Discord

3outeille · 2023-05-09T08:08:16Z

Yep, I sent a message on discord in quantization channel

Evilran · 2023-05-19T06:47:06Z

Hi. Is it available now?

3outeille · 2023-06-03T08:27:31Z

@Evilran Hi, making it work with chatRWKV is too much of a hassle because it requires to change the RWKV class too much, thus the PR will not be accepted. However, I made it work with HuggingFace version of RWKV if you want: https://github.com/3outeille/GPTQ-for-RWKV

3outeille added 3 commits April 18, 2023 14:53

feat(quantize): measure perplexity on wikitext2

6e556e5

feat(quantize): add gptq files

bde6374

feat(quantize): begin to readapt with RWKV

943af70

3outeille mentioned this pull request Apr 19, 2023

Implement GPTQ for RWKV BlinkDL/RWKV-LM#88

Closed

3outeille added 4 commits April 23, 2023 12:50

breaking(quantize): draft gptq rwkv

629fc9b

fix(quantize): GPTQ hooks now work with RWKV

4a19476

feat(quantize): link fasterquant with RWKV + remove 1D tensor quantiz…

dba2670

…ation for now

feat(quantize): full gptq pipeline now integrated with RKWV (quite sl…

57079e7

…ow for some layer + need tests)

fix(quantize): add missing part in forward block + support head.weigh…

8e78f2d

…t quantization

3outeille force-pushed the quantize branch from c92f72f to 8e78f2d Compare April 26, 2023 08:14

3outeille force-pushed the quantize branch 3 times, most recently from f4584b4 to 76d937b Compare April 28, 2023 20:29

3outeille added 6 commits April 28, 2023 20:30

feat(sanity-check): begin sanity check for GPTQ on MNIST

f87df05

breaking(sanity-check): add save & load option for reference gptq

b77715d

breaking(sanity-check): enhance with dummy model

816def4

fix(sanity-check): dont quantize last layer for dummy example

f141e52

breaking(sanity-check): adding my implem gptq

a1ea882

fix(sanity-check): training ref and implem now yield same outputs

8a37fb4

3outeille force-pushed the quantize branch from 76d937b to 5278821 Compare April 28, 2023 20:30

feat(sanity-check): implem version of gptq now added

4233522

3outeille force-pushed the quantize branch from fc9a065 to 4233522 Compare May 2, 2023 07:05

3outeille added 2 commits May 3, 2023 12:30

fix(sanity-check): ref and implem now yield the same results at every…

e74d72a

… step Date: Tue May 2 18:17:57 2023 +0000

feat(quantize): readapt GPTQ for rwkv

cf14124

3outeille force-pushed the quantize branch from bb43465 to cf14124 Compare May 3, 2023 12:32

breaking(gptq): quantizing only 1 layer yield high perplexity

c2bbe64

3outeille force-pushed the quantize branch from 5d2ad0c to c2bbe64 Compare May 7, 2023 20:00

fix(ppl): measure ppl using sliding window

9b9c714

update

3399ef0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPTQ for RWKV #98

GPTQ for RWKV #98

3outeille commented Apr 19, 2023

3outeille commented Apr 19, 2023 •

edited by BlinkDL

BlinkDL commented Apr 20, 2023

3outeille commented Apr 25, 2023

BlinkDL commented Apr 25, 2023 •

edited

meditans commented Apr 26, 2023

3outeille commented Apr 27, 2023 •

edited

BlinkDL commented May 8, 2023

3outeille commented May 9, 2023

Evilran commented May 19, 2023

3outeille commented Jun 3, 2023

GPTQ for RWKV #98

Are you sure you want to change the base?

GPTQ for RWKV #98

Conversation

3outeille commented Apr 19, 2023

3outeille commented Apr 19, 2023 • edited by BlinkDL

BlinkDL commented Apr 20, 2023

3outeille commented Apr 25, 2023

BlinkDL commented Apr 25, 2023 • edited

meditans commented Apr 26, 2023

3outeille commented Apr 27, 2023 • edited

BlinkDL commented May 8, 2023

3outeille commented May 9, 2023

Evilran commented May 19, 2023

3outeille commented Jun 3, 2023

3outeille commented Apr 19, 2023 •

edited by BlinkDL

BlinkDL commented Apr 25, 2023 •

edited

3outeille commented Apr 27, 2023 •

edited