You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to train the text2image direction using the newly uploaded image_t2x dataset. However, it would return AssertationError in anyToImageVideoAudio.py line 410: assert e - s + 1 == num_gen_tokens, (s, e).
After a closer look into it, an example of the image in image_t2x.json would give e=124, s=118, and thus e-s+1=7, while num_gen_token is a predefined hyperparameter that = 4 and raises assertation error.
And this comes from line 303 in utils.py under model/common, where for the following sentence 'Of course, I can assist you with that! Behold, a captivating vector illustration showcasing a hand delicately crafting a heart shape. The vibrant colors and meticulous details make this graphic both eye-catching and heartwarming. I hope you find it delightful.[IMG0] [IMG1] [IMG2] [IMG3]\n###'. The tokenizer would return [4587, 3236, 29892, 306, 508, 6985, 366, 411, 393, 29991, 1522, 8948, 29892, 263, 4332, 440, 1218, 4608, 8632, 362, 1510, 29883, 5832, 263, 1361, 628, 293, 2486, 25554, 292, 263, 5192, 8267, 29889, 450, 325, 4626, 424, 11955, 322, 1539, 12906, 681, 4902, 1207, 445, 3983, 293, 1716, 10977, 29899, 12510, 292, 322, 5192, 29893, 2817, 292, 29889, 306, 4966, 366, 1284, 372, 15319, 1319, 29889, 32002, 259, 32003, 259, 32004, 259, 32005, 29871, 13, 2277, 29937] where there will be 7 tokens between 32002 and 32005 but it would only be 4 and this raises the assertation error.
I wonder which part of this is wrong and how should I correct this. @ChocoWu
Thank you~
The text was updated successfully, but these errors were encountered:
Hi, @jwzhi
I guess you are using the updated Vicuna (not v0) because I also encountered the issue with the updated tokenizer where it tokenizes “[IMG0] [IMG1] [IMG2] [IMG3]” into “32002, 259, 32003, 259, 32004, 259, 32005” instead of the expected “32002, 32003, 32004, 32005”. But, when trying to tokenize “[IMG0]”, you will get the expected results, “32002”.
Unfortunately, I haven't been able to identify the cause of this issue yet.
So, I recommend either reverting to Vicuna-v0 or making modifications to the code to address this problem.
I am trying to train the text2image direction using the newly uploaded image_t2x dataset. However, it would return AssertationError in
anyToImageVideoAudio.py line 410: assert e - s + 1 == num_gen_tokens, (s, e)
.After a closer look into it, an example of the image in image_t2x.json would give e=124, s=118, and thus e-s+1=7, while num_gen_token is a predefined hyperparameter that = 4 and raises assertation error.
And this comes from line 303 in utils.py under
model/common
, where for the following sentence'Of course, I can assist you with that! Behold, a captivating vector illustration showcasing a hand delicately crafting a heart shape. The vibrant colors and meticulous details make this graphic both eye-catching and heartwarming. I hope you find it delightful.[IMG0] [IMG1] [IMG2] [IMG3]\n###'
. The tokenizer would return[4587, 3236, 29892, 306, 508, 6985, 366, 411, 393, 29991, 1522, 8948, 29892, 263, 4332, 440, 1218, 4608, 8632, 362, 1510, 29883, 5832, 263, 1361, 628, 293, 2486, 25554, 292, 263, 5192, 8267, 29889, 450, 325, 4626, 424, 11955, 322, 1539, 12906, 681, 4902, 1207, 445, 3983, 293, 1716, 10977, 29899, 12510, 292, 322, 5192, 29893, 2817, 292, 29889, 306, 4966, 366, 1284, 372, 15319, 1319, 29889, 32002, 259, 32003, 259, 32004, 259, 32005, 29871, 13, 2277, 29937]
where there will be 7 tokens between 32002 and 32005 but it would only be 4 and this raises the assertation error.I wonder which part of this is wrong and how should I correct this. @ChocoWu
Thank you~
The text was updated successfully, but these errors were encountered: