We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
作者好~ 在VideoChat2的训练中, 第二阶段训练中,会对Visual Encode和QFormer进行参数训练,导致参数发生变化。 那么在第三阶段训练中,输入的vit_blip_model,是来自于第二阶段参数发生变化的模型,还是重新使用原始的vit_blip_model?
The text was updated successfully, but these errors were encountered:
作者好~ 请忽略上一条提问。 在训练VideoChat2的第三阶段时,基于你提供的训练集,在极少量训练集缺失数据的情况下, 采用32 batchsize进行训练。目前算法性能最高在50.15%左右。其中如:Action Sequence(-7pp),Scene Transition(-12pp)等数据集差距较大。 请问:
辛苦帮忙解答一下。
Sorry, something went wrong.
batch size也许会有影响,但我感觉影响不大,你可以适当在降batch的时候降一下学习率。另外后续实验发现,COCO和WebVid使用小数据量版本效果偶尔会更好,个人感觉浮动在0.5%以内都比较正常
非常感谢~
No branches or pull requests
作者好~
在VideoChat2的训练中,
第二阶段训练中,会对Visual Encode和QFormer进行参数训练,导致参数发生变化。
那么在第三阶段训练中,输入的vit_blip_model,是来自于第二阶段参数发生变化的模型,还是重新使用原始的vit_blip_model?
The text was updated successfully, but these errors were encountered: