You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello author, thx to the great work! i want to use ALBEF to train another language-image multi model, i am a little confused about the finetune procedure.
Here's my options below:
load your repo's pth file, and iterates the parmeters.
load parameters from Bert model: bert-base-chinese to ALBEF model which tensor name contains text_encoder to pretrained
freeze the parameters in ALBEF model which tensor name contains visual_encoder.
code like this below:
`
tokenizer = BertTokenizer.from_pretrained(args.text_encoder) #load chinese bert pretrained model
model = ALBEF(config=config, text_encoder=args.text_encoder, tokenizer=tokenizer)
model_dict = model.state_dict()
# load parameters in your ckpt file, but leave out tensors which name contains text_encoder
temp = {}
pretrained_dict = torch.load(args.checkpoint, map_location='cpu')['model']
for k, v in pretrained_dict.items():
if k.find("text_encoder") == -1 and model_dict[k].shape==v.shape:
temp[k] = v
# replace parameters in text_encoder and freeze visual_encoder
temp_update = {}
for k, v in model_dict.items():
if k in temp.keys():
if k.find("visual_encoder") != -1:
temp[k].requires_grad = False
temp_update[k] = temp[k]
else:
temp_update[k] = v
model_dict.update(temp_update)
model.load_state_dict(model_dict)
`
finally i found bad recall score in flicker-cn dataset, could you give me some advise?
The text was updated successfully, but these errors were encountered:
Hi, it won't work if you directly replace bert-en to bert-cn, as the parameters of these two models are different. ALBEF is pre-trained using bert-en and cannot be directly applied to bert-cn.
hey @LiJunnan1992 what is your opinion on using adapters to update the pretrained model? or something like low rank adaptation (lora)? wondering if such partial training can be applied to alignment of these models. seems possible, but i would like an expert opinion as setup may be costly in time and compute. thank you!
Hello author, thx to the great work! i want to use ALBEF to train another language-image multi model, i am a little confused about the finetune procedure.
Here's my options below:
code like this below:
`
`
finally i found bad recall score in flicker-cn dataset, could you give me some advise?
The text was updated successfully, but these errors were encountered: