Releases: lucidrains/PaLM-rlhf-pytorch
Releases · lucidrains/PaLM-rlhf-pytorch
0.2.1
0.2.0
address https://github.com/lucidrains/PaLM-rlhf-pytorch/issues/41 , b… …e faithful to the paper
0.1.4
old action log probs should be the true distribution in the kl div lo… …ss, addressing https://github.com/lucidrains/PaLM-rlhf-pytorch/issues/43
0.1.2
flash attention sdp context config only needs to be done once
0.1.1
fix assert
0.1.0
add ability to use flash attention if using pytorch 2.0, thanks to @c… …onceptofmind for the initial PR!
0.0.68
0.0.68
0.0.67
fix silly error in masked kl div loss, thanks to @taynoel84
0.0.66
allow for setting critic palm from rlhftrainer
0.0.65
fix an error with the way action log prob is collected during the epi… …sode rollouts, addressing https://github.com/lucidrains/PaLM-rlhf-pytorch/issues/31 and thanks to @kisseternity