Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
deep-learning
optimizer
pytorch
artificial-intelligence
resnet
vit
diffusion
mae
fairseq
cuda-programming
bert-model
gpt2
transformer-xl
timm
convnext
adan
dreamfusion
-
Updated
Apr 4, 2024 - Python