Skip to content

Pull requests: aws-samples/awsome-distributed-training

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Add in lt for p5
#339 opened May 17, 2024 by sean-smith Loading…
Feature/ldap server
#338 opened May 17, 2024 by mhuguesaws Loading…
Llama training with FP8
#331 opened May 15, 2024 by pbelevich Draft
nccl-tests container: fix cuda driver mismatch
#314 opened May 7, 2024 by verdimrc Loading…
smhp: quality-of-live improvements
#300 opened May 3, 2024 by verdimrc Loading…
Extra containerized nccl tests
#298 opened May 3, 2024 by verdimrc Draft
Add draft gpu troubles
#290 opened Apr 30, 2024 by mhuguesaws Draft
[Draft] add torchtitan test case
#286 opened Apr 26, 2024 by KeitaW Draft
Add llama-repices llama3 example enhancement New feature or request
#276 opened Apr 19, 2024 by KeitaW Loading…
Improve upon EFA versions script
#266 opened Apr 15, 2024 by sean-smith Loading…
[WIP] torchtune usecase
#260 opened Apr 12, 2024 by pbelevich Draft
Bump pytorch dockerfile template
#211 opened Mar 12, 2024 by verdimrc Loading…
SMHP: slurm exporter to report gpu metrics
#181 opened Mar 6, 2024 by verdimrc Loading…
Update organization and tag to V1
#150 opened Feb 22, 2024 by perifaws Loading…
megatron-lm test case: update README
#114 opened Jan 25, 2024 by verdimrc Draft
Prepare DLAMI for ParallelCluster using pcluster build-image enhancement New feature or request
#92 opened Jan 5, 2024 by verdimrc Loading…
ProTip! Adding no:label will show everything without a label.