Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare DLAMI for ParallelCluster using pcluster build-image #92

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

verdimrc
Copy link
Contributor

@verdimrc verdimrc commented Jan 5, 2024

Issue #, if available: N/A

Description of changes: Example to prepare DLAMI using pcluster build-image which does not require additional community tools (ansible and packer).

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@verdimrc verdimrc added the enhancement New feature or request label Jan 5, 2024
@verdimrc verdimrc changed the title Prepare DLAMI using pcluster build-image Prepare DLAMI for ParallelCluster using pcluster build-image Jan 5, 2024
# Estimated build time: ~1h
InstanceType: g4dn.4xlarge

# Deep Learning Base OSS Nvidia Driver GPU AMI (Ubuntu 20.04) 20240101 / us-west-2
Copy link
Contributor

@mhuguesaws mhuguesaws Jan 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel very hardcoded. Why OSS Nvidia Driver and not closed source?

I have hard time understanding what the DLAMI brings compares to ParalleCluster AMI?

Copy link
Contributor Author

@verdimrc verdimrc Jan 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel very hardcoded.

Specifying ami id is as much the same as the rest of pcluster or packer. The example purposely includes a comment on ami name + region to quickly communicate what exactly is the parent's ami, and indeed at hte cost of (minor) maintainability overhead.

Why OSS Nvidia Driver and not closed source?

DLAMI added the 'OSS' to ami name since Dec'23. The Jan24 build uses https://github.com/NVIDIA/open-gpu-kernel-modules.git

... what the DLAMI brings compares to ParalleCluster AMI?

  • Different release cycle, decoupled from PCluster, allowing flexibility which release cycle to follow.
  • Get an approximation (but not exact) to Hyperpod (based on DLAMI).
  • It's for users who come from DLAMI and want to continue do so: prebuilt NCCL, mutliple CUDA versions, and other idiosyncrasies of DLAMI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Packer uses the latest AMI name or based on PC version. It is not bound to AMI id specific to a region.
  2. The OSS driver was added to address linux kernels changes impacting EFA, https://docs.aws.amazon.com/dlami/latest/devguide/important-changes.html. It is fixed now.
  3. Based on DLAMI to use prebuilt NCCL etc.. Agreed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1/ Packer uses the latest AMI name or based on PC version. It is not bound to AMI id specific to a region.

This is definitely a plus point.

@@ -0,0 +1,30 @@
Build:
# Estimated build time: ~1h
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems long. Why not sticking to packer?

Copy link
Contributor Author

@verdimrc verdimrc Jan 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems long.

it's about 40-50ish minutes, and packer would take the same time with default EBS setting (125 MB/s). Our packer example is faster because we increase the throughput very high (1000) which also locks the resulted AMI to the built-time EBS throughput.

pcluster build-image depends on Image Builder and Image Builder seem to only support default EBS throughout (125 MB/s), so at the moment that 40-50min is the build time (hence, being upfront with the commentary).

Why not sticking to packer?

Because all the GPU DL stacks are already installed, and only need to enrich with Slurm and other pcluster requirements (which is what pcluster build-image is about).

And this method support alinux2 or ub2004 without having to write custom packer and ansible recipes. Whereas now, our packer example is written for alinux2, and not out-of-the-box working with ub2004.

Lastly, installing pcluster cli is straightforward on multiple platform, as it's a standard Python package installation. Packer (+ansible) might have paper cut on different platform (e.g., on OSX, the scp must have its own flag compared to when using Cloud9 as the packer client).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because packer+ansible was not supposed to ssh remotely and should be local.
Ansible roles are completely deficient in this repo. I am working on a complete ansible role re-write+tests. ETA Q1.

Copy link
Contributor

@mhuguesaws mhuguesaws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left comments

Verdi March and others added 3 commits January 8, 2024 15:55
@KeitaW KeitaW force-pushed the pcluster-build-image-dlami branch 2 times, most recently from fa037c7 to 6073a5b Compare June 4, 2024 02:26
@KeitaW KeitaW force-pushed the main branch 3 times, most recently from 44e448e to 1209815 Compare June 4, 2024 02:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants