New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nemo readme revisions #9129
base: main
Are you sure you want to change the base?
Nemo readme revisions #9129
Conversation
Co-authored-by: Eric Harper <complex451@gmail.com> Signed-off-by: jgerh <163925524+jgerh@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall text changes are nice, but it leaves many ambiguities, with respect to what features are available for each domain, so please correct those.
Separately, during a recent conference, I have had comments from researchers saying they could not find the ASR models and features supported in NeMo after 1.23 - when the previous refactor pushed all the domain docs up inside of the nemo repo - and left them completely invisible to the world.
Most people will not click links nested in a wall of text to hunt down domain features and docs. So i request that the domain docs be added to the end of the NeMo main readme
README.rst
Outdated
and text-to-speech synthesis (TTS). | ||
The primary objective of NeMo is to provide a scalable framework for researchers and developers from industry and academia | ||
to more easily implement and design new generative AI models by being able to leverage existing code and pretrained models. | ||
NVIDIA NeMo Framework is a scalable and cloud-native generative AI framework built for researchers and PyTorch developers working on `Large Language Models <nemo/collections/nlp/README.md>`_ (LLMs), `Multimodal Models <nemo/collections/multimodal/README.md>`_ (MMs), `Automatic Speech Recognition <nemo/collections/asr/README.md>`_ (ASR), `Text to Speech <nemo/collections/tts/README.md>`_ (TTS), and `Computer Vision <nemo/collections/vision/README.md>`_ (CV). It is designed to help you efficiently create, customize, and deploy new generative AI models by leveraging existing code and pre-trained model checkpoints. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please pull out the domain specific readmes at the end of the main readme. We have had questions about ASR features and models we support at ICASSP this year already due to the hiding of the ASR domain features inside if nemo/collections/asr/README.md
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert back the Key Features link. In addition, this issue will be addressed on a separate PR.
|
||
For technical documentation, please see the `NeMo Framework User Guide <https://docs.nvidia.com/nemo-framework/user-guide/latest/playbooks/index.html>`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add back the link to the documentation at the very top - it is impossible to find documentation in a wall of text somewhere in the middle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create a documentation heading and add user guide link here.
|
||
When applicable, NeMo models take advantage of the latest possible distributed training techniques, | ||
including parallelism strategies such as | ||
When applicable, NeMo models leverage cutting-edge distributed training techniques, incorporating `parallelism strategies <https://docs.nvidia.com/nemo-framework/user-guide/latest/modeloverview.html>`_ to enable efficient training of very large models. These techniques include Tensor Parallelism (TP), Pipeline Parallelism (PP), Fully Sharded Data Parallelism (FSDP), Mixture-of-Experts (MoE), and Mixed Precision Training with BFloat16 and FP8, as well as others. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be explicit - only NeMo LLM and Multimodal Models can leverage parallel strategies like above
README.rst
Outdated
|
||
For technical documentation, please see the `NeMo Framework User Guide <https://docs.nvidia.com/nemo-framework/user-guide/latest/playbooks/index.html>`_. | ||
Model Training, Alignment, and Customization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be explicit - LLM Model Training, ALignment and Customization
README.rst
Outdated
|
||
NeMo LLMs can be aligned with state of the art methods such as SteerLM, DPO and Reinforcement Learning from Human Feedback (RLHF), | ||
see `NVIDIA NeMo Aligner <https://github.com/NVIDIA/NeMo-Aligner>`_ for more details. | ||
Model Deployment and Optimization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LLM Deployment and Optimization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to LLM and MM Model . . .
@@ -408,35 +363,32 @@ To install Apex, run | |||
git checkout $apex_commit | |||
pip install . -v --no-build-isolation --disable-pip-version-check --no-cache-dir --config-settings "--build-option=--cpp_ext --cuda_ext --fast_layer_norm --distributed_adam --deprecated_fused_adam --group_norm" | |||
|
|||
When attempting to install Apex separately from the NVIDIA PyTorch container, you might encounter an error if the CUDA version on your system is different from the one used to compile PyTorch. To bypass this error, you can comment out the relevant line in the setup file located in the Apex repository on GitHub here: https://github.com/NVIDIA/apex/blob/master/setup.py#L32. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sidenote @ericharper can we request apex folks to remove this hardcoded check ? We almost always have to uncomment it anyway, its a pain to have to clone the repo and manually edit files to get something to work
To use a pre-built container, please run | ||
NeMo containers are launched concurrently with NeMo version updates. For example, the release of NeMo ``r1.23.0`` comes with the container ``nemo:24.01.speech``. The latest containers are: | ||
|
||
* NeMo LLM and MM container - `nvcr.io/nvidia/nemo:24.03.framework` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be updated to the unified container if its out already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Submit a PR to update the container version. Stet as is.
README.rst
Outdated
|
||
Get Help |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert - it sounds very wrong to say "Get Help" - just keep it as Contributing & Discussion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
|
||
We welcome community contributions! Please refer to `CONTRIBUTING.md <https://github.com/NVIDIA/NeMo/blob/stable/CONTRIBUTING.md>`_ for the process. | ||
|
||
Publications |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add back publications, we have a new page for research publications
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
|
||
If you would like to add your own article to the list, you are welcome to do so via a pull request to this repository's ``gh-pages-src`` branch. | ||
Please refer to the instructions in the `README of that branch <https://github.com/NVIDIA/NeMo/tree/gh-pages-src#readme>`_. | ||
To contribute an article to the collection, please submit a pull request to the ``gh-pages-src`` branch of this repository. For detailed information, please consult the README located at the `gh-pages-src branch <https://github.com/NVIDIA/NeMo/tree/gh-pages-src#readme>`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: @erastorgueva-nv update this part after the PR is refactored and merged
What does this PR do ?
Add a one line overview of what this PR aims to accomplish.
Collection: [Note which collection this PR will affect]
Changelog
Usage
# Add a code snippet demonstrating how to use this
GitHub Actions CI
The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.
The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information