[QUESTION] Does Megatron-Core supports LLAMA models? #803

noob-ctrl · 2024-05-03T08:57:49Z

Does Megatron-Core supports LLAMA models?

ethanhe42 · 2024-05-03T23:35:26Z

yes

noob-ctrl · 2024-05-04T01:27:55Z

@ethanhe42 When transformer-impl is local, it reports the following error：
AssertionError: (RMSNorm) is not supported in FusedLayerNorm when instantiating FusedLayerNorm when instantiating TransformerLayer
When transformer-impl is transformer_engine, the following code does not seem to define RMSNorm?

So do I need to make any changes when I want to use llama?

ethanhe42 · 2024-05-04T03:14:48Z

You need to use mcore models. local is deprecating

noob-ctrl · 2024-05-04T16:10:25Z

@ethanhe42 When transformer-impl is set to transformer_engine, the following code does not seem to define RMSNorm?

ethanhe42 · 2024-05-05T04:03:45Z

It's handled by TEnorm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QUESTION] Does Megatron-Core supports LLAMA models? #803

[QUESTION] Does Megatron-Core supports LLAMA models? #803

noob-ctrl commented May 3, 2024

ethanhe42 commented May 3, 2024

noob-ctrl commented May 4, 2024

ethanhe42 commented May 4, 2024

noob-ctrl commented May 4, 2024

ethanhe42 commented May 5, 2024

[QUESTION] Does Megatron-Core supports LLAMA models? #803

[QUESTION] Does Megatron-Core supports LLAMA models? #803

Comments

noob-ctrl commented May 3, 2024

ethanhe42 commented May 3, 2024

noob-ctrl commented May 4, 2024

ethanhe42 commented May 4, 2024

noob-ctrl commented May 4, 2024

ethanhe42 commented May 5, 2024