Unknown header key's while converting llama 3 70b to distributed format #40

DifferentialityDevelopment · 2024-05-08T14:06:01Z

Hi there

I'm busy converting llama 3 70b to the distributed format, but I get the following output:

Target float type: q40
Target file: D:\Meta-Llama-3-70B-Instruct-Distributed\dllama_original_q40.bin
💿 Chunking model 1/16...
Unknown header key: ffn_dim_multiplier
Unknown header key: multiple_of
Unknown header key: norm_eps
Unknown header key: head_size
{'dim': 8192, 'ffn_dim_multiplier': 1.3, 'multiple_of': 4096, 'n_heads': 64, 'n_kv_heads': 8, 'n_layers': 80, 'norm_eps': 1e-05, 'vocab_size': 128256, 'rope_theta': 500000, 'head_size': 128.0, 'max_seq_len': 2048, 'arch_type': 11259136, 'n_experts': 0, 'n_active_experts': 0, 'hidden_dim': 28672}
🔶 Exporting tok_embeddings.weight torch.Size([16032, 65536])...
Saved f32 tensor in 72.36s, 4202692608 bytes
🔶 Exporting layers.0.attention.wq.weight torch.Size([8192, 8192])...
Saved q40 tensor in 15.90s, 37748736 bytes
🔶 Exporting layers.0.attention.wk.weight torch.Size([1024, 8192])...
Saved q40 tensor in 1.99s, 4718592 bytes

Would it still work fine?
Conversion process so far is really slow on my machine, should be done in a couple of hours

b4rtaz · 2024-05-08T18:42:21Z

Hello @DifferentialityDevelopment, yes it should be fine. The converter is slow, this is completely not optimized part yet.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unknown header key's while converting llama 3 70b to distributed format #40

Unknown header key's while converting llama 3 70b to distributed format #40

DifferentialityDevelopment commented May 8, 2024 •

edited

b4rtaz commented May 8, 2024

Unknown header key's while converting llama 3 70b to distributed format #40

Unknown header key's while converting llama 3 70b to distributed format #40

Comments

DifferentialityDevelopment commented May 8, 2024 • edited

b4rtaz commented May 8, 2024

DifferentialityDevelopment commented May 8, 2024 •

edited