Skip to content
Bo仔很忙 edited this page Apr 26, 2024 · 1 revision

预训练权重

  • 预训练模型支持多种代码加载方式
from bert4torch.models import build_transformer_model

# 1. 仅指定config_path: 从头初始化模型结构, 不加载预训练模型
model = build_transformer_model('./model/bert4torch_config.json')

# 2. 仅指定checkpoint_path: 
## 2.1 文件夹路径: 自动寻找路径下的*.bin/*.safetensors权重文件 + bert4torch_config.json/config.json文件
model = build_transformer_model(checkpoint_path='./model')

## 2.2 文件路径/列表: 文件路径即权重路径/列表, config会从同级目录下寻找
model = build_transformer_model(checkpoint_path='./pytorch_model.bin')

## 2.3 model_name: hf上预训练权重名称, 会自动下载hf权重以及bert4torch_config.json文件
model = build_transformer_model(checkpoint_path='bert-base-chinese')

# 3. 同时指定config_path和checkpoint_path(本地路径名或model_name排列组合): 
config_path = './model/bert4torch_config.json'  # 或'bert-base-chinese'
checkpoint_path = './model/pytorch_model.bin'  # 或'bert-base-chinese'
model = build_transformer_model(config_path, checkpoint_path)
模型分类 模型名称 权重来源 权重链接/checkpoint_path config_path
bert bert-base-chinese google-bert bert-base-chinese bert-base-chinese
chinese_L-12_H-768_A-12 谷歌 tf, Tongjilibo/bert-chinese_L-12_H-768_A-12
chinese-bert-wwm-ext HFL hfl/chinese-bert-wwm-ext chinese-bert-wwm-ext
bert-base-multilingual-cased google-bert bert-base-multilingual-cased bert-base-multilingual-cased
MacBERT HFL hfl/chinese-macbert-base, hfl/chinese-macbert-large chinese-macbert-base, chinese-macbert-large
WoBERT 追一科技 junnyu/wobert_chinese_basejunnyu/wobert_chinese_plus_base wobert_chinese_base, wobert_chinese_plus_base
roberta chinese-roberta-wwm-ext HFL hfl/chinese-roberta-wwm-ext, hfl/chinese-roberta-wwm-ext-large chinese-roberta-wwm-ext, chinese-roberta-wwm-ext-large
roberta-small/tiny 追一科技 Tongjilibo/chinese_roberta_L-4_H-312_A-12, Tongjilibo/chinese_roberta_L-6_H-384_A-12
roberta-base FacebookAI roberta-base roberta-base
guwenbert ethanyt ethanyt/guwenbert-base guwenbert-base
albert albert_zh brightmart torch, voidful/albert_chinese_tinyvoidful/albert_chinese_small, voidful/albert_chinese_base, voidful/albert_chinese_large, voidful/albert_chinese_xlarge, voidful/albert_chinese_xxlarge albert_chinese_tinyalbert_chinese_small, albert_chinese_base, albert_chinese_large, albert_chinese_xlarge, albert_chinese_xxlarge
nezha NEZHA huawei_noah torch, sijunhe/nezha-cn-base, sijunhe/nezha-cn-large, sijunhe/nezha-base-wwm, sijunhe/nezha-large-wwm nezha-cn-base, nezha-cn-large, nezha-base-wwm, nezha-large-wwm
nezha_gpt_dialog bojone Tongjilibo/nezha_gpt_dialog
xlnet Chinese-XLNet HFL hfl/chinese-xlnet-base chinese-xlnet-base
tranformer_xl huggingface transfo-xl/transfo-xl-wt103 transfo-xl-wt103
deberta Erlangshen-DeBERTa-v2 IDEA IDEA-CCNL/Erlangshen-DeBERTa-v2-97M-Chinese, IDEA-CCNL/Erlangshen-DeBERTa-v2-320M-Chinese, IDEA-CCNL/Erlangshen-DeBERTa-v2-710M-Chinese Erlangshen-DeBERTa-v2-97M-Chinese, Erlangshen-DeBERTa-v2-320M-Chinese, Erlangshen-DeBERTa-v2-710M-Chinese
electra Chinese-ELECTRA HFL hfl/chinese-electra-base-discriminator chinese-electra-base-discriminator
ernie ernie 百度文心 nghuyong/ernie-1.0-base-zh, nghuyong/ernie-3.0-base-zh ernie-1.0-base-zh, ernie-3.0-base-zh
roformer roformer 追一科技 junnyu/roformer_chinese_base roformer_chinese_base
roformer_v2 追一科技 junnyu/roformer_v2_chinese_char_base roformer_v2_chinese_char_base
simbert simbert 追一科技 Tongjilibo/simbert-chinese-base, Tongjilibo/simbert-chinese-small, Tongjilibo/simbert-chinese-tiny
simbert_v2/roformer-sim 追一科技 junnyu/roformer_chinese_sim_char_basejunnyu/roformer_chinese_sim_char_ft_basejunnyu/roformer_chinese_sim_char_smalljunnyu/roformer_chinese_sim_char_ft_small roformer_chinese_sim_char_base, roformer_chinese_sim_char_ft_base, roformer_chinese_sim_char_small, roformer_chinese_sim_char_ft_small
gau GAU-alpha 追一科技 Tongjilibo/chinese_GAU-alpha-char_L-24_H-768
uie uie 百度 torch, Tongjilibo/uie-base
gpt CDial-GPT thu-coai thu-coai/CDial-GPT_LCCC-base, thu-coai/CDial-GPT_LCCC-large CDial-GPT_LCCC-base, CDial-GPT_LCCC-large
cmp_lm(26亿) 清华 TsinghuaAI/CPM-Generate CPM-Generate
nezha_gen huawei_noah Tongjilibo/chinese_nezha_gpt_L-12_H-768_A-12
gpt2-chinese-cluecorpussmall UER uer/gpt2-chinese-cluecorpussmall gpt2-chinese-cluecorpussmall
gpt2-ml imcaspar torch, BaiduYun(84dh) gpt2-ml_15g_corpus, gpt2-ml_30g_corpus
bart bart_base_chinese 复旦fnlp v1.0, fnlp/bart-base-chinese bart-base-chinese, bart-base-chinese-v1.0
t5 t5 UER uer/t5-small-chinese-cluecorpussmall, uer/t5-base-chinese-cluecorpussmall t5-base-chinese-cluecorpussmall, t5-small-chinese-cluecorpussmall
mt5 谷歌 google/mt5-base mt5-base
t5_pegasus 追一科技 Tongjilibo/chinese_t5_pegasus_small, Tongjilibo/chinese_t5_pegasus_base
chatyuan clue-ai ClueAI/ChatYuan-large-v1, ClueAI/ChatYuan-large-v2 ChatYuan-large-v1, ChatYuan-large-v2
PromptCLUE clue-ai ClueAI/PromptCLUE-base PromptCLUE-base
chatglm chatglm-6b THUDM THUDM/chatglm-6b, THUDM/chatglm-6b-int8, THUDM/chatglm-6b-int4, v0.1.0 chatglm-6b, chatglm-6b-int8, chatglm-6b-int4, chatglm-6b-v0.1.0
chatglm2-6b THUDM THUDM/chatglm2-6b, THUDM/chatglm2-6b-int4, THUDM/chatglm2-6b-32k chatglm2-6b, chatglm2-6b-int4, chatglm2-6b-32k
chatglm3-6b THUDM THUDM/chatglm3-6b, THUDM/chatglm3-6b-32k chatglm3-6b, chatglm3-6b-32k
llama llama meta llama-7b, llama-13b
llama-2 meta meta-llama/Llama-2-7b-hf, meta-llama/Llama-2-7b-chat-hf, meta-llama/Llama-2-13b-hf, meta-llama/Llama-2-13b-chat-hf Llama-2-7b-hf, Llama-2-7b-chat-hf, Llama-2-13b-hf, Llama-2-13b-chat-hf
llama-3 meta meta-llama/Meta-Llama-3-8B, meta-llama/Meta-Llama-3-8B-Instruct Meta-Llama-3-8B, Meta-Llama-3-8B-Instruct
Chinese-LLaMA-Alpaca HFL chinese_alpaca_plus_7b, chinese_llama_plus_7b
Belle_llama LianjiaTech BelleGroup/BELLE-LLaMA-7B-2M-enc 合成说明BELLE-LLaMA-7B-2M-enc
Ziya IDEA-CCNL IDEA-CCNL/Ziya-LLaMA-13B-v1, IDEA-CCNL/Ziya-LLaMA-13B-v1.1, IDEA-CCNL/Ziya-LLaMA-13B-Pretrain-v1 Ziya-LLaMA-13B-v1, Ziya-LLaMA-13B-v1.1
Baichuan baichuan-inc baichuan-inc/Baichuan-7B, baichuan-inc/Baichuan-13B-Base, baichuan-inc/Baichuan-13B-Chat Baichuan-7B, Baichuan-13B-Base, Baichuan-13B-Chat
Baichuan2 baichuan-inc baichuan-inc/Baichuan2-7B-Base, baichuan-inc/Baichuan2-7B-Chat, baichuan-inc/Baichuan2-13B-Base, baichuan-inc/Baichuan2-13B-Chat Baichuan2-7B-Base, Baichuan2-7B-Chat, Baichuan2-13B-Base, Baichuan2-13B-Chat
vicuna lmsys lmsys/vicuna-7b-v1.5 vicuna-7b-v1.5
Yi 01-ai 01-ai/Yi-6B, 01-ai/Yi-6B-200K Yi-6B, Yi-6B-200K
bloom bloom bigscience bigscience/bloom-560m, bigscience/bloomz-560m bloom-560m, bloomz-560m
Qwen Qwen 阿里云 Qwen/Qwen-1_8B, Qwen/Qwen-1_8B-Chat, Qwen/Qwen-7B, Qwen/Qwen-7B-Chat, Qwen/Qwen-14B, Qwen/Qwen-14B-Chat Qwen-1_8B, Qwen-1_8B-Chat, Qwen-7B, Qwen-7B-Chat, Qwen-14B, Qwen-14B-Chat
Qwen1.5 阿里云
InternLM InternLM 上海人工智能实验室 internlm/internlm-chat-7b, internlm/internlm-7b internlm-7b, internlm-chat-7b
Falcon Falcon tiiuae tiiuae/falcon-rw-1b, tiiuae/falcon-7b, tiiuae/falcon-7b-instruct falcon-rw-1b, falcon-7b, falcon-7b-instruct
moe deeoseek-moe deepseek deepseek-ai/deepseek-moe-16b-base, deepseek-ai/deepseek-moe-16b-chat deepseek-moe-16b-base, deepseek-moe-16b-chat
embedding text2vec-base-chinese shibing624 shibing624/text2vec-base-chinese text2vec-base-chinese
m3e moka-ai moka-ai/m3e-base m3e-base
bge BAAI BAAI/bge-large-en-v1.5, BAAI/bge-large-zh-v1.5 bge-large-en-v1.5, bge-large-zh-v1.5
gte thenlper thenlper/gte-large-zh, thenlper/gte-base-zh gte-base-zh, gte-large-zh

*注:

  1. 高亮格式(如bert-base-chinese)的表示可直接build_transformer_model()联网下载
  2. 国内镜像网站加速下载
    • HF_ENDPOINT=https://hf-mirror.com python your_script.py
    • export HF_ENDPOINT=https://hf-mirror.com后再执行python代码
    • 在python代码开头如下设置
    import os
    os.environ['HF_ENDPOINT'] = "https://hf-mirror.com"
Clone this wiki locally