Home

预训练权重

预训练模型支持多种代码加载方式

from bert4torch.models import build_transformer_model

# 1. 仅指定config_path: 从头初始化模型结构, 不加载预训练模型
model = build_transformer_model('./model/bert4torch_config.json')

# 2. 仅指定checkpoint_path: 
## 2.1 文件夹路径: 自动寻找路径下的*.bin/*.safetensors权重文件 + bert4torch_config.json/config.json文件
model = build_transformer_model(checkpoint_path='./model')

## 2.2 文件路径/列表: 文件路径即权重路径/列表, config会从同级目录下寻找
model = build_transformer_model(checkpoint_path='./pytorch_model.bin')

## 2.3 model_name: hf上预训练权重名称, 会自动下载hf权重以及bert4torch_config.json文件
model = build_transformer_model(checkpoint_path='bert-base-chinese')

# 3. 同时指定config_path和checkpoint_path(本地路径名或model_name排列组合): 
config_path = './model/bert4torch_config.json'  # 或'bert-base-chinese'
checkpoint_path = './model/pytorch_model.bin'  # 或'bert-base-chinese'
model = build_transformer_model(config_path, checkpoint_path)

模型分类	模型名称	权重来源	权重链接/checkpoint_path	config_path
bert	bert-base-chinese	google-bert	`bert-base-chinese`	`bert-base-chinese`
	chinese_L-12_H-768_A-12	谷歌	tf, `Tongjilibo/bert-chinese_L-12_H-768_A-12`
	chinese-bert-wwm-ext	HFL	`hfl/chinese-bert-wwm-ext`	`chinese-bert-wwm-ext`
	bert-base-multilingual-cased	google-bert	`bert-base-multilingual-cased`	`bert-base-multilingual-cased`
	MacBERT	HFL	`hfl/chinese-macbert-base`, `hfl/chinese-macbert-large`	`chinese-macbert-base`, `chinese-macbert-large`
	WoBERT	追一科技	`junnyu/wobert_chinese_base`，`junnyu/wobert_chinese_plus_base`	`wobert_chinese_base`, `wobert_chinese_plus_base`
roberta	chinese-roberta-wwm-ext	HFL	`hfl/chinese-roberta-wwm-ext`, `hfl/chinese-roberta-wwm-ext-large`	`chinese-roberta-wwm-ext`, `chinese-roberta-wwm-ext-large`
	roberta-small/tiny	追一科技	`Tongjilibo/chinese_roberta_L-4_H-312_A-12`, `Tongjilibo/chinese_roberta_L-6_H-384_A-12`
	roberta-base	FacebookAI	`roberta-base`	`roberta-base`
	guwenbert	ethanyt	`ethanyt/guwenbert-base`	`guwenbert-base`
albert	albert_zh	brightmart	torch, `voidful/albert_chinese_tiny`，`voidful/albert_chinese_small`, `voidful/albert_chinese_base`, `voidful/albert_chinese_large`, `voidful/albert_chinese_xlarge`, `voidful/albert_chinese_xxlarge`	`albert_chinese_tiny`，`albert_chinese_small`, `albert_chinese_base`, `albert_chinese_large`, `albert_chinese_xlarge`, `albert_chinese_xxlarge`
nezha	NEZHA	huawei_noah	torch, `sijunhe/nezha-cn-base`, `sijunhe/nezha-cn-large`, `sijunhe/nezha-base-wwm`, `sijunhe/nezha-large-wwm`	`nezha-cn-base`, `nezha-cn-large`, `nezha-base-wwm`, `nezha-large-wwm`
	nezha_gpt_dialog	bojone	`Tongjilibo/nezha_gpt_dialog`
xlnet	Chinese-XLNet	HFL	`hfl/chinese-xlnet-base`	`chinese-xlnet-base`
	tranformer_xl	huggingface	`transfo-xl/transfo-xl-wt103`	`transfo-xl-wt103`
deberta	Erlangshen-DeBERTa-v2	IDEA	`IDEA-CCNL/Erlangshen-DeBERTa-v2-97M-Chinese`, `IDEA-CCNL/Erlangshen-DeBERTa-v2-320M-Chinese`, `IDEA-CCNL/Erlangshen-DeBERTa-v2-710M-Chinese`	`Erlangshen-DeBERTa-v2-97M-Chinese`, `Erlangshen-DeBERTa-v2-320M-Chinese`, `Erlangshen-DeBERTa-v2-710M-Chinese`
electra	Chinese-ELECTRA	HFL	`hfl/chinese-electra-base-discriminator`	`chinese-electra-base-discriminator`
ernie	ernie	百度文心	`nghuyong/ernie-1.0-base-zh`, `nghuyong/ernie-3.0-base-zh`	`ernie-1.0-base-zh`, `ernie-3.0-base-zh`
roformer	roformer	追一科技	`junnyu/roformer_chinese_base`	`roformer_chinese_base`
	roformer_v2	追一科技	`junnyu/roformer_v2_chinese_char_base`	`roformer_v2_chinese_char_base`
simbert	simbert	追一科技	`Tongjilibo/simbert-chinese-base`, `Tongjilibo/simbert-chinese-small`, `Tongjilibo/simbert-chinese-tiny`
	simbert_v2/roformer-sim	追一科技	`junnyu/roformer_chinese_sim_char_base`，`junnyu/roformer_chinese_sim_char_ft_base`，`junnyu/roformer_chinese_sim_char_small`，`junnyu/roformer_chinese_sim_char_ft_small`	`roformer_chinese_sim_char_base`, `roformer_chinese_sim_char_ft_base`, `roformer_chinese_sim_char_small`, `roformer_chinese_sim_char_ft_small`
gau	GAU-alpha	追一科技	`Tongjilibo/chinese_GAU-alpha-char_L-24_H-768`
uie	uie	百度	torch, `Tongjilibo/uie-base`
gpt	CDial-GPT	thu-coai	`thu-coai/CDial-GPT_LCCC-base`, `thu-coai/CDial-GPT_LCCC-large`	`CDial-GPT_LCCC-base`, `CDial-GPT_LCCC-large`
	cmp_lm(26亿)	清华	`TsinghuaAI/CPM-Generate`	`CPM-Generate`
	nezha_gen	huawei_noah	`Tongjilibo/chinese_nezha_gpt_L-12_H-768_A-12`
	gpt2-chinese-cluecorpussmall	UER	`uer/gpt2-chinese-cluecorpussmall`	`gpt2-chinese-cluecorpussmall`
	gpt2-ml	imcaspar	torch, BaiduYun(84dh)	`gpt2-ml_15g_corpus`, `gpt2-ml_30g_corpus`
bart	bart_base_chinese	复旦fnlp	v1.0, `fnlp/bart-base-chinese`	`bart-base-chinese`, `bart-base-chinese-v1.0`
t5	t5	UER	`uer/t5-small-chinese-cluecorpussmall`, `uer/t5-base-chinese-cluecorpussmall`	`t5-base-chinese-cluecorpussmall`, `t5-small-chinese-cluecorpussmall`
	mt5	谷歌	`google/mt5-base`	`mt5-base`
	t5_pegasus	追一科技	`Tongjilibo/chinese_t5_pegasus_small`, `Tongjilibo/chinese_t5_pegasus_base`
	chatyuan	clue-ai	`ClueAI/ChatYuan-large-v1`, `ClueAI/ChatYuan-large-v2`	`ChatYuan-large-v1`, `ChatYuan-large-v2`
	PromptCLUE	clue-ai	`ClueAI/PromptCLUE-base`	`PromptCLUE-base`
chatglm	chatglm-6b	THUDM	`THUDM/chatglm-6b`, `THUDM/chatglm-6b-int8`, `THUDM/chatglm-6b-int4`, v0.1.0	`chatglm-6b`, `chatglm-6b-int8`, `chatglm-6b-int4`, `chatglm-6b-v0.1.0`
	chatglm2-6b	THUDM	`THUDM/chatglm2-6b`, `THUDM/chatglm2-6b-int4`, `THUDM/chatglm2-6b-32k`	`chatglm2-6b`, `chatglm2-6b-int4`, `chatglm2-6b-32k`
	chatglm3-6b	THUDM	`THUDM/chatglm3-6b`, `THUDM/chatglm3-6b-32k`	`chatglm3-6b`, `chatglm3-6b-32k`
llama	llama	meta		`llama-7b`, `llama-13b`
	llama-2	meta	meta-llama/Llama-2-7b-hf, meta-llama/Llama-2-7b-chat-hf, meta-llama/Llama-2-13b-hf, meta-llama/Llama-2-13b-chat-hf	`Llama-2-7b-hf`, `Llama-2-7b-chat-hf`, `Llama-2-13b-hf`, `Llama-2-13b-chat-hf`
	llama-3	meta	`meta-llama/Meta-Llama-3-8B`, `meta-llama/Meta-Llama-3-8B-Instruct`	`Meta-Llama-3-8B`, `Meta-Llama-3-8B-Instruct`
	Chinese-LLaMA-Alpaca	HFL		`chinese_alpaca_plus_7b`, `chinese_llama_plus_7b`
	Belle_llama	LianjiaTech	BelleGroup/BELLE-LLaMA-7B-2M-enc	合成说明、`BELLE-LLaMA-7B-2M-enc`
	Ziya	IDEA-CCNL	IDEA-CCNL/Ziya-LLaMA-13B-v1, IDEA-CCNL/Ziya-LLaMA-13B-v1.1, IDEA-CCNL/Ziya-LLaMA-13B-Pretrain-v1	`Ziya-LLaMA-13B-v1`, `Ziya-LLaMA-13B-v1.1`
	Baichuan	baichuan-inc	`baichuan-inc/Baichuan-7B`, `baichuan-inc/Baichuan-13B-Base`, `baichuan-inc/Baichuan-13B-Chat`	`Baichuan-7B`, `Baichuan-13B-Base`, `Baichuan-13B-Chat`
	Baichuan2	baichuan-inc	`baichuan-inc/Baichuan2-7B-Base`, `baichuan-inc/Baichuan2-7B-Chat`, `baichuan-inc/Baichuan2-13B-Base`, `baichuan-inc/Baichuan2-13B-Chat`	`Baichuan2-7B-Base`, `Baichuan2-7B-Chat`, `Baichuan2-13B-Base`, `Baichuan2-13B-Chat`
	vicuna	lmsys	`lmsys/vicuna-7b-v1.5`	`vicuna-7b-v1.5`
	Yi	01-ai	`01-ai/Yi-6B`, `01-ai/Yi-6B-200K`	`Yi-6B`, `Yi-6B-200K`
bloom	bloom	bigscience	`bigscience/bloom-560m`, `bigscience/bloomz-560m`	`bloom-560m`, `bloomz-560m`
Qwen	Qwen	阿里云	`Qwen/Qwen-1_8B`, `Qwen/Qwen-1_8B-Chat`, `Qwen/Qwen-7B`, `Qwen/Qwen-7B-Chat`, `Qwen/Qwen-14B`, `Qwen/Qwen-14B-Chat`	`Qwen-1_8B`, `Qwen-1_8B-Chat`, `Qwen-7B`, `Qwen-7B-Chat`, `Qwen-14B`, `Qwen-14B-Chat`
	Qwen1.5	阿里云
InternLM	InternLM	上海人工智能实验室	`internlm/internlm-chat-7b`, `internlm/internlm-7b`	`internlm-7b`, `internlm-chat-7b`
Falcon	Falcon	tiiuae	`tiiuae/falcon-rw-1b`, `tiiuae/falcon-7b`, `tiiuae/falcon-7b-instruct`	`falcon-rw-1b`, `falcon-7b`, `falcon-7b-instruct`
moe	deeoseek-moe	deepseek	`deepseek-ai/deepseek-moe-16b-base`, `deepseek-ai/deepseek-moe-16b-chat`	`deepseek-moe-16b-base`, `deepseek-moe-16b-chat`
embedding	text2vec-base-chinese	shibing624	`shibing624/text2vec-base-chinese`	`text2vec-base-chinese`
	m3e	moka-ai	`moka-ai/m3e-base`	`m3e-base`
	bge	BAAI	`BAAI/bge-large-en-v1.5`, `BAAI/bge-large-zh-v1.5`	`bge-large-en-v1.5`, `bge-large-zh-v1.5`
	gte	thenlper	`thenlper/gte-large-zh`, `thenlper/gte-base-zh`	`gte-base-zh`, `gte-large-zh`

*注：

高亮格式(如bert-base-chinese)的表示可直接build_transformer_model()联网下载
国内镜像网站加速下载
- HF_ENDPOINT=https://hf-mirror.com python your_script.py
- export HF_ENDPOINT=https://hf-mirror.com后再执行python代码
- 在python代码开头如下设置
```
import os
os.environ['HF_ENDPOINT'] = "https://hf-mirror.com"
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

预训练权重

Clone this wiki locally