BiLLM

Tool for converting LLMs from uni-directional to bi-directional for tasks like classification and sentence embeddings. Compatible with 🤗 transformers.

Supported Models

LLaMA
Mistral
Qwen2
OpenELM

Usage

python -m pip install -U billm
Specify start index for bi-directional layers via export BiLLM_START_INDEX={layer_index}. if not specified, default is 0, i.e., all layers are bi-directional. If set to -1, BiLLM is disabled.
Import LLMs from BiLLM and initialize them as usual with transformers.

- from transformers import (
-    LLamaModel,
-    LLamaForCausalLM,
-    LLamaForSequenceClassification,
-    MistralModel,
-    MistralForCausalLM,
-    MistralForSequenceClassification
-    Qwen2Model,
-    Qwen2ForCausalLM,
-    Qwen2ForSequenceClassification
- )

+ from billm import (
+    LLamaModel,
+    LLamaForCausalLM,
+    LLamaForSequenceClassification,
+    LLamaForTokenClassification,
+    MistralModel,
+    MistralForCausalLM,
+    MistralForSequenceClassification,
+    MistralForTokenClassification,
+    Qwen2Model,
+    Qwen2ForCausalLM,
+    Qwen2ForSequenceClassification,
+    Qwen2ForTokenClassification
+    OpenELMModel,
+    OpenELMForCausalLM,
+    OpenELMForSequenceClassification,
+    OpenELMForTokenClassification
+ )

Examples

NER

training:

$ cd examples
$ WANDB_MODE=disabled BiLLM_START_INDEX=0 CUDA_VISIBLE_DEVICES=3 python billm_ner.py \
--model_name_or_path mistralai/Mistral-7B-v0.1 \
--dataset_name_or_path conll2003 \
--push_to_hub 0

inference:

from transformers import AutoTokenizer, pipeline
from peft import PeftModel, PeftConfig
from billm import MistralForTokenClassification


label2id = {'O': 0, 'B-PER': 1, 'I-PER': 2, 'B-ORG': 3, 'I-ORG': 4, 'B-LOC': 5, 'I-LOC': 6, 'B-MISC': 7, 'I-MISC': 8}
id2label = {v: k for k, v in label2id.items()}
model_id = 'WhereIsAI/billm-mistral-7b-conll03-ner'
tokenizer = AutoTokenizer.from_pretrained(model_id)
peft_config = PeftConfig.from_pretrained(model_id)
model = MistralForTokenClassification.from_pretrained(
    peft_config.base_model_name_or_path,
    num_labels=len(label2id), id2label=id2label, label2id=label2id
)
model = PeftModel.from_pretrained(model, model_id)
# merge and unload is necessary for inference
model = model.merge_and_unload()

token_classifier = pipeline("token-classification", model=model, tokenizer=tokenizer, aggregation_strategy="simple")
sentence = "I live in Hong Kong. I am a student at Hong Kong PolyU."
tokens = token_classifier(sentence)
print(tokens)

Sentence Embeddings

refer to AnglE: https://github.com/SeanLee97/AnglE

Citation

If you use this toolkit in your work, please cite the following paper:

For sentence embeddings modeling:

@inproceedings{li2024bellm,
    title = "BeLLM: Backward Dependency Enhanced Large Language Model for Sentence Embeddings",
    author = "Li, Xianming and Li, Jing",
    booktitle = "Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics",
    year = "2024",
    publisher = "Association for Computational Linguistics"
}

For other tasks:

@article{li2023label,
  title={Label supervised llama finetuning},
  author={Li, Zongxi and Li, Xianming and Liu, Yuzhang and Xie, Haoran and Li, Jing and Wang, Fu-lee and Li, Qing and Zhong, Xiaoqin},
  journal={arXiv preprint arXiv:2310.01208},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.github/workflows		.github/workflows
examples		examples
src/billm		src/billm
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pdm.lock		pdm.lock
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

examples

examples

src/billm

src/billm

tests

tests

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

pdm.lock

pdm.lock

pyproject.toml

pyproject.toml

setup.cfg

setup.cfg

Repository files navigation

BiLLM

Supported Models

Usage

Examples

NER

Sentence Embeddings

Citation

About

Releases

Packages

Languages

License

WhereIsAI/BiLLM

Folders and files

Latest commit

History

Repository files navigation

BiLLM

Supported Models

Usage

Examples

NER

Sentence Embeddings

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages