Skip to content

younesselbrag/LoRaAdapter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Adapter-LoRa Beta Version for Quantization

LoRa-Logo

Made With Love GitHub issues GitHub forks GitHub stars GitHub license

Comming Features

To Support LLM model in Effiencey way the comming Feature will make the Library Easy-to Friendly such PFFT Hugging Face

  • LoRaConfig
  • Quantize specific Layer
  • Adujst The Hyper-Paramaters Already instance by Origin Model

Usage

LoRaConfig = AdapterLoRA.LoRaConfig(
            method = "LoRa",
            Rank = 4,
            Instance_Layer = "auto",
            layertyep = ["nn.Lieaner","nn.Embedding"],
            LORA = True,
            BITSAND = False,
            bit8_int = True
)

Adpate_model = AdapterLoRa(model , Config=LoRaConfig, device="cuda")

Features

  • LoRALib Approach: This approach involves calculating the computations xW_0^T and x(BA)^T separately, followed by their summation. This approach is particularly suitable for linear layers and offers accurate computation of LoRA-enhanced layers.

  • LoRATorch Approach: In this approach, the pre-trained weight W_0 is merged with its LoRA weight BA, resulting in the combined weight matrix (W_0 + \frac{\alpha}{r} BA). This approach allows for the straightforward extension of LoRA to more complex and non-linear layers within the PyTorch ecosystem.

Mathematical Formulation

  1. LoRALib Approach:

    The computation is defined as:

    $( h = xW_0^T + \frac{\alpha}{r} x(BA)^T )$

    $where:

    • ( x ) is the input matrix of dimensions ( k \times n ),
    • ( W_0 ) is a pre-trained weight matrix of dimensions ( m \times n ),
    • ( r ) is a predefined LoRA rank,
    • ( B ) and ( A ) are LoRA matrices of dimensions ( m \times r ) and ( r \times n ) respectively,
    • ( \alpha ) is a hyper-parameter.$
  2. LoRATorch Approach:

    The computation is defined as:

    $( h = x(W_0 + \frac{\alpha}{r} BA)^T )$

    $where:

    • ( x ) is the input matrix of dimensions ( k \times n ),
    • ( W_0 ) is a pre-trained weight matrix of dimensions ( m \times n ),
    • ( r ) is a predefined LoRA rank,
    • ( B ) and ( A ) are LoRA matrices of dimensions ( m \times r ) and ( r \times n ) respectively,
    • ( \alpha ) is a hyper-parameter.$

Usage

  1. AdapterLoRa Class: The AdapterLoRa class provides a versatile interface for applying LoRA adaptation to neural networks. It supports both loralib and loratorch approaches, offering the ability to reconstruct and implement LoRA-adapted models.

  2. Adapting Layers: The add_layer_and_Instance_Layer method allows you to specify the layers you want to adapt using the layertyep and layer parameters. This method helps tailor the LoRA application to specific layers in your model.

  3. Freezing Weights: The freeze_weights method enables the option to freeze model weights, enhancing stability and allowing for safer adaptations.

  4. Reconstructing and Implementing LoRA: The reconstruct_model method applies LoRA adaptation to the model, while the implement_lora method further implements LoRA and manages trainable parameters. .

Supported Layers

loralib loratorch
nn.Linear linear.ipynb
nn.Embedding embedding.ipynb
nn.Conv1d
nn.Conv2d
nn.Conv3d
nn.MultiheadAttention
MergedLinear ✓ (Error) mergedlinear.ipynb
$\cdots$ hard to extend easy to extend

Quick Start

The usage of AdapterLoRa

  1. Install AdapterLoRa.
 pip install git+https://github.com/Baijiong-Lin/LoRA-Torch
pip install AdapterLoRa

Usage Tool AdpaterLoRa

import torch.nn as nn
import torch
from core.Quantized import AdapterLoRa

model = nn.TransformerEncoderLayer(d_model=512, nhead=8)

Adpate_model = AdapterLoRa(model , method="LoRa", Rank=4)

"""
adding Linear Layer built Self.attention 
Replace the layers where you would like to use AdapterLoRa by using  add_layer function.
"""

Adpate_model.add_layer("self_attn") 
Adpate_model.add_layer("linear1")
Adpate_model.add_layer("linear2")

# reconstruct model Quantized 
Adpate_model.reconstruct_model()

# Iplmented LoRa Method
model = Adpate_model.implement_lora(verbose=True)
# Total trainable parameters before LoRA: 3176960
# Total trainable parameters after LoRA: 24576

# This sets requires_grad to False for all parameters without the string "lora_" in their names

# Training loop
for batch in dataloader:
    model.train()

Saving Wieghts model

  • Save LoRA model (only the LoRA matrixes will be saved).
import loralib as lora 
# ===== Before =====
# torch.save(model.state_dict(), checkpoint_path)
# ===== After =====
torch.save(lora.lora_state_dict(model), checkpoint_path)

Loading the Pre-Trained Model

  • Load LoRA model (need to load the pre-trained model first).
import loralib as lora 
# Load the pre-trained checkpoint first
model.load_state_dict(torch.load('ckpt_pretrained.pt'), strict=False)
# Then load the LoRA checkpoint
model.load_state_dict(torch.load('ckpt_lora.pt'), strict=False)
  • Quantized Model

  • Time to Train

  • Cost to Train

What's in it for you?

For each of the above four pillars, we are sharing our codebase and insights to:

  • Assist you to leverage Transfomer-Based Model for your machines needs and challenges

  • Boost reproducibility efforts which are becoming increasingly difficult with Transfomers

i am providing Tool that are ready-to-use for Quantize the model:

  • Finetuning Transfomer-Based on your proprietary dataset via PeFT methodologies such as LoRA and QLoRa

  • Performing hyperparameter optimization to get the maximum performance out of these models

What's the best way to use this repository?

Go over to the Transfomer-Based-specific directory that you are interested in, and open the README.md. We have included details about the LLMs, followed by performance results on open-source datasets!

Methods Supports Quantization

the supports method for Quantize the Transfomer-Based Models

  • LoRa
  • LoRaTorch
  • QLoRA

Roadmap

Our plan is to perform these experiments on all the Transformer-Based model below. To that end, this is a tentative roadmap of the LLMs that we aim to cover:

  • TransfomerEncoder
  • TransfomerDecoder
  • Vision-Transfomer
  • minGPT
  • OpenAI GPT-2
  • Inflection Pi Under Progress

Correspondence

Contributor

AdapterLoRa is developed and maintained by ''Youness ELbrag'' (Email | LinkedIn)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages