Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync Fork #206

Draft
wants to merge 170 commits into
base: mikecovlee_dev
Choose a base branch
from
Draft

Sync Fork #206

wants to merge 170 commits into from

Conversation

mikecovlee
Copy link
Member

No description provided.

* support classification tasks, add glue config example, support generate without cache

* replace sts-b with mrpc, try fix error

* add mrpc eval

* separate tasks

* replace len(tensor) to shape

* fix issue

* add evaluate

* fix CasualLM

* support lr scheduler and other stuff

* rearrange codes and APIs

* update rquirements

* update LLMModel

* update pyproject

* fix lint error

* fix dataloader of glue tasks

* fix pytest

* now can run train

* fix inference

* fix evaluate and lint error

* fix mixlora configs

* fix error, add hint of max tokens len

* add mmlu

* fix gradient accumulation

* add interate tests

* remove print

* fix bug

* update auto eval

* support config as json

* add categories

* support max sequence length

* update to match original implementation

* fix model output precision

* rearrange codes

* remove usless assert

* fix inconsistent behaviours for llama-2-hf models

* read max seq len automatically

* move rope angles into each layers to avoid useless arguments

* remove cache

* add requires_grad

* fix trainer bug

* reduce cutoff_len

* remove regression tasks

* make lint happy
Integrate evaluation methods and supports for LLaMA compatible models
performance: add batch lora function
* fix ci scripts

* update ci script

* remove setup python

* remove torch specific version

* change to local image

* update

* fix image

* fix use flash attn on rtx20 cards
* support phi

* rename

* rename

* rename

* refectory entire framework

* fix inference

* fix bug

* fix phi model

* fix compatible with phi model

* fix llama

* support qwen2 and mistral

* support google gemma model

* fix llama flash attention

* support flash attention for phi models

* fix gemma model

* support xformers attention for phi models

* add phi dummy example

* fix phi mixlora

* improve efficiency

* update docs

* fix router_profile

* rearrange codes

* fix bugs

* replace deprecated apis

* fix launcher of mlora
* support cpu as backend

* update README
* reads hidden act from configuration

* fix lint error
* replace encode with official impl

* update ci script

* fix ci script

* add device constraint

* fix ci script

* fix ci script
* add QuestionAnswerTask to global namespaces

* fix mixlora ffn act_fn

* fix config

* fix docs
* support MoRAL (without load balance)

* add intermediate_size
* support hellaswag dataset

* support WinoGrande

* support SIQA

* fix lint error
* improve efficiency of mixlora

* ignore adapters

* replace checkpoint impl with torch

* support router outputs from tail

* fix router loss

* fix model compatibility

* fix bus

* fix lint error

* fix bugs
* fix ci script

* fix ci script

* update docs
* support medical qa

* use long context when evaluate
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant