Welcome to a friendly neighborhood repository featuring diverse experiments and adventures in the world of LLMs. This collection is no ordinary repository; it's an alchemical blend of scripts, notebooks, and experiments dedicated to the mystical realm of Language Models (LLMs).
Projects | GitHub Link | Colab Link | Blog Link | Description |
---|---|---|---|---|
Youtube Cloner | Folder | Fireship GPT | Blog coming soon | An Attempt at cloning youtubers using LLMs by Finetuning |
Finetuning | GitHub Link | Colab Link | Blog Link | Description |
---|---|---|---|---|
Gemma Finetuning | GitHub | Colab | A Beginner’s Guide to Fine-Tuning Gemma | Notebook to Finetune Gemma Models |
Mistral-7b Finetuning | GitHub | Colab | A Beginner’s Guide to Fine-Tuning Mistral 7B Instruct Model | Notebook to Finetune Mistral-7b Model |
Mixtral Finetuning | GitHub | Colab | A Beginner’s Guide to Fine-Tuning Mixtral Instruct Model | Notebook to Finetune Mixtral-7b Models |
LLama2 Finetuning | GitHub | Colab | Notebook to Finetune Llama2-7b Model |
Quantization | GitHub Link | Colab Link | Blog Link | Description |
---|---|---|---|---|
AWQ Quantization | GitHub | Colab | Squeeze Every Drop of Performance from Your LLM with AWQ | quantise LLM using AWQ. |
GGUF Quantization | GitHub | Colab | Run any Huggingface model locally | quantise LLM to GGUF formate. |
Data Prep | GitHub Link | Colab Link | Description |
---|---|---|---|
Documents -> Dataset | GitHub | Colab | Given Documents generate Instruction/QA dataset for finetuning LLMs |
Topic -> Dataset | GitHub | Colab | Given a Topic generate a dataset to finetune LLMs |
Alpaca Dataset Generation | GitHub | Colab | The original implementation of generating instruction dataset followed in the alpaca paper |
├── DataPrep (Notebook to generate synthetic data)
│ ├── dataset_prep.ipynb
│ └── ...
├── Deployment (TGI/VLLM scripts for testing)
│ └── ...
├── Finetuning (Finalized Finetuning Scripts)
│ ├── Gemma_finetuning_notebook.ipynb
│ ├── Llama2_finetuning_notebook.ipynb
│ ├── Mistral_finetuning_notebook.ipynb
│ ├── Mixtral_finetuning_notebook.ipynb
│ └── ...
├── LLMS (LLM experiments)
│ ├── ambari
│ │ └── ...
│ ├── CodeLLama
│ │ └── ...
│ ├── Gemma
│ │ ├── finetune-gemma.ipynb
│ │ └── gemma-sft.py
│ ├── Llama2
│ │ └── ...
│ ├── Mistral-7b
│ │ └── ...
│ └── Mixtral
│ └── ...
├── Projects (Upcoming ideas to explore)
│ └── YT_Clones
│ ├── Fireship_clone.ipynb
│ ├── youtube_channel_scraper.py
│ └── ...
├── Quantization
│ └── ...
├── utils
│ └── streaming_inference_hf.ipynb
└── RAG (Retrieval Augmented Generation)
├── 1_Naive_RAG.ipynb
├── 2_Semantic_Chunking_RAG.ipynb
├── 3_Sentence_Window_Retrieval_RAG.ipynb
├── 4_Auto_Merging_Retrieval_RAG.ipynb
├── 5_Agentic_RAG.ipynb
└── 6_Visual_RAG.ipynb