Skip to content

Caption-Studio: Unleash the power of cutting-edge language models and image recognition to effortlessly generate captivating captions for your images. Elevate your social media game with expertly crafted, attention-grabbing captions that perfectly complement your visuals.

License

ayush-vatsal/Caption-Studio

Repository files navigation

Caption-Studio

Caption-Studio: Unleash the power of cutting-edge language models and image recognition to effortlessly generate captivating captions for your images. Elevate your social media game with expertly crafted, attention-grabbing captions that perfectly complement your visuals. Screenshot (756) Examples (1)

Table of Contents

About

  • Vision Transformer: The system leverages a state-of-the-art Vision Transformer model 'Git large' to extract meaningful descriptions from images.
  • Decoder-only Language Model: The captions are generated using a fine-tuned Decoder-only Language Model Falcon 7B Instruct(LLM), ensuring high-quality and creative output.
  • Social Media-Worthy Captions: The Falcon model was finetuned using a subset of the flickr dataset annotated using a larger LLM, the generated captions are specifically tailored to be suitable for sharing on social media platforms, making your posts more engaging.
  • Code and Resources: This repository provides the necessary code and pre-trained models to get started quickly and also a demo notbook for inference with a gradio UI. The dataset used to finetune the model is also linked in the datasets section.
  • The LLM was instruction tuned similar to the Stanford's Alpaca model, albeit on a much smaller dataset. For the task of description to caption generation, it performs qualitatively similar to OpenAI's text-davinci-003 while being surprisingly small in size and training resources.

Installation

  1. Clone this repository: git clone https://github.com/ayush-vatsal/Caption-Studio.git

  2. Install the required dependencies. It is recommended to use a virtual environment to avoid conflicts:

    cd caption-studio
    python -m venv venv
    source venv/bin/activate # On Windows, use "venv\Scripts\activate
    pip install -r requirements.txt
    

    Installing dependencies can be skipped if running any of the ipynb notebooks, they have bash scripted installs in the first code block.

  3. Run the caption generator on your own images or integrate it into your project.

Usage

To use the Caption Studio, if you have a local GPU that can handle inferencing from Falcon 7B follow these steps: run the AI_Image_captions_for_Social_Media.ipynb notebook locally and scroll down to the very end for inference. or, go to the said notebook, and click on open in colab. Go to runtime, click run all and scroll down to the very end for the UI, here you can upload any image and get captivating captions for your images.

Models

A resharded( version with smaller chunks on safetensors for low RAM environments) Falcon 7B instruct model was used as the base llm Falcon-7b-instruct-sharded This model was finetuned using the PEFT library with Quantization and Low Rank Adaptation (QLoRA), the adapter for this model is here Falcon-adapter. The vision transformer used for image to desciption is Git Large COCO.

Dataset

To finetune the LLM we needed data that had two columns, the image descriptions (expected out of ViT) and corresponding captions. Due to the unique nature of this dataset, we had to create our own data. A subset of the flickr dataset was used to extract image descriptions (flickr dataset is used to train Vision Transformers). These image descriptions were then annotated for captions using OpenAI's chat models. The datset is uploaded here captions datset.

License

This project is licensed under Apache License 2.0.

About

Caption-Studio: Unleash the power of cutting-edge language models and image recognition to effortlessly generate captivating captions for your images. Elevate your social media game with expertly crafted, attention-grabbing captions that perfectly complement your visuals.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published