Single-stream Extractor Network with Contrastive Pre-training for Remote Sensing Change Captioning
-
Updated
May 29, 2024 - Python
Single-stream Extractor Network with Contrastive Pre-training for Remote Sensing Change Captioning
Audio Captioning datasets for PyTorch.
Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
A tool to streamline AI image captioning
This repository is dedicated to small projects and some theoretical material that I used to get into NLP and LLM in a practical and efficient way.
A gradio based image captioning tool that uses the GPT-4-Vision API to generate detailed descriptions of images.
CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)
Modifying LAVIS' BLIP2 Q-former with models pretrained on Japanese datasets.
Image caption extension for A1111 Webui 👁️📜🖋️
📺 Software concept for summarizing YouTube video captions.
A real-time captioning system with support for large and small screen display.
Using LLMs and pre-trained caption models for super-human performance on image captioning.
Toolkit for supporting the EBU-TT Live specification
[NLPCC'23] ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple Oracles PyTorch Implementation
Python program to generate memes.
VisText is a benchmark dataset for semantically rich chart captioning.
With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning. ICCV 2023
A curated list of zero-shot captioning papers
Audio captioning baseline system for DCASE 2020 challenge.
Add a description, image, and links to the captioning topic page so that developers can more easily learn about it.
To associate your repository with the captioning topic, visit your repo's landing page and select "manage topics."