A Comparative Framework for Multimodal Recommender Systems
-
Updated
May 24, 2024 - Python
A Comparative Framework for Multimodal Recommender Systems
DANCE: a deep learning library and benchmark platform for single-cell analysis
Multimodal Access and Interactive Data Representation
A python framework accelerating ML based discovery in the medical field by encouraging code reuse. Batteries included :)
This repository is used to collect papers and code in the field of AI.
This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"
Automated modeling and machine learning framework FEDOT
Implementation of PALI3 from the paper PALI-3 VISION LANGUAGE MODELS: SMALLER, FASTER, STRONGER"
My implementation of "Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution"
A Python package housing a collection of deep-learning multi-modal data fusion method pipelines! From data loading, to training, to evaluation - fusilli's got you covered 🌸
TriDFusion (3DF) Medical Imaging Viewer
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"
[NAACL 2024] Official Implementation of paper "Self-Adaptive Sampling for Efficient Video Question Answering on Image--Text Models"
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"
Collecting awesome papers of RAG for AIGC. We propose a taxonomy of RAG foundations, enhancements, and applications in paper "Retrieval-Augmented Generation for AI-Generated Content: A Survey".
Swarming algorithms like PSO, Ant Colony, Sakana, and more in PyTorch 😊
The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, in a standardized general environment with minimal requirements.
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
Add a description, image, and links to the multimodality topic page so that developers can more easily learn about it.
To associate your repository with the multimodality topic, visit your repo's landing page and select "manage topics."