multimodal-learning

Here are 235 public repositories matching this topic...

MMMU-Benchmark / MMMU

This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"

machine-learning natural-language-processing deep-neural-networks computer-vision deep-learning evaluation question-answering stem multimodality multimodal-learning visual-question-answering multimodal multimodal-deep-learning foundation-models large-language-models llm llms large-multimodal-models

Updated May 31, 2024
Python

DmitryRyumin / ICASSP-2023-24-Papers

Star

ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!

Updated May 31, 2024
Python

pliang279 / MultiViz

Star

[ICLR 2023] MultiViz: Towards Visualizing and Understanding Multimodal Models

machine-learning natural-language-processing computer-vision multimodal-learning

Updated May 30, 2024
Python

pykale / pykale

Star

Knowledge-Aware machine LEarning (KALE): accessible machine learning from multiple sources for interdisciplinary research, part of the 🔥PyTorch ecosystem. ⭐ Star to support our work!

python data-science machine-learning computer-vision deep-learning pytorch transfer-learning graph-analysis domain-adaptation meta-learning medical-image-analysis multimodal-learning multimodal knowledge-aware-learning

Updated May 29, 2024
Python

antonio-f / Phi-3-Vision

Star

Phi-3-Vision model test - running locally

machine-learning computer-vision jupyter-notebook artificial-intelligence image-to-text multimodal-learning hands-on hugging-face multimodal-models llms running-locally tiny-models small-models phi-3 phi-3-vision

Updated May 29, 2024
Jupyter Notebook

minjoong507 / BM-DETR

Star

[arXiv 23] Pytorch code for "Overcoming Weak Visual-Textual Alignment for Video Moment Retrieval"

multimodal-learning video-retrieval video-grounding

Updated May 28, 2024
Python

minjoong507 / MPGN

Star

[EMNLP 2022] Pytorch code for "Modal-specific Pseudo Query Generation for Video Corpus Moment Retrieval"

multimodal-learning video-retrieval video-grounding

Updated May 28, 2024
Python

moranyanuka / icc_code

Star

Official Code for the ACL 2024 (Findings) paper - ICC: Quantifying Image Caption Concreteness for Multimodal Dataset Curation

dataset-filtering multimodal-learning dataset-curation

Updated May 27, 2024

friedrichor / Awesome-Multimodal-Papers

Star

A curated list of awesome Multimodal studies.

deep-learning multimodal-learning multimodal multimodal-deep-learning multimodal-data multimodal-dialogue multimodal-large-language-models large-multimodal-models

Updated May 27, 2024
HTML

GAIR-Lab / IISAN

Star

IISAN: Efficiently Adapting Multimodal Representation for Sequential Recommendation with Decoupled PEFT

representation-learning memory-efficient multimodal-learning peft efficiency-analysis multimodal-representation sequential-recommendation multimodal-recommendation iisan

Updated May 26, 2024
Python

RobinDong / tiny_multimodal

Star

Tiny and simple implementation of multimodal models

computer-vision deep-learning pytorch language-model multimodal-learning

Updated May 26, 2024
Python

mlfoundations / open_flamingo

Star

An open-source framework for training large multimodal models.

computer-vision deep-learning pytorch language-model flamingo multimodal-learning in-context-learning

Updated May 25, 2024
Python

PreferredAI / cornac

Star

A Comparative Framework for Multimodal Recommender Systems

collaborative-filtering matrix-factorization recommendation-system recommendation-engine recommender-system recommendation-algorithms multimodality multimodal-learning

Updated May 24, 2024
Python

alipay / Ant-Multi-Modal-Framework

Star

Research Code for Multimodal-Cognition Team in Ant Group

video-editing multimodal-learning video-text-retrieval image-text-retrieval multimodal-llm

Updated May 24, 2024
Python

mayurmallya / DeepGuide

Star

Deep Multimodal Guidance for Medical Image Classification: https://arxiv.org/pdf/2203.05683.pdf

deep-learning python3 pytorch knowledge-distillation multimodal-learning skin-lesion-classification brain-tumor-classification tensorflow2 student-teacher-learning medical-image-classification mri-classification wsi-classificaiton

Updated May 22, 2024
Jupyter Notebook

JMcardenas / CS4ML

Star

This is a repository for CS4ML. It is a general framework for active learning in regression problems. It approximates a target function arising from general types of data, rather than pointwise samples.

compressed-sensing tensorflow matlab generative-model partial-differential-equations polynomial-regression regression-models active-learning sampling-methods least-square-regression multimodal-learning mri-reconstruction burgers-equation leverage-score pinns christoffel-functions gradient-augmented

Updated May 22, 2024
MATLAB

praveena2j / RJCMA

Star

ABAW6 (CVPR-W) We achieved second place in the valence arousal challenge of ABAW6

emotion affective-computing emotion-recognition attention-model multimodal-learning arousal-valence

Updated May 21, 2024
Python

willxxy / awesome-mmps

Star

Corpus of resources for multimodal machine learning with physiological signals

machine-learning deep-learning signal-processing physiological-signals multimodal-learning multimodal multimodal-deep-learning multimodal-data

Updated May 20, 2024

DmitryRyumin / ICCV-2023-Papers

Star

ICCV 2023 Papers: Discover cutting-edge research from ICCV 2023, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support visual intelligence development!

Updated May 18, 2024
Python

kyegomez / PALI3

Sponsor

Star

Implementation of PALI3 from the paper PALI-3 VISION LANGUAGE MODELS: SMALLER, FASTER, STRONGER"

machine-learning artificial-intelligence multimodality multimodal-learning multimodal multimodal-deep-learning gpt4 autogpt

Updated May 17, 2024
Python

Improve this page

Add a description, image, and links to the multimodal-learning topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multimodal-learning topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multimodal-learning

Here are 235 public repositories matching this topic...

MMMU-Benchmark / MMMU

DmitryRyumin / ICASSP-2023-24-Papers

pliang279 / MultiViz

pykale / pykale

antonio-f / Phi-3-Vision

minjoong507 / BM-DETR

minjoong507 / MPGN

moranyanuka / icc_code

friedrichor / Awesome-Multimodal-Papers

GAIR-Lab / IISAN

RobinDong / tiny_multimodal

mlfoundations / open_flamingo

PreferredAI / cornac

alipay / Ant-Multi-Modal-Framework

mayurmallya / DeepGuide

JMcardenas / CS4ML

praveena2j / RJCMA

willxxy / awesome-mmps

DmitryRyumin / ICCV-2023-Papers

kyegomez / PALI3

Improve this page

Add this topic to your repo