Automate browser-based workflows with LLMs and Computer Vision
-
Updated
May 20, 2024 - Python
Automate browser-based workflows with LLMs and Computer Vision
Official code for Paper "Mantis: Multi-Image Instruction Tuning"
My YOLOv8 learning pathway. Just for fun! You only look (live) once
Enhanced ChatGPT Clone: Features OpenAI, Assistants API, Azure, Groq, GPT-4 Vision, Mistral, Bing, Anthropic, OpenRouter, Vertex AI, Gemini, AI model switching, message search, langchain, DALL-E-3, ChatGPT Plugins, OpenAI Functions, Secure Multi-User System, Presets, completely open-source for self-hosting. More features in development
📸 A powerful, high-performance React Native Camera library.
The Freiburg Vision Test (FrACT) assesses visual acuities and contrast thresholds. It runs in any modern browser, or as webApp.
iOS App that implements state-of-the-art machine learning and computer vision integration. The application is developed based on the Swift language and CoreML, Vision frameworks.
PaperClub 资源站:不间断分享中小型项目, 主要分享各类视觉算法、文本算法和前后端等实用性工程项目,主要开发语言为python,vue等;
PhotonVision is the free, fast, and easy-to-use computer vision solution for the FIRST Robotics Competition.
Library for communication with ChatGPT. Now it supports Vision Question.
Blog and Portfolio page.
Google Gemini Voice/Vision Assistant with gemini-1.5-pro / gemini-1.5-flash modal !
FreeGenius AI, an advanced AI assistant that can talk and take multi-step actions. Supports numerous open-source LLMs via Llama.cpp or Ollama or Groq Cloud API, with optional integration with AutoGen agents, OpenAI API, Google Gemini Pro and unlimited plugins.
A fully-annotated, open-design dataset of autonomous and piloted high-speed flight
In This Repo I've Built Vision Transformer using PyTorch
Add a description, image, and links to the vision topic page so that developers can more easily learn about it.
To associate your repository with the vision topic, visit your repo's landing page and select "manage topics."