vlm_ros

ROS1 package for open-source Visual-Language-Model such as LLaVA and honeybee.

Setup

Prerequisite

This package is build upon

ROS1 (Noetic)
flask (communication between rosnode and docker inference server)
docker and nvidia-container-toolkit (inference server)

Build package

mkdir -p ~/ros/catkin_ws/src && cd ~/ros/catkin_ws/src
git clone https://github.com/ojh6404/vlm_ros.git
cd vlm_ros && docker build -t vlm_ros . # build docker inference server
cd ~/ros/catkin_ws && catkin b

How to use

1. VQA

First, you need to launch docker inference server by

./run_docker -p 8888 -m honeybee

where

-p or --port : which port to use.
-m or --model : which model to use. Default is honeybee. [llava, honeybee]

and launch vqa node by

roslaunch vlm_ros sample_vqa.launch \
    input_image:=/kinect_head/rgb/image_rect_color \
    gui:=true

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
cfg		cfg
launch		launch
node_scripts		node_scripts
sample		sample
scripts		scripts
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
Dockerfile		Dockerfile
README.md		README.md
package.xml		package.xml
pyproject.toml		pyproject.toml
python_setup.sh		python_setup.sh
requirements.txt		requirements.txt
run_docker		run_docker

ojh6404/vlm_ros

Folders and files

Latest commit

History

Repository files navigation

vlm_ros

Setup

Prerequisite

Build package

How to use

1. VQA

About

Topics

Resources

Stars

Watchers

Forks

Languages