ROS1 package for open-source Visual-Language-Model such as LLaVA and honeybee.
This package is build upon
- ROS1 (Noetic)
- flask (communication between rosnode and docker inference server)
- docker and nvidia-container-toolkit (inference server)
mkdir -p ~/ros/catkin_ws/src && cd ~/ros/catkin_ws/src
git clone https://github.com/ojh6404/vlm_ros.git
cd vlm_ros && docker build -t vlm_ros . # build docker inference server
cd ~/ros/catkin_ws && catkin b
First, you need to launch docker inference server by
./run_docker -p 8888 -m honeybee
where
-p
or--port
: which port to use.-m
or--model
: which model to use. Default is honeybee.[llava, honeybee]
and launch vqa node by
roslaunch vlm_ros sample_vqa.launch \
input_image:=/kinect_head/rgb/image_rect_color \
gui:=true