Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to download the ready LLaVA-Lightening-7B weights #97

Open
SIGMIND opened this issue Apr 17, 2024 · 5 comments
Open

How to download the ready LLaVA-Lightening-7B weights #97

SIGMIND opened this issue Apr 17, 2024 · 5 comments

Comments

@SIGMIND
Copy link

SIGMIND commented Apr 17, 2024

As mentioned on the offline demo readme,
Alternatively you can download the ready LLaVA-Lightening-7B weights from mmaaz60/LLaVA-Lightening-7B-v1-1.
THe Hugging Face repo has files named pytorch_model-00001-of-00002.bin and pytorch_model-00002-of-00002.bin
Should I convert the model to gguf format to be used for offline demo?

@mmaaz60
Copy link
Member

mmaaz60 commented Apr 17, 2024

Hi @SIGMIND,

No conversion is required, you can directly clone it from huggingface as below,

git lfs install
git clone https://huggingface.co/mmaaz60/LLaVA-7B-Lightening-v1-1

Then, download projection weights as

git clone https://huggingface.co/MBZUAI/Video-ChatGPT-7B

Finally you should be able to run the demo as,

python video_chatgpt/demo/video_demo.py 
        --model-name LLaVA-7B-Lightening-v1-1 \
        --projection_path Video-ChatGPT-7B/video_chatgpt-7B.bin

I hope it will help. Let me know if you have any questions. Thanks

@SIGMIND
Copy link
Author

SIGMIND commented Apr 18, 2024

Thanks, the steps helped moving forward with the models. However, is there any specific GPU requirement specification for running this locally? I have tried to run it on RTX 2060 but getting error as bellow:
python video_chatgpt/demo/video_demo.py 2024-04-18 17:52:45 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, model_name='LLaVA-7B-Lightening-v1-1', vision_tower_name='openai/clip-vit-large-patch14', conv_mode='video-chatgpt_v1', projection_path='/mnt/sdc1/Video-ChatGPT/Video-ChatGPT-7B/video_chatgpt-7B.bin') 2024-04-18 17:52:45 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, model_name='LLaVA-7B-Lightening-v1-1', vision_tower_name='openai/clip-vit-large-patch14', conv_mode='video-chatgpt_v1', projection_path='/mnt/sdc1/Video-ChatGPT/Video-ChatGPT-7B/video_chatgpt-7B.bin') You are using a model of type llava to instantiate a model of type VideoChatGPT. This is not supported for all configurations of models and can yield errors. Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Loading checkpoint shards: 50%|██████████████████████████████████████████████████████████████████████ | 1/2 [05:36<05:36, 336.88s/it] Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [08:46<00:00, 250.17s/it] Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [08:46<00:00, 263.40s/it] 2024-04-18 18:01:32 | ERROR | stderr | preprocessor_config.json: 0%| | 0.00/316 [00:00<?, ?B/s] preprocessor_config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 316/316 [00:00<00:00, 1.74MB/s] 2024-04-18 18:01:43 | ERROR | stderr | 2024-04-18 18:01:57 | INFO | stdout | Loading weights from /mnt/sdc1/Video-ChatGPT/Video-ChatGPT-7B/video_chatgpt-7B.bin 2024-04-18 18:02:24 | INFO | stdout | Weights loaded from /mnt/sdc1/Video-ChatGPT/Video-ChatGPT-7B/video_chatgpt-7B.bin 2024-04-18 18:02:24 | ERROR | stderr | Traceback (most recent call last): 2024-04-18 18:02:24 | ERROR | stderr | File "/mnt/sdc1/Video-ChatGPT/video_chatgpt/demo/video_demo.py", line 264, in <module> 2024-04-18 18:02:24 | ERROR | stderr | initialize_model(args.model_name, args.projection_path) 2024-04-18 18:02:24 | ERROR | stderr | File "/mnt/sdc1/Video-ChatGPT/video_chatgpt/eval/model_utils.py", line 131, in initialize_model 2024-04-18 18:02:24 | ERROR | stderr | model = model.cuda() 2024-04-18 18:02:24 | ERROR | stderr | File "/home/sig/Downloads/[/mnt/DockerRuntime/miniconda]/envs/video_chatgpt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 905, in cuda 2024-04-18 18:02:24 | ERROR | stderr | return self._apply(lambda t: t.cuda(device)) 2024-04-18 18:02:24 | ERROR | stderr | File "/home/sig/Downloads/[/mnt/DockerRuntime/miniconda]/envs/video_chatgpt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 797, in _apply 2024-04-18 18:02:24 | ERROR | stderr | module._apply(fn) 2024-04-18 18:02:24 | ERROR | stderr | File "/home/sig/Downloads/[/mnt/DockerRuntime/miniconda]/envs/video_chatgpt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 797, in _apply 2024-04-18 18:02:24 | ERROR | stderr | module._apply(fn) 2024-04-18 18:02:24 | ERROR | stderr | File "/home/sig/Downloads/[/mnt/DockerRuntime/miniconda]/envs/video_chatgpt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 820, in _apply 2024-04-18 18:02:24 | ERROR | stderr | param_applied = fn(param) 2024-04-18 18:02:24 | ERROR | stderr | File "/home/sig/Downloads/[/mnt/DockerRuntime/miniconda]/envs/video_chatgpt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 905, in <lambda> 2024-04-18 18:02:24 | ERROR | stderr | return self._apply(lambda t: t.cuda(device)) 2024-04-18 18:02:24 | ERROR | stderr | File "/home/sig/Downloads/[/mnt/DockerRuntime/miniconda]/envs/video_chatgpt/lib/python3.10/site-packages/torch/cuda/__init__.py", line 247, in _lazy_init 2024-04-18 18:02:24 | ERROR | stderr | torch._C._cuda_init() 2024-04-18 18:02:24 | ERROR | stderr | RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW

@biphobe
Copy link

biphobe commented Apr 25, 2024

It seems that last sentence from your logs indicate a driver issue.

@SIGMIND
Copy link
Author

SIGMIND commented Apr 28, 2024

Understood and that is resolved. But how much GPU memory is required to run it offline? I have 12 GB RTX 2060 and getting this error

2024-04-28 15:27:44 | ERROR | stderr |     return self._apply(lambda t: t.cuda(device))
2024-04-28 15:27:44 | ERROR | stderr | torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 32.00 MiB (GPU 0; 11.73 GiB total capacity; 11.26 GiB already allocated; 26.88 MiB free; 11.27 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

@biphobe
Copy link

biphobe commented May 1, 2024

I've run the model locally on RTX 2070 SUPER successfully and also I've also run the model in the cloud with no issues.

Your problem seems related to your setup. Try closing every app on your system and then run the model. In my case during my initial local attempt, the browser was reserving GPU memory and caused errors you just mentioned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants