Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InternViT−6B−448px−V1.5动态分辨率如何支持 #182

Open
hgdhrt opened this issue May 20, 2024 · 2 comments
Open

InternViT−6B−448px−V1.5动态分辨率如何支持 #182

hgdhrt opened this issue May 20, 2024 · 2 comments

Comments

@hgdhrt
Copy link

hgdhrt commented May 20, 2024

请问一下,我按照InternViT−6B−448px−V1.5的示例代码,经过图像前处理,发现一个长宽比不为1的图片仍然经过了centercrop,请问如何支持动态分辨率?

@hgdhrt
Copy link
Author

hgdhrt commented May 20, 2024

补充示例代码

import torch
from PIL import Image
from transformers import AutoModel, CLIPImageProcessor

model = AutoModel.from_pretrained(
    'OpenGVLab/InternViT-6B-448px-V1-5',
    torch_dtype=torch.bfloat16,
    low_cpu_mem_usage=True,
    trust_remote_code=True).cuda().eval()

image = Image.open('./examples/image1.jpg').convert('RGB')

image_processor = CLIPImageProcessor.from_pretrained('OpenGVLab/InternViT-6B-448px-V1-5')

pixel_values = image_processor(images=image, return_tensors='pt').pixel_values
pixel_values = pixel_values.to(torch.bfloat16).cuda()

outputs = model(pixel_values)

@czczup
Copy link
Member

czczup commented May 30, 2024

您好,readme写的不够好,给您带来困扰了。
请使用这里的load_image函数加载图像并进行切图,然后再送入ViT:

https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5#model-usage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants