Enable passing `output_hidden_states` #731

thepowerfuldeez · 2023-11-02T13:24:04Z

Related to #657
Inspired by PR above, I made PR without breaking backward compatibility. In addition, I made support for passing output_hidden_states as attribute for VisionTransformer and TextTransformer classes.

Example:

import torch
import open_clip
model_type = 'hf-hub:UCSC-VLAA/ViT-L-14-CLIPA-336-datacomp1B'

model, _, preprocess = open_clip.create_model_and_transforms(
    model_type,
)
tokenizer = get_tokenizer(model_type)

image = preprocess(Image.open(PATH_TO_IMAGE)).unsqueeze(0)
text = tokenizer(["a diagram", "a dog", "a cat"])

with torch.no_grad():
    image_result = model.encode_image(image, output_hidden_states=True)
    text_result = model.encode_text(text, output_hidden_states=True)

    image_features, image_hidden_states = image_result
    text_features, text_hidden_states = text_result

Update coca_model.py

# Conflicts: # src/open_clip/coca_model.py # src/open_clip/model.py # src/open_clip/transformer.py

… the init

rwightman · 2023-11-02T23:07:14Z

@thepowerfuldeez thanks for the PR, will say that we need to do this one carefully as it impacts the output interface. I recognize that people want this, but it's been slow to be added because it's a bit of a mess when you consider all the details.

First things first, I feel we should only allow this if dictionary output is enabled, having too many tuple variations as possible outputs is asking for trouble.

Next, the internal typing has gotchas with torchscript when you alternate between Tuple and tensor outputs. Not quite sure what the needed combination of typing would be to have that pass.

thepowerfuldeez · 2023-11-03T08:51:03Z

Hi @rwightman ! I’m on the same page with you, are it should be supported as a dict output. I couldn’t decide how to better use it considering dict output appears only in 1 place, where I needed to have output of VisionTransformer to output hidden states (I am not using CLIP class and hence implemented logic with setting attribute for transformer classes).
What is the better way to move such logic as a dict output here?

alvaro-stylesage · 2024-03-26T17:10:47Z

Hi @thepowerfuldeez , thanks a lot for this PR, it has been really useful. However, I have some doubts when using the image_hidden_states as embedding for downstream classification tasks. I am doing:

last_hidden_state = image_hidden_states[-1].numpy()
cls_embedding = last_hidden_state[:, 0, :]

But then, using that (1024,) sized CLS embedding is not resulting in good classification metrics for my task (~50% accuracy), while with the image_features sized (768,) I am getting ~85% accuracy. Can you think of an explanation of this? Is the last hidden state being taken before the LayerNorm and this might affect?

Thanks!

nbardy and others added 17 commits October 4, 2023 11:37

Update coca_model.py

fa79c73

Merge pull request nbardy#1 from nbardy/patch-1

9f6f9eb

Update coca_model.py

openclip

6880041

Merge remote-tracking branch 'upstream/main'

b3e14db

# Conflicts: # src/open_clip/coca_model.py # src/open_clip/model.py # src/open_clip/transformer.py

fix bugs after the merge

0adfd78

add output hidden states flag into init

21cc2b6

fix transformer call

5584e3d

reformat value method

6ccc07d

add logging

e8d6dce

simplify

6abb499

apply permute to the transformer out and pass output_hidden_states to…

847a702

… the init

do not pass output_hidden_states to method and instead use attribute

858e8c3

pass hidden size to VisualTransformer

4eb5a11

refactor the code and remove TransformerOutput class

7988d5a

allow override output_hidden_states to forward call

6e971cc

fix bug with return

19fd3bd

update test case

879718e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable passing `output_hidden_states` #731

Enable passing `output_hidden_states` #731

thepowerfuldeez commented Nov 2, 2023

rwightman commented Nov 2, 2023

thepowerfuldeez commented Nov 3, 2023

alvaro-stylesage commented Mar 26, 2024

Enable passing output_hidden_states #731

Are you sure you want to change the base?

Enable passing output_hidden_states #731

Conversation

thepowerfuldeez commented Nov 2, 2023

rwightman commented Nov 2, 2023

thepowerfuldeez commented Nov 3, 2023

alvaro-stylesage commented Mar 26, 2024

Enable passing `output_hidden_states` #731

Enable passing `output_hidden_states` #731