Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Result of Flexible-input-shape Model is NAN #2166

Open
jmcc113 opened this issue Mar 12, 2024 · 4 comments
Open

Result of Flexible-input-shape Model is NAN #2166

jmcc113 opened this issue Mar 12, 2024 · 4 comments
Labels
bug Unexpected behaviour that should be corrected (type) Flexible Shape PyTorch (traced)

Comments

@jmcc113
Copy link

jmcc113 commented Mar 12, 2024

🐞Describing the bug

When I using EnumeratedShape or RangeDim to generate a flexible-input-shape model to inference, the result is all nan.

Stack Trace

/opt/homebrew/anaconda3/envs/bce/bin/python /Users/jinmuchuan/projects/BCEmbedding/model.py 
torch.int32
When both 'convert_to' and 'minimum_deployment_target' not specified, 'convert_to' is set to "mlprogram" and 'minimum_deployment_targer' is set to ct.target.iOS15 (which is same as ct.target.macOS12). Note: the model will not run on systems older than iOS15/macOS12/watchOS8/tvOS15. In order to make your model run on older system, please set the 'minimum_deployment_target' to iOS14/iOS13. Details please see the link: https://coremltools.readme.io/docs/unified-conversion-api#target-conversion-formats
Tuple detected at graph output. This will be flattened in the converted model.
Converting PyTorch Frontend ==> MIL Ops:   0%|          | 0/672 [00:00<?, ? ops/s]Core ML embedding (gather) layer does not support any inputs besides the weights and indices. Those given will be ignored.
Core ML embedding (gather) layer does not support any inputs besides the weights and indices. Those given will be ignored.
Converting PyTorch Frontend ==> MIL Ops: 100%|█████████▉| 670/672 [00:00<00:00, 5151.90 ops/s]
Running MIL frontend_pytorch pipeline: 100%|██████████| 5/5 [00:00<00:00, 458.38 passes/s]
Running MIL default pipeline:   0%|          | 0/71 [00:00<?, ? passes/s]/opt/homebrew/anaconda3/envs/bce/lib/python3.10/site-packages/coremltools/converters/mil/mil/passes/defs/preprocess.py:267: UserWarning: Output, '1617', of the source model, has been renamed to 'var_1617' in the Core ML model.
  warnings.warn(msg.format(var.name, new_name))
Running MIL default pipeline:  59%|█████▉    | 42/71 [00:00<00:00, 135.52 passes/s]/opt/homebrew/anaconda3/envs/bce/lib/python3.10/site-packages/coremltools/converters/mil/mil/ops/defs/iOS15/elementwise_unary.py:894: RuntimeWarning: overflow encountered in cast
  return input_var.val.astype(dtype=string_to_nptype(dtype_val))
Running MIL default pipeline: 100%|██████████| 71/71 [00:13<00:00,  5.44 passes/s]
Running MIL backend_mlprogram pipeline: 100%|██████████| 12/12 [00:00<00:00, 480.47 passes/s]
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
{'output': array([[[nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan]],

       [[nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan]]], dtype=float32), 'var_1617': array([[nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan]], dtype=float32)}

进程已结束,退出代码为 0

To Reproduce

import numpy as np
import torch
import coremltools as ct
from transformers import AutoModel, AutoTokenizer

sentences = ['sentence_0', 'sentence_1']

tokenizer = AutoTokenizer.from_pretrained('maidalun1020/bce-embedding-base_v1')
model = AutoModel.from_pretrained('maidalun1020/bce-embedding-base_v1', return_dict=False)

device = 'cpu'  # if no GPU, set "cpu"
model.to(device)
example_input = torch.randint(0, 10, size=(1, 128)).type(torch.int32)
print(example_input.dtype)

traced_script_module = torch.jit.trace(model.eval(), (example_input, example_input))


input_shape = ct.Shape(shape=(ct.RangeDim(lower_bound=1, upper_bound=8),
                              ct.RangeDim(lower_bound=1, upper_bound=512)))

mlmodel = ct.convert(traced_script_module,
                     inputs=[ct.TensorType(shape=input_shape, dtype=np.int32, name="input_ids"),
                             ct.TensorType(shape=input_shape, dtype=np.int32, name="attention_mask")],
                     outputs=[ct.TensorType(dtype=np.float32, name="output"),
                              ct.TensorType(dtype=np.float32, name="1617")]
                     )

mlmodel.save('embed.mlpackage')

inputs = tokenizer(sentences, padding=True, truncation=True, max_length=512, return_tensors="np")
inputs_on_device = {k: v.astype(np.int32) for k, v in inputs.items()}

out_dict = mlmodel.predict(inputs_on_device)
print(out_dict)

System environment (please complete the following information):

  • coremltools version:7.1
  • OS (e.g. MacOS version or Linux type):MacOS 13.3.1 (a)
  • Any other relevant version information (e.g. PyTorch or TensorFlow version): torch==2.1.0
@jmcc113 jmcc113 added the bug Unexpected behaviour that should be corrected (type) label Mar 12, 2024
@TobyRoseman
Copy link
Collaborator

Loading an untrusted PyTorch model is a security risk. So I'm unable to reproduce your results. It would be great if you could give us a minimal example (i.e. one which doesn't require loading an external model).

Does the output match if you convert with fixed shapes?

@jmcc113
Copy link
Author

jmcc113 commented Mar 13, 2024

Loading an untrusted PyTorch model is a security risk. So I'm unable to reproduce your results. It would be great if you could give us a minimal example (i.e. one which doesn't require loading an external model).

Does the output match if you convert with fixed shapes?

This model is from Hugging Face. I'm not sure which layer causes this bug, so it's difficult for me to construct a minimal example.
But the output of fixed-shape-model is correct.

@TobyRoseman
Copy link
Collaborator

I'm not sure which layer causes this bug, so it's difficult for me to construct a minimal example.

I completely understand. Unfortunately, without a minimal example, it's difficult for me to help you.

Since the fixed shape works, the issue is almost certainly related to flexible shape. For debugging purposes, there's a few more things you could try.

1 - Verify that the traced PyTorch model still works for shapes within the range of the flexible shape but different than the shapes it was traced on.

2 - See if the model converts correct with fixed input_ids shape, but a flexible attention_mask shape.

3 - See if the model converts correct with fixed attention_mask shape, but a flexible input_ids shape.

@jmcc113
Copy link
Author

jmcc113 commented Mar 15, 2024

1 - Verify that the traced PyTorch model still works for shapes within the range of the flexible shape but different than the shapes it was traced on.

2 - See if the model converts correct with fixed input_ids shape, but a flexible attention_mask shape.

3 - See if the model converts correct with fixed attention_mask shape, but a flexible input_ids shape.

  1. The traced model works well for shapes different than the shapes it was traced on.
  2. When running the model which converts with fixed input_ids shape, but a flexible attention_mask shape, I get an error:
Traceback (most recent call last):
  File "/Users/jinmuchuan/projects/BCEmbedding/model.py", line 48, in <module>
    out_dict = mlmodel.predict(inputs_on_device)
  File "/opt/homebrew/anaconda3/envs/bce/lib/python3.10/site-packages/coremltools/models/model.py", line 596, in predict
    return MLModel._get_predictions(self.__proxy__, verify_and_convert_input_dict, data)
  File "/opt/homebrew/anaconda3/envs/bce/lib/python3.10/site-packages/coremltools/models/model.py", line 648, in _get_predictions
    return proxy.predict(data)
RuntimeError: {
    NSLocalizedDescription = "Failed to build the model execution plan using a model architecture file '/private/var/folders/sn/xnnh_7q94y9fx18c0g26rt716qppp1/T/tmpf0e8pl49.mlmodelc/model.mil' with error code: -7.";
}
  1. When running the model which converts with fixed attention_mask shape, but a flexible input_ids shape, I get NAN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Unexpected behaviour that should be corrected (type) Flexible Shape PyTorch (traced)
Projects
None yet
Development

No branches or pull requests

2 participants