Community contribution - `optimum.exporters.onnx` support for new models! #555

michaelbenayoun · 2022-12-07T13:30:07Z

mszsorondo · 2023-01-01T22:00:52Z

Hi! I'm trying to add support for VisualBERT, which works for VQA, VCR, NLVR and RPG.
Since the guide says that "When inheriting from a middle-end class, look for the one handling the same modality / category of models as the one you are trying to support.", I'm using TextAndVisionOnnxConfig because this is a multimodal model. Then initialized NORMALIZED_CONFIG_CLASS = NormalizedTextAndVisionConfig
I this OK so far?

The problem comes when implementing the inputs property... What is it that this property specifies? In the guide, I see that this inputs are exactly BERT's tokenizer's output keys, and values are the tensor dimensions for each key of the tokenizer's output. This will vary task-wise so I'd have to make a different axis for each task. Is this ok?

Thanks for the help!

EDIT: I see VisualBERT is implemented separately by task, but VisualBertForPreTraining is also provided for customized down-stream tasks. Should I implement a diferent configuration for each task?

EDIT II: I see this issue was previously in the transformers repo, it seems like the docs on how to add the ONNX configuration are written in a way that ignores the current optimum implementation, I have sorted some of the difficulties that arise from this assuming one ONNX config for the whole model. Can I help with an update for this guide?

fxmarty · 2023-01-02T08:22:35Z

Hi @mszsorondo , indeed the page https://huggingface.co/docs/transformers/serialization#export-to-onnx is a bit outdated. I'll do a PR to fix it. In your EDIT II, were you referring to this page?

I'd recommend to refer to: https://huggingface.co/docs/optimum/main/en/exporters/onnx/usage_guides/contribute . If you see any issue / unclear steps in the guide, don't hesitate to open a PR!

As for VisualBERT, I guess you haven't picked the easiest one :) I think you can leave VisualBertForPreTraining aside, it's probably better to support the rest for inference.

Indeed NORMALIZED_CONFIG_CLASS = NormalizedTextAndVisionConfig seems good.

The problem comes when implementing the inputs property... What is it that this property specifies? In the guide, I see that this inputs are exactly BERT's tokenizer's output keys, and values are the tensor dimensions for each key of the tokenizer's output. This will vary task-wise so I'd have to make a different axis for each task. Is this ok?

EDIT: I see VisualBERT is implemented separately by task, but VisualBertForPreTraining is also provided for customized down-stream tasks. Should I implement a diferent configuration for each task?

I don't think you need to implement configs for each tasks. Apparently all tasks take as inputs input_ids, token_type_ids, attention_mask, visual_embeds, visual_token_type_ids, visual_attention_mask. The VisualBertForRegionToPhraseAlignment seem to have an additional region_to_phrase_position input.

To implement the input method, you need to specify which inputs / outputs the model takes, and what are the dynamic axis: for example, for CLIP, that is

optimum/optimum/exporters/onnx/model_configs.py

Lines 523 to 528 in 9ac1703

    
           def inputs(self) -> Mapping[str, Mapping[int, str]]: 
        
               return { 
        
                   "input_ids": {0: "batch_size", 1: "sequence_length"}, 
        
                   "pixel_values": {0: "batch_size", 1: "num_channels", 2: "height", 3: "width"}, 
        
                   "attention_mask": {0: "batch_size", 1: "sequence_length"}, 
        
               }

You can very well do an if/else in the input/output keys (or axis) depending on the task, for example BART:

optimum/optimum/exporters/onnx/model_configs.py

Lines 382 to 389 in 9ac1703

    
           def inputs(self) -> Mapping[str, Mapping[int, str]]: 
        
               inputs_properties = { 
        
                   "default": self.inputs_for_default_and_seq2seq_lm, 
        
                   "seq2seq-lm": self.inputs_for_default_and_seq2seq_lm, 
        
                   "causal-lm": self.inputs_for_causal_lm, 
        
                   "other": self.inputs_for_other_tasks, 
        
               } 
        
               return inputs_properties.get(self.task, inputs_properties["other"])

I think the piece where you will have the most work to do is to extend the dummy inputs generators. They are meant to generate inputs for the model, without using a preprocessor, and help to flexibly generate inputs of various shapes for example (for export validation). You would need to extend an existing one, or create a new input generator to support the visual_embeds, visual_token_type_ids, visual_attention_mask, region_to_phrase_position inputs. Unless you see an existing input generator in here you could reuse the logic of, my guess is that you can create a VisualBertDummyInputGenerator for those four inputs.

mszsorondo · 2023-01-02T17:22:19Z

Thanks for your help @fxmarty

In your EDIT II, were you referring to this page?

I was actually referring to the second guide (https://huggingface.co/docs/optimum/main/en/exporters/onnx/usage_guides/contribute), there are some minor issues with two function calls at the export step + one lacking import. Submitted PR #662

I advanced with the inputs function and did the export step, and indeed got an error regarding visual_embeds (surely this is also a problem for visual_token_type_ids, visual_attention_mask and region_to_phrase_position as you suggest), so I'll go for the new input generator.

bhavnicksm · 2023-01-03T07:09:57Z

Hi @michaelbenayoun!

Is someone working on adding the Pegasus ONNX config?

If not, I would like to look into it 😄(under your guidance, since I haven't done written a ONNXConfig yet)

fxmarty · 2023-01-03T08:22:14Z

Hi @bhavnicksm , @mht-sharma just merged the Pegasus ONNX config yesterday! #620

bhavnicksm · 2023-01-03T08:31:11Z

@fxmarty Still facing an issue

Hi @bhavnicksm , @mht-sharma just
merged the Pegasus ONNX config yesterday! #620

I installed optimum directly from source here using

!pip install --quiet git+https://github.com/huggingface/optimum.git

I tried to use Pegasus with an inference right now using ORTModelforSeq2SeqLM, using the following code:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from optimum.onnxruntime import ORTModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("tuner007/pegasus_paraphrase")
model = AutoModelForSeq2SeqLM.from_pretrained("tuner007/pegasus_paraphrase")

ort_model = ORTModelForSeq2SeqLM.from_pretrained("tuner007/pegasus_paraphrase", from_transformers=True)

and it gives me the following error:

/usr/local/lib/python3.8/dist-packages/transformers/models/pegasus/modeling_pegasus.py:234: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_weights.size() != (bsz * self.num_heads, tgt_len, src_len):
/usr/local/lib/python3.8/dist-packages/transformers/models/pegasus/modeling_pegasus.py:241: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attention_mask.size() != (bsz, 1, tgt_len, src_len):
/usr/local/lib/python3.8/dist-packages/transformers/models/pegasus/modeling_pegasus.py:273: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_output.size() != (bsz * self.num_heads, tgt_len, self.head_dim):
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[<ipython-input-7-2e0907dfd025>](https://localhost:8080/#) in <module>
----> 1 ort_model = ORTModelForSeq2SeqLM.from_pretrained("tuner007/pegasus_paraphrase", from_transformers=True)

9 frames
[/usr/local/lib/python3.8/dist-packages/optimum/onnxruntime/modeling_ort.py](https://localhost:8080/#) in from_pretrained(cls, model_id, from_transformers, force_download, use_auth_token, cache_dir, subfolder, config, local_files_only, provider, session_options, provider_options, **kwargs)
    555             `ORTModel`: The loaded ORTModel model.
    556         """
--> 557         return super().from_pretrained(
    558             model_id,
    559             from_transformers=from_transformers,

[/usr/local/lib/python3.8/dist-packages/optimum/modeling_base.py](https://localhost:8080/#) in from_pretrained(cls, model_id, from_transformers, force_download, use_auth_token, cache_dir, subfolder, config, local_files_only, **kwargs)
    323 
    324         from_pretrained_method = cls._from_transformers if from_transformers else cls._from_pretrained
--> 325         return from_pretrained_method(
    326             model_id=model_id,
    327             config=config,

[/usr/local/lib/python3.8/dist-packages/optimum/onnxruntime/modeling_seq2seq.py](https://localhost:8080/#) in _from_transformers(cls, model_id, config, use_auth_token, revision, force_download, cache_dir, subfolder, local_files_only, use_cache, provider, session_options, provider_options, use_io_binding, task)
   1144             output_names.append(ONNX_DECODER_WITH_PAST_NAME)
   1145         models_and_onnx_configs = get_encoder_decoder_models_for_export(model, onnx_config)
-> 1146         export_models(
   1147             models_and_onnx_configs=models_and_onnx_configs,
   1148             opset=onnx_config.DEFAULT_ONNX_OPSET,

[/usr/local/lib/python3.8/dist-packages/optimum/exporters/onnx/convert.py](https://localhost:8080/#) in export_models(models_and_onnx_configs, output_dir, opset, output_names, device, input_shapes)
    534 
    535         outputs.append(
--> 536             export(
    537                 model=submodel,
    538                 config=sub_onnx_config,

[/usr/local/lib/python3.8/dist-packages/optimum/exporters/onnx/convert.py](https://localhost:8080/#) in export(model, config, output, opset, device, input_shapes)
    605                 f" got: {torch.__version__}"
    606             )
--> 607         return export_pytorch(model, config, opset, output, device=device, input_shapes=input_shapes)
    608 
    609     elif is_tf_available() and issubclass(type(model), TFPreTrainedModel):

[/usr/local/lib/python3.8/dist-packages/optimum/exporters/onnx/convert.py](https://localhost:8080/#) in export_pytorch(model, config, opset, output, device, input_shapes)
    368             # Export can work with named args but the dict containing named args has to be the last element of the args
    369             # tuple.
--> 370             onnx_export(
    371                 model,
    372                 (dummy_inputs,),

[/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py](https://localhost:8080/#) in export(model, args, f, export_params, verbose, training, input_names, output_names, operator_export_type, opset_version, do_constant_folding, dynamic_axes, keep_initializers_as_inputs, custom_opsets, export_modules_as_functions)
    502     """
    503 
--> 504     _export(
    505         model,
    506         args,

[/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py](https://localhost:8080/#) in _export(model, args, f, export_params, verbose, training, input_names, output_names, operator_export_type, export_type, opset_version, do_constant_folding, dynamic_axes, keep_initializers_as_inputs, fixed_batch_size, custom_opsets, add_node_names, onnx_shape_inference, export_modules_as_functions)
   1527             _validate_dynamic_axes(dynamic_axes, model, input_names, output_names)
   1528 
-> 1529             graph, params_dict, torch_out = _model_to_graph(
   1530                 model,
   1531                 args,

[/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py](https://localhost:8080/#) in _model_to_graph(model, args, verbose, input_names, output_names, operator_export_type, do_constant_folding, _disable_torch_constant_prop, fixed_batch_size, training, dynamic_axes)
   1113 
   1114     try:
-> 1115         graph = _optimize_graph(
   1116             graph,
   1117             operator_export_type,

[/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py](https://localhost:8080/#) in _optimize_graph(graph, operator_export_type, _disable_torch_constant_prop, fixed_batch_size, params_dict, dynamic_axes, input_names, module)
    662 
    663     graph = _C._jit_pass_onnx(graph, operator_export_type)
--> 664     _C._jit_pass_onnx_lint(graph)
    665     _C._jit_pass_lint(graph)
    666 

RuntimeError: Unable to cast from non-held to held instance (T& to Holder<T>) (#define PYBIND11_DETAILED_ERROR_MESSAGES or compile in debug mode for type information)

fxmarty · 2023-01-03T09:51:21Z

@bhavnicksm Can you open an issue in Optimum with your environment details? We can track it there!

* Support Splinter exporters (#555) * Added SplintrerModel in PYTORCH_EXPORT_MODELS_TINY dict (rightfully suggested by fxmarty) * Fix alphabetized order for PYTORCH_EXPORT_MODELS_LARGE

chainyo · 2023-02-07T15:06:14Z

@fxmarty Please re-open this. 🤗

fxmarty · 2023-02-07T15:07:04Z

Thanks!

adit299 · 2023-02-14T16:47:18Z

I can look into ImageGPT, if it has not yet been claimed.

fxmarty · 2023-02-14T17:14:10Z

Feel free! Don't hesitate to ask any question if needed.

someshfengde · 2023-02-18T12:50:45Z

Can I take TAPAS if it's not yet been claimed?

asrimanth · 2023-02-18T21:03:34Z

Hello, Can I work on RegNet?

michaelbenayoun · 2023-02-20T10:06:23Z

Yes to both, feel free!
I updated the list saying that you are working on it.

hazrulakmal · 2023-02-21T19:18:13Z

Hi @michaelbenayoun, I went into the codebase recently and I think the list above may not be the latest update. I found that a few models such as

PoolFormer
Hubert
MPnet
wav2vec

already have their own configurations in this file.

fxmarty · 2023-02-22T08:51:01Z

thank you @hazrulakmal , I updated the list!

soma2000-lang · 2023-03-04T15:29:07Z

@fxmarty working on FLAVA

fxmarty · 2023-04-11T15:45:47Z

@rcshubhadeep I moved your issue to #968

gjain7 · 2023-04-28T06:41:54Z

hi , is optimum supports converting Llama (alpaca-lora) to onnx ? It would be great if i get some insights in this

regisss · 2023-04-28T07:01:18Z

hi , is optimum supports converting Llama (alpaca-lora) to onnx ? It would be great if i get some insights in this

Yes, this is supported and was introduced in #975. You'll need to have Optimum v1.8 to do it.

michaelbenayoun · 2023-05-02T09:14:14Z

The TasksManager allows to map model classes to export configuratons, here ONNX ones.
Registering your ONNX config will make it possible for you to use it with the CLI and everything else.

Are you doing a PR that will be merged on optimum?
If so, go to the optimum/exporters/tasks.py file and add an entry in the _SUPPORTED_MODEL_TYPE class attribute:

_SUPPORTED_MODEL_TYPE = {
    ....,
    "custom": supported_task_mapping("text-classification", ...., onnx="CustomOnnxConfig")
}

But if you are not doing a PR that will be merged in optimum, and want to dynamically register your class in your own library you can create a registering method:

register_for_onnx = TasksManager.create_register("onnx")

@register_for_onnx("model_type_here", "text-classification", ...)
class CustomOnnxConfig(TextEncoderOnnxConfig):
...

michaelbenayoun · 2023-05-02T14:37:28Z

If you do it programatically I do not think you need to register anything.
What's your model? You put bert here, but bert is already registered for ONNX so nothing happens.

michaelbenayoun · 2023-05-03T08:45:43Z

Alright, could you open a PR for your issue please?
We will try to help you there.

maiiabocharova · 2023-05-03T16:47:23Z

Thank you for spending time on me! I think PR will be a difficult thing to do, since I am not that proficient and do not think many people will want to use my architecture anyway.

Maybe you can advice how to do it code just for my library?

base_model = CustomBertForTokenClassification.from_pretrained("my-checkpoint")

base_model.config returns BertConfig, which I think I need to overwrite with the custom config I created in the previous step...

michaelbenayoun · 2023-05-04T08:24:05Z

Sorry I meant a separete issue...

maiiabocharova · 2023-05-04T08:38:54Z

Thank you a lot, I'll delete my comments here since they are unrelated to the discussion. I asked on discussion forum

rishabbala · 2023-06-21T17:38:58Z

I can work on CvT, if its open

fxmarty · 2023-06-23T08:49:17Z

Hi @rishabbala , sounds good, let us know if you encounter any help! A good reference is https://huggingface.co/docs/optimum/main/en/exporters/onnx/usage_guides/contribute

ingo-m · 2023-07-06T15:55:48Z

According to the above list, export of BLOOM models to ONNX is already supported, right?

Is export to ONNX already supposed to work for base models that have been finetuned with PEFT / LoRA?

Using the bigscience/bloom-560m base model and finetuning with PEFT / LoRA, I was able to perform inference after exporting to ONNX, but the model predictions are degraded 🤔 Details: huggingface/peft#670

sidistic · 2023-07-12T16:44:35Z

Hello, I would like to add onnx exporter support for Funnel Transformer.

regisss · 2023-07-13T18:03:52Z

Hello, I would like to add onnx exporter support for Funnel Transformer.

Hi @sidistic! Feel free to open a PR here and we'll help you if there is any issue 🙂
This guide may be useful: https://huggingface.co/docs/optimum/main/en/exporters/onnx/usage_guides/contribute

sidistic · 2023-07-26T23:36:30Z

Hello @regisss! I have opened a PR. This is my first ever PR on an open source project so looking forward to hearing your advice and learning from you.

soharas · 2023-08-02T17:27:45Z

Hello, is anyone working to implement this? If not then I might look into it

raise NotImplementedError( NotImplementedError: Tried to use ORTOptimizer for the model type mpnet, but it is not available yet.

manishghop · 2023-12-10T16:22:06Z

Hi, I'm trying to export ChatGLM2 & Qwen models to onnx using hf optimum.

ChatGLM2: model-card-> https://huggingface.co/THUDM/chatglm2-6b
Qwen: model-card-> https://huggingface.co/Qwen/Qwen-7B-Chat

I'm using this code to export chatglm2: https://gist.github.com/manishghop/9be5aee6ed3d7551c751cc5d9f7eb8c3
While running the onnx export I faced: [UnsupportedOperatorError: Exporting the operator 'aten::scaled_dot_product_attention' to ONNX opset version 14 is not supported.](https://github.com/pytorch/pytorch/issues/97262#top) .
Fixed it by adding this code: https://github.com/pytorch/pytorch/issues/97262#issuecomment-1487141914 from the issue PR.

My question is, I do get 2 files(model.onnx & model.onnx_data, but it fails the onnx export validation stage. How do I check if the onnx model works?
Also for exporting qwen model: https://huggingface.co/Qwen/Qwen-7B-Chat, should I just make changes to the model_id to Qwen/Qwen-7B-Chat & hoping it should run the onnx export?

Thanks in advance

mattsthilaire · 2024-01-24T07:51:02Z

Hey all, Wanted to see if I could pick up doing the Canine implementation. I saw @RaghavPrabhakar66 was doing some work on it in the previous issue thread, but didn't see an official PR on it.

fxmarty · 2024-01-26T10:29:28Z

@mattsthilaire For sure, feel free to open a PR!

michaelbenayoun added the good first issue Good for newcomers label Dec 7, 2022

michaelbenayoun pinned this issue Dec 7, 2022

sgugger mentioned this issue Dec 7, 2022

ONNXConfig: Add a configuration for all available models huggingface/transformers#16308

Closed

mszsorondo mentioned this issue Jan 2, 2023

VisualBertOnnx #663

Closed

fxmarty mentioned this issue Jan 3, 2023

Test ORTModel.from_pretrained() for all architectures #664

Closed

Allanbeddouk added a commit to Allanbeddouk/optimum that referenced this issue Feb 1, 2023

Support Splinter exporters (huggingface#555)

87cfc68

Allanbeddouk mentioned this issue Feb 1, 2023

Support Splinter exporters (#555) #736

Merged

sidthekidder mentioned this issue Feb 6, 2023

Add gpt-neo-x support #745

Merged

3 tasks

fxmarty closed this as completed in #745 Feb 7, 2023

fxmarty reopened this Feb 7, 2023

someshfengde mentioned this issue Feb 18, 2023

TAPASonnx optimum.exporters.onnx support #791

Closed

hazrulakmal linked a pull request Feb 21, 2023 that will close this issue

Add onnx support for FNet #802

Open

1 task

adit299 mentioned this issue Feb 24, 2023

Adding ONNX support for ImageGPT #819

Merged

3 tasks

fxmarty reopened this Mar 4, 2023

fxmarty mentioned this issue Mar 31, 2023

Pix2struct convert to ONNX #937

Closed

fxmarty mentioned this issue Apr 11, 2023

Deberta onnx pipeline issue #968

Open

huggingface deleted a comment from rcshubhadeep Apr 11, 2023

fxmarty mentioned this issue Apr 21, 2023

Have optimum supported BLIP-2 model converted to onnx? #987

Closed

This was referenced Jun 23, 2023

Add CvT ONNX Config #1130

Closed

Add CvTONNX Config #1131

Merged

xenova mentioned this issue Nov 11, 2023

[ONNX export] Add depth-estimation w/ DPT+GLPN #1529

Merged

3 tasks

vrdn-23 mentioned this issue Feb 27, 2024

Adds MPNet to NormalizedConfig and ORTConfigManager #1471

Merged

3 tasks

BrightXiaoHan mentioned this issue Mar 12, 2024

Add support for Ernie model #1753

Open

3 tasks

Community contribution - optimum.exporters.onnx support for new models! #555

Community contribution - optimum.exporters.onnx support for new models! #555

Comments

michaelbenayoun commented Dec 7, 2022 • edited by fxmarty

mszsorondo commented Jan 1, 2023 • edited

fxmarty commented Jan 2, 2023 • edited

mszsorondo commented Jan 2, 2023

bhavnicksm commented Jan 3, 2023

fxmarty commented Jan 3, 2023

bhavnicksm commented Jan 3, 2023 • edited

fxmarty commented Jan 3, 2023 • edited

chainyo commented Feb 7, 2023

fxmarty commented Feb 7, 2023

adit299 commented Feb 14, 2023

fxmarty commented Feb 14, 2023

someshfengde commented Feb 18, 2023

asrimanth commented Feb 18, 2023

michaelbenayoun commented Feb 20, 2023

hazrulakmal commented Feb 21, 2023

fxmarty commented Feb 22, 2023

soma2000-lang commented Mar 4, 2023

fxmarty commented Apr 11, 2023

gjain7 commented Apr 28, 2023

regisss commented Apr 28, 2023

michaelbenayoun commented May 2, 2023

michaelbenayoun commented May 2, 2023

michaelbenayoun commented May 3, 2023

maiiabocharova commented May 3, 2023 • edited

michaelbenayoun commented May 4, 2023

maiiabocharova commented May 4, 2023

rishabbala commented Jun 21, 2023

fxmarty commented Jun 23, 2023

ingo-m commented Jul 6, 2023

sidistic commented Jul 12, 2023 • edited

regisss commented Jul 13, 2023

sidistic commented Jul 26, 2023

soharas commented Aug 2, 2023

manishghop commented Dec 10, 2023

mattsthilaire commented Jan 24, 2024

fxmarty commented Jan 26, 2024

Community contribution - `optimum.exporters.onnx` support for new models! #555

Community contribution - `optimum.exporters.onnx` support for new models! #555

michaelbenayoun commented Dec 7, 2022 •

edited by fxmarty

mszsorondo commented Jan 1, 2023 •

edited

fxmarty commented Jan 2, 2023 •

edited

bhavnicksm commented Jan 3, 2023 •

edited

fxmarty commented Jan 3, 2023 •

edited

maiiabocharova commented May 3, 2023 •

edited

sidistic commented Jul 12, 2023 •

edited