Getting error in distil-whisper-asr.ipynb notebook in Quantize Distil-Whisper encoder and decoder models for GPU #24479

tarunmcom · 2024-05-10T13:39:33Z

I am getting the following error when trying to run the distil-whisper-asr.ipynb notebook. The error occurs at the 'Quantize Distil-Whisper encoder and decoder models block. The error happens if the selected device is GPU (Arc770 or iGPU), no error if device is CPU.
I am running on Windows

RuntimeError Traceback (most recent call last)
Cell In[21], line 1
----> 1 get_ipython().run_cell_magic('skip', 'not $to_quantize.value', '\nimport gc\nimport shutil\nimport nncf\n\nCALIBRATION_DATASET_SIZE = 50\nquantized_model_path = Path(f"{model_path}_quantized")\n\n\ndef quantize(ov_model: OVModelForSpeechSeq2Seq, calibration_dataset_size: int):\n if not quantized_model_path.exists():\n encoder_calibration_data, decoder_calibration_data = collect_calibration_dataset(\n ov_model, calibration_dataset_size\n )\n print("Quantizing encoder")\n quantized_encoder = nncf.quantize(\n ov_model.encoder.model,\n nncf.Dataset(encoder_calibration_data),\n subset_size=len(encoder_calibration_data),\n model_type=nncf.ModelType.TRANSFORMER,\n # Smooth Quant algorithm reduces activation quantization error; optimal alpha value was obtained through grid search\n advanced_parameters=nncf.AdvancedQuantizationParameters(smooth_quant_alpha=0.50)\n )\n ov.save_model(quantized_encoder, quantized_model_path / "openvino_encoder_model.xml")\n del quantized_encoder\n del encoder_calibration_data\n gc.collect()\n\n print("Quantizing decoder with past")\n quantized_decoder_with_past = nncf.quantize(\n ov_model.decoder_with_past.model,\n nncf.Dataset(decoder_calibration_data),\n subset_size=len(decoder_calibration_data),\n model_type=nncf.ModelType.TRANSFORMER,\n # Smooth Quant algorithm reduces activation quantization error; optimal alpha value was obtained through grid search\n advanced_parameters=nncf.AdvancedQuantizationParameters(smooth_quant_alpha=0.95)\n )\n ov.save_model(quantized_decoder_with_past, quantized_model_path / "openvino_decoder_with_past_model.xml")\n del quantized_decoder_with_past\n del decoder_calibration_data\n gc.collect()\n\n # Copy the config file and the first-step-decoder manually\n shutil.copy(model_path / "config.json", quantized_model_path / "config.json")\n shutil.copy(model_path / "openvino_decoder_model.xml", quantized_model_path / "openvino_decoder_model.xml")\n shutil.copy(model_path / "openvino_decoder_model.bin", quantized_model_path / "openvino_decoder_model.bin")\n\n quantized_ov_model = OVModelForSpeechSeq2Seq.from_pretrained(quantized_model_path, ov_config=ov_config, compile=False)\n quantized_ov_model.to(device.value)\n quantized_ov_model.compile()\n return quantized_ov_model\n\n\nov_quantized_model = quantize(ov_model, CALIBRATION_DATASET_SIZE)\n')

File ~\miniconda3\envs\openvino_env\lib\site-packages\IPython\core\interactiveshell.py:2541, in InteractiveShell.run_cell_magic(self, magic_name, line, cell)
2539 with self.builtin_trap:
2540 args = (magic_arg_s, cell)
-> 2541 result = fn(*args, **kwargs)
2543 # The code below prevents the output from being displayed
2544 # when using magics with decorator @output_can_be_silenced
2545 # when the last Python token in the expression is a ';'.
2546 if getattr(fn, magic.MAGIC_OUTPUT_CAN_BE_SILENCED, False):

File ~\Downloads\openvino_notebooks\notebooks\distil-whisper-asr\skip_kernel_extension.py:17, in skip(line, cell)
11 if eval(line):
13 return
---> 17 get_ipython().ex(cell)

File ~\miniconda3\envs\openvino_env\lib\site-packages\IPython\core\interactiveshell.py:2878, in InteractiveShell.ex(self, cmd)
2876 """Execute a normal python statement in user namespace."""
2877 with self.builtin_trap:
-> 2878 exec(cmd, self.user_global_ns, self.user_ns)

File :54

File :50, in quantize(ov_model, calibration_dataset_size)

File ~\miniconda3\envs\openvino_env\lib\site-packages\optimum\intel\openvino\modeling_seq2seq.py:461, in OVModelForSeq2SeqLM.compile(self)
460 def compile(self):
--> 461 self.encoder._compile()
462 self.decoder._compile()
463 if self.use_cache:

File ~\miniconda3\envs\openvino_env\lib\site-packages\optimum\intel\openvino\modeling_seq2seq.py:523, in OVEncoder._compile(self)
521 if self.request is None:
522 logger.info(f"Compiling the encoder to {self._device} ...")
--> 523 self.request = core.compile_model(self.model, self._device, ov_config)
524 # OPENVINO_LOG_LEVEL can be found in https://docs.openvino.ai/2023.2/openvino_docs_OV_UG_supported_plugins_AUTO_debugging.html
525 if "OPENVINO_LOG_LEVEL" in os.environ and int(os.environ["OPENVINO_LOG_LEVEL"]) > 2:

File ~\miniconda3\envs\openvino_env\lib\site-packages\openvino\runtime\ie_api.py:521, in Core.compile_model(self, model, device_name, config, weights)
516 if device_name is None:
517 return CompiledModel(
518 super().compile_model(model, {} if config is None else config),
519 )
520 return CompiledModel(
--> 521 super().compile_model(model, device_name, {} if config is None else config),
522 )
523 else:
524 if device_name is None:

RuntimeError: Exception from src\inference\src\cpp\core.cpp:109:
Exception from src\inference\src\dev\plugin.cpp:54:
Exception from src\core\src\dimension.cpp:227:
Cannot get length of dynamic dimension

andrei-kochin assigned p-durandin May 13, 2024

andrei-kochin transferred this issue from openvinotoolkit/openvino_notebooks May 13, 2024

andrei-kochin added category: GPU OpenVINO GPU plugin bug Something isn't working labels May 13, 2024

andrei-kochin assigned vladimir-paramuzov May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting error in distil-whisper-asr.ipynb notebook in Quantize Distil-Whisper encoder and decoder models for GPU #24479

Getting error in distil-whisper-asr.ipynb notebook in Quantize Distil-Whisper encoder and decoder models for GPU #24479

tarunmcom commented May 10, 2024

Getting error in distil-whisper-asr.ipynb notebook in Quantize Distil-Whisper encoder and decoder models for GPU #24479

Getting error in distil-whisper-asr.ipynb notebook in Quantize Distil-Whisper encoder and decoder models for GPU #24479

Comments

tarunmcom commented May 10, 2024