You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
defprepare_inputs_for_generation(self, input_ids, past_key_values=None, **kwargs):
ifpast_key_valuesisnotNone:
past_length=past_key_values[0][0].shape[2]
# Some generation methods already pass only the last input IDifinput_ids.shape[1] >past_length:
remove_prefix_length=past_lengthelse:
# Default to old behavior: keep only final IDremove_prefix_length=input_ids.shape[1] -1input_ids=input_ids[:, remove_prefix_length:]
# Keep only the unprocessed tokens:# 1 - If the length of the attention_mask exceeds the length of input_ids, then we are in a setting where# some of the inputs are exclusively passed as part of the cache (e.g. when passing input_embeds as# input)ifattention_maskisnotNoneandattention_mask.shape[1] >input_ids.shape[1]:
input_ids=input_ids[:, -(attention_mask.shape[1] -past_length) :]
# 2 - If the past_length is smaller than input_ids', then input_ids holds all input tokens. We can discard# input_ids based on the past_length.elifpast_length<input_ids.shape[1]:
input_ids=input_ids[:, past_length:]
The text was updated successfully, but these errors were encountered:
optimum/optimum/onnxruntime/modeling_decoder.py
Line 649 in c55f882
while in non-onnx modeling, it's not.
https://github.com/huggingface/transformers/blob/a98c41798cf6ed99e1ff17e3792d6e06a2ff2ff3/src/transformers/models/mistral/modeling_mistral.py#L1217
The text was updated successfully, but these errors were encountered: