The temperature at 0.0001 (or other arbitrarily small float) is still too high #270

monsieurpooh · 2022-01-22T11:37:35Z

If setting the temperature to 0.00001 or similarly low float, the output is noticeably less chaotic than when temperature is a significantly larger numbers; however, the output is still very non-deterministic and often answers questions wrong even when the majority of the time it may have gotten it right. I suspect it would be better to have more freedom over the temperature range, where 0.00001 actually denotes extremely low temperature with almost no variation in output, for better question-answering capability

If anyone knows of a workaround to this please let me know

monsieurpooh · 2022-01-22T23:04:40Z

I am trying to fix this bug by delving into the code on my end, but I can't even figure out where the code lives. The first line is "from transformers import GPTNeoForCausalLM, GPT2Tokenizer". I can't find where "GPTNeoForCausalLM" is in transformers. When I do a text search on the whole library folder it turns up empty.

monsieurpooh · 2022-01-23T00:08:35Z

I found out how to browse the source code, but I am now confused about how dividing all the scores by a common value will change the ranking or the final result: https://github.com/huggingface/transformers/blob/87e6e4fe5c7e65cb69e70306f22de6daf16b6e14/src/transformers/generation_logits_process.py#L141

monsieurpooh · 2022-01-23T00:29:45Z

This is not really a bug. I found out I had to go way lower, lower than a typical "float" one would expect most programming languages to be able to handle. I specified 0.00000000000001 as the temperature and now the output is pretty consistent.

monsieurpooh · 2022-01-30T04:05:37Z

I would like to reopen this issue because in some situations with long prompts, even 1e-18 is not small enough to create a totally deterministic response, and at such a small number, the script has a chance of throwing an exception due to "probability tensor contains either inf, nan or element < 0"

monsieurpooh · 2022-02-23T15:35:46Z

The workaround is disable sampling

monsieurpooh added the bug Something isn't working. label Jan 22, 2022

monsieurpooh closed this as completed Jan 23, 2022

monsieurpooh reopened this Jan 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The temperature at 0.0001 (or other arbitrarily small float) is still too high #270

The temperature at 0.0001 (or other arbitrarily small float) is still too high #270

monsieurpooh commented Jan 22, 2022

monsieurpooh commented Jan 22, 2022

monsieurpooh commented Jan 23, 2022

monsieurpooh commented Jan 23, 2022

monsieurpooh commented Jan 30, 2022

monsieurpooh commented Feb 23, 2022

The temperature at 0.0001 (or other arbitrarily small float) is still too high #270

The temperature at 0.0001 (or other arbitrarily small float) is still too high #270

Comments

monsieurpooh commented Jan 22, 2022

monsieurpooh commented Jan 22, 2022

monsieurpooh commented Jan 23, 2022

monsieurpooh commented Jan 23, 2022

monsieurpooh commented Jan 30, 2022

monsieurpooh commented Feb 23, 2022