Reproducible output with Azure OpenAI GPT-4-Turbo v1106

In the attached Jupyter notebook, I'll demo how to generate reproducible output in GPT-4-Turbo with the new "seed" parameter. This feature was originally announced by OpenAI during their Dev Day on 6th of November, and is available now in Azure OpenAI as of 17th of November.

Provided code runs on OpenAI Python SDK v1.x. To use the latest version of openai python package, you can upgrade it wth the following pip command:

pip install --upgrade openai

Pre-requisites

Ensure that you deploy one of v1106 GPT models: either GPT-4-Turbo or GPT-35-Turbo.
Set API endpoint name, version and key, along with the Azure OpenAI deployment name to the relevant environment variables. Provided code assumes that the environment variables are OPENAI_API_BASE, OPENAI_API_VERSION, OPENAI_API_KEY and OPENAI_API_DEPLOY.
To "almost always" reproduce the same output, your "seed" parameter should always use the same integer value. In this example, it's set to 42.
All the other parameters (like "temperature", "messages", etc.) in the Chat Completions API call should also stay the same.

Scenario 1: Testing without seed

To test the model's behaviour when temperature is above 0, we create a simple list with 2 identical prompts:

[
    "Create a story about red panda.",
    "Create a story about red panda."
]

If we'll submit it now to GPT-4-Turbo with the temperature value of 0.1 and without seed paramater:

completion = client.chat.completions.create(
    model = AOAI_Deployment, # model = "Azure OpenAI deployment name".
    temperature = 0.1,
    messages = [
        {"role": "system", "content": "You always produce 3-sentence answers."},
        {"role": "user", "content": prompt}
    ]        
)

We may get slightly different outputs for two separate submissions of the same prompt:

--------------------
In the lush forests of the Himalayas, a curious red panda named Pabu spent his days frolicking among the trees. One day, Pabu stumbled upon a hidden grove filled with the sweetest bamboo he'd ever tasted, but it was guarded by a mischievous monkey. With cleverness and a dash of bravery, Pabu outwitted the monkey, sharing the grove's bounty with his fellow pandas, becoming a legend in the forest.
--------------------
In the lush forests of the Himalayas, a curious red panda named Pabu spent his days frolicking among the trees. One day, Pabu stumbled upon a hidden grove filled with the sweetest bamboo he'd ever tasted, which he decided to keep as his secret snack spot. Little did he know, his delightful discovery would soon attract a band of fellow pandas, leading to the most enchanting bamboo feasts the forest had ever seen.
--------------------

Scenario 2: Testing with seed

Now we can submit the same pair of prompts to GPT-4-Turbo with our new seed parameter:

completion = client.chat.completions.create(
    model = AOAI_Deployment, # model = "Azure OpenAI deployment name".
    temperature = 0.1,
    seed = 42,
    messages = [
        {"role": "system", "content": "You always produce 3-sentence answers."},
        {"role": "user", "content": prompt}
    ]        
)

The model will try to produce the same output "almost always":

--------------------
In the lush forests of the Himalayas, a curious red panda named Pabu spent his days frolicking among the trees. One day, Pabu stumbled upon a hidden grove filled with the sweetest bamboo he had ever tasted, which he decided to keep as his secret snack spot. Little did he know, his delightful discovery would soon attract other forest creatures, leading to unexpected friendships and adventures.
--------------------
In the lush forests of the Himalayas, a curious red panda named Pabu spent his days frolicking among the trees. One day, Pabu stumbled upon a hidden grove filled with the sweetest bamboo he had ever tasted, which he decided to keep as his secret snack spot. Little did he know, his delightful discovery would soon attract other forest creatures, leading to unexpected friendships and adventures.
--------------------

Verifying reproducible outcome

To verify the outcomes of both scenarios, we'll use Python's difflib package:

import difflib as dl
differenciator = dl.Differ()

For the first scenario, it will help us to find differences between 2 produced completions:

Found these differences between completions: ['but', 'it', 'which', 'he', 'decided', 'to', 'keep', 'was', 'as', 'guarded', 'by', 'his', 'secret', 'snack', 'spot.', 'Little', 'did', 'he', 'know,', 'his', 'delightful', 'discovery', 'would', 'soon', 'attract', 'mischievous', 'monkey.', 'With', 'cleverness', 'and', 'band', 'a', 'dash', 'bravery,', 'Pabu', 'outwitted', 'the', 'monkey,', 'sharing', 'the', "grove's", 'bounty', 'with', 'his', 'leading', 'to', 'becoming', 'a', 'legend', 'in', 'most', 'enchanting', 'bamboo', 'feasts', 'the', 'forest.', 'forest', 'had', 'ever', 'seen.']

For the second scenario, Differ's compare function will verify that they are identical:

No differences found between completions.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
images		images
AOAI_SDKv1_Seed.ipynb		AOAI_SDKv1_Seed.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

AOAI_SDKv1_Seed.ipynb

AOAI_SDKv1_Seed.ipynb

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Reproducible output with Azure OpenAI GPT-4-Turbo v1106

Table of contents:

Pre-requisites

Scenario 1: Testing without seed

Scenario 2: Testing with seed

Verifying reproducible outcome

About

Releases

Packages

Languages

License

LazaUK/AOAI-ReproducibleOutput-SDKv1

Folders and files

Latest commit

History

Repository files navigation

Reproducible output with Azure OpenAI GPT-4-Turbo v1106

Table of contents:

Pre-requisites

Scenario 1: Testing without seed

Scenario 2: Testing with seed

Verifying reproducible outcome

About

Topics

Resources

License

Stars

Watchers

Forks

Languages