Skip to content

Generating reproducible output in GPT-4-Turbo v1106 with the "seed" parameter in Python SDK v1.x

License

Notifications You must be signed in to change notification settings

LazaUK/AOAI-ReproducibleOutput-SDKv1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Reproducible output with Azure OpenAI GPT-4-Turbo v1106

In the attached Jupyter notebook, I'll demo how to generate reproducible output in GPT-4-Turbo with the new "seed" parameter. This feature was originally announced by OpenAI during their Dev Day on 6th of November, and is available now in Azure OpenAI as of 17th of November.

Provided code runs on OpenAI Python SDK v1.x. To use the latest version of openai python package, you can upgrade it wth the following pip command:

pip install --upgrade openai

Table of contents:

Pre-requisites

  1. Ensure that you deploy one of v1106 GPT models: either GPT-4-Turbo or GPT-35-Turbo. screenshot_0_deployment
  2. Set API endpoint name, version and key, along with the Azure OpenAI deployment name to the relevant environment variables. Provided code assumes that the environment variables are OPENAI_API_BASE, OPENAI_API_VERSION, OPENAI_API_KEY and OPENAI_API_DEPLOY. screenshot_0_deployment
  3. To "almost always" reproduce the same output, your "seed" parameter should always use the same integer value. In this example, it's set to 42.
  4. All the other parameters (like "temperature", "messages", etc.) in the Chat Completions API call should also stay the same.

Scenario 1: Testing without seed

  1. To test the model's behaviour when temperature is above 0, we create a simple list with 2 identical prompts:
[
    "Create a story about red panda.",
    "Create a story about red panda."
]
  1. If we'll submit it now to GPT-4-Turbo with the temperature value of 0.1 and without seed paramater:
completion = client.chat.completions.create(
    model = AOAI_Deployment, # model = "Azure OpenAI deployment name".
    temperature = 0.1,
    messages = [
        {"role": "system", "content": "You always produce 3-sentence answers."},
        {"role": "user", "content": prompt}
    ]        
)
  1. We may get slightly different outputs for two separate submissions of the same prompt:
--------------------
In the lush forests of the Himalayas, a curious red panda named Pabu spent his days frolicking among the trees. One day, Pabu stumbled upon a hidden grove filled with the sweetest bamboo he'd ever tasted, but it was guarded by a mischievous monkey. With cleverness and a dash of bravery, Pabu outwitted the monkey, sharing the grove's bounty with his fellow pandas, becoming a legend in the forest.
--------------------
In the lush forests of the Himalayas, a curious red panda named Pabu spent his days frolicking among the trees. One day, Pabu stumbled upon a hidden grove filled with the sweetest bamboo he'd ever tasted, which he decided to keep as his secret snack spot. Little did he know, his delightful discovery would soon attract a band of fellow pandas, leading to the most enchanting bamboo feasts the forest had ever seen.
--------------------

Scenario 2: Testing with seed

  1. Now we can submit the same pair of prompts to GPT-4-Turbo with our new seed parameter:
completion = client.chat.completions.create(
    model = AOAI_Deployment, # model = "Azure OpenAI deployment name".
    temperature = 0.1,
    seed = 42,
    messages = [
        {"role": "system", "content": "You always produce 3-sentence answers."},
        {"role": "user", "content": prompt}
    ]        
)
  1. The model will try to produce the same output "almost always":
--------------------
In the lush forests of the Himalayas, a curious red panda named Pabu spent his days frolicking among the trees. One day, Pabu stumbled upon a hidden grove filled with the sweetest bamboo he had ever tasted, which he decided to keep as his secret snack spot. Little did he know, his delightful discovery would soon attract other forest creatures, leading to unexpected friendships and adventures.
--------------------
In the lush forests of the Himalayas, a curious red panda named Pabu spent his days frolicking among the trees. One day, Pabu stumbled upon a hidden grove filled with the sweetest bamboo he had ever tasted, which he decided to keep as his secret snack spot. Little did he know, his delightful discovery would soon attract other forest creatures, leading to unexpected friendships and adventures.
--------------------

Verifying reproducible outcome

  1. To verify the outcomes of both scenarios, we'll use Python's difflib package:
import difflib as dl
differenciator = dl.Differ()
  1. For the first scenario, it will help us to find differences between 2 produced completions:
Found these differences between completions: ['but', 'it', 'which', 'he', 'decided', 'to', 'keep', 'was', 'as', 'guarded', 'by', 'his', 'secret', 'snack', 'spot.', 'Little', 'did', 'he', 'know,', 'his', 'delightful', 'discovery', 'would', 'soon', 'attract', 'mischievous', 'monkey.', 'With', 'cleverness', 'and', 'band', 'a', 'dash', 'bravery,', 'Pabu', 'outwitted', 'the', 'monkey,', 'sharing', 'the', "grove's", 'bounty', 'with', 'his', 'leading', 'to', 'becoming', 'a', 'legend', 'in', 'most', 'enchanting', 'bamboo', 'feasts', 'the', 'forest.', 'forest', 'had', 'ever', 'seen.']
  1. For the second scenario, Differ's compare function will verify that they are identical:
No differences found between completions.

About

Generating reproducible output in GPT-4-Turbo v1106 with the "seed" parameter in Python SDK v1.x

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published