Skip to content

This application provides a user interface to convert text into images using Language Models (LLM) and DALL-E. The user can choose among different LLM models, select the LLM temperature, and input a text. This text is then converted into a DALL-E prompt with LangChain, and an image is generated by DALL-E.

License

aambekar234/genai-text-to-image-langchain

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

genai-text-to-image-langchain

Introduction

This application provides a user interface to convert text into images using Language Models (LLM) and DALL-E. The user can choose among different LLM models, select the LLM temperature, and input a text. This text is then converted into a DALL-E prompt with LangChain, and an image is generated by DALL-E. The specific prompts for each sequence of the chain are stored in prompts.py, and the LangChain implementation is in main.py.

What is the purpose of LangChain in this application?

Often, users need to invest significant time in prompt engineering to generate the most suitable image for a specific narrative. This is where LangChain comes into play. It begins by cleaning the text and transforming it into a story script. This script is then further dissected to identify key characters, their ages, and facial expressions. It also estimates the time of day, historical era, and geographical locations for the picture. All this information is fed into the final block, which creates a comprehensive prompt by analyzing all this data. This final prompt is then used to generate an image using the DALL-E 3 model, thereby simplifying the process and enhancing the quality of the generated images.

Note

As of now (01/01/2024), OpenAI adjusts the image prompt that we input into the DALL-E API for image generation. This measure is taken to prevent misuse of the image generation model. However, if you possess an upgraded ChatGPT account, it is recommended to utilize the generated prompt directly in the chatbot for improved outcomes.

How to Run the App

  1. Create a conda environment using the env.yml file:

    conda env create -f env.yaml --force
    
  2. Activate the newly created conda environment:

    conda activate imagegen-app
    
  3. Go to OpenAI API page here and create an API_KEY. Copy your API key in the .env file of the project.

  4. Run app.py:

    python app.py
    

Conclusion

This project offers a distinctive approach to transforming text into images using cutting-edge AI models. It serves as an excellent platform for delving into the potential of Language Models and frameworks like LangChain and DALL-E. Contributions to the project are always welcome. For collaborations on similar future projects, feel free to connect with me on LinkedIn.

About

This application provides a user interface to convert text into images using Language Models (LLM) and DALL-E. The user can choose among different LLM models, select the LLM temperature, and input a text. This text is then converted into a DALL-E prompt with LangChain, and an image is generated by DALL-E.

Topics

Resources

License

Stars

Watchers

Forks

Languages