Utilize natural language processing and machine learning to obtain personalized life consulatation reports. Leverage the txtai library for semantic search and the GPT-4 engine for dynamic suggestions.
To run the Python file and generate a personalized life consultation report or EDA, follow the steps outlined below:
Ensure Python 3.x is installed. Download it from the Python official website.
gh repo clone William-Ger/AI_Therapist
Navigate to the project directory and run:
pip install pandas txtai openai seaborn matplotlib
Open the script file and set your API key:
openai.api_key = 'YOUR-OPENAI-API-KEY-HERE'
Execute the Python script via terminal or command prompt:
python path/to/your_script.py
Follow the terminal prompts to provide your lifestyle details and improvement areas.
The script will create a personalized report as a markdown file. Find it in the output directory.
Tip: Convert the markdown file to PDF using the 'Markdown PDF' extension by yzane in VSCode.
- Semantic Search: Utilizes txtai for in-depth analysis.
- GPT-4 Integration: Leverages GPT-4 for dynamic suggestion creation.
- Report Generation: Creates comprehensive PDF reports.
The EDA facilitated a robust understanding of the dataset, identifying pivotal factors affecting life expectancy. Key steps included data cleaning, visualization, and feature engineering, providing a rich foundation for semantic analysis and report generation.
- Bimodal Distribution: Factors exhibit a bimodal distribution of effects on life expectancy.
- Scientific Backing: Strong scientific backing often indicates negative life expectancy impacts.
- Sex-Based Differences: Different factors disproportionately affect various sex categories, highlighting the dataset's depth.
For detailed insights and analysis, refer to the eda
folder in the repository.
The EDA process was pivotal in shaping the development of the application by providing a clear understanding of the dataset's structure and the relationships between different factors. It served as the foundation upon which the semantic analysis and report generation functionalities were built.
We encourage contributors and users to delve into the EDA process to garner a deeper understanding of the data and the initial analysis carried out in this project.
I extend gratitude to:
- Joakim Arvidsson for creating the invaluable dataset that serves as the backbone of this project, aiding in the generation of factors using sematic search. The kaggle dataset can be found here: https://www.kaggle.com/datasets/joebeachcapital/life-longevity-factors?datasetId=3656167
- The creators of txtai for developing a potent semantic search library that significantly enhanced the project's analytical depth.
- The team at OpenAI for crafting the GPT-4 engine, a cornerstone in the dynamic suggestion generation feature of this project.
Your contributions have been instrumental in bringing this project to fruition.
Use this repo however you want. I would love to see how people can improve the accuracy of the report and functionality of the tool.