Enhance your knowledge in medical research.
AItrika (formerly PubGPT) is a tool that can extract lots of relevant informations inside medical papers in an easy way:
- Abstract
- Full text (when available)
- Genes
- Diseases
- Mutations
- Associations between genes and diseases
- MeSH terms
- Other terms
- Results
- Bibliography
And so on!
You can try AItrika with the Streamlit app by running:
streamlit run app.py
Or you can use it a script by running:
python main.py
To install everything, you need poetry
.
First of all, create a virtual environment with the command python3 -m venv venv_name
and activate it with source venv_name\bin\activate
.
After that, you can install poetry with the command pip install poetry
and then run poetry install
.
In order to set API keys, insert your keys into the env.example
file and rename it to .env
.
You can easily get informations of a paper by passing a PubMed ID:
from aitrika.engine.aitrika import OnlineAItrika
aitrika_engine = OnlineAItrika(pubmed_id=pubmed_id)
title = aitrika_engine.get_title()
print(title)
Or you can parse a local pdf:
from aitrika.engine.aitrika import LocalAItrika
aitrika_engine = LocalAItrika(pdf_path = pdf_path)
title = aitrika_engine.get_title()
print(title)
Breast cancer genes: beyond BRCA1 and BRCA2.
You can get other informations, like the associations between genes and diseases:
associations = aitrika_engine.get_associations()
[
{
"gene": "BRIP1",
"disease": "Breast Neoplasms"
},
{
"gene": "PTEN",
"disease": "Breast Neoplasms"
},
{
"gene": "CHEK2",
"disease": "Breast Neoplasms"
},
]
...
Or you can get a nice formatted DataFrame:
associations = aitrika_engine.associations(dataframe = True)
gene disease
0 BRIP1 Breast Neoplasms
1 PTEN Breast Neoplasms
2 CHEK2 Breast Neoplasms
...
With the power of RAG, you can query your document:
## Prepare the documents
documents = generate_documents(content=abstract)
## Set the LLM
llm = GroqLLM(documents=documents, api_key=os.getenv("GROQ_API_KEY"))
## Query your document
query = "Is BRCA1 associated with breast cancer?"
print(llm.query(query=query))
The provided text suggests that BRCA1 is associated with breast cancer, as it is listed among the high-penetrance genes identified in family linkage studies as responsible for inherited syndromes of breast cancer.
AItrika is licensed under the MIT License. See the LICENSE file for more details.
- Create documentation
- Add docstrings
- Create Python package