Skip to content

Scripts and modules utilised for analysing the data referred to in the Genetics in Medicine paper entitled Targeted genetic analysis in a large cohort of familial and sporadic cases of aneurysm or dissection of the thoracic aorta

Notifications You must be signed in to change notification settings

superDross/TAAD_analysis

Repository files navigation

TAAD_analysis

Code for the Genetics in Medicine article entitled Targeted genetic analysis in a large cohort of familial and sporadic cases of aneurysm or dissection of the thoracic aorta to define the frequency of causative genetic variants and phenotypic risk factors associated with a genetic aetiology of thoracic aortic aneurysm/dissection (TAAD).

Getting Started

git clone https://github.com/superDross/TAAD_analysis
pip3 install -r TAAD_analysis/requirements.txt
PYTHONPATH=$PYTHONPATH:/full/path/to/TAAD_analysis
# ensure input_files and output directories exist before execution
python3 TAAD_analysis

Input Files

An input_files directory should exist within the TAAD_analysis directory and contain the most damaging data (most damaging variant per patient) and all variants data (all variants identified in each patient). The structure of the input files directory should be as below:

input_files/
│   
├── UK_Depth/
│   ├── depth_vs_taadx/
│   └── depth_vs_taadz/
├── Yale_Depth/
│   ├── depth_vs_taadx/
│   └── depth_vs_taadz/
│   
├── UK_All_Variants_Data.csv
├── UK_Most_Damaging_Data.csv
├── UK_Phenotype_Data.csv
├── Yale_All_Variants_Data.csv
├── Yale_Most_Damaging_Data.csv
├── Yale_Phenotype_Data.csv
└── Yale_Survival_Data_Clean.csv

Data Cleaning

The most damaging data is cleaned and combined and ultimately used to produce all the plots, tables and most of the data mentioned in the paper. The all variants data primary use is for helping to select the next most damaging variant. Each major step in the most damaging data cleaning process, and the sub-package (if any) used to achieve said step, are detailed below:

About

Scripts and modules utilised for analysing the data referred to in the Genetics in Medicine paper entitled Targeted genetic analysis in a large cohort of familial and sporadic cases of aneurysm or dissection of the thoracic aorta

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages