Skip to content

πŸ‹ The Living Oceans Metagenome Taxanomic Profiling tool (beta) is a metagenomic pipeine built to work on your local ocean

License

Notifications You must be signed in to change notification settings

new-atlantis-labs/Metagenomics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

49 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

New Atlantis is an open ocean regeneration project that seeks to address biodiversity loss in our oceans by providing a viable business model to Marine Protected Areas (MPAs). We do this by building an open marine biodiversity analytics platform to monitor and forecast the health of Marine Protected Areas and from which marine biocredits and blue carbon credits can be generated.

🧬 Metagenomics

The metagenomic pipeline section of the new atlantis github. An easy-to-use pipeline for generating metagenomic data on different ocean samples. Currently known as the Living Oceans Metagenome Assembly Pipeline or LOMAP for short.

Discord Twitter Open in Colab

Main Image

Photo used with permission by Paul Nicklen, co-founder of SeaLegacy.org, New Atlantis Founding Advisor, NatGeo Contributor, Instagram

πŸͺ„ Try it Now!

You can set up and use the LOMAP on the cloud by following along the google colab notebook

Open in Colab

Please note that google colab does not provide the computational resources necessary to fully run LOMAP on a real dataset. This notebook demonstrates how to setup and use LOMAP by performing the first steps in the workflow on a toy dataset.

βš™οΈ Installation

You can set up LOMAP on your computer at home in one line!

git clone https://github.com/new-atlantis-dao/Oceanomics/tree/main/Metagenomics && cd Metagenomics && rm -r .git

Congratulations, you can now start using LOMAP.

πŸ“― Tutorial

LOMAP can be used to explore a local section of ocean's planktonic network. A written tutorial on how to use the LOMAP pipeline will be released at a later date.

Tutorial

πŸ—‚ Project Organization


β”œβ”€β”€ LICENSE
β”œβ”€β”€ Makefile           <- Makefile with commands like `make data` or `make train`
β”œβ”€β”€ README.md          <- The top-level README for developers using this project.
β”œβ”€β”€ data
β”‚Β Β  β”œβ”€β”€ external       <- Data from third party sources.
β”‚Β Β  β”œβ”€β”€ interim        <- Intermediate data that has been transformed.
β”‚Β Β  β”œβ”€β”€ processed      <- The final, canonical data sets for modeling.
β”‚Β Β  └── raw            <- The original, immutable data dump.
β”‚
β”œβ”€β”€ docs               <- A default Sphinx project; see sphinx-doc.org for details
β”‚
β”œβ”€β”€ models             <- Trained and serialized models, model predictions, or model summaries
β”‚
β”œβ”€β”€ notebooks          <- Jupyter notebooks. Naming convention is a number (for ordering),
β”‚                         the creator's initials, and a short `-` delimited description, e.g.
β”‚                         `1.0-jqp-initial-data-exploration`.
β”‚
β”œβ”€β”€ references         <- Data dictionaries, manuals, and all other explanatory materials.
β”‚
β”œβ”€β”€ reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
β”‚Β Β  └── figures        <- Generated graphics and figures to be used in reporting
β”‚
β”œβ”€β”€ requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
β”‚                         generated with `pip freeze > requirements.txt`
β”‚
β”œβ”€β”€ setup.py           <- makes project pip installable (pip install -e .) so src can be imported
β”œβ”€β”€ src                <- Source code for use in this project.
β”‚Β Β  β”œβ”€β”€ __init__.py    <- Makes src a Python module
β”‚   β”‚
β”‚Β Β  β”œβ”€β”€ data           <- Scripts to download or generate data
β”‚Β Β  β”‚Β Β  └── make_dataset.py
β”‚   β”‚
β”‚Β Β  β”œβ”€β”€ features       <- Scripts to turn raw data into features for modeling
β”‚Β Β  β”‚Β Β  └── build_features.py
β”‚   β”‚
β”‚Β Β  β”œβ”€β”€ models         <- Scripts to train models and then use trained models to make
β”‚   β”‚   β”‚                 predictions
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ predict_model.py
β”‚Β Β  β”‚Β Β  └── train_model.py
β”‚   β”‚
β”‚Β Β  └── visualization  <- Scripts to create exploratory and results oriented visualizations
β”‚Β Β      └── visualize.py
β”‚
└── tox.ini            <- tox file with settings for running tox; see tox.readthedocs.io

πŸ“œ Publications

Software and marker gene sequences used to build a plankton specific database for taxonomic profiling derive from the following publications:

Microbial abundance, activity and population genomic profiling with mOTUs2 (2019)

Nature

read_counter A tool to count the number of reads (from a fastq file) that map to a set of nucleotide sequences (in a fasta format).

Github

A robust approach to estimate relative phytoplankton cell abundances from metagenomes (2022)

DOI

Toward a global reference database of COI barcodes for marine zooplankton (2021)

DOI

πŸ“ Please Cite

A simple Taxonomic Plankton Profiler Tool (unpublished work).

πŸ“² Contact

Please reach out with any comments, concerns, or discussion regarding LOMAP

Discord Twitter Email

Releases

No releases published

Packages

No packages published

Languages