Skip to content

neo-chem/awesome-chemical-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

40 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Awesome data in (experimental) chemistry and materials science Awesome

We live in a data-driven age. To make use of all the data that is produced in (experimental) chemistry and materials science, standards for collecting and sharing of data are needed. Once the community agrees on standard schemas, they can be implemented in ELNs and the data can be re-used if shared via repositories.

Symbol Meaning
😴 Currently not developed/maintained
πŸ”“ Closed source
πŸ“„ Link to a paper

Contents

Electronic lab notebooks (ELN) / Laboratory infrastructure management systems (LIMS)

Overviews

ELNs

  • c6h6: Developed by the cheminfo organization, couchDB backend, modular frontend in JavaScript. Github Stars GitHub last commit πŸ“„.
  • chemotion ELN: Developed by Nicole Jung's group at KIT, with focus on organic chemistry. Written in JavaScript/Ruby. Github Stars GitHub last commit πŸ“„
  • openBIS: General purpose LIMS/ELN developed at ETH Zuerich, allows to add custom plugins and direct data analysis in Jupyter notebooks. Core written in Java. πŸ“„.
  • LabTrove: The ELN used for the Open Source Malaria project.
  • bluesky: More than just an ELN. Bluesky is a Python ecosystem that can be used for experiment control and collection of scientific data and metadata, being developed at national labs with synchrotrons, there is a focuss on streaming data. Github Stars GitHub last commit. πŸ“„.
  • Materials Data Curation System (MDCS): A framework for capturing, sharing and transforming materials data in structured formats such as XML, based on user-selected templates. Developed at NIST. Github Stars GitHub last commit
  • eLabFTW: an open source lab notebook platform with support for inventory management, scheduling and REST APIs (amongst other things). Written with PHP/MySQL. GitHub Stars GitHub last commit

Repositories

  • chemotion repository: Repository for molecules, reactions and research data.

  • Materials Commons: A site for Materials Scientists to collaborate, store and publish research. Github Stars. GitHub last commit πŸ“„

  • nmrshiftdb2 Repository for NMR data. A major rework of the software is pending. Use of NMReDATA is central. Can include raw data and processed data (currently the case for some entries, mostly peak lists). Integration in workflows is possible (e.g. prediction used in chemotion).

Schemas/Ontologies

Overviews

Generic

  • SciData: Scientific data model (SDM) and related ontology (SDMO). Github Stars GitHub last commit

  • BFO: The Basic Formal Ontology (BFO) is an upper-level ontology "designed for use in supporting information retrieval, analysis and integration in scientific and other domains", under development since 2002 (last release 2019). Github Stars GitHub last commit

  • oreChem: Its goal was to create an ontology for scientific experiments, was funded by Microsoft research. "The oreChem s Ontology [eo] describes (a) the planned method of a scientific experiment; (b) the enactment of plans and (c) the provenance of objects realised during enactments." πŸ“„. 😴

  • elnItemManifest: Describes core metadata for ELNs (like title, keywords, identifiers, contact, license information, related items, contributors, content, source). πŸ“„. 😴

  • autoprotocol: Language for specifying experimental protocols. Has with Autoprotocol standard changes a mechanism similar to Python enhancement proposals for changes in the standard. Github Stars GitHub last commit

  • Chemical Markup Language (CML): XML for most chemistry, especially molecules, compounds, reactions, spectra, crystals and computational chemistry, developed by Peter Murray-Rust and Henry Rzepa. πŸ“„ 😴 (Last Update: 2013-04-22)

  • European Materials and Modelling Ontology (EMMO): An ontology designed to represent the "complex multiscale nature of chemicals and materials" with varying analytical philosophical interpretations. Available from the emmo-repo organisation on GitHub. Github Stars GitHub last commit

  • Materials Genome Initiative JSON: JSON schema for materials science and engineering. Directly related is Material Schema which aims to extend schema.org for materials science. Github Stars GitHub last commit

  • ChemAxiom: Ontological framework for chemistry, led by Peter Murray-Rust. Github Stars GitHub last commit 😴

  • Chemical Information Ontology: "aims to establish a standard in representing chemical information. In particular, it aims to produce an ontology to represent chemical structure and to richly describe chemical properties, whether intrinsic or computed." (direct quote from the README) Github Stars GitHub last commit πŸ“„.

Analytical methods

Organic reactions

Biology

  • ISA framework: Data model built around "Investigation" (project context), "Study" (unit of research), "Assay" (analytical measurement) to manage life science/biomedical (*omics) experiments. Github Stars GitHub last commit πŸ“„

  • SD2E/opil Synthetic biology experiment description effort intended to standardize the interface between human-generated experimental requests and lab-automated protocol. Github Stars GitHub last commit

Materials synthesis

Materials properties

  • MatML: XML format for the interchange of materials information. 😴

  • Physical information file (PIF): Schema for information about physical systems, maintained by Citrine informatics. Github Stars GitHub last commit

Spectral data

Initiatives/Consortia

  • NFDI4Chem: Initiative to build an open and FAIR infrastructure for research data management in chemistry.

  • Blue Obelisk: Internet group promoting reusable chemistry via open source software development. πŸ“„ πŸ“„

  • GO FAIR Chemistry Implementation Network: Goals are "to enhance the open, FAIR and effective communication of chemical knowledge within the chemical sciences and between chemistry and other disciplines" and "to enable chemists and chemistry to contribute to the achievement of the UN Global Sustainable Development goals" (direct quotes from the website). πŸ“„.

  • Chemistry Research Data IG: Interest Group of the Research Data Alliance (RDA) that aims to foster exchange on chemical data.

  • RDA/CODATA Materials Data, Infrastructure & Interoperability IG: Interest Group of the Research Data Alliance (RDA) that aims to foster exchange on material data.

Related compilations

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •