Skip to content

kochgroup/intro_pharma_ai

Repository files navigation

CC BY-NC-SA 4.0

Welcome to:
"Introduction to Artificial Intelligence for Life Science Students"

This repository contains a collection of Jupyter Notebooks, which can be used to teach pharmaceutical and chemistry students the basics of Deep Learning. No prior coding knowledge is required. The article introducing this repository can be found here: https://doi.org/10.1002/ardp.202200628 and was written by Janosch Menke, Samuel Homberg and Oliver Koch.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

CC BY-NC-SA 4.0

This work was funded by the "Apotheker Stiftung Westfalen-Lippe"

Usage

  1. Goolge Colab
    The easiest way to use the Notebooks is to open them in Google Colab. The only thing needed is a Google Account. You can open a Juypter Notebook by simply clicking on a button in the table below. All notebooks will work out-of-the-box.

  2. Local Installation
    If you do not want to run the notebooks through a Google service, you can also setup your own local Python environment. We provide an instruction on how to do this. Like with Colab all notebooks will work straight away, as soon as the local installation has been completed.

Notebook English German
01. Introduction to Jupyter Open In Colab Open In Colab
02. Introduction to Python Open In Colab Open In Colab
03. Cheminformatics & RDKit Open In Colab Open In Colab
04. Linear Regression Open In Colab Open In Colab
05. Data Science Open In Colab Open In Colab
06. Linear Algebra Open In Colab Open In Colab
07. Your first Neural Network Open In Colab Open In Colab
08. PyTorch Open In Colab Open In Colab
09. Convolutional Neural Network Open In Colab Open In Colab
10. Transfer Learning Open In Colab Open In Colab
11. Recurrent Neural Networks Open In Colab Open In Colab
12. Autoencoders Open In Colab Open In Colab
13. Graph Neural Networks Open In Colab Open In Colab
14. Summary Open In Colab Open In Colab

We want to point out that these notebooks are, on their own, not sufficient to properly convey the knowledge and teach students about deep learning. Instructors need to prepare their own accompanying lectures. It is also important to mention that these notebooks are not designed to bring students to a level where they are able to train neural networks without any aid. Rather, the notebooks are designed to teach students the theoretical concepts to understand neural networks through code completion. We believe, as explained in more detail in the paper, that the theory bheind neural networks is easy to understand. But learning about them, is difficult as it requieres a solid understanding of a programming language. So students would get stuck on syntactical problems posed by the programming language rather than the theory behind neural networks.

Contribution or Expensions

We hope that these notebooks can be a starting point for others to expand on or contribute to. Everyone is free to adapt this repository (in accoradance with the above mentioned license).

Data Sources

Name Source
MNIST LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
BBBP Martins, I. F., et al. (2012) A Bayesian approach to in silico blood-brain barrier penetration modeling. Journal of Chemical Information and Modeling, 52(6), 1686-1697.
Pneumonia Kermany, D., Zhang, K., Goldbaum, M. (2018), Large Dataset of Labeled Optical Coherence Tomography (OCT) and Chest X-Ray Images, Mendeley Data, V3, doi: 10.17632/rscbjbr9sj.3
Kermany, D. S., Goldbaum, M., Cai, W., Valentim, C. C., Liang, H., Baxter, S. L., ... & Zhang, K. (2018). Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell, 172(5), 1122-1131.
Cats & Dogs Parkhi, O. M., Vedaldi, A., Zisserman, A., & Jawahar, C. V. (2012). Cats and dogs. In 2012 IEEE conference on computer vision and pattern recognition (pp. 3498-3505). IEEE.
GDB 11 Fink, T., & Reymond, J. L. (2007). Virtual exploration of the chemical universe up to 11 atoms of C, N, O, F: assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physicochemical properties, compound classes, and drug discovery. Journal of Chemical Information and Modeling, 47(2), 342-353.

Additional Information

ImageNet Background

Further Instructional Materials

TeachOpenCADD A collection of notebooks covering a wide range of topics related to cheminformatics and data science, like collecting and cleaning molecular data in Python, but also more advanced topics like Docking.

About

A collection of Jupyter Notebooks that are designed to teach life science students about deep learning.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •