Skip to content

FedML-AI/MindAlpha

 
 

Repository files navigation

MindAlpha

MindAlpha is a machine learning platform integrating PySpark, PyTorch and a parameter server implementation. The platform contains native support for sparse parameters, making it easy for users to develop large-scale models. Together with MindAlpha Serving, the platform provides a one-stop solution for data preprocessing, model training and online prediction.

Features

  • Efficient IO with PySpark. Minibatches read by PySpark as pandas DataFrames can be feed directly to models.

  • Similar API with PyTorch and Spark MLlib, users familar with PyTorch and PySpark can get started quickly.

  • Wrap custom sparse layers as PyTorch modules, making them easy to use. Those sparse layers can contain billions of parameters.

  • Models can be developed in Jupyter Notebook interactively and periodical model training can be scheduled by Airflow.

  • The trained model can be exported via one method call and loaded by MindAlpha Serving for online prediction.

Build

Firstly, run script to build a docker image

sh run_build.sh -i

For more details, please refer to docker/ubuntu20.04/Dockerfile and docker/centos7/Dockerfile.

and run script to compile sources(*cpp && py) to get dynamic-link library (*.so) and python install packages (*.whl) which will generate at directory build by default.

sh run_build.sh -m

Tutorials

Two tutorials are given:

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 53.0%
  • Python 33.7%
  • Dockerfile 6.1%
  • Jupyter Notebook 4.4%
  • CMake 1.4%
  • Shell 1.1%
  • Other 0.3%