Computron

Abstract

Many of the most performant deep learning models today in fields like language and image understanding are fine-tuned models that contain billions of parameters. In anticipation of workloads that involve serving many of such large models to handle different tasks, we develop Computron, a system that uses memory swapping to serve multiple distributed models on a shared GPU cluster. Computron implements a model parallel swapping design that takes advantage of the aggregate CPU-GPU link bandwidth of a cluster to speed up model parameter transfers. This design makes swapping large models feasible and can improve resource utilization. We demonstrate that Computron successfully parallelizes model swapping on multiple GPUs, and we test it on randomized workloads to show how it can tolerate real world variability factors like burstiness and skewed request rates.

Installation for Development

Clone this repository and its submodules:

git clone --recurse-submodules git@github.com:dlzou/computron.git

Create an environment, install torch and Colossal-AI from PIP, then install Energon-AI and AlpaServe from the included submodules. Finally, install Computron from source.

conda create -n computron python=3.10
conda activate computron
pip install torch==1.13 torchvision colossalai transformers
pip install -e energonai/
pip install -e alpa_serve/
pip install -e .

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
alpa_serve @ ff27432		alpa_serve @ ff27432
computron		computron
energonai @ b2bd1ae		energonai @ b2bd1ae
examples		examples
experiments		experiments
junkyard		junkyard
playground		playground
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

alpa_serve @ ff27432

alpa_serve @ ff27432

computron

computron

energonai @ b2bd1ae

energonai @ b2bd1ae

examples

examples

experiments

experiments

junkyard

junkyard

playground

playground

.gitignore

.gitignore

.gitmodules

.gitmodules

LICENSE

LICENSE

README.md

README.md

pyproject.toml

pyproject.toml

Repository files navigation

Computron

Abstract

Installation for Development

About

Releases

Packages

Contributors 2

Languages

License

dlzou/computron

Folders and files

Latest commit

History

Repository files navigation

Computron

Abstract

Installation for Development

About

Topics

Resources

License

Stars

Watchers

Forks

Languages