Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support an aarch64 wheel #2195

Open
4np opened this issue Apr 15, 2024 · 4 comments
Open

Support an aarch64 wheel #2195

4np opened this issue Apr 15, 2024 · 4 comments
Labels
feature request Functionality does not currently exist, would need to be created as a new feature (type)

Comments

@4np
Copy link

4np commented Apr 15, 2024

🌱 Support an aarch64 wheel

As an iOS engineer, my goal is to be able to transform trained models for use on device. As I, and probably most iOS engineers, don't use Python, having to install Anaconda / Miniconda as well as lots of Python dependencies isn't ideal as it will clutter up my machine. Even more so as I will only transform on demand, and then maybe for months won't use anything Python.

As such, I started work on a docker container that will set up the required Python tools in a more container fashion. However, there are several hurdles you run into as soon as you do:

  • Core ML Tools binary distribution doesn't work (Fail to import BlobReader from libmilstoragepython. No module named 'coremltools.libmilstoragepython')
  • Core ML Tools source distribution will only create an x86_64 wheel that you cannot use on aarch64 (Docker on Apple Silicon)

How can this feature be used?

To run Core ML Tools inside an aarch64 Docker container on Apple Silicon.

Describe alternatives you've considered

The obvious alternative is to install everything directly on my machine, but I'd rather keep things separate so I can more easily clean up my machine.

Additional context

It would be great if Core ML Tools could reliably run inside a Docker container on Apple Silicon to better compartmentalize Python dependencies (a docker container is easily thrown away, whereas a Python dependency tree is harder to clean-up / maintain).

Dockerfile compiling source distribution

FROM ubuntu:22.04
# Update distribution
RUN apt -y update && apt -y upgrade
# Install essentials
RUN apt -y install git-lfs nano curl zsh make cmake g++ build-essential uuid-dev neofetch
# Change shell from bash to zsh as some of the tools expect zsh
RUN chsh -s /bin/zsh
# Install Git-LFS (large file systems, used by Huggingface)
RUN git lfs install
# Set workdir to /opt
WORKDIR /opt
# Install Oh-My-ZSH for a nicer shell experience
RUN sh -c "$(curl -fsSL https://raw.githubusercontent.com/ohmyzsh/ohmyzsh/master/tools/install.sh)" "" --unattended
# Install Anaconda
#
# Anaconda is a distribution of the Python and R programming languages for scientific computing (data science, machine learning 
# applications, large-scale data processing, predictive analytics, etc.), that aims to simplify package management and deployment.
RUN curl https://repo.anaconda.com/archive/Anaconda3-2024.02-1-Linux-aarch64.sh -o anaconda.sh
RUN zsh anaconda.sh -b -f -p /opt/anaconda || true
RUN eval "$(/opt/anaconda/bin/conda shell.zsh hook)" && conda init zsh && conda init bash
# Clone CoreML Tools and build from source.
RUN git -C /opt clone https://github.com/apple/coremltools.git
RUN eval "$(/opt/anaconda/bin/conda shell.zsh hook)" && zsh -i coremltools/scripts/build.sh --python=3.11 --dist --debug
RUN zsh -i coremltools/scripts/build.sh --python=3.11 --dist
# Installing the built CoreML tools will fail, as there is no aarch64 wheel (only x86_64)...
RUN pip install $(ls coremltools/build/dist/*.whl|head -n1)
# Setup profile.
RUN echo "neofetch" >> ~/.zshrc
RUN echo "conda info" >> ~/.zshrc
# Create a dummy script
RUN echo "import coremltools as ct" > test.py
ENTRYPOINT ["/bin/zsh"]

Build and run container:

docker run --rm -it test

It will fail on RUN pip install $(ls coremltools/build/dist/*.whl|head -n1) as there is no aarch64 wheel, only x86_64. However if you run Docker on Apple Silicon, your container will be aarch64:

(base) ➜  /opt ls -la /opt/coremltools/build/dist
total 1816
drwxr-xr-x 1 root root    4096 Apr 15 05:31 .
drwxr-xr-x 1 root root    4096 Apr 15 05:32 ..
-rw-r--r-- 1 root root 1841473 Apr 15 05:32 coremltools-7.1.2-cp311-none-manylinux1_x86_64.whl
(base) ➜  /opt

Dockerile using binary distribution

FROM ubuntu:22.04
# Update distribution
RUN apt -y update && apt -y upgrade
# Install essentials
RUN apt -y install git-lfs nano curl zsh make cmake g++ build-essential uuid-dev neofetch
# Change shell from bash to zsh as some of the tools expect zsh
RUN chsh -s /bin/zsh
# Install Git-LFS (large file systems, used by Huggingface)
RUN git lfs install
# Set workdir to /opt
WORKDIR /opt
# Install Oh-My-ZSH for a nicer shell experience
RUN sh -c "$(curl -fsSL https://raw.githubusercontent.com/ohmyzsh/ohmyzsh/master/tools/install.sh)" "" --unattended
# Install Anaconda
#
# Anaconda is a distribution of the Python and R programming languages for scientific computing (data science, machine learning 
# applications, large-scale data processing, predictive analytics, etc.), that aims to simplify package management and deployment.
RUN curl https://repo.anaconda.com/archive/Anaconda3-2024.02-1-Linux-aarch64.sh -o anaconda.sh
RUN zsh anaconda.sh -b -f -p /opt/anaconda || true
RUN eval "$(/opt/anaconda/bin/conda shell.zsh hook)" && conda init zsh && conda init bash
# Install CoreML Tools binary distribution
RUN eval "$(/opt/anaconda/bin/conda shell.zsh hook)" && pip install coremltools --pre -U
# Setup profile.
RUN echo "neofetch" >> ~/.zshrc
RUN echo "conda info" >> ~/.zshrc
# Create a dummy script
RUN echo "import coremltools as ct" > test.py
ENTRYPOINT ["/bin/zsh"]

Build and run container:

docker run --rm -it test

Inside the container, try the test.py script:

(base) ➜  /opt python test.py
scikit-learn version 1.2.2 is not supported. Minimum required version: 0.17. Maximum required version: 1.1.2. Disabling scikit-learn conversion API.
Fail to import BlobReader from libmilstoragepython. No module named 'coremltools.libmilstoragepython'
Fail to import BlobWriter from libmilstoragepython. No module named 'coremltools.libmilstoragepython'
(base) ➜  /opt

It will complain BlobReader and BlobWriter are not available, which you need to actually transform models. However, searching for this issue 'they' say to build Core ML Tools from source which, as seen above, also doesn't work.

@4np 4np added the feature request Functionality does not currently exist, would need to be created as a new feature (type) label Apr 15, 2024
@TobyRoseman
Copy link
Collaborator

I believe this is the same issue as #1254

@junpeiz
Copy link
Collaborator

junpeiz commented Apr 15, 2024

Thank you for filing the feature request.

+1 on Toby's point.

For your comments about

having to install Anaconda / Miniconda as well as lots of Python dependencies isn't ideal as it will clutter up my machine.

Just want to mention that the conda will automatically install all dependencies as a turn-key solution, as well as removing all dependencies, and it won't pollute your existing environment.

@4np
Copy link
Author

4np commented Apr 16, 2024

Thanks for the feedback and good to know that Conda is quite contained as well. I don't really use Python, other than for now trying to get trained models transformed to CoreML so I can start using them in Xcode.

@4np
Copy link
Author

4np commented Apr 16, 2024

By the way, Core ML Tools does install when using --platform amd64 and the amd64 / x86_64 sources, rather than arm64 / aarch64:

docker build --platform linux/amd64 -t core-ml-tools .
ocker run --rm -it --platform linux/amd64 core-ml-tools
FROM ubuntu:22.04
# Update distribution
RUN apt -y update && apt -y upgrade
# Install essentials
RUN apt -y install git-lfs nano curl zsh make cmake g++ build-essential uuid-dev neofetch
# Change shell from bash to zsh as some of the tools expect zsh
RUN chsh -s /bin/zsh
# Install Git-LFS (large file systems, used by Huggingface)
RUN git lfs install
# Set workdir to /opt
WORKDIR /opt
# Install Oh-My-ZSH for a nicer shell experience
RUN sh -c "$(curl -fsSL https://raw.githubusercontent.com/ohmyzsh/ohmyzsh/master/tools/install.sh)" "" --unattended
# Install Anaconda
#
# Anaconda is a distribution of the Python and R programming languages for scientific computing (data science, machine learning 
# applications, large-scale data processing, predictive analytics, etc.), that aims to simplify package management and deployment.
RUN curl https://repo.anaconda.com/archive/Anaconda3-2024.02-1-Linux-x86_64.sh -o anaconda.sh
RUN zsh anaconda.sh -b -f -p /opt/anaconda || true
RUN eval "$(/opt/anaconda/bin/conda shell.zsh hook)" && conda init zsh && conda init bash
# Clone CoreML Tools and build from source.
RUN git -C /opt clone https://github.com/apple/coremltools.git
RUN eval "$(/opt/anaconda/bin/conda shell.zsh hook)" && zsh -i coremltools/scripts/build.sh --python=3.11 --dist --debug
RUN eval "$(/opt/anaconda/bin/conda shell.zsh hook)" && pip install $(find /opt/coremltools/build/dist/* -type f -name '*.whl'|head -n 1)
# Setup profile.
RUN echo "neofetch" >> ~/.zshrc
RUN echo "conda info" >> ~/.zshrc
ENTRYPOINT ["/bin/zsh"]

Although Core ML Tools is unfortunately still spitting out errors:

(base) ➜  /models cat test.py
import coremltools
(base) ➜  /models python test.py
scikit-learn version 1.2.2 is not supported. Minimum required version: 0.17. Maximum required version: 1.1.2. Disabling scikit-learn conversion API.
Failed to load _MLModelProxy: No module named 'coremltools.libcoremlpython'
(base) ➜  /models

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Functionality does not currently exist, would need to be created as a new feature (type)
Projects
None yet
Development

No branches or pull requests

3 participants