Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pythia 2.8b converted by convert.py script will not call ANE #1

Open
adonishong opened this issue Jul 25, 2023 · 1 comment
Open

Pythia 2.8b converted by convert.py script will not call ANE #1

adonishong opened this issue Jul 25, 2023 · 1 comment

Comments

@adonishong
Copy link

Appreciate for your guys work.

My testing machine is M2 Max 64GB memory.

With generate.py script, the Pythia 2.8b mlpackage from GitHub release will call ANE no matter with --compute_unit="All" or --compute_unit="CPUAndANE". However, if I try to convert Pythia 2.8b from convert.py, the mlpackage will not call ANE, with --compute_unit="All", CPU and GPU will be used; with --compute_unit="CPUAndANE", only CPU will be called. Pythia-410m shows different case, both mlpackage download from GitHub release and converted from convert.py script could call ANE.

BTW, Pythia-6.9b could be converted from convert.py script, and with generate.py and --compute_unit="CPUAndGPU", it works well, but it will not call ANE also.

@smpanaro
Copy link
Owner

👋 Hey!

The 2.8b in the GitHub release is not directly the result you get by running convert.py. For larger models you need to do two more steps:

  1. First split the model into chunks. You will probably need to edit this line -- it looks like I used 670 for the 2.8b model.
python -m src.experiments.chunk_model --mlpackage-path pythia-1.4b_2023_04_11-20_54_12.mlpackage -o .
  1. This should give you multiple files that end in _chunk{1,2,3,...}.mlpackage.
  2. Next you need to join the junks into a single pipeline model. The argument can be any of the _chunk{1,2,3...}.mlpackage files.
python -m src.experiments.make_pipeline pythia-1.4b_2023_04_11-20_54_12_chunk1.mlpackage
  1. You can tell that it worked by doing Show Package Contents and seeing that the Data > com.apple.CoreML > weights folder has many files (one per chunk).
    image
    image

That should allow you to recreate the 2.8b model that runs on ANE. Two more things that might be helpful:

Measuring ANE

I'm not sure how you are checking to see if the model runs on the ANE, but I would recommend using the --wait flag and attaching the CoreML tool from Instruments. Xcode really struggles with these larger models.

python generate.py --model_path gpt2-medium.mlmodelc --compute_unit CPUAndANE --wait
Screenshot 2023-08-22 at 10 14 25 AM Screenshot 2023-08-22 at 10 15 20 AM

For the chunked models you should see one "Neural Engine Prediction" block for each chunk of the model -- it will be obvious if some chunks run on ANE and some do not. (This screenshot is not a chunked model.) There will be a tiny gap between each block that runs on CPU, but it should be very small.
Screenshot 2023-08-22 at 10 18 34 AM

6.9b Model

I only have an M1, but I think there is a chance you can get the 6.9b running on the M2's ANE. You will definitely need to use the chunk_model and make_pipeline tools. I would start with 670 for the chunk size (like 2.8b) and try smaller if that doesn't work. Let me know if you try, I'd be happy to help try and figure out how to get it working!

Sorry for the slow response and also that all of this is missing from the documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants