Is there support to run ANE accelerated loops/while_loop? #2208

0seba · 2024-04-29T14:02:46Z

Hi, I've been experimenting with while_loop (s), but I haven't had any success making them run accelerated on the ANE, and neither the GPU. Is it even possible for them to run with acceleration?

Here's an example code, the loop uses just a simple counter as exit condition

import numpy as np

import coremltools as ct
import coremltools.converters.mil as mil
from coremltools.converters.mil import Builder as mb


bsize = 4
seqlen = 128
dim = 512

qshape = (bsize, seqlen, dim)
kshape = (dim, dim)

@mb.program(
    input_specs=[
        mb.TensorSpec(shape=qshape, dtype=mil.input_types.types.fp16),
        mb.TensorSpec(shape=kshape, dtype=mil.input_types.types.fp16),
        mb.TensorSpec(shape=(1,), dtype=mil.input_types.types.int32),
    ],
    opset_version=mil.builder.AvailableTarget.iOS17,
)
def loop(q, k, l):
    i = mb.fill(shape=np.array([1]), value=np.array(0., dtype=np.int32))
    start = q
    loop_vars = (i, start)
    
    def cond(_i, state):
        return mb.less(x=_i, y=l)

    def body(_i, state):
        _prod = mb.matmul(x=state, y=k, transpose_y=False)
        state = mb.sigmoid(x=_prod)
        _i = mb.add(x=_i, y=np.ones([1], dtype=np.int32))
        return _i, state
        
    loop_vars = mb.while_loop(_cond=cond, _body=body, loop_vars=loop_vars)
    return loop_vars

mlmodel = ct.convert(
    loop,
    compute_units=ct.ComputeUnit.CPU_AND_NE,
    compute_precision=ct.precision.FLOAT16,
    minimum_deployment_target=ct.target.iOS17,
    inputs=[
        ct.TensorType(name='q', shape=ct.Shape(shape=qshape)),
        ct.TensorType(name='k', shape=ct.Shape(shape=kshape)),
        ct.TensorType(name='l', shape=ct.Shape(shape=(seqlen,))),
    ]
)

q = np.random.normal(scale=0.2, size=qshape).astype(np.float16)
k = np.random.normal(scale=0.2, size=kshape).astype(np.float16)
l = np.array([16]).astype(np.int32)

mlmodel.predict({'q': q, 'k': k, 'l': l})

ephemer · 2024-05-21T20:18:03Z

I don't think Apple wants to directly document or comment on this behaviour because it may change in the future etc.

For now, from what I can tell, the ANE cannot deal with any control flow whatsoever (including if/else) – it can take a static set of instructions and compute them quickly but that's it.

0seba added the question Response providing clarification needed. Will not be assigned to a release. (type) label Apr 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there support to run ANE accelerated loops/while_loop? #2208

Is there support to run ANE accelerated loops/while_loop? #2208

0seba commented Apr 29, 2024

ephemer commented May 21, 2024

Is there support to run ANE accelerated loops/while_loop? #2208

Is there support to run ANE accelerated loops/while_loop? #2208

Comments

0seba commented Apr 29, 2024

ephemer commented May 21, 2024