Rearrange inference OPS and stop using builder.load #5490

oelayan7 · 2024-05-01T09:16:44Z

This PR mainly handles all places where InferenceBuilder is used to
access any op or a specific implementation for an op.
Instead an op is defined, and its proper implementation is picked inside
and the usage will be transparent to the user.
What was done in the PR:

Added missing ops (added a py file with fallback mechanism)
Added missing fallback implementations for existing ops
removed all usages for builder.load and replaced them with ops instead.
added workspace op and inferenceContext which contains all workspace related functions and inferenceContext is the python fallback of inferenceContext in CUDA
a small change to softmax_context signature to fit the fallback signature.

This PR mainly handles all places where InferenceBuilder is used to access any op or a specific implementation for an op. Instead an op is defined, and its proper implementation is picked inside and the usage will be transparent to the user. What was done in the PR: 1) Added missing ops (added a py file with fallback mechanism) 2) Added missing fallback implementations for existing ops 3) removed all usages for builder.load and replaced them with ops instead. 4) a small change to softmax_context signature to fit the fallback signature.

loadams · 2024-05-22T21:54:19Z

Hi @oelayan7 - thanks for the contribution, could you take a look at the failing tests?

lekurile · 2024-05-22T22:42:33Z

Kicked off a manual run of the nv-ds-chat GH workflow since this PR modifies the Hybrid Engine:
https://github.com/microsoft/DeepSpeed/actions/runs/9199337454

@loadams, @jomayeri, FYI.

oelayan7 added 3 commits May 1, 2024 10:53

Add missing inference ops

be6e569

Add missing fallback implementation for inference ops

81a11a4

oelayan7 requested review from mrwyattii, tjruwase, awan-10, arashb and loadams as code owners May 1, 2024 09:16

loadams added 2 commits May 22, 2024 11:06

Merge branch 'master' into rearrange_ops

fad09c9

Merge branch 'master' into rearrange_ops

b8faade

Merge branch 'master' into rearrange_ops

3034ce8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rearrange inference OPS and stop using builder.load #5490

Rearrange inference OPS and stop using builder.load #5490

oelayan7 commented May 1, 2024 •

edited

loadams commented May 22, 2024

lekurile commented May 22, 2024

Rearrange inference OPS and stop using builder.load #5490

Are you sure you want to change the base?

Rearrange inference OPS and stop using builder.load #5490

Conversation

oelayan7 commented May 1, 2024 • edited

loadams commented May 22, 2024

lekurile commented May 22, 2024

oelayan7 commented May 1, 2024 •

edited