Fix baddbmm handling of "beta" special-case #1801

Birch-san · 2023-03-12T00:17:46Z

baddbmm() has a beta coefficient, by which bias may be multiplied.

in the special-case where beta=0: bias may be ignored (confirmed by docs):

If beta is 0, then input will be ignored, and nan and inf in it will not be propagated.

we use this in diffusers / stable-diffusion to avoid adding attention bias when none is specified:
https://github.com/huggingface/diffusers/blob/bbab8553224d12f7fd58b0e65b0daf899769ef0b/src/diffusers/models/cross_attention.py#L237

currently, the special-case is determined by a comparison with 1.0 rather than 0.0. it looks like it was copied from alpha's special-case:

coremltools/coremltools/converters/mil/frontend/torch/ops.py

Lines 5274 to 5282 in fa190de

    
           if beta.val != 1.0: 
        
               # Apply scaling factor beta to the bias. 
        
               bias = mb.mul(x=beta, y=bias, name=bias.name + "_scaled") 
        
               context.add(bias) 
        
           if alpha.val != 1.0: 
        
               # Apply scaling factor alpha to the input. 
        
               batch1 = mb.mul(x=alpha, y=batch1, name=batch1.name + "_scaled") 
        
               context.add(batch1)

changing this, fixed compilation of diffusers' UNetCondition2D model for me (which employs that baddbmm in CrossAttnProcessor).

…to ignore it when bias is scaled by a factor of 0). see https://pytorch.org/docs/stable/generated/torch.baddbmm.html

aseemw · 2023-03-12T03:41:35Z

Thanks for the PR, can you please add a unit test as well in the torch unit test file, which fails without this change, but passes with it.

correct the special-case where baddbmm should ignore bias parameter (…

9144431

…to ignore it when bias is scaled by a factor of 0). see https://pytorch.org/docs/stable/generated/torch.baddbmm.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix baddbmm handling of "beta" special-case #1801

Fix baddbmm handling of "beta" special-case #1801

Birch-san commented Mar 12, 2023

aseemw commented Mar 12, 2023

	if beta.val != 1.0:
	# Apply scaling factor beta to the bias.
	bias = mb.mul(x=beta, y=bias, name=bias.name + "_scaled")
	context.add(bias)

	if alpha.val != 1.0:
	# Apply scaling factor alpha to the input.
	batch1 = mb.mul(x=alpha, y=batch1, name=batch1.name + "_scaled")
	context.add(batch1)

Fix baddbmm handling of "beta" special-case #1801

Are you sure you want to change the base?

Fix baddbmm handling of "beta" special-case #1801

Conversation

Birch-san commented Mar 12, 2023

aseemw commented Mar 12, 2023