Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

foo(x::Tuple{Vararg}) specializes much more aggressively than foo(x...) #54390

Open
topolarity opened this issue May 7, 2024 · 8 comments
Open
Labels
domain:types and dispatch Types, subtyping and method dispatch performance Must go faster

Comments

@topolarity
Copy link
Member

topolarity commented May 7, 2024

Consider these two very similar functions:

julia> foo_vararg(x::Union{AbstractString,Integer}...) = Base.:(*)(map(string, x)...)
julia> foo_vararg_tuple(x::Tuple{Vararg{Union{AbstractString,Integer}}}) = Base.:(*)(map(string, x)...)

The foo_vararg function limits the amount it specializes on length (an escape hatch was also added in #49172 to control this per-method, if necessary):

julia> Core.Compiler.get_compileable_sig(only(methods(foo_vararg)), Tuple{typeof(foo_vararg), String,Int,Int,Int,Int,Int,String}, Core.svec())
Tuple{typeof(foo_vararg), String, Vararg{Union{AbstractString, Integer}}}

In contrast, foo_vararg_tuple specializes on every permutation/length of tuple it receives:

julia> Core.Compiler.get_compileable_sig(only(methods(foo_vararg_tuple)), Tuple{typeof(foo_vararg_tuple), Tuple{String,Int,Int,Int,Int,Int,String}}, Core.svec())
Tuple{typeof(foo_vararg_tuple), Tuple{String, Int64, Int64, Int64, Int64, Int64, String}}

This over-specialization can often lead to type-unstable code and dynamic dispatches.

A better alternative I think would be to unroll this Vararg much less aggressively by default. We would still provide full Vararg unwrapping if you provide an explicit where hint, as in foo(::Tuple{Vararg{Union{AbstractString,Integer}, N}}) where N.

We already recommend that escape hatch in the performance tips for the foo(x...) case.

@topolarity topolarity changed the title foo(x::Tuple{Vararg}) specialized much more aggressively than foo(x...) foo(x::Tuple{Vararg}) specializes much more aggressively than foo(x...) May 7, 2024
@topolarity
Copy link
Member Author

topolarity commented May 7, 2024

To make the motivation here a bit more concrete, we have code in Base that uses Tuple{Vararg} arguments as containers:

julia/base/version.jl

Lines 83 to 84 in 1614e11

pre::Tuple{Vararg{Union{Integer,AbstractString}}} = (),
bld::Tuple{Vararg{Union{Integer,AbstractString}}} = ()) =

Ironically, this code ends up iterating over pre and bld, so that any length-based specialization is useless for inference:

julia/base/version.jl

Lines 59 to 68 in 1614e11

for ident in pre
if ident isa Integer
ident >= 0 || throw(ArgumentError("invalid negative pre-release identifier: $ident"))
else
if !occursin(r"^(?:|[0-9a-z-]*[a-z-][0-9a-z-]*)$"i, ident) ||
isempty(ident) && !(length(pre)==1 && isempty(bld))
throw(ArgumentError("invalid pre-release identifier: $(repr(ident))"))
end
end
end

Why is the unrolled Tuple information "useless" here? This loop is not statically unrolled, so a specialized instance of the function (e.g. foo(::Tuple{T,U,V})) ends up operating on something like ident::Union{T,U,V}. That Union is the same regardless of the order/number of its arguments, so from an inference perspective, we may have just as well provided Tuple{Vararg{Union{T,U,V}} and we would have produced code of the same quality.

That argument disregards codegen/ABI details, which could motivate specialization (e.g. to get fixed-size arguments), but for any polymorphic call-sites, i.e. call-sites where we don't know exactly what Tuple they'll pass in, de-specialization would be a clear win for this code.

@nsajko
Copy link
Contributor

nsajko commented May 7, 2024

This would be disruptive for performance-focused packages. And kinda breaking.

It's well known and documented that Julia almost always specializes on argument type. The performance tips document that there are exactly three exceptions. Tuple is not among them.

@topolarity
Copy link
Member Author

The performance tips document that there are exactly three exceptions. Tuple is not among them.

This is specifically about Vararg specialization, which is indeed one of those three exceptions.

@mikmoore
Copy link
Contributor

mikmoore commented May 7, 2024

I would (var)argue that Tuple{Vararg} is not the same thing as Vararg. Vararg is a special object that (as far as the documentation is concerned) can appear only within a Tuple type or within a method signature, with foo(x::Vararg) being equivalent to foo(x...). The parts that are nonspecialized are the method signature versions. When used within a Tuple, it details the structure of a Tuple and Tuple arguments are always specialized.

EDIT: this is to emphasize that foo(x::Tuple{Vararg}) is a distinct signature from foo(x::Vararg)/foo(x...), which is unclear from the title of this issue.

As for the version.jl code in question, it would appear that replacing the outer loop with a foreach might be appropriate. Or, if we prefer despecialization (since the VersionNumber struct has non-concrete fields anyway), use a Vector (or maybe Memory? does Memory support Union?) instead of a Tuple{Vararg}.

@topolarity
Copy link
Member Author

Using a Vector will require a breaking change the VersionNumber API, but we can probably work around a good part of the issue in this case with some @inline annotations.

In any case, the root of the issue is that there's currently no way to ask Julia to specialize foo(::Tuple{Vararg{T,N}}) on T but not on N. Asking for that specialization behavior seems very reasonable to me, so if we don't want to make our specialization of Tuple consistent with our top-level specialization behavior in general, maybe we need another escape hatch instead.

@mikmoore
Copy link
Contributor

mikmoore commented May 7, 2024

there's currently no way to ask Julia to specialize foo(::Tuple{Vararg{T,N}}) on T but not on N

One also can't have an Array{T,N} argument that specializes on T but not N. It could be nice to have this functionality, but it seems like a lot of work and it goes far beyond just Tuple.

Just like NTuple, Tuple{Vararg} isn't a type. It's just a way to write a Tuple type with parametric length. So you also couldn't disambiguate a Tuple{Vararg} method from another Tuple method that wasn't written with Vararg.

julia> Tuple{Vararg{Int64,3}}
Tuple{Int64, Int64, Int64}

Maybe one could go deep into the compiler to make specific exceptions when Tuple{Vararg} appears explicitly in a method signature, but getting that to work properly and non-degenerately seems like a big task (assuming it's even possible to integrate with the rest of Julia's semantics).

If you want collections that don't specialize on length, use Vector or Memory.

@vtjnash
Copy link
Sponsor Member

vtjnash commented May 7, 2024

Why does it need to dispatch on the order of the combination of Int/String types of it? Often we deal with this sort of situation by making it just accept any Tuple, then using a type assert inside (if necessary) to provide diagnostics if the user accidentally passes something else

@topolarity
Copy link
Member Author

Why does it need to dispatch on the order of the combination of Int/String types of it? Often we deal with this sort of situation by making it just accept any Tuple, then using a type assert inside (if necessary) to provide diagnostics if the user accidentally passes something else

Yeah, that's an interesting thought.

We could de-specialize completely on the Tuple{Vararg{Union{Integer,AbstractString}}} argument, but that also means we give up on any chance of resolving the internal Int64(::Integer) and String(::AbstractString) calls statically.

@nsajko nsajko added the domain:types and dispatch Types, subtyping and method dispatch label May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:types and dispatch Types, subtyping and method dispatch performance Must go faster
Projects
None yet
Development

No branches or pull requests

4 participants