Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scale capability: Use native colour format if the geom prefers it #5033

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

zeehio
Copy link
Contributor

@zeehio zeehio commented Nov 6, 2022

This pull request is on top of:

When a geom maps a data to a fill or colour aesthetic, the scale will transform column values into a character vector ("#ff0000",...). Some geoms do not use character colours, but rather use native colours (for nativeRaster objects, in integer format) and they must do the format conversion when rendering (e.g. https://github.com/zeehio/ggmatrix/blob/98445bf28caaca1022c03a542b8b4541034566a2/R/geom_matrix_raster.R#L123).

If the geom can tell the scale that it would rather have colours in native format, and if the scale can tell the same to the palette, the intermediate character representation of colours can be avoided with significant performance benefits.

Pull request #5031 took care of providing a way for geoms to pass parameters to scales. Here we let a geom tell the scale that it prefers to have colours in native format instead of character format. The scale will transform the colours to native format, or will let the palette do the transformation if the palette supports that (as exposed in r-lib/scales#372 )

This has significant performance improvements for:

We can extend palettes with attributes to improve the mapping efficiency.

With this commit a palette may define an attribute `may_return_na` to `FALSE`.
If it does, `ScaleContinuous` will assume the palette may not return missing
values, and it will skip checking for those and replacing them.
ScaleContinous maps values to palette colours as follows:

- unique values are found
- unique values are mapped to colors
- colors are matched to the original vector

If most values are unique, we can be faster by simply maping all values to colors,
without finding and matching unique values first.

In some scenarios the geom can guess or know if that is going to be the case.

The goal of this commit is to let the geom tell the ScaleContinuous scale
how the mapping from values to colours should be done.

By default the existing "unique" approach is used.

The geom may now specify `scale_params = list(fill=list(mapping_method = "raw"))`
to tell the scale corresponding to the fill aesthetic to use a "raw" approach
of mapping values to colours without finding unique values first.

Besides the default "unique" and the new "raw" mapping methods, we also allow
the geom to ask to use the "binned" approach, where the geom specifies a number
of intervals to use and the mapping process is as follows:

- values are binned in N intervals
- intervals are mapped to colors

This approach is "lossy" (we have a maximum of N different colours), but
this can be much faster and have almost no difference with respect to
the other mapping methods.
Some geoms benefit from using native colour format instead
of the character based colour format.

This commit lets the geoms specify that they prefer native
format for colours, and gives the responsibility of converting
into that format to ScaleContinuous.

Since today it is not mandatory for all scales to honor scale_params,
the geom that requests this will have to verify that the
color is given in native format anyway, and do the conversion if
it has not been done here.

However by optionally shifting the responsibility of the conversion
to the scale we have potential to further optimizations
If a palette has accepts_native_output=TRUE set as an attribute,
ScaleContinuous assumes the palette has an optional argument named
`color_fmt` which can be set to either "character" or "native".

If the geom prefers a native output format and the palette supports
it, we let the palette take care of it.

The conversion goes from value -> native colour, which is much faster
than going through an intermediate character representation of
the colours.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant