Improve LongRange execution for ranges that have min value as Long.MIN_VALUE and max value as Long.MAX_VALUE or min as 0, and max as 2^64-1 #13375

gautamworah96 · 2024-05-16T00:15:24Z

Description

At Amazon Search, we came across a lot of client queries that were specifying the min value as 0, and the max value as Long.MAX_VALUE.

We don't use the associated ValueSourceQuery query internally, but use a custom logic for filtering.
An improvement here would be to rewrite ValueSourceQuery to a MatchAllDocsQuery if we detect that the min and max value are the same as the min and max value of Long.

I think we will have to check for something like (minVal==0 and maxVal==2^64-1) or (minVal==Long.MIN_VALUE and maxVal==Long.MAX_VALUE) since we will have to cover both signed and unsigned cases.

We can implement similar logic for DoubleRange as well.

The text was updated successfully, but these errors were encountered:

timgrein · 2024-05-16T15:55:13Z

That sounds interesting, I can take a look at DoubleRange first and do so some benchmarking, shouldn't be too hard to expand it then to the other data types, if it shows promising results.

gautamworah96 · 2024-05-16T20:12:42Z

I can take a look at DoubleRange first and do so some benchmarking

I suspect that existing benchmarks won't show needle moving results if they don't have enough queries that can be optimized through this opto. The change is still a positive one tho. We may need to benchmark it on a query set that specifically has a lot of these "full range from 0->Long.MAX_VALUE" clauses.

timgrein · 2024-05-17T06:50:54Z

I suspect that existing benchmarks won't show needle moving results if they don't have enough queries that can be optimized through this opto.

Yes, I'm gathering some specific numbers for this exact case. I'll also suspect that this rewriting should've a purely positive effect, especially considering that you've a max of 4 dimensions for FloatRange, DoubleRange etc., which should make detecting a query containing such a range very fast.

msokolov · 2024-05-19T11:48:44Z

Is it valid to rewrite to match all docs though? Some docs may lack a value, and I didn't think we would match in that case.

timgrein · 2024-05-19T12:06:22Z

I took a first pass at this and also noticed that MatchAllDocsQuery is semantically wrong as some docs might not have the field (also caught by the existing test cases). What I think could work is to check, whether the valueSource is of type FieldValuesSource and rewrite the query to a FieldExistsQuery if we encounter the maximum possible range (Long.MIN_VALUE and Long.MAX_VALUE) at the same time.

WDYT @msokolov ?

msokolov · 2024-05-19T12:18:05Z

+1, that should work.

timgrein · 2024-05-19T14:38:53Z

@msokolov Opened a PR for LongRange, if you want to take a look (I cannot request a reviewer...): #13383

gautamworah96 added the type:enhancement label May 16, 2024

timgrein mentioned this issue May 19, 2024

Rewrite LongRange.ValueSourceQuery/MultiValueSourceQuery to FieldExistsQuery on max range #13383

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve LongRange execution for ranges that have min value as Long.MIN_VALUE and max value as Long.MAX_VALUE or min as 0, and max as 2^64-1 #13375

Improve LongRange execution for ranges that have min value as Long.MIN_VALUE and max value as Long.MAX_VALUE or min as 0, and max as 2^64-1 #13375

gautamworah96 commented May 16, 2024

timgrein commented May 16, 2024 •

edited

gautamworah96 commented May 16, 2024

timgrein commented May 17, 2024 •

edited

msokolov commented May 19, 2024

timgrein commented May 19, 2024 •

edited

msokolov commented May 19, 2024

timgrein commented May 19, 2024

Improve LongRange execution for ranges that have min value as Long.MIN_VALUE and max value as Long.MAX_VALUE or min as 0, and max as 2^64-1 #13375

Improve LongRange execution for ranges that have min value as Long.MIN_VALUE and max value as Long.MAX_VALUE or min as 0, and max as 2^64-1 #13375

Comments

gautamworah96 commented May 16, 2024

Description

timgrein commented May 16, 2024 • edited

gautamworah96 commented May 16, 2024

timgrein commented May 17, 2024 • edited

msokolov commented May 19, 2024

timgrein commented May 19, 2024 • edited

msokolov commented May 19, 2024

timgrein commented May 19, 2024

timgrein commented May 16, 2024 •

edited

timgrein commented May 17, 2024 •

edited

timgrein commented May 19, 2024 •

edited