-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(sql): rewrite trivial expressions over same column in GROUP BY queries #4508
base: master
Are you sure you want to change the base?
Conversation
…or Q35. More refactoring tba
if (model.getSelectModelType() == QueryModel.SELECT_MODEL_VIRTUAL | ||
&& nestedModel.getSelectModelType() == QueryModel.SELECT_MODEL_GROUP_BY) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe this could be safely relaxed
if (model.getSelectModelType() == QueryModel.SELECT_MODEL_VIRTUAL | ||
&& nestedModel.getSelectModelType() == QueryModel.SELECT_MODEL_GROUP_BY) { | ||
|
||
CharSequenceIntHashMap nestedCandidates = new CharSequenceIntHashMap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: is there a lighter option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could move this map into a field and reuse it between the invocations.
@@ -1447,6 +1447,49 @@ public void testOrderingOfSortsInSingleTimestampCase() throws Exception { | |||
}); | |||
} | |||
|
|||
@Test | |||
public void testRewriteTrivialExpressionsBasic() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more tests would be good
…_rewrite_trivial_expr
[PR Coverage check]😍 pass : 49 / 50 (98.00%) file detail
|
We need to mitigate perf degradation on the c6a.metal box (192 cores, 384GB RAM): 0.6s for Java vs 2.2s for Rosti. The reason is that Java code implements hash table sharding while C++ code doesn't. Maybe we could mitigate this by limiting the max number of Rosti tables in use here: questdb/core/src/main/java/io/questdb/griffin/engine/groupby/vect/GroupByRecordCursorFactory.java Line 82 in fe2aeca
|
I have a better idea: we should disable keyed Rosti for all types but SYMBOL. That's because SYMBOL type has low cardinality, while it's not always the case for INT or IPv4. So, for high cardinality columns Java-based GROUP BY will be faster than the Rosti one. @bluestreak01 WDYT? |
Closes #4141
This relates to performance around Clickbench Q35.
For Clickbench Q35 on M2 Mac Mini, this speeds up the query from 1.7s to 0.9s.
The rewritten query runs using a Rosti implementation and an early limit, instead of async group by and a late limit.
With changes but async group by instead of rosti, it runs in 1.18s.
Query:
Before change:
Execute: 1.78s
After change:
Execute: 915.35ms