Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(sql): rewrite trivial expressions over same column in GROUP BY queries #4508

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

nwoolmer
Copy link
Contributor

@nwoolmer nwoolmer commented May 15, 2024

Closes #4141

This relates to performance around Clickbench Q35.

For Clickbench Q35 on M2 Mac Mini, this speeds up the query from 1.7s to 0.9s.

The rewritten query runs using a Rosti implementation and an early limit, instead of async group by and a late limit.

With changes but async group by instead of rosti, it runs in 1.18s.

Query:

SELECT ClientIP, ClientIP - 1, ClientIP - 2, ClientIP - 3, COUNT(*) AS c 
FROM hits 
GROUP BY ClientIP, ClientIP - 1, ClientIP - 2, ClientIP - 3 
ORDER BY c DESC LIMIT 10;

Before change:

Sort light lo: 10
  keys: [c desc]
    VirtualRecord
      functions: [ClientIP,column,column1,column2,c]
        Async Group By workers: 8
          keys: [ClientIP,column,column1,column2]
          values: [count(*)]
          filter: null
            DataFrame
                Row forward scan
                Frame forward scan on: hits

Execute: 1.78s

After change:

VirtualRecord
  functions: [ClientIP,ClientIP-1,ClientIP-2,ClientIP-3,c]
    Sort light lo: 10
      keys: [c desc]
        GroupBy vectorized: true workers: 8
          keys: [ClientIP]          values: [count(*)]
            DataFrame
                Row forward scan
                Frame forward scan on: hits

Execute: 915.35ms

@nwoolmer nwoolmer added SQL Issues or changes relating to SQL execution Performance Performance improvements labels May 15, 2024
Comment on lines +5441 to +5442
if (model.getSelectModelType() == QueryModel.SELECT_MODEL_VIRTUAL
&& nestedModel.getSelectModelType() == QueryModel.SELECT_MODEL_GROUP_BY) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe this could be safely relaxed

@nwoolmer nwoolmer requested a review from puzpuzpuz May 15, 2024 15:40
@nwoolmer nwoolmer marked this pull request as ready for review May 15, 2024 15:41
if (model.getSelectModelType() == QueryModel.SELECT_MODEL_VIRTUAL
&& nestedModel.getSelectModelType() == QueryModel.SELECT_MODEL_GROUP_BY) {

CharSequenceIntHashMap nestedCandidates = new CharSequenceIntHashMap();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is there a lighter option?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could move this map into a field and reuse it between the invocations.

@@ -1447,6 +1447,49 @@ public void testOrderingOfSortsInSingleTimestampCase() throws Exception {
});
}

@Test
public void testRewriteTrivialExpressionsBasic() throws Exception {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more tests would be good

@ideoma
Copy link
Collaborator

ideoma commented May 28, 2024

[PR Coverage check]

😍 pass : 49 / 50 (98.00%)

file detail

path covered line new line coverage
🔵 io/questdb/griffin/SqlOptimiser.java 48 49 97.96%
🔵 io/questdb/griffin/model/QueryModel.java 1 1 100.00%

@puzpuzpuz
Copy link
Contributor

We need to mitigate perf degradation on the c6a.metal box (192 cores, 384GB RAM): 0.6s for Java vs 2.2s for Rosti. The reason is that Java code implements hash table sharding while C++ code doesn't. Maybe we could mitigate this by limiting the max number of Rosti tables in use here:

@nwoolmer nwoolmer added the DO NOT MERGE These changes should not be merged to main branch label May 28, 2024
@puzpuzpuz
Copy link
Contributor

We need to mitigate perf degradation on the c6a.metal box (192 cores, 384GB RAM): 0.6s for Java vs 2.2s for Rosti. The reason is that Java code implements hash table sharding while C++ code doesn't. Maybe we could mitigate this by limiting the max number of Rosti tables in use here

I have a better idea: we should disable keyed Rosti for all types but SYMBOL. That's because SYMBOL type has low cardinality, while it's not always the case for INT or IPv4. So, for high cardinality columns Java-based GROUP BY will be faster than the Rosti one.

@bluestreak01 WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DO NOT MERGE These changes should not be merged to main branch Performance Performance improvements ready for review SQL Issues or changes relating to SQL execution
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rewrite GROUP BY queries with trivial expressions over the same column
4 participants