New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[iceberg] Tag columns with "partition key" in DESCRIBE and SHOW COLUMNS output #22675
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix, a little question for discussing. Do you think we should also show the non-identity partition transform? For example when I create an Iceberg table as follows:
create table test_table(a int, b varchar, c timestamp) with (partitioning = ARRAY['a', 'truncate(b, 2)', 'year(c)']);
Then the show create table
would show all the partition transforms:
presto:default> show create table test_table;
Create Table
-----------------------------------------------------------------------
CREATE TABLE iceberg.default.test_table (
"a" integer,
"b" varchar,
"c" timestamp
)
WITH (
delete_mode = 'merge-on-read',
format = 'PARQUET',
format_version = '2',
location = 'file:/Users/wangd/work/data/iceberg/data/default/test_table',
partitioning = ARRAY['a','truncate(b, 2)','year(c)']
)
Maybe it's better to keep desc
be consistent with show create table
, what's your opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs tests
@hantangwangd I think that's a valid ask. What would you suggest we should mention in the Extra info in this case? Should we mention the transformation being used for the hidden partitioning for the respective columns? |
@imjalpreet For non-identity partition column, I think show the transform information would be enough, maybe something like
Is that ok? Or do you have a better idea? |
@hantangwangd let's say for a date/timestamp column we have two hidden partition transforms year and month. What would be the best way to display that? Should we write partition by year, month or is there a better way to communicate that there are two hidden partition transforms on this column? |
@imjalpreet Thanks for providing this great question, so that we can discuss and handle it. Yes, Iceberg allows create multiple transforms on a column. After a careful check in the spec and the implementation, I found the follow details:
So we can get some conclusions:
So do you think the following shown examples is reasonable?
|
Description
After the change:
Motivation and Context
Fixes #22638
Contributor checklist
Release Notes