Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMORO-2819]Spark cannot execute the "alter table set identifier field" command on tables in Iceberg format in unified catalog #2825

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Aireed
Copy link
Contributor

@Aireed Aireed commented May 10, 2024

Why are the changes needed?

Close #2819 .

Brief change log

  • Fix the issue of creating a mixed format table using "create table like".

  • Shade out the Iceberg classes called by the Amorok Spark extension.

  • The cause: UnifiedSessionCatalog is neither SparkCatalog nor SparkSessionCatalog of Iceberg, so it doesn't match catalogAndIdentifier, and can't get the physical plan
    image

solution:

  1. copy the iceberg ExtendedDataSourceV2Strategy and override IcebergCatalogAndIdentifier as AmoroExtendedDataSourceV2Strategy
  2. replace ExtendedDataSourceV2Strategy with AmoroExtendedDataSourceV2Strategy in ArcticSparkExtensions

Tips:

  1. ArcticSparkExtensions inject arctic extension and iceberg extension, so We don't need to configure the Iceberg extension in spark.sql.extensions in the future.

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before making a pull request

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@Aireed
Copy link
Contributor Author

Aireed commented May 10, 2024

cc @baiyangtx PTAL

Copy link
Contributor

@XBaith XBaith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Aireed , can you add a unit test for this case? That helps us to check the effect of the code.

@Aireed Aireed changed the title [AMORO-2819]Spark cannot execute the "alter table set identifier field" command on tables in Iceberg format. [AMORO-2819]Spark cannot execute the "alter table set identifier field" command on tables in Iceberg format in unified catalog May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:mixed-spark Spark module for Mixed Format type:build
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug][SPARK]: Spark cannot execute the "alter table set identifier field" command on tables in Iceberg format.
2 participants