Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Spark] ALTER TABLE ALTER COLUMN SYNC IDENTITY SQL support #3005

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

c27kwan
Copy link
Contributor

@c27kwan c27kwan commented May 1, 2024

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

This PR is part of #1959
In this PR, we add SQL support for ALTER TABLE ALTER COLUMN SYNC IDENTITY.

This is used for GENERATED BY DEFAULT Identity Columns, where a user may want to manually update the identity column high watermark.

How was this patch tested?

This PR adds a new test suite IdentityColumnSyncSuite.

Does this PR introduce any user-facing changes?

Yes. We introduce the SQL syntax ALTER TABLE (ALTER| CHANGE) COLUMN? <colName> SYNC IDENTITY into Delta. This will update the high watermark stored in the metadata for that specific identity column.
Example Usage

ALTER TABLE ALTER COLUMN id SYNC IDENTITY
ALTER TABLE CHANGE COLUMN id SYNC IDENTITY
ALTER TABLE ALTER id SYNC IDENTITY
ALTER TABLE CHANGE id SYNC IDENTITY

@@ -2461,6 +2461,11 @@ trait DeltaErrorsBase
)
}

def identityColumnAlterNonIdentityColumnError(): Throwable = {
new AnalysisException(
"ALTER TABLE ALTER COLUMN SYNC IDENTITY cannot be called on non IDENTITY columns")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot use new AnalysisException. This doesn't compile against Spark Master.

Please use new DeltaAnalysisException and use a proper error class

withSimpleGeneratedByDefaultTable(start, step) {
// Test empty table.
val oldSchema = DeltaLog.forTable(spark, TableIdentifier(tblName)).snapshot.schema
sql(s"ALTER TABLE $tblName ALTER COLUMN id SYNC IDENTITY")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you test the other variants of the SQL syntax? Right now you only test ALTER TABLE ALTER COLUMN SYNC IDENTITY, but we should validate the SQL syntax for ALTER TABLE CHANGE COLUMN SYNC IDENTITY, etc.

test("alter table sync identity overflow") {
withSimpleGeneratedByDefaultTable(startsWith = 1L, incrementBy = 10L) {
sql(s"INSERT INTO $tblName VALUES (${Long.MaxValue}, 'a')")
intercept[ArithmeticException](sql(s"ALTER TABLE $tblName ALTER COLUMN id SYNC IDENTITY"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test that we can't call SYNC IDENTITY for non-Delta sources?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants