Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-48162][SQL] Add collation support for MISC expressions #46461

Closed
wants to merge 6 commits into from

Conversation

uros-db
Copy link
Contributor

@uros-db uros-db commented May 8, 2024

What changes were proposed in this pull request?

Introduce collation awareness for misc expressions: raise_error, uuid, version, typeof, aes_encrypt, aes_decrypt.

Why are the changes needed?

Add collation support for misc expressions in Spark.

Does this PR introduce any user-facing change?

Yes, users should now be able to use collated strings within arguments for misc functions: raise_error, uuid, version, typeof, aes_encrypt, aes_decrypt.

How was this patch tested?

E2e sql tests.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label May 8, 2024
@uros-db uros-db changed the title [WIP][SQL] Misc expressions [WIP][SQL] Add collation support for MISC expressions May 8, 2024
@uros-db uros-db changed the title [WIP][SQL] Add collation support for MISC expressions [SPARK-48162][SQL] Add collation support for MISC expressions May 8, 2024
@dongjoon-hyun
Copy link
Member

Could you take a look at the UT failures?

[info] *** 14 TESTS FAILED ***
[error] Failed tests:
[error] 	org.apache.spark.sql.connect.planner.SparkConnectServiceSuite
[error] 	org.apache.spark.sql.connect.ProtoToParsedPlanTestSuite

@@ -84,7 +85,7 @@ case class RaiseError(errorClass: Expression, errorParms: Expression, dataType:
override def foldable: Boolean = false
override def nullable: Boolean = true
override def inputTypes: Seq[AbstractDataType] =
Seq(StringType, MapType(StringType, StringType))
Seq(StringTypeAnyCollation, MapType(StringType, StringType))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note here: this particular expression cannot be used with the second parameter (MapType(StringType, StringType)), as it specifically doesn't allow passing that expression when constructing RaiseError

as per the current Spark expression API, user can only specify the first parameter, while the second one is only reserved for internal Spark use (where it's always passed as Literal with StringType(0))

hence, there's no need to use something like AbstractMapType(StringTypeAnyCollation, StringTypeAnyCollation) here

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 7233540 May 15, 2024
cloud-fan pushed a commit that referenced this pull request May 22, 2024
…r parameter map to work with collated strings

### What changes were proposed in this pull request?
Following up on the introduction of AbstractMapType (#46458) and changes that introduce collation awareness for RaiseError expression (#46461), this PR should add the appropriate type casting rules for AbstractMapType.

### Why are the changes needed?
Fix the CI failure for the `Support RaiseError misc expression with collation` test when ANSI is off.

### Does this PR introduce _any_ user-facing change?
Yes, type casting is now allowed for map types with collated strings.

### How was this patch tested?
Extended suite `CollationSQLExpressionsANSIOffSuite` with ANSI disabled.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #46661 from uros-db/fix-abstract-map.

Authored-by: Uros Bojanic <157381213+uros-db@users.noreply.github.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
a0x8o added a commit to a0x8o/spark that referenced this pull request May 22, 2024
…r parameter map to work with collated strings

### What changes were proposed in this pull request?
Following up on the introduction of AbstractMapType (apache/spark#46458) and changes that introduce collation awareness for RaiseError expression (apache/spark#46461), this PR should add the appropriate type casting rules for AbstractMapType.

### Why are the changes needed?
Fix the CI failure for the `Support RaiseError misc expression with collation` test when ANSI is off.

### Does this PR introduce _any_ user-facing change?
Yes, type casting is now allowed for map types with collated strings.

### How was this patch tested?
Extended suite `CollationSQLExpressionsANSIOffSuite` with ANSI disabled.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #46661 from uros-db/fix-abstract-map.

Authored-by: Uros Bojanic <157381213+uros-db@users.noreply.github.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants