New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-48241][SQL] CSV parsing failure with char/varchar type columns #46537
Conversation
Hi @ulysses-you Could you help review? |
case a: AttributeReference => a | ||
case a: AttributeReference => | ||
// Keep the metadata in given schema. | ||
a.copy(metadata = field.metadata)(exprId = a.exprId, qualifier = a.qualifier) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a.withMetadata(field.metadata)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm if tests pass, cc @yaooqinn @cloud-fan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch!
thanks, merging to master/ |
it has conflicts with 3.5, can you create a new backport PR? |
Create a backport PR in #46565. |
What changes were proposed in this pull request?
CSV table containing char and varchar columns will result in the following error when selecting from the CSV table:
Why are the changes needed?
For char and varchar types, Spark will convert them to
StringType
inCharVarcharUtils.replaceCharVarcharWithStringInSchema
and record__CHAR_VARCHAR_TYPE_STRING
in the metadata.The reason for the above error is that the
StringType
columns in thedataSchema
andrequiredSchema
ofUnivocityParser
are not consistent. TheStringType
in thedataSchema
has metadata, while the metadata in therequiredSchema
is empty. We need to retain the metadata when resolving schema.Does this PR introduce any user-facing change?
No.
How was this patch tested?
Add a new test case in
CSVSuite
.Was this patch authored or co-authored using generative AI tooling?
No.