New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Equality delete lost after compact data files #10312
Comments
Is there any error log for equality delete? |
No error, If I read directly from snapshot id: 3, the result is correct. |
I found the code to drop the equality delete files here. iceberg/core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java Lines 813 to 832 in 2b21020
I think the |
@CodingJun: Your analysis seems correct to me. We need to take the @RussellSpitzer and @aokolnychyi might know more. |
your process is in |
Yes, The default setting is true. Can you debug it to see if the configuration is effective? |
Do you know if this is a bug? @RussellSpitzer @aokolnychyi |
Apache Iceberg version
1.5.1
Query engine
Spark
Please describe the bug 馃悶
I have a program that continuously write streaming data to iceberg, and regularly use spark to compact data files. But I found that after compact the data files, some of the data was not deleted correctly. The following are the examples to reproduce:
Original table:
Writing process:
Result:
The correct result should be:
PS:
When I set
use-starting-sequence-number = false
for rewriteDataFiles, Thread 1 compact data files failed at t4. stacktrace:Question:
Why are the equality deleted files lost? Is this correct or a bug?
The text was updated successfully, but these errors were encountered: