New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] [FLINK] Delta log not updated with with the parquet files created after the last checkpoint before Flink Job completed #3057
Comments
Hi @galadrielwithlaptop - is there a simple coding example you can create that can reproduce this issue? |
|
You can keep data of few data points only (say 100). and a checkpoint interval of lets just say 5 secs. I am pretty sure that TMs are freed before DeltaGlobalCommiter could complete. |
TM logs 2024-05-09 15:46:34.259 [] Source: delta-source -> Sink: Writer -> Sink: Committer (1/1)#0 INFO flink flink.sink.internal.committer.DeltaCommitter 103 Committing delta committable locally: appId=2ee0131e-257c-4466-8523-3b3df0333e88 checkpointId=13 deltaPendingFile=DeltaPendingFile(fileName=part-b7a731e2-610c-4035-b35d-ee5052c3cae7-12.snappy.parquet lastUpdateTime=1715269592562 fileSize=147467 recordCount=115818 partitionSpec={}) |
Bug
Which Delta project/connector is this regarding?
Describe the problem
I have a delta flink program similar to this. https://learn.microsoft.com/en-us/azure/hdinsight-aks/flink/use-flink-delta-connector . I enabled checkpointing to 1s and job takes around 10 chkpoints and delta_log gets updated 10 times. But after the 10th chkpoint, one parquet files gets created and i can see one more chkpoint is invoked. Then the job got finished. But somehow, the delta log file is not updated. Please check logs here.
I mean if there is some config , I am missing.
ABFS storage
The text was updated successfully, but these errors were encountered: