Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: "Unique rows (HashSet)" has a bug and drops records #3908

Open
gertwieland opened this issue May 4, 2024 · 3 comments
Open

[Bug]: "Unique rows (HashSet)" has a bug and drops records #3908

gertwieland opened this issue May 4, 2024 · 3 comments
Assignees
Labels
bug Hop Gui P1 Critical Issue
Milestone

Comments

@gertwieland
Copy link

Apache Hop version?

2.8

Java version?

openjdk version "11.0.21" 2023-10-17

Operating system

Windows

What happened?

"Unique rows (HashSet)" seems to drop records even if they only appear once.
Steps to reproduce the error:

Generate 60k records, then add a sequence and one column with random fake data.

Then calculate a SHA256 checksum over it. Since it includes the sequence number from 1 - 60k, those checksums must be all unique.

But still, the "Unique rows (HashSet)" seems to consider one row a duplicate, and only returns 59,999 records.

Test pipeline attached
Unique_Hash_Faulty.zip

image

Issue Priority

Priority: 3

Issue Component

Component: Hop Gui

@DAJGIT
Copy link

DAJGIT commented May 5, 2024

I could reproduce this case after several runs.
Trying to catch the duplicate record found this option: Compare using stored row values
image
It seems this solves this case.

@DAJGIT
Copy link

DAJGIT commented May 6, 2024

Keep diving looking for a repoduction path and here it is:
image

Unique_Hash_Faulty_Sample.zip

@hansva hansva added P1 Critical Issue and removed P3 Nice to have labels May 6, 2024
@hansva
Copy link
Contributor

hansva commented May 6, 2024

.take-issue

@github-actions github-actions bot added this to the 2.9 milestone May 6, 2024
@hansva hansva modified the milestones: 2.9, 2.10 May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Hop Gui P1 Critical Issue
Projects
None yet
Development

No branches or pull requests

3 participants