Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some Confusion About ADWIN #1539

Open
9sea opened this issue May 6, 2024 Discussed in #1535 · 3 comments
Open

Some Confusion About ADWIN #1539

9sea opened this issue May 6, 2024 Discussed in #1535 · 3 comments

Comments

@9sea
Copy link

9sea commented May 6, 2024

Discussed in #1535

Originally posted by 9sea April 26, 2024
The code is in line 244 of river.drift.adwin_c.pyx.
I found that the code at line 244 of the file adwin_c.pyx, which reads "for k in range(bucket.current_idx - 1)", differs from the realization in MOA. I believe it should be "for k in range(bucket.current_idx)". Of course, this is a minor difference, but it might also affect the conditional statement at line 272, causing that condition to always be false.
I would appreciate it if someone could clarify this for me. Thank you!

@smastelini
Copy link
Member

Hi @9sea, thanks for reporting that.

Are you referring to this line, in MOA?

https://github.com/Waikato/moa/blob/5474e181c778f8a8cd8a549fa83c08975ff99dde/moa/src/main/java/moa/classifiers/core/driftdetection/ADWIN.java#L504

The loop in MOA uses the <= operator while River's version implicitly uses <. Although I did not do the original port from Java to Python and later Cython, I think you might have a point.

Have you tried to run some benchmarks to assess the implications of such a change?

@9sea
Copy link
Author

9sea commented May 6, 2024

Hi @smastelini , Yes, it is line 504 of the file. If the '<' is used, it will not iterate to the last bucket, which is different from the logic of the algorithm in the original paper. I have not run any benchmarks yet, and I suspect that there might have been a mistake when converting the code from Java to Python, possibly mistakenly writing '<' instead of '<='.

@smastelini
Copy link
Member

It would be nice to run a few simple benchmarks to measure the impact of such a change and then create a PR with the needed changes. @9sea, are you willing to take care of that? I have a few things at the top of my priority list at the moment, so I can't check it right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants