Extend sys/bitarray API with set-bit counting, xor, nth-bit-set #72901

luqasn · 2024-05-16T20:12:12Z

Hello there!

I am currently working on integrating a memory-optimized forward error correction algorithm for LoRaWAN "natively" in Zephyr instead of relying on the external module that is currently being used.

For that, I need some more bitarray methods than are currently available.
In order to make review more enjoyable, I have added the three additional methods I need in this first PR and will then follow up with a second PR adding the new forward error correction algorithm that uses them.

This is my first Zephyr contribution and I am very much looking forward to your feedback.
Thanks for having a look!

This is part one of several changes to add more methods to the bitarray api so that it can be used for broader usecases, specifically LoRaWAN forward error correction. Signed-off-by: Lucas Romero <luqasn@gmail.com>

github-actions · 2024-05-16T20:12:56Z

Hello @luqasn, and thank you very much for your first pull request to the Zephyr project!
Our Continuous Integration pipeline will execute a series of checks on your Pull Request commit messages and code, and you are expected to address any failures by updating the PR. Please take a look at our commit message guidelines to find out how to format your commit messages, and at our contribution workflow to understand how to update your Pull Request. If you haven't already, please make sure to review the project's Contributor Expectations and update (by amending and force-pushing the commits) your pull request if necessary.
If you are stuck or need help please join us on Discord and ask your question there. Additionally, you can escalate the review when applicable. 😊

cfriedt

Looks like there are still some failing tests

andyross

Seems clean, though sort of weirdly non-orthogonal. The original use case for this code was a synchronized slab-ish allocator. Not sure what xor or popcount are for. But they fit and don't make anything worse, so no reason not to merge.

luqasn · 2024-05-17T05:55:40Z

Thanks for the reviews and sorry for the oversight in the tests.

I added the fix as a new commit as to not have you re-review everything, but please let me know if you want me to fix up the original commit instead.

@andyross I agree that it felt only half-fitting, but still better than having two bitarray things that are slightly different in usage.

luqasn · 2024-05-17T07:44:48Z

I fixed the test that was failing because of a real implementation gap, but now I got a target dependent failure that I don't understand.

Here is my interpretation:

The tests seem to fail only on mobile cpus due to a check when unlocking a spinlock.
It tries to check if the unlocking cpu is the same as the cpu holding the lock.
But while trying to do that it fails on an assert because apparently getting the current cpu is not supported if z_smp_cpu_mobile?
What baffles me is that all other tests pass happily even though they also use spinlocks in the same way.
The only difference I can see is that the method under test for the failure uses two spinlocks instead of one, but that does not seem to be the root of the problem.

Can you help shed some light on this?

andyross · 2024-05-17T13:26:27Z

But while trying to do that it fails on an assert because apparently getting the current cpu is not supported if z_smp_cpu_mobile?

Something got borked. The assert there is to detect common race conditions with the use of the "current CPU" in contexts where the current thread might migrate to another CPU. Basically it wants to see that you're in an ISR or that interrupts are masked locally.

If you're holding a spinlock (because you're trying to unlock one) then interrupts really should be masked. That they aren't implies that something went wrong. Most likely you have a mismatched lock somehow, but if you're sure the locking is correct then something very weird is going on. What platform is this failing on?

luqasn · 2024-05-17T21:01:46Z

Thanks for your help!
It is failing on some platforms but not on others. One that I looked at the GHA run for was Cortex A53.

I think I will start by isolating the problematic commit/test into a new PR to rule out that one of the test that might run before it are the problem.

And then I'll have to stare long and hard at the code to see if my locking is weird somewhere.

Already did that some and my first idea would be to change the order of the double unlock to unlock in reverse order of the locking to see if that helps. Feels more "correct" anyways.

TaiJuWu · 2024-05-18T03:44:34Z

lib/utils/bitarray.c

+	key = k_spin_lock(&bitarray->lock);
+
+	__ASSERT_NO_MSG(bitarray != NULL);
+	__ASSERT_NO_MSG(bitarray->num_bits > 0);


There is a logical issue in my mind.
You should check bitarray is not empty before bitarray->lock.

I know other function is written as you, but I infer it is a mistake.
Maybe we need others to confirm.

There is another reason I think it is a mistake.
Consider follow scenario:
If a thread take a lock and then assert, the lock will be held forever.

Thanks for the review!
Yes, I just copy/pasted the structure from the existing methods.

I see what you mean and agree that it is wrong in theory, but I am not sure how important it is.
I think the process will quit/kernel will panic anyways, either with a SEGFAULT or with an assertion error.
And if that's the case, it doesn't matter that the lock isn't released.

But if the others agree, I can change all locking in this file to be more theoretically correct.

luqasn · 2024-05-18T19:45:12Z

Glad to report that changing the unlock order to be in reverse lock order indeed fixed the issue. 🥳

lib/utils/bitarray.c

cfriedt · 2024-05-20T13:34:51Z

@luqasn - CI is passing \o/

The only remaining thing to do is convert your last two commits to fixups and squash as Andy mentioned.

something like this should do it (please check my commit math though)

git reset HEAD~2
# add just the find-nth changes
git commit --fixup d85be1c
# add just the xor changes
git commit --fixup 52b7daa
git rebase -i --autosquash HEAD~5

This is part one of several changes to add more methods to the bitarray api so that it can be used for broader usecases, specifically LoRaWAN forward error correction. Signed-off-by: Lucas Romero <luqasn@gmail.com>

luqasn · 2024-05-21T08:43:31Z

Thanks for all the feedback and help, I've squashed and rebased the fixes into their respective commits.

Ready to merge from my side, but what do the others think about @TaiJuWu's remarks?
Should I take care of the erroneous "struct access before NULL check" in the whole file or not?
If yes: in this PR or in a new one?

andyross

I'm out of complaints, thanks

cfriedt

Sorry - kind of foolish of me to miss this, but we will definitely need tests for each new function as well.

Ideally they would exercise a meaningful permutation of input parameters for each function, even invalid inputs like NULL, etc, but generally we want to exercise as many code paths as possible for coverage.

Usually the order of commits is important.

feature A
test A
feature B
test B

martinjaeger · 2024-05-21T17:17:41Z

@cfriedt There are tests for each new function, just not added as extra commits, but as part of the commit with the added function itself. I think that's fine.

cfriedt

Ah sorry - now I see the tests. Normally we prefer tests in separate commits, but this should be ok

github-actions · 2024-05-21T20:45:19Z

Hi @luqasn!
Congratulations on getting your very first Zephyr pull request merged 🎉🥳. This is a fantastic achievement, and we're thrilled to have you as part of our community!

To celebrate this milestone and showcase your contribution, we'd love to award you the Zephyr Technical Contributor badge. If you're interested, please claim your badge by filling out this form: Claim Your Zephyr Badge.

Thank you for your valuable input, and we look forward to seeing more of your contributions in the future! 🪁

kartben · 2024-05-24T14:33:29Z

lib/utils/bitarray.c

+		*count += POPCOUNT(bitarray->bundles[bd.sidx] & bd.smask);
+		*count += POPCOUNT(bitarray->bundles[bd.eidx] & bd.emask);


just curious, would this work on all supported toolchains? It looks like the POPCOUNT macro is only defined in toolchain/gcc.h, and not in toolchain/llvm.h, etc.?

lib: bitarray: add method to count bits set in region

46105d5

This is part one of several changes to add more methods to the bitarray api so that it can be used for broader usecases, specifically LoRaWAN forward error correction. Signed-off-by: Lucas Romero <luqasn@gmail.com>

zephyrbot added area: Kernel area: Base OS Base OS Library (lib/os) area: Utilities labels May 16, 2024

zephyrbot requested review from andyross, ceolin, cfriedt, dcpleung, nashif, npitre, peter-mitsis and TaiJuWu May 16, 2024 20:12

zephyrbot assigned andyross May 16, 2024

luqasn force-pushed the bitarray_extend_api branch from 1b2789d to 6e16a70 Compare May 16, 2024 20:14

martinjaeger self-requested a review May 16, 2024 20:23

luqasn force-pushed the bitarray_extend_api branch from 6e16a70 to d85be1c Compare May 16, 2024 20:44

cfriedt approved these changes May 17, 2024

View reviewed changes

cfriedt requested changes May 17, 2024

View reviewed changes

andyross previously approved these changes May 17, 2024

View reviewed changes

luqasn dismissed andyross’s stale review via 27ed739 May 17, 2024 05:53

TaiJuWu reviewed May 18, 2024

View reviewed changes

luqasn mentioned this pull request May 18, 2024

lib: bitarray: add method to xor two bitarrays #72981

Closed

luqasn requested review from cfriedt and andyross May 18, 2024 19:43

andyross reviewed May 19, 2024

View reviewed changes

lib/utils/bitarray.c Show resolved Hide resolved

luqasn added 2 commits May 21, 2024 09:27

lib: bitarray: add method to xor two bitarrays

e802854

This is part one of several changes to add more methods to the bitarray api so that it can be used for broader usecases, specifically LoRaWAN forward error correction. Signed-off-by: Lucas Romero <luqasn@gmail.com>

lib: bitarray: add method to find nth bit set in region

651453a

This is part one of several changes to add more methods to the bitarray api so that it can be used for broader usecases, specifically LoRaWAN forward error correction. Signed-off-by: Lucas Romero <luqasn@gmail.com>

luqasn force-pushed the bitarray_extend_api branch from ac7296c to 651453a Compare May 21, 2024 07:28

andyross approved these changes May 21, 2024

View reviewed changes

cfriedt requested changes May 21, 2024

View reviewed changes

martinjaeger approved these changes May 21, 2024

View reviewed changes

cfriedt approved these changes May 21, 2024

View reviewed changes

nashif merged commit 3f50197 into zephyrproject-rtos:main May 21, 2024
29 checks passed

luqasn mentioned this pull request May 22, 2024

docs: bitarray: fix missing method documentation #73184

Merged

kartben reviewed May 24, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend sys/bitarray API with set-bit counting, xor, nth-bit-set #72901

Extend sys/bitarray API with set-bit counting, xor, nth-bit-set #72901

luqasn commented May 16, 2024

github-actions bot commented May 16, 2024

cfriedt left a comment

andyross left a comment

luqasn commented May 17, 2024 •

edited

luqasn commented May 17, 2024

andyross commented May 17, 2024

luqasn commented May 17, 2024 •

edited

TaiJuWu May 18, 2024 •

edited

luqasn May 18, 2024 •

edited

luqasn commented May 18, 2024

cfriedt commented May 20, 2024

luqasn commented May 21, 2024 •

edited

andyross left a comment

cfriedt left a comment

martinjaeger commented May 21, 2024

cfriedt left a comment

github-actions bot commented May 21, 2024

kartben May 24, 2024

		*count += POPCOUNT(bitarray->bundles[bd.sidx] & bd.smask);
		*count += POPCOUNT(bitarray->bundles[bd.eidx] & bd.emask);

Extend sys/bitarray API with set-bit counting, xor, nth-bit-set #72901

Extend sys/bitarray API with set-bit counting, xor, nth-bit-set #72901

Conversation

luqasn commented May 16, 2024

github-actions bot commented May 16, 2024

cfriedt left a comment

Choose a reason for hiding this comment

andyross left a comment

Choose a reason for hiding this comment

luqasn commented May 17, 2024 • edited

luqasn commented May 17, 2024

andyross commented May 17, 2024

luqasn commented May 17, 2024 • edited

TaiJuWu May 18, 2024 • edited

Choose a reason for hiding this comment

luqasn May 18, 2024 • edited

Choose a reason for hiding this comment

luqasn commented May 18, 2024

cfriedt commented May 20, 2024

luqasn commented May 21, 2024 • edited

andyross left a comment

Choose a reason for hiding this comment

cfriedt left a comment

Choose a reason for hiding this comment

martinjaeger commented May 21, 2024

cfriedt left a comment

Choose a reason for hiding this comment

github-actions bot commented May 21, 2024

kartben May 24, 2024

Choose a reason for hiding this comment

luqasn commented May 17, 2024 •

edited

luqasn commented May 17, 2024 •

edited

TaiJuWu May 18, 2024 •

edited

luqasn May 18, 2024 •

edited

luqasn commented May 21, 2024 •

edited