Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stdlib] String comparisons implemented #2409

Closed
wants to merge 13 commits into from

Conversation

siitron
Copy link

@siitron siitron commented Apr 25, 2024

For issue #2346 (as an alternative to #2378). All four comparisons (__lt__, __le__, __gt__, & __ge__) uses a single __lt__ comparison (instead of checking less/greater than + potentially another "equals to"-check, for __le__ & __ge__). Sorry if this is considered a duplicate PR, I only meant to give an alternative suggestion. This is my first ever PR on GitHub.

StringLiterals also get comparisons.

@siitron siitron requested a review from a team as a code owner April 25, 2024 17:23
@siitron siitron closed this Apr 25, 2024
@siitron siitron deleted the nightly branch April 25, 2024 17:36
@siitron siitron restored the nightly branch April 25, 2024 17:57
@VMois
Copy link

VMois commented Apr 27, 2024

Are there any reasons to close a PR? This looks like quite a nice implementation.

@siitron siitron reopened this Apr 28, 2024
@siitron
Copy link
Author

siitron commented Apr 28, 2024

@VMois Thank you! I closed this PR before because I couldn't pass the tests (because I didn't know that the math package wasn't supported to use because of cyclic dependencies). But now I've replaced min with if/else-statements and that nicely resulted in fewer comparisons.

I can't complete ./stdlib/scripts/run-tests.sh locally due to some issues I have with WSL, but the current version should be fine now.

… added tests.

Signed-off-by: Simon Hellsten <simonhellsten@hotmail.com>
Signed-off-by: Simon Hellsten <simonhellsten@hotmail.com>
…n_operators() for String.

Signed-off-by: Simon Hellsten <simonhellsten@hotmail.com>
Signed-off-by: Simon Hellsten <simonhellsten@hotmail.com>
…al (+tests)

Signed-off-by: Simon Hellsten <simonhellsten@hotmail.com>
siitron and others added 2 commits April 29, 2024 21:32
…g_literal.mojo with if/else-statements. Also fixed the string comparison tests to make them look better when formatted correctly.

Signed-off-by: Simon Hellsten <simonhellsten@hotmail.com>
var len1 = len(self)
var len2 = len(rhs)

if len1 < len2:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: you can consolidate the duplication between this and the implementation in String by implementing the comparison on StringRef

Copy link
Author

@siitron siitron Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean something like return StringRef(self) < StringRef(rhs) as __lt__ for StringLiteral (and return self._strref_dangerous() < rhs._strref_dangerous() for String), etc.?

I didn't even consider StringRef (forgot it existed), and I tried to match __lt__ with their respective __eq__-methods. Right now String, StringLiteral, and StringRef does __eq__ in three separate ways, so I thought there was some reason for keeping them apart ATM(?). Would it still be safe to use memory.memcmp in the StringRef __lt__ implementation (considering it currently looks like StringLiteral has to use it's own local _memcmp, and StringRef's __eq__ also avoids it), or should StringRef's __lt__ be done like its __eq__?

What do you think of:

    @always_inline("nodebug")
    fn __lt__(self, rhs: StringRef) -> Bool:
        """Compare this StringRef to the RHS using LT comparison.

        Args:
            rhs: The other StringRef to compare against.

        Returns:
            True if this StringRef is strictly less than the RHS StringRef and False otherwise.
        """
        var len1 = len(self)
        var len2 = len(rhs)

        if len1 < len2:
            return memcmp(self.data, rhs.data, len1) <= 0
        else:
            return memcmp(self.data, rhs.data, len2) < 0

@JoeLoser JoeLoser requested review from abduld and a team and removed request for gabrieldemarmiesse May 5, 2024 00:37
@ematejska ematejska added the mojo-repo Tag all issues with this label label May 6, 2024
…iteral's __lt__-methods. Also updated the tests to use the dunder methods directly (instead of their operators).

Signed-off-by: Simon Hellsten <simonhellsten@hotmail.com>
Copy link
Contributor

@laszlokindrat laszlokindrat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you! Do you think you could do a followup where you move the core logic to StringRef and delegate to that from both String and StringLiteral?

@modularbot
Copy link
Collaborator

✅🟣 This contribution has been merged 🟣✅

Your pull request has been merged to the internal upstream Mojo sources. It will be reflected here in the Mojo repository on the nightly branch during the next Mojo nightly release, typically within the next 24-48 hours.

We use Copybara to merge external contributions, click here to learn more.

@modularbot modularbot added the merged-internally Indicates that this pull request has been merged internally label May 10, 2024
@siitron
Copy link
Author

siitron commented May 10, 2024

This is great, thank you! Do you think you could do a followup where you move the core logic to StringRef and delegate to that from both String and StringLiteral?

Sure np, I'll create a new PR for the StringRef stuff!

Some thoughts (that I'll probably repeat in the PR):

@laszlokindrat Does this mean that StringLiteral no longer needs its private _memcmp-function (assuming __eq__ and __ne__ are also delegated to StringRef), i.e. StringLiteral (and StringRef) are ok to use memory.memcmp now? If yes, then I can make the 18 comparison methods all just use __eq__ and __lt__ in StringRef. I'm asking since StringLiteral's private _memcmp function has this comment:

# Use a local memcmp rather than memory.memcpy to avoid #31139 and #25100.

(I can't access "#31139 and #25100", I'm guessing its some internal stuff).

Also, for String to be delegated to StringRef I'd have to use String's _strref_dangerous-method which I was a bit hesitant to do

JoeLoser pushed a commit that referenced this pull request May 11, 2024
[External] [stdlib] String comparisons implemented

For issue #2346 (as an alternative to #2378). All four comparisons
(`__lt__`, `__le__`, `__gt__`, & `__ge__`) uses a single `__lt__`
comparison (instead of checking less/greater than + potentially another
"equals to"-check, for `__le__` & `__ge__`). Sorry if this is considered
a duplicate PR, I only meant to give an alternative suggestion. This is
my first ever PR on GitHub.

StringLiterals also get comparisons.

ORIGINAL_AUTHOR=Simon Hellsten
<56205346+siitron@users.noreply.github.com>
PUBLIC_PR_LINK=#2409

---------

Co-authored-by: Simon Hellsten <56205346+siitron@users.noreply.github.com>
Closes #2409
MODULAR_ORIG_COMMIT_REV_ID: b2ed4756c2741fd27387fa295515f4a7222e0ca5
@JoeLoser
Copy link
Collaborator

Landed in today's nightly: #2615. Thanks for the contribution!

@JoeLoser JoeLoser closed this May 11, 2024
@JoeLoser JoeLoser added merged-externally Merged externally in public mojo repo imported-internally Signals that a given pull request has been imported internally. labels May 11, 2024
@laszlokindrat
Copy link
Contributor

@laszlokindrat Does this mean that StringLiteral no longer needs its private _memcmp-function (assuming __eq__ and __ne__ are also delegated to StringRef), i.e. StringLiteral (and StringRef) are ok to use memory.memcmp now? If yes, then I can make the 18 comparison methods all just use __eq__ and __lt__ in StringRef. I'm asking since StringLiteral's private _memcmp function has this comment:

Use a local memcmp rather than memory.memcpy to avoid #31139 and #25100.

(I can't access "#31139 and #25100", I'm guessing its some internal stuff).

Good question. If you don't mind, would you just give it a go, and let's see what breaks? If things don't work, I will take a look at those tickets.

Also, for String to be delegated to StringRef I'd have to use String's _strref_dangerous-method which I was a bit hesitant to do

I think this should be okay, since the lifetimes of self should be guaranteed to outlive method calls, so you can take a dangerous reference in these method implementations.

@siitron
Copy link
Author

siitron commented May 13, 2024

If you don't mind, would you just give it a go, and let's see what breaks? If things don't work, I will take a look at those tickets.

@laszlokindrat You can take a look at #2620 if you'd like :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
imported-internally Signals that a given pull request has been imported internally. merged-externally Merged externally in public mojo repo merged-internally Indicates that this pull request has been merged internally mojo-repo Tag all issues with this label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants