Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stdlib] add b64decode #2364

Closed
wants to merge 1 commit into from
Closed

Conversation

mikowals
Copy link
Contributor

Followed the decode algorithm from the same paper used for b64encode.

The Llama3 tokenizer.model stores the tokens with base64 encoding so demand for this may increase. 😃

@mikowals mikowals requested a review from a team as a code owner April 21, 2024 16:03
@mikowals mikowals force-pushed the base64-decode branch 4 times, most recently from 5e7b1c5 to 5ceb47c Compare April 24, 2024 01:19
Signed-off-by: Michael Kowalski <1331470+mikowals@users.noreply.github.com>
Copy link
Collaborator

@JoeLoser JoeLoser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thank you!

from testing import assert_equal


def test_b64encode():
print("== test_b64encode")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion Drop the print here and in the decode function below. The print(...) is only needed in FileCheck-style tests so the tool has something to latch onto for its regex-matching internally.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead and removed these prints when I imported the PR internally since it was trivial to change, so don't feel the need to do anything. 😄

@JoeLoser JoeLoser added imported-internally Signals that a given pull request has been imported internally. merged-internally Indicates that this pull request has been merged internally labels Apr 27, 2024
@JoeLoser
Copy link
Collaborator

✅🟣 This contribution has been merged 🟣✅

Hey @mikowals,

Thanks so much for the contribution! 🎉

We're moving to a new infrastructure for merging contributions to Mojo (we're using a tool called Copybara), and your contribution has now been merged into our internal copy of the Mojo Standard Library. I've added the "merged-internally" label on this PR.

The changes in this PR will appear here in the mojo repo nightly branch when we do our next outbound synchronization at the time that the next Mojo nightly is released. That should happen on Monday (tomorrow).

Please let me know if you have any questions or concerns.

JoeLoser pushed a commit to JoeLoser/mojo that referenced this pull request Apr 30, 2024
[External] [stdlib] add b64decode

Followed the decode algorithm from the same paper used for `b64encode`.

The Llama3 `tokenizer.model` stores the tokens with base64 encoding so
demand for this may increase. 😃

ORIGINAL_AUTHOR=Michael Kowalski
<1331470+mikowals@users.noreply.github.com>
PUBLIC_PR_LINK=modularml#2364

---------

Co-authored-by: Michael Kowalski <1331470+mikowals@users.noreply.github.com>
Closes modularml#2364
MODULAR_ORIG_COMMIT_REV_ID: de91cca69272570a52fcbf28a5c51c8d7fe75364
JoeLoser pushed a commit that referenced this pull request Apr 30, 2024
[External] [stdlib] add b64decode

Followed the decode algorithm from the same paper used for `b64encode`.

The Llama3 `tokenizer.model` stores the tokens with base64 encoding so
demand for this may increase. 😃

ORIGINAL_AUTHOR=Michael Kowalski
<1331470+mikowals@users.noreply.github.com>
PUBLIC_PR_LINK=#2364

---------

Co-authored-by: Michael Kowalski <1331470+mikowals@users.noreply.github.com>
Closes #2364
MODULAR_ORIG_COMMIT_REV_ID: de91cca69272570a52fcbf28a5c51c8d7fe75364
@JoeLoser
Copy link
Collaborator

Closing as this got merged into the latest nightly during our outbound sync today (4/29/24) - see decdd0c.

@JoeLoser JoeLoser closed this Apr 30, 2024
StandinKP pushed a commit to StandinKP/mojo that referenced this pull request Apr 30, 2024
[External] [stdlib] add b64decode

Followed the decode algorithm from the same paper used for `b64encode`.

The Llama3 `tokenizer.model` stores the tokens with base64 encoding so
demand for this may increase. 😃

ORIGINAL_AUTHOR=Michael Kowalski
<1331470+mikowals@users.noreply.github.com>
PUBLIC_PR_LINK=modularml#2364

---------

Co-authored-by: Michael Kowalski <1331470+mikowals@users.noreply.github.com>
Closes modularml#2364
MODULAR_ORIG_COMMIT_REV_ID: de91cca69272570a52fcbf28a5c51c8d7fe75364
@JoeLoser JoeLoser added the merged-externally Merged externally in public mojo repo label May 3, 2024
@mikowals mikowals deleted the base64-decode branch May 7, 2024 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
imported-internally Signals that a given pull request has been imported internally. merged-externally Merged externally in public mojo repo merged-internally Indicates that this pull request has been merged internally
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants