Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slices, exercise 5: why does a Chinese character take 3 bytes? #346

Open
s-tikhomirov opened this issue Dec 26, 2022 · 1 comment
Open

Comments

@s-tikhomirov
Copy link

I don't understand the solution to exercise 5 about slices:

fn main() {
    let s = "你好,世界";
    // Modify this line to make the code work
    let slice = &s[0..2];

    assert!(slice == "你");

    println!("Success!");
}

The solution is:

fn main() {
    let s = "你好,世界";
    let slice = &s[0..3];

    assert!(slice == "你");
}

Earlier, a comment to exercise 2 said that

Each of the two chars '中' and '国' occupies 4 bytes, 2 * 4 = 8

so I assumed would occupy 4 bytes, and the slice would be &s[0..4]. Yet the correct answer is &s[0..3]. Apparently, in the fifth exercise, each Chinese character takes 3 bytes. Why so?

@Mohammed785
Copy link

Mohammed785 commented Jan 9, 2023

I think this might help you to understand
https://doc.rust-lang.org/book/ch08-02-strings.html#bytes-and-scalar-values-and-grapheme-clusters-oh-my

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants