Skip to content

Spell Check on non English language #54

Answered by R1j1t
alvesman asked this question in Q&A
Discussion options

You must be logged in to vote

Hi, I think I understand the issue. You have passed the language specific spacy model (in this case Portuguese) but for the spell check you are still using the default model (english bert-base-cased). You can try to search for a Portuguese model from HuggingFace Models and pass it ex. https://huggingface.co/neuralmind/bert-base-portuguese-cased.

Please see an examples for Japanese

nlp = spacy.load("ja_core_news_sm")
nlp.add_pipe(
"contextual spellchecker",
config={
"model_name": "cl-tohoku/bert-base-japanese-whole-word-masking",
"max_edit_dist": 2,
},
)

I will sti…

Replies: 3 comments 2 replies

Comment options

You must be logged in to vote
2 replies
@alvesman
Comment options

@R1j1t
Comment options

Answer selected by alvesman
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
4 participants