Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The h2o.findSynonyms failed if the 'word' parameter is uknown for the word2vec model #16192

Open
dmresearch15 opened this issue May 6, 2024 · 3 comments
Assignees
Labels
Milestone

Comments

@dmresearch15
Copy link

Received the following error when attempting to execute print(h2o.findSynonyms(w2v_model, "National", count = 5)):
Error in eval(substitute(expr), data, enclos = parent.frame()) :
object 'score' not found
Curious about the absence of the 'score' parameter.

In contrast, when employing print(h2o.findSynonyms(w2v_model, "national", count = 5)), the score is generated as expected.

@maurever
Copy link
Contributor

maurever commented May 9, 2024

Hi @dmresearch15. Thanks for reporting this issue.

It looks like there is a bug, that we cannot return results without an error for an unseen word.

We definitely need to fix it.

@maurever maurever added the bug label May 9, 2024
@maurever maurever changed the title Seeking clarification while utilising h2o.findSynonyms The h2o.findSynonyms failed if the 'word' parameter is uknown for the word2vec model May 9, 2024
@maurever
Copy link
Contributor

maurever commented May 9, 2024

I reproduced the error by this code:

job_titles <- h2o.importFile("https://s3.amazonaws.com/h2o-public-test-data/smalldata/craigslistJobTitles.csv",  col.names = c("category", "jobtitle"), col.types = c("String", "String"), header = TRUE)

words <- h2o.tokenize(job_titles, " ")
vec <- h2o.word2vec(training_frame = words)

// pass    
syn <- h2o.findSynonyms(vec, "teacher", count = 20)
print(syn)

// fail    
syn2 <- h2o.findSynonyms(vec, "Tteacher", count = 20)
print(syn2)

@dmresearch15
Copy link
Author

I'm presently incorporating this into my project. It's helpful to have a timeframe for resolving this issue.

@valenad1 valenad1 added this to the 3.46.0.3 milestone May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants