Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Learning to rank] Add support to feature variables which is not interact with field data #108404

Open
daixque opened this issue May 8, 2024 · 1 comment
Labels
>enhancement :ml Machine learning Team:ML Meta label for the ML team

Comments

@daixque
Copy link
Contributor

daixque commented May 8, 2024

Overview

As of 8.13, the learning to rank functionality of Elasticsearch and Eland only support the feature variable which associate with field data of the Elasticsearch's index.

But sometimes a user may need to train the model with feature values which is provided directly and not as field data. Elasticsearch and Eland should have the capability which accepts feature values is not interact with field data.

For example, our notebook shows how we can implement a search app for movie data. In this example, all feature values are provided by Elasticsearch, such as BM25 score and/or result of script score. But sometimes user wants to train their model with the data which is from outside of Elasticsearch. Typical example would be the user profile such as age and/or gender, etc., because those are not related to the each document (in this case each movie).

Model training with Eland

At the moment LTRModelConfig only accepts list of QueryFeatureExtractor, but in the new version of Eland it should also accept another extractor which represents direct feature value which doesn't associate with any field data of the index.

Elasticsearch learning to rank query

When an application app issues the query, feature values should be directly passed to Elasticsearch. It may look like rescore.learning_to_rank.prams.user_age in the example below:

GET my-index/_search
{
  "query": { 
    "multi_match": {
      "fields": ["title", "content"],
      "query": "the quick brown fox"
    }
  },
  "rescore": {
    "learning_to_rank": {
      "model_id": "ltr-model", 
      "params": { 
        "query_text": "the quick brown fox",
        "user_age": 20
      }
    },
    "window_size": 100 
  }
}
@daixque daixque added >enhancement needs:triage Requires assignment of a team area label labels May 8, 2024
@kderusso kderusso added :ml Machine learning and removed needs:triage Requires assignment of a team area label labels May 8, 2024
@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label May 8, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :ml Machine learning Team:ML Meta label for the ML team
Projects
None yet
Development

No branches or pull requests

3 participants