[Learning to rank] Add support to feature variables which is not interact with field data #108404

daixque · 2024-05-08T09:51:47Z

Overview

As of 8.13, the learning to rank functionality of Elasticsearch and Eland only support the feature variable which associate with field data of the Elasticsearch's index.

But sometimes a user may need to train the model with feature values which is provided directly and not as field data. Elasticsearch and Eland should have the capability which accepts feature values is not interact with field data.

For example, our notebook shows how we can implement a search app for movie data. In this example, all feature values are provided by Elasticsearch, such as BM25 score and/or result of script score. But sometimes user wants to train their model with the data which is from outside of Elasticsearch. Typical example would be the user profile such as age and/or gender, etc., because those are not related to the each document (in this case each movie).

Model training with Eland

At the moment LTRModelConfig only accepts list of QueryFeatureExtractor, but in the new version of Eland it should also accept another extractor which represents direct feature value which doesn't associate with any field data of the index.

Elasticsearch learning to rank query

When an application app issues the query, feature values should be directly passed to Elasticsearch. It may look like rescore.learning_to_rank.prams.user_age in the example below:

GET my-index/_search
{
  "query": { 
    "multi_match": {
      "fields": ["title", "content"],
      "query": "the quick brown fox"
    }
  },
  "rescore": {
    "learning_to_rank": {
      "model_id": "ltr-model", 
      "params": { 
        "query_text": "the quick brown fox",
        "user_age": 20
      }
    },
    "window_size": 100 
  }
}

The text was updated successfully, but these errors were encountered:

elasticsearchmachine · 2024-05-08T19:46:00Z

Pinging @elastic/ml-core (Team:ML)

daixque added >enhancement needs:triage Requires assignment of a team area label labels May 8, 2024

kderusso added :ml Machine learning and removed needs:triage Requires assignment of a team area label labels May 8, 2024

elasticsearchmachine added the Team:ML Meta label for the ML team label May 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Learning to rank] Add support to feature variables which is not interact with field data #108404

[Learning to rank] Add support to feature variables which is not interact with field data #108404

daixque commented May 8, 2024 •

edited

elasticsearchmachine commented May 8, 2024

[Learning to rank] Add support to feature variables which is not interact with field data #108404

[Learning to rank] Add support to feature variables which is not interact with field data #108404

Comments

daixque commented May 8, 2024 • edited

Overview

Model training with Eland

Elasticsearch learning to rank query

elasticsearchmachine commented May 8, 2024

daixque commented May 8, 2024 •

edited