Efficient Full Text Search With Vespa: Customizing Ranking

cal 1 Mar 2024

The goal of ranking is to order (rank) the documents retrieved by a query. It can be configured via rank profiles defined in schemas. More

Goal

Explore the customization of ranking in Vespa. By configuring rank profiles, you will learn how to enhance search results.

Prerequisites

Before proceeding, ensure Vespa Basic Start is completed

Adding a Custom Rank Profile

To customize ranking, modify the book schema as follows:

search book {
    document book {
      ...
    }

    rank-profile custom inherits default {
        rank-properties {
            query(title): 0.5
            query(tags): 0.25
            query(description): 0.25
        }

        function textMatchScore () {
            expression {
                query(title) * nativeFieldMatch(title) + query(tags) * nativeFieldMatch(tags) + query(description) * nativeFieldMatch(description)
            }
        }

        first-phase {
            expression: textMatchScore
            rank-score-drop-limit: 0.01
        }
    }
}

In this configuration, the weight of searchable fields is specified as follows: title=0.5, tags=0.25, description=0.25.

The score function textMatchScore sums the nativeFieldMatch(<fields>) multiplied by their respective weights.

Additionally, a rank-score-drop-limit parameter is set to filter out matching documents with a relevance below 0.01.

Next, set this rank profile in the query profile book_v1:

<query-profile id="book_v1">
    ...
    <field name="ranking.profile">custom</field>
    ...
</query-profile>

Now, when comparing responses to two requests with search_term=diamond and search_term=science

curl http://localhost:8080/search/ \
--header 'Content-Type: application/json' \
--data '{
"queryProfile": "book_v1",
"search_term": "diamond",
}'

curl http://localhost:8080/search/ \
--header 'Content-Type: application/json' \
--data '{
"queryProfile": "book_v1",
"search_term": "science",
}'

Then the response’s 'relevance' for the first request returns a higher value. This is because in the first case, the match is in the 'title' field, whereas in the second case, it’s in the 'tags' field. (As defined in the rank profile, title.weight > tags.weight → 0.5 > 0.25)

Modifying Field Weights in a Request

Field weights defined as rank properties can be set or overridden directly in requests. For example:

curl --location 'http://localhost:8080/search/' \
--header 'Content-Type: application/json' \
--data '{
    "queryProfile": "book_v1",
    "search_term": "science",
    "ranking.features.query(tags)": 0.5,
}'

Notice that the relevance for the matching document increases.

This can be useful when specific context is required in a particular request. For instance, to retrieve matches by 'title' field only, the request should look like:

curl --location 'http://localhost:8080/search/' \
--header 'Content-Type: application/json' \
--data '{
    "queryProfile": "book_v1",
    "search_term": "science",
    "ranking.features.query(title)": 1,
    "ranking.features.query(tags)": 0,
    "ranking.features.query(tags)": 0,
}'

Summary

Through these steps, we successfully customized our rank profile to suit our needs.

Next Steps

Explore N-Gram Search to enhance search results further.