Efficient Full Text Search With Vespa: Adding Search Rules
8 Mar 2024
Vespa rules are used for replacing or filtering out insignificant search terms that does not bring any value to search results. A common scenario involves disregarding common words like "the" during searches.
Goal
Implement rules that disregard specific words in search requests.
Prerequisites
Before proceeding, ensure Customizing Ranking is completed.
Performing a Search without Rules:
When sending a request with the search term "the diamond," several documents containing only the word "the" are retrieved:
curl http://localhost:8080/search/ \
--header 'Content-Type: application/json' \
--data '{
"queryProfile": "book_v1",
"search_term": "the diamond",
}'
Let’s modify our application to ignore such terms.
Adding Rule to the VAP:
Add ./rules/stopwords.sr file into the VAP.
@default # Stopwords: replace them by nothing title:[stopword] -> ; description:[stopword] -> ; tags:[stopword] -> ; ngram:[stopword] -> ; [stopword] :- a,am,an,and,are,as,at,be,because,been,but,by,can,com,could,did,do,does,for,from,had,has,have,he,her,him,his,how,i,if,in,is,it,its,me,my,no,not,of,on,or,our,she,should,so,some,someone,than,that,the,their,them,then,there,these,they,this,through,to,too,us,was,way,we,were,what,when,where,which,who,why,will,with,would,www,you,your,can t,doesn t,how s,it s,that s,there s,what s,when s;
Here, we specify:
-
the set of words to be ignored (
[stopword]) -
fields to which the provided rule should be applied (
title,description,tags,ngram)
Enabling the Rule in the Query Profile
Activate the rule in book_v1 as follows:
<query-profile id="book_v1">
...
<field name="rules.off">false</field>
...
</query-profile>
Now, when resubmitting the same request the response contains only the document "The Diamond as Big as the Ritz."
Summary
By following these steps, we have successfully implemented and activated the rule to filter out insignificant words in search requests.
Next Steps
Explore Semantic Search to improve search quality further