Getting your customers to the right search results within your site does not happen by chance. If you are looking to improve your website search performance, the key to success is combining behavioral data with machine learning to automatically increase the relevance of search results.
Understanding the importance of precision and recall metrics
When a customer searches your website for information, the results they get back should be as relevant to what they are looking for as possible. This search relevance can be measured using two primary metrics: precision and recall.
Precision is how many useful search returns are delivered versus the number that are not relevant to the customer. For example, looking at a customer search for “black boots” on an e-commerce site, if the results show ten different products, six of which are black boots, two of which are brown boots, one of which is black shoes, and one of which is brown socks, then the precision is 6 out of 10 or 60%.
Recall is how many of the total number of relevant options on your website are returned by a search. For example, If there are twenty products relevant to the “black boots” search query, but the system only returns sixteen, then the recall is 16 out of 20 or 80%.
At a small scale, having a little precision and recall “noise” in the search results may not seem so bad. However, when you scale this up to hundreds of results, it could mean dozens or even hundreds of irrelevant or unlisted results that your customers have to wade through to find what they are actually looking for. This can cost you sales.
To get a general metric for precision and recall, consider manually running 50-100 searches on your site and measuring the values based on the results you get.
Improving search relevance, precision, and recall
There are a few ways to improve precision. The easiest is to remove fields that contain a lot of “noise” from being indexed by your search engine. For instance, in the example above, brown socks might be showing up in a search for “black boots” because their description might include a statement like, “These pair well with many kinds of shoes, from white flats to black boots.” Removing the “description” field from your search engine will prevent this problem from occurring.
Often, however, description fields contain many useful and relevant keywords for an item. In that case, it can be useful to create a separate field that includes the relevant keywords from the description field without the irrelevant terms. This process usually has to be done manually, however, which is quite time-consuming.
To improve recall, you can take the opposite approach: add more keywords to your search fields. One simple way to do this is to find synonym lists for common keywords and add those to your search engine so that, for instance, the word “shoe” is added to any item containing the word “sneaker.”
As you can see, improving precision often hurts recall, and vice versa. For this reason, it’s important to try to make improvement quantitatively based on the formulas above. In that way, you can make sure your changes are having a positive overall effect.
Use machine learning to automatically improve precision and recall
Rather than manually optimizing your website search, machine learning offers a way for your search system to optimize itself automatically over time based on real-time usage. This is done by having the system feed data back into its algorithms based on the relevant results that are surfaced and used by the customer.
Optimizing search results using machine learning has been shown to increase conversion rates over time by constantly refining search result relevance and improving precision and recall metrics.
Want to learn more about optimizing on-site search to drive real business metrics?
For years, retailers have optimized search solely on “relevance,” hoping the results that users want to see (and the results that drive important business metrics) appear at the top.
This is exactly wrong — and it’s not how companies like Google and Amazon optimize their search.
On October 24th, we’ll be hosting a new webinar discussing the limitations of optimizing search results for “relevance,” and the 3 attributes retailers should focus on to drive real business results from search.