Author: Max Davish, Associate Product Manager
Blog Date: December 2020
FAQs are a hugely important part of Answers. They are the single most viewed, most clicked-on entity across Answers experiences, and they provide a huge benefit to both customers and businesses.
Customers want quick, relevant answers to their questions, without having to click through a complicated website or interact with a chatbot. And businesses want their customers to get this information directly from their website - not by making an expensive call to the support team or, worse, letting a competitor answer on Google.
But searching for FAQs is surprisingly difficult, because there can be so many different ways to ask the same question. Consider these search queries and this FAQ:
A human looking at the FAQ on the right understands intuitively that it's highly relevant to the search queries on the left. But a keyword search algorithm might not return the right result because many of these queries don't actually contain any individual keywords from the FAQ - even though they are still semantically similar.
Sure, you can always solve this by adding synonyms, but this doesn't scale very well because it's impossible to anticipate all the different ways a user could possibly ask a question. Fortunately, there's a better way.
Semantic search is a new way of searching for FAQs. Instead of looking at keywords, it measures the similarity in meaning between two questions. Answers does this using BERT - the same revolutionary natural language processing technology that powers other Answers features like location detection and entity recognition.
It works by using BERT to transform questions into points in high-dimensional space, called embeddings. In this space, two questions that mean the same thing will be close together, even if they don't contain any of the same words. So instead of looking at overlapping keywords, we instead search by looking for the FAQs that are the "closest" to the user's question.
What does this mean for Administrators? It means that FAQ search will no longer require managing lots of keywords and synonyms. For example, you won't need to add a synonym between "spread" and "transmitted"; BERT innately understands that these words mean the same thing.
This helps make life better for both users and Administrators. Users can ask complex questions in different ways and still get relevant results, and Administrators won't need to do as much manual configuration of their experience. It will just work!
Let's see a few examples of semantic search in action. Here's one from Cox Residential, a US-based telecommunications company:
In this query, BERT understands that "virus protection" is related to "security", so instead of showing a list of links, we show the FAQ the user was looking for. No synonyms necessary!
Even when there are keyword matches, semantic search can help understand which keyword matches are more important. For example:
In this example, there are two FAQs that both contain the word "support", and keyword search has no way of knowing which is more relevant to the user's question. But BERT understands that the user isn't asking about modems. They want to talk to someone, so it brings the second FAQ to the top position.
Here's another example from Cherry Creek Mortgage, a US-based mortgage lender:
In this example, BERT understands that when the user asks about the "loan portal", they're trying to access their personal information online, so it returns the exact right FAQ instead of a handful of irrelevant ones that simply contain the word "loan".
Here's another example from Bucknell University:
Normally, to get this query to work you might add a synonym between "doctor" and "pre-med". But with semantic search, there's no need. BERT understands innately that "doctor" and "pre-med" are related concepts.
Here's one last example from Bliss World, a multi-channel spa and retail product company.
Once again, instead of forcing the user to sift through irrelevant results that matched on the product's name - Clear Genius - semantic search identifies the exact right FAQ to answer the user's question, even though it has the same amount of keyword matches as other FAQs.
To use semantic search on your Answers experience, simply navigate to the new Answers Configuration UI and activate Semantic Text Search on the FAQ Name field:
Alternatively, you can set "semanticTextSearch": true under this field's configuration within the JSON editor!
(Very meta, we know...)
Does this only work for FAQs?
Yes, right now Semantic Search only works for FAQs. The reason is that FAQ names and user queries are naturally similar, so we can use the same BERT model to understand them both. This isn't necessarily true for other fields, so this type of semantic search doesn't work as well. But in the future we plan on adapting this technology to other entities and fields too!
Can I still use textSearch on other fields for my FAQs?
Yes! You can still search other fields like the keywords field or any custom fields you have. Semantic Search works in conjunction with either text search or NLP filters on other fields. When text search is activated on other fields, the algorithm mixes the signals from both fields to determine which FAQs are most relevant.
Does semantic search work with synonyms?
Yes! Semantic Search doesn't require synonyms, but it still considers synonyms when searching for FAQs. More specifically, Semantic Search will never require synonyms for words that are naturally similar, like "fast" and "quick".
However, you might still want to use synonyms for words that are not naturally similar, or that are only similar in the context of your particular Answers experience.
What if I am using a custom entity type for FAQs? How do I use Semantic Text Search?
Semantic Search only works on the built-in FAQ entity, so you'll need to move data from your custom FAQ entity to the built-in one. This can be accomplished easily by exporting and re-uploading your data.
Semantic Text Search seems to be returning a lot more FAQs than previously. Is this by design?
Yes — because semantic search doesn't require any individual keyword matches, it tends to return more results than textSearch, though not always. You can set a maximum number of results to show on the front-end, using the limit parameter on the results configuration.