The Danger in Document-Level Sentiment Analysis

Author: Calvin Casalino, Senior Product Manager
Product: Reviews
Date: June 2020

We've all experienced this; you visit a new restaurant, enjoy some 5 star food, but towards the end of the meal, the waiter takes an hour to finally bring the check. Or they forget your side. Or the bathroom wasn't exactly clean. A few hours later when you open up your favorite review site, you leave a 3 star review:

The content of your review provides much more information than just the star rating, and the review is only one of many that the business receives. In order for your feedback to become an actionable item to help the business provide a better experience, they need a way to analyze the granular content of all of their reviews, at scale.

Unbeknownst to you, the business for which you are leaving a review manages their reviews with Yext, and they now have a new Google review on their Google My Business listing. That review triggers a Pub/Sub message from Google to the Yext Reviews system notifying that customer of the review. Yext now goes to work using the Google Cloud Natural Language API to perform Sentiment Analysis on the review content. This API parses out important Keywords and Modifiers and then assigns a Sentiment Score on a -100 to +100 scale to each Keyword.

Let's run through the example above:

First, Google helps us parse through the parts of speech of each word within the review. This first step is key to understanding which words in the reviews are Keywords and which words are Modifiers. In general, Keywords are nouns and Modifiers are adjectives describing those nouns. Not every Keyword has a Modifier, and sometimes Keywords can even have multiple Modifiers!

Source: Google Cloud Natural Language API

Now that we understand the structure of the review content, Google can help us understand the Sentiment Scores of each Keyword. Sentiment Scores come from Google's pre-trained machine learning models that can understand and parse through natural language. Natural language processing results in a score from -1 to 1 of how positive or negative that word is used in the context of the review.

Source: Google Cloud Natural Language API

From our example review, we see many neutral keywords like family, town, or corner. These are not positive or negative meaning we can mostly ignore them. However, menu and waiter each have highly positive and negative sentiment scores, respectively. For this restaurant, understanding that their "expansive menu" is positive with customers, but their "rude waiter" may be hurting their business are extremely valuable and actionable insights.

Within the Yext platform, the restaurant would see the following data for these two Keywords within Sentiment Analysis:

Yext multiplies the Sentiment Score by 100, so we see the menu Keyword has an average score of +93. Additionally, the waiter Keyword has a -73 Sentiment Score; this is something the restaurant can improve. If we simply looked at the average ratings of these reviews, we'd simply see a 3 star average and lose the additional insight Sentiment Analysis brings.

This whole process may seem complex, but it's worth it. Remember, in our review, we see there was both negative and positive feedback. The most common form of Sentiment Analysis out there is Document Level Sentiment Analysis. This version looks at the entire document of text, in this case the review, and tries to determine the overall sentiment score of the block of text. This works ok for purely positive or purely negative reviews (even though you may not know why the review was positive in the case of multiple positive Keywords), but breaks down for reviews with mixed feedback.

In order to analyze the importance of entity sentiment analysis, we randomly selected a set of 100,000 reviews monitored by Yext in 2020 to examine what percentage of reviews had Mixed Sentiment and Strong Mixed Sentiment. Mixed Sentiment was defined as reviews with at least one negative (<= -10 sentiment score) and one positive keyword (>= +10 sentiment score), and Strong Mixed Sentiment was defined as reviews with at least one strong negative (<= -50 sentiment score) keyword and one strong positive keyword (>= +50 sentiment score). The results of this analysis showed that 29.9% of reviews had Mixed Sentiment and 14.7% of reviews have Strong Mixed Sentiment. The results of this experiment emphasize the difference between entity and document level sentiment analysis. (Yext Analysis, 2020)

A second kind of Sentiment Analysis, Keyword Level Sentiment Analysis considers each keyword in the document uniquely. This allows for much more granular insight. Consider our example review from earlier:

In this review, you as the customer commented that the menu was expansive. This would result in a pretty high sentiment score. Similarly, the waiter was rude. A rude waiter mentioned in one or many reviews is a very important actionable insight for operations. This would be lost if the entire document was scored instead.

With document level sentiment analysis, this review may be ignored as it would be neutral sentiment. Diving into the individual keywords reveals the true information you need to improve your business. Customers are leaving new reviews everyday - these are free data points to help improve your business. To get the best out of your review content, make sure you are analyzing them properly. Let the star ratings handle the document level insight. Use Sentiment Analysis for Keyword level insight.

All Blog Posts

Determine Location Intent in a Search Engine

Max Shaw, VP Product

One of the most common use cases for a search engine is finding something by a "location". Here are some basic examples: Cardiologist near Green Bay, Notary near me, Restaurants open now. These are all pretty simple queries, but getting these to work in a search engine is much more complex than you might imagine.

4 Methods for Increasing Site Search Clicks

Rick Swette, UX Research

We know good search drives business impact. It increases conversions and transactions, reduces search bounce rate, and boosts overall customer satisfaction. So, how do we get more people to trust and use site search? We embarked on a study to find this out.

How to Measure the Success of Your Site Search

Basil Polsonetti, Data Insights

Most brands know that site search is a feature their website should have, but unless the site is dominated by e-commerce, it’s often relegated as a check-the-box task when building a new website.

The Danger in Document-Level Sentiment Analysis

Calvin Casalino, Senior Product Manager

In order for your feedback to become an actionable item to help businesses provide a better experience, they need a way to analyze the granular content of all of their reviews, at scale.

Deep Dive into Duplicate Suppression

Dee Luo, Product Manager

Brands know the importance of having accurate information across all the apps, maps, and directories where consumers are searching for information. In a perfect world, powering that brand data and managing each of these listings would be enough to ensure that consumers consistently get the answers they're searching for.

Yext Answers Algorithm Update: Milky Way

Max Shaw, VP Product

Yext Answers is constantly improving it’s search algorithm to provide more relevant results over time. Milky Way is the first official upgrade to the Answers Algorithm and includes a series of important upgrades to provide better search precision and recall.

GMB API Update - Dedicated Food Menus

Dee Luo, Product Manager

On August 24, 2020, Google launched version 4.7 of its Google My Business (GMB) API. This update includes enhancements to how your restaurant locations can sync and display food menus on Google.

Structuring Your Knowledge Graph

Jessie Yorke, Yext Administrator

In this post we are going to discuss strategy and give you some tools to effectively think about structuring your own brand's Knowledge Graph!

Welcome to the Hitchhikers Program

Liz Frailey, VP Developer & Admin Experience

Welcome to Hitchhikers! We are so excited to have you join our mission of creating amazing search experiences for brands of all sizes.

Introducing: Yext Answers Plugin for WordPress

Alex Barbet, Product

Businesses of all sizes use both WordPress and Yext to build amazing client experiences, and as more and more brands around the world add the Yext Answers bar to their WordPress powered sites, we wanted to provide a way to drive their time-to-value even faster.

Yext’s Fall ‘20 Release is Now Live!

Nick Oropall, Senior Product Marketing Manager

For those of you who are new to Hitchhikers — Welcome to Yext's new training platform & community! Hitchhikers will be the home for all of Yext's product and release updates moving forward so we encourage you to create a free user and check out the platform!

Meet the Hitchhikers Team: Alyssa Hubbard

Alyssa Hubbard

Alyssa Hubbard began at Yext in the Upward Rotational Program. Now she is full-time on the Hitchhikers team, working to build a platform to empower our community of Yext power users.