0% found this document useful (0 votes)
224 views99 pages

LightSpeed Ans

The document provides guidelines for grading the satisfaction of search results for queries. It describes the components of a search satisfaction grading task and the steps of the grading process. Factors that influence satisfaction ratings, such as query type and result format, are also explained.

Uploaded by

soniajecy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
224 views99 pages

LightSpeed Ans

The document provides guidelines for grading the satisfaction of search results for queries. It describes the components of a search satisfaction grading task and the steps of the grading process. Factors that influence satisfaction ratings, such as query type and result format, are also explained.

Uploaded by

soniajecy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 99

Search Satisfaction

Version 2.3.1

Contents
1. Introduction
1.1. Task Components
1.2. Steps in the Grading Process
1.3. Definitions
1.4. General Guidelines

2. Annotation Process
2.1. Understand the query
2.2. Review the results
• Overview of Result Types
2.3. Validate the result
• Wrong Language
• Content Unavailable
• Inappropriate
2.4. Rate the result
• Satisfaction Principles
• Degrees of Separation
• Think About the Meaning, Not Just Matching Words
• User Effort
• Source Quality
• Grading specific situations and result types
1. Ambiguous Queries (Multiple Interpretations)
2. Locale Sensitivity
3. English Results in Non-English Locales
4. Redirected Pages
5. Apps
6. News
7. Maps
8. Web Video
9. Dictionary, Stocks, Weather, Knowledge/Answers, Sports, and “Learn About” Queries
10. Web Results (also called Suggested Web Sites)
11. Web Images
12. Product Searches
13. Other Query Types
2.5. Review and submit
• Common grading mistakes

3. Additional Examples
3.1 Highly Satisfying
3.2 Satisfying
3.3 Somewhat Satisfying
3.4 Not Satisfying

Overall Preference Rating Task

Version History

Note: There are two platforms using these guidelines, Tag and Try Rating. In the majority of cases, the grading
instructions will be the same. However, there are some situations where the grade will be different depending on what
platform you are working on. These instances are noted in the guidelines when they occur.

Return to Table of Contents 2


1. Introduction

A search service may return many different types of results. How are these graded? What is a satisfying search result? In these
guidelines we talk about what constitutes a search query, the different types of results, and how to grade them. In addition we
describe some typical grading tasks that use the principles learned in satisfaction grading.

Search engine users are trying to accomplish a task (or achieve a goal) that requires some information or quick access to some
other resource, such as an app.

A user’s information need or search need is defined as the information or resource that the user needs in order to accomplish
their task. The user's query is an attempt to express that need to the search engine. If the search results enable the user to
accomplish their task, we say that the search need is satisfied.

We say that a result is satisfying if it satisfies the search need of a query. Results can be more satisfying or less satisfying
depending on how well or how completely they satisfy the need. The purpose of this task is to improve the search results when a
user issues a query.

Return to Table of Contents 3


1.1 Task Components

The grading interface displays each query together with additional information that provides useful context. As shown in the
figure above, this includes the following components:

• The query itself, along with the result


• The search mode used when making the query. This is either Safari (web search) or Spotlight (on-device search).
• The user location. We want to return results appropriate for their area (e.g. locations of business).
• Date of query. We want to return results that are relevant in time.
• Web Search links you will use to research the possible intents and interpretations of the query

Note: The tool interface may look different depending on which platform you are using, but the components of each task will be
the same.

Return to Table of Contents 4


1.2 Steps in the Grading Process

Step Description Notes


• Keep in mind queries can have more
than one meaning.
Click on the web search link and scan
1. Understand the query the results to make sure you
• If research links do not work, copy the
understand what the query is about.
query into a search engine with the
correct locale preference

• Note the type of query result. Is it


2. Review the results Review the query results.
appropriate for the user query?

• Are there are any problems that would


Identify whether or not the query result
prevent you from judging the result or
3. Validate the result is wrong language, content unavailable,
create an unsatisfying experience for the
or inappropriate.
user?

Decide whether the query result is


highly satisfying, satisfying, somewhat • Refer to the rating guidelines in Section
4. Rate the result
satisfying, or not satisfying according 2.4.
to the guidelines.

Check your work for errors. Once you • Ensure you have not made one of the
5. Review and submit have reviewed your work, submit it and common grading mistakes discussed in
go to the next task. Section 2.5 before submitting the task.

Return to Table of Contents 5


1.3 Definitions

Term Definition Examples

• Stephen Curry
• Yellowstone National Park
• Jupiter
• Médecins Sans Frontières
A person, place, organization, business, product,
• Starbucks
Named Entity service, or event whose name would normally be
• Post-It Notes
capitalized in English. (This includes fictional entities.)
• Skype
• Super Bowl LI
• Boxer Rebellion
• Frodo Baggins

• photosynthesis
• elephant
A word or phrase describing a concept or object of • ROC curve
study (other than a named entity) that users may wish to • linear algebra
learn more about. Knowledge terms may come from any • cancer
Knowledge Term
field of study, including: science, technology, • oligarchy
mathematics, medicine, history, philosophy, literature, • veto
art, economics, etc. They are most often noun phrases, • existentialism
but may also be other parts of speech. • metaphor
• impressionism
• interest rate

• Microsoft (company): www.microsoft.com

• U.S. Internal Revenue Service (government


A website provided by a named entity (or their employer
organization): www.irs.gov
Official Site or organization) that represents how they want to be
presented
• Taylor Swift (performer): www.taylorswift.com
to the world online.
• Henry Louis Gates Jr. (professor at Harvard University):
https://aaas.fas.harvard.edu/people/henry-louis-gates-jr

Return to Table of Contents 6


Term Definition Examples

A generalization of official site that includes not just


• https://twitter.com/StephenKing
official sites but also other online “homes” provided by
Official Online Presence an entity and existing on commercial services such as
• https://www.youtube.com/user/therock
social networks. This may include: a Twitter feed,
Facebook page, YouTube channel, Instagram feed, or
• https://www.instagram.com/badbunnypr/
other similar platform.

A business (or organization) that consists of many


• Starbucks
locations that all provide basically the same product or
• Taco Bel
Chain Business service, AND where its customers’ (or users’) primary
• Party City
interaction with the business happens in person at
• California Department of Motor Vehicles
those locations.

• Jacinda Ardern
Anything whose concept or identity can be usefully • Taj Mahal
conveyed by a visual image. People and places are • bal-peen hammer
Visually Distinctive Entity visually distinctive entities, but so are certain tools, • dodecahedron
geometric figures, geological or architectural features, • mesa
and visual artworks. • flying buttress
• “The Thinker” (sculpture by Rodin)

1.4 General Guidelines


• You may assume all searches are made on an Apple device or Apple software, such as Safari.
• English results are never considered Wrong Language.
• Use a private browsing window when checking web links.
• Ensure your browser window is expanded. A small browser window causes some results to resize, potentially hiding
information that would have been shown to the user - and this might affect your rating.
• If research links do not work, copy the query into a search engine. Ensure you have the browser set to the correct locale
preference.
• Ensure all ad blockers are turned off in your preferred internet browser.

Return to Table of Contents 7


2. Annotation Process

2.1 Understand the query


The first step is to click on the web search links and scan the results to make sure you understand what the query is about. Keep
in mind queries can have more than one meaning. If research links do not work, copy the query into a search engine. Ensure you
have the browser set to the correct locale preference.

It is important that you research the query in order to fully understand the user intent. The query may be a common word that
you think you know. But the web search may show that the primary meaning is something entirely different. For example:

• Query is "canada goose"; result is the wikipedia page about that kind of bird. If you had not heard of the Canada Goose
clothing brand, you might assume that the bird page is what almost all users would want to see. But by looking at the web
search results, you can tell that this is not the case.

2.2 Review the results


Next, review the query results. There are many types of search results. Some results, when clicked, take you to a web page.
Others are self contained (not clickable) and answer search needs directly in the information presented, without the need for
further user action.

Return to Table of Contents 8


Overview of Result Types

• Answers and Knowledge. Users ask questions


(implicit, explicit, grammatically incorrect) about
a concept or knowledge term or general
knowledge question. Knowledge cards can
return exact answers or rich experiences about
knowledge concepts and entities.

• Note: the term “Knowledge” might not always


appear.

• Apps. Cards that take the user to the Apple app


store (or open an app on the device). Usually
they have an icon of the app and the star
ratings.

Return to Table of Contents 9


• Dictionary. This card shows the definition of a
word. When the user interacts with this card it
provides detailed usage.

• Flights. This will display flight status such as


arrival time, departure time and destinations.
When the user taps on this result, detailed
information about arrival/departure gates,
baggage claims are displayed.

• Maps. These results help the user navigate to a


place. Usually they have address and distance
from the user. If it’s a business it often has hours
of operation.

Return to Table of Contents 10


• Movies/TV Shows/Books/Music. Cards that
provide the user a very rich experience for
example to watch movies/tv show, learn about
the cast, social media links, links to media
related sites (e.g IMDB), listen to music, get
lyrics for songs, read books. From a graders’
point of view they are not clickable (nor
interactive). They usually show a picture,
popularity ratings etc.

• News. These are often types of web results that


are restricted to news sites (sports, fashion,
political and so on). They usually have ‘age of
news’ indicator at the bottom. They are
designed to be clicked on and take the user to
the destination news site.

Return to Table of Contents 11


• Sports. These cards are meant to display
sports scores, or latest scores for a team (and
dates of upcoming matches).

• Stocks. This card provides financial information


related to stocks. They should show the ticker
symbol, the company name and the stock price.
When the user interacts with this card detailed
stock information such historic price graphs are
displayed.

• Weather. This card shows the temperature of a


location (and sometimes other weather
conditions). When the user taps this card, they
are shown detailed multi day weather forecasts.

Return to Table of Contents 12


• Web Images. Groups of images clustered
together. Usually the user doesn’t interact with
the images and they provide visual information
about the search query.

• Web Results. By far the most common result


types. These ‘cards’ usually have an icon with a
brief title of the webpage and are designed to
be clicked by the user and taken to the
corresponding website.

• Web Video. The user can click on these results


which play a video (usually taken from video
channels such as YouTube and Vimeo).

Return to Table of Contents 13


When viewing Web, Web Video, and News results, ensure ad blockers are turned off in your browser. You can do this by
following these steps:

Chrome

1. At the top right, click More > Settings.


2. Click Privacy and security > Site Settings.
3. Click Additional content settings > Ads.
4. Turn off “Block ads” on sites that show intrusive or misleading ads.

Safari

1. Open up the Safari browser.


2. Open the “Safari” menu in the top left corner of the screen.
3. Click “Preferences” > “Websites” > “Content blockers.
4. Any content blocking services currently active will be listed here. Set the service you wish to disable to “Off.”

For other browsers, perform a web search for instructions on how to turn off ad blockers.

Note 1: For Web, Web Video, and News results:

• If the page presents a CAPTCHA, answer the CAPTCHA


and proceed with grading the result following the
guidelines.

• If the page presents a cookie pop-up, close the pop-up


and proceed with grading the result following the
guidelines.

Return to Table of Contents 14


Here are some things to think about as you review the query result:

Result Type Things to Consider

• Type of query result. Is it appropriate given the user query?


All results
• Perform web research to ensure correctness and relevance of information in result.

• Click the result and review the information present on the page.
• Are the results timely or are they stale and outdated given the user’s query?
• Is the information the user requested present on the page?
Web, Web Video, and News results • Does the user have to do additional scrolling or click an additional link to arrive at the
information they’ve requested?
• Also note any broken links or warnings, or log-ins or pop-ups that might prevent some but not
all users from viewing the content on the page.

• Only review the information present on the card. Do not click on them.
Direct results, such as Sports, Maps, Weather, • Is the answer to the query that addresses the user's intent present on the card?
Movies, etc. • If the location information is relevant to the query, is it present on the card?
• Is the date relevant and present?

• Does the subject in the images match with the query?


• Are all the images different, or are some of them the same?
Web Image Groups
• Is the subject blocked, out of focus, too far away, or otherwise difficult to see clearly in any of
the images?

Once you have reviewed the results and noted the relevant information, go to the next step.

Return to Table of Contents 15


2.3 Validate the result

Before you can grade the satisfaction of a result, you’ll be asked to indicate whether there are any problems that would prevent
you from judging it or result in an unsatisfying experience for the user. There are three types of result problems you’ll be asked to
identify: wrong language, content unavailable, and inappropriate.

Wrong Language

A result is in the wrong language if it is neither in English nor in the language of the user’s locale.

However, there are a few exceptions that are NOT considered wrong language results:

1. Result (e.g. amazon.co.jp) is the same country-specific site as requested by the query (“amazon.co.jp”), even if the requested
site is not in your locale.
2. User is visiting another country, query is for a local business or attraction, result is in the language of the visited country (i.e.
where query was submitted), and there is no equivalent result in the user’s own locale language.
3. Query is in a foreign language and result is in locale language, but query is also the name of a popular song, movie, business,
etc. in the current locale (e.g. “viva la vida” query in en-US).
4. The result is in English. English results are never considered Wrong Language.

Return to Table of Contents 16


Content Unavailable

When you flag a result as Content Unavailable, you must leave a comment describing the reason for the flag. Flag result as
content unavailable in any of these situations:

Situation Example

• Result is a blank page, a parked


domain, a 404 error, something
unavailable in user’s country, or
anything else where the content has
been removed or is inaccessible.

Note: Refresh the page twice to ensure the


link is actually broken/missing/unavailable.
• If refreshing the page does not make the
content appear (ie, “fixes” the page), flag
as Content Unavailable.
• If refreshing the page makes content
appear (“fixes” the page), do not flag as
CU. Proceed to Step 2.4 to assign a
Satisfaction Rating based on the result.

• If at least one image in web-images


group result is not visible.

Return to Table of Contents 17


Situation Example

• Result requires log-in, passwords, or


subscription to access, specifically
where some but not all users would
be blocked from viewing content.
The image on the left can be rated as you do not
need to login to see the users twitter profile. The
image on the right cannot be graded or would result
in an unsatisfying experience for the user, precisely
due to “Result requires log-in or subscription to
access, specifically where you and the user have to
log in to see the content of the page.”

• If additional information (user’s


location, or the date the query was
made) is missing and their absence
affects your ability to otherwise
provide a grade.

Return to Table of Contents 18


Situation Example

• The browser presents warning of a


privacy or security issue on the page

• Required information for this result type


is missing (e.g. no distance shown for
Maps result).

• Result is news story whose timestamp is


more than 3 months later than (i.e. newer
than) date of query.

Searched on January 8, 2024

Return to Table of Contents 19


Situation Example

• Result is a website with a banner or pop-


up indicating a limit on number of visits
(even if the limit has not yet been
reached).

Using Contextual Information to Determine Content Unavailable

In some grading scenarios, you will need additional information to assign a grade. This additional information is called
Contextual Information, and it is provided in the tool to help you. There are two types of contextual information in this project:
Query Context and Result Context.

Return to Table of Contents 20


Query context is additional information provided about the query: user location and date the query was issued.

Result context is information provided about the result, such as distance from user location on a Maps result, or the date of a
news article on a News result.

Return to Table of Contents 21


• If location or date information is missing from the Query Context, and it is needed to grade the result, the result must be
flagged as Content Unavailable.
◦ For example, the query is, “what is the weather outside” and the result is weather for the US city of Seattle. You
must verify the user is in Seattle when they issue the request to assign a grade. If user location is missing, this
result must be flagged as Content Unavailable.
• However, if you are confident that the missing information would not change the grade (even if present) you can assign a
grade without flagging the result.
◦ For example, the query asks for the definition of a word. The date the query was issued is missing. In this case, the
date the query was issued is not needed to grade the result, and you can assign a grade without flagging the result
as Content Unavailable.
• If the Result Context is missing from the result, you must flag the result as Content Unavailable by this guideline above:
“Required information for this result type is missing (e.g. no distance shown for Maps result).”

Inappropriate

A result is considered inappropriate if it has any of the following: pornography, adult advertising/services, sex toys, illegal drugs,
hate speech, gambling, spam/phishing, pirated content (including those posing as free video streaming services), or gore/shock.
In general, we want to connect users with useful content for their topic of interest while protecting them from being exposed to
harmful information summarized below.

• Hateful: the result should not advocate discriminatory content that intentionally attacks someone’s dignity. This can
include references or commentary about religion, race, sexual orientation, gender, national/ethnic origin, or other targeted
groups.

• Violent or harmful: the result should not intentionally incite imminent violent, physically dangerous, or illegal activities,
nor provide information that leads to immediate harm.

• Sexually explicit: the result should not have overtly sexual or pornographic material, defined by Webster’s Dictionary as
"explicit descriptions or displays of sexual organs or activities that are principally intended to stimulate erotic without
sufficient aesthetic or emotional feelings.”

Return to Table of Contents 22


• Contradicting expert consensus on public interest topics: the result should not contradict well-established or expert
consensus on a popular topic or issue. This includes misleading or inaccurate information.

• Spam Results that are malicious, deceptive, or manipulative. Examples: pages that contain phishing schemes, instal
viruses, or attempt to artificially boost their relevance (e.g., link farming, keyword stuffing, etc).

• Results that do not contain original and useful content. Examples: pages with content scraped from Wikipedia or
otherwise automatically-created content.

• Illegal: We also manually remove reported results in those circumstances that are required by law in the corresponding
locale (e.g., images of child abuse, content related to sex trafficking, copyright infringement, etc.) and when action is
required to keep people safe (e.g., involuntary posting of sensitive personal information, etc). Movie streaming sites such
as those posing as free movies are also part of this category. App sites that promote side loading (these can lead to
unsafe applications being installed on phones).

Note 2: Content that might otherwise be considered inappropriate is acceptable if it occurs in a medical, educational, fine
art, or journalistic context, and should not be flagged (e.g Wikipedia).

Inappropriate Examples

User searched for [tinyzone] and the result is https://tinyzonetv.to/ which contains pirated content.

User searched for [sdc.com] and result is http://sdc.com/, or user searched [olga 24k gold] and the result is https://
www.lelo.com/blog/olga-24k-gold-review/. Both results contain adult advertising and should be flagged.

Irrespective of whether the user was searching for this, these results need to be flagged.

Return to Table of Contents 23


Follow this workflow in order to properly handle a flagged result:

• If working in TAG:
• If result is Wrong Language or Inappropriate, flag result and select Submit to go the next query.
• If result is Content Unavailable, flag result and leave a comment about why. Submit to go to the next query.

• If working in Try Rating:


For Wrong Language, Inappropriate, or Content Unavailable, flag the result and go to Step 2.4: Rate the result.

Return to Table of Contents 24


2.4 Rate the result

When judging how satisfying each result is, you’ll use the following scale:

Rating Definition Examples

Query: instagram

Result: Instagram app

Instagram is best known as an app, so result is what almost all


users would want to see.

Query: microsoft
Almost all users would want to see this result. It’s authoritative,
accurate, up-to-date, and addresses the most likely search
Result: Their official website, microsoft.com
need(s). If the user is asking a specific question, the result gives
the correct answer clearly and concisely.
Almost all users searching for a company or organization would
want to see its official web site.
Highly Satisfying Note that some types of results can never be Highly
Satisfying. Results for advice or recommendation queries (e.g.,
“how to lose weight”, “chicken parmesan recipe”, “best beatles
song”, “thai restaurant”) can never be HS. This is because the Query: how many stomachs does a cow have
result would be an opinion and we don’t know if almost all users
would agree with the recommendation. Result:

The result (knowledge card with the answer)


immediately gives the user all the information they asked for.

Return to Table of Contents 25


Rating Definition Examples

Query: how many stomachs does a cow have

Result: Wikipedia page about cows

The page contains the answer, but the user has to do some extra
work to find it -- clicking on the result, reading and scrolling
through it.

Query: qr reader

Result: An app to read QR codes


Many users would be interested in seeing this result.
Satisfying results often provide supplementary information that is
The link is to a highly rated QR reader app, however there are
“one step away” from the query topic. For example, if the query is a
Satisfying other highly rated QR reader apps and we do not know if the
restaurant, it might be a review of the restaurant; if the query is a
result would entirely meet the user's search needs.
company, it might be the current stock price, or news about the
company.

Query: gilmore girls

Result:

Query is a product (a movie) and result allows a user to buy/rent


the movie.

Return to Table of Contents 26


Rating Definition Examples

Query: camden county college

Result: Home page for library at the college

Probably not what most users were looking for. (If they had
wanted the library, they would have mentioned it in the query.)

Query: vietnamese restaurant [user is in San Jose, California]

Result:

Somewhat Satisfying Some users may find this result useful, but it’s
probably not what most searchers were looking for. It’s often only
indirectly related to the search need or assumes an uncommon
interpretation of the query.

This result, for a restaurant in San Francisco, is 43 miles from the


user’s location in San Jose, and there are dozens of closer
Vietnamese restaurants.

Query: alica schmidt

Result: https://hotsportsgirls.com/alica-schmidt/

Query is about a German track and field star, so the most


satisfying results will be about her competitions, her athletic
achievements, etc. In contrast, this result is solely about her
physical appearance, which will be of interest to only some
searchers.

Return to Table of Contents 27


Rating Definition Examples

Query: harold's kitchen menu [user is in Virginia, US]

Result: Home page for Harold's Kitchen & Bar in British Columbia,
Canada

Despite the similar name, this result is for a restaurant 3000 miles
away from the user. (And there is a different Harold's Kitchen near
the user.)

This result has nothing to do with the query, or provides incorrect


information, and should not be shown. Query: how many weeks has it been
since march 25th [query issued in April 2021]
Not Satisfying • If working in TryRating: all results flagged as “Inappropriate”,
“Content Unavailable”, or “Wrong Language” should be rated Result: https://www.answers.com/Q/
How_many_weeks_has_it_been_since_April_27_2009
as Not Satisfying.
Despite matching some words in the query, this result is for a
totally different year and does not give the user any useful
information.

Query: tour de france stage 1 (queried on 29 July 2022)

Result: NBC video of stage 18 of 2021 Tour de France.

Result is for a previous year’s Tour de France, and is not even the
stage the user asked for.

Return to Table of Contents 28


Note 3: Search engines often correct query spelling errors and/or predict (“autocomplete”) what a partially typed
query was intended to be. If the web search results show results for a corrected or autocompleted version of the
query, you should grade your result as if the user typed the corrected or completed query.

Examples:
• Query is “fac,” result is “facebook.com”. Grade as if the query was “Facebook.”
• Query is “ted cruise,” result is a wikipedia page about U.S. senator Ted Cruz.
Grade as if the query was “ted cruz.”

Satisfaction Principles

There are several factors you should consider when you grade a result.

Degrees of Separation

Results are often associated with concepts in the real world, and different concepts are connected by their relationships.

For example, the concept of the singer “Beyoncé”


• is related to the concept of her album “Lemonade,”
• which in turn is related to a review of the album in Rolling Stone magazine,
• which is related to the author of the review, Rob Sheffield.

Each time we pass through one of these relationships, we increase the distance from the original concept.

Return to Table of Contents 29


For example:

Query: Beyoncé Query: Rolling Stone Lemonade album review

Result Rating Result

Beyoncé's official website. Highly Satisfying The review of the album.

Her “Lemonade” album on iTunes. Satisfying The album

A Rolling Stone magazine review of the album. Somewhat Satisfying The singer's official site and Rob Sheffield's Twitter.

The reviewer Rob Sheffield's Twitter. Not Satisfying Random article from same issue of Rolling Stone

We can think of these relationships as “degrees of separation” so in this example, the review of the Lemonade album is two
degrees of separation from Beyoncé.

When Grading results, each degree of separation from the concept mentioned in the query, that is, the number of relationships
you have to traverse to get to the result, lowers the satisfaction grade by one level. See table above.

Example Scenarios

Scenario If the query is And the result is Description Grade it as Examples

a. Query is “fleabag,” result is


Query is the name of a creative work
https://en.wikipedia.org/wiki/
(music album, movie, etc.); result is a
1 Creative Work Performer/Creator Satisfying Phoebe_Waller-Bridge, the
representation of the creator/performer
wikipedia page about the creator
(e.g., artist's official site).
and star of that television series.

Return to Table of Contents 30


Scenario If the query is And the result is Description Grade it as Examples

Query is the name of a performer (singer,


a. Query is “taylor swift,” result is
actor, etc.) or creator (author, composer,
Apple Music result for singer’s
Performer's/Creator's artist, etc.); result is a representation of
2 Performer/Creator Satisfying recent album “Lover,” https://
Work their work (album, song, movie, book,
music.apple.com/us/album/lover/
etc.), where user can view/hear/
1468058165
download/stream/learn about it.

Think About the Meaning, Not Just Matching Words

Note that some highly satisfying results may not contain all (or even any) of the query words; what matters is the meaning. For
example:

• The result www.premierleague.com/home is highly satisfying for the query “english premier league soccer” even
though that result doesn’t contain the words “english” or “soccer.”

• The result https://music.apple.com/us/album/25/1544494115 is satisfying for the query “adele’s third album,” even
though it doesn’t contain the word “third.” It's also possible for a result to contain all the query words and not be
satisfying.

• The result https://en.wikipedia.org/wiki/My_Girl_Has_Gone (a web page about a song from the 1960s) is not satisfying for
the query “gone girl,” even though the result contains both query words. Gone Girl is the title of a book and movie from
the 2010s, and the song result is clearly not what the user intended.

Return to Table of Contents 31


User Effort

When the user is looking for specific information, a result that displays this information directly is preferable to a regular web
result. For example:
• If the query is “how old is Obama”, then a Knowledge card that directly displays his age without requiring any user action
is better than a web result that the user needs to click on, wait for it to load, and scroll through to find the desired
information.

Example of card showing Obama’s age. Check that the answer is indeed correct.

Source Quality

Sources of results, including web sites and news providers, can have large differences in quality. If the source of a result is low
quality, you should assign a lower grade than you would have otherwise. Source quality is based on several factors:

1. Writing

• High quality result source: professionally written, clear and understandable.


• Low quality source: unclear, hard to read, filled with grammatical and spelling errors.

Return to Table of Contents 32


2. Motivation

• High quality source: has a neutral point of view, or makes point of view clear.
• Low quality source: has "hidden agendas," such as pretending to offer information while actually trying to sell its
services.

Motivation Examples

High Quality Neutral Low Quality

For general knowledge, a well-researched page


For information about a media item or A page with any kind of malicious behavior, e.g.
on Wikipedia that provides adequate references.
personality (book, movie, series, actor, singer, trying to trick users into downloading something
IMDB, Metacritic for movies. These sites rely on
etc.), a popular fan site. on their computer.
facts and have a neutral point of view.

A website that aggregates information from other


websites.
For news, a reputable news-gathering
organization such as BBC News or AP News.

3. Reputation and Trustworthiness

When the source is a web URL, this means that the response is taken from that web page. Some of those pages may be from
authoritative sites whose content is carefully curated by experts, while others may be created by people expressing their
uninformed opinion, or worse, promoting conspiracy theories and other misinformation.

Look at the source page and see if you can determine how much you can trust the information provided. Sometimes checking
the “About Us” page or the wikipedia page of the website (if any), or doing a third-party search can help you better determine
the trustworthiness of the Source.

Return to Table of Contents 33


• More trustworthy source: well-known and well-respected among those who provide this kind of service.

◦ This is a well-known, authoritative source created by people who are either experts on the subject, are the
inventors/creators/owners of the subject, or use professional standards of research to create the content.

◦ Includes news articles or news subpages dedicated to certain topics, written by professional journalists or experts
on the subject. For example:

• BBC article about the science of snow: https://www.bbcearth.com/news/17-surprising-facts-about-snow

• BBC subpage dedicated to science (written and curated by experts in their respective fields: https://
www.bbcearth.com/science

• Less trustworthy source: unknown (or known to be unreliable and untrustworthy).

◦ The content may be created by a random person who knows nothing about the subject, or by an organization with
a particular political or commercial agenda (e.g. pretending to give information when actually trying to sell you a
product or service).

◦ The information it provides is misleading or incorrect, often supporting a conspiracy theory, or has no purpose
other than to get users to click on links or ads.

Return to Table of Contents 34


Reputation and Trustworthiness Examples

More Trustworthy Neutral Less Trustworthy

For medical information, a page from the Mayo


Clinic website. A page where many different users provide their
For medical information, a page from an own opinions as answers to questions, and the
Medical accuracy should be judged using trusted individual doctor’s website or very ad-heavy answers are often wrong or contradict each
health websites such as MayoClinic, NHS, NIH, commercial sites. other.
Medline, CDC, Merck Manuals. This and this
website have links to trusted health sites.

For information about government policies, a


A page where there seem to be more ads than A page that makes claims (such as “the earth is
page from the relevant government agency
content. flat”) that contradict expert consensus.
website.

For information about a specialized topic


For news, a reputable news-gathering (cooking, playing guitar, computer programming,
organization such as BBC News. etc.), a blog written by someone who does that A page intended as parody, humor, or satire
activity as an occupation or hobby. whose contents are not meant to be taken
For general knowledge, a well-researched page seriously.
on Wikipedia that provides adequate references.
IMDB, Metacritic for movies.

4. Use of Citations

• High quality source: if offering scientific or medical information, cites sources.

• Low quality source: makes medical or scientific claims without citations or evidence.

Return to Table of Contents 35


Source Quality Examples

Query: instagram.com change password

Result Source Rating Explanation

The query is asking an implicit question (how to


change Instagram password. This web page has
Official instructions on how to change
facebook.com Satisfying the authoritative answer, but the user has to click
instagram password
on the result to visit the page in order to see the
answer.

Poorly written website and talks about resetting


Low-quality website describing instructions Not Satisfying password when it has been forgotten (which is a
different meaning of the query)

Source quality is meant to be only one point to help you evaluate a query result. For example, the result can still be unsatisfying
even if from a trustworthy site (e.g. does not answer user intent). And even if you consider a site to be neutral, the result can still
be highly satisfying.

Source quality also depends on the query. Some websites meet all the criteria of being a high quality source, but the site is not
an appropriate source given the query intent. For example, The Onion or Punch satire websites are sources of high quality humor,
but they are not trustworthy sources for news.

Always keep in mind the user intent when considering Source Quality.

Return to Table of Contents 36


Grading Specific Situations and Result Types

1. Ambiguous Queries (Multiple Interpretations)

While most queries express several different user intents, some queries are also ambiguous in what they refer to (e.g. “apple”
could be a company or a fruit). In this case you should still grade the result, using the following additional guidelines.

If you're not sure whether there is a dominant interpretation, look at the web search results for the query. If most of the highly
ranked results on the first page are for one interpretation, then you should consider that to be the dominant interpretation.

Type Description Examples

1. The query is "allegiant", result is the official website for the


airline. Grade as HS, since the dominant interpretation of the
query is the airline.
Dominant Interpretation Exists. Dominant Interpretation: If a result is for the
When one interpretation is much dominant interpretation, you should grade using
2. The query is "apple", result is a map result for the Apple
more popular than the others. the normal guidelines.
store near the user, but not the closest. Grade as S, since the
dominant interpretation of the query is the technology
company.

1. Query is “michael jordan”, result is IMDB page for actor


Michael B. Jordan. Grade as SS, since dominant interpretation
of query is for a different person, the former NBA basketball
player.

Secondary Interpretation: If a result would be 2. Query is “american eagle”, result is home page of web
Dominant Interpretation Exists.
relevant (HS/S/SS) for a secondary developer americaneagle.com. Grade as SS (rather than HS),
When one interpretation is much
interpretation, you should since the dominant interpretation of the query is clothing
more popular than the others.
grade it as “SS”. retailer American Eagle Outfitters.

3. Query is “golden retriever”, result is a song titled Golden


Retriever. Grade as SS (rather than S/HS), since the the song
is not the dominant interpretation of the query. The dog breed
is the dominant interpretation for this query.

Return to Table of Contents 37


Type Description Examples

1. Query is “um athletics,” (location is Texas) result is home


page for the University of Miami athletics program. Grade as
Sometimes there are several reasonable
S (rather than HS), because “um athletics” could equally well
interpretations but none of them are dominant. In
refer to the University of Michigan or University of Maryland
Multiple Interpretations, None that case you should grade normally for all of
athletics programs, among others.
Dominant. them, except that results that would have been
When there are two or more HS if there were only one (or one dominant)
2. Query is “um athletics,” result is a photo gallery showing
interpretations of similar popularity. interpretation should be graded S instead. That’s
some athletic facilities under construction at the University of
because if we can’t say which interpretation is
Michigan. Grade normally: it’s SS, because although it relates
one that nearly all users would want to see.
to the query, it’s not what most users doing that search are
looking for.

2. Locale Sensitivity

Scenario Grade Examples

Explicitly Locale-Sensitive.
Query is “amazon france”. The user is in EN-
Results that do not pertain to the locale specified in the GB locale. The result is https://amazon.co.uk.
Query explicitly specifies that user is seeking
query should be automatically graded as “NS”. Grade as NS, since the Amazon page in the
results from a locale that differs from their
UK is not what the user is searching for.
current location.

Implicitly Locale-Sensitive.
Query is “ticketmaster”; user is located in
Query does not explicitly ask for results in a Any results from a different locale (even if they’re in the
US. Result is ticketmaster.co.uk. Grade as
particular locale, but the user need is correct language) should be automatically graded as
NS, since user did not express any interest in
inherently locale-specific (e.g., local law “NS”.
UK events.
information, country-specific merchant sites,
nearby real-world business).

Return to Table of Contents 38


Scenario Grade Examples

Query is “vaccine recommendations”. User’s


Foreign results (as long as they’re in the correct language) locale is en-US, and the result is https://
should be SLIGHTLY penalized by assigning a grade one www.nhs.uk. The NHS is the UK's National
Mildly Locale-Sensitive.
level lower than you would normally give. Health Service that provides health care to
all British residents. Since different countries
Query does not explicitly ask for results in a
• “HS” results should be downgraded to “S” provide different medical advice for their
particular locale, but those in other locales
• “S” results should be downgraded to “SS” residents, the UK's advice would be less
may be somewhat less useful.
• “SS” results should be downgraded to “NS” useful to a US resident than advice from a
• “NS” results should remain as “NS” US medical agency. The result should be
SLIGHTLY penalized from S, down to SS.

Not Locale-Sensitive. Query is “tennis news.” User is in en-US;


result is news from the BBC about the latest
Grade result without regard to locale.
Results from any locale would be equally results from the Wimbledon tennis
useful for this query. tournament.

Example Scenarios

Scenario If the query is And the result is Description Grade it as Examples

a. Query is “starbucks” [in San Francisco,


Query indicates or assumes CA], result is a Maps result for a Starbucks
Local Intent Unreasonably nearby location, result is so in San Diego, CA, 500 miles away.
3 Not Satisfying
Query Distant Result geographically distant that it b. Query is “airport” [in Boston, MA], result is
makes no sense to show it. official website for Heathrow Airport in
London, UK.

Query explicitly seeks result


a. Locale is en_US, query is “kit kat japan,”
Explicitly Locale from a specific locale; result
4 Wrong Locale Result Not Satisfying result is https://www.hersheys.com/kitkat/
Sensitive Query pertains to a locale different
en_us/home.html
from the one specified.

Return to Table of Contents 39


Scenario If the query is And the result is Description Grade it as Examples

a. Locale is en_US, query is “ticketmaster,”


Query does not mention a result is UK-specific Ticketmaster app
locale, but the user need b. Locale is en_IN, query is “do I need a visa
Implicitly Locale implicitly requires results from to visit japan,” result is US government
5 Wrong Locale Result Not Satisfying
Sensitive Query the user's locale; result page https://travel.state.gov/content/
pertains to a locale different travel/en/international-travel/International-
from the user's locale. Travel-Country-Information-Pages/
Japan.html

3. English Results in Non-English Locales

English is a widely-understood second language in many countries, and all our international graders are fluent in it. For this
reason, rather than simply marking an English result in a non-English locale as “wrong language,” graders should go ahead and
grade the result, with the following locale-specific considerations. You will need to use your own knowledge of the locale to
decide which guideline to apply.

Scenario Grade

The user’s locale is one where most users understand English


Grade the result normally, the same way you would if it were in the locale
fluently (i.e. ES-US) and would likely be interested in English-
language.
language results.

The user’s locale is one where many users understand English Grade the result one level lower than you would if it were in the locale language.
fluently (i.e. Western Europe) and would possibly be interested in
English-language results. ⚠ Results that would have been NS should still be graded as NS

Return to Table of Contents 40


Scenario Grade

The user’s locale is one where relatively few users understand


English fluently and would be unlikely to be interested in English- Grade the result as NS.
language results.

4. Redirected Pages

If the result displayed URL gets redirected to a different URL, then you should grade the page you’re redirected to as if that were
the result.

5. Apps

When a user clicks these results it takes them to app store (usually Apple app store) or opens the app if present on the device.
• Scenario 6 below refers to cases where the query is the name of a well-known app — a service that is best known as an app.
A well-known app is not the same thing as a well-known company!

Examples: Instagram, Spotify, and Candy Crush

• Scenario 10 below refers to cases where the query is a business and the result is an app “regularly used to interact with that
business.” Meaning, the app is a common way that customers or clients perform the ordinary tasks they need to do business
with that company.

Return to Table of Contents 41


Just because a company has an app does not mean that it’s regularly used to interact with that business.

For example, the query “dell” refers to the name of a computer company. But their app “Dell@Retail 2019” is described as “a
chance for our global retail partners to immerse themselves in the design, performance, and vision driving Dell’s innovation.”
This app is NOT used regularly by Dell’s customers and should NOT be graded HS.

• If the query is the name of a bank, then the app should allow the user to perform mobile banking tasks.
• If the query is the name of a restaurant chain, then the app should allow the user to order food at that restaurant.
• If the query is the name of an airline, then the app should allow the user to make reservations, choose their seat
assignment, and check flight status.
• If the query is the name of a retail chain, then the app should allow the user to browse and purchase items sold by that
chain.

Example Scenarios

Scenario If query is And result is Description Rate it as Examples

a. Query is “facebook”, result is the


Query is the name of a well-
Highly Facebook app.
6 App Query Official app known app; result is the app with
Satisfying b. Query is “calculator,” result is the built-
that name
in Calculator app.

Query is the name of an app,


a. Query is “candy crush saga,” result is
result is a variant version (e.g.,
app store result for “candy crush
7 App Name Variant of app “Pro” or “Lite”) of or sequel to that Satisfying
friends,” a newer game in the same
app, or another complementary
series.
app from the same vendor.

Return to Table of Contents 42


Scenario If query is And result is Description Rate it as Examples

Query is the name of an app;


result is that app on the Google
Play store website. Since users a. Query is “slickdeals”, result is https://
Somewhat
8 App Name Google Play result are conducting their search on an play.google.com/store/apps/
Satisfying
Apple iOS device, we can assume developer?id=Slickdeals&hl=en
most of them do not want an
android app as a result.

a. Query is “currency converter,” result is


Query is a description of a type of
“My Currency Converter” app.
App performing that app or function that app needs to
9 App Description Satisfying b. Query is “time in different countries,”
function perform; result is an app (or web
result is https://www.timeanddate.com/
app) that performs that function.
worldclock/

a. Query is “b of a,” result is the Bank of


App regularly used Query is the name of a business; America mobile banking app.
Highly
10 Business to interact with result is an app regularly used to b. Query is “dominos,” result is the
Satisfying
business interact with that business. Domino’s Pizza app, which allows
users to place orders.

a. Query is “zillow”, result is the video


“Living Large in a Tiny Home” from
Query is the name of the entity;
Zillow’s YouTube channel.
result is not their official website,
b. Query is “sonicare” (brand of electric
but is a site, page, video, or app
Company/ toothbrush), result is website for Oral-
Related site/video/ related to their business. For Somewhat
11 Product/Named B (a competing brand of electric
app example, this might be a 3rd party Satisfying
Entity toothbrush).
site about that company or its
c. Query is “billy idol” (singer), result is
products, or a site for a
wikipedia page for Generation X, a
competing product or service.
band from the 1970s he was in before
he became famous.

Return to Table of Contents 43


6. News

News articles usually have the word “News” prepended to them. They are specific web results that link to news websites.

• The relevance grade for a news article depends in part on the amount of time between the date the search was done and
the date of the article.

• The search date is shown in the result preview itself.

• Keep in mind validity flags (Inappropriate, Wrong language, and Content Unavailable).

• Just because a news story mentions an entity does not mean it's about that entity. If the entity is not a primary topic of the
story, the article is not about the entity.

◦ Example: Query is "starbucks" and result is a news article about a man who died in a traffic accident. The article
mentions the fact that the man worked at Starbucks, but his death had nothing to do with the company or the fact
that he worked there. This is NOT a news article about Starbucks, so Scenario 14 below does not apply.

• News items may be Highly Satisfying. One news organization – even one reporter – may actually write several stories
about the same event. One person only likes stories from Fox News while another prefers MSNBC. For these reasons, we
can’t say that a given news story is one that almost everyone wants to see.

However if the article is timely, accurate, well written, and highly relevant to the query and comes from a well-known and
established (in its respective locale and geographical area) news source, the result may be HS.

Return to Table of Contents 44


Type Scenario Grade

Timely Article: up to 3 months older than the


Either HS, S or SS if it's about the query topic.
search date
Current Event
Stale Article: more than 3 months older than the May never be graded better than SS even if it's
search date about the query topic.

Time sensitivity does not impact the relevance grade of the results for these types of queries.
Historical Event Examples of historical events are Notre Dame fire, Harry and Meghan wedding, Sandy Hook
shooting, Pope Benedict resigns, etc.

Note 4: You might see articles with dates in the future! For these rare occurrences, grade it the same way as a timely
article, as long as the date is not more than 3 months newer than the search date. If the date is more than 3 months
newer, flag the result as Content Unavailable.

The following chart contains examples of these news sites for en locales. Note: this list is not exhaustive. There may be news
sources that are considered high quality but are not represented below.

For all locales: refer to Satisfaction Principles: Source Quality to help you judge whether or not a news source would be
considered a high-quality, trusted source.

Return to Table of Contents 45


Well-Known and Established News Source Examples (en Locales)

Locale Example

wsj.com

cnn.com

foxnews.com

reuters.com
en_us
washingtonpost.com

npr.org

bloomberg.com

bbc.com

telegraph.co.uk

bbc.co.uk/news

independent.co.uk
en_gb
theguardian.com

news.sky.com

bbc.com

businessinsider.com.au

news.com.au

theage.com.au
en_au
theguardian.com.au

abc.net.au

9news.com.au

Return to Table of Contents 46


Locale Example

en_au smh.com.au

huffingtonpost.ca

globalnews.ca

thestar.com

ctvnews.ca
en_ca
cbc.ca/news

theglobeandmail.com

thecanadianpress.com

nationalpost.com

Example Scenarios

Scenario If the query is And the result is Description Grade as Examples

Query is asking for news about


A timely, an event or named entity; result
a. Query is "premier league news," searched
relevant news is a relevant news story from a
News about a named Highly 29 July, 2022. Result is a BBC News article,
12 article from a high quality news source. Result
entity or event Satisfying “Why Premier League teams are flocking
high quality must be timely, relevant, and
back to Asia” dated 28 July 2022.
news source from high quality site to receive
HS rating.

a. Query is “ebola,” result is New York Times


Query is a knowledge term or
news story “Ebola Outbreak in Congo Is
Knowledge Term or request to learn about a subject, Highly
13 News Declared a Global Health Emergency,”
“Learn About” Query result is relevant and timely Satisfying
published the same day search was
news about that subject.
performed.

Return to Table of Contents 47


Scenario If the query is And the result is Description Grade as Examples

a. Query is "Jeffrey Epstein," searched on


January 11, 2024. Result is news article: "Last
Batch of Unsealed Jeffrey Epstein Documents
Released" dated January 10, 2024.

https://www.nbcnews.com/news/us-news/last-
Query is a named entity and batch-unsealed-jeffrey-epstein-documents-
there is something highly Highly released-rcna132936
topical about the entity that Satisfying
people are searching for. In the recent news, (as of January 5, 2024),
14 Named Entity News Jeffrey Epstein related documents have been
released - hence a high quality, timely news
article about the document release might be
HS (and comparable to his wikipedia entry).
However, if its not topical, maybe a month old,
the news article might be S.

Query is a named entity, result is a. Query is “facebook,” result is news story


an authoritative page (other than “Facebook agrees to pay FTC $5 billion fine
Satisfying
official online presence) for various privacy violations,” dated the
providing news about that entity. same day the search was performed.

a. Query is “super bowl news,” result is a news


Query is the name of an event or
story “Patriots Come from. Behind to Defeat
named entity; result is a news
Named Entity or Stale but valid Somewhat Falcons in Super Bowl LI.” The story is still
15 story about an earlier event or
Event news story Satisfying accurate, but it describes something that
early news about the entity. The
happened in 2017, not in the most recent or
news story must still be valid.
upcoming Super Bowl.

Return to Table of Contents 48


7. Maps

The relevance of Maps results depends in part on the distance from the user.

• You should check to see if the info card has distance displayed. If not, this result must be flagged as Content Unavailable
(and graded as Not Satisfying if working in Try Rating).

• Queries with a map intent often have a distance qualifier e.g. "nearest", "closest", "near me”.

• Such queries often relate to business where one must physically go to e.g. gas stations, cinema halls.

1. Grade on what is visible: Only use what is in the title and description to grade. Do not grade NS just because clicking the
result takes you nowhere or the wrong place.

• Note: at times a query will return multiple possible Maps results. In these cases, assign the grade based on the first result
only.

2. “Permanently closed”: You might see this phrase in the card for a business. We still surface these results as the knowledge
of whether business is closed permanently or temporarily inactive is important. In this case a "permanently closed" label
would have the result's rating lowered by 1 if similar/same business is open and nearby. Otherwise no penalty.

3. Distant results are not always NS. For example:


• People looking for expensive, rarely purchased items (cars, furniture, etc.) are generally wiling to travel longer distances to
find the right one than people looking for inexpensive, common items (e.g., a cup of coffee). So if the query is “Lexus

Return to Table of Contents 49


dealer,” a result 30 miles away might be S (or even HS if it's the closest match), while if the query is “donuts,” it would be
NS.

• People living in sparsely populated rural areas are generally wiling to travel longer distances than people in cities. If the
query “restaurants” is issued in Wilsal, MT (population 237), then a result 39 miles away in Bozeman (population 39,860)
might be S. But if the same query were issued in New York City, a result 36 miles away in Greenwich, CT would be NS.

4. Keep in mind Intent and Distance! For some queries, users are looking for a Maps result. For other queries, they aren't. If a
Maps result is shown for a non-Maps intent query, then grade it as NS. Use the distance to guide you. If a Maps result is very
far away, that’s often a sign that the user was not looking for a map.

• Query is "prime video" and result description is: "prime time video, 2511 springs rd ne, hickory, nc 28601- distance: 529
mi”

• Query is "Lakers" and result description is: "great lakes brewing company, 2516 market ave, cleveland, oh 44113 -
distance: 2,165 miles

Type Scenario Grade

Maps result is correct and is the closest one. Highly Satisfying

Maps result is correct and near the user, but is not the closest one. Satisfying
Business Maps result is correct, and is still accessible to the user but is not close. Somewhat Satisfying

Maps result is correct but is too far away. Not Satisfying

Point of Interest (e.g., cities,


Maps result is correct. Highly Satisfying
parks, landmarks, monuments)

Return to Table of Contents 50


Example Scenarios

Scenario If the query is And the result is Description Grade as Examples

Query is looking for a specific


location/business/
a. Query is “1234 market street sf”; result
institution/point of interest, or the
is a Map for that exact address
closest example of a chain
b. Query is “new york public library”; result
business/type of business, and
is a Map to that location
the result showed that location
c. Query is “larry and joe’s”; result is a Map
on a map.
Highly to a restaurant with that name in the
16 Maps Query Closest Map
Satisfying same town where user is located
Queries with a map intent often
d. Query is “closest lowe’s”; result is a Map
have a distance qualifier e.g.
showing the Lowe’s store location
"nearest", "closest", "near me".
closest to the user’s location.
Also such queries often relate to
e. Query is “starbucks”; result is a Map
business where one must
showing the closest Starbucks branch.
physically go to e.g. gas stations,
cinema halls

Query is the name of a chain a. Query is "dunkin", [in location


Secondary Maps business; result is a Map showing Sunnyvale, CA], map result presents
17 Chain Business Satisfying
Result a nearby branch of business, but San Jose, CA location, 6.8 miles from
not the closest one. the user.

Query is the name of a chain


business or a type of business; a. Query is "starbucks", user is in San
Moderately
Chain Business/Type result is a Map showing a branch Somewhat Jose, CA, result is a map result for
18 Distant Maps
of Business of business that is not nearby, but Satisfying Starbucks, 17 miles away in Fremont,
Result
still accessible (perhaps up to an CA.
hour’s drive away)

Return to Table of Contents 51


Scenario If the query is And the result is Description Grade as Examples

Query is a type of business, or a


a. Query is “thai food” [in location
product or service; result is map
Cambridge, MA], result is http://
entry or an official website for a
Maps or Multiple www.thesimilans.com, official site for
19 Type of Business business of that type or that Satisfying
Official Websites local Thai restaurant.
offers that product/service. In the
b. Query is “thai restaurant”; result is a
Maps case, business must be
nearby thai restaurant.
nearby.

8. Web Video

• If a query specifically refers to a particular video (e.g., “lemonade official video,” “stepanov elements of programming
lecture”), the desired result should be graded as Highly Satisfying regardless of its popularity.

• For other results, and for more general queries where many different video results could satisfy the user's need (e.g., “guitar
lesson”), then popularity may factor into your decision; you may want to grade a video with millions of views higher than a
similar one with only a handful.

• When deciding on your grade, think about whether video results are what user is looking for when typing the query.

• You are not required to watch the entire video to arrive at a rating.

Return to Table of Contents 52


9. Dictionary, Stocks, Weather, Knowledge / Answers , Sports, and “Learn About” Queries

Grade these cards based on what is visible. The grader cannot click on them but a user is provided self-contained snippets of
information which can often be interacted with to learn more (e.g. the Stock card opens up to show historic price graphs)

• Dictionary: Is the user seeking a definition or a concept? If the card precisely answers the need, this is Highly Satisfying.
In all cases it must be the correct interpretation for that word

• Stocks: check for correct stock symbol and presence of price.

• Weather: the result’s location should match the location specified in the query (e.g. “weather boston”), or the user’s
location if location is not mentioned in query.

• Answers: grade on what is visible. If the query is an explicit question, see Scenario 20 below.

Note 5: For all these cards, ensure your browser window is expanded. A small browser window causes the cards to resize,
potentially hiding information that would have been shown to the user ̶ and this might affect your rating.

Additionally, you must still do web research to ensure correctness and relevance of information shown in the card.

To grade web results such as Scenarios 21 and 23 below, you must click on the web result and verify whether or not the
requested information is available in order to properly grade the result.

Return to Table of Contents 53


Example Scenarios

Scenario If the query is And the result is Description Grade as Examples

a. Query is “when did wwi end,” result is a direct


answer or info card that says “November 11, 1918”
b. Query is “dodgers score,” result is a sports info card
that shows the current score of the Dodgers’
baseball game in progress, or (if no game is in
Query is asking for a specific progress), the final score of the most recent game
piece of information that has a they played.
Explicit Correct simple right answer, and the Highly c. Query is “msft quote,” result is an info card showing
20 Exact Question
Answer result showed that information Satisfying the latest stock price for Microsoft (which has the
directly without the need for stock symbol MSFT).
further user action. d. Query is “jet blue 334,” result is an info card
showing the current status of that airline flight.
e. Query is “define attenuated,” result is an info card
showing the definition of that word.
f. Query is “weather boston", result is an info card
showing current weather for that city.

Query is asking for specific


piece of information with a
simple right answer, and the a. Query is “barack obama age,” result is https://
Embedded result contains that answer, en.wikipedia.org/wiki/Barack_Obama
21 Exact Question Satisfying
Correct Answer but the user has to take an b. Query is “cambridge library hours,” result is https://
action (e.g., follow link to www.cambridgema.gov/cpl/hoursandlocations
destination page and read it)
to get the answer.

Query is asking for a specific


answer; result is an info card
Exact Answer Missing or Not a. Query is “dmx real name,” result is an info card that
22 that correctly identifies what
Query Incorrect Answer Satisfying says “dmx birth name: dmx” (which is incorrect).
the query is asking, but then
fails to give that answer.

Return to Table of Contents 54


Scenario If the query is And the result is Description Grade as Examples

Query is a knowledge term or


general request to learn about
a. Query is “linguistics”; result is https://
a subject; result is the
en.wikipedia.org/wiki/Linguistics
wikipedia page for that term, a
b. Query is “what causes diabetes,” result is a page
page from another
about that disease from the Mayo Clinic website
Wikipedia or authoritative reference, or a
Knowledge https://www.mayoclinic.org/diseases-conditions/
Other knowledge card. Common for Highly
23 Term or “Learn diabetes/symptoms-causes/syc-20371444
Authoritative medical queries. Satisfying
About” Query c. Query is “utilitarianism,” result is a Dictionary info
Reference
card giving the definition of the term.
Note that if “X” is a knowledge
d. Query is “challenger disaster” (historical event);
term, queries such as “what is
result is https://en.wikipedia.org/wiki/
X?” or “tell me about X” still
Space_Shuttle_Challenger_disaster
count as a knowledge term
queries.

Return to Table of Contents 55


10. Web Results (also called Suggested Web Sites)

Please click on the thumbnail and grade the destination page (after redirects).

• If the web search results show results for a corrected or autocompleted version of the query, you should grade your result
as if the user typed the corrected or completed query.

Example Scenarios

Scenario If the query is And the result is Description Grade as Examples

Query is a type of business,


or a product or service; result
a. Query is “thai food” [in location Cambridge, MA], result
is map entry or an official
is http://www.thesimilans.com, official site for local Thai
Type of Maps or Multiple website for a business of that
24 Satisfying restaurant.
Business Official Websites type or that offers that
b. Query is “thai restaurant”; result is a nearby thai
product/service. In the Maps
restaurant.
case, business must be
nearby.

Query is a type of business


or organization; result is the a. Query is “vietnamese restaurant” [in Cupertino, CA];
Type of Official Website
official website of an Somewhat result is https://www.slanteddoor.com, the official site of
25 Business/ of More Distant
instance of this business or Satisfying a particular vietnamese restaurant in San Francisco, CA,
Organization Instance
organization that is not 50 miles from the user.
nearby, but is still accessible.

Return to Table of Contents 56


Scenario If the query is And the result is Description Grade as Examples

a. Query is “facebook,” result is Facebook’s official website,


facebook.com
b. Query is “taylor swift,” result is the singer’s official
website, taylorswift.com
c. Query is “charli d’amelio” (social media personality/
vlogger), result is her TikTok channel.
Query is a named entity;
d. Query is “joe biden,” result is his Twitter profile https://
Official Online result is an official online Highly
26 Named Entity twitter.com/JoeBiden
Presence presence for that entity if it Satisfying
e. Query is “empire falls book,” result is publisher’s official
has one.
page for the book, https://
www.penguinrandomhouse.com/books/159148/empire-
falls-by-richard-russo/9780375726408/
f. Query is “captain fantastic,” result is official web site for
the movie, https://bleeckerstreetmedia.com/
captainfantastic

Return to Table of Contents 57


Scenario If the query is And the result is Description Grade as Examples

a. Query is “taylor swift” (singer), result is https://


en.wikipedia.org/wiki/Taylor_Swift
b. Query is “nope” (2022 movie), result is https://
en.wikipedia.org/wiki/Nope_(film)
c. Query is “iliad” (ancient epic poem), result is https://
en.wikipedia.org/wiki/Iliad
d. Query is “the school of athens” (Renaissance painting by
Raphael), result is https://en.wikipedia.org/wiki/
The_School_of_Athens
Query is a named entity;
e. Query is “marie curie” (Nobel-prize-winning scientist);
Wikipedia or result is the wikipedia page
result is https://en.wikipedia.org/wiki/Marie_Curie
Other for that entity, a page from Highly
27 Named Entity f. Query is “angkor wat” (ancient temple complex in
Authoritative another authoritative Satisfying
Cambodia); result is https://en.wikipedia.org/wiki/
Reference reference, or a knowledge
Angkor_Wat
card about that entity.
g. Query is “aristotle,” result is a page about the
philosopher from the Stanford Encyclopedia of
Philosophy
h. Query is “jurassic world dominion,” result is https://
www.imdb.com/title/tt8041270/ the IMDB page about
that movie.
i. Query is “mike trout,” result is page of this player’s
official statistics in the Baseball Reference, https://
www.baseball-reference.com/players/t/troutmi01.shtml

Query is the name of the


entity; result is not their a. Query is “zillow”, result is the video “Living Large in a
official website, but is a site, Tiny Home” from Zillow’s YouTube channel.
page, video, or app related to b. Query is “sonicare” (brand of electric toothbrush), result
Company/
Related Site/ their business. For example, Somewhat is website for Oral-B (a competing brand of electric
28 Product/
Video/App this might be a 3rd party site Satisfying toothbrush).
Named Entity
about that company or its c. Query is “billy idol” (singer), result is wikipedia page for
products, or a site for a Generation X, a band from the 1970s he was in before he
competing product or became famous.
service.

Return to Table of Contents 58


Scenario If the query is And the result is Description Grade as Examples

Query is a knowledge term or


general request to learn
a. Query is “linguistics”; result is https://en.wikipedia.org/
about a subject; result is the
wiki/Linguistics
wikipedia page for that term,
b. Query is “what causes diabetes,” result is a page about
a page from another
that disease from the Mayo Clinic website https://
Knowledge Wikipedia or authoritative reference, or a
www.mayoclinic.org/diseases-conditions/diabetes/
Term or Other knowledge card. Common for Highly
29 symptoms-causes/syc-20371444
“Learn About” Authoritative medical queries. Satisfying
c. Query is “utilitarianism,” result is a Dictionary info card
Query Reference
giving the definition of the term.
Note that if “X” is a
d. Query is “challenger disaster” (historical event); result is
knowledge term, queries
https://en.wikipedia.org/wiki/
such as “what is X?” or “tell
Space_Shuttle_Challenger_disaster
me about X” still count as a
knowledge term queries.

Query is asking for specific


piece of information with a
simple right answer, and the a. Query is “barack obama age,” result is https://
Exact Embedded result contains that answer, en.wikipedia.org/wiki/Barack_Obama
30 Satisfying
Question Correct Answer but the user has to take an b. Query is “cambridge library hours,” result is https://
action (e.g., follow link to www.cambridgema.gov/cpl/hoursandlocations
destination page and read it)
to get the answer.

Return to Table of Contents 59


11. Web Images

A group of web images should be graded as a single result. Check to see if all the images have the following properties:

• Image displays correct subject. The image must actually show the subject of the query. For example, if the query is
“dodecahedron,” the image must actually show that geometric figure and not some other one. Missing images (or ones
that do not load) do not have this property.

• Subject clearly shown. All images in the set must clearly show the subject of the query. The subject should not be
blocked, out of focus, too far away, or otherwise difficult to see clearly.

• Subject is focus of image. In cases where the image includes multiple people or objects, it should be clear who or what
is the subject of the query. (For example, if the query is “Joe Biden,” it’s fine to have people in the background of a picture
of President Biden giving a speech, but it’s not fine to have a picture of Presidents Biden and Macron shaking hands.)

• Image shows representative version of subject. For example, if the query is the name of a currently popular actor, the
image should show that person as they look today (or how their character looks in a currently popular movie), not how
they looked many years ago. If the query is the name of a famous person from the past who is no longer alive, the image
should show them as they were best known. For example, if the query is “Richard Nixon,” a picture should show him
during the time he was U.S. president, not 20 years later when he was near the end of his life.

• No duplicates. The images in the set should all be different.

If ALL the images have all of the above properties, grade the result Highly Satisfying. Otherwise, downgrade the results as shown
in the following table.

Return to Table of Contents 60


Rule If… Then… Example

All images exhibit all


1 Grade as Highly Satisfying
properties

Query is David Beckham, result is set shown above. It has all the
desired properties, so you would grade as Highly Satisfying.

All but 1 or 2 images in the set


2 Grade as Satisfying
exhibit all properties
Query is “taffy brodesser-akner” (an author); result set is above.
Two of the images in the set are problematic; one shows part of a
poster for an event featuring the author, and another shows her
with another person, both partly cut off. Neither of these violates
property #1 because both attempt to represent the author and not
something else that would confuse or mislead the user, like a
picture of a different author. But each violates at least one of
properties 2-4. Overall you would grade this Satisfying because
all but two images have all the desired properties.

Return to Table of Contents 61


Rule If… Then… Example

Up to half of the images


3 Grade as Somewhat Satisfying
exhibit all properties

Query is "onion"; result set is shown above. Three images are


repeated and there is an image of a man chopping onions. He is
the subject of the photo, not the onion. Only 4 out of the 8 images
exhibit all properties.

If images are missing, mark as


Content Unavailable and Grade
as Not Satisfying
Image displays incorrect
4
subject for any image (if you use Tag grading platform,
check there is no need to rate NS
when Content is
Unavailable).
Query is “dodacahedron” (a geometric shape); result set is shown
above. Neither the second image nor the last image in this set are
dodecahedrons. Therefore you would grade this Not Satisfying.

Return to Table of Contents 62


Example Scenario

Scenario If the query is And the result is Description Grade as Examples

a. Query is “nelson mandela,” result is the


following set of images:

Query is (or asks


about) a visually
Visually Distinctive distinctive entity, and Highly
31 Web Image
Entity result is a high quality Satisfying
web image set
showing that entity.

a. Query is "José Mourinho," result is the


following set of images that meet all the
Query is (or asks requirements above but show him across a
about) a visually 10-15 year time span.
distinctive entity, and
result is a high quality
Visually Distinctive
32 Web Image web image set Satisfying
Entity
showing that entity, but
the set shows images
across a wide span of
time.

Return to Table of Contents 63


12. Product Searches

If the user is searching for a product and the result is a page where the product can be purchased, but the item is unavailable or
out-of-stock, you may want to lower the grade in certain cases:
• If the query describes something very specific, the user usually wants only that item. Showing the product page for the
item is the best you can do, even if the item is out of stock, so that result should not be penalized. Example queries:
◦ “our missing hearts by celeste ng” [a specific book; user doesn’t want any book]
◦ “iPhone 14 pro max 512gb” [a specific model and configuration of a product]
• If the query describes something general, or where there are reasonable substitutes, the user would probably rather
see an in-stock substitute rather than an out-of-stock exact match. So you should lower the grade of the out-of-stock
result. Example queries:
◦ usb to usb-c adapter [there are many different, equally good ones from different brands]
◦ bounty paper towels 12-pack [the user might be just as happy with two 6-packs of the same brand]

Example Scenarios

Scenario If the query is And the result is Description Grade as Examples

Query is name of the


entity; result is not their
a. Query is “zillow”, result is the video “Living Large in a
official website, but is a
Tiny Home” from Zillow’s YouTube channel.
site, page, video, or app
b. Query is “sonicare” (brand of electric toothbrush),
related to their business.
Company/Product/ Related Site/Video/ Somewhat result is website for Oral-B (a competing brand of
32 For example, this might
Named Entity App Satisfying electric toothbrush).
be a 3rd party site about
c. Query is “billy idol” (singer), result is wikipedia page
that company or its
for Generation X, a band from the 1970s he was in
products, or a site for a
before he became famous.
competing product or
service.

Return to Table of Contents 64


Scenario If the query is And the result is Description Grade as Examples

a. Query is “jbl bluetooth speaker,” result is page of


Query is the name of a matching items from electronics retailer Best Buy.
product (which may be b. Query is “empire falls book,” result is Amazon’s
media item such as a detail page for that book, https://www.amazon.com/
book, movie, song, etc.); Empire-Falls-Richard-Russo/dp/0375726403
33 Product Reputable Vendor result is a page from a Satisfying c. Query is “captain fantastic,” result is iTunes store
well-known site where page for that movie, https://itunes.apple.com/us/
the item can be movie/captain-fantastic/id1127934488
purchased, downloaded, d. Query is “taylor swift lover album,” result is Spotify
or streamed. page to stream that album, https://open.spotify.com/
album/3rYkgtFOo9AlPaeKTtn6pM

13. Other Query Types

Some types of results can never be Highly Satisfying.

Scenario If the query is And the result is Description Grade as Examples

Query is the name of a


general concept or event a. Query is “dogs”, result is wikipedia page for the dog
(such as a TV show); breed Beagle.
Overly Specific result is about a specific Somewhat b. Query is “suits” (a TV show that ran for 9 seasons),
34 General Query
Result instance of that concept Satisfying result is https://www.peacocktv.com/watch-online/
or event (such as a tv/suits/8003089882869075112/seasons/5 a page
particular episode of that where viewers can stream the 5th season.
show).

Return to Table of Contents 65


Scenario If the query is And the result is Description Grade as Examples

For Try Rating: Result


was flagged as Wrong
Language, Content
Unavailable, or
Inappropriate during
a. Query is “uniqlo”; user is in en-US; result is https://
Flagged During validation step. Not
35 Any Query www.uniqlo.com/jp/ja/ which is in Japanese and was
Validation Step Satisfying
flagged as Wrong Language.
For Tag: Once a flag
above is chosen no
further grading (for that
task) is needed or
possible.

Result that is not about a. Query is “samsung tv”, result is web page for
the query topic. Note Samsung washing machine.
that in some cases the b. Query is “obama age”, result gives the age of Joe
URL may appear to be Not Biden.
36 Any Query Off-Topic Result
about the query, but Satisfying c. Query is “Messi goals”, (Messi is a soccer player)
clicking through shows result is total goals by Barcelona (his team).
that the destination page d. Query is “target stores”, result is about an Ace
is not related. Hardware store location.

Return to Table of Contents 66


Scenario If the query is And the result is Description Grade as Examples

Result is a blank page, a


parked domain, a 404
error, something
unavailable in user’s
country, or anything else
where the content has
been removed or is
inaccessible.

This type of result


a. Query is “bisq restaurant cambridge”, result is http://
Result Fails to should be flagged as Not
37 Any Query www.bisqcambridge.com
Load/Inaccessible Content Unavailable. Satisfying
b. Query is “brokerbot”; result is http://brokerbot.com
• If you are grading on
Try Rating, you should
then mark this as Not
Satisfying.
• If you are grading on
Tag platform, once a
CU flag is chosen no
further grading (for
that task) is needed or
possible.

Return to Table of Contents 67


2.5 Review and submit

After you have assigned a grade, review your work for errors. Ensure you have not made one of the common grading mistakes
discussed below.

Common Grading Mistakes

Failing to Use Web Search

1. Misunderstanding Query Meaning. The query may be a common word that you think you know. But the web search may
show that the primary meaning is something entirely different.

• Example: Query is "canada goose"; result is the wikipedia page about that kind of bird. If you had not heard of the Canada
Goose clothing brand, you might assume that the bird page is what almost all users would want to see. But by looking at
the web search results, you can tell that this is not the case.

2. Misunderstanding Dominant Interpretation. This is a slight variation of the previous error. Based on your personal
experience, you may know that there is more than one interpretation of the query, but you may not realize that one is dominant.

• Example: Query is "jaguar"; result is the home page for the car company. If you believe the animal is the dominant
interpretation, you would downgrade the car company result. But by doing the web search, you can see that the car
company is actually the dominant interpretation, accounting for all but one of the results on the first page of both Google
and Bing results.

3. Falsely Assuming Dominant Interpretation. If you have heard of a result, you may assume that it's the dominant
interpretation. But this is not always true.

• Example: Query is "u of m scholarships," result is a page about scholarships at the University of Michigan. A grader who
knew nothing about the subject might conclude that this is a great result, and rate it Highly Satisfying. But looking at the

Return to Table of Contents 68


web results shows that the query has no dominant intent. It might be referring to the University of Minnesota, or the
University of Manitoba, or many other things. Therefore the grade cannot be HS.

⚠ Do not use web search ranking to determine grade! The only purpose of looking at the web search (Google and Bing)
results is to make sure you understand the possible meaning(s) of the query, and which meaning is dominant. You should never
use the ranking on the search result page to decide your grade. In other words, you should never think (for example) "Google
says this is the #1 result, so it must be Highly Satisfying," or "Bing puts this at the bottom of the page, so it must not be that
good." Once you understand the query, only these guidelines and your judgment should determine the grade.

Failing to Visit Destination Page

Another class of mistakes can occur when the grader fails to visit the destination page of a web/news result, and in particular, if
they try to grade a web/news result based only on the URL and/or snippet.

1. Missing Error Condition. The URL and/or snippet may make this look like a perfect result ‒ perhaps the home page of a
company. But if you actually clicked on it, you'd discover that the page does not load, or redirects to some entirely unrelated
page.

• Example: Query “valco shopping center,” result is www.valcoshoppingcenter.com. If you click on the result, you’ll be
taken to an advertising page that has nothing to do with the shopping center (which is out of business).

2. Incorrect Page Owner Assumption. The URL may be a perfect match for the name of a company or product you're
familiar with. But if you visited the destination page, you'd see that it's actually for an entirely different company with a
similar name.

• Example: Query "american eagle," result is www.americaneagle.com. Since American Eagle is a well-known clothing
brand, you assume the page is the home page of that company. But it isn't. Clicking on the result would have shown that
it's the home page of a web design company, which is not what most searchers are looking for.

Return to Table of Contents 69


Ignoring Time and Place

Many grading mistakes happen when the grader doesn't pay attention to the time or place of the query and/or result.

1. Mismatched Location. Graders usually notice when the user is in one location and the result is a Map to a very distant
location. But they frequently miss the case where the result is a web result for a very distant location.

• Example: User is in Virginia (state in Eastern U.S.), query is "harold's kitchen menu." Result is home page for Harold's
Kitchen and Bar. At first glance, this looks like a Highly Satisfying result. It's a restaurant with a matching name, and the
page shows their menu. But a closer look shows that this restaurant is actually in Richmond, British Columbia, Canada ‒
nearly 3000 miles (5000 km) away from the user. It is extremely unlikely that this was the result the user was looking for
(especially since there is a different restaurant named Harold's Kitchen close to the user's location).

2. Mismatched Date. Graders may notice the date of a news story, but forget to notice the date of the search. Or they may
not notice an implicit date in the content of a web result.

• Example: Query dated 2022 is "presidential election results"; result is a page showing the results of the 2016 U.S.
presidential election. The user was almost certainly looking for the most recent presidential election results, not one from
six years earlier.

Ignoring Conceptual Distance

Some mistakes involve the conceptual distance between the result and what the user was looking for.

1. Too Specific or Too General. Graders sometimes incorrectly give a result a high grade without realizing that it is too
specific or too general.

• Example: Query is "dog," result is wikipedia page about the welsh corgi, a particular breed of dog. This is too specific.

• Example: Query is "new england patriots news," result is home page for a regional sports news network that covers
many different sports teams in New England, not just the New England Patriots. This is too general.

Return to Table of Contents 70


2. Wrong Level of Web Page. Pages on a given web site often form a hierarchy, with a home page for the site, subpages for
different topics, sub-sub-pages, and so on. A common mistake is not to notice that a page is too high or too low in the
hierarchy, compared to what the user is looking for.

• Example: Query is "us passport information"; result is www.state.gov. This page is too high in the hierarchy of this web
site. It is about everything the U.S. State Department does (diplomatic relations, trade policies, etc.), not just passports.

• Example: Query is "us passport information"; result is a page from the U.S. State Department about what to do if your
passport is lost or stolen. This page is too low in the hierarchy of the site. The user never said anything about their
passport being lost or stolen ‒ in fact, we don't even know if the user already has a passport.

3. Ignoring Degrees of Separation. Graders often ignore the principle of degrees of separation. A result that's associated
with the thing the user is looking for is not the same as the thing the user is looking for.

• Example: Query is "chez panisse," result is Yelp's page of reviews for that restaurant. This is a very useful result, but it is
not Highly Satisfying, because it is one degree of separation from what the user was looking for.

Ignoring Relevance Grading Principles

1. Matching Words Instead of Meaning. Graders sometimes forget the principle "Think about meaning, not just matching
words." Just because the query words appear in the result does not mean the result is a good one, and just because the
query words are missing does not mean the result is a bad one.

• Example: Query is "far alone," result is a page containing the inspirational quote "If you want to go quickly, go alone. If
you want to go far, go together." The result contains both query words, but they match only incidentally. It's clear that
this is not what the user was looking for, and in fact the web search results show that "Far Alone" is the name of a song.

2. Ignoring Basic Definitions of Grading Scale. A common mistake is to ignore the basic definitions of each grade and only
look at the specific grading scenarios. The scenarios are meant to illustrate the definitions in different situations, not to
replace them. If you're faced with a grading situation where you don't see a rule that applies, just go back to the definitions:
Is this a result most users would want to see? Etc.

Return to Table of Contents 71


• Example: Query is “el pais” (name of several newspapers, including one in Cali, Colombia and one in Madrid, Spain); user
is in Colombia but result is for a more popular one in Madrid, elpais.com. There’s no rule about matching similarly-named
results in different countries, and the guidance about locale-sensitivity doesn’t exactly address this example. It’s clear
that the Spain result is not what most Colombian users are looking for, but it might be useful to some. By definition, that
means it’s Slightly Satisfying.

3. Ignoring "Aboutness" in News Stories. When the query is a named entity, news stories about that entity are graded as
satisfying. But just because a news story mentions an entity does not mean it's about that entity. If the entity is not a primary
topic of the story, the article is not about the entity.

• Example: Query is "starbucks" and result is a news article about a man who died in a traffic accident. The article mentions
the fact that the man worked at Starbucks, but his death had nothing to do with the company or the fact that he worked
there. This is NOT a news article about Starbucks, so News: Scenario 2 does not apply.

Return to Table of Contents 72


3. Additional Examples

3.1 Highly Satisfying

Query Result Rating Explanation

The Premier League is the top english soccer


Home page of the Premier League, league. Note that this is a result most users
1 top english soccer league HS
premierleague.com would want to see even though it doesn't use the
words "English" or “Soccer.”

Since it's both a company and an app, both of


2 facebook facebook.com, Facebook app HS these are "official" results that most users would
want to see.

Official website for the pop star, Almost all users searching for a celebrity would
3 olivia rodrigo HS
oliviarodrigo.com want to see that person's official web site.

Wikipedia is a high quality source of information


4 olivia rodrigo Wikipedia entry for Olivia Rodrigo HS
about the artist

Wikipedia page about the early 1800s Wikipedia is a highly satisfying result for any
5 jane austen HS
author named entity.

Almost all users searching for a business or


6 beat the bomb official website : https://beatthebomb.com HS
service would want to see its official web site.

Result is the official Roland Garros (French


https://www.youtube.com/channel/ Open) YouTube channel. Although there is no
7 french open highlights HS
UCF3K1Jf8hjFW8qliei8fQ3A specific rule scenario for this case, it clearly
satisfies the definition of Highly Satisfying.

mountain mike's pizza [user is in Result provides authoritative map information to


8 HS
Berkeley, California] the closest location of a chain business.

Return to Table of Contents 73


Query Result Rating Explanation

The info card immediately gives the user all


9 how tall is gwen stefani HS
information they asked for.

This info card provides relevant and accurate


10 iphone 11 HS information, even though it is not the official site
for the product.

All of the images satisfy the properties described


11 eric stonestreet HS in the section on how to grade Web Image
results.

saw
The official page for the movie. Contains
12 Note: assume a web search shows HS streaming links and descriptions about the
the dominant interpretation is for the movie.
movie

saw
A knowledge card for a named entity is Highly
13 Note: assume a web search shows HS
Satisfying.
the dominant interpretation is for the
movie

Return to Table of Contents 74


Query Result Rating Explanation

The wikipedia page for a named entity is Highly


14 wonder woman HS
Satisfying.

The wikipedia page for a named entity is Highly


15 gilmore girls HS
Satisfying.

This is the only business and the maps is correct


16 mark twain brewery HS and provides useful information about the open
status

A BBC News article, “Why Premier League


premier league news [searched on 29 The news article is timely and about the query
17 teams are flocking back to Asia” dated 28 HS
July 2022] topic.
July 2022.

Returns information about Armie Hammer and


the net worth maybe out of date (if search was
18 armie hammer (user has es-es locale) HS made now), the information has a date attached.
If the date 2020 was missing, this would become
Satisfying

Return to Table of Contents 75


3.2 Satisfying

Query Result Rating Explanation

The result is from a trusted website and has a


description of the experience and user submitted
reviews. This is a good example of a result that is
1 beat the bomb reviews page for the experience S
"one step away" -- it isn't the official site for the
service, but it gives the user helpful information
about that service.

We don’t really know what user wanted. Maybe


https://www.youtube.com/watch?
warriors vs lakers (searched on it’s a video of recent game highlights, but could
2 v=p478C35sgzA (highlight video of most S
06/01/2021) also be a schedule of upcoming games between
recent game on official NBA channel).
these teams, or an info card with the latest score.

The query is asking an implicit question (how to


change Instagram password). This web page has
Official instructions on how to change
3 instagram.com change pass S the authoritative answer, but the user has to click
instagram password
on the result to visit the page in order to see the
answer.

The web search results from Step 1 show that


there is no dominant meaning of the query. The
user might have wanted University of Montana,
Home page for University of Michigan
4 u m sociology (location: texas) S or University of Miami, among others(and the
sociology department
user is located far away from both states). So we
can't say that almost all users would have
wanted this result.

We do not know exactly which taxes the user has


web page containing an Indiana tax in mind and there are other websites (including
5 indiana tax calculator calculator, from a financial services S an official one from the state government) that
company offer similar information, so we can't say that
almost all users would have wanted this result.

Return to Table of Contents 76


Query Result Rating Explanation

Since there are several possible results for


Official video of a recent song by the band
popular BTS songs, and the user didn’t express a
6 bts BTS, https://www.youtube.com/watch? S
preference for a particular song, this is at best
v=WMweEpGlu_U
Satisfying.

User could be searching for a suite in the Plaza


Official website for the Plaza, a hotel in New Hotel, but "Plaza Suite" is also a famous play,
7 plaza suite new york S
York City that has rooms and suites often performed on Broadway in New York. There
is no dominant meaning.

There are several GPA calculators and though


this site is credible, users might want to see
8 gpa calculator https://gpacalculator.net S
alternatives. It is impossible to conclude that
almost all users would wish to see this result.

Query is a product (a movie) and result allows a


user to buy/rent the movie. Do not penalize
9 wonder woman S
movie/tv show results because they are not
clickable.

Query is a product (a movie) and result allows a


10 saw S
user to buy/rent the movie.

24 hour fitness (user in us/hawaiʻi/ There is another 24 hr fitness reasonably close (1


11 S
honolulu_county/honolulu) mile) away and open.

All of the images satisfy the properties described


in the section on how to grade Web Image results
12 eric stonestreet S
except it includes images dating back to 10-15
years ago.

Return to Table of Contents 77


3.3 Somewhat Satisfying

Query Result Rating Explanation

The Google/Bing results from Step 1 show that


the dominant meaning of the query is a different
IMDB page about the director of the 2013
1 steve mcqueen SS person, an actor from the 1960s & 70s with the
movie 12 Years a Slave
same name. So this result is not what most users
are looking for.

A very popular interview with BTS. and tv show


host, but not very relevant given that it is several
2 bts [searched in 2022] 2018 video of interview with the band SS
years old, and several newer interviews are
available.

There is a grocery chain in Florida called CAO, so


Irish website about applying to
3 cao [user is in Florida] SS it's unlikely that the user had the Irish website in
undergraduate programs in Ireland.
mind.

The dominant interpretation is the singer.


Furthermore, the dog breed is correctly spelled
4 pitbull SS as two words (“pit bull”), while the singer is
spelled as one. So these dog pictures are not
likely to be of interest to most searchers.

Most users who do this search are looking for


5 Tim Cook SS
the Apple CEO, not the historian and author.

Definition of a related word but not the word the


6 fleeting meaning SS
user asked for

Return to Table of Contents 78


3.4 Not Satisfying

Query Result Rating Explanation

User may either be looking for public


nearest subway [user is in Seattle,
1 NS transportation or the restaurant. In either case, a
WA]
result 710 miles away is not satisfying.

Though the result is from Farmers Insurance, it


2 farmers insurance [user is in Texas] farmers insurance hawaii NS has information about a different state, so is not
likely what most users would want to see.

James Watt did not invent the steam engine,


which already existed by 1712, before he was
what year did james watt invent the born. He did make some important
3 NS
steam engine improvements to it in the 1760s and 1770s. This
result contains only incorrect or misleading
information.

tour de france stage 1 (queried on NBC video of stage 18 of 2021 Tour de Result is for a previous year’s Tour de France,
4 NS
29 July 2022) France. and is not even the stage the user asked for.

Return to Table of Contents 79


Overall Preference Rating
Version 2.2

Stop! Overall Preference Rating is a separate task from Search Satisfaction. You will only be working on one of
these tasks at a time.

Do not utilize these guidelines unless you are working on the Overall Preference Rating (Side-By-Side Search
Satisfaction) task. If you are working on the single Search Satisfaction task, do not proceed past this point.

Contents
1. Overall Preference Rating
1.1. OPR Criteria
1.2. When a Side is Missing
1.3. Writing Comments
2. OPR and Comment Examples
2.1. Example 1
2.2. Example 2
2.3. Example 3
2.4. Example 4
2.5. Example 5
2.6. Example 6
2.7. Example 7
2.8. Example 8
2.9. Example 9
2.10. Example 10
2.11. Example 11
2.12. Example 12

Return to Table of Contents 80


3. Overall Preference Rating

Return to Table of Contents 81


In this grading task, you will receive two sets of results presented side by side for the same query, as shown above. Use the
Search Satisfaction guidelines to provide a satisfaction rating for every result. Then choose which side you prefer. This is called
the Overall Preference Rating (OPR). The rating scale is:

• About the Same


• Slightly Better
• Better
• Much Better

1.1 OPR Criteria

Use the following criteria to decide on the OPR:

1. Prefer the side whose results have higher satisfaction grades.


2. If there are multiple results, prefer the side where results with higher satisfaction are ranked higher.
3. If there are multiple results, prefer the side with a more varied result set. This might be a variety of result types (maps, apps,
web pages, etc.), satisfying a variety of meanings of the query.
4. Note that the side with more results is not necessarily better.
5. If you’re having trouble deciding which side is better, choose About the Same.

Note 1: How much these criteria affect OPR also depend on the position of the result. For example, if the satisfaction
rating of the results in position 1 are different, that should have a bigger impact on OPR than if the satisfaction rating of
results in position 4 are different.

1.2 When a Side is Missing


Return to Table of Contents 82
When one side is does not have results, OPR choice has some special guidance. Depending on the product (browser or phone)
the following guidelines will be automatically be shown in the template:

• Prefer the side WITH results ONLY when the side with results has at least one result graded Somewhat Satisfying, Satisfying or
Highly Satisfying
• Do not choose "About The Same”.

OR

• Prefer the side WITH results ONLY when the side with results has at least one result graded Satisfying or Highly Satisfying
• Do not choose "About The Same”.

In neither case should you choose “About the Same” in other words a side with a result can never be as good as a side without.

1.3 Writing Comments

Return to Table of Contents 83


You might be asked to leave a comment (written in English) for why you chose the OPR. These are very helpful to the clients of
the grading task. It helps understand the reasoning behind the rating for complex grading tasks and especially in locales the
clients doesn’t understand.

Poor Comment Excellent Comment

The query intent is Yahoo News and is most likely to


visit the main page of headlines of the queried website.
The 1st and 2nd results are the same on the both sides.
I came to the conclusion that the left side offers more The rest of the results are similar on both sides showing
suitable results and therefore should be rated as some specific pages from sports, entertainment and
better. weather categories on Yahoo News website and there is
a little better news among them (R5) on the right than
the left which is a breaking news from domestic news
category. Thus the right side is slightly better due to
better relevance and freshness.

• The comment on the left can be improved by providing reasons why the left is “more suitable”.
• For the comment on the right, the writer states presumed search need and then goes on to describe how the results help meet
that and ultimately why they chose one over the other.

2. OPR and Comment Examples

Return to Table of Contents 84


2.1 Example 1

Query: tdecu
Location: Richwood, TX

LEFT RIGHT

Official TDECU Digital Banking App Official TDECU Digital Banking App
TDECU Mortgage Simplified App TDECU Mortgage Simplified App
Maps info card with directions to TDECU TDECU.org official website
branch, 3 miles away
Maps Info Card with directions to a TDECU.org "About Us" page
TDECU branch 4 miles away
@TDEC twitter page
Much Better Better Slightly Better About the Same Slightly Better Better Much Better

OPR Explanation: The query refers to a credit union (essentially, a bank) with two branches near the user. We can assume the
user wants to either do a bank transaction, go to the bank, or get information about the bank.

The official app, the official website, and the map results for the nearest locations are al Highly Satisfying. The map results
appear on the left but not the right, while the official website appears on the right but not the left.

The left side addresses three search needs (it satisfies people looking for the main app, the mortgage app, and the map) while
the right addresses four (the main app, the mortgage app, the web page, and the Twitter feed). So the right has a slightly more
diverse result set. However, the user gave no indication that they were interested in the Twitter feed, so this is a very unlikely
intent. Since we don’t know whether more people are interested in the map or the official site, the two sides are About the
Same.

Return to Table of Contents 85


2.2 Example 2

Query: diesel
Location: Cambridge, MA

LEFT RIGHT

Diesel Online Store (shop.diesel.com/en/homepage) Diesel Online Store (shop.diesel.com/en/homepage)

DIESEL(ディーゼル)公式オンラインスト (diesel.co.jp) Diesel Fuel - Wikipedia (en.wikipedia.org/wiki/Diesel_fuel)

Diesel Fuel - Wikipedia (en.wikipedia.org/wiki/Diesel_fuel) Diesel [Maps result], 339 Newbury St., Boston (2 miles)
Much Better Better Slightly Better About the Same Slightly Better Better Much Better

OPR Explanation: The query could refer to a clothing store or a kind of fuel.
• Two out of three results are the same on both sides, so they aren’t that different.
• The left side has a wrong language result, which is Not Satisfying to users.
• The right side ranks the diesel fuel result higher, showing both likely interpretations near the top.
• The right side has more diversity of result types (web pages and maps, instead of only web pages).
• Since the are multiple reasons to prefer the right side, that side should be more than Slightly Better. But since the lists aren’t
that different, it’s not Much Better. So we choose Better.

Return to Table of Contents 86


2.3 Example 3

Query: apollo project


Location: Cincinnati, OH on Feb. 13, 2020

LEFT RIGHT

Apollo Space Program wikipedia article Apollo Space Program wikipedia article
(en.wikipedia.org/wiki/Apollo_program) (en.wikipedia.org/wiki/Apollo_program)

Project Apollo documentary [Movie] Project Apollo documentary [Movie]


Project Apollo — Moonlight Richards 50 Apollo Global Video Project: Les Twins
songs to the moon, an Apollo 11 space of Sarcelles by Apollo Theater, Harlem
mission tribute [Apple Music result] [YouTube video]
Much Better Better Slightly Better About the Same Slightly Better Better Much Better

OPR Explanation: The query refers to the space program from the 1960s that first put a human on the moon.
• The first two results are the same on both sides.
• Both result sets have three types of search results.
• The third result on the left is only vaguely related to the Apollo space program. It seems unlikely that someone searching for
“apollo project” would find an obscure artist’s ambient music useful in satisfying their search need.
• The third result on the right is not at all related to the Apollo space program; it has something to do with a project of the
Apollo Theater.

Based on the web results, it’s extremely unlikely that this was the user’s intended interpretation of the query. Since only the last
result is different, and the last result on the left is less bad than the one on the right, we conclude that the left side is Slightly
Better.

Return to Table of Contents 87


2.4 Example 4

Query: best actor winner


Location: Bellevue, WA on Feb. 13, 2020

LEFT RIGHT

Academy Awards Best Actor and Best Supporting Actor — Joaquin Phoenix — Academy Award for Best Actor —
Winners (filmsite.org/bestactor2.html) Winner [Info card]
Andy Serkis for Best Actor [YouTube video from 2011] Academy Awards Best Actor and Best Supporting Actor —
Winners (filmsite.org/bestactor2.html)
The Best Actors Who Won Oscars for Their First Movie Joaquin Phoenix: Best Actor, Motion Picture, Drama: 2020
(www.ranker.com/list/actors-who-won-oscars-for-their- Golden Globes (YouTube video)
first-movie/ranker-film)
Much Better Better Slightly Better About the Same Slightly Better Better Much Better

OPR Explanation: The query very likely refers to the winner of the Academy Award (aka “Oscar”) in the best actor category.
Since the query was on Feb. 13, 2020, we assume the user wanted the most recent award winner at the time, announced at the
ceremony on February 8, 2020.
• Result #1 on the left (same as #2 on right) contains the answer, but requires visiting the page and scrolling al the way to the
bottom to find it. Result #1 on the right gives us the answer right away, without even having to click on it.
• Result #2 on the left is a YouTube video from a non-authoritative source (a random fan), and it’s very outdated ̶ from 2011.
• Result #3 on the left is related to best actor winners, but doesn’t actually contain the answer the user is looking for.
• Result #3 on the right tells us about another recent best actor award ̶the Golden Globes, rather than the Oscars ̶which had the
same winner, Joaquin Phoenix. Even though we assume the user was looking for the Oscar winner, they might also be
interested in other awards won by the same actor for the same role.

Since all of these observations suggest that the right side is better than the left, you would conclude that the right side is Much
Better than the left.

Return to Table of Contents 88


2.5 Example 5

Query: anthony ramos


Location: Fairfax, VA on April 17, 2021

LEFT RIGHT

Anthony Ramos wikipedia page Anthony Ramos official site


Official video for Ramos' 2021 song “Lose My Mind" Official video for Ramos' 2021 song “Lose My Mind"
Official video for Ramos' 2021 song “Blessings" NBC News article from February 2021
Official video for Ramos' 2021 song “Say Less" Anthony Ramos instagram page
Much Better Better Slightly Better About the Same Slightly Better Better Much Better

OPR Explanation: The query refers to an actor and singer who appeared in the original cast of the musical Hamilton.
• Results L1, R1, and R4 al al Highly Satisfying. Al the rest of the results on both sides are Satisfying.
• The set on the right is more diverse, providing more different types of results.

Since the only differences favor the right side, it is Better.

Return to Table of Contents 89


2.6 Example 6

Query: dana
Location: Hampton, VA on 2021-08-17

LEFT RIGHT

Dana (Indonesian digital wallet) app Home page for Dana Inc. (www.dana.com), a company
that makes drivetrain parts for passenger vehicles
Home page for Nigerian airline Dana Air Video of Israeli singer Dana International performing the
winning song at the 1998 Eurovision contest
Video of 2021 song "Dana Dana" by Now United Wikipedia page for South Korean singer Dana
Much Better Better Slightly Better About the Same Slightly Better Better Much Better

OPR Explanation: The query can refer to many different things or people, and the web search results make it clear that none of
them is a dominant interpretation. Furthermore, these results all seem to be only Somewhat Satisfying, since it isn’t likely that
most users in the United States were searching for (say) an Indonesian app or an Israeli Singer from the 1990s. Therefore the
two sides are About the Same.

Return to Table of Contents 90


2.7 Example 7

Query: tina turner movie


Location: Kansas City, MO on 2021-08-17

LEFT RIGHT

1985 movie "Mad Max: Beyond Thunderdome" (which co- Web page for 2021 documentary "Tina"
starred Tina Turner) on HBO

1993 movie "What's Love Got to Do With It," about the life 1993 movie "What's Love Got to Do With It," about the life
of Tina Turner of Tina Turner

Web page for 2021 documentary “Tina" on HBO 1985 movie "Mad Max: Beyond Thunderdome" (which co-
starred Tina Turner)

Much Better Better Slightly Better About the Same Slightly Better Better Much Better

OPR Explanation: Both sides have the same results, but they are ranked differently. Since the search was done in 2021, it’s most
likely that the new 2021 documentary about Tina Turner (“Tina”) is what the user was looking for. Since the only difference is the
ranking, and the right side ranking is clearly better than the left side (moving the best result into position #1), it’s Better.

Return to Table of Contents 91


2.8 Example 8

Query: hannah waddingham


Location: Dickinson, TX on 2021-09-22

LEFT RIGHT

A news article on her winning an Emmy award for her The IMDB page for the actor Hannah Waddingham
character in the tv series Ted Lasso
A website listing the Emmy 2021 winners A different news article on her wining an Emmy award for
her character in the tv series Ted Lasso
Much Better Better Slightly Better About the Same Slightly Better Better Much Better

OPR Explanation: Both sides have a fresh and relevant news article but the second result on the left doesn't add any additional
value. On the right, we have an excellent ranking, the first result is a professional page about the actor and her experience and
the second a fresh news article.

Return to Table of Contents 92


2.9 Example 9

Query: monster hunter stories 2


Location: Miami, FL on 2021-08-10

LEFT RIGHT

Wikipedia entry for the video game Monster Hunter Stories Wikipedia link to Monster Hunter Stories 2: Wings of Ruin
Much Better Better Slightly Better About the Same Slightly Better Better Much Better

OPR Explanation: The user specifically asked for “Monster Hunter Stories 2”. The left side has a more general result (it’s about
the entire video game series), while the right is about the exact thing the user asked about, so the right is Better. To be Much
Better, the right side would have needed some additional content that added diversity, such as the link to official page.

Return to Table of Contents 93


2.10 Example 10

Query: audra mcdonald


Location: Bergen, NJ on 2021-09-22

LEFT RIGHT

A Knowledge Card describing the singer/actor including A Knowledge Card describing the singer/actor including
links to her official site and Twitter handle links to her official site and Twitter handle
A web video of a lesser well known song “My Man's Gone Official website
Now" from 2007
A web video of another song “Rainbow High" Twitter handle
Much Better Better Slightly Better About the Same Slightly Better Better Much Better

OPR Explanation: Both sides have the brief Knowledge card describing the person (with links to her official website and twitter
feed). The left side also has web videos for two of her songs, while the right side also has her official website and Twitter
feedResults R2 and R3 are more valuable than L2 and L3, but the lack of any videos makes the right side only Slightly Better.

Return to Table of Contents 94


2.11 Example 11

Query: sunrise
Location: West Melbourne, FL on 2021-09-01

LEFT RIGHT

Weather Info card for West Melbourne A website selling the domain name
(with sunrise/sunset times) http://www.sunrise.am
App store link for sunrise/sunset times Weather Info card for West Melbourne
(with sunrise/sunset times)
Knowledge Info card about the topic Knowledge Info card about the topic
Sunrise Sunrise
Much Better Better Slightly Better About the Same Slightly Better Better Much Better

OPR Explanation: Both have same third result. Both have the same Highly Satisfying info card, but it’s ranked better on the left.
Of the remaining results, the one on the left might be useful, while the one on the right is Not Satisfying. Both of these
differences favor the left side, so it is Better.

Return to Table of Contents 95


2.12 Example 12

Query: huffington post


Location: Paxtonia, PA 2021-09-22

LEFT RIGHT

Official website Official UK website

Twitter handle Huffington Post News App

Much Better Better Slightly Better About the Same Slightly Better Better Much Better

OPR Explanation: The user is looking for the news site Huffington Post. Official website, app, and Twitter feed are all Highly
Satisfying. The UK site is Somewhat Satisfying. Left is better due to more satisfying results.

Return to Table of Contents 96


Version History
2.3.1 (March 12, 2024)

Fixed a formatting in OPR guidance.

2.3 (March 11, 2024)

• Added a new example to “Content Unavailable: The browser presents warning of a privacy or security issue on the page.”
• “Not Secure” in the search bar for the website.
• Updated Section “11. Web Images.”
• Updated image set for David Beckham so it more closely aligns with the grading rules.
• Added a new grading scenario (31).
• Note: subsequent scenario numbers have been updated accordingly (ie, previous Scenario 35 is now Scenario 36, etc). No
other change to subsequent scenarios.
• Updated image set for “Additional Examples: Highly Satisfying,” Example 11 to more closely align with guidelines.
• Added example 12 to “Additional Examples: Satisfying.”

Overall Preference Rating 2.3

Added guidelines for different but related Side-by-Side Search Satisfaction task.

2.1 (29th January, 2024)

• Note added to Table of Contents addressing two different grading platforms


• Annotators must open a private window when checking links added to General Guidelines
• Updated Content Unavailable list:
• Clarified first bullet point includes all the following: a blank page, a parked domain, a 404 error, something
unavailable in user’s country, or anything else where the content has been removed or is inaccessible.
• Result is a website with a banner or pop-up indicating a limit on number of visits.

Return to Table of Contents 97


• Added new example “Result requires log-in, passwords, or subscription to access, specifically where some but
not all users would be blocked from viewing content” showing an article that is partially visible but not fully
visible.
• Added new section “Using Contextual Information to Determine Content Unavailable” to Step 2.3 Validate the Result
• Clarifies difference between query context and result context, and how they affect CU flag.
• Removed “In general, if location or date information is missing from the query context, the result cannot be graded.
However, if you are confident that the missing information would not change the grade (even if present) you can go
ahead and grade the result” from Section 7. Maps and Common Errors (this is now addressed/clarified in the new
section.
• Added specific example to Scenario 14
• Added a note Scenario 37 is example for Try Rating annotators. Tag annotators should flag this as Content Unavailable.

2.0 (9th January, 2024)

• Guidelines reformatted.
• Query and result in the same language removed as exceptions to Wrong Language.
• Clarification on which pop-ups are considered Content Unavailable.
• Instructions added on how to handle ad blockers, CAPTCHAs, and cookie pop-ups.
• Added scenarios for when to grade News result as Highly Satisfying.
• Added examples for Content Unavailable and Web Image groups.
• Examples of “Trusted News Sources” for en locales added to News Section.

1.62 (30th November, 2023)

• Results for advice or recommendation queries can not be HS - this was accidentally removed in 1.6 (the change only applied to
news)

1.61 (9th November, 2023)

• Fixed some inconsistencies (mentioning news can never be HS or not HS at the same time)
• Examples of Highly Satisfying news responses

1.6 (2nd November, 2023)

Return to Table of Contents 98


• A note in “When to Grade Not Satisfying” indicating if the grader is on Tag platform they do not need to(and cannot) continue
rating
• News items can be HS (previously at best S). See Grading Specific Situations (news)
• Added a section on assessing source quality
• Guidance on image groups (and advice to Tag graders when Content is Unavailable)
• Example added for HS, more advice/example for Inappropriate-Illegal category (side loading sites)

1.51 (3rd August, 2023)

• Two examples of Content Unavailable.

1.5 (3rd February, 2023)

• Added an explanation for OPR when a side is missing (Section 9.1)


• Property 1 for WebImages reworded to handle missing images
• WebImage rating guidance table columns updated
• Labels for web image examples fixed (dodecahedron and author examples)
• In Section 2 regarding the query, if the research links do not work, copy the phrase into the search engine (e.g. Google/Bing)
with the appropriate locale
• Added some guidance on “permanently closed” maps results. See Maps guidance (2)

1.4 (21st March, 2023)

• Explanation of “Adeles Third Album” (in Think About the Meaning) has been fixed.

1.4 (9th February, 2023)

• If at least one image in web-images group result is not visible then flag as Content Unavailable (see section in Content
Unavailable)
• Updated table of advice to suggest this in Grading Specific Advice for Web Images

Return to Table of Contents 99

You might also like