Essential v2
Essential v2
2023-11-14
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
Contents
Getting Support 5
Key Concepts 5
IMDb IDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Changes to Entities and Resolving IDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
JSON Lines File Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Versioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Data Structure Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Data Consistency Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Linking to IMDb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
countries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
episodeInfo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
seriesInfo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
episodeTitleIds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
officialSiteLinks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
genres . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
imdbUrl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
isAdult . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
keywords . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
locations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
movieConnections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
plotShort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
plotMedium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
plotLong . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
releaseDates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
productionStatus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
runtimeMinutes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
taglines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
titleType . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
imdbRating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
year . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
This data set contains IMDb essential metadata for every movie, TV and OTT series, and video game
title as well as performers and creators. Includes the IMDb 1-10 star rating, a daily-computed average
of votes from a global IMDb audience of 250 million visitors.
Release Notes
• 2023-11-13: Deprecation of the region field in akas, certificates, companies, releaseDates and
titleDisplay for Title Essential v2 (and the appropriate fields for the Title Rating v2) dataset
• 2023-11-09: Add names field to Title Essential v2 dataset (providing detailed information about
the names of a title’s awards).
• 2023-11-03: Add titles field to Name Essentials dataset (providing detailed information about the
titles of a person’s awards).
• 2023-10-16: Add trademarks field to Name Essentials dataset (providing detailed information
about a person’s trademarks).
• 2023-09-04: Add death details field to Name Essentials dataset (providing detailed information
about a person’s death).
• 2021-06-21: Add ‘country’ field to certificates, releaseDates and titleDisplay fields for titles, du-
plicating the existing ‘region’ field.
• 2021-04-07: Add trivia field and goofs field for titles, as an optional enhancement to the Title
Essential data (providing details about trivia and goofs on a title).
• 2021-03-05: Add color field to Title Essential data (providing details about whether a title is in
color or black and white).
• 2021-02-24: Add parents guide field for titles, as an optional enhancement to the Title Essential
data (providing details to parents and movie viewers about the title that cannot be fully con-
veyed by certificates).
• 2020-09-21: Add KeywordsV2 field to Title Essential data (providing more detailed information
about keywords).
• 2020-09-01: Add distributionV2 field to Title Essential companies data (providing more detailed
information about distributors).
• 2020-06-04: Use OpenX Serde in Athena examples (better support for wide Unicode characters).
4
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
Getting Support
Key Concepts
IMDb IDs
IMDb uses unique identifiers for each of the entities referenced in IMDb data. For example “Name
IDs” identify name entities (people) and “Title IDs” identify title entities (movies, series, episodes and
video games). IMDb identifiers always take the form of two letters, which signify the type of entity
being identified, followed by a sequence of at least seven numbers that uniquely identify a specific
entity of that type. For example:
• tt0050083 is the unique identifier for the movie “12 Angry Men (1957)”, where tt signifies that
it’s a title entity and 0050083 uniquely indicates “12 Angry Men (1957)”.
• nm0000020 is the unique identifier for the actor “Henry Fonda”, where nm signifies that it’s a
name entity and 0000020 uniquely indicates “Henry Fonda”.
Within the data set, each entry relates to a single IMDb identifier.
These IDs can be seen in some of the IMDb URLs, for example the title page https://www.imdb.com/title/tt0050083/
has the Title ID tt0050083 to reference the movie “12 Angry Men (1957)”.
Duplicate IDs
IMDb data is constantly being updated, both with the addition of new data and enhancement of the
quality of existing data. While there is only ever one unique IMDb identifier, there are, on occasion,
instances where there might be duplicate entries for the same entity. This could happen, for instance,
if multiple users have contributed data for the same entity (e.g. the same person) under different iden-
tifiers (e.g. different name ids). In this case IMDb maintains both identifiers in the data set, effectively
duplicating the data. This allows you to continue using any matching you have between IMDb iden-
tifiers and other identifiers. To identify when this is the case we include a remappedTo field in the
bulk data sets. From these fields, you get the new preferred identifier for that entity.
The Big Bang Theory pilot episode has multiple Title ID entries referring to the same episode:
tt1044014 (the Title ID that has been remapped) and tt0775431 (the preferred Title ID). When
5
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
you retrieve the remappedTo value for Title ID tt1044014, you will receive the preferred Title ID
tt0775431.
Deleted IDs
Sometimes entities are deleted from the data set. The most prominent example of this is the deletion
of titles that have been canceled during development and will therefore never be released. When
an entity is deleted, it is no longer included in the data set. The identifier associated with it is never
reused for a different entity.
IMDb data set is provided in JSON Lines file format. The files are UTF-8 encoded text files, where each
line in the file is a valid JSON string. Each JSON document, one per line, relates to a single entity,
uniquely identified by an IMDb ID. A JSON schema is also provided that documents the format that is
used for each JSON document within the file.
Versioning
Every published revision of an IMDb data set contains data file(s), documentation for that data, and
a schema which validates that data. Each of these is associated with a version number, which can be
found at the end of their filenames.
At any time we may change the format of new data set revisions and their accompanying schema,
but previously published data set revisions will remain unchanged. If data from a new revision of the
data set is not compatible with the previous schema (i.e. a breaking change) then we will increment
the version number for the data files, schema, and documentation. In this case we will publish both
formats of the data set for some period of time before we stop publishing the older one. The data set
format and schema may change without incrementing the schema version number if the change is
compatible with previous revisions (i.e. a non-breaking change).
6
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
There are some conventions you should be aware of when using IMDb data sets:
• There are no null values in the data set. If we do not have a value for a particular key we omit
publishing that key. Keys which are required by the schema will never have a null value.
• There are no empty objects in the data set. If an object would have contained no keys we omit
publishing that object.
• There are no empty arrays in the data set. If an array value would have contained no items we
omit publishing the corresponding key.
IMDb data is constantly being expanded and updated, and it can take seconds or minutes for a change
to propagate throughout the entire catalog. This means that the snapshot of data published may con-
tain temporary inconsistencies. For example, it is possible that we report an actor appearing in a title
in their filmography, but it has not yet propagated to that title’s credits. Each individual inconsistency
will be resolved in the next published revision of the data set.
Linking to IMDb
IMDb data contains URLs that you can use to link back to the IMDb website in any experience
you build for your users. Your license may require you to attach a “refmarker” to the end of the
URL. The “refmarker” is a special sequence of characters that we use to identify the source of
our traffic. Add the “refmarker” to the URL by appending ?ref_=xx_xxx_x to the URL, where
xx_xxx_x is replaced by the code we have provided to you. A full URL could look something like
https://www.imdb.com/title/tt0050083/?ref_=my_ref_marker.
7
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
nameId
The unique IMDb ID for the name in question. Each IMDb ID appears exactly once.
remappedTo
It is possible that two IMDb IDs can be created for a single entity within our system before IMDb identify
that they actually represent the same person or title. When this happens, we maintain the data asso-
ciated with both identifiers in the data set, duplicating the data. If there are duplicate name entities
for a person, remappedTo provides the IMDb ID of the primary name entity for this person.
See “Duplicate IDs” in the “Changes to Entities and Resolving IDs” section of “Key Concepts” for more
information.
name
The primary name by which this person is known, usually the one by which they are most often cred-
ited. For more information about how IMDb defines the primary name see IMDb help site.
awards
A list of awards that this person has won or been nominated for. This includes the name and category
of the award, the name and year of the award event, the titles they were nominated for, and whether
the person won the award. Note that winner may be false because the person is known not to have
won the award (where the awards event occured in the past) or because the winner is not yet known
(where the awards event occurs in the future, but the nominations have been announced). If ‘winner’
is true that means the awards event has already occured, and the person won the award. Please also
note that titles may be missing from the record in case the nomination is not related to a specific
title (e.g “life-time achievement award”).
Example
{
"awards": [
{
"year": 1958,
8
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
death
If applicable, details regarding the person’s death. Fields include (where known) deathStatus, death-
Date, deathCause and deathLocation. Note that there is no ‘alive’ status, so absence of death details
may imply that the person is alive, or that death details are unknown.
Example
{
"death": {
"deathCause": "stroke",
"deathDate": "1969-05-27",
"deathLocation": "Los Angeles, California, USA",
"deathStatus": "DEAD"
},
...
}
filmography
The filmography for this name as a list of credits. Each credit is within a “category” such as “ac-
tress”, “director” or “editorial_department”. For cast categories (e.g. “actor”), we include the roles
that the person played and the billing they had in the end credits (if available). For crew categories
(e.g. “writer”) we include the more specific “jobs” that the person was credited with if applicable. Lists
of credits, roles, and jobs are each in on-screen credits order.
Credits can have a list of attributes. At the moment we provide the following attributes:
• “uncredited”: signals that while the person performed this role on the title, they were not
present in the title’s end credits.
9
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
• “voice”: signals that this person provided a voice only performance for this title.
Additional information about these attributes can be found on IMDb help site.
For episodic credits we only include the series in the list. To get full information on which episodes
were worked on, look at the episode credits in the title file.
Example
A cast credit
{
"filmography": [
{
"titleId": "tt0050083",
"category": "actor",
"billing": 8,
"roles": ["Juror 8"]
},
...
]
}
A crew credit
{
"filmography": [
{
"titleId": "tt0052462",
"category": "producer",
"jobs": ["executive producer"]
},
...
]
}
An uncredited credit
{
"filmography": [
{
"titleId": "tt0050083",
"category": "actor",
"roles": ["Judge"],
10
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
"attributes": ["uncredited"]
}
...
]
}
knownFor
A short list of IMDb title IDs for the titles in which this person is most well known for being involved,
and the category of job that they had on that title (e.g. “actor” or “director”). This is always a subset
of filmography but the selection and order is determined by IMDb. For more details see IMDb help
site. For further details on their involvement see the filmography entry, or the creditsByCat-
egory entry on the title in question.
Example
{
"knownFor": [
{
"titleId": "tt0050083",
"category": "actor"
},
...
]
}
trademarks
Descriptions of a person’s recognizable trait, usually something repeated over a significant propor-
tion of their films, or distinguishing information that sets them apart from most other people in the
industry. For more details see IMDb help site.
Example
{
"trademarks": [
"Deep husky voice",
"Cat-like green eyes",
...
11
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
]
}
titleId
The unique IMDb ID for the title in question. Each IMDb ID appears exactly once.
remappedTo
It is possible that two IMDb IDs can be created for a single entity within our system before IMDb iden-
tify that they actually represent the same person or title. When this happens, we maintain the data
associated with both identifiers in the data set, duplicating the data. If there are duplicate title entities
for a title, remappedTo provides the IMDb ID of the primary title entity for this title.
See “Duplicate ID” in the “Changes to Entities and Resolving IDs” section of “Key Concepts” for more
information.
originalTitle
The original title text of the title, normally what the title is known as in its original country of release.
akas
A list of all available alternative title texts by which this title is also known. Here to help with matching
the IMDb title to any other title identifier you may have. Each title is listed with additional information
about the usage of that title text, e.g. what country it is from, and what language it is used in.
Example
{
"akas": [
...
{
"country": "TR",
"language": "tr",
12
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
titleDisplay
A list of alternative title display texts by which this title is also known. Each display text is listed with
additional information about the usage of that display text, e.g. what country it is from, and what
language it is used in. This is a subset of title akas, curated to only contain the title texts for each
language and country which are best for displaying to customers.
Example
{
"originalTitle": "12 Angry Men",
"titleDisplay": [
{
"country": "AE",
"title": "12 Angry Men"
},
{
"country": "AU",
"title": "12 Angry Men"
},
{
"country": "DE",
"title": "Die zwölf Geschworenen"
},
{
"country": "CSHH",
"language": "cs",
"title": "Dvanáct rozhněvaných mužů"
},
...
]
}
13
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
awards
A list of awards that this title has won or been nominated for. This includes the name and category of
the award, the name and year of the award event, the names who have been nominated and whether
the title won the award. Note that winner may be false because the title is known not to have won
the award (where the awards event occured in the past) or because the winner is not yet known (where
the awards event occurs in the future, but the nominations have been announced). If ‘winner’ is true
that means the awards event has already occured, and the title won the award. Please also note that
names may be missing from the record in case the nomination is not related to a specific person
(e.g. “Best Single Documentary” - Broadcasting Press Guild Award).
Example
{
"awards": [
{
"awardName": "Oscar",
"category": "Best Picture",
"event": "Academy Awards, USA",
"names": [
"nm0000020",
"nm0741627"
],
"winner": false,
"year": 1958
},
...
]
}
creditsByCategory
The credits for this title organized by category. Each entry in this list represents a single category and
gives you a list of credits within that category. For cast credits we include the roles that the person
played, the billing they had in the end credits (if available) and sometimes the creditedAs field
in case where the onscreen credit was using a different name. For crew credits we include the more
specific “jobs” that the person was credited with if applicable. Lists of credits, roles, and jobs are each
in on-screen credits order.
14
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
Credits can have a list of attributes that contain additional data about a specific credit (e.g. “uncred-
ited”, “voice”, etc.).
Additional information about these attributes can be found on the IMDb help site.
For series we include anyone who is credited on any episode. To get full information on which
episodes were worked on by a specific person, look at the episode credits.
Example
A crew category
{
"creditsByCategory": [
{
"category": "sound_department",
"credits": [
{
"jobs": ["sound"],
"nameId": "nm0322302"
},
{
"attributes": ["uncredited"],
"jobs": ["re-recording mixer"],
"nameId": "nm0334505"
}
]
},
...
]
}
A cast category
{
"creditsByCategory": [
{
"category": "cast",
"credits": [
{
"billing": 1,
"category": "actor",
"nameId": "nm0000842",
"roles": ["Juror 1"]
15
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
},
{
"billing": 2,
"category": "actor",
"nameId": "nm0275835",
"roles": ["Juror 2"]
},
...
{
"attributes": ["uncredited"],
"category": "actor",
"nameId": "nm0094036",
"roles": ["Judge"]
},
...
]
},
...
]
}
principalCastMembers
A short list of the most important cast credits for this title. This is always a subset of the cast from
the creditsByCategory list, but the selection and order is determined by IMDb. Often it is similar
to top-billed cast but it can be different, for example if the title credits are in order of appearance or
alphabetical. For more details see IMDb help site. Also includes the role or roles played (in on-screen
credits order), the billing in the full cast list and sometimes the creditedAs field in case where the
onscreen credit was using a different name.
Example
{
"principalCastMembers": [
{
"billing": 1,
"category": "actor",
"creditedAs": "David Newman",
"nameId": "nm1453374",
"roles": ["Squeeze"]
},
16
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
...
]
}
principalCrewMembers
A short list of the most important crew credits for this title. This is always a subset of the crew from
the creditsByCategory list, but the selection and order is determined by IMDb. Also includes the
category and job which qualified the credit for this list and sometimes the creditedAs field in case
where the onscreen credit was using a different name.
Example
{
"principalCrewMembers": [
{
"nameId": "nm0741627",
"category": "writer",
"job": "story"
},
...
]
}
certificates
A list of content rating certifications that have been given to a title, and the country where the rating
applies or applied. For example a title may be given a ‘PG-13’ rating in the ‘US’ (by the MPAA). There
may be additional attributes about the certificate or reasons for the rating provided by the rating or-
ganization (e.g. “Rated PG-13 for sequences of violence and action throughout”).
Example
{
"certificates": [
{
"country": "US",
"rating": "R",
"ratingsBody": "MPAA",
17
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
color
Lists whether the title was filmed in black and white or color. Where titles include footage in both we
list both, with the one that makes up the majority of the running time appearing first.
Example
{
"color": [
"Color",
"Black and White"
],
...
}
companies
Lists of the names of distribution, production, special-effects, and other miscellaneous companies as-
sociated with the making or subsequent distribution of this title. This list includes all companies that
have ever been involved with the title, even if their involvement has now ended. These are ordered
by on-screen credit order, or in the case of distribution companies by distribution release date.
18
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
Additional information about companies associated with titles can be found on IMDb help site.
Example
{
"companies": {
"distribution": [
{
"company": {
"country": "US",
"id": "co0226183",
"name": "Walt Disney Studios Motion Pictures"
},
"countries": ["US"],
"endYear": 2019,
"formats": ["theatrical"],
"isUncredited": false,
"startYear": 2019
},
...
],
"miscellaneous": [
{
"company": {
"country": "US",
"id": "co0746914",
"name": "4DX"
}
},
...
],
"production": [
{
"company": {
"country": "US",
"id": "co0051941",
"name": "Marvel Studios"
19
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
},
"isUncredited": false
},
...
],
"specialEffects": [
{
"company": {
"country": "GB",
"id": "co0454255",
"name": "Territory Studio"
}
},
...
]
}
}
countries
A list of ISO 3166 country codes for the countries in which the production companies for the title are
based. For more details see IMDb help site.
episodeInfo
For titles that are episodes, this contains information about the series, such as the series title ID, sea-
son number and episode number. It also includes the season and episode numbers where relevant.
Example
{
"episodeInfo":
{
"seriesTitleId": "tt0944947",
"episodeNumber": 1,
"seasonNumber": 8
}
}
20
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
seriesInfo
For titles that are series, this contains additional information about the series, such as the year it
started airing, the year it finished airing (if it has finished), and a list of all the episode title IDs in the
series ordered by episode number (e.g. season 1 episode 1, season 1 episode 2, etc.).
Example
{
"seriesInfo": {
"startYear": 2011,
"endYear": 2019,
"episodeTitleIds": [
"tt1480055",
"tt1668746",
"tt1829962",
...
]
}
}
episodeTitleIds
For titles which are series, the IMDb title IDs for all the episodes of that series.
officialSiteLinks
A list of URLs (and optionally their link titles) linking to this title’s official website.
Example
{
"officialSiteLinks": [
{
"url": "www.example.com/official/example-title",
"linkTitle": "Example official website for title"
},
...
]
}
21
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
genres
A list of genres to which this title belongs. The full list of allowed genres and guidelines for how titles
should be categorized can be found on IMDb help site. IMDb defines a limited list of genres but may
add more in the future.
image
A URL linking to the primary image associated with this title, such as a movie poster or still frame.
Additionally, includes the id and the width and height of the image in pixels.
Example
{
"image": {
"height": 1500,
"id": "rm2927108352",
"url": "https://m.media-
↪ amazon.com/images/M/MV5BMWU4N2FjNzYtNTVkNC00NzQ0LTg0MjAtYTJlMjFhNGUxZDFmXkEyXkFq
"width": 974
}
}
imdbUrl
isAdult
Whether or not this title contains adult content. Useful if you would like to filter out all adult content
from your copy of the data set.
keywords
A keyword is a word (or group of connected words) attached to a title (movie / TV series / TV episode)
to describe any notable object, concept, style or action that takes place during a title. A keyword
can be a single word (e.g. waterfall) or a phrase with words separated by a dash (e.g. world-war-two;
running-away-from-home).
22
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
keywords is a list of keywords associated with the title, sorted most relevant first as voted on by IMDb
customers. keywords has the following attributes for each keyword:
• “category”: Contains the category the keyword belongs to. For more information on categoriza-
tion, see the note below.
• “keyword”: Contains the keyword itself.
• “votes”: Contains the number of up votes and down votes IMDb customers have given this key-
word when rating it for helpfulness.
• “plot-detail”: Keywords describing elements of the plot of this title. (e.g. humanity-in-jeopardy;
metaverse).
• “subgenre”: Used to specify which subgenres apply to the title (e.g. romantic-comedy; musical-
comedy).
• “plot-timeframe”: Used to specify what timeframe the title’s plot is set in (e.g. 1980s; 20th-
century).
• “other”: Used to capture any keywords that do not fit into the above categories (e.g. directed-
by-woman; f-rated).
More information about keywords and guidance for how they are associated with at title can be found
on the IMDb help site.
Example
{
"keywords": [
{
"category": "plot-detail",
"keyword": "dream",
"votes": {
"up": 5,
"down": 7
}
},
...
]
}
23
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
languages
A list of ISO 639 language codes for the languages spoken in this title, in order of frequency that they
are spoken in the title. For more details see IMDb help site.
locations
A list of locations where scenes from this title were filmed and optionally names or descriptions of the
scenes which used that location.
Example
{
"locations": [
{
"scenes": ["studio"],
"place": "Fox Movietone Studio, New York, USA"
},
...
]
}
movieConnections
A list of IMDb title IDs of other titles which have a connection to this title, and the type of connection,
for example titles which reference or spoof this title. Optionally may include a description of the con-
nection. A complete list of current connection types can be found on IMDb help site, although more
may be added in future.
Example
{
"movieConnections": [
{
"type": "referenced_in",
"titleId": "tt2336547",
"text": "Jack criticizes the film for depicting 11 Americans being
↪ swayed by Jane Fonda's father"
},
24
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
...
]
}
plot
A plot description of this title. Most plot descriptions will be just a couple of sentences long, however
some may be longer, the ‘plot’ will contain the shortest of ‘plotShort’, ‘plotMedium’ or ‘plotLong’ and
may be omitted. If you are displaying these plots you may need to consider truncation on longer
plots.
plotShort
A plot outline of this title, no longer than 239 characters. Plot outlines never contain spoilers.
plotMedium
A plot summary of this title. Most plot summaries will be reasonably brief, a paragraph or two. If there
are multiple plot summaries available on this title’s plot page on IMDb.com, then the one provided
here will have been selected to display prominently on the title’s main page by our users or manual
vetting team.
plotLong
A synopsis of this title. A long detailed description of the entire plot of the title.
releaseDates
A list of the release dates (ISO 8601 date format) for this title, together with the country (an ISO 3166
country code) to which each release date applies.
Note that each release date may specify year, month and day (e.g. 1979-08-16), year and month
(e.g. 1979-08) or only year (e.g. 1979).
25
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
Example
{
"releaseDates": [
{
"date": "1957-04",
"country": "GB"
},
{
"date": "1957-04-10",
"country": "US"
},
{
"date": "2016-02-24",
"country": "CZ"
},
...
]
}
productionStatus
A list of production statuses for this title in ascending order by date, with the last status being the
current production status. The available statuses for in-production listings are available on IMDb help
site.
{
"productionStatus": [
...
{
"updated": "2008-12-02",
"status": "pre production"
},
{
"updated": "2009-10-24",
"status": "filming"
},
{
"updated": "2011-04-17",
"status": "released"
}
]
26
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
runtimeMinutes
taglines
A list of taglines for this title. A tagline is a short description or comment on a title that is often dis-
played on posters. For additional details see IMDb help site.
titleType
imdbRating
The IMDb Rating for the title. The rating is between 1 and 10 and given to one decimal place. See
IMDb help site for more information on how the rating is calculated. We also include the number of
IMDb users who have voted on this title. A single IMDb user can cast a maximum of one vote. This
field can be missing from an entry when we do not yet have an IMDb rating for the title in question.
This can occur either because it does not yet have enough votes, or it has not yet been released. A TV
series rating is not the weighted average of the ratings of individual episodes. Instead, customers vote
separately for the rating of the series as a whole via each title’s series page.
Example
{
"rating": 8.9,
"numberOfVotes": 613399
}
year
27
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
Amazon Athena is an interactive query service that makes it easy to analyze data directly in Amazon
Simple Storage Service (Amazon S3) using standard SQL. For more information, and for getting started
with Athena, read the user guide.
Getting Started
You first need to create a database in Athena. This process is documented in the user guide
When you have a database, you’re ready to create a table that’s based on the dataset. You need to
upload your dataset to Amazon S3. When you specify a location for your table you should use a trailing
slash for your folder or bucket. Do not use file names or glob characters.
Use: s3://S3-BUCKET/S3-KEY/
Athena will query all objects in the specified location so it is important that only one dataset is found
at that path.
To create a table use the create table DDL statement found at the end of this document. Remember
to set the location to the location of your dataset. If you do not need to query all of the columns in the
table you can remove them from the create table DDL statement.
Now that you have a table created in Athena based on the data in Amazon S3, you can run queries on
the table and see the results in Athena.
IMDb user ratings can be found in the title essential dataset as part of the imdbRating structure.
select
tc.titleId,
tc.originalTitle,
tc.imdbRating.rating,
tc.imdbRating.numberOfVotes
from
28
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
title_essential_v1 as tc
where
tc.remappedTo is null and tc.titleType = 'movie'
order by
tc.imdbRating.rating desc,
tc.imdbRating.numberOfVotes desc
What Are the Title Texts for the Titles That a Person Is Known For?
The title IDs that a person is known for can be found in the name essential dataset as part of the
knownFor array. To query this array it is necessary to flatten it into multiple rows using CROSS JOIN
in conjunction with the UNNEST operator. To include the original title text it is necessary to JOIN the
title essential dataset.
select
nc.nameId,
nc.name,
u_knownFor.titleId,
tc.originalTitle
from
name_essential_v1 as nc
cross join
unnest(nc.knownFor) with ordinality as t(u_knownFor, ordinal)
join
title_essential_v1 as tc
on u_knownFor.titleId = tc.titleId
where
nc.remappedTo is null
order by
29
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
nc.nameId, ordinal
The name IDs for the principal cast for a title can be found in the title essential dataset as part of the
principalCastMembers array. To query this array it is necessary to flatten it into multiple rows
using CROSS JOIN in conjunction with the UNNEST operator. To include the name it is necessary to
JOIN the name essential dataset.
select
tc.titleId,
tc.originalTitle,
nc.name
from
title_essential_v1 as tc
cross join
unnest(tc.principalCastMembers) with ordinality as t(u_pcm, ordinal)
join
name_essential_v1 as nc
on u_pcm.nameId = nc.nameId
where
tc.remappedTo is null
order by
tc.titleId, ordinal
Running this query on the sample dataset returns the following results:
30
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
What Awards was a Title Nominated for, and who were the Award Nominees?
select
tc.titleId,
tc.originalTitle,
tc_awards.awardNominationId,
tc_awards.awardName,
nc.nameId,
nc.name
from
name_essential_v1 as nc
cross join
unnest(nc.knownFor) with ordinality as t(u_knownFor, ordinal)
join
title_essential_v1 as tc
on u_knownFor.titleId = tc.titleId
cross join
unnest(tc.awards) as t(tc_awards)
cross join
unnest(nc.awards) as t(nc_awards)
where
tc_awards.awardNominationid = nc_awards.awardNominationId
31
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
The title IDs for episodes that are part of a series can be found in the title essential dataset as part
of the episodeTitleIds array. To query this array it is necessary to flatten it into multiple rows
using CROSS JOIN in conjunction with the UNNEST operator. To include the title text it is necessary
to JOIN the title essential dataset.
select
tc_series.titleId,
tc_series.originalTitle,
tc_episode.titleId,
tc_episode.originalTitle,
tc_episode.episodeInfo.seasonNumber,
tc_episode.episodeInfo.episodeNumber
from
title_essential_v1 as tc_series
cross join
unnest(tc_series.seriesInfo.episodeTitleIds) as t(u_eti)
join
title_essential_v1 as tc_episode
on u_eti = tc_episode.titleId
where
tc_series.remappedTo is null
order by
tc_series.titleId,
tc_episode.episodeInfo.seasonNumber,
tc_episode.episodeInfo.episodeNumber
32
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
The title ID for a series that an episode is part of can be found in the title essential dataset as part of the
episodeInfo structure. To include the title text it is necessary to JOIN the title essential dataset.
select
tc_episode.titleId,
tc_episode.originalTitle,
tc_series.titleId,
tc_series.originalTitle,
tc_episode.episodeInfo.seasonNumber,
tc_episode.episodeInfo.episodeNumber
from
title_essential_v1 as tc_episode
join
title_essential_v1 as tc_series
on tc_episode.episodeInfo.seriesTitleId = tc_series.titleId
where
tc_episode.remappedTo is null
order by
tc_episode.titleId
33
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
awardNominationId:string,
category:string,
event:string,
titles:array<
string
>,
winner:boolean,
year:bigint
>
>,
death struct<
deathCause:string,
deathDate:string,
deathLocation:string,
deathStatus:string
>,
filmography array<
struct<
attributes:array<
string
>,
billing:bigint,
category:string,
jobs:array<
string
>,
roles:array<
string
>,
titleId:string
>
>,
imdbUrl string,
knownFor array<
struct<
category:string,
titleId:string
>
34
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
>,
name string,
nameId string,
remappedTo string,
trademarks array<
string
>
)
row format serde 'org.openx.data.jsonserde.JsonSerDe'
location 's3://S3-BUCKET/S3-KEY/'
35
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
string
>,
country:string,
rating:string,
ratingsBody:string,
reason:string
>
>,
color array<
string
>,
companies struct<
distribution:array<
struct<
company:struct<
country:string,
id:string,
name:string
>,
countries:array<
string
>,
endYear:bigint,
formats:array<
string
>,
isUncredited:boolean,
startYear:bigint
>
>,
miscellaneous:array<
struct<
company:struct<
country:string,
id:string,
name:string
>
>
36
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
>,
production:array<
struct<
company:struct<
country:string,
id:string,
name:string
>,
isUncredited:boolean
>
>,
specialEffects:array<
struct<
company:struct<
country:string,
id:string,
name:string
>
>
>
>,
countries array<
string
>,
creditsByCategory array<
struct<
category:string,
credits:array<
struct<
attributes:array<
string
>,
billing:bigint,
category:string,
creditedAs:string,
jobs:array<
string
>,
37
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
nameId:string,
roles:array<
string
>
>
>
>
>,
episodeInfo struct<
episodeNumber:bigint,
seasonNumber:bigint,
seriesTitleId:string
>,
genres array<
string
>,
image struct<
height:bigint,
id:string,
url:string,
width:bigint
>,
imdbRating struct<
numberOfVotes:bigint,
rating:double
>,
imdbUrl string,
isAdult boolean,
keywords array<
struct<
category:string,
keyword:string,
votes:struct<
down:bigint,
up:bigint
>
>
>,
38
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
languages array<
string
>,
locations array<
struct<
interestScore:struct<
usersInterested:bigint,
usersVoted:bigint
>,
place:string,
scenes:array<
string
>
>
>,
movieConnections array<
struct<
text:string,
titleId:string,
type:string
>
>,
officialSiteLinks array<
struct<
linkTitle:string,
url:string
>
>,
originalTitle string,
plot string,
plotLong string,
plotMedium string,
plotShort string,
principalCastMembers array<
struct<
attributes:array<
string
>,
39
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
billing:bigint,
category:string,
creditedAs:string,
nameId:string,
roles:array<
string
>
>
>,
principalCrewMembers array<
struct<
attributes:array<
string
>,
category:string,
creditedAs:string,
job:string,
nameId:string
>
>,
productionStatus array<
struct<
date:string,
status:string
>
>,
releaseDates array<
struct<
country:string,
date:string
>
>,
remappedTo string,
runtimeMinutes bigint,
seriesInfo struct<
endYear:bigint,
episodeTitleIds:array<
string
40
IMDb Movie/TV/OTT Data: Documentation & Data Dictionary 2023-11-14
>,
startYear:bigint
>,
taglines array<
string
>,
titleDisplay array<
struct<
country:string,
language:string,
title:string
>
>,
titleId string,
titleType string,
year bigint
)
row format serde 'org.openx.data.jsonserde.JsonSerDe'
location 's3://S3-BUCKET/S3-KEY/'
41