0% found this document useful (0 votes)
2 views44 pages

Intro

The document outlines a tutorial on building a Product Knowledge Graph, detailing the process from knowledge extraction to ontology mining and applications. It highlights challenges such as sparse data and complex domains while introducing Amazon's AutoKnow system for self-driving product knowledge collection. Key insights include leveraging customer behavior and graph structures to enhance product understanding and categorization.

Uploaded by

moisesmarrary2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views44 pages

Intro

The document outlines a tutorial on building a Product Knowledge Graph, detailing the process from knowledge extraction to ontology mining and applications. It highlights challenges such as sparse data and complex domains while introducing Amazon's AutoKnow system for self-driving product knowledge collection. Key insights include leveraging customer behavior and graph structures to enhance product understanding and categorization.

Uploaded by

moisesmarrary2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

All You Need to Know to Build a

Product Knowledge Graph

Nasser Zalmout Chenwei Zhang Xian Li Yan Liang Xin Luna Dong
Amazon Amazon Amazon Amazon Amazon→Facebook
Outline
Overview and Introduction 20 min
Knowledge Extraction 40 min
Knowledge Cleaning 25 min
Break 20 min
Ontology Mining 25 min
Applications 20 min
Conclusion and Future Directions 10 min
Overview and Introduction 20 min

Knowledge Extraction

Knowledge Cleaning

Q&A
Overview and Break

Introduction Ontology Mining

Applications

Conclusion and Future Directions

Q&A
Knowledge Graph Example for 2 Songs
Entity name “Pop”

mid127
name
genre “Dance-pop”

name mid345 name “Taylor Alison Swift”


“Shake it off”
artist
name
Recording type artist mid128 “Taylor Swift”

type song_writer birth_date 12/13/1989


“Love Story” mid346
name name
mid129
genre “Country pop”
Entity type
Relationship type Genre
Product Graph Example for 2 Songs
“Shake it off” name “Pop”

mid127
name
name genre “Dance-pop”

mid345 name “Taylor Alison Swift”


artist
artist name
mid128 “Taylor Swift”

song_writer birth_date 12/13/1989


mid346

mid129 name
genre “Country pop”
name

“Love Story” type Genre


Product Graph Example for 2 Songs
“Shake it off” name “Pop”
ASIN
B0035QUXWQ mid567
mid127
name
name genre “Dance-pop”
product
B0035QUXWR ASIN
mid568 mid345 name “Taylor Alison Swift”
product artist
Release artist name
type mid569
mid128 “Taylor Swift”
product
Track song_writer
product birth_date 12/13/1989
mid570 mid346
B0067XLIG8 ASIN product mid129 name
genre “Country pop”
name
mid571
B0067XLIG4 type
ASIN “Love Story” Genre
Product Graph Example for 2 Products
Use Case I: Providing Information
Use Case II: Providing Choices
Use Case III: Improving Search
Use Case III: Improving Search
Use Case III: Improving Search
Use Case IV: Improving Recommendation
Product Graph vs. Knowledge Graph

(A) (B) (C)

Generic KG Generic KG
Generic KG

PG
PG
PGPG
Product Graph vs. Knowledge Graph

(A) (B) (C) ✅


Generic KG Generic KG
Generic KG
Movie,
Music,
Product
PG KG
PG Book,
etc.
(Hardline, softline,
consumables, etc.)
Generic
KG
Movie,
Music,
Product
KG We focus on
Book,
Aetc. retail products in
(Hardline, softline, this tutorial
consumables, etc.)

But, Is The Problem Harder?


Challenges in Building Product Graph I

❑Sparse and noisy structured data


Challenges in Building Product Graph II

❑Extremely complex domains


❑How to identify the millions of product types?
❑How to organize types into a taxonomy tree?

Sellers’ view Buyers’ view


Challenges in Building Product Graph III

❑Big variety across product types


❑Different attributes apply to different product types
❑Different value vocabularies and different patterns
Challenges in Building Product Graph III

❑Big variety across product types


❑Different attributes apply to different product types
❑Different value vocabularies and different patterns
Scale Up in 3 Dimensions
Millions of categories

Thousands of attributes

Big challenge: Limited training


Hundreds of languages labels for large-scale, rich data
Can We Build A Self-Driving
Product Understanding System?
Our Goal: Self-Driving Product Knowledge Collection
Taxonomy
Grocery
Product
KG Grocery

Snacks Drinks
Snacks Drinks
Candy
Pretzels Candy
User logs
Catalog AutoKnow
hasType
Product Type Flavor Color
Prod. 1 Prod. 2 Prod. 3
Product 1 Snacks Cherry flavor color
flavor
Product 2 Candy ? ?
Chocolate Choc. Gold
synonym
Product 3 Candy Choc. Gold
Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.
Our Goal: Self-Driving Product Knowledge Collection
Taxonomy
Grocery
Product
KG Grocery

Snacks Drinks New Type


Snacks Drinks
Candy
Pretzels Candy
User logs
Catalog AutoKnow
hasType
Product Type Flavor Color
Prod. 1 Prod. 2 Prod. 3
Product 1 Snacks Cherry Corrected
flavor flavor color
Value
Product 2 Candy ? ?
Chocolate Choc. Gold
synonym
Product 3 Candy Choc. Gold
New Value
Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.
Amazon AutoKnow: Self-Driving Product
Knowledge Collection
Taxonomy
Grocery
Product
KG Grocery

Snacks Drinks
Snacks Drinks
Candy
Pretzels Candy
User logs
Catalog AutoKnow
hasType
Product Type Flavor Color
● #Types ↑ 3X Prod. 1 Prod. 2 Prod. 3
Product 1 Snacks Cherry ● Defect rate ↓ flavor color
flavor
Product 2 Candy ? ?
up to 68 percent
points Chocolate Choc. Gold
synonym
Product 3 Candy Choc. Gold
Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.
Amazon AutoKnow: Self-Driving Product
Knowledge Collection
Input Data
PT Taxonomy

Catalog

Behavioral
Signals (e.g.,
search logs,
reviews,
Q&A)

Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.
Amazon AutoKnow: Self-Driving Product
Knowledge Collection
Input Data Ontology Suite
PT Taxonomy
Taxonomy
Enrichment
Catalog

Behavioral Relation
Signals (e.g., Discovery
search logs,
reviews,
Q&A)

Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.
Amazon AutoKnow: Self-Driving Product
Knowledge Collection
Ontology Suite
PT Taxonomy
Taxonomy Data Suite
Enrichment
Catalog Data
Imputation
Behavioral Relation
Signals (e.g., Discovery Data
search logs, Cleaning
reviews,
Q&A) Synonym
Discovery

Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.
Amazon AutoKnow: Self-Driving Product
Knowledge Collection
Ontology Suite
Broad Graph
PT Taxonomy
Taxonomy Data Suite
Enrichment Ontology
Catalog Data
Imputation {product,
Behavioral Relation attribute,
Signals (e.g., Discovery Data value}
search logs, Cleaning
{value,
reviews,
Synonym synonym,
Q&A)
Discovery value}

Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.
Self Driving to Navigate a Large Space
• Automatic: Fully ML-based
• Annotation free: Weak learning based on existing
Catalog data and user behavior
• One-size-fits-all: Few taxonomy-aware models
• Self guidance: Identify important attributes and
categories to focus efforts
Key Intuition I. Learning w. Limited Labels
Generated from Existing Catalog Data
Taxonomy
Grocery

Snacks Drinks

Candy

Catalog

Product Type Flavor Color

Product 1 Snacks Cherry

Product 2 Candy ? ?

Product 3 Candy Choc. Gold


Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.
Key Intuition II. Rich Customer Behavior
Grocery Pretzels Candy

hasType
Snacks Drinks
Prod. 1 Prod. 2 Prod. 3
Pretzels Candy color
flavor flavor

Chocolate Choc. Gold


synonym
mention
mention mention

co-purchase co-view mention


Prod. 1 Prod. 2 Prod. 3

purchase purchase purchase

Query 1 Query 2 Query 3


Key Intuition III. Product Categories as
First-Class Citizen in Modeling

Grocery

Snacks Drinks

Pretzels Candy

hasType

Prod. 1 Prod. 2 Prod. 3


Key Intuition IV. Leverage Graph Structure
Taxonomy is a tree KG is a graph

Grocery Pretzels Candy

hasType
Snacks Drinks
Prod. 1 Prod. 2 Prod. 3
Pretzels Candy color
flavor flavor

Chocolate Choc. Gold


synonym

Customer behavior forms a graph


co-purchase co-view
Prod. 1 Prod. 2 Prod. 3

purchase purchase purchase

Query 1 Query 2 Query 3


Key Intuition V. Leverage Different Modals
Key Techniques in AutoKnow
Deliver the Data Business

10
, 0 0,0 0 0,0 0 0
Deliver the Data Business

1
High
precision
models
Deliver the Data Business

10
, 0 0
High E2E pipeline
precision + AutoML
models to reduce
modeling cost
Deliver the Data Business

10
, 0 0,0 0 0 1000s categories
High E2E pipeline 10s
precision + AutoML languages 100s attributes
models to reduce Scale-up to
modeling cost reduce #models
Deliver the Data Business

10
, 0 0,0 0 0,0 0 0 1000s categories
High E2E pipeline 10s
precision + AutoML languages 100s attributes
models to reduce Scale-up to Higher yield from
modeling cost reduce #models multi-modal models
Tutorial Structure
Sec 4

Ontology Suite
Broad Graph
PT Taxonomy
Taxonomy Data Suite
Enrichment Ontology
Catalog Data
Imputation Sec 2 {product,
Behavioral Relation attribute,
Signals (e.g., Discovery Data value}
search logs, Cleaning Sec 3
{value,
reviews,
Synonym synonym,
Q&A)
Discovery value}

Sec 5.
Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020. Applications
Section Structure
• Problem Definition
What are unique challenges for PG beyond generic KGs?
• Short answer -- key intuition
What are key intuitions for building product KGs?
• Long answer -- details
What are practical tips?
• Reflection/short-answer
Can we apply the techniques to other domains?
Key Questions We Answer in This Tutorial
• Q1. What are unique challenges to build a product knowledge
graph and what are solutions?

• Q2. Are these techniques applicable to building other domain


knowledge graphs?

• Q3. What are practical tips to make this to production?

You might also like