0% found this document useful (0 votes)

32 views5 pages

Perspectives On Taxonomy, Classification, Structure and Find-Ability

The document discusses perspectives on taxonomy, classification, and structure and their goal of improving findability of content. It defines taxonomy, classification, and structure and notes that while taxonomy attempts to be exhaustive, classification organizes items into categories based on similarity. Structure provides context to improve readability. The document also discusses observations on manually classifying knowledge, noting the dangers of overusing filtering and excluding potentially relevant results.

Uploaded by

adnane akkaoui

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views5 pages

Perspectives On Taxonomy, Classification, Structure and Find-Ability

Uploaded by

adnane akkaoui

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Perspectives on Taxonomy, Classification, Structure and Find-ability

Thoughts from the Consortium for Service Innovation, a work in progress by Greg Oxton,
John Chmaj and David Kay.

“We are all in pursuit of relevance.”

– John Chmaj

The goal of using taxonomy, classification schemes and structure is to improve our ability to
find relevant content in a large collection of content and to improve our ability to learn from
the patterns and trends that emerge from a large collection of content without manually
review each piece of content.

The Goal—Complement (Not Replace) the Human Mind

The human mind has an amazing ability to make connections and sense out of information,
observations and experiences, which to a machine (or program) would appear to be totally
unrelated. This is in part because of the subtleties of context and our mental ability to infer,
interpolate, extrapolate and interpret our experiences and interactions at multiple, non linear
levels of conceptualization. In addition to multi layered conceptualization the human mind
benefits from all five senses. A smell, a sound (or song), a sensation or feeling can each, or in
combination, trigger a connection to experience. Machines struggle to deal with multi-
dimensional concepts and non-linear relationships, and they are largely limited to a single
“sense” in the form of text. Machines have a sever disadvantage due to the imprecise nature
of language we use to represent thought and concepts. At the same time, the human mind
does not have complete or perfect recall of all that we know; machines do.

The goal is to create a knowledge practice that complements what individuals know with
what the collective experience of the organization or community. To accomplish we are
seeking to create content that is good enough to be findable and usable by a target audience.
Taxonomy, classification and structure all play a role in find-ability.

The Difference Between Taxonomy, Classification and Structure:

• Taxonomies attempt to be exhaustive; they position all known things in a
hierarchy and/or relationship map. One key point here is that taxonomies have
evolved from organizing the relationship between physical things (tangibles) like
living things or the earth’s fundamental elements. We are now applying
taxonomies to abstract things (intangibles) like the meaning of words and
phrases or concepts. Abstract things by their very nature have an element of
ambiguity that makes it hard to apply the same structures, definitions or
relationship mapping that works well for the physical world.
• Classification is a way to organize things or objects into categories or buckets
based on similarity. Classifications are generally not exhaustive. In practice, the
categories are often predetermined. Generally, they have one or two dimensions
at best. That is, the categories are exclusive: an object can be in category A or B
but not both.
• Structure (as we talk about it in KCS terms) is a simple, less rigid or absolute way
to organize information and does not require that we anticipate all the potential
things that might fit into a level of the structure. Structure in the KCS model
gives the words and phrases some context or role that improves readability. If
the search engine uses the structure, it can help find-ability.

Taxonomy (a definition): a structure for classification, nomenclature to describe a

catalogue, a relationship map

A wiki on taxonomy http://en.wikipedia.org/wiki/Taxonomy

Taxonomy (examples):
• The classification of living things (e.g.
organism/domain/phylum/class/order/family/genus/species)
• The classification of the earth’s minerals—periodic table
• Computers—network/system/device/module/part/component

Taxonomy and knowledge—taxonomies used in knowledge tools are often very

detailed relationship maps for the meaning of words and sometimes concepts. The
content is tagged programmatically such that it aligns with or fits with the map.
When searches are done the search engine looks uses the map to identify

Observations on Manual Classification of Knowledge

“A sad note: it seems the smaller the target document set a KB technology is set up for, the more
likely they are to use category filtering as a primary relevance mechanism. Lazy and sad.”
– David Kay

Classification of content at a certain level can be helpful in improving the relevance of search
response:
• The degree to which classification helps is a function of how distinct or separate
the domains of content really are. The more distinct or separate, the more value
in segmenting them.
• This is not exactly the same as helping users find what they need. Rather, we are
excluding content that we anticipate they do not need.
• High level classification of content plays a part, but must be balanced with other
system and content functions in the ‘pursuit of relevance.’
• Experience has shown that in some environments scoping is essential at the first
or second iteration in achieving some form of reasonable subset of content to
operate against in more detail. Categories for products, general issues, often
content types, and several other potential dimensions can greatly facilitate the
initial scope definition of relevant resources to help address an issue. The
problem people get into, is overdoing it—trying to get classification schemes to
make the full bridge to the subtle, slippery ways in which people try to articulate
and resolve problems. So, classification schemes can only go so far, but for
some environments, they are essential.
At the risk of oversimplifying, here are the three key things we do at knowledge discovery
time with categories:

1. Filter—remove results that aren't in the right part of the category tree
2. Browse—navigate through content based on categories
3. Rank—move matching categories higher up a results list

Done deftly, in theory, category-based ranking is hard to disagree with. In practice, because
users will generally only look at a page or two of results (if that), ranking morphs into
filtering, so it has to be used as an extra-credit sort of influence, and gently. As we have
learned the hard way, if a user gives us five words and we find four of them pretty close
together in the document's title, we’d better return that document on top, even if our
category smartness tells us something else is better.

Similarly, browsing may be more or less useful, but at least it is benign. The customer's in
charge, and as long as the categories map into the user's view of the world at some level
(they have good "scent"), then there's nothing but good in this use of categories—mod the
accuracy of the classification. Users are not going to use this too much in huge document
sets, so it is appropriately self-limiting.

The danger really comes with filtering. We need to focus on this point in a statement of best
practices. Using filtering as the key way of driving relevance is incredibly dangerous—this is
why the homegrown parametric search engines built on Access or something similar fail
miserably as tech support or service desk tools. Not to mention decision trees, which
effectively do the same thing. Here again we have to acknowledge that there are some
environments (particularly fixed or static ones) where decision trees have value. (For a great
decision tree application, check out playing 20 questions with Darth Vader at
http://www.sithsense.com/flash.htm.) Unfortunately, most of the support environments we
have encountered are extremely dynamic and the relevance and maintenance of the tree
structures make them unsuitable.

The problem is that if there is any mismatch at all between how the user and the
organization view an aspect of the problem, filtering fails. Silently. We have no idea that we
just filtered out a relevant result. At the point of search, we do not know the root cause.

On the other hand, there are circumstances where filtering is useful. Users are capable of
knowing the primary product line they are dealing with. Users can know, with some past
experience, which content source is most relevant for their search. And, organizations can
know this stuff with confidence as well. (E.g., version numbers to which a fix applies.)

Certainly, a best practice for filtering (except for gross product family) is to make it available
after the search results appear. (Think search, then browse, the way Yahoo! generally
worked in its earliest days.)

So, categories are not bad, but filtering based on categories certainly can be. In a small KB
(1000 docs? 3000 docs?), we simply do not need it. In a huge KB (100,000s), it can be
helpful, but it needs to be used very thoughtfully based on the environment and an
understanding of the context of the users who will be searching. The context of the user is
often hard to anticipate so a process of continuously improving the categories and filters by
paying close attention to users’ experience.

Additional considerations about classification

• The risk of predefining categories…at what level of detail or accuracy can we
anticipate the structure or relationships of things we do not yet know? Wouldn’t
it be better to let the content self-organize?
• Manual classification of incidents or solutions by support agents; if we have
enough categories for the data to be actionable, we have too many for the analyst
to deal with…If we have a number of categories that the support analyst can deal
with (5-7 items), we have not enough specificity to be actionable.
• Manual classification is not multi-dimensional; it does not support the numerous
ways in which content can be related.

Automation Opportunities:
• Programmatically tagging content (creating meta data) based on a predefined
taxonomy improves search results and allows us to organize the content in multi-
dimensional ways. This is particularly valuable for unstructured content. The
taxonomy requires maintenance as new associations are identified.
• Data mining tools that detect patterns in content based on the content!
Emerging technologies that can organize content based on what it is without a
predefined taxonomy. This allows the content to self organize in ways that we
might not have anticipated.

Thoughts on Structure (from the KCS practices)

A format to give content a little bit of context:
• Problem/Question
• Environment
• Fix/Answer
• Cause (optional)
• Metadata (date created, last modified, # of time used, life cycle state

Searching and Find-ability

• Google searches for the existence of words in a blob of text—if we search for “install
Linksys router in network with Apple and Window XP,” Google will respond with the
most frequently referenced documents that include any of those words…e.g. “install
windows XP” or “install windows emulator on Apple PC” without regard to the role the
words play.
• Rather than searching a blob of text find-ability is improved if we search problem
statements against problem statements and environment against environment
• Problem: install router
• Environment: Linksys, Apple, Windows XP
The response will only be issues related to installing a router in this environment.

A few challenges with this approach

1. Most content is not in the KCS structure, in which case we need one
approach to search well structured content and a different approach to
search unstructured content
2. Aligning the search engine with the structure can not be absolute, it has
to be a balance

Condition Monitoring Engineer Resume
100% (1)
Condition Monitoring Engineer Resume
6 pages
Drones and The Creative Industry PDF
No ratings yet
Drones and The Creative Industry PDF
164 pages
Decision Theory and Tree
No ratings yet
Decision Theory and Tree
50 pages
Edoc List
No ratings yet
Edoc List
11 pages
Real Time System - : BITS Pilani
No ratings yet
Real Time System - : BITS Pilani
42 pages
Morville Usid Seminario
100% (1)
Morville Usid Seminario
166 pages
Taxonomy & Content Classification: Market Milestone Report
No ratings yet
Taxonomy & Content Classification: Market Milestone Report
60 pages
UDC Abridged
No ratings yet
UDC Abridged
266 pages
R20 Cns-Lab-Manual-Iii-Cse-Ii-Sem
No ratings yet
R20 Cns-Lab-Manual-Iii-Cse-Ii-Sem
29 pages
Categorical Reparameterization With Gumbel Softmax
No ratings yet
Categorical Reparameterization With Gumbel Softmax
13 pages
CSC134 Chapter 1
No ratings yet
CSC134 Chapter 1
64 pages
ACA CC 2019 Photoshop Exam Tutorial: Page 1 of 10 © 2020 Certiport, A Business of NCS Pearson, Inc
No ratings yet
ACA CC 2019 Photoshop Exam Tutorial: Page 1 of 10 © 2020 Certiport, A Business of NCS Pearson, Inc
10 pages
U.S. Food & Drug Administration: 10903 New Hampshire Avenue Silver Spring, MD 20993
No ratings yet
U.S. Food & Drug Administration: 10903 New Hampshire Avenue Silver Spring, MD 20993
14 pages
XEROX Benchmark Story: Case Study
93% (15)
XEROX Benchmark Story: Case Study
9 pages
Essays: Taxonomy, Terminology and Juan Sager
No ratings yet
Essays: Taxonomy, Terminology and Juan Sager
81 pages
Axonomies & Classification: Shannon Lucas Information Architecture INF 389F
No ratings yet
Axonomies & Classification: Shannon Lucas Information Architecture INF 389F
30 pages
HCI Lesson5 - Information Architecture and Web Navigation
No ratings yet
HCI Lesson5 - Information Architecture and Web Navigation
7 pages
MILLS Faceted Classification 2004
No ratings yet
MILLS Faceted Classification 2004
8 pages
Story Ontology
No ratings yet
Story Ontology
28 pages
Student Performance Analysis System With Graph & Academic Project Management
No ratings yet
Student Performance Analysis System With Graph & Academic Project Management
6 pages
Clay Shirky's Writings About The Internet: Economics & Culture, Media & Community
No ratings yet
Clay Shirky's Writings About The Internet: Economics & Culture, Media & Community
20 pages
Module 5 Assignment 1
No ratings yet
Module 5 Assignment 1
3 pages
Product Datasheet: Motor Mechanism, MT 250, Compact Nsx250, Powerpact Multistandard J, 250 VDC
No ratings yet
Product Datasheet: Motor Mechanism, MT 250, Compact Nsx250, Powerpact Multistandard J, 250 VDC
2 pages
Text Grouping: A Comprehensive Guide
No ratings yet
Text Grouping: A Comprehensive Guide
8 pages
999 Most Repeated MCQs Collection PPSC FPSC
No ratings yet
999 Most Repeated MCQs Collection PPSC FPSC
32 pages
20200622 标签与分类霍永学
No ratings yet
20200622 标签与分类霍永学
11 pages
Google With Global Search by Topic: I) Summary
No ratings yet
Google With Global Search by Topic: I) Summary
12 pages
Web Search and Geographic Location: Mikew@sims - Berkeley.edu
No ratings yet
Web Search and Geographic Location: Mikew@sims - Berkeley.edu
7 pages
Web Page Classification - Features and Algorithms
No ratings yet
Web Page Classification - Features and Algorithms
31 pages
Template MATH 27 F6 FINAL EXAM PDF
No ratings yet
Template MATH 27 F6 FINAL EXAM PDF
1 page
Example Based Search 2001
No ratings yet
Example Based Search 2001
7 pages
Information Architecture: Professor Larry Heimann Carnegie Mellon University 88-272 Lecture Notes - Fall 1999
No ratings yet
Information Architecture: Professor Larry Heimann Carnegie Mellon University 88-272 Lecture Notes - Fall 1999
24 pages
Bpops103 M 4 Strings N Pointers - Notes
100% (1)
Bpops103 M 4 Strings N Pointers - Notes
25 pages
Intro IR
No ratings yet
Intro IR
108 pages
SPE 90006 Taxonomy: A Knowledge Sharing Enabler: Background
No ratings yet
SPE 90006 Taxonomy: A Knowledge Sharing Enabler: Background
5 pages
Next Generation Web Search: Setting Our Sites
No ratings yet
Next Generation Web Search: Setting Our Sites
11 pages
Website Planning Print
100% (2)
Website Planning Print
22 pages
Theme 10 - 2017
No ratings yet
Theme 10 - 2017
31 pages
E-Commerce U5 PDF
No ratings yet
E-Commerce U5 PDF
21 pages
Emax 2 61850 SDH001330R1002
No ratings yet
Emax 2 61850 SDH001330R1002
4 pages
Taxonomy Insight Barnwell
No ratings yet
Taxonomy Insight Barnwell
4 pages
Drim 1201 Classification (Theory and Practice)
No ratings yet
Drim 1201 Classification (Theory and Practice)
26 pages
CS Lab Manual - Merged
No ratings yet
CS Lab Manual - Merged
49 pages
Web Search Using Automatic Classification: Computer Science Department, Stanford University
No ratings yet
Web Search Using Automatic Classification: Computer Science Department, Stanford University
11 pages
Ten Taxonomy Myths
No ratings yet
Ten Taxonomy Myths
2 pages
Display LUMAscape
No ratings yet
Display LUMAscape
1 page
DWM Exp3 33
No ratings yet
DWM Exp3 33
3 pages
Lecture 2 - CS50's Web Programming With Python and JavaScript
No ratings yet
Lecture 2 - CS50's Web Programming With Python and JavaScript
17 pages
Cataloging
No ratings yet
Cataloging
53 pages
GJU Hisar Non Teaching Recruitment 2024 Notification
No ratings yet
GJU Hisar Non Teaching Recruitment 2024 Notification
16 pages
Hci Unit 5
No ratings yet
Hci Unit 5
22 pages
Basic Cisco Config Router Commands
No ratings yet
Basic Cisco Config Router Commands
14 pages
Cataloging With MARC RDA and Classification Systems 1701819320
No ratings yet
Cataloging With MARC RDA and Classification Systems 1701819320
195 pages
Knowledge Areas
No ratings yet
Knowledge Areas
7 pages
Unit 2 - Philosophy 21GNH101J Updated
No ratings yet
Unit 2 - Philosophy 21GNH101J Updated
33 pages
HTML VIVA Questions
No ratings yet
HTML VIVA Questions
3 pages
Vulnerability & Threat
No ratings yet
Vulnerability & Threat
18 pages
Taxonomies For Development: Knowledge Solutions
No ratings yet
Taxonomies For Development: Knowledge Solutions
7 pages
Types of Information Retrieval Tools
No ratings yet
Types of Information Retrieval Tools
5 pages
Get Dinosaurs The Textbook Spencer G. Lucas PDF Ebook With Full Chapters Now
100% (1)
Get Dinosaurs The Textbook Spencer G. Lucas PDF Ebook With Full Chapters Now
54 pages
Information Architecture
No ratings yet
Information Architecture
62 pages
Lecture 3
No ratings yet
Lecture 3
79 pages
Pepper (The Case For Published Subjects)
No ratings yet
Pepper (The Case For Published Subjects)
9 pages
Unit - 6
No ratings yet
Unit - 6
12 pages
Search Engines
No ratings yet
Search Engines
4 pages
3 s2.0 B9781843346128500107 Main
No ratings yet
3 s2.0 B9781843346128500107 Main
3 pages
Royal Impact Futures Scholarship Application Form 2025 - 23jan2025 (FINAL) ...
No ratings yet
Royal Impact Futures Scholarship Application Form 2025 - 23jan2025 (FINAL) ...
8 pages
PSO10
No ratings yet
PSO10
6 pages
SAE X Mega Taxon Labeling 2023.10.27
No ratings yet
SAE X Mega Taxon Labeling 2023.10.27
43 pages
HCI - 7 - Information Architecture
No ratings yet
HCI - 7 - Information Architecture
41 pages
Information Visualization Technologies
No ratings yet
Information Visualization Technologies
15 pages
Unit 2 Part A
No ratings yet
Unit 2 Part A
178 pages
Team Topologies: Organizing Business and Technology Teams for Fast Flow
From Everand
Team Topologies: Organizing Business and Technology Teams for Fast Flow
Matthew Skelton
No ratings yet
Level:Bslis: Course Code Assignment No.1 Aut 2024
No ratings yet
Level:Bslis: Course Code Assignment No.1 Aut 2024
11 pages
Level:Bslis: Course Code Assignment No.1 Aut 2024
No ratings yet
Level:Bslis: Course Code Assignment No.1 Aut 2024
11 pages
WhatWeb Tool
No ratings yet
WhatWeb Tool
5 pages
Information Literacy Search - (Z-Library)
No ratings yet
Information Literacy Search - (Z-Library)
209 pages
A Framework For Automatic Classification of E-Business Web Content
No ratings yet
A Framework For Automatic Classification of E-Business Web Content
8 pages
Action Election: Fundamentals and Applications
From Everand
Action Election: Fundamentals and Applications
Fouad Sabry
No ratings yet
Upper Ontology: Fundamentals and Applications
From Everand
Upper Ontology: Fundamentals and Applications
Fouad Sabry
No ratings yet
The Association Knowledge Cycle: How Associations are Building Connected Intelligence with Generative AI
From Everand
The Association Knowledge Cycle: How Associations are Building Connected Intelligence with Generative AI
Thomas Altman
No ratings yet
Structured Writing: Rhetoric and Process
From Everand
Structured Writing: Rhetoric and Process
Mark Baker
No ratings yet
Conceptual Frameworks: A Guide to Structuring Analyses, Decisions and Presentations
From Everand
Conceptual Frameworks: A Guide to Structuring Analyses, Decisions and Presentations
Chinmay Kakatkar
5/5 (2)
An Introductory Guide to Systems Thinking
From Everand
An Introductory Guide to Systems Thinking
David Kerr
3/5 (3)
Content Strategy: Connecting the dots between business, brand, and benefits
From Everand
Content Strategy: Connecting the dots between business, brand, and benefits
Rahel Anne Bailie
No ratings yet
How to Research Qualitatively: Tips for Scientific Working
From Everand
How to Research Qualitatively: Tips for Scientific Working
Martin Gertler
No ratings yet
Data-Driven Agility
From Everand
Data-Driven Agility
Justice Conder
No ratings yet
Two Types of Collaboration &Ten Requirements for Using Them
From Everand
Two Types of Collaboration &Ten Requirements for Using Them
Billy Cripe
No ratings yet
Admired Disorder: A Guide to Building Innovation Ecosystems: Complex Systems, Innovation, Entrepreneurship, And Economic Development
From Everand
Admired Disorder: A Guide to Building Innovation Ecosystems: Complex Systems, Innovation, Entrepreneurship, And Economic Development
Alistair M. Brett
No ratings yet
Fundamentals of Computer Network Analysis and Engineering
From Everand
Fundamentals of Computer Network Analysis and Engineering
Radz
No ratings yet

Perspectives On Taxonomy, Classification, Structure and Find-Ability

Uploaded by

Perspectives On Taxonomy, Classification, Structure and Find-Ability

Uploaded by

Perspectives on Taxonomy, Classification, Structure and Find-ability

“We are all in pursuit of relevance.”

The Goal—Complement (Not Replace) the Human Mind

The Difference Between Taxonomy, Classification and Structure:

Taxonomy (a definition): a structure for classification, nomenclature to describe a

A wiki on taxonomy http://en.wikipedia.org/wiki/Taxonomy

Taxonomy and knowledge—taxonomies used in knowledge tools are often very

Observations on Manual Classification of Knowledge

Additional considerations about classification

Thoughts on Structure (from the KCS practices)

Searching and Find-ability

A few challenges with this approach

You might also like