0% found this document useful (0 votes)
17 views54 pages

Knowledge Management Complete Notes

Uploaded by

XeroX PlayzYT
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views54 pages

Knowledge Management Complete Notes

Uploaded by

XeroX PlayzYT
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

Knowledge Management (BCA-5001)

Question Bank
UNIT – 1
Q1 Define Data, Information, Knowledge
Ans.
Data: Data is a collection of facts, measurements, observations, or words that can be used to generate
information. Data can include the number of people in a country, the value of sales of a product, or the number
of times a country has won a cricket match.
• Structured Data: This type of data is organized data into specific format, making it easy to search ,
analyze and process. Structured data is found in a relational databases that includes information like
numbers, data and categories.
• Unstructured Data: Unstructured data does not conform to a specific structure or format. It may include
some text documents , images, videos, and other data that is not easily organized or analyzed without
additional processing.

Data can be classified as qualitative or quantitative. Qualitative data captures subjective qualities, while
quantitative data is numerical and can be measured.
Information: Information is processed data, organized, or structured in a way that makes it meaningful, valuable
and useful. It is data that has been given context, relevance and purpose. It gives knowledge, understanding and
insights that can be used for decision-making, problem-solving, communication and various other purposes.

Knowledge: A mix of contextual information, experiences, rules, and values. Knowledge is richer and deeper
than information and more valuable because someone has thought deeply about that information and added their
own unique experience, judgment, and wisdom.
Knowledge can be gained and accumulated as “information combined with experience, context, interpretation,
reflection and is highly contextual”.
It is a high-value form of information that is ready for application to decision and actions within organizations.
Knowledge is increasingly being viewed as a commodity or an intellectual asset. It possesses some contradictory
characteristics that are radically different from those of other valuable commodities.
Types of Knowledge
1. Tacit knowledge 2. Explicit knowledge
Tacit Knowledge (Implicit Knowledge)
a. The word tacit means understood and implied without being stated.
b. The tacit knowledge is unique and it can’t explain clearly.
c. That is the knowledge which the people possess is difficult to express.
d. The cognitive skills of an employee are a classic example of tacit knowledge.

1
e. The tacit knowledge is personal and it varies depending upon the education, attitude and perception of the
individual.
f. This is impossible to articulate because sometimes the tacit knowledge may be even sub conscious.
g. This tacit knowledge is also subjective in character.
Explicit Knowledge
The word explicit means stated clearly and in detail without any room for confusion. The explicit knowledge
is easy to articulate and they are not subjective. This is also not unique and it will not differ upon individuals.
It is impersonal. The explicit knowledge is easy to share with others.

Data Information Knowledge

Is objective Should be objective Is subjective

Has no meaning Has a meaning Has meaning for a specific purpose

Is unprocessed Is processed Is processed and understood

Is quantifiable, there can Is quantifiable, there can be Is not quantifiable, there is no knowledge
be data overload information overload overload

Q 2- What do you understand by Decision support system? What are the components of DSS? Discuss the
characteristics of DSS.
DSS: A decision support system (DSS) is an interactive computer-
based application that combines data and mathematical models to help
decision makers solve complex problems faced in managing the public
and private enterprises and organizations.
A properly designed DSS is an interactive software based system
intended to help decision makers compile useful information from raw
data, documents, personal knowledge, and/or business models to
identify and solve problems and make decisions

The decision-making process: It includes five phases:


Intelligence: In the intelligence phase the task of the decision maker
is to identify, circumscribe and explicitly define the problem that
emerges in the system under study.
Design: In the design phase actions aimed at solving the identified
problem should be developed and planned.
Choice: Once the alternative actions have been identified, it is
necessary to evaluate them on the basis of the performance criteria

2
deemed significant. Mathematical models and the corresponding solution methods usually play a valuable role
during the choice phase.
Implementation: When the best alternative has been selected by the decision maker, it is transformed into
actions by means of an implementation plan. This involves assigning responsibilities and roles to all those
involved into the action plan.
Control: Once the action has been implemented, it is finally necessary to verify and check that the original
expectations have been satisfied and the effects of the action match the original intentions.
It contains three elements:
Data: Contains database
Models: Repository(collections) of mathematical Models
Interface: Module for handling the dialogue between the system
and the users.

Components of DSS:
Data management: includes database required to make decisions.
Model Management: Collection of mathematical modes derived from operations research, statistics and
financial analysis.
Interactions: Takes inputs from user specifically in graphics forms from browser and gives information and
knowledge generated by system.

Knowledge management: It allows decision makers to draw various forms of collective knowledge.
Features of Decision Support System:
Effectiveness: It should help knowledge workers to reach more effective decisions.

3
Mathematical models: Mathematical modes are applied to the data contained in data marts and data warehouse.
Integration in the decision-making process: Decision makers allowed to integrate in a DSS to their needs
rather than passively accepting what comes out of it.
Organizational role: DSS operate at different hierarchical levels within an enterprise.
Flexibility: A DSS must be flexible and adaptable in order to incorporate the changes required to reflect
modifications in the environment or in the decision-making process.
Advantages of a DSS:
• Enable informed decision-making. By taking multiple different data sources into account, DSS can
facilitate better, up-to-date and informed decisions.
• Consider different outcomes. DSS consider different business outcomes, as possible decisions are based
on current and historical company data.
• Increase efficiency. DSS automate the analysis of large data sets.
• Provide better collaboration. DSS tools might also include communication and collaboration features.
• Enable flexibility. DSS can be used by many different industries.
• Handle complexity. DSS can handle complex problems that have multiple interdependencies and
variables.
Disadvantages of a DSS:
• Cost. Expenses for developing, implementing and maintaining DSS can be high, which can limit their
use by smaller organizations.
• Dependence. Developing an over-reliance on a DSS eventually takes away from the subjectivity
involved in decision-making.
• Complexity. DSS must consider all aspects of a given problem, which requires a lot of data. They can
also be complex to design and implement.
• Security. Data that DSS use might involve sensitive or critical data, meaning that an increased focus on
security is required.

Types of decision
Strategic decisions: Decisions are strategic when they affect the entire organization or at least a substantial part
of it for a long period of time. They strongly influence the general objectives and policies of an enterprise. Taken
at a higher organizational level, usually by the company top management.
Tactical decision: Tactical decisions affect only parts of an enterprise and are usually restricted to a single
department. The time span is limited to a medium-term horizon, typically up to a year. Made by middle
managers.
Operational decision: Operational decisions are framed within the elements and conditions determined by
strategic and tactical decisions. They are usually made at a lower organizational level, by knowledge workers
responsible for a single activity or task such as sub-department heads, workshop foremen, back-office heads.

4
Q 3- What do you understand by Group Decision support system? What are the components of GDSS?
Discuss the characteristics of GDSS.
Ans Group Decision Support System (GDSS) It is an information system that is designed to support decisions
made by groups in an organization. The main aim of GDSS is to facilitate group communication and foster
learning. A GDSS is helpful in situations involving meeting scheduling and documentation; brainstorming;
group discussions; visioning; planning; team building; etc. It enables users or group members to solve complex
problems, formulate detailed plans and proposals, manage conflicts, and effectively prioritize activities. GDSS
helps group members not only make better decisions but also improve tasks in an improved manner.

Components of GDSS
There are three fundamental types of components that compose GDSSs:
1. Software
The software part may consist of the following components: databases and database management capabilities,
user/system interface with multi-user access, specific applications to facilitate group decision-makers’ activities,
and modeling
capabilities.
2. Hardware
he hardware part may consist of the following components: I/O devices, PCs or workstations, individual monitors
for each participant or a public screen for group, and a network to link participants to each other.
3. People
he people may include decision-making participants and /or facilitator. A facilitator is a person who directs the
group through the planning process.
Advantages of GDSS
1. Increased efficiency: Due to increasing computer data processing power, communication and network
performance, the speed and quality for information processing and information transmission create the
opportunity for higher efficiency. Efficiency achievement depends on the performance of hardware (e.g.,
PCs, LAN/WAN) and software.
2. Improved quality: In a GDSS, the outcome of a meeting or decision-making process depends on
communication facilities and decision support facilities. Those facilities can help decision-making
participants avoid the constraints imposed by geography. They also make information sharable and
reduce effort in the decision-making process. Therefore, those facilities contribute to meeting quality
improvement.
3. Leverage that improves the way meetings run: Leverage implies that the system does not merely speed
up the process (say efficiency), but changes it fundamentally. In other words, leverage can be achieved
through providing better ways of meeting, such as providing the ability to execute multiple tasks at the
same time.

5
Q 4- What is Executive information system? How does decision support system help in business?
Executive support systems are intended to be used by the senior managers directly to provide support to non-
programmed decisions in strategic management.

These information are often external, unstructured and even uncertain. Exact scope and context of such
information is often not known beforehand.

This information is intelligence based:

• Market intelligence
• Investment intelligence
• Technology intelligence
Examples of Intelligent Information
Following are some examples of intelligent information, which is often the source of an EIS:

• External databases
• Technology reports like patent records etc.
• Technical reports from consultants
• Market reports
• Confidential information about competitors
• Speculative information like market conditions
• Government policies
• Financial reports and information
Advantages of EIS

• Easy for upper level executive to use


• Ability to analyze trends
• Augmentation of managers' leadership capabilities
• Enhance personal thinking and decision-making
• Contribution to strategic control flexibility
• Enhance organizational competitiveness in the
market place
• Instruments of change
• Increased executive time horizons.
• Better reporting system
• Improved mental model of business executive
• Help improve consensus building and
communication
• Improve office automation
• Reduce time for finding information
• Early identification of company performance
• Detail examination of critical success factor
• Better understanding

6
• Time management
• Increased communication capacity and quality
Disadvantage of EIS

• Functions are limited


• Hard to quantify benefits
• Executive may encounter information overload
• System may become slow
• Difficult to keep current data
• May lead to less reliable and insecure data
• Excessive cost for small company

Q 5- What is groupware technology? Discuss different groupware technologies. How is Groupware Design
Different from Traditional User Interface Design?

Groupware is technology designed to facilitate the work of groups. This technology may be used to
communicate, cooperate, coordinate, solve problems, compete, or negotiate. While traditional technologies like
the telephone qualify as groupware, the term is ordinarily used to refer to a specific class of technologies relying
on modern computer networks, such as email, newsgroups, videophones, or chat.

An organisation or a team might use groupware to work more efficiently in the following ways:
• Enables collaboration and communication on projects irrespective of employee's location
• Minimises misunderstanding and errors in the workplace
• Helps in the efficient management of workflows and processes
• Ensures efficient information management in the organisation

Types of Groupware
Apart from ensuring transparency and productivity in teams, these tools ensure communication, conferencing
and coordination among team members. Here are two types of groupware:
1. Synchronous groupware
2. Asynchronous groupware

Synchronous groupware
Synchronous groupware are tools that allow multiple users to contribute to a project in real-time. Examples of
synchronous groupware include video conferencing, chat systems, support systems and shared whiteboards.
Collaborating in real-time helps organisations create and manage tasks. Using synchronous groupware, teams
focus on group meetings between professionals working on different teams. This can help companies train
multiple professionals simultaneously while reducing the resources a team manager uses.

Asynchronous groupware
Asynchronous groupware allows multiple users to contribute to projects at different times. It allows different
services like file sharing, structured messages, collaborative writing and email handling. A team might prefer
using asynchronous groupware, as it maximises the time people spend working on different projects. These tools
allow team members to access information and data from any location and at any time with an internet
connection. Asynchronous groupware users collaborate on shared data access and make modifications.

7
Advantages of using a groupware
Groupware helps assist people in working collaboratively while located in different places. Here are some
benefits of using groupware:
• Fosters creativity: Groupware fosters creativity among different users and enables team members to use
new ideas to improve the workflow and process.
• Facilitates communication: Groupware facilitates communication between team members through
chats, video conferencing, instant messaging and emails. Using communication tools, team members can
discuss issues before they become significant.
• Manages multiple tasks: Groupware helps manage multiple tasks, professionals and teams, making it
easier to understand goals at every level of the organisation. Knowing the goals at multiple levels can
increase production because everyone understands the task to complete.
• Provides structure: Groupware provides a structure which allows team members to view the goals and
purpose and set up schedules. This helps ensure team members can acquire important information,
compare notes and exchange ideas with others.
• Helps save documents: Using groupware, teams have an option to save documents like faxes, emails
and spreadsheets. This allows users to access files from anywhere using an internet connection.
• Saves travel costs: With groupware, organisations can save travel costs because they do not travel to
different locations to conduct meetings. It helps ensure also team members can attend the meeting while
working from home.
Disadvantages of using a groupware
Though groupware offers many benefits, it also offers the following disadvantages:
• Expensive: Buying a subscription to these tools can be costly. An organisation might also incur training
costs while implementing these tools in the organisation.
• Dependent on a server: While most groupware tools are beneficial, some might be unreliable because
they depend on one server. When the server is down, no one can use the tool.
• Does not allow non-verbal communication: Groupware tools do not allow non-verbal communication
between team members. As a result, team members might not form a strong professional relationships
with each other.
• Promotes overdependence on particular groupware: An organisation might depend upon one tool
because of the security issues involved and the cost of training. This dependence on one groupware might
be problematic in the long term.

Applications of groupware technology:


There are several types of groupware applications. Comparing those design options across applications yields
interesting new perspectives on well-known applications. Also, in many cases, these systems can be used
together, and in fact, are intended to be used in conjunction. For example, group calendars are used to schedule
videoconferencing meetings, multi-player games use live video and chat to communicate, and newsgroup
discussions spawn more highly-involved interactions in any of the other systems.

Email is by far the most common groupware application (besides, of course, the traditional telephone). While
the basic technology is designed to pass simple messages between 2 people, even relatively basic email systems
today typically include interesting features for forwarding messages, filing messages, creating mailing groups,
and attaching files with a message.

Newsgroups and mailing lists are similar in spirit to email systems except that they are intended for messages
among large groups of people instead of 1-to-1 communication. In practice the main difference between
newsgroups and mailing lists is that newsgroups only show messages to a user when they are explicitly requested
(an “on-demand” service), while mailing lists deliver messages as they become available (an “interrupt-driven”
interface).

8
Workflow systems allow documents to be routed through organizations through a relatively-fixed process.
Workflow systems may provide features such as routing, development of forms, and support for differing roles
and privileges.

Group calendars allow scheduling, project management, and coordination among many people, and may
provide support for scheduling equipment as well. Typical features detect when schedules conflict or find
meeting times that will work for everyone.

Shared whiteboards allow two or more people to view and draw on a shared drawing surface even from
different locations. This can be used, for instance, during a phone call, where each person can jot down notes
(e.g., a name, phone number, or map) or to work collaboratively on a visual problem.

Video communications systems allow two-way or multi-way calling with live video, essentially a telephone
system with an additional visual component. Cost and compatibility issues limited early use of video systems to
scheduled videoconference meeting rooms.

Chat systems permit many people to write messages in real time in a public space. As each person submits a
message, it appears at the bottom of a scrolling screen. Chat groups are usually formed by having a listing of
chat rooms by name, location, number of people, topic of discussion, etc.

Multi-player games have always been reasonably common in arcades, but are becoming quite common on the
internet. Many of the earliest electronic arcade games were multi-user, for example, Pong, Space Wars, and car
racing games.

Electronic questionnaire: It is used by group members for planning meetings, and determining crucial issues
and related information for decision-making. By using electronic questionnaires, groups can acquire the required
information for finding optimal solutions and making effective decisions.

UNIT -II
Q 6- What is expert system? Explain different component of expert system also with advantages &
Disadvantages

Ans An Expert System (ES) is a computer-based system that mimics the decision-making ability of a human
expert in a specific domain or field. It uses knowledge representation, inference mechanisms, and user interfaces
to provide expert-level advice, solutions, or recommendations.

9
Components of an Expert System:

1. Knowledge Base (KB): Stores the


expertise and knowledge of the domain,
represented in a structured format (e.g.,
rules, frames, semantic networks).

1. Inference Engine (IE): Draws


conclusions from the knowledge base,
using reasoning mechanisms (e.g.,
forward chaining, backward chaining).

2. User Interface (UI): Enables users to


interact with the ES, providing input, receiving output, and explaining results.

3. Knowledge Acquisition System (KAS): Helps experts transfer their knowledge to the KB.

4. Explanation Facility: Provides transparency into the ES's decision-making process.

5. Working Memory: Temporary storage for data and intermediate results.

Types of Expert Systems:

1. Rule-Based Systems: Use if-then rules to represent knowledge.


2. Frame-Based Systems: Organize knowledge into frames or objects.
3. Case-Based Reasoning Systems: Store and retrieve cases to solve problems.
4. Neural Network-Based Systems: Use artificial neural networks.

Applications of Expert Systems:

1. Medical Diagnosis
2. Financial Planning
3. Engineering Design
4. Troubleshooting
5. Decision Support Systems
6. Robotics
7. Natural Language Processing

Advantages of Expert Systems:

1. Improved decision-making
2. Increased efficiency
3. Enhanced accuracy
4. Consistency
5. Scalability
6. Reduced costs

Limitations of Expert Systems:

1. Knowledge acquisition challenges

10
2. Complexity
3. Maintenance difficulties
4. Explanation limitations
5. Limited domain expertise
Q7- Define Data Warehouse. Explain characteristics of data warehouse. Why do we need data
warehouse?
Ans Data Warehouse: A Data Warehouse (DW) is a centralized repository that stores data from various sources
in a single location, making it easier to access, analyze, and report data. It is a database specifically designed for
querying and analyzing data, rather than transactional processing.

Characteristics of Data Warehouse:

1. Integrated: Data from multiple sources is integrated into a single repository.


2. Time-variant: Data is stored with a time dimension, allowing for historical analysis.
3. Non-volatile: Data is not updated or deleted frequently.
4. Subject-oriented: Data is organized around specific business subjects (e.g., customers, products).
5. Structured: Data is organized in a predefined schema.
6. Scalable: Designed to handle large volumes of data.
7. Accessible: Data is easily accessible for querying and analysis.

Need a Data Warehouse:


1. Improved Decision-making: A DW provides a single, unified view of the organization, enabling better
decision-making.
2. Enhanced Business Intelligence: A DW supports business intelligence tools, enabling data analysis,
reporting, and visualization.
3. Increased Efficiency: A DW reduces data redundancy, improves data quality, and simplifies data access.
4. Better Data Management: A DW provides data governance, security, and compliance.
5. Faster Query Performance: A DW optimizes query performance, reducing response times.
6. Historical Analysis: A DW stores historical data, enabling trend analysis and forecasting.
7. Supports Advanced Analytics: A DW supports advanced analytics, such as data mining, predictive analytics,
and machine learning.

Advantages of Data Warehouse:


1. Improved business decision-making
2. Enhanced customer insights
3. Increased operational efficiency
4. Better data governance
5. Improved data quality
6. Faster query performance
7. Support for advanced analytics

Common Data Warehouse Applications:

1. Business Intelligence (BI)


2. Data Mining
3. Predictive Analytics
4. Customer Relationship Management (CRM)
5. Supply Chain Management (SCM)
6. Financial Analysis
7. Marketing Automation

11
What is Data Warehousing?
Data warehousing is the process of constructing and using a data warehouse. A data warehouse is constructed by
integrating data from multiple heterogeneous sources that support analytical reporting, structured and/or ad hoc
queries, and decision making. Data warehousing involves data cleaning, data integration, and data
consolidations.

Using Data Warehouse Information


There are decision support technologies that help utilize the data available in a data warehouse. These
technologies help executives to use the warehouse quickly and effectively. They can gather data, analyze it, and
take decisions based on the information present in the warehouse. The information gathered in a warehouse can
be used in any of the following domains:
• Tuning Production Strategies - The product strategies can be well tuned by repositioning the products and
managing the product portfolios by comparing the sales quarterly or yearly.
• Customer Analysis - Customer analysis is done by analyzing the customer's buying preferences, buying time,
budget cycles, etc.
• Operations Analysis - Data warehousing also helps in customer relationship management, and making
environmental corrections. The information also allows us to analyze business operations.
Process Flow in Data Warehouse
There are four major processes that contribute to a data warehouse:
• Extract and load the data.
• Cleaning and transforming the data.
• Backup and archive the data.
• Managing queries and directing them to the appropriate data sources

Extract and Load Process Data extraction takes data from the source systems. Data load takes the extracted
data and loads it into the data warehouse.
Note: Before loading the data into the data warehouse, the information extracted from the external sources must
be reconstructed.
Controlling the Process Controlling the process involves determining when to start data extraction and the
consistency check on data. Controlling process ensures that the tools, the logic modules, and the programs are
executed in correct sequence and at correct time.
When to Initiate Extract
Data needs to be in a consistent state when it is extracted, i.e., the data warehouse should represent a single,
consistent version of the information to the user.
For example, in a customer profiling data warehouse in telecommunication sector, it is illogical to merge the list
of customers at 8 pm on Wednesday from a customer database with the customer subscription events up to 8 pm
on Tuesday. This would mean that we are finding the customers for whom there are no associated subscriptions.

12
Loading the Data After extracting the data, it is loaded into a temporary data store where it is cleaned up and
made consistent.
Note: Consistency checks are executed only when all the data sources have been loaded into the temporary data
store.
Clean and Transform Process
Once the data is extracted and loaded into the temporary data store, it is time to perform Cleaning and
Transforming. Here is the list of steps involved in Cleaning and Transforming:
• Clean and transform the loaded data into a structure
• Partition the data
• Aggregation
Clean and Transform the Loaded Data into a Structure
Cleaning and transforming the loaded data helps speed up the queries. It can be done by making the data
consistent:
• within itself.
• with other data within the same data source.
• with the data in other source systems.
• with the existing data present in the warehouse.
Transforming involves converting the source data into a structure. Structuring the data increases the query
performance and decreases the operational cost. The data contained in a data warehouse must be transformed to
support performance requirements and control the ongoing operational costs.
Partition the Data
It will optimize the hardware performance and simplify the management of data warehouse. Here we partition
each fact table into multiple separate partitions.
Aggregation
Aggregation is required to speed up common queries. Aggregation relies on the fact that most common queries
will analyze a subset or an aggregation of the detailed data.

Backup and Archive the Data


In order to recover the data in the event of data loss, software failure, or hardware failure, it is necessary to keep
regular backups. Archiving involves removing the old data from the system in a format that allows it to be
quickly restored whenever required.
For example, in a retail sales analysis data warehouse, it may be required to keep data for 3 years with the latest
6 months data being kept online. In such as scenario, there is often a requirement to be able to do month-on-
month comparisons for this year and last year. In this case, we require some data to be restored from the archive.

Query Management Process


This process performs the following functions:
• manages the queries.
• helps speed up the execution time of queries.
• directs the queries to their most effective data sources.
• ensures that all the system sources are used in the most effective way.
• monitors actual query profiles.
The information generated in this process is used by the warehouse management process to determine which
aggregations to generate. This process does not generally operate during the regular load of information into data
warehouse.

Three-Tier Data Warehouse Architecture


Generally a data warehouses adopts a three-tier architecture. Following are the three tiers of the data warehouse
architecture.

13
• Bottom Tier - The bottom tier of the architecture is the data warehouse database server. It is the relational
database system. We use the back-end tools and utilities to feed data into the bottom tier. These backend tools
and utilities perform the Extract, Clean, Load, and refresh functions.

• Middle Tier - In the middle tier, we have the OLAP Server that can be implemented in either of the following
ways.
o By Relational OLAP (ROLAP), which is an extended relational database management system. The ROLAP
maps the operations on multidimensional data to standard relational operations.
o By Multidimensional OLAP (MOLAP) model, which directly implements the multidimensional data and
operations.
• Top-Tier - This tier is the front-end client layer. This layer holds the query tools and reporting tools, analysis
tools and data mining tools. The following diagram depicts the three-tier architecture of a data warehouse:

Functions of Data Warehouse Tools and Utilities


The following are the functions of data warehouse tools and utilities:
• Data Extraction - Involves gathering data from multiple heterogeneous sources.
• Data Cleaning - Involves finding and correcting the errors in data.
• Data Transformation - Involves converting the data from legacy format to warehouse format.
• Data Loading - Involves sorting, summarizing, consolidating, checking integrity, and building indices and
partitions.
• Refreshing - Involves updating from data sources to warehouse.
Note: Data cleaning and data transformation are important steps in improving the quality of data and data mining
results.

Metadata
Metadata is simply defined as data about data. The data that are used to represent other data is known as
metadata. For example, the index of a book serves as a metadata for the contents in the book. In other words, we
can say that metadata is the summarized data that leads us to the detailed data.
In terms of data warehouse, we can define metadata as following:

14
• Metadata is a roadmap to data warehouse.
• Metadata in data warehouse defines the warehouse objects.
• Metadata acts as a directory. This directory helps the decision support system to locate the contents of a data
warehouse.
Metadata Repository
Metadata repository is an integral part of a data warehouse system. It contains the following metadata:
• Business metadata - It contains the data ownership information, business definition, and changing policies.
• Operational metadata - It includes currency of data and data lineage. Currency of data refers to the data
being active, archived, or purged. Lineage of data means history of data migrated and transformation applied on
it.
• Data for mapping from operational environment to data warehouse - It metadata includes source
databases and their contents, data extraction, data partition, cleaning, transformation rules, data refresh and
purging rules.
• The algorithms for summarization - It includes dimension algorithms, data on granularity, aggregation,
summarizing, etc.

Data Cube
A data cube helps us represent data in multiple dimensions. It is defined by dimensions and facts. The
dimensions are the entities with respect to which an enterprise preserves the records.

Data Mart
Data marts contain a subset of organization-wide data that is valuable to specific groups of people in an
organization. In other words, a data mart contains only those data that is specific to a particular group. For
example, the marketing data mart may contain only data related to items, customers, and sales. Data marts are
confined to subjects.

Points to Remember About Data Marts


• Windows-based or Unix/Linux-based servers are used to implement data marts. They are implemented on low-
cost servers.
• The implementation cycle of a data mart is measured in short periods of time, i.e., in weeks rather than months
or years.
• The life cycle of data marts may be complex in the long run, if their planning and design are not organization-
wide.
• Data marts are small in size.
• Data marts are customized by department.
• The source of a data mart is departmentally structured data warehouse.
• Data marts are flexible.

15
Online Analytical Processing Server (OLAP)
Online Analytical Processing Server (OLAP) is based on the multidimensional data model. It allows managers
and analysts to get an insight of the information through fast, consistent, and interactive access to information.
This chapter covers the types of OLAP, operations on OLAP, difference between OLAP, and statistical
databases and OLTP.
Types of OLAP Servers
We have four types of OLAP servers:
• Relational OLAP (ROLAP)
• Multidimensional OLAP (MOLAP)
• Hybrid OLAP (HOLAP)
• Specialized SQL Servers

Relational OLAP
ROLAP servers are placed between relational back-end server and client frontend tools. To store and manage
warehouse data, ROLAP uses relational or extended-relational DBMS.
ROLAP includes the following:
• Implementation of aggregation navigation logic.
• Optimization for each DBMS back-end.
• Additional tools and services.

Points to Remember
• ROLAP servers are highly scalable.
• ROLAP tools analyze large volumes of data across multiple dimensions.
• ROLAP tools store and analyze highly volatile and changeable data.

Relational OLAP Architecture


ROLAP includes the following components:
• Database server
• ROLAP server
• Front-end tool

16
Advantages
• ROLAP servers can be easily used with existing RDBMS.
• Data can be stored efficiently, since no zero facts can be stored.
• ROLAP tools do not use pre-calculated data cubes.
• DSS server of micro-strategy adopts the ROLAP approach.
Disadvantages
• Poor query performance.
• Some limitations of scalability depending on the technology architecture that is utilized.

Multidimensional OLAP
MOLAP uses array-based multidimensional storage engines for multidimensional views of data. With
multidimensional data stores, the storage utilization may be low if the dataset is sparse. Therefore, many
MOLAP servers use two levels of data storage representation to handle dense and sparse datasets.

Points to Remember
• MOLAP tools process information with consistent response time regardless of level of summarizing or
calculations selected.
• MOLAP tools need to avoid many of the complexities of creating a relational database to store data for
analysis.
• MOLAP tools need fastest possible performance.
• MOLAP server adopts two level of storage representation to handle dense and sparse datasets.
• Denser sub-cubes are identified and stored as array structure.
• Sparse sub-cubes employ compression technology

MOLAP Architecture
MOLAP includes the following components:
• Database server
• MOLAP server
• Front-end tool

17
Advantages
• MOLAP allows fastest indexing to the pre-computed summarized data.
• Helps the users connected to a network who need to analyze larger, less defined data.
• Easier to use, therefore MOLAP is suitable for inexperienced users.
Disadvantages
• MOLAP are not capable of containing detailed data.
• The storage utilization may be low if the data set is sparse.

Hybrid OLAP Hybrid


OLAP is a combination of both ROLAP and MOLAP. It offers higher scalability of ROLAP and faster
computation of MOLAP. HOLAP servers allow to store large data volumes of detailed information. The
aggregations are stored separately in MOLAP store.

18
Specialized SQL Servers Specialized SQL servers provide advanced query language and query processing
support for SQL queries over star and snowflake schemas in a read-only environment.
OLAP Operations
Since OLAP servers are based on multidimensional view of data, we will discuss OLAP operations in
multidimensional data.
Here is the list of OLAP operations:
• Roll-up
• Drill-down
• Slice and dice
• Pivot (rotate)
Roll-up
Roll-up performs aggregation on a data cube in any of the following ways:
• By climbing up a concept hierarchy for a dimension
• By dimension reduction
The following diagram illustrates how roll-up works.

• Roll-up is performed by climbing up a concept hierarchy for the dimension location.


• Initially the concept hierarchy was "street < city < province < country".
• On rolling up, the data is aggregated by ascending the location hierarchy from the level of city to the level of
country.
• The data is grouped into cities rather than countries.
• When roll-up is performed, one or more dimensions from the data cube are removed.

19
Drill-down
Drill-down is the reverse operation of roll-up. It is performed by either of the following ways:
• By stepping down a concept hierarchy for a dimension
• By introducing a new dimension
• Drill-down is performed by stepping down a concept hierarchy for the dimension time.
• Initially the concept hierarchy was "day < month < quarter < year."
• On drilling down, the time dimension is descended from the level of quarter to the level of month.
• When drill-down is performed, one or more dimensions from the data cube are added.
• It navigates the data from less detailed data to highly detailed data.

Slice
The slice operation selects one particular dimension from a given cube and provides a new sub-cube. Consider
the following diagram that shows how slice works.
• Here Slice is performed for the dimension "time" using the criterion time = "Q1".
• It will form a new sub-cube by selecting one or more dimensions.

Dice
Dice selects two or more dimensions from a given cube and provides a new subcube. Consider the following
diagram that shows the dice operation.
The dice operation on the cube based on the following selection criteria involves three dimensions.
• (location = "Toronto" or "Vancouver")
• (time = "Q1" or "Q2")
• (item =" Mobile" or "Modem")

Pivot
The pivot operation is also known as rotation. It rotates the data axes in view in order to provide an alternative
presentation of data. Consider the following diagram that shows the pivot operation.

Functions of Data Warehouse Tools and Utilities


The following are the functions of data warehouse tools and utilities:
• Data Extraction - Involves gathering data from multiple heterogeneous sources.
• Data Cleaning - Involves finding and correcting the errors in data.
• Data Transformation - Involves converting the data from legacy format to warehouse format.
• Data Loading - Involves sorting, summarizing, consolidating, checking integrity, and building indices and
partitions.
• Refreshing - Involves updating from data sources to warehouse.
Note: Data cleaning and data transformation are important steps in improving the quality of data and data mining
results.

1. What do you understand by group decision support system? What are the components of GDSS?
2. What is executive information system? How does decision support system help in business? Explain.
3. Explain the term groupware technologies in detail.
4. Define business expert system in detail along with its benefits.
5. Explain in detail data warehousing tools and utilities functions.
6. What is data warehouse? What are the goals of a data warehouse?
7. How are data marts different from data warehouse?
8. Differentiate Datamart and Warehouse.
9. Define Metadata.

20
10. Differentiate OLAP and OLTP.
11. Explain the three-tier architecture of a data warehouse.
12. Differentiate MOLAP and ROLTP.
13. Write all the operations of OLAP
14. Define the architecture of ROLAP and MOLAP
15. Write all the Process Flow in Data Warehousing.
UNIT-III
Multi- Dimensional analysis:
Data mining and knowledge discovery,
Data mining and Techniques,
Data mining of Advance Databases.
What is knowledge discovery?
What is datamining technologies?
What are datamining functionalities?
What do you understand by frequent pattern mining?
Classification Vs Clustering.

Section B or C
What are various steps of knowledge discovery? Discuss the role of datamining in knowledge discovery.
Explain the diagrammatic illustration for step by steps involved in the process of knowledge discovery from data
base.
What is need of datamining? Explain different types of data mining techniques.
What is need of datamining? Discuss the evaluation of database system technologies.
Explain various methods for evaluating the accuracy of classification or prediction.
Describe classification and prediction. Discuss methods regarding classification.
Describe Apriori Algorithm for frequent pattern mining.

1. Datamining terminology
Data Mining
Data mining is defined as extracting the information from a huge set of data. In other words we can say that data
mining is mining the knowledge from data. This information can be used for any of the following applications −

• Market Analysis
• Fraud Detection
• Customer Retention
• Production Control
• Science Exploration
Data Mining Engine
Data mining engine is very essential to the data mining system. It consists of a set of functional modules that
perform the following functions −

• Characterization

21
• Association and Correlation Analysis
• Classification
• Prediction
• Cluster analysis
• Outlier analysis
• Evolution analysis
Knowledge Base
This is the domain knowledge. This knowledge is used to guide the search or evaluate the interestingness of the
resulting patterns.

Knowledge Discovery
Some people treat data mining same as knowledge discovery, while others view data mining as an essential step
in the process of knowledge discovery. Here is the list of steps involved in the knowledge discovery process −

• Data Cleaning
• Data Integration
• Data Selection
• Data Transformation
• Data Mining
• Pattern Evaluation
• Knowledge Presentation
User interface
User interface is the module of data mining system that helps the communication between users and the data
mining system. User Interface allows the following functionalities −

• Interact with the system by specifying a data mining query task.


• Providing information to help focus the search.
• Mining based on the intermediate data mining results.
• Browse database and data warehouse schemas or data structures.
• Evaluate mined patterns.
• Visualize the patterns in different forms.
Data Integration
Data Integration is a data preprocessing technique that merges the data from multiple heterogeneous data sources
into a coherent data store. Data integration may involve inconsistent data and therefore needs data cleaning.

22
Data Cleaning
Data cleaning is a technique that is applied to remove the noisy data and correct the inconsistencies in data. Data
cleaning involves transformations to correct the wrong data. Data cleaning is performed as a data preprocessing
step while preparing the data for a data warehouse.

Data Selection
Data Selection is the process where data relevant to the analysis task are retrieved from the database. Sometimes
data transformation and consolidation are performed before the data selection process.

Clusters
Cluster refers to a group of similar kind of objects. Cluster analysis refers to forming group of objects that are
very similar to each other but are highly different from the objects in other clusters.

Data Transformation
In this step, data is transformed or consolidated into forms appropriate for mining, by performing summary or
aggregation operations.

2. Data Mining Applications


Here is the list of areas where data mining is widely used −

• Financial Data Analysis


• Retail Industry
• Telecommunication Industry
• Biological Data Analysis
• Other Scientific Applications
• Intrusion Detection
Financial Data Analysis
The financial data in banking and financial industry is generally reliable and of high quality which facilitates
systematic data analysis and data mining. Some of the typical cases are as follows −

• Design and construction of data warehouses for multidimensional data analysis and data mining.

• Loan payment prediction and customer credit policy analysis.

• Classification and clustering of customers for targeted marketing.

• Detection of money laundering and other financial crimes.

23
Retail Industry
Data Mining has its great application in Retail Industry because it collects large amount of data from on sales,
customer purchasing history, goods transportation, consumption and services. It is natural that the quantity of
data collected will continue to expand rapidly because of the increasing ease, availability and popularity of the
web.

Data mining in retail industry helps in identifying customer buying patterns and trends that lead to improved
quality of customer service and good customer retention and satisfaction. Here is the list of examples of data
mining in the retail industry −

• Design and Construction of data warehouses based on the benefits of data mining.

• Multidimensional analysis of sales, customers, products, time and region.

• Analysis of effectiveness of sales campaigns.

• Customer Retention.

• Product recommendation and cross-referencing of items.

Telecommunication Industry
Today the telecommunication industry is one of the most emerging industries providing various services such as
fax, pager, cellular phone, internet messenger, images, e-mail, web data transmission, etc. Due to the development
of new computer and communication technologies, the telecommunication industry is rapidly expanding. This is
the reason why data mining is become very important to help and understand the business.

Data mining in telecommunication industry helps in identifying the telecommunication patterns, catch fraudulent
activities, make better use of resource, and improve quality of service. Here is the list of examples for which data
mining improves telecommunication services −

• Multidimensional Analysis of Telecommunication data.

• Fraudulent pattern analysis.

• Identification of unusual patterns.

• Multidimensional association and sequential patterns analysis.

• Mobile Telecommunication services.

• Use of visualization tools in telecommunication data analysis.

24
Biological Data Analysis
In recent times, we have seen a tremendous growth in the field of biology such as genomics, proteomics,
functional Genomics and biomedical research. Biological data mining is a very important part of Bioinformatics.
Following are the aspects in which data mining contributes for biological data analysis −

• Semantic integration of heterogeneous, distributed genomic and proteomic databases.

• Alignment, indexing, similarity search and comparative analysis multiple nucleotide sequences.

• Discovery of structural patterns and analysis of genetic networks and protein pathways.

• Association and path analysis.

• Visualization tools in genetic data analysis.

Other Scientific Applications


The applications discussed above tend to handle relatively small and homogeneous data sets for which the
statistical techniques are appropriate. Huge amount of data have been collected from scientific domains such as
geosciences, astronomy, etc. A large amount of data sets is being generated because of the fast numerical
simulations in various fields such as climate and ecosystem modeling, chemical engineering, fluid dynamics, etc.
Following are the applications of data mining in the field of Scientific Applications −

• Data Warehouses and data preprocessing.


• Graph-based mining.
• Visualization and domain specific knowledge.
Intrusion Detection
Intrusion refers to any kind of action that threatens integrity, confidentiality, or the availability of network
resources. In this world of connectivity, security has become the major issue. With increased usage of internet
and availability of the tools and tricks for intruding and attacking network prompted intrusion detection to become
a critical component of network administration. Here is the list of areas in which data mining technology may be
applied for intrusion detection −

• Development of data mining algorithm for intrusion detection.

• Association and correlation analysis, aggregation to help select and build discriminating attributes.

• Analysis of Stream data.

• Distributed data mining.

• Visualization and query tools.

25
3. What is Knowledge Discovery?
Data mining as an essential step in the process of knowledge discovery. Here is the list of steps involved in the
knowledge discovery process −

• Data Cleaning − In this step, the noise and inconsistent data is removed.

• Data Integration − In this step, multiple data sources are combined.

• Data Selection − In this step, data relevant to the analysis task are retrieved from the database.

• Data Transformation − In this step, data is transformed or consolidated into forms appropriate for mining
by performing summary or aggregation operations.

• Data Mining − In this step, intelligent methods are applied in order to extract data patterns.

• Pattern Evaluation − In this step, data patterns are evaluated.

• Knowledge Presentation − In this step, knowledge is represented.

The following diagram shows the process of knowledge discovery −

26
4. Datamining tasks: Data mining deals with the kind of patterns that can be mined. On the basis of the
kind of data to be mined, there are two categories of functions involved in Data Mining −
1. Descriptive
2. Classification & prediction
Descriptive Function
The descriptive function deals with the general properties of data in the database. Here is the list of descriptive
functions −

• Class/Concept Description
• Mining of Frequent Patterns
• Mining of Associations
• Mining of Correlations
• Mining of Clusters
Class/Concept Description
Class/Concept refers to the data to be associated with the classes or concepts. For example, in a company, the
classes of items for sales include computer and printers, and concepts of customers include big spenders and
budget spenders. Such descriptions of a class or a concept are called class/concept descriptions. These
descriptions can be derived by the following two ways −

• Data Characterization − This refers to summarizing data of class under study. This class under study is
called as Target Class.

• Data Discrimination − It refers to the mapping or classification of a class with some predefined group or
class.

Mining of Frequent Patterns


Frequent patterns are those patterns that occur frequently in transactional data. Here is the list of kind of frequent
patterns −

• Frequent Item Set − It refers to a set of items that frequently appear together, for example, milk and
bread.

• Frequent Subsequence − A sequence of patterns that occur frequently such as purchasing a camera is
followed by memory card.

• Frequent Sub Structure − Substructure refers to different structural forms, such as graphs, trees, or
lattices, which may be combined with item−sets or subsequences.

Mining of Association
Associations are used in retail sales to identify patterns that are frequently purchased together. This process refers
to the process of uncovering the relationship among data and determining association rules.

27
For example, a retailer generates an association rule that shows that 70% of time milk is sold with bread and only
30% of times biscuits are sold with bread.

Mining of Correlations
It is a kind of additional analysis performed to uncover interesting statistical correlations between associated-
attribute−value pairs or between two item sets to analyze that if they have positive, negative or no effect on each
other.

Mining of Clusters
Cluster refers to a group of similar kind of objects. Cluster analysis refers to forming group of objects that are
very similar to each other but are highly different from the objects in other clusters.

Classification & prediction: Classification is the process of finding a model (or function) that describes and
distinguishes data classes or concepts, for the purpose of being able to use the model to predict the class of
objects whose class label is unknown.
The derived model is based on the analysis of a set of training data (i.e. data object whose class label is known.)
The derived model may be represented in various forms, such as classification(if-then) rule, decision tree, neural
network.

There are two forms of data analysis that can be used for extracting models describing important classes or to
predict future data trends. These two forms are as follows −

• Classification
• Prediction
Classification models predict categorical class labels; and prediction models predict continuous valued functions.
For example, we can build a classification model to categorize bank loan applications as either safe or risky, or a
prediction model to predict the expenditures in dollars of potential customers on computer equipment given their
income and occupation.

What is classification?
Following are the examples of cases where the data analysis task is Classification −

• A bank loan officer wants to analyze the data in order to know which customer (loan applicant) are risky
or which are safe.

• A marketing manager at a company needs to analyze a customer with a given profile, who will buy a new
computer.

In both of the above examples, a model or classifier is constructed to predict the categorical labels. These labels
are risky or safe for loan application data and yes or no for marketing data.

28
What is prediction?
Following are the examples of cases where the data analysis task is Prediction −

Suppose the marketing manager needs to predict how much a given customer will spend during a sale at his
company. In this example we are bothered to predict a numeric value. Therefore the data analysis task is an
example of numeric prediction. In this case, a model or a predictor will be constructed that predicts a continuous-
valued-function or ordered value.

Note − Regression analysis is a statistical methodology that is most often used for numeric prediction.

How Does Classification Works?


With the help of the bank loan application that we have discussed above, let us understand the working of
classification. The Data Classification process includes two steps −

• Building the Classifier or Model


• Using Classifier for Classification
Building the Classifier or Model
• This step is the learning step or the learning phase.

• In this step the classification algorithms build the classifier.

• The classifier is built from the training set made up of database tuples and their associated class labels.

• Each tuple that constitutes the training set is referred to as a category or class. These tuples can also be

referred to as sample, object or data points.

Using Classifier for Classification


In this step, the classifier is used for classification. Here the test data is used to estimate the accuracy of
classification rules. The classification rules can be applied to the new data tuples if the accuracy is considered
acceptable.

Decision tree:
A decision tree is a structure that includes a root node, branches, and leaf nodes. Each internal node denotes a test
on an attribute, each branch denotes the outcome of a test, and each leaf node holds a class label. The topmost
node in the tree is the root node.

The following decision tree is for the concept buy_computer that indicates whether a customer at a company is
likely to buy a computer or not. Each internal node represents a test on an attribute. Each leaf node represents a
class.

29
The benefits of having a decision tree are as follows −

• It does not require any domain knowledge.


• It is easy to comprehend.
• The learning and classification steps of a decision tree are simple and fast.

5. Clustering vs Classification

Though clustering and classification appear to be similar processes, there is a difference between them based on
their meaning. In the data mining world, clustering and classification are two types of learning methods. Both these
methods characterize objects into groups by one or more features. The key difference between clustering and
classification is that clustering is an unsupervised learning technique used to group similar instances on the
basis of features whereas classification is a supervised learning technique used to assign predefined tags to
instances on the basis of features.

What is Clustering?

Clustering is a method of grouping objects in such a way that objects with similar features come together, and
objects with dissimilar features go apart. It is a common technique for statistical data analysis used in machine
learning and data mining. Clustering can be used for exploratory data analysis and generalization.

Clustering belongs to unsupervised data mining, and clustering is not a single specific algorithm, but a general
method to solve the task. Clustering can be achieved by various algorithms. The appropriate cluster algorithm and
parameter settings depend on the individual data sets. It is not an automatic task, but it is an iterative process of
discovery. Therefore, it is necessary to modify data processing and parameter modeling until the result achieves

30
the desired properties. K-means clustering and Hierarchical clustering are two common clustering algorithms used

in data mining.

What is Classification?

Classification is a process of categorization where objects are recognized, differentiated and understood on the
basis of the training set of data. Classification is a supervised learning technique where a training set and correctly
defined observations are available.

The algorithm which implements classification is often known as the classifier, and the observations are often
known as the instances. K-Nearest Neighbor algorithm and decision tree algorithms are the most famous
classification algorithms used in data mining.

31
What is the difference between Clustering and Classification?

Clustering: Clustering is an unsupervised learning technique used to group similar instances on the basis of
features.

Classification: Classification is a supervised learning technique used to assign predefined tags to instances on
the basis of features.

## Whether you chose supervised or unsupervised should be based on whether or not you know what the
'categories' of your data are. If you know, use supervised learning. If you do not know, then use unsupervised.

6. Apriori algorithm Apriori algorithm is an association rule mining algorithm used in data mining. It is used
to find the frequent item set among the given number of transactions. It is a classic algorithm used in data
mining for learning association rules. It is nowhere as complex as it sounds, on the contrary it is very
simple; let me give you an example to explain it. Suppose you have records of large number of transactions
at a shopping center as follows:

Transactions Items bought


T1 Item1, item2, item3
T2 Item1, item2
T3 Item2, item5
T4 Item1, item2, item5

Learning association rules basically means finding the items that are purchased together more frequently than
others.
For example in the above table you can see Item1 and item2 are bought together frequently.

What is the use of learning association rules?


· Shopping centers use association rules to place the items next to each other so that users buy more items. If you
are familiar with data mining you would know about the famous beer-diapers-Wal-Mart story. Basically Wal-Mart
studied their data and found that on Friday afternoon young American males who buy diapers also tend to buy
beer. So Wal-Mart placed beer next to diapers and the beer-sales went up. This is famous because no one would
have predicted such a result and that’s the power of data mining.
· Also if you are familiar with Amazon, they use association mining to recommend you the items based on the
current item you are browsing/buying.
· Another application is the Google auto-complete, where after you type in a word it searches frequently
associated words that user type after that particular word.
So as I said Apriori is the classic and probably the most basic algorithm to do it.
Let’s start with a non-simple example,

Transaction Items Bought


ID
T1 {Mango, Onion, Nintendo, Key-chain, Eggs, Yo-yo}
T2 {Doll, Onion, Nintendo, Key-chain, Eggs, Yo-yo}
T3 {Mango, Apple, Key-chain, Eggs}
T4 {Mango, Umbrella, Corn, Key-chain, Yo-yo}
T5 {Corn, Onion, Onion, Key-chain, Ice-cream, Eggs}

32
Now, we follow a simple golden rule: we say an item/itemset is frequently bought if it is bought at least 60% of
times. So for here it should be bought at least 3 times.

For simplicity
M = Mango
O = Onion
And so on……

So the table becomes

Original table:
Transaction Items Bought
ID
T1 {M, O, N, K, E, Y }
T2 {D, O, N, K, E, Y }
T3 {M, A, K, E}
T4 {M, U, C, K, Y }
T5 {C, O, O, K, I, E}

Step 1: Count the number of transactions in which each item occurs, Note ‘O=Onion’ is bought 4 times in total,
but, it occurs in just 3 transactions.

Item No of
transactions
M 3
O 3
N 2
K 5
E 4
Y 3
D 1
A 1
U 1
C 2
I 1

Step 2: Now remember we said the item is said frequently bought if it is bought at least 3 times. So in this step we
remove all the items that are bought less than 3 times from the above table and we are left with

Item Number of
transactions
M 3
O 3
K 5
E 4
Y 3

33
This is the single items that are bought frequently. Now let’s say we want to find a pair of items that are bought
frequently. We continue from the above table (Table in step 2)

Step 3: We start making pairs from the first item, like MO,MK,ME,MY and then we start with the second item
like OK,OE,OY. We did not do OM because we already did MO when we were making pairs with M and buying
a Mango and Onion together is same as buying Onion and Mango together. After making all the pairs we get,

Item pairs
MO
MK
ME
MY
OK
OE
OY
KE
KY
EY

Step 4: Now we count how many times each pair is bought together. For example M and O is just bought together
in {M,O,N,K,E,Y}
While M and K is bought together 3 times in {M,O,N,K,E,Y}, {M,A,K,E} AND {M,U,C, K, Y}
After doing that for all the pairs we get

Item Pairs Number of


transactions
MO 1
MK 3
ME 2
MY 2
OK 3
OE 3
OY 2
KE 4
KY 3
EY 2

Step 5: Golden rule to the rescue. Remove all the item pairs with number of transactions less than three and we
are left with

Item Pairs Number of


transactions
MK 3
OK 3
OE 3
KE 4
KY 3

34
These are the pairs of items frequently bought together.
Now let’s say we want to find a set of three items that are brought together.
We use the above table (table in step 5) and make a set of 3 items.

Step 6: To make the set of three items we need one more rule (it’s termed as self-join),
It simply means, from the Item pairs in the above table, we find two pairs with the same first Alphabet, so we get
· OK and OE, this gives OKE
· KE and KY, this gives KEY

Then we find how many times O,K,E are bought together in the original table and same for K,E,Y and we get the
following table

Item Set Number of


transactions
OKE 3
KEY 2

While we are on this, suppose you have sets of 3 items say ABC, ABD, ACD, ACE, BCD and you want to generate
item sets of 4 items you look for two sets having the same first two alphabets.
· ABC and ABD -> ABCD
· ACD and ACE -> ACDE

And so on … In general you have to look for sets having just the last alphabet/item different.

Step 7: So we again apply the golden rule, that is, the item set must be bought together at least 3 times which
leaves us with just OKE, Since KEY are bought together just two times.

Thus the set of three items that are bought together most frequently are O,K,E.

7. Data mining techniques

There are several major data mining techniques have been developing and using in data mining projects
recently including association, classification, clustering, prediction, sequential patterns and decision tree. We
will briefly examine those data mining techniques in the following sections.

Association

Association is one of the best-known data mining technique. In association, a pattern is discovered based on a
relationship between items in the same transaction. That’s is the reason why association technique is also known
as relation technique. The association technique is used in market basket analysis to identify a set of products
that customers frequently purchase together.

Retailers are using association technique to research customer’s buying habits. Based on historical sale data,
retailers might find out that customers always buy crisps when they buy beers, and, therefore, they can put beers
and crisps next to each other to save time for customer and increase sales.

35
Classification

Classification is a classic data mining technique based on machine learning. Basically, classification is used to
classify each item in a set of data into one of a predefined set of classes or groups. Classification method makes
use of mathematical techniques such as decision trees, linear programming, neural network and statistics. In
classification, we develop the software that can learn how to classify the data items into groups. For example, we
can apply classification in the application that “given all records of employees who left the company, predict
who will probably leave the company in a future period.” In this case, we divide the records of employees into
two groups that named “leave” and “stay”. And then we can ask our data mining software to classify the
employees into separate groups.

Clustering

Clustering is a data mining technique that makes a meaningful or useful cluster of objects which have similar
characteristics using the automatic technique. The clustering technique defines the classes and puts objects in
each class, while in the classification techniques, objects are assigned into predefined classes. To make the
concept clearer, we can take book management in the library as an example. In a library, there is a wide range of
books on various topics available. The challenge is how to keep those books in a way that readers can take
several books on a particular topic without hassle. By using the clustering technique, we can keep books that
have some kinds of similarities in one cluster or one shelf and label it with a meaningful name. If readers want to
grab books in that topic, they would only have to go to that shelf instead of looking for the entire library.

Prediction The prediction, as its name implied, is one of a data mining techniques that discovers the relationship
between independent variables and relationship between dependent and independent variables. For instance, the
prediction analysis technique can be used in the sale to predict profit for the future if we consider the sale is an
independent variable, profit could be a dependent variable. Then based on the historical sale and profit data, we
can draw a fitted regression curve that is used for profit prediction.
Sequential Patterns Sequential patterns analysis is one of data mining technique that seeks to discover or
identify similar patterns, regular events or trends in transaction data over a business period.

In sales, with historical transaction data, businesses can identify a set of items that customers buy together
different times in a year. Then businesses can use this information to recommend customers buy it with better
deals based on their purchasing frequency in the past.

Decision trees

The A decision tree is one of the most common used data mining techniques because its model is easy to
understand for users. In decision tree technique, the root of the decision tree is a simple question or condition
that has multiple answers. Each answer then leads to a set of questions or conditions that help us determine the
data so that we can make the final decision based on it. For example, We use the following decision tree to
determine whether or not to play tennis:

36
Starting at the root node, if the outlook is overcast then we should definitely play tennis. If it is rainy, we should
only play tennis if the wind is the week. And if it is sunny then we should play tennis in case the humidity is
normal.

We often combine two or more of those data mining techniques together to form an appropriate process that
meets the business needs.

8. Data mining tools


Data mining is not all about the tools or database software that you are using. You can perform data mining with
comparatively modest database systems and simple tools, including creating and writing your own, or using off
the shelf software packages. Complex data mining benefits from the past experience and algorithms defined with
existing software and packages, with certain tools gaining a greater affinity or reputation with different
techniques. Ex oracle data miner, data to knowledge, sas, clementine, intelligent miner etc.

9. Bayesian and Neural networks


Data mining is the process of extracting nontrivial and potentially useful information, or knowlege, from the
enormous data sets available in experimental sciences (historical records, reanalysis, GCM simulations, etc.),
providing explicit information that has a readable form and can be used to solve diagnosis, classification or
forecasting problems. Traditionally, these problems were solved by direct hands-on data analysis using
standard statistical methods, but the increasing volume of data has motivated the study of automatic data
analysis using more complex and sophisticated tools which can operate directly from data. Thus, data mining
identifies trends within data that go beyond simple analysis. Modern data mining techniques (association
rules, decision trees, Gaussian mixture models, regression algorithms, neural networks, support vector
machines, Bayesian networks, etc.) are used in many domains to solve association, classification,
segmentation, diagnosis and prediction problems.
Among the different data mining algorithms, probabilistic graphical models (in particular Bayesian
networks) is a sound and powerful methodology grounded on probability and statistics, which allows
building tractable joint probabilistic models that represents the relevant dependencies among a set of
variables (hundreds of variables in real-life applications).

Formally, Bayesian networks are directed acyclic graphs whose nodes represent variables, and whose arcs
encode conditional independencies between the variables. The graph provides an intuitive description of the
dependency model and defines a simple factorization of the joint probability distribution leading to a
tractable model which is compatible with the encoded dependencies. Efficient algorithms exist to learn both
the graphical and the probabilistic models from data, thus allowing for the automatic application of this
methodogy in complex problems. Bayesian networks that model sequences of variables (such as, for
example, time series of historical records) are called dynamic Bayesian networks. Generalizations of

37
Bayesian networks that can represent and solve decision problems under uncertainty are called influence
diagrams.
On the other hand, neural networks are nonlinear models inspired in the functioning of the brain which have
been designed to solve different problems. Thus, multi-layer perceptrons are regression-like algorithms to
build a deterministic model y=f(x), relating a set of predictors, x, and predictands, y (figure below, left). Self-
Organizing Maps (SOM) are competitive networks designed for clustering and visualization purposes (right).

38
UNIT-IV
KNOWLEDGE
MANAGEMENT
Define knowledge management and knowledge management process.
Explain the types of knowledge
Write brief about failure of knowledge management
Characteristics of knowledge
What are different issues of challenges for knowledge management?
Difference between tacit and explicit knowledge
Define expert knowledge
Advantages and disadvantages of knowledge management
Limitations of knowledge management
Define knowledge architecture in detail
What are different benefits of knowledge management.
WRITE DOWN THE PHASES OF KNOWLEDGE MANAGEMENT

Define
1. RER
2. CASE STUDY
3. KNOWLEDGE BANK
4. KNOWLEDGE CAFÉ
5. KNOWLEDGE MARKETPLACE
6. COTS
7. BRAINSTORMING
8. ROI

“Knowledge management is really about recognizing that regardless of what business you are in, you are
competing based on the knowledge of your employees”

Introduction of Knowledge Management


Knowledge management is essentially about getting the right knowledge to the right person at the right time. This
in itself may not seem so complex, but it implies a strong tie to corporate strategy, understanding of where and in
what forms knowledge exists, creating processes that span organizational functions, and ensuring that initiatives
are accepted and supported by organizational members. Knowledge management may also include new knowledge
creation, or it may solely focus on knowledge sharing, storage, and refinement. For a more comprehensive
discussion and definition, see my knowledge management definition.
It is important to remember that knowledge management is not about managing knowledge for knowledge's sake.
The overall objective is to create value and leverage and refine the firm's knowledge assets to meet organizational
goals.

Implementing knowledge management thus has several dimensions including:

• Strategy: Knowledge management strategy must be dependent on corporate strategy. The objective is to
manage, share, and create relevant knowledge assets that will help meet tactical and strategic requirements.

39
• Organizational Culture: The organizational culture influences the way people interact, the context within
which knowledge is created, the resistance they will have towards certain changes, and ultimately the way they
share (or the way they do not share) knowledge.
• Organizational Processes: The right processes, environments, and systems that enable KM to be implemented
in the organization.
• Management & Leadership: KM requires competent and experienced leadership at all levels. There are a wide
variety of KM-related roles that an organization may or may not need to implement, including a CKO,
knowledge managers, knowledge brokers and so on. More on this in the section on KM positions and roles.
• Technology: The systems, tools, and technologies that fit the organization's requirements - properly designed
and implemented.
• Politics: The long-term support to implement and sustain initiatives that involve virtually all organizational
functions, which may be costly to implement (both from the perspective of time and money), and which often do
not have a directly visible return on investment.

why is knowledge management useful? It is useful because it places a focus on knowledge as an actual asset,
rather than as something intangible. In so doing, it enables the firm to better protect and exploit what it knows, and
to improve and focus its knowledge development efforts to match its needs.

In other words:

• It helps firms learn from past mistakes and successes.


• It better exploits existing knowledge assets by re-deploying them in areas where the firm stands to gain
something, e.g. using knowledge from one department to improve or create a product in another department,
modifying knowledge from a past process to create a new solution, etc.
• It promotes a long term focus on developing the right competencies and skills and removing obsolete knowledge.
• It enhances the firm's ability to innovate.
• It enhances the firm's ability to protect its key knowledge and competencies from being lost or copied.
Unfortunately, KM is an area in which companies are often reluctant to invest because it can be expensive to
implement properly, and it is extremely difficult to determine a specific ROI. Moreover KM is a concept the
definition of which is not universally accepted, and for example within IT one often sees a much shallower,
information-oriented approach. Particularly in the early days, this has led to many "KM" failures and these have
tarnished the reputation of the subject as a whole. Sadly, even today, probably about one in three blogs that I read

40
on this subject have absolutely nothing to do with the KM that I was taught back in business school. I will discuss
this latter issue in greater detail in the future.

Types of Knowledge
Once knowledge is created, it exists within the organization. However, before it can be reused or shared it must be
properly recognized and categorized. Within business and KM, two types of knowledge are usually defined,
namely explicit, tacit knowledge and Embedded knowledge.

• Explicit Knowledge: This is largely a process of sorting through documents and other records, as well as
discovering knowledge within existing data and knowledge repositories. For the latter, IT can be used to uncover
hidden knowledge by looking at patterns and relationships within data and text. The main tools/practices in this
case include intelligence gathering, data mining (finding patterns in large bodies of data and information), and
text mining (text analysis to search for knowledge, insights, etc.). Intelligence gathering is closely linked to
expert systems (Bali et al 2009) where the system tries to capture the knowledge of an expert, though the extent
to which they are competent for this task is questionable (Botha et al 2008).
• Tacit knowledge: Discovering and detecting tacit knowledge is a lot more complex and often it is up to the
management in each firm to gain an understanding of what their company's experts actually know. Since tacit
knowledge is considered as the most valuable in relation to sustained competitive advantage, this is a crucial
step, a step that often simply involves observation and awareness. There are several qualitative and quantitative
tools/practices that can help in the process; these include knowledge surveys, questionnaires, individual
interviews, group interviews, focus groups, network analysis, and observation. IT can be used to help identify
experts and communities. Groupware systems and other social/professional networks as well as expert finders
can point to people who are considered experts, and may also give an indication of the knowledge these
people/groups possess.
• Embedded knowledge: This implies an examination and identification of the knowledge trapped inside
organizational routines, processes, products etc, which has not already been made explicit. Management must
essentially ask "why do we do something a certain way?" This type of knowledge discovery involves
observation and analysis, and the use of reverse engineering and modeling tools.

ROI Illustrating the return-on-investment (ROI) for a portal solution or knowledge management (KM) system
measuring the ROI on improved processes and increased economic value of employee performance. Thus, rather
than employing traditional notions of value and assets as noted in standard accounting practices, KM solutions
are tools managers should use to support opportunities for process improvement and redesign. ROI that measures
value from this perspective creates new areas of value from an organization's existing, undervalued assets. A
well-developed measurement methodology for implementing a KM system may illustrate ROI, justify
expenditures for implementing the system, and provide a format to ensure that process improvement occurs. A
well-thought-out KM system has the capability of becoming the “digital nervous system” of an organization,
tying all areas to the strategic goals of an organization.

Write brief about failure of knowledge management?

KM Failure Factors
Based on the works of numerous researchers and authors, I arrived at two categories of factors, namely "causal"
and "resultant".

Causal factors refer to fundamental problems within the organization, which lead to conditions that are not suitable
for KM. They are not always easily visible and they lead to a number of symptoms, which I have termed “resultant”
factors.

41
Causal Failure Factors:
• Lack of performance indicators and measurable benefits
• Inadequate management support
• Improper planning, design, coordination, and evaluation
• Inadequate skill of knowledge managers and workers
• Problems with organizational culture
• Improper organisational structure
Resultant Failure Factors:
• Lack of widespread contribution
• Lack of relevance, quality, and usability
• Overemphasis on formal learning, systematisation, and determinant needs
• Improper implementation of technology
• Improper budgeting and excessive costs
• Lack of responsibility and ownership
• Loss of knowledge from staff defection and retirement

Difference between Knowledge management and information management

Information and IM:

• Focus on data and information


• Deal with unstructured and structured facts and figures.
• Benefit greatly from technology, since the information being conveyed is already codified and in an easily
transferrable form.
• Focus on organizing, analyzing, and retrieving - again due to the codified nature of the information.
• Is largely about know-what, i.e. it offers a fact that you can then use to help create useful knowledge, but in itself
that
• fact does not convey a course of action (e.g. sales of product x are up 25% last quarter).
• Is easy to copy - due to its codified and easily transferrable nature.
Knowledge and KM:

• Focus on knowledge, understanding, and wisdom


• Deal with both codified and uncodified knowledge. Uncodified knowledge - the most valuable type of
knowledge - is found in the minds of practitioners and is unarticulated, context-based, and experience-based.
• Technology is useful, but KM's focus is on people and processes. The most valuable knowledge cannot
effectively be (directly) transferred with technology, it must be passed on directly from person to person.
• Focus on locating, understanding, enabling, and encouraging - by creating environments, cultures, processes, etc.
where knowledge is shared and created.
• Is largely about know-how, know-why, and know-who
• Is hard to copy - at least regarding the tacit elements. The connection to experience and context makes tacit
knowledge extremely difficult to copy. This is why universities cannot produce seasoned practitioners - there are
some things (the most important things) that you simply cannot teach from a textbook (or other codified source
of information/explicit knowledge). These are learnt in the field and understood on an intuitive level. You cannot
easily copy or even understand this intuition without the right experience, context, etc. - and it is this intuition
that represents the most valuable organizational knowledge.

42
Knowledge management process
The operational processes present the processes of actually carrying out KM, i.e. knowledge collection, sharing,
update, etc. Before elaborating on the processes and their sub-processes in the following sections, an overview of
the model is
provided below: Figure shows the main processes of the model and their basic dependencies.

43
Overview of the Main Processes.

The co-ordination processes are underlying the operational processes. In Figure, this is shown by the rectangle
lying behind all other processes. The operational processes are presented as the following main processes:
“Identification of Need”, “Sharing”, “Creation”, “Collection and Storage”, and “Update”. Please note that there
are two processes that represent the main process “Sharing” in the model: “Knowledge Pull” and “Knowledge
Push”. The arrows connecting the processes

44
Knowledge management domain

provide an overview of the interaction and knowledge flows. The picture in the middle represents the place
where the knowledge is stored. The purpose of this picture, showing a human and a machine, is to express the
variety of possible ways of storing knowledge, including both technical (databases, documents, videos) and non-
technical
(human mind) repositories.
The general concept of the process model is that within the coordinating processes the operational processes are
planned and initiated. Together these make up the KM system. The main processes are described in the
following. “Identification of Need for Knowledge” identifies a need for knowledge and determines it. “Sharing”
is initiated in order to find out whether knowledge that already exists in the system can be used. This covers both
the searching for knowledge by a person who needs the knowledge (“Knowledge Pull”) and the feeding of
knowledge to recipients who are known to be in need of it
(“Knowledge Push”). If the needed knowledge is not available yet, “Creation of Knowledge” is initiated.
Consequently, the new knowledge (the result) has to be collected – this is done in “Knowledge Collection and
Storage”.

Characteristics of knowledge: The most important characteristic of knowledge is non-rivalry, which means that
one person’s use of an idea does not preclude another person using it at the same time.

1. Knowledge is contextual and it can be re-used


2. Benefits of knowledge obtained only if it is applied
3. The values of knowledge may change over time
4. Knowledge has to be renewed or maintained
5. It can be difficult to transfer, capture and distribute knowledge
6. It is developed through learning processes
7. Depends on memory, past experience, expertise, knowledge transfer mechanisms, opportunities
8. Facilitates effectiveness and ‘sense-making’
9. Knowledge enables higher learning.
10. Knowledge creation and utilization is enhanced with technology.

45
Difference between knowledge and information

Information Knowledge
Static Dynamic
Independent of the individual Dependent on individual
Explicit Tacit
Digital Analogue
Easy to duplicate Must be re-create
Easy to broadcast Face-to-face mainly
No instrinctic meaning Meaning has to be personally assigned

Difference between Tacit knowledge and Explicit knowledge


The distinction between tacit and explicit knowledge is perhaps the most fundamental concept of knowledge
management. Such a distinction was first made by Michael Polyani in the 1960s, but it forms one of the central
planks of Nonaka and Takeuchi's book The Knowledge-Creating Company (1995)

Tacit knowledge (knowing-how): knowledge embedded in the human mind through experience and jobs.
Know-how and learning embedded within the minds of people. Personal wisdom and experience, context-
specific, more difficult to extract and codify. Tacit knowledge Includes insights, intuitions.

Explicit knowledge (knowing-that): knowledge codified and digitized in books, documents, reports, memos,
etc. Documented information that can facilitate action. Knowledge what is easily identified, articulated, shared
and employed.

Thus, explicit (already codified) and tacit (embedded in the mind).

Explicit knowledge Tacit (implicit) knowledge


Objective, rational, technical Subjective, cognitive, experiential learning
Structured Personal
Fixed content Context sensitive/specific
Context independent Dynamically created
Externalized Internalized
Easily documented Difficult to capture and codify
Easy to codify Difficult to share
Easy to share Has high value
Easily transferred/ taught/learned Hard to document
Exists in high volumes Hard to transfer/teach/learn
Involves a lot of human interpretation

Expert system

46
Expert systems (ES) are one of the prominent research domains of AI. It is introduced by the researchers at
Stanford University, Computer Science Department.
The expert systems are the computer applications developed to solve complex problems in a particular domain,
at the level of extra-ordinary human intelligence and expertise.

Artificial intelligence based system that converts the knowledge of an expert in a specific subject into a software
code. This code can be merged with other such codes (based on the knowledge of other experts) and used for
answering questions (queries) submitted through a computer. Expert systems typically consist of three parts:
(1) knowledge base which contains the information acquired by interviewing experts, and logic rules
that govern how that information is applied;
(2) Inference engine that interprets the submitted problem against the rules and logic of information
stored in the knowledge base; and an
(3) User Interface that allows the user to express the problem in a human language such as English.

Let us see them one by one briefly −

Despite its earlier high hopes, expert systems technology has found application only in areas where information
can be reduced to a set of computational rules, such as insurance underwriting or some aspects of securities
trading. Also called rule based system

47
Characteristics of Expert Systems

• High performance
• Understandable
• Reliable
• Highly responsive
Capabilities of Expert Systems
The expert systems are capable of −

• Advising
• Instructing and assisting human in decision making
• Demonstrating
• Deriving a solution
• Diagnosing
• Explaining
• Interpreting input
• Predicting results
• Justifying the conclusion
• Suggesting alternative options to a problem

Expert Systems Limitations


No technology can offer easy and complete solution. Large systems are costly, require significant development
time, and computer resources. ESs have their limitations which include −

• Limitations of the technology


• Difficult knowledge acquisition
• ES are difficult to maintain
• High development costs

48
Applications of Expert System
The following table shows where ES can be applied.

Application Description

Design Domain Camera lens design, automobile design.

Medical Domain Diagnosis Systems to deduce cause of disease


from observed data, conduction medical
operations on humans.

Monitoring Systems Comparing data continuously with observed


system or with prescribed behavior such as
leakage monitoring in long petroleum pipeline.

Process Control Systems Controlling a physical process based on


monitoring.

Knowledge Domain Finding out faults in vehicles, computers.

Finance/Commerce Detection of possible fraud, suspicious


transactions, stock market trading, Airline
scheduling, cargo scheduling.

Challenges of knowledge management


In order to maximize the benefit of knowledge management within your business you may have to overcome the
following challenges:

• Capturing and recording business knowledge - ensure your business has processes in place to capture and
record business knowledge.
• Sharing information and knowledge – develop a culture within your business for sharing knowledge between
employees.
• Business strategy and goals – without clear goals or a business strategy in place for the knowledge gathered the
information will be of no use to your business.
• Knowledge management systems – these systems can be costly and complex to understand but when utilised
properly can provide huge business benefits. It is important that staff are fully trained on these systems so that
they collect and record the right data.

49
Advantages of knowledge management
Consider the measurable benefits of capturing and using knowledge more effectively in your business. The
following are all possible outcomes:

• An improvement in the goods or services you offer and the processes that you use to sell them. For example,
identifying market trends before they happen might enable you to offer products and services to customers
before your competitors.
• Increased customer satisfaction because you have a greater understanding of their requirements through
feedback from customer communications.
• An increase in the quality of your suppliers, resulting from better awareness of what customers want and what
your staff require.
• Improved staff productivity, because employees are able to benefit from colleagues' knowledge and expertise to
find out the best way to get things done. They'll also feel more appreciated in a business where their ideas are
listened to.
• Increased business efficiency, by making better use of in-house expertise.
• Better recruitment and staffing policies. For instance, if you have increased knowledge of what your customers
are looking for, you're better able to find the right staff to serve them.
• The ability to sell or license your knowledge to others. You may be able to use your knowledge and expertise in
an advisory or consultancy capacity. In order to do so, though, make sure that you protect your intellectual
property.

Benefits of knowledge management


All organization can benefit from their people sharing, inmoving, reusing, collaborating and learning.

1. Enabling fast and better decision making.


2. Making it easy to find relevant information and resource.
3. Reusing ideas, documents and expertise.
4. Avoiding redundant efforts.
5. Avoiding making the same mistake again.
6. Taking advantage of existing expertise and experience.
7. Communicating important information widely and quickly.
8. Providing methods, tools, templates, techniques and examples.
9. Enabling the organization to leverage its size.
10. Stimulating innovation and growth.

Limitations of Knowledge management


1. Failure to use company knowledge properly can lead to a great loss of time, resources, and even
organization failure.
2. Knowledge sharing is crucial part of making knowledge management systems works, but most
organization fail to share proper knowledge.
3. Extracting information from workers who possess valuable company knowledge can also be a difficult
and lengthy process.
4. Another major disadvantages of knowledge management system is the lack of company strategy to fully
utilize the information that it collects.
5. Without an implementation strategy or goal in place for the knowledge, the information useless.
6. Knowledge management systems are complex and hard to understand for the average worker, and
training worker to use knowledge management system is costly.

50
Knowledge Management System Architecture

Developing a KMS is a complex task and requires a careful planning before selecting the tools for supporting the
knowledge processes. The designed system architecture should suit the organizational culture and business
needs. KMS can be as simple as a file folder until a complex business intelligence system which uses an
advanced data visualization and artificial intelligence. Thus, we have studied several KMS architectures which
aim to support knowledge management processes and collaboration in the organization. We found that even if
there are differences between architectures in term of functions and services, the major components of
architecture are comparable. The general KMS architecture is proposed by Tiwana [Tiwana 02]. He pointed out
that the KMS should comprise four major components: repository, collaborative platform, network, and culture.

1. Repository holds explicated formal and informal knowledge, such as declarative knowledge, procedural
knowledge, causal knowledge, and context. This component acts as the core of KMS which aims to store
and retrieve knowledge for future use.
2. Collaborative platform supports distributed work and incorporates pointers, skills databases, expert
locators, and informal communications channels.
3. Network means both physical and social networks that support communication and conversation.
Physical network is a ‘hard’ network such as intranet, shared space, and back bone. Social network is a
‘soft’ network such as Communities of Practice (CoP), associations, and working groups.
4. Culture is the enabler to encourage sharing and use of the KMS. Research has revealed that the greatest
difficulty in KM is ‘‘changing people’s behavior,’’ and the current biggest impediment to knowledge
transfer is ‘‘culture’’.

These four components are considered as the basis elements for each knowledge management system. However,
other tools could be integrated to enhance the quality of services of the system. Tiwana also proposed seven-
layer KMS architecture [Tiwana 02] which is the integration of these four components and their supportive
information technologies.

Seven layers KMS architecture [Tiwana 02]

Actually, seven layer KMS architecture is just a reflection of OSI model (Open Systems Interconnection basic
reference model). This model tries to represent the functions and tools of KMS in terms of layer that the

51
knowledge passed though. This architecture might suit with complex systems which require network and data
manipulation.

Why is it helpful to view the building of a KM system as a life cycle?

It is important to have a life cycle in building knowledge management systems, because the life cycle provides
structure and order to the process. Additionally, the life cycle provides a breakdown of the activities into
manageable steps, good documentation for possible changes in the future, coordination of the project for a timely
completion, and regular management review at each phase of the cycle.

Write down the phases of knowledge management


A winning knowledge management program increases staff productivity, product and service quality, and
deliverable consistency by capitalizing on intellectual and knowledge-based assets.

Many organizations leap into a knowledge management solution (e.g. document management, data mining,
blogging, and community forums) without first considering the purpose or objectives they wish to fulfill or how
the organization will adopt and follow best practices for managing its knowledge assets long term.

A successful knowledge management program will consider more than just technology. An organization should
also consider:

• People. They represent how you increase the ability of individuals within the organization to influence
others with their knowledge.
• Processes. They involve how you establish best practices and governance for the efficient and accurate
identification, management, and dissemination of knowledge.

• Technology. It addresses how you choose, configure, and utilize tools and automation to enable
knowledge management.
• Structure. It directs how you transform organizational structures to facilitate and encourage cross-
discipline awareness and expertise.

• Culture. It embodies how you establish and cultivate a knowledge-sharing, knowledge-driven culture.

8 Steps to Implementation

Implementing a knowledge management program is no easy feat. You will encounter many challenges along the
way including many of the following:

• Inability to recognize or articulate knowledge; turning tacit knowledge into explicit knowledge.

• Geographical distance and/or language barriers in an international company.

• Limitations of information and communication technologies.

• Loosely defined areas of expertise.

• Internal conflicts (e.g. professional territoriality).

• Lack of incentives or performance management goals.

• Poor training or mentoring programs.

52
• Cultural barriers (e.g. “this is how we've always done it” mentality).

The following eight-step approach will enable you to identify these challenges so you can plan for them, thus
minimizing the risks and maximizing the rewards. This approach was developed based on logical, tried-and-true
activities for implementing any new organizational program. The early steps involve strategy, planning, and
requirements gathering while the later steps focus on execution and continual improvement.

Define
1. RER: Rapid Evidence Review: A Rapid Evidence Review is a way of reviewing research and
evidence on a particular issue. It looks at what has been done in particular area and records the
main outcomes. Evidence review can be run in several ways. Some are more exhaustive in their
execution and ambitious in their scope.
The RER provides a quicker review but still useful way of gathering and consolidating
knowledge. It’s useful building block from which to start work on a new project.

2. CASE STUDY: A case study is a written examination of a project, or important part of a project.
It has a clear structure that brings out key qualitative and quantitative information from the
project. Case studies are also published with a broad audience in mind, so it is useful to bring the
most useful and transferable information to the fore.

3. KNOWLEDGE BANK: Knowledge banks are online services and resources which hold
information, learning and support, giving you the power to improve your council. They are
typically used to showcase the work of an organization and provide signposts to documents,
article and toolkits.

4. KNOWLEDGE CAFÉ: A knowledge café people brings together to have open creative
conversation on topics of mutual interest. It can be organized in a meeting or workshop format,
but the emphasis should be on following dialogue that allows people to share ideas and learn from
each other. It encourages people to explore issues that require discussion in order to build a
consensus around an issue.

Why Use a Knowledge Café?


In an organization, especially in a hierarchical organization, people are not often given the opportunity to 'reflect'
on discussions. People are normally tied to performance pressures. Therefore, much of the value that could be
gained from good discussion, dialogue, and reflection is lost.

5. KNOWLEDGE MARKETPLACE: Knowledge Marketplace – Modelled on the E-Business Net


Market concept, several knowledge-trading places have recently been established. In a
Knowledge Marketplace, a third party vendor hosts a web site grouping together many suppliers
of knowledge services. Suppliers may include expert advisors, vendors providing product support
services, KM job placement agencies, procedures for the evaluation of KM and portal software,
and research companies providing industry benchmarks and best practice case studies. Two types
of Knowledge Marketplace exist.one provides common information and services to all industries
while the other offers only certain services to a specific industry.

6. COTS: Customized Off The Shelf (COTS) – this is the traditional and most popular way of
deploying application services. Based on the organizational needs, the applications will be
identified and then examined against the functional needs of the organization. A short-period test
may follow to identify the most suitable application. Once an application is acquired,

53
customization of the standard features is usually performed to integrate it into the organization’s
information system.

7. BRAINSTORMING: Brainstorming is a process where a group of people meet to focus on a


problem, or idea, and explore such ideas with a view to coming up with solutions, or further
developing the ideas. The participants express or contribute their ideas as they strike them and
then build on the ideas raised by others. All the ideas are noted down and are not criticized. Only
when the brainstorming session is over are the ideas evaluated. Brainstorming helps in problem
solving and in creating new knowledge from existing knowledge.
Brainstorming is a simple way of helping a group of people to generate new and unusual ideas. The
process is actually split into two phrases: divergence and convergence. During the divergent phase,
everyone agrees to delay their judgment. In other words, all ideas will be treated as valid. During the
convergent phrase, the participants use their judgment but do so in a 'positive' manner—that is, they
look for what they like about the ideas before finding flaws.

When to use brainstorming

Used to bring different perspectives to a problem, find key areas to focus on in a project or test new methods,
brainstorming usually happens in a workshop or meeting with small and large groups working together on ideas.

8. ROI: Return on investment measures the gain or loss generated on an investment relative to the
amount of money invested. ROI is usually expressed as a percentage and is typically used for
personal financial decisions, to compare a company's profitability or to compare the efficiency of
different investments.

9. COP Communities of Practice (CoP) are also called knowledge communities, knowledge
networks, learning communities, communities of interest and thematic groups. These consist of a
group of people of different skill sets, development histories and experience backgrounds that
work together to achieve commonly shared goals (Ruggles, 1997). These groups are different
from teams and task forces. People in a CoP can perform the same job or collaborate on a shared
task, e.g. software developers, or work together on a product, e.g. engineers, marketers, and
manufacturing specialists.

What are Social Network Services?


A social network is a group of people who share a common area of interest. Social network services are online
systems that support social networking. The core services they offer usually include
1. Finding people who have similar interests or needs;
2. Aggregating people into groups, or subgroups, and being able to communicate with those groups; and
3. Sharing content, such as documents links to relevant websites, or even streaming video.

What is Knowledge Mapping?


Knowledge Mapping is a process by which organizations can identify and categorize knowledge assets within
their organization—people, processes, content, and technology. It allows an organization to leverage the existing
expertise resident in the organization, as well as identify barriers and constraints to fulfilling strategic goals and
objectives. It is constructing a road map to locate the information needed to make the best use of resources,
independent of source or form

54

You might also like