What Can A DSS Analyze?
What Can A DSS Analyze?
The benefits of decision support systems include more informed decision-making, timely problem solving and improved efficiency for dealing with problems with rapidly changing
variables.
A DSS can be used by operations management and planning levels in an organization to compile information and data and synthesize it into actionable intelligence. This allows the
end user to make more informed decisions at a quicker pace.
The DSS is an information application that produces comprehensive information. This is different from an operations application, which would be used to collect the data in the
first place. A DSS is primarily used by mid- to upper-level management, and it is key for understanding large amounts of data.
For example, a DSS could be used to project a company’s revenue over the upcoming six months based on new assumptions about product sales. Due to the large amount of
variables that surround the projected revenue figures, this is not a straightforward calculation that can be done by hand. A DSS can integrate multiple variables and generate an
outcome and alternate outcomes, all based on the company’s past product sales data and current variables.
The primary purpose of using a DSS is to present information to the customer in a way that is easy to understand. A DSS system is beneficial because it can be programed to
generate many types of reports, all based on user specifications. A DSS can generate information and output it graphically, such as a bar chart that represents projected revenue, or
as a written report.
As technology continues to advance, data analysis is no longer limited to large bulky mainframes. Since a DSS is essentially an application, it can be loaded on most computer
systems, including laptops. Certain DSS applications are also available through mobile devices. The flexibility of the DSS is extremely beneficial for customers who travel
frequently. This gives them the opportunity to be well-informed at all times, which in turn provides them with the ability to make the best decisions for their company and
customers at any time.
Attributes of a DSS
Characteristics of a DSS
Benefits of DSS
Components of a DSS
Database Management System (DBMS): To solve a problem the necessary data may come from internal or external database. In an organization, internal data are generated by a
system such as TPS and MIS. External data come from a variety of sources such as newspapers, online data services, databases (financial, marketing, human resources).
Model Management System: It stores and accesses models that managers use to make decisions. Such models are used for designing manufacturing facility, analyzing the
financial health of an organization, forecasting demand of a product or service, etc.
Support Tools: Support tools like online help; pulls down menus, user interfaces, graphical analysis, error correction mechanism, facilitates the user interactions with the system.
Classification of DSS
There are several ways to classify DSS. Hoi Apple and Whinstone classifies DSS as follows:
Text Oriented DSS:It contains textually represented information that could have a bearing on decision. It allows documents to be electronically created, revised and viewed as
needed.
Database Oriented DSS: Database plays a major role here; it contains organized and highly structured data.
Spreadsheet Oriented DSS: It contains information in spread sheets that allows create, view, modify procedural knowledge and also instructs the system to execute self-contained
instructions. The most popular tool is Excel and Lotus 1-2-3.
Solver Oriented DSS: It is based on a solver, which is an algorithm or procedure written for performing certain calculations and particular program type.
Rules Oriented DSS: It follows certain procedures adopted as rules.
Rules Oriented DSS: Procedures are adopted in rules oriented DSS. Export system is the example.
Compound DSS: It is built by using two or more of the five structures explained above.
Types of DSS
Status Inquiry System: It helps in taking operational, management level, or middle level management decisions, for example daily schedules of jobs to machines or machines to
operators.
Data Analysis System: It needs comparative analysis and makes use of formula or an algorithm, for example cash flow analysis, inventory analysis etc.
Information Analysis System: In this system data is analyzed and the information report is generated. For example, sales analysis, accounts receivable systems, market analysis
etc.
Accounting System: It keeps track of accounting and finance related information, for example, final account, accounts receivables, accounts payables, etc. that keep track of the
major aspects of the business.
Model Based System: Simulation models or optimization models used for decision-making are used infrequently and creates general guidelines for operation or management.
Group Decision Support and Groupware Technologies
THEINTACTFRONT17 APR 2018 1 COMMENT
Globalization has not only expanded the product markets. It has also made organizations geographically more dispersed. Therefore, the way the business is done and decisions are
made has also changed significantly. Collaborative decision-making has become more valuable than ever.
This is why there is an increased emphasis on developing and implementing communications-driven group decision support systems. Decision making, in the current business
environment, is a collaborative process with participation from in-house and remotely located teams or temporary work groups or task forces. In such a scenario, communications-
driven group DSS makes it easier for every participant to send and receive communication and interact with others in real time, from their respective locations, without
meeting physically.
Fosters collaboration between cross functional business teams at same or different locations
Allows geographically separated decision makers connect face-to-face in real time
Allows data sharing with rest of the team members, work groups or task forces
Now that we know how a communications-driven group DSS can support decision-making among geographically dispersed teams using web-based tools, it’s time to understand
what exactly it is.
There are a number of tools and technologies that can be incorporated in a GDSS (Group Decision Support System), in order to promote better decision making. These include:
Groupware: A software system to enhance collaboration among participants/ decision makers and support group/s in completing tasks.
Multimedia Decision Support: An integration of computer, video and decision-support technologies, facilitating information sharing, group decision tasks, collaboration and
coordination. It offers a smart decision support in which decisions are directly affected by the way decision makers interact, review information, make choices and take actions.
Electronic Meeting System: A software system to facilitate creative problem solving and decision making using electronic technologies.
Collaborative Workgroup Software: A web-based team collaboration and project management software facilitating group tasks and live discussions for better decision making.
A group decision support system fosters collaboration and team decision-making in four different situations:
In this situation, all decision makers are available at same time at same place. The information is displayed on either computer projection system or on individual computers of
participants.
In this situation, individuals participate in decision-making from geographically different locations at the same time. A GDSS
In this situation, GDSS fosters communication for those who work at same place but have different shift timings. It offers numerous facilities, including:
Document sharing
Workstation software for shift work
Email
It’s important to understand how GDSS work in different time and different place situations. It is a situation where participants are geographically distant and also operate in a
different time zone. It fosters communication, collaboration and team decision making through:
Conferencing
Bulletin board
Voice mail
Email
The major concern of investors/users at the time of deciding whether to develop a decision support system or not must be:
Therefore, the managers must ask themselves following questions, in order to attain more clarity:
Should there be an audio conferencing facility? If yes, how many people should be able to participate in a conference at a given time?
Will participants be using the technology, like bulletin boards?
What will be the alternative for web conferencing when participants are at different locations and in different time zones?
How frequent will be resource sharing and how participants will access information and to what extent?
Do you wish to integrate emailing with the GDSS?
How can video conferencing be made comfortable for participants?
A lot of thought and planning go into designing and development of a communications-driven group decision support system.
Contingency Theory
A communications-driven GDSS addresses problems associated with group collaboration, communication and decision making, when participants are geographically dispersed and
operate in different time zones.
This means the effectiveness of a GDSS directly depends upon its design, user-interface, DSS architecture, integrated support tools and technical skills possessed by participants
who use DSS.
Although managers know that the set of tools that they have chosen for a GDSS are good, but they may not perform equally good in all circumstances. There is no one best way of
making decisions or supporting group collaboration. A tool or process may work well in some situations and may terribly fail in others.
In such a scenario, the managers must resort to a contingency approach that focuses on three main points:
Task Type: The deciding factors include idea generation, creativity, planning, choosing alternatives and action. For example, computer mediated communication is a good fit for
idea generation activities, and video and audio conferencing is a good choice when decision-making is a function of human intellect.
Group Size: bigger the size, higher the difference between technical abilities, likes and interests, preferences and judgments. Small groups may not require extensive support or
communication tools while large groups require more sophisticated and automated tools.
Group Proximity: More sophisticated communications-driven GDSS is required when the group of decision makers is dispersed and operates in different time zones, while a
simpler system is sufficient for a group operating from the same place and at same time.
A contingency approach depends on task structure, location of team members and difference in organizational attributes.
Virtual Organizations
A virtual organization is an association of physically and/or professionally detached individuals working together on a project or to achieve a mission. It doesn’t have any physical
existence but the technology (internet technology, more precisely) makes it look real.
Communications-driven group decision support systems are best suited for virtual organizations that require a lot of technological support to foster communication and
collaboration and get the work done.
Look real
Work in real time
Establish innovative relationships among task forces
Establish professional alliances among participants
A communications-driven GDSS for a virtual organization makes use of various knowledge management technologies, including:
Personal computers
Intranet and extranet
Wireless technologies
Collaborative technologies
Web conferencing
Groupware
Worldwide Web
Allows group members contribute significantly in decision making irrespective of their locations and time zones
Extracts greater participation from team members, given the availability of support technologies
Makes document sharing easier, faster and more secured
Fosters more concentrated and focused decision-making
Saves a lot of money and time by allowing participants to contribute from their own locations (users don’t need to spend time and money in traveling)
Helps completing tasks fasters
Reduces the chances of forgetfulness by offering facilities like bulletin boards and whiteboards
Encourages input of ideas because of its simplicity of use
Increases information sharing, which ultimately speeds knowledge capturing and enhances productivity
Makes results available easily and immediately
Makes it easier to understand by displaying information in the form of graphics
Gives more structure to virtual operations and decision-making
Scalability: A tool’s ability to support the needs of all anticipated users is known as scalability. Plus, it should be easily integrated with existing hardware and software
applications.
Reliability: A group support tool must be able to perform necessary tasks without failing. Though decision makers use different technologies at different times in different
situations, but the reliability of a support tool should be evaluated before integrating it with the system.
Ease of Installation and Use: A support tool must be easy to install and use. An ideal tool is the one that requires minimal or no formal training for its users. The decision makers
may consult DSS experts to integrate group support tools that are easy to use.
Versatility: Versatility of a support tool plays a crucial role. As different DSS users prefer different platforms, it must be compatible across all platforms. In addition, it must allow
easy customization of features and capabilities.
Security: As a GDSS fosters resource sharing, a support tool must ensure security of data transfer by executing it across firewalls.
Cost: Given the significant expenditure on a GDSS, a support tool must be affordable enough, so that it doesn’t add much to the basic cost of developing and implementing a DSS.
It’s important to select the right communication and support tools to promote good decision making by a team that is physically dispersed. Moreover, a GDSS must be carefully
aligned to the structure of an organization, in order to get the best results.
Groupware Technologies
Groupware is a class of computer programs that enables individuals to collaborate on projects with a common goal from geographically dispersed locations through shared Internet
interfaces as a means to communicate within the group.
Groupware may also include remote access storage systems to archive frequently used data files. These can be altered, accessed and retrieved by workgroup members.
The first commercial groupware products emerged in early 1990s when international giants such as IBM and Boeing began using electronic meeting systems for their internal
projects. Further, Lotus Notes appeared as a major product of this category, further enhancing remote group collaborations.
Groupware is either synchronous or asynchronous in nature. Synchronous groupware is a class of applications that allows a group of individuals who are physically separated to
interact with each other using shared computational objects in real time. The fundamental requirement of synchronous groupware is real-time coordination among clients. The user
interfaces advocate a feeling of togetherness. They require shared audio channels for communication.
Asynchronous groupware uses email, structured messages, agents, workflow, computer conferencing agents, file sharing systems and collaborative writing systems, among others.
Asynchronous collaborations between users are well maintained only if they are allowed to perform their contributions without any restrictions. This can be accomplished through
replicated data management systems with read any or write any data access. Users can execute concurrent updates.
The extensive use of groupware on the Internet helped contribute to the development of Web 2.0, which uses instant messaging, Web conferencing, group calendars, document
sharing, etc.
Expert Systems
THEINTACTFRONT17 APR 2018 1 COMMENT
What are Expert Systems?
The expert systems are the computer applications developed to solve complex problems in a particular domain, at the level of extra-ordinary human intelligence and
expertise.
High performance
Understandable
Reliable
Highly responsive
Advising
Instructing and assisting human in decision making
Demonstrating
Deriving a solution
Diagnosing
Explaining
Interpreting input
Predicting results
Justifying the conclusion
Suggesting alternative options to a problem
Knowledge Base
Inference Engine
User Interface
Knowledge Base
It contains domain-specific and high-quality knowledge. Knowledge is required to exhibit intelligence. The success of any ES majorly depends upon the collection of highly
accurate and precise knowledge.
What is Knowledge?
The data is collection of facts. The information is organized as data and facts about the task domain. Data, information, and past experience combined together are
termed as knowledge.
Factual Knowledge: It is the information widely accepted by the Knowledge Engineers and scholars in the task domain.
Heuristic Knowledge: It is about practice, accurate judgement, one’s ability of evaluation, and guessing.
Knowledge representation
It is the method used to organize and formalize the knowledge in the knowledge base. It is in the form of IF-THEN-ELSE rules.
Knowledge Acquisition
The success of any expert system majorly depends on the quality, completeness, and accuracy of the information stored in the knowledge base.
The knowledge base is formed by readings from various experts, scholars, and the Knowledge Engineers. The knowledge engineer is a person with the qualities of
empathy, quick learning, and case analyzing skills.
He acquires information from subject expert by recording, interviewing, and observing him at work, etc. He then categorizes and organizes the information in a meaningful
way, in the form of IF-THEN-ELSE rules, to be used by interference machine. The knowledge engineer also monitors the development of the ES.
Inference Engine
Use of efficient procedures and rules by the Inference Engine is essential in deducting a correct, flawless solution.
In case of knowledge-based ES, the Inference Engine acquires and manipulates the knowledge from the knowledge base to arrive at a particular solution.
Applies rules repeatedly to the facts, which are obtained from earlier rule application.
Adds new knowledge into the knowledge base if required.
Resolves rules conflict when multiple rules are applicable to a particular case.
Forward Chaining
Backward Chaining
Forward Chaining
It is a strategy of an expert system to answer the question, “What can happen next?”
Here, the Inference Engine follows the chain of conditions and derivations and finally deduces the outcome. It considers all the facts and rules, and sorts them before
concluding to a solution.
This strategy is followed for working on conclusion, result, or effect. For example, prediction of share market status as an effect of changes in interest rates.
Backward Chaining
With this strategy, an expert system finds out the answer to the question, “Why this happened?”
On the basis of what has already happened, the Inference Engine tries to find out which conditions could have happened in the past for this result. This strategy is
followed for finding out cause or reason. For example, diagnosis of blood cancer in humans.
User Interface
User interface provides interaction between user of the ES and the ES itself. It is generally Natural Language Processing so as to be used by the user who is well-versed
in the task domain. The user of the ES need not be necessarily an expert in Artificial Intelligence.
It explains how the ES has arrived at a particular recommendation. The explanation may appear in the following forms −
The user interface makes it easy to trace the credibility of the deductions.
No technology can offer easy and complete solution. Large systems are costly, require significant development time, and computer resources. ESs have their limitations
which include −
Application Description
Process Control
Controlling a physical process based on monitoring.
Systems
There are several levels of ES technologies available. Expert systems technologies include −
Expert System Development Environment− The ES development environment includes hardware and tools. They are −
o Workstations, minicomputers, mainframes.
o High level Symbolic Programming Languages such as LISt Programming (LISP) and PROgrammation en LOGique (PROLOG).
o Large databases.
Tools− They reduce the effort and cost involved in developing an expert system to large extent.
o Powerful editors and debugging tools with multi-windows.
o They provide rapid prototyping
o Have Inbuilt definitions of model, knowledge representation, and inference design.
Shells− A shell is nothing but an expert system without knowledge base. A shell provides the developers with knowledge acquisition, inference engine, user interface, and
explanation facility. For example, few shells are given below −
o Java Expert System Shell (JESS) that provides fully developed Java API for creating an expert system.
o Vidwan, a shell developed at the National Centre for Software Technology, Mumbai in 1993. It enables knowledge encoding in the form of IF-THEN rules.
The knowledge engineer uses sample cases to test the prototype for any deficiencies in performance.
End users test the prototypes of the ES.
Test and ensure the interaction of the ES with all elements of its environment, including end users, databases, and other information systems.
Document the ES project well.
Train the user to use ES.
Maintain the ES
SQL is a language to operate databases; it includes database creation, deletion, fetching rows, modifying rows, etc. SQL is an ANSI (American National Standards Institute)
standard language, but there are many different versions of the SQL language.
What is SQL?
SQL is Structured Query Language, which is a computer language for storing, manipulating and retrieving data stored in a relational database.
SQL is the standard language for Relational Database System. All the Relational Database Management Systems (RDMS) like MySQL, MS Access, Oracle, Sybase, Informix,
Postgres and SQL Server use SQL as their standard database language.
Why SQL?
SQL Process
When you are executing an SQL command for any RDBMS, the system determines the best way to carry out your request and SQL engine figures out how to interpret the task.
Query Dispatcher
Optimization Engines
Classic Query Engine
SQL Query Engine, etc.
A classic query engine handles all the non-SQL queries, but a SQL query engine won’t handle logical files.
Features of SQL
High Performance.
High Availability.
Scalability and Flexibility Run anything.
Robust Transactional Support.
Web and Data Warehouse Strengths.
Strong Data Protection.
Comprehensive Application Development.
Management Ease.
Open Source Freedom and 24 x 7 Support.
Lowest Total Cost of Ownership.
System Databases
THEINTACTFRONT20 APR 2018 1 COMMENT
SQL Server mainly contains four System Databases (master,model,msdb,tempdb). Each of them is used by SQL Server for Separate purposes. From all the databases, master
database is the most important database.
Master Database contains information about SQL server configuration. Without Master database, server can’t be started. This will store the metadata information about all other
objects(Databases,Stored Procedure,Tables,Views,etc.) which is Created in the SQL Server .
If the master database gets corrupted and is not recoverable from the backup, then a user has to again rebuild the master database. Therefore, it is always recommended to maintain
a current backup of the master database. As everything crucial to SQL server is stored in the master database, it cannot be deleted as it is the heart of SQL SERVER.
The model database sets a template for every database that was newly created . It serves as a template for the SQL server in order to create a new database. When we create a new
database, the data present in model database are moved to new database to create its default objects which include tables, stored procedures, etc. Primarily, the requirement of
model database is not specific to creation of new database only. Whenever the SQL server starts, the Tempdb is created by using model database in the form of a template. By
default it does not contain any data.
(iii) Msdb
The msdb database is used mainly by the SQL server Management Studio, SQL Server Agent to store system activities like sql server jobs, mail, service broker, maintenance plans,
user and system database backup history, Replication information, log shipping .We need to take a backup of this database for the proper function of SQL Server Agent Service.
(iv) TempDB
From the name of the database itself, we can identify the purpose of this database. It can be accessed by all the users in the SQL Server Instance.
The tempdb is a temporary location for storing temporary tables(Global and Local) and temporary stored procedure that hold intermediate results during the sorting or query
processing and cursors.
If more temporary objects are created and used storage of tempDB then performance of SQL Server will affect.So recommened to move the temdb to the location where sufficient
amount of space is there.
This Database will be created by SQL Server instance when the SQL Server service starts. This database is created using model database.We cannot take a backup of temp
Database.
MySQL
MySQL is an open source SQL database, which is developed by a Swedish company – MySQL AB. MySQL is pronounced as “my ess-que-ell,” in contrast with SQL, pronounced
“sequel.”
MySQL is supporting many different platforms including Microsoft Windows, the major Linux distributions, UNIX, and Mac OS X.
MySQL has free and paid versions, depending on its usage (non-commercial/commercial) and features. MySQL comes with a very fast, multi-threaded, multi-user and robust SQL
database server.
History
Features
High Performance.
High Availability.
Scalability and Flexibility Run anything.
Robust Transactional Support.
Web and Data Warehouse Strengths.
Strong Data Protection.
Comprehensive Application Development.
Management Ease.
Open Source Freedom and 24 x 7 Support.
Lowest Total Cost of Ownership.
MS SQL Server
MS SQL Server is a Relational Database Management System developed by Microsoft Inc. Its primary query languages are −
T-SQL
ANSI SQL
History
Features
High Performance
High Availability
Database mirroring
Database snapshots
CLR integration
Service Broker
DDL triggers
Ranking functions
Row version-based isolation levels
XML integration
TRY…CATCH
Database Mail
ORACLE
It is a very large multi-user based database management system. Oracle is a relational database management system developed by ‘Oracle Corporation’.
Oracle works to efficiently manage its resources, a database of information among the multiple clients requesting and sending data in the network.
It is an excellent database server choice for client/server computing. Oracle supports all major operating systems for both clients and servers, including MSDOS, NetWare,
UnixWare, OS/2 and most UNIX flavors.
History
Oracle began in 1977 and celebrating its 32 wonderful years in the industry (from 1977 to 2009).
1977 – Larry Ellison, Bob Miner and Ed Oates founded Software Development Laboratories to undertake development work.
1979 – Version 2.0 of Oracle was released and it became first commercial relational database and first SQL database. The company changed its name to Relational Software Inc. (RSI).
1981 – RSI started developing tools for Oracle.
1982 – RSI was renamed to Oracle Corporation.
1983 – Oracle released version 3.0, rewritten in C language and ran on multiple platforms.
1984 – Oracle version 4.0 was released. It contained features like concurrency control – multi-version read consistency, etc.
1985 – Oracle version 4.0 was released. It contained features like concurrency control – multi-version read consistency, etc.
2007 – Oracle released Oracle11g. The new version focused on better partitioning, easy migration, etc.
Features
Concurrency
Read Consistency
Locking Mechanisms
Quiesce Database
Portability
Self-managing database
SQL*Plus
ASM
Scheduler
Resource Manager
Data Warehousing
Materialized views
Bitmap indexes
Table compression
Parallel Execution
Analytic SQL
Data mining
Partitioning
MS ACCESS
This is one of the most popular Microsoft products. Microsoft Access is an entry-level database management software. MS Access database is not only inexpensive but also a
powerful database for small-scale projects.
MS Access uses the Jet database engine, which utilizes a specific SQL language dialect (sometimes referred to as Jet SQL).
MS Access comes with the professional edition of MS Office package. MS Access has easyto-use intuitive graphical interface.
Users can create tables, queries, forms and reports and connect them together with macros.
Option of importing and exporting the data to many formats including Excel, Outlook, ASCII, dBase, Paradox, FoxPro, SQL Server, Oracle, ODBC, etc.
There is also the Jet Database format (MDB or ACCDB in Access 2007), which can contain the application and data in one file. This makes it very convenient to distribute the entire application to
another user, who can run it in disconnected environments.
Microsoft Access offers parameterized queries. These queries and Access tables can be referenced from other programs like VB6 and .NET through DAO or ADO.
The desktop editions of Microsoft SQL Server can be used with Access as an alternative to the Jet Database Engine.
Microsoft Access is a file server-based database. Unlike the client-server relational database management systems (RDBMS), Microsoft Access does not implement database triggers, stored
procedures or transaction logging.
The SQL CREATE DATABASE statement is used to create a new SQL database.
Syntax
Example
If you want to create a new database <testDB>, then the CREATE DATABASE statement would be as shown below −
Make sure you have the admin privilege before creating any database. Once a database is created, you can check it in the list of databases as follows −
+——————–+
| Database |
+——————–+
| information_schema |
| AMROOD |
| TUTORIALSPOINT |
| mysql |
| orig |
| test |
| testDB |
+——————–+
CREATING TABLES
Creating a basic table involves naming the table and defining its columns and each column’s data type.
Syntax
The basic syntax of the CREATE TABLE statement is as follows −
CREATE TABLE table_name( column1 datatype, column2 datatype, column3 datatype, ….. columnN datatype, PRIMARY KEY( one or more columns ));
CREATE TABLE is the keyword telling the database system what you want to do. In this case, you want to create a new table. The unique name or identifier for the table follows
the CREATE TABLE statement.
Then in brackets comes the list defining each column in the table and what sort of data type it is. The syntax becomes clearer with the following example.
A copy of an existing table can be created using a combination of the CREATE TABLE statement and the SELECT statement. You can check the complete details at Create Table
Using another Table.
Example
The following code block is an example, which creates a CUSTOMERS table with an ID as a primary key and NOT NULL are the constraints showing that these fields cannot be
NULL while creating records in this table −
SQL> CREATE TABLE CUSTOMERS( ID INT NOT NULL, NAME VARCHAR (20) NOT NULL, AGE INT NOT NULL, ADDRESS CHAR (25) , SALARY
DECIMAL (18, 2), PRIMARY KEY (ID));
You can verify if your table has been created successfully by looking at the message displayed by the SQL server, otherwise you can use the DESC command as follows −
SQL> DESC CUSTOMERS;+———+—————+——+—–+———+——-+| Field | Type | Null | Key | Default | Extra |+———+—————+——+—–+———+
——-+| ID | int(11) | NO | PRI | | || NAME | varchar(20) | NO | | | || AGE | int(11) | NO | | | || ADDRESS | char(25) | YES | |
NULL | || SALARY | decimal(18,2) | YES | | NULL | |+———+—————+——+—–+———+——-+5 rows in set (0.00 sec)
Now, you have CUSTOMERS table available in your database which you can use to store the required information related to customers.
Constraints
THEINTACTFRONT20 APR 2018 1 COMMENT
Constraints are the rules enforced on the data columns of a table. These are used to limit the type of data that can go into a table. This ensures the accuracy and reliability of the
data in the database.
Constraints could be either on a column level or a table level. The column level constraints are applied only to one column, whereas the table level constraints are applied to the
whole table.
Following are some of the most commonly used constraints available in SQL. These constraints have already been discussed in SQL – RDBMS Conceptschapter, but it’s worth to
revise them at this point.
NOT NULL Constraint− Ensures that a column cannot have NULL value.
DEFAULT Constraint− Provides a default value for a column when none is specified.
UNIQUE Constraint− Ensures that all values in a column are different.
PRIMARY Key− Uniquely identifies each row/record in a database table.
FOREIGN Key− Uniquely identifies a row/record in any of the given database table.
CHECK Constraint− The CHECK constraint ensures that all the values in a column satisfies certain conditions.
INDEX− Used to create and retrieve data from the database very quickly.
Constraints can be specified when a table is created with the CREATE TABLE statement or you can use the ALTER TABLE statement to create constraints even after the table is
created.
Dropping Constraints
Any constraint that you have defined can be dropped using the ALTER TABLE command with the DROP CONSTRAINT option.
For example, to drop the primary key constraint in the EMPLOYEES table, you can use the following command.
Some implementations may provide shortcuts for dropping certain constraints. For example, to drop the primary key constraint for a table in Oracle, you can use the following
command.
Some implementations allow you to disable constraints. Instead of permanently dropping a constraint from the database, you may want to temporarily disable the constraint and
then enable it later.
Integrity Constraints
Integrity constraints are used to ensure accuracy and consistency of the data in a relational database. Data integrity is handled in a relational database through the concept of
referential integrity.
There are many types of integrity constraints that play a role in Referential Integrity (RI). These constraints include Primary Key, Foreign Key, Unique Constraints and other
constraints which are mentioned above.
DML resembles simple English language and enhances efficient user interaction with the system. The functional capability of DML is organized in manipulation commands like
SELECT, UPDATE, INSERT INTO and DELETE FROM, as described below:
SELECT: This command is used to retrieve rows from a table. The syntax is SELECT [column name(s)] from [table name] where [conditions]. SELECT is the most widely used
DML command in SQL.
UPDATE: This command modifies data of one or more records. An update command syntax is UPDATE [table name] SET [column name = value] where [condition]
INSERT: This command adds one or more records to a database table. The insert command syntax is INSERT INTO [table name] [column(s)] VALUES [value(s)].
DELETE: This command removes one or more records from a table according to specified conditions. Delete command syntax is DELETE FROM [table name] where [condition].
OLTP (On-line Transaction Processing) is characterized by a large number of short on-line transactions (INSERT, UPDATE, DELETE). The main emphasis for OLTP systems
is put on very fast query processing, maintaining data integrity in multi-access environments and an effectiveness measured by number of transactions per second. In OLTP
database there is detailed and current data, and schema used to store transactional databases is the entity model (usually 3NF).
OLAP (On-line Analytical Processing) is characterized by relatively low volume of transactions. Queries are often very complex and involve aggregations. For OLAP systems a
response time is an effectiveness measure. OLAP applications are widely used by Data Mining techniques. In OLAP database there is aggregated, historical data, stored in multi-
dimensional schemas (usually star schema).
The following table summarizes the major differences between OLTP and OLAP system design.
What the data Reveals a snapshot of ongoing business processes Multi-dimensional views of various kinds of business activities
Inserts and
Short and fast inserts and updates initiated by end users Periodic long-running batch jobs refresh the data
Updates
Queries Relatively standardized and simple queries Returning relatively few records Often complex queries involving aggregations
Depends on the amount of data involved; batch data refreshes and complex queries may
Processing Speed Typically very fast
take many hours; query speed can be improved by creating indexes
Space Larger due to the existence of aggregation structures and history data; requires more
Can be relatively small if historical data is archived
Requirements indexes than OLTP
Database Design Highly normalized with many tables Typically de-normalized with fewer tables; use of star and/or snowflake schemas
Backup and Backup religiously; operational data is critical to run the business, data loss Instead of regular backups, some environments may consider simply reloading the
Recovery is likely to entail significant monetary loss and legal liability OLTP data as a recovery method
Data Marts
THEINTACTFRONT20 APR 2018 1 COMMENT
A data mart is a repository of data that is designed to serve a particular community of knowledge workers.
The difference between a data warehouse and a data mart can be confusing because the two terms are sometimes used incorrectly as synonyms. A data warehouse is a central
repository for all an organization’s data. The goal of a data mart, however, is to meet the particular demands of a specific group of users within the organization, such as human
resource management (HRM). Generally, an organization’s data marts are subsets of the organization’s data warehouse.
Because data marts are optimized to look at data in a unique way, the design process tends to start with an analysis of user needs. In contrast, a data warehouse’s design process
tends to start with an analysis of what data already exists and how it can be collected and managed in such a way that it can be used later on. A data warehouse tends to be a
strategic but somewhat unfinished concept; a data mart tends to be tactical and aimed at meeting an immediate need.
Today, data virtualization software can be used to create virtual data marts, pulling data from disparate sources and combining it with other data as necessary to meet the needs of
specific business users. A virtual data mart provides knowledge workers with access to the data they need while preventing data silos and giving the organization’s data
management team a level of control over the organization’s data throughout its lifecycle.
Tuning Production Strategies− The product strategies can be well tuned by repositioning the products and managing the product portfolios by comparing the sales quarterly or
yearly.
Customer Analysis− Customer analysis is done by analyzing the customer’s buying preferences, buying time, budget cycles, etc.
Operations Analysis− Data warehousing also helps in customer relationship management, and making environmental corrections. The information also allows us to analyze
business operations.
Query-driven Approach
Update-driven Approach
Query-Driven Approach
This is the traditional approach to integrate heterogeneous databases. This approach was used to build wrappers and integrators on top of multiple heterogeneous databases. These
integrators are also known as mediators.
When a query is issued to a client side, a metadata dictionary translates the query into an appropriate form for individual heterogeneous sites involved.
Now these queries are mapped and sent to the local query processor.
The results from heterogeneous sites are integrated into a global answer set.
Disadvantages
Update-Driven Approach
This is an alternative to the traditional approach. Today’s data warehouse systems follow update-driven approach rather than the traditional approach discussed earlier. In update-
driven approach, the information from multiple heterogeneous sources are integrated in advance and are stored in a warehouse. This information is available for direct querying and
analysis.
Advantages
This approach has the following advantages −
Note − Data cleaning and data transformation are important steps in improving the quality of data and data mining results.
Since a data warehouse can gather information quickly and efficiently, it can enhance business productivity.
A data warehouse provides us a consistent view of customers and items, hence, it helps us manage customer relationship.
A data warehouse also helps in bringing down the costs by tracking trends, patterns over a long period in a consistent and reliable manner.
To design an effective and efficient data warehouse, we need to understand and analyze the business needs and construct a business analysis framework. Each person has different
views regarding the design of a data warehouse. These views are as follows −
The top-down view− This view allows the selection of relevant information needed for a data warehouse.
The data source view− This view presents the information being captured, stored, and managed by the operational system.
The data warehouse view− This view includes the fact tables and dimension tables. It represents the information stored inside the data warehouse.
The business query view− It is the view of the data from the viewpoint of the end-user.
Bottom Tier− The bottom tier of the architecture is the data warehouse database server. It is the relational database system. We use the back end tools and utilities to feed data into
the bottom tier. These back end tools and utilities perform the Extract, Clean, Load, and refresh functions.
Middle Tier− In the middle tier, we have the OLAP Server that can be implemented in either of the following ways.
o By Relational OLAP (ROLAP), which is an extended relational database management system. The ROLAP maps the operations on multidimensional data to standard relational
operations.
o By Multidimensional OLAP (MOLAP) model, which directly implements the multidimensional data and operations.
Top-Tier− This tier is the front-end client layer. This layer holds the query tools and reporting tools, analysis tools and data mining tools.
Virtual Warehouse
Data mart
Enterprise Warehouse
Virtual Warehouse
The view over an operational data warehouse is known as a virtual warehouse. It is easy to build a virtual warehouse. Building a virtual warehouse requires excess capacity on
operational database servers.
Data Mart
Data mart contains a subset of organization-wide data. This subset of data is valuable to specific groups of an organization.
In other words, we can claim that data marts contain data specific to a particular group. For example, the marketing data mart may contain data related to items, customers, and
sales. Data marts are confined to subjects.
Window-based or Unix/Linux-based servers are used to implement data marts. They are implemented on low-cost servers.
The implementation data mart cycles is measured in short periods of time, i.e., in weeks rather than months or years.
The life cycle of a data mart may be complex in long run, if its planning and design are not organization-wide.
Data marts are small in size.
Data marts are customized by department.
The source of a data mart is departmentally structured data warehouse.
Data mart are flexible.
Enterprise Warehouse
An enterprise warehouse collects all the information and the subjects spanning an entire organization
It provides us enterprise-wide data integration.
The data is integrated from operational systems and external information providers.
This information can vary from a few gigabytes to hundreds of gigabytes, terabytes or beyond.
Load Manager
This component performs the operations required to extract and load process.
The size and complexity of the load manager varies between specific solutions from one data warehouse to other.
Fast Load
In order to minimize the total load window the data need to be loaded into the warehouse in the fastest possible time.
The transformations affects the speed of data processing.
It is more effective to load the data into relational database prior to applying transformations and checks.
Gateway technology proves to be not suitable, since they tend not be performant when large data volumes are involved.
Simple Transformations
While loading it may be required to perform simple transformations. After this has been completed we are in position to do the complex checks. Suppose we are loading the EPOS
sales transaction we need to perform the following checks:
Strip out all the columns that are not required within the warehouse.
Convert all the values to required data types.
Warehouse Manager
A warehouse manager is responsible for the warehouse management process. It consists of third-party system software, C programs, and shell scripts.
The size and complexity of warehouse managers varies between specific solutions.
A warehouse manager analyzes the data to perform consistency and referential integrity checks.
Creates indexes, business views, partition views against the base data.
Generates new aggregations and updates existing aggregations. Generates normalizations.
Transforms and merges the source data into the published data warehouse.
Backup the data in the data warehouse.
Archives the data that has reached the end of its captured life.
Note − A warehouse Manager also analyzes query profiles to determine index and aggregations are appropriate.
Query Manager
Query manager is responsible for directing the queries to the suitable tables.
By directing the queries to appropriate tables, the speed of querying and response generation can be increased.
Query manager is responsible for scheduling the execution of the queries posed by the user.
Detailed Information
Detailed information is not kept online, rather it is aggregated to the next level of detail and then archived to tape. The detailed information part of data warehouse keeps the
detailed information in the starflake schema. Detailed information is loaded into the data warehouse to supplement the aggregated data.
The following diagram shows a pictorial impression of where detailed information is stored and how it is used.
Note − If detailed information is held offline to minimize disk storage, we should make sure that the data has been extracted, cleaned up, and transformed into starflake schema
before it is archived.
Summary Information
Summary Information is a part of data warehouse that stores predefined aggregations. These aggregations are generated by the warehouse manager. Summary Information must be
treated as transient. It changes on-the-go in order to respond to the changing query profiles.
IT is often unwilling or afraid to tell the users what they will be getting and when. Users should be told about the following:
The last level is by far the most successful approach, while the first almost always results in failure.
The best sponsor is from the business side, not from IT. Most importantly, the sponsor should be in serious need of the data warehouse’s capabilities to solve a specific problem or
gain some advantage for his or her department.
Without the right skills dedicated to the team, the project will fail. The emphasis is on “dedicated to the team.”
The most common cause of failure is an unrealistic schedule, usually imposed without the input or the concurrence of the project manager or team members. Most often, the
imposed schedules have no rationale for specific dates, but are only means to “hold the project manager to a schedule.” A realistic schedule will include all the required tasks to
implement the project along with their durations, assigned resources and task dependencies.
The first decisions to be made are the categories of tools: Extract/Transform/Load, data cleansing, OLAP, ROLAP, data modeling, administration, and so on. The tools must match
the requirements of the organization, the users, and the project. The tools should work together without the need to build interfaces or write special code.
In spite of what the vendors tell you, users must be trained and the training should be geared to the level of user and the way they plan to use the data warehouse. All users must
learn about the data, and power users should have additional in-depth training on the data structures.
Data Cleaning− In this step, the noise and inconsistent data is removed.
Data Integration− In this step, multiple data sources are combined.
Data Selection− In this step, data relevant to the analysis task are retrieved from the database.
Data Transformation− In this step, data is transformed or consolidated into forms appropriate for mining by performing summary or aggregation operations.
Data Mining− In this step, intelligent methods are applied in order to extract data patterns.
Pattern Evaluation− In this step, data patterns are evaluated.
Knowledge Presentation− In this step, knowledge is represented.
1. Identify the goal of the KDD process from the customer’s perspective.
2. Understand application domains involved and the knowledge that’s required
3. Select a target data set or subset of data samples on which discovery is be performed.
4. Cleanse and preprocess data by deciding strategies to handle missing fields and alter the data as per the requirements.
5. Simplify the data sets by removing unwanted variables. Then, analyze useful features that can be used to represent the data, depending on the goal or task.
6. Match KDD goals with data mining methods to suggest hidden patterns.
7. Choose data mining algorithms to discover hidden patterns. This process includes deciding which models and parameters might be appropriate for the overall KDD process.
8. Search for patterns of interest in a particular representational form, which include classification rules or trees, regression and clustering.
9. Interpret essential knowledge from the mined patterns.
10. Use the knowledge and incorporate it into another system for further action.
11. Document it and make reports for interested parties.
Data Mining Techniques
THEINTACTFRONT20 APR 2018 2 COMMENTS
Data Mining is the process of extracting useful information and patterns from enormous data. Data Mining includes collection, extraction, analysis and statistics of data. It is also
known as Knowledge discovery process, Knowledge Mining from Data or data/ pattern analysis. Data Mining is a logical process of finding useful information to find out useful
data. Once the information and patterns are found it can be used to make decisions for developing the business. Data mining tools can give answers to your various questions
related to your business which was too difficult to resolve. They also forecast the future trends which lets the business people to make proactive decisions.
Exploration– In this step the data is cleared and converted into another form. The nature of data is also determined
Pattern Identification– The next step is to choose the pattern which will make the best prediction
Deployment– The identified patterns are used to get the desired outcome.
One of the most important task in Data Mining is to select the correct data mining technique. Data Mining technique has to be chosen based on the type of business and the type of
problem your business faces. A generalized approach has to be used to improve the accuracy and cost effectiveness of using data mining techniques. There are basically seven main
Data Mining techniques which is discussed in this article. There are also a lot of other Data Mining techniques but these seven are considered more frequently used by business
people.
Statistics
Clustering
Visualization
Decision Tree
Association Rules
Neural Networks
Classification
1. Statistical Techniques
Data mining techniques statistics is a branch of mathematics which relates to the collection and description of data. Statistical technique is not considered as a data mining
technique by many analysts. But still it helps to discover the patterns and build predictive models. For this reason data analyst should possess some knowledge about the different
statistical techniques. In today’s world people have to deal with large amount of data and derive important patterns from it. Statistics can help you to a greater extent to get answers
for questions about their data like
Statistics not only answers these questions they help in summarizing the data and count it. It also helps in providing information about the data with ease. Through statistical reports
people can take smart decisions. There are different forms of statistics but the most important and useful technique is the collection and counting of data. There are a lot of ways to
collect data like
Histogram
Mean
Median
Mode
Variance
Max
Min
Linear Regression
2. Clustering Technique
Clustering is one among the oldest techniques used in Data Mining. Clustering analysis is the process of identifying data that are similar to each other. This will help to understand
the differences and similarities between the data. This is sometimes called segmentation and helps the users to understand what is going on within the database. For example, an
insurance company can group its customers based on their income, age, nature of policy and type of claims.
Partitioning Methods
Hierarchical Agglomerative methods
Density Based Methods
Grid Based Methods
Model Based Methods
The most popular clustering algorithm is Nearest Neighbour. Nearest neighbour technique is very similar to clustering. It is a prediction technique where in order to predict what a
estimated value is in one record look for records with similar estimated values in historical database and use the prediction value from the record which is near to the unclassified
record. This technique simply states that the objects which are closer to each other will have similar prediction values. Through this method you can easily predict the values of
nearest objects very easily. Nearest Neighbour is the most easy to use technique because they work as per the thought of the people. They also work very well in terms of
automation. They perform complex ROI calculations with ease. The level of accuracy in this technique is as good as the other Data Mining techniques.
In business Nearest Neighbour technique is most often used in the process of Text Retrieval. They are used to find the documents that share the important characteristics with that
main document that have been marked as interesting.
3. Visualization
Visualization is the most useful technique which is used to discover data patterns. This technique is used at the beginning of the Data Mining process. Many researches are going on
these days to produce interesting projection of databases, which is called Projection Pursuit. There are a lot of data mining technique which will produce useful patterns for good
data. But visualization is a technique which converts Poor data into good data letting different kinds of Data Mining methods to be used in discovering hidden patterns.
A decision tree is a predictive model and the name itself implies that it looks like a tree. In this technique, each branch of the tree is viewed as a classification question and the
leaves of the trees are considered as partitions of the dataset related to that particular classification. This technique can be used for exploration analysis, data pre-processing and
prediction work.
Decision tree can be considered as a segmentation of the original dataset where segmentation is done for a particular reason. Each data that comes under a segment has some
similarities in their information being predicted. Decision trees provides results that can be easily understood by the user.
Decision tree technique is mostly used by statisticians to find out which database is more related to the problem of the business. Decision tree technique can be used for Prediction
and Data pre-processing.
The first and foremost step in this technique is growing the tree. The basic of growing the tree depends on finding the best possible question to be asked at each branch of the tree.
The decision tree stops growing under any one of the below circumstances
CART which stands for Classification and Regression Trees is a data exploration and prediction algorithm which picks the questions in a more complex way. It tries them all and
then selects one best question which is used to split the data into two or more segments. After deciding on the segments it again asks questions on each of the new segment
individually.
Another popular decision tree technology is CHAID (Chi-Square Automatic Interaction Detector). It is similar to CART but it differs in one way. CART helps in choosing the best
questions whereas CHAID helps in choosing the splits.
5. Neural Network
Neural Network is another important technique used by people these days. This technique is most often used in the starting stages of the data mining technology. Artificial neural
network was formed out of the community of Artificial intelligence.
Neural networks are very easy to use as they are automated to a particular extent and because of this the user is not expected to have much knowledge about the work or database.
But to make the neural network work efficiently you need to know
There are two main parts of this technique – the node and the link
The node– which freely matches to the neuron in the human brain
The link– which freely matches to the connections between the neurons in the human brain
A neural network is a collection of interconnected neurons. which could form a single layer or multiple layer. The formation of neurons and their interconnections are called
architecture of the network. There are a wide variety of neural network models and each model has its own advantages and disadvantages. Every neural network model has different
architectures and these architectures use different learning procedures.
Neural networks are very strong predictive modelling technique. But it is not very easy to understand even by experts. It creates very complex models which is impossible to
understand fully. Thus to understand the Neural network technique companies are finding out new solutions. Two solutions have already been suggested
First solution is Neural network is packaged up into a complete solution which will let it to be used for a single application
Second solution is it is bonded with expert consulting services
Neural network has been used in various kinds of applications. This has been used in the business to detect frauds taking place in the business.
This technique helps to find the association between two or more items. It helps to know the relations between the different variables in databases. It discovers the hidden patterns
in the data sets which is used to identify the variables and the frequent occurrence of different variables that appear with the highest frequencies.
This technique is most often used in retail industry to find patterns in sales. This will help increase the conversion rate and thus increases profit.
7. Classification
Data mining techniques classification is the most commonly used data mining technique which contains a set of pre classified samples to create a model which can classify the
large set of data. This technique helps in deriving important information about data and metadata (data about data). This technique is closely related to cluster analysis technique
and it uses decision tree or neural network system. There are two main processes involved in this technique
Market basket analysis only uses transactions with more than one item, as no associations can be made with single purchases. Item association does not necessarily suggest a cause
and effect, but simply a measure of co-occurrence. It does not mean that since energy drinks and video games are frequently bought together, one is the cause for the purchase of
the other, but it can be construed from the information that this purchase is most probably made by (or for) a gamer. Such rules or hypothesis must be tested and should not be taken
as truth unless item sales say otherwise.
Predictive MBA is used to classify cliques of item purchases, events and services that largely occur in sequence.
Differential MBA removes a high volume of insignificant results and can lead to very in-depth results. It compares information between different stores, demographics, seasons of
the year, days of the week and other factors.
MBA is commonly used by online retailers to make purchase suggestions to consumers. For example, when a person buys a particular model of smartphone, the retailer may
suggest other products such as phone cases, screen protectors, memory cards or other accessories for that particular phone. This is due to the frequency with which other consumers
bought these items in the same transaction as the phone.
MBA is also used in physical retail locations. Due to the increasing sophistication of point of sale systems coupled with big data analytics, stores are using purchase data and MBA
to help improve store layouts so that consumers can more easily find items that are frequently purchased together.
Data Mining is widely used in diverse areas. There are a number of commercial data mining system available today and yet there are many challenges in this field. In this tutorial,
we will discuss the applications and the trend of data mining.
Design and construction of data warehouses for multidimensional data analysis and data mining.
Loan payment prediction and customer credit policy analysis.
Classification and clustering of customers for targeted marketing.
Detection of money laundering and other financial crimes.
Retail Industry
Data Mining has its great application in Retail Industry because it collects large amount of data from on sales, customer purchasing history, goods transportation, consumption and
services. It is natural that the quantity of data collected will continue to expand rapidly because of the increasing ease, availability and popularity of the web.
Data mining in retail industry helps in identifying customer buying patterns and trends that lead to improved quality of customer service and good customer retention and
satisfaction. Here is the list of examples of data mining in the retail industry −
Design and Construction of data warehouses based on the benefits of data mining.
Multidimensional analysis of sales, customers, products, time and region.
Analysis of effectiveness of sales campaigns.
Customer Retention.
Product recommendation and cross-referencing of items.
Telecommunication Industry
Today the telecommunication industry is one of the most emerging industries providing various services such as fax, pager, cellular phone, internet messenger, images, e-mail, web
data transmission, etc. Due to the development of new computer and communication technologies, the telecommunication industry is rapidly expanding. This is the reason why data
mining is become very important to help and understand the business.
Data mining in telecommunication industry helps in identifying the telecommunication patterns, catch fraudulent activities, make better use of resource, and improve quality of
service. Here is the list of examples for which data mining improves telecommunication services −
Intrusion Detection
Intrusion refers to any kind of action that threatens integrity, confidentiality, or the availability of network resources. In this world of connectivity, security has become the major
issue. With increased usage of internet and availability of the tools and tricks for intruding and attacking network prompted intrusion detection to become a critical component of
network administration. Here is the list of areas in which data mining technology may be applied for intrusion detection −
Data Types− The data mining system may handle formatted text, record-based data, and relational data. The data could also be in ASCII text, relational database data or data warehouse data.
Therefore, we should check what exact format the data mining system can handle.
System Issues− We must consider the compatibility of a data mining system with different operating systems. One data mining system may run on only one operating system or on several. There
are also data mining systems that provide web-based user interfaces and allow XML data as input.
Data Sources− Data sources refer to the data formats in which data mining system will operate. Some data mining system may work only on ASCII text files while others on multiple relational
sources. Data mining system should also support ODBC connections or OLE DB for ODBC connections.
Data Mining functions and methodologies− There are some data mining systems that provide only one data mining function such as classification while some provides multiple data mining
functions such as concept description, discovery-driven OLAP analysis, association mining, linkage analysis, statistical analysis, classification, prediction, clustering, outlier analysis, similarity
search, etc.
Coupling data mining with databases or data warehouse systems− Data mining systems need to be coupled with a database or a data warehouse system. The coupled components are
integrated into a uniform information processing environment. Here are the types of coupling listed below −
o No coupling
o Loose Coupling
o Semi tight Coupling
o Tight Coupling
o Data Visualization
o Mining Results Visualization
o Mining process visualization
o Visual data mining
Data Mining query language and graphical user interface− An easy-to-use graphical user interface is important to promote user-guided, interactive data mining. Unlike relational database
systems, data mining systems do not share underlying data mining query language.
Application Exploration.
Scalable and interactive data mining methods.
Integration of data mining with database systems, data warehouse systems and web database systems.
SStandardization of data mining query language.
Visual data mining.
New methods for mining complex types of data.
Biological data mining.
Data mining and software engineering.
Web mining.
Distributed data mining.
Real time data mining.
Multi database data mining.
Privacy protection and information security in data mining.
Types of knowledge
THEINTACTFRONT20 APR 2018 1 COMMENT
Knowledge management is an activity practiced by enterprises all over the world. In the process of knowledge management, these enterprises comprehensively gather information
using many methods and tools.
Then, gathered information is organized, stored, shared, and analyzed using defined techniques.
The analysis of such information will be based on resources, documents, people and their skills.
Properly analyzed information will then be stored as ‘knowledge’ of the enterprise. This knowledge is later used for activities such as organizational decision making and training
new staff members.
There have been many approaches to knowledge management from early days. Most of early approaches have been manual storing and analysis of information. With the
introduction of computers, most organizational knowledge and management processes have been automated.
Therefore, information storing, retrieval and sharing have become convenient. Nowadays, most enterprises have their own knowledge management framework in place.
The framework defines the knowledge gathering points, gathering techniques, tools used, data storing tools and techniques and analyzing mechanism.
1. A Priori
A priori and a posteriori are two of the original terms in epistemology (the study of knowledge). A priori literally means “from before” or “from earlier.” This is because a
priori knowledge depends upon what a person can derive from the world without needing to experience it. This is better known as reasoning. Of course, a degree of experience is
necessary upon which a priori knowledge can take shape.
Let’s look at an example. If you were in a closed room with no windows and someone asked you what the weather was like, you would not be able to answer them with any degree
of truth. If you did, then you certainly would not be in possession of a priori knowledge. It would simply be impossible to use reasoning to produce a knowledgeable answer.
On the other hand, if there were a chalkboard in the room and someone wrote the equation 4 + 6 = ? on the board, then you could find the answer without physically finding four
objects and adding six more objects to them and then counting them. You would know the answer is 10 without needing a real world experience to understand it. In fact,
mathematical equations are one of the most popular examples of a priori knowledge.
Interested in learning more about philosophy? Check out this five-star course on an introduction to philosophy and its different schools of thought.
2. A Posteriori
Naturally, then, a posteriori literally means “from what comes later” or “from what comes after.” This is a reference to experience and using a different kind of reasoning
(inductive) to gain knowledge. This kind of knowledge is gained by first having an experience (and the important idea in philosophy is that it is acquired through the five senses)
and then using logic and reflection to derive understanding from it. In philosophy, this term is sometimes used interchangeably with empirical knowledge, which is knowledge
based on observation.
It is believed that a priori knowledge is more reliable than a posteriori knowledge. This might seem counter-intuitive, since in the former case someone can just sit inside of a
room and base their knowledge on factual evidence while in the latter case someone is having real experiences in the world. But the problem lies in this very fact: everyone’s
experiences are subjective and open to interpretation. This is a very complex subject and you might find it illuminating to read this post on knowledge issues and how to identify
and use them. A mathematical equation, on the other hand, is law.
3. Explicit Knowledge
Now we are entering the realm of explicit and tacit knowledge. As you have noticed by now, types of knowledge tend to come in pairs and are often antitheses of each other.
Explicit knowledge is similar to a priori knowledge in that it is more formal or perhaps more reliable. Explicit knowledge is knowledge that is recorded and communicated through
mediums. It is our libraries and databases. The specifics of what is contained is less important than how it is contained. Anything from the sciences to the arts can have elements
that can be expressed in explicit knowledge. Get a taste of explicit knowledge for yourself with this top-rated course on learning how to learn and knowing how to tap into
your inner genius.
The defining feature of explicit knowledge is that it can be easily and quickly transmitted from one individual to another, or to another ten-thousand or ten-billion. It also tends to
be organized systematically. For example, a history textbook on the founding of America would take a chronological approach as this would allow knowledge to build upon itself
through a progressive system; in this case, time.
4. Tacit Knowledge
I should note that tacit knowledge is a relatively new theory introduced only as recently as the 1950s. Whereas explicit knowledge is very easy to communicate and transfer from
one individual to another, tacit knowledge is precisely the opposite. It is extremely difficult, if not impossible, to communicate tacit knowledge through any medium.
For example, the textbook on the founding of America can teach facts (or things we believe to be facts), but someone who is an expert musician can not truly communicate their
knowledge; in other words, they can not tell someone how to play the instrument and the person will immediately possess that knowledge. That knowledge must be acquired to a
degree that goes far, far beyond theory. In this sense, tacit knowledge would most closely resemble a posteriori knowledge, as it can only be achieved through experience.
The biggest difficult of tacit knowledge is knowing when it is useful and figuring out how to make it usable. Tacit knowledge can only be communicated through consistent and
extensive relationships or contact (such as taking lessons from a professional musician). But even in this cases there will not be a true transfer of knowledge. Usually two forms of
knowledge are born, as each person must fill in certain blanks (such as skill, short-cuts, rhythms, etc.). You can better understand this theory and other ways we use knowledge with
this video textbook on the psychology of learning.
Our last pair of knowledge theories are propositional and non-propositional knowledge, both of which share similarities with some of the other theories already discussed.
Propositional knowledge has the oddest definition yet, as it is commonly held that it is knowledge that can literally be expressed in propositions; that is, in declarative sentences (to
use its other name) or indicative propositions.
Propositional knowledge is not so different from a priori and explicit knowledge. The key attribute is knowing that something is true. Again, mathematical equations could be an
example of propositional knowledge, because it is knowledge of something, as opposed to knowledge of how to do something.
The best example is one that contrasts propositional knowledge with our next form of knowledge, non-propositional or procedural knowledge. Let’s use a
textbook/manual/instructional pamphlet that has information on how to program a computer as our example. Propositional knowledge is simply knowing something or having
knowledge of something. So if you read and/or memorized the textbook or manual, then you would know the steps on how to program a computer. You could even repeat these
steps to someone else in the form of declarative sentences or indicative propositions. However, you may have memorized every word yet have no idea how to actually program a
computer. That is where non-propositional or procedural knowledge comes in.
Now might be a good time to brush up on how we learn with this sweet course on how to base goals on what you want to learn in order to exceed your wildest dreams.
Non-propositional knowledge (which is better known as procedural knowledge, but I decided to use “non-propositional” because it is a more obvious antithesis to “propositional”)
is knowledge that can be used; it can be applied to something, such as a problem. Procedural knowledge differs from propositional knowledge in that it is acquired “by doing”;
propositional knowledge is acquired by more conservative forms of learning.
One of the defining characteristics of procedural knowledge is that it can be claimed in a court of law. In other words, companies that develop their own procedures or methods can
protect them as intellectual property. They can then, of course, be sold, protected, leased, etc.
Procedural knowledge has many advantages. Obviously, hands-on experience is extremely valuable; literally so, as it can be used to obtain employment. We are seeing this today as
experience (procedural) is eclipsing education (propositional). Sure, education is great, but experience is what defines what a person is capable of accomplishing. So someone who
“knows” how to write code is not nearly as valuable as someone who “writes” or “has written” code. However, some people believe that this is a double-edged sword, as the degree
of experience required to become proficient limits us to a relatively narrow field of variety.
But nobody can deny the intrinsic and real value of experience. This is often more accurate than propositional knowledge because it is more akin to the scientific method;
hypotheses are tested, observation is used, and progress results.
Intranet
Data warehouses and knowledge repositories
Decision support tools
Groupware for supporting collaboration
Networks of knowledge workers
Internal expertise
Definition of KMS
A knowledge management system comprises a range of practices used in an organization to identify, create, represent, distribute, and enable adoption to insight and experience.
Such insights and experience comprise knowledge, either embodied in individual or embedded in organizational processes and practices.
Purpose of KMS
Improved performance
Competitive advantage
Innovation
Sharing of knowledge
Integration
Continuous improvement by:
o Driving strategy
o Starting new lines of business
o Solving problems faster
o Developing professional skills
o Recruit and retain talent
Start with the business problem and the business value to be delivered first.
Identify what kind of strategy to pursue to deliver this value and address the KM problem.
Think about the system required from a people and process point of view.
Finally, think about what kind of technical infrastructure are required to support the people and processes.
Implement system and processes with appropriate change management and iterative staged release.
Knowledge Management Technologies also support knowledge management systems and benefit from the knowledge management infrastructure, especially the information
technology infrastructure. KM technologies constitute a key component of KM systems.
Technologies that support KM include artificial intelligence (AI) technologies including those used for knowledge acquisition and case-based reasoning systems, electronic
discussion groups, computer-based simulations, databases, decision support systems, enterprise resource planning systems, expert systems, management information systems,
expertise locator systems, videoconferencing, and information repositories including best practices databases and lessons learned systems. KM technologies also include the
emergent Web 2.0 technologies, such as wikis and blog (Becerra-Fernandez and Sabherwal, 2010).
Knowledge Management Mechanisms and Technologies work together and affect each other. You can follow the following video-clips to learn more about how information
technology influence knowledge management
There are four main knowledge management processes, and each process comprises two sub-processes:
Knowledge discovery
o Combination
o Socialization
Knowledge capture
o Externalization
o Internalization
Knowledge sharing
o Socialization
o Exchange
Knowledge application
o Direction
o Routines
Emerging Issues in Business Intelligence
THEINTACTFRONT20 APR 2018 1 COMMENT
Organizations are closely watching emerging technology trends to discover the next great competitive advantage in the use of information. One trend is easy to identify: more
information. Data volumes are growing across the board, with organizations seeking to tap new sources generated by social media and online customer behavior. This trend is
spurring tremendous interest in better access and analysis of the variety of information available in unstructured or semi-structured content sources.
From a macro perspective, it’s easy to identify the biggest long-term trend in business intelligence: providing nontechnical users with the tools and capabilities to access, analyze,
and share data on their own. However, the road to this destination has not been easy. With IT driving application development and deployment, standard approaches to extending
enterprise BI and data analysis capabilities have been difficult and slow. Getting the requirements right for the data, reports, visualization, and drill-down analysis capabilities is
difficult and never fully satisfactory. By the time requirements have been gathered and turned into application features, users will have identified different requirements.
2. Unified Access and Analysis of All Types of Information Improves User Productivity
As the implementation of BI and analytics tools spreads to more users within organizations, a question inevitably arises: What about all the information in text and document
formats, which accounts for the vast majority of what users encounter? Difficulty in finding information, whether structured or unstructured, is a productivity cost to organizations.
If one of the measures of BI’s value is improved productivity, then BI should help users access and analyze unstructured as well as structured information.
Historically, BI systems have developed in technology ecosystems limited to structured, alphanumeric data, leaving unstructured content to document and content management
systems, search engines, and a lot of manual paperwork. With the majority of content increasingly being stored and generated in digital form, users are demanding better integration
between content access and analysis and the structured realm of BI. Integrated views of all types of information can help managers and frontline workers see the context
surrounding the numbers in structured systems. This enables them to uncover business opportunities and find the root causes of problems more quickly.
Now, with Twitter, Facebook, and other sites, we have hit the social media age: customers are using social networks to influence others and express their shopping interests and
experiences. Organizations are hungry to capture and analyze activity by current and potential customers in social networks and comment fields across the Internet marketplace.
4. Text Analytics Enables Organizations to Interpret Social Media Sentiment Trends and Commentary
Rising interest in social media analysis is putting the spotlight on text analytics, which is the critical technology for understanding “sentiment” in social media, as well as customer
reviews and other content sources. Like data mining, the text mining and analytics category stretches to include a range of techniques and software, such as natural language
processing, relationship extraction, visualization, and predictive analysis.
Text analytics falls within the realm of interpretation rather than exact science, which makes it a nice complement to BI and structured data analytics. Sentiment analysis, for
example, employs statistical and linguistic text analysis methods to understand positive and negative comments. While this analysis can provide an early sense of the reception of a
new product or service, the interpretation cannot replace the more exacting analysis of the numbers done with BI or structured analytics tools. Sentiment analysis, however, can
help organizations become more proactive in taking steps to address negative reactions to products and services before they lead to the poor sales that BI and data warehouse users
detect later in the reporting and analysis of sales transaction figures.
When limited to a reactive posture, organizations face delays and confusion in how to respond to events, which can lead to increased costs and missed opportunities. Reactive
organizations lack a well-orchestrated plan and can only respond to events on a case-by-case basis. With speed and complexity rising in many industries, a reactive posture isn’t
good enough. Organizations need business intelligence and analytics applications and services that will help them shift from a reactive to a proactive and predictive posture.
Traditional BI systems are not enough for organizations to make this shift.
Decision management is the term industry experts and vendors use to describe the integration of analytics with business rules and process management systems to achieve a
predictive and proactive posture in a real-time world. Decision management requires several technologies. Business rules, or conditional statements for guiding decision processes,
are common in application code and logic; the challenge is to implement business rules systems that can guide decisions across applications and processes, not just within one
system. Business process management systems help organizations optimize processes that cross applications and use analytics as part of the continuous improvement of those
processes.
Along with business rules and business process management, a third technology important to decision management is complex (or business) event processing. Events are happening
everywhere; they are recorded or “sensed” from online behavior, RFID tags, manufacturing systems, surveillance, financial services trading, and so on. Integrated with analytics
and data visualization, event processing systems can enable organizations to pick out meaningful events from a stream or “cloud” of noise that is not important.
Organizations can use decision management technologies to automate decisions where speed and complexity overwhelm human-centered decision processes, and where there are
competitive advantages to having decisions executed in real time and driven by predictive models. Decision management is an emerging technology area currently focused on
specialized systems, but as demand for greater execution speed and efficiency grows, more organizations will evaluate its potential for mainstream requirements.