0% found this document useful (0 votes)
3 views

Session 1a-Data and Data Management

The document discusses Business Intelligence (BI) as a set of applications and technologies for data analysis to aid decision-making and competitive advantage. It covers database management systems, including relational and non-relational databases, and highlights the importance of data quality and governance. Additionally, it explores analytical tools like OLAP and data mining for uncovering patterns and trends in large datasets.

Uploaded by

Preetha P
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Session 1a-Data and Data Management

The document discusses Business Intelligence (BI) as a set of applications and technologies for data analysis to aid decision-making and competitive advantage. It covers database management systems, including relational and non-relational databases, and highlights the importance of data quality and governance. Additionally, it explores analytical tools like OLAP and data mining for uncovering patterns and trends in large datasets.

Uploaded by

Preetha P
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

Data and Data Manageent

Prof. Rajiv Kumar


IIM Kashipur

Source: Various sources


Business Intelligence (BI)?

 Business intelligence (BI) is a broad category of applications, technologies,


and processes for gathering, storing, accessing, and analyzing data to help
business users make better decisions.

Business Intelligence is the processes, technologies, and tools that help us


change data into information, information into knowledge and knowledge into
plans that guide organization.
Why Business Intelligence?

 Collecting and refining information from many sources


(internal and external)
 Analyzing and presenting the information in useful ways
(dashboards, visualizations)
 So that people can make better decisions
 That help build and retain competitive advantage.
DIKW Pyramid (1 of 4)
Know-why: Wisdom is the ability to increase
effectiveness. Makes use of knowledge to create
value through correct and well-informed decisions.
Ex. Keep more branded dish detergent compared to
other dish detergent to increase profit.

Know-how: Knowledge is applied information that


actively guides task execution, problem solving and
decision making.
Ex. Brands of dish detergent ->most rapidly selling at
that store.
Know-what: Data shaped into a meaningful and
useful form. Ex. Total number of bottles of dish
detergent sold at a store.
Know-nothing: Streams of raw facts. Discrete facts
about events.
Ex. Supermarket or mall checkout counters scans
millions of pieces of data from bar codes.
Database Management Systems

 Database
• Serves many applications by centralizing data and controlling redundant data
 Database management system (DBMS)
• Interfaces between applications and physical data files
• Separates logical and physical views of data
• Solves problems of traditional file environment
 Controls redundancy
 Eliminates inconsistency
 Uncouples programs and data
 Enables organization to centrally manage data and data security
Human Resources Database with
Multiple Views
Relational DBMS

 Represent data as two-dimensional tables


 Each table contains data on entity and attributes
 Table: grid of columns and rows
• Rows (tuples): Records for different entities
• Fields (columns): Represents attribute for entity
• Key field: Field used to uniquely identify each record
• Primary key: Field in table used for key fields
• Foreign key: Primary key used in second table as look-up field to
identify records from original table
Relational Database Tables
Operations of a Relational D B M S

 Three basic operations used to develop useful sets of data


• SELECT
 Creates subset of data of all records that meet stated criteria
• JOIN
 Combines relational tables to provide user with more information than available in
individual tables
• PROJECT
 Creates subset of columns in table, creating tables with only the information specified
The Three Basic Operations of a
Relational DBMS
Capabilities of Database Management
Systems

 Data definition capability


 Data dictionary
 Querying and reporting
• Data manipulation language
 Structured Query Language (S Q L)
 Many DBMS have report generation capabilities for creating
polished reports (Microsoft Access)
Access Data Dictionary Features
Example of an SQL Query
An Access Query
Designing Databases

 Conceptual design vs. physical design


 Normalization
• Streamlining complex groupings of data to minimize redundant data elements
and awkward many-to-many relationships
 Referential integrity
• Rules used by RDBMS to ensure relationships between tables remain
consistent
 Entity-relationship diagram
 A correct data model is essential for a system serving the business well
An Unnormalized Relation for Order
Normalized Tables Created from Order
An Entity-Relationship Diagram
Non-Relational Databases and
Databases in the Cloud

 Non-relational databases: “No SQL”


• More flexible data model
• Data sets stored across distributed machines
• Easier to scale
• Handle large volumes of unstructured and structured data
 Databases in the cloud
• Appeal to start-ups, smaller businesses
• Amazon Relational Database Service, Microsoft S Q L Azure
• Private clouds
The Challenge of Big Data

 Big data
• Massive sets of unstructured/semi-structured data from web
traffic, social media, sensors, and so on
 Volumes too great for typical DBMS
• Petabytes, exabytes of data
 Can reveal more patterns, relationships and anomalies
 Requires new tools and technologies to manage and analyze
Contemporary Business Intelligence
Infrastructure
Analytical Tools: Relationships,
Patterns, Trends

Tools for consolidating, analyzing, and providing access to


vast amounts of data to help users make better business
decisions
• Multidimensional data analysis (OLAP)
• Data mining
• Text mining
• Web mining
Online Analytical Processing (OLAP)

 Supports multidimensional data analysis


• Viewing data using multiple dimensions
• Each aspect of information (product, pricing, cost, region, time
period) is different dimension
• Example: How many washers sold in the East in June compared
with other regions?
 OLAP enables rapid, online answers to ad hoc queries
Multidimensional Data Model
Data Mining

 Finds hidden patterns, relationships in datasets


• Example: customer buying patterns
 Infers rules to predict future behavior
 Types of information obtainable from data mining:
• Associations
• Sequences
• Classification
• Clustering
• Forecasting
Text Mining and Web Mining

 Text mining
• Extracts key elements from large unstructured data sets
• Sentiment analysis software
 Web mining
• Discovery and analysis of useful patterns and information from
web
• Web content mining
• Web structure mining
• Web usage mining
Databases and the Web

 Many companies use the web to make some internal databases available
to customers or partners
 Typical configuration includes:
• Web server
• Application server/middleware/C G I scripts
• Database server (hosting D B M S)
 Advantages of using the web for database access:
• Ease of use of browser software
• Web interface requires few or no changes to database
• Inexpensive to add web interface to system
Linking Internal Databases to the Web
Establishing an Information Policy

 Firm’s rules, procedures, roles for sharing, managing,


standardizing data
 Data administration
 Establishes policies and procedures to manage data
 Data governance
 Deals with policies and processes for managing availability,
usability, integrity, and security of data, especially regarding
government regulations
 Database administration
Ensuring Data Quality

 More than 25 percent of critical data in Fortune 1000


company databases are inaccurate or incomplete
 Before new database is in place, a firm must:
• Identify and correct faulty data
• Establish better routines for editing data once database in
operation
 Data quality audit
 Data cleansing

You might also like