0% found this document useful (0 votes)
15 views27 pages

Business Intelligence Notes

Uploaded by

sizwemakgalemele
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views27 pages

Business Intelligence Notes

Uploaded by

sizwemakgalemele
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27

Business Intelligence

• Business Intelligence (BI) is a technology-driven process that


helps businesses analyze data to make better decisions. BI
is like a set of tools and methods that turn raw data into
useful information, such as reports and dashboards, so that
companies can understand their performance and make
informed decisions to improve their operations.
• Example: A retail company uses BI to analyze sales data
across different regions, products, and time periods.
• Example: An e-commerce company uses BI to track
customer behavior on its website, such as what products are
viewed most often or abandoned in carts.
NoSQL
• NoSQL stands for "Not Only SQL" and refers to a type of
database that allows for storing and retrieving data in
ways other than the traditional table structures found in
relational databases like SQL.
• NoSQL databases are designed to handle large volumes
of data, especially when that data is unstructured or
doesn't fit neatly into rows and columns. They are
flexible and can store different types of data, such as
documents, graphs, key-value pairs, or wide-column
stores.
Hadoop
• Hadoop is an open-source framework that allows for the
storage and processing of large datasets across clusters
of computers.
• Hadoop makes it possible to manage and analyze vast
amounts of data efficiently by breaking it down into
smaller chunks and processing them in parallel across
multiple computers.
Big Data
• Big Data refers to extremely large and complex
datasets that are difficult to process and analyze using
traditional data processing tools and techniques. The
size of these datasets often exceeds the capabilities of
standard databases, making it necessary to use more
advanced technologies to handle them.
Data Warehouses

• Data Warehouses
Centralized repositories that integrate data from various
sources, providing a comprehensive and consistent view
for analysis and reporting
Extract, Transform, Load (ETL)
Process
Extract, Transform, Load (ETL) Process
A process used to move data from source systems to a
target database or data warehouse.
Stages:
Extract: Gather data from various sources.
Transform: Clean, format, and integrate the data.
Load: Insert the transformed data into the target system.
Data Marts

• Data Marts
Subsets of a data warehouse designed to serve specific
business units or functions.
Data Lakes

• Data Lakes
Storage repositories that hold raw, unstructured, and
structured data in its native format.
NoSQL Databases

• NoSQL Databases
Databases designed to handle a variety of data models
other than the traditional relational model.
Hadoop
• Hadoop
An open-source framework designed for distributed
storage and processing of large data sets.
In-Memory Databases

• In-Memory Databases
Databases that store data primarily in memory (RAM)
rather than on disk.
ACID properties
• ACID properties are a set of principles that ensure reliable
transactions in database systems. They stand for:
1.Atomicity: Transactions are all-or-nothing. Either the entire
transaction is completed, or none of it is. This ensures that partial
transactions do not affect the database.
2.Consistency: A transaction brings the database from one valid state
to another valid state, maintaining database invariants and
constraints.
3.Isolation: Transactions are executed independently of one another.
The intermediate state of a transaction is not visible to other
transactions until it is complete.
4.Durability: Once a transaction is committed, its changes are
permanent, even in the case of a system failure.
The Hadoop Distributed File System
• The Hadoop Distributed File System (HDFS) is a key
component of the Apache Hadoop framework, designed
to store and manage large volumes of data across a
distributed computing environment.
• Apache Hadoop is an open-source framework designed
for processing and storing large datasets in a
distributed computing environment.
MapReduce
• MapReduce is a programming model and processing
technique used to handle large-scale data processing
across distributed computing environments, such as
those managed by Hadoop. It breaks down a task into
smaller, manageable pieces, processes them in parallel,
and then combines the results
Descriptive Analysis
• Descriptive Analysis is a statistical method that helps
to summarize and describe the main features of a
dataset. It's often the first step in data analysis,
providing an overview of the data through measures
such as mean, median, mode, and standard deviation,
as well as through visualizations like charts and graphs.
Visual Analytics
• Visual Analytics is the science of analytical reasoning
facilitated by interactive visual interfaces. It combines
data analysis and visualization to help users understand
complex datasets and extract insights. By turning raw
data into visual representations such as charts, graphs,
and dashboards, visual analytics makes it easier to spot
trends, patterns, outliers, and correlations.
Regression Analysis

• Regression Analysis is a statistical method used to


examine the relationship between one dependent
variable and one or more independent variables. It’s
commonly used to predict outcomes, identify trends,
and understand the strength and nature of relationships
between variables.
Visual Analytics
• Visual Analytics is a field that combines data analysis
and visualization to help users understand complex data
and extract actionable insights. By turning data into
visual representations such as charts, graphs, and
dashboards, visual analytics makes it easier to spot
trends, patterns, correlations, and outliers that might
not be immediately apparent in raw data.
Word cloud
• A word cloud is a visual representation of text data
where the size of each word indicates its frequency or
importance. It's often used to quickly convey the most
common terms or themes within a body of text.
Conversion funnel
• A conversion funnel is a marketing model that illustrates
the journey potential customers take from the initial
awareness of a product or service to the final
conversion (e.g., making a purchase). It helps
businesses understand and optimize each stage of the
customer journey
Predictive analytics
• Predictive analytics involves using statistical techniques
and machine learning algorithms to analyze historical
data and make predictions about future events or
behaviors. The goal is to identify patterns and trends
that can inform decision-making and help anticipate
outcomes.
Time series analysis
• Time series analysis involves studying data points
collected or recorded at specific time intervals to
identify patterns, trends, and seasonal effects over
time. This type of analysis helps in understanding the
underlying structure of the data and making forecasts
about future values.
Association Analysis

• Association analysis is a data mining technique used


to identify relationships between variables in large
datasets. It aims to discover patterns, correlations, or
associations between different items.
Neural Computing

• Neural computing refers to computational models


inspired by the human brain’s neural networks. It
involves creating algorithms that can learn from and
make decisions based on data.
Case-Based Reasoning (CBR)

• Case-based reasoning is a problem-solving approach


where new problems are solved by finding and adapting
solutions from similar past cases.
Optimization
• Optimization is the process of finding the best solution
or outcome from a set of possible choices, subject to
certain constraints. It involves selecting the most
efficient or effective option among various alternatives
to achieve a specific objective.
Linear programming
• Linear programming (LP) involves creating a
mathematical model that represents a problem where
you want to optimize a specific objective, such as
maximizing profits or minimizing costs, while adhering
to certain limitations or constraints.

You might also like