Bi Endsem Notes
Bi Endsem Notes
2. Types of Reports –
a. List
● Used to show detailed information from your database.
● E.g. item list,customer list
● Data in list is shown in the form of rows and columns
● Each column shows all the values for a data item in the database.
● Following operations can be performed on the list.
○ Set list properties.
○ Hide columns in list report
○ Create a scrollable list.
○ Use repeaters.
b. Crosstabs Reports
● Also k/a Matrix reports.
● Used to show relationships between three or more query items
● Data is shown in rows and columns with info summarized at intersection
point.
● Crosstabs nodes and node members
● Set crosstabs properties
● Create single edge crosstab
● Create nested Crosstab
● Swap columns and rows
● Change list into crosstabs.
c. Statistics
● By using statistics the collected data can be abbreviated and represented
in such a way that it can be easily understood and actionable insights
can be extracted.
● Statistical reporting strategy uses three basic statistical tests.
○ Descriptive statistics
■ Main objective of descriptive statistics is to demonstrate a
huge portion of the collected data through summary,charts
and tables.
○ Inferential statistics
■ Provide a more detailed and effective statistical data
analysis
○ Psychometric tests
■ Psychometric tests analyze the attributes and performance
of the conducted survey to ensure that the survey data is
reliable and valid
● Statistical reporting tools include factor analysis,cluster analysis,gap
analysis,Z-test and U-test.
● Types of statistical reporting data
○ Categorical data
○ Ordinal data
○ Interval data
○ Ratio data
d. Chart
● Used to present data in a way that is useful to the end user.
● They are visual representations of all types of data which may or may not
be related.
● It can represent large sets of data which makes it easy to understand.
● Combinational charts are those where more than one type of charts are
used to represent data.
● Many different types of charts are available:-
a. Column charts
b. Line charts
c. Pie charts
d. Bar charts
e. Area charts
f. Point charts
g. Scatter chart
h. Bubble chart
i. Quadrant chart
e. Map
● Another approach of data visualization which is used to analyze and
represent the geographically correlated data and present it in the form of
maps.
● Helps to identify insights from data and make proper decisions.
● Different types of maps:-
a. Heap map:- A heap map is used to represent the relationship
between two data variables and provides quantity wise
information, such as high, medium, low. It is represented using
different colors.
b. Point map
c. Flow map
d. Statistical maps
e. Bubble map
f. Regional map
g. Administrative maps.
f. Financial
● Known as financial statement; is a management tool used for
communication of key financial information
● Used for tracking their performance and report to their investors,to stay
compliant with law regulations that require them to respond to certain
guidelines.
● Three major types of finance reports include Balance sheet,Income
Statement,cash flow statement
5. Conditional formatting
● Helps users to extract interesting data from reports.
● Basically works on changing the appearance of cells by highlighting them in
different colors or format.
● These conditions are user defined rules like comparing with some numerical
values, result of some formula and text matching.
● Conditional formatting options in Excel -
a. Highlight cell rules
b. Top/ Bottom rules
c. Data bars
d. Color scales
e. Icon sets
6. Adding Summary Lines to Reports
● Helps to extract the quick insights from the dataset and also helps for further
analysis of business.
● Number of tools such as excel,PowerBI, Tabelau,Power Query Builder etc are
available.
● Summaries can be applied to detail values and summary values.
● Predefined summary functions include.
○ Total
○ Count
○ Minimum
○ Maximum
○ Average
7. Drill up
● Performs aggregation by ascending the location hierarchy where one or more
dimensions are removed.
● E.g. Monthly salary can be converted to to yearly salary or a group of districts
can be shown as one state
Drill- down
● It is a dimension expansion technique that can be applied by adding new
dimensions or expanding existing dimensions.
● E.g. States can be drill down to districts or yearly salary can be converted into
monthly salary.
Drill-through capabilities:
● Using this, we can move from one report to another within a session while
maintaining focus on the same data.
● Used to build analytical applications that are bigger than single report
● Drill-through operations consist of a network of linked reports that users can
navigate, retaining context and focus to explore and analyze information.
CSV:
PDF:
1. Preserves formatting: PDF files are designed to preserve the formatting and
layout of documents, including images, charts, fonts, and styles. In BI, this
ensures that reports, dashboards, and visualizations are presented consistently,
maintaining their visual integrity across different devices and platforms.
2. Secure and non-editable: PDF files can be encrypted and password-protected,
providing security for sensitive BI data. They are typically non-editable, preventing
unauthorized modifications and preserving the integrity of the information. This
is crucial when sharing reports or distributing information externally.
3. Print-friendly: PDF files are optimized for printing, ensuring that BI reports or
documents can be easily printed without any loss of quality or formatting issues.
This makes it convenient for users who prefer physical copies or need to share
hard copies of the BI output.
4. Cross-platform compatibility: PDF files can be viewed on various devices and
operating systems using free PDF readers. This cross-platform compatibility
ensures that BI reports can be accessed and reviewed by stakeholders,
regardless of the software or hardware they use.
XML:
Excel:
1. Advanced data analysis: Excel provides a rich set of features and functions for
data analysis and manipulation. Users can perform complex calculations, apply
formulas, create charts, and perform statistical analysis directly within Excel,
making it a popular choice for in-depth BI analysis.
2. Data visualization: Excel offers a wide range of charting and graphing options,
allowing users to create visual representations of their BI data. These
visualizations can aid in understanding trends, patterns, and relationships,
enabling stakeholders to derive meaningful insights from the data.
3. Formula-driven calculations: Excel supports powerful formula capabilities,
enabling users to perform calculations on BI data dynamically. This allows for
real-time updates and automatic recalculation of values based on changes in
underlying data, facilitating dynamic reporting and analysis.
4. Collaboration and sharing: Excel files can be easily shared and collaborated on
within teams. Multiple users can work on the same Excel file simultaneously,
making it convenient for collaborative BI efforts. Additionally, Excel files can be
saved in cloud storage platforms, allowing for easy access and sharing across
different locations.
UNIT 4
Data validation:
● The quality of input data may prove unsatisfactory due to incompleteness,noise and
inconsistency.
● Hence this data is corrected in the data pre-processing process by filling out missing
values,smoothing out the noise and correcting inconsistencies.
1. Incomplete data (I SIE)
● There is a possibility that some data were not recorded at source in a systematic way or
it may not be available at the time of transaction of record.
● Techniques to partially correct incomplete data are as follows:
○ Elimination:In case of classification, suppose a class label is missing for a
row,such data row could be eliminated.Or if many attributes within a row are
missing even in this case the data row could be eliminated.
○ Inspection:Inspect each missing value of the attribute and find the possible
subset of the attribute.It is time consuming for large datasets but is accurate if
skillfully exercised.
○ Identification:A conventional value can be used to code and identify missing
values,so it is not necessary to delete entire records from the
dataset.Example:For a continuous attribute that assumes only positive values,it
is possible to assign the values {-1} to all missing data.
○ Substitution:For missing values,mean or median of its discrete values may be
used as a replacement.Example.In a database with family income,missing values
may be replaced with the average of the income
2. Data affected by noise .
● A random error or variance in a measure variable is known as noise.
● Noise in a data may be introduced due to:
○ Fault in data collection instruments.
○ Error introduced at data entry by a human or a computer.
● Outliers in the dataset must be identified so that they can be corrected and adjusted
subsequently,or entire records containing them can be removed.
● Various ways to identify outliers.
○ Outlier analysis by clustering
■ Partition dataset into clusters and one can store cluster representation
only i.e. replace all the values of cluster by that one value representing
the cluster.
○ Regression
■ Regression is a statistical measure used to determine the strength of the
relationship between one dependent variable denoted by Y and a series
of independent changing variables.
■ Use regression analysis on values of attributes to fill missing values.Two
basic types of regression linear regression and multiple
regression.Difference between linear and multiple regression is that
former uses only one independent variable whereas the later uses two or
more independent variables to predict the outcome.
Data transformation:
● Data warehouses integrating data from multiple sources face a problem of
inconsistency.To deal with this inconsistency, data transformation process is employed.
● Data transformation techniques are used to normalize or rescale numerical data to make
it more manageable and comparable.
1.Standardization (mzd) like msd
● Standardization is the process of making the entire dataset values have a particular
property.
● Following methods can be used for standardization /normalization.
○ Min-Max
■ Min-max scaling is a data transformation technique that rescales the
values of a numerical feature to a specific range, typically between 0 and
1.
■ This transformation preserves the relative ordering of the data while
ensuring that all values are within the desired range.
■ It achieves this by subtracting the minimum value from each data point
and then dividing it by the difference between the maximum and minimum
values.
x’=x-min(x)/max(x)-min(x)
○ Z-score
■ Transforms a numerical feature by subtracting the mean value and
dividing it by the standard deviation.
x’=x-mean(x)/std(x)
○ Decimal scaling
■ Decimal scaling is a data transformation technique that scales down the
values of a numerical feature by shifting the decimal point.
■ The feature is divided by a power of 10 to ensure that the transformed
values lie within a specified range, typically between -1 and 1.
x’=x/10^k
2.Feature extraction.
● Data transformation technique used to reduce the dimensionality of a dataset by
selecting or creating a smaller set of features that capture the most important and
relevant information.
● It helps in removing redundant or irrelevant features and focuses on those that
contribute the most to the analysis or prediction task at hand.
● It converts the raw data into compact representation that still retains the essential
characteristics of the original data.
● The extracted features can then be used as input for various machine learning
algorithms or statistical models to perform tasks such as classification, regression, or
clustering.
Data reduction :
● Data reduction refers to the process of reducing the size or complexity of a dataset while
preserving as much relevant information as possible.
● It aims to overcome challenges such as high dimensionality, computational inefficiency,
or noise in the data.
Criteria to determine whether a data reduction technique should be used;-
1. Efficiency:- data reduction increase-> efficiency increase
2. Accuracy:- Data reduction increase -> accuracy decrease
3. Simplicity::- data reduction increase -> simplicity increase
1. Sampling
● Sampling involves selecting a subset of the original data points from a larger
dataset.
● It is done to reduce the computational burden or to obtain a representative
sample that captures the essential characteristics of the entire dataset.
● Types of sampling
○ Simple random sampling:Equal probability of selecting any particular
item.
○ Sampling without replacement:As each item is selected ,it is removed
from the population
○ Sampling with replacement:Objects selected for the sample is not
removed from the population.In this technique the same items may be
selected multiple times.
○ Stratified sampling:Data is split into partitions and samples are drawn
from each partition randomly
2. Feature selection
● The objective is to select optimal number of features to train and build models
that generalize on data and prevent overfitting
Feature selection can be divided into 3 main areas:- (FWrE) free
● Filter Methods:
Here, features are selected based on correlation.
It checks relationship of each feature with the response variable to be predicted
Types of methods: - Threshold based method, Statistical tests
● Wrapper Methods:-
These methods try to capture the interaction between multiple features by using
a recursive approach to build multiple models using subsets of features and
select the best feature subset.
Types of methods:- Forward elimination, Backward selection.
● Embedded Methods:-
Combine benefits of filter and wrapper methods
Uses ML models to rank and score feature variables based on their
importance.
Types of methods:- Random forest, Decision trees.
2. Histogram Analysis
● It replaces data with an alternative, smaller data representation.
● Divide data into buckets and store average (sum) for each bucket.
● A bucket represents attribute-value / frequency pair
● Types of histogram:-
a. Equal-width histograms - divides the range into N intervals of equal size
b. Equal-depth (frequency) partitioning - Divides the range into N intervals,
each containing approx same no. of samples.
3. Cluster Analysis
● Clustering is used to group the elements based on their similarity w/o prior
knowledge of their classes
● No target variable is to be predicted
● Categorisation of cluster:- (HOPE)
a. Exclusive cluster - ekta jeev sadashiv
b. Overlapping - sagli kade naak khupasnar
c. Probabilistic - kahi ka na raha
d. Hierarchical - bin aani tichi pilla
5. Correlation Analysis
● Data redundancy when data from multiple sources is considered for integration.
● X^2(chi square) test can be carried out-on nominal data to test how strongly the
two attributes are related.
● Correlation coefficient and covariance may be used with numeric data ,this will
give variation between the attributes.
Data exploration :
● Highlights the relevant features of each attribute using graphical methods.
● 3 phases :-
1. Univariate analysis:- investigate properties of each single attribute in dataset
2. Bivariate analysis:- measure degree of relationship between pairs of attribute
3. Multivariate analysis:- investigate relationship holding within a subset of
attributes.
1.Univariate analysis :
● Univariate analysis is a statistical analysis technique that focuses on examining and
understanding a single variable at a time.
● It involves studying the characteristics, distribution, and summary statistics of a single
variable to gain insights and draw conclusions about it.
● Weight (kg) of six students of a class: - [55, 60, 70, 50, 56]. ->only one variable
● Conclusions can be drawn using
○ central tendency measures (mean, median, mode),
○ dispersion or spread of data (range, min, max, variance, standard deviation).
○ frequency distribution tables, histograms, etc
b. Histograms
● Is a graphical representation of the distribution of single variable
● Provides visual summary of the frequency or count of observations falling into
different intervals or bins
● Provides insights into central tendency, spread, skewness, and outliers
C. Measures of central tendency for numerical attributes
● Central tendency is also known as measure of central location that describes the central
position within set of data.
● Different measures of central tendencies.
a. Mean
i. Mostly used with continuous attributes.
ii. Mean is equal to sum of all the values in the data set divided by total no.
of observation in it.
iii. Disadvantage - highly susceptible to outliers. Thus, median is preferred in
such cases.
b. Median
i. Suitable when data is skewed as it is not affected by skewed values
ii. It is the middle score for a dataset, which is in ascending order.
iii. Odd values case - central value
iv. Even values case - average of two central values
v. Used in cases where we have extreme large values
c. Mode
i. It is the most frequently occurring value in the dataset.
ii. It used for categorical data, where most common category is to be
known.
iii. Types - Unimodal (1 mode), Bimodal (2 modes), Trimodal (3 modes)
iv. Empirical relation:
Mean - mode = 3 x (mean - median)
d. Midrange
i. Average of the largest and smallest value in dataset
e. Quartile Deviation -
i. Distance between the first and third quartile
ii. IQR = Q3 - Q1
2. Bivariate analysis:
● It is an analysis of two variables to determine the relationship between them.
● Used to test hypothesis of association.
● There are 3 cases in bivariate analysis:-
○ Both attributes are numerical
○ One attribute is numerical and other is categorical
○ Both are categorical
A. Graphical analysis
a. Scatter plot
i. Scatter plots can reveal clusters or groupings of data points, indicating a
potential subgroup or pattern within the data.
ii. They can also highlight outliers
iii. Tells relationship between 2 variables : Response variable (Y), Other
independent variable (X)
iv. Give an example:
b. Loess Plots
i. Loess curve is used for fitting a smooth curve between two variables
ii. Applies nonparametric smoothing techniques to scatter plots
iii. They fit a smooth curve to data points, capturing underlying trends or
relationship between variables
iv. Also called as local regression.
c. Level Curves
i. Level curves, also known as contour lines or isocontours, are curves on a
two-dimensional surface or map that connect points of equal value of a
particular quantity.
ii. This quantity could be a function, such as temperature, elevation,
pressure, etc
iii. By examining the contour lines, one can observe areas of high or low
values
iv. Level curves are commonly used in topographic maps to represent
elevation or height above sea level.
d. Quantile-Quantile plots
i. Useful to compare the quantile of two sets of numbers
ii. Example -
f. Time series -
i. Time series data is a collection of observations for a single entity at
different intervals of time.
ii. Example of time series analysis:- Rainfall measurement, Stock prices
iii. It is plot of time series data on one axis (Y-axis) against time on the other
axis (X-axis)
b. Covariance
a. Graphical analysis
i. Scatter Plot Matrix
1. Scatterplot matrices are a great way to roughly determine if you have a
linear correlation between multiple variables.
2. The variables in the scatterplot matrix are written in a diagonal line from
top left to bottom right.
3. In the below scatterplot ,it can be said that there is a correlation between
Girth and Volume because plot looks like a line.There is probably less of a
corelation between Height and Girth.
4.
ii. Star Plot
1. Star plots,sometimes called radar charts or web charts are used to
display multivariate data.
2. Multivariate in this sense refers to having multiple characteristics to
observe.
3. Starplots are often used to display several different observations of the
same type of data.
iii. Spider Web Chart
1. Known as radar chart ,is often used when you want to display data across
several unique dimensions.
2. These dimensions are usually quantitative,and typically range from zero
to maximum value.
Bivariate Analysis:
● Explores the relationship between two variables to understand their association.
● Determines the strength, direction, and significance of the relationship.
● Visualizes the relationship using scatter plots, line graphs, or bar charts.
● Helps in hypothesis testing and identifying potential predictors or dependent
variables.
Multivariate Analysis:
● Examines the relationships among multiple variables simultaneously.
● Considers the interdependencies and interactions between variables.
● Provides a deeper understanding of complex relationships and patterns.
● Aids in predicting outcomes, identifying latent factors, and clustering similar
cases.
Unit 5:
Impact of Machine learning in Business Intelligence Process
Classification:
Classification problems
● The Classification algorithm is a Supervised Learning technique that is used to identify
the category of new observations on the basis of training data.
● In Classification, a program learns from the given dataset or observations and then
classifies new observations into a number of classes or groups. Such as, Yes or No, 0 or
1, Spam or Not Spam, cat or dog, etc.
● Classes can be called as targets/labels or categories.
● In the classification algorithm, a discrete output function(y) is mapped to the input
variable(x).
3. Cross Validation
● It helps in obtaining a more reliable estimate of how well the model is
likely to perform on unseen data by simulating the process of training and
testing on multiple different subsets of the data.
● First step:Data is split into k subsets of equal size.
● Second Step:Each subset in turn is used for testing and remainder for
training.
● The advantage is that all the examples are used for testing and training.
4. ROC Curve3
● ROC curve stands for Receiver Operating Characteristics Curve.
● A trade-off between the true positive rate and false positive rate is shown on ROC curve.
● Vertical axis represents the true positive rate and horizontal axis represents the false
positive rate
● The model with perfect accuracy will have an area of 1.0
5. Bootstrapping
● The bootstrap sampling method is a resampling method that uses random sampling with
replacement.
● This means that it is very much possible for an already chosen observation to be
chosen again.
Bayesian methods
Logistic regression.
● It is supervised ML algorithm mostly used for classification problems. It used for
predicting the categorical dependent variable using a given set of independent
variables.
● Output : probabilistic values which lie between 0 and 1.
● Curve : "S" shaped logistic function, which predicts two maximum values (0 or 1).
● Mathematical Formula - Logistic regression consists of sigmoid function which
can be given as follows y=1/1+e^-x ….y=dependent variable
…x=independent variable
● Sigmoid function is used to convert independent variable into expression
of probability that ranges between 0 and 1.
● Types of logistic regression-Binomial, Multinomial, Ordinal
1. Partitioning Clustering
Divides the data into non-hierarchical groups. It is also known as the centroid-based method.
The most common example of partitioning clustering is the K-Means Clustering algorithm.
2. Density-Based Clustering
The density-based clustering method connects the highly-dense areas into clusters
The data is divided based on the probability of how a dataset belongs to a particular distribution.
The grouping is done by assuming some distributions, commonly Gaussian Distribution.
4. Hierarchical Clustering
the dataset is divided into clusters to create a tree-like structure, which is also called a
dendrogram.
5. Fuzzy Clustering
Fuzzy clustering is a type of soft method in which a data object may belong to more than one
group or cluster.
Partition methods
● There are further two types:
○ K-means(Link for example https://youtu.be/CLKW6uWJtTc)
■ K-means clustering is a method of vector quantization
■ It originates from signal processing
■ It aims to partition ‘n’ observations into ‘k’ clusters where each
observation belongs to the cluster with the nearest mean ,serving as a
prototype of a cluster.
■ Centroid is a point that represents the mean of parameter values of all the
points in the cluster.
■ Steps of K-means clustering.
● First we choose final number clusters.
● Examine each element and assign it to one of the clusters
depending upon the minimum euclidean distance.
● Each time the element is added to the cluster the centroid position
is recalculated.This process goes on until all the elements are
grouped into clusters.
Hierarchical methods
● In this technique, the dataset is divided into clusters to create a tree-like
structure, which is also called a dendrogram.
● There are two types of Hierarchical clustering:- Divisive and Agglomerative
Agglomerative Clustering Divisive Clustering
Starts with individual data points as Initially, all objects are in 1 cluster and
clusters and move upwards by grouping then sub-divides the cluster until we are
similar data points until we achieve a left with individual data points
single cluster
Support
● The support of an itemset is the percentage of transactions in which the items appear.
● If A=>B,
Then Support (A=>B) = (Tuples containing both A and B) / Total no. of tuples
Confidence
● The confidence or strength of an association rule is the ratio of number of transactions
that contain A and B to the no. of transactions that contain A.
● Confidence (A=>B) = (Tuples containing both A and B) / Tuples containing A
Apriori Algorithm
● The Apriori algorithm uses frequent itemsets to generate association rules and is mainly
designed to work on databases that contain transactions.
● Used to determine how strongly or weakly the two objects are connected.
● Uses breadth first search or hashtree to generate associations rules efficiently.
UNIT 6
Tools for Business Intelligence
● Business Intelligence (BI) tools are software applications that enable organizations to
gather, analyze, and visualize data to gain insights and make informed business
decisions.
● BI tools are used for query and generating reports on data, but they can also combine a
large set of applications for the analysis of data.
● Some of the tools are:-
1. Tableau:-
● Most popular and simple Microsoft BI tool
● It offers a varied range of graphical representations that are extremely interactive
and pleasing.
● This tool mainly serves two functions, the collection of data and data analysis.
2. Datapine:-
● datapine is a cloud-based BI tool that simplifies data analysis and reporting.
● It offers a drag-and-drop interface to create visually appealing dashboards and
reports.
● datapine supports real-time data processing, collaboration, and provides
advanced features like predictive analytics and data mining.
3. Sisense:-
● the main purpose of the BI tool, Sisense, is to gather, analyze, and visualize
datasets, irrespective of their size.
● intuitive feature that offers a user-friendly drag-and-drop interface, allowing
anyone to use and understand it, including those who are not from an IT
background.
● also provides advanced analytics capabilities and supports data integration from
multiple sources.
4. Power BI:-
● Microsoft product that provides a suite of tools for data analysis and visualization.
● It allows users to connect to various data sources, create interactive dashboards,
and generate insightful reports.
● Power BI offers features like natural language querying, data modeling, and
collaboration options.
Role of analytical tools in BI (mr pev)
● Data exploration
○ Helps to find trends,insights that were previously concealed.
○ Users can select areas for additional study and obtain a deeper grasp of their
data
● Data visualization
○ View data in a variety of visual representations,including graphs,charts and
maps.
○ Simple to find patterns and trends which can be shared with others.
● Analytics for prediction
○ Users can create predictive models that can predict future outcomes using
information from the past.
○ This helps businesses to make better decisions and plan more effectively.
● Data Modelling
○ Build data models to comprehend the connections between various data pieces.
○ Helps companies to find areas of optimization and development.
● Reporting
○ Offer real time insights which helps companies to identify opportunities for
development and reach data-driven decisions.
WEKA Experimenter
● Experimental Setup and Configuration:specify datasets,algorithms
● Automated Execution of Experiments:Perform automated execution of
algorithms,cross-validation etc
● Statistical Analysis and Reporting: provides statistical analysis
features, such as significance testing and confidence intervals,
allowing users to assess the performance
● Result Visualization and Comparison: Charts,graphs,scatterplots.
● Experiment Management: allows users to save and load experiment
configurations
● Workflow
○ Series of interconnected nodes define a workflow.
○ Once the workflow is executed, data in the workflow flows from left to
right.
● Component
○ A component in KNIME is a reusable sub-workflow.
○ It is a way to encapsulate a set of nodes and their connections into
a single node that can be easily reused within the same workflow or
across different workflows.
● Metanode
a. It is similar to a component but provides a higher level of
abstraction.
b. A metanode allows you to group nodes together, define a
dedicated configuration interface for the group.
1. Data analytics
● Data analysis is the process of analyzing and interpreting data which helps in
greater understanding of corporate performance,consumer behavior,market
trends etc.
a. Descriptive analytics
Summarize, compile and describe historical data
b. Diagnostic analytics
Identify patterns and determine factors that led to particular outcome
c. Predictive analytics
Find patterns in data to predict future events using previous data.
d. Prescriptive analytics
Uses data to suggest precise actions that should be taken to obtain a given
result.
2. Business analytics (UIPI)
● Understand Company Performance:
○ Evaluating data on measures such as revenue,customer happiness and
profitability helps organizations to better understand their business..
● Identify trends and performance.
○ BA helps us to find the trends and patterns in data which can be used to
make predictions.
● Improve operations
a. BA help businesses to cut cost, boost productivity, and boost consumer
happiness by pinpointing areas for improvement and optimize their
operations
● Promote innovations
a. BA assists firms in identifying new business prospects and creative
solutions
b. Studies industry trends, consumer behavior, etc.
7. BI Applications in CRM
a. BI helps in identifying trends as well as opportunities to boost customer
engagement,retention and satisfaction.
b. BI will be used in CRM as follows (SCAM)
i. Sales analytics.Helps to discover patterns and trends in customer
behavior.
ii. Client segmentation.Helps to divide clients into several groups based on
demographics,past purchasing patterns etc.
iii. Analysis of customer feedback.Helps to analyze customer
reviews,comments and customer care contacts
iv. Marketing analytics.Helps companies in tracking and analyzing the
results of their marketing initiatives.
8. BI Applications in Marketing (2C2P)
a. Customer segmentation:Helps to divide customers into groups depending
on characteristics like behavior,buying habits and demographics
b. Predictive analytics:Helps to forecast customer behavior and find trends
that are likely to result in sales using predictive analytics.
c. Performance metrics:Includes website traffic, conversion rates and social
media involvement.
d. Competitive analysis:Helps to monitor and examine the marketing
initiatives of rival companies ,giving users insightful knowledge into their plans.