0% found this document useful (0 votes)
12 views18 pages

BI Unit 3

Uploaded by

gunashekaran043
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views18 pages

BI Unit 3

Uploaded by

gunashekaran043
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Business Intelligence

Unit 3
Basics of Neural networks

A neural network is a method in artificial intelligence that teaches computers to process data in a
way that is inspired by the human brain. It is a type of machine learning process, called deep
learning, that uses interconnected nodes or neurons in a layered structure that resembles the
human brain.

1.Neurons and Layers:

 Concept: Neurons are the basic building blocks of neural networks. They receive inputs,
perform a computation, and produce an output.
 Example: In image recognition, each neuron might represent a pixel. The first layer
receives raw pixel values, and subsequent layers extract increasingly complex features.

2.Input Layer:

 Concept: The input layer is where the neural network receives its input data.
 Example: In a spam email classifier, the input layer would consist of neurons
representing features like email content, sender information, etc.

3.Hidden Layers:

 Concept: Between the input and output layers, neural networks have one or more hidden
layers where complex representations are learned.
 Example: In a financial fraud detection system, hidden layers might learn patterns
indicating potential fraudulent transactions.
Business Intelligence

4. Weights — These values explain the strength (degree of importance) of the connection
between any two neurons.

Bias — is a constant value added to the sum of the product between input values and respective
weights. It is used to accelerate or delay the activation of a given node.

5. Activation function — is a function used to introduce the non-linearity phenomenon into the
NN system. This property will allow the network to learn more complex patterns.

Output Layer:
Concept: The output layer produces the final result of the neural network's computation.
History and timeline of neural networks
The history of neural networks spans several decades and has seen considerable advancements.
The following examines the important milestones and developments in the history of neural
networks:

 1940s. In 1943, mathematicians Warren McCulloch and Walter Pitts built a circuitry
system that ran simple algorithms and was intended to approximate the functioning of the
human brain.
 1950s. In 1958, Frank Rosenblatt, an American psychologist who's also considered the father
of deep learning, created the perceptron, a form of artificial neural network capable of
learning and making judgments by modifying its weights. The perceptron featured a single
layer of computing units and could handle problems that were linearly separate.
 1970s. Paul Werbos, an American scientist, developed the backpropagation method, which
facilitated the training of multilayer neural networks. It made deep learning possible by
enabling weights to be adjusted across the network based on the error calculated at the output
layer.
 1980s. Cognitive psychologist and computer scientist Geoffrey Hinton, along with computer
scientist Yann LeCun, and a group of fellow researchers began investigating the concept of
connectionism, which emphasizes the idea that cognitive processes emerge through
interconnected networks of simple processing units. This period paved the way for modern
neural networks and deep learning.
 1990s. Jürgen Schmidhuberand Sepp Hochreiter, both computer scientists from Germany,
proposed the Long Short-Term Memory recurrent neural network framework in 1997.
 2000s. Geoffrey Hinton and his colleagues pioneered RBMs, a sort of generative artificial
neural network that enables unsupervised learning. RBMs opened the path for deep belief
networks and deep learning algorithms.
Business Intelligence

Artificial Neuron — Mathematical Operation on one Neuron

An artificial
cial neuron takes input values (it can be several) with weights assigned to them. Inside
the node, the weighted inputs are summed up, and an activation function is applied to get the
results. The output of the node is passed on to the other nodes or, in th
thee case of the last layer of
the network, the output is the overall output of the network.

A single neuron, like the one shown above, performs the following mathematical operation,

Equation 1

In the Equation, four things are happening — input is multiplied with the respective weights and
added, bias is added to the result, and then an activation function, g,, is applied so that the output
of the neuron is g(w·x+b).

Neural Network Design

A Neural Network(NN) is made of several neurons stacked into llayers.

 For an n-dimensional
dimensional input, the first layer (also called the input layer) will have n nodes
and the t-dimensional
dimensional final/output layer will have t neural units.

 All intermediate layers are called hidden layers


layers,, and the number of layers in a network
determines the depth of the model. The Figure below shows a 3–4–4–1 NN.
Business Intelligence

Figure 2: A Neural network with 3 input features, two hidden layers with 4 nodes each and one-
one
value output.

 The nodes are densely connected


connected- each node is connected to all the neurons
neuro in the
immediate previous layer
layer.
 Each connection has weights expressing the strength of the connection between any two
nodes.
 Every node performs the computation described in Equation 1 except the nodes at the
input layer.

Simplified Example

Let’s take a simple example of how a single neuron work. In this example, we assume 3 input
values and a bias of 0.
Business Intelligence

Figure 3: Artificial Neural with 3 input values 2, 1, -4


4 and the weights 0.8, 0.12 and 0.3,
respectively. The bias in this case is set to 0.

In this example, we will consider a commonly used activation function called sigmoid,
sigmoid which is
defined as

Sigmoid function (f(x)) with its derivative (f’(x)). The sigmoid f(x) pushes any real value x into
the range (0,1). At this moment don’t mind too much about the derivative.
Business Intelligence

This is a plot of sigmoid plot. Notice that for a value of x less than -5
5 or greater than 5 then f(x)
approaches 0 and 1, respectively.

As said before, four things are happening inside the neuron. First, the input values are weighted
weight
by multiplying the input values with corresponding weights.

First operation: input values are weighted.

Next, is to sum the weighted input then add the bias,


Business Intelligence

Second operation: sum weighted inputs and the bias

and lastly, sigmoid activation function is applied on the above result

Third operation: apply sigmoid function.

That is it. The output of the neuron is 0.627.

If the given neuron is in the hidden layer, then this output becomes the input of the next
neuron(s). On the other hand,, if this value is the output of the last layer, then it can be interpreted
as the final prediction of the model

Important Note: To simplify the mathematical operation done with the neuron, we can use a
more compact matrix form of the first two operations. In this case, a dot product operation
between the vector of input values and the weights vector will come in handy.

Connection between Biological and Artificial Neuron

The nervous system in the biological brain consists of two categories of cells: neurons and glial
gl
cells. Glial cells provide supportive functions to the nervous system. Specifically, the cells are
tasked to maintain homeostasis, form a myelin sheath that insulates the nerves, and participate in
signal transmission.

 A neuron is made up of the cell body, axon, and dendrites.


Business Intelligence

 The dendrites are the projections that act as the input to the neuron
neuron.. It receives electro-
electro
chemical information from other neurons and propagates them to the cell body.

 On the other hand, the axon is a long elongation of the neuron that transports information
from the cell body into the other neurons, glands, and muscles. Axon connects to the cell
body in a conical projection called the axon hillock. The hillock is responsible for
summing the inhibitory and excitatory signals, and if the sum exceeds some threshold, the
neuron fires a signal (called an action potential). Two neurons connect at the synapses.
The synapses are located at the axon terminal of the first neuron and the dendrites of the
second neuron.

Biological (left)
t) and artificial neuron (right).

Artificial neuron

 An artificial neuron (also called a unit or a node) mimics the biological neuron in structure
and function.

 The artificial neuron takes several input values (synonymous to the dendrites in the
biological neuron) with weights assigned to them (analogous to the role of synapses).

 Inside the node, the weighted inputs are summed up, and an activation function is applied
to get the results. This operation matches the role of the cell body and the axon hillock in
the biological neuron.

 The output of the node is passed on to the other units — an operation that mimics the
process of electro-chemical
chemical information being passed on from one neuron to another or
other parts
ts of the nervous system.

Developing the Neural Network


 Howow neural networks are developed and why understanding this cycle will become
invaluable as technology progresses.
 The image below shows a workflow for developing a generic neural network. It’s
important to remember that this cycle is regarding only the neural network side of
development.
Business Intelligence

 If the solution requires an application that utilizes the network, then this flow is in
addition to the standard software development cycle.

 The first step, Data Sourcing, refers to the collection and “normalization” of data to be
fed into neural networks. The process for this step differs based on data readiness, but in
general involves accessing where the data is stored and converting the data to be in the
same format universally.
 Next is the process of Data Labeling, which can be time consuming for certain neural
networks. For example, a network designed to categorize its input will need to have
initial data that has already been categorized manually.
 Data Versioning is as it sounds, each data set needs to be properly notated so that
developers can reference which sets produced the best results.
The next section of steps involves the cycle of creating the actual neural network. The network,
modeled in the image above, goes through three main stages:
 Model Architecture: The first stage is Model Architecture, this is where the developer
will decide based on purpose and input data exactly what type of network to create and
what layers the model will consist of.
 Model Training: After the model is defined it will need to be trained. Model Training is
the stage where the model will be exposed to most of the data that was labeled in the
most recent set.
 The process consists of a batch of data points being passed through the model, then the
outputs of each data point pass is compared to the labels designated during the Data
Labeling stage.
 Depending on how close the outputs are to their respective label, the model’s weights
will be updated via one of numerous methods to attempt closing the difference between
output and label. This stage ends when either the model has passed over the entire data
set a specified number of times or when each pass yields no additional benefit.
 Model Evaluation: After the training stage, the model enters the Model Evaluation
stage. During this stage the model will pass over the data points not used during training.
This method of splitting the data ensures that the model never explicitly “sees” the data it
Business Intelligence

is being evaluated with. The only way the model could perform well on this evaluation
set would be if it truly identified the co
correct
rrect patterns within the training data.
Finally, after multiple iterations of the model development sub sub-cycle,
cycle, the development team will
have a functioning model that is ready to make predictions. Before the model is utilized in a
production environment itt must be versioned.

 Model Versioning is essentially documenting the specifics regarding when the model
was trained and on what data. This step is vital for ensuring model quality in the future. If
a new model is trained, it is important to be able to compare its results with previous
iterations as well as allowing the developers to assess why one model performs better
than the other.

After versioning, the model is officially ready for deployment.

 Model Deployment steps differ based on use case. For exa example,
mple, if the network is a
stand-alone
alone entity, this step is mainly just hosting the model somewhere in the cloud or as
a runnable script. But, if the model is to be used within custom software, this is where the
neural network development cycle would return to the software development cycle, most
likely within the “integration” phase.

 After a model is successfully deployed to a production environment there are different


“next steps” based on use case.
 For most high-level
level classification models (models trai
trained
ned to predict what the data being
passed on is), if the model is trained effectively once, then there should be no reason to
re-train
train on new data unless there is a change of scope.
 In this situation, the development team would enter the Prediction Monitoring
Monitor phase.
Essentially, development work is done for this use case, the only action is to ensure the
model continues to perform well. If the model begins to perform poorly, the team would
then need to start the cycle over again.
Business Intelligence

Illuminating the black box of an Artificial Neural Network (ANN)


It refers to gaining insight into its internal workings and understanding how it arrives at its
predictions or decisions. One method to achieve this is through sensitivity analysis.

Sensitivity analysis involves examining how changes in input variables affect the output of the
neural network. By systematically varying the inputs and observing the corresponding changes in
output, analysts can infer which inputs have the most significant impact on the model's
predictions. This helps in understanding the model's behavior and identifying which features it
relies on most heavily for decision-making.
In the context of illuminating the black box of an ANN, sensitivity analysis can provide valuable
insights into the features or variables that drive the network's decisions, helping users understand
its inner workings and potentially uncovering areas for improvement or refinement.

Support Vector Machines (SVMs)


They are a type of supervised machine learning algorithm used for classification and regression
tasks. They work by finding the optimal hyperplane that best separates different classes in the
input space.

How SVMs work:

 Basic Concept: SVMs operate by identifying the hyperplane that maximizes the margin
between classes in the feature space. The feature space is the multidimensional space
where each data point is represented by its features or attributes.

 Linear Separability: SVMs are initially designed for linearly separable data. In a binary
classification problem, SVM finds the hyperplane that separates the classes with the
widest possible margin. This hyperplane is determined by support vectors, which are the
data points closest to the decision boundary.

 Kernel Trick: SVMs can be extended to handle non-linearly separable data using a
technique called the kernel trick. This involves mapping the input space into a higher-
dimensional feature space where the data might be linearly separable. Common kernels
include linear, polynomial, radial basis function (RBF), and sigmoid.

 Optimization: The goal of SVMs is to find the hyperplane that maximizes the margin
while minimizing classification errors. This is achieved by solving an optimization
problem, typically a quadratic programming problem, where the objective is to minimize
the classification error and maximize the margin.
Business Intelligence

 Regularization: SVMs also incorporate regularization parameters to control the trade-off


between maximizing the margin and minimizing classification errors. This helps prevent
overfitting and ensures the model generalizes well to unseen data.

 Classification and Regression: While SVMs are primarily used for classification tasks,
they can also be adapted for regression tasks (Support Vector Regression). In regression,
SVMs aim to fit a hyperplane that predicts continuous output values with minimal error.

 Overall, SVMs are powerful and versatile machine learning algorithms that are widely
used in various applications such as text classification, image recognition, bioinformatics,
and financial forecasting. They are particularly effective when dealing with high-
dimensional data and datasets with a clear margin of separation between classes.

A process based approach to the use of SVM

 A process-based approach to the use of Support Vector Machines (SVMs) involves a


systematic method of applying SVMs to solve specific tasks or problems. Here's an
overview of the steps involved in a process-based approach to using SVMs:

 Problem Definition: Clearly define the problem you want to solve using SVMs. This
could be a classification or regression task. Identify the nature of the data (e.g., structured
or unstructured) and the desired outcome.

 Data Collection and Preprocessing: Gather the data relevant to your problem. This may
involve collecting data from various sources, cleaning, and preprocessing it to ensure it's
suitable for use with SVMs. Preprocessing steps may include feature scaling,
normalization, handling missing values, and encoding categorical variables.

 Feature Selection and Engineering: Identify relevant features that may help the SVM
model make accurate predictions. This may involve feature selection techniques to
choose the most informative features or feature engineering to create new features based
on domain knowledge.

 Model Selection: Choose an appropriate SVM variant and kernel function based on the
problem at hand and the characteristics of the data. Consider factors such as linearity,
separability, and the presence of noise in the data. Common choices include linear SVMs,
polynomial SVMs.
Business Intelligence

 Model Training: Train the SVM model on a labeled dataset using the chosen kernel
function and parameters. During training, the SVM algorithm learns to find the optimal
hyperplane that separates the different classes or predicts the target variable.

 Model Evaluation: Evaluate the performance of the trained SVM model using appropriate
evaluation metrics such as accuracy, precision, recall, F1-score, or mean squared error
(for regression). Use techniques like cross-validation to ensure the model's
generalizability and robustness.

 Hyperparameter Tuning: Fine-tune the hyperparameters of the SVM model to improve its
performance further. This may involve grid search, random search, or more advanced
optimization techniques to find the optimal combination of hyperparameters.

 Deployment and Monitoring: Once satisfied with the performance of the SVM model,
deploy it into production and monitor its performance over time. Continuously collect
feedback data and retrain or update the model as needed to adapt to changing conditions
or new patterns in the data.

 By following a process-based approach, users can systematically apply SVMs to solve


real-world problems effectively while ensuring robustness, generalizability, and
scalability of the solution.

Nearest neighbor method for sentiment analysis


The Nearest Neighbor (NN) method for sentiment analysis is a straightforward approach that
relies on the similarity of input data points to classify sentiment.

 Basic Concept: The nearest neighbor method operates on the principle that similar
instances tend to have similar labels. In sentiment analysis, this means that
texts/documents with similar content tend to have similar sentiments.

 Training Phase: During the training phase, the method doesn't actually build a model in
the traditional sense. Instead, it memorizes the labeled instances in the training
dataset. Each instance consists of a text/document and its corresponding sentiment label
(e.g., positive, negative, neutral).
Business Intelligence

 Prediction Phase: When given a new, unlabeled text/document to classify sentiment, the
nearest neighbor method identifies the labeled instances (neighbors) from the training
dataset that are most similar to the input text/document.

 Similarity Metric: The similarity between the input text/document and each labeled
instance is typically computed using a distance or similarity metric, such as cosine
similarity, Euclidean distance, or Jaccard similarity, depending on the nature of the data
and the text representation used.

 Voting Scheme: Once the nearest neighbors are identified, the sentiment label of the
input text/document is determined using a voting scheme. For example, a simple
approach is to assign the sentiment label that occurs most frequently among the nearest
neighbors.

 Parameter Tuning: The performance of the nearest neighbor method can be influenced by
various factors, including the choice of similarity metric and the number of nearest
neighbors considered. These parameters may need to be tuned to optimize performance
on a given dataset.

 Limitations: While simple and intuitive, the nearest neighbor method for sentiment
analysis has some limitations. It can be computationally expensive, especially when
dealing with large datasets. Additionally, it may not perform well when the feature space
is high-dimensional or when there is noise or irrelevant features in the data.

 Scalability: Techniques such as approximate nearest neighbor search or dimensionality


reduction can be employed to improve the scalability of the nearest neighbor method for
sentiment analysis, enabling it to handle larger datasets more efficiently.

 In summary, the nearest neighbor method for sentiment analysis offers a simple yet
effective approach for classifying sentiment based on the similarity of input
texts/documents to labeled instances in a training dataset.

An overview of sentiment analysis


Sentiment analysis, also known as opinion mining, is a natural language processing (NLP)
technique used to identify, extract, quantify, and analyze subjective information from text data.
Business Intelligence

The goal of sentiment analysis is to determine the sentiment expressed in a piece of text, whether
it's positive, negative, or neutral.

Applications of Sentiment Analysis:

 Social Media Monitoring: Analyzing sentiment in social media posts, comments, and
reviews to understand public opinion about products, services, events, or brands.
 Customer Feedback Analysis: Assessing sentiment in customer reviews, surveys, and
feedback to identify areas for improvement and measure customer satisfaction.
 Market Research: Analyzing sentiment in market reports, news articles, and financial
data to gauge market sentiment and make informed investment decisions.
 Brand Monitoring and Reputation Management: Monitoring sentiment around a brand or
organization to manage reputation, address customer concerns, and improve brand
perception.
 Product Analysis and Recommendation Systems: Analyzing sentiment in product reviews
and user feedback to improve product features, recommend products, and personalize
user experiences.
 Political Analysis: Analyzing sentiment in political speeches, news articles, and social
media discussions to understand public opinion, election outcomes, and political trends.

Process of Sentiment Analysis:

 Data Collection: Gather text data from various sources such as social media, customer
reviews, surveys, news articles, or any other relevant sources.

 Preprocessing: Clean and preprocess the text data by removing noise, irrelevant
information, special characters, punctuation, and stopwords. Perform tokenization,
stemming, and lemmatization to normalize the text.

 Feature Extraction: Represent the text data as numerical features that can be used by
machine learning algorithms. Common techniques include bag-of-words, TF-IDF (Term
Frequency-Inverse Document Frequency), word embeddings (e.g., Word2Vec, GloVe),
or character n-grams.

 Sentiment Classification: Apply a sentiment classification algorithm to classify the


sentiment of each text document. This can be done using machine learning techniques
such as Naive Bayes, Support Vector Machines (SVM), Logistic Regression, Decision
Trees, Random Forests, or neural network-based approaches such as LSTM (Long Short-
Term Memory) or CNN (Convolutional Neural Network).
Business Intelligence

 Evaluation: Evaluate the performance of the sentiment analysis model using appropriate
evaluation metrics such as accuracy, precision, recall, F1-score, or confusion matrix. Use
techniques like cross-validation to ensure the model's generalizability and robustness.

Analysis of Sentiment Analysis Results:


 Overall Sentiment Distribution: Analyze the distribution of sentiments (positive,
negative, neutral) in the dataset to understand the overall sentiment trends.

 Sentiment Trends Over Time: Analyze how sentiments change over time to identify
patterns, trends, or events that may influence sentiment.

 Key Sentiment Drivers: Identify the key topics, themes, or features mentioned in positive
and negative sentiments to understand what aspects are driving sentiment.

 Comparison Across Segments: Compare sentiment across different segments (e.g.,


products, brands, demographics) to identify variations and disparities in sentiment.

 Sentiment Impact Analysis: Analyze the impact of sentiment on business outcomes,


customer behavior, stock prices, or other relevant metrics to derive actionable insights.
 In conclusion, sentiment analysis is a powerful NLP technique with a wide range of
applications across various industries. By following a systematic process and analyzing
the results, organizations can gain valuable insights into public opinion, customer
sentiment, market trends, and other factors influencing decision-making and business
outcomes.

Speech analytics
Speech analytics is the process of analyzing spoken language to extract valuable insights,
patterns, and information. It involves the use of various technologies and techniques to
analyze recorded speech data, typically in the form of audio recordings or transcribed text.

Process of Speech Analytics:

 Data Collection: Gather speech data from various sources such as call recordings,
voicemails, interviews, focus groups, or speech-to-text transcripts.
Business Intelligence

 Speech Recognition: Convert spoken language into text using automatic speech
recognition (ASR) technology. This step is essential for processing and analyzing the
speech data.

 Transcription and Text Processing: Clean and preprocess the transcribed text data by
removing noise, filler words, and irrelevant information. Perform text normalization,
tokenization, and part-of-speech tagging as needed.

 Feature Extraction: Extract relevant features from the text data, such as sentiment,
emotion, keywords, topics, speaker characteristics, speech rate, or intonation patterns.

 Analysis and Modeling: Apply analytical techniques and machine learning algorithms to
analyze the extracted features and derive actionable insights from the speech data. This
may involve sentiment analysis, topic modeling, clustering, classification, or other NLP
tasks.

 Visualization and Reporting: Visualize the results of the analysis using charts, graphs,
dashboards, or reports to communicate key findings and insights effectively to
stakeholders.

 Challenges and Considerations:


 Speech Recognition Accuracy: The accuracy of speech recognition systems can vary
depending on factors such as accent, background noise, speaker variability, and domain-
specific vocabulary.

 Data Privacy and Security: Speech data may contain sensitive information, so it's
essential to ensure compliance with data privacy regulations and implement robust
security measures to protect confidentiality.

 Scalability and Efficiency: Analyzing large volumes of speech data can be


computationally intensive and time-consuming, requiring scalable and efficient
processing and storage solutions.
Business Intelligence

 Interpretability and Bias: Interpreting speech analytics results can be challenging,


especially when dealing with complex models or ambiguous language. It's essential to
consider potential biases in the data and algorithms used for analysis.

 In summary, speech analytics is a valuable tool for extracting insights from spoken
language data across various domains and applications. By leveraging advanced
technologies and analytical techniques, organizations can gain valuable insights into
customer behavior, employee sentiments, market trends, and business performance,
leading to improved decision-making and outcomes.

Applications of Speech Analytics:

 Customer Service Optimization: Analyzing customer interactions with call center agents
to identify areas for improvement, measure customer satisfaction, and enhance the quality
of service.

 Market Research: Analyzing recorded interviews, focus groups, or survey responses to


understand consumer opinions, preferences, and trends.

 Sales Performance Analysis: Analyzing sales calls to identify successful selling


techniques, assess sales representative performance, and optimize sales strategies.

 Compliance Monitoring: Monitoring calls for regulatory compliance, adherence to


company policies, and legal requirements.

 Quality Assurance: Assessing the quality and effectiveness of training programs, scripts,
and call handling procedures by evaluating interactions between agents and customers.

 Fraud Detection: Identifying suspicious or fraudulent activities by analyzing speech


patterns, keywords, and anomalies in recorded conversations.

 Voice of the Employee: Analyzing employee feedback, interviews, or performance


reviews to understand employee sentiments, engagement levels, and organizational
culture.

You might also like