Unit-1 Notes Onl
Unit-1 Notes Onl
The data mining tasks can be classified generally into two types based on what a specific task
tries to achieve. Those two categories are descriptive tasks and predictive tasks. The descriptive
data mining tasks characterize the general properties of data whereas predictive data mining
tasks perform inference on the available data set to predict how a new data set will behave.
Performance Issues
There can be performance-related issues such as follows −
Data mining is one of the forms of artificial intelligence that uses perception models,
analytical models, and multiple algorithms to simulate the techniques of the human
brain. Data mining supports machines to take human decisions and create human
choices.
The user of the data mining tools will have to direct the machine rules, preferences, and
even experiences to have decision support data mining metrics are as follows −
Usefulness − Usefulness involves several metrics that tell us whether the model
provides useful data. For instance, a data mining model that correlates save the location
with sales can be both accurate and reliable, but cannot be useful, because it cannot
generalize that result by inserting more stores at the same location.
Furthermore, it does not answer the fundamental business question of why specific
locations have more sales. It can also find that a model that appears successful is
meaningless because it depends on cross-correlations in the data.
Return on Investment (ROI) − Data mining tools will find interesting patterns buried
inside the data and develop predictive models. These models will have several measures
for denoting how well they fit the records. It is not clear how to create a decision based
on some of the measures reported as an element of data mining analyses.
Access Financial Information during Data Mining − The simplest way to frame
decisions in financial terms is to augment the raw information that is generally mined to
also contain financial data. Some organizations are investing and developing data
warehouses, and data marts.
The design of a warehouse or mart contains considerations about the types of analyses
and data needed for expected queries. It is designing warehouses in a way that allows
access to financial information along with access to more typical data on product
attributes, user profiles, etc. can be useful.
Converting Data Mining Metrics into Financial Terms − A general data mining
metric is the measure of "Lift". Lift is a measure of what is achieved by using the specific
model or pattern relative to a base rate in which the model is not used. High values
mean much is achieved. It can seem then that one can simply create a decision based
on Lift.
Accuracy − Accuracy is a measure of how well the model correlates results with the
attributes in the data that has been supported. There are several measures of accuracy,
but all measures of accuracy are dependent on the information that is used. In reality,
values can be missing or approximate, or the data can have been changed by several
processes.
Data mining is the process of finding useful new correlations, patterns, and trends by
transferring through a high amount of data saved in repositories, using pattern
recognition technologies including statistical and mathematical techniques. It is the
analysis of factual datasets to discover unsuspected relationships and to summarize the
records in novel methods that are both logical and helpful to the data owner.
Data mining systems are designed to promote the identification and classification of
individuals into different groups or segments. From the aspect of the commercial firm,
and possibly for the industry as a whole, it can interpret the use of data mining as a
discriminatory technology in the rational search of profits.
There are various social implications of data mining which are as follows −
Privacy − It is a loaded issue. In current years privacy concerns have taken on a more
important role in American society as merchants, insurance companies, and government
agencies amass warehouses including personal records.
The concerns that people have over the group of this data will generally extend to some
analytic capabilities used to the data. Users of data mining should start thinking about
how their use of this technology will be impacted by legal problems associated with
privacy.
Profiling − Data Mining and profiling is a developing field that attempts to organize,
understand, analyze, reason, and use the explosion of data in this information age. The
process contains using algorithms and experience to extract design or anomalies that
are very complex, difficult, or time-consuming to recognize.
The founder of Microsoft's Exploration Team used complex data mining algorithms to
solve an issue that had haunted astronomers for some years. The problem of reviewing,
describing, and categorizing 2 billion sky objects recorded over 3 decades. The algorithm
extracted the relevant design to allocate the sky objects like stars or galaxies. The
algorithms were able to extract the feature that represented sky objects as stars or
galaxies. This developing field of data mining and profiling has several frontiers where it
can be used.
Unauthorized Used − Trends obtain through data mining designed to be used for
marketing goals or some other ethical goals, can be misused. Unethical businesses or
people can use the data obtained through data mining to take benefit of vulnerable
people or discriminate against a specific group of people. Furthermore, the data mining
technique is not 100 percent accurate; thus mistakes do appear which can have serious
results.
Association analysis is the finding of association rules showing attribute-value conditions that occur
frequently together in a given set of data. Association analysis is widely used for a market basket or
transaction data analysis. Association rule mining is a significant and exceptionally dynamic area of data
mining research. One method of association-based classification, called associative classification, consists
of two steps. In the main step, association instructions are generated using a modified version of the
standard association rule mining algorithm known as Apriori. The second step constructs a classifier based
on the association rules discovered.
2. Classification
Classification is the processing of finding a set of models (or functions) that describe and distinguish data
classes or concepts, for the purpose of being able to use the model to predict the class of objects whose
class label is unknown. The determined model depends on the investigation of a set of training data
information (i.e. data objects whose class label is known). The derived model may be represented in
various forms, such as classification (if – then) rules, decision trees, and neural networks. Data Mining has
a different type of classifier:
Decision Tree
SVM(Support Vector Machine)
Bayesian classification:
Classification by Backpropagation
K-NN Classifier
Rule-Based Classification
Fuzzy Logic
3. Prediction
Data Prediction is a two-step process, similar to that of data classification. Although, for prediction, we do
not utilize the phrasing of “Class label attribute” because the attribute for which values are being predicted
is consistently valued(ordered) instead of categorical (discrete-esteemed and unordered). The attribute can
be referred to simply as the predicted attribute. Prediction can be viewed as the construction and use of a
model to assess the class of an unlabeled object, or to assess the value or value ranges of an attribute that a
given object is likely to have.
4. Clustering
Unlike classification and prediction, which analyze class-labeled data objects or attributes, clustering
analyzes data objects without consulting an identified class label. In general, the class labels do not exist in
the training data simply because they are not known to begin with. Clustering can be used to generate these
labels. The objects are clustered based on the principle of maximizing the intra-class similarity and
minimizing the interclass similarity. That is, clusters of objects are created so that objects inside a cluster
have high similarity in contrast with each other, but are different objects in other clusters. Each Cluster that
is generated can be seen as a class of objects, from which rules can be inferred. Clustering can also
facilitate classification formation, that is, the organization of observations into a hierarchy of classes that
group similar events together.
5. Regression
Regression can be defined as a statistical modeling method in which previously obtained data is used to
predicting a continuous quantity for new observations. This classifier is also known as the Continuous
Value Classifier. There are two types of regression models: Linear regression and multiple linear regression
models.
An artificial neural network (ANN) also referred to as simply a “Neural Network” (NN), could be a process
model supported by biological neural networks. It consists of an interconnected collection of artificial
neurons. A neural network is a set of connected input/output units where each connection has a weight
associated with it. During the knowledge phase, the network acquires by adjusting the weights to be able to
predict the correct class label of the input samples. Neural network learning is also denoted as connectionist
learning due to the connections between units. Neural networks involve long training times and are
therefore more appropriate for applications where this is feasible. They require a number of parameters that
are typically best determined empirically, such as the network topology or “structure”. Neural networks
have been criticized for their poor interpretability since it is difficult for humans to take the symbolic
meaning behind the learned weights. These features firstly made neural networks less desirable for data
mining.
The advantages of neural networks, however, contain their high tolerance to noisy data as well as their
ability to classify patterns on which they have not been trained. In addition, several algorithms have newly
been developed for the extraction of rules from trained neural networks. These issues contribute to the
usefulness of neural networks for classification in data mining.
An artificial neural network is an adjective system that changes its structure-supported information that
flows through the artificial network during a learning section. The ANN relies on the principle of learning
by example. There are two classical types of neural networks, perceptron and also multilayer perceptron.
7. Outlier Detection
A database may contain data objects that do not comply with the general behavior or model of the data.
These data objects are Outliers. The investigation of OUTLIER data is known as OUTLIER MINING. An
outlier may be detected using statistical tests which assume a distribution or probability model for the data,
or using distance measures where objects having a small fraction of “close” neighbors in space are
considered outliers. Rather than utilizing factual or distance measures, deviation-based techniques
distinguish exceptions/outlier by inspecting differences in the principle attributes of items in a group.
8. Genetic Algorithm
Genetic algorithms are adaptive heuristic search algorithms that belong to the larger part of evolutionary
algorithms. Genetic algorithms are based on the ideas of natural selection and genetics. These are
intelligent exploitation of random search provided with historical data to direct the search into the region of
better performance in solution space. They are commonly used to generate high-quality solutions for
optimization problems and search problems. Genetic algorithms simulate the process of natural selection
which means those species who can adapt to changes in their environment are able to survive and
reproduce and go to the next generation. In simple words, they simulate “survival of the fittest” among
individuals of consecutive generations for solving a problem. Each generation consist of a population of
individuals and each individual represents a point in search space and possible solution. Each individual is
represented as a string of character/integer/float/bits. This string is analogous to the Chromosome.
Advantages or Disadvantages:
Improved Marketing:
Increased Efficiency:
Fraud Detection:
Customer Retention:
Competitive Advantage:
Improved Healthcare:
ns.
While data mining offers many benefits, there are also some disadvantages and challenges associated with
the process. The following are some of the main disadvantages of data mining:
Data Quality:
Ethical Considerations:
Technical Complexity:
Cost:
Data mining refers to extracting or mining knowledge from large amounts of data. In
other words, data mining is the science, art, and technology of discovering large and
complex bodies of data in order to discover useful patterns. Theoreticians and
practitioners are continually seeking improved techniques to make the process more
efficient, cost-effective, and accurate. Any situation can be analyzed in two ways in
data mining:
Statistical Analysis: In statistics, data is collected, analyzed, explored, and
presented to identify patterns and trends. Alternatively, it is referred to as
quantitative analysis.
Non-statistical Analysis: This analysis provides generalized information and
includes sound, still images, and moving images.
In statistics, there are two main categories:
Descriptive Statistics: The purpose of descriptive statistics is to organize data
and identify the main characteristics of that data. Graphs or numbers summarize
the data. Average, Mode, SD(Standard Deviation), and Correlation are some of
the commonly used descriptive statistical methods.
Inferential Statistics: The process of drawing conclusions based on
probability theory and generalizing the data. By analyzing sample statistics, you
can infer parameters about populations and make models of relationships within
data.
There are various statistical terms that one should be aware of while dealing with
statistics. Some of these are:
Population
Sample
Variable
Quantitative Variable
Qualitative Variable
Discrete Variable
Continuous Variable
Now, let’s start discussing statistical methods. This is the analysis of raw data using
mathematical formulas, models, and techniques. Through the use of statistical
methods, information is extracted from research data, and different ways are
available to judge the robustness of research outputs.
As a matter of fact, today’s statistical methods used in the data mining field typically
are derived from the vast statistical toolkit developed to answer problems arising in
other fields. These techniques are taught in science curriculums. It is necessary to
check and test several hypotheses. The hypotheses described above help us assess
the validity of our data mining endeavor when attempting to infer any inferences from
the data under study. When using more complex and sophisticated statistical
estimators and tests, these issues become more pronounced.
For extracting knowledge from databases containing different types of observations,
a variety of statistical methods are available in Data Mining and some of these are:
Logistic regression analysis
Correlation analysis
Regression analysis
Discriminate analysis
Linear discriminant analysis (LDA)
Classification
Clustering
Outlier detection
Classification and regression trees,
Correspondence analysis
Nonparametric regression,
Statistical pattern recognition,
Categorical data analysis,
Time-series methods for trends and periodicity
Artificial neural networks
Now, let’s try to understand some of the important statistical methods which are used
in data mining:
Linear Regression: The linear regression method uses the best linear
relationship between the independent and dependent variables to predict the
target variable. In order to achieve the best fit, make sure that all the distances
between the shape and the actual observations at each point are as small as
possible. A good fit can be determined by determining that no other position would
produce fewer errors given the shape chosen. Simple linear regression and
multiple linear regression are the two major types of linear regression. By fitting a
linear relationship to the independent variable, the simple linear regression
predicts the dependent variable. Using multiple independent variables, multiple
linear regression fits the best linear relationship with the dependent variable. For
more details, you can refer linear regression.
Classification: This is a method of data mining in which a collection of data is
categorized so that a greater degree of accuracy can be predicted and analyzed.
An effective way to analyze very large datasets is to classify them. Classification
is one of several methods aimed at improving the efficiency of the analysis
process. A Logistic Regression and a Discriminant Analysis stand out as two
major classification techniques.
o Logistic Regression: It can also be applied to machine learning
applications and predictive analytics. In this approach, the dependent
variable is either binary (binary regression) or multinomial (multinomial
regression): either one of the two or a set of one, two, three, or four
options. With a logistic regression equation, one can estimate
probabilities regarding the relationship between the independent variable
and the dependent variable. For understanding logistic regression
analysis in detail, you can refer to logistic regression.
o Discriminant Analysis: A Discriminant Analysis is a statistical method of
analyzing data based on the measurements of categories or clusters and
categorizing new observations into one or more populations that were
identified a priori. The discriminant analysis models each response class
independently then uses Bayes’s theorem to flip these projections around
to estimate the likelihood of each response category given the value of X.
These models can be either linear or quadratic.
o Linear Discriminant Analysis: According to Linear
Discriminant Analysis, each observation is assigned a
discriminant score to classify it into a response variable class.
By combining the independent variables in a linear fashion,
these scores can be obtained. Based on this model,
observations are drawn from a Gaussian distribution, and the
predictor variables are correlated across all k levels of the
response variable, Y. and for further details linear discriminant
analysis
o Quadratic Discriminant Analysis: An alternative approach is
provided by Quadratic Discriminant Analysis. LDA and QDA
both assume Gaussian distributions for the observations of the
Y classes. Unlike LDA, QDA considers each class to have its
own covariance matrix. As a result, the predictor variables have
different variances across the k levels in Y.
o Correlation Analysis: In statistical terms, correlation analysis captures
the relationship between variables in a pair. The value of such variables
is usually stored in a column or rows of a database table and represents
a property of an object.
o Regression Analysis: Based on a set of numeric data, regression is a
data mining method that predicts a range of numerical values (also
known as continuous values). You could, for instance, use regression to
predict the cost of goods and services based on other variables. A
regression model is used across numerous industries for forecasting
financial data, modeling environmental conditions, and analyzing trends.
1. Euclidean Distance
Applications: It is commonly used in clustering algorithms together with okay-method, and in nearest
neighbor searches.
2. Cosine Similarity
Description: Cosine similarity measures the cosine of the perspective between vectors. It is specifically
useful in excessive-dimensional areas, which includes textual content mining, in which it measures the
orientation in place of significance, making it scale-invariant.
Applications: Widely utilized in text mining and information retrieval, which include record similarity in
serps.
3. Jaccard Similarity
Description: Jaccard similarity measures the similarity among two finite sets with the aid of dividing the
dimensions of their intersection via the dimensions of their union. It is beneficial for comparing specific
records.
Applications: Commonly used in clustering and classification tasks regarding categorical statistics,
consisting of market basket evaluation.
Applications: Used in statistical evaluation and system studying to discover and quantify linear
relationships between features.
5. Hamming Distance
Description: Hamming distance measures the number of positions at which the corresponding factors of
strings are one-of-a-kind. It is especially useful for binary or specific information.
Applications: Used in mistakes detection and correction algorithms, in addition to in comparing binary
sequences or express variables.
1. Clustering
Clustering entails grouping a set of gadgets such that items in the identical institution (or cluster) are
greater just like every aside from to the ones in different agencies. Similarity measures play a essential
function in defining these groups.
K-Means Clustering: Uses Euclidean distance to partition information into ok clusters. Each facts factor is
assigned to the cluster with the nearest centroid.
Hierarchical Clustering: Uses diverse distance metrics (e.G., Euclidean, Manhattan) to construct a hierarchy
of clusters, often visualized as a dendrogram.
Text Clustering: Uses cosine similarity to organization documents with comparable content material. This is
mainly beneficial in organizing big textual content corpora.
2. Classification
Classification assigns a label to a brand new facts factor based totally at the traits of acknowledged
classified facts points. Similarity measures help decide the label by means of comparing the new factor to
present points.
K-Nearest Neighbors (k-NN): Classifies a statistics factor primarily based on the majority label among its ok
nearest acquaintances, frequently the usage of Euclidean distance or cosine similarity.
Document Classification: Uses similarity measures like cosine similarity to categorize text files into
predefined instructions.
3. Information Retrieval
Information retrieval structures, together with search engines, rely on similarity measures to rank
documents primarily based on their relevance to a query.
Search Engines: Use cosine similarity to evaluate the question vector with report vectors, ranking
documents by using their similarity to the query.
Content-Based Filtering: In advice systems, similarity measures (e.G., cosine similarity, Jaccard similarity)
are used to recommend gadgets which might be much like those a user has previously favored.
4. Recommendation Systems
Recommendation structures suggest items to customers based on their alternatives and behavior, often the
usage of similarity measures to discover objects or customers which might be alike.
Collaborative Filtering: Uses similarity measures like Pearson correlation or cosine similarity to locate
customers with similar preferences and propose items they've liked.
Content-Based Filtering: Recommends items similar to those the person has shown interest in, the use of
measures like cosine similarity to examine object capabilities.
5. Anomaly Detection
Anomaly detection identifies outliers or uncommon statistics points that differ substantially from the bulk
of information.
Mahalanobis Distance: Considers the correlations of the dataset to stumble on multivariate outliers.
Euclidean Distance: Can be used in easier contexts to locate information factors that are a ways from the
imply or median of the dataset.
In NLP, similarity measures are used to examine text data, assisting in responsibilities consisting of report
clustering, plagiarism detection, and sentiment evaluation.
Word Embeddings: Use cosine similarity to evaluate phrase vectors in fashions like Word2Vec or GloVe,
enabling the identity of semantically comparable words.
Document Similarity: Measures like cosine similarity assist in clustering files or detecting plagiarism through
comparing text content.
7. Image Processing
Image processing involves analyzing and manipulating pics, where similarity measures are used to
compare picture capabilities.
Image Retrieval: Uses measures like Euclidean distance on characteristic vectors (e.G., color histograms,
side descriptors) to discover similar photographs.
Face Recognition: Employs measures like cosine similarity on feature vectors extracted from deep studying
fashions to become aware of or verify people.
8. Bioinformatics
In bioinformatics, similarity measures help examine organic information, along with genetic sequences or
protein systems.
Sequence Alignment: Uses Hamming distance to compare DNA, RNA, or protein sequences, figuring out
similarities and variations which could imply evolutionary relationships.
Protein Structure Comparison: Employs measures like RMSD (Root Mean Square Deviation) to evaluate 3-D
systems of proteins, aiding within the examine of their functions and interactions.
Decision Tree
2. Internal Nodes: Represent decisions or tests on attributes. Each internal node has one or more branches.
4. Leaf Nodes: Represent the final decision or prediction. No further splits occur at these nodes.
1. Selecting the Best Attribute: Using a metric like Gini impurity, entropy, or information gain, the best
attribute to split the data is selected.
2. Splitting the Dataset: The dataset is split into subsets based on the selected attribute.
3. Repeating the Process: The process is repeated recursively for each subset, creating a new internal node or
leaf node until a stopping criterion is met (e.g., all instances in a node belong to the same class or a
predefined depth is reached).
Information Gain: Measures the reduction in entropy or Gini impurity after a dataset is split on an
attribute.
o InformationGain=Entropyparent–∑i=1n(∣Di∣∣D∣∗Entropy(Di))InformationGain=Entropyparent–∑i=1n
(∣D∣∣Di∣∗Entropy(Di)), where Di is the subset of D after splitting by an attribute.
No Need for Feature Scaling: Decision trees do not require normalization or scaling of the data.
Handles Non-linear Relationships: Capable of capturing non-linear relationships between features and
target variables.
Instability: Small variations in the data can result in a completely different tree being generated.
Bias towards Features with More Levels: Features with more levels can dominate the tree structure.
Pruning
To overcome overfitting, pruning techniques are used. Pruning reduces the size of the tree by removing
nodes that provide little power in classifying instances. There are two main types of pruning:
Pre-pruning (Early Stopping): Stops the tree from growing once it meets certain criteria (e.g., maximum
depth, minimum number of samples per leaf).
Post-pruning: Removes branches from a fully grown tree that do not provide significant power.
Neural networks are powerful equipment in records mining because of their capability to research
complicated patterns from massive datasets. Their adaptability to various kinds of information and trouble
domain names makes them appropriate for a wide range of applications, which include:
Pattern Recognition: Neural networks excel in spotting styles within data, making them precious for
obligations, which include photo and speech reputation, fraud detection, and medical analysis.
Classification: In class duties, neural networks categorize input facts into predefined instructions.
Applications include email junk mail detection, sentiment evaluation, and disorder analysis.
Regression: Neural networks can perform regression obligations by predicting numerical values. This is
useful in scenarios consisting of predicting inventory expenses, sales forecasts, and housing charges.
Clustering: Neural networks may be applied to clustering troubles, grouping similar data points. This is
useful in customer segmentation, anomaly detection, and statistics compression.
Feature Scaling: Neural networks have an advantage from feature scaling, making sure that all input
capabilities have a similar scale. Common scaling strategies encompass normalization and standardization.
Handling Missing Data: Addressing lacking information is important for powerful neural community
training. Techniques like imputation or exclusion of incomplete statistics help maintain the facts' integrity.
Data Splitting: Datasets are generally split into education, validation, and testing units. Training sets are
used to educate the model; validation units assist in tracking hyperparameters and checking out sets to
examine the model's performance on unseen facts.
Output Layer: The output layer produces the very last predictions or classifications. The number of neurons
in this layer depends upon the assignment's character, binary type, multi-elegance class, or regression.
Backpropagation: One of the most important algorithms for training neural networks is backpropagation. It
is an iterative configuration of weights following the gradient of error with respect to these estimates. This
process is critical in ensuring that the difference between the predicted and actual outputs is minimal.
Activation Functions: Activation functions introduce nonlinearity into the neural network, enabling it to
learn complex relationships. Some typical activation functions are sigmoid, hyperbolic tangent H(x), and
rectified linear units.
Regularization: So, they apply regularization techniques such as dropout and weight decay while training to
prevent overfitting. All these techniques aid the model in generalizing well to new, unseen data.
Hyperparameter Tuning: The selection of appropriate hyperparameters, such as learning rate, batch size,
and the number of hidden layers, drastically influence the performance level of a neural network.
Hyperparameter tuning often involves the use of Grid search or random search methods.
Overfitting: Neural networks are prone to memorizing the training data, which generalizes poorly when
transferred to new data. Regularization techniques and appropriate validation strategies mitigate this
problem.
Interpretability: Neural networks are so often being called 'black box' models that it is difficult to explain
why such predictions were made. In some areas with the requirement for transparency, this inability to
make sense of it becomes a problem.
Computational Resources: Training large neural networks is a heavy computational task that requires
strong GPUs or TPUs. This is a limiting factor, especially for small-scale projects or organizations with
limited resources.
Genetic Algorithm
Genetic algorithms are based on the ideas of natural selection and genetics. These are intelligent exploitation of
random searches provided with historical data to direct the search into the region of better performance in solution
space. They are commonly used to generate high-quality solutions for optimization problems and search
problems.
Genetic algorithms are based on an analogy with the genetic structure and behavior of chromosomes of the
population. Following is the foundation of GAs based on this analogy –
2. Those individuals who are successful (fittest) then mate to create more offspring than others
3. Genes from the “fittest” parent propagate throughout the generation, that is sometimes parents
create offspring which is better than either parent.
Search space
The population of individuals are maintained within search space. Each individual represents a solution in
search space for given problem. Each individual is coded as a finite length vector (analogous to
chromosome) of components. These variable components are analogous to Genes. Thus a chromosome
(individual) is composed of several genes (variable components).
Fitness Score
A Fitness Score is given to each individual which shows the ability of an individual to “compete”. The
individual having optimal fitness score (or near optimal) are sought.
The GAs maintains the population of n individuals (chromosome/solutions) along with their fitness
scores.The individuals having better fitness scores are given more chance to reproduce than others. The
individuals with better fitness scores are selected who mate and produce better offspring by combining
chromosomes of parents. The population size is static so the room has to be created for new arrivals. So,
some individuals die and get replaced by new arrivals eventually creating new generation when all the
mating opportunity of the old population is exhausted. It is hoped that over successive generations better
solutions will arrive while least fit die.
Each new generation has on average more “better genes” than the individual (solution) of previous
generations. Thus each new generations have better “partial solutions” than previous generations. Once
the offspring produced having no significant difference from offspring produced by previous populations,
the population is converged. The algorithm is said to be converged to a set of solutions for the problem.
Once the initial generation is created, the algorithm evolves the generation using following operators –
1) Selection Operator: The idea is to give preference to the individuals with good fitness scores and allow
them to pass their genes to successive generations.
2) Crossover Operator: This represents mating between individuals. Two individuals are selected using
selection operator and crossover sites are chosen randomly. Then the genes at these crossover sites are
exchanged thus creating a completely new individual (offspring). For example –
3) Mutation Operator: The key idea is to insert random genes in offspring to maintain the diversity in the
population to avoid premature convergence. For example –
Characters A-Z, a-z, 0-9, and other special symbols are considered as genes
Fitness score is the number of characters which differ from characters in target string at a particular index.
So individual having lower fitness value is given more preference.