Unit 2
Unit 2
Simulated Annealing
• Simulated Annealing is an optimization technique which helps us to
find the global optimum value (global maximum or global minimum)
from the graph of given function. This technique is used to choose
most probable global optimum value when there is multiple number of
local optimum values in a graph.
• Basically Simulation annealing is the combination of high climbing
and pure random walk technique, first one helps us to find the global
maximum value and second one helps to increase the efficiency to find
the global optimum value.
• If we are moving upwards using hill climbing algorithm our solution can
stuck at some point because hill climbing do not allow down hill so in this
situation, we have to use one more algorithm which is pure random walk,
this algorithm helps to find the efficient solution that must be global
optimum. Whole algorithm is known as Simulated Annealing.
• Likewise, in above graph we can see how this algorithm works to find most
probable global maximum value. In above figure, there is lot of local
maximum values i.e. A,B,D but our algorithm helps us to find the global
optimum value, in this case global maximum value.
Algorithm
First generate a random solution
Calculate it’s cost using some cost function
Generate a random neighbor solution and calculate it’s cost
Compare the cost of old and new random solution
If C old > C new then go for old solution otherwise go for new solution
Repeat steps 3 to 5 until you reach an acceptable optimized solution of given
problem
Let’s try to understand how this algorithm helps us to find the global maximum
value
First let’s suppose we generate a random solution and we get B point then we
again generate a random neighbor solution and we get F point then we
compare the cost for both random solution, and in this case cost of former is
high so our temporary solution will be F point then we again repeat above 3
steps and finally we got point A be the global maximum value for the given
function.
Example Code:
from random import random
def anneal(sol):
old_cost = cost(sol)
T = 1.0
T_min = 0.00001
alpha = 0.9
while T > T_min:
i=1
while i <= 100:
new_sol = neighbor(sol)
new_cost = cost(new_sol)
ap = acceptance_probability(old_cost, new_cost, T)
if ap > random():
sol = new_sol
old_cost = new_cos
i += 1
T = T*alpha
return sol, cost
In above skeleton code, you may have to fill some gaps like cost()
which is used to find the cost of solution generated, neighbor()
which returns random neighbor solution and
acceptance_probability() which helps us to compare the new cost
with old cost , if value returned by this function is more than
randomly generated value between 0 and 1 then we will upgrade our
cost from old to new otherwise not.
Calculation of acceptance probability
Equation for acceptance probability is given as:
a=e(c_new-c_old)/T
Here c_new is new cost , c_old is old cost and T is temperature , temperature T
is increasing by alpha(=0.9) times in each iteration.
Some points related to acceptance probability:
is >1 is new solution is better than old one.
gets smaller value as temperature decreases(if new solution is worse than old
one.
gets smaller as new solution gets more worse than old one.
Applications:
Deployment of mobile wireless base (transceiver) stations (MBTS, vehicles) is
expensive, with the wireless provider often offering a basic coverage of BTS in a
normal communication data flow. However, during a special festival celebration
or a popular outdoor concert in a big city, the quality of the wireless connection
would be insufficient. In this situation, wireless provider increase the number of
MBTS to improve data communication among public.
Simulated Annealing is used to find the optimal value of MBTS which should be
suitable for proper data communication.
Stochastic
An introduction to Stochastic processes and how they are applied every day
in Data Science and Machine Learning.
One of the main application of Machine Learning is modelling stochastic
processes. Some examples of stochastic processes used in Machine
Learning are:
Poisson processes: for dealing with waiting times and queues.
Random Walk and Brownian motion processes: used in algorithmic
trading.
Markov decision processes: commonly used in Computational Biology
and Reinforcement Learning.
Gaussian Processes: used in regression and optimisation problems (eg.
Hyper-Parameters tuning and Automated Machine Learning).
Auto-Regressive and Moving average processes: employed in time-series
analysis (eg. ARIMA models).
In this article, I will briefly introduce you to each of these processes.
Deterministic and Stochastic processes
In a deterministic process, if we know the initial condition (starting
point) of a series of events we can then predict the next step in the
series. Instead, in stochastic processes, if we know the initial condition,
we can’t determine with full confidence what are going to be the next
steps. That’s because there are many (or infinite!) different ways the
process might evolve.
Observation: the result of one trial.
Population: all the possible observation that can be registered from a
trial.
Sample: a set of results collected from separated independent trials.
Poisson processes
Poisson Processes are used to model a series of discrete events in which
we know the average time between the occurrence of different events
but we don’t know exactly when each of these events might take place.
A process can be considered to belong to the class of Poisson Processes
if it can meet the following criteria’s:
1 The events are independent of each other (if an event happens, this does
not alter the probability that another event can take place).
2 Two events can’t take place simultaneously.
3 The average rate between events occurrence is constant.
System Architecture
The overall design goal of the multi-model big data analysis system and
information extraction is to establish a cross-model query platform
connecting multiple subsystems. The execution ability of the cross-
model query is obtained by expanding the multi-model query engine .
Evaluation of Strategies:
Big Data processing engines have significantly improved the
performance of Big Data applications by diminishing the amount of data
movement that occurs during the execution of an application. Locality-
aware scheduling, introduced by the MapReduce
[ 1 ] The process of performing computations only when invoked was
leveraged by Big Data frameworks to enable further optimizations such
as regrouping of tasks and computing only what is necessary.
Frameworks such as MapReduce and Spark
[2] Have become mainstream tools for data analytics, although many
others, such as Dask
[3], Are emerging. Meanwhile, several scientific domains including
bioinformatics, physics or astronomy, have entered the Big Data era due
to increasing data volumes and variety. Nevertheless, the adoption of Big
Data engines for scientific data analysis remains limited, perhaps due to
the widespread availability of scientific processing engines such as
Pegasus
[4], and the adaptations required in Big Data processing engines for scientific
computing.Scientific applications differ from typical Big Data use cases,
which might explain the remaining gap between Big Data and scientific
engines. While Big Data applications mostly target text processing (e.g. Web
search, frequent pattern mining, recommender systems
[5]) Implemented in consistent software libraries, scientific applications often
involve binary data such as images and signals, processed by a sequence of
command-line/containerized tools using a mix of programming languages (C,
Fortran, Python, shell scripts), referred to as workflows or pipelines. With
respect to infrastructure, Big Data applications commonly run on clouds or
dedicated commodity clusters with locality-aware file systems such as the
Hadoop Distributed File System (HDFS)
[6]), whereas scientific applications are usually deployed on large, shared
clusters where data is transferred between data and compute nodes through
shared file systems such as Lustre
[7]. Such differences in applications and infrastructure have important
consequences. To mention only one, in-memory computing requires
instrumentation to be applied to command-line tools. Technological advances
of the past decade, in particular page caching in the Linux kernel
Genetic Algorithm
A genetic algorithm is a search technique used in computing to find true or
approximate solutions to optimization and search problems.
Initialize population: genetic algorithms begin by initializing a Population of
candidate solutions. This is typically done randomly to provide even coverage of
the entire search space. A candidate solution is a Chromosome that is
characterized by a set of parameters known as Genes.
Evaluate: next, the population is evaluated by assigning a fitness value to each
individual in the population. In this stage we would often want to take note of
the current fittest solution, and the average fitness of the population.
After evaluation, the algorithm decides whether it should terminate the search
depending on the termination conditions set. Usually this will be because the
algorithm has reached a fixed number of generations or an adequate solution has
been found.
Selection: if the termination condition is not met, the population goes through a
selection stage in which individuals from the population are selected based on
their fitness score, the higher the fitness, the better chance an individual has of
being selected.
Crossover: the next stage is to apply crossover and mutation to the selected
individuals. This stage is where new individuals (children) are created for the
next generation.
Mutation: at this point the new population goes back to the evaluation step and
the process starts again. We call each cycle of this loop a generation.
History of Genetic Algorithms
The GA, developed by John Holland and his collaborators in the 1960s and
1970s.
As early as 1962, John Holland’s work on adaptive systems¹ laid the foundation
for later developments.
By the 1975, the publication of the book “Adaptation in Natural and Artificial
Systems”², by Holland and his students and colleagues.
The GA got popular in the late 1980s by was being applied to a broad range of
subjects that are not easy to solve using other techniques.
In 1992, John Koza has used genetic algorithm to evolve programs to perform
certain tasks. He called his method “genetic programming” (GP)³.
Genetic Programming:
THE CHALLENGE
"How can computers learn to solve problems without being explicitly
programmed? In other words, how can computers be made to do what is needed
to be done, without being told exactly how to do it?"
CRITERION FOR SUCCESS
A COMPUTER PROGRAM
Representation
1. CRITERION FOR SUCCESS
"The aim [is] ... to get machines to exhibit behavior, which if done by
humans, would be assumed to involve the use of intelligence."
Main Points
Genetic programming now routinely delivers high-return human-
competitive machine intelligence. Genetic programming is an automated
invention machine. Genetic programming can automatically create a
general solution to a problem in the form of a parameterized topology .
• Decision trees
• If-then production rules (e.g., expert systems)
• Horn clauses
• Neural nets (matrices of numerical weights)
• Bayesian networks
• Frames
• Propositional logic
• Binary decision diagrams
• Formal grammars
• Vectors of numerical coefficients for polynomials (adaptive systems)
• Tables of values (reinforcement learning)
• Conceptual clusters
• Concept sets
• Parallel if-then rules (e.g., genetic classifier systems)
2. A COMPUTER PROGRAM
we present genetic programming, the youngest member of the evolutionary
algorithm family. Besides the particular representation (using trees
aschromosomes), it differs from other EA strands in its application area.While
the EAs discussed so far are typically applied to optimization problems. GP
could instead be positioned in machine learning. This, in fact, is the basis of
using evolution for such tasks: models are treated as individuals, and their
fitness is the model quality to be maximized
Introductory example: credit scoring
Bank wants to distinguish good from bad loan applicants
Model needed that matches historical data
3. Representation
As the introductory example has shown, the general idea in GP is to use parse
trees as chromosomes. Such parse trees capture expressions in a given formal
syntax. Depending on the problem, and the users' perceptions on what the
solutions must look like, this can be the syntax of arithmetic expressions,
formulas in first-order predicate logic, or code written in a programming
language
Tree based representation
Arithmetic formula
Logical formula:
Program:
Visualization:
Data visualization is the graphical representation of information and data.
By using visual elements like charts, graphs, and maps, data visualization
tools provide an accessible way to see and understand trends, outliers, and
patterns in data.
In the world of Big Data, data visualization tools and technologies are
essential to analyze massive amounts of information and make data-driven
decisions.
Advantages
1 Easily sharing information
2 Interactively explore opportunities
3 Visualize patterns and relationships
Disadvantages
1 Biased or inaccurate information
2 Correlation doesn’t always mean causation
3 Core messages can get lost in translation
Data visualization and big data
visualization is an increasingly key tool to make sense of the trillions of rows of
data generated every day. Data visualization helps to tell stories by curating data
into a form easier to understand, highlighting the trends and outliers. A good
visualization tells a story, removing the noise from data and highlighting useful
information.
General Types of Visualizations:
Chart: Information presented in a tabular, graphical form with data displayed
along two axes. Can be in the form of a graph, diagram, or map.
Table: A set of figures displayed in rows and columns.
Graph: A diagram of points, lines, segments, curves, or areas that represents
certain variables in comparison to each other, usually along two axes at a right
angle.
Geospatial: A visualization that shows data in map form using different shapes
and colors to show the relationship between pieces of data and specific
locations.
Infographic: A combination of visuals and words that represent data. Usually
uses charts or diagrams.
Dashboards: A collection of visualizations and data displayed in one place to
help with analyzing and presenting data.
Classification of visual Data Analysis Techniques:
Data Types:
Data can come in many forms, but machine learning models rely on four
primary data types. These include numerical data, categorical data, time series
data, and text data.
Visualization Techniques:
The greatest value of a picture is when it forces us to notice what we
never expected to see.” And our data visualization team couldn’t agree
more. Visualization allows business users to look beyond individual data
records and easily identify dependencies and correlations hidden inside
large data sets.
Analysis of industrial data
The maintenance team is unlikely to be satisfied with instant alerts only. They
should be proactive, not just reactive in their work, and for that, they need to
know dependencies and trends. Big data visualization helps them get the
required insights. For example, if the maintenance team would like to understand
the connections between machinery failures and certain events that trigger them,
they should look at connectivity charts for insights.
Analysis of social comments
The company’s customer base is 20+ million. It would be impossible for the
retailer to browse all over the internet in the search of all the comments and
reviews and try to get insights just by scrolling through and reading all the
comments. To have these tasks automated, companies resort to sentiment analysis .
Analysis of customer behavior
They strive to implement big data solutions that would allow gathering detailed
data about the purchases in brick-and-mortar and online stores, browsing
history and engagement, GPS data and data from the customer mobile app, calls
to the support center and more. Registering billions of events daily, a company
is unable to identify the trends in customer behavior if they have just multiple
records at their disposal. With big data visualization, ecommerce retailers, for
instance, can easily notice the change in demand for a particular product based
on the page views.
Most frequently used big data visualization techniques
we studied on practical examples how companies can benefit from big data
visualization, and now we’ll give an overview of the most widely used data
visualization techniques.
Symbol maps
The symbols on such maps differ in size, which makes them easy to compare.
Imagine a US manufacturer who has launched a new brand recently. The
manufacturer is interested to know which regions liked the brand particularly .
Line charts
Line charts allow looking at the behavior of one or several variables over time
and identifying the trends. In traditional BI, line charts can show sales, profit
and revenue development for the last 12 months.
When working with big data, companies can use this visualization technique
to track total application clicks by weeks, the average number of complaints to
the call center by months, etc.
Pie charts
Pie charts show the components of the whole. Companies that work with both
traditional and big data may use this technique to look at customer segments
or market shares. The difference lies in the sources from which these
companies take raw data for the analysis.
Bar charts
Bar charts allow comparing the values of different variables. In traditional BI,
companies can analyze their sales by category, the costs of marketing
promotions by channels, etc.
When analyzing big data, companies can look at the visitors’ engagement with
their website’s multiple pages, the most frequent pre-failure cases on the shop
floor and more.
Heat maps
Heat maps use colors to represent data. A user may encounter a heat map in
Excel that highlights sales in the best performing store with green and in the
worst performing – with red.
Interaction techniques:
An interaction technique, user interface technique or input technique is a
combination of hardware and software elements that provides a way for
computer users to accomplish a single task
• The computing view
• The user's view
• The designer's view
The computing view
* One or several input devices that capture user input,
* One or several output devices that display user feedback,
* A piece of software that:
* Interprets user input into commands the computer can understand,
• produces user feedback based on user input and the system's state .
The user's view
The user's perspective, an interaction technique is a way to perform a single
computing task and can be informally expressed with user instructions or
usage scenarios. For example, "to delete a file, right-click on the file you
want to delete, then click on the delete item".
The designer's view
The user interface designer's perspective, an interaction technique is a well-
defined solution to a specific user interface design problem. Interaction
techniques as conceptual ideas can be refined, extended, modified and
combined