0% found this document useful (0 votes)

25 views12 pages

Algorithms 17 00111

This document summarizes a research paper that proposes using a genetic algorithm approach within a Markov chain framework to perform non-parametric estimation of regression parameters and their statistical confidence bounds from data. Specifically, it generates samples from an unknown probability density function to estimate regression coefficients if the likelihood function is known, without requiring specification of prior distributions or excluding initial samples. The method forms a Markov chain to converge to a steady-state distribution that minimizes error, allowing confidence bounds to be determined on the learned regression parameters. It is shown to converge regardless of initial conditions.

Uploaded by

jamel-shams

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views12 pages

Algorithms 17 00111

Uploaded by

jamel-shams

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

algorithms

Article
A Markov Chain Genetic Algorithm Approach for
Non-Parametric Posterior Distribution Sampling of
Regression Parameters
Parag C. Pendharkar †

Information Systems School of Business Administration, Pennsylvania State University at Harrisburg,

777 West Harrisburg Pike, Middletown, PA 17057, USA; [email protected]; Tel.: +1-(717)-948-6028;
Fax: +1-(717)-948-6456
† URL: http://www.personal.psu.edu/pxp19/.

Abstract: This paper proposes a genetic algorithm-based Markov Chain approach that can be used
for non-parametric estimation of regression coefficients and their statistical confidence bounds.
The proposed approach can generate samples from an unknown probability density function if a
formal functional form of its likelihood is known. The approach is tested in the non-parametric
estimation of regression coefficients, where the least-square minimizing function is considered the
maximum likelihood of a multivariate distribution. This approach has an advantage over traditional
Markov Chain Monte Carlo methods because it is proven to converge and generate unbiased samples
computationally efficiently.

Keywords: genetic algorithms; Markov Chains; regression; non-parametric estimation algorithm

1. Introduction
Let α(x) be some unknown continuous multivariate probability density function. Let
xi = [xi1 ,. . ., xin ] be the ith iid sample generated from the distribution. Assume that m
such samples are available; then, a maximum likelihood estimator α̂ will maximize the
Citation: Pendharkar, P.C. A Markov
following expression:
Chain Genetic Algorithm Approach m
for Non-Parametric Posterior α̂( x ) = ∏ α̂(xi ) (1)
Distribution Sampling of Regression i =1
Parameters. Algorithms 2024, 17, 111. If X and Y are random variables generated from multivariate distribution (X,Y),
https://doi.org/10.3390/a17030111
m
Academic Editors: Szymon Łukasik,
∑ [yi − α̂(xi )]2 (2)
Piotr A. Kowalski and Rohit Salgotra i =1

Received: 18 February 2024 where the dimensionality of some response variable Y is one and an iid sample from the
Revised: 5 March 2024 distribution is represented as (x1 , y1 ),. . ., (xm , ym ). Then, the least-square non-parametric
Accepted: 6 March 2024 estimator will minimize the following expression [1]:
Published: 7 March 2024
In regression problems, the dependence of y on x is considered to be given by some
regression function α̂(x) belonging to some class of functions [2]. In this paper, assuming
that some functional model form α̂(x) = f (x; θ 1 ,. . ., θ n ) is known or a priori established, pa-
Copyright: © 2024 by the author.
rameters θ 1 ,. . ., θ n are computed by minimizing the least-square expression in Equation (2).
Licensee MDPI, Basel, Switzerland. When the true density α(x) lies in the parametric class of functions parameterized by the
This article is an open access article vector θ = [θ 1 ,. . ., θ n ], then, finding parameters by minimizing Equation (2), i.e., maximizing
distributed under the terms and the likelihood, properties such as convergence, consistency, and lack of bias are satisfied [3].
conditions of the Creative Commons However, when actual density does not lie in the class of parametric functions, there is an
Attribution (CC BY) license (https:// acute need for non-parametric estimation [4].
creativecommons.org/licenses/by/ Methods in Bayesian statistics condition the problem of learning parameters (θ) on the
4.0/). dataset D, which is used to learn the parameters. These methods allow for the specification

Algorithms 2024, 17, 111. https://doi.org/10.3390/a17030111 https://www.mdpi.com/journal/algorithms

Algorithms 2024, 17, 111 2 of 12

of priors on the parameters as p(θ) and require a formal specification of data likelihood
function p(D|θ), which is conditioned on the parameters θ. The learning of parameters and
confidence bounds on the parameters occurs using Monte Carlo Markov Chain (MCMC)
methods to estimate posterior distribution p(θ|D) via the following Bayesian rule:

p(θ | D ) ∝ p( D |θ ) × p(θ ) (3)

To reduce any bias in confidence bounds on the parameters, samples from the initial
burn-in period are excluded from the computation of confidence bounds.
The method described in this paper does not require any knowledge of data likeli-
hoods or user-defined priors. The “population”-based method directly generates posterior
distribution samples p(θ | D ) so that confidence bounds on parameters can be generated.
There is no need to exclude any initial burn-in period samples. More specifically, a genetic
algorithm (GA) approach is used to estimate the parameters and their confidence bounds.
This method begins with a random sample (i.e., population) estimate p0 (θ | D ) and iterates
over certain GA generations k so that a steady-state distribution of the sample, pk (θ | D ),
minimizing the average least-square error, is obtained. It is shown that from one iteration
to the next, the GA forms a Markov Chain, whose transition matrix can be determined
using a fitness measure of minimizing the sum of squared error expression (2). Once the
population converges, the final population can be used to establish confidence bounds on
the regression parameters. The transition matrix is shown to be irreducible and aperiodic,
which guarantees both the uniqueness of the steady-state distribution and convergence to
a steady-state, regardless of the initial random population p0 (θ | D ).
The rest of the paper is organized as follows: In Section 2, preliminaries of the GA
Markov Chain framework are described. In Section 3, the approach is applied to a dataset,
and the results of the experiments are reported. Section 4 concludes the paper with a
summary and directions for future work.

2. A Genetic Algorithm Markov Chain and Related Preliminaries

A GA Markov Chain framework for steady-state distribution convergence was pro-
posed both for binary GAs [5] and floating-point (real values) GAs [6]. This paper uses and
adapts the GA floating-point framework. While the proposed method is general enough
to learn parameters for both non-linear and linear regression functions, the linear model
is used in this section for ease of exposition. The search for regression parameters occurs
on a cone (a cone is a non-empty set C∈◦ n with vertex 0, such that if θ ∈ C ⇒ λθ ∈ C, ∀
λ ≥ 0). The benefit of searching for solutions on a cone is that the search for regression
parameters can be restricted over the symmetric fixed closed interval, with a center at zero.
Let that fixed interval be θ ∈[−1, 1], and the true linear regression equation is represented
as follows:
n
yj = ∑ β i x j + β n +1 (4)
i =1

where j = {1,. . ., m} is the index representing individual examples in the dataset D, i = {1,. . .,
n} is the number of independent variables, βi s are regression coefficients, and β n+1 is the
regression intercept. When θ ∈[−1, 1], the equivalent regression equation can be written
as follows: !
n
yj = λ ∑ θ i x j + θ n +1 (5)
i =1

The variable λ is a predetermined positive constant. The issue of selecting an ap-

propriate value for this constant is taken up later in this section. For the rest of the
paper, the dimensionality of θ = [θ 1 ,. . ., θ z ]T , where the last parameter z = n + 1, is the
intercept for the linear regression. When the correct regression model is learned, then
λθi = β i , i = {1, . . . , z}.
The GA used for the solution procedure is a random floating-point (real attribute)
population-based search procedure, represented as a matrix Qg . The rows of the matrix are
Algorithms 2024, 17, 111 3 of 12

equal to the size of the population (Ω) and the number of columns in the matrix is equal to
z. For the sake of illustration, the matrix Qg is represented as follows:
 
θ11 ...... θ1z
Qg =  . ...... .  (6)
θΩ1 ...... θΩz

The superscript “g” represents the generation number for the population. Each row
represents a regression model described by its row parameters. In GA terminology, each
row (regression model) is called a population member. A population member’s fitness
is computed using the dataset D, applying the regression model using row parameters
and Equation (5), and computing a reciprocal function of the root-mean-square error of
Equation (2) as the fitness value. This approach to computing the fitness value means
that population members with higher fitness values have lower root-mean-square errors.
Each population member is represented with the subscript s = {1,. . ., Ω}, and an individual
population member is defined as θ s = [θ s1 ,. . ., θ sz ]T . Any component of population vector
θ s is called a gene of the population member s. All genes for all population members
take values between −1 and 1, i.e., θ si ∈[−1, 1]. Initially, when g = 0, the values of genes
for all population members are randomly generated over this interval using a uniform
distribution. Next, for each population member, using the dataset D, its fitness is computed.
The population for the next generation, i.e., g = 1, is computed by applying selection,
crossover, and mutation operations. All of these three operations are probabilistic. The
selection operator used in this research is proportional selection. The proportional selection
operation selects two parents with replacement (i.e., two population members from Q0 ),
wherein higher-fitness population members are more likely to be chosen as parents. Once
these two parents are selected, a random crossover point is selected χ∈[1, z − 1], and
then, with a certain crossover probability pc , a child is created by exchanging the parents’
genes at crossover point χ. As an example, assume a crossover point χ∈[1, n − 1] and
two selected parents, Pa = [θ a1 ,. . ., θ az ] and Pb = [θ b1 ,. . ., θ bz ], with a,b∈[1, Ω]; then, the
genes ofh the child are created by swapping
i genes of parents at the crossover point, which
will be θ a1 , . . . , θ aχ , θb(χ+1) , . . . , θbz . The mutation operation probabilistically changes a
gene of a population member using a low mutation probability pm . If a gene is selected for
mutation, its value is replaced by generating a random number using uniform distribution
in a closed interval [−1, 1]. If Ts , Tc , and Tm are represented as selection, crossover, and
mutation operations, then the next generation population Qg+1 is generated from the
current generation population Qg , as follows:

Q g +1 = T m T C T S ( Q g ) (7)

T T and it is used to
The matrix Qg in Equation (6) may be written as Q g = θ1T , . . . , θΩ

represent one state in a discrete Markov Chain. Each generation “g” is a discrete event.
Given that genes or individual components of θ s take continuous values in the interval
[−1, 1], the interval is discretized into discrete categories using a minimal interval width δ
> 0. The discretized interval will contain categories in a set {−1, −1 + δ, −1 + 2δ, . . ., 0, . . .,
1 − δ, 1}. By choosing a value of δ so that zero is not eliminated from the set, there will be η
= (2/δ) total categories in the interval. Pendharkar [5] showed that the number of unique
states, U, of the population will be given by the following expression:

ηz + Ω − 1

U= (8)
Ω

The state of any population at any generation “g” can be represented by a vector π g of
dimension U, taking values in the interval [0, 1] so that the sum of the components of the
vector always sum to 1. As an example, π 0 = [0, 0, 1, . . . 0]T means that the initial random
Algorithms 2024, 17, 111 4 of 12

population in generation 0 is in the third state of all possible unique states U. Assuming that
the initial state of the population is given by the state π 0 and the steady-state distribution
(after several generations) is given by π*, the expression in Equation (7) is a Markov Chain,
and iterative operations in Equation (7) can be represented as follows:
h iT h iT
π* = π0 R (9)

where R is the Markov Chain transition matrix of dimension U × U. Let rpq be the individual
elements of matrix R; then, as long as pm ̸= 0, rpq > 0, because each population state can
be reached by any other population state since a mutation can randomly change any gene
of any or all population members with a non-zero low probability. Also, rpq ̸= 1, because
pm ̸= 1 and because there is always some uncertainty that the population from state p
may not go to state q in the next step. Furthermore, since each rpq > 0 and rpq ̸= 1, R is
a stochastic matrix with the magnitude of its maximum eigenvalue equal to one and its
second-largest eigenvalue less than one. The Markov Chain is ergodic and irreducible, with
a unique stationary distribution that can be reached with any initial random population. It
is important to note that steady-state distribution is reached when the average value of the
GA fitness function is maximized for the GA population, which in the case of regression is
equivalent to minimizing the least-square error. Pendharkar [6] provided an upper bound
of Euclidean distance between steady-state distribution π* and population distribution
probability vector π g for mutation rates (i.e., mutation probability) pm ∈(0, 1). This upper
bound is as follows:
g
π * − π g ≤ min 1 − ( pm )Ωz , 1 − (1 − pm )Ωz

<1 (10)

The reader may note that Equation (10) gives an upper bound independent of any
discrete interval chosen by selecting a particular value of δ. Neither δ nor η play any
role in the upper bound; the bound is independent of the discretization of the [−1, 1]
interval and is also applicable for the continuous interval [−1, 1]. The drawback of the
bound defined in Equation (10) is that it is a loose bound. Albeit loose, the bound suggests
that the convergence to steady-state distribution is guaranteed and is slower for large
population sizes and many independent variables. Faster convergence can be obtained by
selecting pm values closer to 0.5. In deriving the upper bound, Pendharkar [6] assumed that
the crossover probability is non-zero as well. The theoretical expression in Equation (10)
can certainly be used to compute the total number of generations needed to achieve a
certain level of accuracy for ∥π ∗ − π g ∥. In practice, since the bound from Equation (10) is a
loose bound, steady-state distribution occurs much earlier than the theoretical number of
generations to convergence computed using Equation (10) suggests. In this research, the
total number of generations when a GA run is terminated is represented by the symbol
ϑ. For afixed value of ϑ, the worst-case computational complexity of the GA procedure
is O( Ω2 .
One of the benefits of the loose bound from Equation (10) is that it is independent of
the crossover rate or type of crossover operations used in the GA. This means that different
crossover operations can be used, and the mutation-driven loose bound from Equation (10)
will still hold for these different crossover operations. As a result, this research uses two
crossover operations. The first crossover operation is the single-point crossover operation
described earlier. The second crossover operation is the arithmetic crossover [7]. Figure 1
illustrates the two crossover operations used in this research. As explained before, the
single-point crossover picks two parents, a random crossover point, and it creates a child
with genes containing the genes from the head of the first parent until the crossover point
and genes from the tail of the second parent after the crossover point. The critical point to
notice in this crossover is that the child’s genes take values from those appearing in two
parents. The arithmetic crossover selects two parents using the selection operation and
then considers the fitness values of the two parents. The higher-fitness parent becomes the
𝜃 = 𝜃 × 𝜑 + (1 − 𝜑) × 𝜃 . (11)
In Equation (11), the higher-fitness base parent is assumed to be the parent “a” with
genes
Algorithms 2024, Pa = [θa1,…, θaz]. There are some advantages of arithmetic crossover. First, arithmetic
17, 111 5 of 12
crossover produces gene values in the child that are weighted averages of the correspond-
ing gene values of the parents. Second, these values are not random but are generated
base parent, whose genes will be retained unless the decision vector suggests a decision to
within the upper boundschange itsof the respective
genes. The decisiongene values
vector, with a of both parents.
dimensionality Additionally,
equal to the numberthere
of genes,
are no biases in the is aarithmetic
binary vector crossover,
that selectsas all genes
a value of 0 orhave
1 withan equal
equal probability
probability. of crosso-
For genes where the
ver, and the weighted decisionaverage improves
vector takes the
a value of precision
0, the of parameters
higher-fitness parent’s geneswhen parents
are retained have
in the child.
For genes where the decision vector takes a value of 1, the
somewhat similar genes. As an example of bias, the first gene of parent a and the cilast gene child’s gene (θ ) is replaced
by a computed value, which is a weighted average of the respective genes of the two
of parent b will always be present in child c when a single-point crossover is used because
parents. The weights in this weighted average are assigned by generating a random number
the crossover point φ∈will
(0, 1) always include
and averaging the first
respective gene
genes and
from theretain
parentsthe lastthegene
using in theformula:
following child.
This situation is eliminated in arithmetic crossover, since the decision vector may contain
one in the first and last gene locations. θci = θ ai × φ + [(1 − φ) × θbi ]. (11)

Crossover Point

Parent a 0.3 0.1 -0.2 0.5 -0.6 Parent a 0.3 0.1 -0.2 0.5 -0.6 Fitness=10

Parent b 0.5 -0.4 0.6 0.2 0.3 Fitness=8

Parent b 0.5 -0.4 0.6 0.2 0.3

Decision 0 1 1 0 1
Child c 0.3 0.1 -0.2 0.2 0.3

ϕ=rand()=0.3

Child c 0.3 -0.25 0.36 0.5 0.03

-0.6, 0.3

ϕ×(-0.6)+(1-ϕ) ×0.3= 0.03

Single-Point Crossover Arithmetic Crossover

Figure 1. Two crossover operations.
Figure 1. Two crossover operations.
In Equation (11), the higher-fitness base parent is assumed to be the parent “a” with
Let Q* be thegenes Pa = [θ a1 ,. at
population . ., a ]. There are some
θ azgeneration whereadvantages of arithmetic
steady-state crossover.isFirst,
distribution arithmetic
achieved
crossover produces gene values in the child that are weighted
with a desired level of accuracy. The steady-state population Q* is a sample generated averages of the corresponding
gene values of the parents. Second, these values are not random but are generated within
from an unknown non-parametric distribution whose likelihood function is maximized
the upper bounds of the respective gene values of both parents. Additionally, there are no
using the GA’s minimizing least-square
biases in the arithmetic errorasfitness
crossover, all genesfunction. Using
have an equal Q* as aofsample
probability crossover,forand
posterior distribution, the sample
the weighted averagewith regression
improves parameters
the precision shown
of parameters wheninparents
Equationhave(4) can
somewhat
be obtained by multiplying matrix Q* by a scalar λ. Once the resulting matrix is obtained, b
similar genes. As an example of bias, the first gene of parent a and the last gene of parent
will always be present in child c when a single-point crossover is used because the crossover
non-parametric distribution confidence intervals for regression parameters can be estab-
point will always include the first gene and retain the last gene in the child. This situation
lished. is eliminated in arithmetic crossover, since the decision vector may contain one in the first
Earlier, it was
andassumed
last gene that the value of scalar λ was known and was assumed to be
locations.
a constant. There are Let twoQ*ways
be theto population
determine at athe
generation
value of where steady-state
λ. In the first distribution
sequential issearch
achieved
with a desired level of accuracy. The steady-state population
approach, an initial value of λ = 10 is selected, GA experiments are run, and the population Q* is a sample generated
from an unknown non-parametric distribution whose likelihood function is maximized
member with theusing best the
fitness
GA’s is found. Using
minimizing the genes
least-square of this
error fitness population
function. Using Q*member,
as a samplethe for
posterior distribution, the sample with regression parameters shown in Equation (4) can be
obtained by multiplying matrix Q* by a scalar λ. Once the resulting matrix is obtained, non-
parametric distribution confidence intervals for regression parameters can be established.
Earlier, it was assumed that the value of scalar λ was known and was assumed to be
a constant. There are two ways to determine the value of λ. In the first sequential search
Algorithms 2024, 17, 111 6 of 12

approach, an initial value of λ = 10 is selected, GA experiments are run, and the population
member with the best fitness is found. Using the genes of this population member, the
value of λ, and Equation (5), the regression function in Equation (4) is estimated, and
its root-mean-square (RMS) error on the dataset is computed and stored for comparison.
Next, the value of λ is incremented by 1 to the next value, and the procedure is repeated
for a new value of λ and a new value of RMS is computed. If the RMS for the current
value (i.e., λ = 11) is better than the previously stored RMS value (i.e., λ = 10), then the
current value of RMS is stored for comparison and λ is incremented by 1 to the next value
(i.e., λ = 12), and a similar procedure is repeated. Otherwise, the previous value of λ is
selected as the final value of λ. The second approach is a direct approach that benchmarks
the value of λ by using the results from traditional parametric statistical regression function
parameters. Let these parameters from the traditional parametric statistical regression
h ′ ′
iT
function be represented as a vector β 1 , . . . , β n+1 ; then, the value of λ is computed using
the following expression:

λ = max β’1 , . . . , β’n+1 + 1. (12)

The second approach has one additional benefit that helps speed up the convergence
of the GA procedure. Since the regression function parameters from traditional regression
are available, the genes of some members of an initial random population can be seeded
by considering the value of λ and adding minor random noise to the known traditional
regression parameters. This seeding ensures that the initial GA population is no longer
entirely random and has some solutions close to traditional regression parameters and
near-unknown optimum solutions. Longer run times are sometimes necessary to achieve
population convergence without a seeding. Given these added computational efficiency
benefits, this research uses the second approach for determining the value of λ and partial
seeding of an initial random population. The seeding procedure used in this research first
creates a random initial population. Next, for Ω times, it randomly picks a population
member with replacement, randomly selects one of its genes, and assigns it a value using
the following expression:
β’i
× rand(0, 1), (13)
λ
where rand(0,1) is a randomly generated number, taking its value between 0 and 1, and i
is the index of the selected gene. Once this procedure is complete, the initial population
becomes the seeded initial population. The reader may note that some members of the
seeded initial population will contain all random genes because selecting the population
member for seeding is a bootstrap sampling procedure. Furthermore, only one gene of a
chosen population member is seeded, and that too has some random noise inserted in it, as
shown in Equation (13). Seeding procedures are also used in MCMC and are sometimes
called better starting points. The seeding procedure used here is somewhat weaker than
those used in MCMC algorithms, where starting values for all parameters are seeded using
values closer to the mode of the posterior distribution to improve convergence [8].
Using the final generation of the population at convergence, the central tendency parame-
ters (means and medians) and probability intervals for regression parameters can be computed
by first computing the means and probability intervals of θi s for i = {1,. . ., n + 1} and then
multiplying these values by the parameter λ obtained from Equation (12). The mean values
for the θi s are column averages for the final population from the matrix shown in Equation (6).
In the final population at convergence, each column represents a sample from the posterior
distribution of a regression function parameter. A 100 × (1 − γ)% confidence interval can be
estimated by taking 100 × (γ/2)% and 100 × (1 − γ/2)% quantiles of the parameter sample,
representing end points of the interval.
Once the value of λ is known and the statistical significance level for the confidence
interval, γ, is decided, the initial no-information interval width (IW) can be determined.
Algorithms 2024, 17, 111 7 of 12

For example, if γ = 5%, the initial no-information IWs for parameters in Equation (4) are
[−0.975 × λ, 0.975 × λ]. For the GA procedure, to add value, the final parameter IWs
should be smaller than the initial no-information IWs. A formal method for computing the
reduction in the final GA procedure IWs from the initial no-information IWs is mentioned
in the next section.

3. Experiments and Results

The regression data selected for the experiments are related to the quality of the
delivery system network of a soft drink company [9]. The dependent variable in the data is
the time needed by an employee to restack an automatic soft drink vending machine. This
time is called the total service time and is measured in minutes. There are two independent
variables: the first variable is the number of stocked items, and the second is the distance an
employee walks, measured in feet. The dataset contains 25 observations. Table 1 illustrates
this dataset.

Table 1. Soft drink restacking times dataset.

Observation Total Items Total Distance Total Time

1 7 560 16.68
2 3 220 11.5
3 3 340 12.03
4 4 80 14.88
5 6 150 13.75
6 7 330 18.11
7 2 110 8
8 7 210 17.83
9 30 1460 79.24
10 5 605 21.5
11 16 688 40.33
12 10 215 21
13 4 255 13.50
14 6 462 19.75
15 9 448 24
16 10 776 29
17 6 200 15.35
18 7 132 19
19 3 36 9.5
20 17 770 35.1
21 10 140 17.9
22 26 810 52.32
23 9 450 18.75
24 8 635 19.83
25 4 150 10.75

When the ordinary least-square (OLS) regression is run on the dataset, it results in
′ ′ ′
parameter values of β 1 = 1.61591, β 2 = 0.0143848, and β 3 = 2.34123. The overall regression
model is significant at a 95% statistical confidence level, and the adjusted R-squared
value is 95.4%. The root-mean-squared (RMS) error for the OLS model is 3.05766. From
Algorithms 2024, 17, 111 8 of 12

Equation (12), the λ value is 3.34123. The no-information IW for non-parametric posterior
distribution regression parameters β1 , β2 , and β3 is [−3.2577, 3.2577] for a value of γ = 5%.
Some implementation aspects of the procedure were not described in Section 2. First
is the fitness function of the GA procedure. For a given population member, the predicted
outputs are computed from its genes using Equation (5). The RMS error on the dataset
is computed next. Let us say that this RMS error is ξ s for some population member
g
model s∈{1,. . ., Ω}. The fitness value for this population member s in generation g, ( f s ), is
computed using the following expression:

g λ
fs = (14)
1 + ξs

The maximization of Equation (14) can be obtained by minimizing the RMS. Note that
λ is a predetermined constant. The number “one” in the denominator is added to avoid
dividing by zero if ξ s takes a value of zero in the event of a perfect regression model fit.
Using the results of the OLS and plugging them into Equation (14), we obtain a value of
0.823. This value gives a benchmark of what may be the approximate best value of the
best-fitness member in the GA procedure. If the best member fitness function value exceeds
0.823, then the GA model procedure has found a regression model with a lower RMS than
the OLS regression model. Usually, it would be rare to find better results from a heuristic GA
regression model. Thus, the threshold value of 0.823 should be considered an upper bound
on the fitness value of the best population fitness member found using the GA experiments.
Second, the no-information IW for the regression parameters was computed as [−0.975 × λ,
0.975 × λ] for the value of γ = 5%. If for the GA population considered for computing
the final posterior distribution sample, the lb = 100 × (γ/2)% and ub = 100 × (1 − γ/2)%
quantiles represent lower and upper bounds for a given regression parameter, then the
percentage improvement in parameter IW from the no-information IW can be computed
using the following expression:

(ub − lb)

PercentageIW Improvement(%imp.) = 1 − . (15)
2 × (1 − γ ) × λ

Finally, in Section 2, it was mentioned that a posterior distribution sample from the
final population at convergence is considered for the computing parameter confidence inter-
vals. Since the mutation operation introduces some randomness, it is not always necessary
to pick the last generation of the population to compute a parameter confidence interval.
From a practical standpoint, a sample is extracted from generation g* for population Q*,
where g* is determined using the following expression:

Ω
 
g
 ∑ fs 
g = argmax  s=1
*
(16)
 
g∈{0,...ϑ } Ω



Equation (16) implies that a sample is extracted when the average population fitness
value is highest. Generally, this extraction generation is closer to the final population
generation ϑ, but it may or may not be the final population at generation ϑ.
The objective of using a GA for learning regression parameters is like running a ridge
regression to avoid overfitting the training dataset. As a heuristic procedure, in most cases, a
GA-based regression model may not outperform OLS in terms of the RMS error. Still, it may
provide better generalizability for unseen future cases, leading to improved predictability
compared to the traditional OLS regression model. Given a small dataset, three regression
parameters, and an initial seeded population, minor initial experiments were conducted
to select GA parameters for the experiments. These parameters were Ω = 100, ϑ = 200,
pc = 30%, and pm = 5%. The reader should note that the initial population in the procedure
Algorithms 2024, 17, 111 9 of 12

used in this research was always random for each run. This randomization was obtained
using a random number that uses computer clock times as a seed for generating a random
number. There are random number generators that allow users to define the seed for
the random number generation procedure. When such a random number generator is
used, the same initial population can be generated for each run by keeping the value of
the seed constant. In such cases, the selection of parameters will be essential because
then the parameters are the only criteria that will govern the quality of the final results.
However, when the initial population for each run is randomly generated using computer
clock times, the impact of parameters on the quality of solutions is hard to ascertain. In
the case of computer clock time-seeded random numbers for the same set of parameters,
slightly different results can be obtained owing to different starting populations. Extensive
experiments on GA parameters in such a case are not necessary. Different runs with different
initial populations can be conducted, and the percentage improvement criterion highlighted
in Equation (15) can be used to separate high-quality results from low-quality results.
Additionally, the average fitness of the GA population can be compared with the benchmark
mentioned earlier (a value of 0.823) to monitor the gap between the average population
fitness and its upper bound value, with a lower gap representing a better solution. The
GA literature advises against using very high mutation rate values because high mutation
rates introduce randomness [10]. Some randomness is necessary for searching for better
solutions, but too much randomness can be detrimental to solution quality [10]. A general
rule of thumb is that the crossover rate value should be less than 50%, and the mutation
rate value should be less than the crossover rate value. Additionally, both the crossover rate
and mutation rate should be non-zero values. In this paper, multiple runs were conducted
with different starting populations and only the best results are reported.
Figures 2 and 3 illustrate the results of the GA experiments for two crossover oper-
ations. In both Figures 2 and 3, the top solid line is the best-fitness population member
and the dotted line is the average population members’ fitness. While the top line appears
straight, minor improvements in the best population member fitness occur. For Figure 2,
the best population member fitness value improves from 0.823015 in the 1st generation
to 0.823325 in the 200th generation. In Figure 3, these numbers are 0.822516 and 0.823437,
respectively. The average fitness values for the posterior distribution sample extracted at
the 192nd generation from Figure 2 and the 194th generation for Figure 3 were 0.631558
and 0.714597, respectively. The higher average fitness value for the sample extracted from
Figure 3 suggests that arithmetic crossover results are better and will provide a greater
percentage of IW improvement (PIWI) from Equation (15). The reader may also visually
observe the gap between the average population fitness values and the best fitness popu-
lation member values. This gap is lower in Figure 3, indicating that arithmetic crossover
results are better than single-point crossover results.
Table 2 illustrates the descriptive statistics of results obtained from the two crossover
operators. Bayesian regression results taken from a text [9] are also reported for comparison.
As expected, the PIWIs are higher for the arithmetic crossover, with all PIWIs being higher
than 46%. This results in a tighter 95% confidence interval of the GA regression parameters
for the arithmetic crossover operator. In all cases, the arithmetic crossover operator PIWIs
are higher than those for the single-point crossover operator. A point of interest for the
reader may be to note the starting average fitness values for both crossover operations
at generation one. Both operators start at an average fitness value of around 0.4. Since
the initial GA population is seeded with values close to the OLS regression parameters,
the starting point illustrates that there is enough diversity in the population for the GA
operators to still evolve the population to a higher overall fitness.
Algorithms
Algorithms2024,
2024, 17,
17, x FOR PEER REVIEW
111 10 of 12
10 of 12
Algorithms 2024, 17, x FOR PEER REVIEW 10 of 12

Single-Point Crossover Sample Extraction at 192

Generation
0.9 Single-Point Crossover Sample Extraction at 192
0.8
Generation
0.9
0.7
0.8
0.6
0.7

Value
0.6
0.5

Fitness Value
Fitness 0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
1 1
7 7
13 13
19
25
31
37
43
55 49
61 55
67 61
73 67
79 73
85 79
91 85
97 91
103 97
109103
115109
121115
127121
133127
139133
139
145
151
157
163
169
187175
193181
199187
193
199
0
19
25
31
37
43
49

145
151
157
163
169
175
181
Generation Number
Generation Number
Average Fit. Best Fit.
Average Fit. Best Fit.

Figure 2.The
The single-point crossover results.
Figure 2. The single-point
Figure 2. single-point crossover
crossover results.
results.

ArithmeticCrossover
Crossover Sample
Sample Extraction
Extraction at 194
at 194
Arithmetic
Generation
Generation
0.9
0.9
0.8
0.8
0.7
0.7
Value
Fitness Value

0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
00
11
88
15
22
29
36
43
50
57
64
71
78
85
92
99
106
113
120
127
134
141
148
155
162
169
176
183
190
197
15
22
29
36
43
50
57
64
71
78
85
92
99
106
113
120
127
134
141
148
155
162
169
176
183
190
197
Generation Number
Generation Number

Average Fit. Best Fit.

Figure 3.
Figure 3. The
The arithmetic
arithmetic crossover
crossover results.
results.
Figure 3. The arithmetic crossover results.
Table2.2. Posterior
Posterior distributionsummaries.
summaries.
Table 2. Posteriordistribution
Table distribution summaries.
Parameter
Parameter Mean
Mean Std.
Std. Dev.
Dev. 2.5% Median 97.5% PIWI
Parameter Mean Std. Dev. 2.5% 2.5% Median
Median 97.5% 97.5% PIWI
PIWI
Single-Point Crossover Results
Single-Point
Single-Point Crossover Results
Crossover1.712
Results
β1 1.548 0.66 −0.702 1.712 61.98%
ββ1 1 1.548
1.548 0.66
0.66 − 0.702
−0.702 1.7121.712 1.7121.712 61.98%
61.98%
β2 0.073 0.64 −0.635 0.015 2.279 54.10%
βββ232 0.073
0.073
0.984 0.64
0.64
1.18 − −0.635
0.635
−2.907 0.0150.015
1.069 2.279
2.4162.279 54.10%
54.10%
16.16%
ββ3 3 0.984
0.984 1.18
1.18
Arithmetic − −2.907 Results
2.907
Crossover 1.0691.069 2.4162.416 16.16%
16.16%
β1 1.491 Arithmetic
Arithmetic
0.71 Crossover
Crossover
0.568 Results
Results
1.604 1.604 83.68%
βββ121 1.491
0.003
1.491 0.71
0.62
0.71 0.568
−1.203
0.568 1.604
0.015
1.604 1.450
1.6041.604 83.68%
58.21%
83.68%
βββ232 0.003
2.089
0.003 0.62
0.93
0.62 − −1.203
−1.036
1.203 0.015
2.331
0.015 2.331
1.4501.450 58.21%
46.97%
58.21%
β3
β3
2.089
2.089
0.93
Bayesian −1.036
0.93 Model Results
−1.036 (Taken2.331
2.331
from [9]) 2.3312.331 46.97%
46.97%
β1 1.610 0.18 Model1.272
Bayesian 1.609from [9])1.968
Results (Taken
Bayesian Model Results (Taken from [9])
βββ121 0.014
1.610
1.610 0.01
0.18
0.18 0.007
1.272
1.272 0.014
1.6091.609 0.022
1.9681.968
βββ232 2.356
0.014
0.014 1.19
0.01
0.01 −0.039
0.007
0.007 2.372
0.0140.014 4.635
0.0220.022
2.356 β3 1.19 −0.039 2.372 4.635
β3 2.356 1.19 −0.039 2.372 4.635
When viewing the arithmetic crossover results with the Bayesian model results, the
Bayesian
When model confidence
viewing bounds are
the arithmetic tighter,results
crossover partly due
withtothe
theBayesian
likelihood distribution
model results, the
assumptions that Bayesian models make. The only exception is the confidence bound for
Bayesian model confidence bounds are tighter, partly due to the likelihood distribution
assumptions that Bayesian models make. The only exception is the confidence bound for
Algorithms 2024, 17, 111 11 of 12

When viewing the arithmetic crossover results with the Bayesian model results, the
Bayesian model confidence bounds are tighter, partly due to the likelihood distribution
assumptions that Bayesian models make. The only exception is the confidence bound for
the regression intercept, which is lower for the GA regression model. The non-parametric
distribution is negatively skewed since the mean values for arithmetic crossover GA models
are always lower than the median values. When tight bounds are desired, it is possible
to use the GA regression model first to understand the underlying properties of the non-
parametric posterior distribution and then select the appropriate data likelihood and prior
distributions in Bayesian regression. This way, Bayesian regression modelers can make an
informed decision and improve confidence in the results of their investigations.
Table 3 illustrates the parameter values for the best-fitness population member found
in all GA generations. Both models provide somewhat similar results in terms of RMS
values, which in turn are identical to the RMS results obtained through OLS regression.
Given that there are three values for a regression parameter (mean, median, and best-fitness
population member genes), the question is, which value should be used as the final set of
regression parameters? A decision maker should use the best member fitness parameter
values from Table 3 if those values fall within the 95% confidence bounds provided in
Table 2. For arithmetic crossover, the value for β1 = 1.616 does not belong to its 95%
confidence bounds of [0.568, 1.604] from Table 2. Thus, it should be rejected. The reason for
this rejection is that the value of β1 = 1.616 may not be a natural outcome of GA population
evolution but was retained due to the initial seeding of the GA population with OLS
regression parameters. Once a value for the best member fitness is rejected, the median
values should be used as the final set of regression parameters. Ideally, the best fitness
values should be chosen if these values fall within their respective 95% confidence bounds.
Otherwise, median values represent the next best solution. In the case of single-point
crossover, the best member fitness values fall within the 95% confidence bounds provided
in Table 2, which are the final set of regression parameters.

Table 3. The best member fitness function parameters.

GA Model β1 β2 β3 RMS
Single-Point Crossover 1.605 0.014 2.416 3.058
Arithmetic Crossover 1.616 0.014 2.331 3.057

It may be possible to use an ensemble value, the average of all three values (median,
mean, and best member fitness genes) as the final value for the regression model. The
merits of different approaches in deciding the final set of regression function parameters are
considered to be out of scope for the current study. However, multiple values offer different
selection possibilities, where each possibility may have advantages and disadvantages.

4. Summary and Directions for Future Research

This paper proposes a GA-based Markov chain approach to directly generate samples
from posterior distributions. Using linear regression as an example and least-square
error minimization criteria, a sample from posterior distribution was extracted, and 95%
confidence bounds on the regression parameters were established. This procedure is
guaranteed to converge as long as non-zero parameter values for the GA are selected.
Compared to traditional MCMC methods, the proposed method has certain advantages in
that the procedure searches for all parameters simultaneously instead of one parameter at
a time, as in MCMC methods. Also, no proposal density functions are required, and no
burn-in period sample rejection is necessary. Knowledge of data likelihood and priors is
not required as well. It is well known that MCMC methods do not allow for incorporation
of multi-mode distribution [11] and the current method does not impose such restrictions
either. The only challenge in using this method is that some knowledge of the maximum
Algorithms 2024, 17, 111 12 of 12

likelihood criterion for the posterior distribution is necessary. This knowledge is directly
used for creating the GA fitness function.
There are some areas that could be explored where the proposed approach may be
helpful. One of the advantages of the proposed method is that it does not require a
continuous or differentiable likelihood function. The method may be adapted to generate
truncated posterior distributions by imposing penalties in the GA fitness function. These
penalties may be imposed by using IF–THEN rules. As noted earlier, the proposed method
can also be used to aid in selecting data likelihood density functions for MCMC methods.
While the linear regression problem domain was used in this research due to its widespread
application and simplicity, the proposed method can easily be used for non-linear regression
and linear and non-linear discriminant analysis. This method is likely more efficient than
MCMC methods and will likely converge faster. When both the current method and the
MCMC method can be used on a problem domain (as was the case in this research), both
can be used to gain confidence in the final results. When results vary, it is possible to use
the data-mining literature to devise approaches to combine different values to reduce error
variance and gain confidence in the selected set of parameters. Future research is needed to
explore the additional merits of the proposed GA procedure.

Funding: This research received no funding.

Data Availability Statement: All Data are reported in the paper.
Conflicts of Interest: The author declares no conflicts of interest.

References
1. Geman, S.; Hwang, C.-R. Nonparametric Maximum Likelihood Estimation by the Method of Sieves. Ann. Stat. 1982, 10, 401–414.
[CrossRef]
2. Gasser, T.; Engel, J.; Seifert, B. Nonparametric Function Estimation. In Handbook of Statistics; Elsevier Science Publishers:
Amsterdam, The Netherlands, 1993; Volume 9, pp. 423–465.
3. Agarwal, R.; Chen, Z.; Sarma, S.V. A Novel Nonparametric Maximum Likelihood Estimator for Probability Density Functions.
IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1294–1308. [CrossRef] [PubMed]
4. Ferreira, T.R.; Liska, G.R.; Beijo, L.A. Assessment of Alternative Methods for Analysing Maximum Rainfall Spatial Data Based on
Generalized Extreme Value Distribution. SN Appl. Sci. 2024, 6, 34. [CrossRef]
5. Pendharkar, P.C.; Koehler, G.J. A General Steady State Distribution Based Stopping Criteria for Finite Length Genetic Algorithms.
Eur. J. Oper. Res. 2007, 176, 1436–1451. [CrossRef]
6. Pendharkar, P.C. A Steady State Convergence of Finite Population Floating Point Canonical Genetic Algorithm. Int. J. Comput. Sci.
2008, 2, 184–199.
7. Pendharkar, P.; Rodger, J. An Empirical Study of Impact of Crossover Operators on the Performance of Non-Binary Genetic
Algorithm Based Neural Approaches for Classification. Comput. Oper. Res. 2004, 31, 481–498. [CrossRef]
8. van Ravenzwaaij, D.; Cassey, P.; Brown, S.D. A Simple Introduction to Markov Chain Monte-Carlo Sampling. Psychon. Bull. Rev.
2018, 25, 143–154. [CrossRef] [PubMed]
9. Ntzoufras, I. Bayesian Modeling Using WinBUGS; John Wiley and Sons, Inc.: Hoboken, NJ, USA, 2009.
10. Goldberg, D.E. Genetic Algorithms in Search, Optimization & Machine Learning; Addison-Wesley: Reading, MA, USA, 1989.
11. Tucker, J.D.; Shand, L.; Chowdhary, K. Multimodal Bayesian Registration of Noisy Functions Using Hamiltonian Monte Carlo.
Comput. Stat. Data Anal. 2021, 163, 107298. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

Bayesian Statistical Methods (Brian J. Reich, Sujit K. Ghosh)
No ratings yet
Bayesian Statistical Methods (Brian J. Reich, Sujit K. Ghosh)
288 pages
Murphy Book Solution
No ratings yet
Murphy Book Solution
100 pages
Bba Sem III, (New) 2019
0% (1)
Bba Sem III, (New) 2019
22 pages
MTH210
No ratings yet
MTH210
126 pages
Computer Intensive Methods in Statistics
No ratings yet
Computer Intensive Methods in Statistics
227 pages
SDV
No ratings yet
SDV
82 pages
Using Early Rejection Markov Chain Monte Carlo and Gaussian Processes To Accelerate ABC Methods
No ratings yet
Using Early Rejection Markov Chain Monte Carlo and Gaussian Processes To Accelerate ABC Methods
33 pages
Tajmouati Samya Publications 09 08 2022 10 08 16 55
No ratings yet
Tajmouati Samya Publications 09 08 2022 10 08 16 55
6 pages
Bayesian Analysis of Failure Time Data Using P Splines Best Quality Download
No ratings yet
Bayesian Analysis of Failure Time Data Using P Splines Best Quality Download
17 pages
MCMC Final Edition
No ratings yet
MCMC Final Edition
17 pages
A Recursive Local Polynomial Approximation Method Using Dirichlet Clouds and Radial Basis Functions
No ratings yet
A Recursive Local Polynomial Approximation Method Using Dirichlet Clouds and Radial Basis Functions
26 pages
MATH 437/ MATH 535: Applied Stochastic Processes/ Advanced Applied Stochastic Processes
No ratings yet
MATH 437/ MATH 535: Applied Stochastic Processes/ Advanced Applied Stochastic Processes
7 pages
3 1 Lueckmann21a-Supp
No ratings yet
3 1 Lueckmann21a-Supp
39 pages
Markov Chain Monte Carlo Without Likelihoods
No ratings yet
Markov Chain Monte Carlo Without Likelihoods
5 pages
Non Parametric Confidence Intervals
No ratings yet
Non Parametric Confidence Intervals
86 pages
Bayesian Inference On Change Point Problems
No ratings yet
Bayesian Inference On Change Point Problems
71 pages
Lim 05429427
No ratings yet
Lim 05429427
10 pages
Toward Faster Methods in Bayesian Unsupervised Learning
No ratings yet
Toward Faster Methods in Bayesian Unsupervised Learning
235 pages
Monte Carlo Methods in Bayesian Computation Full Text Download
94% (17)
Monte Carlo Methods in Bayesian Computation Full Text Download
16 pages
Model Choice in Nonnested Families: Basilio de Bragança Pereira Carlos Alberto de Bragança Pereira
No ratings yet
Model Choice in Nonnested Families: Basilio de Bragança Pereira Carlos Alberto de Bragança Pereira
105 pages
Roger D. Peng - Advanced Statistical Computing (2022 Update) (2023) - Libgen - Li
No ratings yet
Roger D. Peng - Advanced Statistical Computing (2022 Update) (2023) - Libgen - Li
107 pages
Predicting A Binary Sequence
No ratings yet
Predicting A Binary Sequence
22 pages
MCMC With Temporary Mapping and Caching With Application On Gaussian Process Regression
No ratings yet
MCMC With Temporary Mapping and Caching With Application On Gaussian Process Regression
16 pages
Arena Stanfordlecturenotes11
No ratings yet
Arena Stanfordlecturenotes11
9 pages
Can J Chem Eng - 2024 - Gibson - Bayesian Parameter Estimation Using Truncated Normal Distributions As Priors For
No ratings yet
Can J Chem Eng - 2024 - Gibson - Bayesian Parameter Estimation Using Truncated Normal Distributions As Priors For
17 pages
PLSC504 Bayes 2024 Slides
No ratings yet
PLSC504 Bayes 2024 Slides
30 pages
Fourier Feature Approximations For Periodic Kernels
No ratings yet
Fourier Feature Approximations For Periodic Kernels
8 pages
Lec30 GibbsSampling
No ratings yet
Lec30 GibbsSampling
55 pages
Generalized Fiducial Inference For Ultrahigh Dimensional Regression
No ratings yet
Generalized Fiducial Inference For Ultrahigh Dimensional Regression
25 pages
An Introduction To Modern Bayesian Econometrics: Tony Lancaster May 26, 2003
No ratings yet
An Introduction To Modern Bayesian Econometrics: Tony Lancaster May 26, 2003
10 pages
Bayes Intro PT 2
No ratings yet
Bayes Intro PT 2
13 pages
Genetic Programming For Financial Time Series Prediction
No ratings yet
Genetic Programming For Financial Time Series Prediction
10 pages
References
No ratings yet
References
9 pages
Predicting A Binary Sequence
No ratings yet
Predicting A Binary Sequence
24 pages
2012 Nikolaos Nikolaou MSC
No ratings yet
2012 Nikolaos Nikolaou MSC
102 pages
Inference in A Class of Optimization Problems Confidence Regions and Finite Sample Bounds On Errors in Coverage Probabilities
No ratings yet
Inference in A Class of Optimization Problems Confidence Regions and Finite Sample Bounds On Errors in Coverage Probabilities
53 pages
Bayesian Analysis of Failure Time Data Using P Splines
100% (10)
Bayesian Analysis of Failure Time Data Using P Splines
14 pages
Advstatcomp PDF
No ratings yet
Advstatcomp PDF
109 pages
A Clusterwise Nonlinear Regression Algorithm For Interval-Valued Data
No ratings yet
A Clusterwise Nonlinear Regression Algorithm For Interval-Valued Data
29 pages
1
No ratings yet
1
130 pages
Bayesian Analysis
No ratings yet
Bayesian Analysis
20 pages
Inference For SDE Models Via Approximate Bayesian Computation
No ratings yet
Inference For SDE Models Via Approximate Bayesian Computation
27 pages
Lec-01-Introduction To Statistical Learning
No ratings yet
Lec-01-Introduction To Statistical Learning
38 pages
A Novel Method To Dig Information From Generic Gro
No ratings yet
A Novel Method To Dig Information From Generic Gro
8 pages
Computation
No ratings yet
Computation
11 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
A Bayesian Analysis of The Multinomial Probit Model Using Marginal Data Augmentation
No ratings yet
A Bayesian Analysis of The Multinomial Probit Model Using Marginal Data Augmentation
24 pages
Entropy 24 01255
No ratings yet
Entropy 24 01255
13 pages
Machine Learning Econometrics Bayesian Algorithms
No ratings yet
Machine Learning Econometrics Bayesian Algorithms
33 pages
Stats 205 Notes
No ratings yet
Stats 205 Notes
99 pages
Maximum Likelihood Estimation With Stata, Fourth Edition by William Gould, Jeffrey Pitblado, Brian Poi
No ratings yet
Maximum Likelihood Estimation With Stata, Fourth Edition by William Gould, Jeffrey Pitblado, Brian Poi
376 pages
Simon Shaw Bayes Theory
No ratings yet
Simon Shaw Bayes Theory
72 pages
SPlit An Optimal Method For Data Splitting
No ratings yet
SPlit An Optimal Method For Data Splitting
36 pages
Bayesian Nonparametric Models: Peter Orbanz, Cambridge University Yee Whye Teh, University College London
No ratings yet
Bayesian Nonparametric Models: Peter Orbanz, Cambridge University Yee Whye Teh, University College London
14 pages
Christophe Andrieu - Arnaud Doucet Bristol, BS8 1TW, UK. Cambridge, CB2 1PZ, UK. Email
No ratings yet
Christophe Andrieu - Arnaud Doucet Bristol, BS8 1TW, UK. Cambridge, CB2 1PZ, UK. Email
4 pages
Paap 2019-08-15 Peter Glynn
No ratings yet
Paap 2019-08-15 Peter Glynn
46 pages
Nonlinear Regression (14 Difficult Models) : Test Problems
No ratings yet
Nonlinear Regression (14 Difficult Models) : Test Problems
5 pages
Talk On Regression Based Method For Bayesian Nonparanormal Graphical Models
No ratings yet
Talk On Regression Based Method For Bayesian Nonparanormal Graphical Models
40 pages
Double/Debiased Machine Learning For Treatment and Structural Parameters
No ratings yet
Double/Debiased Machine Learning For Treatment and Structural Parameters
71 pages
Lokketangen and Woodruff 1996 - Journal of Heuristics - Progressive Hedging and Tabu Search
No ratings yet
Lokketangen and Woodruff 1996 - Journal of Heuristics - Progressive Hedging and Tabu Search
18 pages
Minimum L - Distance Estimators For Non-Normalized Parametric Models
No ratings yet
Minimum L - Distance Estimators For Non-Normalized Parametric Models
32 pages
Biomimetics-09-00637 Yu
No ratings yet
Biomimetics-09-00637 Yu
13 pages
Biomimetics-09-00734 F
No ratings yet
Biomimetics-09-00734 F
14 pages
Covid 04 00117
No ratings yet
Covid 04 00117
10 pages
Biomimetics 09 00606
No ratings yet
Biomimetics 09 00606
12 pages
KEM.834.10 JHGHJHJB
No ratings yet
KEM.834.10 JHGHJHJB
6 pages
Biomimetics-09-00723-V2 VF
No ratings yet
Biomimetics-09-00723-V2 VF
23 pages
Biomimetics 09 00621 v2
No ratings yet
Biomimetics 09 00621 v2
8 pages
Electricity-05-00039 Hif
No ratings yet
Electricity-05-00039 Hif
19 pages
Biomimetics-09-00649 g3
No ratings yet
Biomimetics-09-00649 g3
25 pages
NE In2Se3 FeSFET Final Printed Version - 2019
No ratings yet
NE In2Se3 FeSFET Final Printed Version - 2019
7 pages
Batteries-10-00228 Mdpi 2024
No ratings yet
Batteries-10-00228 Mdpi 2024
24 pages
Jlpea 14 00019
No ratings yet
Jlpea 14 00019
20 pages
Biology 13 00320
No ratings yet
Biology 13 00320
9 pages
Algorithms 17 00099 v2
No ratings yet
Algorithms 17 00099 v2
21 pages
Coatings-14-00784 Elect Derivded
No ratings yet
Coatings-14-00784 Elect Derivded
13 pages
Algorithms 17 00101 v2
No ratings yet
Algorithms 17 00101 v2
20 pages
Flyer
No ratings yet
Flyer
3 pages
Algorithms 17 00100
No ratings yet
Algorithms 17 00100
19 pages
Crystals 14 00299
No ratings yet
Crystals 14 00299
20 pages
Biomedicines 12 00670 v3
No ratings yet
Biomedicines 12 00670 v3
17 pages
Algorithms 17 00098 v2
No ratings yet
Algorithms 17 00098 v2
22 pages
Algorithms 17 00103 v2
No ratings yet
Algorithms 17 00103 v2
36 pages
Algorithms 17 00097
No ratings yet
Algorithms 17 00097
21 pages
Algorithms 17 00120 v2
No ratings yet
Algorithms 17 00120 v2
16 pages
Algorithms 17 00126
No ratings yet
Algorithms 17 00126
21 pages
Algorithms 17 00123
No ratings yet
Algorithms 17 00123
12 pages
Biomedicines 12 00666
No ratings yet
Biomedicines 12 00666
11 pages
Algorithms 17 00128
No ratings yet
Algorithms 17 00128
13 pages
Biomedicines 12 00677
No ratings yet
Biomedicines 12 00677
13 pages
Biomedicines 12 00671
No ratings yet
Biomedicines 12 00671
11 pages
MATH 2526 Chapter 11 Assignment
No ratings yet
MATH 2526 Chapter 11 Assignment
6 pages
Business Analytics: Aviral Apurva Anureet Bansal Devansh Agarwaal Dhwani Dhingra Chirag Verma
No ratings yet
Business Analytics: Aviral Apurva Anureet Bansal Devansh Agarwaal Dhwani Dhingra Chirag Verma
49 pages
Data Analytics Regression UNIT-III
No ratings yet
Data Analytics Regression UNIT-III
26 pages
P4 Project Report
No ratings yet
P4 Project Report
28 pages
(Ebook PDF) Beginning Research in Political Science by Carolyn Forestiere PDF Download
100% (2)
(Ebook PDF) Beginning Research in Political Science by Carolyn Forestiere PDF Download
48 pages
ISyE 6414 Syllabus Summer 2025
No ratings yet
ISyE 6414 Syllabus Summer 2025
6 pages
Enoch Project
No ratings yet
Enoch Project
39 pages
Simple Linear Regression 4
No ratings yet
Simple Linear Regression 4
5 pages
Time-Series Panel Analysis (TSPA) Online Material
No ratings yet
Time-Series Panel Analysis (TSPA) Online Material
8 pages
The Effect of Inventory Management On Firm Performance
No ratings yet
The Effect of Inventory Management On Firm Performance
16 pages
A Stata Implementation of The Blinder-Oaxaca Decomposition
No ratings yet
A Stata Implementation of The Blinder-Oaxaca Decomposition
25 pages
A Short Guide For Feature Engineering and Feature Selection
No ratings yet
A Short Guide For Feature Engineering and Feature Selection
32 pages
GDP Growth Determinants (CW Econometrics)
No ratings yet
GDP Growth Determinants (CW Econometrics)
9 pages
Business Strategy As A Situational Factor of Subsidiary PMS
No ratings yet
Business Strategy As A Situational Factor of Subsidiary PMS
11 pages
q6-5 Solution (Ridge and Lasso)
No ratings yet
q6-5 Solution (Ridge and Lasso)
7 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
53 pages
IT0089 (Fundamentals of Analytics Modeling) : Exercise
No ratings yet
IT0089 (Fundamentals of Analytics Modeling) : Exercise
6 pages
Factors Affecting Employee Use of Work-Life Balance Initiatives
No ratings yet
Factors Affecting Employee Use of Work-Life Balance Initiatives
11 pages
Supermarket - Sales - Analysis - Algorithm - by Data Analaysis
No ratings yet
Supermarket - Sales - Analysis - Algorithm - by Data Analaysis
2 pages
ProjectTemplate - Lavesh Kewlani
No ratings yet
ProjectTemplate - Lavesh Kewlani
10 pages
Proposal-Ridge Regression
No ratings yet
Proposal-Ridge Regression
6 pages
Lijphart 2012 Patterns of Democracy CH 15 - 16
No ratings yet
Lijphart 2012 Patterns of Democracy CH 15 - 16
40 pages
Artikel Jurnal Untuk Refernsi
No ratings yet
Artikel Jurnal Untuk Refernsi
5 pages
An Analysis of The Influence of Public Relations Department Leadership Style On Public Relations Strategy Use and Effectiveness
No ratings yet
An Analysis of The Influence of Public Relations Department Leadership Style On Public Relations Strategy Use and Effectiveness
25 pages
Homework 5 - Task4
No ratings yet
Homework 5 - Task4
10 pages
LP - Linear Regression
No ratings yet
LP - Linear Regression
8 pages
Rel Code
No ratings yet
Rel Code
148 pages
ch12 0
No ratings yet
ch12 0
43 pages
Correlation and Regression: Statistics For Economics 1
No ratings yet
Correlation and Regression: Statistics For Economics 1
72 pages

Algorithms 17 00111

Uploaded by

Algorithms 17 00111

Uploaded by

algorithms

Information Systems School of Business Administration, Pennsylvania State University at Harrisburg,

Keywords: genetic algorithms; Markov Chains; regression; non-parametric estimation algorithm

Algorithms 2024, 17, 111. https://doi.org/10.3390/a17030111 https://www.mdpi.com/journal/algorithms

p(θ | D ) ∝ p( D |θ ) × p(θ ) (3)

2. A Genetic Algorithm Markov Chain and Related Preliminaries

The variable λ is a predetermined positive constant. The issue of selecting an ap-

Parent b 0.5 -0.4 0.6 0.2 0.3 Fitness=8

Child c 0.3 -0.25 0.36 0.5 0.03

ϕ×(-0.6)+(1-ϕ) ×0.3= 0.03

Single-Point Crossover Arithmetic Crossover

3. Experiments and Results

Table 1. Soft drink restacking times dataset.

Observation Total Items Total Distance Total Time

Single-Point Crossover Sample Extraction at 192

Average Fit. Best Fit.

Table 3. The best member fitness function parameters.

4. Summary and Directions for Future Research

Funding: This research received no funding.

You might also like