Advanced Mathematical Techniques in Engineering Sciences
Advanced Mathematical Techniques in Engineering Sciences
Advanced Mathematical Techniques in Engineering Sciences
Techniques in Engineering
Sciences
www.Technicalbookspdf.com
Science, Technology, and Management Series
Series Editor
J. Paulo Davim
www.Technicalbookspdf.com
Advanced Mathematical
Techniques in Engineering
Sciences
Edited by
Mangey Ram and J. Paulo Davim
www.Technicalbookspdf.com
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been
made to publish reliable data and information, but the author and publisher cannot assume responsibility for the
validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the
copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to
publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let
us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or
utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including
photocopying, microfilming, and recording, or in any information storage or retrieval system, without written
permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://
www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA
01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users.
For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been
arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for
identification and explanation without intent to infringe.
www.Technicalbookspdf.com
Contents
Preface............................................................................................................................................. vii
Acknowledgments..........................................................................................................................xi
Editors............................................................................................................................................ xiii
Contributors....................................................................................................................................xv
www.Technicalbookspdf.com
vi Contents
Index.............................................................................................................................................. 327
www.Technicalbookspdf.com
Preface
Mathematical techniques are the strength of engineering sciences and form the common
foundation of all novel disciplines as engineering sciences. The book Advanced Mathematical
Techniques in Engineering Sciences involved an ample range of mathematical tools and tech-
niques applied in various fields of engineering sciences. Through this book, engineers
have the opportunity to gain a greater knowledge and it may help them in the applications
of mathematics in engineering sciences.
Chapter 1 presents the rules and methods for applying the Laplace transform. Three
sections of the mathematical investigation of applied questions are distinguished: the rules
for performing operations in the Laplace transform; Laplace transform in research tasks
of the vibrations of a rod; and application of the Laplace transform in engineering technol-
ogy. Specific examples of solving differential equations are presented, applied to prob-
lems of mechanics and the theory of oscillations. The essence of Kondratenko’s method is
described, on the basis of which mathematical modeling of some technological operations
of mechanical engineering was carried out and features of dynamic phenomena during
the work of equipment and interaction of the tool with detail were revealed.
Chapter 2 investigates the history, nature, and importance of the Fourier series. This
chapter describes the periodic function, orthogonal function, Fourier series, Fourier
approximation, Dirichlet’s theorem, Riemann–Lebesgue lemma for Fourier series, dif-
ferentiation of Fourier series, convergence of the Fourier series of the functions, Fourier
transform, Fourier analysis with Fourier transform, and Gibbs phenomenon. Also, some
of the summability methods (Cesàro, Nörlund, Riesz, weighted mean, etc.), absolute sum-
mability methods, strong summability methods, necessary and sufficient conditions for
regularity of matrix of summability methods, uses of summability, norm, modulus of con-
tinuity, Lipschitz condition, various Lipschitz classes in trigonometric Fourier approxima-
tion, and importance of degree of approximation have been explained. The applications of
summability methods in approximation of the signals, the behavior of the Fourier series
of a piecewise smooth function (Gibbs phenomenon), Fourier series of signals of bounded
bandwidth, filtering by Fourier transforms, and applications of summability technique
and Fourier series have been described.
Chapter 3 describes the basics of soft computing and their applications. The key goal
of soft computing is to develop intelligent machines to provide solutions to real-world
problems, which are difficult to model mathematically.
Chapter 4 describes the study on solving transportation problems under a multi-objec-
tive environment. The main focus of this chapter is to introduce a new approach for solv-
ing multi-objective transportation problems in addition to the existing approaches such
as goal programming, fuzzy programming, and revised multi-choice goal programming.
In the proposed approach a procedure to obtain a Pareto-optimal solution of a multi-
objective transportation problem using the Vogel approximation method is incorporated.
vii
www.Technicalbookspdf.com
viii Preface
The merits and demerits of the approaches goal programming, fuzzy programming, and
revised multi-choice goal programming compared to our new approach to solving a multi-
objective transportation problem are presented.
Chapter 5 provides the study of simultaneous optimization of yield and viscosity of
the pulp cooking process using the dual-response surface methodology. The pulp cook-
ing process is an important step in the manufacturing of rayon grade pulp. The pulp is
the cellulose component of the wood. The cellulose is separated from other components
and impurities of wood by cooking the wood chips in a highly pressurized chamber fol-
lowed by multiple stages of washing and chemical treatments. The study is undertaken
to increase the pulp yield as far as possible without increasing the viscosity beyond the
specified upper limit.
Chapter 6 gives the concept of a time-dependent conflicting bifuzzy set (CBFS), and a
new procedure to construct the membership and nonmembership functions of the fuzzy
reliability function is proposed with the help of time-dependent CBFS. The concept of
triangular CBFS has been developed, and triangular CBFS is used to represent the failure
rate function of the system.
Chapter 7 focuses on the failure time data analysis based on the nonhomogeneous
Poisson process (NHPP) and discusses several statistical estimation methods for a periodic
replacement problem with minimal repair as the simplest application of life data analysis
with NHPP. Not only the parametric maximum likelihood estimation of the power law
process is applied, but also constrained nonparametric maximum likelihood estimation
(CNPMLE) and kernel-based estimation methods for estimating the cost-optimal periodic
replacement problem with the minimal repair, where single or multiple minimal repair
data are assumed.
Chapter 8 extends the available literature and discusses the important attribute “view-
count” of content dynamically. With the Internet emerging as a rapidly growing new mar-
ket, the netizens are also growing at a fast pace. Making use of this ideology, a modeling
framework whose utility has been highlighted through three models is proposed that
describes the growing Internet market size and repeat viewers. The models have been
validated on a YouTube entertainment video data set.
Chapter 9 analyzes dual-market modeling. Dual-market modeling is an increasingly
important concept in marketing, and in order to inculcate heterogeneity and the require-
ment of some specific characteristics for some technologies to be adopted differently in
different geographical locations, the authors intend to study this behavior through a math-
ematical framework that exhibits the dual-market phenomenon.
Chapter 10 presents a uniform methodology for three fundamental problems in data
analysis: identification/detection of atypical elements (outliers), clustering, and classifica-
tion. Such a unification facilitates understanding of the material and adapting it to the
individual needs and preferences of particular users. The investigated material is ready
to use, is practically parameter free, and does not require laborious exploration from the
researcher. This has been illustrated with a number of applications in the fields of engi-
neering, management, medicine and biology, as well as supplemented by a thematic bibli-
ography extending the issues presented.
Chapter 11 analyzes the statistical tolerance limits used today in both production and
research. It is often desirable to have statistical tolerance limits available for the distribu-
tions used to describe time-to-failure data in reliability problems. For example, one might
wish to know if at least a certain proportion of a manufactured product will operate for
at least, say, the warranty period. This question cannot usually be answered exactly, but
it may be possible to determine a lower tolerance limit, based on a preliminary random
www.Technicalbookspdf.com
Preface ix
sample, such that one can say with a certain confidence that at least a specified proportion
or more of the product will operate longer than the lower tolerance limit. Then reliability
statements can be made based on the lower tolerance limit, or, decisions can be reached by
comparing the tolerance limit to the warranty period.
Chapter 11 analyzes the lower tolerance limit, based on a preliminary random sam-
ple, such that one can say with a certain confidence that at least a specified proportion or
more of the product will operate longer than the lower tolerance limit, which presents a
new technique for constructing exact lower and upper tolerance limits on outcomes (for
example, on order statistics) in future samples. The technique used here emphasizes piv-
otal quantities relevant for obtaining tolerance factors and is applicable whenever the sta-
tistical problem is invariant under a group of transformations that acts transitively on the
parameter space. The proposed technique is based on a probability transformation and
pivotal quantity averaging. It is conceptually simple and easy to use. The discussion is
restricted to one-sided tolerance limits.
Chapter 12 deals with the design of torque-based PID controller and tuning its gains
with the help of two algorithms, namely, modified chaotic invasive weed optimization
(MCIWO) and modified chaotic invasive weed optimization-neural network (MCIWO-NN)
algorithms for the biped robot while walking on a staircase. An analytical method has
been developed to generate the gaits and design the torque-based PID controller. The
dynamics of the biped robot utilized in the said controller have been derived after utiliz-
ing the Lagrange–Euler formulation. Alongside, the authors utilized the MCIWO algo-
rithm to optimize the gains of the PID controller. Further, in the MCIWO-NN algorithm,
the MCIWO algorithm is used to evolve the architecture of the NN, which helped in pre-
dicting the gains of the PID controller in an adaptive manner. The developed algorithms
are tested in computer simulations and on a real biped robot.
In Chapter 13, intelligent predictive models for modeling fertility of Murrah bulls
using the various emerging machine learning (ML) algorithms, namely, neural networks
(NNs), support vector regression (SVR), decision trees (DTs), and random forests (RFs), and
a conventional linear model (LM) for regression have been described. These intelligent
ML models would provide decision support to organized dairy farms for selecting good
bulls. Hence, the ML models can be employed as a plausible alternative to linear regres-
sion models to assess more accurately the conception rate in Murrah breeding bulls at the
organized farms.
In Chapter 14, the computational study has been performed on the two-jet vectoring
through the Coanda surface, realizing the future concept of the vertical and short takeoff
and landing (V/STOL) of an aircraft for civil aviation purposes. The set of computations
has been performed from the incompressible flow regime to the incipient of the compress-
ible flow regime. This computational study aims to identify the design parameters through
the flow characteristics using the computational fluid dynamics technique.
Chapter 15 deals with the discussion on the collocation method which is a very well-
known numerical technique. Along with the description of the methodology adopted,
details are given for the used B-spline basis function in the collocation method. The prop-
erties of B-spline basis functions are discussed in this chapter with a description of the
types and degrees of B-spline basis functions.
In Chapter 16, utilizing Rayleigh’s approximation method, an attempt has been
made to study the reflection and refraction patterns in a corrugated interface sand-
wiched between an initially stressed fluid-saturated poroelastic half-space and a highly
anisotropic half-space. The highly anisotropic half-space is considered as triclinic.
Various two-dimensional plots have been drawn to show the effects of some affecting
www.Technicalbookspdf.com
x Preface
Mangey Ram
Dehradun, India
J. Paulo Davim
Aveiro, Portugal
www.Technicalbookspdf.com
Acknowledgments
The editors acknowledge CRC Press for this opportunity and professional support. Also,
we would like to thank all the chapter authors and reviewers for their availability for this
work.
xi
www.Technicalbookspdf.com
www.Technicalbookspdf.com
Editors
Mangey Ram received the PhD in mathematics and minor in computer science from
G. B. Pant University of Agriculture and Technology, Pantnagar, India, in 2008. He has
been a faculty member for around 10 years and has taught several core courses in pure
and applied mathematics at undergraduate, postgraduate, and doctorate levels. He is cur-
rently a professor at Graphic Era (Deemed to be University), Dehradun, India. Before
joining Graphic Era (Deemed to be University), he was a deputy manager (probation-
ary officer) with Syndicate Bank for a short period. He is editor-in-chief of International
Journal of Mathematical, Engineering and Management Sciences; and the guest editor and
member of the editorial boards of many journals. He is a regular reviewer for interna-
tional journals, including those published by the Institute of Electrical and Electronics
Engineers, Elsevier, Springer, Emerald, John Wiley, Taylor & Francis Group, and many
other publishers. He has published 125 research publications (published by Institute of
Electrical and Electronics Engineers, Springer, Emerald, World Scientific, among oth-
ers) and in many other national and international journals of repute and also presented
his works at national and international conferences. His fields of research are reliability
theory and applied mathematics. Ram is a senior member of the Institute of Electrical
and Electronics Engineers, member of the Operational Research Society of India, the
Society for Reliability Engineering, Quality and Operations Management in India, the
International Association of Engineers in Hong Kong, and the Emerald Literati Network
in the United Kingdom. He has been a member of the organizing committees of a num-
ber of international and national conferences, seminars, and workshops. He has been
conferred with the Young Scientist Award by the Uttarakhand State Council for Science
and Technology, Dehradun, in 2009. He has been awarded the Best Faculty Award in 2011
and recently the Research Excellence Award in 2015 for his significant contributions in
academics and research at Graphic Era (Deemed to be University).
J. Paulo Davim received his PhD in mechanical engineering in 1997, and MSc in mechani-
cal engineering (materials and manufacturing processes) in 1991, the Dipl-Ing engineer’s
degree (5 years) in mechanical engineering in 1986, from the University of Porto (FEUP),
the Aggregate title (Full Habilitation) from the University of Coimbra in 2005, and the
DSc from London Metropolitan University in 2013. He is Eur Ing by FEANI-Brussels and
senior chartered engineer by the Portuguese Institution of Engineers with a MBA and
Specialist title in engineering and industrial management. Currently, he is a professor at
the Department of Mechanical Engineering of the University of Aveiro, Portugal. He has
more than 30 years of teaching and research experience in manufacturing, materials and
mechanical engineering with special emphasis in machining and tribology. He also has
interest in management and industrial engineering and higher education for sustain-
ability and engineering education. He has guided large numbers of postdoc, PhD, and
xiii
www.Technicalbookspdf.com
xiv Editors
master’s degree students. He has received several scientific awards. He has worked as
the evaluator of projects for international research agencies as well as examiner of PhD
theses for many universities. He is the editor-in-chief of several international journals,
guest editor of journals, books editor, book series editor, and scientific advisor for many
international journals and conferences. Presently, he is an editorial board member of 25
international journals and acts as a reviewer for more than 80 prestigious Web of Science
journals. In addition, he has published as editor (and co-editor) more than 100 books
and as author (and co-author) more than 10 books, 70 book chapters, and 400 articles in
journals and conferences (more than 200 articles in journals indexed in Web of Science
core collection/h-index 41+/5000+ citations and SCOPUS/h-index 51+/7500+ citations).
www.Technicalbookspdf.com
Contributors
N. Aggrawal Dinesh Bisht
Department of Computer Science & Department of Mathematics
Information Technology Jaypee Institute of Information Technology
Jaypee Institute of Information Technology Noida, Uttar Pradesh, India
Noida, Uttar Pradesh, India
Atish Kumar Chakravarty
R. Aggarwal Computer Centre & Dairy Economics,
Department of Operational Research Statistics & Management Division
University of Delhi ICAR-National Dairy Research Institute
Delhi, India Karnal, Haryana, India
xv
www.Technicalbookspdf.com
xvi Contributors
Sangeeta Pant
Anuj Kumar
Department of Mathematics
Department of Mathematics
University of Petroleum & Energy
University of Petroleum & Energy
Studies
Studies
Dehradun, Uttarakhand, India
Dehradun, Uttarakhand, India
Mangey Ram
Gurupada Maity Department of Mathematics, Computer
Department of Applied Mathematics Science & Engineering
with Oceanology and Computer Graphic Era (Deemed to be University)
Programming Dehradun, Uttarakhand, India
Vidyasagar University
Midnapore, West Bengal, India Sankar Kumar Roy
Department of Applied Mathematics
Ravinder Malhotra with Oceanology and Computer
Computer Centre & Dairy Economics, Programming
Statistics & Management Division Vidyasagar University
ICAR-National Dairy Research Midnapore, West Bengal, India
Institute
Karnal, Haryana, India Yasuhiro Saito
Department of Maritime Safety
Ravi Kumar Mandava Technology
School of Mechanical Sciences Japan Coast Guard Academy
IIT Bhubaneswar Kure, Japan
Bhubaneswar, Odisha, India
Adesh Kumar Sharma
Lubov Mironova Computer Centre & Dairy Economics,
Institute of Applied Technology Statistics & Management Division
Russian University of Transport (MIIT) ICAR-National Dairy Research Institute
Moscow, Russia Karnal, Haryana, India
www.Technicalbookspdf.com
Contributors xvii
www.Technicalbookspdf.com
www.Technicalbookspdf.com
chapter one
Leonid Kondratenko
Moscow Aviation Institute (State National Research University)
Contents
1.1 esignation..............................................................................................................................2
D
1.2 Laplace transform and operations mapping......................................................................2
1.3 Linear substitutions................................................................................................................ 7
1.4 Differentiation and integration.............................................................................................9
1.5 Multiplication and curtailing.............................................................................................. 11
1.6 The image of a unit function and some other simple functions.................................... 13
1.7 Examples of solving some problems of mechanics......................................................... 18
1.8 Laplace transform in problems of studying oscillation of rods..................................... 23
1.9 Relationship between the velocities of the particles of an elementary volume
of a cylindrical rod with stresses........................................................................................ 24
1.10 An inertial disk rotating at the end of the rod................................................................. 25
1.11 Equations of torsional oscillations of a disk..................................................................... 26
1.12 Equations of longitudinal oscillations of a disk............................................................... 27
1.13 Application of the Laplace transform in engineering technology................................30
1.13.1 Method of studying oscillations of the velocities of motion and
stresses in mechanisms containing rod systems������������������������������������������������ 30
1.13.2 Features of functioning of a drive with a long force line.................................... 31
1.13.3 Investigation of dynamic features of the system in the technologies
of deephole machining�������������������������������������������������������������������������������������������32
References........................................................................................................................................ 33
This chapter is written by engineers for engineers. The authors try to convey to the reader
the simplicity and accessibility of the methods in a concise form with the illustration of
the calculation schemes. For a more extensive study of the stated problems of mathemati-
cal modeling, at the end of the chapter are given the literature sources, from which the
www.Technicalbookspdf.com
2 Advanced Mathematical Techniques in Engineering Sciences
reader can obtain the necessary additional explanations. The list of authors includes well-
known scientists in the field of mathematics and mechanics – G. Doetsch, A.I. Lur’e, L.I.
Sedov, V.A. Ivanov, and B.K. Chemodanov. In compiling the theoretical material, we refer
to the authors mentioned. This chapter reflects the experience of lecturing on mathemati-
cal methods of modeling, as well as the personal participation of the authors in the work
in this technical field.
The material presented can be of interest to students, graduate students and other
specialists.
1.1 Designation
j — the imaginary unit; e—the base of natural logarithms;
α = σ + jω —the complex number;
Re—real part, Im—the imaginary part of the complex number;
s—a complex variable; s = x + jy, x = Re s, y = Im s;
L—the transformation (Laplace transform);
F(s)—function of complex variable s (Laplace representation);
f(t)—function of the real variable t (the original);
L[f(t)]—direct Laplace transform;
L −1[F(s)]—inverse Laplace transform; and
→—the sign of the correspondence of the transformation:
for the direct transformation −f(t) → F(s); for the inverse transformation −F(s) ← f(t).
In many formulas, fractional numbers are not represented by a standard record but by
a slash or by multiplication of the factor in the degree (n), irrational numbers are expressed
as a number in fractional power. For a correct understanding of these symbols, examples
are given:
a b a a 1
, a + b/c = a + , a/ ( b + c ) = , a ( bc + d ) =
−1
a/bc = , a1/2 = a , b −1/3 = 3 .
bc c b+c bc + d b
1.2 Laplace transform and operations mapping
The Laplace transform is a powerful mathematical method for solving differential, differ-
ence, and integral equations. By means of these equations, one can describe any physical
(technological) process and conduct mathematical modeling of the behavior of the object
and of the reaction of the environment under the influence of force or other factors, inves-
tigate the dynamic properties of the element of construction, and much more.
In many engineering problems, it is important to investigate a function f(t), where real
variable t is time. Such problems in mechanics relate to dynamic problems.
The simplest and most economical solution of such problems is possible with the help
of methods of the theory of operational calculus [1].
An important role in applied mathematical analysis is played by the Laplace integral
I=
∫ f (t ) e
0
− st
dt. (1.1)
www.Technicalbookspdf.com
Chapter one: Application of the Laplace transform in problems 3
F( s) =
∫ f (t)e
0
− st
dt . (1.2)
L f (t) = F( s) =
∫ f (t)e
0
− st
dt . (1.3)
The record L[f(t)] means L-transformation. The Laplace transform connects the single-
valued function F(s) of the complex variable s (image) with the corresponding function f(t)
of the real variable t (the original). A brief description of the essence of the Laplace trans-
form and the correspondence table of operations can be found in Ref. [2].
As can be seen from (1.2), this transformation consists of multiplying the function f(t)
by the exponential function e−s and integrating the product of these functions with respect
to the argument t in the range from 0 to ∞.
From the image of F(s), if it exists, one can always find the original f(t). Such a transi-
tion is called the inverse Laplace transform, symbolically denoted by L −1, and corresponds to
x + j∞
1
f (t) = L [ F( s)] =
−1
2π ∫ F(s)e st
ds, t > 0. (1.4)
x − j∞
f (t1 – 0) f (t1 + 0)
f (0)
t
t1
www.Technicalbookspdf.com
4 Advanced Mathematical Techniques in Engineering Sciences
The right-hand side of Equation (1.4) is called the inverse of the Laplace integral and is a com-
plex integral.
If relation (1.4) is rewritten in the form
+∞
∫e
−∞
jyt
( x + jy ) dy = 2π e − xt f (t ) for t > 0;
+∞
∫e
−∞
jyt
( x + jy ) dy = 0 for t < 0, (1.5)
then the formulas (1.2) and (1.5) have a physical meaning. For a constant value of x in the
complex variable s = x + jy, the function F(x + jy) is the spectral density of the damped time
function e−stf(t) for which the variable y is the circular frequency. Such a change in the vari-
able s in the complex plane corresponds to the displacement of the point along the vertical
line with the abscissa x (Figure 1.2).
From the mathematical point of view, multiplying the function f(t) by e−st makes the
improper integral of the right-hand side of expression (1.2) a convergent in the half-plane
Re s > x0 (Figure 1.3).
The function f(t) can be an original only if the following conditions are satisfied:
jy
Re s > x0
х0
х
O
jy
s
y
х
O
х
www.Technicalbookspdf.com
Chapter one: Application of the Laplace transform in problems 5
1. The function f(t) is continuous for all values t ≥ 0. Continuity can be violated only at
points of discontinuity of the first kind. The number of these points must be finite in
any interval of limited length (Figure 1.1).
2. The function f(t) = 0 for the values t < 0.
3. The function f(t) has a limited order of increasing (i.e., one can find constant numbers).
4. M > 0 and x0 ≥ 0 such that
Theorem 1.1: If the function f(t) is an original, then this function is Laplace trans-
formed and the image of the given function F(s) is defined in the half-plane Re s > x0,
where x0 is the growth index of the function f(t).
Under the condition Re s > x0, the integral (1.2) is an absolutely convergent inte-
gral. The number x0 is called the abscissa of absolute convergence of the integral (1.2).
Theorem 1.2: The image F(s) of the original f(t) in the half-plane, for which Re s > x0,
where x0 is the growth index of the original, is an analytic function.
The continuity of the function F(s) follows from the proof of Theorem 1.1.
This important property makes it possible to use powerful methods of the theory
of functions of a complex variable in calculations, because in the practical applica-
tion of Laplace, calculations are performed not over given functions, but over their
images.
The analytic expression of the original through the image is formulated by
Theorem 1.3.
Theorem 1.3: The original f(t) at points of continuity is defined by
x + j∞
1
f (t) =
2π ∫ F(s)e st
ds, (1.6)
x − j∞
where F(s) is the Laplace representation of the original f(t), and the integral on the
right-hand side of this equation is understood in the sense of the principal value
www.Technicalbookspdf.com
6 Advanced Mathematical Techniques in Engineering Sciences
jy
ds
Re s > x0
х0
х
O
x + j∞ x + jy
∫
x − j∞
F( s)e st ds = lim y→∞
∫ F(s)e
x − jy
st
ds,
and this is taken along a straight line parallel to the imaginary axis and located in the
half-plane Re s > x0 (Figure 1.4).
Formula (1.6) is called the Laplace inversion formula and establishes a connection
between the image of F(s) and the single-valued corresponding original f(t). The pro-
cess of obtaining the original from a given image is written by the expression (1.4).
The formula (1.6) defines the original only at the points of its continuity. However,
for the piecewise-continuous functions f(t), illustrated in Figure 1.1, the limit of the
right-hand side of (1.6) at the points of discontinuity of the first kind exists and is
defined by
x + j∞
1 1
lim y→∞
2π j ∫ F(s)e st
ds =
2
f ( t + 0 ) + f ( t − 0 ) .
x − j∞
From this follows another important property—the single value of the Laplace transform.
The original always corresponds to a single image, since the values of the origi-
nal at points of discontinuity do not change the view of the image. At the same time,
the same image can be associated with a set of originals, the values of which differ
from each other only at points of discontinuity [4].
Corollary of Theorem 1.3: If the original is a differentiable function everywhere in
the interval 0 < t < ∞, then the original with respect to the given image is uniquely
determined.
It should be noted that not all analytical functions can be images. In particular,
periodic functions of the form eαs, cos s, sin s, are not images, and not all functions
2
can be originals (1/t , tg ω t , e t ). The proof of the theorem, which determines sufficient
conditions when the function F(s) is an image, is find in Refs. [4,5].
The Laplace transform is the result of the extension of the Fourier transform to
functions that satisfy the Dirichlet conditions in the interval 0 < t < ∞ but do not sat-
isfy the condition of absolute integrability in this interval. The connection between
the Fourier transform and the Laplace transform is clearly presented in Ref. [3]. The
Fourier and Laplace transforms are widely used in the theory of automatic regulation.
www.Technicalbookspdf.com
Chapter one: Application of the Laplace transform in problems 7
The next important property of the Laplace transform is the linearity of the trans-
formation, which is formulated by a theorem that establishes the “original-image”
correspondence.
Theorem 1.4: If the functions f1(t), f2(t), …, fn(t) are originals, and the images of these
functions are, respectively, F1(s), F2(s), …, Fn(s), and if λ1, λ2, …, λn are quantities that do
not depend on t and s, then the following equalities hold:
n n
L
k = 1
∑λ k f k (t) =
∑ λ F (s); (1.7)
k =1
k k
n n
L −1
k = 1
∑
λ k Fk ( s) =
∑ λ f (t). (1.8)
k =1
k k
The linearity property allows, in the practical application of the Laplace trans-
form, performance of calculations not on the given functions, but on their images,
applying the table of correspondences between the originals and the images. In this
case, you need to know not only the images of individual functions, but also the
rules for displaying operations performed on such functions. Therefore, following
we formulate other properties of transformation (differentiability, integrability, etc.)
in the form of rules, and we call such rules later when solving some mathematical
problems.
1.3 Linear substitutions
We give the rules for linear transformation of an argument in the original or image. For
simplicity of clarity, instead of the symbolic designation of the transformation, we intro-
duce the arrows indicating the direct and inverse Laplace transformations.
Rule I. Theorem 1.5: Similarity theorem. Multiplying the argument of the original
(image) by a certain positive number results in the division of the image (the original)
and argument of the image (the original) into the same positive number
1 s 1 t
f ( at ) → F , F ( as ) ← f a > 0. (1.9)
a a a a
This operation characterizes the change in the scale of the independent variable.
Rule II. Theorem 1.6: First displacement theorem (the lag theorem). If the function f(t) is
an original and F(s) is its image, then the image of the displaced original f(t − a), where
a is a real number, is determined by the expression
e − as F( s) = 0, for t < a ;
www.Technicalbookspdf.com
8 Advanced Mathematical Techniques in Engineering Sciences
Since t < a, the argument t − a is negative, then the function f(t − a) is equal to zero.
The graph of this function is obtained from the graph of the function f(t) by shifting
its graph to the right by a distance a (Figure 1.5a, b).
The displacement theorem has a wide application in the theory of automatic reg-
ulation, as well as in the study of processes described by piecewise-continuous and
periodic functions.
Rule III. Theorem 1.7: Second bias theorem (the bias theorem). If the function f(t) is the
original and F(s) is the image, then the image of the displaced original f(t + a), where
a is a real number, is determined by the expression
a
f (t + a) → e F( s) −
as
∫
0
e − st f (t)dt ; a > 0. (1.12)
The essence of this theorem is that the image of F(s) cannot be linearly transformed
into the original of the function f(t + a), since the right-hand side of Equation (1.12)
has a finite Laplace integral. Its calculation is carried out in the interval of variation
of the real variable 0 ≤ t < a.
This rule is the opposite of the second rule. The graph of the function f(t) is
shifted to the left by a distance a (Figure 1.5a, c).
The bias theorem determines the ratio of the image and the original in the case
when the complex variable is displaced by a [1]:
t
F ( s + a) ← e − at
∫
f (t) + a e − at f (t) dt . (1.13)
0
(a)
f (t)
t
O
(b) (c)
f (t – a)
f (t + a)
t t
O O
а
а
www.Technicalbookspdf.com
Chapter one: Application of the Laplace transform in problems 9
eα t f (t) → F( s − α ) (1.14)
or
e −α t f (t) → F( s + α ). (1.15)
If we set the initial value f (+0) = 0, then from formula (1.16) we obtain
If the derivatives of higher orders f (2)(t), f (3)(t), …, f (n)(t) are originals, then the following
relations hold:
and
f ( n)
(t ) → s F ( s) −
n
∑s
k +1
n− k
f ( k − 1) ( +0 ). (1.18)
www.Technicalbookspdf.com
10 Advanced Mathematical Techniques in Engineering Sciences
The essence of Rule V is as follows. The differentiation, which in the origin space
is a transcendent process [3], is replaced in the image space by multiplying the image
by the degree of argument s with the simultaneous addition of the polynomial whose
coefficients are the initial values of the original.
Rule V assumes that the derivative of the highest order f (n) (t) exists at each point t > 0
and has an image. This rule is especially valuable in solving differential equations.
In the operational calculus, instead of the Laplace integral (1.3), we prefer to
consider the function
∫
F( s) = s L[ f (t)]; F( s) = s e − st f (t) dt (1.19)
0
or
∞
F( s)
s
=
∫e − st
f (t) dt. (1.20)
0
Taking into account (1.19), we give the most important ratio of operational calculus
f ′(0)
f ′(t) → s F( s) − f (0) ; f ′′(t) → s2 F( s) − f (0) − ;
s
f ′(0) f ( n− 1) (0)
f ( n) (t) → sn F( s) − f (0) − − − ( n− 1) . (1.21)
s s
There are no contradictions between the formulas for differentiating the original
(1.18) and the expressions (1.21). According to the rule for calculating the integral,
these expressions differ only in the integration constants. Only in (1.18), these con-
stants are real, and in (1.21) are only complex quantities.
The essence of the function (1.19) lies in the fact that in transformations we work
in image space only with analytic functions and their initial values. This important
technique is widely used in mechanics and other technical applications. Then, using
a concrete example, we show the advantages of applying formulas (1.21).
Rule VI. Theorem 1.10: Differentiation theorem for an image.
If the function f(t) is an original and F(s) is an image, then is justly the equality
d
L tf (t) = − F( s). (1.22)
ds
Since the image of F(s) is always an analytic function and possesses all derivatives, on
the basis of (1.22) one can obtain derivatives of any order.
Thus, the operation of image differentiation with respect to s corresponds to the
operation of multiplying the original by an independent variable t taken with the
opposite sign:
www.Technicalbookspdf.com
Chapter one: Application of the Laplace transform in problems 11
Rule VII. Theorem 1.11: If the function f(t) is an original, and F(s) is the image, then
the integral is also an original and is justly equality
F( s) f −1 (+0)
L f −1 (t) = + . (1.24)
s s
t
Here, f −1 (t) =
∫ f (t) dt =
∫ f (τ ) dτ + f
0
−1
(+0), where f −1 (+0) is the integration constant.
Hence, under the condition that f −1 (+0) = 0, Rule VII is formulated.
The operation of integrating the original corresponds to the operation of divid-
ing the image of this original by the complex number s:
t
1
∫ f (t) dt → s F(s). (1.25)
0
Let f ( − k ) (t) =
∫ f (t) dt ∫ f (t)(dt) ∫ f (t)(dt) , then
2 k
f ( − k ) (+0)
n
L f ( − n) (t) =
F( s)
sn
+ ∑ sn k +1
. (1.26)
k =1
The rule of integration for an image is rarely used in practice, so we do not give it here.
The expressions for the derivative and integral representations are of primary
importance in operational calculus; therefore, the number s acquires the character of
the operator [1].
Rule IX. Theorem 1.13: Convolution theorem. If the functions f1(t) and f2(t) are originals
and their images are, respectively, F1(s) and F2(s), then is justly the equality
www.Technicalbookspdf.com
12 Advanced Mathematical Techniques in Engineering Sciences
t
∫
L f1 (t − τ ) f2 (τ )dτ = F1 ( s)F2 ( s). (1.28)
0
Here the integral combination of functions is called the convolution of the originals
f1(t) and f2(t) and is denoted by
t
f1 f 2 =
∫ f (t − τ ) f τ dτ. (1.29)
0
1 2
F1 F2 →
∫ f (t − τ ) f τ dτ ,
0
1 2 t > 0. (1.30)
x + j∞
t 1
L
∫
f1 (t) f2 (t) =
2π j ∫
F1 ( s − w)F2 (w) dw. (1.31)
0 x − j∞
x + j∞
1
f1 (t) f2 (t) →
2π j ∫ F (w)F (s − w) dw,
1 2 x1 ≤ x < Re s − x2 ; (1.32)
x − j∞
x + j∞
1
f1 (t) f2 (t) →
2π j ∫ F (s − w)F (w) dw,
1 2 x2 ≤ x < Re s − x1 . (1.33)
x − j∞
The domains of absolute convergence of these integrals are illustrated in Figure 1.6.
www.Technicalbookspdf.com
Chapter one: Application of the Laplace transform in problems 13
Im s Im s
Re s Re s
O х₁ х х₂ O х₂ х х₁
The reader will find a more detailed exposition of this question in Ref. [3].
The left-hand-side integral (1.34) is called the quadratic quality criterion.
In optimization processes, minimizing this integral is the defining characteristic.
www.Technicalbookspdf.com
14 Advanced Mathematical Techniques in Engineering Sciences
(a) (b)
u(t – а)
u(t) 1
1
t а t
O O
In mechanics, the initial values of the function are used in solving problems (i.e., values of
the function for t = 0). We denote it as u0(t) and the shifted function, respectively, u0(t − τ).
Taking (1.18) into account, we can write for them
u0 (t) → 1; (1.39)
u0 (t − τ ) → e − sτ . (1.40)
The physical meaning of the initial function is that at time t = 0 it takes the value of the
constant C. It follows from (1.39) that any constant C is an image of the same constant. In
the case of the Laplace transform, we always mean that the “constant” (initial function) is
a function of t, vanishes for t < 0 and is equal to C for t > 0 [1].
The third case. Special functions. To the category of special functions is the Dirac delta
function, also called the first-order impulsive unit function. The delta function is defined by
δ (t) = 0, for t ≠ 0,
(1.41)
δ (t) = ∞, for t = 0.
∫ δ (t) dt = 1. (1.42)
−∞
Conditions (1.41) and (1.42) are incompatible from the point of view of classical mathemati-
cal analysis, and therefore the delta function does not belong to the “function” in the usual
sense. However, in the class of generalized functions, the delta function occupies an equal
www.Technicalbookspdf.com
Chapter one: Application of the Laplace transform in problems 15
place [2]. The notion of a “delta function” turns out to be significant when extending the
operation for differentiating discontinuous functions. For example, a sequence of functions
u0 (t) − u0 ( a)
fδ ( t , a ) = ,
a
characterizing pulses of height 1/a and duration a (Figure 1.8), for a → 0 converges to a
delta function. For example, the function
has a physical meaning in mechanics, as a force of constant magnitude, acting for a period
of time a. The momentum of this force for a time interval of action is equal to one, regard-
less of the value of a. Such a function is called a delta function of the first order. This func-
tion is zero for all t except t = 0, when becomes infinite, so that lim a→ 0 au1 ( t , a ) = 1.
The shifted Dirac delta function δ (t − τ) is defined by
d(t − τ ) = 0 for t ≠ τ ,
(1.44)
d(t − τ ) = ∞ for t = τ .
Similarly, the displaced unit impulsive force will be denoted as u1 (t − τ), and
u1 (t − τ ) = 0 for t ≠ τ ,
(1.45)
u1 (t − τ ) = ∞ for t = τ .
u1 (t) → s ; u1 (t − τ ) → e − sτ s. (1.46)
The reader will find delta functions of the second order (Figure 1.9) in Refs. [1,2]. We give
only the correlation of the originals and images
u2 (t) → s 2 ; u2 (t − τ ) → s2 e − sτ . (1.47)
1/а
t
O а
www.Technicalbookspdf.com
16 Advanced Mathematical Techniques in Engineering Sciences
1/а2
а 2а
t
O
1/а2
The fourth case. The time function is eα, where α is an arbitrary complex or real number.
We represent such a function in the form
We take the function u(t) as a unit function, which for convenience of calculations we addi-
tionally represent in form of the multiplier 1(t).
The image of the function (1.48) will be
∞
e −( s−α )t ∞ 1
L 1 ( t ) e −α t
=
∫ eα t e − st dt = −
s
|0 =
s−α
. (1.49)
0
If α is a complex number, then, depending on the values that the real and imaginary parts
take, function (1.48) characterizes the types of vibrations and motions. Expression (1.48)
takes on an explicit physical meaning. The reader will find a detailed exposition of this
question in Ref. [3].
The graph of the function (1.48), where α is the real negative number (α < 0), is shown
in Figure 1.10.
The fifth case. The function of time is t and the ratio
is fair.
1(t)eαt
t
O
www.Technicalbookspdf.com
Chapter one: Application of the Laplace transform in problems 17
We find the image of this function using integration by parts, we obtain for Re s > 0:
∞ ∞
te − st ∞ 1 − st e − st 1
L [t ] =
∫ te − st
dt = −
s
|0 +
s ∫
e dt = − 2 |∞0 = 2 . (1.51)
s s
0 0
is fair.
Using the previous method and repeated integration, we obtain an image for the func-
tion (1.52) in the following form:
n!
L t n = . (1.53)
s n+ 1
The seventh case. We represent the image (1/s)⋅F (s) in the form
1 1 1
F( s) = F( s) n− 1 . (1.54)
sn s s
Using Rule IX (the convolution theorem), we obtain an image of an integer power of the
variable t:
t
F( s) 1
sn
→
(n − 1)! ∫ f (τ )(t − τ )n− 1
dτ . (1.55)
0
The mechanism of finding the image according to the given original, if this original exists,
reduces to calculating the integral (1.2). For the simplest functions, such an operation does
not present mathematical difficulties.
Therefore, the results of the transformations are given in tables of correspon-
dence between originals and images [1–4]. We give some correspondences in Tables 1.1
and 1.2.
www.Technicalbookspdf.com
18 Advanced Mathematical Techniques in Engineering Sciences
• The Cauchy problem, when all additional conditions are given in one point (as a rule
in the starting point) of interval
• Boundary value problem, when additional conditions are indicated by the values of
the function and its derivatives at the boundary of the interval—at the beginning
and at the end of the integration
As is known, the solution of such problems is connected with the problem of integrating
partial differential equations under given boundary conditions [6].
We show the advantages of the Laplace transform in solving a differential inhomoge-
neous first-order equation with constant coefficients.
Example: Let some process be described by the following differential equation, which
we call the initial equation:
y ′ + c0 y = f (t).(1.56)
Equation (1.56) is represented in the original space. In the image space, this equation
c orresponds to the depicting equation, which has the form
L[ y ′] + c0 L[ y ] = L[ f (t)]. (1.57)
The symbol L [...] denotes the transformation of the original equation by multiplying
both parts by e−st and integrating from 0 to ∞.
Applying Rule V, we write in the images of Equation (1.57):
www.Technicalbookspdf.com
Chapter one: Application of the Laplace transform in problems 19
Thus, we obtained a linear algebraic equation with the initial value of the function
y(t) corresponding to the value y(+ 0) for the initial point t = 0 (Theorem 1.9).
The solution of this equation is quite simple:
Y ( s) [( s + c0 )] = F( s) + y(+0),
1 1
Y ( s) = F( s) + y(+0) . (1.59)
s + c0 s + c0
To the resulting image of Y(s) we find the corresponding original, using the inversion
formula (1.4) and Table 1.1 (item 8):
We note that the first term on the right-hand side of (1.59) is the product of two images,
which, according to Rule IX, corresponds to the convolution of two originals, the first
term in (1.60). Finally, we get
t t
y(t) =
∫
0
f (τ )e − c0 (t − τ ) = e − c0t
∫ f (τ )e
0
c0τ
dτ + y(+0)e − c0t . (1.61)
We note that (1.61) is a solution of the differential equation (1.56) for a given initial value
of the function y(t). This solution can be obtained even easier. We outline the course of
the solution. Let f(t) = 1(t) and y(+0) = 0. Using Tables 1.1 (item 1) and 1.2 (item 6), and
passing from the image to the original, we obtain the solution (1.56) in the following
form:
1 1
1(t) = 1/s ; Y( s) = ; Y ( s) ← y(t); y(t) = (1 − e − c0t ).
s( s + c0 ) c0
The solution of a homogeneous second-order differential equation with constant coef-
ficients is given on a concrete example.
Example of the second. Find the solution of equation
d2 x dx
2
+6 + 5x = 0
dt dt
with the initial conditions at t = 0, x(0) = 0, x′(0) = 1.
We write the depicting equation
L [ x′′] + L [6x′] + L [5 x] = 0.
www.Technicalbookspdf.com
20 Advanced Mathematical Techniques in Engineering Sciences
or
s 2 X ( s) − 1 + 6 s X ( s) + 5X ( s) = 0, or X ( s)[( s 2 + 6 s + 5)] = 1.
From which
X ( s) = 1/( s 2 + 6 s + 5) = k1/( s + 1) + k 2 /( s + 5).
A detailed solution of differential equations of any order may be found in Ref. [3].
We give examples of the solutions of certain problems of mechanics with the aid of
the Laplace transform.
The first task. The motion of a material point of mass m under the action of a force
that depends on time. The differential equation of motion of a point of mass m has the
form
L[mx′′(t)] = L[ f (t)] or ms 2 X (s) − msx(0) − mx′(0) = F( s). (1.63)
Here x0 and x0′ are the initial values of the function f(x) and its first derivative for t = 0.
Then
1 1 1 F( s)
X ( s) = x0 + 2 x0′ + . (1.65)
s s m s2
Turning to the original, using Table 1.2 (items 2, 3) and (1.53), we obtain the solution of
Equation (1.62) in the form
t
1
x = x0 + x0′ t +
m ∫ f (τ )(t − τ ) dτ . (1.66)
0
The same result can be obtained using the formulas (1.20), Table 1.2 (paragraph 1.7).
Assuming the initial conditions in the form
x = x0 , x′ = v0 or t = 0, (1.67)
www.Technicalbookspdf.com
Chapter one: Application of the Laplace transform in problems 21
Or
X ( s) = x0 + sv0 + (1/ ms 2 )F( s). (1.69)
Analyzing the two depicting Equations (1.64) and (1.68), we draw the following conclu-
sions: the original of the second derivative x′′(t) corresponds to only one image s2 X(s);
the initial values of the functions for (1.64) are given by real quantities; and in Equation
(1.68), they automatically become complex numbers.
Passing from the image (1.69) to the original, we immediately obtain Equation (1.66).
The depicting Equation (1.68) can also be obtained using the impulsive functions (1.39),
(1.46). This can be done if we assume that for zero initial values of the coordinate and
velocity to a point of mass m, at the time t = 0, a pulse is applied, mx0′ u1 (t). This action
imparts a velocity x′0 to the point, as well as two oppositely directed impulsive shocks,
which impart an instantaneous displacement of x0 the point.
Let us show this by the example of motion of a point of unit mass according to the law
of motion
Let’s write down the depicting equation, applying single impulsive functions,
x=
∫ f (τ )(t − τ )dτ + v t + x .
0
0 0 (1.73)
The second task. Vibrations of the simplest vibrator. Let the load P = mg suddenly be
suspended at the end of the stressed spring. We neglect the weight of the cargo. At the
same time the cargo is given a deviation by the value displacement of the spring x0 and a
deviation the speed of x0′. It is necessary to find a change in the elongation of the spring
x(t) under a given force (Figure 1.11).
Decision. Due to the fact that the spring is stiff, during the action of the force P and
the reported stroke length, this spring will change until the whole system reaches
equilibrium and the spring reaches at rest. Therefore, the motion of the particles of
the spring can be regarded as longitudinal oscillations, which are made at some point
in time.
Moreover, these oscillations will first be forced oscillation, and then at the initial
moment of equilibrium, the oscillations take on the character of free oscillations. We
use the method of introducing single impulsive functions, following the example
considered above. We write the equation of cargo movement
www.Technicalbookspdf.com
22 Advanced Mathematical Techniques in Engineering Sciences
x΄0
x0
x
P = mg
Then
g x′ s x s2
X ( s) = + 2 0 2 + 2 0 2 . (1.76)
2
s +k 2
s +k s +k
c
Here k = is the frequency of free oscillations.
m
g 1
We transform the first term on the right-hand side of (1.76) as follows: =g 2 .
s2 + k 2 s + k2
The second factor of the resulting expression is considered as
1 1 s2 1 1 s
= 2 1− 2 = − .
2
s +k 2
k s + k 2 k 2 s s 2 + k 2
Applying the analogous method for transforming the remaining terms (1.76) and
passing to the origin space (Table 1.2, items 8, 9), we finally obtain the well-known
solution
g x′
x(t) = (1 − cos kt ) + 0 sin kt + x0 cos kt.
k2 k
The reader may find an extensive exposition of this material in the literature [1].
www.Technicalbookspdf.com
Chapter one: Application of the Laplace transform in problems 23
∂2 u ∂2 u
= c 1 ∂ x 2 , (1.77)
∂t 2
where c1 is the coefficient characterizing the properties of the rod material.
Taking c1 = E/ρ, we finally obtain
∂2 u E ∂2 u
= . (1.78)
∂t 2 ρ ∂ x 2
Here, E is the modulus of elasticity of the material; ρ is the density of the material.
2. Only twisting movements appear in the rod (the rod is only exposed to the torsion
pulse), assuming that the set of planar cross sections of the rod rotate sequentially at
a distance dx from each other (Figure 1.13)
∂2 ϕ ∂2 ϕ
= c2 2 . (1.79)
∂t 2
∂x
Taking c2 = G/ρ, we finally obtain
и dx
М
х
О
dx
x
M
www.Technicalbookspdf.com
24 Advanced Mathematical Techniques in Engineering Sciences
∂2 ϕ G ∂2 ϕ
= . (1.80)
∂t 2 ρ ∂x2
Here G is the shear modulus of the material.
In the case of wave processes propagating in an elastic rod, longitudinal and transverse
oscillations with allowance for (1.77) and (1.79) can be described by the formulas [7]
∂2 u 1 ∂2 u 1− µ
= , a1 = ; (1.81)
∂ x 2 a12 ∂t 2 (1 + µ )(1 − 2 µ )
∂2 ϕ 1 ∂2 ϕ G
= , a2 = . (1.82)
∂x 2 2
a2 ∂t 2
ρ
Here μ is Poisson’s ratio; G = E/[2(1 + μ)].
∂υ ∂σ
ρ =− ; (1.83)
∂t ∂x
1 ∂σ ∂υ
=− ; (1.84)
E ∂t ∂x
∂Ω ∂τ
rρ =− ; (1.85)
∂t ∂x
1 ∂τ ∂Ω
=− . (1.86)
Gρ ∂t ∂x
Here υ is the travel speed of an elementary volume of an elastic rod along axis, υ = ∂u/∂t; r
is the radius of the rod; and Ω is the angular travel speed of the particles of the elastic rod
in the plane of the section, Ω = ∂ϕ/∂t.
Equations (1.83) and (1.84) describe the relationship between the velocities of longi-
tudinal displacement of plane sections of the elementary volume of an elastic rod and
www.Technicalbookspdf.com
Chapter one: Application of the Laplace transform in problems 25
changes in normal stresses with the gradients of changes of these variables along the
length of the rod.
Equations (1.85) and (1.86) describe the relationship between the shear rate of flat sec-
tions of the elementary volume of an elastic rod and the rate of change of the maximum
tangential stresses with the gradients of the variations of these variables along the length
of the rod.
For a short rod, Equations (1.83–1.86) can be written in ordinary derivatives. A detailed
exposition of this question is given in Ref. [8].
This approach, proposed by L. Kondratenko, allowed for development of a new
method for studying the dynamics of rotating and longitudinally moving elements of the
construction. The method makes it possible to estimate the magnitude and voltage oscil-
lations in the structural elements, as well as the speed of movement of the functional ele-
ment in engineering technologies.
Let us explain the essence of Kondratenko’s method.
dτ
ϑk = Ω1 − Ω2 , (1.87)
dt
l
where ϑ k is the coefficient of torsional elasticity, ϑ k = ; Ω1, Ω2 is the angular velocity
ρGW
of rotation of the rod section and the cross section of the rod near the disk, respectively;
Ω1
1
l
2
J, Мr
Ω2
www.Technicalbookspdf.com
26 Advanced Mathematical Techniques in Engineering Sciences
and W is the geometric moment of resistance of the cross section of the rod near the disk,
W = πd3/16.
We integrate (1.87). We obtain the dependence of the tangential stresses on the varia-
tion of the twist angle. We integrate (1.87). We obtain the dependence of the tangential
stresses on the variation of the twist angle
l
υ 1kτ = ϕ 1 − ϕ 2 , ϑ 1k = . (1.88)
ρG
The tangential stresses developed in the rod overcome the resistance of the forces.
Mr = Mr0 + hk Ω2, as well as rising inertia forces Md = JdΩ2/dt. Here hx is the loss factor
proportional to the angular velocity of rotation of the disk. Taking into account that
τ = M/W, we finally obtain
dΩ2
τ (t)W = Mr 0 (t) + hx Ω2 (t) + J . (1.89)
dt
1.11 Equations of torsional oscillations of a disk
For Equations (1.85) and (1.86), we write the partial differential equations, assuming that
the density and the shear modulus of the rod material are equal and constant along the
length:
dτ ( s )
rρ sΩ ( s ) = − ; (1.90)
dx
sτ ( s) dΩ( s)
=− . (1.91)
rG dx
Differentiating (1.90) with respect to the coordinate x, eliminating the derivative dΩ(s)/dx
with the help of (1.91), and introducing the new variable θ k(s) = ± s (G−1ρ)1/2, we obtain a new
differential equation second-order
∂2 τ ( s)
− θ k2 ( s)τ ( s) = 0. (1.92)
∂x2
The solution of this equation has the form
The constants of the integration C1, C2 are determined by the boundary conditions, for
x = 0,
∂τ ( s, x) θ 2 ( s)G
τ ( s, x) = t1 ( s, 0); =− k ⋅Ω1 ( s, 0). (1.94)
∂x s
The final solution (1.90) and (1.91) will be
1
Ω( s, x) = Ω1 ( s, 0)ch [θ k ( s)x ] − sτ 1 ( s, 0)sh [θ k ( s)x ] ; (1.95)
Gθ k ( s)
www.Technicalbookspdf.com
Chapter one: Application of the Laplace transform in problems 27
1
τ ( s, x) = τ 1 ( s, 0)ch [θ k ( s)x ] − rGθ k Ω1 ( s, 0)sh [θ k ( s)x ] . (1.96)
s
The coefficient θ k(s) is the symbolical coefficient of wave propagation. The solution obtained
in the images makes it possible to calculate the frequency characteristics of the driven link
depending on the change in the speed of the leading link, taking into account the emerg-
ing reactive force factors of the medium. The reader may find an extensive exposition of
this question in the literature [8].
dυ dσ
ρ =− ; (1.97)
dt dx
1 dσ dυ
=− . (1.98)
E dt dx
We perform actions similar to Section 1.10. Integrating (1.97) with respect to the coordinate
x and then differentiating with respect to t, we obtain the relation for the longitudinal
vibrations of the rod
dσ
ϑ n′ 0 = υ1 − υ 2 , (1.99)
dt
where ϑ′n0 is the coefficient characterizing the longitudinal elasticity, ϑ′n0 = l/E; υ1, υ 2 are the
linear velocities of the displacement of the points of the rod and disk sections, respectively;
and E is the modulus of elasticity of the material. Assuming that the modulus of elasticity
is the same and constant, we integrate (1.99), and we obtain the dependence of the normal
stresses on the displacements of the points of the rod:
2 υ2
1
F2
υ1 F1
F х
α
υ
mg
l
www.Technicalbookspdf.com
28 Advanced Mathematical Techniques in Engineering Sciences
ϑ n′ 0σ = (υ1 − υ 2 ) t. (1.100)
Or
l l
σ = x1 − x2 , σ = ∆u. (1.101)
E E
Expression (1.101) is the well-known Hooke’s law.
The normal stresses developed in the rod overcome the resistance arising on the disk
(slave link), which is the resultant force of its two components F2 = Ff = k sin α. Note that the
force F2 is the friction force, which is determined by the expression F2 = Ff = kf sin α. Here
kf is the coefficient of friction. The force F1 is the inertial component, and F1 = −Fi = mdυ 2/dt.
Without taking into account the direction of the speed of motion and taking F = σf,
where f is the cross-sectional area of the body, we write the following relation [8]:
dυ 2
βσ f = F0 (t) + k1υ1 (t) + hυ 2 + m . (1.102)
dt
Here β is the proportionality coefficient, which depends on the coefficient of friction caused
by the contact pressure and the direction of motion of the driven link, β = 1 − c sgn υ 2; k1, h
are the coefficients of friction loss proportional to the speeds of the β driving and driven
links. Solving with the help of the symbolical method jointly (1.99) and (1.102), taking into
account the direction of motion of the driven link, we finally obtain
( )
υ1 (t) ( 1 − c sgn υ 2 − k1υ n0 p ) − ϑ n0 pF0 (t) = υ 2 (t) 1 − c sgn υ 2 + hυ n0 p + mϑ n0 p 2 . (1.103)
Here ϑn0 is the elasticity of the mechanical system, ϑn0 = l/fE; p is a differential operator,
p ≡ d/dt.
Further transformations will be based on the energy approach of the deformation
of an elastic body and the rheological representation of the transfer of dynamic energy
through a metallic body (the Zener model) [9].
It is known that the realization of the principle of continuity of deformations in an
elastic body corresponds to the minimum value of the potential deformation energy accu-
mulated by the body [10] (i.e., in deformation processes, the stored energy in the body is
spent on performing work to restore the body shape to its original state after the load is
removed).
Taking into account the phenomenological Zener model, we write the differential
equation characterizing the redistribution of stresses and deformations in the body under
the static load of the body in some time [8]
η dσ dθ
σ+ = E1θ + η . (1.104)
E2 dt dt
Here η is the coefficient of proportionality, which characterizes the viscosity of the body;
E1, E2 are elastic constants of isothermal and isobaric deformation processes; and θ is linear
deformation of the body.
If we take θ′ = 0, then expression (1.104) is transformed into equation
www.Technicalbookspdf.com
Chapter one: Application of the Laplace transform in problems 29
η dσ η
σ+ = E1θ 0 , τ ε = (1.105)
E2 dt E1
with the decision
σ t η
θ (t) = E1σ 0 + θ 0 − 0 exp − ; τ σ = , (1.107)
E1 τσ E1
where τσ is the time of retardation (lag).
We transform Equation (1.104), going over to the operator form:
1 d
pσ = E2 pθ − (σ − E1θ ) , p ≡ . (1.108)
τε dt
We perform one more transformation
1 1 E
σ p + = θ E2 p + ; k = 2 . (1.109)
τε kτ ε E1
Passing under zero initial conditions to Laplace transforms, we rewrite Equation (1.109) in
the images
1 1
σ ( s) s + = θ ( s)E2 s + . (1.110)
τε kτ ε
Taking into account the stepwise deformation, the Laplace transform of the stress change
function (1.110) is written in the form
1
s+
kτ ε
σ ( s) = θ 0E2 . (1.111)
1
s + s
τ ε
1 τ t E t
σ (t) = θ 0E2 + 1 − ε exp − = θ 0E1 1 + 2 − 1 exp − . (1.112)
k kτ τ E τε
ε ε 1
Assuming that the relaxation constant is zero within the elastic range, when the stresses
do not exceed the yield strength, without taking into account the direction of motion in
accordance with (1.102) and (1.103), we write the following equations of motion of the mate-
rial object:
www.Technicalbookspdf.com
30 Advanced Mathematical Techniques in Engineering Sciences
dυ 2
σ f (1 − c) = F0 (t) + hυ 2 + m (1.113)
dt
and
( )
υ1 (t)(1 − c) − ϑ n0 pF0 (t) = υ 2 (t) 1 − c + hϑ n0 p + mϑ n0 p 2 . (1.114)
In this case, the oscillations of the velocity of motion of a material object can be described
by the following equation [8]:
υ1 ( t ) − aϑ n0 pF0 ( t ) 1
υ2 (t ) = , a= . (1.115)
1 + ahϑ n0 p + amϑ n0 p 2 1− c
For the difference in the velocities of motion for υ1(t) = const, the following relation is valid:
− aϑ n0 pF0 (t)
∆υ (t) = . (1.116)
1 + ahϑ n0 p + amϑ n0 p 2
Integrating (1.116) with respect to t and passing to the differential form, we obtain the
equation of displacement of the material point relative to the leading member:
d∆u d 2 ∆u dF
∆u ( t ) + ahϑ n0 + amϑ n0 2
= − aϑ n0 0 . (1.117)
dt dt dt
From the solution of the system of equations (1.113) and (1.114), the stresses in the rod are
determined by the equality
a
σ (t) = F0 (t) + hυ 2 (t)(1 + Tp) , (1.118)
f
where T = m/h is the inertial time constant.
A detailed exposition of this question can be found in Ref. [8].
The equations obtained make it possible to apply the motion transfer scheme and, on
its basis, to investigate the longitudinal and torsional oscillations of the moving techno-
logical object fixed to the end of the rod (the input and output links of the system).
www.Technicalbookspdf.com
Chapter one: Application of the Laplace transform in problems 31
of the rod. Such a model can be taken as an imitation model when studying the process
of hole processing. The structural scheme for the transfer of rotational motion is shown in
Figure 1.16 [8].
Accepted designations: ϑk —coefficient of torsional elasticity; p—differentiation opera-
tor (p = d/dt); τ—tangential stresses; τε —relaxation constant; kε —ratio of the adiabatic and
isothermal elasticity modulus of the rod material; Mr —resultant moment of the resistance
forces; l—length of the rod with the disk; hk —coefficient of friction loss proportional to the
rotational speed; 1, 2—leading and driven links.
The transfer function of the effect of the oscillations of the torque on the rotational
speed of the disk is the relation
Ω2 ( s )
WΩ ( s ) = . (1.119)
Mr ( s )
Here Ω2 is the angular velocity of rotation of the disk, and Mr is the moment of resistance
of forces.
The transfer function (1.119) is a Laplace transform of the impulse response k(t) [4]. To
determine k(t), it is necessary to find the roots of the characteristic equation.
The proposed mathematical model allows us to investigate the dynamics of the rotat-
ing parts of the structural element, and also obtain equations describing in the rod the rela-
tionship between the angular acceleration of elementary sections and the gradient of the
tangential stress and the rate of change of this voltage with the angular velocity gradient.
l
2
1
x
Ω1 Ω2
Mc
G
– Ω2
Ω1 ∆Ω 1 1+ kετε p τ
W (J)–1
υk p 1 + τε p
– –
hk
N1
N2
Figure 1.16 The structural scheme for the transfer of rotational motion.
www.Technicalbookspdf.com
32 Advanced Mathematical Techniques in Engineering Sciences
In the presence of long lines (an elastic system with distributed parameters), it is expe-
dient to carry out an investigation of such processes in the complex domain by means of a
one-dimensional Laplace transform.
Based on the proposed scheme of motion transfer (Figure 1.14), the transfer function
can be described by expression
Ω2 ( s) ϑ k s(s+ α 1 )
WΩ ( s) = = . (1.120)
Mr ( s) [α 2 + s(1 + ϑ k hkα 1 ) + s2ϑ k ( hk + Jα 1 ) + s3ϑ k J ]
Here α1, α 2 are quantities that take into account the peculiarities of the rotation of the disk
in the interaction medium (contact interaction, etc.).
Ω (s)( hk + Js)
τ c ( s) = ψ ( s) ⋅ Mr + 1 ;
ch[θ k (s)l]
1
ψ ( s) = ;
W[1 + hkϑ k ( s) + Jϑ k ( s)s2
l
ϑ k ( s) = Z k ( s);
GrW
th [θ k ( s)l ]
Zk ( s) = .
θ k ( s)l
Here Zk(s) is a function characterizing the degree of distribution of the parameters. For
tg α k
Ω1 = 0 and the Laplacian s = jω, the function Zk(jω) becomes real, i.e. Zk ( jω ) = ; αk is the
αk
parameter characterizing the properties of the structure, αk = lω (ρG−1)0,5; ω is the circular
frequency of harmonic oscillations; j = (−1)1/2.
These equations make it possible to calculate the frequency characteristics of the drive
and determine the drive response to the harmonic variation in the speed of the driving
link or the moment of resistance acting on the actuator.
Thus, we obtain two frequency characteristics, one of which is −WM(jω), illustrating
the influence of the oscillation of the moment of resistance on the angular velocity of rota-
tion of the disk Ω2; the other −WMτ(jω), determines the influence of the oscillations Mr on
the magnitude of tangential stresses τ, appearing in the section adjacent to the disk.
The graph of the function Z к(αк) is shown in Figure 1.17, from which it is clear that as
the parameter tends to zero, the function Zk tends to unity αк → 0, Z к → 1.
www.Technicalbookspdf.com
Chapter one: Application of the Laplace transform in problems 33
Zk
12
10
2 π 3
1 π
2 π 2 αk
1 2 3 4 5 6
–2
–4
–6
–8
–10
–12
Figure 1.17 Change in the function Zk as a function of the dimensionless parameter αk.
In cases of variation of αk in the intervals π/2 + kπ > αk > π + kπ, the function Z к takes
negative values Zk < 0.
The graph in Figure 1.17 clearly defines the zones of stable and unstable operation of
the mechanism.
A detailed exposition of this material and questions of mathematical modeling of the
rolling of the tube can be found in the literature [11–16].
References
1. Lur’e A.I. [Operacionnoe ischislenie i ego prilozheniya k zadacham mekhaniki] Operational Calculus
and Its Applications to Problems of Mechanics. GITTL, Moscow, 1950. (In Russ.).
2. Korn G., Korn T. Mathematical Handbook for Scientists and Engineers Definitions, Theorems and
Formulas for Reference and Review. McGraw-Hill Book Company, New York, San Francisco,
Toronto, London, Sydney, 1968.
3. Doetsch G. Anleitung zum praktiscen gebrauch der Laplace-transformation. R. Oldenbourg,
München, 1961.
4. Ivanov V.A., Chemodanov B.K., Medvedev V.S. [Matematicheskie osnovy teorii avtomatichesk-
ogo regulirovaniya] Mathematical Foundations of the Theory of Automatic Control. High School,
Moscow, 1971. (In Russ.).
5. Lavrent’ev M.A., SHabat B.G. [Metody teorii funkcij kompleksnogo peremennogo] Methods of the
Theory of Functions of a Complex Variable. The Science, Moscow, 1965. (In Russ.).
6. Mironova L.I. [Komp’yuternye tekhnologii v reshenii zadach teorii uprugosti] Computer Technologies
in Solving Problems in the Theory of Elasticity. Palmarium Academic Publishing, ISBN-13: 978-3-
659-72395-7; ISBN-10: 3659723959. (In Russ.).
7. Sedov L.I. [Mekhanika sploshnoj sredy] Continuum Mechanics. Nedra, Moscow. T.1, T.2, 1970. (In
Russ.).
8. Kondratenko L. A. [Raschet kolebanij detalyah i uzlah mashin] Calculation of Velocity Variations and
Stresses in Machine Assemblies and Components. Sputnik, Moscow, 2008. (In Russ.).
9. Eirich Frederik R. (Edit). Rheologiy. Academic Press, Inc., New York, VI, 1965.
www.Technicalbookspdf.com
34 Advanced Mathematical Techniques in Engineering Sciences
10. Bezuhov N.I. [Osnovy teorii uprugosti, plastichnosti i polzuchesti] Fundamentals of the Theory of
Elasticity, Plasticity and Creep. High School, Moscow, 1968. (In Russ.).
11. Kondratenko L.A., Terekhov V.M., Mironova L.I. [Ob odnom metode issledovaniya krutil’nyh
kolebanij sterzhnya i ego primenenii v tekhnologiyah mashinostroeniya] About one method
of research torsional vibrations of the core and this application in technologies of mechanical
engineering. Engineering & Automation Problems. 2017, vol. 1, pp. 133–137. (In Russ.).
12. Kondratenko L., Terekhov V., Mironova L. The aspects of roll-forming process dynamics.
Vibroengineering PROCEDIA. At the 22nd International Conference on Vibroengineering,
Moscow, 2016, pp. 460–465. (In Russ.).
13. Kondratenko L.A. [Mekhanika rolikovogo val’cevaniya teploobmennyh trub] Mechanics
Roller Rolling Heat Exchange Tubes. Sputnik, Moscow, 2015. (In Russ.).
14. Kondratenko L.A., Terekhov V.M., Mironova L.I. [K voprosu o vliyanii dinamiki rolikovogo
val’cevaniya na kachestvo izgotovleniya teploobmennyh apparatov v atomnyh ehnerget-
icheskih ustanovkah] On the effect of the dynamics of the roller rolling on the quality of man-
ufacture of heat exchangers of nuclear power units. Heavy Engineering Construction. 2016, vol. 3,
pp. 10–14. (In Russ.).
15. Kondratenko L., Mironova L., Terekhov V. Investigation of vibrations during deepholes
machining. 25th International Conference Vibroengineering, Liberec, Czech Republic. JVE
International LTD. Vibroengineering Procedia. 2017, Vol. 11. ISSN 2345-0533, pp. 7–11. Crossref
DOI link: https://doi.org/10.21595/vp.2017.18285.
16. Kondratenko L., Mironova L., Terekhov V. On the question of the relationship between longi-
tudinal and torsional vibrations in the manufacture of holes in the details. 26th Conference in
St. Petersburg, Russia. JVE International LTD. Vibroengineering Procedia. 2017, Vol. 12. ISSN
2345-0533, pp. 6–11. Crossref DOI link: https://doi.org/10.21595/vp.2017.18461.
www.Technicalbookspdf.com
chapter two
Contents
2.1 Introduction........................................................................................................................... 36
2.2 Periodic functions................................................................................................................. 36
2.3 Orthogonality of sine and cosine functions..................................................................... 37
2.4 Fourier series......................................................................................................................... 39
2.5 Dirichlet’s theorem............................................................................................................... 41
2.6 Riemann–Lebesgue lemma................................................................................................. 41
2.7 Term-wise differentiation.................................................................................................... 41
2.8 Convergence of Fourier series.............................................................................................42
2.9 Small order.............................................................................................................................43
2.10 Big “oh” for functions...........................................................................................................43
2.11 Fourier analysis and Fourier transform............................................................................43
2.12 Fourier transform..................................................................................................................44
2.13 Gibbs phenomenon...............................................................................................................44
2.13.1 Gibbs phenomenon with an example.................................................................... 45
2.13.2 Results related to Gibbs phenomenon................................................................... 47
2.14 Trigonometric Fourier approximation............................................................................... 47
2.15 Summability.......................................................................................................................... 47
2.15.1 Ordinary summability............................................................................................. 47
2.15.2 Absolute summability.............................................................................................. 48
2.15.3 Strong summability.................................................................................................. 48
2.16 Methods for summability.................................................................................................... 48
2.17 Regularity condition............................................................................................................. 49
2.18 Norm....................................................................................................................................... 49
2.19 Modulus of continuity.......................................................................................................... 50
2.20 Lipschitz condition............................................................................................................... 50
2.21 Various Lipschitz classes..................................................................................................... 50
2.22 Degree of approximation..................................................................................................... 51
2.23 Fourier series and music...................................................................................................... 52
2.24 Applications and significant uses.......................................................................................54
References........................................................................................................................................54
35
www.Technicalbookspdf.com
36 Advanced Mathematical Techniques in Engineering Sciences
2.1 Introduction
Mathematics has its roots embedded within various streams of engineering and sciences.
The concepts of the famous Fourier series were originated from the field of physics.
The following two physical problems are the reasons for the origin of Fourier series:
Jean Baptiste Joseph Fourier (1768–1830) was the first physicist, mathematician, and
engineer who developed the concepts of Fourier analysis in dealing with the problems of
vibrations and heat transfer. He claimed that any continuous or discontinuous function of
t could also be expressed as a linear combination of cos(t) and sin(t) functions.
In the mathematical analysis, we do not usually get a full decomposition into the
simpler things, but an approximation of a complex system is usually achieved by a more
elementary system. When we truncate the Taylor series expansion of a function, we
approximate the function by using the polynomial.
The form of a Taylor series is as follows (infinite series):
∞
f (t) = ∑a t ,
n= 0
n
n
where a0, a1, a2, … are called the constant coefficients of the infinite series. A Taylor series
does not include terms with negative powers. The quality of the approximation depends
on the number of terms taken under consideration. Of course, for a function to have a
Taylor series, it must (among other things) be infinitely differentiable in some interval, and
this is a very restrictive condition.
The Fourier series, which is a sum of sines and cosines, can be used for the approxi-
mation of any periodic function. Sines and cosines serve as much more versatile “prime
elements” than powers of t. Sines and cosines can be used to approximate not only non-
analytic functions, but they even do a good job in the wilderness of the discontinuous
functions.
2.2 Periodic functions
A function satisfying the identity l(t) = l(t + T) for all t, where T > 0, is called periodic or
T-periodic as shown in Figure 2.1.
For a T-periodic function
y
T
www.Technicalbookspdf.com
Chapter two: Fourier series and its applications in engineering 37
y
1
x
–3 –2 –1 1 2 3
–1
Here, nT is also a period for any integer n > 0, and T is called a fundamental period. Any
interval of length T is the same as the definite integral of a T-periodic function. The follow-
ing example uses this property to integrate a 2-periodic function as shown in Figure 2.2.
Example 2.1: Let there exist a 2-periodic function f and I be a positive integer.
N
Solution:
I I+2 I
∫
−I
f 2 ( x) dx =
∫
−I
f 2 ( x) dx + +
∫f
I−2
2
( x) dx
{ }
I I 2 2
1
∫ f 2 ( x) dx = I
∫ ∫
f 2 ( x) dx = I (− x + 1)2 dx = I − (− x + 1)3
3
−I I−2 0 0
I
I 2
∫f 2
( x) dx = − [−1 − 1] = I .
3 3
−I
The most important periodic functions are those in the 2π-period of the trigonometric
system
∫ f (x)g(x) dx = 0.
a
= π for m = n
www.Technicalbookspdf.com
38 Advanced Mathematical Techniques in Engineering Sciences
= π for m = n
π
∫ cos mx sin nx dx = 0 for all m and n.
−π
Certain sequences of sin nt and cos nt functions are orthogonal on certain intervals. The
resulting expansions,
∞
f = ∑cϕ
i=1
i i
using the sin nt and cos nt become the Fourier series expansions of the function f.
First, we just consider the functions φ n(t) = cos nt. These are orthogonal on the interval
0 < t < π.
Example 2.2: The functions φ 0(x) = 1, φ1(t) = cos t, φ 2(t) = cos 2t, φ 3(t) = cos 3t, …, φn(t) =
π
cos nt, … are orthogonal on the interval 0 < t < π. Furthermore, |φ 0|2 = π and |φn|2 =
for n = 1, 2, …. 2
π
π
1 1
=
∫ 2 cos (n + m) t + 2 cos(n − m)t dt
0
π
1 1
= sin ( n + m) t + sin ( n − m) t = 0,
2 ( n + m) 2 ( n − m) 0
π π
1
|ϕ n | = 2
∫ cos (nt) dt = ∫ 2 [1 + cos 2nt] dt
0
2
0
π
1 1 π
= t+ sin 2 nt = .
2 2 n 0 2
Next, we just consider the functions ψ n(t) = sin nt. These are also orthogonal on
the interval 0 < t < π. The resulting expansion is called as the Fourier sine series
expansion of f.
Example 2.3: The functions ψ 1(t) = sin t, ψ 2(t) = sin 2t, ψ 3(t) = sin 3t, …, ψn(t) = sin nt, … are
π
orthogonal on the interval 0 < t < π. Furthermore, |ψn|2 = for n = 1, 2, …
2
www.Technicalbookspdf.com
Chapter two: Fourier series and its applications in engineering 39
Proof:
π π
1 1
( y n (t), y m (t)) =
∫
0
sin(nt)sin(mt) dt =
∫ 2 cos (n + m) t − 2 cos (n − m) t dt
0
π
1 1
= sin ( n + m) t + sin ( n − m) t = 0,
2 ( n + m ) 2 ( n − m ) 0
0
π
1 1 π
= t− sin 2 nt = .
2 2n 0 2
Finally, we consider the functions φn(t) = cos nt and ψn(t) = sin nt. These are orthogonal
on the interval −π < t < π.
Example 2.4: The functions φ 0(t) = 1, φ1(t) = cos t, φ 2(t) = cos 2t, φ 3(t) = cos 3t, …, φn(t) =
cos nt, … and ψ 1(t) = sin t, ψ 2(t) = sin 2t, ψ 3(t) = sin 3t, …, ψn(t) = sin nt, … are orthogonal
on the interval −π < t < π. Furthermore, |φ 0|2 = 2π and |φn|2 = |ψn|2 = π for n = 1, 2, …
is shown in Examples 2.2 and 2.3. For (φn(t), ψm(t)), the third identity is used:
π
1 1
=
∫ 2 sin (m + n) t − 2 sin (m − n) t dt
−π
π
1 1
= − cos ( n + m) t + cos ( n − m) t = 0.
2 ( n + m ) 2 ( m − n ) −π
Then |φ 0|2 = 2π is an easy verification and |φn|2 = |ψn|2 = π is shown in the same way
(see Examples 2.2 and 2.3).
2.4 Fourier series
Fourier series are special representation of the functions (signals) of the form
∞
f ( x ) = a0 + ∑ (a cos(nx) + b sin(nx)),
n= 1
n n
www.Technicalbookspdf.com
40 Advanced Mathematical Techniques in Engineering Sciences
∫ f (x) dx = ∫ a dx + ∑ ∫ (a cos(nx) + b sin(nx)) dx.
−π −π
0
n= 1 −π
n n
π π
Since,
−π
∫ cos nx dx = ∫ sin nx dx = 0 for n = 1, 2, …
−π
π π π
1
∫ f (x) dx = ∫ a0 dx = 2π a0 ⇒ a0 =
2π ∫ f (x) dx.
−π −π −π
The coefficient an is determined by multiplying both sides with cos mx and integrating the
resulting equation over the interval [−π, π]:
π π ∞ π
∫ f ( x)cos(mx) dx =
∫ a0 cos(mx) dx + ∑ ∫ a cos(nx)cos(mx) dx
n= 1 −π
n
−π −π
∞ π
+ ∑ ∫ b sin(nx)cos(mx) dx.
n= 1 −π
n
π π π
Since
∫ cos mx dx = 0, ∫ cos mx sin nx dx = 0 for all m and ∫ cos mx cos nx dx = 0 for m≠n:
−π −π −π
π π
∫
−π
f ( x)cos(mx) dx = an (cos nx)2 dx = π an
∫
−π
π 2π
1 1
an =
π ∫
−π
f ( x)cos(mx) dx =
π ∫ f (x)cos(mx) dx.
0
Similarly, the coefficient bn is determined by multiplying both sides with sin mx and
i ntegrating the resulting equation over the interval [−π, π]:
π 2π
1 1
bn =
π ∫ f ( x)sin(mx) dx =
π ∫ f (x)sin(mx) dx .
−π 0
sn ( f ; x ) =
a0
2
+ ∑ ( a coskx + b sinkx ) ,
k k ∀n ≥ 1 with s0 ( f ; x ) =
a0
2
,
k =1
denotes the (n + 1)th partial sums, called trigonometric polynomials of degree (or order) n,
of the Fourier series of f. The conjugate Fourier series of the series of f is defined by
www.Technicalbookspdf.com
Chapter two: Fourier series and its applications in engineering 41
∞ ∞
∑ (b cosnx − a sinnx) = ∑v ,
n= 1
n n
n= 0
n
π π
1 1
where ak =
π ∫
−π
f ( x)cos kx dx , k = 0,1, 2,, and bk =
π
−π
∫ f (x)sin kx dx, k = 1, 2, 3, are
∞
called the Fourier coefficients of f. The sequence of partial sums of series ∑ uk ( x), given by
k=0
a n
sn ( f ; x) = 0 + ∑ ( ak cos kx + bk sin kx), is a trigonometric polynomial of order n.
2 k =1
2.5 Dirichlet’s theorem
The Fourier series of a piecewise smooth integrable function f converges at each point x to
f (x + ) + f (x − )
.
2
Hence, the Fourier series converges to f(x) at points of continuity and to the average of the
limiting values at a jump discontinuity.
2.6 Riemann–Lebesgue lemma
Fourier coefficient an and bn of any function tends to zero as n tends to infinity—that is,
π
1
lim
n →∞ π ∫ f (x)cos kx dx = 0
−π
and
π
1
lim
n →∞ π ∫ f (x)sin kx dx = 0.
−π
Validation of the asymptotic approximations for integrals can be done by the Riemann–
Lebesgue lemma. The method of steepest descent (rigorous treatments) and stationary
phase method are based on the Riemann–Lebesgue lemma.
2.7 Term-wise differentiation
A continuous, piecewise smooth 2π-periodic function is f on all of R with Fourier series
a0
+
2 n∈N ∑
an cos(nt) + ∑
bn sin(nt) .
n ∈N
www.Technicalbookspdf.com
42 Advanced Mathematical Techniques in Engineering Sciences
If f´ is piecewise smooth, then the series can be differentiated term by term to yield the
following point-wise convergent series at every point t:
f ′(t + ) + f ′(t − )
2
= ∑ (nb cos(nt) − na sin(nt)).
n n
n ∈N
Example 2.5: Let f (x) = x be 2π -periodic function on [−π, π], and its second partial sum is
3
Function
2 f
1
Fourier
series
–3 –2 –1 1 2 3
–1
–2
–3
www.Technicalbookspdf.com
Chapter two: Fourier series and its applications in engineering 43
2.9 Small order
Function g(n) is of a smaller order than function h(n), or g(n) approaches to 0 faster than
h(n), if
g(n)
lim = 0,
n →∞ h(n)
and we write g = o(h).
f (t)
is bounded as t → a,
g(t)
where a could be ±∞, then f = O(g), or f is at most of order g.
1. Fourier analysis is the study of the Fourier transform, Fourier series, and related
concepts.
2. The Fourier transform, Fourier series, and several related concepts are just special
cases of constructions from representation theory (writing a conjugacy-invariant
function on a group as a linear combination of characters).
3 Fourier analysis is not just a special case of representation theory—not even close.
4. These might at first sound contradictory, but they really are not. Of course there is
some subtlety in “related concepts,” but that is not really the fundamental problem.
Consider the following similar set of true statements:
a. Number theory is the study of the integers (and related concepts).
b. The integers are just a special case of some construction from category theory (the
initial object in some category of rings).
c. The integers are also just a special case of some construction from group theory
(the endomorphism algebra of the free Abelian group on one generator).
d. The integers are also a special case from set theory, model theory, etc.
e. But number theory is not a special case of any of those fields.
Even worse, number theory is not even the only field of mathematics devoted to studying
the integers—much of combinatorics is as well.
www.Technicalbookspdf.com
44 Advanced Mathematical Techniques in Engineering Sciences
The fundamental problem is not that number theorists bring in additional concepts
like number fields, Galois groups, and modular forms. They do, but the issue arises even
when working with purely elementary statements like the four-square theorem. What
does this mean in the case of Fourier analysis? One question you might study in Fourier
analysis is whether the Fourier transform exists from one space of functions (may be an
Lp space) to another. Now to define the Fourier transform on this space (i.e., to uniquely
characterize it) one just needs to know what it is on some dense set—the smooth com-
pactly supported functions probably work. The Fourier transform on smooth compactly
supported functions is not very hard to set up, and it is a special case of a construction
from representation theory, as well as being a special case of a construction from integral
calculus, and probably many other fields.
In some sense, because of this uniqueness property, everything about the Fourier
transform on R in all its incarnations is determined by just this restriction to smooth com-
pactly supported functions, which is almost a purely algebraic object as one needs very
little analysis to define it. Furthermore, the styles of argument and thought typical in rep-
resentation theory are not so helpful for analyzing about Lp norms. All of these concepts
could fall under the umbrella of Fourier analysis. In the Fourier series approximation, the
periodic functions are represented as a sum of simple waves of sine and cosine. Extension
of the Fourier series is Fourier transform. Fourier transform is used when the time period
of given function is lengthened and approach to infinity.
2.12 Fourier transform
Form of Fourier integral representation of f ( x ) is
∞ ∞
1
f ( x) =
2π ∫ ∫ f (t)e is( t − x )
dt ds
−∞ −∞
1
∞
∞
=
2π ∫e ∫
−∞
− isx
f (t)e ist dt ds.
−∞
∞
If F( s) =
∫ f (t)e
−∞
ist
dt ,
∞
1
then f ( x) =
2π ∫ F(s)e − isx
ds.
−∞
The function F(s) is called as the Fourier transform of the function f ( x), and the function
f ( x) itself is called as the inverse Fourier transform of F ( s ). Linearity, similarity theorem
(change of scale property), shifting, and modulation properties are followed by Fourier
transform. Fourier transform is applicable for dealing with the boundary value problems
occurred in mathematical, physical, and engineering sciences like in heat conduction,
vibration of strings, and so on. In two-dimensional problems, it is sometimes required to
apply the transform twice and the required solution is obtained by double inversion.
2.13 Gibbs phenomenon
J. Willard Gibbs, an American physicist, studied the peculiar manner of Fourier series.
He stated that near the discontinuity manifested, due to lack of development in the
www.Technicalbookspdf.com
Chapter two: Fourier series and its applications in engineering 45
i. Function f(t)
ii. The sequence of partial sums Sn(t)
iii.
Nörlund mean Np of partial sums of a Fourier series
iv. The sequence of averages (i.e., σ 1n (t)-means or (C, 1) mean)
Let the Nörlund mean of partial sums of a Fourier series be denoted by Np. The behavior of
the Nörlund mean is better than the sequence of partial sums (t). Similarly the σ 1n (t)-means
or (C, 1) mean also behave better than the (t) for the following function:
−1, −π ≤ t < 0
f (t) = .
1, 0≤t≤π
For all real values of t,
f (t + 2π ) = f (t).
∑ 1 − (n−1)
n
2
sin nt , − π ≤ t ≤ π .
π
n= 1
The nth Cesàro sum (𝜎1(x)) for the trigonometric Fourier series is given by
n
k 1 − (−1)k
σ n1 (t) =
2
π ∑
1 −
n k sin( kt) ,
k =1
where 𝛿 = 1.
www.Technicalbookspdf.com
46 Advanced Mathematical Techniques in Engineering Sciences
For trigonometric Fourier series sn(t) denoted the nth partial sum which is given by
t ( f ; t) =
N
n
2
(n + 1)(n + 2) ∑( n − k + 1) s ( f ; t).
k
k =0
In the interval [−𝜋, 𝜋], one can observe that sn(t) converges to f(t), but its converging rate is
very slow. The converging rate of σ n1 (t) and tnN ( f ; t) toward f(t) is higher than the converging
rate of sn(t) toward f(t). Near the points of discontinuities (−𝜋, 0, and 𝜋), as n increases, the
peaks of s5 and s10 move closer to the line passing through points of discontinuity (Gibbs
phenomenon), but for n = 5, 10 the peaks of the graph of σ n1 (t) and tnN ( f ; t) go flatter. Hence,
an overshoot or undershoot of the peculiarity of the Fourier series and other series of the
eigen-functions at simple discontinuous points is a Gibbs phenomenon; that is, near the
point of discontinuity the converging rate of the trigonometric Fourier series is very slow.
For the case of the various summable means (by using various summability methods
for approximation) of the trigonometric Fourier series of the function f(t), overshoot or
undershoot the Gibbs phenomenon and one can observe that the effect of the summability
method is very smooth. Hence, tnN ( f ; t) and σ n1 (t) is the better approximant than (t). One can
observe that Np method is stronger than (C, 1) method.
The graph in Figure 2.4 implies that except the point t0 = 0 (point of discontinuity of
f(t)), the sequence converges to f(t). Gibbs focused on this point and around this point the
behavior of the Fourier partial sums.
In the continuous area (−π < t < 0 and 0 < t < π), the graph tends to look more like that
of the original one by increasing the number of Fourier coefficients. But the amplitude of
the wiggles remains constant near the discontinuous point (around the origin). Hence, the
(a) (b)
y y A
1.0 1.0 B
0.5 0.5 C
D
x x
–3 –2 –1 1 2 3 –3 –2 –1 1 2 3
0.5 0.5
1.0 1.0
Figure 2.4 For n = 5 and n = 10, f(t) (A), sn(t) (B), σ n1 (t) (C), and tnN (f; t) (D).
www.Technicalbookspdf.com
Chapter two: Fourier series and its applications in engineering 47
partial sums of the trigonometric Fourier series will not smoothly converge to the mean
value at the points of discontinuity.
2.15 Summability
In 1890, Cesàro deals with the sum of some divergent series and defined Cesàro summation
(summability methods).
There exist three types of summability:
i. Ordinary summability
ii. Absolute summability
iii. Strong summability
tn = ∑a
k=0
s , n = 0,1, 2, ,
n, k k
defines the matrix transform of the sequence { sn }n = 1 . Here the column vector of the tn is the
∞
product of the matrix T with the column vector of the sn. The sequence {sn} or the series ∑un
is said to be matrix summable to s, if lim tn = s.
n →∞
www.Technicalbookspdf.com
48 Advanced Mathematical Techniques in Engineering Sciences
lim n→∞ tn = s
and if {tn } (sequence of the mean) is of bounded variation, i.e.,
∑t n − tn − 1 < ∞.
n= 1
Absolute summability of index q: The infinite series ∑ ∞n = 0 an with the sequence of the
partial sum { sn } is absolute summable with the index q to s, i.e., if
tn → s, as n → ∞ ,
n
and ∑kk =1
q−1 q
tk − tk − 1 < ∞ , as n → ∞.
It is denoted by A, q .
∑k q q
tk − tk − 1 = O(n), as n → ∞ ,
k =1
and tn → s, as n → ∞.
It is denoted by A, q .
The following inclusion relations hold,
A, q ⊂ A, q ⊂ ( A ) .
2.16 Methods for summability
These are some summability methods:
1
i. (C, 1) means when an, k = , 0 ≤ k ≤ n.
n+1
1
ii. Harmonic means when an, k = , 0 ≤ k ≤ n.
(n − k + 1)log n
Enδ −−k1
iii. Cesàro (C , δ ) means when an, k = , 0 ≤ k ≤ n.
Enδ
pn − k
iv. Nörlund means when an, k = , 0 ≤ k ≤ n.
Pn
www.Technicalbookspdf.com
Chapter two: Fourier series and its applications in engineering 49
pk
v. Riesz means when an, k = , 0 ≤ k ≤ n.
Pn
pn − k q k
vi. General Nörlund ( N , p , q) means when an, k = ,
Rn
n
where Rn = ∑p q
k=0
k n− k .
vii. Deferred Cesàro means: Agnew defined the deferred Cesàro mean of the sequence
x = ( xk ) by
q( n)
(Dp , q )n =
1
∑
q(n) − p(n) k = p( n)+ 1
xk
q(n) < p(n) and lim q(n) = ∞,
n→∞
2.17 Regularity condition
The summability matrix T is regular, if
lim sn = s ⇒ lim tn = s.
n →∞ n →∞
Toeplitz and Silverman (1913) obtained necessary and sufficient conditions for the regularity
of matrix T.
n
1. ∑ an, k ≤ M , where M (finite constant) is independent of n.
k=0
lim an, k = 0, ∀k .
2.
n →∞
n
lim ∑ an, k = 1.
3.
n →∞ k = 0
2.18 Norm
A function that assigns strictly the positive length or size in a vector space is known as the
norm.
A function p: V → R is a norm on V if it satisfies the following properties (∀a ∈ R and
u, v ∈ V):
In the analysis of the Fourier series, importance of the Lp norm cannot be ignored as it is an
essential tool. The condition p → ∞ will give the value of essential upper bound of the Lp
www.Technicalbookspdf.com
50 Advanced Mathematical Techniques in Engineering Sciences
norm and Lp behavior represents the Lipschitz behavior at p → ∞. Hence, by replacing the
power function with the help of more general classes of functions, the results of Fourier
series can be generalized.
∑ (x
i
L2-Norm (Euclidean norm): It is a sum of squared difference x1 − x2
3. 2
= 1i − x 2 i )2
i
and has wide applicability in the signal processing field for mean-squared error
(MSE) measurement.
Lp -Norm: It is given by x1 − x2
4. p
= p
∑ (xi
1i − x2i )p , 1 ≤ p < ∞.
2.19 Modulus of continuity
The modulus of continuity ω(f, δ) of a continuous function f in [a, b] is defined by
ω ( f , δ ) = sup { f ( x) − f ( y ) , x , y ∈[ a, b]}.
y − x ≤δ
Let f ∈ Lp[a, b], p ≥ 1, then a function ω p(f, δ) is called the integral modulus of continuity
and defined by
1
b p
∫
p
ω p ( f , δ ) = sup f ( x + t) − f ( x) dx .
0 < t ≤δ
a
2.20 Lipschitz condition
Let f(x) be defined on an interval I and suppose we can find two positive constants M and
α such that
α
f (x1 ) − f (x2 ) ≤ M x1 − x2
for all x1 , x2 ∈ I . Then f is said to satisfy a Lipschitz condition of order α.
www.Technicalbookspdf.com
Chapter two: Fourier series and its applications in engineering 51
Lip (ξ(t), p): For a positive increasing function ξ(t) and an integer p ≥ 1, a signal f ∈ Lip
(ξ(t), p), if
1/ p
b
∫
p
f ( x + t) − f ( x) dx ≤ O(ξ (t)).
a
W(L p, ξ(t)): For a given positive increasing function ξ(t), an integer p ≥ 1 and β ≥ 0, f belongs
to weighted class W(Lp, ξ(t)), if
1
b p
∫ { f (x + t) − f (x)} sin x dx = O(ξ (t)).
β p
a
Note: If β = 0, W(Lp, ξ(t)) coincides with the class Lip (ξ(t), p); if ξ(t) = tα, Lip (ξ(t), p) reduces to
Lip (α, p) and if p → ∞, then Lip (α, p) reduces to Lip α:
∫ f ( x)
p
dx ≤ ∞ , for 1 ≤ p ≤ ∞ is denoted by LP or LP. If E = [a, b] is the interval of finite
a
length, then we write Lp [a, b]. The Lp (E)-space (p ≥ 1) is a Banach space under the norms
defined by
1
b p
f p ∫
p
= f ( x) dx , for 1 ≤ p < ∞ and f ∞ { }
= sup f ( x) : x ∈[ a , b] .
a
2.22 Degree of approximation
A major portion of the study of theory of signals (functions) is concerned with the con-
nections between the structural properties of a function and its degree of approximation.
The objective is to relate the smoothness (by trigonometric Fourier approximation) of the
function to the rate of decrement in the degree of approximation to zero. This chapter
is to discuss the trigonometric approximation of the function (signal) and the concept
needed to find the approximation degree using Fourier series. Trigonometric approxima-
tion is the most classical setting where the results are the most penetrating and satisfying.
One of the basic problems in the theory of Fourier series is to examine the approximation
degree using certain methods. In this sense, one of the important results is encountered.
Quade [1] solved a problem related with approximation by trigonometric polynomial by
using Nörlund summability in Lp norm.
Theorem 2.1 [1]: Let f ∈ Lip(α , p), 0 < α ≤ 1. Then
f − σ n ( f ) p = O(n−α )
for either
www.Technicalbookspdf.com
52 Advanced Mathematical Techniques in Engineering Sciences
And if p = α = 1, then
f − σ n ( f ) 1 = O(n−1 log(n + 1)).
Chandra [2] improved the result [1] and proved the following:
Theorem 2.2 [2]: Let f ∈ Lip(α , p) and let ( pn ) be positive such that
(n + 1)pn = O( Pn ).
If either
or
Then
f − N n ( f ) p = O(n−α ).
These are very important basic results of the degree of approximation and are a
motivation for researchers working in this area. After this, several mathematicians
studied the degree of approximation by using different summability techniques of
a signal that belongs to various classes like Chandra [3,4], Khan [5,6], Mursaleen and
Mohiuddine [7], Mishra and Mishra [8], Chen and Hong [9], Mishra et al. [10–12], Chen
and Jeng [13], Mishra [14,15], and Alexits [16]. Bor [17–21] gave a number of t heorems
dealing with summability factors of the series and provided many applications.
Recently, Sonker and Munjal [22–29] gave a number of theorems exploring the appli-
cations of summability and absolute summability of the Fourier and infinite series.
Many engineering problems can be solved using the summability methods. (C, 1)
and (C, 2) can be used for increasing the rate of convergences of Gibbs phenomenon.
For getting the information of the system and any process, analysis of signals or time
functions can be done, and it is of great importance. Psarakis and Moustakides [30]
presented a method for designing the finite impulse response (FIR) digital filters.
This process transmits the vibrations of the music to the air and amplified the vibrations
of the air. The human eardrums feel the air pressure fluctuations and the human brain
converts them into electrical signals.
www.Technicalbookspdf.com
Chapter two: Fourier series and its applications in engineering 53
(a) (b)
t t
For the study of the two different music instruments (a) flute and (b) violin, the graphs
are plotted in Figure 2.5. For the sustained note D (294 vibrations per second), the graphs
of waveforms show the difference between flute and violin. The flute waveform is simpler
than that of the violin.
Fourier series approximation of this music is expressed as
πt πt 2π t 2π t
P(t) = a0 + a1 cos + b1 sin + a2 cos + b sin +
L L L 2 L
The sum of simple pure sounds is used for the expression of the Fourier coefficients, which
have different values corresponding to the different musical instruments.
The nth term is called nth harmonic of P,
nπ t nπ t
an cos + b sin .
L n L
Amplitude is
An = an2 + bn2
and the energy of the nth harmonic is its square,
It can be observed again that flute waveform is quite simple in comparison to the violin
waveform. In violin, the highest harmonics are very strong but the energy of the flute
keeps decreasing very fast.
Hence, trigonometric Fourier series is very useful in expressing the sounds of musi-
cal instruments. Complex musical sounds can be made of a combination of various pure
sounds.
(a) (b)
A2n A2n
0 2 4 6 8 10 n 0 2 4 6 8 10 n
www.Technicalbookspdf.com
54 Advanced Mathematical Techniques in Engineering Sciences
Summability techniques are trained to minimize the error. With the use of summa-
bility technique, the output of the signals (found by Fourier approximation) can be
made stable, bounded, and used to predict the behavior of the input data, the initial
situation, and the changes in the complete process.
References
1. E. S. Quade, “Trigonometric approximation in mean,” Duke Mathematical Journal, vol. 3,
pp. 529–542, 1937.
2. P. Chandra, “Trigonometric approximation of functions in Lp-norm,” Journal of Mathematical
Analysis and Applications, vol. 275, pp. 13–26, 2002.
www.Technicalbookspdf.com
Chapter two: Fourier series and its applications in engineering 55
www.Technicalbookspdf.com
56 Advanced Mathematical Techniques in Engineering Sciences
28. S. Sonker and A. Munjal, “Absolute Nörlund summability |N; pn|k of improper integrals,”
National Conference on Recent Advances in Mechanical Engineering (NCRAME-2017), vol. II, no. 90,
pp. 413–415, ISBN: 978-93-86256-89-8, 2017.
29. S. Sonker, Xh. Z. Krasniqi, and A. Munjal, “A note on absolute Cesàro ϕ − C, 1; δ ; l k summabil-
ity factor,” International Journal of Analysis and Applications, vol. 15, no. 1, pp. 108–113, 2017.
30. E. Z. Psarakis and G. V. Moustakides, “An L2-based method for the design of 1-D zero phase
FIR digital filters,” IEEE Transactions on Circuits and Systems I: Fundamental Theory Applications,
vol. 44, no. 7, pp. 591–601, 1997.
31. M. I. Gil’, “Estimates for entries of matrix valued functions of infinite matrices,” Mathematical
Physics Analysis and Geometry, vol. 11, no. 2, pp. 175–186, 2008.
www.Technicalbookspdf.com
chapter three
Contents
3.1 I ntroduction: Soft computing.............................................................................................. 58
3.2 Fuzzy logic............................................................................................................................. 58
3.2.1 Evolution of fuzzy logic........................................................................................... 59
3.3 Fuzzy sets............................................................................................................................... 59
3.3.1 Equal fuzzy sets........................................................................................................ 59
3.3.2 Membership function............................................................................................... 60
3.3.2.1 Z-Shaped membership function.............................................................. 60
3.3.2.2 Triangular membership function............................................................ 60
3.3.2.3 Trapezoidal membership function.......................................................... 60
3.3.2.4 Gaussian membership function............................................................... 60
3.4 Fuzzy rule base system........................................................................................................ 61
3.5 Fuzzy defuzzification........................................................................................................... 61
3.5.1 Center of area (CoA) method.................................................................................. 61
3.5.2 Max-membership function...................................................................................... 61
3.5.3 Weighted average method....................................................................................... 61
3.5.4 Mean-max method................................................................................................... 62
3.5.5 Center of sums........................................................................................................... 62
3.6 Comparison of crisp to fuzzy............................................................................................. 62
3.7 Examples of uses of fuzzy logic..........................................................................................63
3.8 Artificial neural networks...................................................................................................63
3.8.1 Artificial neurons......................................................................................................64
3.8.2 Firing rule..................................................................................................................64
3.8.3 Different types of neural networks........................................................................64
3.8.3.1 Feedback ANN...........................................................................................65
3.8.3.2 Feed-forward ANN...................................................................................65
3.8.3.3 Classification-prediction ANN................................................................65
3.9 Training of neural networks...............................................................................................65
3.9.1 Supervised training..................................................................................................65
3.9.2 Unsupervised training.............................................................................................65
3.9.3 Reinforced training.................................................................................................. 66
57
www.Technicalbookspdf.com
58 Advanced Mathematical Techniques in Engineering Sciences
3.2 Fuzzy logic
Life is full of uncertainties; it can be vague or imprecise. To deal with such uncertainties
probability theory used to be a tool for mathematicians, which is based on classical set
theory. Zadeh, in 1965 [1], challenged that there are some uncertainties which are out of
the scope of probability theory. For example a company owner needs an honest person
for his company. Now there are available choices that can be extremely honest, very hon-
est, honest some of the time, and dishonest; which cannot be defined using classical logic
because in this logic there are only two choices—honest and dishonest. Zadeh named this
new concept as fuzzy set theory based on membership functions. Classical set theory is
about yes or no concepts, whereas fuzzy set theory includes gray part also. Fuzzy set the-
ory deals with appropriate reasoning in linguistic terms. Logic that deals mathematically
with imprecise information usually employed by humans is fuzzy logic. A multivalued
logic extends Boolean logic usually employed in computer science. Fuzzy logic is based
on the concept of logic having multidimensions that provide intermediate values to be
defined between conventional opposite evaluations like true or false, high or low, heat or
cold, etc. Fuzzy concept, introduced by Zadeh in 1960, resembles uncertainty to generate
decisions by human reasoning [1–3]. Fuzzy logic is a variety of multivalued logic which is
consequent of fuzzy set theory. This logic deals with interpretation of those that are near
rather than strict fuzzy logic system works with vague concepts as well. In fuzzy logic,
www.Technicalbookspdf.com
Chapter three: Soft computing techniques and applications 59
“I think that… ”
“chances are… ”
“it is unlikely that… ”
and so forth.
The fuzzy expression contains a fuzzy proposition with its truth value in the interval [0,1].
It represents a mapping from [0,1] to [0,1] such as
3.3 Fuzzy sets
Fuzzy set F(m) is represented by a pair of two components: first is the member m and the
second is its membership grade µF (m) which maps any element m of universe of discourse
M to the membership space [0,1], as given below:
www.Technicalbookspdf.com
60 Advanced Mathematical Techniques in Engineering Sciences
3.3.2 Membership function
A function that describes the membership grades of elements in a fuzzy set is said to be
a membership function. A membership function can be discrete or continuous. It needs a
uniform membership function representation for efficiency. Some well-known member-
ship functions are as discussed in the following sections.
2
x− p
1− 2 ; if p < x ≥ ( p + q)/2
q − p
x− p
2
Z( x ; p , q) = 2 ; if ( p + q)/2 < x ≤ q (3.2)
q − p
1; if x ≤ p
0; otherwise
x−p
; if p < x ≥ q
q− p
r−x
T (x ; p, q, r) = ; if q < x ≤ r (3.3)
r−q
0; otherwise
x−p
; if p < x ≥ q
q− p
1; if q < x ≤ r
T ( x ; p , q , r , s) = (3.4)
s− x
; if r < x ≤ s
s−r
0; otherwise
1
G( x ; σ , m) = 1 x−m 2 . (3.5)
2 σ
e
www.Technicalbookspdf.com
Chapter three: Soft computing techniques and applications 61
3.5 Fuzzy defuzzification
It is quite difficult to take decision on the bases of fuzzy output; in that case this fuzzy
output is converted into crisp value. This process of converting fuzzy output into crisp
output is known as defuzzification [10]. Different methods are available in the literature;
some widely used methods are discussed in the following sections.
∫ (µ(m) × m) dm ; for continuous membership value of m
m* = ∫ µ(m) dm (3.6)
∑ µ(m) × m ;
for discrete membership value of m
∑ µ(m)
3.5.2 Max-membership function
This method is also called the height method and is applicable to peaked output functions.
Expression for this method is given by
m* =
∑ µ(m) × m , (3.8)
∑ µ(m)
where m is the centroid of each symmetric membership function. This method is compu-
tationally efficient but less popular.
www.Technicalbookspdf.com
62 Advanced Mathematical Techniques in Engineering Sciences
3.5.4 Mean-max method
This method is similar to the max-membership method; the only difference is that the
locations of maximum membership can be more than one. Expression is given by
m1 + m2
m* = , (3.9)
2
where m1 + m2 are the mean of maximum interval.
3.5.5 Center of sums
This method is based on the algebraic sum of fuzzy subsets. This method is very fast in
terms of calculations. The defuzzified value is give by
N n
∑ ∑ µ(m )
i=1
mi
k =1
i
m* = N n . (3.10)
∑ ∑ µ(m )
i
i=1 k =1
www.Technicalbookspdf.com
Chapter three: Soft computing techniques and applications 63
• Foam detection
• Imbalance compensation
• Water level adjustment
• Washing machine
• Food cookers
• Taking blood pressure
• Determination of “socioeconomic class”
• Cars
www.Technicalbookspdf.com
64 Advanced Mathematical Techniques in Engineering Sciences
Target
Neural network
including connections Compare
Input (called weights) Output
between neurons
Adjust
weights
3.8.1 Artificial neurons
McCulloch–Pitts gave the concept of the simplest neuron; it is also known as a threshold
logic unit (TLU). In this model inputs and outputs are taken as binary values [11]. Input
gets activated with the help of other neurons. Further synaptic weights are added and
compared with the threshold value. If this value is more than threshold, then this particu-
lar neuron gets fired (i.e., it gets activated). If it is less than threshold, the neuron will not
be activated.
3.8.2 Firing rule
One of the main concepts of ANN is the firing rule. A firing rule decides whether the neu-
ron will get activated or not with any input pattern. Hamming distance technique is one
of the basic and simple firing techniques. This technique is widely used due to its simple
calculations. There are two types of training patterns for a node: the first are responsible
for the firing of any neuron, called 1-taught set of inputs. The second type of training pat-
terns that oppose this change are called 0-taught set of inputs.
Let K = (k1,k2,…,kn) and L = (l1,l2,…,ln), then hamming distance is given by
H= ∑ k −l .
i i
i
www.Technicalbookspdf.com
Chapter three: Soft computing techniques and applications 65
3.8.3.1 Feedback ANN
The feedback network reverses the information. These are applicable to the error correc-
tions of the internal system.
3.8.3.2 Feed-forward ANN
It is a neural network containing
i. An input layer
ii. An output layer
iii. One or more layers of neurons
3.8.3.3 Classification-prediction ANN
Classification-prediction ANN identifies particular patterns and converts these into spe-
cific groups.
3.9.1 Supervised training
Inputs and outputs both are given in supervised training. These inputs are processed into
a network and the outcome is compared with the actual output. Hence, the difference of
actual output and desired output is adjusted by adjusting weights. This process is repeated
until optimized error is achieved. These data which are used in adjustment of weights are
called training data. This learning is similar to class learning where a teacher is always
there to correct mistakes of students, thus it is sometimes referred to as supervised learn-
ing. If the input data lack some precise knowledge, then training may not be possible for
the particular network. For a good learning an appropriate number of data set is required,
otherwise the networks may not converge. In standard conditions we divide the data set
into three parts: one for training, one for testing, and the last part for validation. For an
unbeaten network it is necessary to examine all the basic parts again and again (e.g., num-
ber of hidden layers, connection weights, etc.). The adaption of feedback is done by most
the popular backpropagation technique. At last when the network is trained appropriately,
the final weights can be frozen.
3.9.2 Unsupervised training
In this type of learning only inputs are given to the network, not the outputs. The
decision of selecting features is done by grouping the input data. Hence, this learn-
ing algorithm is known as adaption. In real life there are several examples where exact
training of data is not possible (e.g., in an army, the situation where an army faces new
weapons). Tuevo Kohonen designed a self-organizing neural network that can learn
without desired output [21]. As this learning algorithm resembles a class of students
www.Technicalbookspdf.com
66 Advanced Mathematical Techniques in Engineering Sciences
where students are learning themselves, without the help of a teacher, it is sometimes
referred to as learning without a teacher.
3.9.3 Reinforced training
This type of learning comes under supervised learning with condition. Here the teacher
works as a guide to tell either the output is correct or not but will not give the actual output
to the network. This learning algorithm is not popular among researchers.
3.11 Genetic algorithms
Optimization is a process for finding the best out of available solutions. Optimization
is needed everywhere, even in our daily routine, when we decide our to-do list or pri-
oritize our tasks for the day [30]. There are many traditional methods available in the
literature to solve the optimization problem. These traditional methods have the follow-
ing drawbacks:
The above drawbacks of traditional methods motivate the search for a robust and effi-
cient method. Biological systems are flexible, robust, efficient, and self-guided. Genetic
algorithms are inspired by Darwin’s theory about evolution: “survival of the fittest.” The
method was developed by John Holland (1975) [30–32].
www.Technicalbookspdf.com
Chapter three: Soft computing techniques and applications 67
Start
End
Reproduction
Mutation
Gen = Gen + 1
Step 2—Fitness evolution: To calculate the fitness value of each solution, the decoding
is done to get the real value of the solution. This value of the variable is then substituted to
the given objective function to compare fitness of solution.
Step 3—Reproduction: Good strings (“fittest”) in a population are selected and
assigned a large number of copies to form a mating pool.
Step 4—Crossover: In this step parents exchange properties.
Step 5—Mutation: The concept of biological mutation is also preserved here. A sud-
den change in population is done to take the solution out of local optima.
www.Technicalbookspdf.com
68 Advanced Mathematical Techniques in Engineering Sciences
References
1. L. A. Zadeh, “Fuzzy sets,” Inf. Control, vol. 8, no. 3, pp. 338–353, 1965.
2. L. A. Zadeh, “Outline of a new approach to the analysis of complex systems and decision pro-
cesses,” IEEE Trans. Syst. Man Cybern., vol. 3, no. 1, pp. 28–44, 1973.
3. M. Gr. Voskoglou, “Measuring the uncertainty of human reasoning,” Am. J. Appl. Math. Stat.,
vol. 2, no. 1, pp. 1–6, 2013.
4. C. Lejewski, “Jan Lukasiewicz,” Encycl. Philos., vol. 5, pp. 104–107, 1967.
5. D. C. S. Bisht, M. Raju, and M. Joshi, “Simulation of water table elevation fluctuation using
fuzzy-logic and ANFIS,” Comput. Model. New Tech., vol. 13, no. 2, pp. 16–23, 2009.
6. A. Gupta, and N. Singhal, “Advice generation using fuzzy logic in OMR Pheonix technique,”
Int. J. Comput. Appl., vol. 52, no. 16, pp. 6–10, 2012.
7. E. Egrioglu, U. Yolcu, C. H. Aladag, and C. Kocak, “An ARMA type fuzzy time series forecast-
ing method based on particle swarm optimization,” Math. Probl. Eng., vol. 2013, pp. 1–12, 2013.
8. A. Reigber, My life with Kostas. Unpublished report, Neverending Story Press, 1999.
9. J. M. Mendel, Uncertain Rule-Based Fuzzy Logic Systems: Introduction and New Directions. Prentice
Hall PTR, Upper Saddle River, NJ, 2001.
10. R. R. Yager, and D. P. Filev, Essentials of Fuzzy Modeling and Control. John Wiley & Sons,
New York, 388 pp, 1994.
11. W. S. McCulloch, and W. Pitts, “A logical calculus of the ideas immanent in nervous activity,”
Bull. Math. Biophys., vol. 5, no. 4, pp. 115–133, 1943.
12. D. O. Hebb, “Organization of behavior. New York: Wiley, 1949, pp. 335,” J. Clin. Psychol., vol. 6,
no. 3, pp. 307–307, 1950.
13. M. Dougherty, “A review of neural networks applied to transport,” Transp. Res. Part C Emerg.
Technol., vol. 3, no. 4, pp. 247–260, 1995.
14. A. L. Glass, and K. J. Holyoak, “Alternative conceptions of semantic theory,” Cognition, vol. 3,
no. 4, pp. 313–339, 1974.
15. J. A. Anderson, “A simple neural network generating an interactive memory,” Math. Biosci., vol.
14, no. 3–4, pp. 197–220, 1972.
16. L. Glass, and R. E. Young, “Structure and dynamics of neural network oscillators,” Brain Res.,
vol. 179, no. 2, pp. 207–218, 1979.
17. K. Fukushima, “Cognitron: A self-organizing multilayered neural network,” Biol. Cybern., vol.
20, no. 3–4, pp. 121–136, 1975.
18. I. Jung, L. Koo, and G.-N. Wang, “Two states mapping based neural network model for decreas-
ing of prediction residual error,” Int. J. Ind. Manuf. Eng., vol. 1, no. 7, pp. 322–328, 2007.
19. K. L. Priddy, and P. E. Keller, Artificial Neural Networks: An Introduction, vol. 68. SPIE Press,
Bellingham, WA, 2005.
20. S. Sapna, A. Tamilarasi, and M. P. Kumar, “Backpropagation learning algorithm based on
Levenberg Marquardt Algorithm,” Comp. Sci. Inf. Technol. CS IT, vol. 2, pp. 393–398, 2012.
21. T. Kohonen, ed., Self-Organizing Maps. Springer-Verlag, New York, Secaucus, NJ, 1997.
22. B. Dixon, “Applicability of neuro-fuzzy techniques in predicting ground-water vulnerability:
A GIS-based sensitivity analysis,” J. Hydrol., vol. 309, no. 1, pp. 17–38, 2005.
23. U. Nauck, and R. Kruse, “Design and implementation of a neuro-fuzzy data analysis tool in
Java,” Manual Technical University of Braunschweig Germany, 1999.
24. E. Khan, Neural fuzzy based intelligent systems and applications In: Jain, LC and NM Martin
eds., Fusion of Neural Networks, Fuzzy Sets and Genetic Algorithms: Industrial Applications. CRC
Press, Washington, DC, 1999.
www.Technicalbookspdf.com
Chapter three: Soft computing techniques and applications 69
25. J.-S. Jang, “ANFIS: Adaptive-network-based fuzzy inference system,” IEEE Trans. Syst. Man
Cybern., vol. 23, no. 3, pp. 665–685, 1993.
26. J. M. Keller, R. Krishnapuram, and F.-H. Rhee, “Evidence aggregation networks for fuzzy logic
inference,” IEEE Trans. Neural Netw., vol. 3, no. 5, pp. 761–769, 1992.
27. I. N. Aghdam, B. Pradhan, and M. Panahi, “Landslide susceptibility assessment using a novel
hybrid model of statistical bivariate methods (FR and WOE) and adaptive neuro-fuzzy infer-
ence system (ANFIS) at southern Zagros Mountains in Iran,” Environ. Earth Sci., vol. 76, no. 6,
p. 237, 2017.
28. A. M. Ahmed, and S. M. A. Shah, “Application of adaptive neuro-fuzzy inference system
(ANFIS) to estimate the biochemical oxygen demand (BOD) of Surma River,” J. King Saud
Univ.-Eng. Sci., vol. 29, no. 3, pp. 237–243, 2017.
29. A. Karkevandi-Talkhooncheh, S. Hajirezaie, A. Hemmati-Sarapardeh, M. M. Husein, K. Karan,
and M. Sharifi, “Application of adaptive neuro fuzzy interface system optimized with evolu-
tionary algorithms for modeling CO2–crude oil minimum miscibility pressure,” Fuel, vol. 205,
pp. 34–45, 2017.
30. A. Boultif, A. Kabouche, and S. Ladjel, “Application of genetic algorithms (GA) and thresh-
old acceptance (TA) to a ternary liquid–liquid equilibrium system,” Int. Rev. Model. Simul.
IREMOS, vol. 9, no. 1, pp. 29–36, 2016.
31. I. Cruz-Vega, C. A. R. García, P. G. Gil, J. M. R. Cortés, and J. de J. R. Magdaleno, “Genetic algo-
rithms based on a granular surrogate model and fuzzy aptitude functions,” in Evolutionary
Computation (CEC), 2016 IEEE Congress on, 2016, pp. 2122–2128.
32. R. L. Haupt, and S. E. Haupt, Practical Genetic Algorithms. John Wiley & Sons, New York, 2004.
33. D. C. S. Bisht, P. K. Srivastava, and M. Ram, “Role of fuzzy logic in flexible manufacturing
system,” Diagnostic Techniques in Industrial Engineering. Springer, Cham, pp. 233–243, 2018.
34. N. Mathur, P. K. Srivastava, and A. Paul, “Algorithms for solving fuzzy transportation
problem,” Int. J. Math. Oper. Res., vol. 12, no. 2, pp. 190–219, 2018.
www.Technicalbookspdf.com
www.Technicalbookspdf.com
chapter four
Contents
4.1 I ntroduction........................................................................................................................... 71
4.2 Preliminaries......................................................................................................................... 73
4.2.1 Concepts of solution................................................................................................. 74
4.3 Mathematical model.............................................................................................................77
4.4 Solution procedure............................................................................................................... 78
4.4.1 Fuzzy programming................................................................................................ 78
4.4.2 Goal programming................................................................................................... 79
4.4.3 Revised multi-choice programming......................................................................80
4.4.4 Vogel approximation method.................................................................................80
4.4.5 Merits and demerits.................................................................................................. 81
4.5 Numerical example..............................................................................................................83
4.5.1 Fuzzy programming................................................................................................84
4.5.2 Goal programming................................................................................................... 85
4.5.3 Revised multi-choice goal programming............................................................. 86
4.5.4 Vogel approximation method................................................................................. 86
4.6 Comparison........................................................................................................................... 87
4.7 Conclusion and future study.............................................................................................. 88
Acknowledgment........................................................................................................................... 88
References........................................................................................................................................ 89
4.1 Introduction
Operations research (OR) is a discipline that encompasses a wide range of methods in solv-
ing real-life decision-making problems. The mathematical methods are applied in pursuit
of improving decision-making and efficiency in the areas of mathematical optimization,
econometric methods, simulation, neural networks, decision analysis, and the analytic
hierarchy process. The study of OR arose during World War II. During this time, OR was
considered as a scientific way for providing respective departments with a quantitative
basis for making decisions corresponding to the operations under the entire system. The
term “optimization” is the root of the study in OR. The optimization is used in different
areas of study, like mathematical optimization, engineering optimization, economics and
business, information technology, etc.
In an optimization problem (OP), we basically treat the objective function, either
maximization or minimization, with or without some prescribed set of constraints.
71
www.Technicalbookspdf.com
72 Advanced Mathematical Techniques in Engineering Sciences
D1
O1 C11
C12
C
C14 13
D2
C21
C22
O2 C23
C24 D3
C31
C32
C33
O3 C34 D4
www.Technicalbookspdf.com
Chapter four: New approach for solving multi-objective transportation problem 73
Several approaches are available for solving a MOTP, such as fuzzy programming,
goal programming, revised multi-choice goal programming, etc. Zimmermann [22] intro-
duced the concept of solving a MOO problem using fuzzy programming. Basically, in a
fuzzy programming approach, a MOTP is converted to a single-objective optimization
problem and then its solution is treated as a compromise solution.
Goal programming (GP), an analytical tool, is introduced to address the decision-
making problem involving objective functions that are conflicting and noncommensurable
to each other, and targets have been assigned as goals to the objective functions. The DM
is interested in maximizing the aspiration level of the corresponding goals. The concept of
GP was introduced by Charnes and Cooper [14] and further developed by researchers such
as Hannan [23], Ignizio [24], Tamiz et al. [25], Romero [26], Liao [27], Tabrizi et al. [28], and
many others. However, the resources ambiguity and the incomplete information make it
almost impossible to set the specific aspiration levels (goals) and select a better decision
by DM. To tackle this situation, Chang [29] presented the multi-choice goal programming
(MCGP) approach to solve the multi-objective decision-making (MODM) problem. During
this time, again Chang [30] proposed RMCGP which is the revised form of MCGP for
solving the MODM.
In this chapter, we introduce a new approach for solving the MOTP. Especially, we
intend to solve the MOTP using the Vogel approximation method (VAM). The usefulness
of the algorithm is tested through a numerical example.
The rest of this chapter is organized in the following way: Section 4.2 describes the
preliminaries of the proposed chapter. Section 4.3 contains the mathematical model
of TP and MOTP. The solution procedure is presented in Section 4.4 which contains
five subsections. Fuzzy programming, goal programming, and revised multi-choice
goal programming are briefly presented in Sections 4.4.1–4.4.3, respectively. An algo-
rithm for solving the proposed MOTP by VAM is introduced in Section 4.4.4. Section
4.4.5 contains the merits and demerits of the proposed approaches for solving MOTP.
A numerical example is taken into consideration to justify our study; and comparison
among the obtained solutions from the approaches is carried out in Sections 4.5 and
4.6, respectively. The chapter ends with the conclusion and an outlook of the study in
Section 4.8.
4.2 Preliminaries
In an optimization problem, there are mainly two perspectives, namely, formulation of the
model and then finding its solution. Here, we present some useful definitions in connec-
tion of the study.
Definition 4.1: Optimization is a mathematical discipline which is concerned with
finding maximum or minimum of objective functions with or without constraints. In
the study of optimization, basically we need to optimize a real function f ( x1 , x2 , … , xn ),
of n variables x1 , x2 , … , xn with or without constraints.
In an OP, for modeling a physical system, if there be only one objective function, and
the task is to obtain the optimal solution, then it is referred to as a single-objective
optimization problem. The general form of single-objective optimization problem
can be depicted as follows:
www.Technicalbookspdf.com
74 Advanced Mathematical Techniques in Engineering Sciences
g( x1 , x2 , … , xn ) ≤ 0 (4.3)
l( x1 , x2 ,…, xn ) = 0 (4.4)
( x1 , x2 ,…, xn ) ∈ F ⊂ n , (4.5)
minimize or maximize f = f ( f1 , f 2 ,… , f k )
(4.6)
subject to the constraints (4.2) − (4.5)
where f1 , f2 , … , f k are the objective functions containing the decision variables
x1 , x2 , … , xn.
Definition 4.4: In a MOO problem, if all the objective functions f1 , f2 , … , f k and the
constraints (4.2)–(4.5) are linear functions in terms of decision variables x1 , x2 , … , xn,
then the MOO problem is called a linear MOO problem. Furthermore, if at least one
of the constraints or one of the objective functions becomes a nonlinear type, then the
MOO problem is called a nonlinear MOO problem.
4.2.1 Concepts of solution
In a single-objective optimization problem, the “best” solution is defined in terms of
“optimal solution” for which the value of the objective function is optimized satisfying the
set of all feasible restrictions. In a MOO problem, the “best” solution is usually referred to
as the “Pareto optimal solution.” Here are some useful definitions related to the solution
of a MOO problem.
{
Definition 4.5 (Feasible Solution, FS): A solution set X = x : x ∈ n is said to be a }
feasible solution to a MOO, if it satisfies all the constraints. A set S consisting of all
FSs is called a feasible solution set which lies in the space of action, where
{
S = x : x ∈ n satisfying the constraints (4.2)–(4.5) . }
Definition 4.6 (Optimal Solution): An optimal solution of minimization problem
MOO is a FS which gives the minimum value of each objective function simultane-
ously, i.e., if, x * ∈ S where x* is an optimal solution and f k ( x * ) ≤ f k ( x), k = 1, 2,…, K for
all x * ∈ S.
www.Technicalbookspdf.com
Chapter four: New approach for solving multi-objective transportation problem 75
In the space of objective function, the optimal solution is located within the
oundary of the feasible space. Here, the optimal solution is also known as the infe-
b
rior solution. Generally, there is no optimal solution to a MOO problem because the
objective functions are conflicting in nature. In MOO with conflicting objective func-
tions, the optimum solution corresponding to each objective function is obtained
individually but the optimum solution of MOO problem reflecting optimum values
of objective functions individually does not exist in general.
Optimal Compromise Solution: Compromise programming seeks the compro-
mise solution among several objective functions of a MOO problem. The idea is based
on the minimization of the distance between the ideal and the desired solutions.
Definition 4.7: The optimal compromise solution of a MOO is a solution x ∈ X which
is preferred by the DM to all other solutions, taking into consideration all objectives
contained in the several functions of the MOO problem.
It is generally accepted that an optimal compromise solution has to be an efficient
solution according to the definition of an efficient solution. For a real-life practical
problem, the complete solution (set of all efficient solutions) is not always necessary.
We need only a procedure that finds an optimal compromise solution.
Definition 4.8 (Pareto-optimal solution [Efficient solution]): A feasible solution x of
a MOO problem is said to be a nondominated (noninferior) solution if there does not
exist any other feasible solution x which dominates the solution obtained through x.
Therefore, for a nondominated solution, an increase in the value of any one objective
function is not possible without decreases in the value of at least one other objective
function. Mathematically, a solution x ∈ X is nondominated if there does not exist
any x ∈ X, such that f k ( x) ≤ f k ( x), k = 1, 2,…, K; and f k ( x) ≠ f k ( x), at least one k.
Fuzzy Programming (FP): In real-life uncertain situations, the fuzzy set theory is
an important and an effective tool to take into account for analyzing the MODM
problem. Although the fuzzy set theory is rigorously used in the field of operations
research as a tool for solving a MOO problem, here we intend to use a fuzzy set for
accommodating real-life situations in a MOO problem through the fuzzy parameters.
is a pair ( A, µ ) where A is a crisp set
Definition 4.9 (Fuzzy set) [31]: A fuzzy set A A
that belongs to the universal set X and µ A : X → [0,1] is a function, called a member-
ship function.
Fuzzy membership function: Membership values are used in order to determine the
degree of membership of the elements to the fuzzy set. The evaluation of a member-
ship value is of critical importance in the application of fuzzy set theory in the field of
engineering and science. The linear membership function is defined by two flexible
points, such as upper and lower aspiration levels, or two bounds of tolerance intervals.
Definition 4.10 (Membership function of triangular fuzzy number): The membership
= ( a, b, c) is depicted as follows (Figure 4.2):
function of a triangular fuzzy number A
x−a if a ≤ x ≤ b
,
b−a
c−x if b ≤ x ≤ c
µ A ( x) = ,
c−b
0, if elsewhere
www.Technicalbookspdf.com
76 Advanced Mathematical Techniques in Engineering Sciences
y
1
x
o a b c
Model GP:
minimize ∑w
i=1
i f i ( x) − g i
subject to x ∈ F ,
where F is the feasible set and wi are the weights attached to the deviation of the
achievement function. fi ( x) is the i-th objective function of i-th goal, and gi is an aspi-
ration level of the i-th goal. fi ( x) − g i represents the deviation of the i-th goal.
A modification of GP is provided and is denoted as weighted goal programming
(WGP) which can be displayed in the following form:
Model WGP:
minimize ∑w ( d
i=1
i
+
i + di− )
subject to fi ( x) − di+ + di− = g i ,
x ∈F,
where di+ and di− are over and under achievements of the i-th goal, respectively.
However, the conflicts of resources and the incompleteness of available information
make it almost impossible for DMs to set the specific aspiration levels and choose a
better decision. To overcome this situation, a revised multi-choice goal programming
(RMCGP) approach was presented by Chang [30].
www.Technicalbookspdf.com
Chapter four: New approach for solving multi-objective transportation problem 77
Definition 4.12: The mathematical model of RMCGP for solving a MOO problem can
be defined as follows:
Model RMCGP:
minimize ∑ w ( d
i=1
i
+
i ) (
+ di− + α i ei+ + ei− )
subject to fi ( x) − di+ + di− = y i (i = 1, 2,…, K ),
y i − ei+ + ei− = gi ,max or g i ,min (i = 1, 2,…, K ),
x ∈ F.
Here yi is the continuous variable associated with the i-th goal which is restricted
between the upper ( g i ,max ) and lower ( g i ,min ) bounds; ei+ and ei− are positive and
negative deviations attached to the i-th goal of y i − g i,max ; α i is the weight attached
to the sum of the deviations of y i − g i,max ; and other variables are defined as
in WGP.
4.3 Mathematical model
The mathematical model of a transportation problem is as follows:
Model 1
m n
minimize Z = ∑ ∑ C x , (4.7)
i=1 j=1
ij ij
subject to ∑x
j=1
ij ≤ ai (i = 1, 2, … , m), (4.8)
∑x
i=1
ij ≥ bj ( j = 1, 2, … , n), (4.9)
where xij is the decision variable that represents the amount of goods delivered from i-th
origin to j-th destination. Cij is the transportation cost per unit commodity. ai and bj are
∑ ∑
m n
supply and demand at i-th origin and j-th destination, respectively, and ai ≥ bj
i=1 j=1
is the feasibility condition.
www.Technicalbookspdf.com
78 Advanced Mathematical Techniques in Engineering Sciences
A single-objective transportation problem does not serve to formulate all the real-life
transportation problems. As, for example, if it is required to minimize cost of transportation,
maximize profit, and minimize distance in a single TP, then it reduces to multi-objective
ground. The transportation problem with multiple objective functions is considered as a
MOTP. However, we deal with such kind of objective functions, which are conflicting and
noncommensurable to each other involved in TP. A mathematical model of MOTP is as
follows:
m n
minimize/maximize Z = t
∑∑C x
i=1 j=1
t
ij ij (t = 1, 2,…, K )
( )
Here Cijt = Cij1 , Cij2 , … , CijK is the cost per unit goods for each objective function when the
total amount is transported from i-th origin to j-th destination.
4.4 Solution procedure
In this section, we introduce three well-known techniques for solving the MOTP: fuzzy
programming, goal programming, and revised multi-choice programming. Thereafter,
we propose a new algorithm using VAM to solve the MOTP.
4.4.1 Fuzzy programming
To solve the MOTP, we use the approach of fuzzy programming which reduces a multi-
objective problem to a single-objective problem, then the single-objective problem is solved
to find the compromise solution of the MOTP. The steps to be followed for converting a
multi-objective problem to a single-objective problem are as follows:
Step 1: First, we solve each of the objective functions separately, under the demand and
supply restrictions, and obtain optimal solutions of K linear objective functions. Let
X 1* , X 2* , … , X K* be the ideal solutions of the K objectives Zt, (t = 1, 2,…, K).
Step 2: Each objective function is evaluated corresponding to the ideal solutions
obtained in Step 1, and we formulate a pay-off matrix of order K × K as follows:
Table 4: Pay-off Matrix
Z1 (X 1* ) Z 2 (X1* ) … Z K (X 1* )
Z1 (X 2* ) Z 2 (X 2* ) … Z K (X 2* )
1 * 2 *
Z (X ) 3 Z (X ) 3 … Z K (X 3* )
Z (X K* )
1
Z (X K* )
2
… Z K (X K* )
Step 3: When the t-th objective function is a minimizing type, obtain the lower bound
Lt (best solution) and upper bound Ut (worst solution) corresponding to the t-th objec-
tive function. Then formulate the membership function using Zimmermann’s [22]
approach corresponding to each objective function Zt (t = 1, 2,…, K) as follows:
www.Technicalbookspdf.com
Chapter four: New approach for solving multi-objective transportation problem 79
0, if Zt ≥ U t
U t − Z t (X )
µ A ( t
)
Z (X ) =
U t − Lt
, if Lt ≤ Zt (X ) ≤ U t
1, if Zt (X ) ≤ Lt
Again, if the t-th objective function is the maximizing type, obtain the lower bound
Lt (worst solution) and upper bound Ut (best solution) corresponding to the t-th objec-
tive function. Then the membership function for the objective function Zt (t = 1, 2,…, K)
is formulated as follows:
0, if Zt (X ) ≤ Lt
Z t (X ) − Lt
µ A ( t
)
Z (X ) =
U t − Lt
, if Lt ≤ Z t (X ) ≤ U t
1, if Zt ≥ U t
Step 4: Introduce an auxiliary variable λ and formulate an equivalent fuzzy linear
programming problem in the following form:
maximize λ
(
subject to λ ≤ µ A Zt (X ) ) (t = 1, 2,…, K ),
( )
Here, µ A Zt (X ) is the membership function of the t-th objective function for (t = 1,
2,…, K) as given in Step 3.
Step 5: Solve the crisp model obtained in Step 4, and derive the optimal compromise
solution.
4.4.2 Goal programming
Goal programming is a useful approach for solving MOTP. Here we discuss the goal
programming approach for solving MOTP (see Model 1A).
In this procedure, DM needs to decide goals corresponding to each of the objective
functions. Consider g i,max and g i,min as the maximum and minimum aspiration values of
the t-th objective function in MOTP. Consider di+ and di− as positive and negative deviations
corresponding to the t-th objective function. Then the mathematical model of GP is formu-
lated as follows:
Model 1A
k
minimize ∑w ( d
t=1
t
+
t )
+ dt− (4.11)
www.Technicalbookspdf.com
80 Advanced Mathematical Techniques in Engineering Sciences
Model 1B
k
minimize ∑ w ( d
t=1
t
+
t ) ( )
+ dt− + α t et+ + et− (4.15)
where yt is a continuous variable that lies between upper ( gt ,max ) and lower ( gt , min ) bounds,
and it is denoted as the aspiration level of the t-th objective function. Again et− and et+ are
negative and positive deviations attached to the t-th goal of yt − gt,max , and αt is taken as
the weight attached with the sum of the deviations of yt − gt,max .
www.Technicalbookspdf.com
Chapter four: New approach for solving multi-objective transportation problem 81
• Step 3: Thereafter, consider the weights wt (normalized) for the objective functions
Zt for all t.
(
• Step 4: Formulate the objective function, i.e., maximize Z = ∑ mi = 1 ∑ nj = 1 ∑t∈T ′ wt rijt +
)
∑t∈T ′′ wt sijt xij under the constraints (4.8)–(4.10) described in MOTP. Here T’ and T”
are the sets of objective functions with maximization and minimization types,
respectively.
• Step 5: The formulated model in Step 4 is a LPP; solve the LPP by simplex algorithm.
• Step 6: Obtain the optimum solution of the LPP from Step 5 and notice the a llocations
made at the cells. Let Xij be the optimum solution. Then the compromise solution of
the t-th objective function is Zt = ∑ mi = 1 ∑ mj = 1 Cijt X ij ∀t .
• Step 7: Stop.
The presented algorithm is more effective for solving the MOTP based on the following
reasons:
www.Technicalbookspdf.com
82 Advanced Mathematical Techniques in Engineering Sciences
Theorem 4.1: Solution of MOTP by RMCGP produces a better result than GP.
Proof: The mathematical model of GP for solving a MOTP is depicted as follows:
GP
subject to x ∈ F. (4.21)
WGP
minimize ∑w ( d
i=1
i
+
i + di− )
subject to Z i ( x) − di+ + di− = gi ,
From the above discussion, we observe that the mathematical model GP is a func-
tion of the weights and goals along with the decision variables, whereas the objective
function in the mathematical model of WGP contains goal deviations (di+ ) and (di− )
as variable. It is observed that both models GP and WGP produce the same result,
but the WGP model is easier to tackle than GP as its objective function includes a
minimum number of variables compared to GP.
If the goal (gi) for the i-th objective function is not a real value but it is taken
as an interval [ g i ,min , g i ,max ], then to obtain a better solution of a maximization-type
objective function, gi should attain the maximum value of the range, and for objective
function of a minimization type, it should take its minimum value of the specified
range.
Thereafter, we introduce a new variable yi in the model WGP as Z i ( x) − di+ + di− = y i,
and two deviation variables such as ei+ and ei− are similar to di+ and di− , respectively,
along with the constraint y i − ei+ + ei− = g i ,min or g i ,max . Then, the objective function
( ) ( )
is converted into the form as minimize ∑ ik= 1 wi di+ + di− + α i ei+ + ei− , where αi are
weights corresponding to the goal deviations. Using this objective function, we
construct the model RMCGP. The RMCGP minimizes the deviations (di+ + di− ) and
(ei+ + ei− ), whereas the GP minimizes the deviation of the value of objective function,
i.e., ∑ ik= 1 wi (di+ + di− ). Thus, in fact of minimizing the objective function in the RMCGP
Chapter four: New approach for solving multi-objective transportation problem 83
( )
model, the second part of objective function ∑ ik= 1 α i ei+ + ei− is also minimized. This
implies that the value of yi tends to g i ,max for a maximizing-type objective func-
tion, and yi tends to g i,min for a minimizing-type objective function. In WGP or
GP, the goal deviations are only minimized, which does not consider the type
of objective function. Subsequently, the additional variables ei+ and ei− tackle the
situation that minimizes the deviations according to the type of the objective func-
tion. Hence, we establish that RMCGP produces a better result compared to WGP
or GP models.
Furthermore, in the mathematical model of RMCGP, if we consider the goal
deviations ei+ and ei− as 0, then, it reduces to the form of a WGP. Also, model WGP is
a modification of the model GP. Therefore, it is cleared that the solution of RMCGP
is better than the solution of WGP and GP. Hence, the arguments evince the proof of
the theorem.
The efficiency of our proposed algorithm is presented followed by the algorithm,
which establishes a better utility of our proposed algorithm in comparison to FP, GP,
and RMCGP.
4.5 Numerical example
A rice merchant has three warehouses at three locations O1, O2, and O3. He delivers rice
into three different markets D1, D2, and D3. In the warehouses, there are different capaci-
ties of rices with different prices in different locations. It is noticed by the merchant that
the maximum supplying capacity of rice at the origins O1, O2, and O3 are 1100, 1250, and
1150 kg, respectively. Furthermore, the minimum demands of rice at the destinations D1,
D2, and D3 are 1150, 1100, and 1225 kg, respectively.
The merchant wishes to deliver rice to the destinations keeping in mind that he has
to minimize transportation cost but maximize the profit with the consideration that
the transportation costs are paid by the customers. Basically every customer wishes to
minimize transportation cost in time of purchasing rice, whereas the merchant wishes to
maximize his profit. Conflicting situations occur, and the problem becomes a MOTP with
two objective functions. We present the data regarding the transportation parameters in
Tables 4.1 and 4.2.
Considering the aforementioned data, the following MOTP is formulated:
Model P
minimize Z1 = 3.5 x11 + 4.1x12 + 4.5 x13 + 5.5 x21 + 4.5 x22
4.5.1 Fuzzy programming
The ideal solutions obtained by solving the objective functions Z1 and Z2 separately
subject to the constraints (4.24)–(4.30) are given as [X 1* ] = [1100, 0, 0, 50,1100,75, 0, 0,1150],
[X 2* ] = [0, 0,1100,1175, 0,75, 0,1100, 50]. Based on the ideal solutions, we formulate the p ay-off
matrix, which is shown in Table 4.3.
Using Table 4.3, we formulate the following membership functions corresponding to
each objective function of the proposed problem as
0, if Z 1 ≥ 16,607.5
16,607.5 − Z 1 (X )
(
µ Z (X ) =
1
) , if 14, 050 ≤ Z 1 (X ) ≤ 16,607.5
16,607.5 − 14, 050
1, if Z 1 (X ) ≤ 14, 050,
0, if Z 2 (X ) ≤ 7132.5
Z 2 (X ) − 7132.5
(
µ Z (X ) =
2
) , if 7132.5 ≤ Z 2 (X ) ≤ 7295
7295 − 7132.5
1, if Z 2 ≥ 7295.
Using the procedure described in Section 4.4.1, finally we design the following model as:
Model P1
maximize λ
subject to 1957.5λ ≤ 16,607.5 − (3.5 x11 + 4.1x12 + 4.5 x13 + 5.5 x21 + 4.5 x22 + 5.0 x23
+ 4.5 x31 + 4.2 x32 + 4.0 x33 ), 162.5λ ≤ (2.5 x11 + 1.5 x12 + 2.0 x13 + 2.1x21 + 1.8x22
+ 1.5x23 + 1.5x31 + 2.2 x32 + 1.9x33 ) − 7132.5,
and the constraints (4.24)–(4.30).
Model P1 is an LPP and using LINGO10 software, we obtain the compromise optimal
solution as follows:
The minimum value of objective function Z 1 (X * ) = $15, 324.02, and the maximum value of
objective function Z 2 (X * ) = $7239.05 . The value of the aspiration level is λ = 0.66.
4.5.2 Goal programming
According to the market situations, the DM has some knowledge of approximate profit
connecting with the optimum transportation cost. In that situation, the DM wishes to solve
the MOTP in such a way that the transportation cost belongs to the interval [15,000, 16,000]
(lesser value is preferred by DM) and the profit belongs to [7100, 7500] (greater value is pre-
ferred by DM). So, it is required to schedule the amount of rice to be transported satisfying
the predetermined goals assumed by the DM.
To achieve the goals in Model P, we formulate the model by goal programming in the
following way: Assume the deviations of goal 1 and goal 2 are 1000 and 400, respectively.
Consider the weights w1 = 1/1000 and w2 = 1/400 to Model P, then Model P reduces to
Model P2 as follows:
Model P2
1 1
minimize
1000
(
d1+ + d1− +)400
(
d2+ + d2− )
subject to 3.5 x11 + 4.1x12 + 4.5 x13 + 5.5 x21 + 4.5 x22 + 5.0 x23 + 4.5x31 + 4.2 x32
2.5 x11 + 1.5 x12 + 2.0 x13 + 2.1x21 + 1.8 x22 + 1.5 x23 + 1.5 x31 + 2.2 x32
Model P3
1 1 1 1 +
minimize
1000
( )
d1+ + d1− +
400
(
d2+ + d2− +
1000
) (
e1+ + e1− +
400
)
e2 + e2− ( )
subject to 3.5 x11 + 4.1x12 + 4.5 x13 + 5.5 x21 + 4.5 x22 + 5.0 x23 + 4.5x31 + 4.2 x32
2.5 x11 + 1.5 x12 + 2.0 x13 + 2.1x21 + 1.8 x22 + 1.5 x23 + 1.5 x31 + 2.2 x32 + 1.9 x33 − d2+ + d2− = y 2 ,
7100 ≤ y 2 ≤ 7500,
Table 4.4 Normalized transportation cost per unit kilogram rice (in $)
D1 D2 D3
O1 0.636 0.745 0.816
O2 1 0.818 0.909
O3 0.818 0.764 0.727
components of Tables 4.1 and 4.2 by the maximum cost component to each of them, and
they are 5.5 and 2.5, respectively.
According to the proposed algorithm, if we take equal weights to the objective
functions, then we find the transportation cost which is shown in Table 4.6 in normalized
form corresponding to the equivalent single-objective function for solving the proposed
problem.
From Table 4.6, we see that some transportation parameters take negative value, we
add the magnitude of the most negative value to each of the cost parameters in Table 4.6
and solve it by VAM to get the optimal allocation in the transportation cell. Then the fol-
lowing compromise solution is obtained: [X*] = [1100, 0, 0, 50, 1100, 75, 0, 0, 1150]. Finally, the
values of the objective functions are Z1 (X * ) = $14, 050 and Z 2 (X * ) = $7132.5 . This optimal
compromise solution is the best solution preferred by buyer.
In a similar way, to choose different weights for the objective functions, we prepare
Table 4.7 which contains optimal compromise solutions.
From Table 4.7, the DM can choose any one of the optimal compromise solutions, and
we present the best solutions preferred by both merchant and buyer. According to the
choice of the DM, he may take the solution Z1 (X * ) = $14, 300 and Z 2 (X * ) = $7192.5 corre-
sponding to the weights 0.1 and 0.9, respectively, which is better for both merchant and
buyer, not dominating any one by the other.
4.6 Comparison
According to the obtained solutions of formulated Model P by FP, GP, RMCGP, and VAM, it
is clear that the algorithm using VAM produces a better solution than FP, GP, and RMCGP.
Also, we have seen that there is no need for any auxiliary variable in VAM which is nec-
essary in FP, GP, and RMCGP. In this regard, we may say that the proposed algorithm is
more effective with less computation burden for solving MOTP. The mathematical model
of GP is a special structure of RMCGP because the value of α i = 0 for all i in RMCGP which
88 Advanced Mathematical Techniques in Engineering Sciences
produces GP. GP tries to optimize the goal values but not prefer the goals properly for
maximization or minimization problems, whereas RMCGP treats these goals as the DM’s
choices. Also, in GP or RMCGP one of the most important drawbacks is how to select the
goals. There may be a situation, if the goal is not selected in proper way, in which the solu-
tion is infeasible. If, for example, the DM selects the goals [12,000, 13,000] (lesser value is
preferred by DM) and the profit belongs to [7400, 7600] (greater value is preferred by DM),
then we cannot find any optimal compromise solution from both the GP and the RMCGP.
Again in FP, we have solved the objective functions separately and to form a pay-off
matrix; and finally a single-objective function is derived and solved to find the optimal
compromise solution. During the process, altogether, we have solved three objective func-
tions to obtain an optimum solution. Also, we have used two additional constraints and
one auxiliary variable to solve the MOTP. So, this approach is laborious to solve MOTP. In
addition to that, it is seen from our proposed example that the solution of MOTP by FP does
not depend on the expected solution by the DM, if there is to be any kind of expectation for
optimum values of objective functions. That is why, in most of the real-life decision-mak-
ing problems, the FP is less important to produce a more optimal compromise solution.
Finally, in the proposed approach through VAM, the obtained optimal compromise
solution is better than the solutions of FP, GP, and RMCGP. Furthermore, there is no need
to use any auxiliary variable or any additional constraints to solve the MOTP by the pro-
posed algorithm. It is clear that the set of all normalized weights wi produces a set of opti-
mal compromise solutions of the MOTP. The better optimal compromise solution is one of
the compromise solution picked by the DM. Here, we derive a better solution compared to
the solutions obtained by FP, GP, and RMCGP.
Acknowledgment
The author Gurupada Maity acknowledges the University Grants Commission of India
for supporting the financial grant to carry on this research work under JRF(UGC) scheme:
Sanctioned letter number [F.17-130/1998(SA-I)] dated 26/06/2014.
Chapter four: New approach for solving multi-objective transportation problem 89
References
1. F.L. Hitchcock, The distribution of a product from several sources to numerous localities,
Journal of Mathematics and Physics 20 (1941) 224–230.
2. T.C. Koopmans, Optimum utilization of the transportation system, Econometrica 17 (1949)
136–146.
3. A. Ebrahimnejad, An improved approach for solving fuzzy transportation problem with
triangular fuzzy numbers, Journal of Intelligent and Fuzzy Systems 29(2) (2015) 963–974.
4. A. Kaur and A. Kumar, A new method for solving fuzzy transportation problems using
ranking function, Applied Mathematical Modelling 35(12) (2011) 5652–5661.
5. D.R. Mahapatra, S.K. Roy and M.P. Biswal, Multi-choice stochastic transportation problem
involving extreme value distribution, Applied Mathematical Modelling 37(4) (2013) 2230–2240.
6. G. Maity and S.K. Roy, Solving multi-objective transportation problem with interval goal
using utility function approach, International Journal of Operational Research 27(4) (2016)
513–529.
7. G. Maity and S.K. Roy, Solving multi-choice multi-objective transportation problem:
A utility function approach, Journal of Uncertainty Analysis and Applications (2014)
DOI:10.1186/2195–5468–2–11.
8. G. Maity, S.K. Roy and J.L. Verdegay, Multi-objective transportation problem with cost
reliability under uncertain environment, International Journal of Computational Intelligence
Systems 9(5) (2016) 839–849.
9. S. Midya and S.K. Roy, Single-sink, fixed-charge, multi-objective, multi-index stochastic trans-
portation problem, American Journal of Mathematics and Management Sciences 33 (2014) 300–314.
10. S. Midya and S.K. Roy, Analysis of interval programming in different environments and
its application to fixed charge transportation problem, Discrete Mathematics, Algorithms and
Applications 9(3) (2017) 1750040, 17 pages.
11. S.K. Roy, Multi-choice stochastic transportation problem involving Weibull distribution,
International Journal of Operational Research 21(1) (2014) 38–58.
12. S.K. Roy and G. Maity, Minimizing cost and time through single objective function in
multi-choice interval valued transportation problem, Journal of Intelligent and Fuzzy Systems
32(3) (2017) 1697–1709.
13. S.K. Roy, G. Maity and G.W. Weber, Multi-objective two-stage grey transportation problem using
utility function with goals, Central European Journal of Operations Research 25(2) (2017) 417–439.
14. A. Charnes and W.W. Cooper, Management Model and Industrial Application of Linear Program
ming, 1, Wiley: New York (1961).
15. G. Maity and S.K. Roy, Solving a multi-objective transportation problem with nonlinear
cost and multi-choice demand, International Journal of Management Science and Engineering
Management 11(1) (2016) 62–70.
16. F.A.E.W. Waiel, A multi-objective transportation problem under fuzziness, Fuzzy Sets and
Systems 117(1) (2001) 27–33.
17. S.K. Roy, G. Maity, G.W. Weber and S.Z. Alparslan Gök, Conic scalarization approach to solve
multi-choice multi-objective transportation problem with interval goal, Annals of Operations
Research 253(1) (2017) 599–620.
18. A. Kumar, S. Pant, M. Ram and S.B. Sing, On solving complex reliability optimization prob-
lem using multi-objective Particle Swarm optimization, Mathematics Applied to Engineering,
Academic Press, (2017) 115–131.
19. A. Kumar, S. Pant and M. Ram, System reliability optimization using grey wolf optimizer algo-
rithm, Quality and Reliability Engineering International, John Wiley & Sons 33 (2017) 1327–1335.
20. S. Pant, A. Kumar, S.B. Sing and M. Ram, A modified Particle Swarm optimization algorithm
for nonlinear optimization, Nonlinear Studies 24(1) (2017) 127–138.
21. S. Pant, A. Kumar and M. Ram, Reliability optimization: A particle swarm approach, In: Ram
M., and Davim J. (eds), Advances in Reliability and System Engineering, Management and Industrial
Engineering, Springer, Cham,
22. H.J. Zimmermann, Fuzzy programming and linear programming with several objective
functions, Fuzzy Sets and Systems 1 (1978) 45–55.
90 Advanced Mathematical Techniques in Engineering Sciences
23. E.L. Hannan, An assessment of some of the criticisms of goal programming, Computers &
Operations Research 12 (1985) 525–541.
24. J.P. Ignizio, Goal Programming and Extensions, Lexington Books: Lexington, MA (1976).
25. M. Tamiz, D.F. Jones and C. Romero, Goal programming for decision making: An overview of
the current state-of-the-art, European Journal of Operational Research 111 (1998) 569–581.
26. C. Romero, A general structure of achievement function for a goal programming model,
European Journal of Operational Research 153 (2004) 675–686.
27. C.N. Liao, Formulating the multi-segment goal programming, Computers and Industrial
Engineering 56 (2009) 138–141.
28. B.B. Tabrizi, K. Shahanaghi and M.S. Jabalameli, Fuzzy multi-choice goal programming,
Applied Mathematical Modelling 36 (2012) 1415–1420.
29. C.T. Chang, Multi-choice goal programming, Omega 35 (2007) 389–396.
30. C.T. Chang, Revised multi-choice goal programming, Applied Mathematical Modelling 32
(2008) 2587–2595.
31. L.A. Zadeh, Fuzzy sets, Information and Control 8 (1965) 338–353.
chapter five
Contents
5.1 I ntroduction........................................................................................................................... 91
5.2 Simultaneous optimization of multiple characteristics.................................................. 96
5.2.1 Derringer’s desirability function method............................................................. 96
5.2.2 Taguchi’s loss function approach........................................................................... 97
5.2.3 Fuzzy logic approach............................................................................................... 98
5.2.4 Dual-response surface methodology..................................................................... 99
5.3 Data collection and modeling........................................................................................... 100
5.4 Optimization....................................................................................................................... 104
5.5 Validation............................................................................................................................. 106
5.6 Conclusion........................................................................................................................... 108
References...................................................................................................................................... 109
5.1 Introduction
Many modern processes are reasonably complex and have multiple output characteristics.
For example, a heat treatment process like induction hardening needs to be executed to
meet the requirements on surface hardness and case depth. Even an agile software devel-
opment process needs to be executed to simultaneously meet the goals set on performance
characteristics like sprint productivity, spring velocity, defect density, etc. (John et al., 2017).
In such a scenario, the process engineers need to execute the processes in such a way to
meet the customer requirements on multiple characteristics. In other words, the engineers
need to identify an optimum setting of process control factors, which would simultane-
ously optimize multiple output characteristics. This can be done using the application
of simultaneous optimization of multiple characteristics methodology. A lot of research
has been carried out in the past on simultaneous optimization of output characteristics,
and many approaches have been proposed. The important among them are Derringer’s
desirability function approach, Taguchi’s loss function approach, dual-response surface
methodology and fuzzy logic–based approach. In this chapter, the authors present a case
study on simultaneous optimization of multiple output characteristics of the pulp cooking
process. The methodology used for simultaneous optimization is dual-response surface
methodology.
91
92 Advanced Mathematical Techniques in Engineering Sciences
This study is carried out at an organization manufacturing rayon grade pulp. The
rayon grade pulp is the raw material for manufacturing viscous staple fiber. The viscous
staple fiber is used for making clothes. The pulp cooking process is an important step in
the manufacturing of rayon grade pulp. The pulp is the cellulose component of the wood.
The cellulose is separated from other components and impurities of wood by cooking the
wood chips in a highly pressurized chamber followed by multiple stages of washing and
chemical treatments.
The company produces approximately 210 tons of pulp daily and sells at a price of
Indian Rupees 28,000 per ton of pulp. Even a small increase in pulp yield can have huge
economic benefits for the organization. The yield of the pulp cooking process is defined as
wp
y= × 100 (5.1)
wc
where y is the pulp yield, wp is the weight of pulp produced, and wc is the weight of the
wood chips loaded. Unfortunately, the pulp yield cannot be increased indefinitely as it
will adversely affect the pulp viscosity. The pulp with viscosity beyond 52 centipoises (cp)
is graded as low quality. One centipoise is equal to one millipascal-second. To quantify
the current status of the pulp cooking process, the data on yield and viscosity of the past
twenty batches are collected. The collected data are given in Table 5.1.
The descriptive summary of the pulp yield is given in Figure 5.1.
Figure 5.1 shows that the average pulp yield per batch is only 34.027. So there is a lot
of scope for improvement. Figure 5.1 also revealed that the yield is normally distributed,
Mean 34.027
StDev 0.141
Variance 0.020
Skewness –0.542541
Kurtosis 0.103251
N 20
Minimum 33.700
1st quartile 33.925
33.7 33.8 33.9 34.0 34.1 34.2 Median 34.015
3rd quartile 34.157
Maximum 34.240
95% confidence interval for mean
33.962 34.093
95% confidence interval for median
95% confidence intervals 34.000 34.138
95% confidence interval for StDev
Mean 0.107 0.206
Median
as the p-value of Anderson–Darling normality test is greater than 0.05 (Mathews, 2005).
Similarly, the descriptive summary of the viscosity is given in Figure 5.2.
Figure 5.2 shows that the average viscosity is 50.89 with a standard deviation of 0.973.
The upper specification limit (USL) on pulp viscosity is 52 cp. Hence, it is very likely that
the pulp cooking process is not capable of meeting the customer requirement of producing
pulp with viscosity within 52 cp. Figure 5.2 also shows that the viscosity is normally dis-
tributed as Anderson–Darling normality test p-value > 0.05. Hence, the viscosity data are
subjected to capability analysis. The process capability analysis result is given in Figure 5.3.
Figure 5.3 shows that the Ppk is only 0.38 which is less than 1.0 indicating that the pulp
cooking process is not capable of meeting the customer requirements on viscosity. Hence,
there is a need to make the pulp cooking process capable of meeting the customer require-
ment on viscosity as well as improving the yield of the process as far as possible.
The performance of the pulp cooking process can be unsatisfactory due to the pres-
ence of assignable causes. To check whether the pulp cooking process is in statistical con-
trol, control charts are constructed for the pulp yield and viscosity. Since both yield and
viscosity are normally distributed, the individual x chart (Montgomery, 2002) is used. The
individual x chart of yield is given in Figure 5.4 and that of viscosity is given in Figure 5.5.
Figures 5.4 and 5.5 show that none of the points plotted is beyond the upper or
lower control limits (UCL or LCL). Moreover, none of the out-of-control run rules
(Leavenworth and Grant, 2000) is violated in both the cases. This shows that the pulp
cooking process is under control and free from the influence of assignable causes. In
94 Advanced Mathematical Techniques in Engineering Sciences
Median
49 50 51 52 53
Performance
Observed Expected overall Expected within
PPM < LSL • • •
PPM > USL 150000.00 126956.10 160180.99
PPM total 150000.00 126956.10 160180.99
I chart of yield
34.50
UCL = 34.461
34.25
Individual value
X = 34.027
34.00
33.75
LCL = 33.594
33.50
1 3 5 7 9 11 13 15 17 19
Observation
I chart of viscosity
55
54 UCL = 54.241
53
52
Individual value
51 X = 50.89
50
49
48
LCL = 47.539
47
1 3 5 7 9 11 13 15 17 19
Observation
other words, it is not just a process control problem but needs process optimization.
Hence, it is decided to carry out the design of experiments to improve and optimize
the pulp cooking process. Moreover, the pulp cooking process needs to be optimized to
meet the requirements of two output characteristics, namely, pulp yield and v iscosity.
So a widely popular simultaneous optimization of multiple output characteristics
96 Advanced Mathematical Techniques in Engineering Sciences
β
y − USL
d= , if T ≤ y < USL (5.3)
T − USL
where y is the characteristic under study, T is the target, LSL is the lower specification
limit, and USL is the upper specification limit of y.
The desirability function for STB characteristics is defined as
α
y − USL
d= , if y min < y < USL (5.5)
y min − USL
d = 0, if y ≥ USL (5.6)
d = 1, if y ≤ y min (5.7)
where y is the characteristic under study, USL is the upper specification limit of y and ymin
is the practically achievable most desirable minimum value or target of y.
Similarly, the desirability function for LTB characteristics is defined as
α
y − LSL
d= , if LSL < y < y max (5.8)
y max − LSL
Chapter five: An application of dual-response surface optimization methodology 97
d = 0, if y ≤ LSL (5.9)
d = 1, if y ≥ y max (5.10)
where y is the characteristic under study, LSL is the lower specification limit of y and ymax
is the practically achievable most desirable maximum value or target of y. The weights α
and β in the desirability function need to be chosen based on the desirability of quality
characteristic y with respect to its target and specification limits.
Equations (5.2)–(5.10) show that the desirability value d will be 1 when the character-
istic y is on the target. The desirability value decreases as y moves away from the target.
The desirability value will be 0 when the characteristic under study y is on or beyond the
specification limits. For simultaneous optimization of multiple characteristics, the desir-
ability function value di, i = 1, 2, …, k is computed for each characteristic yi,, and the over-
all desirability D is computed as the geometric mean of individual desirability values as
shown in Equation (5.11):
D = (d1 × d2 × × dk )k (5.11)
Finally, the values of the factors that would simultaneously optimize multiple character-
istics are found out by maximizing the overall desirability value D. Some of the impor-
tant applications of the desirability function approach for simultaneous optimization
of response variables are in CNC turning of AISI P-20 tool steel (Aggarwal et al., 2008),
analytical methods development (Candioti et al., 2014), carbonitriding of bushes (John,
2013), etc.
l( y ) =
1
n
k ∑ (y − T ) , for NTB (5.12)
i=1
i
2
l( y ) =
1
n
k ∑y ,
i=1
2
i for STB (5.13)
l( y ) =
1
n
k ∑ y1 ,
i=1
i
2
for LTB (5.14)
where T is the target and k is a proportionality constant known as quality loss coeffi-
cient. For the STB-type response variable, the target is taken as zero and for the LTB-type
response variable, the target is generally taken as infinity. From Equations (5.12) to (5.14),
it is clear that the expected loss l(y) will be zero when the responses are on target and the
loss increases as the response variables move away from the respective targets. For simul-
taneous optimization of multiple responses, k is often chosen in such a way that expected
98 Advanced Mathematical Techniques in Engineering Sciences
quality loss l(y) will be equal to 1 when the response variables are either on upper or lower
specification limits. For example, for NTB response variables, k is chosen as
2
2
k= (5.15)
USL− LSL
where USL is the upper specification limit and LSL is the lower specification limit. To
use Taguchi’s loss function approach for simultaneous optimization, the expected loss is
computed for each response variable yj, j = 1, 2, …, k using Equations (5.12)–(5.14), and the
overall expected loss L(y) is computed as the average of the individual expected losses as
shown in Equation (5.16):
k
L( y ) =
1
k ∑ l(y ) (5.16)
j=1
j
Finally, the values of the factors that would simultaneously optimize multiple responses
are found out by minimizing the overall expected loss L(y). Many applications of Taguchi’s
loss function approach for simultaneous optimization of multiple responses are available
in the literature (Antony, 2001; John, 2012; Lin and Lin, 2002; Nian et al., 1999).
ed − ed z
m( z) = , if d ≠ 0 (5.17)
ed − 1
m( z) = 1 − z , if d = 0 (5.18)
where d is called the exponential constant, and z measures the deviation of the character-
istic y from its target value. The z is defined for all three types of characteristics, namely
STB, LTB, and NTB are given in Equations (5.19)–(5.21):
y −T y −T
z= or , for NTB (5.19)
yU − T T − yL
y − y min
z= , for STB (5.20)
yU − y min
y max − y
z= , for LTB (5.21)
y max − y L
where y is the characteristic under study, T is the specified target on y, yU, and yL are the
upper and lower limits of y, ymin is the best possible minimum value y can achieve in case
Chapter five: An application of dual-response surface optimization methodology 99
of the STB and ymax is the best possible maximum value y can achieve in case of the LTB.
The membership function m(z) is assigned a value of 0 whenever y is beyond the upper
or lower limit. When y = T in Equation (5.19) or y = ymin in Equation (5.20) or y = ymax in
Equation (5.21), z will be 0 and the membership function m(z) achieves the maximum value
of 1. In other words, m(z) achieves the best value of 1 when y is on target or at best possible
value. Moreover, the rate of decrease of m(z) will be high when d < 0, the rate of decrease
will be low when d > 0 and the rate of decrease will be constant when d = 0. The user can
choose the value of d based on the desired rate of decrease of m(z).
For simultaneous optimization of multiple characteristics using fuzzy logic, the mem-
bership function m(z) is computed for all the characteristics, and the optimum values of the
factors are identified by maximizing the minimum of m(z) values (Lin et al., 2000).
yˆ µ = a0 + ∑ a x + ∑ a x + ∑ ∑ a x x (5.22)
i=1
i i
i=1
2
ii i
i< j
ij i j
k k k
yˆ σ = b0 + ∑ b x + ∑ b x + ∑ ∑ b x x (5.23)
i=1
i i
i=1
2
ii i
i< j
ij i j
where is ŷ µ is the estimated mean, ŷσ is the estimated standard deviation of the response
variable y and xi, i = 1, 2, …, k are the explanatory variables. Then the optimum values of
the explanatory variables that would simultaneously optimize the estimated mean and
variance of the response variable (Vining and Myers, 1990) are obtained by formulating
and solving the optimization problem of
Minimize yˆ σ (5.24)
Subject to yˆ µ = T (5.25)
The aforementioned optimization problem can be solved using Microsoft Excel Solver (Del
Castillo and Montgomery, 1993). The methodology can also be used for simultaneous opti-
mization of multiple responses. For example, a STB-type response variable and a NTB-
type response variable can be simultaneously optimized by solving
Minimize yˆ 1 (5.26)
Subject to yˆ 2 = T (5.27)
where ŷ1 is the estimated value of the STB-type response variable, ŷ 2 is the estimated value
of the NTB response variable, and T is the specified target value of ŷ 2 .
100 Advanced Mathematical Techniques in Engineering Sciences
There are many other approaches also available for simultaneous optimization of mul-
tiple response variables, namely, grey relational analysis (Chiang and Chang, 2006), prin-
cipal component analysis (Tong et al., 2005), artificial neural networks (Noorossana et al.,
2009), genetic algorithm (Ortiz et al., 2004), etc. In this case study, the authors have used
the dual-response surface methodology for simultaneous optimization of pulp yield and
viscosity of the pulp cooking process.
Table 5.4 also shows that the pure quadratic term is not significant at the 5% level as the
corresponding p-value = 0.43478 > 0.05 (Montgomery, 2013). The ANOVA table is again
constructed by dropping the insignificant quadratic term. The modified ANOVA table of
pulp yield is given in Table 5.5.
Table 5.5 shows that the p-values for factor sulfidity (x1) and interaction between sul-
fidity and black liquor (x1x2) are significant (p-value < 0.05) at the 5% level. Hence, a model
is developed for yield (y1) using sulfidity (x1) and interaction between sulfidity and black
liquor (x1x2) as explanatory variables (Draper and Smith, 2003). The coefficient table for the
pulp yield model is given in Table 5.6.
From Table 5.6, the model for pulp yield (y1) is identified as
Table 5.7 shows that the R 2 and adjusted R 2 are very high and standard error is reason-
ably close to zero. Hence, it is concluded that the model is accurate. The residual plots of
pulp yield model are given in Figure 5.6.
Figure 5.6 shows that the residuals are more or less normally distributed, and there is
no trend or pattern in the plot of residuals versus fitted values or observation order. So it is
concluded that the model is adequate (Montgomery et al., 2003).
Similarly, the response variable viscosity is also subjected to ANOVA. The ANOVA
table for viscosity is given in Table 5.8.
Table 5.8 shows that the regression is significant at the 5% significant level (p-value =
0.03104 < 0.05). The p-value for interaction is 0.89732 > 0.05 indicating that the interaction
is not significant. But the p-value of the pure quadratic term is 0.0927 < 0.1, indicating the
quadratic term is significant at the 10% level. So to develop a full polynomial model for
viscosity, more experiments need to be carried out at factor axial points, which would
make the study costlier. Hence, the possibility of developing a linear model for viscosity
by transforming the response is explored. A linear model is found to be the best fit model
for the logarithm of viscosity ( y 2′ ). The experimental layout with the logarithm of viscosity
( y 2′ ) is given in Table 5.9.
90 0.1
Residual
Percent
50 0.0
10 –0.1
1 –0.2
–0.30 –0.15 0.00 0.15 0.30 36.0 36.2 36.4 36.6 36.8
Residual Fitted value
1.5 0.1
Frequency
Residual
1.0 0.0
0.5 –0.1
0.0 –0.2
–0.15 –0.10 –0.05 0.00 0.05 0.10 0.15 1 2 3 4 5 6 7 8
Residual Observation order
The logarithm of viscosity is subjected to ANOVA. The ANOVA table for the trans-
formed viscosity is given in Table 5.10.
Table 5.10 shows that neither the pure quadratic term nor the interaction is significant
(p-value > 0.05). Hence, the ANOVA table is modified by dropping the insignificant terms.
The modified ANOVA table is given in Table 5.11.
Table 5.11 shows that only the factors sulfidity (x1) and cooking time (x3) have a signifi-
cant effect on the response variable. So the model is developed for the logarithm of viscos-
ity ( y 2′ ) using sulfidity (x1) and cooking time (x3) as explanatory variables. The coefficient
table of the logarithm of viscosity ( y 2′ ) model is given in Table 5.12.
From Table 5.12, the model for logarithm of pulp viscosity ( y 2′ ) is identified as
5.4 Optimization
The optimum setting of the factors that would increase the pulp yield as much as pos-
sible without increasing the viscosity beyond 52 is identified by formulating the problem
as a constraint optimization problem (Hillier and Lieberman, 2008; Taha, 2014). Since the
model is developed for the logarithm of viscosity, the problem is formulated to maximize
the pulp yield (y1) subject to the constraint that the logarithm of pulp viscosity ( y 2′ ) will not
exceed the upper limit. The upper limit k ′ is computed as
k ′ = k − 1.96 s (5.30)
Residual
50 0.00
10 –0.01
1 –0.02
–0.04 –0.02 0.00 0.02 0.04 3.900 3.925 3.950 3.975 4.000
Residual Fitted value
0.01
Frequency
2
Residual 0.00
1
–0.01
0 –0.02
1 2 3 4 5 6 7 8
–0.015
–0.010
–0.005
0.000
0.005
0.010
0.015
0.020
Observation order
Residual
where k is the logarithm of 52, the upper specification limit of viscosity, and s is the stan-
dard error of viscosity model. The upper limit k ′ is taken 1.96 standard deviations less than
the logarithm of 52 cp. This is to ensure that even the individual values of viscosity are
very unlikely to fall outside the specification limit. Substituting the values of s and loga-
rithm of 52 in Equation (5.30), k ′ has become
−1 ≤ x1 ≤ 1 (5.34)
−1 ≤ x2 ≤ 1 (5.35)
−1 ≤ x3 ≤ 1 (5.36)
The optimization problem given in Equations (5.31)–(5.36) is solved using Microsoft Excel
Solver utility (Fylstra et al., 1999). The solution obtained is given in Table 5.14. The optimum
106 Advanced Mathematical Techniques in Engineering Sciences
values of explanatory variables x1, x2, and x3 along with corresponding values of percent-
age sulfidity, percentage black liquor and cooking time are given in Table 5.14.
Table 5.14 shows that by executing the pulp cooking process with a cooking medium
having sulfidity 16.28%, black liquor 10%, and a cooking time of 65 minutes would give a
yield of 36.38 and a viscosity of 50.21 cp. This is well within the customer specified upper
limit of 52 cp on viscosity. The validation of the findings of the study is given in the next
section.
5.5 Validation
The results of the study are validated by cooking 14 batches of pulp at the optimum com-
bination of factors, namely, sulfidity at 16.28%, black liquor 10%, and cooking time 65 min-
utes. The pulp yield and viscosity are measured for each batch and are given in Table 5.15.
The individual x control chart comparing the pulp yield performance before and after
the study is given in Figure 5.8 and that of pulp viscosity is given in Figure 5.9.
Figure 5.8 shows that executing the pulp cooking process with the optimum combi-
nation of factors suggested by the study would significantly improve the pulp yield. The
validation data show that on an average the yield increased from 34% to 36.43%. Figure 5.9
shows that the optimum combination of factors reduced the mean as well as variation in
36.5
X = 36.431
35.5
35.0
34.5
34.0
33.5
1 4 7 10 13 16 19 22 25 28 31 34
Observation
54
53
52
Individual value
UCL = 51.657
51
50 X = 50.141
49
LCL = 48.625
48
47
1 4 7 10 13 16 19 22 25 28 31 34
Observation
pulp viscosity. When the pulp cooking process is operating under statistical control, it is
very unlikely that the viscosity will be more than the upper specification limit of 52 cp, in
fact, the viscosity will be less than 51.65 cp. The process capability analysis results of pulp
viscosity with the validation data are given in Figure 5.10.
Figure 5.10 shows that running the cooking process with the optimum combination
of factors would improve process capability with respect to viscosity to 1.25. Hence, the
pulp manufacturing company has decided to use the optimum combination of factors sug-
gested by the study for all future batches of pulp cooking process.
5.6 Conclusion
In this chapter, the authors presented a case study on optimizing the pulp cooking pro-
cess. The cooking process is an important step in the rayon grade pulp manufacturing
process. The rayon grade pulp is the raw material for manufacturing viscous staple fiber,
which in turn is used for cloth making. The challenge in improving the pulp yield of the
cooking process is that it would result in deteriorating the pulp viscosity. The pulp yield
needs to be improved without increasing viscosity beyond 52 cp. This is achieved by the
application of dual-response surface methodology and design of experiments.
Through discussions with technical professionals, three factors, namely, percentage
of sulfidity in the cooking medium, percentage of black liquor in the cooking medium,
and cooking time are selected for the study. The pulp yield and viscosity are taken as the
Performance
Observed Expected overall Expected within
PPM < LSL * * *
PPM > USL 0.00 84.82 116.88
PPM total 0.00 84.82 116.88
response variables. Since the engineers suspected that interaction between the factors, a
full factorial experiment is designed. To explore whether the relationship between the
factors and response variables is nonlinear or not, four center points are also added to
the design. The experiments are carried out as per the design and data on pulp yield and
viscosity are collected. Based on the data, models are fitted for pulp yield and logarithm
of pulp viscosity. Then the optimum combination of the factors that would maximize the
pulp yield subject to the constraint on viscosity is obtained by formulating the problem as
a constraint optimization problem and solving it using Microsoft Solver utility.
Validation of the results showed that the yield of the pulp cooking process has sig-
nificantly improved from 34% to 36.43%. Moreover, the study also reduced pulp viscosity
as well as variation in pulp viscosity. The process capability index Ppk of pulp viscosity
improved to 1.25. Hence, the company decided to execute the pulp cooking process with
the optimum combination of factors suggested by the study for the upcoming batches.
Many of the modern-day processes have multiple output characteristics. The process
manager or engineer needs to find an optimum setting of process control factors that
would result in simultaneously meeting the requirements on all the output characteris-
tics. In this chapter, the authors have demonstrated the dual-response surface methodol-
ogy for the simultaneous optimization of multiple output characteristics. Even though
the case study is dealing with the optimization of the pulp cooking process for simulta-
neously meeting the requirements of the output characteristics, namely, pulp yield and
viscosity, the methodology can be used for optimizing any process. Moreover, more than
two output characteristics can also be simultaneously optimized using response surface
methodology.
References
Aggarwal, A., Singh, H., Kumar, P., and Singh, M. (2008). Optimization of multiple quality charac-
teristics for CNC turning under cryogenic cutting environment using desirability function.
Journal of Materials Processing Technology, 205(1), 42–50.
Antony, J. (2001). Simultaneous optimisation of multiple quality characteristics in manufacturing
processes using Taguchi’s quality loss function. International Journal of Advanced Manufacturing
Technology, 17(2), 134–138.
Box, G. E. P., and Draper, N. R. (2007). Response Surfaces, Mixtures and Ridge Analysis. 2nd edition,
New Jersey, NJ: John Wiley and Sons.
Candioti, L. V., De Zan, M. M., Cámara, M. S., and Goicoechea, H. C. (2014). Experimental design
and multiple response optimization. Using the desirability function in analytical methods
development. Talanta, 124, 123–138.
Chiang, K. T., and Chang, F. P. (2006). Optimization of the WEDM process of particle-reinforced
material with multiple performance characteristics using grey relational analysis. Journal of
Materials Processing Technology, 180(1), 96–101.
Del Castillo, E., and Montgomery, D. C. (1993). A nonlinear programming solution to the dual
response problem. Journal of Quality Technology, 25(3), 199–204.
Derringer, G. (1994). A balancing act: Optimizing product’s properties. Quality Progress, 27(6): 51–58.
Ding, R., Lin, D. K., and Wei, D. (2004). Dual-response surface optimization: A weighted MSE
approach. Quality Engineering, 16(3), 377–385.
Draper, N. R., and Smith, H. (2003). Applied Regression Analysis. 3rd edition, Singapore: John Wiley
and Sons (Asia) Pte Ltd.
Fylstra, D., Lasdon, L., Watson, J., and Waren, A. (1999). Design and use of the Microsoft Excel Solver.
Interfaces, 28(5): 29–55.
Harrington, E. (1965). The desirability function. Industrial Quality Control, 21(10): 494–498.
Hillier, F. S., and Lieberman, G. J. (2008). Operations Research – Concepts and Cases. 8th edition.
New Delhi: Tata McGraw-Hill Publishing Company Ltd.
110 Advanced Mathematical Techniques in Engineering Sciences
Time-dependent conflicting
bifuzzy set and its applications
in reliability evaluation
Shshank Chaube
University of Petroleum and Energy Studies
S.B. Singh
G.B. Pant University of Agriculture and Technology
Sangeeta Pant
University of Petroleum and Energy Studies
Anuj Kumar
University of Petroleum and Energy Studies
Contents
6.1 I ntroduction......................................................................................................................... 112
6.2 Basic concept of time-dependent CBFS and some definitions..................................... 112
6.2.1 Time-dependent CBFS........................................................................................... 112
6.2.2 Normal CBFS........................................................................................................... 113
6.2.3 Convex CBFS........................................................................................................... 113
6.2.4 Conflicting bifuzzy number.................................................................................. 113
6.2.5 (α, β)-Cut of a time-dependent CBFS.................................................................... 113
6.2.6 Triangular time-dependent CBFS........................................................................ 113
6.3 Problem formulation.......................................................................................................... 113
6.4 Reliability evaluation with time-dependent CBFN....................................................... 115
6.5 Reliability evaluation of series and parallel system having
components following time-dependent conflicting bifuzzy failure rate................... 118
6.5.1 Series system........................................................................................................... 118
6.5.2 Parallel system......................................................................................................... 120
6.5.3 Parallel-series system............................................................................................. 121
6.5.4 Series-parallel system............................................................................................. 123
6.6 Examples.............................................................................................................................. 125
6.6.1 Series system........................................................................................................... 125
6.6.2 Parallel system......................................................................................................... 125
6.6.3 Parallel-series system............................................................................................. 126
6.6.4 Series-parallel system............................................................................................. 126
6.7 Conclusion........................................................................................................................... 127
References...................................................................................................................................... 127
111
112 Advanced Mathematical Techniques in Engineering Sciences
6.1 Introduction
The conventional reliability of a system is defined as the probability that the system will
perform a predefined operation under some specified condition for a fixed time period.
Traditionally, system reliability evaluation is dependent on the probabilistic approach. But
this approach is not always valid, since in reality a lot of times data related to the system
information do not represent the realistic situation correctly due to uncertainties present
in it. Therefore, in many cases, reliability assessment of the system becomes a very dif-
ficult task. Hence, to evaluate reliability of a system when available information is uncer-
tain, then we apply the fuzzy approach. Zadeh (1965) constituted the foundation for this
approach by his works on fuzzy set theory with the assumption that the nonmembership
degree is equal to one minus the membership degree. Here we can consider member-
ship degree and nonmembership degree as positive and negative aspects of a situation. It
implies if the membership is correct, then the nonmembership is wrong, which is a con-
trary relation. Over this theory Atanassov (1986) introduced the concept of intuitionistic
fuzzy sets. He proposed the condition 0 ≤ µ A ( x) + ν A ( x) ≤ 1, where µ A ( x) and ν A ( x) rep-
resent the degree of membership and the degree of nonmembership, respectively. Many
researchers (Burillo & Bustinces, 1996; Li, Shan & Cheng, 2005; Supriya, Ranjit & Akhil,
2005; Gianpiero & David, 2006) have done work on intuitionistic fuzzy sets. Other theories
like L-fuzzy sets (Goguen, 1967), Ying-Yang bipolar fuzzy logic (Zhang & Zhang, 2004), soft
sets (Basu, Deb & Pattanaik, 1992), vague sets (Gau & Buehrer, 1993), and interval-valued
intuitionistic fuzzy sets (Atanassov, 1999) also were introduced to handle the uncertainty.
Then, Zamali, Lazim, and Osman (2008) introduced the concept of a conflicting bifuzzy
set (CBFS), and proposed that the sum of membership degree and nonmembership degree
can be more than one.
Several authors (Singer, 1990; Cai, Wen & Zhang, 1991; Chen, 1994, 1996; Roy et al.,
2017) proposed and developed the fuzzy reliability theory. Extending these works, in this
chapter, applications of conflicting bifuzzy sets are applied in fuzzy reliability theory.
In this chapter, some basic concepts of triangular CBFS are defined, and a proce-
dure using triangular CBFS is introduced to estimate the fuzzy reliability of the sys-
tem. Here, membership and nonmembership functions of fuzzy reliability of systems
are constructed by considering the failure rate of each component as a time-dependent
triangular CBFN.
A(t) = ( m(t) − l(t), m(t), m(t) + n(t), m(t) − l ′(t), m(t), m(t) + n ′(t))
where m(t) ∈ R is the center, l(t) > 0 and n(t) > 0 are the left and right spreads of the
membership function of A(t), and, l ′(t) > 0 and n ′(t) > 0 are the left and right spreads of the
nonmembership function of A(t), at time t.
F(t) = { x , µ F (t ) ( x), ν F (t ) ( x) : x ∈ X , t ∈ T }
114 Advanced Mathematical Techniques in Engineering Sciences
it is very obvious that both Fα (t) and Fβ (t) are crisp sets.
Assume that F(t) is a CBFN then by the fuzzy-convexity property of the membership
function of CBFN, we have
where f1α (t), f2α (t) are increasing functions of α, and f1α (t), f2α (t) are decreasing functions
of β; α , β ∈ [0, 1].
Define a bounded differential function ψ from X to Y as
ψ : X → Y such that y = ψ ( x) ∀x ∈ X
If a1α (t) and a2α (t) are invertible, then left shape function g R(t ) ( y ) and right shape function
hR(t ) ( y ) are obtained as
−1
g R(t ) ( y ) = [ a1α ]
−1
= min u (6.6)
y ≤y ≤ y
1 2
−1
hR(t ) ( y ) = [ a2α ]
−1
= max u (6.7)
y ≤ y ≤y
2 3
From Equations (6.6) and (6.7), the membership function can be constructed as
g R(t ) ( y ), y1 ≤ y ≤ y 2
µR(t ) ( y ) = hR(t ) ( y ), y2 ≤ y ≤ y3
0, otherwise
g R(t ) ( y ), y1′ ≤ y ≤ y 2
ν R(t ) ( y ) = hR(t ) ( y ), y 2 ≤ y ≤ y 3′
0, otherwise
t
0
∫
R(t) = exp − f ( k ) dk , t > 0 (6.8)
F(t) = ( m(t) − l(t), m(t), m(t) + n(t); m(t) − l ′(t), m(t), m(t) + n ′(t))
t
0
∫
a1α (t) = min exp − x( k ) dk s.t. m(t) − l(t) + α l(t) ≤ x(t) ≤ m(t) + n(t) − α n(t) (6.9)
t
0
∫
a2α (t) = max exp − x( k ) dk s.t. m(t) − l(t) + α l(t) ≤ x(t) ≤ m(t) + n(t) − α n(t) (6.10)
t
0
∫
a1β (t) = min exp − x( k ) dk s.t. m(t) − β l ′(t) ≤ x(t) ≤ m(t) + β n ′(t) (6.11)
t
0
∫
a2 β (t) = max exp − x( k ) dk s.t. m(t) − β l ′(t) ≤ x(t) ≤ m(t) + β n ′(t) (6.12)
Here, R(t) attains its extremes at the bounds. Therefore, we have
t
0
∫
a1α (t) = exp − {m( k ) + n( k ) − α n( k )}dk , t > 0 (6.13)
t
0
∫
a2α (t) = exp − {m( k ) − l( k ) + α l( k )} dk , t > 0 (6.14)
t
∫
a1β (t) = exp − {m( k ) + β n ′( k )} dk , t > 0 (6.15)
0
t
∫
a2 β (t) = exp − {m( k ) − β l ′( k )}dk , t > 0 (6.16)
0
By taking the inverses of Equations (6.13)–(6.16), µR(t ) and ν R(t ) can be obtained as
t
∫
ln( y ) + ( m( k ) + n( k )) dk
t t
0
t
,
∫
∫
exp − {m( k ) + n( k )} dk ≤ y ≤ exp − m( k ) dk
∫
0
n( k ) dk 0 0
µR(t ) = t
∫
ln( y ) + ( m( k ) − l( k )) dk
t t
−
0
t
,
∫ ∫
exp − {m( k ) dk ≤ y ≤ exp − {m( k ) − l( k ) dk
∫
0
l( k ) dk 0 0
(6.17)
Chapter six: Time-dependent conflicting bifuzzy set and its applications 117
t
∫
ln( y ) + m( k ) dk
t t
−
t
0
,
∫ ∫
exp − {m( k ) + n′( k )} dk ≤ y ≤ exp − m( k ) dk
0
∫
n′( k ) dk 0 0
ν R(t ) = t
(6.18)
∫
ln( y ) + m( k ) dk t t
t
0
,
∫ ∫
exp − {m( k ) dk ≤ y ≤ exp − {m( k ) − l′( k ) dk
∫
l′( k ) dk 0 0
0
It is very clear that R(t) is CBFN. Now we can consider the following two models:
Model 1. When the failure rate function is fixed, i.e., F(t) = F, then l(t) = l, m(t) = m,
n(t) = n, l′(t) = l′ and n′(t) = n′. Now we have
Fα (t) = [ m − l + α l, m + n − α n] , ∀α ∈ [0, 1]
and Fβ (t) = [ m − β l ′ , m + β n ′ ] , ∀β ∈ [0, 1]
Since R(0) = 1 and R(∞) = 0, from Equations (6.19) and (6.20) we obtain
ln( y ) + (m + n)t
, exp[−(m + n)t] ≤ y ≤ exp[− mt], 0 < t < ∞
nt
µR(t ) = (6.19)
− ln( y ) + (m − l)t , exp[− mt] ≤ y ≤ exp[−(m − l)t], 0 < t < ∞
lt
ln( y ) + mt
− , exp[−(m + n ′)t] ≤ y ≤ exp[− mt], 0 < t < ∞
n ′t
ν R(t ) = (6.20)
− ln( y ) + (m − l)t , exp [ − mt ] ≤ y ≤ exp[−(m − l ′)t], 0 < t < ∞
l ′t
Model 2. When the failure rate function is not constant, i.e., F(t) depends on m(t), l(t), n(t),
l′(t), and n′(t).
Let us assume that l(t) = l = constant, n(t) = n = constant, l′(t) = l′ = constant, n′(t) = n =
constant, and m(t) = pe qt, where p is a positive constant. Since R(0) = 1 and R(∞) = 0, from
Equations (6.19) and (6.20) we get
p
ln( y ) + exp(qt) − 1 + nt
q p p
, exp − exp(qt) − 1 − nt ≤ y ≤ exp − exp(qt) − 1
nt q q
µR(t ) = for 0 < t < ∞
p
ln( y ) + exp(qt) − 1 − lt
q p p
− lt
, exp − exp(qt) − 1 ≤ y ≤ exp − exp(qt) − 1 + lt
q q
(6.21)
118 Advanced Mathematical Techniques in Engineering Sciences
p
ln( y ) + exp(qt) − 1
− q p p
, exp − exp(qt) − 1 − n ′t ≤ y ≤ exp − exp(qt) − 1
n ′t q q
µR(t ) = for 0 < t < ∞
ln( y ) + p exp(qt) − 1
q p p
, exp − exp(qt) − 1 ≤ y ≤ exp − exp(qt) − 1 + l ′t
l ′t q q
(6.22)
6.5 R
eliability evaluation of series and parallel system having
components following time-dependent conflicting bifuzzy
failure rate
6.5.1 Series system
Consider a series system having “j” components. Let the failure rate of the ith component,
γ i (t) be represented as
γ i (t) = mi (t) − li (t), mi (t), mi (t) + ni (t); mi (t) − li′(t), mi (t), mi (t) + ni′(t)
t
0
∫
Ri (t) = exp − γ i ( k ) dk ; i = 1, 2, , j, t > 0
t j
t
RS (t) = exp −
0
∫∑
i=1
0
∫
γ i ( k )dk = exp − γ S ( k ) dk (6.24)
j j
γ S (t , α ) =
∑
i=1
mi (t) − li (t) + α li (t), ∑ m (t) + n (t) − α n (t) ,
i=1
i i i ∀α ∈ [0, 1] (6.25)
Similarly,
j j
γ S (t , β ) =
∑ i=1
mi (t) − β li′(t), ∑ m (t) + β n′(t) ,
i=1
i i ∀α ∈ [0, 1] (6.26)
Now since RS (t) is also a CBFN, therefore, from Equations (6.11)–(6.14), we can have the
α-cut and β-cut of RS (t), respectively, for the membership function and the nonmember-
ship function as
t j
t j
∫∑ ∫∑
RS (t , α ) = exp − m j ( k ) + n j ( k ) − α n j ( k ) dk , exp − m j ( k ) − l j ( k ) + α n j ( k ) dk
0 i=1
0 i=1
(6.27)
t j
t j
∫∑ ∫∑
RS (t , β ) = exp − m j ( k ) + β n′j ( k ) dk , exp − m j ( k ) − β l′j ( k ) dk (6.28)
0 i=1
0 i=1
From Model 1, considering the failure rate as a constant, then the α-cut of RS (t) for
the membership function and β-cut of RS (t) for the nonmembership function are
obtained as
j
j
RS (t , α ) = exp −t
∑ mi + n j − α n j , exp −t
∑ m − l + α l (6.29)
j j j
i=1 i=1
j
j
RS (t , β ) = exp −t
∑ mi + β n ′j , exp −t
∑ m − β l ′ (6.30)
j j
i=1 i=1
From the Model 2, when the failure rate is not fixed, then the α-cut and β-cut of RS (t) for the
membership function and the nonmembership function are calculated as
t j
t j
∫∑ ∫∑
RS (t , α ) = exp − pi e qi k
+ ni − α ni dk , exp − pi e qi k + li + α li dk (6.31)
0 i=1 0 i=1
t j
t j
∫∑ ∫∑
RS (t , β ) = exp − pi e qi k
+ β ni′ dk , exp − pi e qi k − β li′ dk (6.32)
0 i=1 0 i=1
120 Advanced Mathematical Techniques in Engineering Sciences
RP (t) = 1 − ∏ (1 − R (t))
i=1
i
(6.33)
j t
= 1− ∏i=1
0
∫
1 − exp − γ i ( k ) dk
Now since RP (t) is also a CBFN, therefore, from Equations (6.11)–(6.14), the α-cut and β-cut
of RP (t) for the membership function and the nonmembership function, respectively, are
obtained as
j t
RP (t , α ) = 1 −
∏ ∫
1 − exp − ( mi ( k ) + ni ( k ) − α ni ( k )) dk ,
i=1 0
j t
1− ∏i=1
0
∫
1 − exp − ( mi ( k ) − li ( k ) + α li ( k )) dk
(6.34)
j t j t
R
P (t , β ) = 1 −
∏ ∫
1 − exp − ( mi ( k ) + β ni′( k )) dk , 1 −
∏ ∫
1 − exp − ( mi ( k ) − β li′( k )) dk
i=1 0 i=1 0
(6.35)
From Model 1, if the failure rate is constant, then we have
j j
RP (t , α ) = 1 −
∏(
i=1
{ (
1 − exp −t mi + n j − α n j )}) , 1− ∏ (1 − exp {−t ( m − l + α l )}) (6.36)
i=1
i j j
j j
RP (t , β ) = 1 −
∏(
i=1
{ (
1 − exp −t mi + β n ′j )}) , 1− ∏ (1 − exp {−t ( m − β l ′ )}) (6.37)
i=1
i j
Again, from Model 2, if the failure rate is not constant, then we have
j t
RP (t , α ) = 1 −
∏ ∫(
1 − exp − pi e qi k + ni − α ni dk ,
)
i=1 0
j t
1− ∏i=1
0
∫(
1 − exp − pi e qi k − li + α li ) dk
) (6.38)
Chapter six: Time-dependent conflicting bifuzzy set and its applications 121
j t j t
RP (t , β ) = 1 −
∏ ∫( )
1 − exp − pi e qi k + β ni′ dk , 1 −
∏ ∫( )
1 − exp − pi e qi k − β li′ dk (6.39)
i=1 0 i=1 0
γ rs (t) = ( mrs (t) − lrs (t), mrs (t), mrs (t) + nrs (t); mrs (t) − lrs′ (t), mrs (t), mrs (t) + nrs′ (t))
From Equation (6.27), reliability of the sth component of the rth branch is given by
t
0
∫
Rrs (t) = exp − γ rs ( k ) dk
It is well known that the reliability of a parallel-series system RPS (t) at time t is
j
i
RPS (t) = 1 − ∏
s=1
1−
∏R
r =1
rs
(6.40)
j i t
= 1− ∏
s=1
1−
∏
r =1
0
∫
exp − γ rs ( k ) dk
Now since RPS (t) is also a CBFN, therefore, from Equations (6.31)–(6.34), the α-cut and
β-cut of RPS (t), respectively, for the membership and the nonmembership functions, are
obtained as
j i t
RPS (t , α ) = 1 −
∏ 1−
∏ ∫
exp − ( mrs ( k ) + nrs ( k ) − α nrs ( k )) dk ,
s=1 r =1 0
j i t
1− ∏s=1
1−
∏
r =1
0
∫
exp − ( mrs ( k ) − lrs ( k ) + α lrs ( k )) dk
(6.41)
j i t
RPS (t , β ) = 1 −
∏ 1−
∏ ∫
exp − ( mrs ( k ) + β nrs′ ( k )) dk ,
s=1 r =1 0
j i t
1− ∏ s=1
1−
∏ r =1
0
∫
exp − ( mrs ( k ) − β lrs′ ( k )) dk
(6.42)
j
i
RPS (t , α ) = 1 −
∏ s=1
1−
∏ ( exp {−t ( m
r =1
rs + nrs − α nrs )} ,
)
j
i
1− ∏ s=1
1−
∏ ( exp {−t ( m
r =1
rs − lrs + α lrs )} )
(6.43)
j
i
RPS (t , β ) = 1 −
∏ ∏ exp {−t ( m
s=1
1−
r =1
+ β nrs′ ) ,
rs
}
j
i
1− ∏ ∏
s=1
1−
r =1
{
exp −t ( mrs + β lrs′ )
} (6.44)
Again, if the failure rate function is not constant, i.e., if the failure rate function γ rs (t) of the
sth component of the rth branch is represented as
γ rs (t) = ( mrs (t) − lrs (t), mrs (t), mrs (t) + nrs (t); mrs (t) − lrs′ (t), mrs (t), mrs (t) + nrs′ (t))
where lrs(t) = lrs = constant, nrs(t) = nrs = constant, l′rs(t) = l′rs = constant, n′rs(t) = n′rs = constant,
and m(t) = prs e qrst , here prs is a positive constant, then we have
j i t
RPS (t , α ) = 1 −
∏ 1−
∏ ∫(
exp − prs e qrs k + nrs − α nrs dk ,
)
s=1 r =1 0
j i t
1− ∏ s=1
1−
∏r =1
0
∫(
exp − prs e qrs k − lrs + α lrs dk
) (6.45)
j i t
RPS (t , β ) = 1 −
∏ 1−
∏ ∫(
exp − prs e qrs k + β nrs′ dk ,
)
s=1 r =1 0
j i t
1− ∏ s=1
1−
∏ exp − ∫ ( p e
r =1 0
rs
qrs k
)
− β lrs′ dk
(6.46)
Chapter six: Time-dependent conflicting bifuzzy set and its applications 123
γ rs (t) = ( mrs (t) − lrs (t), mrs (t), mrs (t) + nrs (t); mrs (t) − lrs′ (t), mrs (t), mrs (t) + nrs′ (t))
From Equation (6.10), reliability of the rth component of the sth subsystem is
t
0
∫
Rrs (t) = exp − γ rs ( k ) dk
i j
RSP = ∏
s=1
1−
∏ (1 − R
r =1
rs )
i j t
= ∏ 1−
∏ ∫
1 − exp − γ rs ( k ) dk
(6.47)
s=1 r =1 0
Now since RSP (t) is also a CBFN, therefore, from Equations (6.11)–(6.14), the α-cut and β-cut
of RSP (t) for the membership function and the nonmembership function, respectively, are
obtained as
i j t
RSP (t , α ) =
∏ 1−
∏ ∫
1 − exp − ( mrs ( k ) + nrs ( k ) − α nrs ( k )) dk ,
s=1 r =1 0
i j t
∏ 1−
∏ ∫
1 − exp − ( mrs ( k ) − lrs ( k ) + α lrs ( k )) dk
(6.48)
s=1 r =1 0
i j t
RSP (t , β ) =
∏ 1−
∏ ∫
1 − exp − ( mrs ( k ) + β nrs′ ( k )) dk ,
s=1 r =1 0
i j t
∏ 1−
∏ ∫
1 − exp − ( mrs ( k ) − β lrs′ ( k )) dk
(6.49)
s=1 r =1 0
i j
RSP (t , α ) =
∏ s=1
1−
∏ (1 − exp ( −t ( m
r =1
rs + nrs − α nrs )) ,
)
i j
∏
s=1
1−
∏ (1 − exp ( −t ( m
r =1
rs − lrs + α lrs )) )
(6.50)
i j
RSP (t , β ) =
∏ 1−
∏ (1 − exp ( −t ( m rs
))
+ β nrs′ ) ,
s=1 r =1
i j
(6.51)
∏ 1−
∏ (1 − exp ( −t ( m rs − β lrs′ ) ))
s=1 r =1
Again, if the failure rate function is not constant, i.e., if the failure rate function γ rs (t) is
represented as
γ rs (t) = ( mrs (t) − lrs (t), mrs (t), mrs (t) + nrs (t); mrs (t) − lrs′ (t), mrs (t), mrs (t) + nrs′ (t))
where lrs(t) = lrs = constant, nrs(t) = nrs = constant, l′rs(t) = l′rs = constant, n′rs(t) = n′rs = constant,
and m(t) = prs e qrst , where prs is a positive constant, then we have
i j t
RSP (t , α ) =
∏ 1−
∏ ∫(
1 − exp − prs e qrs k + nrs − α nrs dk ,
)
s=1 r =1 0
i j t
∏ 1−
∏ ∫(
1 − exp − prs e qrs k − lrs + α lrs ) dk
) (6.52)
s=1 r =1 0
i j t
RSP (t , β ) =
∏ 1−
∏ ∫(
1 − exp − prs e qrs k + β nrs′ dk ,
)
s=1 r =1 0
i j t
∏ 1−
∏ ∫( )
1 − exp − prs e qrs k − β lrs′ ) dk
(6.53)
s=1 r =1 0
Chapter six: Time-dependent conflicting bifuzzy set and its applications 125
6.6 Examples
Here in this section, some numerical examples are discussed to illustrate the new approach.
The system reliability of a power plant is calculated by using Equations (6.29) and (6.30).
The reliabilities are obtained as triangular CBFN for a conflicting bifuzzy failure rate of
turbines with different values of time t.
When time t = 150, then the system reliability is obtained as
The reliability of the system is evaluated using Equations (6.38) and (6.39). The system
reliability is triangular CBFN for the conflicting bifuzzy failure rate of components with
different values of time t.
The reliability of the system is evaluated using Equations (6.43) and (6.44).
At time t = 10 system reliability is
Input Output
Input Output
The reliability of the system is evaluated using Equations (6.51) and (6.52).
At t = 10 system reliability is
At t = 20 system reliability is
6.7 Conclusion
In this study, a procedure is introduced to construct the membership and the nonmem-
bership functions of the fuzzy reliability function, by considering the failure rates as time-
dependent CBFN. With the introduced approach, reliability of different systems (series,
parallel, parallel-series, and series-parallel systems) is evaluated in the form of a t riangular
CBFN. In all of these systems, the failure rate of each component is taken as time-dependent
triangular CBFS. Since the fuzzy set and intuitionistic fuzzy set are the special cases of the
CBFS, the proposed method is very nicely applicable in these types of sets also. Hence, we
can conclude that this approach can easily be applied for the assessment of the reliability of
the systems whenever there is some uncertainty in the information, available for the systems.
References
Atanassov K. Intuitionistic fuzzy sets. Fuzzy Sets and Systems 1986; 20(1):87–96.
Atanassov K. Intuitionistic Fuzzy Sets. New York: Physica-Verlag; 1999.
Basu K, Deb R, Pattanaik PK. Soft sets: An ordinal formulation of vagueness with some application
to the theory of choice. Fuzzy Sets and Systems 1992; 45:45–58.
Burillo P, Bustinces H. Construction theorem for intuitionistic fuzzy sets. Fuzzy Sets and Systems
1996; 84:271–281.
Cai KY, Wen CY, Zhang ML. Fuzzy variables as a basis for a theory of fuzzy reliability in the
possibility context. Fuzzy Sets and Systems 1991; 42(2):145–172.
Chen SM. Fuzzy system reliability analysis using fuzzy number arithmetic operations. Fuzzy Sets
and Systems 1994; 64(1):31–38.
Chen SM. New method for fuzzy system reliability analysis. Cybernetics and Systems: An International
Journal 1996; 27:385–401.
Gau WL, Buehrer DJ. Vague sets. IEEE Transactions on Systems, Man, and Cybernetics 1993; 23:610–614.
Gianpiero C, David C. Basic intuitionistic principle in fuzzy set theories and its extension
(A terminological debate on Atanassov IFS). Fuzzy Sets and Systems 2006; 157:3198–3219.
Goguen J. L-fuzzy sets. Journal of Mathematical Analysis and Applications 1967; 18:145–174.
128 Advanced Mathematical Techniques in Engineering Sciences
Li DF, Shan F, Cheng CT. On properties of four IFS operators. Fuzzy Sets and Systems 2005; 154:151–155.
Roy SK, Maity G, Weber GW, Gok SZA. Conic scalarization approach to solve multi-choice
multi-objective transportation problem with interval goal. Annals of Operations Research 2017;
253(1):599–620.
Singer D. A fuzzy set approach to fault tree and reliability analysis. Fuzzy Sets and Systems 1990;
34(2):145–155.
Supriya KD, Ranjit B, Akhil RR. Some operations on intuitionistic fuzzy sets. Fuzzy Sets and Systems
2005; 156:492–495.
Zadeh LA. Fuzzy sets. Information and Control 1965; 8(3):338–353.
Zamali T, Lazim MA, Osman MTA. An introduction to conflicting bifuzzy set theory. International
Journal of Mathematics and Statistics 2008; 3(A08):86–95.
Zhang WR, Zhang L. Yin-Yang bipolar logic and bipolar fuzzy logic. Information Sciences 2004;
165:265–287.
chapter seven
Tadashi Dohi
Hiroshima University
Contents
7.1 I ntroduction......................................................................................................................... 129
7.2 Model description............................................................................................................... 131
7.3 Parametric estimation method......................................................................................... 132
7.3.1 Single failure-occurrence time data case............................................................. 133
7.3.2 Multiple failure-occurrence time data case........................................................ 134
7.4 Nonparametric estimation methods................................................................................ 135
7.4.1 Constrained nonparametric ML estimator......................................................... 135
7.4.1.1 Single failure-occurrence time data case.............................................. 135
7.4.1.2 Multiple failure-occurrence time data case.......................................... 137
7.4.2 Kernel-based approach.......................................................................................... 138
7.4.2.1 Single failure-occurrence time data case.............................................. 138
7.4.2.2 Multiple failure-occurrence time data case.......................................... 140
7.5 Numerical examples........................................................................................................... 142
7.5.1 Simulation experiments with single minimal repair data............................... 142
7.5.2 Real example with multiple minimal repair data sets...................................... 144
7.6 Conclusions.......................................................................................................................... 146
References...................................................................................................................................... 147
7.1 Introduction
In recent years, industrial systems have been more large scaled and more complex, and
played a significant role to improve the quality of our daily life. For utilizing the ability
of such systems, it is important to understand the behavior of failure phenomena of the
industrial systems. People who operate the repairable systems may try to assess the sys-
tem reliability and/or availability accurately for a long time. In addition, estimating the
cost to maintain operation of the repairable systems is regarded as an important issue
for practitioners. To describe the stochastic behavior of a cumulative number of failures
occurring as the operating time progresses, we can apply the stochastic point processes as
a powerful mathematical tool. In fact, there are many research results for life data analysis
of the industrial systems including production machines, which are based on stochastic
modeling of time-to-failure phenomena. By modeling a conditional intensity function that
129
130 Advanced Mathematical Techniques in Engineering Sciences
represents the rate of occurrence of failure at an arbitrary time, the various types of fail-
ure phenomena can be described by the stochastic point processes. These stochastic point
processes can be characterized by the failure time (lifetime) distribution and/or the kind
of repair operation (Ascher and Feingold 1984; Nakagawa 2005).
For non-repairable systems, the failed system or component is usually replaced by
the new one or repaired appropriately, when a failure occurs. In the situation where the
replacement time can be assumed to be negligible with respect to the failure time scale,
the time evolution can be described by a renewal process (RP) with independent and
identically distributed (i.i.d.) inter-failure time distributions (renewal distribution) (Cox
1972). We may also consider the failure occurrence phenomena in repairable systems by
modeming with another representative stochastic point processes. It is well known that
the nonhomogeneous Poisson process (NHPP) is the simplest but most useful tool for
modeling such phenomena. In the repairable systems, we perform the repair action after
failure occurs in order to return the failed component to the normal condition. Such an
activity may restore only the damaged part of the failed component back to a working con-
dition that is only as good as it was just before the failure in some cases. This repair action
is called minimal repair. The minimal repair process with a negligible repair time sequence
can be represented by a NHPP. Therefore, the analytical treatment of the minimal repair
process is rather easier than that in the RP.
In this chapter we mainly focus on the failure time data analysis based on the NHPP
and discuss several statistical estimation methods for a periodic replacement problem
with minimal repair as the simplest application of life data analysis with NHPP. NHPP is
characterized by the intensity function, which represents the rate at which events occur, or
the corresponding mean value function, which is defined by the integral of the intensity
function and means the expected cumulative number of events by an arbitrary time. Two
types of statistical inference approaches for NHPP are considered according to the situa-
tion where we can know the information on intensity function (or equivalently mean value
function) or cannot in advance. If the form of intensity function is known in advance, a
parametric model having any parametric intensity function is usually applied. But, if the
form of intensity function is unknown, nonparametric models may be applied to avoid the
mis-specification of failure occurrence phenomena. In this chapter, we consider the well-
known parametric model called the power law process as an example, and summarize two
nonparametric methods called the constrained nonparametric maximum likelihood estimation
(CNPMLE) and the kernel-based estimation for the NHPP.
As an application example of statistical inference of failure processes, a periodic
replacement problem with minimal repair is one of the most fundamental, but most
important maintenance solutions (Barlow and Proschan 1996). The original periodic
replacement model with minimal repair has been extended from various points of view
by several authors (Boland 1982; Colosimo et al. 2010; Nakagawa 1986; Park et al. 2000; Sheu
1990, 1991; Valdez-Flores and Feldman 1989) after the seminal contribution by Barlow and
Hunter (1960). Recently, Okamura et al. (2014) developed a dynamic programming algo-
rithm to compute effectively the optimal periodic replacement time in Nakagawa (1986).
We apply not only the parametric maximum likelihood estimation for the power
law process but also CNPMLE and kernel-based estimation methods for estimating the
cost-optimal periodic replacement problem with minimal repair, where single or multiple
minimal repair data are assumed. The former means that a single time series of failure
(minimal repair) time is observed; the latter implies that the multiple time series data are
observed from multiple production machines, where the multiple data involve the single
data case as a special case. In the numerical example, we conduct a simulation experiment
Chapter seven: Recent progress on failure time data analysis of repairable system 131
of single minimal repair process and a real data analysis with the multiple field data sets
of minimal repair of diesel engine in Nelson and Doganaksoy (1989).
7.2 Model description
Suppose that more than one failure may occur in each system component of a repairable
system. Usually, two kinds of repair actions are performed to return the failed component
state to the normal condition after each failure. One is called the minimal repair (Barlow
and Proschan 1996). This repair activity restores only the damaged part of the failure com-
ponent back to a working condition that is only as good as it was just before the failure.
Another is called the periodic replacement. This is a preventive maintenance action which is
planned in advance, where the used component is replaced by a new one at a prespecified
time. For describing the failure occurrence phenomena under such repair activities, it is
well known that an NHPP is useful. That is, NHPP can be used to model the stochastic
behavior of a cumulative number of failures under the minimal repair.
More specifically, suppose that the failure time T follows an absolutely continu-
ous probability distribution function Pr (T ≤ t ) = F ( t ) and a probability density func-
tion dF ( t )/dt = f ( t ). Define F ( t ) = 1 − F ( t ) . If the minimal repair is made at the first
failure, then the probability that the system does not fail beyond the time t is given by
t
F (t ) +
∫ ( F (t )/F ( x)) dF ( x). Continuing similar manipulations up to the n-th failure yields
0
an NHPP, { N ( t ) , t ≥ 0}, with the mean value function (Baxter 1982):
E N ( t ) = Λ ( t ) = − log F ( t ) (7.1)
where N(t) represents the cumulative number of minimal repairs by time t. It is well known
that the NHPP { N ( t ) , t ≥ 0} possesses the following properties:
• N(0) = 0
• {N(t), t ≥ 0} has independent increments
• Pr{N(t + Δt) − N(t) ≥ 2} = o(Δt)
• Pr{N(t + Δt) − N(t) = 1} = λ(t) Δt + o(Δt)
where o(Δt) is the higher term of Δt, and the function λ(t) is called the intensity function
of NHPP. The mean value function in Equation (7.1) is also defined as an integral of the
intensity function:
t
Λ(t) =
∫ λ(x) dx (7.2)
0
and means the expected cumulative number of failures occurred by time t. Then, the prob-
ability mass function (p.m.f.) of the NHPP is given by
Pr { N ( t ) = n} =
{ Λ ( t )}
n
If the component fails before time τ ( > 0 ) , then the failed component is restored to
a working condition that is only as good as it was just before failure in the periodic
132 Advanced Mathematical Techniques in Engineering Sciences
replacement problem with minimal repair. Here, the minimal repair is made so that the
failure rate remains undisturbed by repair after each failure. Also, after the operational
time reaches τ, we replace the used component by a new one preventively. Therefore,
it is easier to plan the preventive replacement periodically than the age replacement
since the past replacement history is not needed to record in the periodic replacement.
However, an additional cost is necessary for the periodic preventive replacement since
the used component is replaced by a new one at time τ, where the repaired component
before time τ is also used in operation. Define the time length from the beginning of the
operation to the periodic replacement as one cycle. Then, the expected total cost for one
cycle can be represented by cm Λ (τ ) + c p , where cm (> 0) and c p (> 0) represent the fixed cost
of each minimal repair and a periodic replacement, respectively. Dividing the expected
cost for one cycle by time τ leads to the long-run average cost per unit time in the peri-
odic replacement with minimal repair:
C (τ ) =
cm Λ (τ ) + c p
=
cm
∫
0
λ ( t ) dt + c p
(7.4)
τ τ
where τ (> 0) is a decision variable in our problem and denotes the periodic replacement
time. Then, the purpose is to derive τ * which minimizes Equation (7.4). Differentiating
Equation (7.4) with respect to τ and setting it to zero implies the first-order condition of
optimality:
τ
τλ (τ ) −
∫
0
λ ( t ) dt = γ (7.5)
where γ is called the cost ratio and defined by γ = c p /cm. For solving the nonlinear Equation (7.5)
with respect to τ, the unknown intensity function has to be estimated via an either para-
metric or nonparametric way. This is usually done based on the information on the inten-
sity function, which is available from the past minimal repair record or history. Under
τ
τ →∞ 0 ∫
the strictly increasing intensity function, i.e., dλ ( t )/dt > 0, if lim τ λ (τ ) − λ ( t ) dt > γ
holds, then a unique and finite optimal periodic replacement time τ * (0 < τ * < ∞) minimiz-
ing Equation (7.4) always exists.
It is also known that the power law model represents two different situations where the
repairable systems monotonically deteriorate or are monotonically improved over time t.
The intensity function λ(t) is decreasing with time under the model parameter β < 1, which
means that the system is deteriorating in time. But, λ(t) is increasing with time under the
model parameter β > 1, which implies the system’s improvement. Furthermore, λ(t) is con-
stant when β = 1. This corresponds to the special case where the underlying failure process
reduces to a homogeneous Poisson process.
Once the model is selected, the next step is to estimate the model parameters (η, β) with
failure-occurrence time data. Suppose that n failure-occurrence time data, which are the
random variables, are given by 0 < T1 ≤ T2 ≤ ≤ Tn ≤ T , where T is the right censoring time
of these observation data. That is, it is assumed that n failures occur by time t , which is
realization of T , and the realizations of Ti ( i = 1, 2, , n), say, ti, are observed, where tn ≤ t.
( )
Then, the ML estimates ηˆ , βˆ are defined as the parameters that maximize the following
likelihood function:
n
(
LF η , β ti = ) ∏ λ (t ;η , β ) exp ( −Λ (t ;η , β )) (7.7)
i
i=1
(
LLF η , β ti = ) ∑ log λ (t ;η , β ) − Λ (t ;η , β ) (7.8)
i
i=1
From the first-order condition of optimality in Equation (7.8) for each parameter, the ML
( )
estimates ηˆ , βˆ can be derived by solving the following simultaneous equations:
t
ηˆ = 1 βˆ (7.9)
n
n
βˆ = (7.10)
t
∑
n
log
i=1 ti
Following the above procedure, we obtain the ML-based plug-in point estimates, τˆ * and
C ( τˆ * ) , of the optimal periodic replacement time τ * and its associated minimum long-
run average cost per unit time by substituting the resulting intensity function with the
134 Advanced Mathematical Techniques in Engineering Sciences
( )
estimates ηˆ , βˆ into Equation (7.5). From the property of intensity function of the power
law process, if the parameter β is greater than 1, then it is guaranteed to exist a unique and
finite optimal periodic replacement time.
l nj
l tˆ j
(
LLF η , β t ji ) ∏ ∏ λ (t ) exp ∑ ∫
=
ji ()
λ ( x ) dx f t (7.11)
j=1 i=1 j=1
0
()
where f t is the joint density function of T1 , T2 , … , Tl . Since this density function is also
independent of model parameters, the ML estimates ηˆ , βˆ can be derived by maximizing
the following log-likelihood function similar to the single failure-occurrence time data
( )
case:
l nj
(
LLF η , β t ji = ) ∑∑ ( )
log λ t ji ; η , β − Λ tˆj ; η , β ( ) (7.12)
j=1 i = 1
1 tˆjβ β
l ˆ ˆ
ηˆ =
m ∑j=1
(7.13)
n j
l nj
tˆj
βˆ
∑ ∑
n
0= j+ log t ji − log tˆj (7.14)
j=1
βˆ i=1 ηˆ
Unfortunately, in this case, β̂ cannot be derived in closed form. For the case of multiple
minimal repair data sets, by assuming the form of intensity function λ(t) or the correspond-
ing mean value function Λ(t), the model parameters that are included in these functions
Chapter seven: Recent progress on failure time data analysis of repairable system 135
can be derived regardless of the form of intensity function. Once the estimates of intensity
function are obtained, we can get the optimal periodic replacement time with Equation
(7.5). Therefore, in a similar way with single minimal repair data, it is also easy to handle
the multiple minimal repair data case.
Though a number of variations have been studied in the literature for the preventive
maintenance problems including the age replacement problem and the periodic replace-
ment problem, it is assumed in many cases that the failure time distribution or equivalently
minimal repair process is completely known. Therefore, once the failure time distribution
is specified, the problem is to estimate the model parameters from the underlying failure
time data, and the point estimate of the optimal preventive maintenance time is derived as
a plug-in estimate with the estimated model parameters. In other words, when the failure
time distribution is unknown in advance, the analytical models in the literature cannot
provide the optimal solution.
where t0 = 0. By plotting n failure points and connecting them by line segments, we can
obtain the resulting estimate of the mean value function in Equation (7.15). This is called
the naïve estimator and has a property that the mean square error between the cumulative
number of failures and the mean value function of NHPP is always zero. The correspond-
ing naïve estimate of the intensity function can be defined as the slope of mean value func-
tion in each failure time interval by
1
λˆ ( t ) = , ti − 1 < t ≤ ti ; t0 = 0 (7.16)
ti − ti − 1
Note that the naïve estimator in Equation (7.16) does not work well for the generalization
ability. Although it can fit the past observation (training data) very well, the prediction
result for the unknown (future) pattern data is rather poor. In addition, the naïve estimator
in Equation (7.16) tends to fluctuate everywhere with big noise, and does not provide stable
estimation results.
Contrary to this, Boswell (1966) introduced the idea on isotonic estimation and gave
a CNPMLE. Suppose that the nondecreasing intensity function λ(t) with respect to time t,
i.e., the mean value function, is nondecreasing and convex in time. Boswell (1966) proved
that the intensity function that maximizes the likelihood function of NHPP under the
nondecreasing property is given by a step-function with breakpoints at any realization ti.
Therefore, we can define the likelihood function as a function of unknown intensity func-
tions at respective time points:
n
tˆ
LF ( λ ( ti ) , i = 1, 2, , n) = exp −
∫0
λ ( t ) dt
∏ λ (t ) (7.17)
i=1
i
Then the nonparametric ML estimation method with Equation (7.17) is formulated as the
following variational problem with respect to λ(·):
0, 0 ≤ t < t1 ,
ˆ
λ (t ) = q−p (7.19)
max 1≤ p ≤ i min i ≤ q ≤ n , ti ≤ t < ti + 1 ; i = 1, 2, , n − 1
tq − tp
It can be easily checked that Equation (7.19) leads to an upper bound of Equation (7.17)
for an arbitrary nondecreasing intensity function by substituting Equation (7.19) into
Equation (7.17). Although the resulting estimator in Equation (7.19) is still discontinuous,
it is somewhat smoother than Equation (7.16). The computation cost is quite low for the
CNPMLE compared with other representative nonparametric estimation methods. We
introduce the following simple algorithm for the CNPMLE of an NHPP:
Chapter seven: Recent progress on failure time data analysis of repairable system 137
• Set h = 1 and i1 = 1.
• Repeat until ih + 1 = n :
Set ih + 1 to be the index i, which minimizes the slopes between ( tih , ih − 1) and (ti , i − 1)
( i = ih + 1,… , n).
( )( )
• The CNPMLE is then given by λˆ ( t ) = i j + 1 − i j ti j+1 − ii j whenever ti j ≤ t < ti j+1 .
Since we assume at the moment that the intensity function is nondecreasing with respect
to time, the resulting CNPMLE is regarded as a specific estimator that represents an
increasing intensity trend. Therefore, if the data have such an increasing trend, then the
nondecreasing CNPMLE is expected to be useful. However, when the assumption on an
increasing intensity trend is violated, it may be less effective.
Contrary with the above discussion, we also consider a nonincreasing intensity func-
tion λ(t). In this case, the mean value function is nondecreasing but concave in time. By
solving the variational problem in Equation (7.18) under the condition that λ(t) is non-
increasing, the corresponding CNPMLE can be derived as the following min–max solution:
q−p
ˆ min 1≤ p ≤ i max i ≤ q ≤ n , ti − 1 ≤ t ≤ ti ; i = 1, 2, , n,
λ (t ) = tq − tp (7.20)
0, t ≥ tn
where t0 = 0. By assuming the monotone trend (nondecreasing or nonincreasing) of inten-
sity function in advance, CNPMLE can also represent the degradation or improvement of
systems appropriately.
Since the left side of the first-order condition of optimality in Equation (7.5) with
CNPMLE is a step function with breakpoints at any realization of failure-occurrence time,
the resulting optimal periodic replacement time is also always equal to any realization of
failure-occurrence time. This may be a drawback to apply the CNPMLE for the periodic
replacement problem with minimal repair. If there does not exist the failure-occurrence
time data that are closed to the true optimal periodic replacement time, then the resulting
estimate of optimal periodic replacement time never provides the near value to the true one.
0, 0 ≤ t < t1 ,
∑
q
mi
λˆ ( t ) = i= p (7.21)
max 1≤ p ≤ i min i ≤ q ≤ f , ti ≤ t ≤ ti + 1 ; i = 1, 2, , s − 1
∑
q
∆i
i= p
138 Advanced Mathematical Techniques in Engineering Sciences
where
∆i = jn j )
− ti , ( ti + 1 − ti ) (7.22)
i=1
and mi means the number of failures occurred at time ti. Similar to this, the CNPMLE
under a nonincreasing condition with multiple failure-occurrence time data sets are also
derived by
∑
q
mi
i= p
ˆ min 1≤ p ≤ i max i ≤ q ≤ f , ti − 1 ≤ t ≤ ti ; i = 1, 2, , s,
λ (t ) =
∑
q
∆i (7.23)
i= p
0, t ≥ ts
7.4.2 Kernel-based approach
We introduce an alternative nonparametric estimation technique, which is called the
kernel-based method, for the periodic replacement problem with minimal repair. The
kernel-based approach is well known to be quite useful since the convergence of non-
parametric estimators can be improved (Rinsaka and Dohi 2005). Recently, Gilardoni
and Colosimo (2011) applied a kernel-based estimation method to obtain the opti-
mal periodic replacement time with multiple minimal repair data sets. Similar to the
CNPMLE, they applied the so-called TTT method and transformed the multiple-case
to the single one. In this section, we use the well-known Gaussian kernel function, and
apply two bandwidth estimation methods with integrated least squares error (Diggle
and Marron 1989) and log likelihood function (Guan 2007). Diggle and Marron (1989)
proved the equivalence of smoothing parameter selection between the probability den-
sity function with i.i.d. samples and the intensity function estimation with the minimal
repair data.
λˆ ( t ) =
1
h ∑ K t −h t
i=1
i
(7.24)
Chapter seven: Recent progress on failure time data analysis of repairable system 139
where K(·) denotes a kernel function and h is a positive constant, called smoothing parameter or
bandwidth. Roughly speaking, the kernel-based method approximates the intensity function
with a superposition of kernel functions with location parameter at each failure-occurrence
time. Since the choice of h is more sensitive rather than the choice of kernel function to improve
the accuracy of λ(t), we deal with only a well-known Gaussian kernel function:
1 t2
K (t ) = − 2 (7.25)
2π
∫ {λˆ (t ) − λ (t )} dt
tˆ 2
ISE ( h ) =
0
(7.26)
tˆ tˆ tˆ
∫ λˆ ( t ) dt − 2
∫ λˆ ( t )λ ( t ) dt +
∫ λ (t ) dt
2 2
=
0 0 0
where λ(t) is the “true” but unknown intensity function. After omitting the last term,
which is independent of h, and approximating the second term in Equation (7.26), it can
be checked that the optimal bandwidth minimizing ISE (h) is equal to h minimizing the
following function:
tˆ n
CV ( h ) =
∫ 0
λˆ ( t ) dt − 2
2
∑ λˆ
r =1
h, r ( tr ) (7.27)
where
n
λˆ h , r ( t ) =
1
∑ t − t
K i
h i = 1, i ≠ r h
(7.28)
140 Advanced Mathematical Techniques in Engineering Sciences
In the LLCV method with the same n training data sets, we consider obtaining the optimal
bandwidth h by maximizing the log-likelihood function with an unknown intensity func-
tion. The log-likelihood function based on the cross-validation approach is given by
n n n
ln L ( h ) = ∑∑
k =1 r =1
ln λˆ h , r ( tk ) − ∑ Λˆ (tˆ ) (7.29)
r =1
h, r
where
t
ˆ h, r (t ) =
Λ
∫ λˆ 0
h, r ( t ) dt (7.30)
where K(·) is defined in Equation (7.25). Then the CV (h) in Equation (7.27) with the LSCV
method can be rewritten, by means of the leave-one-out cross-validation technique, as
l tˆ j nj
CV ( h ) = ∑∫
j=1
0
ˆ
λ j ( t ) dt − 2
1 2
r =1
∑ ( )
λˆ j1, h , r t jr (7.32)
where
nj
t ji − t
λˆ j1, h , r ( t ) =
1
∑ K
h i = 1, i ≠ r h
(7.33)
l nj nj nj
ln L ( h ) =∑ ∑∑
ln λˆ j1, h , r t jk − ( ) ∑ ( )
ˆ 1j , h , r tˆj (7.34)
Λ
j=1 k =1 r =1 r =1
where
t
ˆ 1j , h , r ( t ) =
Λ
∫ λˆ
0
1
j, h, r ( t ) dt (7.35)
and λˆ j1, h , r ( t ) is already defined in Equation (7.33). Unfortunately, if only one failure occurred
for jth component, then Equation (7.33) or Equation (7.35) based on leave-one-out cross-
validation cannot work. Therefore, we remove such a data set in the analysis. By minimiz-
ing Equation (7.32) or maximizing Equation (7.34), we estimate the optimal bandwidth h.
In this scheme, the intensity function of NHPP for component j is based on the only jth set
of failure-occurrence time data, since the intensity function is defined as Equation (7.31).
Therefore, l-different intensity functions can be derived according to the behavior of each
set of failure-occurrence time data. It may be useful to consider the arithmetic mean of all
intensity functions by taking the average of the failure-occurrence phenomena. That is,
λˆ ( t ) =
1
m ∑ λˆ (t ) (7.36)
j=1
1
j
These bandwidth estimation methods are labeled as LLCV1 and LSCV1, respectively, in
this chapter.
The second approach is based on the idea of superposition of NHPP (Arkin and
Leemis 1998). Reorder all the failure-occurrence time data as a single-ordered sample
( 0 = ) t0 ≤ ≤ tn* < t *, where n* is the total number of failure included the tie data, say,
n* = ∑ lj = 1 n j, and t * is the realization of the maximum of random censoring time, say,
T * = max Tj . For the single-ordered sample, the intensity function of superposition of
1≤ j ≤ l
NHPP is defined similar to the single failure-occurrence time data case:
n*
λˆ 2 ( t ) =
1
h ∑ K t −h t (7.37)
i=1
i
where K(·) is defined in Equation (7.25). Then the CV (h) in Equation (7.27) with the LSCV
method can be rewritten with a superposed intensity function:
n*
∫( )
tˆ *
∑ λˆ
2
CV ( h ) = λˆ 2 ( t ) dt − 2 2
h, r ( tr ) (7.38)
0
r =1
where
n*
λˆ h2, r ( t ) = 1
h ∑ K t h− t (7.39)
i = 1, i ≠ r
i
142 Advanced Mathematical Techniques in Engineering Sciences
Furthermore, the ln L ( h ) in Equation (7.29) can be rewritten with the superposed intensity
function:
n* n* n*
ln L ( h ) = ∑ ∑ ln λ
k =1 r =1
2
h, r ( tk ) − ∑ Λˆ (tˆ ) (7.40)
r =1
2
h, r
*
where
t
ˆ 2h , r ( t ) =
Λ
∫ λˆ
0
2
h, r ( t ) dt (7.41)
n*
λˆ ( t ) =
11
l h ∑ K t −h t (7.42)
i=1
i
These bandwidth estimation methods are labeled LLCV2 and LSCV2, respectively. Actually,
the estimation results of these four methods, LLCV1, LLCV2, LSCV1, and LSCV2, are slightly
different from each other.
7.5 Numerical examples
7.5.1 Simulation experiments with single minimal repair data
We conduct the Monte Carlo simulation to investigate properties of parametric and non-
parametric methods. Here, we focus on the single minimal repair data. The (true but
unknown) minimal repair process is assumed to follow a power law NHPP model hav-
ing the model parameters ( β , η ) = ( 3.2, 0.23 ) of intensity function λ ( t ) = ( β /η ) ( t/η ) . By
β −1
applying the thinning algorithm of NHPP (Lewis and Shedler 1979), the original failure
(minimal repair) time data are generated as the pseudo random number. The “real” opti-
mal periodic replacement time and its minimum long-run average cost per unit time are
calculated numerically as τ * = 0.44 and C (τ * ) = 98.8, under the fixed cost ratio γ = 30.
The optimal periodic replacement time and its minimum long-run average cost per
unit time are estimated with both parametric and nonparametric methods for n = 160
minimal repair time data. Here, two parametric models are assumed: the power law
(PL) NHPP model with ( t ) = ( β /η ) ( t/η ) , and the Cox–Lewis (CL) NHPP model with
β −1
λ ( t ) = exp (α + β t ). Without any prior knowledge of the underlying system, it is very dif-
ficult to select the correct model (power law model) exactly. To investigate the effect of
mis-specification of the underlying failure process model, we assume the CL model. Also,
we apply the CNPMLE and the kernel-based approaches with LSCV and LLCV. In this
example, we normalize the minimal repair data in order to reduce the computation cost.
Therefore, all data ti ( i = 1, 2, , 160 ) are divided with the maximum value t160.
For four cases with n = 40, 80, 120, and 160, the estimation results of the optimal peri-
odic replacement time and the minimum long-run average cost per unit time are presented
Chapter seven: Recent progress on failure time data analysis of repairable system 143
Table 7.2 Estimation results of the minimum long-run average cost per unit time
n PL CL CNPMLE LSCV LLCV
40 99.7 98.8 98.1 102.6 98.6
80 100.5 102.0 98.1 103.0 98.4
120 100.4 104.7 98.1 101.5 98.3
160 100.5 107.9 98.1 102.5 98.2
in Tables 7.1 and 7.2. From Table 7.1, it can be seen that three nonparametric estimation
methods give very close results to the real optimal periodic replacement time τ * = 44. But,
estimation results based on the mis-specification (CL model) are the worst among all cases.
In this way, the influence by mis-specification of the parametric model is significant when
the exact information of the underlying failure process is not available from the past expe-
rience. Focusing on Table 7.2, the kernel-based approach with LLCV can provide the best
estimation results of the minimum long-run average cost per unit time. Also, CNPMLE
shows the similar estimation results, because the original minimal repair data include
a closed data 0.43 to the optimal periodic replacement time τ * = 0.44. Of course, this is a
quite rare case. If there is no failure data near the optimal solution, CNPMLE cannot work
in the small sample problem. Furthermore, it is known that two parametric models and
a kernel-based approach with LSCV tend to give rather pessimistic estimates. This fact
indicates that even if the real model (power law model) can be assumed, the estimation
results may be biased. It may be caused that the minimum long-run average cost per unit
time is very sensitive to the optimal periodic replacement time τ *. For the CL model, the
difference from the real optimal solution tends to get larger as the number of minimal
repair data increases.
Apart from the optimization, we estimate the long-run average cost per unit time at
different periodic replacement time in order to examine the estimation accuracy of our
methods. In addition to the naïve estimator, we compare the estimation results of long-
run average cost per unit time with our six methods at arbitrary four time points, t = 0.25,
0.50, 0.75, and 1.00, where the number of minimal repairs is given by n = 180. In Table 7.3,
“TRUE” represents the estimate calculated by the power law model with modeling of the
parameters (β, η) = (3.2, 0.23). It can be said that the model, which shows good results near
to the theoretically optimal solution, provides good accuracy at different periodic replace-
ment time points. In this table, we calculate the mean squared error between the estima-
tion results and “true” values at 20 time points from t = 0 to t = 1.00 by 0.05. In this case, the
power law model can provide the best accuracy performance among all models. But, four
nonparametric models, especially for the kernel-based approach with LLCV and naïve
estimator, show similar good accuracy performance. It is evident that the mis-specification
of the parametric model also leads to the worst results as well.
144 Advanced Mathematical Techniques in Engineering Sciences
Table 7.3 Estimation accuracy of each method for arbitrary periodic replacement time
t TRUE PL CL CNPMLE LSCV LLCV Naïve
0.25 128.8 130.0 148.5 120.7 126.0 128.0 128.1
0.5 100.5 102.5 109.8 101.5 106.2 104.0 104.3
0.75 138.8 138.8 132.4 134.8 140.0 140.0 140.7
1 216.1 210.0 210.0 209.0 200.9 209.5 210.0
MSE 0 3.53 140.8 17.8 19.4 13.4 14.1
The total number of failures is 48 in the entire data set. For this data set, we apply
one parametric model (power law model) and two nonparametric models (CNPMLE and
kernel-based approach). First, we estimate the intensity function and mean value func-
tion with each model in Figures 7.1 and 7.2. From Figure 7.1, it is seen that the intensity
function with CNPMLE is represented by a nondecreasing step function. In other models,
only the intensity function of the power law model increases as time goes by. LLCV1 and
LSCV1 show the unimodal intensity functions. LSCV2 gives the multimodal shape. LLCV2
also indicates a similar trend to the naïve estimator and fluctuates everywhere. Looking
at Figure 7.2, we can see that both the results of the power law model and CNPMLE have
convex shapes. Although the mean value function of the power law model is a smoothed
curve, CNPMLE constitutes the mean value function by several line segments. The mean
value functions with LLCV2 and LSCV2 are not smooth compared with other models.
Especially for LLCV2, this property is remarkable.
Cumulative
number of failures CNPMLE
LLCV2
2.0 Power law
model
LSCV2
1.5
LLCV1
1.0
LSCV1
0.5
t
100 200 300 400 500 600
Intensity
LLCV2 CNPMLE Power law
function
model
0.004
0.003 LSCV2
0.002 LLCV1
0.001 LSCV1
t
100 200 300 400 500 600
Table 7.6 Estimation results on the minimum long-run average cost per unit time
γ PL CNPMLE LLCV1 LLCV2 LSCV1 LSCV2
0.04 19.1 6.6 20.0 7.0 17.6 20.6
0.05 20.4 8.2 20.9 8.8 18.1 22.2
0.06 21.5 9.8 21.7 10.5 18.3 23.5
0.07 22.5 11.5 22.4 12.2 18.5 24.8
0.08 23.3 13.1 23.0 13.9 18.6 25.9
Suppose that the cost ratio γ varies from 0.04 to 0.08 by 0.01. Tables 7.4 and 7.5 sum-
marize the estimation results of the optimal periodic replacement time and the minimum
long-run average cost per unit time.
Focusing on Table 7.5, we can know that the optimal periodic replacement time with
CNPMLE and LLCV2 is always constant and almost similar even though the cost ratio
γ increases. But, the results with other estimators are increasing as γ increases. In this
example, LLCV2 gives the smallest optimal periodic replacement time and LSCV1 gives
the largest optimal periodic replacement time. The optimization results with the LSCV
method are often influenced by the differences between two approaches (LSCV1, LSCV2),
compared with the LLCV method. It is also observed that the minimum long-run average
cost per unit time with the parametric power law model and LLCV1 takes closed values
from each other in Table 7.6. Furthermore, CNPMLE and LLCV2 tend to show the relatively
optimistic results among all models. Conversely, LSCV2 gives the most pessimistic estima-
tion results.
7.6 Conclusions
In this chapter we focused on the failure time data analysis based on the NHPP and dis-
cussed several statistical estimation methods for a periodic replacement problem with min-
imal repair as the simplest application of life data analysis with NHPP. We have applied
not only the parametric ML estimation for the power law process but also CNPMLE and
kernel-based estimation methods for estimating the cost-optimal periodic replacement
problem with minimal repair, where single or multiple minimal repair data are assumed.
Furthermore, we conducted a simulation experiment of a single minimal repair process
and a real data analysis with the multiple field data sets of minimal repair of diesel engine
in the numerical example. Throughout numerical examples, we investigated properties
of parametric and nonparametric methods and showed how to use the parametric meth-
ods and several nonparametric approaches for the periodic replacement problem with
minimal repair.
Chapter seven: Recent progress on failure time data analysis of repairable system 147
References
Arkin, B. L. and Leemis, L. M. (1998). Nonparametric estimation of the cumulative intensity function
for a nonhomogeneous Poisson process from overlapping realizations. Management Science 46,
pp. 989–998.
Arunkumar, S. (1972). Nonparametric age replacement policy.Indian Journal of Statistics Series A 34,
pp. 251–256.
Ascher, H. and Feingold, H. (1984). Repairable Systems Reliability: Modeling, Inference, Misconceptions
and Their Causes. New York, Marcel Dekker.
Barlow, R. E. and Hunter, L. C. (1960). Optimum preventive maintenance policies. Operations Research
8, pp. 90–100.
Barlow, R. E. and Proschan, F. (1996). Mathematical Theory of Reliability. Philadelphia, SIAM.
Baxter, L. A. (1982). Reliability applications of the relevation transform. Naval Research Logistics
Quarterly 29, pp. 323–330.
Bergman, B. (1979). On age replacement and the total time on test concept. Scandinavian Journal of
Statistics 6, pp. 161–168.
Boland, P. J. (1982). Periodic replacement when minimal repair costs vary with time. Naval Research
Logistics Quarterly 29, pp. 541–546.
Boswell, M. T. (1966). Estimating and testing trend in a stochastic process of Poisson type. Annals of
Mathematical Statistics 37, pp. 1564–1573.
Colosimo, E. A., Gilardoni, G. L., Santos, W. B. and Motta, S. B. (2010). Optimal maintenance time
for repairable systems under two types of failures. Communications in Statistics—Theory and
Methods 39, pp. 1289–1298.
Cox, D. R. (1972). The statistical analysis of dependencies in point processes. Stochastic Point
Processes, Stochastic Point Processes: Statistical Analysis, Theory and Applications (Lewis, P. A. W.
ed.), pp. 55–66. New York, Wiley.
Cox, D. R. and Lewis, P. A. W. (1966). The Statistical Analysis of Series of Events. New York, Wiley.
Crow, L. H. (1974). Reliability analysis for complex repairable systems. Reliability and Biometry: Statistical
Analysis of Lifelength (Proschan, F. and Serfling, R. J. eds.), pp. 379–410. Philadelphia, SIAM.
Diggle, P. and Marron, J. S. (1989). Equivalence of smoothing parameter selectors in density and
intensity estimation. Journal of the American Statistical Association 83, pp. 793–800.
Dohi, T., Kaio, N. and Osaki, S. (2007). Optimal (T, S)-policies in a discrete-time opportunity-based age
replacement: An empirical study. International Journal of Industrial Engineering 14, pp. 340–347.
Gilardoni, G. L. and Colosimo, E. A. (2011). On the superposition of overlapping Poisson processes
and nonparametric estimation of their intensity function. Journal of Statistical Planning and
Inference 141, pp. 3075–3083.
Gilardoni, G. L., De Oliveira, M. D. and Colosimo, E. A. (2013). Nonparametric estimation and
bootstrap confidence intervals for the optimal maintenance time of a repairable system.
Computational Statistics & Data Analysis 63, pp. 113–124.
Guan, Y. (2007). A composite likelihood cross-validation approach in selecting bandwidth for the
estimation of the pair correlation function. Scandinavian Journal of Statistics 34, pp. 336–346.
Ingram, C. R. and Scheaffer, R. L. (1976). On consistent estimation of age replacement intervals.
Technometrics 18, pp. 213–219.
Lewis, P. A. W. and Shedler, G. S. (1979). Simulation of nonhomogeneous Poisson processes by thin-
ning. Naval Research Logistics Quarterly 26, pp. 403–413.
Nakagawa, T. (1986). Periodic and sequential preventive maintenance policies. Journal of Applied
Probability 23, pp. 536–542.
Nakagawa, T. (2005). Maintenance Theory of Reliability. Berlin, Springer.
Nelson, W. and Doganaksoy, N. (1989). A computer program for an estimate and confidence limits
for the mean cumulative function for cost or number of repairs of repairable products. TIS
report 89CRD239. New York, General Electric Company Research and Development.
Okamura, H., Dohi, T. and Osaki, S. (2014). A dynamic programming approach for sequential pre-
ventive maintenance policies with two failure modes. Reliability Modeling with Applications:
Essays in Honor of Professor Toshio Nakagawa on His 70th Birthday (Nakamura, S., Qian, C. H. and
Chen, M. eds.), pp. 3–16. Singapore, World Scientific.
148 Advanced Mathematical Techniques in Engineering Sciences
Park, D. H., Jung, G. M. and Yum, J. K. (2000). Cost minimization for periodic maintenance policy
of a system subject to slow degradation. Reliability Engineering & System Safety 68, pp. 105–112.
Rinsaka, K. and Dohi, T. (2005). Estimating age replacement policies from small sample data. Recent
Advances in Stochastic Operations Research (Dohi, T., Osaki, S. and Sawaki, K. eds.), pp. 145–158.
Singapore, World Scientific.
Sheu, S. H. (1990). Periodic replacement when minimal repair costs depend on the age and the num-
ber of minimal repairs for a multi-unit system. Microelectronics Reliability 30, pp. 713–718.
Sheu, S. H. (1991). A generalized block replacement policy with minimal repair and general random
repair costs for a multi-unit system. Journal of the Operational Research Society 42, pp. 331–341.
Valdez-Flores, C. and Feldman, R. M. (1989). A survey of preventive maintenance models for sto-
chastically deteriorating single-unit systems. Naval Research Logistics Quarterly 36, pp. 419–446.
Zielinski, J. M., Wolfson, D. B., Nilakantan, L. and Confavreux, C. (1993). Isotonic estimation of the
intensity of a nonhomogeneous Poisson process: The multiple realization setup. Canadian
Journal of Statistics 21, pp. 257–268.
chapter eight
Contents
8.1 I ntroduction......................................................................................................................... 149
8.2 YouTube view count: A twofold perspective.................................................................. 150
8.3 Literature review................................................................................................................. 152
8.4 Model development............................................................................................................ 153
8.4.1 Model I: Linear growth.......................................................................................... 155
8.4.2 Model II: Exponential growth............................................................................... 155
8.4.3 Model III: Repeat viewing..................................................................................... 156
8.5 Data analysis and model validation................................................................................. 156
8.6 Conclusion........................................................................................................................... 163
References...................................................................................................................................... 164
8.1 Introduction
Online social networks such as Facebook, Twitter, YouTube, Google+, etc., are part of our
daily life now. Social networks are merged as a platform since they are able to bring people
from varied backgrounds and common interests to the same place. Billions of people inter-
act with each other on such platforms. It is also changing the way in which people create,
share and consume information (Khan and Vong 2014). Information is not just shared in
the form of text but also via images, gifs, videos, memes, etc. Social network sites are also
instrumental in creating awareness on some topics, which directly or indirectly unites
people and forces lawmakers and firms to make better decisions keeping the interests of
the user in mind.
The social networking websites are emerging as an important factor in expanding a
product’s online market. There are several examples where special privileges are provided
to the e-commerce customers over the traditional shop customers. Like the launching of
certain products exclusively on e-commerce sites, namely, Flipkart, Amazon, Snapdeal, and
so on. For the first few weeks they are exclusively sold through e-commerce. Every shop-
ping cart allows users to share their experience on social media like Facebook, YouTube,
Google+, etc., which works the word-of-mouth publicity in traditional media/markets.
149
150 Advanced Mathematical Techniques in Engineering Sciences
Among various video sharing sites, YouTube has emerged as a leader in video sharing
platforms. It started as a website to share short entertainment videos and has grown into a
massive platform for people to connect freely. One can find videos of different genres like
music, sports, comedy, recreational activities, religious and spiritual content, educational,
etc. Since YouTube is acting more like a marketing forum, it is helping individuals as well
as large production houses to increase their audience. Nowadays every news channel,
production house, celebrity, and organization has their YouTube channels where they post
their recent activities in the form of videos to stay connected to their fans, group mem-
bers or general public. Independent content creators have built grassroots followings num-
bering in the thousands at very little cost or effort. YouTube’s revenue-sharing “Partner
Program” made it possible for people to earn a substantial living as a video producer
alone – with each of its top 500 partners earning more than $100,000 annually and its
10 highest-earning channels grossing from $2.5 to $12 million (Berg 2015).
Since YouTube provides a glimpse of the product’s success (in term of statistics), manu-
facturing companies and production houses prefer to launch previews of their products or
trailers on YouTube first. The video’s popularity would provide good business to not only
YouTube but also the product manufacturer. YouTube’s features, such as the “like,” “dislike,”
“comments,” “share,” “subscribe,” etc., go a long way in helping the customers/audiences to
express themselves. A video’s number of likes, dislikes, comments in favor and comments
against them clearly shows a product’s performance and the viewers’ satisfaction. It pro-
vides a clear picture as to whether the product is going to be a hit in market or not. In the
comments section one can also find the basis on which the customers compare two or more
products. It clearly works in favor of the firm as firms get to know what to produce and what
characteristics of the product lead a customer to buy that product. Manufacturers can easily
estimate their potential buyers from available statistics and also know the “X-factor” of their
product by deeply analyzing the comment or the review videos that are again posted by
some YouTube user (uploader) on it. Therefore, it is quite clear that YouTube is not just adver-
tising the product, but it is also giving a glimpse of its future. Features provided by YouTube
help users as well as uploaders to interact better with each other.
According to an Internet survey conducted by Alexa in 2005, YouTube is the fastest
growing website and was ranked second in traffic generation among all the websites sur-
veyed (Cheng et al. 2008). Each minute 400 hours of content are uploaded and approxi-
mately 1 billion hours of content are viewed daily on YouTube (YouTube Press). Further,
Alexa categorized YouTube’s speed as “SLOW” as its average load time is 3.6 seconds and
it is slower than 69% of their surveyed websites (Cheng et al. 2008). The results are almost
similar even today. In October 2017, YouTube’s average load time was 2.38 seconds, and it
is slower than 68% of other websites.
The rest of the chapter is distributed as follows: Section 8.2 describes the twofold
perspective of YouTube view count followed by a literature review in Section 8.3. In
Section 8.4, a methodical approach to frame the view-count based models in a dynamic
environment has been proposed. Section 8.5 contains the model validation carried out on
YouTube video data sets. The conclusion and references are presented last.
it increases the monetary gains, while on the other it further slows down the platform
due to the additional traffic generated. Therefore, understanding and predicting the view
count is a twofold perspective: one that popular content generates more traffic, and hence
understanding popularity has a direct impact on caching and replication strategy that the
provider should adopt; and the other perspective that popularity has a direct economic
impact (Richier et al. 2014).
A number of researchers have tried to understand and model the popularity and
virality pattern of YouTube using various tools and techniques (Bauckhage et al. 2015;
Richier et al. 2014; Vaish et al. 2012; Yu et al. 2015; Zhou et al. 2010). For predicting a video’s
popularity or the high view count, one must know how YouTube works and when a view
count increases. The flowchart presented in Figure 8.1 explains legitimate view-count
accountability.
It is important to predict a video’s high demand and popularity and make better deci-
sions of which videos should be cached on the limited data space of the proxy servers.
A view is a video playback that was requested by an actual user. A fake view includes
misleading views, misleading titles and thumbnails that attract views. When a video has a
large number of views that last for mere seconds after clicking, the views are not counted
as legitimate. So if a video is viewed in its entirety by someone who clicked on it, it is
counted as one view. But not all views are fully played. Google Ad Sense works only with
videos that are over 30 seconds in length so that the click-through rates get registered. In
fact, some videos are lucky enough to have just 10 seconds of play being considered as
a view. Thereby, it can be understood that the amount of video played should be above
a threshold percentage of the length of the video. The type and genre of the video also
affect video length. YouTube considers views from the same IP address in a time interval
of 6–8 hours. So one person viewing the same video repeatedly would only generate three
to five views a day, even though he/she has viewed it over 300 times. A viewer being
redirected to YouTube upon clicking an embedded video counts as one view. If there is
an embedded video with auto play, it is not counted as a view. In December 2012, 2 bil-
lion views were removed from the view counts of Universal and Sony music videos on
YouTube, provoking a claim by The Daily Dot that the views had been deleted because of
infringement of the site’s terms of service, which banned the use of automated processes
Video is cached
Video is Request by
on proxy
uploaded user
servers
No
View-count
directly Requested from
updated on main server.
main server
to inflate view counts. In another incident on August 5, 2015, YouTube removed the feature
that caused a video’s view count to freeze at “301” (later “301+”) until the time the actual
count was verified to counteract view-count fraud. There might be many more restrictions
and rules that go into categorizing a request as a view that might not have been looked
into as for now.
A detailed number of attributes have been looked upon and carried on for analysis
in terms of number of views. The next section provides a highlight of certain work in the
related area.
8.3 Literature review
As far as a prediction of view count is concerned, there are several factors that can affect
the number of views gathered by a video. There are several research proposals concern-
ing the total number of view count (Bauckhage et al. 2015; Richier et al. 2014; Vaish et al.
2012; Yu et al. 2015) or predicting the growth rate of view count (Bauckhage et al. 2015;
Richier et al. 2014; Yu et al. 2015). Then there is literature showing how recommendation
in YouTube helps in increasing view count of a video (Yu et al. 2015). In some research,
different caching techniques are given to reduce the caching time. Cheng et al. (2008)
provided a glimpse of YouTube statistics. They found the active life span of the video in
terms of caching, i.e., up to what time it is required to cache the video on the proxy server.
They also showed how videos can be distributed on the basis of video category, the age of
the video, video length, video file size, and video bit rate on YouTube. According to their
results, 97.9% video lengths are within 600 seconds, and 99.1% are within 700 seconds
in their entire data set. Since 22.9% of the videos fall in the music category and 17.8% of
videos fall under the entertainment category, their segregation can be understood. If the
rate of growth of view count is very high, then the content is considered viral, i.e., a large
number of views in a very short period of time. Richier et al. (2014) fitted six bio-inspired
models in their research to find the view count dynamics of YouTube videos for a fixed
and growing population. Khan and Vong (2014) captured the effect of external influence
on the virality of content. For this they used webometrics and network diagram and
found a correlation between the various attributes of the video to the cause of virality.
They also state that the external network (other social network sites except for YouTube)
also have a good contribution in making a content viral. Zhou et al. (2010) highlighted the
importance of a recommendation system on view count of a video. They found that the
recommendation system is the cause for 30%–40% of views of a video. Rather research-
ers contradict with the traditional definition of virality and relate it to broadcasting. Goel
et al. (2015) made an attempt to provide an insight to YouTube and found that there is a
clear difference between broadcasting and structural virality. The traditional approach
of virality largely depends on the total view count of the video, i.e., a video having a
greater view count is more viral than a video having a comparatively less view count.
Since it is very difficult to differentiate between broadcasting and virality, in this chap-
ter we have also taken a traditional definition of virality and considered the number of
views to be increasing rapidly due to virality, not because of broadcasting. As both yield a
large number of views in a short duration, there is a need to understand that content can
be said to be viral only if it is spread by the initial viewers, and Goel et al. (2015) found
that structural virality is typically low, and remains independent of size, suggesting that
popularity is largely driven by the size of the largest broadcast. Several researchers have
also studied the life cycle of videos and found various phases that occur in the lifetime of
a YouTube video. Yu et al. (2015) found phases as a description of the burst popularity of a
Chapter eight: View-count based modeling for YouTube 153
video and found the multiple peaks of popularity in its life cycle. They also directly relate
the phases to content type and evolution of popularity on the basis of the power law. Of
late, Aggrawal et al. (2017) have proposed a novel approach of studying the life cycle of a
YouTube video and categorization of viewers. Bauckhage et al. (2015) used a bio-inspired
model in their research to determine “how viral were viral videos.” They utilized con-
volution theory and took a joint effect of exponential infection and recovery rate in the
Markov process to find the probability density function for the epidemic model (virality
of videos). Ding et al. (2011) did their research for understanding the uploaders of the con-
tent where they demonstrate the positive reinforcement between online social behavior
and uploading behavior. They also examined whether YouTube users are truly broadcast-
ing themselves via characterizing and classified videos as user generated and user copied
(2011). Their results claim that most of the content on YouTube is not user generated, and
63% of the most popular uploaders are just uploading the user-copied content. The UCC
(user-copied content) uploaders upload more videos than the UGC (user-generated con-
tent). Further Vaish et al. (2012) used different factors like share count, number of views,
number of likes and number of dislikes for calculating the virality index of the video and
provide a conventional and hybrid asset valuation technique to demonstrate how virality
can fit in to provide accurate results.
In this chapter, we have tried to capture the growth pattern of view count for cer-
tain videos on YouTube. The initial stage of the life cycle of a video starts with it being
posted on YouTube (i.e., people are getting aware and then diffusing the information in the
Internet market). As the video becomes more popular, it attracts a larger number of views,
likes, dislikes, comments, shares, etc. This is considered as the growing phase of the video.
After attaining maximum popularity or becoming viral or being viewed by most of its
target viewers, the video is said to have matured, and its active life span is considered to
be almost over. From this time onward, the video’s growth will be very slow and steady as
compared to the earlier phase (Cheng et al. 2008). A video is never deleted from YouTube,
so there is no fixed life span, but the video’s life is said to be over when the growth in a
number of views is negligible over time. YouTube’s revenue depends on advertisements, so
it is very important to know the right time to introduce an advertisement on a video. The
high rate of advertisement during the growth stage of the video is likely to have the maxi-
mum impact and yield high profits for the advertiser and YouTube. Therefore, our study of
the view count growth shall prove to be a helpful contribution in this area.
8.4 Model development
In the proposed modeling framework, we have captured the view-count dynamics of
YouTube viewers and have extended this framework for the dynamic Internet model. The
popularity of YouTube can be judged by the number of likes, dislikes, comments, shares,
views, etc. All the aforesaid attributes can be represented as a counting process. Out of these
we have considered view counts as a counting process in the present research. As we know
from the literature (Kapur et al. 1999), a counting process (N(t), t > 0) is said to be a nonho-
mogenous Poison process with intensity function λ(t) if it satisfies the following conditions:
i. N(0) = 0
ii. {N(t), t > 0} has independent increments
iii. P{ N (t + h) − N ( h) ≥ 2} = o( h)
iv. P{ N (t + h) − N ( h) = 1} = λ (t)h + o( h)
where, o(h) denotes a quantity that tends to zero for small “h.”
154 Advanced Mathematical Techniques in Engineering Sciences
t
Let ν(t) represent the expected number of views by time t, i.e., v(t) =
then it can be shown that
∫ λ(x) dx, t > 0,
0
Pr[ N (t) = k ] =
( v(t))k e − v(t) , k = 0,1, 2,… (8.1)
k!
In other words, N(t) has a Poisson distribution with mean value function ν(t). Consider
a case when the time scale of the content diffusion is very large as compared to the size
of the potential population. Hence, we model the case where contents gain popularity
through advertisement and other marketing tools: examples are when advertisement is
done for a large pool of users of a social network and netizens access the content at random
thereafter.
Hence, we assume that the expected number of views in (t, t + Δt) is essentially propor-
tional to the expected number of views left from total expectation at time t, i.e.,
dv(t)
= b ( a − v(t)) (8.3)
dt
Solving Equation (8.3) under initial condition ν(0) = ν 0, we have
In Equation (8.4), v(0) is nothing but the number of view counts when the process has
started. Here, it is assumed that there is some number of views when the actual noting
starts. This model is similar to that proposed by Richier et al. (2014). In Equation (8.4), we
assume ν(0) = 0, i.e., when the system starts there are no views. So we get an expression
given by Equation (8.5):
where a is the expected number of total views that can be observed in a video.
The model explains how the rate of view count is directly linked with the leftover
views of a video. This model can act as a very strong forecasting tool to predict the level a
video can reach.
It is to further note that in today’s market when everything is dynamic, one cannot
actually talk about fixed market size. Especially in the Internet market where the market
varies significantly because of various reasons, for example, with an increase in the popu-
larity of a video, the number of viewers increases and so does the number of views. To
inculcate this dynamic behavior, Equation (8.3) can be redesigned as
dv(t)
= b ( a(t) − v(t)) (8.6)
dt
Chapter eight: View-count based modeling for YouTube 155
where a(t) represents dynamic Internet market size. In the present chapter we have explic-
itly taken three forms of varying market size: linear growth, exponential growth, and
growth because of repeat viewership. The various forms that a(t) can take are systemati-
cally discussed in the following sections.
dv(t)
= b ( a(1 + α t) − v(t)) (8.8)
dt
dv(t)
dt
( )
= b aeα t − v(t) (8.10)
156 Advanced Mathematical Techniques in Engineering Sciences
dv(t)
= b ( a + α v(t) − v(t)) (8.12)
dt
Making use of the initial condition that in starting there are no views, the functional form
of all the aforesaid models can be represented in Table 8.1.
Table 8.2 The estimated parameters of the data sets on all three models
Dataset
Parameter DS I DS II DS III DS IV DS V DS VI DS VII
Parameter estimation for different dataset using Model I
a 5,499.146 4,038.262 1,402.6 17,684.89 18,936.66 2,032.839 4,609.456
b 0.279 0.549 0.284 0.189 0.36 0.216 0.144
α 0.013 0.003 0.007 0.004 0.02 0.01 0.004
Parameter estimation for different dataset using Model II
a 5,894.747 4,058.969 1,454.733 17,852.3 18,978.54 2,151.82 4,660.231
b 0.241 0.539 0.255 0.186 0.358 0.193 0.142
α 0.008 0.002 0.005 0.003 0.002 0.007 0.003
Parameter estimation for different dataset using Model III
a 8,493.813 2,782.836 11,82.168 20,123.59 1,845.306 2,251.78 4,777.419
b 0.08 0.602 0.187 0.135 0.322 0.109 0.115
α 0.062 0.376 0.362 0.014 0.909 0.259 0.12
Table 8.3 Values of comparison parameters on the proposed models for Data Set I
Models
Comparison
parameters Model I Model II Model III
Bias 4.229 1.447 −88.783
Variance 250.337 249.531 749.213
RMPSE 250.372 249.535 754.455
M.S.E 62,686.291 62,267.613 569,202.180
R2 0.983 0.983 0.845
S.S.E 4,513,412.925 4,483,268.110 40,982,556.960
Table 8.4 Values of comparison parameters on the proposed models for Data Set II
Models
Comparison
parameters Model I Model II Model III
Bias −3.286 −3.397 −9.024
Variance 103.677 106.259 205.414
RMPSE 103.729 106.313 205.612
M.S.E 10,759.629 11,302.558 4,2276.422
R2 0.944 0.941 0.780
S.S.E 774,693.266 813,784.143 3,043,902.371
Table 8.5 Values of comparison parameters on the proposed models for Data Set III
Models
Comparison
parameters Model I Model II Model III
Bias −5.525 −6.231 −17.574
Variance 76.304 79.675 130.630
RMPSE 76.504 79.919 131.806
M.S.E 5,852.879 6,386.973 17,372.916
R2 0.938 0.933 0.817
S.S.E 421,407.265 459,862.083 1,250,849.931
158 Advanced Mathematical Techniques in Engineering Sciences
Table 8.6 Values of comparison parameters on the proposed models for Data Set IV
Models
Comparison
parameters Model I Model II Model III
Bias −33.935 −35.183 −82.531
Variance 363.805 371.675 763.370
RMPSE 365.384 373.336 767.818
M.S.E 133,505.451 139,379.865 589,544.964
R2 0.990 0.989 0.955
S.S.E 9,211,876.145 9,617,210.648 40,678,602.510
Table 8.7 Values of comparison parameters on the proposed models for Data Set V
Models
Comparison
parameters Model I Model II Model III
Bias 20.157 19.929 21.208
Variance 133.237 132.567 145.133
RMPSE 134.753 134.056 146.674
M.S.E 18,158.358 17,971.066 21,513.276
R2 0.957 0.957 0.949
S.S.E 1,307,401.808 1,293,916.786 1,548,955.894
Table 8.8 Values of comparison parameters on the proposed models for Data Set VI
Models
Comparison
parameters Model I Model II Model III
Bias −4.873 −5.109 −20.301
Variance 347.482 350.105 642.360
RMPSE 347.516 350.142 642.680
M.S.E 120,767.653 122,599.623 413,038.037
R2 0.980 0.980 0.932
S.S.E 8,695,271.028 8,827,172.856 29,738,738.660
Table 8.9 Values of comparison parameters on the proposed models for Data Set VII
Models
Comparison
parameters Model I Model II Model III
Bias −2.323 −3.116 −26.352
Variance 56.387 66.752 191.522
RMPSE 56.435 66.825 193.326
M.S.E 3,184.932 4,465.605 37,374.959
R2 0.992 0.989 0.904
S.S.E 226,130.190 317,057.955 2,653,622.061
The seven data sets have been graphically analyzed as shown in Figures 8.2–8.8. The
plots show the actual data and the estimated values from the three models (Model I, Model
II, and Model III).
All three models give equally fine results. Moreover, looking at the graphs, it is very
difficult to determine which model is performing the best. To sort out the query, we used
the weighted criteria approach given by Anand et al. (2014). The weighted criteria approach
Chapter eight: View-count based modeling for YouTube 159
12,000
10,000
View count (thousands)
8,000
2,000
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71
Time (days)
6000
5000
View count (thousands)
4000
1000
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71
Time (days)
2500
2000
View count (thousands)
1500
Actual view
1000 Model I
Model II
Model III
500
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71
Time (days)
2500
2000
View count (thousands)
1500
Actual view
Model I
1000
Model II
Model III
500
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71
Time (days)
25,000
20,000
View count (thousands)
15,000
Actual view
Model I
10,000
Model II
Model III
5,000
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71
Time (days)
4000
3500
3000
View count (thousands)
2500
500
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71
Time (days)
7000
6000
5000
View count (thousands)
4000
Actual view
3000 Model I
Model II
2000 Model III
1000
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71
Time (days)
is a ranking tool that helps us to determine the best fit among various models on the basis
of the comparison parameters for each data set.
The algorithm of the approach is as follows:
• In the criteria value matrix, each element aij shows the value of the jth criteria of the
ith model.
• For each model compute the attribute value, i.e., the maximum value and a minimum
value of each criterion.
• The criteria ratings are determined as under:
Case 1. When the smaller value of the criterion represents appropriate fitting to the
actual data, i.e., best value, then
Wij = 1 − X ij (8.15)
• The weighted criteria value matrix is computed by the product of weight of each cri-
terion with the criteria value, i.e.,
∑A ij
Zi = i=1
m (8.17)
∑
i=1
Wij
The ranking of models is done on the basis of the expression obtained in Equation (8.17)
(i.e., based on permanent value). The smaller permanent value of the model represents
good rank as compared to the bigger permanent value of the model. So all permanent val-
ues are compared, and ranks for each model are provided. The analysis of DS-I is shown
in Tables 8.10–8.13, and the algorithm can be carried for the rest of the data sets.
In Data Set I, Model II is performing best. We found that the best performing model
differs from data set to data set. The ranking of each model for different data sets is shown
in Table 8.14.
Table 8.14 shows the models and their ranks corresponding to various data sets under
consideration. We found that Model I (linear increment) is performing best in five data sets
(DS II, DS III, DS V, DS VI, and DS VII), and Model II (exponential increment) is performing
best on two data sets (DS I and DS IV).The relevance of studying these scenarios lies in the
fact that the first two models, Model I and Model II, can be understood in terms of virality
of the video and with what rate the videos are becoming viral. The third model (i.e., Model
III) helps to model a certain real-life scenario where a video might be getting some repeat
counts and thereby increasing the view count to the permissible amount. Practically all of
these scenarios can exist in the market, and the analysis is clearly able to represent that.
8.6 Conclusion
The fact that the population is fixed, is often a reasonable approximation when the evolu-
tion of the popularity of content increases quickly and then dies out within a short span of
time. In this chapter, we considered the case in which the Internet market growth and the
dynamic of view counts of content are in literacy linked. To compute such a dependence,
different growth scenarios have been considered. Further, the models have been ranked
based on a weighted criteria technique to know which type of video undergoes which
type of growth.
In the future, it would be interesting to know how the concept works in the environ-
ment of irregular fluctuations in diffusion rate and about its connectivity based on attri-
butes apart from view count.
164 Advanced Mathematical Techniques in Engineering Sciences
References
Anand, Adarsh, Parmod Kumar Kapur, Mohini Agarwal, and Deepti Aggrawal. “Generalized
innovation diffusion modeling & weighted criteria based ranking.” In Reliability, Infocom
Technologies and Optimization (ICRITO) (Trends and Future Directions), 2014 3rd International
Conference on, pp. 1–6. IEEE, 2014.
Aggrawal, Niyati, Anuja Arora, and Adarsh Anand. “Modelling and characterizing viewers of
YouTube videos.” International Journal of System Assurance Engineering and Management, 2018.
doi:10.1007/s13198-018-0700-6.
Bauckhage, Christian, Fabian Hadiji, and Kristian Kersting. “How viral are viral videos?” In
ICWSM, pp. 22–30. 2015.
Berg, Madeline. “The World’s Top-Earning YouTube Stars 2015”. Forbes. (November 2015)
Cheng, Xu, Cameron Dale, and Jiangchuan Liu. “Statistics and social network of YouTube videos.”
In Quality of Service, 2008. IWQoS 2008. 16th International Workshop on, pp. 229–238. IEEE, 2008.
Ding, Yuan, Yuan Du, Yingkai Hu, Zhengye Liu, Luqin Wang, Keith Ross, and Anindya Ghose.
“Broadcast yourself: Understanding YouTube uploaders.” In Proceedings of the 2011 ACM
SIGCOMM Internet Measurement Conference, pp. 361–370. ACM, 2011.
Khan, Gohar Feroz, and Sokha Vong. “Virality over YouTube: An empirical analysis.” Internet
Research 24, no. 5 (2014): 629–647.
Goel, Sharad, Ashton Anderson, Jake Hofman, and Duncan J. Watts. “The structural virality of
online diffusion.” Management Science 62, no. 1 (2015): 180–196.
Kapur, Parmod Kumar., R. B. Garg, and Santosh Kumar. Contributions to Hardware and Software
Reliability. Vol. 3. World Scientific, Singapore, 1999.
Richier, Cédric, Eitan Altman, Rachid Elazouzi, Tania Altman, Georges Linares, and Yonathan
Portilla. “Modelling view-count dynamics in YouTube.” arXiv preprint arXiv:1404.2570 (2014).
Vaish, Abhishek, Rajiv Krishna, Akshay Saxena, Mahalingam Dharmaprakash, and Utkarsh Goel.
“Quantifying virality of information in online social networks.” International Journal of Virtual
Communities and Social Networking 4, no. 1 (2012): 32–45.
YouTube Press [https://www.youtube.com/yt/about/press/]
Yu, Honglin, Lexing Xie, and Scott Sanner. “The lifecycle of a YouTube video: Phases, content and
popularity.” In ICWSM, pp. 533–542. 2015.
Zhou, Renjie, Samamon Khemmarat, and Lixin Gao. “The impact of YouTube recommendation
system on video views.” In Proceedings of the 10th ACM SIGCOMM Conference on Internet
Measurement, pp. 404–410. ACM, 2010.
chapter nine
Contents
9.1 I ntroduction......................................................................................................................... 165
9.2 Mathematical modeling..................................................................................................... 167
9.2.1 Early market adoption model............................................................................... 169
9.2.2 Main market adoption model............................................................................... 169
9.2.3 Total adoption modeling........................................................................................ 171
9.3 Parameter estimation......................................................................................................... 171
9.4 Discussion and summary.................................................................................................. 175
References...................................................................................................................................... 176
9.1 Introduction
The diffusion of innovation is an essential topic of research in the field of marketing man-
agement. Since the 1960s, plenty of innovation diffusion models have been introduced
to study the diffusion process of a product. A plethora of diffusion models based on the
highly pertinent work of Bass (1969) are available in literature. Bass (1969) has contrib-
uted greatly to the understanding of a variety of diffusion models. The simple structure
of the Bass model has led to its higher number of applications over the last few decades.
The Bass model perceives that an innovation spreads throughout the market by two main
channels: mass media (external influence) and word of mouth (internal influence). His
model assumes the nature of consumers to be homogenous with respect to their response
behavior (Agliari et al. 2009). The inexorable expansion of the market forces the research-
ers to explore alternative diffusion models with high explanatory power. The variation in
customers’ buying behavior requires a renewed focus toward the segmented market struc-
ture, which directly affects the expected profit of the firm (Wedel and Kamakura 2012).
In today’s era of competition to build long-lasting relations and gain trust with consum-
ers, it becomes mandatory for management to take into account different characteristics
and adoption behaviors of customers in various segments of markets. Hence, it becomes
vital for marketers to understand the concept of multisegmented marketing (Singh et al.
2015). However, the launch of a new product would raise awareness about its usage and
may trigger demand among their potential customers (Aggrawal et al. 2014). Furthermore,
the availability of the product/service centers of the technological products would also
165
166 Advanced Mathematical Techniques in Engineering Sciences
impact the number of adopters of the product. Due to the different adoption behavior, the
studies suggest the presence of a dual market: an “early” market corresponding to the high
needs and less price sensitivity and a “main” market corresponding to the relatively less
needs and high price sensitivity.
Main market adopters are different from early market adopters. Recent literature sug-
gests that, at least regarding high-tech products, main market adopters are not opinion
generators; moreover, they do not influence the potential customers of the product (Moore
1991, 1995). In addition, industry studies ascertain different motives for adoption of inno-
vative products among early and main market consumers. Early adopters are mainly
technophiles attracted to a product for its competitive edge over similar products in the
segment; main market consumers are primarily more interested in the product’s enduring
functions (Goldenberg et al. 2002).
Existing dual-market models tend to overlook price sensitivity when it comes to con-
sidering adoption behaviors of early and main market adopters. Early market adopters are
higher risk takers as they endorse a product in spite of unpredictability and possible mis-
givings/imperfections at the initial stage of introduction of the product (Kim et al. 2014).
In comparison, main market adopters are more calculative and are rationalists who weigh
out the benefits offered by the product in the given price bracket before they make the final
purchasing decision (Rogers 1995). This reasoning accounts for entrance of main market
adopters late into the market. However, the existing diffusion models presume their entry
at the earliest stages of the market. Hence, there is essentially some time of consideration
after a main market adopter comes to know about the innovation and before he adopts it.
Another limitation of the above-mentioned models is that they take into account a
single market for a single product. However, potential adopters are not strictly observant
of various factors of the adoption system and may respond differently over time (Anand
et al. 2016b). In addition, as the demographic distribution over population and potential
adopters might be spread across vast regions, this may introduce some time lag. Hence, the
introduction of product to a new customer through a mass-mediated process or personal-
ized interaction is bound to take time. Thus, inclusion of the feature of time lag between
early and main market for diffusion of product is necessary for a comprehensive under-
standing of the diffusion model.
The understanding of the dual-market structure in the initial stages of product life
cycle has invoked marketing practitioners to introduce the concept of time lag between
early and main markets of the product. High-tech executives have increasingly come to
use terms like “early market/main market” or “visionaries/pragmatists” to comprehend
the diffusion process. According to Moore (1991), this difference necessitates change in
marketing strategy including product launch. The main market varies from the early mar-
ket with respect to its magnitude, population distribution, nature, customer expectations,
price sensitivity, and major benefits derived from the product (Gatingon and Robertson
1985). For effective product management, marketing agents should be sensitive to the time
difference of the new and main markets, to the extent of being able to predict the time
at which the mainstream consumers take over early adopters. Accordingly, marketing
strategies can be orchestrated to suit the initiation of early and main market buyers. The
interim period is also significant as it is closely related to other early product life cycle
modifications.
Product life cycle (PLC) is considered to be the trajectory of sales of a product from its
genesis to its final stages (Chandrasekaran and Tellis 2007). Considering it from a macro
perspective, other researchers describe it as the fluctuations in the market during the prod-
uct’s lifetime (Helfat and Peteraf 2003). Hence, PLC can help to determine product-related
Chapter nine: Market segmentation-based modeling 167
strategy decisions for the company (Wong and Ellis 2007). Hofer (1975) studied and reem-
phasized the importance of PLC on business planning. Forrester was a pioneer in study-
ing PLC and its applicability as a tool for management analysis and managerial modeling.
He assumes the industry and products to be homogenous in terms of their characteristics
and customer viewpoint to analyze the PLC stages. Hence, it is quintessential in mapping
the development of innovation and its market opportunities. Rogers regards the diffusion
curve based on potential adopters into five market segments: innovators, early adopters,
early majority, late majority, and laggards. Subsequently, Moore worked on Rogers’ normal
diffusion curve and adopters’ categories to describe expansion of the new products in the
market. Moore detects a break in the process as the later consumer or mainstream market
does not necessarily depend on the earlier adopters for product information. However,
it can be perceived that Moore’s purported “break” is not as sharp as he would make us
believe. After the initial life span of the innovation, there may be a slump in the market,
yet the other market comes up simultaneously before the previous market has died down.
Similarly, there may be entry of the other market during the decline phase of the foregoing
market. Therefore, assuming time lag might be misleading as at some point of time two or
more markets exist side by side. Introduction of a multimodal product life cycle curve for
the simultaneous multimarkets phenomena is more imperative as it is more realistic. But
in this study, we consider the existence of two simultaneous markets to study the bimodal
structure as a particular case of multimodal curves. The improved curve is going to have
new long-bearing repercussions on marketing strategies for both the dying early market
and the mainstream market.
Although a new trend in marketing literature differentiates between early and main
markets for new products requiring separate treatment by marketers (Mahajan and Muller
1998), the existence of discontinuity in the diffusion process has not been sufficiently
explored. This discontinuity may be due to insufficient transmission of product informa-
tion between early market adopters and mainstream consumers (Moore 1991). Any signifi-
cant difference in the adoption rates of the two markets will inevitably affect the overall
sales. It may result in a temporary decline in the sales of the product at the intermedi-
ate stage (Goldenberg et al. 2002). Inclination or reluctance of consumers of the two seg-
ments of the market may be markedly different (Rogers 1995). This differentiation calls for
bimodal curves for corresponding dual markets. This chapter proposes a new dual-market
innovation diffusion model framework that considers the division of consumers as early
adopters and mainstream consumers. The main adopters are assumed to enter the market
after a certain period of time. Considering different influences of the product over poten-
tial buyers, we study the different adoption behaviors using distribution functions for
main market adopters. We use the new dual-market model to study the pattern of product
life cycles of innovations.
The remainder of this chapter is as follows: in Section 9.2, we present the details and
mathematical framework of our model, followed by empirical analysis and validation of
our proposal in Section 9.3. At last, we discuss the implications of our findings and con-
clude this chapter in Section 9.4.
9.2 Mathematical modeling
The proposed methodology is based on the following set of assumptions:
• The diffusion process is subject to adoption due to the remaining number of adopters
in the market.
168 Advanced Mathematical Techniques in Engineering Sciences
• The adoption process between the early and main market are disconnected.
• Both the markets have their own potential buyers based on their buying behavior.
• One market adoption is not influenced by the other, i.e., there is no cross-market
influence.
• Market size (potential adopters) is fixed during the diffusion process.
• There is a time lag between both the markets.
In this section, we present the dual-market model guided under the above-mentioned
assumption. As available in literature, according to Bass (1969) the adoption process occurs
because of two adopter groups: innovators (external influentials) and imitators (internal
influentials). The mathematical representation given by Bass is
dN (t) N (t)
n(t) = = p + q
M
[ M − N (t)] (9.1)
dt
where p and q are the coefficients of external and internal influence, respectively. The
cumulative number of adopters at time t, N(t) can be obtained over the remaining adopters
of potential market size, M.
Building on the Bass model, Kapur et al. (2004) proposed an alternative formulation
N (t)
of the Bass model by replacing p + q as b(t) in Equation (9.1) to avoid the distinction
M
between innovators and imitators, as innovator for one product may be imitator for the
other. Equation (9.1) can be thus rewritten as
dN (t)
= b(t)[ M − N (t)] (9.2)
dt
where b(t) defines the rate of adoption of an innovation at time t.
The uniqueness of the adoption behavior of both early and main markets is worthy
of elaboration. Hence, the dual-market innovation diffusion model assumes that adopters
are highly affected by the information that transfers from their own peer group rather
than the same information disseminated throughout the entire population (Goldenberg
et al. 2002). Here we use the index i to define the notations of early market and the index m
defines the main market segment.
The adoption process in the early market and in the main market segment progresses
as follows:
For early market:
dI (t)
= bi (t)[ N i − I (t)] (9.3)
dt
For main market:
dM(t)
= bm (t)[ N m − M(t)] (9.4)
dt
Here, bi(t) denotes the hazard rate function that an early market consumer will adopt the
product as a result of external and internal forces of marketing, and bm(t) is the rate at
which the main market consumer will adopt the product as a result of external and inter-
nal forces of marketing. Ni describes the market potential of the early market, and Nm
defines the market potential of the main market. I(t) stands for the cumulative number of
Chapter nine: Market segmentation-based modeling 169
adopters at time t for the early market population, and M(t) is the cumulative number of
adopters of the main market population at time t.
The early market adoption process is similar to the differential equation as defined in
Equation (9.3). But for main market adoption, we employ a new parameter τ for delayed
entry of the main market adopters. It is widely accepted in the literature that the early and
main market adopters differ with respect to their adoption behavior and also have differ-
ent levels of price sensitivity. Hence, the entry of main market adopters after a certain time
τ can be represented in the following way:
dM(t − τ )
= bm (t − τ )[ N m − M(t − τ )] (9.5)
dt
If τ equals 0, Equation (9.5) is equivalent to Equation (9.4). Let t ′ = t − τ , and Equation (9.5)
can be rewritten as
dM(t ′)
= bm (t ′)[ N m − M(t ′)] (9.6)
dt
1 − e − bi t
N i (t) = N i (9.7)
1 + β e − bi t
One should note that when there is only one market, i.e., M(t) = 0, our model converges to
a special case of logistic growth model obtained by Kapur et al. (2004).
main market adopters enter the market and the sales curve increases dramatically (Golder
and Tellis 1998). We assume that the main market is fully developed at the time of its intro-
duction. Based on this assumption we can say that adoption here can take more or less
time vis-a-vis early market depending on the product’s availability in market and its util-
ity. To address the heterogeneity of the main market, we have mentioned different types
of S-shaped distribution functions. We have also considered the scenario when the main
market adopters enter into market with the fastest pace, due to major change in the mar-
keting policy of the firm. For that, we have considered the exponential growth function for
main market adopters. In addition to this, the different adoption distribution functions of
the main market have yielded the following expressions:
Case 1: Here we are taking into consideration of bm(t) as the exponential distribution
function:
bm (t) = bm
Using this in Equation (9.7), the total number of main market adopters at time t ′ is found as
(
M(t ′) = N m 1 − e − bmt ′ (9.8) )
Case 2: Using two-stage Erlang function as the rate of adoption, i.e.,
bm 2t ′
bm (t ′) =
1 + bmt ′
Then the solution of Equation (9.7), corresponding to the above defined bm (t ′) is
(
M(t ′) = N m 1 − ( 1 + bmt ′ ) e − bmt ′ (9.9) )
Case 3: Considering the logistic rate of adoption for main market adopters, i.e.,
bm
bm (t ′) =
1 + β m e − bmt ′
Substituting it in Equation (9.7), then the corresponding total number of main market
adopters is given as
1 − e − bmt ′
M(t ′) = N m (9.10)
1 + β m e − bmt ′
Case 4: Assuming the rate as two-stage Erlang logistic function in Equation (9.7), i.e.,
bm (bmt ′ + (1 − e − bmt ′ )β )
bm (t ′) =
(1 + β m + bmt ′)(1 + β m e − bmt ′ )
1 − ( 1 + bmt ′ ) e − bmt ′
M(t ′) = N m (9.11)
1 + β m e − bmt ′
Chapter nine: Market segmentation-based modeling 171
We now define a function L(t) as the cumulative number of adopters of the main market at
time t, which starts from the initial time point 0, as follows:
M(t − τ ) for t ≥ τ ,
L(t) = (9.12)
0 for t < τ .
The different values of function M(t) have been taken from Equations (9.8) to (9.11).
Here it is noted that the market potentials of early market Ni and main market Nm have
been obtained from the market potential of total market, M. Assume θ defines the propor-
tion of the early market in the population of the total market; as such,
Then by substituting the expressions of Ni and Nm in the above expression proposed in the
last section, and then putting the values of I(t) and L(t), we summarize all the dual-market
innovation diffusion models to find total sales N(t) in Table 9.1, corresponding to various
early and main market adoption functions.
9.3 Parameter estimation
Parameter estimation for all above-defined dual models has been adjudged in this section
by using the techniques of the nonlinear least squares method. For the practical affirmation,
it is approachable to show the fit of the proposed models on the real-life data sets in terms
of cumulative distribution functions. The empirical analysis of proposed models has been
done over real-life Data Set I (DS I), which refers to cable TV, and Data Set II (DS II), which
refers to cloth dryer sales data taken from Van den Bulte and Lilien (1997). In this study,
statistical software package SPSS (Statistical Package for Social Sciences) nonlinear regres-
sion models have been used to estimate the parameters and their standard errors for the
above-defined four models. The SPSS is an interactive and user-friendly software to apply
more sophisticated models to the data. Also, the statistical software R has been used to
draw the box plots of relative errors for all defined models. For the above-mentioned mod-
els in Table 9.1, the estimates of parameters are summarized in Table 9.2. It is assumed that
the delay entry time parameter τ is a fixed number for all the models. In the case of DS I,
the value of the parameter τ is taken as 5, and it is fixed as 8 for DS II.
The value of weight parameter θ is given in the eighth column of Table 9.2 and signifies
that for DS I, the main market is more significant than the early market. But in the case of
DS II, it implies that the early market dominates the main market. Hence, it is not justifi-
able to declare the importance of one segment of the market over the other without using
a mathematical model. The performance analysis of the proposed models is measured by
using the most common goodness-of-fit criteria as MSE (mean square error), R 2 (coefficient
of determination), bias, and variation. The values of these comparison criteria are shown
in Table 9.3, confirming the robustness of the approach.
For practical purposes, it is mandatory to find the better-fitted model to the given data
sets. Hence, we have shown the cumulative sales data and predicted sales of the defined
data sets in Figures 9.1 and 9.2. It can be observed that all the models are indistinguishable
and equally fit to the actual sales data set as all graphs of predicted sales are overlapping
90
80
70
60
Actual sales
50
Sales
DMIDM-I
40
DMIDM-II
30
DMIDM-III
20
DMIDM-IV
10
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Time
30
25
20
Actual sales
Sales
15 DMIDM-I
DMIDM-II
10
DMIDM-III
5 DMIDM-IV
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Time
DS I DS II
0.20
0.30
Relative errors
Relative errors
0.20
0.10
0.10
0.00
0.00
Figure 9.3 Box plot of the relative errors for DS I and DS II.
174 Advanced Mathematical Techniques in Engineering Sciences
to each other. For each product, the relative error has been carried out to examine the pre-
dictive performance of all dual models. To put it another way, we draw the box plots as
shown in Figure 9.3, which depicts the range of relative errors of all estimated dual models
for both data sets. This figure shows that the proposed dual-market innovation diffusion
model with logistic rate growth of early market composite with two-stage Erlang growth
function in the main market gives the best result in the case of DS I, whereas the same
model gives the worst result for DS II.
In innovation diffusion literature, most of the product life cycle follows the bell-shape
structure. But in our study, we have shown that the concept of the dual market brings the
multimodal structure of the diffusion curve of innovation. Figures 9.4 and 9.5 plot the
noncumulative sales of the proposed models. It can be seen that the bimodal structure of
innovation is well captured in these figures. As the early segment market is introduced,
the sales of the product initially increase and reach a peak and afterward decrease with
time, until the main market is introduced in the market. This shape helps us to explain
why it is essential for a firm to choose the introduction of the main market after a certain
10
9
8
Noncumulative sales
7
6
DMIDM-I
5
4 DMIDM-II
3 DMIDM-III
2 DMIDM-IV
1
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Time
2.5
Noncumulative sales
DMIDM-I
1.5
DMIDM-II
1
DMIDM-III
0.5 DMIDM-IV
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Time
period of time. This sales growth curve also drives the firm to allocate the promotional
efforts and sales strategies on time because the market growth can also be captured by
the introduction of the main market of the product and that helps them to increase their
product revenue.
References
Aggrawal, D., Singh, O., Anand, A., & Agarwal, M. (2014). Optimal introduction timing policy for a
successive generational product. International Journal of Technology Diffusion (IJTD), 5(1), 1–16.
Agliari, E., Burioni, R., Cassi, D., & Maria Neri, F. (2009). Word-of-mouth and dynamical inhomoge-
neous markets: An efficiency measure and optimal sampling policies for the pre-launch stage.
IMA Journal of Management Mathematics, 21(1), 67–83.
Anand, A., Aggarwal, R., Singh, O., & Aggrawal, D. (2016a). Understanding diffusion process in the con-
text of product dis-adoption. St. Petersburg State Polytechnical University Journal Economics, 9(2), 7–18.
Anand, A., Singh, O., Aggarwal, R., & Aggrawal, D. (2016b). Diffusion modeling based on customer’s
review and product satisfaction. International Journal of Technology Diffusion (IJTD), 7(1), 20–31.
Bass, F. M. (1969). A new product growth for model consumer durables. Management Science, 15(5),
215–227.
Chandrasekaran, D., & Tellis, G. J. (2007). A critical review of marketing research on diffusion of new
products. In N. K. Malhotra (Ed.), Review of Marketing Research (Vol. 3, pp. 39–80). Bingley: Emerald
Group Publishing Limited.
Gatingon, H., & Robertson, T.S. (1985). A propositional inventory for new diffusion research. Journal
of Consumer Research, 11(4), 849–867.
Goldenberg, J., Libai, B., & Muller, E. (2002). Riding the saddle: How cross-market communications
can create a major slump in sales. Journal of Marketing, 66(2), 1–16.
Golder, P. N., & Tellis, G. T. (1998). Growing, growing, gone: Modeling the sales slowdown of really
new consumer durables. University of Southern California Working Paper.
Helfat, C. E., & Peteraf, M. A. (2003). The dynamic resource‐based view: Capability lifecycles.
Strategic Management Journal, 24(10), 997–1010.
Hofer, C. W. (1975). Toward a contingency theory of business strategy. Academy of Management
Journal, 18(4), 784–810.
Kapur, P. K., Bardhan, A. & Jha, P. C. (2004), An alternative formulation of innovation diffusion model.
In V. K. Kapoor (Ed.), Mathematics and Information Theory, (pp. 17–23). New Delhi: Anamaya
Publication.
Kim, T., Hong, J. S., & Lee, H. (2014). Predicting when the mass market starts to develop: The dual
market model with delayed entry. IMA Journal of Management Mathematics, 27(3), 381–396.
Mahajan, V., & Muller, E. (1998). When is it worthwhile targeting the majority instead of the innova-
tors in a new product launch? Journal of Marketing Research, 35, 488–495.
Moore, G. A. (1991), Crossing the Chasm. New York: Harper Business.
Moore, G. A. (1995), Inside the Tornado. New York: Harper Business.
Rogers, E. M. (1995), The Diffusion of Innovations, 4th ed. New York: The Free Press.
Singh, O., Kapur, P. K., & Sachdeva, N. (2015), Technology management in segmented markets.
Quality, Reliability, Infocom Technology and Industrial Technology Management (pp. 78–89).
New Delhi: I K International Publishing House.
Van den Bulte, C., & Lilien, G. L. (1997). Bias and systematic change in the parameter estimates of
macro-level diffusion models. Marketing Science, 16(4), 338–353.
Wedel, M., & Kamakura, W. A. (2012). Market Segmentation: Conceptual and Methodological Foundations
(Vol. 8). New York: Springer Science & Business Media.
Wong, H. K., & Ellis, P. D. (2007). Is market orientation affected by the product life cycle? Journal of
World Business, 42(2), 145–156.
chapter ten
Contents
10.1 I ntroduction......................................................................................................................... 177
10.2 Methodology of kernel estimators................................................................................... 178
10.2.1 Modification of smoothing parameter................................................................. 181
10.2.2 Support boundary.................................................................................................. 182
10.3 Identification of atypical elements................................................................................... 183
10.3.1 Basic version of the procedure.............................................................................. 183
10.3.2 Extended pattern of population............................................................................ 185
10.3.3 Equal-sized patterns of atypical and typical elements..................................... 186
10.3.4 Comments for Section 10.3.................................................................................... 187
10.4 Clustering............................................................................................................................. 187
10.4.1 Procedure................................................................................................................. 188
10.4.2 Influence of the parameters values on obtained results................................... 190
10.4.3 Comments for Section 10.4.................................................................................... 191
10.5 Classification........................................................................................................................ 192
10.5.1 Bayes classification................................................................................................. 192
10.5.2 Correction of values of smoothing parameter and modification intensity.... 193
10.5.3 Reduction to pattern sizes..................................................................................... 194
10.5.4 Structure for nonstationary patterns (concept drift)......................................... 195
10.5.5 Comments for Section 10.5.................................................................................... 198
10.6 Example practical application and final comments....................................................... 198
Acknowledgments....................................................................................................................... 200
References...................................................................................................................................... 201
10.1 Introduction
Perversely, one can state that contemporary data analysis has developed too vigorously,
which negatively caused, in particular, an absence of due care and attention regarding
formalism, mathematical justification, and ultimately compact subject methodology. Before
the computer revolution of the second half of the 20th century, data analysis was conducted
based on already well-established, effective mathematical apparatuses of statistics. The main
trouble was then the inadequacy of the data, expressed primarily in small sample sizes, and
– in consequence – statistical procedures were directed toward m aximal effectiveness in the
sense of gaining as much information as possible from them. The situation was diametrically
reversed in the 1980s, with the spread of not only efficient numerical calculation systems, but
also methods of automatic measurement. In a relatively short time, a total reversal in the con-
ditions occurred: the data became too numerous and carried too complex information for
177
178 Advanced Mathematical Techniques in Engineering Sciences
processing by classic statistical methodology. Moreover, the absence of the ability to super-
vise such excessive and complicated data sets through a statistician’s intuition led to the
danger of the appearance of evidently erroneous data, resulting, for example, in faults occur-
ring during the measurements of particular elements. Such drastically reversed conditions
of data analysis tasks caused an enormous need for totally new procedures, and the speed
of progress brought about a situation in which they were based on separated, often specific,
concepts without mathematical justification or attempts at the unification of methodology.
Currently, it seems, the time for their modification, proving, and generalization has arrived.
The subject of this chapter is the presentation of a coherent concept of establishing the
methodology of kernel estimators for the three main tasks of data analysis: identification/
detection of atypical elements (outliers), clustering, and classification. The application of a
uniform apparatus for all three basic problems facilitates comprehension of the material
and, in consequence, creation of individualized modifications, and also in the latter phase
of the designing of a personal computer application. The use of nonparametric kernel esti-
mators frees the results from data distribution – this concerns not only the shape of their
grouping, but also the possibility of their partition in separate incoherent parts. The meth-
odology investigated in this chapter is practically parameter free, i.e., it is not required
from the user, the calculation of parameter values, although it is possible to optionally
modify them in order to achieve the specific desired properties.
This chapter is constructed as follows. After this introduction, Section 10.2 presents
an outline of the methodology of kernel estimators. This will be applied in Section 10.3
to the identification of atypical element task, to clustering in Section 10.4, and in Section
10.5 – classification. In the framework of the final summary, in Section 10.5, an exam-
ple application of the investigated material in the creation of a mobile phone operator’s
marketing support strategy is presented.
x1 , x2 , , xm ∈ R n (10.1)
fˆ ( x) =
1
mhn ∑ K x −h x , (10.2)
i=1
i
where the measurable function K : Rn → [0, ∞), symmetrical with respect to zero and
having a weak global maximum at this point, fulfils the condition
∫
K (x) d x = 1 and is
Rn
called a kernel, whereas the positive coefficient h is referred to as a smoothing parameter.
For details, see the classic monographs (Kulczycki 2005; Silverman 1986; Wand and Jones
1995). Notably, a kernel estimator enables the identification of density for practically any
distribution, especially with no assumptions regarding its membership of a fixed class;
unusual, complex, or multimodal distributions are treated here as a typical unimodal
case. The form of the kernel K and the value of the smoothing parameter h are commonly
provided based on the mean integrated square error criterion.
Chapter ten: Kernel estimators for data analysis 179
Thus, the selection of the kernel form is practically meaningless from a statistical point
of view and, in consequence, the user should above all take into account properties of the
desired estimator or/and computational aspects, useful for the application problem being
worked out; for details, see the literature (Kulczycki 2005, Section 3.1.3; Wand and Jones
1995, Sections 2.7 and 4.5).
For the one-dimensional case (i.e., when n = 1), the normal (Gauss) kernel
1 x2
K j ( x) = exp − (10.3)
2π 2
is generally held as basic. For special purposes, other types can be proposed; here, the
uniform kernel
1
for x ∈[−1, 1]
K j ( x) = 2 (10.4)
0 for x ∉[−1,1]
will be used henceforth – it has bounded support and assumes a finite number of values,
which will be taken advantage of later in this chapter.
In the multidimensional case (i.e., when n > 1), a so-called product kernel will be
applied hereinafter.* The main idea here is the division of particular variables with the
multidimensional kernel then becoming a product of n one-dimensional kernels for
specific coordinates. Thus, the kernel estimator (10.2) is then given as
m
fˆ ( x) =
1
∑ x −x
K 1 i ,1 x2 − xi ,2
K 2 h
xn − xi , n
K n h , (10.5)
∏
n 1
m h h 1 2 n
j i=1
j=1
x1 xi ,1
x=
x2 and xi = xi ,2 for i = 1, 2, … , m. (10.6)
x
n xi , n
The above kernels fulfill the additional requirements of the particular procedures used
henceforth.
The value of the smoothing parameter is highly significant for the estimation quality,
and many advantageous algorithms for calculating it on the basis of a random sample have
been proposed.
First, consider the one-dimensional case. In specific conditions, e.g., during initial
research or a numerous random sample (10.1) with relatively regular distribution, the
* For description of another – radius – type, see the monographs by Kulczycki (2005, Section 3.1.3) and Wand
and Jones (1995, Section 4.5), where it is called spherically symmetric. This notion will not be used in this text.
180 Advanced Mathematical Techniques in Engineering Sciences
approximate method (Kulczycki 2005, Section 3.1.5; Wand and Jones 1995, Section 3.2.1) is
sufficient, according to which
15
8 π W (K ) 1
h= σˆ , (10.7)
3 U ( K ) m
2
∞ ∞
where W ( K ) =
∫
−∞
K ( x)2 d x and U ( K ) =
∫−∞
x 2 K ( x) d x, while σ̂ denotes the estimator of a
standard deviation:
m m
1
∑( xi − Eˆ ) 1
∑ x . (10.8)
2
σˆ = with Eˆ = i
m−1 i=1
m i=1
The functional values occurring in formula (10.7) are, respectively, for normal kernel (10.3)
1
W (K ) = , U ( K ) = 1 (10.9)
2 π
and for uniform (10.4)
1 1
W (K ) = , U ( K ) = . (10.10)
2 3
For specific cases, the more sophisticated yet effective plug-in method (Kulczycki 2005,
Section 3.1.5; Wand and Jones 1995, Section 3.6.1) can be recommended. Its concept consists
of the calculation of the smoothing parameter using the approximate method described
above, and after r steps improving the result, one obtains a value close to optimal. On the
basis of simulation research carried out for the needs of the material worked out in this
chapter, r = 2 can be proposed. In this case, the plug-in method consists of the application
of the following steps:
105
d8 = , (10.11)
32 πσˆ 9
where σ̂ is given by formula (10.8), and subsequently,
19
−2 K (6) (0)
g II = (10.12)
mU (K )d8
17
−2 K (4) (0)
gI = ; (10.13)
mU (K )d6 ( g II )
finally
15
V (K )
h= , (10.14)
mU (K ) d4 (g I )
2
while
m m
xi − x j
dp (g ) =
1
m2 g p + 1 ∑ ∑ K
i=1 j=1
( p)
g for p = 4,6. (10.15)
Chapter ten: Kernel estimators for data analysis 181
The kernel K, applied in estimator (10.2), is used only in the last step (10.14). In other steps,
represented by formulas (10.12), (10.13), and (10.15), the different kernel K may be used.
Generally, a normal kernel (10.3) is assumed; the quantities occurring in formulas (10.12),
(10.13), and (10.15) are then given by dependence (10.9) and also
1 1 15
K (6) ( x) = ( x 6 − 15 x 4 + 45 x 2 − 15) exp − x 2 , K (6) (0) = − (10.16)
2π 2 2π
1 1 3
K (4) ( x) = ( x 4 − 6 x 2 + 3) exp − x 2 , K (4) (0) = . (10.17)
2π 2 2π
For the multidimensional case, thanks to using a product kernel, the methods presented
can be simply applied n times, sequentially for each coordinate.
Finally, it is worth noting that too small a value of the smoothing parameter h implies
the appearance of an excessive number of local extremes of the estimator f̂ , whereas too
large causes its overflattening – this property will be actively used in later considerations.
In practical applications of kernel estimators, one can also use specific concepts,
generally improving the estimator properties, and others optionally fitting the model to a
considered reality.* In the first group, a so-called modification of the smoothing parameter –
presented in Section 10.2.1 (Kulczycki 2005, Section 3.1.6; Silverman 1986, Section 5.3.1) –
will be used henceforth, while in Section 10.2.2, the support boundary (Kulczycki 2005,
Section 3.1.7; Silverman 1986, Section 2.10), belonging to the second group, is presented.
where c ∈[0, ∞), f̂* means the kernel estimator without modification, and finally defining
the kernel estimator with modification of the smoothing parameter as
m
fˆ ( x) =
1
mhn ∑ s1 K xhs− x . (10.19)
i=1
n
i i
i
fˆ ( x) =
1
∑ 1 x −x
K 1 i ,1 x2 − xi ,2 xn − xi , n
K 2 s h K n s h . (10.20)
∏
n 1
m h s sh
i i 1 i 2 i n
j i=1
j=1
* According to the experience of the author and his research team, it is worth maintaining sensible self-restraint
in the application of specific ideas available in the subject literature, often ineffective in practice, and increasing
the complexity of the procedures.
182 Advanced Mathematical Techniques in Engineering Sciences
As a consequence of the above concept, in the areas in which the elements of random
sample (10.1) are rare, the kernel estimators are additionally flattened, and in the regions of
their concentration – additionally peaked. The parameter c determines the intensity of the
modification procedure – when its value is larger/smaller, it becomes more/less distinct.
Using the criterion of the integrated mean square error, one can propose
c = 0.5. (10.21)
For details see the monographs by Kulczycki (2005, Section 3.1.6) and Silverman (1986,
Section 5.3.1).
10.2.2 Support boundary
For practical applications, specific coordinates of the random variable can describe diverse
quantities. A number of these, in particular representing distance or time, for their correct
interpretation, must belong to properly bounded subsets, e.g., nonnegative numbers. Due
to omitting misinterpretations and calculational errors resulting from this, a beneficial
procedure for bounding a kernel estimator’s support can be applied.
First, consider the one-dimensional case (when n = 1) and the left boundary – i.e., the
case where the condition fˆ ( x) = 0 for x < x*, with x* ∈ R (mostly x* = 0), is desired. A fragment
of the ith kernel which lays outside the interval [ x* , ∞) is symmetrically “reflected” with
respect to the boundary x* , and becomes a part of the kernel “hooked” in the element xi
“reflection”; therefore, in the point 2 x* − xi . So, after defining the function K x* : R → [0, ∞) by
K ( x) when x ≥ x*
K x* ( x) = , (10.22)
0 when x < x*
the basic form of kernel estimator (10.2) is the following
m
fˆ ( x) =
1
mh ∑ K
i=1
x*
x − xi + K x + xi − 2 x*
h
x*
h
(10.23)
and analogously, the formula with the modification of the smoothing parameter (10.20)
m
x + xi − 2 x*
fˆ ( x) =
1
mh ∑ s1 K
i=1
i
x*
x − xi
hs + K x*
i hsi . (10.24)
Cut fragments of kernels, lying outside of the assumed support, are embodied into the
support directly near its boundary, so with such a small introduced change, this is accept-
able in practice.
The consideration for the right boundary of the support can be followed analogously.
In the multidimensional case, the concept presented may naturally be applied subse-
quently for every coordinate of the considered random variable. These cases, however,
will not be used further in this text. For more details, see the books by Kulczycki (2005)
and Silverman (1986, Section 2.10).
A broader description regarding various aspects of kernel estimators is found in the
classic monographs (Kulczycki 2005; Silverman 1986; Wand and Jones 1995). In the next
sections of the chapter, this methodology will be uniformly applied to three fundamental
procedures of data analysis: identification of atypical elements, clustering, and classification.
Chapter ten: Kernel estimators for data analysis 183
x1 , x2 , , xm ∈ R n . (10.25)
where f̂− i means the kernel estimator f̂ calculated excluding the ith element, for
i = 1, 2, , m. It is worth noting that, regardless of the dimension of the random variable
X, the values of set (10.26) are real (one-dimensional). Particular values fˆ− i ( xi ) character-
ize the probability of the occurrence of the element xi, and therefore, the lower the value
fˆ− i ( xi ), the more the element xi can be interpreted as “less typical,” or rather happening
more rarely.
Define now the number
r ∈(0,1) (10.27)
establishing sensitivity of the procedure for identifying atypical elements. This number
will determine the assumed proportion of atypical elements in relation to the total popu-
lation and, therefore, the ratio of the number of atypical elements to the sum of atypical
184 Advanced Mathematical Techniques in Engineering Sciences
and typical. One can naturally assume r > 0.5, as otherwise the atypical elements become
typical and vice versa. In practice
is the most often used, with particular attention paid to the second option.
Let us treat set (10.26) as realizations of a real (one-dimensional) random variable
and calculate the estimator for the quantile of the order r. The positional estimator of
the second order (Parrish 1990, Kulczycki 1998) will be applied as follows, given by the
formula
where i = [mr + 0.5], and [y] denotes an integral part of the number y ∈ R, while zi is the ith
value in size of set (10.26) after being sorted; thus,
x ∈ R n (10.31)
the condition
fˆ ( x ) ≤ qˆ r (10.32)
is fulfilled, then this element should be considered atypical; for the opposite
fˆ ( x ) > qˆ r (10.33)
it is typical. Note that for the correctly estimated quantities f̂ and qˆ r, the above guarantees
obtaining the proportion of the number of atypical elements to total population at the
assumed level r.
The above procedure for identifying atypical elements, combined with the properties
of kernel estimators, allows in the multidimensional case for inferences based not only
on values for specific coordinates of a tested element, but above all, on the relationships
between them.
Chapter ten: Kernel estimators for data analysis 185
In the multidimensional case, the interval [a, b] generalizes to the n-dimensional cuboid
[ a1 , b1 ] × [ a2 , b2 ] × × [ an , bn ], while a j < b j for j = 1, 2, …, n.
First, the one-dimensional case is considered. Let us generate two pseudorandom
numbers u and v of distribution uniform to the intervals [a, b] and [0, c], respectively. Next,
one should check that
v ≤ f (u). (10.35)
If the above condition is fulfilled, then the value u ought to be assumed as the desired real-
ization of a random variable with distribution described by the density f, that is
x = u. (10.36)
In the opposite case, the numbers u and v need to be removed and the above procedure
repeated until the desired number of pseudorandom numbers x with density f is obtained.
In the presented procedure, the density f is established by the methodology of kernel
estimators, described in Section 10.2. Denote its estimator as f̂ . The uniform kernel will
be employed, allowing easy calculation of the support boundaries a and b, as well as the
parameter c appearing in condition (10.34). Namely,
a = min xi − h (10.37)
i = 1,2,, m
b = max xi + h (10.38)
i = 1,2,, m
and
c = max
i = 1,2,, m
{ fˆ ( x − h) , fˆ ( x + h)}. (10.39)
i i
The last formula results from the fact that the maximum for a kernel estimator with the
uniform kernel must occur on the edge of one of the kernels. It is also worth noting that
calculations of parameters (10.37)−(10.39) do not require much effort. This is thanks to the
appropriate choice of kernel form, taking advantage of the kernel estimators’ r obustness
in form.
In the multidimensional case, Neumann’s elimination algorithm is similar to the
previously discussed one-dimensional version. The edges of the n-dimensional cuboid
186 Advanced Mathematical Techniques in Engineering Sciences
x ± h
i ,1
xi ,2 ± h
c = max fˆ following all combinations of ± . (10.40)
i = 1,2,…, m
xi , n ± h
The number of these combinations is finite and equal to 2n. Using the formula presented,
n particular coordinates of pseudorandom vector u and the subsequent number v are
generated, after which, condition (10.35) is checked.
The results of empirical research show that for the properly extended set (10.25), the
procedure investigated here for identifying atypical elements allows us to obtain a pro-
portion of this type of element throughout the whole population, with great accuracy,
sufficient from an applicational point of view.
Similarly, the set of observations for which the opposite inequality (10.33) is true may be
considered as a pattern of typical elements:
Sizes of the above patterns equal mat and mt, respectively. Of course mat + mt = m ; we also
have
mat
≅ r. (10.43)
mat + mt
In this way, unsupervised in its nature, the problem of identifying atypical elements
has been reduced to a supervised classification task, although with strongly unbalanced
patterns – taking into account relation (10.43) with condition (10.28), set (10.41) is in practice
around 10–100 times smaller than pattern (10.42). Classification is relatively conveniently
conditioned and can use many different well-developed methods. However, most pro-
cedures work much better if patterns are of similar or even equal sizes (Kaufman and
Rousseeuw 1990). Using once again the algorithm presented in Section 10.3.3, the size of
the set can be increased to mt, so that mat = mt, thus equaling patterns of atypical (10.41) and
typical (10.42) elements.
Chapter ten: Kernel estimators for data analysis 187
10.4 Clustering
Clustering has become the second basic problem within data analysis – compared to other
procedures, it is more loosely defined and at a lesser advanced stage in research (Everitt
et al. 2011; Xu and Wunsch 2009). It can be found between classical data analysis, where the
research objective has already been specified, and exploration data analysis, in which the
aim of future investigations is unknown a priori, and its detection is an integral component
of the research. In the first case, clustering may be applied for the purposes of classifica-
tion, however, without fixed patterns, whereas the second treats it as a division of the
explored data into a few groups, each comprising elements that are similar to each other
but significantly differ between particular groups.
Consider a set of elements from an investigated population. The most intuitive and
natural concept is the assumption that specific clusters are related to modes (local maxima)
of distribution density; thus, the “valleys” become the borders of the resulting clusters
(Fukunaga and Hostetler 1975). The algorithm described in this section is presented in
its entirety, which can be applied without the requirement for users to conduct laborious
research. Its attributes can be summarized as follows:
modification with the aim of simultaneously increasing the cluster quantity in dense
regions and reducing or even eliminating them from sparse areas of data, or vice versa.
5. The suitable relationship between the parameters mentioned in points 3 and 4 enables
reducing and even eliminating clusters in sparse regions, virtually without affecting
the cluster quantity in dense areas of data.
The characteristics from point 4, and consequently point 5, are particularly worthwhile
to highlight as being practically absent in other clustering methods. In applications, one
should underline the consequences of points 1 and 2, as well as possibly point 3.
10.4.1 Procedure
Consider – as in the previous section – a data set
x1 , x2 , , xm ∈ R n , (10.44)
whereas b > 0 and k * ∈ N\{0}. Based on an optimizing criterion, one can suggest
2
1
b= min h j , (10.47)
n + 2 j = 1,2,, n
while hj denotes the smoothing parameter value of the jth coordinate. To the above
task, the estimator with smoothing parameter modification with standard intensity
(10.21) can be applied; the product kernel (10.5) is used in the multidimensional case. As
a (one-dimensional) kernel, the normal kernel (10.3) is proposed because of its analyti-
cal convenience, differentiability within the entire domain, and the fact that its values
are positive, which defends against division by zero in formula (10.46). In this case, the
quotient on the right side of Equation (10.46) takes the convenient form
(x k − x )
− i ,1 2 2 i ,1
s1 h1
(x − x )
k
∇fˆ ( xik ) − i ,2 2 2 i ,2
= s2 h2 , (10.48)
fˆ ( xik )
( xi , n − xi , n )
k
− sn2 hn2
Chapter ten: Kernel estimators for data analysis 189
xk
i ,1
xk
xik = i ,2 . (10.49)
xik, n
Next, it is assumed that algorithm (10.45) and (10.46) needs to be completed, in the event
that the following inequality is fulfilled after the subsequent kth step
Dk − Dk − 1 ≤ aD0 (10.50)
D0 = ∑∑
i=1 j=i+1
d( xi ,x j ), Dk − 1 = ∑∑i=1 j=i+1
d( xik − 1 ,x kj − 1 ), Dk = ∑ ∑ d(x ,x ), (10.51)
i=1 j=i+1
k
i
k
j
where d denotes Euclidean metric in R n. Thus, D0 and Dk − 1, Dk mean the sums of the
distances between specific elements (10.44) during the starting of the algorithm and fol-
lowing the performance of the (k − 1)th and kth steps, respectively. Initially, one can sug-
gest a = 0.001. The possible reduction of this value has practically no influence over the
results; however, growth needs validation of the potential consequences. Last, when con-
dition (10.50) is fulfilled after the kth step, then
k * = k (10.52)
{d(x k*
i
*
, x kj ) } i = 1, 2,, m − 1
j = i + 1, i + 2,, m
. (10.54)
Its size is
m(m − 1)
md = . (10.55)
2
Considering set (10.54) as a one-dimensional random variable sample, one should calculate
the auxiliary kernel estimator fˆd of mutual distances (10.54). Normal kernel (10.3) is sug-
gested once more; furthermore, the smoothing parameter modification procedure with the
standard value of parameter (10.21), together with the left-sided support boundary to the
interval [0, ∞); see formula (10.24) for x* = 0.
190 Advanced Mathematical Techniques in Engineering Sciences
Finding – with appropriate accuracy – the “first” (in the sense of the lowest argument
value) local minimum of the function fˆd in the interval (0, D), where
D= max d( xi , x j ), (10.56)
i = 1,2,, m − 1
j = i + 1, i + 2,, m
is the next task. For this objective, one can consider set (10.54) to be a random sample,
estimate its standard deviation applying formula (10.8), and subsequently take the values
x from the set
where int(100 ⋅ D) means an integral of the number 100 ⋅ D , until the condition
is fulfilled. The first (the smallest) value* will be treated as the smallest distance between
cluster centers located in close proximity to each other and referred to as xd hereinafter.
The final step is the creation of the clusters. This is achieved through:
1. Taking an element of set (10.55) and first producing a one-element cluster i ncluding it.
2. Finding an element of set (10.55) which differs from the others in the cluster, and is
nearer than xd; if such an element exists, it needs to be then added to that cluster; in
the event that this is not the case, go to point 4.
3. Discovering an element of set (10.55) different to elements in the cluster, and lying
closer than xd to at least one of these other elements; in the event of there being such
an element, one should add this to the cluster and subsequently repeat point 3.
4. Adding the attained cluster to a “cluster list” and removing its elements from set
(10.55); if this set reduced in such a way remains not empty, go to point 1, otherwise,
finish the algorithm.
The “cluster list” such obtained has all clusters defined during the above procedure con-
tained within. Now, we have investigated the basic form of the clustering procedure – its
potential modifications with their influence on the results is described in the following
section.
* In the event that such a value is nonexistent, the presence of one cluster should be recognized and the
procedure completed. The same applies to the irrational but formally possible situation m = 1, when set (1.52)
is empty.
Chapter ten: Kernel estimators for data analysis 191
As discussed in Section 10.2, too great a value of the smoothing parameter h results in
the over-smoothing of the kernel estimator; while if it is too small, this causes the appear-
ance of too many local extremes. Therefore, the result of increasing – with respect to that
calculated by the criterion of the mean integrated square error – this parameter value is
the occurrence of fewer clusters; conversely, decreases to this value yield more clusters.
In both cases, one can emphasize that despite influencing the number of clusters, their
number still solely depends on the data’s internal structure. On the basis of the performed
research, a change in the smoothing parameter value in the range −25% to +50% can be
recommended. Results are in need of individual verification if they lie beyond this range.
The intensity of the smoothing parameter modification – as described in Section
10.2.1 – is defined by the parameter c; its standard value is provided by formula (10.21). Its
increase sharpens the kernel estimator in the dense regions of set (10.44) and also smooths
it in the sparse areas; as a consequence, if this parameter value rises, then the number of
clusters in dense areas increases and simultaneously decreases in sparse regions. These
effects are reversed in the event of this parameter value diminishing. On the basis of the
performed research, the parameter c value can be proposed to be between 0 (indicating a
lack of modification) and 1.5. The validity of the obtained results needs individual verifica-
tion in the case of exceeding a value of 1.5. In particular one can suggest c = 1 as standard.
However, growth of the cluster number in dense data regions and at the same time
lowering or even eliminating clusters in sparse areas (as they frequently contain atypi-
cal elements appearing as a result of various errors) is frequently desired in practice.
Combining the aforementioned considerations, it is appropriate to propose increases to
both the change of the standard intensity of the smoothing parameter modification (10.21)
and simultaneously, the smoothing parameter h value calculated on the basis of the opti-
mization criterion, to the value h* given as
c − 0.5
3
h* = h. (10.59)
2
The combined effect of both of these factors implies a twofold smoothing of the estimator
f̂ in the areas in which set (10.44) is sparse. At the same time, the above factors virtually
cancel each other out in dense regions; hence, them having almost zero influence on the
discovery of clusters in such areas. On the basis of the conducted research, a change in the
parameter c value in the range of 0.5–1.0 can be executed; however, increases that exceed
1.0 require individual validation. In particular, the value c = 0.75 can be recommended in
such a case.
More details with visual aids are presented in the article by Kulczycki and
Charytanowicz (2010). Applications were synthetically presented in the paper by Kulczycki
et al. (2012), and also in more detail in the particular publications by Charytanowicz et al.
(2016), Kulczycki and Daniel (2009), and Łukasik et al. (2008).
10.5 Classification
Classification constitutes the third of the basic tasks of data analysis (Duda et al. 2001). In
the previously considered problems of atypical element identification (Section 10.3) and
clustering (Section 10.4), the subject of processing was only a data set (10.25) or (10.44),
respectively, with no other information, without additional prompts, supervision. These
problems are therefore typical unsupervised tasks. Meanwhile, the classification issue
involves a tested element being assigned to previously defined groups (classes) repre-
sented by patterns. This constitutes additional significant information, and thus the clas-
sification becomes a supervised task.
Such beneficial conditions of a classification task cause the available methodology to
be rich and very varied. The concept presented in the following sections will be based on
the Bayes approach. The classifiers gained in this way work quite well in complex real-
world situations and are eagerly used by practitioners, chiefly because of their robustness,
low requirements for patterns, and also their illustrativeness supporting individual modi-
fications. Especially, in the method proposed here, there is an opportunity to attribute
preferences to classes containing elements which – according to possible asymmetrical
task requirements – must especially not be incorrectly attributed to others. The parameters
of kernel estimators can be made more precise with the aim of successively improving
classification quality. Moreover, the application of the sensitivity method borrowed from
artificial neural networks allows the elimination of those pattern elements that have insig-
nificant or even a negative effect on the correctness of results. These last two procedures
will in turn be the basis for the creation of an effective adaptational structure, adjusting a
classifier to nonstationary data (so-called concept drift).
10.5.1 Bayes classification
Assume J sets containing elements from space R n:
representing assumed classes. The sizes m1, m2, …, mJ, need to be more or less proportional
to the “contribution” of specific classes within the investigated population. The aim of
classification is to map the tested element
x ∈ R n (10.63)
to one of the groups represented by patterns (10.60)−(10.62). Denote as fˆ1 , fˆ2 , , fˆJ kernel
estimators successively calculated on the basis of sets (10.60)−(10.62) treated as r andom sam-
ples (10.1) each time – the methodology used for this purpose is presented in Section 10.2.
Chapter ten: Kernel estimators for data analysis 193
According to the classic Bayes concept (Duda et al. 2001), the classified element (10.63)
needs then to be attributed to that class in which the value
is the largest. By introducing the positive coefficients z1 , z2 , , z J, the above can be general-
ized to
where # means the number of elements within a set. The classic leave-one-out method can
be applied for the calculation of the above functional value for any fixed argument. Due
to this value being an integer, the modified Hook–Jeeves algorithm (Kelley 1999) was used
to find a minimum. Alternate conceptions are described in the survey paper (Venter 2010).
As a result of performed research, the assumption can be made that for every coor-
dinate, the grid should usually have nodes at the points 0.25, 0.5, …, 1.75. The functional
(10.66) values are calculated for these nodes; the attained results are then sorted and the
five best become starting conditions for the Hook–Jeeves procedure, in which the initial
step value is proposed as 0.2. Following completion of each of the above five executions,
the values of functional (10.66) for the obtained end points are calculated, and that which
has the smallest value is the sought-after vector of the parameters b0 , b1 , b2 , , bn.
194 Advanced Mathematical Techniques in Engineering Sciences
It is worthy of note that in this procedure it is not necessary to correct the c lassification
parameters; however, doing so would enhance the classification quality and, moreover,
would allow applying an easy and convenient formula (10.7) to calculate smoothing
parameter values.
∑ w = m, (10.67)
i=1
i
subsequently assigned to specific random sample (10.1) elements. Then the initial definition
for kernel estimator (10.2) becomes
m
fˆ (x) =
1
mhn ∑ w K x −h x . (10.68)
i=1
i
i
Formulas (10.2), (10.5), (10.19), (10.20), and (10.23), (10.24) can be changed analogously. The
coefficients wi describe the weight (significance) of the ith pattern element with respect to
classification quality. It should be noted that if wi ≡ 1, definition (10.68) is then reduced to
basic form (10.2).
The procedure for reducing pattern sets (10.60)−(10.62) consists of two stages. The first
consists of the weights wi calculation; the second is the removal of such random sample
elements which have the lowest respective weights. To realize the former of these two
stages, separate neural networks can be built for each class. For simplicity of the forth-
coming notations, let the index j = 1, 2, , J which characterizes specific classes, be fixed.
The constructed network has three layers and is unidirectional: with m inputs which are
related to subsequent elements of the pattern; a hidden layer of a size equal to the integral
of the number m; with one output neuron. This network learns through the use of a data
set consisting of values of specific kernels for consecutive pattern elements, while the out-
put is the kernel estimator value for the considered pattern element. Network learning is
achieved through backward propagation of errors, with a momentum factor. At the com-
pletion of the procedure, the network is subjected to an analysis of sensitivity on learning
data; for details see the book by Zurada (1992). The essence of this method constitutes the
establishment – after network learning – of the influence of the subsequent inputs ui on the
output y; this is represented by the real coefficients
∂ y( x1 , x2 , , xm )
Si = for i = 1, 2, , m. (10.69)
∂ xi
Define the coefficients Si( p ), which aggregate information originating from consecutive
iterations of the previous stage (with p = 1, 2, , P ) and characterize the sensitivity of
successive learning data. This results in the coefficient Si defined as
Chapter ten: Kernel estimators for data analysis 195
∑ (S
p=1
( p) 2
i )
Si = for i = 1, 2, , m, (10.70)
P
which will be used to calculate the coefficients wi. Thus, first let
Si
wi = 1 −
m for i = 1, 2,… , m, (10.71)
∑j=1
Sj
in order to guarantee condition (10.67). The form of definition (10.71) is due to the net-
work created here being the most sensitive to redundant and atypical elements, which
suggests – as a consequence of the kernel estimator (10.68) form – a requirement to assign
to them the suitably smaller values w i, and consequently, wi; these coefficients characterize
the significance of specific pattern elements with respect to the classification quality.
The natural requirement that those elements for which wi < 1 needed to be removed
from the pattern has been confirmed by the performed research [observe that the mean
value of coefficients wi equals 1 due to normalization being introduced by formula (10.72)].
Increases to this value resulted in a significant drop in classification accuracy because of
losses of nonredundant, valuable information carried in the pattern. Conversely, decreases
of such a threshold led to a substantial fall in the reduction of pattern size; however,
its effect over classification accuracy was barely noticeable in the proximity of value 1,
although a significant reduction results in a significant increase in the number of errors.
Initial
patterns
B. Calculation of
correction coefficients
b0, b1,... bn
C. Calculation of
weights
w1, w2,..., wm
D. Sorting of weights
w1, w2,..., wm
E. Reduction of patterns
F. Calculation U. Calculation of
of weights derivatives
w1, w2,..., wm w1’, w2’,..., wm’
V. Sorting of derivatives
w1’, w2’,..., wm’
Elements with
G. Reduction of patterns
positive wi’
W. Removal not more than
wi ≥ 1 wi < 1 of pattern qm1*, qm2*,..., qmj*
elements
Remaining elements
Z. Removal of elements
H. Bayes classification
these values may worsen the classification quality, whereas an increase results in an
e xcessive calculation time.
The elements of initial patterns (10.60)–(10.62) are provided as introductory data. Based
on these – according to the procedures presented in Section 10.2 – the value of the param-
eter h is calculated (for the parameter c it is given by formula (10.21)). Figure 10.1 shows this
action in block A. Next, corrections in the parameters h and c values are made by taking
the coefficients b0 , b1 , , bn, as described in Section 10.5.2 (block B in Figure 10.1).
The subsequent procedure, shown by block C, is the calculation of the parameters
wi values mapped to particular elements of patterns, separately for each class, as in
Section 10.5.3. Following this, within each class, the values of the parameter wi are sorted
(block D), and then – in block E – the appropriate m1* , m2* , , m*J elements of the largest
values wi are designated to the classification phase itself. The remaining undergo further
treatment, denoted in block U, which is presented in the following sections, after Bayes
classification has been dealt with.
The reduced patterns separately go through a procedure newly calculating the values
of parameters wi, presented in Section 10.5.3 and depicted in block F. In turn, as block G
in Figure 10.1 denotes, these pattern elements for which wi ≥ 1 are submitted to further
stages of the classification procedure, while those with wi < 1 are sent to block A for further
processing in the next steps of the algorithm after adding new elements of patterns. The
final, and also the principal part of the procedure worked out here is Bayes classification,
presented at the beginning of this section and marked by block H. Obviously, many tested
elements (10.64) can be subjected to classification separately. After the procedure has been
finished, elements of patterns which have undergone classification are sent to the begin-
ning of the algorithm to block A, for further processing in the next steps, following the
addition of new elements of patterns.
Now – as mentioned two paragraphs earlier, in the last sentence – it remains to con-
sider those pattern elements, whose values wi were not counted among the m1* , m2* , , m*J
largest for particular patterns. Thus, within block U, the derivative wi′ is calculated* for
each of them. If the element is “too new” and does not possess the k − 1 previous values wi,
then the gaps are filled with zeros (because the values wi generally oscillate around unity,
such behavior significantly increases the derivative value, and in consequence, ensures
against premature elimination of this element). Next, for each separate class, the elements
wi′ are sorted (block V). As marked in block W, the respective
elements of each pattern with the largest derivative values, on the additional requirement
that the value is positive, go back to block A for further calculations carried out after the
addition of new elements. If the number of elements with positive derivative is less than
qm1* , qm2* , , qm*J, then the number of elements returning may be smaller (including even
zero). The remaining elements are permanently eliminated from the procedure, as shown
in block Z. In the above notation, q is a positive constant influencing the proportion of
* As the task considered here does not require the differences between subsequent values t1 , t2 , ..., tk to be equal,
it is therefore advantageous to apply interpolation methods. In the procedure worked out here, favorable
results were achieved using a classic method based on Newton's interpolation polynomial. Detailed formu-
las, as well as a treatment of other related concepts are found in the survey paper (Venter 2010). A backward
derivative, after taking into consideration the last three values, can be assumed as standard, i.e., a useful com-
promise between stability of results and possibility to react to changes (the derivative has then two degrees of
freedom).
198 Advanced Mathematical Techniques in Engineering Sciences
patterns’ elements with little, but successively increasing meaning. As a standard value
q = 0.2 is proposed, or more generally q ∈[0.1, 0.25] depending on the size/speed of changes.
An increase in this parameter value allows more effective conforming to pattern changes,
although this potentially increases the calculation time, while lowering it may signifi-
cantly worsen adaptation. In the general case, this parameter can be different for particu-
lar patterns – then formula (10.73) takes the form q1m1* , q2 m2* , , q J m*J, where q1 , q2 , , q J are
positive.
The above procedure is repeated following the addition of new elements (block A
in Figure 10.1). Besides these elements – as has been mentioned earlier – for particular
patterns m1* , m2* , , m*J elements of the greatest values wi are taken, respectively, as well
as up to qm1* , qm*2 , , qm*J (or in the generalized case q1m1* , q2 m*2 , , q J m*J ) elements of the
greatest derivative wi′, so successively increasing its significance, most often due to the
nonstationarity of patterns.
xi ,1
xi ,2
xi = xi ,3 for i = 1, 2,… , m, (10.74)
xi , n
where xi,1 denotes the average monthly income per SIM card of the ith client, xi,2 is its length
of subscription, xi,3 is the number of active SIM cards, and possibly others xi , 4 , xi ,5 , ,xi , n in
accordance with the current market situation.
Firstly, atypical elements within set (10.74) were removed, according to the proce-
dure presented in Section 10.3 (with r = 0.1). The regularity of the data structure was thus
enhanced; it is worthy of note that this was achieved through the cancellation of only
elements that had only negligible importance for the further results of the investigated
procedure.
Secondly, the data set was submitted to clustering by the procedure described in
Section 10.4. The consequence was the partitioning of the data set which consisted of
particular clients, into separate groups each composed of similar members. The results,
achieved for ordinary values of the modification intensity c and smoothing parameters
h, showed too great a number of small-sized clusters lying in low density areas of data –
mostly containing irrelevant, unusual clients – and an excessively large main cluster
containing more than half of all elements. Taking into account the properties of the used
algorithm, this value was raised to c = 1. As a consequence, the desired effects – the
significant lowering of the number of “peripheral” clusters as well as the splitting of
the main cluster – were thus attained. The number of clusters was then satisfactory and
changes to the smoothing parameters h value became redundant. At this point, the data
set comprising 1639 elements was partitioned into 26 clusters with sizes 488, 413, 247,
128, 54, 41, 34, 34, 33, 28, 26, 21, 20, 14, 13, 12, 10, two containing four elements, three of
three elements, two of two elements, and two of one element. Note that four groups can
be clearly distinguished – the first includes two large clusters of 488 and 413 elements,
the following contains two medium clusters with 247 and 128 elements, the next nine are
small and have 20–54, and finally there are 13 each with fewer than 20 elements. It is now
appropriate to eliminate the last of these clusters, although, those including key or pres-
tige clients (14, 13, 12, and 10 elements) were excluded from removal. Finally, for further
analysis, 17 clusters remained.
Then, in the case of each of the clusters found in this manner, an optimal scheme –
with regard to anticipated operator profit – was defined for the treatment of subscribers
belonging to this group. Elements of preference theory (Fodor and Rubens 1994) and fuzzy
200 Advanced Mathematical Techniques in Engineering Sciences
logic were applied due to the usually imprecise nature of expert evaluation of such prob-
lems; however, details of this operation lie beyond the remit of this chapter – details can be
found in the publication by Kulczycki and Daniel (2009).
It is worthy of note that, of the above calculations, none must be performed dur-
ing client negotiations; instead they should merely be updated (every 1–6 months in
practice).
The client with whom negotiations are conducted can be characterized through the use
of – in accordance with formula (10.74) – an n-dimensional vector the specific coordinates of
which represent the respective features of this client. Such data can be obtained from the
operator database archive – if the client has previously been the subscriber – or alternatively,
from historic invoices issued by a rival network should they be attempting to poach a client.
Attributing the client to an appropriate subscriber group during negotiations – according to
clusters defined earlier – was performed by applying Bayes classification presented in Section
10.5. Because the marketing strategies regarding specific clusters have been previously estab-
lished, the above action completes the procedure of supporting the marketing strategy with
regard to business clients, which was the objective of the project presented above.
The comments summarizing the above application example are symptomatic and can
usefully be treated as recapitulation of all the material presented in this chapter.
Thus, use in the above concept of a marketing support strategy for a mobile
phone operator methodologically uniform apparatus of kernel estimators makes the
analysis and creation of a useful computer application significantly easy. In turn, its
nonparametric character freed the concept from difficult to foresee – often nonstandard –
distribution of data appearing in contemporary complex tasks. In particular, there are
no restrictions on the shape of their grouping, and even in the number of separate
parts that are divided. The values of each (excluding the easy-to-interpret participa-
tion of atypical elements r) parameter are set on the basis of optimization criteria, after
which, they can be appropriately matched to individual preferences. In this text all
the necessary formulas are given, apart from standard procedures used in Sections
10.5.2–10.5.4.
Currently, the fundamental challenge of kernel estimators is large sets of high-
dimensional data. Thanks to averaging the properties of this type of estimator, quite
satisfactory results can be obtained for the first of these aspects even by natural sam-
pling of data set elements. With its fixed size m it is worth using classic random sam-
pling (Vitter 1985), and in the case of streaming data – the algorithm presented in the
paper by Aggarwal (2006). For multidimensionality one can apply classic reduction
using the statistical method PCA (Jolliffe 2001) or a refined approach based on calcu-
lated intelligence (Kulczycki and Łukasik 2014). More sophisticated methods, also with
the presence of categorical features, are currently the subject of intensive research by
the author and his team.
Acknowledgments
The work was supported in parts by the Systems Research Institute of the Polish Academy
of Sciences in Warsaw, and the Faculty of Physics and Applied Computer Science of the
AGH University of Science and Technology in Cracow, Poland.
I thank my close associates – former Ph.D.-students – Małgorzata Charytanowicz, D.Sc.,
Ph.D., Karina Daniel, Ph.D., Piotr A. Kowalski, Ph.D., Damian Kruszewski, Ph.D., Szymon
Łukasik, Ph.D., coauthors of the common publications (Kulczycki and Charytanowicz
2010; Kulczycki et al 2012; Kulczycki and Daniel 2009; Kulczycki and Kowalski 2011, 2015a,
Chapter ten: Kernel estimators for data analysis 201
2015b; Kulczycki and Kruszewski 2017a, 2017b; Kulczycki and Łukasik 2014). With their
consent, this text also contains results of our joint research.
References
Aggarwal C.C., Outlier Analysis. Springer, New York, 2013.
Aggarwal C.C., On biased reservoir sampling in the presence of stream evolution. In Proceedings
of the 32nd International Conference on Very Large Data Bases, Seoul, 12–15 September 2006, U.
Dayal, K.-Y. Whang, D.B. Lomet, G. Alonso, G.M. Lohman, M.L. Kersten, S.K. Cha, Y.-K. Kim
(eds.), VLDB Endowment, 2006.
Canaan C., Garai M.S., Daya M., Popular sorting algorithms. World Applied Programming, vol. 1,
pp. 62–71, 2011.
Charytanowicz M., Niewczas J., Kulczycki P., Kowalski P.A., Lukasik S., Discrimination of wheat
grain varieties using X-ray images. In: Information Technologies in Medicine, Pietka E., Badura P.,
Kawa J., Wieclawek W. (eds.), Springer, Berlin, 2016, pp. 39–50.
Duda R.O., Hart P.E., Storck D.G., Pattern Classification. Wiley, New York, 2001.
Everitt B.S., Landau S., Leese M., Stahl D., Cluster Analysis. Wiley, New York, 2011.
Fodor J., Roubens M., Fuzzy Preference Modelling and Multicriteria Decision Support. Kluwer, Dordrecht,
1994.
Fukunaga K., Hostetler L.D., The estimation of the gradient of a density function, with applications
in pattern recognition. IEEE Transactions on Information Theory, vol. 21, pp. 32–40, 1975.
Gentle J.E., Random Number Generation and Monte Carlo Methods. Springer, New York, 2003.
Jolliffe I.T., Principal Component Analysis. Springer, New York, 2001.
Kaufman L., Rousseeuw P.J., Finding Groups in Data: An Introduction to Cluster Analysis. Wiley,
New York, 1990.
Kelley C.T., Iterative Methods for Optimization. SIAM, Philadelphia, 1999.
Kinkaid, D., Cheney, W., Numerical Analysis. Brooks/Cole, Pacific Grove, 2002.
Kulczycki P., Wykrywanie uszkodzeń w systemach zautomatyzowanych metodami statystycznymi. Alfa,
Warsaw, 1998.
Kulczycki P., Estymatory jądrowe w analizie systemowej. WNT, Warsaw, 2005.
Kulczycki P., Charytanowicz M., A complete gradient clustering algorithm formed with kernel
estimators. International Journal of Applied Mathematics and Computer Science, vol. 20, pp. 123–
134, 2010.
Kulczycki P., Charytanowicz M., Kowalski P.A., Łukasik S., The complete gradient clustering
algorithm: Properties in practical applications. Journal of Applied Statistics, vol. 39, pp. 1211–1224,
2012.
Kulczycki P., Daniel K., Metoda wspomagania strategii marketingowej operatora telefonii komór
kowej. Przegląd Statystyczny, vol. 56, no. 2, pp. 116–134, 2009; Errata: vol. 56, no. 3–4, p. 3, 2009.
Kulczycki P., Kowalski P.A., Bayes classification of imprecise information of interval type. Control
and Cybernetics, vol. 40, pp. 101–123, 2011.
Kulczycki P., Kowalski P.A., Bayes classification for nonstationary patterns. International Journal of
Computational Methods, vol. 12, ID 1550008 (19 pages), 2015a.
Kulczycki P., Kowalski P.A., Classification of interval information with data drift. In: Modeling and
Using Context, Christiansen H., Stojanovic I., Papadopoulos G.A. (eds.), Springer, Berlin, 2015b,
pp. 495–500.
Kulczycki P., Kruszewski D., Detection of atypical elements with fuzzy and intuitionistic fuzzy
evaluations. In: Trends in Advanced Intelligent Control, Optimization and Automation, Mitkowski
W., Kacprzyk J., Oprzedkiewicz K., Skruch P. (eds.), Springer, Cham, 2017a, pp. 774–786.
Kulczycki P., Kruszewski D., Identification of atypical elements by transforming task to supervised
form with fuzzy and intuitionistic fuzzy evaluations. Applied Soft Computing, vol. 60, no. 11,
pp. 623–633, 2017b.
Kulczycki P., Łukasik S., An algorithm for reducing dimension and size of sample for data explo-
ration procedures. International Journal of Applied Mathematics and Computer Science, vol. 24,
pp. 133–149, 2014.
202 Advanced Mathematical Techniques in Engineering Sciences
Łukasik S., Kowalski P.A., Charytanowicz M., Kulczycki P., Fuzzy models synthesis with
kernel-density-based clustering algorithm. In: Fifth International Conference on Fuzzy Systems
and Knowledge Discovery, J. Ma, Y. Yin, J. Yu, S. Zhou (eds.), IEEE Computer Society, Los
Alamitos, vol. 3, pp. 449–453, 2008.
Parrish R., Comparison of quantile estimators in normal sampling. Biometrics, vol. 46, pp. 247–257,
1990.
Silverman, B.W., Density Estimation for Statistics and Data Analysis. Chapman and Hall, London, 1986.
Venter G., Review of optimization techniques. In: Encyclopedia of Aerospace Engineering, Blockley R.,
Shyy W. (eds.), Wiley, New York, 2010, pp. 5229–5238.
Vitter J.S., Random sampling with reservoir. ACM Transactions on Mathematical Software, vol. 11,
pp. 37–57, 1985.
Wand M., Jones M., Kernel Smoothing. Chapman and Hall, London, 1995.
Xu R., Wunsch D., Clustering. Wiley, New York, 2009.
Zurada J., Introduction to Artificial Neural Neural Systems. West Publishing, St. Paul, 1992.
chapter eleven
K.N. Nechval
Transport and Telecommunication Institute
G. Berzins
University of Latvia
Contents
11.1 Introduction......................................................................................................................... 203
11.2 Two-parameter Weibull distribution............................................................................... 207
11.3 Lower statistical γ-content tolerance limit with expected (1 − α)-confidence............ 211
11.4 Upper statistical γ-content tolerance limit with expected (1 − α)-confidence............ 214
11.5 Lower statistical (1 − α)-expectation tolerance limit...................................................... 217
11.6 Upper statistical (1 − α)-expectation tolerance limit...................................................... 219
11.7 Numerical example 1.........................................................................................................222
11.8 Numerical example 2......................................................................................................... 223
11.9 Conclusion...........................................................................................................................225
References......................................................................................................................................225
11.1 Introduction
Statistical tolerance (prediction) limits are another tool for making statistical inference
on an unknown population. As opposed to a confidence limit that provides information
concerning an unknown population parameter, a tolerance limit provides information on
the entire population. In this chapter, two types of statistical tolerance limits are defined:
(1) γ-content tolerance limit with expected (1 − α)-confidence, and (2) (1 − α)-expectation
tolerance limit.
To be specific, let γ denote a proportion between 0 and 1. Then one-sided γ-content
tolerance limit with expected (1 − α)-confidence is determined to capture a proportion γ
or more of the population, with a given expected confidence level 1 − α. For example, an
upper γ-content tolerance limit with expected (1 − α)-confidence for a univariate popula-
tion is such that with the given expected confidence level 1 − α, a specified proportion γ
or more of the population will fall below the limit. A lower γ-content tolerance limit with
expected (1 − α)-confidence satisfies similar conditions.
203
204 Advanced Mathematical Techniques in Engineering Sciences
∞
Eθ Pr
∫
Lk (S)
{ (
gθ ( y k ) dy k ≥ γ = Eθ Pr Gθ (Lk (S)) ≥ γ
)} = 1 − α , (11.1)
and Uk(S) is an upper γ-content tolerance limit with expected (1 − α)-confidence on future
outcomes of the kth-order statistic Yk if
U k ( S)
Eθ Pr
∫ { }
gθ ( y k ) dy k ≥ γ = Eθ Pr (Gθ (U k (S)) ≥ γ ) = 1 − α , (11.2)
0
where
1
gθ ( y k ) = [ Fθ ( y k )]k − 1[1 − Fθ ( y k )m − k fθ ( y k ) (11.3)
B( k , m − k + 1)
Chapter eleven: A new technique for constructing exact tolerance limits 205
1
Γ( a)Γ(b)
∫
B( a, b) = t a − 1 (1 − t)b − 1 dt =
0
Γ ( a + b)
(11.4)
∞
is the beta-function, and Γ( a) =
∫0
t a − 1e − t dt is the gamma function,
m
m
Gθ ( y k ) = Pr(Yk ≤ y k ) = ∑
i= k
i [ Fθ ( y k )] [1 − Fθ ( y k )]
i m− i
m
m
m
m
i
i
∑ ∑ ∑
i m− i m− i + j
= i 1 − Fθ ( y k ) Fθ ( y k ) = i
j
(−1) j Fθ ( y k )
i= k i= k j=0
Fθ ( y k )
=
∫0
ϕ (t|k , m − k + 1) dt (11.5)
1
ϕ (t|a, b) = t a − 1 (1 − t)b − 1 , t ∈(0,1), (11.6)
Β( a, b)
is a probability density function of the beta-distribution with the shape parameters a and b,
k −1
m
Gθ ( y k ) = 1 − Gθ ( y k ) = Pr(Yk > y k ) = ∑
i=0
i [ Fθ ( y k )] [1 − Fθ ( y k )]
i m− i
k −1 k −1
m m
i
i
∑ ∑ ∑
i m− i m− i + j
= i 1 − Fθ ( y k ) Fθ (y k ) = i j
(−1) j Fθ ( y k )
i=0 i=0 j=0
=
∫
Fθ ( y k )
ϕ (t|k , m − k + 1) dt. (11.7)
dGθ ( y k )
= gθ ( y k ). (11.8)
dy k
m
m
dGθ ( y k )
dy k
=
d
dy k ∑ i= k
i [ Fθ ( y k )] [1 − Fθ ( y k )]
i m− i
m
m
= ∑
i= k
i
(
i[ Fθ ( y k )]i − 1[1 − Fθ ( y k )]m − i Fθ′( y k ) − (m − i)[ Fθ ( y k )]i [1 − Fθ ( y k )]m − i − 1 Fθ′( y k ) )
m
= ∑ (i − 1)!(mm! − i)![F (y )]
i= k
θ k
i−1
[1 − Fθ ( y k )]m − i fθ ( y k )
m− 1
= ∑ (i − 1)!(mm! − i)![F (y )]
i= k
θ k
i−1
[1 − Fθ ( y k )]m − i fθ ( y k )
− ∑ ( j − 1)!(mm! − j)![F (y )]
i= k +1
θ k
j−1
[1 − Fθ ( y k )]m − j fθ ( y k )
m!
= [ Fθ ( y k )]k − 1[1 − Fθ ( y k )]m − k fθ ( y k ) = gθ ( y k ), (11.9)
( k − 1)!(m − k )!
where j = i + 1;
Fθ ( y k ) Fθ ( y k )
dGθ ( y k ) d d 1
dy k
=
dy k ∫ ϕ (t|k , m − k + 1) dt =
dy k ∫ Β( k , m − k + 1)
t k − 1 (1 − t)m − k dt
0 0 (11.10)
1
= [ Fθ ( y k )]k − 1[1 − Fθ ( y k )]m − k Fθ′ ( y k ) = gθ ( y k ) .
Β( k , m − k + 1)
The problem considered in this chapter is to find Lk (S) (lower statistical γ-content toler-
ance limit Lk with expected [1 − α]-confidence on future outcomes of the kth-order statis-
tic Yk) satisfying (11.1) and Uk(S) (upper statistical γ-content tolerance limit with expected
[1 − α]-confidence on future outcomes of the kth-order statistic Yk) satisfying (11.2) on the
basis of the experimental random sample X1, …, Xn when some or all numerical values of
components of the parametric vector θ are unspecified.
Thus, the logical purpose for a tolerance limit must be the prediction of future out-
comes for some production process. The coverage value γ is the percentage of the future
process outcomes to be captured by the prediction, and the confidence level (1 − α) is the
proportion of the time we hope to capture that percentage γ.
The common distributions used in life testing problems are the normal, exponential,
Weibull, and gamma distributions (Mendenhall 1958). Tolerance limits for the normal dis-
tribution have been considered in (Guttman 1957; Wald and Wolfowitz 1946; Wallis 1951),
and others.
Tolerance (prediction) limits enjoy a fairly rich history in the literature and have a
very important role in engineering and manufacturing applications. Patel (1986) provides
Chapter eleven: A new technique for constructing exact tolerance limits 207
a review (which was fairly comprehensive at the time of publication) of tolerance intervals
(limits) for many distributions as well as a discussion of their relation with confidence
intervals (limits) for percentiles. Dunsmore (1978) and Guenther, Patil, and Uppuluri (1976)
discuss two-parameter exponential tolerance intervals (limits) and the estimation proce-
dure in greater detail. Engelhardt and Bain (1978) discuss how to modify the formulas
when dealing with type II censored data. Guenther (1972) and Hahn and Meeker (1991)
discuss how one-sided tolerance limits can be used to obtain approximate two-sided tol-
erance intervals by applying Bonferroni’s inequality. In Nechval et al. (2011, 2016a–c), the
exact statistical tolerance and prediction limits are discussed under parametric uncer-
tainty of underlying models.
In contrast to other statistical limits commonly used for statistical inference, the toler-
ance limits (especially for the order statistics) are used relatively rarely. One reason is that
the theoretical concept and computational complexity of the tolerance limits is signifi-
cantly more difficult than that of the standard confidence and prediction limits. Thus, it
becomes necessary to use the innovative approaches that will allow one to construct toler-
ance limits on future order statistics for many populations.
In this chapter, new approaches to constructing lower and upper statistical γ-content
tolerance limits with expected (1 − α)-confidence as well as (1 − α)-expectation tolerance
limits on order statistics in future samples are proposed. For illustration, a two-parameter
Weibull distribution is considered.
x δ
Fθ ( x) = 1 − exp − , x > 0, β > 0, δ > 0, (11.12)
β
indexed by scale and shape parameters β and δ is used as the underlying distribution of a
random variable X in a sample of the lifetime data, where θ = (β, δ).
The Weibull distribution is widely used in reliability and survival analysis due to its
flexible shape and ability to model a wide range of failure rates. It can be derived theo-
retically as a form of extreme value distribution, governing the time to occurrence of the
“weakest link” of many competing failure processes. Its special case with shape parameter
δ = 2 is the Rayleigh distribution, which is commonly used for modeling the magnitude of
radial error when x and y coordinate errors are independent normal variables with zero
mean and the same standard deviation while the case δ = 1 corresponds to the widely
used exponential distribution. For illustration, probability density functions of the two-
parameter Weibull distribution for selected values of β and δ are shown in Figure 11.1.
Let X follow a Weibull distribution with scale parameter β and shape parameter δ. We
consider both parameters β, δ to be unknown. Let (X1, …, Xn) be a random sample from
208 Advanced Mathematical Techniques in Engineering Sciences
1.7
1.5
β = 0.5, δ = 2
1.3
1.1
0.9 β = 1.0, δ = 2
0.7 β = 1.5, δ = 3
0.5
β = 3.0, δ = 4
0.3
0.1
0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
Figure 11.1 The Weibull probability density functions for selected values of β and δ.
the two-parameter Weibull distribution (11.11), and let β , δ be the maximum likelihood
estimates of β, δ, respectively, computed on the basis of (X1, …, Xn):
1/δ
n
∑x
δ
β = i n , (11.13)
i=1
and
−1 −1
n
n n
∑ ∑ 1
∑
δ = xδi ln xi xδi − ln xi . (11.14)
n
i=1 i=1 i=1
It is readily verified that any n − 2 of the Zi ’s, say Z1, …, Zn−2 form a set of n − 2 functionally
independent ancillary statistics. The appropriate conditional approach is to consider the
distributions of V1, V2, V3 conditional on the observed value of Z(n) = (Z1, …, Zn). (For pur-
poses of symmetry of notation we include all of Z1, …, Zn in expressions stated here; it can
be shown that Zn, Zn−1, can be determined as functions of Z1, …, Zn−2 only.)
Theorem 1 (Joint pdf of the pivotal quantities V1, V2 from the two-parameter Weibull
distribution). Let (X1, ..., Xn) be a random sample of n observations from the two-
parameter Weibull distribution (11.11). Then the joint pdf of the pivotal quantities
δ
β δ
V1 = , V2 = , (11.17)
β
δ
Chapter eleven: A new technique for constructing exact tolerance limits 209
conditional on fixed
z ( n) = ( zi , …, zn ) , (11.18)
where
δ
X
Zi = i , i = 1,… , n, (11.19)
β
ancillary statistics, any n − 2 of which form a functionally independent set, β and
are
δ are the maximum likelihood estimates for β and δ, respectively, based on a random
sample of n observations (X1, ..., Xn) from the two-parameter Weibull distribution
(11.11), is given by
n
1 n v
−n
1
n n n
fn ( v1 , v2 |z ) =
( n)
∑
Γ(n) i = 1
zi v1n − 1 exp − v1
v2
∑
i=1
z
v2
i
ϑ (z )
( n)
v2n − 2 ∏ ∑
i=1
v2
z
i
i = 1
zi
2
where
n
1
n
n
( n)
fn ( v1 |z , v2 ) =
Γ(n) i = 1
∑
ziv2 v1n − 1 exp − v1
∑z
i=1
v2
i , v1 ∈(0, ∞) (11.21)
−n
n
n
∏ ∑
1
( n)
fn ( v2 |z ) = v2n − 2 v2
z
i
v2
z
i , v2 ∈(0, ∞), (11.22)
ϑ (z ( n) ) i=1 i=1
∞ −n
n
n
ϑ (z ) =
( n)
∫v ∏ ∑
0
n− 2
2
i=1
v2
z
i
i=1
v2
z dv2 (11.23)
i
δ −1
n
δ xi x δ
fθ ( x1 , ..., xn ) = ∏
i=1
β β
exp − i . (11.24)
β
Using the invariant embedding technique (Nechval and Vasermanis 2004; Nechval
et al. 2008, 2010), we transform (11.24) to
210 Advanced Mathematical Techniques in Engineering Sciences
n
δ xi
δ
n xi δ
fθ ( x1 ,..., xn ) dβ dδ = ∏i=1
xi β
exp −
i=1 β
∑
dβ dδ
n− 2 δ δ ( n − 1)
n
1 δ
n
xi β
= − βδ n ∏i=1
xi δ ∏
i=1
β
β
β δ n δ δ β δ − 1
xi δ
× exp −
β ∑i=1
β
dβ − 2 dδ
β β δ
δ
n n− 2 n δ δ ( n − 1)
1 δ xi β
∏ ∏
δ
= − βδ n
i=1
xi δ i=1
β
β
δ n δ
δ
δ −1
β xi δ β dβ − δ dδ
∑
δ
× exp − β β
β
i=1
β δ 2
δ
n n− 2 n δ δ ( n− 1)
1 δ xi β
∏ ∏
δ
= − βδ n
i=1
xi δ i=1
β
β
δ n δ
δ
δ
β xi d β d δ
∑
δ
× exp −
β β δ
i=1
β
n n
n
= − βδ n ∏i=1
1 n− 2
xi
v2 ∏ i=1
ziv2 v1n− 1 exp − v1
∑z
i=1
v2
i dv1dv2
−n
n n
n
∏ ∏ ∑
1
= − βδ n Γ(n)v2n− 2 v2
z
i
v2
z
i
i=1
xi i=1 i=1
n
1
n
n
×
Γ(n) i = 1 ∑
ziv2 v1n − 1 exp − v1
∑z
i=1
v2
i dv1dv2 .
(11.25)
Normalizing (11.25),
−n n
n n
n 1
n
n
∏ ∏ ∑ ∑ ∑
1
− βδ n Γ(n)v2n − 2 v2
z
i ziv2 ziv2 v1n − 1 exp − v1 ziv2
i=1
xi i=1 i = 1 Γ(n) i = 1 i=1
−n n
∞ ∞
n n
n
n
n
∏ x Γ(n)v ∏ ∑ ∑ ∑
1 1
∫∫
0 0
− βδ n
i=1 i
n− 2
2
i=1
ziv2
i=1
ziv2
Γ ( n)
i=1
ziv2 v1n − 1 exp − v1
i=1
ziv2 dv1 dv2
Chapter eleven: A new technique for constructing exact tolerance limits 211
−n n
n
n v 1
n
n
v n− 2
2 ∏ ∑
i=1
v2
z
i
i = 1
zi 2
∑
Γ(n) i = 1
ziv2 v1n − 1 exp − v1
∑z
i=1
v2
i
= ∞ −n
n
n
∫v ∏ ∑
0
n− 2
2
i=1
ziv2
i = 1
ziv2 dv2
−n
n
n
1
n
n
n
v n− 2
2 ∏ ∑ v2
z
i
i = 1
ziv2
=
Γ(n) i = 1
∑
ziv2 v1n − 1 exp − v1
∑i=1
z
v2
i
i=1
ϑ (z( n) )
= fn ( v1 , v2 |z( n) ), (11.26)
δ δ V2
n
β δ n
βδ n n
W = V1
∑
i=1
ZiV2 =
β
∑
i=1
Z V2
i = δ
β
∑
i=1
ZiV2 = V3V2 ∑Z
i=1
V2
i , (11.27)
1
W ~ g n (w ) = w n − 1 exp(− w), w ∈(0, ∞) (11.28)
Γ(n)
a−1
1 w
f ( w| a , b ) = exp(− w/b), w ∈(0, ∞), (11.29)
Γ( a)b b
Lk = ηL1kδ β , (11.30)
212 Advanced Mathematical Techniques in Engineering Sciences
where
(
ln 1− q1−γ
n
)−1 ∑ ZiV2
∞ i=1 n
n
−n
∏ z ∑ z
1 1
∫∫
= arg dw dv2 = 1 − α
V2
ηLk ηLk
w n − 1 exp(− w) v2n − 2 v2 v2
ϑ ( z( n) )
i i
0 Γ(n)
i=1 i=1
0
(11.31)
is a tolerance factor; the maximum likelihood estimates β and δ of the parameters
β and δ are determined from (11.13) and (11.14), respectively; the ancillary statis-
tics Zi, i = 1, …, n, are given by (11.19); q1−γ is a quantile of the beta-distribution
satisfying
q1−γ q1−γ
1
∫ ϕ(t|k , m − k + 1) dt = ∫ Β(k , m − k + 1) t
0 0
k −1
(1 − t)m − k dt = 1 − γ . (11.32)
Proof. Taking into account (11.2), (11.7), (11.27), and (11.28), the following probability
transformation can be carried out:
∞
∫ (
Pr gθ ( y k ) dy k ≥ γ = Pr Gθ (Lk ) ≥ γ
)
Lk
Fθ ( Lk )
= Pr ( 1 − Gθ (Lk ) ≥ γ ) = Pr 1 −
0
∫ ϕ (t|k , m − k + 1) ≥ γ
Fθ ( Lk )
= Pr
∫ ϕ (t|k , m − k + 1) dt ≤ 1 − γ = Pr ( Fθ (Lk ) ≤ q1−γ )
0
L δ L δ
= Pr 1 − exp − k ≤ q1−γ = Pr exp − k ≥ 1 − q1−γ
β β
Lk δ Lk δ −1
= Pr − ≥ ln ( 1 − q1−γ ) = Pr ≤ ln ( 1 − q1−γ )
β β
δ
L β δ δ δ
β L δ
= Pr ≤ ln ( 1 − q1−γ ) = Pr ≤ ln ( 1 − q1−γ )
k −1 k −1
β β β β
n
ln ( 1 − q1−γ ) ∑Z
−1 V2
−1
ln ( 1 − q1−γ )
n i
= Pr V1 ≤
V2
= Pr V
1 ∑Z V2
i ≤
δ V2
i=1
( ) ( )
δ
L
k β i=1
L
k β
Chapter eleven: A new technique for constructing exact tolerance limits 213
n
ln ( 1 − q1−γ ) ∑
−1
ZiV2
= Pr W ≤ i=1
V2
,
(11.33)
( )
δ
Lk β
where q1−γ is the (1 − γ)-quantile of the beta-distribution with the shape parameters
a = k and b = m − k + 1. Using pivotal quantity averaging, it follows from (11.1) and
(11.33) that
(
ln 1− q1−γ )−1 ∑ ZiV2
n
n
ln ( 1 − q1−γ ) ∑
−1 i=1
Zi
V2
V2
δ
(
= E Lk β )
E Pr W ≤
∫
i=1
V2 g n (w) dw
( )
δ
0
Lk β
n
(
ln 1− q1−γ )−1 ∑ ZiV2
i=1
V2
δ
(
∞
Lk β )
∫∫
= g n (w) fn ( v2 |z( n) ) dw dv2
0
0
n
(
ln 1− q1−γ )−1 ∑ ZiV2
i=1
V
δ 2 −n
∞
(
Lk β ) n
n
∏ ∑
1 1
∫∫
= w n − 1 exp(− w) v2n − 2 z v2 v2
z dw dv2 = 1 − α .
ϑ ( z ( n) )
i i
0 Γ(n) i=1 i=1
0
(11.34)
(
ln 1− q1−γ )−1 ∑ ZiV2
n
i=1
∞
V2
δ −n
(
Lk β ) n
n
∏ ∑
1 1
∫∫
Lk = arg w n − 1 exp(− w) v2n − 2 ziv2 zi dw dv2 = 1 − α .
v2
0 Γ(n) ϑ ( z( n) )
0 i=1 i=1
(11.35)
Assuming that
(L β )
δ
k = ηLk , (11.36)
Lk = ηL1kδ β , (11.37)
n
(
ln 1− q1−γ )−1 ∑ ZiV2 −n
∞ i=1 n
n
∏ ∑
1 1
∫∫
V2
ηLk w n − 1 exp(− w) vn− 2 z v2
z v2
dw dv2 = 1 − α ,
0 Γ(n) (
ϑ z ( n) ) 2
i=1
i
i=1
i
0
(11.38)
q1−γ is a quantile of the beta-distribution (with k = 1) satisfying (11.32).
Corollary 2.2. If k = m = 1, then
Lk = ηL1kδ β , (11.39)
where
ln γ −1
n
∑ ZiV2
∞ i=1 n
n
−n
∏ ∑
1 1
= arg
∫∫ dw dv2 = 1 − α .
V
ηLk 2
ηLk w n − 1 exp(− w) vn− 2 ziv2 ziv2
0 Γ(n) (
ϑ z ( n) ) 2
i=1 i=1
0
(11.40)
where
(
ln 1− qγ
n
)−1 ∑ ZiV2
∞ i=1 n
n
−n
∏ ∑
1 1
= arg
∫∫ dw dv2 = α
V2
ηU k ηU k w n − 1 exp(− w) v2n − 2 ziv2 ziv2
0 Γ(n) ϑ (z ( n) )
i=1 i=1
0
(11.42)
Chapter eleven: A new technique for constructing exact tolerance limits 215
is a tolerance factor; the maximum likelihood estimates β and δ of the param-
eters β and δ are determined from (11.13) and (11.14), respectively; the ancillary
statistics Zi, i = 1, …, n, are given by (11.19); qγ is a quantile of the beta-distribution
satisfying
qγ qγ
1
∫ ϕ(t|k , m − k + 1) dt = ∫ Β(k , m − k + 1) t
0 0
k −1
(1 − t)m − k dt = γ . (11.43)
Proof. Taking into account (11.2), (11.5), (11.27), and (11.28), the following probability
transformation can be carried out:
Uk Fθ (U k )
∫
Pr gθ ( y k ) dy k ≥ γ = Pr (Gθ (U k ) ≥ γ ) = Pr
0
0
∫ϕ (t|k , m − k + 1) ≥ γ = Pr ( Fθ (U k ) ≥ qγ )
Uk δ Uk δ
= Pr 1 − exp − ≥ qγ = Pr exp − ≤ 1 − qγ
β β
U δ L δ −1
= Pr − k ≤ ln ( 1 − qγ ) = Pr k ≥ ln ( 1 − qγ )
β β
δ
U β δ δ δ
β U δ
≥ ln ( 1 − qγ ) = Pr β ≥ ln ( 1 − qγ )
= Pr k −1
k −1
β β β
n
ln ( 1 − qγ ) ∑Z
−1 V2
ln ( 1 − qγ )
−1 n i
= Pr V1 ≥
δ V2
= Pr V1 ∑Z V2
i ≥
δ V2
i=1
(
Uk β
)
i=1
(
Uk β
)
n
ln ( 1 − qγ ) ∑
−1
ZiV2
= Pr W ≥
i=1 ,
(11.44)
δ V2
(
Uk β
)
where qγ is the γ quantile of the beta-distribution (11.6) with the shape parameters
a = k and b = m − k + 1.
Using pivotal quantity averaging, it follows from (11.2) and (11.44) that
216 Advanced Mathematical Techniques in Engineering Sciences
(
ln 1− qγ
n
)−1 ∑ ZiV2
n
ln ( 1 − qγ ) ∑
−1 i=1
ZiV2
V2
δ
(
= E 1 − Uk β )
∫
E Pr W ≥ i=1
V2 g n (w) dw
( )
δ
0
Uk β
n
ln 1− qγ( )−1 ∑ ZiV2
i=1
V
δ 2
∞
(
Uk β )
∫∫
= 1− g n (w) fn ( v2 |z( n) ) dw dv2
0
0
n
ln 1− qγ( )−1 ∑ ZiV2
i=1
V
δ 2 −n
∞
(
Uk β ) n
n
∏ ∑
1 1
∫∫
= 1− w n − 1 exp(− w) v2n − 2 z v2
i z v2
i dw dv2 = 1 − α .
0 Γ(n) ϑ (z ( n ) ) i=1 i=1
0
(11.45)
(
ln 1− qγ
n
)−1 ∑ ZiV2
i=1
∞
V2
δ −n
(
Uk β ) n
n
∏ ∑
1 1
∫∫
U k = arg w n − 1 exp(− w) v2n − 2 ziv2 ziv2 dw dv2 = α .
0 Γ(n) ϑ (z ( n) )
0 i=1 i=1
(11.46)
Assuming that
( )
δ
Uk β = ηU k , (11.47)
where ηU k satisfies
n
(
ln 1− qγ )−1 ∑ ZiV2 −n
∞ i=1 n
n
∏ ∑
1 1
∫∫
V2
ηU k
w n − 1 exp(− w) v2n − 2 v2
z
i z v2
i dw dv2 = α . (11.49)
0 Γ(n) ϑ (z ( n) ) i=1 i=1
0
Chapter eleven: A new technique for constructing exact tolerance limits 217
qγ qγ
1
∫ ϕ(t|k , m − k + 1) dt = ∫ Β(k , m − k + 1) t
0 0
k −1
(1 − t)m − k dt = γ . (11.50)
where
ln(1− γ )−1
n
∑ ZiV2
∞ i=1 n
n
−n
∏ ∑
1 1
= arg
∫∫ dw dv2 = α .
V2
ηU k ηU k w n − 1 exp(− w) v2n − 2 ziv2 ziv2
0 Γ(n) ϑ (z ( n) )
i=1 i=1
0
(11.52)
Remark 1. It will be noted that an upper statistical γ-content tolerance limit with
expected (1 − α)-confidence may be obtained from a lower statistical γ -content toler-
ance limit with expected (1 − α)-confidence by replacing 1 − α by α, 1 − γ by γ.
∞
{ (
Eθ Pr Yk > Lk (S) )}
= Eθ
L (S)
∫
gθ ( y k ) dy k = 1 − α , (11.53)
k
is given by
Lk = ηL1 δ β , (11.54)
k
k −1
m
l
l
∞ v n− 2
2 ∏z v2
i
∑ l
∑ j (−1)
j=0
j 1
ϑ (z( n) ) ∫ n
i=1
n dv2 = 1 − α . (11.55)
∑
l=0 v2
z + ηL (m − l + j)
0 v2
k
i
i=1
218 Advanced Mathematical Techniques in Engineering Sciences
Proof. Taking into account (11.53), (11.7), (11.27), and (11.28), the following probability
transformation can be carried out:
∞
(
Pr Yk > L =
k ) ∫ g (y ) dy
θ k k = Gθ (Lk )
Lk
k −1 k −1
m m
l
l
∑ ∑ ∑
l m− l m− l + j
= l 1 − Fθ (Lk ) Fθ (Lk ) = (−1) j Fθ (Lk )
l
j
l=0 l=0 j=0
m−l + j
k −1
m
l
l Lk δ
= ∑ l
∑ j (−1) exp −
j
β
l=0 j=0
m−l + j
k −1 δ δ δ
m
i
l β Lk δ
= ∑ l
∑
j
j (−1) exp −
β β
l=0 j=0
m−l + j
k −1 l
l n δ V2
n
−1
m L
= ∑ l
∑ j
( −1) j
exp −V1
∑ ziV2 k
β
∑ zi
V2
l=0 j=0
i=1 i=1
V2 −1
δ
k −1 l
l n
= ∑
m
l
∑
j
j (−1) exp −W Lk β
( ) ∑ V2
z (m − l +
i
j) .
l= 0 j= 0 i=1
(11.56)
Using pivotal quantity averaging, it follows from (11.22), (11.27), (11.28), (11.53), and
(11.56) that
{ (
Eθ Pr Yk > Lk (S) )}
k −1 m V2 −1
l δ
l n
= E ∑
l=0 l
∑ j
(−1) exp −W Lk β
( ) ∑ z (m − l +
V2
i j)
j=0 j i=1
−1
m ∞ ∞ v2
l δ
k −1 l n
=
∑
l
∑ (−1)
j
∫∫
(
exp − w Lk β
) ∑ z (m − l +
v2
i j) gn (w) fn ( v2 |z( n) ) dwdv2
l=0 j=0 j 0 0 i=1
n
k −1
m l
l ∞ v n− 2
2 ∏z v2
i
= ∑
l
∑
j=0 j
( −1) j 1
ϑ (z( n) ) ∫ n
i=1
δ v2
n dv2 , (11.57)
l=0 0
∑
( )
ziv2 + Lk β (m − l + j)
i=1
Chapter eleven: A new technique for constructing exact tolerance limits 219
where
n
k −1
m
l
l
∞ v zn− 2
2 ∏ v2
i
Lk = arg
∑ l
∑ j (−1)
j=0
j 1
ϑ (z ( n) ) ∫ n
i=1
v2
n dv2 = 1 − α .
( )
δ
∑
l=0 0
ziv2 + Lk β (m − l + j)
i=1
(11.58)
Assuming that
( )
δ
L β
k = ηL , (11.59)
k
1
∞ v n− 2
2 ∏z v2
i
ϑ (z ( n) ) ∫ n
i=1
n dv2 = 1 − α . (11.61)
∑
v2
z + ηL m
0 v2
k
i
i=1
1
∞ v n− 2
2 ∏z v2
i
ϑ (z ( n) ) ∫ n
i=1
n dv2 = 1 − α . (11.63)
∑
v2
z + ηL
0 v2
i
k
i=1
U k (S)
{ (
Eθ Pr Yk ≤ U k (S) )}
= Eθ
0
∫
gθ ( y k ) dy k = 1 − α , (11.64)
is given by
U k = ηU1 δ β , (11.65)
k
k −1
m l l ∞ v n− 2
2 ∏z v2
i
∑ l
j=0
∑
j (−1)
j
ϑ (
1
z ( n)
) ∫ n
i=1
n dv2 = α . (11.66)
∑
l=0 v2
z + ηU (m − l + j)
0 v2
i
k
i=1
Proof. Taking into account (11.63), (11.7), (11.27), and (11.28), the following probability
transformation can be carried out:
∞
( )
Pr Yk ≤ Lk = 1 −
∫ g (y ) dy
θ k k = 1 − Gθ (U k )
U k
k −1
m k −1
m l
l
= 1− ∑l=0 l
l m− l
1 − Fθ (U k ) Fθ (U k ) = 1 − ∑ ∑
l=0
l j=0 j
j m−l + j
(−1) Fθ (U k )
m−l + j
k −1
m l
l U k δ
= 1− ∑
l
∑
j=0 j
( −1) j
exp −
β
l=0
m−l + j
δ δ δ
k −1
m i
l
β Lk δ
= 1− ∑ ∑ j
(−1) exp −
β β
l=0 l j=0 j
m− l + j
k −1
m l
l n δ V2
n
−1
U
= 1− ∑
l
∑
j=0 j
j
(−1) exp −V1
∑ ziV2 k
β
∑ ziV2
l=0 i=1 i=1
m l V2 −1
δ
k −1 l n
= 1− ∑
l
∑ ( −1) j
exp
− W U
k
β
( ) ∑ V2
z (m − l +
i j) .
j=0 j
l=0
i=1
(11.67)
Using pivotal quantity averaging, it follows from (11.22), (11.27), (11.28), (11.64), and
(11.67) that
Chapter eleven: A new technique for constructing exact tolerance limits 221
{ (
Eθ Pr Yk ≤ U k (S) )}
V2 −1
δ
k −1 l
l n
( )
m
= E 1 −
∑ l
∑ j
j=0
( −1) j
exp
− W
U
k β
∑ z (m − l +
V2
i j)
l=0
i=1
k −1
m l
l
= 1− ∑
l=0
l
∑
j=0
j
(−1) j
−1
∞ ∞
δ v2 n
×
∫∫
(
exp − w U k β
) ∑ z (m − l +
v2
i
j) g n (w) f n (v2 |z( n) )dwdv2
0 0 i=1
n
k −1
m l
l ∞ v2n − 2 ∏z v2
i
= 1− ∑ l
∑ j (−1)
j=0
j
ϑ (
1
z ( n)
) ∫ n
i=1
v2
n dv2 ,
( )
∑
l=0 0 δ
ziv2 + U k β (m − l + j)
i=1
(11.68)
where
n
k −1
m
l
l
∞ v z n− 2
2 ∏
v2
i
U k = arg
∑ l
∑ j (−1)
j=0
j 1
ϑ (z( n) ) ∫ n
i=1
v2
n dv2 = α .
( )
δ
∑
l=0 0
ziv2 + U k β (m − l + j)
i=1
(11.69)
Assuming that
( )
δ
U β
k = ηU , (11.70)
k
where ηU satisfies
k
1
∞ v2n − 2 ∏z v2
i
ϑ (z ( n) ) ∫ n
i=1
n dv2 = α . (11.72)
∑
v2
z + ηU m
0 v2
i
k
i=1
222 Advanced Mathematical Techniques in Engineering Sciences
where ηU satisfies
k
1
∞ v2n − 2 ∏z v2
i
ϑ (z ( n) ) ∫ n
i=1
n dv2 = α . (11.74)
∑
v2
ziv2 + ηU
0
k
i=1
11.7 Numerical example 1
Consider the data in an example discussed by Mann and Saunders (1969). They regard
the data coming from the Weibull distribution as the results of full-scale fatigue tests on
a particular type of component. The data are for a complete sample of size n = 3, with
observations X1 = 45.952, X2 = 54.143, and X3 = 65.440, results being expressed here in num-
ber of thousands of cycles. On the basis of these data, it is wished to obtain the lower
(1 − α)-expectation tolerance limit for the minimum (Y1) of independent lifetimes in a
group of m = 500 components which are to be put into service.
The maximum likelihood estimates of the unknown parameters δ and β, computed on
the basis of (X1, X2, X3), are δ = 7.726 and β = 58.706, respectively. Taking 1 − α = 0.8 and
k = 1, with n = 3 and m = 500, we have from (11.60) that the statistical lower (1 − α)-expectation
tolerance limit, Lk ≡ Lk (S), for the minimum (Y1) of independent lifetimes in a group of
m = 500 components which are to be put into service, is given by
Lk = ηL1 δ β = 5.527411, (11.75)
k
n
1
∞ vn− 2
2 ∏ z v2
i
ηL
k
= arg
ϑ (z ) 0
( n) ∫ n
i=1
n dv2 = 1 − α = 1.18/10 . (11.76)
8
∑
v2
ziv2 + ηL m
k
i=1
Lawless (1973) obtained for this example (via conditional approach in terms of a Gumbel
distribution) the lower 80% prediction limit of 5.623, which is slightly larger than (11.75).
The resulting lower 80% prediction limit of Mee and Kushary (1994) for this example
(obtained via simulation) was 5.225, which is slightly smaller than (11.75). The Mann and
Saunders (1969) result for this example was only 0.766.
Chapter eleven: A new technique for constructing exact tolerance limits 223
Taking γ = 0.8, 1 − α = 0.8 and k = 1, with n = 3 and m = 500, we have from (11.37) that
a lower statistical γ-content tolerance limit, Lk ≡ Lk (S), with expected (1 − α)-confidence for
the minimum (Y1) of independent lifetimes in a group of m = 500 components which are to
be put into service, is
Lk = ηL1kδ β , = 4.082282, (11.77)
(
ln 1− q1−γ
n
)−1 ∑ ZiV2
∞ i=1 n
n
−n
∏ ∑
1 1
= arg
∫∫ dw dv2 = 1 − α
V2
ηLk ηLk w n − 1 exp(− w) v2n − 2 ziv2 ziv2
0 Γ(n) ϑ (z ( n) )
i=1 i=1
0
11.8 Numerical example 2
To investigate the performance of a logic circuit for a small electronic calculator, a circuit
manufacturer puts n = 5 of the circuits on life test without replacement under specified
environmental conditions, and the failures are observed after X1 = 830, X2 = 1020, X3 = 1175,
X4 = 1424, and X5 = 1603 hours. A buyer tells the circuit manufacturer that he wants to
place three orders (l = 3) for the same type of logic circuits to be shipped to three different
destinations. The buyer wants to select a random sample of q = 5 logic circuits from each
shipment to be tested. An order is accepted only if all of five logic circuits in each selected
sample meet the warranty lifetime. What warranty lifetime should the manufacturer offer
so that all of five logic circuits in each selected sample meet the warranty with probability
of 0.95?
In order to find this warranty lifetime, the manufacturer wishes to use a random sam-
ple of size n = 5 given above and to calculate the lower statistical simultaneous tolerance
limit Lk=1(S) (warranty lifetime) which is expected to capture a certain proportion, say,
γ = 0.975 or more of the population of selected items (m = lq = 15), with a given confidence
level 1 − α = 0.95. This lower statistical simultaneous tolerance limit is such that one can say
with a certain confidence 1 − α that at least 100γ% of the military carriers in each sample
selected by the buyer for testing will operate longer than L1(S).
Goodness-of-fit testing. Let us assume that (X1, …, Xn) is a random sample from the two-
parameter Weibull distribution (11.11), and let β , δ, be the maximum likelihood estimates
of β, δ, respectively, computed on the basis of (X1, …, Xn):
−1 −1
n
n
n
∑ ∑ 1
∑
δ = δ
x ln xi
i
δ
x
i − ln xi = 4.977351, (11.79)
n
i=1 i=1 i=1
and
224 Advanced Mathematical Techniques in Engineering Sciences
1/δ
n
∑x
δ
β = i n = 1321.323 . (11.80)
i=1
We assess the statistical significance of departures from the Weibull model (11.11) by per-
forming the Anderson–Darling goodness-of-fit test. There are many goodness-of-fit tests,
for example: Kolmogorov–Smirnov and Anderson–Darling tests. The Anderson–Darling
test is more sensitive to deviations in the tails of a distribution than the older Kolmogorov–
Smirnov test. The Anderson–Darling test statistic value is determined by (e.g., D’Agostino
and Stephens 1986)
n
2
A = −
∑ (2i − 1)( ln F (x ) + ln (1 − F (x
i=1
θ i θ n + 1− i ))) n − n, (11.81)
and n = 5 is the number of observations. The result from (11.81) needs to be modified for
small sampling values. For the Weibull distribution, the modification of A2 is
2 0.2
Amod = A2 1 + . (11.83)
n
2
The Amod value must then be compared with critical values, Aα2 , which depend on the sig-
nificance level α and the distribution type. As an example, for the Weibull distribution the
2
determined Amod value has to be less than the critical value Aα2 for acceptance of goodness-
of-fit test. For this example, α = 0.05, Aα2 = 0.05 = 0.757,
5
A2 = −
∑
i=1
(2 i − 1) ( ln Fθ ( xi ) + ln ( 1 − Fθ ( xn + 1− i ))) 5 − 5 = 0.202335, (11.84)
2 0.2
Amod = A2 1 + = 0.220432 < Aα2 = 0.05 = 0.757. (11.85)
n
Since the test statistic is less than the critical value, we do not reject the null hypothesis at
the significance level α = 0.05. Thus, there is not evidence to rule out the Weibull lifetime
model (11.11).
Now the lower one-sided simultaneous γ-content tolerance limit at the confidence level
1 − α, L1 ≡ L1 (S) (on the order statistic Y1 from a set of m = 15 future ordered observations
Y1 ≤ … ≤ Ym) is given by (11.37)
Lk = ηL1kδ β = 328.7676 ≅ 329, (11.86)
(
ln 1− q1−γ
n
)−1 ∑ ZiV2
∞ i=1 n
n
−n
∏ z ∑ z
1 1
∫∫
= arg dw dv2 = 1 − α ,
V2
ηLk ηLk
w n − 1 exp(− w) v2n − 2 v2
i
v2
i
0 Γ(n) ϑ (z ( n) )
i=1 i=1
0
= 0.0009842, (11.87)
11.9 Conclusion
The new technique (based on probability transformation and pivotal quantity averag-
ing) given and illustrated in this chapter is offered as a conceptually simple, efficient,
and useful method for constructing exact statistical tolerance limits on future outcomes
under parametric uncertainty of underlying models. It is based also on the idea of invari-
ant embedding of a sufficient statistic in the underlying model in order to construct piv-
otal quantities and to eliminate the unknown parameters from the problem via pivotal
quantity averaging. Using the proposed technique, the exact statistical tolerance limits on
future order statistics (under parametric uncertainty of underlying models)) associated
with sampling from corresponding distributions can be found easily and quickly making
tables, simulation, and special computer programs unnecessary.
We consider the one-sided statistical tolerance limits defined as follows: (1) one-sided
statistical tolerance limit that covers at least 100γ% of the measurements with expected
100(1 − α)% confidence and (2) one-sided statistical tolerance limit determined so that the
expected proportion of the measurements covered by this limit is (1 − α). For example,
such tolerance limits are required when planning life tests, engineers may need to predict
the number of failures that will occur by the end of the test or to predict the amount of
time that it will take for a specified number of units to fail. The methodology described
in this chapter is illustrated for the two-parameter Weibull distribution. Applications to
other log-location-scale distributions could follow directly. Finally, we give two numerical
examples.
It should be noted that the results obtained in this chapter (Sections 11.3–11.8) via the
proposed technique are new.
References
D’Agostino, R.B. and M.A. Stephens, Goodness-of-Fit Techniques. New York: Marcel Dekker, 1986.
Dunsmore, J.R., “Some approximations for tolerance factors for the two parameter exponential dis-
tribution,” Technometrics, vol. 20, pp. 317–318, 1978.
Engelhardt, M. and L.J. Bain, “Tolerance limits and confidence limits on reliability for the two-
parameter exponential distribution,” Technometrics, vol. 20, pp. 37–39, 1978.
Guenther, W.C., “Tolerance intervals for univariate distributions,” Naval Research Logistics Quarterly,
vol. 19, pp. 309–333, 1972.
Guenther, W.C., S.A. Patil and V.R.R. Uppuluri, “One-sided β-content tolerance factors for the two
parameter exponential distribution,” Technometrics, vol. 18, pp. 333–340, 1976.
226 Advanced Mathematical Techniques in Engineering Sciences
Guttman, I., “On the power of optimum tolerance regions when sampling from normal distribu-
tions,” Annals of Mathematical Statistics, vol. XXVIII, pp. 773–778, 1957.
Hahn, G.J. and W.Q. Meeker, Statistical Intervals: A Guide for Practitioners. New York: John Wiley &
Sons, 1991.
Lawless, J.F., “On estimation of the safe life when the underlying life distribution is Weibull,”
Technometrics, vol. 15, 857–865, 1973.
Mann, N.R. and S.C. Saunders, “On evaluation of warranty assurance when life has a Weibull dis-
tribution,” Biometrika, vol. 56, pp. 615–625, 1969.
Mee, R.W. and D. Kushary, “Prediction limits for the Weibull distribution utilizing simulation,”
Computational Statistics & Data Analysis, vol. 17, 327–336, 1994.
Mendenhall, V. “A bibliography on life testing and related topics,” Biometrika, vol. XLV, pp. 521–543,
1958.
Nechval, N.A. and E.K. Vasermanis, Improved Decisions in Statistics. Riga: Izglitibas soli, 2004.
Nechval, N.A., G. Berzins, M. Purgailis and K.N. Nechval, “Improved estimation of state of stochas-
tic systems via invariant embedding technique,” WSEAS Transactions on Mathematics, vol. 7,
pp. 141–159, 2008.
Nechval, N.A., M. Purgailis, G. Berzins, K. Cikste, J. Krasts and K.N. Nechval, “Invariant embed-
ding technique and its applications for improvement or optimization of statistical decisions,”
in Al-Begain, K., Fiems, D., Knottenbelt, W. (Eds.), Analytical and Stochastic Modeling Techniques
and Applications, (LNCS) (vol. 6148, pp. 306–320). Berlin: Springer-Verlag, 2010.
Nechval, N.A., K.N. Nechval and M. Purgailis, “Statistical inferences for future outcomes with appli-
cations to maintenance and reliability,” in Lecture Notes in Engineering and Computer Science:
Proceedings of the World Congress on Engineering, WCE 2011, 6–8 July, 2011 (pp. 865–871). London,
UK, 2011.
Nechval, N.A. and K.N. Nechval, “Tolerance limits on order statistics in future samples coming
from the two-parameter exponential distribution,”American Journal of Theoretical and Applied
Statistics, vol. 5, pp. 1–6, 2016a.
Nechval, N.A., K.N. Nechval, S.P. Prisyazhnyuk and V.F. Strelchonok, “Tolerance limits on order sta-
tistics in future samples coming from the Pareto distribution,” Automatic Control and Computer
Sciences, vol. 50, pp. 423–431, 2016b.
Nechval, N.A., K.N. Nechval and V.F. Strelchonok, “A new approach to constructing tolerance limits
on order statistics in future samples coming from a normal distribution,” Advances in Image and
Video Processing (AIVP), vol. 4, pp. 47–61, 2016c.
Patel, J.K., “Tolerance limits: A review,” Communications in Statistics: Theory and Methodology, vol. 15,
pp. 2719–2762, 1986.
Wald, A. and J. Wolfowitz, “Tolerance limits for a normal distribution,” Annals of Mathematical
Statistics, vol. XVII, pp. 208–215, 1946.
Wallis, W.A., “Tolerance intervals for linear regression,” in Neyman, J. (Ed.), Second Berkeley Symposium on
Mathematical Statistics and Probability (pp. 43–51) Berkeley: University of California Press, 1951.
chapter twelve
Contents
12.1 I ntroduction......................................................................................................................... 227
12.2 Kinematics and dynamics of the biped robot................................................................. 229
12.2.1 Dynamic balance margin while ascending the staircase................................. 230
12.2.2 Dynamic balance margin while descending the staircase............................... 231
12.2.3 Design of torque-based PID controllers for the biped robot............................ 232
12.3 MCIWO-based PID controller...........................................................................................234
12.4 MCIWO-NN–based PID controller.................................................................................. 236
12.5 Results and discussion....................................................................................................... 238
12.5.1 Ascending the staircase......................................................................................... 238
12.5.2 Descending the staircase....................................................................................... 241
12.6 Conclusions.......................................................................................................................... 245
References...................................................................................................................................... 245
12.1 Introduction
Compared to industrial manipulators, legged robots are having much more interaction
with the ground and it is a tough job to control the robot in an effective manner. Further,
the mechanism, structure and balancing of the two-legged robot is complex in nature
when compared with other legged robots. Over the past few decades, people are working
on the stability and controlling aspects of the biped robot on various terrains. Generating
the stable gait for the biped robot while walking on various terrains is a difficult task
and was taken up by many researchers. Presently, researchers are utilizing zero moment
point (ZMP) [1] based control algorithms to control the gait of the two-legged robot. Some
other researchers had tried to optimize the parameters of the ZMP-based controller after
utilizing nontraditional algorithms [2,3]. It is important to note that the conventional PID
controllers were widely deployed in both industrial as well as nonindustrial applications
due to its ease, simple design, and cost effectiveness. Based on the demand for the usage
of PID controller in various applications, the real-time tuning/adaption of the controller
gains (i.e., Kp, Kd, and Ki) in an online manner is a challenging task. Moreover, the tuning
methods such as Zeigler–Nicholas [4] and Cohen [5] methods were already proved that it is
not possible to use them in highly nonlinear, uncertain, and coupled robotic applications.
227
228 Advanced Mathematical Techniques in Engineering Sciences
• In this research work, the authors initiated a torque-based PID controller to control
each joint of the biped robot in a systematic manner and to reduce the error between
the two consecutive intervals of various joints.
• Optimal tuning of the PID controller occurred using the MCIWO algorithm, instead
of the time-consuming manual tuning process. A NN tool has also been developed
to tune the gains of the PID controller.
Chapter twelve: Design of neural network–based PID controller 229
• Further, the authors implemented a cosine and chaotic variables to the standard
invasive weed optimization (i.e., modified chaotic invasive weed optimization)
algorithm to evolve the structure of NN automatically and to generate the gains in
an adaptive manner. With the best of the authors’ knowledge, none of the research-
ers had used the MCIWO algorithm to evolve the structure of the neural network
in control applications.
where H1 = l4 cos(θ4) + l3 cos(θ 3), L1 = l4 sin(θ4) + l3 sin(θ 3), ψ = θ4 − θ 3 = arcos((H12 + L12 − l42 −
l32)/2 l4l3). The angle θ 3 can be calculated after using the equation θ 3 = θ4 − ψ.
where H2 = l9 cos(θ 9) + l10 cos(θ 10), L2 = l9 sin(θ 9) + l10 sin(θ 10), ψ = θ 10 – θ 9. The angle θ 9 can be
calculated by using θ 9 = θ 10 – ψ.
Further, the following mathematical expressions are used to calculate the joint angles
in the frontal plane:
θ 2 = θ 8 = tan −1 ( fw H 1 ) (12.3)
( )
θ 5 = θ 11 = tan −1 ( 0.5 fw ) H 2 (12.4)
(a) (b)
l14 m17
m14 l14 l17
m18 m13 m16
l17
θ14 θ17 θ18
m15 l17
l15 l18
θ15
m15 m18
l15 L1
m9 l3
m3
θ3 θ9 m2
l3 m8
θ2 θ8
m10 l2 l8
H1
m4 θ10
l4 l3
θ4 m5 m11
l5 θ5 θ11 l11
x1 m12
l6 m6 l12
x2
fw
x3
Figure 12.2 Biped robot walking on ascending the staircase. (a) Sagittal view and (b) frontal view.
Chapter twelve: Design of neural network–based PID controller 231
Lower limb
Lower limb
X ZMP Y ZMP
YZMP
F XZMP F
YDBM
Sagittal plane XDBM Frontal plane
Figure 12.3 ZMP and DBM in both sagittal and frontal planes.
∑
n
i=1
( Iiω i − mi xi zi + mi xi ( g − zi ))
xZMP = (12.5)
∑ i=1
n
( m i (
z i − g ) )
∑
n
i=1
( Iiω i − mi yi zi + mi yi ( g − zi ))
y ZMP = (12.6)
∑ i=1
n
( m i (
z i − g ) )
where ω i, Ii, and mi represent the angular acceleration (rad/s2), mass moment of inertia
(kg m2) and mass (kg) of the link i, g is the acceleration due to gravity (m/s2), z̈i and ẍi
denote the acceleration (m/s2) of the link in z- and x-directions, respectively, and (xi, yi, zi)
indicates the coordinates of the itℎ lumped mass.
After determining the position of ZMP, the dynamic balance margin (DBM) of the
biped robot in X- and Y-directions (Figure 12.3) are calculated by using Equations (12.7)
and (12.8), respectively. It is important to note that for generating the dynamically bal-
anced gait, the ZMP should lie inside the foot support polygon. If the ZMP moves outside
the polygon, we need to move the links of the biped robot in such a way that they push
the ZMP inside the foot support polygon. Further, the dynamic balance margin has been
defined as the distance between the point where the ZMP is acting and to the end of the
foot support polygon:
f
xDBM = s − xZMP (12.7)
2
f
y DBM = w − y ZMP (12.8)
2
(a) (b)
l14 m17
m14 l17
l18
θ14
θ17 l14 l17
m15 θ18
m18 m13 m16
L1
l15 θ15
m3 m9 l9 l15 l18
m15
m18
l3 θ3 θ9
m4
l4 H1
m2
θ4 m10 θ2 θ8
10 l2 m8 l8
l10
l6
m5
θ5 m11
l5 θ11 l11
x1 l6 m6 m12
l12
x2
fw
x3
Figure 12.4 Biped robot walking on descending the staircase. (a) Sagittal view and (b) frontal view.
similar procedure as that of the ascending case. However, there is a small difference in the
descending case with the acceleration due to gravity “g” gacting in the direction opposite
to that of the movement of the robot.
∑ ∑ ∑
n n n
τ i ,the = Mij ( q ) qj + Cijk q j q k + Gi i, j, k = 1, 2, … , n (12.9)
j=1 j=1 k=1
where τi,the, q, q j, and q j represent the theoretical torque required, displacement of the joint
in (m), velocity of the joint in (m/s), and acceleration of the joint in (m/s2). Further, the
expanded form of inertia (Mi,j), centrifugal/coriolis (hi,j,k), and gravity (Gi) terms are as
follows:
∑
n
Mij = Tr dpj I p dpiT i, j = 1, 2, , n (12.10)
p = max( i , j )
Chapter twelve: Design of neural network–based PID controller 233
( )
∂ dpk
∑
n
hijk = Tr I p dpiT i, j = 1, 2, , n (12.11)
p = max( i , j , k )
∂qp
∑
n
Gi = − mp gdpi ep rp i, j = 1, 2, , n (12.12)
p=i
where ep rp, Ip, and g denote the mass center (m), mass moment of inertia (kg m/s2) tensor of
pth link and acceleration due to gravity in (m/s2), respectively. It is important to note that
the acceleration of the joint plays a significant role in controlling each link of the biped
robot. By rearranging Equation (12.9), the expression related to the acceleration of link i is
given as follows:
∑ −1
∑ ∑
∑
n n n n
Mij ( q ) − Mij ( q ) * τ i ,the i, j, k = 1, 2, … , n
−1
qj = Cijk q j q k − Gi +
j=1 j=1 k=1 j=1
(12.13)
now considering the term
∑
n
Mij ( q ) * τ i ,the = τˆ (12.14)
−1
j=1
∑ −1
∑ ∑
n n n
qj = Mij ( q ) − Cijk q j q k − Gi + τˆ i, j, k = 1, 2, … , n (12.15)
j=1 j=1 k=1
In real-time applications, the theoretical torque required and acceleration of each link are
not suitable to estimate the value of actual torque and acceleration. Based on the above rea-
son the actual torque required at different joints of the biped robot is calculated by using
the following expression:
where τact represents the actual torques required at different joints of the biped robot; the
terms Kp, Kd, and Ki denote proportional, derivative, and integral gains of the PID control-
ler, respectively; and e indicates the value of error (i.e., difference between the desired and
actual value, e(θi) = θif − θis) related to each joint. After including the terms e and e the above
equation can be written as follows:
where θif and θis represent the final and initial angular positions at different joints of the
biped robot, respectively. Therefore, the final control equation that represents the accelera-
tion of the link is
∑ −1
∑ ∑
( )
n n n
qj = Mij ( q ) − Cijk q j q k − Gi + K pi θ if − θ is − K diθis + K ii ∫ e (θ is ) dt
j=1 j = 1 k = 1
(12.18)
234 Advanced Mathematical Techniques in Engineering Sciences
1.
Initialize a population
The population of the initial solution is being dispersed randomly over the
N-dimensional search space. It is important to note that each position of the weed
signifies one possible solution to the problem.
2.
Reproduction
After growing, the individual weeds are allowed to reproduce new seeds depending
on their own, the lowest and highest fitness values in the colony. Based on the fitness
of the weed, the number of seeds produced by the weed varies from lower to higher
in a linear manner. The weed with better fitness will produce more seeds, and worst
fitness will produce less seeds. The goodness of this algorithm is that, all the worst
and best weeds in the solution space will contribute in the reproduction process, and
the weed that is giving the worst fitness value also shares some useful information in
the evolution process. The number of seeds (S) produced by each weed will be given
by the following equation:
f − fmin
S = Floor Smin + × Smax (12.19)
fmax − fmin
where fmin and fmax denote the minimum and maximum fitness value in the colony,
respectively, and Smin and Smax represent minimum and maximum number of seeds
produced by each plant, respectively.
3.
Spatial dispersal
The randomly generated seeds are distributed around the parent weed with a certain
value of variance and mean equal to zero. Moreover, the standard deviation (σ) of the
random function will be reduced nonlinearly from a previously mentioned initial
value (σ initial) to a final value (σfinal) in each generation. The equation governing this
process is as follows:
(Genmax − Gen) n σ
σ Gen = ( initial − σ final ) + σ final (12.20)
(Genmax )n
Chapter twelve: Design of neural network–based PID controller 235
Start
Initialization:
Choose the parameters, i.e., Kp, Kd, and Ki,
and other parameters, i.e., Generations, initial pop, maximum pop
exponent, sigma_initial, sigma_final, Smax, and Smin
Reproduction:
Based on the fitness of the weeds the new seeds will be produced
after including cosine and chaotic variables are introduced
Spatial dispersal:
The newely produced seeds are normally
distributed in the search space by varying SD
The total no of
weeds and seeds > Pmax
No Yes
Competative Exclusion:
Evaluate the fitness of weeds and seeds
Use all the weeds and seeds in the colony
Choose the better fitnessweeds and seed
in the colony equal to Pmax
No
Next generation Stop criteria
Yes
Stop
Figure 12.5 Flow chart showing the step-by-step procedure of the MCIWO algorithm.
where σ initial and σfinal indicate the initial and final standard deviation, respectively,
Genmax and “n” represent the maximum number of generations and modulation
index, respectively.
In order to improve the performance of the algorithm in the present research, the
authors introduced two new terms, namely, chaotic [27] and cosine [28,29] variable in spa-
tial dispersal section. The first variable, i.e., chaotic random variable, is used to distribute
236 Advanced Mathematical Techniques in Engineering Sciences
the seeds equally. This will help in enhancing the search space and in minimizing the
chances of the solution being trapped in the local optimum. The chaotic random number
considered in the present study is attained from the Chebyshev map and is
(
X k + 1 = cos k cos −1 ( X k ) (12.21) )
Further, the cosine variable assists in enhancing and exploring the search space in a
better manner, and it will utilize the unused resources in a search space. After intro-
ducing the cosine variable, Equation (12.20) can be modified and is as follows:
4.
Competitive exclusion
After passing several iterations, the number of weeds in a colony will reach its maxi-
mum (Pmax) by fast reproduction. At this stage, each weed is allowed to produce new
seeds. The newly produced seeds are then allowed to spread over the search space by
using a chaotic random number. After spreading the seeds over the search area, the
weeds occupy their position and a rank will be assigned along with the parent weeds.
Once it reaches the maximum allowable population, the weeds with lower fitness are
eliminated, and the weeds with better fitness will join the population in the next genera-
tion. This process will continue until the maximum numbers of iterations are reached.
In the present chapter, the gains (i.e., Kp, Kd, and Ki) of the various PID controllers that
are used to control the individual joints of the biped robot are considered as weeds of
the MCIWO algorithm. In this study, the PID controllers are designed only to control 12
joints out of 18 joints of the biped robot. The rest six DOF belongs to the hands and they
are not considered here. Further, each PID controller requires three gains for its opera-
tion. Therefore, in total, 36 gain values are required to control all 12 joints of the biped
robot. One such population of the MCIWO algorithm looks like the one that follows:
866.54
,
400.25 , 958.32
, … , 758.35
, 550.96
, 688.78
K p1 K d1 K i1 K p36 Kd36 Ki36
It is important to note that a fitness value needs to be assigned for each population
of the MCIWO algorithm. Here, the average angular error between the start and end
points of the interval for various joints is considered as the fitness of each population.
The fitness function ( f ) of the MCIWO PID controller is as follows:
b p
1
∑ ∑ (α 1
) 2
f = ijkf − α ijks (12.23)
b j=1
2 k =1
where b denotes the number of intervals considered in one step, and p indicates the
number of joints, respectively, for which the controllers are designed.
bias1
bias2
v11
w11
w12 w21 v12
Kp1
w22 v13
1
w13 Kd1
2
Ki1
Kp12
Kd12
11
12 Ki12
Input layer Output layer
Hidden layer
of the PID controller. The architecture of the NN consists of 12 input neurons that indicate
the angular error of each joint at different instances of time. The output layer consists of 36
neurons that represent the proportional, integral, and derivative gains of the PID control-
lers that are used to control each joint of the biped robot. The number of neurons in the
hidden layer has been decided with the help of a systematic study. It is important to note
that the performance of a NN depends on the connecting weights between the neurons
of input-hidden and hidden-output layers. During the parametric study, the connecting
weights, the bias value of the network and the coefficients of transfer function values of
individual layers are optimized with the help of the MCIWO algorithm. A batch mode
of training has been implemented to train the NN. Once the structure of the NN is opti-
mized, it can be used to predict the gains of the PID controller in a more adaptive manner.
The operating principle of a MCIWO-NN is shown in Figure 12.7.
Let us consider that the change in the angular position of joints, such as Δθ 1, Δθ 2, Δθ 3,
Δθ4, Δθ 5, Δθ 6, Δθ 7, Δθ 8, Δθ 9, Δθ 10, Δθ 11, Δθ 12, of the biped robot at equal intervals of time as
Figure 12.7 Flow chart showing the structure of the MCIWO-NN algorithm.
238 Advanced Mathematical Techniques in Engineering Sciences
the input to the NN and the gains of the PID controller, such as, Kp, Kd, and Ki of each joint
of the biped robot, are taken as outputs of the NN. Here the main objective of using the
MCIWO is to optimize the structure of the NN—that is, connecting weights, bias values
and coefficients of transfer function of the NN. One such population of MCIWO in this
case looks like the following:
0.25
, 0.4
, 0.72
, … , 0.34
, 0.00009
, 0.00005
, 2 , 5 , 7
w11 w12 w13 v1434 b1 b2 c11 c12 c13
Once the gains of the PID controller are obtained from the NN, it is helpful to control the
joint motors. The root mean square (RMS) error of angular displacement of all the joints
between the end of each interval (αijkf) and the start of each interval (αijks) is considered as
the fitness ( f ) of each population of MCIWO-NN and is given as follows:
d b p
1
∑ ∑ 1 1
∑ (α ) 2
f = ijkf − α ijks (12.24)
d b 2
i=1 j=1 k =1
where d represents the number of training scenarios and other terms have their usual
meaning.
PID controller. After conducting the study, the optimal values related to the parameters
of MCIWO algorithm are seen to be as follows: final value of standard deviation (σfinal) =
0.00001; initial standard deviation (σ initial) = 3%; maximum number of seeds (Smax) = 5; mini-
mum number of seeds (Smin) = 0; initial population size (npopi) = 10; final population size
(npopf) = 25; nonlinear modulation index (n) = 2; and generations (Gen) = 30. Further, a
systematic study has also been conducted in the MCIWO-NN approach to obtain the suit-
able transfer functions for hidden and output layers, and to find the optimal number of
neurons in the hidden layer of the network. From this study, it has been observed that
the log-sigmoid and tan-sigmoid transfer functions are seen to exhibit better performance
with the hidden and output layers, respectively. Further, the number of neurons in the
hidden layer that produce better fitness is found to be equal to 14. The numbers of neu-
rons in the input, hidden, and output layers are kept equal to 12, 14, and 36, respectively.
Therefore, the numbers of connecting weights used in the network are seen to be equal to
672 (12 × 14 + 14 × 36). Finally, the number of variables that represent the NN architecture
are found to equal 675 that includes connection weights, one bias value of the network and
two coefficients of transfer functions.
For solving the above problem, the weights of the network are varying in the range
of 0.0–1.0, the bias values are varied in the range of 0.0–0.0001 and the coefficients are
transfer functions that are varied in the range of 1–10. In addition to the above study, here
also a parametric study is conducted to determine the optimal parameters of the MCIWO
algorithm that evolve the NN architecture. The optimal MCIWO parameters obtained
through the study are as follows: final value of standard deviation (σfinal) = 0.00001; initial
standard deviation (σ initial) = 4%; modulation index (n) = 4; final population (npopf) = 10;
initial population (npopi) = 5; maximum number of seeds (Smax) = 4; minimum number
of seeds (Smin) = 0; and maximum number of generations (Gen) = 50. Once the optimal
parameters of MCIWO and MCIWO-NN algorithms are identified, a comparative study
has been conducted in computer simulations on the said algorithms in terms of variation
of error, estimation of the torque required at each joint, variation of ZMP and DBM in
X- and Y-directions of the biped robot.
The variations of error at different joints of the biped robot using MCIWO and
MCIWO-NN algorithms are shown in Figure 12.8. It can be observed that the magnitude
of error reaches zero at the end of every interval. Further, the enlarged views show that the
MCIWO-NN (i.e., adaptive PID controller) converges faster than the MCIWO (i.e., optimal
controller) controller. It might be due to the reason that the trained NN could have pre-
dicted the values of gains of the controller with respect to the magnitude of error in the
joint angles, whereas in the MCIWO approach the gain values are constant and fixed, and
are not varying whenever there is a change in the magnitude of input signal to a particular
joint. This adaptiveness of the MCIWO-NN PID controller helped in converging the error
faster than the MCIWO PID controller.
Figure 12.9 shows the torque required at different joints of the biped robot while
ascending the staircase. It can be observed that the adaptive (MCIWO-NN) PID controller
requires less torque compared to optimized (MCIWO) PID controller. The reason for the
better performance is the same as the one discussed earlier. Moreover, Figure 12.10 shows
the variation of ZMP in X- and Y-directions of the biped robot while ascending the stair-
case. It can be observed that the ZMP of the MCIWO-NN–based PID controller is close to
the center of the foot when compared to the MCIWO-based PID controller. It means that
the adaptive PID controller has produced more dynamically balanced gaits than the opti-
mal PID controller.
240 Advanced Mathematical Techniques in Engineering Sciences
Error at joint 3
0.05
Error at joint 2
0.02
0.00
0.01
0.00 –0.05
–0.01 –0.10
–0.02 –0.15
–0.03 –0.20
0 5 10 15 20 25 30 35 0 5 10 15 20 25 30 35
Time in milli seconds Time in milli seconds
MCIWO
0.0020 MCIWO 0.002 MCIWO-NN
0.0015 0.000 Enlarged view
MCIWO-NN
Error at joint 3
Enlarged view
Error at joint 2
0.0010 –0.002
0.0005 –0.004
0.0000
–0.006
–0.0005
–0.0010 –0.008
–0.0015 –0.010
1 2 3 4 5 21 22 23 24 25
Time in milli seconds Time in milli seconds
(c) (d)
0.10 0.04 MCIWO
0.05 0.03 MCIWO-NN
Error at joint 5
Error at joint 4
0.00 0.02
–0.05 0.01
–0.10 0.00
–0.15 –0.01
MCIWO
–0.20 –0.02
MCIWO-NN
–0.25 –0.03
0 5 10 15 20 25 30 35 0 5 10 15 20 25 30 35
Time in milli seconds Time in milli seconds
0.015 –0.0005
0.010 –0.0010
0.005 –0.0015
0.000 –0.0020 MCIWO
–0.005 –0.0025 MCIWO-NN
–0.0030
15 16 17 18 19 20 0 1 2 3 4 5
Time in milli seconds Time in milli seconds
(e) (f )
MCIWO MCIWO
0.15 0.15 MCIWO-NN
MCIWO-NN
Error at joint 8
0.10 0.10
Error at joint 9
0.05 0.05
0.00 0.00
–0.05 –0.05
–0.10
0 5 10 15 20 25 30 35
0 5 10 15 20 25 30 35
Time in milli seconds
Time in milli seconds
MCIWO
0.008 MCIWO 0.012 MCIWO-NN
0.006 MCIWO-NN 0.010 Enlarged view
Error at joint 8
Error at joint 9
0.008
0.004
0.006
0.002 0.004
0.000 0.002
0.000
–0.002 –0.002
(g) 0.20
(h)
MCIWO 0.2
0.15 MCIWO-NN
Error at joint 11
0.0
Error at joint 10
0.10
–0.2
0.05
–0.4
0.00
–0.6 MCIWO
–0.05
–0.8 MCIWO-NN
–0.10
0 5 10 15 20 25 30 35 0 5 10 15 20 25 30 35
Time in milli seconds Time in milli seconds
MCIWO
0.012 Enlarged View MCIWO 0.02 MCIWO-NN
0.010 MCIWO-NN 0.01 Enlarged view
Error at joint 10
Error at joint 11
0.008
0.006 0.00
0.004 –0.01
0.002
0.000 –0.02
–0.002
–0.004 –0.03
–0.006 –0.04
Figure 12.8 Variation of error at different joints of the biped robot. (a) Joint 2, (b) Joint 3, (c) Joint 4,
(d) Joint 5, (e) Joint 8, (f) Joint 9, (g) Joint 10 and (h) Joint 11.
Chapter twelve: Design of neural network–based PID controller 241
(a) (b)
0.55 MCIWO–NN
0.18 MCIWO–NN
MCIWO MCIWO
0.50 0.16
0.45
0.14
0.40
0.12
0.35
0.30 0.10
0.25 0.08
0.20
0.06
0.15
0.04
0.10
0.05 0.02
0.00 0.00
Joint 2 Joint 3 Joint 4 Joint 5 Joint 8 Joint 9 Joint 10 Joint 11
Swing leg joints Stand leg joints
Figure 12.9 Torque required at different joints of the biped robot while walking on ascending the
staircase. (a) Swing leg and (b) stand leg.
0.3
MCIWO-NN MCIWO
0.2
0.1
Y-ZMP (m)
0.0
–0.1
–0.2
–0.3
–0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35
X-ZMP (m)
–0.06 MCIWO-NN MCIWO
Enlarged view
–0.07
Y-ZMP (m)
–0.08
–0.09
–0.10
–0.05 –0.04 –0.03 –0.02 –0.01 0.00 0.01 0.02 0.03 0.04 0.05
X-ZMP (m)
Figure 12.10 Variation ZMP in x- and y-directions while walking on ascending the staircase.
(a) (b)
MCIWO–NN 0.150 MCIWO–NN
MCIWO MCIWO
0.5
Torque required at different joints (N m)
Torque required at different joints (N m)
0.125
0.4
0.100
0.3
0.075
0.2
0.050
0.1 0.025
0.0 0.000
Joint 2 Joint 3 Joint 4 Joint 5 Joint 8 Joint 9 Joint 10 Joint 11
Swing leg joints Stand leg joints
Figure 12.11 Torque required at different joints of the biped robot while walking on descending the
staircase. (a) Swing leg and (b) stand leg.
Chapter twelve: Design of neural network–based PID controller 243
0.3
MCIWO-NN MCIWO
0.2
0.1
Y-ZMP (m)
0.0
–0.1
–0.2
–0.3
–0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35
X-ZMP (m)
0.085
0.080
0.075
0.070
0.065
Figure 12.12 Variation of ZMP in X- and Y-directions while walking on descending the stair-case.
(a) (b)
MCIWO MCIWO
0.030 MCIWO-NN 0.0050 MCIWO-NN
0.0045
0.025
0.0040
X-DBM (m)
Y-DBM (m)
0.020 0.0035
0.0030
0.015
0.0025
0.010 0.0020
Stair ascend Stair descend Stair ascend Stair descend
Terrains Terrains
Figure 12.13 DBM on ascending and descending the staircase. (a) X-DBM and (b) Y-DBM.
and descending case are fed to the real robot. The schematic diagrams showing the real-
time walking of the biped robot while ascending and descending the staircase are shown
in Figures 12.14 and 12.15, respectively. From this it can be observed that the biped robot
has successfully negotiated the staircase with the help of the gait obtained by the adap-
tive PID controller.
244 Advanced Mathematical Techniques in Engineering Sciences
0 ms 5 ms 10 ms 15 ms
20 ms 25 ms 30 ms 35 ms
0 ms 5 ms 10 ms 15 ms
20 ms 25 ms 30 ms 35 ms
12.6 Conclusions
In this research work, an attempt is made to develop a torque-based PID controller for the
biped robot while ascending and descending the staircase. A metaheuristic optimization
algorithm, that is, a MCIWO algorithm, has been used to find the optimal gains of the PID
controller. Further, the same MCIWO algorithm has also been used to optimize the struc-
ture of the NN. It has been observed that in both the ascending and descending cases, the
MCIWO-NN–based PID controller is found to perform better than the MCIWO-based PID
controller. This may be due to the nature of the NN that is able to produce adaptive gains
for the PID controller when there is a change in the value of magnitude of error for each
joint. Further, the developed controllers are successfully tested in computer simulations.
Finally, the gait obtained by the adaptive (i.e., MCIWO-NN controller) PID controller is
tested on a real biped robot.
References
1. Juricic D, Vukobratovic M., Mathematical Modeling of Biped Walking Systems, ASME Publications.
Vol. 72-WA/BHF13, 1972.
2. Vundavilli PR, Sahu SK, Pratihar DK, “Dynamically balanced ascending and descending gaits
of a two-legged robot”, Int J Humanoid Robotics, Vol. 4, No. 4, pp. 717–751, 2007.
3. Vundavilli PR. Sahu SK, Pratihar DK, “Online dynamically balanced ascending and descend-
ing gait generations of a biped robot using soft computing”, Int J Humanoid Robotics Vol. 4 No.4,
pp. 777–814, 2007.
4. Ziegler JG, Nichols NB, “Optimum settings for automatic controllers”, Trans ASME, Vol. 64,
pp. 759–768, 1942.
5. Astrom KJ, Wittenmark B, Adaptive Control, New York: Addison-Wesley; 1995.
6. Jaung J-G, Haung M-T, Liu W-K, “PID control using presearched genetic algorithms for a
MIMO system”, IEEE Trans Syst Man Cybern Part C: Appl Rev, Vol. 38, No. 5, pp. 716–727,
2008.
7. Gaing Z-L, “A particle swarm optimization approach for optimum design of PID controller in
AVR system”, IEEE Trans Energy Convers, Vol. 19, No. 2, pp. 384–391, 2004.
8. Mandava RK, Manas KS, Vundavilli PR, “Optimization of PID controller parameters for
3-DOF planar manipulator using GA and PSO”, In Bennett A. (ed) New Developments in Expert
Systems Research, Computer Science, Technology and Applications, pp.67–88, Publisher: Nova
Science Publishers, 2015.
9. Muhammet Ü, Ayça Ak, Vedat T, Hasan E, “Optimization of PID controllers using ant colony
and genetic algorithms”, Stud Comput Intell, Vol. 449, pp. 5–68, 2013, Springer-Verlag, Berlin,
Heidelberg.
10. Gandomi AH, Yang X-S, Alavi AH, “Cuckoo search algorithm: A metaheuristic approach to
solve structural optimization problems”, Eng Comput, Vol. 29, pp. 17–35, 2013.
11. Latha K, Rajinikanth V, “2DOF PID controller tuning for unstable systems using bacte-
rial foraging algorithm”, In Panigrahi B.K., Das S., Suganthan P.N., Nanda P.K. (eds) Swarm,
Evolutionary, and Memetic Computing. SEMCCO 2012. Lecture Notes in Computer Science, vol.
7677. Springer, Berlin, Heidelberg, 2012.
12. Zhao Z-Y, Tomizuka M, Isaka S, “Fuzzy gain scheduling of PID controllers”, IEEE Trans Syst
Man Cybern, Vol. 23, No. 5, pp. 1392–1398, 1993.
13. Hazzab A, Bousserhane IK, Zerbo M, Sicard P, “Real-time implementation of fuzzy gain
scheduling of PI controller for induction motor machine control”, Neural Process Lett, Vol. 24,
pp. 203–215, 2006.
14. Yu K-W, Hsu J-H, “Fuzzy gain scheduling PID control design based on particle swarm optimi-
zation method”, Proceedings Second International Conference on Innovative Computing, Information
and Control, ICICIC’07, 337. Kumamoto; 2007.
246 Advanced Mathematical Techniques in Engineering Sciences
Contents
13.1 I ntroduction......................................................................................................................... 247
13.2 Materials and methods...................................................................................................... 249
13.2.1 Data........................................................................................................................... 249
13.2.2 Machine learning algorithms............................................................................... 249
13.2.2.1 Neural network models.......................................................................... 250
13.2.2.2 NN model building with R programming tools................................. 252
13.2.2.3 Support vector regression models......................................................... 252
13.2.2.4 Decision tree models............................................................................... 253
13.2.2.5 Decision tree model building with R programming tools................254
13.2.2.6 Random forest models............................................................................ 255
13.2.2.7 Linear regression models........................................................................ 256
13.2.2.8 Model evaluation error metrics.............................................................. 257
13.3 Results and discussion....................................................................................................... 257
13.3.1 Neural Network Models........................................................................................ 257
13.3.2 Support vector regression models........................................................................ 257
13.3.3 Decision tree regression model............................................................................. 258
13.3.4 Random forest regression model.......................................................................... 259
13.3.5 Linear model for regression.................................................................................. 260
13.3.6 Machine learning models vis-à-vis linear regression model........................... 260
13.4 Conclusion........................................................................................................................... 261
References...................................................................................................................................... 261
13.1 Introduction
India is popularly known for its best buffalo germplasm throughout the world, being
responsible for more than 57% of the world buffalo population. Buffalo is considered as the
major dairy animal and backbone of the Indian dairy industry. The country with a 108.7
million (Anon., 2014) buffalo population ranks first in the world. India ranks first in milk
production, achieving an annual output of 155.5 million tonnes with per capita availability
of 337 g/day in 2015–2016 (Anon., 2017). Buffaloes contribute 53% (82.41 million tonnes) to
the total milk production in India.
The average productivity (5.76 kg/day/animal) of buffaloes is more than that of
indigenous cattle (3.41 kg/day/animal) in the country (Anon., 2017). Besides this, buffaloes
247
248 Advanced Mathematical Techniques in Engineering Sciences
contribute significantly toward meat production, draft power, and dung for manure and
fuel. Thus, the buffalo species is the most important and indispensable component of the
livestock sector in the country.
Murrah is one of the best milch breeds of buffalo. The population of Murrah buffaloes
is 48.25 million in India, out of which 11.68 million is pure and 36.56 million buffaloes
are graded. Murrah buffaloes contribute 44.39% to the total buffalo population in India.
Murrah buffaloes are known for high milk production and a higher fat percentage, which
is almost twice than that of cow milk. Haryana state (especially Bhiwani, Jhajjar, Jind,
and Rohtak d istricts) is the home tract of Murrah buffaloes, but the graded Murrah buf-
faloes are found throughout the country owing to their higher milk production potential
coupled with adaptation to wide ecological conditions and feed conversion efficiency. The
Murrah buffalo is basically the center of attraction for dairying among the various buffalo
breeds available in India (Mir et al., 2015). Hence, the Murrah breed of buffalo has been
appropriately named as the “black gold” of dairy animals in India. Also, several coun-
tries including Bangladesh, Brazil, Bulgaria, Egypt, etc., have used Murrah as an improver
breed for upgrading their native buffaloes.
Until today, in many countries, the major attention for improvement of the dairy
animals is through increasing the milk production. Although the result has been found
satisfactory, over the years, it has been observed that increasing milk of the dairy animal
deteriorates the reproduction performance as milk production traits are negatively associ-
ated with the fertility of the animals (Berry et al., 2011). Under these constraints, an assess-
ment is required to compare the predictable fertility in relation to the milk production.
The fertility of the breeding bull in the herd may be assessed based on total pregnancy
corresponding to the total number of inseminations.
Accordingly, various studies have been carried out to predict fertility in dairy ani-
mals using classical regression analysis techniques (De Haas et al., 2007; Patil et al., 2014;
Cook and Green, 2016; Utt, 2016; Eriksson et al., 2017). Conventional regression techniques
are based on the assumption of a specific parametric function such as, linear, quadratic,
etc., to fit the data that could be rather rigid for modeling any type of relationship (Piles
et al., 2013). Alternatively, nonparametric methods like emerging machine learning (ML)
algorithms could be applied for the intelligent analysis of such traits (González-Recio
et al., 2014) as they do not involve prior knowledge of any parametric function. However,
they can adapt intricate relationships between dependent and independent variables as
well as complex dependencies among explanatory variables. Also, they are quite flex-
ible and can learn arbitrarily complex patterns when sufficient data are presented. ML
algorithms can realize how to perform important tasks by generalizing from examples,
i.e., automatic predictions from instances of desired behavior or past observations. Thus,
ML is the study of intelligent computer algorithms that improve automatically through
incidence (Ramón et al., 2012; Shalev-Shwartz and Ben-David, 2014). These learning meth-
ods have found several applications in performance modeling and evaluation in animal
sciences (Caraviello et al., 2006; Shahinfar et al., 2012; González-Recio et al., 2014; Murphy
et al., 2014; Shahinfar et al., 2014; Hempstalk et al., 2015; Fenlon et al., 2016; Borchers et al.,
2017). However, the majority of the studies in this area of research have been conducted
outside India. Very few studies have recently been carried out in India including pioneer-
ing work by the authors at this institute (Sharma et al., 2006; Sharma et al., 2007, 2013;
Panchal et al., 2016, 2017). Nevertheless, a bull’s fertility prediction using ML techniques
has not been attempted especially in Murrah buffaloes. Hence, in this chapter, the authors
have investigated various emerging ML algorithms to predict the fertility in Murrah bulls
being maintained at the ICAR-National Dairy Research Institute (NDRI), Karnal, India.
Chapter thirteen: Modeling fertility in Murrah bulls with intelligent algorithms 249
Table 13.1 Summary statistics of the Murrah breeding bulls’ fertility data set
Variable Mean SD SE Minimum Maximum Range
Birth weight 35.8723 5.3674 0.7829 25 50 25
Weight (3-m) 68.3996 17.9533 2.6188 43 116 73
Weight (6-m) 111.2570 19.4178 2.8324 76 170 94
Weight (9-m) 159.0518 36.7451 5.3598 110 270 160
Weight (12-m) 272.8443 52.2760 7.6252 172 363 191
Weight (24-m) 382.3406 51.6531 7.5344 264 493 229
Age at first calving 3.2955 0.7420 0.1082 2.1 4.9 2.8
Post-thaw motility 49.0425 5.3809 0.7849 40 60 20
Conception rate 62.4381 16.9481 2.4721 30.43 94.74 64.31
SD: standard deviation; SE: standard error.
w00 w0n
w01 v0
x0
w10
v1
w11 net0(p)
x1 wln
w20 v2
w21
wm0 vm
xn wm1
wmn
neti = ∑w x + θ
j=1
ij j
(1)
i = w i ⋅ x + θ i(1) (13.1)
where θ i(1) is the bias of the ith hidden layer neuron. The output from the ith hidden layer
neuron is given by
net = ∑v h +θ
i=1
i i
(2)
= v ⋅ h + θ (2) (13.3)
where vi represents the synaptic (or connection) strength between the ith hidden layer
neuron and the output neuron, while θ(2) is the bias of the output neuron.
Introducing a bias neuron x0 with input value as +1, Equation (13.1) can be rewritten as
n
neti = ∑ w x = W ⋅ x (13.4)
j=0
ij j i
where wi 0 = Wi 0 ≡ θ i(1) and Wi is the weight vector wi (associated with the ith hidden neu-
ron) augmented by the 0th column corresponding to the bias. Similarly, introducing an
auxiliary hidden neuron (i = 0) such that h0 = +1, allows us to redefine Equation (13.3) as
m
net = ∑ v h = V ⋅ h (13.5)
i=0
i i
where v0 ≡ θ (2).
∑ ( net )
1 ( p) 2
F= o − t( p ) (13.7)
P p=1
This function is minimized using any standard optimization method like Broyden–
Fletcher–Goldfarb–Shanno (BFGS) optimization technique, etc.
The NN discovers knowledge from complicated or imprecise data, which is employed
to find patterns and reveal trends that are too complex to be observed either by human
beings or classical statistical techniques. A substantially trained NN acts as an expert
system to analyze data in the specific domain of information for which it was trained.
as these are less than ε; however, any deviation beyond this point is not accepted.
Chapter thirteen: Modeling fertility in Murrah bulls with intelligent algorithms 253
First, let us consider the instance of linear functions f, which take the form
where ⋅, ⋅ represents the dot product in ℵ. In context with Equation (13.8), the flatness
signifies to determine the smallest value for w. This can be minimized with the help of
2
Euclidean norm, i.e., w . Formally, express this problem in the form of a convex optimization
problem:
1 2
minimize = w
2
t( p ) − w , x( p ) − b ≤ ε (13.9)
subject to
w, x + b − t ≤ ε
( p) ( p)
It is implicitly assumed in Equation (13.9) that a function such as f actually exists, which
{ }
approximates all pairs x ( p ) , t( p ) with ε precision, i.e., the convex optimization problem
is feasible. However, at times, this may not be the case, or we may tolerate some errors.
Slack variables, ξ p and ξ p* are introduced to manage the infeasible constraints of the opti-
mization problem under consideration. Thus, the optimization problem (Equation 13.9) is
reformulated:
P
minimize =
1 2
2
w +c ∑(
p=1
ξ p + ξ p* )
t( p ) − w , x( p ) − b ≤ ε + ξ
p (13.10)
subject to w , x + b − t ≤ ε + ξ p
( p) ( p) *
ξ p , ξ p* ≥ 0
The constant c > 0 resolves the trade-off between the flatness of f and the extent to which
deviations beyond ε can be permissible. The problem formulation Equation (13.10) is
virtually dealing with a so-called ε-insensitive loss function, ξ ε defined as
0 if ξ ≤ ε
ξ ε := (13.11)
ξ − ε otherwise
f ( x) = ∑ c I(x ∈ F ) (13.12)
m= 1
m m
∑ t ( )
2
Let the criterion be minimization of the SSE, ( p)
− f x( p ) . Then the best cˆm is just the
average of t(p) in region Fm:
where ave [⋅] denotes the average. Now, determining the best binary partition in terms of
minimum SSE, is generally computationally infeasible. Therefore, a top-down greedy-
search algorithm is applied. Starting with all the data points, consider a splitting variable
j and split point s, and define the pair of half-planes:
{ }
F1 ( j, s) = X |X j ≤ s and F2 ( j, s) = {X |X j > s} (13.14)
Now, it is to seek the splitting variable j and split point s that solve:
∑ { } ∑ { }
2 2
min min t ( p ) − c1 + min t( p ) − c2 (13.15)
j , s c1 c2
x( p ) ∈F1 ( j , s ) x( p ) ∈F2 ( j , s )
For any choice of j and s, the inner minimization is solved by
cˆ1 = ave t( p ) |x( p ) ∈ F1 ( j, s) and cˆ2 = ave t( p ) |x( p ) ∈ F2 ( j, s) (13.16)
For each splitting variable, the split point s can be determined quite promptly. Thus,
determination of the best pair (j, s) is feasible by browsing through all the inputs. Having
found the best split, the data are partitioned into the two resulting regions, and repeat
this splitting process on each of the two regions. Further, this process is employed on all
the resulting regions. Now, the question is how large the tree should be grown. Naturally,
a big tree could over-fit the data, whereas a short tree might not discover the underlying
important structure, i.e., tree size is a tuning parameter leading to the model’s complex-
ity. Thus, the optimum tree size should be adaptively resolved from the data. One smart
strategy would be to split tree nodes only if the decrease in SSE due to the split exceeds
some threshold. Nevertheless, this tactic is too short-sighted as an imprecise split might
produce a very good split underneath. The ideal scheme would be to grow a large tree T0,
stopping the splitting process only when a certain least node size is attained. Then this
big tree is pruned using cost-complexity pruning technique that is delineated. Define a
subtree T ⊂ T0 to be any tree, which is attained as a result of pruning T0, i.e., collapsing any
number of its internal (nonterminal) nodes. Let the terminal nodes be indexed by m, with
Chapter thirteen: Modeling fertility in Murrah bulls with intelligent algorithms 255
node m representing region Fm. Consider |T| to be the number of terminal nodes in T. Let
the following quantities be stated in the present context:
{
N m = x( p ) ∈ Fm }
∑t
1
cˆm = ( p)
(13.17)
Nm
x( p ) ∈Fm
2
∑ {t }
1
Qm (T ) = ( p)
− cˆm
Nm ( p)
x ∈Fm
Cα (T ) = ∑ N Q (T ) + α |T | (13.18)
m= 1
m m
This leads to a finite sequence of subtrees, and it can be shown that this sequence must
contain Tα. Further pedagogical details can be found in Breiman et al. (1984). The α̂ is com-
puted by the cross-validation method. The value of α̂ is so chosen that it minimizes the
cross-validated SSE. The ultimate tree is Tα̂ .
For p = 1 to P:
1.
a. Get a bootstrap sample Z* of size N from the training data.
b. Construct a random-forest tree Tp on the bootstrapped data through recursive
iteration for each terminal node of Tp until the minimum node size nmin is attained
as follows:
i. Randomly choose m variables from the set of l variables.
256 Advanced Mathematical Techniques in Engineering Sciences
This equation possesses the property that prediction for t is a straight-line function of the
X variables. The slopes of their individual straight-line relationships with t are the con-
stants bi, i = 1, 2, , n, called coefficients of the variables indicating variation in predicted
value of bi per unit of change in X ( i ), keeping other items equal. The additional constant b0,
called intercept, is prediction that the model would produce if all the X values were zero
(if possible). The coefficients and intercept are estimated by the least squares method, i.e.,
setting them equal to the unique values that minimize SSE within the sample of data to
Chapter thirteen: Modeling fertility in Murrah bulls with intelligent algorithms 257
which the model is fitted. Also, the model’s prediction errors are generally assumed to be
independently and identically normally distributed.
The linear model function, glm() supported by R programming language has been
employed for the MLR analysis in this chapter. The detailed pedagogic description can be
found in Kabacoff (2015).
∑|Actual − Predicted|
MAE = i=1
(13.20)
n
n 2
RMSE =
1
n ∑
i=1
Actual − Predicted
Actual
(13.21)
RSS
AIC = n ln + 2k,
n
n (13.22)
RSS = ∑
i=1
2
( Actual − Predicted )
where n is the number of data points (observations in the test set); k is the number of
e stimated parameters (including the variance); and RSS is the estimated residual of the
fitted model.
1 1 1
bwt
2.
28
16
4
1.14
wt3m
–1.6
217
0.30
0.8
444
0
7
164
563
0.7
40
wt6m
65
0.91401 –2.26999
2 1.0
141 21
–0
–0.708
8
0.0
–1. 87
.10
35
34
391
wt9m
9 7
34
cr
78
0.1
wt12m –1
–1.197
–0.037
.3
0.2 81
01
9
6
84
959
14
71
.51
.87
5
0
.31
3
5
–0
–1
–0
wt24m 0.37868 2.10716
646
908
0.14
3
0.7
aam
–0 62
7
92
0.195
.59
aptm
Table 13.2 Neural network model’s optimum configuration and predictive performance
Number of
neurons/transfer
Type of neural Learning function in Epochs/
network/R package algorithm hidden layer(s) steps MAE RMSE AIC
Feed-forward (nnet) BFGS 6 (Log sigmoid) 100 13.12 0.15 177.47
Feed-forward (neuralnet) Rprop 2, 2 (Tangent 6746 14.53 0.30 54.95
sigmoid)
tune function in the same package. The tuning SVR, i.e., hyperparameter o ptimization or
model selection was based on grid search method. A lot of models were trained for the
different combinations of cost and ε, and the optimal one was selected (Table 13.3). The
tune method was employed to train models with ε = 0, 0.1, 0.2,…, 1 and cost = 22, 23, 24,…, 29
(Figure 13.3).
In Figure 13.3, the darker the region is, the better the model is (i.e., RMSE is closer to
zero in darker regions).
Table 13.3 Support vector regression machine’s optimum configuration and predictive performance
SVR type/R Data partitioning
package Epsilon Cost c Kernel scheme MAE RMSE AIC
Eps-regression 1 (0−1) 4 (2 −2 )
2 9 Gaussian 90:10 (10-fold 8.05 0.06 52.86
(Grid search RBF cross validation)
with tuning)
RBF: radial basis function.
Performance of SVM
500 600
550
400
500
300
450
Cost
400
200
350
100
300
Table 13.4 Decision tree regression model’s optimum configuration and predictive performance
Approach/R Data partitioning
package Maxsurrogate Usesurrogate scheme MAE RMSE AIC
Top-down 0 0 90:10 13.26 0.22 155.23
greedy search
(rpart package)
Model
70
60
50
Error
40
30
20
Table 13.5 Random forest model’s optimum configuration and predictive performance
Number of
variables Data
Number of per level partitioning
R package trees (ntree) (mtry) scheme MAE RMSE AIC
Random Forest 500 2 90:10 9.58 0.03 127.94
Table 13.6 Comparison of predictive accuracies of machine learning models vis-à-vis linear model
to predict fertility of Murrah breeding bulls
Accuracy metric Models’ prediction accuracies
NN SVR DT RF LM
MAE 13.12 8.05 13.26 9.58 6.44
RMSE 0.15 0.06 0.22 0.03 0.18
AIC 177.47 52.86 155.23 127.94 356.37
Chapter thirteen: Modeling fertility in Murrah bulls with intelligent algorithms 261
The experimental results (Table 13.6) that emerged from this study revealed that the
ML models, i.e., RF, SVR, and NN models, outperformed the LM, whereas the DT model
did not perform well due to its well-known inherent problem of over-fitting. Thus, the ML
approach (especially the RF paradigm) is capable of efficiently predicting the fertility of
Murrah bulls; which was, generally, found better than conventional linear models. Hence,
ML algorithms can be employed as a plausible alternative to linear regression models in
predicting the fertility of Murrah breeding bulls.
13.4 Conclusion
Various supervised ML algorithms, viz., NN, SVR machine, DT, and RF have been inves-
tigated empirically in this chapter, for modeling breeding bulls’ fertility in Murrah buf-
faloes. The performance of these intelligent models has been compared with that of the
classical linear model for regression, also developed in this study. The results of this study
revealed that the ML approach, generally, outperformed the classical linear models for
regression. Hence, the ML models developed in this study are superior to precisely assess
the conception rate in Murrah breeding bulls at organised dairy farm(s) like ICAR-NDRI,
Karnal (India). These intelligent models will provide decision support to organized dairy
farms for selecting good buffalo bulls.
References
Anastasiadis, A.D., Magoulas, G.D. and Vrahatis, M.N. (2005). New globally convergent training
scheme based on the resilient propagation algorithm. Neurocomputing, 64, 253–270.
Anonymous (2014). 19th Livestock Census-2012 All India Report. Department of Animal Husbandry,
Dairying and Fisheries, Ministry of Agriculture, Govt. of India, New Delhi. www.dahd.nic.in/
sites/default/files/Livestock5.pdf.
Anonymous (2017). Annual Report 2016–17. Department of Animal Husbandry, Dairying & Fisheries
Ministry of Agriculture & Farmers Welfare Government of India.
Berglund, P. and Heeringa, S. (2014). Multiple Imputation of Missing Data Using SAS. SAS Institute Inc.,
Cary, NC.
Berry, D.P., Evans, R.D. and Mc Parland, S. (2011). Evaluation of bull fertility in dairy and beef cattle
using cow field data. Theriogenology, 75, 172–181.
Borchers, M.R., Chang, Y.M., Proudfoot, K.L., Wadsworth, B.A., Stone, A.E. and Bewley, J.M. (2017).
Machine-learning-based calving prediction from activity, lying, and ruminating behaviors in
dairy cattle. Journal of Dairy Science, 100, 5664–5674.
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.
Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984). Classification and Regression Trees.
Wadsworth, New York.
Caraviello, D.Z., Weigel, K.A., Craven, M., Gianola, D., Cook, N.B., Nordlund, K.V., Fricke, P.M. and
Wiltbank, M.C. (2006). Analysis of reproductive performance of lactating cows on large dairy
farms using machine learning algorithms. Journal of Dairy Science, 89, 4703–4722.
Cook, J.G. and Green, M.J. (2016). Use of early lactation milk recording data to predict the calving to
conception interval in dairy herds. Journal of Dairy Science, 99, 4699–4706.
Daumé III, H. (2012). A course in machine learning. http://ciml.info.
De Haas, Y., Janss, L.L.G. and Kadarmideen, H.N. (2007). Genetic correlations between body condi-
tion scores and fertility in dairy cattle using bivariate random regression models. Journal of
Animal Breeding and Genetics, 124, 277–285.
Du, K.-L. and Swamy, M.N.S. (2014). Fundamentals of machine learning. Chapter 2. In: Neural
Networks and Statistical Learning. Springer, London. doi:10.1007/978-1-4471-5571-3_2.
Eriksson, S., Johansson, K., Axelsson, H.H. and Fikse, W.F. (2017). Genetic trends for fertility, udder
health and protein yield in Swedish red cattle estimated with different models. Journal of
Animal Breeding and Genetics, 134, 308–321.
262 Advanced Mathematical Techniques in Engineering Sciences
Fenlon, C., O’Gradyy, L., Dunnion, J., Shallooz, L., Butlerz, S. and Doherty, M. (2016). A comparison
of machine learning techniques for predicting insemination outcome in Irish dairy cows. In:
Proceedings of the 24th Irish Conference on Artificial Intelligence and Cognitive Science, September
20–21, Dublin, Ireland, pp. 57–67. http://aics2016.ucd.ie/papers/full/AICS_2016_paper_30.pdf.
González-Recio, O., Rosa, G.J.M. and Gianola, D. (2014). Machine learning methods and predictive
ability metrics for genome-wide prediction of complex traits. Livestock Science, 166, 217–231.
Gunn, S.R. (1998). Support vector machines for classification and regression. In: ISIS Technical
Report, Image Speech & Intelligent Systems Group, University of Southampton, UK.
Hastie, T., Tibshirani, R. and Friedman, J. (2009). Elements of Statistical Learning: Data Mining, Inference
and Prediction. Second Edition. Springer, New York.
Haykin, S. (2005). Neural Networks: A Comprehensive Foundation. Second Edition. Pearson Education
(Singapore) Pte. Ltd., Delhi.
Hecht-Nielsen, R. (1990). Neurocomputing. Addison Wesley Longman Publishing Co., Inc. Boston, MA.
Hempstalk, K., McParland, S. and Berry, D.P. (2015). Machine learning algorithms for the prediction
of conception success to a given insemination in lactating dairy cows. Journal of Dairy Science,
98, 5262–5273.
Hornik, K. (1991). Approximation capabilities of multilayer feedforward networks. Neural Networks,
4, 251–257.
Intrator, O. and Intrator, N. (1993). Using neural nets for interpretation of nonlinear models.
In: Proceedings of the Statistical Computing Section, American Statistical Society, San Francisco,
pp. 244–249.
James, G., Witten, D., Hastie, T. and Tibshirani, R. (2013). An Introduction to Statistical Learning with
Applications in R. Springer, New York.
Kabacoff, R.I. (2015). R in Action: Data Analysis and Graphics with R. Manning Publications Co.,
New York.
Kominakis, A.P., Abas, Z., Maltaris, I. and Rogdakis, E. (2002). A preliminary study of the applica-
tion of artificial neural networks to prediction of milk yield in dairy sheep. Computers and
Electronics in Agriculture, 35, 35–48.
Kowalczyk, A. (2014). Support Vector Regression with R. www.svm-tutorial.com/2014/10/
support-vector-regression-r/.
Kuhn, M. and Johnson, K. (2013). Applied Predictive Modeling. Springer, New York.
doi:10.1007/978-1-4614-6849-3.
Li, F. (2008). Function approximation by neural networks. In: Sun, F., Zhang, J., Tan, Y., Cao, J. and
Yu, W. (Eds.) Advances in Neural Networks. Lecture Notes in Computer Science, 5263, 384–390.
Springer, Berlin, Germany.
Liaw, A. and Wiener, M. (2002). Classification and regression by randomForest. R News, 2/3, 18–22.
Mir, M.A., Chakravarty, A.K., Gupta, A.K., Naha, B.C., Jamuna, V., Patil, C.S. and Singh, A.P. (2015).
Optimizing age of bull at first use in relation to fertility of Murrah breeding bulls. Veterinary
World, 8, 518–522.
Murphy, M.D., O’Mahony, M.J., Shalloo, L., French, P. and Upton, J. (2014). Comparison of modeling
techniques for milk-production forecasting. Journal of Dairy Science, 97, 3352–3363.
Panchal, I., Sawhney, I.K., Sharma, A.K. and Dang, A.K. (2016). Classification of healthy and mas-
titis Murrah buffaloes by application of neural network models using yield and milk quality
parameters. Computers and Electronics in Agriculture, 127, 242–248.
Panchal, I., Sawhney, I.K., Sharma, A.K., Garg, M.K. and Dang, A.K. (2017). Mastitis detection in
Murrah buffaloes with intelligent models based upon electro-chemical and quality param-
eters of milk. Indian Journal of Animal Research, 51, 922–926.
Patil, C.S., Chakravarty, A.K., Singh, A., Kumar, V., Jamuna, V. and Vohra, V. (2014). Development
of a predictive model for daughter pregnancy rate and standardization of voluntary waiting
period in Murrah buffalo. Tropical Animal Health and Production, 46, 279–284.
Piles, M., Díez, J., delCoz, J.J., Montañés, E., Quevedo, J.R., Ramon, J., Rafel, O., López-Béjar, M. and
Tusell, L. (2013). Predicting fertility from seminal traits: Performance of several parametric and
non-parametric procedures. Livestock Science, 155, 137–147.
Chapter thirteen: Modeling fertility in Murrah bulls with intelligent algorithms 263
Ramón, M., Martínez-Pastor, F., García-Álvarez, O., Maroto-Morales, A., Josefa-Soler, A., Jiménez-
Rabadán, P., Fernández-Santos, M.R., Bernabéu, R. and Garde, J.J. (2012). Taking advantage of
the use of supervised learning methods for characterization of sperm population structure
related with freezability in the Iberian red deer. Theriogenology, 77, 1661–1672.
Riedmiller, M. and Braun, H. (1993). A direct adaptive method for faster backpropagation learning:
The RPROP algorithm. In: Proceedings of the IEEE International Conference on Neural Networks,
San Francisco, pp. 586–591.
Shahinfar, S., Mehrabani-Yeganeh, H., Lucas, C., Kalhor, A., Kazemian, M. and Weigel, K.A. (2012).
Prediction of breeding values for dairy cattle using artificial neural networks and neuro-fuzzy
systems. Computational and Mathematical Methods in Medicine. doi:10.1155/2012/127130.
Shahinfar, S., Page, D., Guenther, J., Cabrera, V., Fricke, P. and Weigel, K. (2014). Prediction of insemi-
nation outcomes in Holstein dairy cattle using alternative machine learning algorithms.
Journal of Dairy Science, 97, 731–742.
Shalev-Shwartz, S. and Ben-David, S. (2014). Understanding Machine Learning: From Theory to
Algorithms. Cambridge University Press, New York.
Smola, A.J. and Scholkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing,
14, 199–222.
Smola, A. and Vishwanathan, S.V.N. (2008). Introduction to Machine Learning. Cambridge University
Press, Cambridge.
Sharma, A. K., Jain, D.K., Chakravarty, A.K., Malhotra, R. and Ruhil, A.P. (2013). Predicting economic
traits in Murrah buffaloes with connectionist models. Journal of Indian Society of Agricultural
Statistics, 67, 1–11.
Sharma, A.K., Sharma, R.K. and Kasana, H.S. (2006). Empirical comparisons of feed-forward con-
nectionist and conventional regression models for prediction of first lactation 305-day milk
yield in Karan Fries dairy cows. Neural Computing and Applications, 15, 359–365.
Sharma, A.K., Sharma, R.K. and Kasana, H.S. (2007). Prediction of first lactation 305-day milk yield
in Karan-Fries dairy cattle using ANN modelling. Applied Soft Computing, 7, 1112–1120.
Utt, M.D. (2016). Prediction of bull fertility. Animal Reproduction Science, 169, 37–44.
Vapnik, V. (1995). The Nature of Statistical Learning Theory. Springer, New York.
Venables, W.N. and Ripley, B.D. (2002). Modern Applied Statistics with S. Springer, New York.
Witten, I.H. and Frank, E. (2005). Data Mining: Practical Machine Learning Tools and Techniques. Second
Edition. Morgan Kaufmann Publishers, San Francisco, CA.
chapter fourteen
Maharshi Subhash
Graphic Era University
Michele Trancossi
Sheffield Hallam University
Contents
14.1 Introduction......................................................................................................................... 265
14.2 Governing equations.......................................................................................................... 267
14.2.1 Spalart–Allmaras model........................................................................................ 267
14.2.2 k–ε Model................................................................................................................. 268
14.2.3 SST k–ω model......................................................................................................... 268
14.2.4 k–ε–ζ–f Model........................................................................................................... 269
14.3 Grid independence test and solution methodology...................................................... 270
14.4 Results and discussion....................................................................................................... 273
14.5 Conclusions.......................................................................................................................... 281
Acknowledgments....................................................................................................................... 281
Nomenclature............................................................................................................................... 281
References...................................................................................................................................... 281
14.1 Introduction
It is not only a dream but also a necessity of tomorrow to have vertical and short takeoffs
and landings (V/STOL) in the civil aviation sector; because of rapid growth of aviation for
humanitarian purposes. There are several methods [1–5], which can be useful to imple-
ment V/STOL for air vehicles. Out of these methods, the most adequate method is based
on thrust vectoring. The project ACHEON (Aerial Coanda High Efficiency Orienting-jet
Nozzle) encompasses a thrust vectoring propulsive nozzle called HOMER (High-speed
Orienting Momentum with Enhanced Reversibility), which is supported in a patent devel-
oped at the University of Modena and Reggio-Emilia, Italy [6]. The idea encapsulated in
the project is to use the Coanda surface for the thrust vectoring to achieve V/STOL. In
the past, several works depicted the use of the Coanda surface for the flow control on the
aircraft wing and other flow control devices [7–12]. However, this concept can also be used
efficiently in other industrial applications like plasma spray gun (and for direct injection
in combustion chamber to improve the combustion efficiency [13–15]).
265
266 Advanced Mathematical Techniques in Engineering Sciences
The attachment of the jet over the adjacent curved surface was known about two
centuries ago by Young [16] and patented around one century later by Henry Coanda,
a Romanian engineer; therefore, this phenomenon is known as the “Conada” effect.
The conditions of the stability of the flow over the curved surface has been described
by Rayleigh [17] through the streamline curvature, although this flow feature did not
attract much research attention until 1961. The mechanism of the flow over the convex
surface causing Coanda flow can be found in the literature by Newman [18]. They found
that the flow adheres the curved surface due to the momentum balance between centrif-
ugal force and the pressure force [19]. Due to the interaction of the ambient fluid to the
boundary layer on the flow over the curved surface, the static pressure increases gradu-
ally and when the pressure gradient becomes zero, that position on the curved surface
is the verge of separation; beyond this position the pressure gradient becomes positive
and causes the reverse flow. The detailed literature review in addition to its engineering
applications, especially for generating lift on the curved surface has been performed by
Trancossi [20].
Some past works were concentrated on the investigation of the mechanism of the
flow over the curved surface. Wille and Fernholtz [21] have performed experiments on the
flow over a convex surface and found that surface curvature has a significant influence
on the jet deflection. They also studied the boundary layer phenomena and entrainment
of ambient fluid, causing the jet to adhere to the curved surface. However, in counterpart
the jet-spreading rate increases rapidly and causes the separation of the boundary layer.
Therefore, in order to investigate the flow phenomena, one needs to go into the boundary
layer, which has been attempted in previous work [22].
Experimental investigation on the flow over a circular cylinder by Fekete shows that
the velocity profile on the curved surface is similar to the profile at the plane wall jet. He
also investigated the surface pressure, position of separation, and wall shear stress. He
has shown that the wall-shear stress is negligible as long as the ratio b/R is not too small,
stating experiments where b/R < 0.0075 may be prone to skin friction forces. He has dis-
covered that θsep decreases with increased surface roughness; however, with large values
of the Reynolds number the influence of surface roughness is negligible within the tested
roughness limit [23].
Neuendorf and Wygnansky [24] investigated experimentally the flow over the curved
surface; they found that the entrainment of the ambient fluid causes the jet to adhere to
the curved surface, but, on the other side, also causes separation, because of the incre-
ment of the jet spreading rate. Therefore, boundary layer approximation fails. This indi-
cates that the condition of the velocity gradient and pressure gradient for the separation
needs to be reinvestigated in order to reveal the physics of the flow for the curved surface.
Without knowing this flow behavior the design of such a nozzle would depend upon the
trial methods, which would consume more time and cost. This part of work has already
been invesigated [22]. In the present work, our main emphasis is on the flow and geometric
parameters. Another work for identification of geometric parameters is by Patankar and
Sridhar [25], which delineated the behavior of the Coanda flow, which is the function of the
aspect ratio (ratio between jet orifice length to the jet orifice width), but the choice of aspect
ratio depends upon the geometry of the flow. There are no unanimously defined param-
eters that influence the flow, due to the intricacies of the flow geometry. Here, an attempt
has been made to define such parameters for this case and also to drive other researchers
to validate these parameters. Therefore, for the nozzle flow over the Coanda surface, one
can define the aspect ratio by diameter of the exit nozzle (b) to the radius of curvature (R)
of the Coanda surface. In the present chapter this has been defined for design purposes.
Chapter fourteen: Computational study of the Coanda flow for V/STOL 267
There are more parameters which have been discussed in Section 14.4. The present work
delineates relevant work, which is focused on the identification of the geometric and flow
parameters for the design of such flow. To use this flow phenomenon for industrial appli-
cations, there is a necessity to identify the flow and geometric parameters for better opti-
mized design.
14.2 Governing equations
The mass and momentum conservation equations are given in the Reynolds averaging
form
∂U i
= 0 (14.1)
∂ xi
∂U i ∂U i 1 ∂P ∂2 U i ∂
+Uj =− +ν − ui′u′j (14.2)
∂t ∂x j ρ ∂ xi ∂x j ∂x j ∂x j
14.2.1 Spalart–Allmaras model
The Spalart–Allmaras equations are as follows:
1 ∂ ∂νˆ ∂νˆ
2
∂νˆ ∂νˆ c νˆ ∂νˆ
+u = cb 1 (1 − ft 2 )Sˆ νˆ − cw 1 fw − b21 ft 2 + (ν + νˆ ) + cb 2
∂t ∂x j κ d σ ∂ x j ∂ x j ∂ xi ∂ xi
(14.3)
µt = ρνˆ fν 1 (14.4)
where
χ3
fν 1 = (14.5)
χ + cν31
3
νˆ
χ = (14.6)
ν
and ρ is the density, ν = μ/ρ is the molecular kinematic viscosity, and μ is the molecular
dynamic viscosity. Additional definitions are given by the following equations:
ν
Sˆ = Ω + 2 2 fν 2 (14.7)
κ d
where Ω = 2WijWij is the magnitude of the vorticity, d is the distance from the field point
to the nearest wall, and
χ
fν 2 = 1 − (14.8)
1 + χ fν 1
268 Advanced Mathematical Techniques in Engineering Sciences
1 + c6
fw = g 6 w63 (14.9)
g + cw 3
g = r + cw2 ( r 6 − r ) (14.10)
ν
r = min 2 2 ,10 (14.11)
Sκ d
ft 2 = ct 3 exp ( − ct 4 χ 2 ) (14.12)
1 ∂ui ∂u j
Wij = − (14.13)
2 ∂ x j ∂ xi
cb 1 1 + cb 2
ct 4 = 0.5, cw 1 = +
κ2 σ
∂k µ
ρ + ρ u j∇k = ∇ µ + T ∇k + µT P − ρε (14.14)
∂t σk
∂ε µ ε
ρ + ρ u j∇ε = ∇ µ + T ∇ε + ( Cε 1µT P − ρCε 2ε ) − ρε (14.15)
∂t σε k
where
k2
µT = ρCµ (14.17)
ε
The model constants are
a1 ⋅ k
νT = (14.18)
max( a1 ω , SF2 )
Chapter fourteen: Computational study of the Coanda flow for V/STOL 269
∂k ∂k ∂k
ρ +Uj = Pk − β * kω + (ν + σ kν T ) (14.19)
∂t ∂x j ∂ xj
∂ω ∂ω ∂ω 1 ∂ k ∂ω
ρ +Uj = α S2 − βω 2 + (ν + σ ων T ) + 2(1 − F1 )σ ω 2 (14.20)
∂t ∂x j ∂ x j ω ∂ xi ∂ xi
2 k 500ν
2
F2 = tanh max * , 2 (14.21)
β ω y y ω
2 k 500ν 4σ ω 2 k
4
F1 = tanh min max * 2
, 2 , (14.22)
β ω y y ω CDkω y
2
1
S2 =
2
(
∂ j ui + ∂i u j ) (14.23)
∂U i ∂U i ∂U j
G = ν T + (14.25)
∂ x j ∂ x j ∂ xi
1 ∂ k ∂ω
CDkω = max 2 ρσ ω 2 , 10−10 (14.26)
ω ∂x j ∂x j
φ = φ1F1 + φ2 (1 − F1 ) (14.27)
α
1 = 5/9, α 2 = 0.44, β1 = 3/40, β 2 = 0.0828, β * = 0.09, σ k 1 = 0.85, σ k 2 = 1, σ ω 1 = 0.5, σ ω 2 = 0.856
Incorporating the Durbin’s [29] elliptic relaxation concept, a new eddy-viscosity tur-
bulence model comprising four equations denoted as k–ε–ζ–f was developed by Hanjalic
et al. [30].
The eddy-viscosity is obtained in the following form:
k2
ν t = Cµζ (14.28)
ε
moreover, the rest of the variables are from the following set of model equations; thus,
∂(α k ρ k k k ) ∂ ∂ µkt ∂ k k
+ (α k ρ k vk k k ) = α k µk + + α k ρ k ( Pk − ε k ) (14.29)
∂t ∂x j ∂x j σ k ∂ x j
∂(α k ρ k ε k )
+
∂ (C* P α ε − Cε 2α k ε k2 ) + ∂ α µ + µkt ∂ε k (14.30)
(α k ρ k vk ε k ) = ε 1 k k k k k
∂t ∂x j kk ∂ x j σ ε ∂ x j
∂(α k ρ kζ k ) ∂ ζ ∂ µkt ∂ζ k
+ (α k ρ k vkζ k ) = ρ f − ρ Pk + α k µ k + (14.31)
∂t ∂x j k ∂x j σ ζ ∂ x j
2
−ζ
2∂2 f Pk 3
f −L = C1 + C2 (14.32)
∂ x j ∂ x j ζ T
k αk 1/2
ν
T = max min , 2 , CT (14.33)
ε υ Cµ 6S
2 ε
k 3/2 k 3/2 ν3
1/2
L = CL max min , 2 2
, Cη (14.34)
ε υ Cµ 6S ε
Additional modifications to the ε-equation are that the constant Cε1 is dampened close to
the wall; thus,
( )
Cε* 1 = Cε 1 1 + 0.045 1/ ζ (14.35)
commercial software AVL Fire [32] for k–ζ–f turbulence model and the rest of the model on
Fluent 6.3 (2016) for Ref. [33].
The grid independent check has been performed according to the ERCOFTAC [30]
guidelines and as depicted in the following papers [34,35]. The optimum number of grid
(numerically stable grid) has been determined through the numerical computation of the
grid at different refinement levels of the grid at the curved surface (first grid from the
wall at 80, 40, and 20 µm). It has been found that, when the grid resolved the viscous sub-
layer until y+ value less than two (first grid from the wall at 20 µm), then one can get the
jet deflection angle independent of the grid. In addition, the x+ value is less than 60 for
the stable solution of the flow along the downstream. We have employed the four turbu-
lence models such as the Spalart–Allmaras (SA) model [26], SST K–ω [28] model, k–ε with
enhanced wall treatment [27] model, and k–ζ–f model [30].
In this chapter, the descritization error has been minimized using the second-order
upwind scheme for the momentum equation and for modified turbulent viscosity in
the Spalart–Allmaras (SA) model (Equation 14.3). The pressure and velocity have been
coupled through the SIMPLE (Semi-Implicit Method for Pressure Linked Equation) algo-
rithm [36].
The first-order implicit method has been used to discretize the unsteady term. The
advantage of the fully implicit scheme is that it is unconditionally stable with respect to
time step size. However, the time step has been taken Δt = 1 × 10 –3 s. Figure 14.3 (a–c) shows
the residual RMS plot of the three models has been given; it has been found that for the
SA model the error is in the range of 10−9–10−10, that is lowest in comparison to the other
turbulence model.
272 Advanced Mathematical Techniques in Engineering Sciences
(a)
1E–08
Residuals
Continuity
x-Velocity
y-Velocity
Nut
1E–09
1E–10
25000 25500 26000 26500 27000 27500 28000 28500 29000 29500 30000
Iterations
(b)
1E–05
1E–06
1E–07
Residuals
Continuity
1E–08 x-Velocity
y-Velocity
k
1E–09
Omega
1E–10
71000 71500 72000 72500 73000 73500 74000 74500 75000 75500 76000
Iterations
(c)
1E–03
1E–04
Residuals
Continuity
1E–05 x-Velocity
y-Velocity
k
Epsilon
1E–06
4000 4500 5000 5500 6000 6500 7000 7500 8000 8500 9000
Iterations
Figure 14.3 Residual plot for different turbulence models: (a) SA, (b) SST k–ω, (c) k–ε model.
Chapter fourteen: Computational study of the Coanda flow for V/STOL 273
Moreover, we have found that the k–ζ–f model has more numerical stability than other
models; therefore, in this computational study this model has been used.
It has been easily realized that, for such application in V/STOL the flow velocity will be in
a compressible range. However, in the present work, the study has been started from the
incompressible flow to at the verge of the compressible flow (M = 0.3).
For the geometry (Figure 14.1) the radius of exit curvature (R) is 101.566 mm, the exit
throat diameter (b) is 40.163 mm and inlet diameter (d1 = d2) is 56 mm and the ratio of b/R
is 0.395. Assume the average velocity and velocity ratio are constant for all the computa-
tions within the range of average velocity 20–40 m/s. Initial observation on the computa-
tional results shows that the jet deflection angle is the function of velocity ratio. The flow
visualization has been given through velocity contours from Figures 14.4 to 14.6 for some
selected cases. Now, these contours reveal the fact that the highest velocity occurs at the
exit of the nozzle attached to the upper curvature of the nozzle, as the upper nozzle has
been designated as V1 velocity and lower nozzle as V2. The ratio is always greater than
one; therefore, the attachment of the flow is near the upper exit curvature and the maxi-
mum velocity occurs there. Consequently, the boundary layer thickness is very small in
the range of the micrometer.
In another configuration as shown in Figure 14.2, the throat diameter (b) is 46 mm, the
radius of curvature (R) is 179.543 mm, the inlet diameter (d1 = d2) is 77 mm, and the b/R ratio
is 0.256. The velocity contours have been shown in Figure 14.7. The complete attachement
of the flow on the Coanda surface has been seen for the velocity ratio greater than 1.3. The
reason for the larger attachment angle can be explained by the velocity vector (Figure 14.8)
and the velocity plot (Figures 14.9 and 14.10).
Figure 14.8 depicts the velocity profile at the exit of the nozzle. The subsequent plots in
Figures 14.9 and 14.10 show the velocity profile near the exit and far from the exit, respec-
tively. Near the exit of the flow, the effect of pulling the low velocity jet toward the high
velocity jet is low, and as the flow progresses the effect is more pronounced. We can see
that far from the exit, the velocity profile attained the higher gradient than the low velocity
jet; in effect the low velocity jet attracted toward the high velocity jet.
274 Advanced Mathematical Techniques in Engineering Sciences
6.81E+01 7.39E+01
6.47E+01 7.02E+01
6.13E+01 6.65E+01
5.79E+01 6.28E+01
5.45E+01 5.91E+01
5.11E+01 5.54E+01
4.77E+01 5.17E+01
4.43E+01 4.80E+01
4.09E+01 4.43E+01
3.75E+01 4.06E+01
3.41E+01 3.69E+01
3.07E+01 3.32E+01
2.72E+01 2.95E+01
2.38E+01 2.59E+01
2.04E+01 2.22E+01
1.70E+01 1.85E+01
1.38E+01 1.48E+01
1.02E+01 1.11E+01
6.81E+00 7.36E+00
3.41E+00 3.69E+00
0.00E+00 0.00E+00
V1/V2 = 1.3 (6.158°) V1/V2 = 1.8 (11.943°)
7.97E+01 9.23E+01
7.67E+01 8.76E+01
7.17E+01 8.30E+01
6.77E+01 7.84E+01
6.37E+01 7.38E+01
5.68E+01 6.92E+01
5.58E+01 6.46E+01
5.18E+01 6.00E+01
4.78E+01 5.54E+01
4.38E+01 5.07E+01
3.58E+01 4.61E+01
3.59E+01 4.15E+01
3.19E+01 3.69E+01
2.79E+01 3.23E+01
2.39E+01 2.77E+01
1.99E+01 2.31E+01
1.69E+01 1.85E+01
1.20E+01 1.38E+01
7.97E+00 9.23E+00
3.98E+00 4.61E+00
0.00E+00 0.00E+00
V1/V2 = 2.5 (17.996°) V1/V2 = 6 (24.035)
Therefore, it can be said that due to the decrement of the b/R ratio from 0.395 to 0.256 a
large attachment angle has been found. In this way, the jet adhesion angle is a strong func-
tion of the b/R ratio, which is a geometric parameter. The b/R ratio is the main controlling
parameter for the jet adhesion angle.
After computation for the above-mentioned velocity ranges, the relation between
the velocity ratio (V1/V2) and the jet deflection angle (θ) is plotted in Figure 14.11. For
the average velocities 20, 25, and 30 m/s, the deflection angle has nearly the same value
until the large velocity ratio. For the Vav = 35 m/s, there is little difference in the jet
Chapter fourteen: Computational study of the Coanda flow for V/STOL 275
1.19E+02 1.29E+02
1.13E+02 1.23E+02
1.07E+02 1.16E+02
1.01E+02 1.10E+02
9.50E+01 1.03E+02
8.91E+01 9.67E+01
8.31E+01 9.03E+01
7.72E+01 8.38E+01
7.13E+01 7.74E+01
6.53E+01 7.09E+01
5.94E+01 6.45E+01
5.34E+01 5.80E+01
4.75E+01 5.16E+01
4.16E+01 4.51E+01
3.56E+01 3.87E+01
2.97E+01 3.22E+01
2.38E+01 2.58E+01
1.78E+01 1.93E+01
1.19E+01 1.29E+01
5.94E+00 6.45E+00
0.00E+00 0.00E+00
V1/V2 = 1.3 (θ = 6.729°) V1/V2 = 1.8 (θ = 12.346°)
1.39E+02
1.32E+02 1.61E+02
1.25E+02 1.53E+02
1.18E+02 1.45E+02
1.11E+02 1.37E+02
1.04E+02 1.29E+02
9.72E+01 1.21E+02
9.02E+01 1.13E+02
8.33E+01 1.05E+02
7.63E+01 9.66E+01
6.94E+01 8.86E+01
6.25E+01 8.05E+01
5.55E+01 7.25E+01
4.86E+01 6.44E+01
4.16E+01 5.64E+01
3.47E+01 4.83E+01
2.78E+01 4.03E+01
2.08E+01 3.22E+01
1.39E+01 2.42E+01
6.94E+00 1.61E+01
0.00E+00 8.05E+00
0.00E+00
deflection angle from V1/V2 = 4; however, for the lower velocity ratio it is nearly the
same. In other words, it can be said that the rate of increment of the deflection angle for
average velocity 35 m/s is higher than the lower velocity range (20, 25, and 30 m/s) and
for the 40 m/s. For the Vav = 40 m/s, the deflection angle is larger than the other average
velocity. The reason may lie in the fact that, at this inlet velocity, the exit velocity of jet
is quite higher near to the Mach number 0.3. Nevertheless, the slope is the same as for
20, 25, and 30 m/s for all velocity ratios. Now, the different behaviors of the slope for
276 Advanced Mathematical Techniques in Engineering Sciences
1.47E+02
1.36E+02
1.40E+02
1.29E+02
1.33E+02
1.22E+02
1.25E+02
1.15E+02
1.18E+02
1.09E+02
1.11E+02
1.02E+02
1.03E+02
9.50E+01
9.58E+01
8.82E+01
8.84E+01
8.14E+01
8.11E+01
7.46E+01
7.37E+01
6.79E+01
6.63E+01
6.11E+01 5.90E+01
5.43E+01 5.16E+01
4.75E+01 4.42E+01
4.07E+01 3.68E+01
3.39E+01 2.95E+01
2.71E+01 2.21E+01
2.04E+01 1.47E+01
1.38E+01 7.37E+00
6.79E+00 0.00E+00
0.00E+00 V1/V2 = 1.8 (θ = 13.125°)
V1/V2 = 1.3 (θ = 6.912°)
1.84E+02
1.59E+02
1.75E+02
1.51E+02
1.65E+02
1.43E+02
1.56E+02
1.35E+02
1.47E+02
1.27E+02
1.38E+02
1.19E+02
1.29E+02
1.11E+02
1.20E+02
1.03E+02
1.10E+02
9.53E+01
1.01E+02
8.74E+01
9.20E+01
7.94E+01
8.28E+01
7.15E+01
7.36E+01
6.35E+01
6.44E+01
5.56E+01
5.52E+01
4.77E+01
4.60E+01
3.97E+01
3.68E+01
3.18E+01
2.76E+01
2.38E+01
1.84E+01
1.59E+01
9.20E+00
7.94E+00
0.00E+00
0.00E+00
V1/V2 = 2.5 (θ = 19.545°) V1/V2 = 6 (θ = 26.822°)
average velocity 35 m/s at larger velocity ratios have been investigated in detail. An
investigation of the reason for aberration leads us to see the flow phenomena in detail;
therefore, the calculation of the Reexit and Reflow performed for all average velocity range
as shown in Table 14.1.
An interesting phenomenon has been observed, for the average velocity 20, 25, 30, and
40 m/s, i.e., the exit Reynolds number is always higher than the flow Reynolds number
over the exit curvature. This behavior pointed out the laminarization of the flow; however,
for most cases the flow Reynolds numbers are in the order of 105 (we are using the word
Chapter fourteen: Computational study of the Coanda flow for V/STOL 277
1.15E+02 1.25E+02
1.09E+02 1.19E+02
1.04E+02 1.13E+02
9.78E+01 1.07E+02
9.21E+01 1.00E+02
8.63E+01 9.40E+01
8.06E+01 8.78E+01
7.48E+01 8.15E+01
6.90E+01 7.52E+01
6.33E+01 6.90E+01
5.75E+01 6.27E+01
5.16E+01 5.54E+01
4.60E+01 5.02E+01
4.03E+01 4.39E+01
3.45E+01 3.76E+01
2.88E+01 3.13E+01
2.30E+01 2.51E+01
1.73E+01 1.88E+01
1.15E+01 1.25E+01
5.75E+00 6.27E+00
0.00E+00 0.00E+00
V1/V2 = 2 and Vav =30 m/s
V1/V2 =4 and Vav =25 m/s
1.36E+02 1.41E+02
1.29E+02 1.34E+02
1.22E+02 1.27E+02
1.15E+02 1.19E+02
1.08E+02 1.12E+02
1.02E+02 1.05E+02
9.50E+01 9.84E+01
8.82E+01 9.14E+01
8.14E+01 8.44E+01
7.47E+01 7.73E+01
6.79E+01 7.03E+01
6.11E+01 8.33E+01
5.43E+01 5.62E+01
4.75E+01 4.92E+01
4.07E+01 4.23E+01
3.39E+01 3.51E+01
2.71E+01 2.81E+01
2.04E+01 2.11E+01
1.36E+01 1.41E+01
6.79E+00 7.03E+00
0.00E+00 0.00E+00
V1/V2 =1.3, Vav =35 m/s V1/V2 =1.14 and Vav =37.5 m/s
Figure 14.7 Velocity contours for different velocity ratio and various average velocities.
1.41E+02
1.34E+02
1.27E+02
1.20E+02
1.12E+02
1.05E+02
9.84E+01
9.14E+01
8.44E+01
7.73E+01
7.03E+01
6.33E+01
5.62E+01
4.92E+01
4.22E+01
3.52E+01
2.81E+01
2.11E+01
1.41E+01
7.03E+00
3.85E–04
Figure 14.8 Velocity vector at different positions of the outer wall of the Coanda surface for
V1/V2 = 1.14 and Vav = 37.5 m/s.
278 Advanced Mathematical Techniques in Engineering Sciences
140
120
x = 0.05 m
100 x = 0.10 m
x = 0.15 m
magnitude
80
Velocity
(m/s)
x = 0.20 m
60
40
20
0
–0.04 –0.03 –0.02 –0.01 0 0.01 0.02 0.03 0.04 0.05 0.06
Position (m)
Figure 14.9 Velocity profile near the exit of the nozzle at different positions for V1/V2 = 1.14 and
Vav = 37.5 m/s.
140
120
x = 0.30 m
100 x = 0.40 m
x = 0.50 m
80
magnitude
Velocity
x = 0.70 m
(m/s)
60
40
20
0
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14
Position (m)
Figure 14.10 Velocity profile far from the exit of the nozzle at different positions for V1/V2 = 1.14 and
Vav = 37.5 m/s.
30
25
20
15
θ
Vav = 40 m/s
Vav = 35 m/s
10
Vav = 30 m/s
Vav = 25 m/s
5 Vav = 20 m/s
0
0 1 2 3 4 5 6
V1/V2
Figure 14.11 Plots between velocity ratio and the deflection angle.
Chapter fourteen: Computational study of the Coanda flow for V/STOL 279
Vav = 35 m/s
1.3 2.695E+05 3.135E+05
1.8 2.695E+05 3.135E+05
2.5 2.302E+05 2.679E+05
6.0 1.110E+05 1.291E+05
Vav = 30 m/s
1.3 1.977E+05 1.572E+05
1.8 2.310E+05 1.836E+05
2.5 9.467E+04 7.525E+04
6.0 7.989E+04 6.350E+04
Vav = 25 m/s
1.3 7.870E+04 2.116E+04
1.8 1.646E+05 8.599E+04
2.5 7.888E+04 6.227E+04
6.0 6.658E+04 7.012E+04
Vav = 20 m/s
1.3 5.28E+04 1.435E+04
1.8 1.32E+05 6.942E+04
2.5 1.32E+05 1.004E+05
6.0 1.31E+05 1.392E+05
“laminarization” because the flow retards on the curved surface with respect to the flow
at the exit of the nozzle). Very few cases go below this range, but for the average velocity
35 m/s, the flow accelerates over the curved surface, and consequently, the flow Reynolds
number is higher than the exit Reynolds number. Now, we reach at this point that the flow
over the curved surface can exhibit two phenomena: one is laminarization and the other
is acceleration. These two distinct behaviors have different characteristics of the jet deflec-
tion angle with respect to the higher velocity ratio. To envisage this behavior from another
point, the Mach number has been calculated at the exit flow velocity and found around 0.3.
This is near the compressible flow behavior, and for higher average velocity the exit Mach
number is a little bit above this value. Until, now, we have these two behaviors of the flow.
This should be further investigated experimentally.
Our next attempt is to envisage the effect of laminarization and the acceleration of
flow upon the thrust. Here, the thrust means the force exerted by the flow in along the
flow and normal to the flow. The thrust in x-direction would contribute in cruising and
280 Advanced Mathematical Techniques in Engineering Sciences
the normal thrust in y-direction would contribute in the lift force of the aircraft. This has
been calculated by multiplying the mass flux with the exit velocity as shown in Table 14.2.
The force normal to the flow direction can contribute to the lift force. Therefore, it has been
observed from Table 14.2 that the normal force is maximum for the range of velocity ratio
from 1.8 to 2.5. This is a strong function of the flow phenomena over the curved surface
depicted above. Therefore, it can be said that the force that contributes the lift can be in the
range of the velocity ratio of 1.8–2.5. Through this computational study, the range of the
maximum force has been investigated, which is vital for the design of such a nozzle for
maximum lift force.
Until now, we have discussed the influence of given parameters on the flow. For the
given velocity ratio, for which we can have maximum thrust, the jet deflection angle is a
function of b/R ratio.
For the velocity ratios 1.3, 2, and 4, we can see the complete attachments of the flow
over the exit curvature for the lower value of the b/R ratio. This parameter has a strong
Vav = 35 m/s
1.3 470.639 32.272
1.8 470.634 56.322
2.5 402.063 73.827
6.0 193.852 57.003
Vav = 30 m/s
1.3 295.984 23.385
1.8 345.773 43.333
2.5 141.717 27.383
6.0 119.593 35.587
Vav = 25 m/s
1.3 98.175 6.280
1.8 205.371 28.274
2.5 98.407 18.219
6.0 83.056 23.674
Vav = 20 m/s
1.3 52.668 3.587
1.8 131.428 18.13
2.5 131.229 24.462
6.0 130.925 31.676
Chapter fourteen: Computational study of the Coanda flow for V/STOL 281
influence on the attachment of the flow and can be controlled for the desirable thrust for
the propulsion.
14.5 Conclusions
The flow behaviors have been studied in detail. The flow and geometric parameters for the
design of such a nozzle have been recognized through the computational fluid dynamics
analysis. It has been realized that the influence of the b/R ratio is higher than other param-
eters. Therefore, this parameter can be one of the controlling elements for the design of the
nozzle in order to have maximum lift and cruising velocity. The range of V1/V2 has been
defined for the maximum thrust. The correlation will be developed after the experimental
database for calculation of thrust and deflection angle.
Acknowledgments
Some of the computational works of the present chapter were performed as part of Project
ACHEON (Aerial Coanda High Efficiency Orienting-jet Nozzle) with ref. 309041, sup-
ported by the European Union through the 7th Framework Programme during the stay
of the first author at UNIMORE Italy. Some of the computational studies were performed
recently using AVL-Fire. The first author acknowledges AVL List GmbH, Hans-List-Platz
1, A-8020, Graz, Austria for providing AVL AST software for research and development
purposes under the University Partnership Program.
Nomenclature
b = Exit throat diameter of the nozzle (m)
d1 = Diameter of the upper nozzle (m)
d2 = Diameter of the lower nozzle (m)
Fx = X component of the resultant thrust at the exit of the nozzle (N)
Fy = Y component of the resultant thrust at the exit of the nozzle (N)
P = Mean pressure (N/m2)
R = Radius of the Coanda surface attached with the nozzle exit (m)
Reexit = Reynolds number of the flow at the exit of the nozzle, Vexit,av × b/ν (–)
Reflow = Reynolds number of the flow at the curved surface of the nozzle, Vflow,av × Rθ/ν (–)
Ui = Reynolds averaged velocity tensor (m/s)
u’i = Fluctuating velocity tensor (m/s)
V1 = Velocity of the flow in upper nozzle (m/s)
V2 = Velocity of the flow in lower nozzle (m/s)
References
1. Yoshitani, N., Hashimoto, S.-I., Kimura, T., Motohashi, K., and Ueno, S., “Flight Control
Simulators for Unmanned Fixed-Wing and VTOL Aircraft,” ICROS-SICE International Joint
Conference 2009, August 18–21, 2009, Fukuoka International Congress Center, Japan.
2. Thomason, T., “Bell-Boeing JVX Tilt Rotor Program – Flight Test Program,” American Institute
of Aeronautics and Astronautics, 1983, AIAA Paper No. 83-2726.
3. Saeed, B., Gratton, G., and Mares, C., “A Feasibility Assessment of Annular Winged VTOL
Flight Vehicles,” Aeronautical Journal, Vol. 115, 2011, pp. 683–692.
4. Kim, H., Rajesh, G., Setoguchi, T., and Matsuo, S., “Optimization Study of a Coanda Ejector,”
Journal of Thermal Science, Vol. 15, No. 4, 2006, pp. 331–336.
282 Advanced Mathematical Techniques in Engineering Sciences
5. Alvi, F., Strykowski, P., Krothapalli, A., and Forliti, D., “Vectoring Thrust in Multiaxis Using
Confined Shear Layers,” Journal of Fluids Engineering, Vol. 122, No. 1, 2000, pp. 3–13.
6. Trancossi, M., Dumas, A. Giuliani, I., and Baffigi, I., “Ugello capace di deviare in modo din-
amico e con-trollabile un getto sintetico senza parti meccaniche in movimento e suo sistema
di controllo,” Patent No.RE2011A000049, Italy, 2011.
7. Freund J. B. and Mungal, M. G., “Drag and Wake Modification of Axisymmetric Bluff Bodies
Using Coanda Blowing,” Journal of Aircraft, Vol. 31, No. 3, May–June 1994, pp. 572–578.
8. Chng, T. L., Rachman, A., Tsai, H. M., and Zha, Ge-C., “Flow Control of an Airfoil via Injection
and Suction,” Journal of Aircraft, Vol. 46, No. 1, 2009, pp. 291–300.
9. Lee, D.-W., Hwang, J.-G., Kwon, Y.-D., Kwon, S.-B., Kim, G-Y., and Lee, D.-E., “A Study on the
Air Knife Flow with Coanda Effect,” Journal of Mechanical Science and Technology, Vol. 21, 2007,
pp. 2214–2220.
10. Lalli, F., Bruschi, A., Lama, R., Liberti, L., Mandrone, S., and Pesarino, V., “Coanda Effects in
Coastal Flows,” Coastal Engineering, Vol. 57, 2010, pp. 278–289.
11. Collis, S.S., Joslin, R.D., Seifert, A., and Theofilis, V., “Issues in Active Flow Control: Theory,
Control, Simulation, and Experiment,” Progress in Aerospace Science, Vol. 40, 2004, pp. 237–289.
12. Florin, F., Alexandru, D., Octavian, P., and Horia, D., “Control of Two-Dimensional Turbulent
Wall Jet on a Coanda Surface,” Proceedings in Applied. Mathematics and Mechanics, Vol. 11, 2011,
pp. 651–652.
13. Mabey, K., Smith, B., Whichard, G., and McKechnie, T., “Coanda-Assisted Spray Manipulation
Collar for a Commercial Plasma Spray Gun,” Journal of Thermal Spray Technology, Vol. 20, No. 4,
2011, pp. 782–790.
14. Kim, H., Rajesh, G., Setoguchi, T., and Matsuo, S., “Optimization Study of a Coanda Ejector,”
Journal of Thermal Science, Vol. 15, No. 4, 2006, pp. 331–336.
15. Vanierschot, M., Persoons, T., and Van den Bulck, E., “A New Method for Annular Jet Control
Based on Cross-Flow Injection,” Physics of Fluids, Vol. 21, 2009, pp. 025103-1–025103-9.
16. Young, T., “Outlines of Experiments and Inquires Respecting Sound and Light,” Philosophical
Transactions of Royal Society of London, Vol. 90, 1 January 1800, pp. 106–150.
17. Rayleigh, L., “On the Dynamics of Revolving Fluid,” Proceedings of Royal Society of London,
Series A, Vol. 93, No. 648, 1 March, 1917, pp. 148–154.
18. Newman, B. G., The Deflexion of Plane Jets by Adjacent Boundaries, in Coanda Effect, In
Boundary Layer and Flow Control, edited by G. V. Lachmann, Vol. 1, Pergamon Press, Oxford,
1961, pp. 232–264.
19. Carpenter, P. W., and Green, P. N., “The Aeroacoustics and Aerodynamics of High-Speed
Coanda Devices, Part 1: Conventional Arrangement of Exit Nozzle and Surface,” Journal of
Sound and Vibration, Vol. 208, No. 5, 1997, pp. 777–801.
20. Trancossi, M., “An Overview of Scientific and Technical Literature on Coanda Effect Applied
to Nozzles,” SAE Technical Papers No. 2011-01-2591, Issn 0148-7191, 2011.
21. Wille, R., and Fernholtz, H., “Report on the First European Mechanics Colloquium, on the
Coanda Effect,” Journal of Fluid Mechanics, Vol. 23, No. 4, 1965, pp. 801–819.
22. Subhash, M., and Dumas, A., “Computational Study of Coanda Adhesion over Curved
Surface,” SAE International Journal of Aerospace, 2013, paper number: 13ATC-0018/2013-01-2302
(Accepted for Publication).
23. Fekete, G. I., “Coanda Flow of a Two-Dimensional Wall Jet on the Outside of a Circular
Cylinder,” Mechanical Engineering Research Laboratories, Rept. 63-11, McGill University,
1963.
24. Neuendorf, R., and Wygnansky, I., “On a Turbulent Wall Jet Flowing over a Circular Cylinder,”
Journal of Fluid Mechanics, Vol. 381, 1999, pp. 1–25.
25. Patankar, U., and Sridhar, K., “Three-Dimensional Curved Wall Jets,” Journal of Basic
Engineering, Vol. 94, No. 2, 1972, pp. 339–344.
26. Spalart, P. R., and Allmaras, S. R., “A One-Equation Turbulence Model for Aerodynamic
Flows,” AIAA Paper No. 92-0439, 1992.
27. Launder, B. E., and Sharma, B. I., “Application of the Energy Dissipation Model of Turbulence
to the Calculation of Flow Near a Spinning Disc,” Letters in Heat and Mass Transfer, Vol. 1, No.
2, 1974, pp. 131–138.
Chapter fourteen: Computational study of the Coanda flow for V/STOL 283
28. Menter, F. R., “Two-Equation Eddy-Viscosity Turbulence Models for Engineering Applications,”
AIAA Journal, Vol. 32, No. 8, August 1994, pp. 1598–1605.
29. Durbin, P. A., “Separated Flow Computations with the k-ε-v2 Model,” AIAA Journal, Vol. 33,
1995, pp. 659–664.
30. Hanjalic, K., Popovac, M., and Hadziabdic, M., “A Robust Near-Wall Elliptic-Relaxation Eddy-
Viscosity Turbulence Model for CFD,” International Journal of Heat Fluid Flow, Vol. 25, No. 6,
2004, pp. 1047–1051.
31. ANSYS Fluent User manual, 2016.
32. AVL-Fire User manual, 2014.
33. Casey, M., and Wintergerste, T., “ERCOFTAC Special Interest Group on ‘Quality and Trust in
Industrial CFD’ Best Practice Guidelines,” Version 1.0, January 2000.
34. Rizzi, A., and Vos, J., “Towards Establishing Credibility in Computational Fluid Dynamics,”
AIAA Journal, Vol. 36, No. 5, 1998, pp. 668–675.
35. Celik, I., Li, J., Hu, G., and Shaffer, C., “Limitations of Richardson Extrapolation and Some
Possible Remedies,” Journal of Fluids Engineering, Vol. 127, July 2005, pp. 795–805.
36. Patankar, S. V., and Spalding, D. B., “A Calculation Procedure for Heat, Mass and Momentum
Transfer in Three-Dimensional Parabolic Flows,” International Journal of Heat and Mass Transfer,
Vol. 15, 1972, pp. 1787–1806.
chapter fifteen
Contents
15.1 Introduction......................................................................................................................... 285
15.2 Collocation method............................................................................................................ 286
15.3 B-spline................................................................................................................................. 287
15.3.1 B-spline of degree zero.......................................................................................... 287
15.3.2 First-degree (linear) B-spline................................................................................. 288
15.3.3 Second-degree (quadratic) B-spline..................................................................... 288
15.4 Characteristics of B-spline basis functions..................................................................... 289
15.5 Types of B-spline................................................................................................................. 289
15.5.1 Trigonometric B-spline basis functions............................................................... 289
15.5.2 Exponential B-spline basis functions................................................................... 290
15.6 Methodology: Collocation method using B-spline basis function.............................. 290
15.7 Numerical solution of advection diffusion equation using collocation method....... 292
15.7.1 Using B-spline basis functions.............................................................................. 292
15.7.2 Using trigonometric B-spline basis functions.................................................... 293
15.8 Numerical example............................................................................................................ 295
References...................................................................................................................................... 296
15.1 Introduction
Due to the wide existence and applicability of ordinary and partial differential equations
in various branches of science and engineering, a variety of nonlinear systems of initial
and boundary value problems have been extensively studied in the literature. Many of the
mathematical models of engineering problems can be expressed in terms of partial differ-
ential equations such as in describing the physics of various phenomena in science, in the
study of the physical laws of fluid flow diffusion in transport problems, electromagnetic
waves, neural networks, tissue engineering,quantum phenomena, etc. These are some
of the application areas where existing phenomena or processes can be easily described
in the form of initial and boundary value problems. Since it is not always feasible to cal-
culate the analytical solutions of obtained modeled equations, there emerges the need for
and role of advanced numerical methods.
A variety of numerical methods are available to obtain the numerical as well as analytical
solutions of partial differential equations. Two of the most popular techniques for solving
285
286 Advanced Mathematical Techniques in Engineering Sciences
partial differential equations include the finite difference method and the finite element
method. In the finite difference method, the solution is derived at a finite number of points
by approximating the derivatives at each of the selected points. The accuracy of this method
is based on the refinement of the grid points where the solution is being evaluated [1]. In
the finite element method the focus is on dividing the domain into a finite number of ele-
ments with allocated nodes at predefined locations around the boundary elements [2]. The
elements as well as the nodes result in a mesh that can be refined to minimize the error.
In the last few years, the collocation method, which is a type of finite element method,
has been an emerging popular technique to solve various ordinary and partial differential
equations. This method has been developed from the finite element method using the con-
cepts of the finite difference method. This method has been applied to solve a variety of
mathematical problems with different types of basis functions with the aim to obtain the
best possible numerical solutions of various linear and nonlinear mathematical problems.
It involves satisfying a differential equation to some tolerance at a selected finite number
of points, called collocation points.
In this chapter, the collocation method will be discussed using B-spline basis func-
tions in standard as well as in trigonometric form. A numerical problem of an advection
diffusion equation is solved to describe the application of the method with details. The
obtained results are presented in the form of tables depicting the absolute and maximum
absolute errors.
15.2 Collocation method
In the collocation method, the numerical solution of a differential equation is obtained as a
linear combination of basis functions with unknown coefficients to be determined. In this
approach, a given function is approximated by a polynomial at collocation points chosen
by some predefined way that can be either uniform or nonuniform.
Let us discuss the application of this approach to a general differential equation
represented in the following form:
f ( x , u, ux ) = 0 (15.1)
to be solved by the collocation method in domain [xL, xR] with the known values of given
boundary conditions defined as
u( xL ) = ϕ 0 , u( xR ) = ϕ 1 (15.2)
U= ∑ c ϕ (x) (15.3)
i=1
i i
Here, ci′ s are the unknowns to be calculated, and N is the total number of domain
partitions. The domain partition also affects the performance of the method, with more
domain partitions, the closer the approximate solution approaches to the exact solution.
To apply the approach of the collocation method, the approximated solution value at the
boundary is taken from the boundary conditions, and the solution is obtained at internal
node points.
Chapter fifteen: Introduction to collocation method 287
15.3 B-spline
The theory of B-spline function is well known in obtaining the approximate numerical
solution of boundary value problems, either ordinary or partial differential equations due
to their distinct properties.
Schoenberg [3] in 1946 was the first researcher to refer to the word B-spline (“B” refers
to basis) in his research work related to the field of mathematics. He described B-spline as
a short form of basis spline that represents a smooth, piecewise polynomial. The concept of
B-spline is an extended form of splines with some additional properties. A B-spline basis
function is a spline function described upon the knot sequence xi having minimal support
with respect to a given degree, smoothness, and domain partition. Following are some
related definitions related to nodes:
The first definition of the B-spline basis functions was given by Schumaker [4] using the
idea of divided differences. After this, a recurrence relation was independently obtained
by Cox [5] and de Boor [6] in the early 1970s to compute B-spline of various orders and
degrees. The recursive formula is used to calculate the mth B-spline basis function of the
lth degree in a recursive manner by implementation of the Leibniz’ theorem, which can be
stated as follows:
x − xm
where, Vm ,l = .
xm+ l − xm
This is the well-known Cox–de Boor recursion formula to calculate a particular degree
B-spline basis function as a linear combination of basis functions of smaller degree. Here
Bm,l(x) is an mth B-spline basis function of degree l, and x is a parameter variable.
The above-defined recurrence relation (15.4) can be used for l = 1 as the initial value
to generate the first-degree B-splines. It then results in construction of the higher-order
basis functions. The basis function Bm,l(x) for degree l ≥ 1 can hence be written as a linear
combination of two (l − 1)th degree basis functions.
1, x ∈[ xm , xm + 1 )
Bm ,0 = (15.5)
0, otherwise
From the definition, a zero-degree B-spline can be described as a function that is nonzero
and has value one, on the half open interval [xm, xm+1) while at all other points it is zero. The
appearance of zero-degree B-spline is as presented in Figure 15.1.
288 Advanced Mathematical Techniques in Engineering Sciences
x − xm
, x ∈[ xm , xm + 1 )
xm + 1 − xm
xm + 2 − x
Bm ,1 = , x ∈[ xm + 1 , xm + 2 ) (15.6)
x m + 2 − xm + 1
0, otherwise
( x − x m )2
x ∈[ xm , xm + 1 )
( xm + 2 − xm )( xm + 1 − xm )
( x − xm )( xm + 2 − x) ( xm + 3 − x)( x − xm + 1 )
+ x ∈[ xm + 1 , xm + 2 )
( xm + 2 − xm )( xm + 2 − xm + 1 ) ( xm + 3 − xm + 1 )( xm + 2 − xm + 1 )
Bm ,2 = (15.7)
( xm + 3 − x)2
x ∈[ xm + 2 , xm + 3 )
( xm + 3 − xm + 1 )( xm + 3 − xm + 2 )
0, otherwise
From the definition it can be concluded that this second-degree B-spline basis function is
nonzero for three consecutive knot spans and has the presentation of a curve as depicted
in Figure 15.3.
Using a similar approach, the formula for higher-degree B-splines can be obtained.
15.5 Types of B-spline
B-spline basis functions are also in trigonometric and exponential forms. The most com-
monly used basis functions are the cubic B-spline (B-spline of degree three) because of the
property that they are symmetric with respect to the origin. Following are the trigonomet-
ric and exponential B-spline basis functions of degree three.
p 3 ( xm ) x ∈[ xm , xm + 1 )
1 p( xm )( p( xm )q( xm + 2 ) + q( xm + 3 ) p( xm + 1 )) + q( xm + 4 ) p ( xm + 2 )
2
x ∈[ xm + 1 , xm + 2 )
TBm ( x) = (15.8)
w 2
q( xm + 4 )( p( xm + 1 )q( xm + 3 ) + q( xm + 4 )p( xm + 2 )) + p( xm )q ( xm + 3 ) x ∈[ xm + 2 , xm + 3 )
q 3 ( xm + 4 ) x ∈[ xm + 2 , xm + 3 )
290 Advanced Mathematical Techniques in Engineering Sciences
where
b−a
h= , is the step size for domain x ∈ [ a, b]
n
x − xm x − x h 3h
p( xm ) = sin , q( xm ) = sin m , w = sin sin( h) sin
h 2 2 2
This is a polynomial cubic trigonometric function with some geometric properties like C∞
continuity, nonnegativity, and partition of unity.
1
p
(
b2 ( xm − 2 − x) − sinh ( p( xm − 2 − x))
) x ∈[ xm − 2 , xm − 1 )
a1 + b1 ( xm − x) + c1 exp ( p( xm − x)) + d1 exp ( − p( xm − x)) x ∈[ xm − 1 , xm )
EBm ( x) = (15.9)
a1 + b1 ( x − xm ) + c1 exp ( p( x − xm )) + d1 exp ( − p( x − xm )) x ∈[ xm , xm + 1 )
1
p
(
b2 ( x − xm + 2 ) − sinh ( p( x − xm + 2 ))
) x ∈[ xm + 1 , xm + 2 )
phc p c(c − 1) + s2 p
where, a1 = , b1 = , b2 =
phc − s 2 ( phc − s)(1 − c) 2( phc − s)
1 e − ph (1 − c) + s(e − ph − 1) 1 e ph (c − 1) + s(e ph − 1)
c1 = , d =
4 4 ( phc − s)(1 − c)
1
( phc − s)(1 − c)
b−a
c = cosh( ph), s = sinh( ph), and h = , is the step size for domain x ∈[ a, b]
n
m+ l − 2
U ( x , t) = ∑
j = m− l + 2
c j Bj ( x) (15.10)
Here, l defines the degree of the B-spline basis functions, m defines the number of
collocation points, and cm are the constants to be calculated from the generated matrix
system to be solved using any numerical method.
The formula for the cubic B-spline basis function using the definition and second-
order B-spline basis function was first given by Prenter [7] to solve a partial differential
equation given by
( x − x m − 2 )3 x ∈[ xm − 2 , xm − 1 )
3
( x − xm − 2 ) − 4( x − xm − 1 ) 3
x ∈[ xm − 1 , xm )
1
Bm ,3 ( x) = 3 3
( xm + 2 − x) − 4( xm + 1 − x) 3
x ∈[ xm , xm + 1 ) (15.11)
h
( xm + 2 − x ) 3
x ∈[ xm + 1 , xm + 2 )
0 otherwise
The cubic B-spline basis function as defined above will be nonzero at four knot spans and
is presented as in Figure 15.4.
From this definition, the values of Bm(x) at the node points with its first and second
derivatives can be tabulated as in Table 15.1.
By substituting l = 3 in (15.10), the solution can be approximated as
m+ 1
It is evident that the nonzero part of Bm is localized to a small neighborhood of xm, namely,
in the interval xm− 2 < xm < xm+ 2 . Because of this, only Bm− 1 , Bm , Bm+ 1 contribute to the value
of U at xm. Using the values of basis functions at the node points from Table 15.1 in Equation
(15.13), the approximate solution and its derivatives up to second order can be determined
in terms of parameters cm′ s that can be written as
Table 15.1 Value of Bm(x) for cubic B-spline and its derivatives at the nodal points
xm−2 xm−1 xm xm+1 xm+2
Bm(x) 0 1 4 1 0
Bm′ ( x) 0 3/ h 0 −3/h 0
Bm′′ ( x) 0 6/h2 −12/h2 6/h2 0
U ( xm , t ) = c m − 1 + 4 c m + c m + 1
hU ′( xm , t) = 3(cm + 1 − cm − 1 ) (15.14)
un + 1 − un un+ 1 + uxx
n
(u )n+ 1 + (ux )n
= β xx −α x (15.16)
∆t 2 2
α∆t n β∆t n
cmn+−11 (1 − 6 z − 3 y ) + cmn+ 1 (4 + 12 z) + cmn++11 (1 − 6 z + 3 y ) = un − ux + uxx (15.17)
2 2
α∆t β∆t
Here, y = , z = 2 and m = 0,…, N.
2h 2h
Chapter fifteen: Introduction to collocation method 293
α∆t n β∆t n
c0n+ 1 (36 z + 12 y ) + c1n+ 1 (6 y ) = un − ux + uxx − ϕ 0 (1 − 6 z − 3 y ) (15.18)
2 2
α∆t n β∆t n
c Nn+−11 (−6 y ) + c Nn+ 1 (36 z − 12 y ) = un − ux + uxx − ϕ 1 (1 − 6 z + 3 y ) (15.19)
2 2
36 z + 12 y 6y 0 0 0 0
1 − 6z − 3y 4 + 12 z 1 − 6z + 3y . . 0
0 1 − 6z − 3y 4 + 12 z 1 − 6z + 3y 0 0
0 0 0
0 . . 1 − 6z − 3y 4 + 12 z 1 − 6 z + 3 y
0 0 0 0 −6 y 36 z − 12 y
n α∆t n α∆t n
u − 2 ux + 2 uxx − ϕ 0 (1 − 6 z − 3 y )
α∆t n β∆t n
un − u + u
2 x 2 xx
α∆ t β∆t
un − uxn +
n
uxx
2 2
n α∆t n α∆t n
u − 2 ux + 2 uxx − ϕ 1 (1 − 6 z + 3 y )
h
sin 2
2 2
a1 = , a2 = ,
3h 1 + 2 cos( h)
sin ( h ) sin
2
−3 3
a3 = , a4 = ,
3h 3h
4 sin 4 sin
2 2
h
3 cos 2
3 ( 1 + cos( h)) 2
a5 = , a6 = −
h h 3h h
16 sin 2 cos + cos
2
sin 2 ( 2 + 4 cos( h))
2 2 2 2
Using the linear combination formula to write the approximate solution with trigonometric
B-spline basis functions up to the second-order derivative, the approximate solution can be
determined in terms of the time parameters cm′ s as
U ( xm , t) = a1cm − 1 + a2 cm + a1cm + 1
U ′( xm , t) = a4 cm − 1 + a3 cm + 1
U ′′( xm , t) = a5 cm − 1 + a6 cm + a5 cm + 1
On substituting the values of basis functions in Equation (15.16), the system can be
written as
β∆t α∆t
Here, γ = ,η = , and m = 0,…, N.
2 2
a a a a α∆t n β∆t n
c0n+ 1 γ 2 5 − a6 − η 2 4 + c1n+ 1η ( a3 − a4 ) = un − ux + uxx
a1 a1 2 2
ϕ
− 0 ( a1 − γ a5 + η a4 ) (15.21)
a1
Table 15.2 Value of Bm(x) for trigonometric cubic B-spline and its derivatives at the nodal points
xm−2 xm−1 xm xm+1 xm+2
TBm ( x) 0 a1 a2 a1 0
TBm′ ( x) 0 a3 0 a4 0
TBm′′ ( x) 0 a5 a6 a5 0
Chapter fifteen: Introduction to collocation method 295
a a a a α∆t n β∆t n
c Nn+−11η ( a4 − a3 ) − c Nn+ 1 γ 2 5 − a6 − η 2 3 = un − ux + uxx
a1 a1 2 2
ϕ1
− ( a1 − γ a5 + η a3 ) (15.22)
a1
15.8 Numerical example
The B-spline basis functions are widely used to solve various linear and nonlinear ordi-
nary and partial differential equations, see Refs. [9–16]. This chapter is an effort to discuss
the basics of basis functions and their implementation to solve differential equations.
To get insight into the method, consider a problem with α = 0, β = 1 in Equation (15.15)
that reduces to the heat equation given as
with boundary conditions u(0, t) = 0, u(1, t) = 0 and initial condition u( x , 0) = sin(π x).
The exact solution of the equation is given by u( x , t) = exp(−π 2 t) sin(π x).
Numerical solution of the concerned equation is obtained by collocation method using
standard B-spline and trigonometric B-spline basis functions at t = 1. To discuss the accu-
racy of the obtained solutions, errors are calculated using both types of basis functions
and are depicted in Table 15.3. The max absolute errors are depicted in Table 15.4 for two
different values of time step and domain partition. First, values are calculated at time step
0.01 and 40 domain partitions for t = 1 to t = 3 and then the time step is considered 0.001
with 160 domain partitions for same time levels.
It can be concluded from Tables 15.3 and 15.4 that the solution is comparable with the
exact solution from both forms of the B-spline basis functions. In case of standard B-spline
Table 15.4 Maximum absolute errors using B-spline and trigonometric B-spline
Time B-spline Trigonometric B-spline
At N = 40 and ∆t = 0.01
1 2.0112E-04 2.4165E-05
2 1.2937E-05 3.2392E-07
3 1.0714E-06 2.7468E-08
At N = 160 and ∆t = 0.001
1 2.4242E-08 6.5252E-06
2 2.8568E-11 6.1191E-10
3 1.5320E-15 4.4115E-14
basis functions, the solution is very much improved on increasing the domain partitions
with small values of time steps, while in the case of the trigonometric B-spline, the solution
is also improved but at a normal pace.
References
1. W. Zahra, Numerical Treatment of Boundary Value Problems Using Spline Functions, LAP Lambert
Academic Publishing, GmbH & Co.KG and licensors, Germany 2010.
2. K. S. Surana, J. N. Reddy, The Finite Element Method for Boundary Value Problems, CRC Press,
Taylor & Francis Group, Boca Raton, FL, 2016.
3. I. J. Schoenberg, Contribution to the problem of approximation of equidistant data by analytical
functions, Quarterly Applied Mathematics, 4, 1946, 45–99.
4. L. L. Schumaker, Spline Functions, Basic Theory, Wiley, Cambridge University Press, New York, 1981.
5. M. G. Cox, The numerical evaluation of B-splines, Journal of the Institute of Mathematical
Applications, 10, 1972, 134–149.
6. C. de Boor, A Practical Guide to Splines, Springer Verlag, New York, 1978.
7. P. M. Prenter, Splines and Variational Methods, John Wiley & Sons, New York, 1975.
8. D. U. Von Rosenberg, Methods for Solution of Partial Differential Equations, Vol. 113, American
Elsevier Publishing Inc., New York, 1969.
9. M. K. Kadalbajoo, L. P. Tripathi, A. Kumar, A cubic B-spline collocation method for a numerical
solution of the generalized Black–Scholes equation, Mathematical and Computer Modelling, 55
(3–4), 2012, 1483–1505.
10. A. K. Khalifa, K. R. Raslan, H. M. Alzubaidi, A collocation method with cubic B-splines for solv-
ing the MRLW equation, Journal of Computational and Applied Mathematics, 212 (2), 2008, 406–418.
11. R. Pourgholi, Applications of cubic B-splines collocation method for solving nonlinear inverse
parabolic partial differential equations, Numerical Methods for Partial Differential Equations, 33 (1),
2017, 88–104.
12. M. Gholamian, J. Saberi-Nadjafi, Cubic B-splines collocation method for a class of partial
integro-differential equation, Alexandria Engineering Journal, 2017. doi:10.1016/j.aej.2017.06.004.
13. G. Arora, V. Joshi, A computational approach using modified trigonometric cubic B-spline for
numerical solution of Burgers’ equation in one and two dimensions, Alexandria Engineering
Journal, 2017. doi:10.1016/j.aej.2017.02.017.
14. G. Arora, V. Joshi, A computational approach for solution of one dimensional parabolic partial
differential equation with application in biological processes, Ain Shams Engineering Journal, 2016.
doi:10.1016/j.asej.2016.06.013.
15. M. Abbas, A. A. Majid, A. I. Md. Ismail, A. Rashid, Numerical method using cubic trigonomet-
ric B-spline technique for nonclassical diffusion problems, Abstract and Applied Analysis, 2014,
Article ID 849682, 11 pages.
16. O. Ersoy, I. Dag, The exponential cubic B-spline collocation method for the Kuramoto–
Sivashinsky equation, Filomat, 30 (3), 2016, 853–861, DOI 10.2298/FIL1603853E.
chapter sixteen
Contents
16.1 I ntroduction......................................................................................................................... 297
16.2 Problem formulation and its solution.............................................................................. 299
16.2.1 Solution for the lower highly anisotropic half-space......................................... 301
16.2.2 Solution for the upper fluid-saturated poroelastic half-space.......................... 302
16.3 Boundary conditions..........................................................................................................305
16.4 Solution of the first-order approximation of the corrugation.......................................305
16.5 Solution for second-order approximation of the corrugation...................................... 307
16.6 Special case of a simple harmonic interface....................................................................309
16.7 Particular cases for special case........................................................................................ 310
16.8 Energy distribution............................................................................................................ 312
16.9 Numerical discussion and results.................................................................................... 313
16.9.1 Effect of corrugation amplitude............................................................................ 314
16.9.2 Effect of corrugation wavelength......................................................................... 316
16.9.3 Effect of frequency factor....................................................................................... 317
16.9.4 Influence of initial stress parameter on poroelastic half-space........................ 318
16.9.5 Influence of initial stress parameter on highly anisotropic half-space........... 321
16.10 Concluding remarks......................................................................................................... 324
References...................................................................................................................................... 325
16.1 Introduction
In recent years, the phenomena of elastic wave scattering due to different obstacles present
in the media have drawn the considerable attention of many distinct researchers across the
globe. This is because this investigation enables us to unravel deep subsurface structures
that have immense operational usage in oil exploration, earthquake engineering, and much
more. Various types of materials provide distinct propagative behavior to the waves under-
neath the earth. During SH-wave propagation, the boundaries present between the layers
distribute the individual waves to reflect or transmit through the interface depending on the
angle of incidence. This defines the distributive characteristics of the interface. Furthermore,
297
298 Advanced Mathematical Techniques in Engineering Sciences
the boundaries are majorly irregular or corrugated, which further increases the complex-
ity of the investigation. The variation of the wave propagation also depends largely on the
physical characteristics of the medium. Therefore, propagation through such layers can also
enlighten us with some important facts about faults and anticlinal structures beneath the
earth. The phenomena of reflection and transmission have been the principal concept behind
the subjects of geophysics and seismology. Explorations of oil and gas companies have been
using this concept for years to detect the accumulation of hydrocarbons beneath the earth.
Literature is already present on the reflection and transmission of SH-waves, such as Ewing
et al. (1957), Keith et al. (1977), Aki and Richards (2002), etc. Fokkema (1980) investigated these
phenomena using the time-harmonic waves. The study of stress free boundary between two
incompressible materials using the reflection and transmission phenomena was done by Pal
and Chattopadhyay (1984).
Nowadays, the reflection and refraction through porous media have become one
of the core subjects of investigation due to the dynamic behavior. A separate field of
study has emerged concerning the propagation through the porous media. A typical
porous media is the one that has some pores in it, which is usually filled with fluid.
Such materials are often characterized by their porosity values. Porosity is defined as
the ratio of the volume of void space to the total volume. Porosity values range from 0
to 1. The connected pore space enables the filtration of pore fluid through the porous
media. Pumice, sandstone, and soil are some of the naturally occurring porous materi-
als found in the earth. The dynamic nature of such porous materials is the major aspect
of rock study, which is effectively used in seismic exploration for detailed investiga-
tion of subsurface structures to explore sedimentary basins for hydrocarbon produc-
tion. Deresiewicz (1961) first studied the boundary effects on the wave propagation in a
liquid-filled porous media. Wu et al. (1990) investigated reflection and refraction of elastic
waves from a fluid-saturated porous solid boundary. Sharma and Gogna (1992) used the
plane harmonic waves to investigate the reflection and refraction phenomena through
an interface between an elastic solid and a liquid-saturated porous media by making
purposeful use of the asymptotic approximation of dissipation function. Tajuddin and
Haussaini (2005) analyzed the reflection phenomena of plane waves at the boundaries
of a liquid-filled poroelastic half-space. Tomar and Arora (2007) studied the reflection
and refraction phenomena of elastic waves through an elastic/porous solid filled with
immiscible fluids.
Wave propagation through an anisotropic media is fundamentally very different to
an isotropic media. In seismology, if there are variations in phase velocity that depend
largely on factors such as wave propagation direction, particle motion direction, the ori-
entation of the material and the stress and strain of the propagating media, then it is said
that there is an anisotropy in the propagating medium. The anisotropic properties of
the material have a significant contribution on the reflection and refraction coefficients.
Information of such coefficients can help us to understand the mechanical properties of
the medium. Anisotropy also occurs due to the presence of thin laminates arranged in
a particular order. Other factors such as micro-fracturing and orientation of the mineral
can also result in a general anisotropy. Normally, it is difficult to derive a general anisot-
ropy from a specific anisotropy; therefore, it is necessary that during a wave propagation
problem, the anisotropy should be of the general type. These general problems have
motivated the present study. Crampin (1977) was the first researcher who differentiated
anisotropy with isotropy. He established that the variation in velocity due to anisotropy
is one of the many anomalies that can occur in the media. The concepts of reflection and
transmission phenomenon in the anisotropic half-space have been the base of geological
Chapter sixteen: Rayleigh’s approximation method 299
study to explore continental margins for mineral exploration. Daley and Hron (1979)
investigated the ellipsoidal anisotropic media to derive reflection and transmission coef-
ficients for seismic waves. Rokhlin et al. (1986) studied this wave scattering phenomena
of elastic waves on a plane interface lying between two generally anisotropic media.
Then, Thomsen (1988) published a paper on reflection seismology over azimuthally
anisotropic media.
Rayleigh (1907) made the first attempt to find the solution to the reflection problem of
light or sound when incident perpendicularly on an uneven boundary surface. Then, Sato
(1955) applied Rayleigh’s concept on the elastic waves, which was later extended by Asano
(1960, 1961, 1966). Abubakar (1962a–c) attempted to study the problem of scattering of
elastic waves incident on a corrugated interface by utilizing the perturbation technique.
Saini and Singh (1977) studied the effect of anisotropy on the reflection of SH-waves at
an interface. In general terms, the Rayleigh’s method approximates the exponential term
associated with the corrugated interface. For the solution of first-order approximation of
corrugation, the linear terms are retained, and since the amplitude and slope of the cor-
rugated boundary are assumed to be very small, the higher orders are neglected. Several
other kinds of literature have also been published on Rayleigh’s method implemented
on elastic wave scattering in the corrugated interface, such as Tomar and Saini (1997),
Tomar et al. (2002), Tomar and Kaur (2003), Tomar and Singh (2007), etc. Tomar and Kaur
(2007) then investigated the behavior of the SH-wave at a corrugated interface that lies in
between a dry sandy half-space and an anisotropic elastic half-space.
In the present chapter, utilizing Rayleigh’s approximation method, an attempt has been
made to study the reflection and refraction pattern in a corrugated interface sandwiched
between an initially stressed fluid-saturated poroelastic half-space and a highly aniso-
tropic half-space. Here the highly anisotropic half-space is considered as triclinic. Closed
form formulae for the reflection and refraction coefficients have been derived. Rayleigh’s
method has been effectively used to derive first- and second-order approximations of the
coefficients. Some special cases have also been deduced. The energy ratios of the reflected
and refracted waves are also presented. Various two-dimensional plots have been drawn
to show the effects of some affecting parameters such as initial stress parameter, corruga-
tion amplitude, wavelength and frequency factor.
ζ= ∑ ζ e
n =1
n
inλ x
+ ζ − ne − inλ x (16.1)
Here, ζ n and ζ − n are Fourier expansion coefficients, λ is the wave number and n is series
expansion order and the wavelength of corrugation is 2π/λ.
300 Advanced Mathematical Techniques in Engineering Sciences
(D0 , f )
(D1, f1)
(D1΄, f1΄)
F2 SH
(f)
ζ
x
F1
SH
(e) (e) (B1΄, e1΄)
SH
(B1, e1)
Z (B, e)
d c ∓s
ζ 1 = ζ −1 = , ζ ± n = n n , n = 2,3,… (16.2)
2 2
If the shape of the corrugated interface is represented by only one cosine term, i.e.,
ζ = d cos λ x ; then 2π/λ and d is the wavelength and amplitude of corrugation, respec-
tively. Let ui,vi and wi(I = 1, 2) be the displacement components along x, y, and z directions,
respectively.
For the propagation of SH-wave, it is assumed that
∂
ui = 0, wi = 0, vi = vi ( x , z, t), ≡ 0 (16.4)
∂y
Indices 1 and 2 stand for the highly anisotropic half-space and the fluid-saturated poroelas-
tic half-space. The first and second partial derivatives with respect to time are represented
d d
as ∂t and ∂tt, respectively. Moreover, ∂ z and ∂ zz stand for and 2 , respectively.
dz dz
Chapter sixteen: Rayleigh’s approximation method 301
T11t = F11e11 + F12 e22 + F13 e33 + F14 e23 + F15 e13 + F16 e12 ,
T22t = F12 e11 + F22 e22 + F23 e33 + F24 e23 + F25 e13 + F26 e12 ,
T33t = F13 e11 + F23 e22 + F33 e33 + F34 e23 + F35 e13 + F36 e12 ,
(16.5)
T23t = F14 e11 + F24 e22 + F34 e33 + F44 e23 + F45 e13 + F46 e12 ,
T13t = F15 e11 + F25 e22 + F35 e33 + F45 e23 + F55 e13 + F56 e12 ,
T12t = F16 e11 + F26 e22 + F36 e33 + F46 e23 + F56 e13 + F66 e12 .
Here, Tijt , Fij, and eij are the components of the stress tensor, stiffness coefficients, and
components of the strain tensor, respectively.
The equations of motion in the absence of body forces are given by Biot (1965),
where ρ1 denotes the mass density, and ωx, ωy, and ωz are the rotational components
given by
ω x = 1 2 ( ∂ y w − ∂ z v ) , ω y = 1 2 ( ∂ z u − ∂ x w ) , and ω z = 1 2 ( ∂ x v − ∂ y u) .
Now, using Equations (16.4)–(16.6), we get the governing equation of motion and the
stress–strain components
t
S11
∂ x T21 + ∂ z T23 − ∂ x v1 = ρ1 ∂tt v1 (16.7)
2
and
ω2
∂ zz V − 2 ik1µ1 ∂ z V + k12 µ2 2 2 − 1 + ξ V = 0 (16.10)
k1 β 1
where
F46 F St F
µ1 = ,µ2 = 66 , ξ = 11 , and β12 = 66
F44 F44 F66 ρ1
v1 ( x , z, t ) = ( A0 e i Ω0 z + B0 e − i Ωz ) (16.11)
where
(
Ω0 = k1 µ1 + µ12 + µ2 ( cot 2e + ξ ) ) and Ω = k (− µ +
1 1 µ12 + µ2 ( cot 2e + ξ ) )
Hence, the displacement for the lower half-space is given by
u2 = 0, w2 = 0, v2 = v2 ( x , z, t) and U 2 = 0, W2 = 0, V2 = V2 ( x , z, t) (16.13)
T23p = 2Ge23 ,
T31p = 2Ge31 ,
where, A,C,D,F,G,K,M,N are the material constants; Tijp are the components
of the stress
tensors acting on the solid phase of the poroelastic material; E = divU i is the fluid volumetric
Chapter sixteen: Rayleigh’s approximation method 303
strain; σ = −fp is the stress acting on the fluid phase of poroelastic material in which P is the
pressure in the fluid and f is the porosity of the poroelastic material.
With the help of Equations (16.13) and (16.14), the equation of motion for SH-wave
propagation in fluid-saturated initially stressed poroelastic medium in the absence of
body forces and the viscoelasticity of the fluid based on Biot (1956a,b, 1962, 1965) can be
written as
1
where ω ij =
2
( ui , j − uj , i ); S11p are horizontal initial stress; ρ11 , ρ22, and ρ12 take into account as
the inertial effects of the moving fluids and are associated with the densities of the solid
part ρ s , fluid part ρ f , and the aggregate medium ρ2 by the relations such that the mass
density of the aggregate is ρ2 = ρ11 + 2 ρ12 + ρ22 = ρ s + f ( ρ f − ρ s ) .
Moreover, the following inequalities also hold for the dynamic coefficients
ρ11 > 0, ρ12 ≤ 0, ρ22 > 0, ρ11 ρ22 − ρ12 2
>0
On further simplification, Equation (16.15) results in
p
S11
N + ∂ xx v2 + G ∂ zz v2 = d′ ∂tt v2 (16.16)
2
ρ2
where d′ = ρ11 − 12 .
ρ22
To solve Equation (16.16), we assume v2 ( x , z, t ) = V ( z ) e i(ω t − k2 x ) and after applying this in
Equation (16.16), we get
∂ zz V + k22η 2V = 0, (16.17)
where
N Sp γ2
η = k2 µ1′ ( −1 + µ2′ cosec 2 f ) , µ1′ = ( 1 + ξ1 ) , µ2′ = d p ( 1 + ξ1 ) , ξ1 = 11 , and d p = γ 11 − 12
−1
G 2N γ 22
V ( x , z, t ) = ( C0 e − iη z + D0 e iη z ) (16.18)
The displacement for upper initially stressed poroelastic half-space, i.e., F2, is given as
v2 ( x , z, t ) = ( C0 e − iη z + D0 e iη z ) e i(ω t − k2 x ) (16.19)
where C0 and D0 are constants, and k2 is the wave number defined by the law of refraction
k2 : k1 = sin e : sin f
Let us assume that a ray of plane SH-wave of unit amplitude is propagating in the lower
half-space (F1) and is incident at the corrugated interface z = ζ , making an angle e with the
z-axis. Due to the corrugation at the interface, the reflection and refraction phenomena
will be affected, and the incident SH-wave will give rise to (1) a regularly reflected and a
regularly transmitted wave at angles e and f with the z-axis, in the lower (F1) and upper
half-space (F2), respectively; (2) a spectrum of nth order of irregularly reflected and irregu-
larly refracted waves at angles en and fn in the left side of regularly reflected and regularly
refracted waves, respectively; and (3) a similar spectrum of irregularly reflected and
irregularly refracted waves at angles en′ and fn′ in the right side of regularly reflected and
regularly refracted waves, respectively, at the corrugated interface.
The angle of refraction f is related to the angle of incidence e through Snell’s law
sin e sin f
= (16.20)
β1 β2
The angles en , en′ , fn , and fn′, are given by the following spectrum theorem by Abubakar
(1962a–c):
nλβ1 nλβ1
sin en − sin e = , sin en′ − sin e = − ,
ω ω
(16.21)
nλβ 2 nλβ 2
sin fn − sin f = , sin fn′ − sin f = −
ω ω
The total displacement in the lower highly anisotropic half-space (F1) is then given by the
sum of the incident, regularly reflected, and irregularly refracted waves
∞ ∞
iω t − x sin e
v ( x , z, t ) = B0 e iΩ0 z + Be − iΩz +
∑
n =1
Bne − iΩn z e − inλ x + ∑
n =1
Bn′ e − iΩn′ z e inλ x e β1
(16.22)
where
Ωn =
ω sin en
β1 (
− µ1 + µ12 + µ2 ( cot 2en + ξ ) and Ω′n =
ω sin en
β1 )
− µ1 + µ12 + µ2 ( cot 2en′ + ξ ) ( )
Similarly, the total displacement in upper initially stressed fluid-saturated poroelastic
half-space (F2) is the sum of regularly refracted and irregularly refracted waves:
∞ ∞
iω t − x sin f
v1 ( x , z, t ) = D0 e iη z +
∑n =1
Dne iηn z e − inλ x + ∑
n =1
Dn′ e iηn′ z e inλ x e β2 (16.23)
where
ω sin fn ω sin fn
ηn = µ1′ ( −1 + µ2′ cosec 2 fn ) and ηn′ = µ1′ ( −1 + µ2′ cos ec 2 fn′ )
β2 β2
The constants B0 and B are the reflection and refraction coefficients at plane interface,
respectively, and the constants Bn , Bn′ and Dn , Dn′ are the reflection and refraction coeffi-
cients, respectively, for the first-order approximation of corrugation. All these constants
are determined from the boundary conditions at the interface.
Chapter sixteen: Rayleigh’s approximation method 305
16.3 Boundary conditions
The boundary conditions at the corrugated interface z = ζ ensure the continuity of
displacement and stress, i.e.,
v1 = v2 (16.24)
iΩ ζ ∞ ∞
B0 e 0 + Be − iΩζ +
∑
n =1
Bne − iΩnζ e − inλ x + ∑
n =1
Bn′ e − iΩn′ ζ e inλ x
(16.26)
∞ ∞
= D0 e iη z +
∑n =1
Dne iηn z e − inλ x + ∑
n =1
Dn′ e iηn′ z e inλ x
and
ω sin e ω sin e
B0 Ω0 ( F44 − ζ ′F64 ) − ( F46 − ζ ′F66 ) e iΩ0ζ + B{−Ω ( F44 − ζ ′F64 ) − β
β 1 1
∞
ω sin e
× ( F46 − ζ ′F66 )}e − iΩζ + ∑B e
n =1
n
− iΩnζ − inλ x
e − ( F44 − ζ ′F64 ) Ωn −
β1
+ nλ ( F46 − ζ ′F66 )
∞
ω sin e
+ ∑B 'e
n =1
n
′ ζ inλ x
− iΩn
e − ( F44 − ζ ′F64 ) Ω′n + −
β1
+ nλ ( F46 − ζ ′F66 )
∞
ω sin f iηζ ω sin f
= D0 Gη + ζ ′N
β2
e + ∑D e
n =1
n
− iηnζ − inλ x
e ηnG + nλζ ′N + ζ ′N
β2
∞
ω sin f
+ ∑D′
n =1
n
iηn′ ζ inλ x
e ηn′ G − nλζ ′N + ζ ′N
β2
(16.27)
From Equations (16.26) and (16.27), the reflection and refraction coefficients of nth order of
approximation of the corrugated interface can be determined.
In view of Equation (16.28), using Equations (16.26) and (16.27) by collecting the terms
independent of x and ζ to both sides,
B0 + B = D0 (16.29)
These equations provide the values of reflection and refraction coefficients of the regularly
reflected and refracted SH-wave at a plane interface.
Solving Equations (16.29) and (16.30), we have
F46ω sin e
−Gη + F44Ω0 −
B β1 , (16.31)
=
B0 F46ω sin f
Gη + F44Ω +
β1
D0
=
[ F44Ω0 + F44Ω ] (16.32)
B0 F46ω sin f
Gη + F44Ω +
β1
Equations (16.31) and (16.32) give the reflection and refraction coefficients of SH-waves at a
plane interface between initially stressed fluid-saturated poroelastic half-space and highly
anisotropic half-space.
In order to find the solutions of the first-order approximation for the coefficients Bn
and Dn, we arrange the coefficients of e−inλx on both sides of Equations (16.26) and (16.27),
and then we have
ω sin e
bn = − F44Ωn − F46 + nλ , dn = ηnG,
β1
ω sin e ω sin e
t1 = nλ − F46Ω0 + F66 + Ω0 − F44Ω0 + F46 β ,
β 1 1
where
ω sin e ω sin e
t2 = nλ F46Ω + F66 + Ω − F44Ω − F46 ,
β1 β 1
nλ Nω sin f
t3 = −Gη 2 +
β2
Equating the coefficients of einλx, we obtain the first-order approximation for coefficients Bn′
and Dn′ ,
Chapter sixteen: Rayleigh’s approximation method 307
where
ω sin e
bn′ = − F44Ω′n + F46 − + nλ , dn′ = ηn′ G,
β1
ω sin e ω sin e
t4 = nλ F46Ω0 − F66 + Ω0 − F44Ω0 + F46 ,
β1 β1
ω sin e ω sin e
t5 = nλ − F46Ω − F66 + Ω − F44Ω − F46 ,
β1 β1
nλ Nω sin f
t6 = −Gη 2 −
β2
From Equations (16.33) to (16.36), we obtain the reflection and refraction coefficients of
irregularly reflected and refracted waves for the first-order approximation:
where
B D D B
Π+Bn = iζ − n ( dn − bn ) −Ω0 + Ω + η 0 + ( t3 − ηbn ) 0 − ( t2 + bnΩ ) + ( −t1 + bnΩ0 ) ,
B0 B0 B0 B0
D B
ΠD+ n = iζ − n ( t3 − ηbn ) 0 − ( t2 + bnΩ ) + ( −t1 + bnΩ0 ) ,
B0 B0
B D D B
Π−Bn′ = iζ n ( dn′ − bn′ ) −Ω0 + Ω + η 0 + ( t6 + ηbn′ ) 0 − ( t5 + bn′Ω ) + ( −t4 − bn′Ω0 ) ,
B0 B0 B0 B0
D B
ΠD− n′ = iζ n ( t6 + ηbn′ ) 0 − ( t5 + bn′Ω ) + ( −t4 − bn′Ω0 ) ,
B0 B0
Π+n = ( bn − an ) , Π−n = ( dn′ − bn′ )
Using Equations (16.1) and (16.38) into Equations (16.26) and (16.27) and comparing the
terms independent of x, the coefficients of e−inλx and those of einλx, separately on both sides
of the equations thus obtained, we get the following system of six equations which on
solving will give reflection and refraction coefficients for the second-order approximation:
B0 ( 1 − Ω20ζ − nζ n ) + B ( 1 − Ω2ζ − nζ n ) − iΩnζ nBn − iΩ′nζ − nBn′ = D0 ( 1 − η 2ζ − nζ n ) + iηnζ nDn + iηn′ζ − nDn′ ,
Ω′n2ζ −2nBn′ η ′ 2ζ 2 D′
iΩ0ζ − nB0 − iΩζ − nB + ( 1 − Ω2nζ − nζ n ) Bn − = iηζ nD0 + ( 1 − ηn2ζ − nζ n ) Dn − n − n n ,
2 2
Ω2nζ n2 Bn η 2ζ 2 D
iΩ0ζ nB0 − iΩζ nB − + ( 1 − Ω′n2ζ − nζ n ) Bn′ = iηζ nD0 − n n n + ( 1 − ηn′ 2ζ − nζ n ) Dn′ ,
2 2
iF ω sin e
(
B0 iF44Ω0 1 − Ω20ζ − nζ n + 46
β
) (
−1 + Ω20ζ − nζ n )
1
iF ω sin e
(
+ B iF44Ω −1 + Ω2ζ − nζ n + 46
β
) (
−1 + Ω20ζ − nζ n )
1
Ω2ζ ζ ζ F ω sin e
+ Bn − F44Ωn2ζ n + λ n −1 + n − n n ζ nΩn F64 + λ nζ n F66 + n 66
2 β1
F ω sin e Ω′n2ζ − nζ n
+ F46Ωnζ n − λ n + 46 + Bn′ − F44Ω′n ζ − n + λ n 1 −
2
β1 2
(
= D0 iGη 1 − ηn2ζ − nζ n )
η 2ζ ζ ζ ω sin f η 2ζ ζ
+ Dn −Gηn2ζ n − λ nN λ nζ n −1 + n − n n + n −1 + n − n n
2 β2 2
η ′ 2ζ ζ ζ ω sin f ηn′ 2ζ − nζ n
+ Dn′ −Gηn′ 2ζ − n − λ nN λ nζ − n 1 + n − n n + − n −1 + ,
2 β2 2
Ω2ζ ζ λ nF66ω sin eζ − n Ω20ζ − nζ n Ω0 F46ω sin eζ − n
B0 F64 λ nΩ0ζ − n 1 − 0 − n n − F44Ω2 0ζ − n + 1 + +
2 β1 2 β1
F ω sin e
( ) (
+ Bn F64 λ nΩnζ − n −1 + Ωn2ζ − nζ n + iF64 λ n −1 + Ωn2ζ − nζ n − 46
β1
) (
1 − Ωn2ζ − nζ n )
Chapter sixteen: Rayleigh’s approximation method 309
F Ω′ ω sin e F Ω′
+ Bn′ iΩ′n2ζ −2n − λ nF64 + 44 n + iF66 λ nΩ′nζ −2n λ n − F66 − 46 n
2 β1 2
λ nNζ − nω sin f η 2ζ − nζ n
= D0 −Gη 2ζ − n − −1 +
β2 2
− iηn2 iλ nN
+ Dn′ iGηn′ 1 − ηn′ 2ζ − nζ n + Dn Gηnζ −2n
2 +
β2
{ }
λ nβ 2ζ n2 + ω sin fηnζ n2
d
; when n = 1
ζ n = ζ −n = 2
0; when n = 2, 3,
In this case, 2π/λ is the wavelength and d is the amplitude of the corrugation. Thus, the
reflection and refraction coefficients for the first-order approximation of the corrugation
can be obtained by setting n = 1 in Equation (16.37), and we obtain
where
d B D D B
Π+B1 = i ( d1 − b1 ) −Ω0 + Ω 1 + η 0 + ( t3′ − ηb1 ) 0 − ( t2′ + b1Ω ) 1 + ( −t1′ + b1Ω0 ) ,
2 B0 B0 B0 B0
d D B
Π+D1 = i ( t3′ − ηb1 ) 0 − ( t2′ + b1Ω ) 1 + ( −t1′ + b1Ω0 ) ,
2 B0 B0
d B D D B
Π−B1′ = i ( d1′ − b1′ ) −Ω0 + Ω 1 + η 0 + ( t6′ + ηb1′ ) 0 − ( t5′ + b1′Ω ) 1 + ( −t4′ − b1′Ω0 ) ,
2 B0 B0 B0 B0
d D B
Π−D1′ = i ( t6′ + ηb1′ ) 0 − ( t5′ + b1′Ω ) 1 − ( t4′ + b1′Ω0 ) ,
2 B0 B0
Π1+ = ( b1 − a1 ) , Π1− = ( d1′ − b1′ )
where
Ω1 =
ω sin e1
β1 ( )
− µ1 + µ12 + µ2 ( cot 2e1 + ξ ) , Ω′1 =
ω sin e1
β1 (
− µ1 + µ12 + µ2 ( cot 2e1′ + ξ ) , )
ω sin f1 ω sin f1
η1 =
β2 ( )
µ1′ −1 + µ2′ cos ec 2 f1 and η1′ =
β2
( )
µ1′ −1 + µ2′ cosec 2 f1′
where
Chapter sixteen: Rayleigh’s approximation method 311
d B D D B
Π+B1 = i ( d1 − b1 ) −Ω0 + Ω 1 + η 0 + ( t3′ − ηb1 ) 0 − ( t2′ + b1Ω ) 1 + ( −t1′ + b1Ω0 ) ,
2 B0 B0 B0 B0
d D B
Π+D1 = i ( t3′ − ηb1 ) 0 − ( t2′ + b1Ω ) 1 + ( −t1′ + b1Ω0 ) ,
2 B0 B0
d B D D B
Π−B1′ = i ( d1′ − b1′ ) −Ω0 + Ω 1 + η 0 + ( t6′ + ηb1′ ) 0 − ( t5′ + b1′Ω ) 1 + ( −t4′ − b1′Ω0 ) ,
2 B0 B0 B0 B0
d D B
Π−D1′ = i ( t6′ + ηb1′ ) 0 − ( t5′ + b1′Ω ) 1 − ( t4′ + b1′Ω0 ) ,
2 B0 B0
Π1+ = ( b1 − a1 ) , Π1− = ( d1′ − b1′ )
where
µω sin e µω sin e
b1 = {− µΩ1 } , d1 = η1G, t1′ = λ + Ω0 ( − µΩ0 ) , t2′ = λ + Ω ( − µΩ ) ,
β1 β1
λ Nω sin f µω sin e
t3′ = −Gη 2 + , b1′ = {− µΩ′1 } , d1′ = η1′G, t4′ = λ − + Ω0 ( − µΩ0 ) ,
β 2 β 1
µω sin e
t 5′ = λ −
β 1
+ Ω ( − µΩ ) , t6′ = −Gη 2 −
λ Nω sin f
β 2
, Ω1 =
ω sin e1
β1 ( ( cot 2e1 ) ),
( ( cot 2e1′ ) ), η1 = ω sin µ1′ ( −1 + µ2 cos ec f 1′ ) ,
ω sin e1 f1 ω sin f1
µ1′ ( −1 + µ2′ cosec f1 ) , η1′ = β
Ω′1 = 2 ′ 2
β1 β 2 2
µ N Sp γ2
, µ1′ = ( 1 + ξ1 ) , µ2′ = d ( 1 + ξ1 ) , ξ1 = 11 , d p = γ 11 − 12
−1
µ1 = 0 ,µ2 = 1, ξ = 0, β12 =
ρ1 G 2N γ 22
Equation (16.40) is deduced for the case when SH-wave is incident at a corrugated inter-
face between initially stressed fluid-saturated poroelastic half-space and isotropic elastic
half-space.
Case II: When the upper half-space becomes isotropic elastic medium without initial
p
stress and without poro-elasticity, i.e., S11 = 0, d p → 1, N = G = µ p and lower half-space is
considered as highly anisotropic half-space, then Equation (16.39) reduces to
d B D D B
Π+B1 = i ( d1 − b1 ) −Ω0 + Ω 1 + η 0 + ( t3′ − ηb1 ) 0 − ( t2′ + b1Ω ) 1 + ( −t1′ + b1Ω0 ) ,
2 B0 B0 B0 B0
d D B
Π+D1 = i ( t3′ − ηb1 ) 0 − ( t2′ + b1Ω ) 1 + ( −t1′ + b1Ω0 ) ,
2 B0 B0
312 Advanced Mathematical Techniques in Engineering Sciences
d B D D B
Π−B1′ = i ( d1′ − b1′ ) −Ω0 + Ω 1 + η 0 + ( t6′ + ηb1′ ) 0 − ( t5′ + b1′Ω ) 1 + ( −t4′ − b1′Ω0 ) ,
2 B0 B0 B0 B0
d D B
Π−D1′ = i ( t6′ + ηb1′ ) 0 − ( t5′ + b1′Ω ) 1 − ( t4′ + b1′Ω0 ) ,
2 B0 B0
Π1+ = ( b1 − a1 ) , Π1− = ( d1′ − b1′ )
where
Ω1 =
ω sin e1
β1 ( )
− µ1 + µ12 + µ2 ( cot 2e1 + ξ ) , Ω′1 =
ω sin e1
β1 (
− µ1 + µ12 + µ2 ( cot 2e1′ + ξ ) , )
ω sin f1 ω sin f1
η1 =
β2
( )
−1 + cos ec 2 f1 , η1′ =
β2
(
−1 + cosec 2 f1′ ,)
F46 F St F
µ1 =
F44 F44 F66 ρ1
(
,µ2 = 66 , ξ = 11 , β12 = 66 , η = k2 −1 + cosec 2 f , )
γ2
µ1′ = 1, µ2′ = 1, ξ1 = 0, and d p = γ 11 − 12
γ 22
Equation (16.41) is deduced for the case when SH-wave is incident at a corrugated interface
between isotropic elastic half-space and highly anisotropic half-space.
16.8 Energy distribution
It is apparent that when a plane SH-wave is incident on any surface, the energy of the
incident wave is distributed among the reflected and refracted waves. The energy flux for
the incident and each of the individually reflected and refracted waves can be obtained by
multiplying total energy per unit volume with the wave velocity and the area of the wave
front. In our case, the total energy per unit volume is twice the mean kinetic energy den-
sity. Also, the wave front area is proportional to the cosine of the angle of wave intersected
against normal. Therefore, by Snell’s law and the spectrum theorem, the energy equation
for each of the individual waves, i.e., the incident, regularly reflected and refracted, and
irregularly reflected and refracted SH-wave for the nth-order approximation of the corru-
gation can be written as (Abubakar 1962b, Tomar and Kaur 2007)
Chapter sixteen: Rayleigh’s approximation method 313
2 ∞ 2 ∞ 2 2 ∞ 2
ρ β cos f D0 ρ2 β 2 cos fn Dn
1=
B
B0
+ ∑
n =1
cos en Bn
cos e B0
+ ∑
n =1
cos en′ Bn′
cos e B0
+ 2 2
ρ1β1 cos e B0
+ ∑
n =1
ρ1β1 cos e B0
(16.42)
The energy distribution at the interface between two different types of half-spaces can be
deduced using Equation (16.42) by equating the coefficients of Bn , Dn , Bn′ , and Dn′ to 0, as
they are positively dependent on corrugation amplitude,
2 2
B ρ β 2 tan e D0
1= + 2 22
B0 ρ1β1 tan f B0
∑E ≈ 1
i =1
i
Here, E1 and E2 are the energy ratios of the regularly reflected and regularly refracted
waves. Energy ratio is particularly defined as the ratio of the energy of reflection/
refraction wave and energy of the incident wave. Similarly, E3, E5 and E4, E6 can be defined
as the energy ratios of the irregularly reflected waves and irregularly refracted waves,
respectively, for the first-order approximation of corrugation. Thus, the energy ratios are
given as
2 2 2
B ρ β cos f D0 cos e1 B1
E1 = , E2 = 2 2 , E3 = ,
B0 ρ1β1 cos e B0 cos e B0
2 2 2
ρ β cos f1 D1 cos e1′ B1′ ρ β cos f1′ D1′
E4 = 2 2 , E5 = , E6 = 2 2
ρ1β1 cos e B0 cos e B0 ρ1β1 cos e B0
G = 0.1387 × 1010 N/m 2 , N = 0.2774 × 1010 N/m 2 , ρ11 = 1.926137 × 103 kg/m 3 ,
ρ12 = −0.002137 × 103 kg/m 3 , ρ22 = 0.215337 × 103 kg/m 3
0.00004
3
2
0.00003
Amplitude ratio
0.00002
1. d = 0.1.
2. d = 0.2.
0.00001 3. d = 0.3.
0
0 20 40 60 80
Angle of incidence (°)
Figure 16.2 Variation of modulus of amplitude ratio (B1/B0) with respect to angle of incidence for
different values of amplitude of corrugation (d).
0.00005
0.00004
Amplitude ratio
0.00003
0.00002
1. d = 0.1.
3 2. d = 0.2.
0.00001 2 3. d = 0.3.
1
0
0 20 40 60 80
Angle of incidence (°)
Figure 16.3 Variation of modulus of amplitude ratio (D1/B0) with respect to angle of incidence for
different values of amplitude of corrugation (d).
Chapter sixteen: Rayleigh’s approximation method 315
gets decreased with increasing incident angles. When corrugation amplitude (d) is con-
cerned, the amplitude ratios experience a positive effect for most parts of the incident
angle. Furthermore, the ratio (D1/B0) in Figure 16.3 has a similar type behavior as seen in
Figure 16.2 with respect to incidence angle and corrugation amplitude. Here, the effect of
corrugation amplitude is less pronounced, which is identified by less spacing in between
the curves.
Figures 16.4 and 16.5 have been plotted to discuss the variation of ( B1′ /B0 ) and ( D1′ /B0 )
against the incident angle while considering corrugation amplitude (d) as the effecting
parameter. In Figure 16.4, the amplitude ratio ( B1′ /B0 ) has a gradual increase throughout the
incident angle. However, with the increasing values of (d), the ratio ( B1′ /B0 ) gets decreased.
Amplitude ratio ( D1′ /B0 ) has the same characteristics as ( B1′ /B0 ), which is clearly visible in
Figure 16.5.
1. d = 0.1.
0.15 2. d = 0.2.
3. d = 0.3.
Amplitude ratio
1
0.10
2
0.05
3
0.00
0 20 40 60 80
Angle of incidence (°)
Figure 16.4 Variation of modulus of amplitude ratio ( B1′ /B0 ) with respect to angle of incidence for
different values of amplitude of corrugation (d).
1. d = 0.1.
2. d= 0.2.
0.15 3. d = 0.3.
Amplitude ratio
1
0.10
2
0.05
3
0.00
0 20 40 60 80
Angle of incidence (°)
Figure 16.5 Variation of modulus of amplitude ratio ( D1′ /B0 ) with respect to angle of incidence for
different values of amplitude of corrugation (d).
316 Advanced Mathematical Techniques in Engineering Sciences
0.00005
1. λ = 1.0.
2. λ = 2.0.
0.00004
3. λ = 3.0.
Amplitude ratio
0.00003
3
0.00002 2
1
0.00001
0
0 20 40 60 80
Angle of incidence (°)
Figure 16.6 Variation of modulus of amplitude ratio (B1/B0) with respect to angle of incidence for
different values of wavelength of corrugation (λ).
0.00007
0.00006
1
0.00005 2
Amplitude ratio
3
0.00004
0.00003
0.00002 1. λ = 1.0.
2. λ = 2.0.
0.00001 3. λ = 3.0.
0
0 20 40 60 80
Angle of incidence (°)
Figure 16.7 Variation of modulus of amplitude ratio (D1/B0) with respect to angle of incidence for
different values of wavelength of corrugation (λ).
Chapter sixteen: Rayleigh’s approximation method 317
Figures 16.8 and 16.9. In Figure 16.8, ( B1′ /B0 ) has a decreasing characteristic for the smaller
incident angles until it attains its minima. On further increase in incident angle, the values
of amplitude ratio tend to increase and reach its maxima. Then on further increase in angle,
the ratio gets decreased. However, it is observed that with increasing values of λ, the value of
( B1′ /B0 ) is increased for most incident angles. A very different case is observed in Figure 16.9,
where ( D1′ /B0 ) starts increasing from 0° onward and then decreases afterward. The λ puts a
positive influence in ( D1′ /B0 ) but the influence gets decreased in the higher incident angles.
0.00008
3
0.00006 2 1
Amplitude ratio
0.00004
1. λ = 1.0.
0.00002 2. λ = 2.0.
3. λ = 3.0.
0
0 20 40 60 80
Angle of incidence (°)
Figure 16.8 Variation of modulus of amplitude ratio ( B1′ /B0 ) with respect to angle of incidence for
different values of wavelength of corrugation (λ).
0.00015
Amplitude ratio
3
0.00010 2
1
1. λ = 1.0.
0.00005 2. λ = 2.0.
3. λ = 3.0.
0.00000
0 20 40 60 80
Angle of incidence (°)
Figure 16.9 Variation of modulus of amplitude ratio ( D1′ /B0 ) with respect to angle of incidence for
different values of wavelength of corrugation (λ).
318 Advanced Mathematical Techniques in Engineering Sciences
0.00014
0.00006
2
0.00004
0.00002 1
0.00000
0 20 40 60 80
Angle of incidence (°)
Figure 16.10 Variation of modulus of amplitude ratio (B1/B0) with respect to angle of incidence for
different values of frequency factor (ωd/β1).
0.0001
3
2
0.00008
Amplitude ratio
0.00006
1
0.00004
1. ωd/β1 = 0.2.
0.00002 2. ωd/β1 = 0.3.
3. ωd/β1 = 0.4.
0.0000
0 20 40 60 80
Angle of incidence (°)
Figure 16.11 Variation of modulus of amplitude ratio (D1/B0) with respect to angle of incidence for
different values of frequency factor (ωd/β1).
the effect of (ωd/β1) is much more pronounced as the spaces between the curves are much
higher. But, (B1/B0) is decreased in smaller incident angles and gets increased for higher
angles. Similar behavior is observed in Figure 16.11 for (D1/B0).
In Figure 16.12, the amplitude ratio first decreases slightly with incident angle and
then gets increased and finally is decreased. Frequency factor puts a favorable influence on
( B1′ /B0 ) for most part of the incidence angle. In Figure 16.13, the value of ( D1′ /B0 ) is increased
initially and then is decreased with increasing incident angle values. Moreover, frequency
factor has a positive influence on the amplitude ratios throughout the incident angle range.
0.00015
3
Amplitude ratio
0.00010
0.00000
0 20 40 60 80
Angle of incidence (°)
Figure 16.12 Variation of modulus of amplitude ratio ( B1′ /B0 ) with respect to angle of incidence for
different values of frequency factor (ωd/β1).
0.00025
0.00020
Amplitude ratio
0.00015
3
0.00010 2
1. ωd/β1 = 0.2.
1
2. ωd/β1 = 0.3.
0.00005 3. ωd/β1 = 0.4.
0.00000
0 20 40 60 80
Angle of incidence (°)
Figure 16.13 Variation of modulus of amplitude ratio ( D1′ /B0 ) with respect to angle of incidence for
different values of frequency factor (ωd/β1).
–0.9987
1, 2, 3
–0.9988
Amplitude ratio
–0.9989
1. ξ1 = 0.0.
–0.9990 2. ξ1 = 0.2.
3. ξ1 = 0.4.
–0.9991
0 20 40 60 80
Angle of incidence (°)
Figure 16.14 Variation of modulus of amplitude ratio (B/B0) with respect to angle of incidence for
different values of initial stress parameter (ξ1) associated with the poroelastic half-space.
320 Advanced Mathematical Techniques in Engineering Sciences
0.0013
0.0012
Amplitude ratio
1, 2, 3
0.0011
0.0010 1. ξ1 = 0.0.
2. ξ1 = 0.2.
3. ξ1 = 0.4.
0.0009
0 20 40 60 80
Angle of incidence (°)
Figure 16.15 Variation of modulus of amplitude ratio (D0/B0) with respect to angle of incidence for
different values of initial stress parameter (ξ1) associated with the poroelastic half-space.
0.00004
0.00003
Amplitude ratio
1, 2, 3
0.00002
1. ξ1 = 0.0.
2. ξ1 = 0.2.
0.00001
3. ξ1 = 0.4.
0
0 20 40 60 80
Angle of incidence (°)
Figure 16.16 Variation of modulus of amplitude ratio (B1/B0) with respect to angle of incidence for
different values of initial stress parameter (ξ1) associated with the poroelastic half-space.
0.000035
0.00003 1, 2, 3
Amplitude ratio
0.000025
1. ξ1 = 0.0.
0.00002 2. ξ1 = 0.2.
3. ξ1 = 0.4.
0.000015
0 20 40 60 80
Angle of incidence (°)
Figure 16.17 Variation of modulus of amplitude ratio (D1/B0) with respect to angle of incidence for
different values of initial stress parameter (ξ1) associated with the poroelastic half-space.
Chapter sixteen: Rayleigh’s approximation method 321
0.0001
0.00008
1, 2, 3
Amplitude ratio
0.00006
0.00004 1. ξ1 = 0.0.
2. ξ1 = 0.2.
3. ξ1 = 0.4.
0.00002
0.0000
0 20 40 60 80
Angle of incidence (°)
Figure 16.18 Variation of modulus of amplitude ratio ( B1′ /B0 ) with respect to angle of incidence for
different values of initial stress parameter (ξ1) associated with the poroelastic half-space.
0.00018
1, 2, 3
Amplitude ratio
0.00016
1. ξ1 = 0.0.
0.00014
2. ξ1 = 0.2.
3. ξ1 = 0.4.
0.00012
0 20 40 60 80
Angle of incidence (°)
Figure 16.19 Variation of modulus of amplitude ratio ( D1′ /B0 ) with respect to angle of incidence for
different values of initial stress parameter associated (ξ1) with the poroelastic half-space.
half-space. The amplitude ratios have varying influence due to the incident angles, but it is
very much evident from all the figures, that the initial stress parameters (ξ1 = 0.0, 0.2, and
0.4) do not have any influence on the poroelastic half-space as all the curves overlap each
other. Thus, it can be concluded that the initial stress parameter related to the poroelastic
half-space does not put any prominent effect on the amplitude ratios.
–0.9988
–0.9990
3
Amplitude ratio
–0.9992
2
–0.9994
1. ξ = 0.0.
–0.9996 2. ξ = 0.2. 1
3. ξ = 0.4.
–0.9998
0 20 40 60 80
Angle of incidence (°)
Figure 16.20 Variation of modulus of amplitude ratio (B/B0) with respect to angle of incidence for
different values of initial stress parameter (ξ) associated with the highly anisotropic half-space.
0.0012
0.0010
3
Amplitude ratio
0.0008 1 2
1. ξ = 0.0.
0.0006
2. ξ = 0.2.
0.0004 3. ξ = 0.4.
0.0002
0.0000
0 20 40 60 80
Angle of incidence (°)
Figure 16.21 Variation of modulus of amplitude ratio (D0/B0) with respect to angle of incidence
for different values of initial stress parameter (ξ) associated with the highly anisotropic half-space.
0.00003
3
0.000025
Amplitude ratio
0.00002 2
0.000015 1
1. ξ = 0.0.
0.00001 2. ξ = 0.2.
3. ξ = 0.4.
5. × 10–6
0
0 20 40 60 80
Angle of incidence (°)
Figure 16.22 Variation of modulus of amplitude ratio (B1/B0) with respect to angle of incidence for
different values of initial stress parameter (ξ) associated with the highly anisotropic half-space.
Chapter sixteen: Rayleigh’s approximation method 323
0.000035
0.00003
3
0.000025
Amplitude ratio
2
0.00002
1
0.000015
1. ξ = 0.0.
0.00001 2. ξ = 0.2.
3. ξ = 0.4.
5. × 10–6
0
0 20 40 60 80
Angle of incidence (°)
Figure 16.23 Variation of modulus of amplitude ratio (D1/B0) with respect to angle of incidence
for different values of initial stress parameter (ξ) associated with the highly anisotropic half-space.
0.0001
3
2
0.00008
Amplitude ratio
0.00006
1
0.00004 1. ξ = 0.0.
2. ξ = 0.2.
0.00002 3. ξ = 0.4.
0.0000
0 20 40 60 80
Angle of incidence (°)
Figure 16.24 Variation of modulus of amplitude ratio ( B1′ /B0 ) with respect to angle of incidence
for different values of initial stress parameter (ξ) associated with the highly anisotropic half-space.
0.00020
0.00015
2
Amplitude ratio
0.00010 1
1. ξ = 0.0.
0.00005 2. ξ = 0.2.
3. ξ = 0.4.
0.00000
0 20 40 60 80
Angle of incidence (°)
Figure 16.25 Variation of modulus of amplitude ratio ( D1′ /B0 ) with respect to angle of
incidence for different values of initial stress parameter (ξ) associated with the highly
anisotropic half-space.
324 Advanced Mathematical Techniques in Engineering Sciences
that both ratios experience the same influence. In both graphs, the amplitude ratios have a
gradual decrease with increase in incidence angle. But, they are increased due to an initial
stress parameter related to the highly anisotropic half-space.
Similarly, (B1/B0) in Figure 16.22 and (B1/B0) in Figure 16.23 also have identical behav-
iors. The amplitude ratios start to increase from smaller incidence angle and remain the
same and indistinguishable to 15° irrespective of any initial stress parameter value. With
a further increase in incidence angle, the effect of initial stress parameter associated with
highly anisotropic half-space is clearly recognizable. The ratios have an increasing effect
due to the stress parameter.
The amplitude ratio ( B1′ /B0 ) in Figure 16.24 does not have any considerable effect due
to the initial stress parameter until 20°. On further increasing the angle, the effect of the
initial stress parameter is clearly visible and ( B1′ /B0 ) has an increasing effect. Besides, in
Figure 16.25, ( D1′ /B0 ) start to increase from 0° and continue to increase to a certain angle.
After that, the amplitude ratios have a gradual decrease. Further, the amplitude ratios have
increasing effects due to the increase in the initial stress parameter related to the highly
anisotropic half-space.
16.10 Concluding remarks
A comprehensive investigation has been done to study the reflection and refraction
phenomena of a plane SH-wave through a corrugated interface sandwiched between
an initially stressed fluid-saturated poroelastic half-space and a highly anisotropic half-
space. Rayleigh’s method of approximation has been effectively utilized to derive first- and
second-order approximations of the coefficients. A rigorous analysis between the reflec-
tion and refraction coefficients against various parameters such as corrugation amplitude,
corrugation wavelength, frequency factor and the initial stress parameter associated with
both the half-spaces has been done. Each of the individual parameters has been discussed
separately and in detail. Finally, some of the highlight observations of the study are as
follows:
The critical findings of this descriptive study of the present problem can be worthwhile to
the fields of geophysics and geology. This study may furnish some valuable assistance to
geoscientists for proper interpretation of geological structures.
Chapter sixteen: Rayleigh’s approximation method 325
References
Abubakar, I. Scattering of plane elastic waves at rough surfaces – I. Proc. Camb. Philos. Soc. 58 (1962a):
136–157.
Abubakar, I. Reflection and refraction of plane SH-waves at irregular interfaces – I. J. Phys. Earth 10.1
(1962b): 1–14.
Abubakar, I. Reflection and refraction of plane SH-waves at irregular interfaces – I. J. Phys. Earth 10.1
(1962c): 15–20.
Aki, K., Richards, P.G. Quantitative Seismology, 2nd ed. University Science Books, Sausalito (2002).
Asano, S. Reflection and refraction of elastic waves at a corrugated boundary surface. Part-I. The
case of incidence of SH-wave. Bull. Earthq. Res. Inst. 38.2 (1960): 177–197.
Asano, S. Reflection and refraction of elastic waves at a corrugated boundary surface. Part-II. Bull.
Earthq. Res. Inst. 39.3 (1961): 367–466.
Asano, S. Reflection and refraction of elastic waves at a corrugated interface. Bull. Seismol. Soc. Am.
56.1 (1966): 201–221.
Biot, M.A. Theory of elastic waves in a fluid saturated porous solid I. low frequency range. J. Acoust.
Soc. Am. 28 (1956a): 168–178.
Biot, M.A. Theory of elastic waves in a fluid saturated porous solid II. High frequency range.
J. Acoust. Soc. Am. 28 (1956b): 179–191.
Biot, M.A. Mechanics of deformations and acoustic propagation in porous media. J. Appl. Phys. 33
(1962): 1482–1489.
Biot, M.A. Mechanics of Incremental Deformation, Wiley, New York (1965).
Crampin, S. A review of the effects of anisotropic layering on the propagation of seismic waves.
Geophys. J. Int. 49.1 (1977): 9–27.
Daley, P.F., Hron F. Reflection and transmission coefficients for seismic waves in ellipsoidally aniso-
tropic media. Geophysics 44.1 (1979): 27–38.
Deresiewicz, H. The effect of boundaries on wave propagation in a liquid-filled porous solid: II.
Love waves in a porous layer. Bull. Seismol. Soc. Am. 51.1 (1961): 51–59.
Ewing, W.M., Jardetzky, W.S., Press, F. Elastic Waves in Layered Media. Lamont Geological Observatory
Contribution. McGraw-Hill, New York (1957).
Fokkema, J.T. Reflection and transmission of elastic waves by the spatially periodic interface between
two solids (theory of the integral-equation method). Wave Motion 2.4 (1980): 375–393.
Keith, C.M., Crampin, S. Seismic body waves in anisotropic media: Reflection and refraction at a
plane interface. Geophys. J. Int. 49.1 (1977): 181–208.
Lord Rayleigh, O. M. On the dynamical theory of gratings. Proc. R. Soc. Lond. A 79.532 (1907):
399–416.
Pal, A.K., Chattopadhyay, A. The reflection phenomena of plane waves at a free boundary in a pre-
stressed elastic half-spaces. J. Acoust. Soc. Am. 76.3 (1984): 924–925.
Rokhlin, S.I., Bolland T.K., Adler L. Reflection and refraction of elastic waves on a plane interface
between two generally anisotropic media. J. Acoust. Soc. Am. 79.4 (1986): 906–918.
Saini, S.L., Singh, S.J. Effect of anisotropy on the reflection of SH-waves at an interface. Geophys. Res.
Bull. 15.2 (1977): 67–73.
Sato, R. The reflection of elastic waves on corrugated surface. Zisin 8.1 (1955): 8–22.
Sharma, M.D., Gogna, M.L. Reflection and refraction of plane harmonic waves at an interface
between elastic solid and porous solid saturated by viscous liquid. Pure and Appl. Geophys. 138
(1992): 249–266.
Tajuddin, M., Hussaini, S.J. Reflection of plane waves at boundaries of a liquid filled poroelastic half-
space. J. Appl. Geophys. 58.1 (2005): 59–86.
Thomsen, L. Reflection seismology over azimuthally anisotropic media. Geophysics 53.3 (1988): 304–313.
Tomar, S.K., Arora, A. Reflection and transmission of elastic waves at an elastic/porous solid sat-
urated by two immiscible fluids. Int. J. Solids Struct. 43 (2006): 1991–2013 [Erratum, ibid 44,
5796–5800 (2007)].
Tomar, S.K., Kaur, J. Reflection and transmission of SH-waves at a corrugated interface between two
laterally and vertically heterogeneous anisotropic elastic solid half-spaces. Earth, Planets and
Space 55.9 (2003): 531–547.
326 Advanced Mathematical Techniques in Engineering Sciences
Tomar, S.K., Kaur, J. SH-waves at a corrugated interface between a dry sandy half-space and an
anisotropic elastic half-space. Acta Mech. 190 (2007): 1–28.
Tomar, S.K., Saini, S.L. Reflection and refraction of SH-waves at a corrugated interface between two
dimensional transversely isotropic half spaces. J. Phys. Earth 45.5 (1997): 347–362.
Tomar, S.K., Singh, S.S. Quasi-P-waves at a corrugated interface between two laterally dissimilar
monoclinic half spaces. Int. J. Solids Struct. 44.1 (2007): 197–228.
Tomar, S.K., Kumar, R., Chopra, A. Reflection and refraction of SH-waves at a corrugated inter-
face between transversely isotropic and visco-elastic solid half spaces. Acta Geophys. Pol. 50.2
(2002): 231–249.
Tiersten, H.F. Linear Piezoelectric Plate Vibration, Plenum Press, New York (1969).
Wu, K.Y., Xue Q., Adler L. Reflection and transmission of elastic waves from a fluid‐saturated porous
solid boundary. J. Acoust. Soc. Am. 87.6 (1990): 2349–2358.
Index
327
328 Index
Coanda flow, for vertical and short takeoffs and Decision tree (DT) models
landings (V/STOL) (Cont.) building with R programming tools, 254–255
SST k–ω model, 268–269 for modeling fertility in Murrah bulls, 253
grid independence test and solution results and discussion, 258, 259
methodology, 270–273 Deferred Cesàro means, 49
introduction, 265–267 Defuzzification, 61
results and discussion, 273–281 Degree of approximation, 51–52
Coanda, Henry, 266 Delta function, 14–15
Collocation method, 286 of first order, 15
introduction, 285–286 of second order, 15
numerical example, 295–296 Derringer’s desirability function method, 96–97
numerical solution of advection diffusion Desirability function approach, for simultaneous
equation using, 292–295 optimization, 96–97
using B-spline basis functions Differentiation theorem
characteristics of, 289 for image, 10–11
exponential B-spline basis functions, 290 for original, 9–10
first-degree (linear) B-spline, 288 Diffusion of innovation, 165
methodology, 290–292 Digital image processing, 54
second-degree (quadratic) B-spline, 288–289 Dirac delta function, 14–15
to solving differential equations, 285–296 Dirichlet’s theorem, 41
trigonometric B-spline basis functions, Doetsch, G., 2
289–290 Dual-market innovation diffusion model (DMIDM),
types, 289–290 168, 171, 174, 175
zero-degree B-spline, 287–288 Dual-market models, 166
Collocation points, 286 Dual-response surface methodology
Complex convolution theorem, 12–13 for simultaneous optimization of pulp yield and
Comprehensive R Archival Network (CRAN), 252 viscosity of pulp cooking process, 91–109
“Conada” effect, 266 Dynamic balance margin (DBM), of biped robot,
Concept drift, 195–198 230–232
Conflicting bifuzzy number (CBFN), 113
Conflicting bifuzzy set (CBFS), see Time-dependent
E
conflicting bifuzzy set (CBFS)
Constrained nonparametric maximum likelihood Early market adopters, 166
estimation (CNPMLE), 135–138 Early market adoption model, 169
multiple failure-occurrence time data case, Efficient solution, see Pareto optimal solution
137–138 18-Degrees of Freedom (DOF) biped robot, 229
single failure-occurrence time data case, 135–137 Energy distribution, 312–313
Control theory, 54 ε-SVR model, 252
Conventional reliability of a system, 112 eps-regression method, 252
Convergence of Fourier series, 42 Equal fuzzy sets, 59
Convex conflicting bifuzzy set (CBFS), 113 Evolutionary and nature-inspired optimization
Convolution theorem, 11–12 algorithms, 228
Cost ratio, 132 (1 − α)-expectation tolerance limit
Cox–de Boor recursion formula, 287 lower statistical, 217–219
Cox–Lewis (CL) NHPP model, 142 upper statistical, 219–222
Cox–Lewis process, 132 Exponential B-spline basis functions, 290
Crank–Nicolson scheme, 292
Crisp and fuzzy sets, comparison of, 62 F
Cuckoo search algorithm (CSA), 228
(α,β)-Cut of a time-dependent CBFS, 113 Failure time data analysis of repairable system
introduction, 129–131
D model description, 131
nonparametric estimation methods
Damping theorem, 9 constrained nonparametric ML estimator,
Data analysis, 54 135–138
failure time data analysis of repairable system, Kernel-based approach, 138–142
129–146 numerical examples
Kernel estimators for, 177–200 real example with multiple minimal repair
YouTube view count, 156–163 data sets, 144–146
Index 329
Q S
Quadratic B-spline basis function, 288–289 Second bias theorem, 8
Quadratic quality criterion, 13 Second-degree (quadratic) B-spline, 288–289
Sedov, L.I., 2
R Shifted unit function, 13
SH-wave propagation, 298–299
Ramanujan-Fourier series, 54 Signal processing, 54
Random forest (RF) models Similarity theorem, 7
for prediction of fertility in Murrah bulls, SIMPLE (Semi-Implicit Method for Pressure Linked
255–256 Equation) algorithm, 271
results and discussion, 259–260 Simultaneous optimization of multiple
randomForest package, 256 characteristics methodology
Rayleigh’s approximation method data collection and modeling, 100–104
boundary conditions, 305 Derringer’s desirability function method, 96–97
energy distribution, 312–313 dual-response surface methodology, 99–100
introduction, 297–299 fuzzy logic approach, 98–99
numerical discussion and results, 313–324 introduction, 91–96
corrugation amplitude effect, 314–315 optimization, 104–106
corrugation wavelength effect, 316–317 of pulp cooking process, 91–109
frequency factor effect, 317–318, 319 Taguchi’s loss function approach, 97–98
initial stress parameter on highly anisotropic validation, 106
half-space, 321–324 Single failure-occurrence time data case, 133–134
initial stress parameter on poroelastic half- Single-objective optimization problem, 73–74
space, 318–321 Single step function/single jump function, 13
particular cases for special case, 310–312 Small order, 43
problem formulation and its solution, 299–304 Smoothing parameter, 139
on reflection/refraction phenomena of plane kernel estimator with modification of, 181–182
SH-wave, 297–324 Social networks, 149
solution for first-order approximation of Soft computing techniques
corrugation, 305–307 adaptive neuro fuzzy inference system
solution for lower highly anisotropic half-space, (ANFIS), 66
301–302 applications, 67–68
Index 333