Machine Learning For Computer Scientists and Data Analysts From An Applied Perspective Setareh Rafatirad Download
Machine Learning For Computer Scientists and Data Analysts From An Applied Perspective Setareh Rafatirad Download
https://ebookbell.com/product/machine-learning-for-computer-
scientists-and-data-analysts-from-an-applied-perspective-setareh-
rafatirad-43847510
https://ebookbell.com/product/machine-learning-for-computer-
vision-1st-edition-cheston-tan-4230616
https://ebookbell.com/product/machine-learning-for-computer-and-cyber-
security-principles-algorithms-and-practices-gupta-10504934
https://ebookbell.com/product/practical-machine-learning-for-computer-
vision-endtoend-machine-learning-for-images-1st-edition-valliappa-
lakshmanan-48775628
https://ebookbell.com/product/practical-machine-learning-for-computer-
vision-valliappa-lakshmanan-170415602
Handson Java Deep Learning For Computer Vision Implement Machine
Learning And Neural Network Methodologies To Perform Computer
Visionrelated Tasks Klevis Ramo
https://ebookbell.com/product/handson-java-deep-learning-for-computer-
vision-implement-machine-learning-and-neural-network-methodologies-to-
perform-computer-visionrelated-tasks-klevis-ramo-9994594
https://ebookbell.com/product/deep-learning-for-computer-vision-image-
classification-object-detection-and-face-recognition-in-
python-v14-jason-brownlee-33715580
Machine Learning And Data Mining For Computer Security Methods And
Applications 1st Edition Marcus A Maloof Auth
https://ebookbell.com/product/machine-learning-and-data-mining-for-
computer-security-methods-and-applications-1st-edition-marcus-a-
maloof-auth-4239824
https://ebookbell.com/product/python-for-computer-vision-unlocking-
image-processing-and-machine-learning-with-python-mark-
jackson-147611288
Machine Learning for Computer Scientists
and Data Analysts
Setareh Rafatirad • Houman Homayoun •
Zhiqian Chen • Sai Manoj Pudukotai Dinakarrao
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland
AG 2022
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
The recent popularity gained by the field of machine learning (ML) has led to its
adaptation into almost all the known applications. The applications range from smart
homes, smart grids, and forex markets to military applications and autonomous
drones. There exists a plethora of machine learning techniques that were introduced
in the past few years, and each of these techniques fits greatly for a specific set of
applications rather than a one-size-fits-all approach.
In order to better determine the application of ML for a given problem, it is non-
trivial to understand the current state of the art of the existing ML techniques, pros
and cons, their behavior, and existing applications that have already adopted them.
This book thus aims at researchers and practitioners who are familiar with their
application requirements, and are interested in the application of ML techniques
in their applications not only for better performance but also for ensuring that the
adopted ML technique is not an overkill to the considered application. We hope that
this book will provide a structured introduction and relevant background to aspiring
engineers who are new to the field, while also helping to revise the background
for the researchers familiar with this field. This introduction will be further used to
build and introduce current and emerging ML paradigms and their applications in
multiple case studies.
Organization This book is organized into three parts that consist of multiple
chapters. The first part introduces the relevant background information pertaining
to ML, traditional learning approaches that are widely used.
• Chapter 1 introduces the concept of applied machine learning. The metrics
used for evaluating the machine learning performance, data pre-processing, and
techniques to visualize and analyze the outputs (classification or regression or
other applications) are discussed.
• Chapter 2 presents a brief review of the probability theory and linear algebra that
are essential for a better understanding of the ML techniques discussed in the
later parts of the book.
v
vi Preface
• “Online Learning”: Shuo Lei (Virginia Tech), Yifeng Gao (University of Texas
Rio Grande Valley), Xuchao Zhang (NEC Labs America)
• “Recommender Learning”: Shanshan Feng (Harbin Institute of Technology,
Shenzhen), Kaiqi Zhao (University of Auckland)
• “Graph Learning”: Liang Zhao (Emory University)
• “SensorNet: An Educational Neural Network Framework for 167 Low-Power
Multimodal Data Classification”: Tinoosh Mohsenin (University of Maryland
Baltimore County), Arnab Mazumder (University of Maryland Baltimore
County), Hasib-Al- Rashid (University of Maryland Baltimore County)
• “Transfer Learning in Mobile Health”: Hassan Ghasemzadeh (Arizona State
University)
• “Applied Machine Learning for Computer Architecture Security”: Hossein
Sayadi (California State University, Long Beach)
• “Applied Machine Learning for Cloud Resource Management”: Hossein
Mohammadi Makrani (University of California Davis), Najme Nazari
(University of California Davis)
Kaiqi Zhao (University of Auckland), Shanshan Feng (Harbin Institute of Technol-
ogy, Shenzhen), Xuchao Zhang (NEC Labs America), Yifeng Gao (University of
Texas Rio Grande Valley), Shuo Lei (Virginia Tech), Zonghan Zhang (Mississippi
State University), and Qi Zhang (University of South Carolina).
ix
x Contents
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
Part I
Basics of Machine Learning
Chapter 1
What Is Applied Machine Learning?
1.1 Introduction
and durable models and performing hardware analysis on trained models (in terms
of hardware latency and area) are significant applied machine learning problems to
address in this sector [1, 7, 8].
The majority of this book discusses the difficulties and best practices associated
with constructing machine learning models, including understanding an applica-
tion’s properties and the underlying sample dataset.
What is Machine Learning Pipeline? How do you describe the goal of machine
learning? What are the main steps in the machine learning pipeline? We will answer
these questions both through formal definitions and practical examples. Machine
learning pipeline is meant to help with automating the machine learning workflow,
in order to obtain actionable insights from big datasets. The goal of machine
learning is to train an accurate model to solve an underlying problem. However, the
term pipeline is misleading as many of the steps involved in the machine learning
workflow may be repeated iteratively so to enhance and improve the accuracy of the
model. The cyclical architecture of machine learning pipelines is demonstrated in
Fig. 1.1.
Initially, the input (or collected) data is prepared before performing any analysis.
This step includes tasks such as data cleaning, data imputation, feature engineering,
data scaling/standardization, and data sampling for dealing with issues includ-
ing noise, outliers, transforming categorical variables, normalizing/standardizing
dataset features, and imbalanced (or biased) datasets.
In the Exploratory Data Analysis step (EDA), data is analyzed to understand
its characteristics such as having a normal or skewed distribution (see Fig. 1.2).
Skewedness in data affects a statistical model’s performance, especially in the case
of regression-based models. To prevent harming the results due to skewness, it is a
common practice to apply a transformation over the whole set of values (such as log
transformation) and use the transformed data for the statistical model.
Fig. 1.2 Comparison of different data distributions. In Right-Skewed or positive distribution, most
data falls to the positive, or the right side of the peak. In Left-Skewed or negative distribution, most
data falls to the negative, or the left side the peak. (a) Right-skewed. (b) Normal distribution. (c)
Left-skewed
We live in the age of big data where data lives in various sources and repositories
stored in different formats: structured and unstructured. Raw input data can contain
structured data (e.g., numeric information, date), unstructured data (e.g., image,
text), or a combination of both, which is called semi-structured data. Structured data
is quantitative data that can fit nicely into a relational database such as the dataset in
Table 1.2 where the information is stored in tabular form. Structured attributes can
be transformed into quantitative values that can be processed by a machine. Unlike
structured data, unstructured data needs to be further processed to extract structured
information from it; such information is referred to as data about data, or what we
call metadata in this book.
Table 1.1 demonstrates a dataset for a sample text corpus related to research
publications in the public domain. This is an example of a semi-structured dataset,
which includes both structured and unstructured attributes: the attributes of year and
8
Table 1.2 Structured malware dataset, obtained from virusshare and virustotal, covering 5 different classes of malware
Bus-cycles Branch-instructions Cache-references Node-loads Node-stores Cache-misses Branch-loads LLC-loads L1-dcache-stores Class
11463 37940 8057 1104 111 2419 37190 2360 38598 Backdoor
1551 5055 1096 165 17 333 4916 330 5003 Backdoor
29560 126030 20008 1769 146 4098 108108 5987 99237 Backdoor
26211 117761 14783 1666 48 4182 117250 4788 91070 Backdoor
30139 123550 20744 1800 158 4238 124724 6969 115862 Backdoor
12989 30012 9076 1252 136 5412 27909 2000 27170 Benign
6546 12767 4953 548 87 3683 13157 864 12361 Benign
8532 31803 7087 699 124 3240 34722 1970 34974 Benign
14350 27451 9157 1843 178 6611 28507 2411 24908 Benign
13837 25436 12235 1296 192 7148 24747 2533 23757 Benign
1068674 8211420 168839 42612 28574 73696 6298568 64166 6202146 Rootkit
1054761 8187337 162526 41245 28389 71576 6688738 67408 6655480 Rootkit
1046053 8196952 158955 40525 28113 70250 6981991 69597 6950106 Rootkit
1038524 8124926 157896 40207 28214 69910 7134795 71132 7148734 Rootkit
1030773 8069156 158085 39603 28265 69356 7230800 72226 7294250 Rootkit
999182 29000000 455 64 5 94 29000000 289 14000000 Trojan
999189 29000000 457 65 5 95 29000000 288 14000000 Trojan
999260 29000000 457 65 6 96 29000000 287 14000000 Trojan
999265 29000000 459 67 6 98 29000000 287 14000000 Trojan
999277 29000000 459 67 6 98 29000000 288 14000000 Trojan
1 What Is Applied Machine Learning?
989865 9128084 2549 169 37 268 9404871 923 9614242 Virus
989984 9130539 2529 168 37 266 9402351 920 9611680 Virus
990117 9132992 2510 167 36 264 9400377 914 9609689 Virus
990233 9135227 2491 165 36 262 9397484 909 9606756 Virus
990366 9137694 2473 164 36 260 9395002 903 9604237 Virus
760836 7851079 165236 8891 4047 13803 10530146 38930 4454651 Worm
765750 7957382 161998 8717 3967 13533 10573140 38205 4453953 Worm
770445 8059123 158884 8549 3891 13273 10606358 37508 4450452 Worm
774993 8157690 155888 8388 3818 13022 10660033 36824 4454598 Worm
779347 8251754 153008 8237 3747 12785 10693711 36171 4452344 Worm
1.3 Knowing the Application and Data
11
12 1 What Is Applied Machine Learning?
ables including SEAT CLASS, GUESTS, FARE, and customer TITLE. Histograms
display a general distribution of a set of numeric values corresponding to a dataset
variable over a range.
Plots are great means to help with understanding the data behind an application.
Some example application of such plots is described in Table 1.5. It is important to
1.4 Getting Started Using Python 13
note that every plot is deployed for a different purpose and applied to a particular
type of data. Therefore, it is crucial to understand the need for such techniques
used during the EDA step. Such graphical tools can help maximize insight, reveal
underlying structure, check for outliers, test assumptions, and discover optimal
factors.
As indicated in Table 1.5, several Python libraries offer very useful tools to plot
your data. Python is a real generic programming language with a very large user
community. It is purpose-built for large datasets and machine learning analysis. In
this book, we focus on using Python language for various machine learning tasks
and hands-on examples and exercises.
Before getting started using Python for applying machine learning techniques
on a problem, you may want to find out which IDEs (Integrated Development
Environment) and text editors are tailored for Python programming or looking
at code samples that you may find helpful. IDE is a program dedicated to
software development. A Python IDE usually includes an editor to write and handle
Python code, build, execution, debugging tools, and some form of source control.
Several Python programming environments exist depending on how advanced is
a Python programmer to perform a machine learning task. For example, Jupyter
Notebook is a very helpful environment for beginners who have just started with
traditional machine learning or deep learning. Jupyter Notebook can be installed in
a virtual environment using Anaconda-Navigator, which helps with creating virtual
environments and installing packages needed for data science and deep learning.
While Jupyter Notebook is more suitable for beginners, there are other machine
learning frameworks such as TensorFlow that are mostly used for deep learning
tasks. As such, depending on how advanced you are in Python programming, you
may end up using a particular Python programming environment. In this book, we
will begin with using Jupyter Notebook for programming examples and hands-on
exercises. As we move toward more advanced machine learning tasks, we switch to
TensorFlow. You can download and install Anaconda-Navigator on your machine
using the following link by selecting Python 3.7 version: https://www.anaconda.
com/distribution/.
Once it is installed, navigate to Jupyter Notebook and hit “Launch.” You will then
have to choose or create a workspace folder that you will use to store all your Python
programs. Navigate to your workspace directory and hit the “New” button to create a
new Python program and select Python 3. Use the following link to get familiar with
the environment: https://docs.anaconda.com/anaconda/user-guide/getting-started/.
In the remaining part of this chapter, you will learn how to conduct preliminary
machine learning tasks through multiple Python programming examples.
14 1 What Is Applied Machine Learning?
Table 1.5 Popular Python tools for understanding the data behind an application. https://
github.com/dgrtwo/gleam
Plot Python Usage
type library description Example
Line plot Plotly Trends in data
2.5
Scatter plot Gleam Multivariate data 2.0
petal_width
1.5
1.0
0.5
0.0
1 2 3 4 5 6 7
petal_length
6
petal_length
1
setosa versicolor virginica
species
3.0
2.0
species
1.5 setosa
versicolor
virginica
1.0
0.5
0.0
1 2 3 4 5 6 7 8
petal_length
1 import re
2 from nltk.util import ngrams
3
4 #input text
5 text = """tighter time analysis for real-time traffic in on-chip \
6 networks with shared priorities"""
7 print(’input text: ’ + text)
8
9 tokens = [item for item in text.split(" ") if item != ""]
10
11 output2 = list(ngrams(tokens, 2)) #2-grams
12 output4 = list(ngrams(tokens, 4)) #4-grams
13
14 allOutput=[]
15 for bigram in output2:
16 if bigram[0]!= "for" and bigram[1]!= "for" and bigram[0]!="in" and \
17 bigram[1]!="in" and bigram[0]!="with" and bigram[1]!="with":
18 allOutput.append(bigram)
19
20 print(’\nall extracted bigrams are:’)
21 print(allOutput)
22
23
24 allOutput=[]
25
26 for quadgram in output4:
27 if quadgram[0]!= "for" and quadgram[1]!= "for" and quadgram[2]!= "for" \
28 and quadgram[3]!= "for" and quadgram[0]!="in" and quadgram[1]!="in" \
29 and quadgram[0]!="with" and quadgram[1]!="with":
30 allOutput.append(quadgram)
31
32 print(’\nall extracted quadgrams are:’)
33 print(allOutput)
34
35 >input text: tighter time analysis for real-time traffic in on-chip \
36 >networks with shared priorities
37
38 >all extracted bigrams are:
39 >[(’tighter’, ’time’), (’time’, ’analysis’),\
40 >(’real-time’, ’traffic’), (’on-chip’, ’networks’), (’shared’, ’priorities’)]
41
1.6 Data Exploration 17
especially when we arrive at modeling the data in order to apply machine learning.
Plotting in EDA consists of histograms, box plot, scatter plot, and many more. It
often takes much time to explore the data. Through the process of EDA, we can ask
to define the problem statement or definition on our dataset, which is very important.
Some of the common questions one can ask during EDA are:
• What kind of variations exist in data?
• What type of knowledge is discovered from the covariance matrix of data in
terms of the correlations between the variables?
• How are the variables distributed?
• What kind of strategy to follow with regard to the outliers detected in a dataset?
Some typical graphical techniques widely used during EDA include histogram,
confusion matrix, box plot, scatter plot, principal component analysis (PCA), and so
forth. Some of the available popular Python libraries used for EDA include seaborn,
pandas, matplotlib, and NumPy. In this section, we will illustrate multiple examples
showing how EDA is conducted on a sample dataset.
Let us begin the EDA by importing some libraries required to perform EDA.
1 import pandas as pd
2 import seaborn as sns #visualization
3 import matplotlib.pyplot as plt #visualization
4 import numpy as np
The first step to performing EDA is to represent the data in a Dataframe form, which
provides one with extensive usage for data analysis and data manipulation. Loading
the data into the Pandas dataframe is certainly one of the most preliminary steps in
EDA, as we can see that the value from the dataset is comma separated. So all we
have to do is to just read the CSV file into a dataframe and pandas dataframe does
the job for us. First, download iris.csv from https://raw.githubusercontent.com/uiuc-
cse/data-fa14/gh-pages/data/iris.csv. Loading the data and determining its statistics
can be done using the following command:
1 import pandas as pd
2 import matplotlib.pyplot as plt
3
4 data = pd.read_csv(’iris.csv’)
5 print(’size of the dataset and the number of features are:’)
6 print(data.shape)
7 print(’\ncolumn names in the dataset:’)
8 print(data.columns)
9 print(’\nnumber of samples for each flower species:’)
10 print(data["species"].value_counts())
11
12 data.plot(kind=’scatter’, x=’petal_length’, y=’petal_width’)
13 plt.show()
14
15 > # size of the dataset and the number of features are:
16 >(150, 5)
17
18 ># column names in the dataset:
19 >Index([’sepal_length’, ’sepal_width’, ’petal_length’, ’petal_width’,’species
’], dtype=’object’)
20
21 ># number of samples for each flower species:
22 >virginica 50
23 >setosa 50
24 >versicolor 50
25 >Name: species, dtype: int64
Fig. 1.6 2D scatter plot for iris dataset, based on two attributes “petal-length” and “petal-width”
A scatter plot can display the distribution of data. Figure 1.6 shows a 2D scatter
plot for visualizing the iris data (the command is included in the previous code
snippet). The plot observed is a 2D scatter plot with petal_length on x-axis and
petal_width on y-axis. However, with this plot, it is difficult to understand per class
distribution of data. Using a color-coded plot can help plot the color coding for
each flower/species/type of class. This can be done using seaborn(sns) library by
executing the following commands:
1 import seaborn as sns
2 sns.set_style("whitegrid")
3 sns.FacetGrid(data, hue="species", height=4) \
4 .map(plt.scatter, "petal_length", "petal_width") \
5 .add_legend()
6 plt.show()
Looking at this scatter plot in Fig. 1.6, it is a bit difficult to make sense of the
data since all data points are displayed with the same color regardless of their label
(i.e., category). However, apply color coding to the plot and we can say a lot about
the data by using a different color for each label. Figure 1.7 shows the color-coded
scatter plot coloring setosa with blue, versicolor with orange, and virginica with
green. One can understand how data is distributed across the two axes of petal-width
and petal-length based on the flower species. The plot clearly shows the distribution
across three clusters (blue, orange, and green), two of which are non-overlapping
(blue and orange), and two overlapping ones (i.e., orange and green).
One important observation that can be realized from this plot is that petal-
width and petal-length attributes can distinguish between setosaa and versicolor
and between setosa and versicolor. However, the same attributes cannot distinguish
1.7 A Practice for Performing Exploratory Data Analysis 21
petal_width
1.5
species
setosa
versicolor
1.0
virginica
0.5
0.0
1 2 3 4 5 6 7
petal_length
versicolor from virginica due to their overlapping clusters. This implies that the
analyst should explore other attributes to train an accurate classifier and perform a
reliable classification. So here is the summary of our observations:
• Using petal-length and petal-width features, we can distinguish setosa flowers
from others. How about using all the attributes?
• Separating versicolor from viginica is much harder as they have considerable
overlap using petal-width and petal-length attributes. Would one obtain the same
observation if instead sepal-width and sepal-length attributes were used?
We have also included the 3D scatter plot in the Jupyter notebook for
this tutorial. A sample tutorial for 3D scatter plot with Plotly Express can
be found here, which needs a lot of mouse interaction to interpret data.
https://plot.ly/pandas/3d-scatter-plots/ (What about 4D, 5D, or n-D scatter plot?)
Pair-Plot
When the number of features in a dataset is high, pair-plot can be used to clearly
visualize the correlations between the dataset variables. The pair-plot visualization
helps to view 2D patterns (Fig. 1.8) but fails to visualize higher dimension patterns
in 3D and 4D. Datasets under real-time study contain many features. The relation
between all possible variables should be analyzed. The pair plot gives a scatter plot
between all combinations of variables that you want to analyze and explains the
relationship between the variables (Fig. 1.8).
To plot multiple pairwise bivariate distributions in a dataset, you can use the
pairplot() function in seaborn. This shows the relationship for (n, 2) combination of
variables in a Dataframe as a matrix of plots and the diagonal plots are the univariate
plots. Figure 1.8 illustrates the pair-plot for iris dataset, which lead to the following
observations:
22 1 What Is Applied Machine Learning?
• Petal-length and petal-width are the most useful features to identify various
flower types.
• While Setosa can be easily identified (linearly separable), Virnica and Versicolor
have some overlap (almost linearly separable).
With the help of pair-plot, we can find “lines” and “if-else” conditions to build a
simple model to classify the flower types.
1 plt.close();
2 sns.set_style("whitegrid");
3 sns.pairplot(iris, hue="species", height=3);
4 plt.show()
1.7 A Practice for Performing Exploratory Data Analysis 23
Fig. 1.9 Histogram plot showing frequency distribution for variable “petal_length”
Histogram Plot
Box Plot
A box and whisker plot also called a box plot displays the five-number summary
of a set of data. The five-number summary is the minimum, first quartile, median,
third quartile, and maximum. In a box plot, we draw a box from the first quartile to
the third quartile. A vertical line goes through the box at the median. The whiskers
go from each quartile to the minimum or maximum. A box and whisker plot is a
way of summarizing a set of data measured on an interval scale. It is often used
in explanatory data analysis. This type of graph is used to show the shape of the
distribution, its central value, and its variability. In a box and whisker plot:
• The ends of the box are the upper and lower quartiles, so the box spans the
interquartile range.
• The median is marked by a vertical line inside the box.
• The whiskers are the two lines outside the box that extend to the highest and
lowest observations.
The following code snippet shows how a box plot is used to visualize the
distribution of the iris dataset. Figure 1.10 shows the box plot visualization across
the iris dataset “species” output variable.
1 sns.boxplot(x=’species’,y=’petal_length’, data=data)
2 plt.show()
Violin Plots
Violin plots are a method of plotting numeric data and can be considered a
combination of the box plot with a kernel density plot. In the violin plot (Fig. 1.11),
we can find the same information as in box plots:
• Median.
• Interquartile range.
• The lower/upper adjacent values are defined as first quartile-1.5 IQR and third
quartile + 1.5 IQR, respectively. These values can be used in a simple outlier
detection (Turkey’s fence) techniques, where observations lying outside of these
“fences” can be considered outliers.
1.7 A Practice for Performing Exploratory Data Analysis 25
5
petal_length
1
setosa versicolor virginica
species
Fig. 1.10 Box plot for Iris dataset over “species” variable
Data in statistics are sometimes classified according to how many variables are
in a study. For example, “height” might be one variable and “weight” might be
another variable. Depending on the number of variables being looked at, the data is
univariate, or it is bivariate.
Multivariate data analysis is a set of statistical models that examine patterns in
multi-dimensional data by considering at once with several data variables. It is
an expansion of bivariate data analysis, which considers only two variables in its
models. As multivariate models consider more variables, they can examine more
complex analyses/phenomena and find the data patterns that can more accurately
represent the real world. These three analyses can be done by using seaborn library
in the following manner, depicted in Fig. 1.12, showing the bivariate distribution of
“petal-length” and “petal-width,” as well as the univariate profile of each attribute
in the margin.
1 sns.jointplot(x="petal_length", y="petal_width", data=data, kind="kde")
2 plt.show()
Visualization techniques are very effective, helping the analyst understand the
trends in data.
1.7 A Practice for Performing Exploratory Data Analysis 27
Standard Deviation
The standard deviation is a statistic that measures the dispersion of a dataset relative
to its mean and is calculated as the square root of the variance. The standard
deviation is calculated as the square root of variance by finding each data point’s
deviation in the dataset relative to the mean. If the data points are far from the
mean, there is a higher deviation within the dataset. The more dispersed the data,
the larger the standard deviation; conversely, the more dispersed the data, the smaller
the standard deviation.
1 print("\n Std-dev:");
2 print(np.std(iris_setosa["petal_length"]))
3 print(np.std(iris_virginica["petal_length"]))
4 print(np.std(iris_versicolor["petal_length"]))
5
6 >Std-dev:
7 >0.17191858538273286
8 >0.5463478745268441
9 >0.4651881339845204
Mean/Average
The mean/average is the most popular and well-known measure of central tendency.
It can be used with both discrete and continuous data, although its use is most often
with continuous data. The mean is the sum of all values in the dataset divided by all
the values in the dataset. So, if we have n data points in a dataset and they have values
x1 , x2 , · · · , xn , the sample mean, usually denoted by x, is x = (x1 +x2 +· · ·+xn )/n
1 print("Means:")
2 print(np.mean(iris_setosa["petal_length"]))
3 # Mean with an outlier.
4 print(np.mean(np.append(iris_setosa["petal_length"],50)))
5 print(np.mean(iris_versicolor["petal_length"]))
6
7 >Means:
8 >1.4620000000000002
9 >2.4137254901960787
10 >4.26
Variance
Median
Percentile
Percentiles are used to understand and interpret data. The nth percentile of a set of
data is the value at which n percent of the data is below it. They indicate the values
below which a certain percentage of the data in a dataset is found. Percentiles can be
calculated using the formula n = (P /100) × N, where P = percentile, N = number
1.7 A Practice for Performing Exploratory Data Analysis 29
Quantile
Interquartile Range
The IQR describes the middle 50% of values when ordered from lowest to highest.
To find the interquartile range (IQR), initially, find the median (middle value) of the
lower and upper half of the data. These values are quartile 1 (Q1) and quartile 3
(Q3). The IQR is the difference between Q3 and Q1.
30 1 What Is Applied Machine Learning?
The mean absolute deviation of a dataset is the average distance between each data
point and the mean. It gives us an idea about the variability in a dataset. The idea is to
calculate the mean, calculate how far away each data point is from the mean using
positive distances, which are also called absolute deviations, add those deviations
together, and divide the sum by the number of data points.
1 from statsmodels import robust
2
3 print ("\n Median Absolute Deviation")
4 print(robust.mad(iris_setosa["petal_length"]))
5 print(robust.mad(iris_virginica["petal_length"]))
6 print(robust.mad(iris_versicolor["petal_length"]))
7
8 >Median Absolute Deviation
9 >0.14826022185056031
10 >0.6671709983275211
11 >0.5189107764769602
Precision and recall are not adequate for showing the performance of detection
even contradictory to each other because they do not include all the results
and samples in their formula. F-score (i.e., F-measure) is then calculated based
on precision and recall to compensate for this disadvantage. Receiver Operating
Characteristic (ROC) is a statistical plot that depicts a binary detection performance
while its discrimination threshold setting is changeable. The ROC space is supposed
by FPR and TPR as x and y axes, respectively. It helps the detector to determine
trade-offs between TP and FP, in other words, the benefits and costs. Since TPR and
FPR are equivalent to sensitivity and (1-specificity), respectively, each prediction
result represents one point in the ROC space in which the point in the upper left
corner or coordinate (0, 1) of the ROC curve stands for the best detection result,
representing 100% sensitivity and 100% specificity (perfect detection point).
TP +TN 8+6
ACC = = = 0.82. (1.1)
T P + FP + T N + FN 8+1+6+2
(continued)
32 1 What Is Applied Machine Learning?
TP 8
P = = = 0.89. (1.2)
FP + T P 1+8
TP 8
R= = = 0.8. (1.3)
T P + FN 8+2
2 × (P × R) 2 × (0.89 × 0.8)
F − Measure = = = 0.84. (1.4)
P +R 0.89 + 0.8
of the dataset including dataset size, the number of features (dimensions), type
of features (numeric, nominal, discrete, continuous, binary, so forth), and class
attribute (dependent variable).
Problem 1.3 Plot the distribution of data to show the number of data points per
class and describe if the dataset is balanced or not. If the dataset is imbalanced or
skewed, what solution do you propose as a remedy?
Problem 1.4 Identify outliers (if any) in the dataset and propose a solution to deal
with the outliers and explain why it is a suitable approach to be applied to this
dataset. You can use a visualization technique such as a box plot or a scatter plot to
identify outliers.
Problem 1.5 Perform a high-level statistical analysis of the dataset in terms of
reporting the mean, median, mean absolute deviation, and quantile before dealing
with potential outliers.
Problem 1.6 Perform Bi-variate analysis (correlation matrix, pair-plots) to find a
combination of useful features (i.e., independent variables) for classification.
Problem 1.7 Download the Airline .json file from
https://github.com/sathwikabavikadi/Machine-Learning-for-Computer-Scientists-
and-Data-Analysts and convert to .csv file and import into a dataframe.
Problem 1.8 Write your Python code to extract gender, age, and tile (such as “Mr”)
attributes from the “Description” field. Use pandas library.
Problem 1.9 Using the output of question 1.8, write a Python code to perform data
imputation on age and gender attributes. Explain your approach. You can use numpy
library.
Problem 1.10 Write a Python code to plot the distribution of Gender attributes after
imputation using a histogram plot.
Problem 1.11 Write a Python code to plot the distribution of Age attribute and plot
the box plots.
Problem 1.12 Write a Python code to plot the correlations between the dataset
attributes. You can use seaborn and matplotlib libraries. In case of finding correla-
tions between independent variables report them.
Problem 1.13 Outline the EDA techniques discussed in this chapter and the
significance of these techniques.
Problem 1.14 Discuss the prominence of data pre-processing.
Chapter 2
A Brief Review of Probability Theory
and Linear Algebra
2.1 Introduction
In daily life, we encounter various series of events and experiments that are based
on probability and have no certainty about the outcome. Probability theory is
an advantageous tool for quantitatively describing and forecasting the outcomes
of probability-based investigations. By applying probability theory to a problem,
one can simplify its understanding, evaluate it using the relevant mathematical
model, and forecast probable outcomes based on the probability. Two examples
are provided here to help you gain a better understanding of probability theory’s
applicability.
Consider rolling a fair dice as a simple example. When we are rolling a fair dice,
there is no certainty in the output to be achieved. It can be said that the output of this
experiment is based on probability. In more detail, in rolling a fair dice, the outcome
would be “1” with the probability of 1/6. Also, the outcome would be “2” with the
probability of 1/6. Similarly, each of the numbers of the dice would occur with the
probability of 1/6. In other words, if we repeat this experiment too many times, the
outcome “1” would be achieved in 16.66% of the time. A similar interpretation is
also applied to other possible outcomes. It can be seen that the possible outcome is
based on probability. This analysis and interpretation are possible using the concept
of the probability theory according to the definition of the probability theory.
Another example in this field is the entering and existing rate of the customers
in a restaurant. Using the probability theory, the entry rate of the costumers, the
time duration each customer spends in the restaurant, and their existing rate can be
easily modeled and analyzed mathematically. In particular, the average income of
the restaurant can be estimated. In fact, according to these predictions and analyzes,
one can take action to improve the performance of the restaurant.
Probability theory, in general, covers a broad range of applications. In any
subject where complete information is unavailable and hence no certainty about
the outcome, the issue can be controlled through the use of probability theory. Other
1
P(Head) = .
2
Similarly, the probability of achieving “Tail” would be 1/2. Generally, the proba-
bility of an event is shown as P (X = xi ). In this example, xi could be “Head” or
“Tail.” Note that the probability of an event is always a non-negative, less than or
equal to one value:
0 ≤ P(X = xi ) ≤ 1. (2.1)
i ∈ DX ,
j ∈ DY ,
ci
p(X = xi ) = ,
N
(2.3)
rj
p(Y = yj ) = .
N
In the above equation, ci and rj are achieved as below:
ci = nij ,
j ∈DY
(2.4)
rj = nij .
i∈DX
nij
P (X = xi ) = ,
N
j ∈DY
nij (2.5)
P (Y = yj ) = .
N
j ∈DX
In particular, consider P (X = xi ). It can be seen from the above formula that the
probability of X = xi is independent of the random variable Y by performing a
summation over j ∈ DY . This is called the “marginal probability” of X which can
be rewritten as below:
P (X = xi ) = P (X = xi , Y = yj ). (2.6)
j ∈DY
DX = {1, 2, · · · , 6},
DY = {Head, Tail}.
Now, consider that we are interested in finding the probability of X =“Head” and
Y = 1. Here, nij = 1 where i corresponds to the event “Head” in random variable
X, and j corresponds to the event “1” corresponds to random variable Y . Moreover,
the total number of events obtained from the combination of X and Y is N = 12.
According to (2.2), the joint probability of X and Y would be obtained as below:
nij 1
P (X = Head, Y = 1) = = . (2.7)
N 12
Also, inspired by (2.6), the marginal probabilities of X and Y would be obtained as
below:
6
P (X = Head) = P (X = Head, Y = yj ) = ,
12
j ∈DY
2
P (Y = 1) = P (X = xi , Y = 1) = . (2.8)
12
i∈DX
Consider the case in which the event X = xi occurred given the knowledge that
the event Y = yj has already occurred. The probability of X = xi given Y = yj
Another Random Document on
Scribd Without Any Related Topics
types. He takes what was best in them and sets it forth as a
standard and prophecy for the future, a pattern in the mount to be
realised hereafter in the structure of God's spiritual temple upon
earth.
But the Holy Spirit guided the hopes and intuitions of the sacred
writers to a special fulfilment. We can see that their types have one
antitype in the growth of the Church and the progress of mankind;
but the Old Testament looked for their chief fulfilment in a Divine
Messenger and Deliverer: its ideals are types of the Messiah. The
higher life of a good man was a revelation of God and a promise of
His highest and best manifestation in Christ. We shall endeavour to
show in subsequent chapters how Chronicles served to develop the
idea of the Messiah.
But the chronicler's types are not all prophecies of future progress or
Messianic glory. The brighter portions of his picture are thrown into
relief by a dark background. The good in Jeroboam is as completely
ignored as the evil in David. Apart from any question of historical
accuracy, the type is unfortunately a true one. There is a leaven of
the Pharisees and of Herod, as well as a leaven of the kingdom. If
the base leaven be left to work by itself, it will leaven the whole
mass; [pg 132] and in a final estimate of the character of those who
do evil “with both hands earnestly,” little allowance needs to be
made for redeeming features. Even if we are still able to believe that
there is a seed of goodness in things evil, we are forced to admit
that the seed has remained dead and unfertilised, has had no
growth and borne no fruit. But probably most men may sometimes
be profitably admonished by considering the typical sinner—the man
in whose nature evil has been able to subdue all things to itself.
[pg 133]
Chapter II. David—I. His Tribe And Dynasty.
King and kingdom were so bound up in ancient life that an ideal for
the one implied an ideal for the other; all distinction and glory
possessed by either was shared by both. The tribe and kingdom of
Judah were exalted by the fame of David and Solomon; but, on the
other hand, a specially exalted position is accorded to David in the
Old Testament because he is the representative of the people of
Jehovah. David himself had been anointed by Divine command to be
king of Israel, and he thus became the founder of the only legitimate
dynasty of Hebrew kings. Saul and Ishbosheth had no significance
for the later religious history of the nation. Apparently to the
chronicler the history of true religion in Israel was a blank between
Joshua and David; the revival began when the Ark was brought to
Zion, and the first steps were taken to rear the Temple in succession
to the Mosaic tabernacle. He therefore omits the history of the
Judges and Saul. But the battle of Gilboa is given to introduce the
reign of David, and incidental condemnation is passed on Saul: “So
Saul died for his trespass which he committed against the Lord,
because of the word of the Lord, which he kept not, and also for
that he asked counsel of one that had a familiar spirit, to inquire [pg
134] thereby, and inquired not of the Lord; therefore He slew him
and turned the kingdom unto David the son of Jesse.”
The reign of Saul had been an unsuccessful experiment; its only real
value had been to prepare the way for David. At the same time the
portrait of Saul is not given at full length, like those of the wicked
kings, partly perhaps because the chronicler had little interest for
anything before the time of David and the Temple, but partly, we
may hope, because the record of David's affection for Saul kept alive
a kindly feeling towards the founder of the monarchy.
Inasmuch as Jehovah had “turned the kingdom unto David,” the
reign of Ishbosheth was evidently the intrusion of an illegitimate
pretender; and the chronicler treats it as such. If we had only
Chronicles, we should know nothing about the reign of Ishbosheth,
and should suppose that, on the death of Saul, David succeeded at
once to an undisputed sovereignty over all Israel. The interval of
conflict is ignored because, according to the chronicler's views, David
was, from the first, king de jure over the whole nation. Complete
silence as to Ishbosheth was the most effective way of expressing
this fact.
The reasons for denying the legitimacy of the northern kings were
obvious and conclusive. Successful rebels who had destroyed the
political and religious unity of Israel could not inherit “the sure
mercies of David” or be included in the covenant which secured the
permanence of his dynasty.
But the choice of the house of David involved the choice of the tribe
of Judah and the rejection of the kingdom of Samaria. The ten
tribes, as well as the kings of Israel, had cut themselves off both
from the Temple and the sacred dynasty, and therefore from the
covenant into which Jehovah had entered with “the man after his
own heart.” Such a limitation of the chosen people was suggested by
many precedents. Chronicles, following the Pentateuch, tells how the
call came to Abraham, but only some of the descendants of one of
his sons inherited the promise. Why should not a selection be made
from among the sons of Jacob? But the twelve tribes had been
explicitly and solemnly included in the unity of Israel, largely through
David himself. The glory of David and Solomon consisted in their
sovereignty over a united people. The national recollection of this
golden age loved to dwell on the union of the twelve tribes. The
Pentateuch added legal sanction to ancient sentiment. The twelve
tribes were associated together in national lyrics, like the “Blessing
of Jacob” and the “Blessing of Moses.” The song of Deborah told
how the northern tribes “came to the help of the Lord against the
mighty.” It was simply impossible for the chronicler to absolutely
repudiate the ten tribes; and so they are formally included in the
genealogies of Israel, and are recognised in the history of David and
[pg 137] Solomon. Then the recognition stops. From the time of the
disruption the northern kingdom is quietly but persistently ignored.
Its prophets and sanctuaries were as illegitimate as its kings. The
great struggle of Elijah and Elisha for the honour of Jehovah is
omitted, with all the rest of their history. Elijah is only mentioned as
sending a letter to Jehoram, king of Judah; Elisha is never even
named.
On the other hand, it is more than once implied that Judah, with the
Levites, and the remnants of Simeon and Benjamin, are the true
Israel. When Rehoboam “was strong he forsook the law of the Lord,
and all Israel with him.” After Shishak's invasion, “the princes of
Israel and the king humbled themselves.”135 The annals of Manasseh,
king of Judah, are said to be “written among the acts of the kings of
Israel.”136 The register of the exiles, who returned with Zerubbabel is
headed “The number of the men of the people of Israel.”137 The
chronicler tacitly anticipates the position of St. Paul: “They are not
all Israel which are of Israel”; and the Apostle might have appealed
to Chronicles to show that the majority of Israel might fail to
recognise and accept the Divine purpose for Israel, and that the true
Israel would then be found in an elect remnant. The Jews of the
second Temple naturally and inevitably came to ignore the ten tribes
and to regard themselves as constituting this true Israel. As a matter
of history, there had been a period during which the prophets of
Samaria were of far more importance to the religion of Jehovah than
the temple at Jerusalem; but in the chronicler's time the very
existence of the ten tribes was ancient history. Then, at any rate,
[pg 138] it was true that God's Israel was to be found in the Jewish
community, at and around Jerusalem. They inherited the religious
spirit of their fathers, and received from them the sacred writings
and traditions, and carried on the sacred ritual. They preserved the
truth and transmitted it from generation to generation, till at last it
was merged in the mightier stream of Christian revelation.
Churches are still apt to ignore their obligations to teachers who, like
the prophets of Samaria, seem to have been associated with alien or
hostile branches of the family of God. A religious movement which
fails to secure for itself a permanent monument is usually labelled
heresy. If it has neither obtained recognition within the Church nor
yet organised a sect [pg 140] for itself, its services are forgotten or
denied. Even the orthodoxy of one generation is sometimes
contemptuous of the older orthodoxy which made it possible; and
yet Gnostics, Arians and Athanasians, Arminians and Calvinists, have
all done something to build up the temple of faith.
[pg 142]
Chapter III. David—II. His Personal History.
During the six or seven centuries that elapsed between [pg 143] the
death of David and the chronicler, the name of David had come to
have a symbolic meaning, which was largely independent of the
personal character and career of the actual king. His reign had
become idealised by the magic of antiquity; it was a glory of “the
good old times.” His own sins and failures were obscured by the
crimes and disasters of later kings. And yet, in spite of all its
shortcomings, the “house of David” still remained the symbol alike of
ancient glory and of future hopes. We have seen from the
genealogies how intimate the connection was between the family
and its founder. Ephraim and Benjamin may mean either patriarchs
or tribes. A Jew was not always anxious to distinguish between the
family and the founder. “David” and “the house of David” became
almost interchangeable terms.
Even the prophets of the eighth century connect the future destiny
of Israel with David and his house. The child, of whom Isaiah
prophesied, was to sit “upon the throne of David” and be “over his
kingdom, to establish it and to uphold it with judgment and with
righteousness from henceforth even for ever.”139 And, again, the king
who is to “sit ... in truth, ... judging, and seeking judgment, and
swift to do righteousness,” is to have “his throne ... established in
mercy in the tent of David.”140 When Sennacherib attacked
Jerusalem, the city was defended141 for Jehovah's own sake and for
His servant David's sake. In the word of the Lord that came to Isaiah
for Hezekiah, David supersedes, as it were, the sacred fathers of the
Hebrew race; Jehovah is not spoken of as “the God of Abraham,
Isaac, and Jacob,” but “the God of David.”142 [pg 144] As founder of
the dynasty, he takes rank with the founders of the race and religion
of Israel: he is “the patriarch David.”143 The northern prophet Hosea
looks forward to the time when “the children of Israel shall return,
and seek the Lord their God and David their king”144; when Amos
wishes to set forth the future prosperity of Israel, he says that the
Lord “will raise up the tabernacle of David”145; in Micah “the ruler in
Israel” is to come forth from Bethlehem Ephrathah, the birthplace of
David146; in Jeremiah such references to David are frequent, the
most characteristic being those relating to the “righteous branch,
whom the Lord will raise up unto David,” who “shall reign as king
and deal wisely, and shall execute judgment and justice in the land,
in whose days Judah shall be saved, and Israel shall dwell safely”147;
in Ezekiel “My servant David” is to be the shepherd and prince of
Jehovah's restored and reunited people148; Zechariah, writing at
what we may consider the beginning of the chronicler's own period,
follows the language of his predecessors: he applies Jeremiah's
prophecy of “the righteous branch” to Zerubbabel, the prince of the
house of David149: similarly in Haggai Zerubbabel is the chosen of
Jehovah150; in the appendix to Zechariah it is said that when “the
Lord defends the inhabitants of Jerusalem” “the house of David shall
be as God, as the angel of the Lord before them.”151 In the later [pg
145] literature, Biblical and apocryphal, the Davidic origin of the
Messiah is not conspicuous till it reappears in the Psalms of
Solomon152 and the New Testament, but the idea had not necessarily
been dormant meanwhile. The chronicler and his school studied and
meditated on the sacred writings, and must have been familiar with
this doctrine of the prophets. The interest in such a subject would
not be confined to scholars. Doubtless the downtrodden people
cherished with ever-growing ardour the glorious picture of the
Davidic king. In the synagogues it was not only Moses, but the
Prophets, that were read; and they could never allow the picture of
the Messianic king to grow faint and pale.153
David's name was also familiar as the author of many psalms. The
inhabitants of Jerusalem would often hear them sung at the Temple,
and they were probably used for private devotion. In this way
especially the name of David had become associated with the
deepest and purest spiritual experiences.
This brief survey shows how utterly impossible it was for the
chronicler to transfer the older narrative bodily from the book of
Samuel to his own pages. Large omissions were absolutely
necessary. He could not sit down in cold blood to tell his readers that
the man whose name they associated with the most sacred
memories and the noblest hopes of Israel had been guilty of
treacherous murder, and had offered himself to the Philistines as an
ally against the people of Jehovah.
We have already seen that the events of David's reign at Hebron and
his struggle with Ishbosheth are omitted because the chronicler does
not recognise Ishbosheth as a legitimate king. The omission would
also commend itself because this section contains the account of
Joab's murder of Abner and David's inability to do more than protest
against the crime. “I am this day weak, though anointed king; and
these men the sons of Zeruiah are too hard for me,”155 are scarcely
words that become an ideal king.
In 2 Sam. xxi. 15-17 we are told that David waxed faint and had to
be rescued by Abishai. This is omitted by Chronicles probably
because it detracts from the character of David as the ideal hero.
The next paragraph in Samuel also tended to depreciate David's
prowess. It stated that Goliath was slain by Elhanan. The chronicler
introduces a correction. It was not Goliath whom Elhanan slew, but
Lahmi, the brother of Goliath. However, the text in Samuel is
evidently corrupt; and possibly this is one of the cases in which
Chronicles has preserved the correct text.158
Then follow two omissions that are not easily accounted for. 2 Sam.
xxii., xxiii., contain two psalms, Psalm xviii. and “the Last Words of
David,” the latter not included in the Psalter. These psalms are
generally [pg 149] considered a late addition to the book of Samuel,
and it is barely possible that they were not in the copy used by the
chronicler; but the late date of Chronicles makes against this
supposition. The psalms may be omitted for the sake of brevity, and
yet elsewhere a long cento of passages from post-Exilic psalms is
added to the material derived from the book of Samuel. Possibly
something in the omitted section jarred upon the theological
sensibilities of the chronicler, but it is not clear what. He does not as
a rule look below the surface for obscure suggestions of undesirable
views. The grounds of his alterations and omissions are usually
sufficiently obvious; but these particular omissions are not at present
susceptible of any obvious explanation. Further research into the
theology of Judaism may perhaps provide us with one hereafter.
Our first impression as we read the book is that David comes into
the history as abruptly as Elijah or Melchizedek. Jehovah slew Saul
“and turned the kingdom unto David the son of Jesse.”159 Apparently
the Divine appointment is promptly and enthusiastically accepted by
the nation; all the twelve tribes come at once in their tens and
hundreds of thousands to Hebron to make David king. They then
march straight to Jerusalem and take it by storm, and forthwith
attempt to bring up the Ark to Zion. An unfortunate accident
necessitates a delay of three months, but at the end of that time the
Ark is solemnly installed in a tent at Jerusalem.160
We are not told who David the son of Jesse was, or why the Divine
choice fell upon him, or how he had been prepared for his
responsible position, or how he had so commended himself to Israel
as to be accepted with universal acclaim. He must, however, have
been of noble family and high character; and it is hinted that he had
had a distinguished career as a soldier.161 We should expect to find
his name in the introductory genealogies; and if we have read these
lists of names with conscientious attention, we shall remember that
there are sundry incidental references to David, and that he was the
seventh son of Jesse,162 who was descended from the Patriarch
Judah, through Boaz, the husband of Ruth.
This chapter partly explains David's popularity after Saul's death; but
it only carries the mystery a stage further back. How did this outlaw
and apparently unpatriotic rebel get so strong a hold on the
affections of Israel?
The main thread of the history is interrupted here and later on166 to
insert incidents which illustrate the personal courage and prowess of
David and his warriors. We are also told how busily occupied David
was during the three months' sojourn of the Ark in the house of
Obed-edom the Gittite. He accepted an alliance with Hiram, king of
Tyre; he added to his harem; he successfully repelled two inroads of
the Philistines, and made him houses in the city of David.167
This revelation of the Divine will as to the position of the Temple led
David to proceed at once with preparations for its erection by
Solomon, which occupied all his energies for the remainder of his
life.172 He gathered funds and materials, and gave his son full
instructions about the building; he organised the priests and Levites,
the Temple orchestra and choir, the doorkeepers, treasurers, officers,
and judges; he also organised the army, the tribes, and the royal
exchequer on the model of the corresponding arrangements for the
Temple.
Then follows the closing scene of David's life. The sun of Israel sets
amid the flaming glories of the western sky. No clouds or mists rob
him of accustomed splendour. David calls a great assembly of
princes and warriors; he addresses a solemn exhortation to them
and to Solomon; he delivers to his son instructions for “all the
works” which “I have been made to understand in writing from the
hand of Jehovah.” It is almost as though the plans of the Temple had
shared with the first tables of stone the honour of being written with
the very finger of God Himself, and David were even greater than
Moses. He reminds Solomon of all the preparations he had made,
and [pg 156] appeals to the princes and the people for further gifts;
and they render willingly—thousands of talents of gold, and silver,
and brass, and iron. David offers prayer and thanksgiving to the
Lord: “And David said to all the congregation, Now bless Jehovah
our God. And all the congregation blessed Jehovah, the God of their
fathers, and bowed down their heads, and worshipped Jehovah and
the king. And they sacrificed sacrifices unto Jehovah, and offered
burnt offerings unto Jehovah, on the morrow after that day, even a
thousand bullocks, a thousand rams, and a thousand lambs, with
their drink offerings and sacrifices in abundance for all Israel, and
did eat and drink before Jehovah on that day with great gladness.
And they made Solomon king; ... and David died in a good old age,
full of days, riches, and honour, and Solomon his son reigned in his
stead.”173
What idea does Chronicles give us of the man and his character? He
is first and foremost a man of earnest piety and deep spiritual
feeling. Like the great religious leaders of the chronicler's own time,
his piety found its chief expression in ritual. The main business of his
life was to provide for the sanctuary and its services; that is, for the
highest fellowship of God and man, according to the ideas then
current. But David is no mere formalist; the psalm of thanksgiving
for the return of the Ark to Jerusalem is a worthy tribute to the
power and faithfulness of Jehovah.174 His prayer after God had
promised to establish his dynasty is instinct with devout confidence
and gratitude.175 But the most gracious and appropriate of these
Davidic utterances is his last prayer and thanksgiving for the liberal
gifts of the people for the Temple.176
A portrait reveals the artist as well as the model and the chronicler in
depicting David gives indications of the morality of his own times.
We may deduce from his omissions a certain progress in moral
sensitiveness. The book of Samuel emphatically condemns David's
treachery towards Uriah, and is conscious of the discreditable nature
of many incidents connected with the revolts of Absalom and
Adonijah; but the silence of Chronicles implies an even severer
condemnation. In other matters, however, the chronicler “judges
himself in that which he approveth.”179 Of course the first business of
an ancient king was to protect his people from their enemies and to
enrich them at the expense of their neighbours. The urgency of
these duties may excuse, but not justify, the neglect of the more
peaceful departments of the administration. The modern reader is
struck by the little stress laid by the narrative upon good
government at home; it is just mentioned, and that is about all. As
the sentiment of international morality is even now only in its
infancy, we cannot wonder at its absence from Chronicles; but we
are a little surprised to find that cruelty towards prisoners is included
without comment in the character of the ideal king.180 It is curious
that the account in the book of Samuel is slightly ambiguous and
might possibly admit of a comparatively mild interpretation; but
Chronicles, according to the ordinary translation, says definitely, “He
cut them with saws.” The mere [pg 160] reproduction of this
passage need not imply full and deliberate approval of its contents;
but it would not have been allowed to remain in the picture of the
ideal king, if the chronicler had felt any strong conviction as to the
duty of humanity towards one's enemies. Unfortunately we know
from the book of Esther and elsewhere that later Judaism had not
attained to any wide enthusiasm of humanity.
[pg 161]
Chapter IV. David—III. His Official Dignity.
Indeed, the title of the royal house of Judah rested upon Divine
appointment. “Jehovah ... turned the [pg 163] kingdom unto David;
... and they anointed David king over Israel, according to the word
of Jehovah by the hand of Samuel.”182 But the Divine choice was
confirmed by the cordial consent of the nation; the sovereigns of
Judah, like those of England, ruled by the grace of God and the will
of the people. Even before David's accession the Israelites had
flocked to his standard; and after the death of Saul a great array of
the twelve tribes came to Hebron to make David king, “and all the
rest also of Israel were of one heart to make David king.”183 Similarly
Solomon is the king “whom God hath chosen,” and all the
congregation make him king and anoint him to be prince.184 The
double election of David by Jehovah and by the nation is clearly set
forth in the book of Samuel, and in Chronicles the omission of
David's early career emphasises this election. In the book of Samuel
we are shown the natural process that brought about the change of
dynasty; we see how the Divine choice took effect through the wars
between Saul and the Philistines and through David's own ability and
energy. Chronicles is mostly silent as to secondary causes, and fixes
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.
ebookbell.com