0% found this document useful (0 votes)
8 views

python librariesforhacking

This document discusses the use of Python libraries in cybersecurity, particularly in penetration testing. It covers various Python libraries such as Faker for generating random data, Scapy for network packet manipulation, and others that aid in different stages of pen testing. The unit aims to provide an understanding of cybersecurity threats and the application of Python in addressing these challenges.

Uploaded by

raghubirkar11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

python librariesforhacking

This document discusses the use of Python libraries in cybersecurity, particularly in penetration testing. It covers various Python libraries such as Faker for generating random data, Scapy for network packet manipulation, and others that aid in different stages of pen testing. The unit aims to provide an understanding of cybersecurity threats and the application of Python in addressing these challenges.

Uploaded by

raghubirkar11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

UNIT 2 PYTHON LIBRARIES FOR

CYBER SECURITY

2.0 Introduction
2.1 Learning Outcomes
2.4 How is Python used in Pen Testing?
2.5 Python Libraries for Cyber Security
2.5.1 Faker
2.5.2 Scapy
2.5.3 Beautiful Soup
2.6 Let’s Sum it Up
2.6 Check Your Progress

2.0 Introduction
In this unit, we will discuss python libraries for cybersecurity. Many
Cybersecurity development projects require cryptographic
functions like data encryption, decryption, and secret key
generation. Python Cryptography library can do all these tasks
since it is loaded with cryptographic algorithms. Hence, we will
discuss this in detail in this unit.
2.1 Learning Outcomes

After completing this Unit, one should be able to understand:


 understand the concept of Cybersecurity threats
 how to apply pen testing for cybersecurity
 explain Python libraries for cyber security

2.2 CYBER SECURITY THREATS

The process of guarding against dangerous attempts on computer


systems, mobile devices, data networks, or servers because of
instability is known as cybersecurity. They also guarantee that the
operations they supply are not misdirected or interrupted. As a
result of data hacks, companies lose a lot of money and also carry
another negative effect on employees or other users
Due to the apparent high-profile data leaks, cyber security is becoming
a global issue, causing significant problems. Mostly every
company, from startups to technology giants, has begun to
consider because of the risks that are associated, data protection is
now one of the highly searched professions. The information
security sector is on high alert as significant and developing cyber
security threats appear every day Cyber-attacks including fraud,
viruses, cryptocurrencies, machine learning, and artificial
intelligence (ML and AI) also put the resources and information of
businesses in danger. The cybercrime crisis is already on the rise,
and the consequences have never been more significant, leading to
an increased risk of:

Distortion: Through the systematic transmission of disinformation,


chatbot and artificial resources affect the information's quality and
trustworthiness.
Disruption: The risk of the virus becoming utilized to take control of
IoT devices is increasing because we depend on fragile
connections

19
Deterioration: To secure their data, Rapid advances in advanced
devices, as well as conflicting expectations imposed by personal
privacy rules and national security, have damaged businesses'
capacity

2.3 How is Python used in Pen Testing?


It is defined as a broad field of application in which cyber security
professionals try to analyze an organization's security, also known
as penetration testing. Cyber security specialists may alert the
organization to severe security flaws, allowing it to start preparing
for the internet.
They act as if they were an attacker when someone does a pen test.
Penetration testing is divided into seven steps, except for the first,
which was before the stage when several Python libraries could be
used:

Stage 1: Pre-Engagement - A cybersecurity team determines the


objectives and logistics of the pen test.

Stage 2: Information Gathering - Pen testers rely on the Python


libraries: NMAP, Twisted, Beautiful Soup, Scapy, Socket,
Mechanize, and Devploit.

Stage 3: Threat Modeling - Pen testers rely on the Python libraries:


Python Framework and Threat-modeling 0.0.1.

Stage 4: Vulnerability Scanning - Pen testers rely on the Python


libraries: Vulners 1.5.13, Safety, and Scapy.

Stage 5: Exploitation - Pen testers rely on the Python libraries:


Pymetaploit3 (to implement the Metasploit framework), Scapy,
Socket, and BYOB.

Stage 6: Post-Exploitation - Pen testers rely on the Python


libraries: Pymetaploit3, BYOB, and RSPET.

Stage 7: Reporting - Pen testers rely on the Python libraries: Sys,


Plotly, Pandas, and NLTK.

When faced with a cyber-attack, cyber security specialists depend on a


few Python modules and frameworks:
Pslist- How they begin and stop pslist is a tool for cataloging activities
and determining.

Pstree- Pstree is a programme that uses a tree design to evaluate


what processes are executing.

Psscan- Assists in the discovery of previously interrupted activities

Psxview -Provides a comprehensive view of your operating system's


activities, locations, and where they are active.

GRR (Google Rapid Reaction) is a paradigm for incident handling.


Cyber security specialists also depend on automating security duties
when confronted with a cyber-attack.SOAR (Security
Orchestration, Automation, and Response) is a tool that aids in
automated security processes and is commonly used during
incident response while analyzing various warnings.

Check Your Progress 1


Note: a) Space is given below for writing your answers.
b) Check your answers with the one given at this unit's end
1) Write a short note on Pen Testing in Python.

…………………………………………………………………………
……………………………………………………………………
……………………………………………………………………
…………………………………………………………………….

2.4 Python Libraries for Cyber Security


In the following sections, you will be able to explain python libraries
for cyber security.

2.4.1 Faker
21
To construct your dataset, faker is an open-source python package,
i.e., random data with random properties such as name, age, and
location. It supports all major locations and languages because it
helps create data depending on location.

The value of false data in cyber security cannot be overstated. It may


be used for testing and real-world Cyber security operations. Faker
is a strong Python library for creating random data such as names,
addresses, emails, nations, text, and URLs.Check out the
following Faker example, which makes a random name and
address using the Faker() function.

Compatibility

Faker stopped supporting Python 2 in version 4.0.0 and now only


accepts Python 3.6 and above in version 5.0.0. Try installing
version 3.0.1 in the meantime, and consider rewriting the code to
support Python 3 to use all of Faker's new capabilities. If Python 2
compatibility is still required, this package was formerly
recognized as a fake-factory, but it was widely criticized by the
end of 2016, and a lot has changed since then.

Implementation

We need to update by using pip install faker to explore.

Import-Package:
Fakers may print/get phony data, such as a fake name, address, email,
SMS message, etc.

Implementing numerous purposes:


Then we'll look at some of the Faker library's features.; to do so, we'll
use a variable to start the Faker function.
exp = Faker()

We'll now utilize this property to produce various properties.”


print('Name: ', exp.name())

print('Address: ',exp.address())

print('DOB: ',exp.date_of_birth())

“We may create information in various languages based on different


locations and regions. All we have to do now is specify the
vocabulary that we desire. Let's gather some Japanese and Hindi
materials combined.”
exp = Faker(['ja_JP', ‘hi_IN])

for i in range(5):

“They may also make sentences using their specified word library,
which comprises terms of our choosing, and the faker would make
fake phrases with those alphabets.
words = ['Hello','Abhishek','all', 'are','where','why',]

exp.sentence(ext_word_list=words)

“In addition to names and addresses, we can construct an entire


portfolio for other people who do not exist while using the profile
feature; we will create a fake identity.
exp.profile()”

23
“Faker is also able to generate a randomized database.

Utilizing a faker, generate a fictitious set of data.


We'll use the profile method to create a collection with the profiles of
100 different persons who are all false. These profiles will be
created in the Hindi language. For all of this, we'll utilize
caterpillars to retain those characteristics in a data frame.
Different attributes, such as home, location, and webpage, are
included in the dataset we produced. To fulfill their criteria dataset
can be used. We've put these characteristics into a data set so that
we can do things like visualization and analysis with them

Generate columns that match specific formats


Including a code number or an iPhone model, you may also construct
fake data that requires a specific structure

# Use bothify to generate random numbers(#) or letters(?). Can limit the letters
used with letters=
print(fake.bothify('PROD-??-##', letters='ABCDE'))
print(fake.bothify('iPhone-#'))

Using a secret formula to create classified sections:


You may specify the chance that an unexpected result will be "True.",
With Boolean columns.”

25
# Create fake True/False values
# Random True/False
print(fake.boolean())

# Specify % True
print(fake.boolean(chance_of_getting_true=25))

“You can give a list of numbers from which to pick at random for
category columns. If you don't want each item inside the table to
get an equal probability of being picked, you can define the weight
to assign to each value instead. “

import numpy as np

industry = ['Automotive','Health Care','Manufacturing','High Tech','Retail']


# Specify probabilities of each category (must sum to 1.0)
weights = [0.6, 0.2, 0.1, 0.07, 0.03]

# p= specifies the probabilities of each category. Must sum to 1.0


print(np.random.choice(industry, p=weights))

# Generating choice without weights (equal probability on all elements)


print(np.random.choice(industry))


Create number rows with dispersion in the middle.

By selecting the standard deviation, you may generate numeric rows


for fields that clear the picture, such as sales. You can also
generate numbers by specifying the best amount.

In between periods, a set of events are created :

Dates and date times can be constructed in a variety of ways. You can
choose a date inside this decade, a century, a year, a month, or a
period of interval between two date ranges.”

# 1st argument is mean of distribution, 2nd is standard deviation


print(np.random.normal(1000, 100))
# Rounded result
print(round(np.random.normal(1000, 100)))

# Generate random integer between 0 and 4


print(np.random.randint(5))

print(fake.date_this_century().strftime('%m-%d-%Y'))
print(fake.date_this_decade().strftime('%m-%d-%Y'))
print(fake.date_this_year().strftime('%m-%d-%Y'))
print(fake.date_this_month().strftime('%m-%d-%Y'))
print(fake.time())
import pandas as pd

# Start and end dates to generate data


my_start = pd.to_datetime('01-01-2021')
my_end = pd.to_datetime('12-31-2021')

print(f'Random date between {my_start} & {my_end}')


fake.date_between_dates(my_start, my_end).strftime('%m-%d-%Y')

“Even parts of dates or specifying dates relative to today:

print(fake.year())
print(fake.month())
print(fake.day_of_month())
print(fake.day_of_week())
print(fake.month_name())
print(fake.past_date('-1y'))
print(fake.future_date('+1d'))

2.4.2 Scapy
In pen testing, it's utilized for activities including screening, exploring,
unit testing, and detection; Scapy is a strong Python package. The
program aims to sniff, transmit, analyze, and alter network
packets. The primary concept is to send packets and receive a
meaningful answer. Scapy offers an advantage over comparable
programmers like Nmap, which often respond with a
straightforward (open/closed/filtered) status. Engineers may create
packets (requests), record responses (answers) as packet pairs
(requests | answers), and provide the results in the form of a
(requests | answers) list using Scapy. Many modules ignore
packets the target network/host does not reply to. Scapy, on the
other hand, gives users all the information by constructing an
additional list of mismatched (unaddressed) packages. Scapy may
also deliver incorrect messages to a targeted server, inject 802.11
signals, decode WEP-encrypted VOIP packets, and so on, in
addition to probing packets. Scapy can provide the following
capabilities by simply utilizing the import line to import the
module.

27
The expression "from scapy. all import *" in the following code
instructs the processor to export all of the Scapy module's
capabilities. The desired Scapy functions can be imported by
substituting the asterisk (*) sign with the relevant procedures.
Consider scenario.
#! /usr/bin/env pythonj8
import sys
from scapy.all import ICMP, IP, ARP
Scapy is a Python library with its command prompt interpreter (CLI)
for creating, modifying, sending, and capturing network packets.
It may be used interactively or as a framework by exporting it into
Python scripts. It's also compatible with Linux, Mac OS X, and
Microsoft.

This technology allows us to perform searches and/or network assaults


when utilized in cyber security. Scapy's main advantage is that,
unlike other tools, it enables us to change data packets slower,
allowing us to use existing network techniques and calibrate them
to meet our needs.

True impact and receiving replies are the two main functions of Scapy.
It sends packets, receives answers, compares demands with
responses, and generates a list of package pairs (demand,
response) and a list of data packets that are mismatched. It has a
significant benefit over Nmap or hping in that the response is not
restricted to (accessible) but includes the entire package.

And, if we choose to utilise your Python libraries, we can create your


instruments. In this manner, we will be capable of creating
additional higher-level advancements and combining them all
according to our requirements.

Using the Command Line Interpreter (CLI)


The Command Prompt Translator is a programme that allows you to
type commands into a computer (CLI):
To get began with this tool, we'll need to do the following:
Scapy is a bit of software that you may download and install.
Run it from the command prompt (with administrator privileges)
Using the Command - line interface to Perform Basic
Tasks (CLI)
The main basic functions we should know are:
ls() returns a list of all accessible layers.
explore() provides a graphical interface for seeing layers that already
exist.
lsc() : available functions
help() returns a help menu.
Inside the function, the following are the most common:

29
send(): send packets to level 2.
sendp(): send packets to level 3.
sr(): send and receive packets at level 3.
srp(): send and receive packets at level 2.
sr1(): send and receive only the first packet at level 3.
srp1(): sends and receives only the first packet to level 2.
sniff(): packet sniffing.
traceroute(): command trace route.
arping(): Send who-has ARP requests to determine which machines
are up in the network.
Example of use through Command Line Interpreter (CLI)
Making an ICMP-type package may be the easiest way to go. In this
scenario, we'll create a packet that we'll keep in a variable (P) and
parameterize with the IP layer (destination ip), a second ICMP
layer, and finally, a payload ("hello SanExperts").
Then we'll transmit it using the sr1 command (p)

If we execute the ‘show()’ command, it will layer the contents of the


package
We utilized the IP address of the equipment in this example, but we
might change the package's settings to use another IP or MAC
address. We may send at level 3 (sr) or level 2 (srp) of the TCP
architecture, depending on our needs.

To do a scan of sites in our subnet, for example, just run the command
(srp) and display the values of the what if they have answered
(ans):

31
If we need help on any of the commands, we can show it by using the
‘help()’ command:

Using Python in Real-Life Situations


Scanning of ports is the first use case.
Using Scapy, create a program for penetration testing at
both the TCP and UDP levels:
This will enable us to create a specialized method for assessing
services in our network.

To implement a use case, we'll first create a Python programme that


accepts command-line parameters. To do so, we first write a
service that allows us to specify the target to be checked, the
numbers to be scanned, both in a unified and ranged (1-n) format,
and whether the search will be performed over a TCP or UDP link.
As a requirement, we must import Scapy’s libraries.

We check the correct syntax and begin the procedure that will do the
scan in the main() section of the programme.

33
Next, we discussed the subclass that will do the scan, distinguishing
TCP from UDP:
We describe the different levels with the shipment data and the
relevant settings to assemble our package:
And we send it by (sr1) and save the outcome inside a variable (result).

Next, depending on the responses, we'll look at the flags collected to


see if the ports are open and closed. Before proceeding, it is
necessary to understand how the Tcp works and how the dialogues

35
and flags are formed to identify the port's status based on each
answer.

• Open port: The connector isn't restricted by a firewall, and the


service uses it.
SYN and ACK flags are received (0x12)

• Closed port: The port is not closed, and no service is responding to


it.
RST and ACK are received as flags (0x14).
• Filtered port: A firewall (FW) prevents entry to the port.
There is no response.

Here is how the answer would be assessed to return to the sample


code.

In this example, the "SYN, ACK" flag is looked for in the response,
and if it is found, it is assumed that a connection has been initiated
and the ports are open.

We'll use a similar procedure with UDP. We'll transmit the packet and
wait a while to see whether the data is received because it's an
offline protocol. We'll have to figure out whether the channel is
open and closed later, based on the sort of answer.
If we want further information, such as if the channels are "filtered,"
we should add that data as part of the portfolio in the following
TCP and UDP testers:

Finally, the script's regular result after running would be as continues


to follow:

37
2.4.3 Beautiful Soup
Beautiful Soup is a Python tool for parsing and extracting data from
HTML and XML files. It connects with your chosen parsing to
provide smooth parse tree navigation, search, and editing. It's not
uncommon for developers to save hours, if not days, of work.

Here's an Html page that will be used to demonstrate several concepts.



html_doc = """
<html><head><title>The Dormouse's story</title></head>
<body>
<p class="title"><b>The Dormouse's story</b></p>

<p class="story">Once upon a time there were three little sisters; and their names were
<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
and they lived at the bottom of a well.</p>

<p class="story">...</p>
"""

When we run the "three sisters" text through Beautiful Soup, we get
Easy preparation objects, which is a layered data model that
reflects the document:
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_doc, 'html.parser')

print(soup.prettify())
# <html>
# <head>
# <title>
# The Dormouse's story
# </title>
# </head>
# <body>
# <p class="title">
# <b>
# The Dormouse's story
# </b>
# </p>
# <p class="story">
# Once upon a time there were three little sisters; and their names were
# <a class="sister" href="http://example.com/elsie" id="link1">
# Elsie
# </a>
# ,
# <a class="sister" href="http://example.com/lacie" id="link2">
# Lacie
# </a>
# and
# <a class="sister" href="http://example.com/tillie" id="link2">
# Tillie
# </a>
# ; and they lived at the bottom of a well.
# </p>
# <p class="story">
# ...
# </p>
# </body>
# </html>


“Here are a few quick methods to get around that data model:
soup.title
# <title>The Dormouse's story</title>

soup.title.name
# u'title'

soup.title.string
# u'The Dormouse's story'

soup.title.parent.name
# u'head'

soup.p
# <p class="title"><b>The Dormouse's story</b></p>

soup.p['class']
# u'title'

soup.a
# <a class="sister" href="http://example.com/elsie" id="link1">Elsie</a>

soup.find_all('a')
# [<a class="sister" href="http://example.com/elsie" id="link1">Elsie</a>,
# <a class="sister" href="http://example.com/lacie" id="link2">Lacie</a>,
# <a class="sister" href="http://example.com/tillie" id="link3">Tillie</a>]

soup.find(id="link3")
# <a class="sister" href="http://example.com/tillie" id="link3">Tillie</a>

“Extracting all URLs within a page's > tags is a regular operation.

39
for link in soup.find_all('a'):
print(link.get('href'))
# http://example.com/elsie
# http://example.com/lacie
# http://example.com/tillie

Trying to extract all of the content from a webpage is another


prevalent task:
print(soup.get_text())
# The Dormouse's story
#
# The Dormouse's story
#
# Once upon a time there were three little sisters; and their names were
# Elsie,
# Lacie and
# Tillie;
# and they lived at the bottom of a well.
#
# ...

Installing Beautiful Soup Setup


When you're using a current edition of Deb or Ubuntu Linux, then may
download Beautiful Soup:
Install python-bs4 using apt-get (for Python 2)
Install python3-bs4 using apt-get (for Python 3)

Because Beautiful Soup 4 is delivered via PyPi, you may install it with
a simple install or pip if you don't have access to the system
packager. The package is called Beautifulsoup4 and operates on
Both 2 & Python 3. Check that you're using the correct model of
pip or easy download for a particular Python edition (when you're
using Python 3, these would be labeled pip3 and easy install3,
respectively).

When you don't have an easy setup or pip loaded, you may download
the Beautiful Soup 4 data source and install it with setup.py.

You may use the Beautiful Broth state to include the whole collection
with your app when everything else fails. You may utilise
Beautiful Soup without downloading it by obtaining the tar ball
and pasting the bs4 folder into the source folder of your
application.
Beautiful Soup is written in Python 2.7 and Python 3.2, but it should
work with more current versions too.

Installing a parser
Beautiful Soup supports the HTML parser included in Python’s
standard library and several third-party Python parsers. One is
the lxml parser. Depending on your setup, you might install lxml
with one of these commands:

$ apt-get install python-lxml

$ easy_install lxml

$ pip install lxml

Another alternative is the pure-Python html5lib parser, which parses


HTML as a web browser does. Depending on the setup, one
might install html5lib with one of these commands:

$ apt-get install python-html5lib

$ easy_install html5lib

$ pip install html5lib

Making the soup

To parse a document, pass it into the BeautifulSoup constructor. You


can pass in a string or an open filehandle:

from bs4 import BeautifulSoup

with open("index.html") as fp:


soup = BeautifulSoup(fp)

soup = BeautifulSoup("<html>data</html>")

First, the document is converted to Unicode, and HTML entities are


converted to Unicode characters:

41
BeautifulSoup("Sacr&eacute; bleu!")
<html><head></head><body>Sacré bleu!</body></html>

Beautiful Soup then parses the document using the best available
parser. It will use an HTML parser unless you specifically tell it to
use an XML. (See Parsing XML.)

Kinds of objects

Beautiful Soup transforms a complex HTML document into a


complex tree of Python objects. But you’ll only ever have to deal
with four kinds of objects: Tag , NavigableString , BeautifulSoup ,
and Comment .

Tag

A Tag object corresponds to an XML or HTML tag in the original


document:

soup = BeautifulSoup('<b class="boldest">Extremely bold</b>')


tag = soup.b
type(tag)
# <class 'bs4.element.Tag'>

Tags have a lot of attributes and methods, and we will cover most of
them in Navigating the tree and Searching the tree. For now, the
most important features of a tag are its name and attributes.

Name

Every tag has a name, accessible as .name :

tag.name
# u'b'

If you change a tag’s name, the change will be reflected in any HTML
markup generated by Beautiful Soup:

tag.name = "blockquote"
tag
# <blockquote class="boldest">Extremely bold</blockquote>

Attributes
A tag may have any number of attributes. The
tag <b id="boldest"> has an attribute “id” whose value is
“boldest.” You can access a tag’s attributes by treating the tag like
a dictionary:

tag['id']
# u'boldest'

You can access that dictionary directly as .attrs :

tag.attrs
# {u'id': 'boldest'}

You can add, remove, and modify a tag’s attributes. Again, this is done
by treating the tag as a dictionary:

tag['id'] = 'verybold'
tag['another-attribute'] = 1
tag
# <b another-attribute="1" id="verybold"></b>

del tag['id']
del tag['another-attribute']
tag
# <b></b>

tag['id']
# KeyError: 'id'
print(tag.get('id'))
# None

Multi-valued attributes

HTML 4 defines a few attributes that can have multiple values. HTML
5 removes a couple of them but defines a few more. The most
common multi-valued attribute is class (a tag can have more than
one CSS class). Others
include rel , rev , accept-charset , headers , and accesskey .
Beautiful Soup presents the value(s) of a multi-valued attribute as
a list:

css_soup = BeautifulSoup('<p class="body"></p>')


css_soup.p['class']
# ["body"]

css_soup = BeautifulSoup('<p class="body strikeout"></p>')


css_soup.p['class']
# ["body", "strikeout"]

43
If an attribute looks like it has more than one value, but it’s not a
multi-valued attribute as defined by any version of the HTML
standard, Beautiful Soup will leave the attribute alone:

id_soup = BeautifulSoup('<p id="my id"></p>')


id_soup.p['id']
# 'my id'

When you turn a tag back into a string, multiple attribute values are
consolidated:

rel_soup = BeautifulSoup('<p>Back to the <a


rel="index">homepage</a></p>')
rel_soup.a['rel']
# ['index']
rel_soup.a['rel'] = ['index', 'contents']
print(rel_soup.p)
# <p>Back to the <a rel="index contents">homepage</a></p>

You can disable this by passing multi_valued_attributes=None as a


keyword argument into the BeautifulSoup constructor:

no_list_soup = BeautifulSoup('<p class="body strikeout"></p>', 'html',


multi_valued_attributes=None)
no_list_soup.p['class']
# u'body strikeout'

You can use `get_attribute_list to get a value that’s always a list,


whether or not it’s a multi-valued attribute:

id_soup.p.get_attribute_list('id')
# ["my id"]

If you parse a document as XML, there are no multi-valued attributes:

xml_soup = BeautifulSoup('<p class="body strikeout"></p>', 'xml')


xml_soup.p['class']
# u'body strikeout'

Again, you can configure this using


the multi_valued_attributes argument:

class_is_multi= { '*' : 'class'}


xml_soup = BeautifulSoup('<p class="body strikeout"></p>', 'xml',
multi_valued_attributes=class_is_multi)
xml_soup.p['class']
# [u'body', u'strikeout']
You probably won’t need to do this, but use the defaults as a guide if
you do. They implement the rules described in the HTML
specification:

from bs4.builder import builder_registry


builder_registry.lookup('html').DEFAULT_CDATA_LIST_ATTRIBUTES

NavigableString

A string corresponds to a bit of text within a tag. Beautiful Soup uses


the NavigableString class to contain these bits of text:

tag.string
# u'Extremely bold.'
type(tag.string)
# <class 'bs4.element.NavigableString'>

A NavigableString is just like a Python Unicode string, except that it


also supports some of the features described in Navigating the
tree and Searching the tree. You can convert a NavigableString to
a Unicode string with unicode() :

unicode_string = unicode(tag.string)
unicode_string
# u'Extremely bold.'
type(unicode_string)
# <type 'unicode'>

You can’t edit a string in place, but you can replace one string with
another, using replace_with():

tag.string.replace_with("No longer bold")


tag
# <blockquote>No longer bold</blockquote>

NavigableString supports most of the features described


in Navigating the tree and Searching the tree, but not all of them.
In particular, since a string can’t contain anything (the way a tag
may include a string or another tag), strings don’t support
the .contents or .string attributes, or the find() method.

If you want to use a NavigableString outside of Beautiful Soup, you


should call unicode() on it to turn it into a standard Python
Unicode string. If you don’t, your string will carry around a
45
reference to the entire Beautiful Soup parse tree, even when you’re
done using Beautiful Soup. This is a big waste of memory.

BeautifulSoup

The BeautifulSoup object represents the parsed document as a whole.


For most purposes, you can treat it as a Tag object, which means it
supports most of the methods described in Navigating the
tree and Searching the tree.

You can also pass a BeautifulSoup object into one of the methods
defined in Modifying the tree, just as you would a Tag. This lets
you do things like combine two parsed documents:

doc = BeautifulSoup("<document><content/>INSERT FOOTER


HERE</document", "xml")
footer = BeautifulSoup("<footer>Here's the footer</footer>", "xml")
doc.find(text="INSERT FOOTER HERE").replace_with(footer)
# u'INSERT FOOTER HERE'
print(doc)
# <?xml version="1.0" encoding="utf-8"?>
# <document><content/><footer>Here's the footer</footer></document>

Since the BeautifulSoup object doesn’t correspond to an actual HTML


or XML tag, it has no name and no attributes. But sometimes it’s
useful to look at its .name , so it’s been given the
special .name “[document]”:

soup.name
# u'[document]'
Check the progress-2:
Note: a) Space is given below for writing your answers.
b) Check your answers with the one given at this unit's end.
i) Write short note on scapy library

ii) Write in brief how Boutiful Soup library is used in Cyber


security

……………………………………………………………………
…………………………………………………………………
…………………………………………………………………
…………………………………………………………………
……………………………………………………………….

2.5 Let Us Sum it Up


Many Cybersecurity development projects require cryptographic
functions like data encryption, decryption, and secret key
generation. Python Cryptography library can do all these tasks
since it is loaded with cryptographic algorithms. Cryptography
functions are divided into two groups (layers) i-e (a) the recipes
layer and (b) the hazardous materials layer. The Cryptography
library functions under the recipes layer are easy to implement.

It includes all the recipes and primitives and provides a high-level


Python coding interface.

2.6 Check the Progress: Key

47
1) It is defined as a broad field of application in which cyber
security professionals try to analyze an organization's
security, also known as penetration testing. Cyber security
specialists may alert the organization to serious security
flaws, allowing it to start preparing for the internet.
Penetration testing is divided into seven steps
 Stage 1: Pre-Engagement - A cybersecurity team determines the
objectives and logistics of the pen test.
 Stage 2: Information Gathering - Pen testers rely on the Python
libraries: NMAP, Twisted, Beautiful Soup, Scapy, Socket,
Mechanize, and Devploit.
 Stage 3: Threat Modeling - Pen testers rely on the Python libraries:
Python Framework and Threat-modeling 0.0.1.
 Stage 4: Vulnerability Scanning - Pen testers rely on the Python
libraries: Vulners 1.5.13, Safety, and Scapy.
 Stage 5: Exploitation - Pen testers rely on the Python libraries:
Pymetaploit3 (to implement the Metasploit framework), Scapy,
Socket, and BYOB.
 Stage 6: Post-Exploitation - Pen testers rely on the Python
libraries: Pymetaploit3, BYOB, and RSPET.
 Stage 7: Reporting - Pen testers rely on the Python libraries: Sys,
Plotly, Pandas, and NLTK.
2.

i) Scapy is a strong Python package. The program


aims to sniff, transmit, analyze, and alter
network packets. The primary concept is to
send packets and receive a meaningful answer.
Scapy offers an advantage over comparable
programmers like Nmap, which often respond
with a straightforward (open/closed/filtered)
status.

The Command Prompt Translator is a programme that allows you to


type commands into a computer (CLI)
The main basic functions we should know are:
ls() returns a list of all accessible layers.

explore() provides a graphical interface for seeing layers that already


exist.
lsc() : available functions
help() returns a help menu.
Inside the function, the following are the most common:
send(): send packets to level 2.
sendp(): send packets to level 3.
sr(): send and receive packets at level 3.
srp(): send and receive packets at level 2.
sr1(): send and receive only the first packet at level 3.
srp1(): sends and receives only the first packet to level 2.
sniff(): packet sniffing.
traceroute(): command trace route.
arping(): Send who-has ARP requests to determine which machines
are up in the network.

ii) Beautiful Soup is a Python tool for parsing and extracting data
from HTML and XML files. It connects with your chosen parsing
to provide smooth parse tree navigation, search, and editing. It's
not uncommon for developers to save hours, if not days, of work.

When you're using a current edition of Deb or Ubuntu Linux, then may
download Beautiful Soup:
Install python-bs4 using apt-get (for Python 2)
Install python3-bs4 using apt-get (for Python 3)

Beautiful Soup transforms a complex HTML document into a


complex tree of Python objects. But you’ll only ever have to deal
with four kinds of objects: Tag , NavigableString , BeautifulSoup ,
and Comment .

You can also pass a BeautifulSoup object into one of the methods
defined in Modifying the tree, just as you would a Tag.

49

You might also like