Lesson-5_Shell_Scripting_and_Django
Lesson-5_Shell_Scripting_and_Django
Explain Django
Shell Scripting
Shell Scripting
● Shell accepts human readable commands and converts them into something the kernel
can understand.
● The kernel is a computer program at the core of a computer’s operating system and
controls everything in the system.
● The kernel is responsible for the following in a Linux environment:
○ File management
○ Process management
○ I/O management
○ Memory management
○ Device management
Kernel, Shell, and Terminal
UTILITIES
TERMINAL
SHELL
KERNEL
HARDWARE
Shell Scripting
● Shells usually accept command as input from users and execute them.
● When we have a group of commands to be executed routinely, we can write these
commands in a file that can be read and executed by the Shell.
● These files are called Shell scripts or Shell programs.
● Each Shell script is saved with .sh file extension .
• Example: myscript.sh
Elements of Shell Scripting
● Python is well suited for system programming and platform independent system
programming.
● Python offers smart data structures and lets you work with numbers and dates without
complexity.
● Python lets you write unit tests.
● System programming with the aid of the sys and the os module serves as an abstraction
layer between the application and the operating system.
● The general advantages of Python in system programming are:
○ Simple and clear
○ Well structured
○ Highly flexible
sys Module
The table below gives the name and description of methods under the sys module:
Name Description
sys.displayhook Changes the way the interpreter prints interactively entered expressions
sys.stdin Gives access to the standard input, standard output, and standard error data
sys.stdout streams
sys.stderr
sys Module
Name Description
sys.byteorder Indicator of the native byte order
sys.executable String containing the name of the executable binary (path and executable file
name) for the Python interpreter.
sys.maxint Attribute containing the largest positive integer supported by Python’s
regular integer type
sys.maxsize Reports the platform's pointer size that limits the size of Python's data
structures such as strings and lists
sys.maxunicode Integer giving the largest supported code point for a Unicode character
sys.modules A dictionary mapping module names to modules which have already been
loaded
sys Module
Name Description
sys.path Contains the search path, where Python is looking for modules
sys.platform Name of the platform on which Python is running
sys.version_info A tuple containing the five components of the version number: major, minor,
micro, release-level, and serial. The values of this tuple are integers except the
value for the release level, which is one of the following: 'alpha', 'beta',
'candidate', or 'final'
sys.__stdin__ Contains the original values of stdin, stderr, and stdout at the start of the
sys.__stdout__ program
sys.__stderr__
os Module
Function Description
os.getcwd() Returns a string with the path of the current working directory
os.chdir(path) Changes the current working directory to path
os.getcwdu() Like getcwd(), but outputs unicode
os.listdir(path) A list with the content of the directory defined by path, that is, subdirectories and
file names
os Module
Function Description
os.mkdir(path[, mode=0755]) Creates a directory named path with numeric mode if it is not
existing. The default mode is 0777 (octal)
os.renames(old, new) Works like rename(), except that it creates recursively any
intermediate directories needed to make the new pathname
● Instead of using the system method of the os module, os.system('touch xyz'), we can use
the popen() command of the subprocess module.
● subprocess.run starts a process, waits for it to finish, and then returns a CompletedProcess
instance that has information about what happened.
● If you want processes to run in the background or need to interact with them while they
continue to run, you need the the Popen constructor.
subprocess Module
● The subprocess module is safe from injection by default, unless shell=True is used.
● Programs like SSH give arguments to a Shell after they have started causing injection
vulnerabilities.
● shlex.quote will ensure that any spaces or shell metacharacters are properly escaped.
#Without shlex
>>> sp.run(['ssh', 'user@host', 'ls', path])
#With shlex
>>> import shlex
>>> sp.run(['ssh', 'user@host', 'ls', shlex.quote(path)])
Reading and Writing Files
● Traditionally, we use coreutils like grep, sed, awk, tr, sort to go over text files line-by-line.
● In Python, we need to turn a path into a file object before processing.
● The open() function takes a path and returns a file object.
The shutil library provides utility functions for copying and archiving files and directory trees. The code below
shows how shutil performs various file operations in Python:
import shutil
# $ mv src dest
shutil.move('src', 'dest')
# $ cp src dest
shutil.copy2('src', 'dest')
# $ cp -r src dest
shutil.copytree('src', 'dest')
# $ rm a_file
os.remove('a_file') # ok, that's not shutil
# $ rm -r a_dir
shutil.rmtree('a_dir')
# $ tar caf 'my_archive.tar.gz' 'my_folder'
shutil.make_archive('my_archive.tar.gz', 'gztar', 'my_folder')
Replacing Miscellaneous File Operations
● Shell commands like sed, grep, and awk can be replaced with regular expressions in Python.
● The regex functionality is encapsulated in the re module, in Python.
● grep is the Unix utility that goes through each line of a file, tests if it contains a certain pattern, and
then prints the lines that match.
The code below shows how to do this in Python with and without regex:
>>> import re
>>> re.search(r'a pattern', r'string containing a pattern')
<_sre.SRE_Match object; span=(18, 27), match='a pattern'>
>>> re.search(r'a pattern', r'string without the pattern')
>>> # Returns None, which isn't printed in the Python REPL
Replacing sed, grep, and awk
The code below shows an example of replacing awk with Python to split strings:
● A process that fails doesn't raise an exception by default in Python and Shell.
● A non-zero exit code indicates something other than an error and hence can be used in an if
condition to check for process failures.
● If you want a non-zero exit code to crash the program, especially during development, you can use
the check parameter.
The code below shows an example of dealing with exit codes in Python:
>>> if proc.returncode != 0:
... # do something else
● We can use stdin and stdout to redirect input and output to files respectively.
● To do something with input and output text inside the script, we need to use the special constant,
subprocess.PIPE
● In a shell script, you just use the output of date for time.
● Python has two libraries for dealing with time: time and datetime
The code below shows an example of using time and datetime in Python:
Objective: You are given a project to write a Python program that performs shell operations.
3. Code methods for reading files, managing processes, performing date arithmetic and
Web Scraping is another process usually done with Shell scripts. It is the process of extracting
information from a website or internet. Web scraping is one of the most important techniques of data
extraction from internet. It allows the extraction of unstructured data from websites and convert it into
structured data.
BASIC STEPS FOR WEB SCRAPING
Select
website
Authenticate
Generate request
Process
Informatio
n
Web Scraping Applications
Web Scraping plays a major role in data extraction that helps in business Improvements. At present, a
website to any business is mandatory. This explains the importance of web scraping in information
extraction
Let’s see some of the applications of web scraping.
Data
Science
E-Commerce
Web
Scrapping Marketing
Applications
Sales
Finance
Different Methods of Web Scraping
There are different methods to extract information from websites. Authentication is an important aspect
for web scraping and every website has some restrictions for their content extraction.
Web scraping focuses on extracting data such as product costs, weather data, pollution check, criminal
data, stock price movements etc,. in our local database for analysis.
Copying
API Keys
Socket Programming
Web Scraping in Python
Python is one of the favorite languages for web scraping. Web scraping can be used for data analysis
when we have to analyze information from a website
The important libraries in Python that assists us in web scraping are:
Installation
starts here
Demo: Web Scraping Using Beautiful Soup
Demo: Web Scraping Using Beautiful Soup
Demo: Web Scraping Using Beautiful Soup
Demo: Web Scraping Using Beautiful Soup
Django
Django
Django is a high-level, popular Python framework for web development. Access to Django is
free & open source. Django is open-source and web apps can be created with less code. As a
framework, it is used for backend and front-end web development.
Disqus
YouTube
Bitbucket
Mozilla
Spotify
Important Attributes of Django
• A URL is the web address and the act of assigning functions to url is called
mapping.
• Static folder is used to store other CSS files, java files , images etc.
• Functions related to web apps are written inside view. It also renders
content to templates, puts information into model and gets information
from databases.
Important Attributes of Django
• Form fetches data from HTML form and helps connect to the model.
Duration: 20 min.
a. Beautiful Soup
b. Pandas
c. Numpy
a. Beautiful Soup
b. Pandas
c. Numpy
Beautiful Soup is for web scraping, Pandas for data analysis, and Numpy for numerical analysis.
Knowledge
Check
Data extraction is the most important aspect of web scraping.
2
a. False
b. True
Knowledge
Check
Data extraction is the most important aspect of web scraping.
2
a. False
b. True
Web scraping means extracting information from a URL. So, data extraction is the most important aspect of web
scraping.
Knowledge
Check
What are the features available in Django web framework?
3
a. Web apps
b. Templates
c. Both a & b
a. Web apps
b. Templates
c. Both a & b
Django framework is the simplest way to create web apps and templates using Python.
Knowledge
Check
In Python, a=BeautifulSoup() is an expression, where a is ____________
4
a. A constructor
b. An object
c. A class
a. A constructor
b. An object
c. A class
d. None of above
Knowledge
Check
What is the role of render_to_response method in Django?
5
d. None of above
Duration: 45 min.