0% found this document useful (0 votes)
19 views28 pages

Web Clients - Web Servers

Python framework

Uploaded by

Gayathri U
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views28 pages

Web Clients - Web Servers

Python framework

Uploaded by

Gayathri U
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Click to edit Master title style

Web Frameworks:
Intro to Web
Communication
CSE304 – Python Programming and Web Frameworks
Dr R Anushiadevi/SoC/SASTRA
Textbook: Wesley J Chun, Core PYTHON Applications Programming,
Prentice Hall, Third Edition, 2013

1
Introduction
Click to edit Master title style

➢ The World Wide Web (WWW) runs on the principle of Client-Server


Architecture.
➢ A Web Client, or simply a Client, is the application that is installed on the
User’s device that can be used to interact with the Internet.
➢ This application is usually the Web Browser that acts as an interface for
the user to access any documents, information or data in the internet.
➢ The other side of this architecture, the Web Server, or simple the Server,
are the processes that run on the information provider’s host
computers.
➢ This infrastructure, which can be of both a software and a hardware,
waits for clients for their requests, processes them and then return the
user-requested data if it is available and appropriately sends an error
message if the data is not found.
2 2
The Client-Server
Click Architecture
to edit Master title style

3 3
Protocols
Click to edit&Master
the Internet
title style
➢ They are known as the Language of Communication between the Client
and the Server.
➢ The Standard Protocol used for Web Communication is called as
Hypertext Transfer Protocol [HTTP] which is written on top of the
Transmission Control Protocol – Internet Protocol Suite [TCP-IP Suite].

➢ The Internet is metaphorically known as the giant ‘cloud’ of


interconnected Clients and Servers all across the globe.
➢ In order to achieve Information Security, the information sharing
between the server and the client is kept hidden completely from the
user, including the underlying protocols like TCP, IP and HTTP that are
responsible for carrying out the job.
➢ Information regarding the intermediate nodes are also hidden from the
users for the sake of enhanced security. 4 4
Click to edit Master title style


Expanded view
of the Internet

5 5
Network
Click Defenses
to edit Master title style
➢ Firewalls help fight unauthorized access to a corporate (or a home)
network by blocking down entry points that are configurable on a per-
network basis.
➢ These systems reduce the chances of hacking by locking down
everything and only opening up the ports for well-known services like
Web Servers and Secure Shell (SSH) or the Secured-Socket Layer (SSL).
➢ Proxy Servers are a useful tool for administrators to monitor network
traffic. They also cache data, because of which webpages load much
faster while at the same time give the information to the company
hosting the server as to what are their clients (in this case, their
employees) are using the Internet for.
➢ There are different kinds of Proxy Servers – the Forward Proxies and the
Reverse Proxies – based on which side’s information they are
programmed to record data for.
6 6
Server
Click to Farms
edit Master title style
➢ For companies with very large websites, they host their own Server
Farms located at their Internet Service Providers [ISP].
➢ Their physical location is known as the Co-Location – meaning that the
companies’ servers reside at an ISP along with computers with other
corporate customers.
➢ All the clients [Host Machines] are connected with the Server Farm with
a Network Backbone – following any desired Network Topology so the
clients are provided with faster access and being the backbone helps
the data transmitted to not suffer any packet-loss during transportation
from the server to the user.
➢ Some famous Server Farms across the globe include:
➢ Microsoft Data Center [1.2 Million sq.ft.]
➢ Yotta D1 Data Center [India’s Largest Server Farm]
➢ The Citadel [World Largest Server Farm – 7.2 Million sq.ft.]
7 7
Click to edit Master title style

Web Frameworks:
Python Web
Client Tools
CSE304 – Python Programming and Web Frameworks
Dr R Anushiadevi/SoC/SASTRA
Textbook: Wesley J Chun, Core PYTHON Applications Programming,
Prentice Hall, Third Edition, 2013

8
Introduction
Click to edit Master title style
➢ The Web Browser is not the only Web Client for accessing a Server.
➢ Most of the browsers provide only limited capacity – viewing and
interacting with websites [using Hypertext Markup Language (HTML),
eXtensible Markup Language (XML), etc…]
➢ However, a Client Program can Download Data, Store them, Manipulate
them or even Transmit them from one location (which is known as the
Sender) to another location (known as the Receiver).
➢ The way in which Web Browsers help us access Websites available in the
internet is by uniquely identifying them with what is known as the
Uniform Resource Locator [URL].
➢ They are also known as the Web Address of the webpage.
➢ In Python, the ‘urllib’ module helps in manipulating URL’s available in the
Internet.
9 9
Components
Click of antitle
to edit Master URLstyle

https://www.example.co.uk:443/blog/article/search?docid=720&hl=en#dayone

Scheme Path Queries

Network Location (incl. Optional Fragment


Subdomain, Domain Name, Top- Parameter
Level Domain and Port Number (?)

1010
Manipulating
Click URLstitle
to edit Master in Python
style
➢ In Python 2.x, Python provided 2 Modules – urlparse and urllib that deals
with URLs in completely different functionality and capabilities.
➢ However, both the modules have been deemed ‘Obsolete’ with the latest
versions of Python starting from 3.x; now both the modules have been
combined into one single module called urllib with the following 4 Library
Modules containing the functionalities:

➢ urllib.request -> For Opening and Reading URL


➢ urllib.parse -> For Parsing/Manipulating URLs
➢ urllib.error -> Containing the Exceptions raised by urllib.request
➢ urllib.robotparser -> for Parsing robot.txt Files

➢ Among the above four, we will be discussing about the different


functions that are available in the urllib.request and urllib.parse Library.
1111
urllib.parse Library
Click to edit Master title style
[Formerly known as urlparse]
➢ Our Syllabus covers three functions available from the original urlparse Library:
➢ urllib.parse.urlparse() -> formerly urlparse.urlparse()
➢ urllib.parse.urlunparse() -> formerly urlparse.urlunparse()
➢ urllib.parse.urljoin() -> formerly urlparse.urljoin()

➢ urllib.parse.urlparse ( urlstring, scheme=‘ ‘, allow_fragments=true )


Parse the entered URL into Six Components, returning a 6-Item Named
Tuple. This corresponds to the general structure of the URL discussed in the
previous slide: scheme, net_loc, path, params, query, fragment.

➢ The Scheme Argument gives the default addressing scheme, which is to be


used only if the URL doesn’t specify one. It also accepts a String as an
argument. In case of any invalid specifications, a ValueError Exception
would be raised on the particular parameter where there is an error.
1212
Continuation
Click to edit Master title style

The above named tuples can also be referred to with their index values starting
from 0 [Scheme] to 6 [Fragment].
1313
urllib.parse Library:
Click to edit Master title style
urllib.parse.urlunparse(), urljoin()
➢ urllib.parse.urlunparse ( urlparsetuple )
Construct an URL from a tuple returned from urlparse() function. The
‘urlparsetuple’ argument can be any 6-Item Iterable Tuple. For accurate
results, it is advised that only Tuples formed from the urlparse() function
is to be passed as an argument to the function. Returns a String.

➢ urllib.parse.urljoin ( baseurl, newurl, allow_fragments=true)


Construct a full (absolute) URL by combining the Base URL with another
URL. It normally uses some of the components of the Base URL to provide
the missing components of the New URL.
1414
Continuation
Click to edit Master title style

If you notice clearly in the


above example, since my
‘newurl’ was only the ‘path’
part of the URL, it
automatically took the other
components of the URL from
the ‘baseurl’ and concatenated
it with the ‘newurl’. Once again,
the result of the function would
be a String (urlstring).

1515
urllib.parse Library: Legacy
Click to edit Master title style
functions from the urllib Module
➢ There are certain functions from the urllib Module from Python 2.x that
are now part of the urllib.parse Library in the latest versions of Python. In
our Syllabus, we are covering 5 functions of the Legacy urllib Module:
➢ urllib.parse.quote() -> formerly urllib.quote()
➢ urllib.parse.quote_plus() -> formerly urllib.quote_plus()
➢ urllib.parse.unquote() -> formerly urllib.unquote()
➢ urllib.parse.unquote_plus() -> formerly urllib.unquote_plus()
➢ urllib.parse.urlencode() -> formerly urllib.urlencode()

➢ urllib.quote ( string, safe = ‘/’, encoding = None, errors = None )


Replace special characters in the string using the %xx space. However,
Letters, Digits and the characters – “ _ . - ~ “ are not quoted. This function
is usually used for quoting a section of the URL.
1616
Continuation
Click to edit Master title style
➢ urllib.parse.quote_plus ( string, safe = ‘/’, encoding = None, errors = None )
Similar to the Quote function; but this function also replaces spaces with a
‘plus’ (+) sign. This also accepts a string (a complete URL or a part of the
URL) and returns the quoted URL String as the result.

➢ urllib.parse.unquote ( string, encoding = ‘utf-8’, errors = ‘replace’ )


Replaces all the %xx values in the ‘Quoted URL String’ into a normal URL
String. The Encoding variable is set to ‘Unicode-8’ by default as it is the
standard encoding Method used in modern day web technologies. In case
of any errors, the default Placeholder Character ‘/’ is used in place of them.

➢ urllib.parse.unquote_plus ( string, encoding = ‘utf-8’, errors = ‘replace’ )


unquote() function which can also replace Plus Signs with spaces. Accepts a
String as an Input and returns the Original URL String as Output.
1717
Continuation
Click to edit Master title style

➢ urllib.parse.encode ( query, doseq = ‘False’, safe = ‘’, encoding = None,


errors = None, quote_via = quote_plus )
If one wishes to send a Query URL to a Request-based Web Page, the
urlencode() function to create such a Query based URL String. It accepts a
2-Tuple Dictionary that contains the query as an Input, encodes them into a
String using the UTF-8 Encoding Technique and returns a Query-Based URL
String as an output. The urlencode() functions implicitly calls the urlquote()
or the urlquoteplus() accordingly to properly create a Query URL. The
following Code Snippet gives an Idea of how this function works:

1818
urllib.request Library
Click to edit Master title
[Formerly known as urllib] style
➢ Our Syllabus covers two functions available in the urllib.request Library:
➢ urllib.request.urlopen() -> formerly urllib.urlopen()
➢ urllib.request.urlretrieve() -> formerly urllib.urlretrieve()

➢ urllib.request.urlopen ( url, data = None, [timeout, ]*, cafile = None, capath =


None )
Open the given URL, which is an urlstring. Data must be an Object
specifying any additional Data to be sent to the server which is by default,
None. The timeout parameter specifies the timeout in seconds for blocking
operations such as establishing a connection attempt in HTTP/HTTPS
based requests. The optional cafile refers to the CA [Certificate Authority]
that specifies a set of trusted CA Certificates in case of SSL based
connection requests while capath refers to the exact location of the cafile.
The default values of both of these Parameters are also, None.
1919
Continuation &
Click to edit Master title style
urllib.request.urlretrieve() function
➢ The Object returned as a result of a successful execution of the function
would be a File; so File-Handling functions like read(), readlines() or
readline() is to be used for displaying our results.
➢ In case of any Errors in the entered URL or any Parameters, the URLError
Exception would be raised during program execution.

➢ urllib.request.urlretrieve ( url, filename = None, reporthook = None, data =


None )
Similar to the urlopen() function, but instead of storing the result in a File, it
downloads the entire html result and saves it into your Hard Disk. It returns
a 2-Tuple as the result, containing the Filename and the set of MIME
Headers that were returned by the Web Server. For saving the file in a
particular location, the Second Argument can be used, which accepts a
String [the Exact File Location] as its data.
2020
Click to edit Master title style

2121
Click to edit Master title style

2222
Click to edit Master title style

Legacy Functions
of the ‘urllib’
Module that were
available in Python
2.x & Below

2323
Click to edit Master title style

Web Frameworks:
Web Clients &
Web Servers
CSE304 – Python Programming and Web Frameworks
Dr R Anushiadevi/SoC/SASTRA
Textbook: Wesley J Chun, Core PYTHON Applications Programming,
Prentice Hall, Third Edition, 2013

24
Web to
Click Clients in a Nutshell
edit Master title style
➢ One example of a well-known Web Client is the Crawler/Spider/Bot.

➢ This bot has been tasked to perform a variety of tasks that include:
➢ Indexing into a Large Search Engine such as Google, Yahoo!, etc;
➢ Offline Browsing – Downloading Documents onto a Local Hard Disk and
rearranging hyperlinks to create mirrors of files for local browsing;
➢ Downloading and storing historical or Archival Documents;
➢ Web Page Caching to save superfluous downloading time on Website revisits.

➢ How does it do that? It is a very long complex code that contains modules
like HTMLParser, cStringIO and the urllib with BeautifulSoup providing the
Graphical User Interface (GUI) for the Client.
➢ There are also Third-Party Web Browser Simulating Tools called
Mechanize which also runs on Python over the BeautifulSoup GUI Interface.
2525
Web to
Click Servers
edit Master title style
➢ Many Web Clients like Google Chrome, Mozilla Firefox, Brave Browser,
Microsoft Edge, Opera GX, Safari, etc… use dedicated Web Servers like
Apache, ligHTTPD, Microsoft IIS, LiteSpeed Technologies LiteSpeed, ACME
Laboratories thttpd to establish the Client-Server Architecture.
➢ Django and Google App Engine developmental Servers were based on the
‘BaseHTTPServer’ module of Python.
➢ The Handler is the piece of software that does the majority of Web Serving.
➢ It processes Client Requests, identifies the requested Information and
returns an appropriate response being either statically or dynamically
generated. The amount of requests a server can handle will approximately
give you the efficiency of the server.
➢ Once again, most of these modules have been deemed ‘Obsolete’ from
Python 3.x. However, similar behaving modules have efficiently replaced
them in the latest versions while also giving provisional legacy support.
2626
Web to
Click Servers in a Nutshell
edit Master title style

2727
Click to edit Master title style

Thank You!

28

You might also like