Module1 - Web Programming Fundamentals
Module1 - Web Programming Fundamentals
➢ Topics to be covered:
Working of web browser, HTTP protocol, HTTPS, DNS, TLS, XML introduction, JSON
Introduction, DOM, URL, URI, REST API.
➢ What is a Browser?
A browser is a software program that is used to explore, retrieve, and display the information
available on the World Wide Web. This information may be in the form of pictures, web pages,
videos, and other files that all are connected via hyperlinks and categorized with the help of URLs
(Uniform Resource Identifiers). For example, you are viewing this page by using a browser.
A browser is a client program as it runs on a user computer or mobile device and contacts the
webserver for the information requested by the user. The web server sends the data back to the
browser that displays the results on internet supported devices. On behalf of the users, the browser
sends requests to web servers all over the internet by using HTTP (Hypertext Transfer Protocol).
A browser requires a smartphone, computer, or tablet and internet to work.
The WorldWideWeb was the first web browser. It was created by W3C Director Tim Berners-
Lee in 1990. Later, it was renamed Nexus to avoid confusion caused by the actual World Wide
Web.
1. Refresh button: Refresh button allows the website to reload the contents of the web
pages. Most of the web browsers store local copies of visited pages to enhance the
performance by using a caching mechanism. Sometimes, it stops you from seeing the
updated information; in this case, by clicking on the refresh button, you can see the
updated information.
When a user enters a web address or URL in the search bar like javatpoint.com, the request is
passed to a domain name server (DNS). All of these requests are routed via several routers and
switches.
The domain name servers hold a list of system names and their corresponding IP addresses. Thus,
when you type something in the browser search bar, it gets converted into a number that
determines the computers to which the search results are to be displayed.
The browser acts as a part of the client-server model. A browser is a client program that sends the
request to the server in response to the user search queries by using Hypertext Transfer Protocol
or HTTP. When the server receives the request, it collects information about the requested
document and forwards the information back to the browser. Thereafter, the browser translates
and displays the information on the user device.
o When a user enters something (like javatpoint.com) in the browser. This request goes to
a domain name server.
o The browser sends the user request to the server using an IP address, which is described
by the domain name server.
o The domain name server sends an IP address to the web server that hosts the website.
o The server sends the information back to the IP address, which is defined by the browser
at the time of the request. The requested page may include links to other files on the same
server, like images, for which the browser also requests the server.
o The browser gathers all the information requested by the user, and displays on your device
screen in the form of web pages.
➢ HTTP
HTTP stands for Hyper Text Transfer Protocol. WWW is about communication between web
clients and servers. Communication between client computers and web servers is done by sending
HTTP Requests and receiving HTTP Responses. HTTP (Hypertext Transfer Protocol) is the set
of rules for transferring files -- such as text, images, sound, video and other multimedia files --
over the web. As soon as a user opens their web browser, they are indirectly using HTTP. HTTP
is an application protocol that runs on top of the TCP/IP suite of protocols, which forms the
foundation of the internet. The latest version of HTTP is HTTP/2, which was published in May
2015. It is an alternative to its predecessor, HTTP 1.1, but does not it make obsolete.
o The World Wide Web is about communication between web clients and
web servers.
o Clients are often browsers (Chrome, Edge, Safari), but they can be any type of
program or device.
1. The browser requests an HTML page. The server returns an HTML file.
2. The browser requests a style sheet. The server returns a CSS file.
3. The browser requests an JPG image. The server returns a JPG file.
4. The browser requests JavaScript code. The server returns a JS file
5. The browser requests data. The server returns data (in XML or JSON).
➢ Working of WEB
We will have a client on the left side and server on the right side. A user wants to see a website,
like www.microsoft.com The user types the URL of a page using a client program, usually a
browser. But first, the computer of the user and the web server need to be physically connected.
That is the job of the internet. Using the TCP/IP protocol, it establishes a connection using a
combination of cable media or wireless media and does all the necessary work to prepare the
environment for the two computers to talk via the HTTP protocol.
On the other side, the server processes the request, prepare the response, establish the connection
again, and send it back the response and again in the form of an HTTP message to the client. Then
the two computers completely disconnect.
HTTP Response sent by a server to the client. The response is used to provide the client with the
resource it requested. It is also used to inform the client that the action requested has been carried
out. It can also inform the client that an error occurred in processing its request.
HTTP requests
This is when a client device, such as an internet browser, asks the server for the information needed
to load the website. The request provides the server with the desired information it needs to tailor
its response to the client device. Each HTTP request contains encoded data, with information such
as:
• The specific version of HTTP followed. HTTP and HTTP/2 are the two versions.
• HTTP request headers. This includes data such as what type of browser is being used
and what data the request is seeking from the server. It can also include cookies, which
show information previously sent from the server handling the request.
• An HTTP body. This is optional information the server needs from the request, such as
user forms -- username/password logins, short responses and file uploads -- that are
being submitted to the website.
HTTP response
1. Status Line
2. Response Header Fields or a series of HTTP headers
3. Message Body
In the request message, each HTTP header is followed by a carriage returns line feed (CRLF).
After the last of the HTTP headers, an additional CRLF is used and then begins the message
body.
➢ Status Line
In the response message, the status line is the first line. The status line contains three items:
It is used to show the HTTP specification to which the server has tried to make the messag
comply.
Example
1. HTTP-Version = HTTP/1.1
b) Status Code
1xx: Information
It shows that the request was received and continuing the process.
2xx: Success
It shows that the action was received successfully, understood, and accepted.
3xx: Redirection
c) Reason Phrase
It is also known as the status text. It is a human-readable text that summarizes the meaning of
the status code.
1. HTTP/1.1 200 OK
Here,
The HTTP Headers for the response of the server contain the information that a client can
use to find out more about the response, and about the server that sent it. This information is
used to assist the client with displaying the response to a user, with storing the response for
the use of future, and with making further requests to the server now or in the future.
In addition to the web page files it can serve, a web server contains an HTTP daemon, a
program that waits for HTTP requests and handles them when they arrive. A web browser is
an HTTP client that sends requests to servers. When the browser user enters file requests by
either "opening" a web file by typing in a URL or clicking on a hypertext link, the browser
builds an HTTP request and sends it to the Internet Protocol address (IP address) indicated by
the URL. The HTTP daemon in the destination server receives the request and sends back the
requested file or files associated with the request.
To expand on this example, a user wants to visit TechTarget.com. The user types in the web
address and the computer sends a "GET" request to a server that hosts that address. That GET
request is sent using HTTP and tells the TechTarget server that the user is looking for
the HTML (Hypertext Markup Language) code used to structure and give the login page its
look and feel. The text of that login page is included in the HTML response, but other parts
of the page -- particularly its images and videos -- are requested by separate HTTP requests
and responses. The more requests that are made -- for example, to call a page that has
numerous images -- the longer it will take the server to respond to those requests and for the
user's system to load the page.
When these request/response pairs are being sent, they use TCP/IP to reduce and transport
information in small packets of binary sequences of ones and zeros. These packets are
physically sent through electric wires, fiber optic cables and wireless networks.
The requests and responses that servers and clients use to share data with each other consist
of ASCII code. Requests state what information the client is seeking from the server;
responses contain code that the client browser will translate into a web page.
This protocol is also called HTTP over SSL because the HTTPS communication protocols are
encrypted using the SSL (Secure Socket Layer).
Those websites which need login credentials should use the HTTPS protocol for sending the data.
It allows users to create a secured encrypted connection and helps them to protect their
information from being stolen.
2. This protocol operates at the application layer. 2. This protocol operates at the transport layer.
3. The data which is transferred in HTTP is plain 3. The data which is transferred in HTTPS is
text. encrypted, i.e., ciphertext.
4. By default, this protocol operates on port 4. By default, this protocol operates on port number
number 80. 443.
5. The URL (Uniform Resource Locator) of HTTP 5. The URL (Uniform Resource Locator) of HTTPS
start with http:// start with https://
6. This protocol does not need any certificate. 6. But, this protocol requires an SSL (Secure Socket
Layer) certificate.
8. The speed of HTTP is fast as compared to 8. The speed of HTTPS is slow as compared to
HTTPS. HTTP.
10. Examples of HTTP websites are Educational 10. Examples of HTTPS websites are shopping
Sites, Internet Forums, etc. websites, banking websites, etc.
➢ Advantages of HTTPS
Following are the advantages or benefits of a Hypertext Transfer Protocol Secure (HTTPS):
Disadvantages of HTTPS
Following are the disadvantages or limitations of a Hypertext Transfer Protocol Secure (HTTPS):
o The big disadvantage of HTTPS is that users need to purchase the SSL certificate.
o The speed of accessing the website is slow because there are various complexities in
communication.
o Users need to update all their internal links.
A uniform resource locator is the address of a resource on the internet or the World Wide Web.
It is also known as a web address or uniform resource identifier (URI). For example, https:
www.google.com, which is the URL or web address for the google website. A URL represents
the address of a resource, including the protocol used to access it.
A URL forwards user to a particular online resource, such as a video, webpage, or other resources.
For example, when you search information on Google, the search results display the URL of the
relevant resources in response to your search query. The title which appears in the search results
is a hyperlink of the URL of the webpage. It is a Uniform Resource Identifier, which refers to
all kinds of names and addresses of the resources on the webservers. URL's first part is known as
a protocol identifier, and it specifies the protocol to use, and the second part, which is known as
a resource name, represents the IP address or the domain name of a resource. Both parts are
differentiated by a colon and two forward slashes like http://www.google.com.
➢ Example:
http://www.w3cschool.com/index.html
where
What is xml?
XML Example
XML documents create a hierarchical structure looks like a tree so it is known as XML Tree that
starts at "the root" and branches to "the leaves".
The first line is the XML declaration. It defines the XML version (1.0) and the encoding used
(ISO-8859-1 = Latin-1/West European character set).
The next line describes the root element of the document (like saying: "this document is a
note"):
The next 4 lines describe 4 child elements of the root (to, from, heading, and body).
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
And finally the last line defines the end of the root element.
</note>
XML documents must contain a root element. This element is "the parent" of all other elements.
The elements in an XML document form a document tree. The tree starts at the root and branches
to the lowest level of the tree.
<root>
<child>
<subchild>.....</subchild>
</child>
</root>
The terms parent, child, and sibling are used to describe the relationships between elements.
Parent elements have children. Children on the same level are called siblings (brothers or sisters).
➢ XML Validation
A well formed XML document can be validated against DTD or Schema. A well-formed XML
document is an XML document with correct syntax. It is very necessary to know about valid
XML document before knowing XML validation.
➢ XML DTD
In simple words we can say that a DTD defines the document structure with a list of legal
elements and attributes.
Actually DTD and XML schema both are used to form a well formed XML document.
➢ XML schema
It is defined as an XML language. Uses namespaces to allow for reuses of existing definitions.
It supports a large number of built-in data types and definition of derived data types
What is DTD
DTD stands for Document Type Definition. It defines the legal building blocks of an XML
document. It is used to define document structure with a list of legal elements and attributes.
Purpose of DTD
Its main purpose is to define the structure of an XML document. It contains a list of legal
elements and define the structure with the help of them.
Checking Validation
Before proceeding with XML DTD, you must check the validation. An XML document is
called "well-formed" if it contains the correct syntax.
A well-formed and valid XML document is one which have been validated against DTD.
Let's take an example of well-formed and valid XML document. It follows all the rules of DTD.
<?xml version="1.0"?>
<!DOCTYPE employee SYSTEM "employee.dtd">
<employee>
<firstname>vimal</firstname>
<lastname>jaiswal</lastname>
<email>[email protected]</email>
</employee>
In the above example, the DOCTYPE declaration refers to an external DTD file. The content of
the file is shown in below paragraph.
Description of DTD
<!DOCTYPE employee : It defines that the root element of the document is employee.
<!ELEMENT employee: It defines that the employee element contains 3 elements "firstname,
lastname and email".
<!ELEMENT firstname: It defines that the firstname element is #PCDATA typed. (parse-able
data type).
<!ELEMENT lastname: It defines that the lastname element is #PCDATA typed. (parse-able
data type).
<!ELEMENT email: It defines that the email element is #PCDATA typed. (parse-able data
type).
➢ XML Schema
XML schema is a language which is used for expressing constraint about XML documents.
There are so many schema languages which are used now a days for example Relax- NG and
XSD (XML schema definition).
An XML schema is used to define the structure of an XML document. It is like DTD but
provides more control on XML structure.
Checking Validation
An XML document is called "well-formed" if it contains the correct syntax. A well-formed and
valid XML document is one which have been validated against Schema.
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.javatpoint.com"
xmlns="http://www.javatpoint.com"
elementFormDefault="qualified">
<xs:element name="employee">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
<xs:element name="email" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
1. simpleType
2. complexType
simpleType
The simpleType allows you to have text-based elements. It contains less attributes, child
elements, and cannot be left empty.
complexType
The complexType allows you to hold multiple attributes and elements. It can contain additional
sub elements and can be left empty.
➢ DTD vs XSD
There are many differences between DTD (Document Type Definition) and XSD (XML Schema
Definition). In short, DTD provides less control on XML structure whereas XSD (XML schema)
provides more control.
1) DTD stands for Document Type XSD stands for XML Schema Definition.
Definition.
2) DTDs are derived from SGML syntax. XSDs are written in XML.
3) DTD doesn't support datatypes. XSD supports datatypes for elements and attributes.
5) DTD doesn't define order for child XSD defines order for child elements.
elements.
7) DTD is not simple to learn. XSD is simple to learn because you don't need to learn
new language.
8) DTD provides less control on XML XSD provides more control on XML structure.
structure.
➢ CDATA vs PCDATA
CDATA
CDATA: (Unparsed Character data): CDATA contains the text which is not parsed further in an
XML document. Tags inside the CDATA text are not treated as markup and entities will not be
expanded.
<?xml version="1.0"?>
<!DOCTYPE employee SYSTEM "employee.dtd">
<employee>
<![CDATA[
<firstname>vimal</firstname>
<lastname>jaiswal</lastname>
<email>[email protected]</email>
]]>
</employee>
In the above CDATA example, CDATA is used just after the element employee to make the
data/text unparsed, so it will give the value of employee:
PCDATA
PCDATA: (Parsed Character Data): XML parsers are used to parse all the text in an XML
document. PCDATA stands for Parsed Character data. PCDATA is the text that will be parsed
by a parser. Tags inside the PCDATA will be treated as markup and entities will be expanded.
<?xml version="1.0"?>
<!DOCTYPE employee SYSTEM "employee.dtd">
<employee>
<firstname>vimal</firstname>
<lastname>jaiswal</lastname>
<email>[email protected]</email>
</employee>
In the above example, the employee element contains 3 more elements 'firstname', 'lastname', and
'email', so it parses further to get the data/text of firstname, lastname and email to give the value
of employee as:
What is JSON
JSON is an acronym for JavaScript Object Notation, is an open standard format, which is
lightweight and text-based, designed explicitly for human-readable data interchange. It is a
language-independent data format. It supports almost every kind of language, framework, and
library.
In the early 2000s, JSON was initially specified by Douglas Crockford. In 2013, JSON was
standardized as ECMA-404, and RCF 8259 was published in 2017.
JSON is an open standard for exchanging data on the web. It supports data structures like objects
and arrays. So, it is easy to write and read data from JSON.
In JSON, data is represented in key-value pairs, and curly braces hold objects, where a colon is
followed after each name. The comma is used to separate key-value pairs. Square brackets are
used to hold arrays, where each value is comma-separated.
What is JSON
o JSON stands for JavaScript Object Notation.
Features of JSON
o Simplicity
o Openness
o Self-Describing
o Internationalization
o Extensibility
o Interoperability
o Less Verbose: In contrast to XML, JSON follows a compact style to improve its users'
readability. While working with a complex system, JSON tends to make substantial
enhancements.
o Faster: The JSON parsing process is faster than that of the XML because the DOM
manipulation library in XML requires extra memory for handling large XML files. However,
JSON requires less data that ultimately results in reducing the cost and increasing the parsing
speed.
o Readable: The JSON structure is easily readable and straightforward. Regardless of the
programming language that you are using, you can easily map the domain objects.
o Structured Data: In JSON, a map data structure is used, whereas XML follows a tree
structure. The key-value pairs limit the task but facilitate the predictive and easily
understandable model.
JSON Objects
In JSON, objects refer to dictionaries, which are enclosed in curly brackets, i.e., { }. These
objects are written in key/value pairs, where the key has to be a string and values have to be
a valid JSON data type such as string, number, object, Boolean or null. Here the key and
values are separated by a colon, and a comma separates each key/value pair.
Example:
JSON Arrays
In JSON, arrays can be understood as a list of objects, which are mainly enclosed in square
brackets [ ]. An array value can be a string, number, object, array, boolean or null.
Example:
[{
"PizzaName" : "Country Feast",
"Base" : "Cheese burst",
"Toppings" : ["Jalepenos", "Black Olives", "Extra cheese", "Sausages", "Cherry tomatoes"],
{
"PizzaName" : "Veggie Paradise",
"Base" : "Thin crust",
"Toppings" : ["Jalepenos", "Black Olives", "Grilled Mushrooms", "Onions", "Cherry tomato
es"],
"Spicy" : "yes",
"Veg" : "yes"
}
]
In the above example, the object "Pizza" is an array. It contains five objects, i.e.,
PizzaName, Base, Toppings, Spicy, and Veg.
➢ JSON Vs XML
JSON stands for JavaScript Object Notation, whereas XML stands for Extensive Markup
Language. Nowadays, JSON and XML are widely used as data interchange formats, and both
have been acquired by applications as a technique to store structured data.
JSON XML
JSON is easy to learn. XML is quite more complex to learn than JSON.
It is simple to read and write. It is more complex to read and write than JSON.
It is data-oriented. It is document-oriented.
Example : Example :
[ <name>
{ <name>Peter</name>
"name" : "Peter", </name>
"employed id" : "E231",
"present" : true,
"numberofdayspresent" : 29
},
{
"name" : "Jhon",
"employed id" : "E331",
"present" : true,
"numberofdayspresent" : 27
}
]
DOM is a way to represent the webpage in a structured hierarchical way so that it will become
easier for programmers and users to glide through the document. With DOM, we can easily
access and manipulate tags, IDs, classes, Attributes, or Elements of HTML using commands or
methods provided by the Document object. Using DOM, the JavaScript gets access to HTML
as well as CSS of the web page and can also add behavior to the HTML elements. so
basically Document Object Model is an API that represents and interacts with HTML or
XML documents.
HTML is used to structure the web pages and Javascript is used to add behavior to our web
pages. When an HTML file is loaded into the browser, the javascript can not understand the
HTML document directly. So, a corresponding document is created (DOM). DOM is basically
the representation of the same HTML document but in a different format with the use of
objects. Javasc ript interprets DOM easily i.e javascript can not understand the
tags(<h1>H</h1>) in HTML document but can understand object h1 in DOM. Now, Javascript
can access each of the objects (h1, p, etc) by using different functions.
Structure of DOM
DOM can be thought of as a Tree or Forest (more than one tree). The term structure model is
sometimes used to describe the tree-like representation of a document. Each branch of the tree
ends in a node, and each node contains objects Event listeners can be added to nodes and
triggered on an occurrence of a given event. One important property of DOM structure models
is structural isomorphism: if any two DOM implementations are used to create a representation
of the same document, they will create the same structure model, with precisely the same
objects and relationships.
Properties of DOM: Let’s see the properties of the document object that can be accessed and
modified by the document object.
The HTML DOM is a standard object model and programming interface for HTML. It
defines:
In other words: The HTML DOM is a standard for how to get, change, add, or delete
HTML elements.
XML DOM
It is a standard object model for XML. XML documents have a hierarchy of informational
units called nodes; DOM is a standard programming interface of describing those nodes and
the relationships between them.
As XML DOM also provides an API that allows a developer to add, edit, move or remove
nodes at any point on the tree in order to create an application.
Following is the diagram for the DOM structure. The diagram depicts that parser evaluates
an XML document as a DOM structure by traversing through each node.
➢ REST API
Representational State Transfer (REST) is an architectural style that defines a set of constraints
to be used for creating web services. REST API is a way of accessing web services in a simple
and flexible way without having any processing. REST technology is generally preferred to the
more robust Simple Object Access Protocol (SOAP) technology because REST uses less
bandwidth, simple and flexible making it more suitable for internet usage. It’s used to fetch or
give some information from a web service. All communication done via REST API uses only
HTTP request.
Working: A request is sent from client to server in the form of a web URL as HTTP GET or
POST or PUT or DELETE request. After that, a response comes back from the server in the
form of a resource which can be anything like HTML, XML, Image, or JSON. But now JSON
is the most popular format being used in Web Services.
In HTTP there are five methods that are commonly used in a REST-based Architecture
i.e., POST, GET, PUT, PATCH, and DELETE. These correspond to create, read, update,
and delete (or CRUD) operations respectively. There are other methods which are less
frequently used like OPTIONS and HEAD.
❖ POST: The POST verb is most often utilized to create new resources. In particular,
it’s used to create subordinate resources. That is, subordinate to some other (e.g.
parent) resource. On successful creation, return HTTP status 201, returning a Location
header with a link to the newly-created resource with the 201 HTTP status.
❖ PUT: It is used for updating the capabilities. However, PUT can also be used
to create a resource in the case where the resource ID is chosen by the client instead
of by the server. In other words, if the PUT is to a URI that contains the value of a
non-existent resource ID. On successful update, return 200 (or 204 if not returning any
content in the body) from a PUT. If using PUT for create, return HTTP status 201 on
successful creation. PUT is not safe operation but it’s idempotent.
❖ PATCH: It is used to modify capabilities. The PATCH request only needs to contain
the changes to the resource, not the complete resource. This resembles PUT, but the
body contains a set of instructions describing how a resource currently residing on the
server should be modified to produce a new version. This means that the PATCH body
should not just be a modified part of the resource, but in some kind of patch language
like JSON Patch or XML Patch. PATCH is neither safe nor idempotent.
The REST standard imposes six architectural constraints that must all be respected by a
system in order for it to qualify as a RESTful system. Strict adherence to these six
constraints ensures optimal reliability, scalability, and extensibility. The six principles of
REST architecture are discussed
Server-side and client-side responsibilities are separated so that each side can be
implemented independently of the other. The server-side code (the API) and the client-side
code can each be changed without affecting the other, as long as both continue to
communicate in the same format. In a REST architecture, different clients send requests to
the same endpoints, perform the same actions, and get the same responses.
The communication between client and server does not keep track of the state of sessions
from one request to the next. The state of a session is included in each request, so neither the
client nor the server needs to know the state of the other to communicate. Each request is
complete and self-sufficient. There is no need to maintain a continuous connection between
client and server, which implies a greater tolerance to failure. In addition, this allows REST
APIs to respond to requests from several clients without saturating the server’s ports. The
exception to this rule is authentication so that the client does not have to specify its
authentication information on every request.
The different actions and/or resources available with their specific endpoints and parameters
must be decided and respected religiously, uniformly by the client and the server. Each
response should contain enough information to be interpreted without the client needing any
other information beforehand. Responses should not be too long and should contain links to
other endpoints.
Caching
Responses can be cached to avoid overloading the server unnecessarily. Caching must be
well managed: The REST API must specify whether and for how long a response can be
cached to avoid the client receiving outdated information.
Layered architecture
Code on demand
This constraint is optional. It means that an API can return executable code instead of a
response in JSON or XML, for example. This means that a RESTful API can extend the
client’s code while simplifying it by providing executable code such as a JavaScript script
or a Java applet.