Theory Module 1 - Web Programming Fundamentals
Theory Module 1 - Web Programming Fundamentals
Fundamentals
Module
1
A web browser is a type of software that
How does the Web allows you to find and view websites on
work? the Internet
Web Application
Achitecture
High Level Browser Architecture
User Interface:
• The user interface is the space where User interacts with the browser.
•It includes the address bar, back and ne t buttons, home button, refresh and
stop, bookmark option, etc.
•Every other part, e cept the window where requested web page is displayed,
comes under it.
Browser Engine:
•The browser engine works as a bridge between the User interface and
•It is a small database created on the local drive of the computer where the
browser is installed.
• It manages user data such as cache, cookies, bookmarks and preferences.
UI Backend:
• UI backend is used for drawing basic widgets like combo bo es and
windows.
• It uses operating system user interface methods.
UR
L
HTT
P
● An application protocol Invented alongside HTML to create the first
interactive, te t-based web browser
● It is a client-server protocol - the foundation of any data e change
● client is a browser or an App
● Server could be IIS, Apache, Python Tornado, NodeJs
● A request-response architecture
● Stateless
● Port:80
HTTP Request &
Response
HTTP Request HTTP Response
● Secure Sockets Layer technology protects your Web site and makes it easy
for your Web site visitors to trust you in three essential ways:
○ Privacy - An SSL Certificate enables encryption of sensitive
information during online transactions.
○ Integrity - A Certificate Authority verifies the identity of the
certificate owner when it is issued.
○ Authentication- Each SSL Certificate contains unique, authenticated
information about the certificate owner.
● SSL stands for Secure Sockets Layer, and was first developed by
Netscape back in 1994.
● This prevents unauthorized third parties from seeing or altering any
data being e changed across the internet.
● A HTTP-based SSL connection is always initiated by the client using a
URL starting with https:// instead of with http://.
● At the beginning of an SSL session, an SSL handshake is performed
● This handshake produces the cryptographic parameters of the session.
SSL
Handshake
1. The SSL or TLS client sends a client hello message that lists
cryptographic information such as the SSL or TLS version and, in the
client's order of preference, the CipherSuites supported by the client.
2. The message also contains a random byte string that is used in
subsequent computations.
3. The protocol allows for the client hello to include the data
compression methods supported by the client.
4. The SSL or TLS server responds with a server hello message that contains
the CipherSuitechosen by the server from the list provided by the client,
the session ID, and another random byte string.
5. The server also sends its digital certificate.
6. If the server requires a digital certificate for client authentication, the
server sends a client certificate request that includes a list of the types of
certificates supported and the Distinguished Names of acceptable
Certification Authorities (CAs).
7. The SSL or TLS client verifies the server's digital certificate.
8. The SSL or TLS client sends the random byte string that enables both
the client and the server to compute the secret key to be used for
encrypting subsequent message data.
9. The random byte string itself is encrypted with the server's public key.
10. If the SSL or TLS server sent a client certificate request, the client sends a
random byte string encrypted with the client's private key, together with
the client's digital certificate, or a no digital certificate alert.
11. This alert is only a warning, but with some implementations the
handshake fails if client authentication is mandatory.
12. The SSL or TLS server verifies the client's certificate.
13. The SSL or TLS client sends the server a finished message, which
is encrypted with the secret key, indicating that the client part of
the handshake is complete.
14. The SSL or TLS server sends the client a finished message, which
is encrypted with the secret key, indicating that the server part of
the handshake is complete.
15. For the duration of the SSL or TLS session, the server and client
can now
e change messages that are symmetrically encrypted with the shared
secret key.
TL
S
● To facilitate privacy and data security for communications over the Internet.
● Encrypts the communication between web applications and servers.
● Proposed by the Internet Engineering Task Force (IETF).
● Transport Layer Security (TLS) was designed to provide security at the
transport layer.
● TLS was derived from a security protocol called Secure Sockets Layer (SSL).
● Netscape originally developed SSL in 1994 to secure its browser,
Netscape Navigator. The last version of SSL was SSL 3.0. The first
version of TLS, released in 1999, is based on SSL 3.0.
● TLS is used to secure application layer protocols like FTP, HTTP, and
SMTP,
● TLS has three primary functionalities and one de facto functionality:
○ Encryption – Conceals data transferred between two
parties, typically a client server and a web application. This
prevents eavesdropping.
○ Authentication – Certifies the identities of two parties
communicating over the internet. This prevents
impersonation attacks.
○ Integrity – Verifies that the data being sent across a network
has not been tampered with on its journey. This prevents
man-in-the-middle attacks. The integrity is ensured by using
a certificate issued by a trusted certificate authority (CA).
○ Replay prevention – This protects against brute force attacks
and man-in-the-middle attacks.
How does TLS work? ( similar to SSL)
● To use TLS, it must have a TLS certificate installed on its origin server
(SSL certificate).
● A TLS certificate is issued by a certificate authority to the person or
business that owns a domain.
● The certificate contains important information about who owns the
domain, along with the server's public key, both of which are important for
validating the server's identity.
● A TLS connection is initiated using a sequence known as the TLS
handshake.
● When a user navigates to a website that uses TLS, the TLS handshake
begins between the user's device and the web server.
During the TLS handshake, the user's device and the web server:
● Specify which version of TLS they will use
● Decide on which cipher suites they will use
● Authenticate the identity of the server using the server's TLS
certificate
● Generate session keys for encrypting messages between them after
the handshake is complete
Certificate Authorities and Chain of Trust
● Authentication is an integral part of establishing every TLS
connection. After all, it is possible to carry out a conversation over
an encrypted tunnel with any peer, including an attacker, and unless
we can be sure that the host we are speaking to is the one we trust,
then all the encryption work could be for nothing.
● What Is a Certificate Authority (CA)?
A certificate authority (CA), also sometimes referred to as a
certification authority, is a company or organization that acts to
validate the identities of entities (such as websites, email addresses,
companies, or individual persons) and bind them to cryptographic
keys through the issuance of electronic documents known as digital
certificates.
● A digital certificate p rovides:
○ Authentication, by serving as a credential to validate the identity
of the entity that it is issued to.
○ Encryption, for secure communication over insecure
networks such as the Internet.
○ Integrity of documents signed with the certificate so that
they cannot be altered by a third party in transit.
● When SSL certificate is installed, an intermediate root certificate or bundle
is sent
● When a browser downloads your website’s SSL certificate upon arriving at
a homepage, it begins chaining that certificate back to its root.
● It will begin by following the chain to the intermediate that has been
installed, from there it continues to tracing backwards until it arrives at
a trusted root certificate.
● If the certificate is valid and can be chained back to a trusted root, it will
be trusted. If it can’t be chained back to a trusted root, the browser will
issue a warning about the certificate.
Differences between SSL
and TLS
Cipher suites
● Version: The current version of SSL is 3.0; the current version of TLS is
1.0. In other words, SSLv3.0 is compatible with TLSv1.0.
● SSL protocol offers support for Fortezza cipher suite: TLS does not offer
support. TLS follows a better standardization process that makes defining
of new cipher suites easier like RC4, Triple DES, AES, IDEA, etc.
● Alert messages: SSL has the “No certificate” alert message. TLS protocol
removes the alert message and replaces it with several other alert
messages.
● Record Protocol:
SSL uses Message Authentication Code (MAC) after encrypting each
message while TLS on the other hand uses HMAC — a hash-based
message authentication code after each message encryption.
● Handshake process
In SSL, the hash calculation also comprises the master secret and
Domain Name
System
● Phonebook of the Internet
● Translates domain names to IP addresses
● DNS servers eliminate the need for humans to memorize IP addresses
● The process of DNS resolution involves converting a hostname into
a computer friendly IP address
● The Domain Name System (DNS) is a hierarchical and decentralized naming
system for computers, services, or other resources connected to the
Internet or a private network.
● The Domain Name System delegates the responsibility of assigning
domain names and mapping those names to Internet resources by
designating authoritative name servers for each domain.
● Network administrators may delegate authority over sub-domains of
their allocated name space to other name servers.
● All DNS servers fall into one of four categories:
➢ Recursive resolvers
➢ Root nameservers
➢ TLD nameservers
➢ Authoritative nameservers
DNS recursive resolver
● Also known as a DNS recursor
● First stop in a DNS query
● Acts as a middleman between a client and a DNS nameserver
● After receiving a DNS query from a web client, a recursive resolver will
either respond with cached data
● If not, then send a request to a root nameserver → to a TLD nameserver →
to an authoritative nameserver
● After receiving a response from the authoritative nameserver containing
the requested IP address, the recursive resolver then sends a response to
the client.
DNS root nameserver
● The 13 DNS root nameservers
● Known to every recursive resolver, and they are the first stop in a
recursive resolver’s quest for DNS records.
● A root server accepts a recursive resolver’s query which includes a domain
name, and the root nameserver responds by directing the recursive resolver
to a TLD nameserver, based on the e tension of that domain (.com, .net,
etc.).
● The root nameservers are overseen by a nonprofit organization called
the Internet Corporation for Assigned Names and Numbers (ICANN).
TLD nameserver
● Maintains information for all the domain names that share a
common domain e tension, such as .com, .net, etc.
● Respond by pointing to the authoritative nameserver for that
domain.
● Management of TLD nameservers is handled by the Internet
Assigned Numbers Authority (IANA), which is a branch of ICANN.
How does DNS work?
1. A user types ‘e ample.com’ into a web browser and the query travels into
the Internet and is received by a DNS recursive resolver.
2. The resolver then queries a DNS root nameserver
3. The root name server responds to the resolver with the address of a TLD
server which stores the information for its domains.
4. The resolver then makes a request to the
TLD.
5. The TLD server then responds with the IP address of the authoritative
domain’s nameserver.
6. Lastly, the recursive resolver sends a query to the authoritative domain’s
nameserver.
7. The IP address for e ample.com is then returned to the resolver from
the nameserver.
8. The DNS resolver then responds to the web browser with the IP address of
the domain requested initially
9. . The browser makes a HTTP request to the IP
address.
10. The server at that IP returns the webpage to be rendered in the
browser.
Authoritative nameserver
● The authoritative nameserver is usually the resolver’s last step in the
journey for an IP address.
● Contains information specific to the domain name it serves (e.g.
google.com) and it can provide a recursive resolver with the IP address of
that web server found in the DNS record.
DNS Records
● DNS record types are records that provide important information about
a hostname or domain.
● The following are the five major DNS record types:
○ A record
■ "A" in A record stands for "address." An A record shows the IP
address for a specific hostname or domain
○ AAAA record
■ this DNS record type is different in the sense that it points to
IPV6
addresses.
○ CNAME record
■ CNAME—or, in full, "canonical name"—is a DNS record that points
a domain name (an alias) to another domain.
■ the use of CNAME records is running multiple subdomains for
different purposes on the same server.
○ Nameserver (NS) record
■ nameserver (NS) record specifies the authoritative DNS server for
a domain
■ When we purchase a web hosting service or set up a simple
website, we would receive nameserver details along
○ Mail e change (M ) record
■ A mail e change (M ) record, is a DNS record type
that shows where emails for a domain should be routed to
■ You can have multiple M records for a single domain name.
Document Object Model (
DOM)
● The DOM is a W3C (World Wide Web Consortium) standard.
All the nodes in a node tree have relationships to each other. The tree structure
is called a node-tree
HTML DOM Node Tree (Document Tree)
● In a node tree, the top node is called the root
● Every node, e cept the root, has e actly one parent node
● A node can have any number of child ren
● A leaf is a node with no child ren
● Siblings are nodes with the same parent
● To access a node in tree:
○ By using the getElementById() method
○ By using the getElementsByTagName() method
○ By navigating the node tree( using the node
relationships ).
eXtensible Markup
Language (XML)
● Markup language for documents containing structured information
● Defined by the WWW Consortium (W3C)
● Based on Standard Generalized Markup Language (SGML)
● Bridge for data e change on the Web
XML HTML
● E tensible set of tags ● Fi ed set of tags
● Content orientated ● Presentation oriented
● Standard Data ● No data validation
capabilities
infrastructure ● Single
● Allows multiple output presentation
forms
● An ML element is made up of a start tag, an end tag, and data in between.
● E ample:
<director> Matthew Dunn </director>
● E ample of another element with the same value:
<actor> Matthew Dunn </actor>
● ML tags are case in-sensitive:
<CITY> <City> <city>
● ML can abbreviate empty elements, for e ample:
<married> </married> can be abbreviated to
<married/>
● An attribute is a name-value pair separated by an equal sign (=).
● E ample:
<City ZIP=“94608”> Emeryville </City>
○ Attributes are used to attach additional, secondary information to
an element.
● A ML document is an ML element that can, but might not,
basic include ML elements.
● nested
ample:
<books>
<book isbn=“123”>
<title> Second Chance </title>
<author> Matthew Dunn </author>
</book>
</books>
ML Data Model
<BOOKS>
<book id=“123” loc=“library”>
<author>Hull</author>
<title>California</title>
<year> 1995 </year>
</book>
<article id=“555” ref=“123”>
<author>Su</author>
<title> Purdue</title>
</article>
</BOOKS>
Authoring ML Documents
● All elements must have an end tag.
● All elements must be cleanly nested .
● All attribute values must be enclosed in quotation marks.
● Each document must have a unique first element, the root node
● Each ML based standard defines what are valid elements, ML type
using
specification languages to specify the synta
● Database schemas constrain what information can be stored, and the
data types of stored values
○ ML documents are not required to have an associated schema
○ However, schemas are very important for ML data e change
○ Otherwise, a site cannot automatically interpret data received
from another site
○ Two mechanisms for specifying ML schema
■ Document Type Definition (DTD) - To validate Structure
■ ML Schema - To validate Schema
Document Type Definition (DTD)
● Grammar in terms of Structure is checked for
● The type of an ML document can be specified using a DTD
● DTD constraints structure of ML data
○ What elements can occur
○ What attributes can/must an element have
○ What subelements can/must occur inside each element, and how
many times.
● DTD does not constrain data types
● All values represented as strings in ML
● DTD can be internal or e ternal
●
PCDATA
● PCDATA means parsed character data.
● Think of character data as the te t found between the start tag and the end tag of
an ML element.
● PCDATA is te t that WILL be parsed by a parser. The te t will be e amined by the
parser for entities and markup.
● Tags inside the te t will be treated as markup and entities will be e panded.
● However, parsed character data should not contain any &, <, or > characters; these
need to be represented by the & < and > entities, respectively.
CDATA
● CDATA means character data.
● CDATA is te t that will NOT be parsed by a parser. Tags inside the te t will NOT be
treated as markup and entities will not be e panded.
Internal
DTD
<? ml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to ( PCDATA)>
<!ELEMENT from ( PCDATA)>
<!ELEMENT heading ( PCDATA)>
<!ELEMENT body ( PCDATA)>
]>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend</body>
</note>
DTD for it might be:
<db><person>
<!DOCTYPE db [
<
n <!ELEMENT db
a (person*)>
m <!ELEMENT
e person (name,
> age, email
A
l <!ELEMENT
a name (
n
< PCDATA)>
/ <!ELEMENT
n age (
a
External
DTD
<? ml version="1.0" encoding="UTF-8"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
</note>
● Online validation
○ https://www. mlvalidation.com/
○ https://www.truugo.com/ ml_validator/
● E tension in VS Code
ML Schema
● ML Schema is a more sophisticated schema language which
addresses the drawbacks of DTDs. Supports
○ Typing of values
■ E.g. integer, string, etc
■ Also, constraints on min/ma values
○ User-defined, comle types
○ Many more features, including
■ uniqueness and foreign key constraints, inheritance
● ML Schema is itself specified in ML synta , unlike DTDs
○ More-standard representation, but verbose
● ML Scheme is integrated with namespaces
● BUT: ML Schema is significantly more complicated than DTDs.
<xs:schema xmlns:xs=http://www.w3.org/2001/XMLSchema>
<xs:element name=“bank” type=“BankType”/>
<xs:element name=“account”>
<xs:complexType>
<xs:sequence>
<xs:element name=“account_number” type=“xs:string”/>
<xs:element name=“branch_name” type=“xs:string”/>
<xs:element name=“balance” type=“xs:decimal”/> </xs:squence>
</xs:complexType>
</xs:element>
….. definitions of customer and depositor ….
<xs:complexType name=“BankType”>
<xs:squence>
<xs:element ref=“account” minOccurs=“0” maxOccurs=“unbounded”/>
<xs:element ref=“customer” minOccurs=“0” maxOccurs=“unbounded”/>
<xs:element ref=“depositor” minOccurs=“0” maxOccurs=“unbounded”/>
</xs:sequence>
</xs:complexType>
Concluding ML
● Query
○ is a functional language that is used to retrieve information
stored in ML format. Query can be used on ML
documents, relational databases containing data in ML
formats, or ML Databases.
● Converting Relational Database to ML
○ Through EER (e tended entity relationship) model
can convert the schema of relational database into ML.
○ The semantics of the relational database, captured in EER
diagram, are mapped to ML schema using stepwise procedures
and mapped to ML document under the definitions of
the ML schema.
● Challenges facing ML
○ ML is “Verbose
■ human has to read!
○ Cann’t leave open tag unclosed
○ no browsers still, which can read ML
■ ML documents must be converted to HTML
before they are deployed
○ Multiple White-Space Characters
■ ML represents 50 blank spaces as one space.
○ ML is not optimized for access speed
● Online Validation
○ https://www.freeformatter.com/ml-validator-sd.html
JSO
N
● “JSON” stands for “JavaScript Object Notation”
● A lightweight te t based data-interchange format
● Completely language independent
● Despite the name, JSON is a (mostly) language-independent way of
specifying objects as name-value pairs
● Based on a subset of the JavaScript Programming Language
● Easy to understand, manipulate and generate
○ Easy for humans to read and write.
○ Easy for machines to parse and generate
JSON
syntax
var employeeData =
{ "employee_id": 1234567,
"name": "Jeff Fox",
"hire_date": "1/1/2013",
"location": "Norwalk, CT",
"consultant": false
};
Arrays in
JSON
● An ordered collection of values
● Begins with [ (left bracket)
● Ends with ] (right bracket)
● Name/value pairs are separated by , (comma)
var employeeData =
{ "employee_id": 1236937,
"name": "Jeff Fox",
"hire_date": "1/1/2013",
"location": "Norwalk, CT",
"consultant": false,
"random_nums":
[ 24,65,12,94 ]
The BNF is
simple
● JSON Parser
○ You can convert JSON object into JSON te t
○ Object to Te t Conversion
var myJSONTe t = myObject.toJSONString();
1. Interface Definition: An API provides a set of rules and protocols for how
software components should interact. This includes specifying the
requests that can be made, the data formats to use, and the responses to e
pect.
2. Endpoints: APIs often e pose certain endpoints (URLs) that allow
applications to access specific functions or data. For e ample, a weather
API might have an endpoint to get current weather information.
3. Requests and Responses: When an application wants to use an API, it sends
a request to one of these endpoints. The API processes the request and
sends back a response, which could be data or an action result.
4. done using API keys, tokens, or other methods.
1. Data Formats: APIs typically use data formats like JSON or ML to structure
the information being e changed. This makes it easier for different
systems to understand and work with the data.
2. Authentication: Many APIs require authentication to ensure that only
authorized users or systems can access the data or services. This can be
REST
API
● An API is an application programming interface. It is a set of rules that
allow programs to talk to each other. The developer creates the API on the
server and allows the client to talk to it.
● REST determines how the API looks like.
● It stands for “Representational State Transfer”.
● REST is a set of rules that developers follow when they create their API.
One of these rules states that you should be able to get a piece of data
(called a resource) when you link to a specific URL.
○ Create
○ Read
○ Update
○ delete
● Each URL is called a request while the data sent back to you is called
a response.
● The idea behind REST is to treat all server URLs as access points for
the various resources on the server.
● Eg:
[GET] http://example.com/use
rs
[POST] http://example.com/use
rs
[GET] http://example.com/users
/1
[PUT] http://example.com/users
/1
[DELET http://example.com/users
/1
E]
● GET - reads
● POST - creates new resources
● PUT - Update a given resource
● DELETE - deletes a resource
When to Consider
Alternatives
● REST: If simplicity and compatibility with web standards are more
important, or if you need to support HTTP/1.1, REST might be easier
to implement and integrate.
● WebSocket: For real-time applications requiring low latency and a
persistent connection with continuous two-way communication,
WebSocket might be more appropriate.
● SOAP: When you need e tensive built-in support for comple transactions
and security features, and are working within environments where ML
is preferred.
● GraphQL: When you require fle ible queries and precise control over the
data returned in responses, and you want a single endpoint for data
fetching.
● gRPC: when you need high-performance, efficient communication
between services with strict requirements for latency and throughput.
Sample
Questions
1. The architecture of Web application.
2. HTTP methods
3. E plain GET and POST request methods of HTTP.
4. Differentiate following:
a. HTTP and HTTPS
b. SSL and TLS
c. ML and JSON
d. URI, URL, URN
5. Short Note:
a. REST API and its characteristics
b. DNS
6. Structure of browser (components of the browser)
7. Functions of different components of the browser.
8. Working of the rendering engine
9. What is DNS? E plain domain and subdomain with suitable e amples.
10. Write a short note on DOM.
● Self-learning Topics: : Ngin server
Refere
nce
1. https://computersciencewiki.org/inde .php/HTTP,_HTTPS,_HTML,_URL,_
ML,_SLT,_CSS
2.