Web Science
Web Science
Web Expansion: Initially used in educational and research institutions before public
adoption.
E-commerce Growth: Businesses leveraged the web for online transactions and
inventory management.
Content Monetization: Shift between free (ad-supported) and paid content models.
3. Technological Foundations
A web browser is a software application that allows users to access and interact with content on
the World Wide Web. It interprets and renders web pages using various protocols and
technologies.
1. Rendering Web Pages
Supports various web technologies like HTML5, CSS3, JavaScript, and WebAssembly.
Enables interactive elements like forms, animations, and dynamic content.
Allows users to install extensions and plugins for added functionality (e.g., ad blockers,
password managers).
Supports themes and developer tools for customization.
Provides Inspect Element, Console, and Network Monitoring for web development.
Helps in debugging CSS, JavaScript, and API requests.
1. Web Browser
A web browser is software that allows users to access and interact with websites.
It acts as a bridge between the user (client) and the web server by sending requests and
displaying responses.
Google Chrome
Mozilla Firefox
Microsoft Edge
Safari
Data Storage Stored temporarily (until browser closes) Stored on the server (until session expires)
Cookie Usage Stores HTTP session ID as a cookie Uses session ID to track user actions
Expiration Ends when the browser is closed Ends after inactivity timeout
3. Web Server
A web server is a software or hardware that processes requests and delivers web pages to
users.
It stores, processes, and serves web content using protocols like HTTP/HTTPS.
A web browser is an application that allows users to access and navigate the World
Wide Web (WWW).
Acts as an intermediary between the client (user) and the server to request and display
web content.
Functions as a compiler that renders HTML pages containing text, images, styles, and
JavaScript.
Examples: Google Chrome, Microsoft Edge, Mozilla Firefox, Safari.
1990: Tim Berners-Lee created the first web browser, World Wide Web (later renamed
Nexus).
1993: Mosaic, the first browser to support images and text together, was developed by
Marc Andreessen and his team.
1994: Netscape Navigator, an advanced commercial browser, was released.
1995: Microsoft introduced Internet Explorer, pre-installed in Windows OS.
Other modern browsers such as Mozilla Firefox, Google Chrome, and Safari emerged
with enhanced features.
2. Web Server
A web server is a combination of software and hardware that processes and delivers web
pages to users.
Uses HTTP and other protocols to manage requests and serve web content.
Can handle multiple user requests simultaneously, ensuring high availability.
How a Web Server Works
Static Content: Pre-existing files (HTML, CSS, images) directly served by the web
server.
Dynamic Content: Generated in real-time using programming languages such as PHP,
Python, JSP, or ASP.
3. HTTP Protocol
What is HTTP?
Request Line: Contains method (GET, POST, PUT, etc.), requested URL, and HTTP
version.
Headers: Provide additional request details (e.g., authentication, content type, caching
instructions).
Body (Optional): Contains form data or uploaded content for methods like POST or PUT.
Status Line: Contains HTTP version and response status code (200 OK, 404 Not Found,
500 Internal Server Error, etc.).
Headers: Describe the content type, encoding, caching policies, etc.
Body: Contains the actual response (HTML, JSON, etc.).
Common HTTP Methods
Involves writing source code using scripting languages (Perl, Python, Tcl) or high-level
programming languages (Java) to create dynamic web applications.
HTML content is embedded within the code, making this approach developer-centric rather
than designer-friendly.
Challenges:
o Difficult for designers to modify layouts without developer intervention.
o Reduces design flexibility and slows down the creative process.
o Maintenance becomes cumbersome as both logic and presentation are intertwined.
Example Technologies: CGI scripts, Java Servlets
2. Template-Based Approaches
Focus: Prioritizes the structure and layout of web pages over programming logic.
Uses templates that consist mostly of formatting elements, with minimal embedded scripting
for dynamic content.
Key Features:
o Allows basic logic such as conditional statements, loops, and parameter substitution for
content injection.
o Provides a designer-friendly environment with greater flexibility and independence
from developers.
Advantages:
o Web designers can create and modify layouts without requiring deep programming
knowledge.
o Easier to maintain due to the clear separation between formatting and functionality.
Examples:
o Server-Side Includes (SSI)
o Adobe ColdFusion
o Apache Velocity
Some argue that embedding scripting logic into templates diminishes their usability for
designers, turning them into limited programming languages.
The ideal approach maintains a balance, ensuring templates empower designers without
requiring them to handle complex logic.
A minimal set of logic constructs (such as loops and conditional statements) enables designers
to work independently while still allowing some dynamic content control.
4. Hybrid Approaches
HTML (HyperText Markup Language) is the fundamental language for creating structured web
content.
Invented by Tim Berners-Lee at CERN in the late 1980s as a system for sharing and linking
documents online.
In 1992, Mosaic, the first graphical web browser, was developed at UIUC, making HTML more
accessible.
<!DOCTYPE HTML>: Defines the HTML version to ensure consistent rendering across browsers.
<html>: The root element enclosing all other tags.
<head>: Contains metadata, links to stylesheets, and scripts for functionality.
<body>: Includes all visible web page content such as text, images, videos, and interactive
elements.
4. HTML Forms
Purpose: Used to collect and process user input via interactive fields.
<form> tag serves as a container for input elements.
Common Input Elements:
o <input>: Captures text, passwords, emails, numbers, and other data types.
o <label>: Associates descriptive text with input fields for clarity.
Important Attributes:
o action: Specifies the URL where form data is submitted.
o method: Defines how data is sent (GET or POST).
o target: Determines how the response is displayed (e.g., new window, same page).
o enctype: Specifies encoding type when submitting data (e.g., multipart/form-
data).
o autocomplete: Enables auto-filling of previous inputs.
o novalidate: Prevents automatic form validation upon submission.
5. Introduction to XML
XML (Extensible Markup Language) is a structured format for storing and transporting data.
Unlike HTML, XML does not define presentation; it focuses on data storage and transfer.
HTML is used for displaying content, whereas XML is used for carrying structured data.
XML tags are customizable, allowing users to define their own schema, while HTML has
predefined tags.
XML provides hierarchical data organization, making it ideal for structured information
exchange.
Prolog: Contains metadata about the XML document (e.g., XML version and encoding
information).
Root Element: The top-level container that encloses all other elements.
Elements: Define data points within the XML document using opening and closing tags.
Attributes: Provide additional details within elements in name-value pairs.
XML follows a tree structure, with a root element at the top and nested child elements below.
Rules for XML Documents:
o Must have a single root element.
o Elements must have properly nested opening and closing tags.
o Tags are case-sensitive (<Title> is different from <title>).
The Document Object Model (DOM) provides a structured way to access and manipulate XML
content.
Web browsers use built-in XML parsers to convert XML data into an interactive object model.
Example: Parsing XML in JavaScript
<script>
var text = "<bookstore><book><title>Everyday
Italian</title></book></bookstore>";
var parser = new DOMParser();
var xmlDoc = parser.parseFromString(text, "text/xml");
console.log(xmlDoc.getElementsByTagName("title")
[0].childNodes[0].nodeValue);
</script>
UNIT 5
Definition:
SDL is a security assurance process that ensures security is built into the software development lifecycle
rather than being added later. It was originally developed by Microsoft to address security concerns in
software development.
Phases of SDL:
Traditional SDL follows a waterfall model, making it hard to adapt to Agile methodologies.
Agile SDL integrates security into rapid development cycles, ensuring that security is a
continuous and iterative process.
UNIT 6
Definition:
SQL Injection (SQLi) is a web security vulnerability that allows attackers to inject malicious SQL code into
input fields to manipulate database queries. This can lead to unauthorized data access, data corruption,
or even full database control.
1. Classic SQL Injection: Directly injecting malicious SQL queries into input fields.
2. Blind SQL Injection: Extracting data based on conditional responses (e.g., true/false conditions).
3. Time-Based SQL Injection: Using time delays to determine if an application is vulnerable.
4. Union-Based SQL Injection: Using the UNION SQL operator to retrieve data from other tables.
Vulnerable Query:
sql
CopyEdit
SELECT * FROM users WHERE username = 'admin' AND password = 'password';
Malicious Input:
sql
CopyEdit
admin' --
Executed Query:
sql
CopyEdit
SELECT * FROM users WHERE username = 'admin' --' AND password = 'password';
Prevention Methods:
python
CopyEdit
cursor.execute("SELECT * FROM users WHERE username = ? AND password
= ?", (user, password))
Definition:
Stored procedures are SQL scripts stored in a database that execute predefined queries. If not secured
properly, they can be exploited like SQL Injection.
Prevention Methods:
sql
CopyEdit
CREATE PROCEDURE GetUser(IN userInput VARCHAR(100))
BEGIN
SELECT * FROM users WHERE username = userInput;
END;
Definition:
SQL Column Truncation occurs when user input is truncated due to a column’s length limit, leading to
security risks such as privilege escalation or account takeover.
How It Works:
If a database column has a character limit (e.g., VARCHAR(10)), an attacker can exploit it by
registering a username that gets truncated but still matches another user’s credentials.
Example of an Attack:
Prevention Methods:
Stored Procedure Attack Avoid dynamic SQL, use parameterized stored procedures
SQL Column Truncation Enforce strict input validation, implement unique constraints
UNIT 7
7.1 Access Control
Access control is a security technique that regulates who or what can view or use resources in a
computing environment. It ensures that only authorized individuals can access specific data or systems.
Definition: Horizontal access control restricts users from accessing data that belongs to other
users at the same privilege level.
Example: In a web application, a user should only be able to access their own profile but not
another user’s profile.
Security Issues:
o Insecure Direct Object References (IDOR): Attackers can manipulate URLs or request
parameters to gain unauthorized access to other users’ data.
o Session Hijacking: Unauthorized access due to stolen session tokens.
Mitigation:
o Implement role-based or attribute-based access control (RBAC/ABAC).
o Use secure session management techniques.
o Validate user permissions before returning requested data.
Definition: Vertical access control restricts users from accessing functionalities beyond their
assigned privilege level.
Example: A normal user should not be able to perform administrative actions like modifying
system configurations.
Security Issues:
o Privilege Escalation: Attackers exploit system vulnerabilities to gain higher access.
o Misconfigured Access Control: Improperly assigned roles allow unauthorized users to
execute privileged functions.
Mitigation:
o Implement the Principle of Least Privilege (PoLP).
o Use multi-factor authentication (MFA) for sensitive operations.
o Perform regular access control audits.
7.2 Authentication
Authentication is the process of verifying a user's identity before granting access to resources.
Common Issues:
o Weak passwords (e.g., "123456", "password") are easily guessed.
o Reused passwords across multiple platforms increase the risk of credential stuffing
attacks.
o Brute force attacks exploit weak authentication mechanisms.
o Phishing attacks trick users into revealing passwords.
Mitigation Strategies:
o Enforce password complexity rules (length, special characters, uppercase/lowercase).
o Implement account lockout mechanisms after multiple failed login attempts.
o Use Multi-Factor Authentication (MFA) for added security.
o Educate users on phishing awareness.
UNIT 8
8.1 SOAP (Simple Object Access Protocol)
What is SOAP?
SOAP is a protocol for exchanging structured information between applications over the internet. It uses
XML messages to request and respond to services.
1. Client sends a request – The request is sent in XML format over HTTP, HTTPS, SMTP, or other
protocols.
2. Server processes the request – The web service interprets the SOAP request and performs the
requested operation.
3. Response is sent back – The server returns an XML-formatted response containing the
requested data.
Components of SOAP:
Importance of SOAP
What is WSDL?
WSDL is an XML-based language that describes how a SOAP web service can be accessed and used. It
acts as a contract between the client and the server, defining available services, data formats, and
protocols.
Importance of WSDL
What is UDDI?
UDDI is a registry where businesses can publish and discover web services. It acts like a phonebook for
web services, helping clients find services dynamically.
1. Service providers register their WSDL and service information in a UDDI registry.
2. Clients search for services using UDDI queries.
3. Once a service is found, the client retrieves the WSDL and uses SOAP to interact with it.
Components of UDDI
Importance of UDDI
1. Set up a web service using Java (JAX-WS), .NET, or Python (Zeep library).
2. Define the WSDL for the service.
3. Develop a client to consume the SOAP service.
4. Test with SOAP UI – A tool to send SOAP requests and view responses.
xml
CopyEdit
<soapenv:Envelope
xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:cur="http://currency.example.com/">
<soapenv:Body>
<cur:GetExchangeRate>
<cur:FromCurrency>USD</cur:FromCurrency>
<cur:ToCurrency>INR</cur:ToCurrency>
</cur:GetExchangeRate>
</soapenv:Body>
</soapenv:Envelope>
xml
CopyEdit
<soapenv:Envelope
xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
<soapenv:Body>
<cur:GetExchangeRateResponse>
<cur:Rate>82.50</cur:Rate>
</cur:GetExchangeRateResponse>
</soapenv:Body>
</soapenv:Envelope>
Summary of Web Services
Feature SOAP WSDL UDDI
XML-based protocol for XML format describing web Registry for discovering web
Definition
communication services services
Role Handles message exchange Defines service structure Stores service details