0% found this document useful (0 votes)
476 views

Internet and Web Applications

The document provides an introduction to basic aspects of internet and web applications, including network enabled applications like BitTorrent and Git, as well as web applications like web search engines and web feeds. It discusses important applications, mechanisms, protocols, and languages associated with internet and web applications, and covers topics like web application frameworks, web crawling, content management, file sharing, and business models. The document also introduces concepts of Web 2.0, Web 3.0, XML, XHTML, stylesheets, and other technologies that enable modern web applications.

Uploaded by

Mohamed Sayed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
476 views

Internet and Web Applications

The document provides an introduction to basic aspects of internet and web applications, including network enabled applications like BitTorrent and Git, as well as web applications like web search engines and web feeds. It discusses important applications, mechanisms, protocols, and languages associated with internet and web applications, and covers topics like web application frameworks, web crawling, content management, file sharing, and business models. The document also introduces concepts of Web 2.0, Web 3.0, XML, XHTML, stylesheets, and other technologies that enable modern web applications.

Uploaded by

Mohamed Sayed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Department of Computer Science Institute for System Architecture, Chair for Computer Networks

Internet and Web


Applications

Introduction
Content

Network enabled applications

Internet applications
BitTorrent Git Subset of important
(File distribution system) (Version management) • applications,
• mechanisms,
… • protocols and
• languages
that are associated
Web applications with these two
categories will be
Web search engines Web feeds
discussed in the
lecture

Weblogs …

slide 2
Content

1. Introduction
2. Basic aspects of Web applications
3. Interaction models in the World Wide Web
4. Extensible Stylesheet Language / Cascading Style Sheets
5. Semantic Web
6. Web Application Frameworks
7. Subscription Services
8. Web Crawling and Web Search
9. Content and File Management, Wikis
10.File Sharing
11. Load Balancing and Content Distribution
12. Business Models

slide 3
Web

• The World Wide Web is a system of hypertext documents


that are viewable by a web browser
• Hypertext documents are documents with internal cross-
references to other documents
• Runs as service/application on top of the Internet protocols
• Originally based on three fundamental mechanisms:

Application protocol
(Hypertext Transfer Protocol)

Web

Document description format Document addressing mechanism


(Hypertext Markup Language) (Uniform Resource Identifiers)

slide 4
Web 2.0
• Web 2.0 is a buzzword for describing modern forms of
applications and services that are accessible by the web browser
such as wikis or weblogs
• These applications often have a high social component and
appealing user interfaces in common and are based on enhanced
interaction models and new document representations discussed
in this lecture

Protocol improvements
Ajax
Application protocol
Comet / HTTP Streaming
(Hypertext Transfer Protocol)
Increasing importance Web Sockets
of client-side logic Social
communities
New business Web 2.0
models Social software
...
Document description format
Document addressing mechanism
(Uniform Resource Identifiers)
(X)HTML RSS Atom RDF HTML5 …
slide 5
Web 3.0

• Web 3.0 or the Semantic Web is a vision of semantically


interconnected data in the Web
• While Web 2.0 has been a technological evolution, Web 3.0 is a
conceptual evolution of the Web based on the enhanced
technological environment
• Focus is not set to documents but to data thus becoming
Linked Data
• Important formats are: RDF, RDFS, OWL (see chapter 5) as well
as Microformats

“I have a dream for the Web [in which computers] become capable of
analyzing all the data on the Web – the content, links, and
transactions between people and computers. A ‘Semantic Web’, which
should make this possible, has yet to emerge, but when it does, the
day-to-day mechanisms of trade, bureaucracy and our daily lives will
be handled by machines talking to machines. The ‘intelligent agents’
people have touted for ages will finally materialize.”
(Tim Berners-Lee, inventor of the original World Wide Web in Weaving
the Web ISBN 0-7528-2090-7)
slide 6
Department of Computer Science Institute for System Architecture, Chair for Computer Networks

Basic knowledge
Outline

• Extensible Markup Language (XML)


– Language for describing, processing and exchanging data
• Extensible Hypertext Markup Language (XHTML)
– XML reformulation of HTML
• XML validation languages
– Document Type Definition (DTD)
– XML Schema
• XML Processing Models
– Document Object Model (DOM)
– Simple API for XML (SAX)
• Selected alternatives to XML
• Mechanism for message content type description
• JavaScript
• Cascading Style Sheets (CSS)

slide 8
XML

• The Extensible Markup Language (XML) is a general purpose


meta-language for representation, exchange and
processing of data
• Basis for many web related formats

Extensible Hypertext
Markup Language
(XHTML) XML Serialization of the
Extensible Stylesheet Resource Description
language Framework (RDF)
(XSL)

XML XML Schema


Atom

Really Simple
Syndication …
(RSS)

• XML documents are organised in a simple tree structure with


the documents’ elements as nodes
slide 9
XML terminology

• Every node might have content and parameters


• An element’s content is one of
– Empty content (no value)
– Simple content (text values)
– Element content (further “tags”)
– Mixed content (simple and element content)

Root element
bookstock Element content

Parent element of
“author”, “title”, “isbn” book book book
id=42 id=43 id=44

Child element of
author title isbn … node “bookstock”

Parameter “id”
text text text
Three siblings with value “44”
(sister elements) Simple content
slide 10

‘text’ stands for arbitrary text value


Document’s character encoding
XML terminology
(UTF-8, UTF-16, ISO-8859-1, …)

<?xml version="1.0" encoding="UTF-8"?> XML declaration


Start of document’s
Attribute “id” <bookstock> root element
Start of child element
<book id="42">
Attribute’s of “bookstock”
value “42”
<title>TCP/IP Illustrated</title>
<author>W.R. Stevens</author> Siblings
<isbn>0201633469</isbn>
</book>
Parent of elements “title”,
Start tag <book id="43">
“author”, “isbn”
of element
“title” <title>Mobile Web Services</title>
<author>F. Hirsch</author> Element content of “book”
<author>J. Kemp</author>
<isbn>0470015969</isbn>
End tag
</book>
of element
… Simple content of “isbn”
“book”
</bookstock> (text content)
Comment <!– End of document -->
slide 11
XML syntax

• Main important syntactical rules for XML data are:


– Non-empty elements need to have a start tag and an end
tag (<X>content</X>) - If an element is empty it can
directly be delimited (<X/>)
– Exactly one root element exists
– All attribute values are quoted (single or double quotes)
– Nested tags do not overlap (<X><Y></X></Y> is not
allowed)
– XML is case sensitive: <X></x> is incorrect
– The XML tags, tag content etc. must comply to the given
charset (default: UTF-8)
• Syntactically correct XML documents are called well-
formed

slide 12
XML namespaces

• Different contributors might use the same element names to


refer to different things
• If such elements are mixed in one file name conflicts will
occur

<datafile>
<author>Charlie Brown</author>
<title>Book list</title> If a computer program
<bookstock> should find all book
<book> titles and scans the
<title>TCP/IP Illustrated</title> XML file for “title”
<author>W.R. Stevens</author> elements it assumes
<isbn>0201633469</isbn>
the data file’s title to
<price>41,89</price>
</book> be a book title
</bookstock>
</datafile>

slide 13
XML namespaces Imports namespace and binds
it to prefix

<datafile xmlns:a="http://www.example.com/doc/"
xmlns:b="http://www.example.com/myNs/" >
<a:author>Charlie Brown</a:author>
<a:title>Book list</a:title>
<b:bookstock> No name conflicts
<b:book> occur because
<b:title>TCP/IP Illustrated</b:title> elements are bound
<b:author>W.R. Stevens</b:author> to different
<b:isbn>0201633469</b:isbn> namespaces
<b:price>41,89</b:price>
</b:book>
</b:bookstock>
</datafile>

• Name conflicts are solved by XML namespaces which are used to


qualify elements and attributes by unique identifiers
• These identifiers are web addresses that might point to non existing
resources
• A prefix can be associated with one namespace that is attached to all
elements that belong to this namespace
• Typically standardised XML dialects define a namespace that is slide 14
imported by all documents using this dialect
XML validity

• A valid XML document is related to and conforms to further


information describing its structure and data types used therein
• Two often used type definition languages are:
1. Document Type Definition (DTD)
• Defines a list of legal elements
• Referenced from or embedded in related XML document
• Main shortcoming:
– Capabilities are limited (especially limited
possibilities for type definitions)

<!DOCTYPE note [ Element “author” is


child element of “book”
<!ELEMENT book (title,author,isbn)>
and of type PCDATA
<!ELEMENT title (#PCDATA)> (simple character data)
<!ELEMENT author (#PCDATA)>
<!ELEMENT isbn (#PCDATA)>
]>

slide 15
XML validity

2. XML Schema
• XML dialect that enables definition of various data
types
• Defines two categories of types:
– Simple types (such as string, integer etc.)
– Complex types (such as sequences of other types)

<?xml version="1.0" encoding="UTF-8"?>


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" >
<xs:element name="book">
<xs:complexType>
<xs:sequence>
Element “book” contains <xs:element name="title" type="xs:string"/>
child elements “title”, <xs:element name="author" type="xs:string"/>
“author”, “isbn” which <xs:element name="isbn" type="xs:string"/>
are all of type “string” </xs:sequence>
</xs:complexType>
</xs:element>

</xs:schema>

slide 16
XHTML

• Besides HTML5 (discussed in chapter 2), one wide-spread


application of XML is the Extensible Hypertext Markup
Language (XHTML)
• Its language constructs and thus its expressive power were
deduced from HTML
• High degree of standardisation and stricter syntax leads to
further unification of browser engines
• Use of XML for Hypertext documents makes processing by
regular XML tools possible (e.g. for style definition or syntax
validation)
• Extensible by further XML languages such as MathML for
embedding mathematic expressions into web documents
• XHTML documents may be validated by a DTD
– E.g. XHTML 1.1:
http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd

slide 17
XHTML example

<?xml version="1.0" encoding="UTF-8"?>


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
References to "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
associated DTD
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
Defines default
<title>XHTML Example</title>
namespace
</head>
<body>
<h3>Available books</h3>
<br/>
Empty elements <table>
are directly closed <tr><td>Title</td><td>Author</td></tr>
<tr><td>TCP/IP Illustrated, Volume 1</td><td>W.R. Stevens</td></tr>
<tr><td>Mobile Web Services</td><td>F. Hirsch / J. Kemp</td></tr>
<tr><td>Hacking RSS and Atom</td><td>L.M. Orchard</td></tr>
</table>
</body>
</html>

slide 18
DOM

• The Document Object Model (DOM) is an application


programming interface standardised by the W3C for
accessing and manipulating XML documents
• DOM interprets the document in its logical tree structure
thus making it necessary to load a tree representation of the
document into memory
• It renders it possible to traverse through the tree or to
directly access elements by an id

bookstock

var elem = document.getElementById("42")


book book book
var child = elem.getElementsByTagName("author")[0] id=42 id=43 id=44

var secondChild = child.childNode


author title isbn …
var value = secondChild.nodeValue

text text text

slide 19

getElementsByTagName returns a list of elements that have the given element name
SAX
// Application (SAX handler excerpt) SAX Parser
public void startDocument (){
System.out.println("Start document");
} <?xml version="1.0">
R
public void startElement (…){
… <bookstock> E
} Text A
public void characters (…){ </bookstock> D

}
public void endElement (…){

}
public void endDocument (){
System.out.println("End document");}
• The Simple API for XML is a event-based mechanism for XML processing
• Principle:
1. A program that e.g. is intended to extract information from an XML file
registers event handlers at a SAX parser
2. The parser reads the XML document sequentially from the beginning to
the end
3. Every time the parser reads an element, attribute, special character etc.
it generates an event that is delegated to the associated event handler
slide 20
which then may extract wanted information
SAX example

Input document Event handler output


<?xml version="1.0" Start document
Start element: bookstock
encoding="UTF-8"?>
Characters: "\n"
<bookstock> Start element: book(Attributes:id=42;)
<book id="42"> Characters: "\n"
<title>TCP/IP Illustrated</title> Start element: title
<author>W.R. Stevens</author> Characters: "TCP/IP Illustrated"
<isbn>0201633469</isbn> End element: title
</book> Characters: "\n"
<book id="43"> SAX Start element: author
Characters: "W.R. Stevens"
<title>Mobile Web Services</title> End element: author
<author>F. Hirsch</author> Characters: "\n"
<author>J. Kemp</author> Start element: isbn
<isbn>0470015969</isbn> Characters: "0201633469"
</book> End element: isbn
</bookstock> Characters: "\n"
End element: book
Characters: "\n"
...
End element: bookstock
End document
slide 21
Selected alternatives to XML

• Though XML is intensively applied, XML documents have a certain


overhead and contain redundant information
• Alternatives applied in Web applications include:
• JSON (JavaScript Object Notation, discussed in chapter 3)
• YAML (YAML Ain’t Markup Language)
• Compact text-based and human readable data serilization
format
• Superset of JSON
• Offers lists, associative arrays and scalars for structuring
data
• Enables cross-references between nodes
• Hierarchies are realized by indentation

E.g. YAML list in block format


lectures:
- 'Internet and Web Applications'
- 'Distributed Systems'
- 'Mobile Communication and Computing'
slide 22
Overview of interaction on the Web
• The Web is based on a classical client-server architecture
• Client fetches documents (addressed by an Uniform Resource
Identifier) from the server using the Hypertext Transfer Protocol
(HTTP)
• Documents may have different types of content (text, images, …)
• Content of one document may consist of different sections
interpreted by the client
1
Request document
Web Client
Server (Web Browser)
XHTML Document
Displayed data
(Text)

Layout and Style


(CSS)

Program Logic
(JavaScript) Interpret document

2 slide 23
Content type

• In order to find an appropriate interpreter program for


received content on the client side the content’s meta
information specifies a content type (sometimes referred to
as “media type” or “MIME type”)
• Format:
Content-Type ":" type "/" subtype
• Examples:
– Content-Type: application/xml (for general XML content)
– Content-Type: video/mpeg (for MPEG encoded content)
– Content-Type: image/jpeg (for JPEG encoded content)


Header part
of received Content-Type: application/xml Message received
message Content-Length: 136713 by a client

Content

slide 24
Content type

• Client manages data structure that maps known content


types to responsible applications
• After the client has received a document it analyses the
document’s content type information
• If the client is not responsible for the type of content it looks
up an appropriate application and initiates handling of the
document by this application

Server 1 Client
Request document (Web Browser)

// Client’s code fragment


Content-Type: text/plain if(notResponsible){
… document = response;
3
type application
Content format = response.getType(); … …
program = database.lookup(format);
… …
2 application = start(program);
application.delegateTo(document);
}

slide 25
JavaScript

• JavaScript is an object based wide-spread script language


(standardised under the name ‘ECMAScript’) often used for
client-side execution
• May be embedded into (X)HTML documents or linked from them
• After the web browser has identified code segments it forwards
them for execution to the JavaScript engine
• The result of execution might influence the original document’s
content or force an additional reaction (e.g. open new browser
window)
<html>
<head>
<script language="JavaScript" type="text/javascript">
Definition of
function hello() { alert("Hello!"); } function hello
</script>
</head> When loading the
<body onLoad="hello()"> page the function
This is a simple JavaScript example. “hello” is called
</body>
</html>

JavaScript engine result


slide 26
JavaScript

• Important security concept is the sandbox principle which


permits script logic that is running inside the JavaScript
engine access only to objects inside the browser per default

direct access
Web browser File system

Network resources
JavaScript engine

• Java Script is often used in combination with DOM leading to


a dynamically changeable structure of an XHTML document

DOM-view of JavaScript code


an XHTML (DOM API)
document

slide 27
CSS

• Cascading Style Sheet language is a client-side language for


defining layout of mark-up language elements
• Simplifies uniform design and its maintenance of web pages
• CSS description may be directly embedded into (X)HTML file or
linked from it (e.g. <link rel="stylesheet" href="main.css">)

• If style definitions are given as (X)HTML


external file the layout of all page CSS (X)HTML
associated (X)HTML pages can be (X)HTML file page

changed by editing this single file page

• The CSS file contains a number of selectors (e.g. (X)HTML


tags or self defined names) whose properties are defined
• These selectors refer to elements of Selector
(X)HTML files (e.g. to tag with {
specific name or attribute) property1: valuesX;
property2: valuesY;

}

slide 28
CSS
main.css index.html
input{ font-size: 110%; <head>…
color: red; <link rel="stylesheet" href="main.css">
background: #ffff00; } </head>
.top{ padding-top: 20px; } <body>
p{ font-size: 16pt; <div class="top">
font-family: "Garamond", serif; <img src="/banner.jpg"/>
font-weight: bold; } </div>
#fat { font-weight: bold; } <p class="top">
Welcome to the Samoa Webmailer Service!
</p>
<div id="fat">
20 Please enter mail address and password.</div>
pixel <form method="post" action="login">
<input type="text" name="id"/><br/>
<input type="password" name="password"/>
<br/><input type="submit" value="login"/>
</form>
</body>
20
pixel
bold; Garamond as font

bold

yellow (#ffff00) background; red font


slide 29
References

Links at World Wide Web Consortium (W3C):

XML home http://www.w3.org/XML


XML Schema home http://www.w3.org/XML/Schema
XHTML 1.0 http://www.w3.org/TR/xhtml1/
XHTML 1.1 http://www.w3.org/TR/xhtml11/
HTML 5 http://www.w3.org/TR/2014/REC-html5-20141028/
DOM home http://www.w3.org/DOM/

Further Links

YAML-Spezifikation http://www.yaml.org/spec/1.2/spec.html
SAX home http://www.saxproject.org/
JavaScript https://developer.mozilla.org/en/About_JavaScript
at mozilla.org

Article about Web 2.0


http://oreilly.com/web2/archive/what-is-web-20.html
slide 30

You might also like