4020 Week 1
4020 Week 1
4020 Week 1
Systems
ITEC 4020
Instructor: Jimmy Huang
[email protected]
https://yorku.zoom.us/j/94310213415?pwd=UC9MeVZvdTAvT29XM1l3aGJRVjAwdz09
Meeting ID: 943 1021 3415 Passcode: 956408
http://www.yorku.ca/jhuang/4020A.html
Motivation
◆ Web-based Knowledge & Data Management
➢ A huge amount of Web data
➢ how to organize, retrieve them, how to discover interesting
patterns and how to make a recommend from them
Web Search Engine
Uber Taxi and Didi Chuxing
Amazon, Alibaba, Tencent, JD.com
Web Blog Analysis
Spam Email Detection
Online Electronic Medical Data Analysis
Electronic Health Care and eHealth
Social Network Analysis
Amazon Business Model
3
Examples of Web Search
Engines
4
Examples of Web
5
Some Internet Related Research
Projects for Students
◆ “Searching and Analyzing Big Data: Context-sensitive and
Task-aware Approaches”. This project is supported by
NSERC Individual Discovery Grant (2020 - 2026)
◆ “Analyzing and Searching Medical Data for Cost Effective
Health Care”. This project is supported by Early Researcher
Award/Premier’s Research Excellence Award
◆ “Finding Best Evidence for Evidence-based Best Practice
Recommendations in Health Care”. This project is supported
by NSERC Collaborative R&D Grant
◆ Other research projects will also be available such as IBM
SUR and OCE projects etc.
◆ Advanced Data Mining and Machine Learning Technologies
for Next Generation eHealth Decision Support Systems
Course Objectives
◆ This course will cover both programming aspects of
internet applications and advanced topics of Web
technology, such as information retrieval, Web search
and Web mining.
Week 1
Outline
◆ The Internet
◆ The Web
◆ What makes the Web work?
➢ HTTP
➢ URL
➢ HTML
➢ CGI
◆ Example of a Web page
◆ Summary
The Internet
To
IP Address: 123.21.12.131
From
The Internet
6
Historical View: Internet
◆ 1969 - Telnet
◆ 1970 - 4 computers
➢ Stanford, UCLA, UC Santa Barbara, U Utah
◆ 1971 - FTP
◆ 1983 - 562 computers on the internet
◆ 1993 - 1.2 million computers on the internet
◆ 1999 - ssh, sftp, ……
◆ 2010 - Amazon, Alibaba, ……
◆ 2020 - Smart-based devices, …….
Outline
◆ The Internet
◆ The Web
◆ What makes the Web work?
➢ HTTP
➢ URL
➢ HTML
➢ CGI
◆ Example of a Web page
◆ Summary
The Web
◆ World-Wide Web (Web, WWW)
➢ networked information system that provides a simple
way of browsing different types (text, pictures, video,
audio, etc.) of information on the Internet using
hyperlinks.
◆ Web pages
➢ electronic documents that typically contains several
types of information accessible via the World Wide Web
◆ Web sites
➢ a collection of related Web pages of a certain individual,
group, or organization.
◆ The Web uses a client/server model
Client-Server Model
Browser - software to interact
machine that services internet request
with internet data at the client
Request File
Browser
Display File
Send File
Server
What is a Web Server?
Web server
◆ computer running application software that listens and
responds to a client computer’s request made through a
web browser
◆ machine that hosts web pages and other web
documents
◆ provides web documents and other online services
using HTTP
What is a Web Browser?
Web browser
◆ application software that is used to locate and issue a
request for the page on the web server that hosts the
document
◆ It also interpret the page sent back by the web server
and display it on the monitor of the client computer
◆ computer program that lets you view and explore
information on the World Wide Web
Web Browsers
Note: Not all URLs will have the directory and filename
HyperText Markup Language (HTML)
◆ Hypertext
➢ presents and relates information as hyperlinked
documents that point to other documents or resources.
◆ HTML
➢ A standard markup language that defines a hypertext
document.
➢ A simple, powerful, platform-independent document
language.
➢ Specifies what displays should look like
➢ Browser interprets HTML
➢ Same HTML file often looks different across browsers
➢ HTML files are the source files of Web pages
HTML File Structure
<HTML>
<HEAD>
<TITLE>Page Title</TITLE>
</HEAD>
<BODY>
Stuff
</BODY>
</HTML>
What About Graphics?
◆ An HTML file can refer to an image file
<h2>Teaching</h2>
<p><a href=”http://ai.uwaterloo.ca/3421.html">
COSC 3421 Fall 2002</a></p>
<p><a href=”http://ai.uwaterloo.ca/3221.html">
COSC 3221 Winter 2003</a></p>
Simple Formatting
<H1><FONT COLOR="#b80000">
Heading level 1</FONT></H1>
<H2><FONT COLOR="#ff0000">
Heading level 2 </FONT> </H2>
<P>Paragraph with <B>bold</B> and
<I>italic</I> text.</P>
<HR>
Creating HTML Files
Browser Server
Web Server
Retrieving Hosting web pages
web pages
using HTTP
protocol
Internet Web Authoring System
Web Client create web pages
Browser Publish Scanner
web pages
Video capture
Sound card
Web page: document written in HTML,JSP and ASP.
Internet Client-Server Systems
Internet Client-Server Systems
Internet Client-Server Systems
40 Internet Banking
Internet Client-Server Systems
46
Wechat Business Model
47
Amazon Business Model
48
Static and Dynamic Web Pages
18
Common Gateway Interface (CGI)
30
CGI-based Web Application
HTTP Request
HTTP Document
Web Browser Web Server
Get Data
CGI Scripts/
Applications Database
Return data
How Web Page Works
URL
Navigational tools
Navigational
Graphics /
tools
Hyperlinks
Hyperlinks
Cookies
◆ A piece of information generated by the web-server
and stored in the client side ready for future access.
◆ Cookies can make CGI scripts more interactive.
◆ Cookies are text files stored on Web client.
◆ CGI script creates cookie and has a Web server sent
it to client’s browser to store on hard disk.
◆ Later, when client revisits Web site and uses a CGI
script that requests this cookie, client’s browser
sends information stored in the cookie.
39
Cookies
◆ How do cookies work?
Request Origin
Client
Server A
Response Origin
Client
Set-Cookie: XYZ Server A
Request Origin
Client Cookie: XYZ Server A