Unit 12
Unit 12
What is URI?
A URI or Uniform Resource Identifier is a string identifier that refers to a resource on the
internet. It is a string of characters that is used to identify any resource on
the internet using location, name, or both.
A URI has two subsets; URL (Uniform Resource Locator) and URN (Uniform Resource
Number). If it contains only a name, it means it is not a URL. Instead of directly URI, we
mostly see the URL and URN in the real world.
A URI contains scheme, authority, path, query, and a fragment. Some most common URI
schemes are HTTP, HTTPs, ftp, Idap, telnet, etc.
Syntax of URI
1. scheme:[//authority]path[?query][#fragment]
o Scheme: The first component of URI is scheme that contain a sequence of characters
that can be any combination of letter, digit, plus sign, or hyphen (_), which is
followed by a colon (:). The popular schemes are http, file, ftp, data, and irc. The
schemes should be registered with IANA.
o Authority: The authority component is optional and preceded by two slashes (//).
It contains three sub-components:
o userinfo: It may contain a username and an optional password separated by
a colon. The sub-component is followed by the @ symbol.
o host: It contains either a registered name or an IP address. The IP address
must be enclosed within [] brackets.
o Port: Optional
o Path: It consists of a sequence of path segments separated by a slash(/). The URI
always specifies it; however, the specified path may be empty or of 0 lengths.
o Query: It is an optional component, which is preceded by a question mark(?). It
contains a query string of non-hierarchical data.
o Fragment: It is also an optional component, preceded by a hash(#) symbol. It
consists of a fragment identifier that provides direction to a secondary resource.
o What is the URL?
o A URL or Uniform Resource Locator is used to find the location of the resource on
the web. It is a reference for a resource and a way to access that resource. A URL
always shows a unique resource, and it can be an HTML page, a CSS document, an
image, etc.
o A URL uses a protocol for accessing the resource, which can be HTTP, HTTPS, FTP,
etc.
o It is mainly referred to as the address of the website, which a user can find in their
address bars. An example of an URL is given below:
o
Syntax of URL
Each HTTP URL follow the syntax of its generic URI. Hence the syntax of the URL is also
similar to the syntax of URI. It is given below:
1. scheme:[//authority]path[?query][#fragment]
o Scheme: The URL's first component is a scheme, which represents a protocol that a
browser must need to use to request the resource. The commonly used protocols for
websites are HTTP or HTTPS.
o Authority: The authority includes two sub-components, domain name and Port,
separated by a colon. The domain name can be anything, the registered name of the
resource like javatpoint.com, and port is the technical gate to access the resource
on a webserver. The port number 80 is used for HTTP and 443 is used for HTTPS.
o Path: The path indicates the complete path to the resource on the webserver. It can
be like /software/htp/index.html.
o Query String: It is the string that contains the name and value pair. If it is used in a
URL, it follows the path component and gives the information. Such as "?
key1=value1&key2=value2".
o Fragment: It is also an optional component, preceded by a hash(#) symbol. It
consists of a fragment identifier that provides direction to a secondary resource.
o URI contains both URL and URN to identify the name and location or both of a
resource; in contrast, URL is a subset of URI and only identifies the location of the
resource.
o The example of URI is urn:isbn:0-476-27557-4, whereas the example of URL,
is https://google.com.
o The URI can be used to find resources in HTML, XML, and other files also, whereas,
URL can only be used to locate a web page.
o Each URL can be a URI, whereas all URIs cannot always be URLs.
URI URL
URI is an acronym for Uniform Resource URL is an acronym for Uniform Resource
Identifier. Locator.
URI contains two subsets, URN, which tell the URL is the subset of URI, which tells the
name, and URL, which tells the location. only location of the resource.
All URIs cannot be URLs, as they can tell either All URLs are URIs, as every URL can only
name or location. contain the location.
A URI aims to identify a resource and A URL aims to find the location or
differentiate it from other resources by using address of a resource on the web.
the name of the resource or location of the
resource.
The URI scheme can be protocol, designation, The scheme of URL is usually a protocol
specification, or anything. such as HTTP, HTTPS, FTP, etc.
What is UDP?
The UDP stands for User Datagram Protocol. Its working is similar to the TCP as it is also
used for sending and receiving the message. The main difference is that UDP is a
connectionless protocol. Here, connectionless means that no connection establishes prior
to communication. It also does not guarantee the delivery of data packets. It does not even
care whether the data has been received on the receiver's end or not, so it is also known as
the "fire-and-forget" protocol. It is also known as the "fire-and-forget" protocol as it sends
the data and does not care whether the data is received or not. UDP is faster than TCP as it
does not provide the assurance for the delivery of the packets.
o Type of protocol
Both the protocols, i.e., TCP and UDP, are the transport layer protocol. TCP is a
connection-oriented protocol, whereas UDP is a connectionless protocol. It means
that TCP requires connection prior to the communication, but the UDP does not
require any connection.
o Reliability
TCP is a reliable protocol as it provides assurance for the delivery of the data. It
follows the acknowledgment mechanism. In this mechanism, the sender receives the
acknowledgment from the receiver and checks whether the acknowledgment is
positive or negative. If the ACK is positive means, the data has been received
successfully. If ACK is negative, then TCP will resend the data. It also follows the
flow and error control mechanism.
UDP is an unreliable protocol as it does not ensure the delivery of the data.
o Flow Control
TCP follows the flow control mechanism that ensures a large number of packets are
not sent to the receiver at the same time, while UDP does not follow the flow control
mechanism.
o Ordering
TCP uses ordering and sequencing techniques to ensure that the data packets are
received in the same order in which they are sent. On the other hand, UDP does not
follow any ordering and sequencing technique; i.e., data can be sent in any sequence.
o Speed
Since TCP establishes a connection between a sender and receiver, performs error
checking, and also guarantees the delivery of data packets while UDP neither creates
a connection nor it guarantees the delivery of data packets, so UDP is faster than
TCP.
o Flow of data
In TCP, data can flow in both directions means that it provides the full-duplex
service. On the other hand, UDP is mainly suitable for the unidirectional flow of data.
Let's look at the differences between the TCP and UDP in a tabular form.
TCP UDP
Full form It stands for Transmission Control It stands for User Datagram
Protocol. Protocol.
Speed TCP is slower than UDP as it UDP is faster than TCP as it does
performs error checking, flow not guarantee the delivery of data
control, and provides assurance for packets.
the delivery of
Header size The size of TCP is 20 bytes. The size of the UDP is 8 bytes.
Acknowledgment TCP uses the three-way-handshake UDP does not wait for any
concept. In this concept, if the sender acknowledgment; it just sends the
receives the ACK, then the sender data.
will send the data. TCP also has the
ability to resend the lost data.
Flow control It follows the flow control This protocol follows no such
mechanism mechanism in which too many mechanism.
packets cannot be sent to the
receiver at the same time.
Error checking TCP performs error checking by It does not perform any error
using a checksum. When the data is checking, and also does not resend
corrected, then the data is the lost data packets.
retransmitted to the receiver.
Applications This protocol is mainly used where a This protocol is used where fast
secure and reliable communication communication is required and
process is required, like military does not care about the reliability
services, web browsing, and e-mail. like VoIP, game streaming, video
and music streaming, etc.
The World Wide Web (WWW) application exists through web browsers accessing the
content available on web servers. Although it’s often thought of as an end-user application,
you’ll actually use WWW to manage a router or switch. You enable an internet server
function within the router or switch and use a browser to access the router or switch. The
domain name System (DNS) allows users to use names to ask computers, with
DNS getting used to seek out the corresponding IP addresses. DNS also uses a client/server
model, with DNS servers being controlled by networking personnel and DNS client
functions being part of most any device that uses TCP/IP today. The client simply asks
the DNS server to provide the IP address that corresponds to a given name. Simple
Network Management Protocol (SNMP) is an application layer protocol used specifically
for network device management. for instance , Cisco supplies a large sort of network
management products, many of them within the Cisco Prime network management
software package family. they will be wont to query, compile, store, and display
information about a network’s operation. to question the network devices, Cisco Prime
software mainly uses SNMP protocols. Traditionally, to maneuver files to and from a router
or switch, Cisco used Trivial File Transfer Protocol (TFTP) . TFTP defines a protocol for
basic file transfer—hence the word trivial. Alternatively, routers and switches can use File
Transfer Protocol (FTP), which may be a far more functional protocol, to transfer files.
Both work well for moving files into and out of Cisco devices. FTP allows more features,
making it an honest choice for the overall end-user population. TFTP client and server
applications are very simple, making them good tools as embedded parts of networking
devices.
Some of these applications use TCP, and a few use UDP. for instance , Simple Mail
Transfer Protocol (SMTP) and Post Office Protocol version 3 (POP3), both used for
transferring mail, require guaranteed delivery, in order that they use TCP. no matter which
transport layer protocol is employed , applications use a well known port number in order
that clients know which port to aim to attach to. Table 5-3 lists several popular applications
and their well-known port numbers.
TCP establishes and terminates connections between the endpoints, whereas UDP doesn’t .
Many protocols operate under these same concepts, therefore the terms connection-
oriented and connectionless are wont to ask the overall idea of every . More formally, these
terms are often defined as follows :
Connection-oriented protocol: A protocol that needs an exchange of messages
before data transfer begins, or that features a required pre-established correlation
between two endpoints.
Connectionless protocol: A protocol that doesn’t require an exchange of messages
which doesn’t require a pre-established correlation between two endpoints.
UDP is an alternative to Transmission Control Protocol (TCP). Both UDP and TCP run on
top of IP and are sometimes referred to as UDP/IP or TCP/IP. However, there are
important differences between the two. For example, UDP enables process-to-process
communication, while TCP supports host-to-host communication.
TCP sends individual packets and is considered a reliable transport medium. On the other
hand, UDP sends messages, called datagrams, and is considered a best-effort mode of
communications. This means UDP doesn't provide any guarantees that the data will be
delivered or offer special features to retransmit lost or corrupted messages.
UDP provides two services not provided by the IP layer. It provides port numbers to help
distinguish different user requests. It also provides an optional checksum capability to
verify that the data arrived intact.
It allows packets to be dropped and received in a different order than they were
transmitted, making it suitable for real-time applications where latency might be a
concern.
It can be used for transaction-based protocols, such as DNS or Network Time Protocol
(NTP).
It can be used where a large number of clients are connected and where real-time error
correction isn't necessary, such as gaming, voice or video conferencing, and streaming
media.
UDP header composition
UDP uses headers when packaging message data to transfer over network connections.
UDP headers contain a set of parameters -- called fields -- defined by the technical
specifications of the protocol.
The User Datagram Protocol header has four fields, each of which is 2 bytes. They are the
following:
Unlike TCP, UDP doesn't guarantee the packets will get to the right destinations. This
means UDP doesn't connect to the receiving computer directly, which TCP does. Rather, it
sends the data out and relies on the devices in between the sending and receiving
computers to correctly get the data where it's supposed to go.
Most applications wait for any replies they expect to receive as a result of packets sent
using UDP. If an application doesn't receive a reply within a certain time frame, the
application sends the packet again, or it stops trying.
UDP uses a simple transmission model that doesn't include handshaking dialogues to
provide reliability, ordering or data integrity. Consequently, UDP's service is unreliable.
Packets may arrive out of order, appear to have duplicates or disappear without warning.
Although this transmission method doesn't guarantee that the data being sent will reach its
destination, it does have low overhead and is popular for services that don't absolutely
have to work the first time.
Applications of UDP
Lossless data transmission
UDP can be used in applications that require lossless data transmission. For example, an
application that is configured to manage the process of retransmitting lost packets and
correctly arrange received packets might use UDP. This approach can help to improve the
data transfer rate of large files compared to TCP.
In the Open Systems Interconnection (OSI) communication model, UDP is in Layer 4, the
transport layer. UDP works in conjunction with higher-level protocols to help manage data
transmission services, including Trivial File Transfer Protocol (TFTP), Real Time Streaming
Protocol (RTSP) and Simple Network Management Protocol (SNMP).
Gaming, voice and video
UDP is an ideal protocol for network applications in which perceived latency is critical,
such as in gaming, voice and video communications. These examples can suffer some data
loss without adversely affecting perceived quality. In some cases, however, forward error
correction techniques are used in addition to UDP to improve audio and video quality,
despite some loss.
Services that don't need fixed packet transmission
UDP can also be used for applications that depend on the reliable exchange of information
but should have their own methods to answer packets. These services are advantageous
because they're not bound to fixed patterns to guarantee the completeness and correctness
of the data packets sent. Users can decide how and when to respond to information that's
not correct or sorted.
Multicasting and routing update protocols
UDP can also be used for multicasting because it supports packet switching. In addition,
UDP is used for some routing update protocols, such as Routing Information Protocol (RIP).
Fast applications
UDP can be used in applications where speed rather than reliability is critical. For instance, it
might be prudent to use UDP in an application sending data from a fast acquisition where it's
OK to lose some data points.
UDP characteristics include the following:
It is a connectionless protocol.
It is used for VoIP, video streaming, gaming and live broadcasts.
It is faster and needs fewer resources.
The packets don't necessarily arrive in order.
It allows missing packets -- the sender is unable to know whether a packet has been
received.
It is better suited for applications that need fast, efficient transmission, such as games.
Domain Name Port for DNS requests, network routing, TCP and
53 System (DNS) and zone transfers UDP
Dynamic Host
Configuration Used on networks that do not use static
67 /68 Protocol (DHCP) IP address assignment. UDP
Let's return to the example we used in the previous topic (Figure 199). We are sending an
HTTP request from our client at 177.41.72.6 to the Web site at 41.199.222.3. The server for
that Web site will use well-known port number 80, so its socket is 41.199.222.3:80, as we
saw before. We have been ephemeral port number 3,022 for our Web browser, so the client
socket is 177.41.72.6:3022. The overall connection between these devices can be described
using this socket pair:
(41.199.222.3:80, 177.41.72.6:3022)
Unlike TCP, UDP is a connectionless protocol, so it obviously doesn't use connections. The
pair of sockets on the sending and receiving devices can still be used to identify the two
processes exchanging data, but since there are no connections the socket pair doesn't have
the significance that it does in TCP.
netstat Command
The netstat command generates displays that show network status and protocol statistics.
You can display the status of TCP and UDP endpoints in table format, routing table
information, and interface information.
netstat displays various types of network data depending on the command line option
selected. These displays are the most useful for system administration. The syntax for this
form is:
netstat [-m] [-n] [-s] [-i | -r] [-f address_family]
The most frequently used options for determining network status are: s, r, and i. See
the netstat(1M) man page for a description of the options.
DNS is a TCP/IP protocol used on different platforms. The domain name space is divided
into three different sections: generic domains, country domains, and inverse domain.
Generic Domains
o It defines the registered hosts according to their generic behavior.
o Each node in a tree defines the domain name, which is an index to the DNS database.
o It uses three-character labels, and these labels describe the organization type.
Label Description
Country Domain
The format of country domain is same as a generic domain, but it uses two-character
country abbreviations (e.g., us for the United States) in place of three character
organizational abbreviations.
Inverse Domain
The inverse domain is used for mapping an address to a name. When the server has
received a request from the client, and the server contains the files of only authorized
clients. To determine whether the client is on the authorized list or not, it sends a query to
the DNS server and ask for mapping an address to the name.
Working of DNS
o DNS is a client/server network communication protocol. DNS clients send requests
to the. server while DNS servers send responses to the client.
o Client requests contain a name which is converted into an IP address known as a
forward DNS lookups while requests containing an IP address which is converted
into a name known as reverse DNS lookups.
o DNS implements a distributed database to store the name of all the hosts available
on the internet.
o If a client like a web browser sends a request containing a hostname, then a piece of
software such as DNS resolver sends a request to the DNS server to obtain the IP
address of a hostname. If DNS server does not contain the IP address associated
with a hostname, then it forwards the request to another DNS server. If IP address
has arrived at the resolver, which in turn completes the request over the internet
protocol.
DNS server
A DNS server is a computer server that contains a database of public IP addresses and
their associated hostnames, and in most cases serves to resolve, or translate, those
names to IP addresses as requested. DNS servers run special software and communicate
with each other using special protocols.
However, computers and network devices don't work well with domain names when
trying to locate each other on the internet. It's far more efficient and precise to use an IP
address, which is the numerical representation of what server in the network (internet)
the website resides on.
When you type a website address into your browsers address bar and press Enter, a DNS
server goes to work to find the address that you want to visit. It does this by sending a
DNS query to several servers, each of which translates a different part of the domain
name you entered. The different servers queried are:
A DNS Resolver: Receives the request to resolve the domain name with the IP
address. This server does the grunt work in figuring out where the site you want
to go actually resides on the internet.
A Root Server: The root server receives the first request, and returns a result to
let the DNS resolver know what the address of the Top Level Domain (TLD)
server that stores the information about the site. A top level domain is the
equivalent of the .com or .net portion of the domain name you entered into the
address bar.
A TLD Server: The DNS resolver then queries this server, which will return the
Authoritative Name Server where the site is actually returned.
An Authoritative Name Server: Finally, the DNS resolver queries this server to
learn the actual IP address of the website you're trying to deliver.
Once the IP address is returned, the website you wanted to visit is then displayed in your
web browser.
It sounds like a lot of back and forth, and it is, but it all happens very quickly with little
delay in returning the site you want to visit.
The process described above happens the first time you visit a site. If you visit the same site
again, before the cache on your web browser is cleared, there's no need to go through all
these steps. Instead, the web browser will pull the information from the cache to serve
the website to your browser ever faster.
Primary and Secondary DNS Servers
In most cases, a primary and a secondary DNS server are configured on your router or
computer when you connect to your internet service provider . There are two DNS
servers in case one of them happens to fail, in which case the second is used to resolve
hostnames you enter.
Several publicly accessible DNS servers are available for you to use. If you want to change the
DNS servers your network connects to, see our Free & Public DNS Servers List for an up-
to-date listing, and How Do I Change DNS Servers? .
Some DNS servers can provide faster access times than others. This is often a function of
how close you are to those servers. If your ISP's DNS servers are closer to you than
Google's, for example, you may find domain names are resolved quicker using the default
servers from your ISP than with an external server.
If you experience connection problems where it seems no websites will load, it's possible
there's an error with the DNS server. If the DNS server isn't able to find the correct IP
address that's associated with the hostname you enter, the website can't be located and
loaded.
A computer or device, including smartphones and tablets, connected to your router can use a
different set of DNS servers to resolve internet addresses. These will supersede those
configured on your router and will be used instead.
The nslookup command is used to query your DNS server on Windows PCs.
Start by opening the Command Prompt tool and then typing the following:
nslookup lifewire.com
Name: lifewire.com
Addresses: 151.101.2.114
151.101.66.114
151.101.130.114
151.101.194.114
In the example above, the nslookup command tells you the IP address, or several IP
addresses in this case, that the lifewire.com address translates to.
DNS Root Servers
There are 13 important DNS root servers on the internet that store a complete database
of domain names and their associated public IP addresses. These top-tier DNS servers
are named A through M for the first 13 letters of the alphabet. Ten of these servers are in
the US, one in London, one in Stockholm, and one in Japan.
The Internet Assigned Numbers Authority (IANA) keeps this list of DNS root servers if
you're interested.
Malware Attacks That Change DNS Server Settings
Malware attacks against DNS servers are not at all uncommon. Always run an antivirus
program because malware can attack your computer in a way that changes the DNS
server settings.
For example, if your computer uses Google's DNS servers (8.8.8.8 and 8.8.4.4) and you
open your bank's website, you naturally expect that when you enter its familiar URL,
you'll be sent to the bank's website.
There are two things you should do to avoid becoming a victim of a DNS settings attack.
The first is to install antivirus software so that malicious programs are caught before
they can do any damage.
The second is to pay close attention to the appearance of important websites you visit
regularly. If you visit one and the site looks off in some way—maybe the images are all
different or the site's colors have changed, or menus don't look right, or you find
misspellings (hackers can be dreadful spellers)—or you get an "invalid certificate"
message in your browser, it might be a sign that you're on a faked website.
This ability to redirect traffic can be used for positive purposes. For
example, OpenDNS can redirect traffic to adult websites, gambling websites, social media
websites, or other sites network administrators or organizations don't want their users
visiting. Instead, they may be sent to a page with a "Blocked" message.
HTTP
HTTP Transactions
The above figure shows the HTTP transaction between client and server. The client
initiates a transaction by sending a request message to the server. The server replies to the
request message by sending a response message.
Messages
HTTP messages are of two types: request and response. Both the message types follow the
same message format.
Request Message: The request message is sent by the client that consists of a request line,
headers, and sometimes a body.
Response Message: The response message is sent by the server to the client that consists
of a status line, headers, and sometimes a body.
o A client that wants to access the document in an internet needs an address and to
facilitate the access of documents, the HTTP uses the concept of Uniform Resource
Locator (URL).
o The Uniform Resource Locator (URL) is a standard way of specifying any kind of
information on the internet.
o The URL defines four parts: method, host computer, port, and path.
o Method: The method is the protocol used to retrieve the document from a server.
For example, HTTP.
o Host: The host is the computer where the information is stored, and the computer is
given an alias name. Web pages are mainly stored in the computers and the
computers are given an alias name that begins with the characters "www". This field
is not mandatory.
o Port: The URL can also contain the port number of the server, but it's an optional
field. If the port number is included, then it must come between the host and path
and it should be separated from the host by a colon.
o Path: Path is the pathname of the file where the information is stored. The path
itself contain slashes that separate the directories from the subdirectories and files.
Web Clients and server
Tasks The common tasks for client are The complex tasks like fulfilling
simple and mostly include client requests, storing and
requesting services. processing large datasets, data
analysis are common for server.
Switch off The client systems can be switch off Switching off servers may be
without any fear. disastrous for client systems that
continuously request the services.
Login Support There can be single user logins. Server support multiple user login
and request processing
simultaneously.
FTP
Objectives of FTP
Why FTP?
Although transferring files from one system to another is very simple and straightforward,
but sometimes it can cause problems. For example, two systems may have different file
conventions. Two systems may have different ways to represent text and data. Two
systems may have different directory structures. FTP protocol overcomes these problems
by establishing two connections between hosts. One connection is used for data transfer,
and another connection is used for the control connection.
Mechanism of FTP
The above figure shows the basic model of the FTP. The FTP client has three components:
the user interface, control process, and data transfer process. The server has two
components: the server control process and the server data transfer process.
o Control Connection: The control connection uses very simple rules for
communication. Through control connection, we can transfer a line of command or
line of response at a time. The control connection is made between the control
processes. The control connection remains connected during the entire interactive
FTP session.
o Data Connection: The Data Connection uses very complex rules as data types may
vary. The data connection is made between data transfer processes. The data
connection opens when a command comes for transferring the files and closes when
the file is transferred.
FTP Clients
o FTP client is a program that implements a file transfer protocol which allows you to
transfer files between two hosts on the internet.
o It allows a user to connect to a remote host and upload or download the files.
o It has a set of commands that we can use to connect to a host, transfer the files
between you and your host and close the connection.
o The FTP program is also available as a built-in component in a Web browser. This
GUI based FTP client makes the file transfer very easy and also does not require to
remember the FTP commands.
Advantages of FTP:
o Speed: One of the biggest advantages of FTP is speed. The FTP is one of the fastest
way to transfer the files from one computer to another computer.
o Efficient: It is more efficient as we do not need to complete all the operations to get
the entire file.
o Security: To access the FTP server, we need to login with the username and
password. Therefore, we can say that FTP is more secure.
o Back & forth movement: FTP allows us to transfer the files back and forth.
Suppose you are a manager of the company, you send some information to all the
employees, and they all send information back on the same server.
Disadvantages of FTP:
o The standard requirement of the industry is that all the FTP transmissions should
be encrypted. However, not all the FTP providers are equal and not all the providers
offer encryption. So, we will have to look out for the FTP providers that provides
encryption.
o FTP serves two operations, i.e., to send and receive large files on a network.
However, the size limit of the file is 2GB that can be sent. It also doesn't allow you to
run simultaneous transfers to multiple receivers.
o Passwords and file contents are sent in clear text that allows unwanted
eavesdropping. So, it is quite possible that attackers can carry out the brute force
attack by trying to guess the FTP password.
o It is not compatible with every system.
o The network virtual terminal is an interface that defines how data and
commands are sent across the network.
o In today's world, systems are heterogeneous. For example, the operating
system accepts a special combination of characters such as end-of-file token
running a DOS operating system ctrl+z while the token running a UNIX
operating system is ctrl+d.
o TELNET solves this issue by defining a universal interface known as network
virtual interface.
o The TELNET client translates the characters that come from the local
terminal into NVT form and then delivers them to the network. The Telnet
server then translates the data from NVT form into a form which can be
understandable by a remote computer.
Telnet
o The main task of the internet is to provide services to users. For example, users
want to run different application programs at the remote site and transfers a result
to the local site. This requires a client-server program such as FTP, SMTP. But this
would not allow us to create a specific program for each demand.
o The better solution is to provide a general client-server program that lets the user
access any application program on a remote computer. Therefore, a program that
allows a user to log on to a remote computer. A popular client-server program
Telnet is used to meet such demands. Telnet is an abbreviation for Terminal
Network.
o Telnet provides a connection to the remote computer in such a way that a local
terminal appears to be at the remote side.
Local Login
o When a user logs into a local computer, then it is known as local login.
o When the workstation running terminal emulator, the keystrokes
entered by the user are accepted by the terminal driver. The terminal
driver then passes these characters to the operating system which in
turn, invokes the desired application program.
o However, the operating system has special meaning to special
characters. For example, in UNIX some combination of characters have
special meanings such as control character with "z" means suspend.
Such situations do not create any problem as the terminal driver knows
the meaning of such characters. But, it can cause the problems in
remote login.
Remote login
The user sends the keystrokes to the terminal driver, the characters are then sent to
the TELNET client. The TELNET client which in turn, transforms the characters to a
universal character set known as network virtual terminal characters and delivers
them to the local TCP/IP stack
The commands in NVT forms are transmitted to the TCP/IP at the remote machine.
Here, the characters are delivered to the operating system and then pass to the
TELNET server. The TELNET server transforms the characters which can be
understandable by a remote computer. However, the characters cannot be directly
passed to the operating system as a remote operating system does not receive the
characters from the TELNET server. Therefore it requires some piece of software
that can accept the characters from the TELNET server. The operating system then
passes these characters to the appropriate application program.
SSH stands for Secure Shell or Secure Socket Shell. It is a cryptographic network protocol
that allows two computers to communicate and share the data over an insecure network
such as the internet. It is used to login to a remote server to execute commands and data
transfer from one machine to another machine.
The SSH protocol was developed by SSH communication security Ltd to safely
communicate with the remote machine.
Its security features are widely used by network administrators for managing systems and
applications remotely.
A simple example can be understood, such as suppose you want to transfer a package to
one of your friends. Without SSH protocol, it can be opened and read by anyone. But if you
will send it using SSH protocol, it will be encrypted and secured with the public keys, and
only the receiver can open it.
Before SSH:
After SSH:
The SSH protocol works in a client-server model, which means it connects a secure shell
client application (End where the session is displayed) with the SSH server (End where
session executes).
As discussed above, it was initially developed to replace insecure login protocols such as
Telnet, rlogin, and hence it performs the same function.
The basic use of SSH is to connect a remote system for a terminal session and to do this,
following command is used:
1. ssh [email protected]
The above command enables the client to connect to the server,
named server.test.com, using the ID UserName.
If we are connecting for the first time, it will prompt the remote host's public key
fingerprint and ask to connect. The below message will be prompt:
1. The authenticity of host 'sample.ssh.com' cannot be established.
2. DSA key fingerprint is 01:23:45:67:89:ab:cd:ef:ff:fe:dc:ba:98:76:54:32:10.
3. Are you sure you want to continue connecting (yes/no)?
To continue the session, we need to click yes, else no. If we click yes, then the host key will
be stored in the known_hosts file of the local system. The key is contained within the
hidden file by default, which is /.ssh/known_hosts in the home directory. Once the host
key is stored in this hidden file, there is no need for further approval as the host key will
automatically authenticate the connection.
The SSH architecture is made-up of three well-separated layers. These layers are:
1. Transport Layer
2. User-authentication layer
3. Connection Layer
The SSH protocol architecture is an open architecture; hence it provides great flexibility
and enables SSH use for many other purposes instead of only a secure shell. In the
architecture, the transport layer is similar to the transport layer security (TLS). The User-
authentication layer can be used with the custom authentication methods, and the
connection layer allows multiplexing different secondary sessions into a single SSH
connection.
Transport Layer
The transport layer is the top layer of the TCP/IP protocol suite. For SSH-2, this layer is
responsible for handling initial key exchange, server authentication, set up encryption,
compression, and integrity verification. It works as an interface for sending and receiving
plaintext packets with sizes up to 32, 768bytes.
As its name suggests, the user authentication layer is responsible for handling client
authentication and provides various authentication methods. The authentication is done at
the client-side; hence when a prompt occurs for a password, it usually for an SSH client
rather than a server, and the server responds to these authentications.
Connection Layer
The connection layer defines various channels through which SSH services are provided. It
defines the concept of channels, channel requests, and global requests. One SSH connection
can host different channels simultaneously and can also transfer data in both directions
simultaneously. Channel requests are used in the connection layer to relay out-of-band
channel-specific data, for example, the altered size of a terminal window or the exit code of
a server-side process. The standard channel types of connection layer are:
o Data
o Text
o Commands
o Files
The files are transferred using the SFTP(Secure file transfer protocol), the encrypted
version of FTP that provides security to prevent any threat.
o Telnet was the first internet application protocol used to create and maintain a
terminal session on a remote host.
o Both SSH and Telnet have the same functionality. Still, the main difference is that
SSH protocol is secured with public-key cryptography that authenticates endpoint
while setting up a terminal session. On the other hand, no authentication is provided
in Telnet for the user's authentication, making it less secure.
o SSH sends the encrypted data, while Telnet sends data in plain text.
o Due to high security, SSH is the preferred protocol for public networks, while due to
less security, Telnet is suitable for private networks.
o SSH runs on port no 22 by default, but it can be changed, while Telnet uses port
number 23, specifically designed for the Local area network.
To make a secure transmission, SSH uses three different encryption techniques at various
points during a transmission. These techniques are:
1. Symmetrical Encryption
2. Asymmetrical Encryption
3. Hashing
Symmetrical Encryption
Only one key can be used in symmetric encryption techniques to encrypt & decrypt
messages sent and received from the destination. This technique is also known as shared
key encryption because both devices use the same key to encrypt the data they send and
decrypt the received data.
This technique encrypts the entire SSH connection to prevent man-in-middle attacks. In
this technique, one issue arises at the time of initial key exchange. As per this problem, if a
third party is present during the key exchange, they could know the key and read the entire
message.
The Key exchange algorithm is used to prevent this problem. With this algorithm, the
secret keys can be securely exchanged without an interception.
Asymmetrical Encryption
In asymmetrical encryption, two different keys are used for encryption and decryption,
private and public keys. The private key is private to the user only and cannot be shared
with any other user, whereas the public key is shared publicly. The public key is saved on
the SSH server, whereas the private key is saved locally on the SSH client; these two keys
form a key pair. The message encrypted with the public key can only decrypt with the
corresponding private key.
It is a much secure technique as if a third party gets the public key, and they cannot decrypt
the message because they don't know the private key.
The asymmetrical encryption does not encrypt the complete SSH session. Instead, it is
mainly used for the key exchange algorithm of symmetric encryption. In this, before
establishing a connection, both systems (client and server) generate public-private key
pairs temporarily and then share their private keys to generate the shared secret key.
After establishing a secure symmetric connection, the server uses the public key to
transmit it to the client for authentication. The client can only decrypt the data if it has the
private key, and hence the SSH session establishes.
Hashing
In SSH, one-way hashing is used as the encryption technique, which is another form of
cryptography. The hashing technique is different from the above two methods, as it is not
meant by decryption. It generates the signature or summary of information. SSH
uses HMAC(Hash-based Message authentication) to ensure that messages are reached in
complete and unmodified form.
In this technique, each transmitted message must have a MAC, which uses three
components: symmetric key, packet sequence number, and message content. These
three components form the hash function that generates a string that doesn't have any
meaning, and this string is sent to the host. The host also has the same information, so they
also generate a hash function, and if the generated hash matches with the received hash, it
means the message is not tempered.
User ID and password are transmitted without any encryption. This leads to security risk in
Telnet protocol as eavesdropping and snooping are easier to implement by intruders or
hackers.
➨It is not possible to run GUI based tools over Telnet connection as it is character based
communication tool. It is not possible to transmit cursor movements and other GUI
information.
➨It is very inefficient protocol.
➨Each keystrokes require several context switches before it reaches the other end.
➨It is expensive due to slow typing speeds.
SMTP
Components of SMTP
o First, we will break the SMTP client and SMTP server into two components such as
user agent (UA) and mail transfer agent (MTA). The user agent (UA) prepares the
message, creates the envelope and then puts the message in the envelope. The mail
transfer agent (MTA) transfers this mail across the internet.
o SMTP allows a more complex system by adding a relaying system. Instead of just
having one MTA at sending side and one at receiving side, more MTAs can be added,
acting either as a client or server to relay the email.
o The relaying system without TCP/IP protocol can also be used to send the emails to
users, and this is achieved by the use of the mail gateway. The mail gateway is a
relay MTA that can be used to receive an email.
Working of SMTP
POP Protocol
The POP protocol stands for Post Office Protocol. As we know that SMTP is used as a
message transfer agent. When the message is sent, then SMPT is used to deliver the
message from the client to the server and then to the recipient server. But the message is
sent from the recipient server to the actual server with the help of the Message Access
Agent. The Message Access Agent contains two types of protocols, i.e., POP3 and IMAP.
Suppose sender wants to send the mail to receiver. First mail is transmitted to the sender's
mail server. Then, the mail is transmitted from the sender's mail server to the receiver's
mail server over the internet. On receiving the mail at the receiver's mail server, the mail is
then sent to the user. The whole process is done with the help of Email protocols. The
transmission of mail from the sender to the sender's mail server and then to the receiver's
mail server is done with the help of the SMTP protocol. At the receiver's mail server, the
POP or IMAP protocol takes the data and transmits to the actual user.
Since SMTP is a push protocol so it pushes the message from the client to the server. As we
can observe in the above figure that SMTP pushes the message from the client to the
recipient's mail server. The third stage of email communication requires a pull protocol,
and POP is a pull protocol. When the mail is transmitted from the recipient mail server to
the client which means that the client is pulling the mail from the server.
What is POP3?
The POP3 is a simple protocol and having very limited functionalities. In the case of the
POP3 protocol, the POP3 client is installed on the recipient system while the POP3 server is
installed on the recipient's mail server.
The first version of post office protocol was first introduced in 1984 as RFC 918 by
the internet engineering task force. The developers developed a simple and effective email
protocol known as the POP3 protocol, which is used for retrieving the emails from the
server. This provides the facility for accessing the mails offline rather than accessing the
mailbox offline.
In 1985, the post office protocol version 2 was introduced in RFC 937, but it was replaced
with the post office protocol version 3 in 1988 with the publication of RFC 1081. Then,
POP3 was revised for the next 10 years before it was published. Once it was refined
completely, it got published on 1996.
Although the POP3 protocol has undergone various enhancements, the developers
maintained a basic principle that it follows a three-stage process at the time of mail
retrieval between the client and the server. They tried to make this protocol very simple,
and this simplicity makes this protocol very popular today.
To establish the connection between the POP3 server and the POP3 client, the POP3 server
asks for the user name to the POP3 client. If the username is found in the POP3 server, then
it sends the ok message. It then asks for the password from the POP3 client; then the POP3
client sends the password to the POP3 server. If the password is matched, then the POP3
server sends the OK message, and the connection gets established. After the establishment
of a connection, the client can see the list of mails on the POP3 mail server. In the list of
mails, the user will get the email numbers and sizes from the server. Out of this list, the
user can start the retrieval of mail.
Once the client retrieves all the emails from the server, all the emails from the server are
deleted. Therefore, we can say that the emails are restricted to a particular machine, so it
would not be possible to access the same mails on another machine. This situation can be
overcome by configuring the email settings to leave a copy of mail on the mail server.
o It allows the users to read the email offline. It requires an internet connection only
at the time of downloading emails from the server. Once the mails are downloaded
from the server, then all the downloaded mails reside on our PC or hard disk of our
computer, which can be accessed without the internet. Therefore, we can say that
the POP3 protocol does not require permanent internet connectivity.
o It provides easy and fast access to the emails as they are already stored on our PC.
o There is no limit on the size of the email which we receive or send.
o It requires less server storage space as all the mails are stored on the local machine.
o There is maximum size on the mailbox, but it is limited by the size of the hard disk.
o It is a simple protocol so it is one of the most popular protocols used today.
o It is easy to configure and use.
o If the emails are downloaded from the server, then all the mails are deleted from the
server by default. So, mails cannot be accessed from other machines unless they are
configured to leave a copy of the mail on the server.
o Transferring the mail folder from the local machine to another machine can be
difficult.
o Since all the attachments are stored on your local machine, there is a high risk of a
virus attack if the virus scanner does not scan them. The virus attack can harm the
computer.
o The email folder which is downloaded from the mail server can also become
corrupted.
o The mails are stored on the local machine, so anyone who sits on your machine can
access the email folder.
IMAP Protocol
IMAP stands for Internet Message Access Protocol. It is an application layer protocol
which is used to receive the emails from the mail server. It is the most commonly used
protocols like POP3 for retrieving the emails.
It also follows the client/server model. On one side, we have an IMAP client, which is a
process running on a computer. On the other side, we have an IMAP server, which is also a
process running on another computer. Both computers are connected through a network.
The IMAP protocol resides on the TCP/IP transport layer which means that it implicitly
uses the reliability of the protocol. Once the TCP connection is established between the
IMAP client and IMAP server, the IMAP server listens to the port 143 by default, but this
port number can also be changed.
POP3 is becoming the most popular protocol for accessing the TCP/IP mailboxes. It
implements the offline mail access model, which means that the mails are retrieved from
the mail server on the local machine, and then deleted from the mail server. Nowadays,
millions of users use the POP3 protocol to access the incoming mails. Due to the offline mail
access model, it cannot be used as much. The online model we would prefer in the ideal
world. In the online model, we need to be connected to the internet always.
The biggest problem with the offline access using POP3 is that the mails are permanently
removed from the server, so multiple computers cannot access the mails. The solution to
this problem is to store the mails at the remote server rather than on the local server. The
POP3 also faces another issue, i.e., data security and safety.
The solution to this problem is to use the disconnected access model, which provides the
benefits of both online and offline access. In the disconnected access model, the user can
retrieve the mail for local use as in the POP3 protocol, and the user does not need to be
connected to the internet continuously. However, the changes made to the mailboxes are
synchronized between the client and the server.
The mail remains on the server so different applications in the future can access it. When
developers recognized these benefits, they made some attempts to implement the
disconnected access model. This is implemented by using the POP3 commands that provide
the option to leave the mails on the server.
This works, but only to a limited extent, for example, keeping track of which messages are
new or old become an issue when both are retrieved and left on the server. So, the POP3
lacks some features which are required for the proper disconnected access model.
In the mid-1980s, the development began at Stanford University on a new protocol that
would provide a more capable way of accessing the user mailboxes. The result was the
development of the interactive mail access protocol, which was later renamed as Internet
Message Access Protocol.
The first version of IMAP was formally documented as an internet standard was IMAP
version 2, and in RFC 1064, and was published in July 1988. It was updated in RFC 1176,
August 1990, retaining the same version. So they created a new document of version 3
known as IMAP3. In RFC 1203, which was published in February 1991. However, IMAP3
was never accepted by the market place, so people kept using IMAP2. The extension to the
protocol was later created called IMAPbis, which added support for Multipurpose Internet
Mail Extensions (MIME) to IMAP. This was a very important development due to the
usefulness of MIME. Despite this, IMAPbis was never published as an RFC. This may be due
to the problems associated with the IMAP3. In December 1994, IMAP version 4, i.e., IMAP4
was published in two RFCs, i.e., RFC 1730 describing the main protocol and RFC 1731
describing the authentication mechanism for IMAP 4. IMAP 4 is the current version of
IMAP, which is widely used today. It continues to be refined, and its latest version is
actually known as IMAP4rev1 and is defined in RFC 2060. It is most recently updated in
RFC 3501.
IMAP Features
IMAP was designed for a specific purpose that provides a more flexible way of how the user
accesses the mailbox. It can operate in any of the three modes, i.e., online, offline, and
disconnected mode. Out of these, offline and disconnected modes are of interest to most
users of the protocol.
o Access and retrieve mail from remote server: The user can access the mail from the
remote server while retaining the mails in the remote server.
o Set message flags: The message flag is set so that the user can keep track of which
message he has already seen.
o Manage multiple mailboxes: The user can manage multiple mailboxes and transfer
messages from one mailbox to another. The user can organize them into various
categories for those who are working on various projects.
o Determine information prior to downloading: It decides whether to retrieve or not
before downloading the mail from the mail server.
o Downloads a portion of a message: It allows you to download the portion of a
message, such as one body part from the mime-multi part. This can be useful when
there are large multimedia files in a short-text element of a message.
o Organize mails on the server: In case of POP3, the user is not allowed to manage the
mails on the server. On the other hand, the users can organize the mails on the
server according to their requirements like they can create, delete or rename the
mailbox on the server.
o Search: Users can search for the contents of the emails.
o Check email-header: Users can also check the email-header prior to downloading.
o Create hierarchy: Users can also create the folders to organize the mails in a
hierarchy.
The IMAP protocol synchronizes all the devices with the main server. Let's suppose we
have three devices desktop, mobile, and laptop as shown in the above figure. If all these
devices are accessing the same mailbox, then it will be synchronized with all the devices.
Here, synchronization means that when mail is opened by one device, then it will be
marked as opened in all the other devices, if we delete the mail, then the mail will also be
deleted from all the other devices. So, we have synchronization between all the devices. In
IMAP, we can see all the folders like spam, inbox, sent, etc. We can also create our own
folder known as a custom folder that will be visible in all the other devices.