Unit III Ddis
Unit III Ddis
1
Authentication in Distributed
Systems
• Authentication is identification plus verification. Identification is the procedure whereby an
entity claims a certain identity, while verification is the procedure whereby that claim is checked.
Thus the correctness of an authentication relies heavily on the verification procedure employed.
• The entities in a distributed system that can be distinctly identified are collectively referred to as
principals. There are three main types of authentication of interest in a distributed system:
• (A1) message content authentication — verifying that the content of a message received is the
same as when it was sent;
• (A2) message origin authentication — verifying that the sender of a received message is the
same one recorded in the sender field of the message; and
• (A3) general identity authentication — verifying that the a principal’s identity is as claimed.
• (A1) is commonly handled by tagging a key-dependent message authentication code (MAC) onto
a message before it is sent.
2
A simple classification of authentication protocols
3
Authentication Protocol Paradigms
5
Protocols Based upon Symmetric Cryptosystems
• Authentication Server
6
Example: Kerberos Authentication Service (Symmetric Cryptosystem)
7
1. The Authentication Protocol:
8
9
2. Initial Authentication at login:
10
3. Obtain a ticket for the server:
11
4. Requesting Service from the server:
12
Protocols Based upon Asymmetric Cryptosystems
13
Example: SSL Protocol
14
2. SSL handshake protocol:
15
Peer to Peer Networks and Unstructured Overlays
16
Peer to Peer Networks
17
Overlay Network
• Logical network built on top of a physical network.
• Nodes in the overlay are a Subset of nodes of the underlying Physical Network.
• Multiple overlay networks are possible. Each overlay corresponds to different application.
18
Structured and Unstructured Overlays
• A core mechanism in P2P networks is searching for
data, and this mechanism depends on how (i) the
data, and (ii) the network, are organized.
• Search algorithms for P2P networks tend to be
data-centric, as opposed to the host-centric
algorithms for traditional networks.
• P2P search uses the P2P overlay, which is a logical
graph among the peers, that is used for the object
search and object storage and management
algorithms.
• The P2P overlay is the application layer overlay,
where communication between peers is point-to-
point (representing a logical all-to-all connectivity,)
once a connection is established. The P2P overlay
can be structured or unstructured, i.e., no
particular graph structure is used. Object storage
and search strategies are intricately linked to the
overlay
• Structure as well as to the data organization 19
mechanisms.
Data indexing
Data identified by indexing, which allows physical data
independence from apps.
Simple Distributed Hash Table
• Centralized indexing, e.g., versions of Napster, DNS scheme
• Distributed indexing. Indexes to data scattered across
peers. Access data through mechanisms such as
Distributed Hash Tables (DHT). These differ in hash
mapping, search algorithms, diameter for lookup, fault
tolerance, churn resilience.
• Local indexing. Each peer indexes only the local objects.
Remote objects need to be searched for. Typical DHT
uses flat key space. Used commonly in unstructured
overlays (E.g., Gnutella) along with flooding search or
random walk search.
Mappings from node address space and object space in a
simple DHT.
Another classification: Highly deterministic placement of files/data allows
• Semantic indexing - human readable, e.g., filename, fast lookup.
keyword, database key. Supports keyword searches, But file insertions/deletions under churn incurs some
range searches, approximate searches. cost.
• Semantic-free indexing. Not human readable. Attribute search, range search, keyword search etc.
Corresponds to index obtained by use of hash function. not possible. 20
Chord
Example:
Two steps involved. Create an Initial CHORD network with 16
• Map the object value to its key nodes
• Map the key to the node in the
native address space using
lookup
• Common address space is a m-
bit identifier (2m addresses), and
this space is arranged on a
logical ring mod(2m).
21
22
23
24
25
Content Addressable Network (CAN)
26
27
28
Tapestry
29
30
P2P Searching Algorithms
1. Search for file, data, or peer
2. Unstructured
• Napster, Gnutella, KaZaA, eDonkey, etc.
3.Structured
•Chord, Pastry, Tapestry, CAN, etc.
31
Napster
Centralized Directory
• A centralized Directory is somewhat similar to
client-server architecture in the sense that it maintains a huge
central server to provide directory service. All the peers inform
this central server of their IP address and the files they are
making available for sharing. The server queries the peers at
regular intervals to make sure if the peers are still connected or
not. So basically this server maintains a huge database regarding
which file is present at which IP addresses. The first system which
made use of this method was Napster, for Mp3 distribution.
Working
• Now whenever a requesting peer comes in, it sends its query to
the server.
• Since the server has all the information of its peers, so it returns
the IP addresses of all the peers having the requested file to the
peer.
• Now the file transfer takes place between these two peers.
The major problem with such an architecture is that there is a
single point of failure. If the server crashes, the whole P2P network
crashes. Also, since all of the processing is to be done by a single
server so a huge amount of the database has to be maintained and
regularly updated.
32
Gnutella
Query Flooding
It uses distributed systems (overlay network- a
graph-like structure). Gnutella was the first
decentralized peer-to-peer network.
Working
•Now when one peer requests for some file, this
request is sent to all its neighboring nodes i.e. to
all nodes connected to this node. If those nodes
don’t have the required file, they pass on the
query to their neighbors and so on. This is called
query flooding.
•When the peer with the requested file is found
(referred to as query hit), the query flooding
stops and it sends back the file name and file
size to the client, thus following the reverse
path.
•If there are multiple query hits, the client
selects from one of these peers.
Gnutella: Gnutella represents a new wave of
P2P applications providing distributed discovery
and sharing of resources across the Internet.
Gnutella is distinguished by its support for
anonymity. A Gnutella network consists of a
dynamically changing set of peers connected
33
using TCP/IP.
34