0% found this document useful (0 votes)
53 views38 pages

Problem Definition: Application Aware Backup System

The document describes an application aware backup system that will develop APIs for backing up Linux files, applications, and network devices to the cloud in an optimized manner. The system will allow for selective retrieval of backed up data from the cloud. The goal is to provide open source APIs that can be used by others to optimize storage of data in the cloud and selectively retrieve data from the cloud for Linux systems.

Uploaded by

swapy_04
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views38 pages

Problem Definition: Application Aware Backup System

The document describes an application aware backup system that will develop APIs for backing up Linux files, applications, and network devices to the cloud in an optimized manner. The system will allow for selective retrieval of backed up data from the cloud. The goal is to provide open source APIs that can be used by others to optimize storage of data in the cloud and selectively retrieve data from the cloud for Linux systems.

Uploaded by

swapy_04
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 38

Application Aware Backup System

PROBLEM DEFINITION

Today most of organizations want to store their important data to cloud. Application
aware backup system is fully supported solution for backing up Linux files, application and
network devices to the cloud .We are going to develop API’s for the functionality of back
ping , restoring of data in optimize manner to the cloud .Restore API’s will help to user for
selective retrieval of data .Application aware backup is open source backup and recovery
software .Our objective is to develop cost effective data storage in cloud .This solution
provide cutting edge for small, medium as well as large business

Our aim is to provide open source API’s which can be used by anyone in their
application. Optimization while storing data in cloud, selective retrieval of data from cloud
for Linux based system can be provided by our developed API’s.

1
Application Aware Backup System

CHAPTER 1: LITERATURE REVIEW

While understanding importance of our project one has to know following basic
things related to our topics:

Cloud Storage:

Clients would be able to access their applications and data from anywhere at any time.
They could access the cloud computing system using any computer linked to the Internet.
Data wouldn't be confined to a hard drive on one user's computer or even a corporation's
internal network.

It could bring hardware costs down. Cloud computing systems would reduce the need
for advanced hardware on the client side. You wouldn't need to buy the fastest computer with
the most memory, because the cloud system would take care of those needs for you. Instead,
you could buy an inexpensive computer terminal. The terminal could include a monitor, input
devices like a keyboard and mouse and just enough processing power to run the middleware
necessary to connect to the cloud system. You wouldn't need a large hard drive because you'd
store all your information on a remote computer.

Corporations that rely on computers have to make sure they have the right software in
place to achieve goals. Cloud computing systems give these organizations company-wide
access to computer applications. The companies don't have to buy a set of software or
software licenses for every employee. Instead, the company could pay a metered fee to a
cloud computing company. Corporations might save money on IT support. Streamlined
hardware would, in theory, have fewer problems than a network of heterogeneous machines
and operating systems.

Servers and digital storage devices take up space. Some companies rent physical
space to store servers and databases because they don't have it available on site. Cloud
computing gives these companies the option of storing data on someone else's hardware,
removing the need for physical space on the front end. If the cloud computing system's back
end is a grid computing system, then the client could take advantage of the entire network's
processing power. Often, scientists and researchers work with calculations so complex that it

2
Application Aware Backup System

would take years for individual computers to complete them. On a grid computing system,
the client could send the calculation to the cloud for processing. The cloud system would tap
into the processing power of all available computers on the back end, significantly speeding
up the calculation.

ASP Architecture:

Fig. 1.1. Cloud Architecture and layers in cloud

3
Application Aware Backup System

Archive Services Platform provides the foundation for an enterprise-class system that
incorporates secure storage, rich search functionality, WORM retention capabilities, and
simple recovery. The architecture of the Archive Services Platform is designed to handle the
complex issues of digital asset management:

 Scalability
 Reliability
 Performance
 Search capabilities
 Storage capabilities, including strong retention policies
 Security

SCALABILITY

The Archive Services Platform is designed for horizontal scalability in small, cost-
effective increments, with no known theoretical limit. Each individual service can easily scale
by adding servers. For example, adding storage servers with additional disk drives increases
storage capacity, and adding search nodes increases the capacity for handling search requests.

RELIABILITY

The design of the Archive Services Platform is reliable, providing outstanding system
availability and failure resiliency. To ensure the availability of digital assets, the Archive
Services Platform stores four copies of all data assets in two geographically separated data
centers.

To further support high availability, the Archive Services Platform is designed with
no single point of failure. For example, computing capacity is distributed among many
different computers, and storage capacity is distributed across multiple volumes. Plus, all
operations are distributed across multiple data centers. As a result, the Archive Services
Platform is resilient against isolated failures, from a single node failing to a site outage.

4
Application Aware Backup System

PERFORMANCE

The distributed architecture of the Archive Services Platform also provides fast performance
for extensive searches across large amounts of stored assets.

 Return 99 percent of simple text search results within 2 seconds. More complex
searches, such as those including wildcards or ranges, may take longer.
 Ingest thousands of assets per minute.
 Destroy thousands of assets per minute.
 Make assets searchable within 30 minutes after ingestion, on average.
 Support 5 search queries per second for each instance.
 Accept assets with no pre-imposed size limit.

SEARCH CAPABILITIES

The distributed structure of the Archive Services Platform also provides the ability to
search for assets easily, which helps retrieve desired assets quickly and efficiently.

 Interactive Searching
The interactive search function returns the metadata of the identified assets, so
that a partner or custom application can allow end users to further refine the search
criteria and review the resulting assets.

 Bulk Searching
The bulk search function enables a partner or custom application to specify
large numbers of assets for retrieval. There are also options for destroying assets or
modifying the metadata of assets.

 Full-text Indexing
The Archive Services Platform provides complete indexes to all text within
both digital assets and the assets’ associated metadata. An optimized index enables
complex searches for text and other search criteria.

5
Application Aware Backup System

Extra features of Searching mechanism are as follows:

 Federated Search Model


 Search Support for Multiple Formats
 Rich Querying Capabilities
 Boolean expressions
 Phrases
 Keywords
 Wildcards
 Date Ranges
 Flexible Asset Metadata

STORAGE CAPABILITIES

Storage management is a key part of the design of the Archive Services Platform,
which provides a reliable and secure platform for large-scale digital asset storage.

SECURITY

Security is at the heart of every product and service. The Archive Services Platform
encrypts every asset with a unique 256-bit key using the AES encryption algorithm. Any
hostile attempt to reconstitute the key for an asset would require assembling information from
a large number of disk volumes distributed across multiple server nodes. Therefore, any
hostile attempt to decrypt any asset would require possession of a large fraction of all the disk
volumes, rendering such an attempt practically impossible. Strong encryption also permits
efficient, secure destruction of an asset by overwriting its fixed-length key using a data-
shredding algorithm.

Certificate-based Access Controls

A digital certificate is used to authenticate every connection to the archive, and link to
the customer's identity. The customer identity is then used to check each operation at each
step in the workflow, whether search or retrieval, to ensure that each customer can access
only their own data, not another customer's data.

6
Application Aware Backup System

Data Integrity

The Archive Services Platform computes SHA-256 hashes of each data asset, and
checks those hashes for discrepancies on every step of storage and retrieval, as well as during
routine periodic integrity sweeps. This integrity validation is also made available to the
partner or custom application, so that it can independently check the results of transmission,
in addition to the TLS protocol that is also checking each data packet transmitted.

Transmission Security

To ensure the secure transmission of assets from partners and customers to Iron
Mountain, Iron Mountain supports the Secure Sockets Layer (SSL) and Transport Layer
Security (TLS) communications protocols, secured by digital certificates. TLS and its
precursor, SSL, are communications protocols that provide secure client/server
communication across networks, such as the Internet. TLS and SSL feature the following
security measures:

 Endpoint-to-endpoint secure transmission: All data is encrypted during transit. As


mentioned above, all data is also encrypted during storage.
 Two-way validation, or authentication, of the certificate owner to ensure that the
entity is who they claim to be. Although SSL/TLS also supports unilateral
authentication, Iron Mountain uses bilateral authentication for added security. Both
the client and the server must prove to each other who they are by using digital
certificates.

Physical Security

Iron Mountain owns or leases off-site data bunkers that provide high-security,
environmentally-controlled storage for media, and includes data centers with redundant
infrastructure. These data bunkers include the following security measures:

 Extensive multi-acre underground sites.


 Gated entrances with 24x7 security guards.
 Restricted access requiring photo ID and escort.
 Real-time closed circuit TV monitoring.
7
Application Aware Backup System

 Commercial power feeds with generators for full backup power.


 Internal and external 24x7 environmental monitoring alarms for temperature,
“waterbug” leaks, smoke, fire, and motion detection.
 External accreditation by the Uptime Institute according to their Tier Classification
and Performance Standard.

COMPONENTS

Iron Mountain's Archive Services Platform architecture is designed to provide high


availability for all services. All data is replicated to two sites, and the secondary site contains
infrastructure and equipment sufficient to provide all services in the case of a network outage
to the primary site. Site-to-site replication utilizes a private multi-gigabit network link that
provides secure transmission together with other Iron Mountain Digital storage services.

Major Components of ASP

ARCHIVE SERVICES API NODES

The Archive Services API nodes provide REST and SOAP web services connection points.
These consist of multiple nodes connected to a load balancer. The API nodes perform authentication
and authorization for all requests for ingestion, retrieval, search, and destruction services. Requests

8
Application Aware Backup System

are then distributed to nodes within the storage grid and search grid, depending on the type of request.
API nodes can scale separately from grid nodes in order to increase connection capacity.

STORAGE GRID

The Storage Grid provides high-density storage based on commodity server equipment with
industry-standard high-capacity disk drives. The servers within the storage grid distribute data to
multiple nodes in order to enforce redundancy, data optimization, and encryption.

Periodic data integrity checks use pre- and post-encryption hashes calculated with the SHA-
256 hash algorithm. The storage grids also store segments of the keys needed to decrypt data.
Multiple storage nodes are required to combine shared key material to allow decryption.

SEARCH GRID

The Search Grid provides low-latency metadata and full-text search capabilities by
distributing indexes across many nodes. Search indexes are also replicated across sites, allowing
failover for searches in the event of the failure of a search node. During ingestion, Text Extraction
services extract text from industry-standard document types, and stream the text directly to the
indexing services collocated with the query servers.

Search nodes also provide bulk search capability by visiting each search node in turn and
aggregating results into a single stream of results returned through the API nodes.

TRANSACTION MONITORING, REPORTING, AND BILLING

Several databases track customer transactions, provide data aggregation for reporting, and
prepare bills based on storage usage by customers. Reporting services are separate from transactional
services, in order to ensure consistent transaction performance.

Sendmail:

While going through literature we had reviewed firstly about Linux based server
system (send mail server, postgres server etc). Send mail acts as a post office to which all
messages can be submitted for routing. Send mail can interpret both Internet-style addressing
(that is, user @domain) and UUCP-style addressing (that is, host! user). How addresses are
interpreted is controlled by the send mail configuration file. Send mail can rewrite message

9
Application Aware Backup System

addresses to conform to standards on many common target networks. This will brief us about
send mail server’s functionality.

REST:

As a programming approach, REST is a lightweight alternative to Web Services and


RPC. Much like Web Services, a REST service is:

 Platform-independent (you don't care if the server is Unix, the client is a Mac, or
anything else),
 Language-independent (C# can talk to Java, etc.),
 Standards-based (runs on top of HTTP), and
 Can easily be used in the presence of firewalls.

In order to understand REST protocol we gone through following details of REST as,
REST stands for Representational State Transfer. (It is sometimes spelled "ReST".) It
relies on a stateless, client-server, cacheable communications protocol -- and in virtually all
cases, the HTTP protocol is used.

REST is an architecture style for designing networked applications. The idea is that, rather
than using complex mechanisms such as CORBA, RPC or SOAP to connect between
machines, simple HTTP is used to make calls between machines.

 In many ways, the World Wide Web itself, based on HTTP, can be viewed as a
REST-based architecture.

RESTful applications use HTTP requests to post data (create and/or update), read data (e.g.,
make queries), and delete data. Thus, REST uses HTTP for all four CRUD
(Create/Read/Update/Delete) operations.

REST is a lightweight alternative to mechanisms like RPC (Remote Procedure Calls) and
Web Services (SOAP, WSDL, et al.).

 Despite being simple, REST is fully-featured; there's basically nothing you can do in
Web Services that can't be done with a RESTful architecture.

REST is not a "standard". There will never be a W3C recommendation for REST, for
example. And while there are REST programming frameworks, working with REST is so
simple that you can often "roll your own" with standard library features in languages like
Perl, Java, or C#.

while explaining why they chose REST over SOAP, Yahoo! people write that they "believe
REST has a lower barrier to entry, is easier to use than SOAP, and is entirely sufficient for
[Yahoo's] services" (Yahoo! Developer Network FAQ, as of February 2008).

10
Application Aware Backup System

Advantages:

 The main advantage of ROA (REST Oriented Architecture) is ease of


implementation, agility of the design, and the lightweight approach to things.
 Advantage of REST lies with performance: with better cache support, lightweight
requests and responses, and easier response parsing, REST allows for nimbler clients
and servers, and reduces network traffic, too
 As REST matures, expect it to become better understood and more popular even in
more conservative industries.

Disadvantages:

 The main advantage of SOAP-based SOA over ROA is the more mature tool support;
however, this could change over time.
 Another SOA advantages include the type-safety of XML requests (for responses,
ROA can also use XML if the developers desire it).

11
Application Aware Backup System

CHAPTER 2: SOFTWARE REQUIREMENTS SPECIFICATION

2.1. Introduction

2.1.1 Project Scope

Before starting any new project one has to identify the need of that project. With respect
to this project, in windows based Application Aware Backup System (like MS DPM)
there is features such as MS File Filter, Filter Driver. Also technique such as volume
shadow copy is available in windows based application. But, such application is not
available on Linux based system. So to avoid this disadvantage we are going to
implement Application Aware Backup System in Linux environment. In this, we are
storing our back up on the cloud, so the system can be used in and out of the intranet as
required for any organization running Linux based web servers. In this we are going to
implement API’s which will work for different functionality of back up, restore, search
etc. These API’s will help for storage analyst to store data in optimized manner, also
restore API’s will help for selective retrieval of data. These API’s will available in open
source so that they can be used by anyone in order to develop there own application. Here
proactive approach will be considered so that it will reduce storage cost of organizations
in cloud. Further we can use our application aware backup system in distributed
applications.

2.1.2 User Characteristics

He/she can create user accounts and grant access to user and
Administrator
monitor the backup process.
Initiates the backup process, gives information about data to
User
backup/retrieve.
Table 2.1 User Characteristics

12
Application Aware Backup System

2.1.3 Operating Environment

We are using the Linux Operating System. Linux system will provide more security
than other type of operating system because of less access permissions even for
administrator of system. Also it will contain virtual file system which gives platform
for any type of files. In this way Linux based servers are more secure for transactions
hence operating environment chosen is Linux environment.

2.1.4 Assumptions and constraints

2.1.4.1 Assumptions

We are assuming that we are permitted the access to Linux server architecture to
retrieve information from it and secondly we are storing the back up on the cloud, so
we assume that we have proper access to all the resources of the cloud storage. All the
storage and security constraints are taken care by the cloud itself so than we don’t
have to bother about the integrity of the backup in the cloud.

2.1.4.2 Constraints

 Administrator must be licensed to use the cloud service.


 Before accessing the user must be given proper access rights by the administrator by
registering it over the cloud.
 User can handle only one server over Linux.
 User must specify how frequently he wants the back to be done.

2.2 System Features

Functional Requirements

This section is the set of different interfaces split into subsections modeled for back up,
store/restore, search, add/remove account for administrator .Once account has been created
user can retrieve selective data from the cloud through user interface.

13
Application Aware Backup System

 APIs

These will mainly contain generic bridge to communicate with ASP. Further it will
provide functionalities like backup, store, restore, search, add/delete record creation of
account etc. API’s developed will be open source hence they can be used by anyone.
For backup functionality backup API is available like wise other types of API’s such
as restore API’s etc. will be performed by respective API’s.

 Service agent

It consists of application/service aware layer acting as the interface between Linux


system and APIs. Service agent acts as interface between Linux based system and API
layer. It will act as bridge between those two.

 Linux system

The server relies on this system with data to be backed up as well as data to be
restored from the cloud is stored in Linux system. Linux system is more secure than
other operating environment hence in turn transaction will more secure.

 User interface

User can communicate with the backup system through user interface. User interface
will provide different choices to user for backup, restore, search. For that backup,
restore option is present in user interface. Browse button is present in order to browse
file that user want to backup or restore.

2.3 External Interface Requirements

 User Interfaces
Application to be developed should be interfaced with user with provision of
functionality such as backup, restore, search, add/remove of files etc. User can have
choice of accessing above mentioned functionality. User can be interfaced with
system by using GUI developed for user.

14
Application Aware Backup System

 Software Interfaces
In order to interface with APIs’ developed and Linux system, service agent is
provided.
Our developed API’s will act as interface with cloud and Linux system through
service agent in order to store in optimize manner in cloud, or retrieve in selective
manner from cloud.

 Communication Interfaces

Communication interface between API’s and archive service platform can be carried
out by SOAP/REST protocol. In order to carry out communication between Linux
system and cloud (archive service platform) SOAP/ REST is used. SOAP provides a
simple and lightweight mechanism for exchanging structured and typed information
between peers in a decentralized, distributed environment using XML. SOAP does
not itself define any application semantics such as a programming model or
implementation specific semantics; rather it defines a simple mechanism for
expressing application semantics by providing a modular packaging model and
encoding mechanisms for encoding data within modules. This allows SOAP to be
used in a large variety of systems ranging from messaging systems to RPC.

2.4 Nonfunctional Requirements

2.4.1 Performance Requirements

The expected outcome will depend upon following factors:

1. Accessing Linux server architecture.

2. Deduplication and metadata formation.

3. Privileges to the multiple users over the cloud.

4. Proper utilization of the bandwidth over the cloud.

5. Efficient use of memory while creating the backup from server

15
Application Aware Backup System

2.4.2 Safety Requirements

In order to provide no data lose database management system used should be capable
of maintaining atomicity property.

2.4.3 Security Requirements

Data to be backup/restore must be secure for that archive service layer is provided in
cloud which will take care of data security by providing encryption of data, digital
signature or other techniques. In order to maintain authentication of user, validation of
user can be provided in backup system.

2.4.4 Software Quality Attributes

1. Accurate retrieval of selective backup from Linux server.

2. Deduplication and metadata formation.

3. Flexibility provision

4. Optimization.

First and most importantly, it allows for native communication between the software
application and the cloud storage provider. This eliminates the need for the ISV's
application to depend on the appliance for data movement. In addition, it allows the
ISV greater control over how and where data is stored, and even may save them time
in the development process, by not having to re-invent a wheel that is already
available. Finally API sets, like Iron Mountain’s Archive Services Platform API, can
provide the application with greater functionality beyond just simple cloud storage.
An integrated API set also provides the end user with a more simplified means of
interacting with the secondary storage tier and greater control over how that data is
stored or retrieved. This is because the interaction with the data is from within the
application which should eliminate external procedures to data management.

16
Application Aware Backup System

2.4.5 Hardware Requirement

1.Fedora 13 2.6 kernel enabled system

2.Ironmountain Net Infrastructure

3.Intranet

2.4.6 Software Requirement

1. SOAP/REST APIs

2. JDK 1.6

17
Application Aware Backup System

2.5. Analysis Model


2.5.1. Data Flow Diagram

Start

Take data to local Select required


cache and through data f
Backup Restore
staging area From cloud

Data transferred
Transfer using through SOAP/REST
service agent and protocol
library file

Backup file is Required file is stored


stored in cloud in local cache

File is fetched from


local cache

Stop

Fig. 2.1 Data Flow Diagram

18
Application Aware Backup System

2.5.2 Class Diagram

Fig. 2.2 Class Diagram

19
Application Aware Backup System

2.5.3 State-transition Diagrams

Fig. 2.3 Account Creation

20
Application Aware Backup System

Fig. 2.4 Backup Creation

Fig. 2.5 Retrieval of Selected Information

CHAPTER 3: SYSTEM DESIGN

3.1 System Architecture

Fig. 3.1 Overall System Design

21
Application Aware Backup System

Fig. 3.2 Architecture Design

3.2 UML Diagrams

3.2.1 Activity Diagram

22
Application Aware Backup System

Fig. 3.3 Activity Diagram

3.2.2 Sequence Diagram

23
Application Aware Backup System

Fig. 3.4 Account Creation

24
Application Aware Backup System

Fig 3.5 Creation of Backup

25
Application Aware Backup System

Fig 3.6 Retrieval of Information

26
Application Aware Backup System

3.2.2 Use Case Diagram

Fig. 3.7 Use Case Diagram

27
Application Aware Backup System

3.2.3 Component Diagram

Fig. 4.8 Component Diagram

28
Application Aware Backup System

CHAPTER 4: PROJECT APPLICATION

4.1 Advantages:

1. Proposed system will help organizations to store data in cloud with no duplication of
data by provision of own API’s developed.
2. It will reduce rental cost of organization for storing data in cloud by provision of own
API’s which will avoid duplication.
3. It will minimize bandwidth of channel by retrieving data selectively.
4. API’s developed will be available in open source so that anyone can use those API’s
in their application.
5. As storage needs skyrocket, cloud storage services provide a cost-effective option for
offloading data and sparing companies the need to build out their storage
infrastructure.
6. While companies seek the most effective ways to manage data growth, there are
important points to consider and best practices to follow when weighing cloud storage
providers and solutions.
7. It is helping Independent Software Vendors (ISVs) deliver value to their customers
through comprehensive and secure cloud storage solutions.
8. Cloud archiving is cost-efficient because providers leverage economies of scale.
Billing to the end customer is incremental, based on how much data is stored rather
than a large up-front fee.
9. Cloud archiving providers also have a more sophisticated understanding of
compliance requirements, as they are accustomed to operating in a constantly
changing regulatory landscape.
10. Once business applications are integrated with a scalable cloud archiving solution,
users can easily archive data, regardless of increased storage needs, and retrieve it
when necessary.

29
Application Aware Backup System

4.2 Disadvantages:

Security of data to be stored is depended on ASP (Archive service platform) because


security can be provided by ASL (Archive service layer). It means that data security
can be totally dependent on ASP.

4.3 Applications:

1. Application aware backup system, provide data backup in cloud efficiently and in
minimum amount of cost in Linux platform.
2. API’s can be available in open source so that anyone can use those API’s in their
application.
3. It will provide selective retrieval of data in Linux platform.
4. Cloud archiving is an enabling technology that can benefit many markets.
Applications can make calls directly to the cloud archiving API or via a gateway,
exposing storage as if it were on-site.
5. Cloud archiving, therefore, offers hosted archiving that can be integrated into a wide
variety of applications used by customers and provided by ISVs and application
developers, including email archiving, content management, compliance, discovery
and management of unstructured information, and any application that creates or
manages large volumes of unstructured or semi structured content.
6. Cloud archiving is perfect for applications with data that is principally static in nature
or has specific data retention policies. An archive consists of records that have been
especially selected for long-term preservation. Documents and other artifacts should
be archived only when they are ready to be kept long-term. Archiving in general, and
cloud archiving in particular, are not suitable for applications that use the data
actively, such as online transaction processing applications.

30
Application Aware Backup System

CHAPTER 5: APPENDIX A

Glossary:

API: This means ‘Application Program Interface’.

ASP: This is ‘Achieve Service Platform’ where data is collocated data centers. Data is
centralized in robust database. Data is always backed up. It is accessible from any location,
compatible with window, apple. Archive Services Platform is a comprehensive storage and
information management platform. Unlike other storage cloud services, the Archive Services
Platform offers more than basic storage. It provides a secure and reliable storage solution for
all your data.

 Data is stored using 256 AES encryption technology


 The Platform is WORM enabled when retention is applied
 Data is stored redundantly within and across geographically split Data Centers

In addition to providing a comprehensive storage solution, the Archive Services Platform also
features extensive search capabilities. Data stored in the Platform is full-text indexed upon
ingestion and can be searched for from within your application.

Search by:

 ID
 Content
 Metadata

Other capabilities of the Platform include retention management and secure destruction. The
Platform is accessible by a web services API and command line interface.

 REST API – We have expanded our programming interfaces to include additional


REST calls.

31
Application Aware Backup System

 Standard Asset Type and Asset Format Support - For your end-users, you can now
streamline the process of indexing and categorizing their archived information by
allowing them to choose from a standard set of names for their archived information.
 Folders – The Archive Services Platform now provides your end-users with virtual
folders.
 Programmatic Customer On-boarding – We have streamlined the process of on-
boarding our partners' customers.

ASL: This is ‘archive service layer’ which gives the security to the data to be backed up.

SENDMAIL:

RPM packages required: sendmail, sendmail-cf and m4

Ubuntu/Debian install: sudo apt-get sendmail sendmail-base sendmail-bin sendmail-cf


mailutils

Sendmail receives mail for local system user login accounts. Mail is held in a single file:
/var/mail/userID

Steps to run mail server using sendmail:

1. Required for inbound mail: The mail server must be identified by the DNS as the mail
server in order to receive mail. See the YoLinux web tutorial on configuring DNS.
2. /etc/mail/local-host-names (Required) (Red Hat 7.1 - Fedora Core 3)
/etc/sendmail.cw (Red Hat 6.x)

This file contains all of the alternate host names of the server. (i.e. domain-name.com)
Sendmail will not accept mail for a domain unless it is permitted to do so by the
contents of this file.

3. File /etc/aliases (Optional) lists alternative names for email recipients.


Sample:

32
Application Aware Backup System

webmaster: john, dave


postmaster: kim, garret
larry.anderson: larry
moe.anderson: moe
curly.anderson: curly

4. After creation or modification one must run the command newaliases which will
generate a new version of the file /etc/aliases.db There is no need to restart the
sendmail daemon. The changes are picked up automatically.
5. File /etc/mail/virtusertable (Optional) Allows the separation of emails by domain. i.e.
[email protected] and [email protected] go to two different users greg1 and
greg2.

[email protected] dave
[email protected] john
[email protected] john
@domain-2.com error:nouser User unknown
@domain3.com mathew

6. The second column is the local user, a remote forwarding email address or a mailing
list entry in /etc/aliases. The domain "domain-2.com" will only receive email for
[email protected] and [email protected] while all other mail to this
domain receives an error message.
7. Convert /etc/mail/virtusertable to /etc/mail/virtusertable.db with the commands:
o cd /etc/mail
o make
8. Relaying and receiving mail is controlled by the file: /etc/mail/access. By default
relaying is only allowed by localhost and sendmail will accept mail from all. (Red Hat
7.1 default is more strict but the restriction is not from the access file. More below.)
Required for outbound email. Helpful for blocking some unwanted inbound email.

localhost.localdomain RELAY

33
Application Aware Backup System

localhost RELAY
127.0.0.1 RELAY

9. Generate database file:


10. [root prompt]# makemap hash /etc/mail/access.db < /etc/mail/access

11. The access file can be used to thwart spammers. After adding entries to the access
file, generate the database file with the command above.

XXX.XXX.XXX.XXX REJECT
YYY.YYY.YYY.YYY ERROR:"550 We don't accept mail from spammers"
[email protected] REJECT " Spam not accepted"
ZZZ.ZZZ.ZZZ.ZZZ OK - Override rules and allow

ZZZ.ZZZ OK - Allow from ZZZ.ZZZ.*.* network

12. See the /etc/mail/access file I am currently using. It changes daily. Feel free to cut and
paste this Sendmail access file to your system.
Other access lists:
o Iowa State University
o West-Point.org
o IP block list used by http: Wizcrafts.net: Exploited server list

Sendmail.org: More info on cf-readme (See Anti-Spam section)

13. Sendmail must be running. See the YoLinux init tutorial to learn how the sendmail
daemon can be configured to be started by the system upon system boot. This may
have been configured during installation.

The default configuration is fairly secure and usable. For Red Hat 6 and earlier systems, you
are ready to mail. For Red Hat 7 systems, there is one more step. See changes below required
to receive mail.

34
Application Aware Backup System

Note: A user defined in the aliases file is valid for all domains hosted by the system, unless
you have configured virtual hosting.

For alternate configurations change the file: sendmail.cf

The config file sendmail.cf has become so complex that most people use the m4 macro
package to generate this file from a sendmail.mc file. Pre-configured ".mc" files can be found
in the directory:

 /etc/mail/ (Red Hat 9.0 - Fedora Core 3)


 /usr/lib/sendmail-cf/cf/ (Red Hat 7.1)
 /usr/share/sendmail-cf/cf/ (Red Hat 6.x)

Default Red Hat sendmail.cf configurations:

 Fedora Core 3, Red Hat Enterprise Linux 4, CentOS 4:

cd/etc/mail make(Checks for changes and rebuilds data files.)


or perform the manual process:

m4 /etc/mail/sendmail.mc > /etc/mail/sendmail.cf

 Red Hat 9.0:

m4 /usr/share/sendmail-cf/m4/cf.m4 /etc/mail/sendmail.mc > /etc/mail/sendmail.cf

 Red Hat 7.1:

You will find that the files /etc/sendmail.cf and /usr/share/sendmail-


cf/cf/redhat.cf are the same and is the RedHat default.

Note: the cf.m4 file is represented as an include file in the sendmail "mc" macro file.
(include(`/usr/share/sendmail-cf/m4/cf.m4'))

 Red Hat 6.x:

35
Application Aware Backup System

You will find that the files /etc/sendmail.cf and /usr/lib/sendmail-


cf/cf/redhat.cf are identical and is the RedHat default.

cd/usr/lib/sendmail-cf/cf/
m4 ../m4/cf.m4 redhat.mc > /etc/sendmail.cf

THE CLOUD STORAGE API:

What does a cloud storage API do for an ISV? First and most importantly, it allows
for native communication between the software application and the cloud storage provider.

36
Application Aware Backup System

This eliminates the need for the ISV's application to depend on the appliance for data
movement. In addition, it allows the ISV greater control over how and where data is stored,
and even may save them time in the development process, by not having to re-invent a wheel
that is already available. Finally API sets, like Iron Mountain’s Archive Services Platform
API, can provide the application with greater functionality beyond just simple cloud storage.
An integrated API set also provides the end user with a more simplified means of interacting
with the secondary storage tier and greater control over how that data is stored or retrieved.
This is because the interaction with the data is from within the application which should
eliminate external procedures to data management. In conjunction with this should be a more
consistent adherence to storage and retention policies, because those policies can be set when
the data is live instead of going back and trying to classify data years after it’s created.
Interacting and performing efficient communication to the cloud is just the bare minimum of
what an API should do for the ISV. If all the API does is provide an additional storage point,
it has little more value than just supporting the latest disk-to-tape device. Cloud Storage APIs,
like those provided by Iron Mountain Archive Services Platform API, do much more. For
example their Archive Services Platform API uses either SOAP or Restful Web services to
expose its interfaces which include ingestion, search, retrieval, retention, and destruction
capabilities. APIs like this can add capabilities to an ISV's application without much
additional development effort. These are critical advantages for the ISV because advanced
Cloud Storage API sets can speed time to market with new features, save development
investment and reduce the time to actually develop these capabilities. Finally, they enable the
ISV to keep their customers happy in a very competitive market place.

CHAPTER 6: REFERENCES

White Papers:

37
Application Aware Backup System

1. Hartwig Gunzer, Sales Engineer, BorlandMarch 2002. “Introduction to web services”.


2. Joshiua Liu, Harry Cheng, “Interactive LDAP”, University of California, IEEE
Publication 2008.

IEEE Papers:

1. Chen Wei, Chun Mei, “Analysis and design of Linux file system based on computer
forensic”, University of Xiaman Fujian, China, IEEE paper-2010.

Websites:

1. http://www.developers.ironmountail.com
2. Daniel Richards, “How stuff works (How send mail server works)”,
http://www.howstuffworks.net/howsendmailworks.
3. http://www.w3.org
4. http://www.webservices.org
5. Danielson, Krissi (2008-03-26). “Distingiushing Cloud Computing” Ebizq.net.
http://www.ebizq.net/blogs/saasweek/2008/03/distingiushing_cloud_computing/

38

You might also like