Cloud Computing Project Report
Cloud Computing Project Report
Cloud Computing Project Report
Experiment No. 1
Aim: Introduction to cloud computing.
.
1. Objectives:
to provide an overview of concepts of Cloud Computing .
to indulge into research in Cloud Computing.
2. Outcomes:
To understand and appreciate cloud architecture.
To analyze the local and global impact of computing on
individuals, organizations, and society.
To recognize the need for, and an ability to engage in life-long learning.
4. Theory:
• Self-service provisioning: End users can spin up computing resources for almost any
type of workload on-demand.
• Elasticity: Companies can scale up as computing needs increase and then scale down
again as demands decreases.
Pay per use: Computing resources are measured at a granular level, allowing users to
pay only for the resources and workloads they use.
Private cloud services are delivered from a business' data center to internal users. This
model offers versatility and convenience, while preserving management, control and
security. Internal customers may or may not be billed for services through IT chargeback.
In the Public cloud model, a third-party provider delivers the cloud service over the
Internet. Public cloud services are sold on-demand, typically by the minute or the hour.
Customers only pay for the CPU cycles, storage or bandwidth they consume. Leading
1|Page
CLOUD COMPUTING LAB HIET,KALA AMB,HP
public cloud providers include Amazon Web Services (AWS), Microsoft Azure,
IBM/SoftLayer and Google Compute Engine.
Hybrid cloud is a combination of public cloud services and on-premises private cloud –
with orchestration and automation between the two.
IT people talk about three different kinds of cloud computing, where different services
are being provided for you. Note that there's a certain amount of vagueness about how
these things are defined and some overlap between them.
Advantages: The pros of cloud computing are obvious and compelling. If your
business is selling books or repairing shoes, why get involved in the nitty gritty of
buying and maintaining a complex computer system? If you run an insurance office,
do you really want your sales agents wasting time running anti-virus software,
upgrading word-processors, or worrying about hard-drive crashes? Do you really
2|Page
CLOUD COMPUTING LAB HIET,KALA AMB,HP
want them cluttering your expensive computers with their personal emails, illegally
shared MP3 files, and naughty YouTube videos—when you could leave that
responsibility to someone else? Cloud computing allows you to buy in only the
services you want, when you want them, cutting the upfront capital costs of
computers and peripherals. You avoid equipment going out of date and other familiar
IT problems like ensuring system security and reliability. You can add extra services
(or take them away) at a moment's notice as your business needs change. It's really
quick and easy to add new applications or services to your business without waiting
weeks or months for the new computer (and its software) to arrive.
If you're using software as a service (for example, writing a report using an online
word processor or sending emails through webmail), you need a reliable, high-
speed, broadband Internet connection functioning the whole time you're working.
That's something we take for granted in countries such as the United States, but it's
much more of an issue in developing countries or rural areas where broadband is
unavailable.
If you're buying in services, you can buy only what people are providing, so you may
be restricted to off-the-peg solutions rather than ones that precisely meet your needs.
Not only that, but you're completely at the mercy of your suppliers if they suddenly
decide to stop supporting a product you've come to depend on. (Google, for example,
upset many users when it announced in September 2012 that its cloud-based Google
Docs would drop support for old but de facto standard Microsoft Office file formats
such as .DOC, .XLS, and .PPT, giving a mere one week's notice of the
change—although, after public pressure, it later extended the deadline by three
months.) Critics charge that cloud-computing is a return to the bad-old days of
mainframes and proprietary systems, where businesses are locked into unsuitable,
long-term arrangements with big, inflexible companies. Instead of using "generative"
systems (ones that can be added to and extended in exciting ways the developers
never envisaged), you're effectively using "dumb terminals" whose uses are severely
limited by the supplier. Good for convenience and security, perhaps, but what will
you lose in flexibility? And is such a restrained approach good for the future of the
Internet as a whole? (To see why it may not be, take a look at Jonathan Zittrain's
eloquent book The Future of the Internet—And How to Stop It.)
5. Conclusion:
3|Page
CLOUD COMPUTING LAB HIET,KALA AMB,HP
4|Page
CLOUD COMPUTING LAB HIET,KALA AMB,HP
Experiment No. 2
1. Aim:To study and implementation of Infrastructure as a Service.
2. Objectives:
to understand concepts of virtualization and to use cloud as Infrastructure as a
services.
To learn the technique and its complexity
To understand the importance of this technique from application point of view
3. Outcomes:
5. Theory:
Infrastructure as a Service (IaaS) is one of the three fundamental service
models of cloud computing alongside Platform as a Service (PaaS) and Software
as a Service (SaaS). As with all cloud computing services it provides access to
computing resource in a virtualised environment, “the Cloud”, across a public
connection, usually the internet. In the case of IaaS the computing resource
provided is specifically that of virtualised hardware, in other words, computing
infrastructure. The definition includes such offerings as virtual server space,
network connections, bandwidth, IP addresses and load balancers. Physically, the
pool of hardware resource is pulled from a multitude of servers and networks
usually distributed across numerous data centers, all of which the cloud provider
is responsible for maintaining. The client, on the other hand, is given access to the
virtualised components in order to build their own IT platforms.
In common with the other two forms of cloud hosting, IaaS can be utilised by
enterprise customers to create cost effective and easily scalable IT solutions
where the complexities and expenses of managing the underlying hardware are
outsourced to the cloud provider. If the scale of a business customer’s operations
fluctuate, or they are looking to expand, they can tap into the cloud resource as
5|Page
CLOUD COMPUTING LAB HIET,KALA AMB,HP
and when they need it rather than purchase, install and integrate hardware
themselves.
IaaS customers pay on a per-use basis, typically by the hour, week or month.
Some providers also charge customers based on the amount of virtual machine
space they use. This pay-as-you-go model eliminates the capital expense of
deploying in-house hardware and software. However, users should monitor their
IaaS environments closely to avoid being charged for unauthorized services.
6. Procedure:
Installation Steps:
a. Openstack using devstack
b. Sudo apt-get installgit
c. Git clone https://git.openstack.org/openstack-dev/devstack
d. Cd devstack
e. Is (you can see stack.sh in the list)
f. ./stck.sh
g. Give password as many time it asks
h. ls –ls come out of devstack
i. sudo chmod777devstack
j. cd devstack
k. ./stack.sh
6|Page
CLOUD COMPUTING LAB HIET,KALA AMB,HP
7. Conclusion:
We have installed Xen as bare metal hypervisor and implemented it. It provides
access to computing resources in a virtual environment. With the help of
Infrastructure as a service we can build our own IT platform. We can install
Windows Operating System on Ubuntu and vice versa.
7|Page
CLOUD COMPUTING LAB HIET,KALA AMB,HP
Experiment No. 3
1. Aim:To study and implementation of Storage as a Service
2. Objectives:
You use your word processor most likely some version of Microsoft Word—to
write memos, letters, thank you notes, fax coversheets, reports, newsletters, you
name it. The word processor is an essential part of our computing lives. There are
a number of web-based replacements for Microsoft’s venerable Word program are
available. All of these programs let you write your letters and memos and reports
from any computer, no installed software necessary, as long as that computer has
a connection to the Internet. And every document you create is housed on the
web, so you don’t have to worry about taking your work with you. It’s cloud
computing at its most useful, and it’s here today.
Exploring Web-Based Word Processors:
There are a half-dozen or so really good web-based word processing applications,
led by the ever-popular Google Docs. We’ll start our look at these applications
with Google’s application and work through the rest in alphabetic order.
Google Docs:
Google Docs (docs.google.com) is the most popular web-based word processor
available today. Docs is actually a suite of applications that also includes Google
Spreadsheets and Google Presentations; the Docs part of the Docs suite is the
actual word processing application. Like all things Google, the Google Docs
8|Page
CLOUD COMPUTING LAB HIET,KALA AMB,HP
interface is clean and, most important, it works well without imposing a steep
learning curve. Basic formatting is easy enough to do, storage space for your
documents is generous, and sharing collaboration version control is a snap to do.
When you log in to Google Docs with your Google account, you see the page.
This is the home page for all the Docs applications (word processing,
spreadsheets, and presentations); all your previously created documents are listed
on this page. The leftmost pane helps you organize your documents. You can
store files in folders, view documents by type (word processing document or
spreadsheet), and display documents shared with specific people.
Collaborating on Spreadsheets :
If the word processor is the most-used office application, the spreadsheet is the
second most-important app. Office users and home users alike use spreadsheets to
prepare budgets, create expense reports, perform “what if” analyses, and
otherwise crunch their numbers. And thus we come to those spreadsheets in the
cloud, the web-based spreadsheets that let you share your numbers with other
users via the Internet. All the advantages of webbased word processors apply to
web-based spreadsheets— group collaboration, anywhere/anytime access,
portability, and so on.
Exploring Web-Based Spreadsheets:
Several web-based spreadsheet applications are worthy competitors to Microsoft
Excel. Chief among these is Google Spreadsheets, which we’ll discuss first, but
there are many other apps that also warrant your attention. If you’re at all
interested in moving your number crunching and financial analysis into the cloud,
these web-based applications are worth checking out.
Google Spreadsheets
Google Spreadsheets was Google’s first application in the cloud office suite first
known as Google Docs & Spreadsheets and now just known as Google Docs. As
befits its longevity, Google Spreadsheets is Google’s most sophisticated web-
based application. You access your existing and create new spreadsheets from the
main Google Docs page (docs.google.com). To create a new spreadsheet, click
the New button and select Spreadsheet; the new spreadsheet opens in a new
window and you can edit it.
Collaborating on Presentations:
One of the last components of the traditional office suite to move into the cloud is
the presentation application. Microsoft PowerPoint has ruled the desktop forever,
and it’s proven difficult to offer competitive functionality in a web-based
application; if nothing else, slides with large graphics are slow to upload and
download in an efficient manner. That said, there is a new crop of web-based
presentation applications that aim to give PowerPoint a run for its money. The big
players, as might be expected, are Google and Zoho, but there are several other
9|Page
CLOUD COMPUTING LAB HIET,KALA AMB,HP
applications that are worth considering if you need to take your presentations with
you on the road—or collaborate with users in other locations.
Google Presentations:
If there’s a leader in the online presentations market, it’s probably Google
Presentations, simply because of Google’s dominant position with other webbased
office apps. Google Presentations is the latest addition to the Google Docs suite of
apps, joining the Google Docs word processor and Google Spreadsheets
spreadsheet application. Users can create new presentations and open existing
ones from the main Google Docs page (docs.google.com). Open a presentation by
clicking its title or icon. Create a new presentation by selecting New, then
Presentation. Your presentation now opens in a new window on your desktop.
What you do get is the ability to add title, text, and blank slides; a PowerPoint-
like slide sorter pane; a selection of predesigned themes. the ability to publish
your file to the web or export as a PowerPoint PPT or Adobe PDF file; and quick
and easy sharing and collaboration, the same as with Google’s other web-based
apps.
Collaborating on Databases:
A database does many of the same things that a spreadsheet does, but in a
different and often more efficient manner. In fact, many small businesses use
spreadsheets for database-like functions. A local database is one in which all the
data is stored on an individual computer. A networked database is one in which
the data is stored on a computer or server connected to a network, and accessible
by all computers connected to that network. Finally, an online or web-based
database stores data on a cloud of servers somewhere on the Internet, which is
accessible by any authorized user with an Internet connection. The primary
advantage of a web-based database is that data can easily be shared with a large
number of other users, no matter where they may be located. When your
employee database is in the cloud.
Exploring Web-Based Databases:
In the desktop computing world, the leading database program today is Microsoft
Access. (This wasn’t always the case; dBase used to rule the database roost, but
things change over time.) In larger enterprises, you’re likely to encounter more
sophisticated software from Microsoft, Oracle, and other companies.
Interestingly, none of the major database software developers currently provide
web-based database applications. Instead, you have to turn to a handful of start-up
companies (and one big established name) for your online database needs.
Cebase
Cebase (www.cebase.com) lets you create new database applications with a few
clicks of your mouse; all you have to do is fill in a few forms and make a few
choices from some pull-down lists. Data entry is via web forms, and then your
10 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
data is displayed in a spreadsheet-like layout, You can then sort, filter, and group
your data as you like. Sharing is accomplished by clicking the Share link at the
top of any data page. You invite users to share your database via email, and then
adjust their permissions after they’ve accepted your invitation.
6. Result:
_____________________________SNAPSHOTS_______________________________
Step 1: Sign into the Google Drive website with your Google account.
If you don’t have a Google account, you can create one for free. Google Drive will allow
you to store your files in the cloud, as well as create documents and forms through the
Google Drive web interface.
11 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
12 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
Step 4: Use the navigation bar on the left side to browse your files.
“My Drive” is where all of your uploaded files and folders are stored. “Shared with Me”
are documents and files that have been shared with you by other Drive users. “Starred”
files are files that you have marked as important, and “Recent” files are the ones you
have most recently edited.
•You can drag and drop files and folders around your Drive to organize them as you see
fit.
•Click the Folder icon with a “+” sign to create a new folder in your Drive. You can
create folders inside of other folders to organize your files.
13 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
14 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
A menu will appear that allows you to choose what type of document you want to create.
You have several options by default, and more can be added by clicking the “More “ link
at the bottom of the menu:
15 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
16 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
17 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
Other Capabilities
1. Edit photos
2. Listen Music
3. Do drawings
4. Merge PDFs
18 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
7. Conclusion:
Google Docs provide an efficient way for storage of data. It fits well in Storage as
a service (SaaS). It has varied options to create documents, presentations and also
spreadsheets. It saves documents automatically after a few seconds and can be
shared anywhere on the Internet at the click of a button.
19 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
Experiment No. 4
1. Aim: To study and implementation of identity management
5. Theory:
Identity Management
Every enterprise will have its own identity management system to control access to
information and computing resources. Cloud providers either integrate the
customer’s identity management system into their own infrastructure, using
federation or SSO technology, or a biometric-based identification system, or
provide an identity management solution of their own. CloudID, for instance,
provides a privacy-preserving cloud-based and cross-enterprise biometric
identification solutions for this problem. It links the confidential information of the
users to their biometrics and stores it in an encrypted fashion. Making use of a
searchable encryption technique, biometric identification is performed in encrypted
domain to make sure that the cloud provider or potential attackers do not gain
access to any sensitive data or even the contents of the individual queries.
20 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
6. Procedure:
7. we have our own cloud system ready so we will open in virtual box
8. my system 192.168.44.169
9. continuing
21 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
21. if you check in browser you will see the paper is uploaded
23. if you add any content to folder that is sync then you can see change in cloud in browser
• Enter the new user’s Login Name and their initial Password
• Optionally, assign Groups memberships
• Click the Create button
Reset a User’s Password: You cannot recover a user’s password, but you can set a new
one:
Renaming a User
Each ownCloud user has two names: a unique Login Name used for authentication, and a
Full Name, which is their display name. You can edit the display name of a user, but you
cannot change the login name of any user.
22 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
Managing Groups
You can assign new users to groups when you create them, and create new groups when
you create new users. You may also use the Add Group button at the top of the left pane
to create new groups. New group members will immediately have access to file shares
that belong to their new groups.
Click the gear on the lower left pane to set a default storage quota. This is automatically
applied to new users. You may assign a different quota to any user by selecting from the
Quota dropdown, selecting either a preset value or entering a custom value. When you
create custom quotas, use the normal abbreviations for your storage values such as 500
MB, 5 GB, 5 TB, and so on.
Deleting users
Deleting a user is easy: hover your cursor over their name on the Users page until a
trashcan icon appears at the far right. Click the trashcan, and they’re gone. You’ll see an
undo button at the top of the page, which remains until you refresh the page. When the
undo button is gone you cannot recover the deleted user.
File Sharing
ownCloud users can share files with their ownCloud groups and other users on the same
ownCloud server, and create public shares for people who are not ownCloud users. You
have control of a number of user permissions on file shares:
23 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
• Allowing resharing
24 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
Step 2 :By default, the ownCloud Web interface opens to your Files page. You can add, remove,
and share files, and make changes based on the access privileges set by you (if you are
administering the server) or by your server administrator. You can access your ownCloud files
with the ownCloud web interface and create, preview, edit, delete, share, and re-share files. Your
ownCloud administrator has the option to disable these features, so if any of them are missing on
your system ask your server administrator.
25 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
Step 3: Apps Selection Menu: Located in the upper left corner, click the arrow to open a
dropdown menu to navigate to your various available apps. Apps Information field: Located in
the left sidebar, this provides filters and tasks associated with your selected app. Application
View: The main central field in the ownCloud user interface. This field displays the contents or
user features of your selected app.
26 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
Step 4:Share the file or folder with a group or other users, and create public shares with
hyperlinks. You can also see who you have shared with already, and revoke shares by clicking
the trash can icon. If username auto-completion is enabled, when you start typing the user or
group name ownCloud will automatically complete it for you. If your administrator has enabled
email notifications, you can send an email notification of the new share from the sharing screen.
27 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
28 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
8. Conclusion:
We have studied how to use ownCloud for ensuring identity management of the
users. We can create multiple groups and provide privileges to view or modify
data as per defined permissions. It also enables simplified look and feel to be used
by anyone.
29 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
Experiment No. 5
1. Aim:To Study Cloud Security management
5. Theory:
Cloud computing security is the set of control-based technologies and policies
designed to adhere to regulatory compliance rules and protect information, data
applications and infrastructure associated with cloud computing use. Because of
the cloud's very nature as a shared resource, identity management, privacy
andaccess control are of particular concern. With more organizations using cloud
computing and associated cloud providers for data operations, proper security in
these and other potentially vulnerable areas have become a priority for
organizations contracting with a cloud computing provider.
Cloud computing security processes should address the security controls the
cloud provider will incorporate to maintain the customer's data security, privacy
and compliance with necessary regulations. The processes will also likely
include a business continuity and databackup plan in the case of a cloud security
breach.
30 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
Physical security
Personnel security
Application security
Cloud providers ensure that applications available as a service via the cloud
(SaaS) are secure by specifying, designing, implementing, testing and
maintaining appropriate application security measures in the production
environment. Note that - as with any commercial software - the controls they
implement may not necessarily fully mitigate all the risks they have identified,
and that they may not necessarily have identified all the risks that are of concern
to customers. Consequently, customers may also need to assure themselves that
cloud applications are adequately secured for their specific purposes, including
their compliance obligations.
6. Procedure:
31 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
32 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
you need to scan that QR code on your mobile phone using barcode scanner
(install it in mobile phone)you also need to install "Google Authenticator" in
your mobile phone to generate the MFA code
7) Google authenticator will keep on generating a new MFA code after every 60
seconds
that code you will have to enter while logging as a user.
Hence, the security is maintained by MFA device code...
onecan not use your AWS account even if it may have your user name and
password, because MFA code is on your MFA device (mobiel phone in this case)
and it is getting changed after every 60 seconds.
Permissions in user account:
After creating the user by following above mentioned steps; you can give certain
permissions to specific user
1) click on created user
2) goto "Permissions" tab
3) click on "Attach Policy" button
4) select the needed policy from given list and click on apply.
7. Result:
Step 1 :goto aws.amazon.com
33 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
Step 2 : Click on "My Account". Select "AWS management console" and click on
it. Give Email id in the required field
34 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
35 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
36 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
37 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
38 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
39 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
40 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
8. Conclusion:
We have studied how to secure the cloud and its data. Amazon EWS provides the best
security with its extended facilities and services like MFA device. It also gives you the
ability to add your own permissions and policies for securing data more encrypted.
41 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
Experiment No. 6
Aim: Case Study: Amazon web Services
.
1. Objectives:
to provide an overview of concepts of Amazon web services .
to indulge into study of Amazon’s web services in Cloud Computing.
2. Outcomes:
4. Theory:
About Amazon.com
Amazon.com is the world’s largest online retailer. In 2011, Amazon.com switched from tape
backup to using Amazon Simple Storage Service (Amazon S3) for backing up the majority
of its Oracle databases. This strategy reduces complexity and capital expenditures, provides
faster backup and restore performance, eliminates tape capacity planning for backup and archive,
and frees up administrative staff for higher value operations. The company was able to replace
their backup tape infrastructure with cloud-based Amazon S3 storage, eliminate backup
software, and experienced a 12X performance improvement, reducing restore time from around
15 hours to 2.5 hours in select scenarios.
The Challenge
As Amazon.com grows larger, the sizes of their Oracle databases continue to grow, and so does
the sheer number of databases they maintain. This has caused growing pains related to backing
up legacy Oracle databases to tape and led to the consideration of alternate strategies including
the use of Cloud services of Amazon Web Services (AWS), a subsidiary of Amazon.com. Some
of the business challenges Amazon.com faced included:
Utilization and capacity planning is complex, and time and capital expense budget are at a
premium. Significant capital expenditures were required over the years for tape hardware, data
center space for this hardware, and enterprise licensing fees for tape software. During that
time, managing tape infrastructure required highly skilled staff to spend time with setup,
certification and engineering archive planning instead of on higher value projects. And at the
42 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
end of every fiscal year, projecting future capacity requirements required time consuming
audits, forecasting, and budgeting.
The cost of backup software required to support multiple tape devices sneaks up on you. Tape
robots provide basic read/write capability, but in order to fully utilize them, you must invest in
proprietary tape backup software. For Amazon.com, the cost of the software had been high,
and added significantly to overall backup costs. The cost of this software was an ongoing
budgeting pain point, but one that was difficult to address as long as backups needed to be
written to tape devices.
Maintaining reliable backups and being fast and efficient when retrieving data requires a lot of
time and effort with tape. When data needs to be durably stored on tape, multiple copies are
required. When everything is working correctly, and there is minimal contention for tape
resources, the tape robots and backup software can easily find the required data. However, if
there is a hardware failure, human intervention is necessary to restore from tape. Contention
for tape drives resulting from multiple users’ tape requests slows down restore processes even
more. This adds to the recovery time objective (RTO) and makes achieving it more
challenging compared to backing up to Cloud storage.
Why Amazon Web Services
Amazon.com initiated the evaluation of Amazon S3 for economic and performance
improvements related to data backup. As part of that evaluation, they considered security,
availability, and performance aspects of Amazon S3 backups. Amazon.com also executed a cost-
benefit analysis to ensure that a migration to Amazon S3 would be financially worthwhile. That
cost benefit analysis included the following elements:
Performance advantage and cost competitiveness. It was important that the overall costs of the
backups did not increase. At the same time, Amazon.com required faster backup and recovery
performance. The time and effort required for backup and for recovery operations proved to be
a significant improvement over tape, with restoring from Amazon S3 running from two to
twelve times faster than a similar restore from tape. Amazon.com required any new backup
medium to provide improved performance while maintaining or reducing overall costs.
Backing up to on-premises disk based storage would have improved performance, but missed
on cost competitiveness. Amazon S3 Cloud based storage met both criteria.
Less operational friction. Amazon.com DBAs had to evaluate whether Amazon S3 backups
would be viable for their database backups. They determined that using Amazon S3 for
backups was easy to implement because it worked seamlessly with Oracle RMAN.
Strong data security. Amazon.com found that AWS met all of their requirements for physical
security, security accreditations, and security processes, protecting data in flight, data at rest,
and utilizing suitable encryption standards.
The Benefits
43 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
With the migration to Amazon S3 well along the way to completion, Amazon.com has realized
several benefits, including:
Elimination of complex and time-consuming tape capacity planning. Amazon.com is growing
larger and more dynamic each year, both organically and as a result of acquisitions. AWS has
enabled Amazon.com to keep pace with this rapid expansion, and to do so seamlessly.
Historically, Amazon.com business groups have had to write annual backup plans, quantifying
the amount of tape storage that they plan to use for the year and the frequency with which they
will use the tape resources. These plans are then used to charge each organization for their
tape usage, spreading the cost among many teams. With Amazon S3, teams simply pay for
what they use, and are billed for their usage as they go. There are virtually no upper limits as
to how much data can be stored in Amazon S3, and so there are no worries about running out
of resources. For teams adopting Amazon S3 backups, the need for formal planning has been
all but eliminated.
Reduced capital expenditures. Amazon.com no longer needs to acquire tape robots, tape
drives, tape inventory, data center space, networking gear, enterprise backup software, or
predict future tape consumption. This eliminates the burden of budgeting for capital
equipment well in advance as well as the capital expense.
Immediate availability of data for restoring – no need to locate or retrieve physical tapes.
Whenever a DBA needs to restore data from tape, they face delays. The tape backup software
needs to read the tape catalog to find the correct files to restore, locate the correct tape, mount
the tape, and read the data from it. In almost all cases the data is spread across multiple tapes,
resulting in further delays. This, combined with contention for tape drives resulting from
multiple users’ tape requests, slows the process down even more. This is especially severe
during critical events such as a data center outage, when many databases must be restored
simultaneously and as soon as possible. None of these problems occur with Amazon S3. Data
restores can begin immediately, with no waiting or tape queuing – and that means the database
can be recovered much faster.
Backing up a database to Amazon S3 can be two to twelve times faster than with tape drives.
As one example, in a benchmark test a DBA was able to restore 3.8 terabytes in 2.5 hours over
gigabit Ethernet. This amounts to 25 gigabytes per minute, or 422MB per second. In addition,
since Amazon.com uses RMAN data compression, the effective restore rate was 3.37
gigabytes per second. This 2.5 hours compares to, conservatively, 10-15 hours that would be
required to restore from tape.
Easy implementation of Oracle RMAN backups to Amazon S3. The DBAs found it easy to
start backing up their databases to Amazon S3. Directing Oracle RMAN backups to Amazon
S3 requires only a configuration of the Oracle Secure Backup Cloud (SBC) module. The effort
required to configure the Oracle SBC module amounted to an hour or less per database. After
this one-time setup, the database backups were transparently redirected to Amazon S3.
Durable data storage provided by Amazon S3, which is designed for 11 nines durability. On
occasion, Amazon.com has experienced hardware failures with tape infrastructure – tapes that
break, tape drives that fail, and robotic components that fail. Sometimes this happens when a
DBA is trying to restore a database, and dramatically increases the mean time to recover
44 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
(MTTR). With the durability and availability of Amazon S3, these issues are no longer a
concern.
Freeing up valuable human resources. With tape infrastructure, Amazon.com had to seek out
engineers who were experienced with very large tape backup installations – a specialized,
vendor-specific skill set that is difficult to find. They also needed to hire data center
technicians and dedicate them to problem-solving and troubleshooting hardware issues –
replacing drives, shuffling tapes around, shipping and tracking tapes, and so on. Amazon S3
allowed them to free up these specialists from day-to-day operations so that they can work on
more valuable, business-critical engineering tasks.
Elimination of physical tape transport to off-site location. Any company that has been storing
Oracle backup data offsite should take a hard look at the costs involved in transporting,
securing and storing their tapes offsite – these costs can be reduced or possibly eliminated by
storing the data in Amazon S3.
As the world’s largest online retailer, Amazon.com continuously innovates in order to provide
improved customer experience and offer products at the lowest possible prices. One such
innovation has been to replace tape with Amazon S3 storage for database backups. This
innovation is one that can be easily replicated by other organizations that back up their Oracle
databases to tape.
45 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
Experiment No. 7
Aim: Case Study: Google App Engine
Google App Engine (often referred to as GAE or simply App Engine, and also used by the
acronym GAE/J) is a platform as a service (PaaS) cloud computing platform for developing and
hosting web applications in Google-managed data centers. Applications are sandboxed and run
across multiple servers. App Engine offers automatic scaling for web applications—as the
number of requests increases for an application, App Engine automatically allocates more
resources for the web application to handle the additional demand.
Google App Engine is free up to a certain level of consumed resources. Fees are charged for
additional storage, bandwidth, or instance hours required by the application. It was first released
as a preview version in April 2008, and came out of preview in September 2011.
Currently, the supported programming languages are Python, Java (and, by extension, other JVM
languages such as Groovy, JRuby, Scala, Clojure, Jython and PHP via a special version
ofQuercus), and Go. Google has said that it plans to support more languages in the future, and
that the Google App Engine has been written to be language independent.
All billed High-Replication Datastore App Engine applications have a 99.95% uptime SLA
46 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
Portability Concerns
Developers worry that the applications will not be portable from App Engine and fear being
locked into the technology. In response, there are a number of projects to create open-source
back-ends for the various proprietary/closed APIs of app engine, especially the datastore.
Although these projects are at various levels of maturity, none of them is at the point where
installing and running an App Engine app is as simple as it is on Google’s service. AppScale and
TyphoonAE are two of the open source efforts.
AppScale can run Python, Java, and Go GAE applications on EC2 and other cloud vendors.
TyphoonAE can run python App Engine applications on any cloud that support linux machines.
Web2py web framework offers migration between SQL Databases and Google App Engine,
however it doesn’t support several App Engine-specific features such as transactions and
namespaces.
Compared to other scalable hosting services such as Amazon EC2, App Engine provides more
infrastructure to make it easy to write scalable applications, but can only run a limited range of
applications designed for that infrastructure.
App Engine’s infrastructure removes many of the system administration and development
challenges of building applications to scale to hundreds of requests per second and beyond.
Google handles deploying code to a cluster, monitoring, failover, and launching application
instances as necessary.
While other services let users install and configure nearly any *NIX compatible software, App
Engine requires developers to use only its supported languages, APIs, and frameworks. Current
APIs allow storing and retrieving data from a BigTable non-relational database; making HTTP
requests; sending e-mail; manipulating images; and caching. Existing web applications that
require a relational database will not run on App Engine without modification.
Per-day and per-minute quotas restrict bandwidth and CPU use, number of requests served,
number of concurrent requests, and calls to the various APIs, and individual requests are
terminated if they take more than 60 seconds or return more than 32MB of data.
Google App Engine’s datastore has a SQL-like syntax called “GQL”. GQL intentionally does not
support the Join statement, because it seems to be inefficient when queries span more than one
machine. Instead, one-to-many and many-to-many relationships can be accomplished using
ReferenceProperty().
47 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
Experiment No. 8
Aim: Installation and configuration of Hadoop.
Theory:
What is Hadoop?
Master/Slave Architecture:
In this architecture, the Master is either Namenode or JobTraker or both, and the Slave is
multiple DataNodes and TaskTrakers ({DataNode, TaskTracker}, ….. {DataNode,
TaskTracker})
48 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
Installing Java
You can go through the steps to install Java and Hadoop here.
Pre-installation Setup
Before installing Hadoop, we need to know the prerequisites for installation. The following are
the prerequisites:
Memory
Processor model and speed
Operating system and Network
Memory: Minimum 8GB RAM is required.
Processor model and speed: quad, hexa, or octa core processor with 2-2.5 GHz speed.
Operating system and Network requirements: Installing Hadoop with Linux is better rather
than on Windows, and for learning purpose, install Hadoop in pseudo-distributed mode. The
49 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
Configuration:
The following are the steps to configure files to set up HDFS and MapReduce environment:
Step:1 Extract the core Hadoop configuration files into a temporary directory.
Step:2 The files are in the path: configuration_files/core_Hadoop directory where companion
files are decompressed.
Step:3 Make necessary changes in the configuration files.
Step:4 In the temporary directory, locate the files and edit their properties based on your
environment.
Step:5 Search for ToDo list in the files for the properties to replace.
The Following are the steps to install Hadoop 2.4.1 in pseudo distributed mode.
Step 1 − Extract all downloaded files:
The following command is used to extract files on command prompt:
Command: cd Downloads
Step 2 − Create soft links (shortcuts).
The following command is used to create shortcuts:
Command: ln -s ./Downloads/hadoop-2.7.2/ ./hadoop
Step 3 − Configure .bashrc
This following code is to modify PATH variable in bash shell.
Command: vi ./.bashrc
The following code exports variables to path :
export HADOOP_HOME=/home/luck/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
Step 4 − Configure Hadoop in Stand-alone mode:
The following command is used to Configure Hadoop’s hadoop-env.sh file
Command: vi ./hadoop/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/home/luck/jdk
Step 5 − Exit and re-open the command prompt
Step 6 − Run a Hadoop job on Standalone cluster
To run hadoop test the hadoop command. The usage message must be displayed.
Step 7 − Go to the directory you have downloaded the compressed Hadoop file and unzip
using terminal
Command: $ tar -xzvf hadoop-2.7.3.tar.gz
50 | P a g e
CLOUD COMPUTING LAB HIET,KALA AMB,HP
Hadoop 2.4.1
Subversion https://svn.apache.org/repos/asf/hadoop/common -r 1529768
Compiled by hortonmu on 2013-10-07T06:28Z
Compiled with protoc 2.5.0
From source with checksum 79e53ce7994d1628b240f09af91e1af4
The above result shows that Hadoop standalone mode setup is working.
The following Xml files must be reconfigured in order to develop Hadoop in Java:
Core-site.xml
Mapred-site.xml
Hdfs-site.xml
Core-site.xml:
The core-site.xml file contains information regarding memory allocated for the file system, the
port number used for Hadoop instance, size of Read/Write buffers, and memory limit for storing
the data.
Open the core-site.xml with the following command and add the properties listed below in
between, tags in this file.
Command: ~$ sudo gedit $HADOOP_HOME/etc/hadoop/core-site.xml
51 | P a g e