Module 5 - Chapter 2
Module 5 - Chapter 2
Module - 5
Cloud Applications. ( Chapter - 10 )
Cloud computing has gained huge popularity in industry due to its ability to host applications for which the
services can be delivered to consumers rapidly at minimal cost.
This chapter discusses some application case studies, detailing their architecture and how they leveraged
various cloud technologies.
Applications from a range of domains, from scientific to engineering, gaming, and social networking, are
considered.
The Web service forms the front-end of a platform that is hosted in cloud and leverages three layers of cloud
computing stack: SaaS, PaaS, and IaaS.
The Web service constitute SaaS application that will store ECG data in the Amazon S3 service and issue a
processing request to the scalable cloud platform.
The runtime platform is composed of a dynamically sizable number of instances running the workflow engine
and Aneka.
The number of workflow engine instances is controlled according to the number of requests in the queue of
each instance, while Aneka controls the number of EC2 instances used to execute the single tasks defined by
the workflow engine for a single ECG processing job.
Advantages
1. The elasticity of cloud infrastructure that can grow and shrink according to the requests served. As a result,
doctors and hospitals do not have to invest in large computing infrastructures designed after capacity planning,
thus making more effective use of budgets.
2. Ubiquity. Cloud computing technologies are easily accessible and promise to deliver systems with minimum
or no downtime. Computing systems hosted in cloud are accessible from any Internet device through simple
interfaces (such as SOAP and REST-based Web services). This makes systems easily integrated with other
systems maintained on hospitals premises.
3. Cost savings. Cloud services are priced on a pay-per-use basis and with volume prices for large numbers of
service requests.
Protein structure prediction is a computationally intensive task that is fundamental to different types of research
in the life sciences.
The geometric structure of a protein cannot be directly inferred from the sequence of genes that compose its
structure, but it is the result of complex computations aimed at identifying the structure that minimizes the
required energy.
This task requires the investigation of a space with a massive number of states, consequently creating a large
number of computations for each of these states.
One project that investigates the use of cloud technologies for protein structure prediction is Jeeva - an
integrated Web portal that enables scientists to offload the prediction task to a computing cloud based on Aneka
(Figure 10.2).
The prediction task uses machine learning techniques (support vector machines) for determining the secondary
structure of proteins.
These techniques translate problem into one of pattern recognition, where a sequence has to be classified into
one of three possible classes (E, H, and C).
A popular implementation based on support vector machines divides the pattern recognition problem into three
phases: initialization, classification, and a final phase.
These three phases have to be executed in sequence, we can perform parallel execution in the classification
phase, where multiple classifiers are executed concurrently.
This reduces computational time of the prediction.
The prediction algorithm is then translated into a task graph that is submitted to Aneka.
Once the task is completed, the middleware makes the results available for visualization through the portal.
The advantage of using cloud technologies is the capability to leverage a scalable computing infrastructure
that can be grown and shrunk on demand.
The dimensionality of typical gene expression datasets ranges from several thousands to over tens of thousands
of genes.
This problem is approached with learning classifiers, which generate a population of condition-action rules that
guide the classification process. The eXtended Classifier System (XCS) has been utilized for classifying large
datasets in bioinformatics and computer science domains.
A variation of this algorithm, CoXCS [162], has proven to be effective in these conditions. CoXCS divides the
entire search space into subdomains and employs the standard XCS algorithm in each of these subdomains.
Such a process is computationally intensive but can be easily parallelized because the classifications problems
on the subdomains can be solved concurrently.
Cloud-CoXCS (Figure 10.3) is a cloud-based implementation of CoXCS that leverages Aneka to solve the
classification problems in parallel and compose their outcomes. The algorithm is controlled by strategies, which
define the way the outcomes are composed together and whether the process needs to be iterated.
1 Salesforce.com
Salesforce.com is most popular and developed CRM solution available today.
As of today more than 100,000 customers have chosen Safesforce.com to implement their CRM solutions.
The application provides customizable CRM solutions that can be integrated with additional features developed
by third parties.
Salesforce.com is based on the Force.com cloud development platform.
This represents scalable and high-performance middleware executing all operations of all Salesforce.com
applications.
The architecture of the Force.com platform is shown in Figure 10.5.
At the core of the platform resides its metadata architecture, which provides the system with flexibility and
scalability.
Application core logic and business rules are saved as metadata into the Force.com store.
Both application structure and application data are stored in the store. A runtime engine executes application
logic by retrieving its metadata and then performing the operations on the data.
A full-text search engine supports the runtime engine. This allows application users to have an effective user
experience The search engine maintains its indexing data in a separate store.
3 NetSuite
NetSuite provides a collection of applications that help customers manage every aspect of the business
enterprise.
Its offering is divided into three major products: NetSuite Global ERP, NetSuite Global CRM1 , and NetSuite
Global Ecommerce.
Moreover, an all-in-one solution: NetSuite One World, integrates all three products together.
The services NetSuite delivers are powered by two large datacenters on the East and West coasts of the United
States, connected by redundant links.
This allows NetSuite to guarantee 99.5% uptime to its customers.
The NetSuite Business Operating System (NS-BOS) is a complete stack of technologies for building SaaS
business applications that leverage the capabilities of NetSuite products.
On top of the SaaS infrastructure, the NetSuite Business Suite components offer accounting, ERP, CRM, and
ecommerce capabilities.
10.2.2 Productivity
Productivity applications replicate in cloud. The most common tasks that we are used to performing on our
desktop: from document storage to office automation and complete desktop environments hosted in the cloud.
1 Dropbox and iCloud
Online storage solutions have turned into SaaS applications and become more usable as well as more advanced
and accessible.
The most popular solution for online document storage is Dropbox, that allows users to synchronize any file
across any platform and any device in a seamless manner (Figure 10.6). Dropbox provides users with a free
storage that is accessible through the abstraction of a folder. Users can either access their Dropbox folder
through a browser or by downloading and installing a Dropbox client, which provides access to the online
storage by means of a special folder. All the modifications into this folder are silently synched so that changes
are notified to all the local instances of the Dropbox folder across all the devices.
Another interesting application in this area is iCloud, a cloud-based document-sharing application provided by
Apple to synchronize iOS-based devices in a completely transparent manner.
Documents, photos, and videos are automatically synched as changes are made, without any explicit operation.
This allows the system to efficiently automate common operations without any human intervention.
This capability is limited to iOS devices, and currently there are no plans to provide iCloud with a Web-based
interface that would make user content accessible from even unsupported platforms.
2 Google docs
Google Docs is a SaaS application that delivers the basic office automation capabilities with support for
collaborative editing over the Web.
Google Docs allows users to create and edit text documents, spreadsheets, presentations, forms, and drawings.
It aims to replace desktop products such as Microsoft Office and OpenOffice and provide similar interface and
functionality as a cloud service.
By being stored in the Google infrastructure, these documents are always available from anywhere and from
any device that is connected to the Internet.
Google Docs is a good example of what cloud computing can deliver to end users: ubiquitous access to
resources, elasticity, absence of installation and maintenance costs, and delivery of core functionalities as a
service.
The EyeOS architecture is quite simple: On the server side, the EyeOS application maintains the information
about user profiles and their data, and the client side constitutes the access point for users and administrators to
interact with the system. EyeOS stores the data about users and applications on the server file system. Once the
user has logged in by providing credentials, the desktop environment is rendered in the clients browser by
downloading all the JavaScript libraries required to build the user interface and implement the core
functionalities of EyeOS.
EyeOS also provides APIs for developing new applications and integrating new capabilities into the system.
EyeOS applications are server-side components that are defined by at least two files (stored in the
eyeos/apps/appname directory): appname.php and appname.js. The first file defines and implements all the
operations that the application exposes; the JavaScript file contains the code that needs to be loaded in the
browser in order to provide user interaction with the application.
Xcerion XML Internet OS/3 (XIOS/3) is another example of a Web desktop environment. The service is
delivered as part of the CloudMe application, which is a solution for cloud document storage. The key
differentiator of XIOS/3 is its strong leverage of XML, used to implement many of the tasks of the OS:
rendering user interfaces, defining application business logics, structuring file system organization, and even
application development.
XIOS/3 is released as open-source software and implements a marketplace where third parties can easily deploy
applications that can be installed on top of the virtual desktop environment. It is possible to develop any type of
application and feed it with data accessible through XML Web services: developers have to define the user
interface, bind UI components to service calls and operations, and provide the logic on how to process the data.
1 Facebook
Facebook is probably the most evident and interesting environment in social networking.
With more than 800 million users, it has become one of the largest Websites in the world.
To sustain this incredible growth, it has been fundamental that Facebook be capable of continuously adding
capacity and developing new scalable technologies and software systems while maintaining high performance
to ensure a smooth user experience.
Currently, the social network is backed by two data centers that have been built and optimized to reduce costs
and impact on the environment.
On top of this highly efficient infrastructure, built and designed out of inexpensive hardware, a completely
customized stack of opportunely modified and refined open-source technologies constitutes the back-end of the
largest social network.
The reference stack serving Facebook is based on LAMP (Linux, Apache, MySQL, and PHP). This collection
of technologies is accompanied by a collection of other services developed in-house.
These services are developed in a variety of languages and implement specific functionalities such as search,
news feeds, notifications, and others.
While serving page requests, the social graph of the user is composed.
The social graph identifies a collection of interlinked information that is of relevance for a given user.
Most of the user data are served by querying a distributed cluster of MySQL instances, which mostly contain
key-value pairs.
1 Animoto
Animoto is the most popular example of media applications on the cloud. The Website provides users with a
very straightforward interface for quickly creating videos out of images, music, and video fragments submitted
by users. Users select a specific theme for a video, upload the photos and videos and order them in the sequence
they want to appear, select the song for the music, and render the video. The process is executed in the
background and the user is notified via email once the video is rendered.
A proprietary artificial intelligence (AI) engine, which selects the animation and transition effects according to
pictures and music, drives the rendering operation. Users only have to define the storyboard by organizing
pictures and videos into the desired sequence.
The infrastructure of Animoto is complex and is composed of different systems that all need to scale ( Figure
10.8). The core function is implemented on top of the Amazon Web Services infrastructure. It uses Amazon
EC2 for the Web front-end and worker nodes; Amazon S3 for the storage of pictures, music, and videos; and
Amazon SQS for connecting all the components.
The systems auto-scaling capabilities are managed by Rightscale, which monitors the load and controls the
creation of new worker instances.
Game log processing is also utilized to build statistics on players and rank them. These features constitute the
additional value of online gaming portals that attract more and more gamers. The processing of game logs is a
potentially compute-intensive operation that strongly depends on the number of players online and the number
of games monitored.
The use of cloud computing technologies can provide the required elasticity for seamlessly processing these
workloads and scale as required when the number of users increases. A prototype implementation of
cloud-based game log processing has been implemented by Titan Inc. (Figure 10.10)