Demystifying The Cloud-eBook

Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

Demystifying

the Cloud
An introduction to Cloud Computing

Version 1.1 August 2010

This guide explains the concepts of Cloud Computing in simple terms. The first three chapters introduce the key concepts and the terminologies of the Cloud. The remaining chapters cover the major implementations of the Cloud Computing including Amazon Web Services, Microsoft Windows Azure Platform and Google App Engine. This is targeted towards the beginners and intermediate developers with a basic understanding of web technologies. Disclaimer This eBook was authored prior to my employment with Amazon Web Services. The views, terminology, architecture and references depicted in this material are my own and do not represent those that of my employer.

Cloud Computing Strategist


www.janakiramm.net | [email protected]

Janakiram MSV

Chapter 1 Defining the Cloud


Evolution of Cloud Computing
Evolution of ISP There are multiple factors that led to the evolution of Cloud Computing. One of the key factors is the way Internet Service Providers (ISP) matured over a period of time. I am borrowing this analogy from Forrester Research.

Evolution of ISP

Page 2 Janakiram MSV http://www.janakiramm.net

From the initial days of offering basic Internet connectivity to offering software as a service, the ISPs have come a long way. ISP 1.0 was all about providing Internet access to their customers. ISP 2.0 was the phase where ISPs offered hosting capabilities. The next step was co-location through which the ISPs started leasing out the rack space and bandwidth. By this, companies could host their servers running custom, Line of Business (LoB) applications that could be accessed over the web by its employees, trading partners and customers. ISP 3.0 was offering applications on subscription resulting in the Application Service Provider (ASP) model. The latest Software as a Service or SaaS, is a mature ASP model. The next logical step for ISPs would to embrace the Cloud. 1.2 The Programmable Web Web Services made the web programmable. They enabled the developers to look at the Internet a class library or an object model. Protocols like Simple Object Access Protocol (SOAP), Representational State Transfer (REST), JavaScript Object Notation (JSON) and Plain Old XML (POX) fueled the growth of APIs on the web. Today every popular search engine, social networking site and syndication portal has APIs offered to developers. Virtualization Virtualization is the most discussed term among CIOs and IT decision makers. Through Virtualization, the data center infrastructure can be consolidated from hundreds of servers to just tens of servers. All the physical server roles like Web Servers, Database Servers and Messaging Servers run as virtualized instances. This results in lower Total Cost of Ownership (TCO) and brings substantial savings on the power bills and reduced cost of cooling equipment. Though the evolution of ISP, programmable web and virtualization are independent trends, they contribute to the evolution of Cloud Computing.

Page 3 Janakiram MSV http://www.janakiramm.net

Understanding Cloud Computing


If you are wondering what is so special about the Cloud in Cloud Computing, here is the explanation. Traditionally, developers and architects used a picture of cloud to illustrate a remote resource connected via the web. Eventually cloud became the logical connector between the local and remote resources on the Internet.

Most of the developers get confused when they encounter the term Cloud Computing. According to them, their Web Services are already hosted on the Cloud and that can be potentially called as Cloud Services. While there is some truth in this argument, it is a not very accurate way of describing Cloud Computing. Lets look at Cloud Computing through the eyes of a developer. Think Web Services Most of the developers are familiar with Web Services. Web Services are based on a few simple concepts. Every Web Service accepts a request and returns a response (even if there is no explicit return value, a HTTP 200 OK return value is considered as a response). They are units of code that can be invoked over the web. Typically, Web Services accept one or more input parameters and invoke processing logic which will result in an output. Web Services are a part of web applications that run on a typical stack that has hardware, a Server OS, application development platform. For a while, think how you can expose every layer that is powering your web application as a Web Service.

Web Services Stack

Page 4 Janakiram MSV http://www.janakiramm.net

Cloud OS Visualize a scenario where the hardware and the Operating System (OS) are exposed as a Web Service over the public Internet. Based on the principles of Web Services, we could send a request to this service along with a few parameters. Since the OS is expected to act as an interface to the CPU and the devices, we can potentially invoke a service that accepts a job that will be processed by the OS and the underlying hardware. Technically, this Web Service has just turned the OS + H/W combination into a Service. We can start consuming this service by submitting CPU intensive tasks to this new breed of Web Service. What do you call an OS that is exposed on the web as a service? May be a Cloud OS? We will answer this in the coming sections.

Exposing the hardware and the OS as a Service


Cloud FX Developers always develop and deploy their applications on the application development platforms. Some of the most popular application development platforms are .NET and Java. In the last scenario, we have seen how the OS + H/W combination is offered as a service. Now, imagine a scenario where the application development platform is offered to you as a service. Through this, you will be able to develop and test your applications on a low end, inexpensive notebook PC but will able to submit my code to run on the most powerful hardware infrastructure. It is the same programming language, SDK and the runtime that runs on your development environment. If the hardware, OS, the language runtime and the SDK are offered to you as a service, what would you call this? A Cloud Platform or may be Cloud FX? We will address this in the next section.

Page 5 Janakiram MSV http://www.janakiramm.net

Exposing the Runtime + SDK as a Service


Cloud Application Today, most of the traditional desktop applications like word processors and spreadsheet packages are available over the web. These new breed of applications just need a browser and offer high fidelity with the desktop software. This fundamentally changes the way software is deployed and licensed. You need not double click setup.exe to install an Office suite on your desktop. Just subscribe to the applications and the features that you need and only pay for what you use. This is almost equivalent to exposing the application as a service.These applications may be called as Cloud Applications. We will take a relook at this later.

Web App as a Service

Welcome to the World of Services


Infrastructure as a Service

Page 6 Janakiram MSV http://www.janakiramm.net

In the previous section we discussed the Cloud OS. All that the Cloud OS offers is the infrastructure services. You may choose to use REST API to manage this OS or use SSH or Remote Desktop console. Technically, when you are able to delegate a program to execute on a remote OS running on the Web, you are leveraging Infrastructure as a Service (IaaS). This is different from classic web hosting. Web hosting only hosts web pages and cannot execute code that needs low level access to the OS API. Web hosting cannot dynamically scale on demand. IaaS enables you to run your computing task on virtually unlimited number of machines. Remember that through IaaS, you have just moved a server running in your backyard into the Cloud. You pretty much own the managing, patching, securing and the health of the remote servers. Amazon EC2 is an example of commercial IaaS offering.

Cloud OS = Infrastructure as a Service Platform as a Service Platform as a Service or PaaS goes one level above the Cloud OS. Through this, developers can leverage a scalable platform to run their applications. The advantage of PaaS is that the developers need not worry about installing, maintaining, securing and patching the server. The PaaS provider takes the responsibility of the infrastructure and exposes the platform alone as a service. Through this, the developers can achieve higher level of scalability, reliability and availability of their applications. Microsoft Azure and Google App Engine are examples of PaaS.

Page 7 Janakiram MSV http://www.janakiramm.net

Cloud FX = Platform as a Service

Software as a Service Software as a Service (SaaS) is a silent revolution in the world of traditional software products. With the availability of Intel Atom based Netbooks and abundant bandwidth, most of the applications are moving to the Cloud to be offered as services. Consumers can now use inexpensive devices that are capable of connecting to the web to get their work done. This reduces the upfront investment in software and brings the Pay-as-you-go model. Google Apps, Salesforce.com and Microsoft Online Services are examples of SaaS.

Cloud App = Software as a Service

What does Cloud Computing mean to you?

Page 8 Janakiram MSV http://www.janakiramm.net

IT Professionals and System Administrators For IT Professionals, Cloud Computing is all about consolidation and outsourcing the infrastructure. They are typically focused on the Infrastructure as a Service. IT Pros will move away from managing individual servers in their Data Centers to using a unified console to manage, track and monitor the health of the remote server instances running on the Cloud.

IaaS is the focus area of IT Pros and system administrators Developers and Architects Platform as a Service is an offering meant for developers and architects. They need to design applications keeping the statelessness of the Cloud. Architects should start thinking about the patterns that will make the applications seamlessly scale on-the-fly across hundreds of servers.

PaaS is the focus area of developers and architects Consumers

Page 9 Janakiram MSV http://www.janakiramm.net

Consumers will experience the Cloud through a variety of applications that they will use in their day to day life. If you have ever used Google Docs or Microsoft Live Mesh, you have already leveraging the Cloud. Consumers will subscribe to Software as a Service offerings.

SaaS delivers software through a subscription for the consumers

Page 10 Janakiram MSV http://www.janakiramm.net

Chapter 2 The Tenets of the Cloud


The 4 Key Tenets
I want to quickly recap the definition of Cloud Computing. It is all about outsourcing your infrastructure and applications to run on a remote resource. The remote resource phrase in the definition can be misleading and creates an illusion that running your web app on a server hosted abroad is Cloud Computing. So, what qualifies the remote resource to be called as the Cloud? Here are the 4 key capabilities that the Cloud Computing offers: Elasticity This is the most important attribute of the Cloud. You might start running your application on just a single server. But in no time, Cloud Computing enables you to scale your application to run on 100s of servers. Once the traffic and usage of your application decreases, you can scale down to 10s of servers. All this happens almost instantly and the best thing is your application and your customers dont even realize that. This dynamic capability to scale up and scale down is called Elasticity. Elasticity brings an illusion of Infinity. Though nothing is infinite in this world, your application can get any number of resources as it demands. This is the biggest unique selling point of the Cloud. Now, think of web hosting. When you want to add another server to your web application, your hoster has to manually provision that for you. Adding additional servers and configuring the network topology introduces additional time lag that your business cannot afford. Most of the Cloud Computing vendors offer an intuitive way of manipulating your server configuration and topology. Elasticity is the single most important attribute of the Cloud.

Page 11 Janakiram MSV http://www.janakiramm.net

Elasticity of Cloud Computing Pay-By-Use Elasticity and Pay-By-Use attributes go hand in hand. When you are scaling up your application by adding more resources, you know how much it is going to cost you. Pay-By-Use is a boon for the startups. As an entrepreneur, you got to balance your investment between human resources and IT resources. The biggest benefit of Pay-By-Use is that it reduces the CAPEX and turns your IT investment into OPEX. The analogy that I typically use is that of Cable or DTH TV subscription. During the season of Cricket World Cup or NBA, you would want to subscribe to the sports channels and unsubscribe that moment the event is over. With Pay-By-Use, you can subscribe and unsubscribe to the IT infrastructure based on your needs and you only pay for what you use. This is the most optimal way of spending your IT budget. Self Service When you are able to enjoy the capability of scaling up and scaling down and only pay for what you use, you never want to wait for someone in the datacenter to add an additional server to your application. Cloud can deliver its promise only when there is Self Service. Through this, you can control the resources all by yourself without an intermediary. When you add a new CPU core, a server instance or add extra storage, you do it by yourself by using the Console offered by the Cloud provider. This results in reduction in IT support and maintenance. Today most of the organizations have dedicated IT teams to provision a new machine, storage, collaboration portal and mailboxes as a part of on-boarding the new employees. Through Self Service, a fairly non-technical Page 12 Janakiram MSV http://www.janakiramm.net

person can achieve these tasks and you dont need certified system administrators to do this. For example, when you sign up with Google Apps, it is very simple and intuitive to configure the mailboxes for the employees. With more and more applications moving the Cloud, Self Service becomes the preferred way of configuring and managing the IT infrastructure.

Server Configuration @ ElasticHosts.com

ElasticFox for managing Amazon Web Web Services Programmability This is a critical parameter of the Cloud. The Cloud makes the developers extremely important. Developers are familiar with the concepts of multithreading where they spawn new threads to achieve scalability and the

Page 13 Janakiram MSV http://www.janakiramm.net

responsiveness of the application. They incorporate logic to create additional threads on demand. The programmability aspect of the Cloud adds a new dimension to the development. Developers can now create additional machines and add it to the applications on demand. They can treat the entire data center, servers and machines as an object model that be programmed. They can now do a For-Each loop on every server instance and decide what do with each instance. Amazon Web Services have the most mature API for programmatically controlling the Cloud based resources. Azure started supporting the management API that lets developers programmatically deploy and manage Azure applications. By leveraging these APIs, developers are building applications to manage the infrastructure and some of these frontends run on iPhone and Windows Mobile. Now, imagine clicking a button on your mobile phone to add a dozen servers to your application. Thanks to the Cloud! Developers are important more than ever!

AWS SDK for .NET

Page 14 Janakiram MSV http://www.janakiramm.net

Azure Tools for Eclipse

iPhone App to manage AWS So, lets summarize what we just discussed. Cloud Computing has 4 key tenets 1) Elasticity, 2) Pay-By-Use, 3) Self Service, and 4) Programmability. Hosting vs. Cloud Computing Revisiting the on-going debate of hosting vs. Cloud Computing, lets see what attributes hosting model exposes. Hosting can never meet the promise of elasticity. Even if it does, it wont match the economics of the Cloud. Hosting does offer some level of Self Service but not to an extent of manipulating the server Page 15 Janakiram MSV http://www.janakiramm.net

configuration on the fly! Pay-By-Use attribute is emulated by some hosting companies. But, it is not a norm in the hosting business. Programmability is too expensive to be supported by hosters as they cannot invest in the SDK and tools to manage the infrastructure. So, it is clearly evident that hosting is not the same as Cloud Computing. Having understood the key attributes of the Cloud, you might start wondering how you can bring these capabilities to your data center in the enterprise. The reality is that these capabilities can be applied to your data center and officially that is called as the Private Cloud. It is time for us to discuss various implementations of the Cloud. We will look at 4 different mechanisms the way Cloud can be implemented.

Hosting vs. the Cloud

The 4 Implementations of the Cloud


Public Cloud This is the most popular incarnation of the Cloud. Many businesses and individuals realize Cloud through the Public Cloud implementation. It needs a huge investment and only well established companies with deep pockets like Microsoft, Amazon and Google can afford to set them up. Public Cloud is implemented on thousands of servers running across hundreds of data centers deployed across tens of locations around the world. The best thing about Public Cloud is that the customers can choose a location for his application to be deployed. This will reduce the latency when the consumers access the application. For example, a London based business can choose to deploy their app at the Europe data center and an American

Page 16 Janakiram MSV http://www.janakiramm.net

company prefers a data center in North America. With the geographical spread, Public Clouds like Amazon Web Services and Microsoft Windows Azure also offer Content Delivery Network (CDN) features. Through this, static content will be automatically replicated across all the data centers around the globe thus increasing the scalability and availability of the applications.

Public Cloud Private Cloud Simply put, Private Clouds are normal data centers within an enterprise with all the 4 attributes of the Cloud Elasticity, Self Service, Pay-By-Use and Programmability. By setting up a Private Cloud, enterprises can consolidate their IT infrastructure. They will need fewer IT staff to manage the data center. They will also realize reduced power bills because of the low electricity consumption and lesser cooling equipment needs. Private Cloud empowers employees within an organization through Self Service of their IT needs. It becomes easy to provision new machines and quickly assign them to project teams. Private Cloud borrows some of the best practices of Public Cloud but limited to an organizational boundary. Private Cloud can be setup using a variety of offerings from vmWare, Microsoft, IBM, SUN and others. There are also some of the Open Source implementations like Eucalyptus and Ubuntu Enterprise Cloud. We will discuss more of Private Cloud in the coming episodes.

Page 17 Janakiram MSV http://www.janakiramm.net

Private Cloud Hybrid Cloud There are scenarios where you need a combination of Private Cloud and Public Cloud. Due to the regulations and compliance issues in few countries, sensitive data like citizen information, patient medical history, and financial transactions cannot be stored in servers that physically not located within the political boundaries of a country. In some scenarios, the enterprise customers want to get best of the both worlds by logically connecting their Private Cloud and the Public Cloud. Through this, they can offer seamless scalability by moving some of the onpremise and Private Cloud based applications to the Public Cloud. Security plays a critical role in connecting the Private Cloud to the Public Cloud. Realizing its importance, Amazon Web Services has recently announced Virtual Private Cloud (VPC) that securely bridges Private Cloud and Amazon Web Services. It is almost like extending your infrastructure beyond the organizational boundary and the firewall in a secure way. Microsofts recent announcement of Windows AppFabric brings the concept of Hybrid Cloud to Microsofts future customers.

Page 18 Janakiram MSV http://www.janakiramm.net

Hybrid Cloud Community Cloud Community Cloud is implemented when a set of businesses have a similar requirement and share the same context. This would be made available to a set of select organizations. For example, the Federal government in US may decide to setup a government specific Community Cloud that can leveraged by all the states. Through this, individual local bodies like state governments will be freed from investing, maintaining and managing their local data centers. Similarly, the Reserve Bank of India (RBI) may setup a Community Cloud for all the financial institutions that share common goals and requirements. So, a Community Cloud is a sort of Private Cloud but goes beyond just one organization.

Community Cloud

Page 19 Janakiram MSV http://www.janakiramm.net

Chapter 3 The Anatomy of the Cloud


Introduction to Virtualization
Virtualization is abstracting the hardware to run virtual instances of multiple guest operating systems on a single host operating system. You can see Virtualization in action by installing Microsoft Virtual PC, VMware Player or Sun VirtualBox. These are desktop virtualization solutions that let you install and run an OS within the host OS. The virtualized guest OS images are called Virtual Machines. The benefit of virtualization is realized more on the servers than on the desktops.

Server Virtualization There are many reasons for running Virtualization on the servers running in a traditional data center. Here are a few: Mean Time To Restore It is far more flexible and faster to restore a failed web server, app server or a database server that is running as a virtualized instance. Since these instances are physical files on the hard disk for the host operating system, just copying over a replica of the failed server image is faster than restoring a failed physical server. Administrators can maintain multiple versions of the VMs that come handy during the restoration. The best thing about this is that whole copy and restore process can be automated as a part of disaster recovery plan.

Page 20 Janakiram MSV http://www.janakiramm.net

Maximizing the server utilization It is very common that certain servers in the data center are less utilized while some servers are maxed out. Through virtualization, the load can be evenly spread across all the servers. There are management software offerings that will automatically move VMs to idle servers to dynamically manage the load across the data center. Reduction in maintenance cost Virtualization has a direct impact on the bottom line. First, by consolidating the data center to run on fewer but powerful servers, there is a significant cost reduction. The power consumed by the data center and the maintenance cost of the cooling equipment comes down drastically. The other problem that virtualization solves is the migration of servers. When the hardware reaches the end of the lifecycle, the physical servers need to be replaced. Backing up and restoring the data and the installation of software on a production server is very complex and expensive. Virtualization makes this process extremely simple and cost effective. The physical servers will be replaced and the VMs just get restarted without any change in the configuration. This has a lot of impact on the IT budgets. Efficient management All major virtualization software have a centralized console to manage, maintain, track and monitor the health of physical servers and the VMs running on these servers. Because of the simplicity and the dynamic capabilities, IT administrators will spend less time in managing the infrastructure. This results in better management and cost savings for the company.

Virtualization on the Server


Lets understand more about the server virtualization. Typically the OS is designed to act as an interface between the applications and the hardware. It is not specifically designed to run the guest OS instances on top of it.

OS manages the applications

Page 21 Janakiram MSV http://www.janakiramm.net

In fact, in the server virtualization scenario, the host OS is not very significant. It is just confined to booting up and running the VMs. Given the fact that the OS is not ideal for running multiple VMs and has a little role to play, there is a new breed of software called Hypervisor that takes over the OS. Hypervisor is an efficient Virtual Machine Manager (VMM) that is designed from the ground up to run multiple high performant VMs. So, a Hypervisor is to VMs what an OS is to processes.

A Hypervisor can potentially replace the OS and can even boot directly from a VM. This is called bare metal approach to virtualization. These Hypervisors have low footprint of few megabytes (vmWare ESXi is just 32MB in size!) and have an embedded OS with them. Hypervisors are assisted by the hardware virtualization features built into the latest Intel and AMD CPUs. This combination of hardware and Hypervisor turns the server into a lean and mean machine to host multiple VMs. The VM that is used by the Hypervisor to boot as a host is called a paravirtualized VM. This concept makes virtualization absolutely powerful. Imagine a server booting in few seconds and the required paravirtualized (host) VM gets copied over a gigabit Ethernet to run multiple guest VMs. This turns the datacenter to be very dynamic and agile. The Hypervisor can be controlled by a central console and can be instructed about the host VM to boot and the guest VMs to be run on it.

Page 22 Janakiram MSV http://www.janakiramm.net

Bare Metal Virtualization

A look at the Hypervisor market


Citrix XenServer This product is based on the proven, open source Hypervisor called Xen. Xens paravirtualization technology is widely acknowledged as the fastest and most secure virtualization software in the industry and it is enhanced by taking full advantage of the latest Intel VT and AMD-V hardware virtualization assist capabilities. This product is free and can be downloaded from Citrix.com. VMware ESXi This product is another bare metal Hypervisor from the virtualization leader, VMware. This is one of the best Hypervisors with just 32MB footprint. ESXi ships with Direct Console User Interface (DCUI) that provides basic UI required for administering and managing the Hypervisor. Through its standard Common Information Model (CIM) system, it also exposes the APIs to control the infrastructure. Microsoft Hyper-V Server This is a free Hypervisor from Microsoft based on the same Hypervisor that ships with Microsoft Windows Server Hyper-V edition. This is best suited for Virtual Desktop Infrastructure (VDI) because of its compatibility with Windows Vista and

Page 23 Janakiram MSV http://www.janakiramm.net

Windows 7. Hyper-V does not have any local GUI but can be managed from System Center Virtual Machine Manager (SCVMM).

Virtualization and the Cloud


The architecture that we discussed forms the heart and soul of Cloud Computing. Here is how

Elasticity We know that the key attribute of the Cloud is Elasticity, which is the ability to scale up and scale down on the fly. This capability is achieved only through virtualization. Scaling up is technically adding more server VMs to an application and scaling down is detaching the VMs from the application. Self Service The next attribute is Self Service. The Hypervisor comes with an API and the required agents to manage it remotely. This functionality can surface through the Self Service portals that the Cloud vendor offers. So, when you move a slider to increase the number of servers in your web tier, you are essentially talking to the Hypervisor to action that request. Pay-By-Use Pay-By-Use is the next attribute of the Cloud. By leveraging the management and monitoring capabilities of the Hypervisor, metering the usage of resources like the CPUs, RAM and storage can be easily achieved. Programmable Infrastructure Programmable Infrastructure is the last key tenet of the Cloud. We already saw how the API wired into Hypervisors can be leveraged. Developers can directly talk to the Hypervisor through the native APIs or Web Services exposed by the Cloud vendors. Through this, they can take the control of the VMs. It is very obvious that the Cloud is heavily relying on virtualization and efficient Hypervisors to achieve its goal.

Dissecting the Cloud


Now that we know how Virtualization forms the core of the Cloud, lets me put things in perspective. Lets see what actually goes inside the Cloud. Geographic location

Page 24 Janakiram MSV http://www.janakiramm.net

We start deciding where to physically run our application. Most of the Cloud providers give you an option to host your application at a specific location. Depending on the customer base and the expected user location, you can choose a location. This will ensure that all your components like storage, compute and database services are hosted within the same data center. This will reduce the latency and makes the application more responsive.

Geographically spread Cloud data centers Data Center Though you do not have a direct choice in this, your app will be deployed at a data center physically located at a place that you have chosen. These data centers typically run thousands of powerful servers that offer a lot of storage and computing power.

A Cloud data center runs hundreds of servers

Server You never know which physical server is responsible for running your code and the application. In most of the cases, the app that you deployed may be powered by more than one server running within the same data center. You cannot assume that Page 25 Janakiram MSV http://www.janakiramm.net

the same physical server will run the next instances of your app. Servers are treated as a commodity resource to host the VMs. There is no affinity between a VM and a physical server. Each server in the data center is optimally utilized at any given point.

Each server runs the Hypervisor and the VM(s)

Virtual Machine This is the layer that you will directly interact with. In Platform as a Service (PaaS), you may not realize that you are dealing with a VM but in reality most of the Cloud implementations will host your code or app on a VM. VMs are essential to respect the 4 tenets of the Cloud. Your application runs on a VM that is managed by the Hypervisor running across all the servers. These VMs are moved across servers based on the server utilization. There is no guarantee that the VM that you launch will run on the same physical server. There will be a load balancer which will ensure that your applications are scalable by exploiting the power of all the VMs associated with your application.

Page 26 Janakiram MSV http://www.janakiramm.net

Under the hood of a server

Page 27 Janakiram MSV http://www.janakiramm.net

Chapter 4 Introducing Amazon Web Services


Overview of Amazon Web Services Amazon Web Services is one of the early and also the most successful implementations of the Public Cloud. Many well known online properties leverage AWS. Amazon initially started offering a Cloud based Message Queuing service called Amazon Simple Queue Service or SQS. They eventually added services like Mechanical Turk, Simple Storage Service (S3), Elastic Compute Cloud (EC2), A CDN service called CloudFront, a flexible and distributed database service called SimpleDB. Amazon recently announced the availability of MySQL in the Cloud through a service called Relational Data Service (RDS).

Amazon Web Services Given that Amazon offers the core capabilities to run a complete web application or a Line of Business application, it is obvious that it is Infrastructure as a Service (IaaS). AWS is truly the platform of the platforms. You can choose an OS, App server and the programming language of your choice. AWS SDK and API is available for most of the popular languages including Java, .NET, Python and Ruby.

Page 28 Janakiram MSV http://www.janakiramm.net

Your Application EC2

Amazon Physical Infrastructure

SOAP or REST based API

AWS API Bindings Lets take a closer look at some of the major Cloud service offerings from Amazon: S3 Amazons Simple Storage Service or S3 is a great way to store data on the Cloud that can be accessed by any application with access to the Internet. S3 can store any arbitrary data as objects accompanied by metadata. These objects can be organized into buckets. Every bucket and object has a set of permissions defined in the Access Control List (ACL). The objects stored in S3 can be anything from a document, a media file, serialized objects or even Virtual Machine images. Each object can be 5GB in size while the metadata can be up to 2KB. All the objects can be accessed using simple REST or SOAP calls. This makes S3 an ideal storage solution to centrally store and retrieve data across multiple clients. S3 can also be treated as a virtual file system to provide persistence storage capabilities to applications.

Page 29 Janakiram MSV http://www.janakiramm.net

Bucket Object

http://bucket.s3.amzonaws.com/object

Simple Storage Services EC2 In simple terms, EC2 is hiring a server running at a remote location. These servers are actually Virtual Machine images running on top of Amazons powerful data centers. Amazon calls these virtualized server instances as Amazon Machine Images or AMI. These instances come in different sizes that you can choose from. Please refer to http://aws.amazon.com/ec2/#instance for more details on the instance types. There are many pre-configured AMIs that you can choose from. The typical workflow on EC2 is that you choose a pre-configured AMI, launch that AMI, customize it by adding additional software and by loading an app and finally, save that AMI as your custom AMI on S3. You can launch multiple instances of your AMI and attach them to an IP called the Elastic IP. Because of the dynamic capability of launching multiple instances of the same AMIs to scale up and terminating them to scale down, it is called Elastic Compute Cloud.

Page 30 Janakiram MSV http://www.janakiramm.net

Elastic Compute Cloud SQS SQS is the message queue on the Cloud. It supports programmatic sending of messages via web service applications as a way to communicate over the Internet. Message Oriented Middleware (MOM) is a popular way of ensuring that the messages are delivered once and only once. Moving that infrastructure to the web by yourself is expensive and hard to maintain. SQS gives you this capability on-demand and through the pay-by-use model. SQS is accessible through REST and SOAP based API.

Simple Message Queue CloudFront When your web application is targeting the global users, it makes sense to serve the static content through a server that is closer to the user. One of the solutions Page 31 Janakiram MSV http://www.janakiramm.net

based on this principle is called Content Delivery Network (CDN). But this infrastructure of geographically spread servers to serve static content can be very expensive. CloudFront is CDN as a service. Amazon is leveraging its data center presence across the globe by serving content through these edge locations. CloudFront utilizes S3 by replicating the buckets across multiple edge servers. Amazon charges you only for the data that is served through CloudFront and there is no requirement for upfront payment.

Cloud Front

SimpleDB If S3 offers storage for arbitrary binary data, SimpleDB is a flexible way to store Name/Value pairs on the Cloud. This dramatically reduces the overhead of maintaining a relational database continuously. SimpleDB is accessed through REST and HTTP calls and can be easily consumed by any client that can parse a HTTP response. Many Web 2.0 applications built using AJAX, Flash and Silverlight can easily access data from SimpleDB. It is the only service from Amazon that is free up to a specific threshold.

Page 32 Janakiram MSV http://www.janakiramm.net

Simple DB

RDS Amazon RDS offers relational database on the Cloud. It is based on the popular MySQL database. When you are moving a traditional Line of Business application to the Cloud and want to maintain high fidelity with the existing systems, you can choose RDS. The advantage of RDS is that you do not install, configure, manage and maintain the DB server. You only consume it and Amazon takes care of the rest. Routine operations like patching the server and backing up the databases are taken care and you only consume the service. RDS is priced on Pay-as-you-go model and there is no upfront investment required. It is accessible through the REST and SOAP based API.

Relational Database Services Page 33 Janakiram MSV http://www.janakiramm.net

Scenarios
Scalable Web Application If you are an aspiring entrepreneur and want to go-live with your app without an upfront investment, Amazon is the place to go. By running your web app on Amazon, you can dynamically scale you application on demand and only pay for what you use. This can be the best playground for you to determine the server capacity needs and asses the peak traffic patterns before the commercial launch of your web app.

Line of Business Application If your enterprise has to open up an internal LOB application to its employees and trading partners, it can extend the application to the Cloud by leveraging a concept of AWS called Virtual Private Cloud (VPC). This is achieving the Hybrid Cloud capabilities by partially moving an application to the Cloud while still running the sensitive and proprietary part of the LOB application secured behind the firewall. VPC enables organizations to securely extend itself to the Cloud. Data Archival Data that is not very frequently accessed but may be required due to data retention policies can be easily archived on Amazon S3. By building a simple, searchable frontend, this data can be searched and retrieved on-demand. Moving the data to the Cloud will ensure that is available from any where and any time. High-Performance Computing On Demand For many enterprises, there is an occasional requirement of high performance computing. Investing in high-end servers is not an optimal solution because they may not be utilized after the task is done. With AWS, companies can virtually hire as much computing power as they need and pay only for what they used. This will eliminate the expensive proposition of investing in the infrastructure.

Page 34 Janakiram MSV http://www.janakiramm.net

Scalable Media Delivery A TV channel might want to start delivering the recorded shows to its global audience. Since most of the content is static, they can leverage the CDN capabilities. Signing up with services like Akamai and LimeLight can be expensive. Because the media content is already stored on S3, it is very easy and cost effective to leverage Amazons CloudFront to deliver the media content through the geographically spread edge locations.

Page 35 Janakiram MSV http://www.janakiramm.net

Chapter 5 Introduction to Microsoft Windows Azure Platform


Overview of Windows Azure Platform At a high level, Windows Azure Platform has 4 key services in it. The first one is Windows Azure which is the Cloud OS from Microsoft. The second service is the AppFabric which enables the integration of on-premise services with the Cloud. The third service is a Database on the Cloud called SQL Azure which is based on Microsoft SQL Server. The latest addition to the platform is a service Codenamed Dallas which is a marketplace to publish, discover, consume and analyze premier content. Though Windows Azure Platform is designed for the developers building applications on the Microsoft platform, this can also be leveraged by developers building applications on Java and PHP environments. Microsoft is investing in the right set of tools and plug-ins for Eclipse and other popular developer environments.

Page 36 Janakiram MSV http://www.janakiramm.net

Windows Azure Platform I will first explain each of the components of Windows Azure Platform and then walk you through the scenarios for deploying applications on this platform. Windows Azure Windows Azure is the heart & soul of the Azure Platform. It is the OS that runs on each and every server running in the data centers across multiple geographic locations. It is interesting to note that Windows Azure OS is not available as a retail OS. It is a homegrown version exclusively designed to power Microsofts Cloud infrastructure. Windows Azure abstracts the underlying hardware and brings an illusion that it is just one instance of OS. Because this OS runs across multiple physical servers, there is a layer on the top that coordinates the execution of processes. This layer is called the Fabric. In between the Fabric and the Windows Azure OS, there are hundreds of Virtual Machines (VM) that actually run the code and the applications. As a developer, you will only see two services at the top of this stack. They are 1) Compute and, 2) Storage.

Windows Azure architecture You interact with the Compute service when you deploy your applications on Windows Azure. Applications are expected to run within one of the two roles called Web Role or Worker Role. Web Role is meant to host typical ASP.NET web applications or any other CGI web applications. Worker Role is to host long running processes that do not have any UI. Think of the Web Role as an IIS container and Page 37 Janakiram MSV http://www.janakiramm.net

the Worker Role as the Windows Services container. Web Role and Worker Role can talk to each other in multiple ways. The Web Role can also host WCF Services that expose a HTTP endpoint. The code within Worker Role will run independent of the Web Role. Through the Worker Role, you can port either .NET applications or native COM applications to Windows Azure. When you run an application, you definitely need storage to either store the simple configuration data or more complex binary data. Windows Azure Storage comes in three flavors. 1) Blobs, 2) Tables and, 3) Queues. Blobs can store large binary objects like media files, documents and even serialized objects. Table offers flexible name/value based storage. Finally, Queues are used to deliver reliable messages between applications. Queues are the best mechanism to communicate between Web Role and Worker Role. The data stored in Azure Storage can be accessed through HTTP and REST calls.

Figure 3. Compute & Storage Service

So, we just discussed that Windows Azure offers a Compute and Storage service. Compute service is consumed by deploying a Web Application in a Web Role and long running process in the Worker Role. Storage can be consumed through Blobs, Tables and Queues. AppFabric

Page 38 Janakiram MSV http://www.janakiramm.net

Windows Azure platform AppFabric was earlier called the .NET Services. This service enables seamless integration of services that run within an organization behind a firewall with those services that are hosted on the Cloud. It forms a secure bridge between the legacy applications and the Cloud services. AppFabric also brings the federated identity to the Cloud based applications. The two key components of AppFabric are 1) Service Bus and, 2) Access Control.

AppFabric connecting on-premise to the Cloud Service Bus provides a secure connectivity between on-premise and Cloud services. It can be used to register, discover and consume services irrespective of their location. Services hosted behind firewalls and NAT can be registered with the Service Bus and these services can be then invoked by the Cloud Services. The Service Bus abstracts the physical location of the Service by providing a URI that can be invoked by any potential consumer. Access Control is a mechanism to secure your Cloud services and applications. It provides a declarative way of defining rules and claims through which callers can gain access to Cloud services. Access Control rules can be easily and flexibly configured to cover a variety of security needs and different identity-management infrastructures. Access Control enables enterprises to integrate their on-premise security mechanisms like Active Directory with the Cloud based authentication. Developers can program Access Control through simple WCF based services.

Page 39 Janakiram MSV http://www.janakiramm.net

SQL Azure SQL Azure is Microsoft SQL Server on the Cloud. Unlike Azure Storage, which is meant for unstructured data, SQL Azure is a full blown relational database engine. It is based on the same DB engine of MS SQL Server and can be queried with T-SQL. Because of its fidelity with MS SQL, on-premise applications can quickly start consuming this service. Developers can talk to SQL Azure using ADO.NET or ODBC API. PHP developers can consume this through native PHP API. Through the Microsoft SQL Azure Data Sync, data can be easily synchronized between onpremise SQL Server and SQL Azure. This is a very powerful feature to build hubs of data on the Cloud that always stay in sync with your local databases. For all practical purposes, SQL Azure can be treated exactly like a DB server running in your data center without the overhead of maintaining and managing it by your teams. Because Microsoft is responsible for installation, maintenance and availability of the DB service, business can only focus on manipulating and accessing data as a service. With the Pay-as-you-go approach, there is no upfront investment and you will only pay for what you use.

SQL Azure

Microsoft Codenamed Dallas This service is an exchange setup by Microsoft for parties that can publish useful data/content and parties that can consume this data in their applications. For

Page 40 Janakiram MSV http://www.janakiramm.net

example, Public Sector can publish interesting and useful census data and some company in the healthcare business might be just looking for this data. This company can search and discover the census dataset and pay for what they consume. Data can be published in a variety of forms including BLOBS, CSV files, Spreadsheets and RSS feeds. The dataset published on Dallas can be consumed directly through tools like Microsoft Excel or can be integrated into custom applications by calling the REST based API. So, this turns into a marketplace of data publishers and data consumers. Through an add-in for Excel 2010 called PowerPivot (previously known as Project Gemini), end users can directly consume the data in Microsoft Excel.

Figure 5. Microsoft Codename Dallas service To summarize what we just discussed, Windows Azure Platform Services consist of Windows Azure, AppFabric, SQL Azure and a project Codenamed Dallas.

Scenarios for Microsoft Windows Azure Platform


Scalable Web Application Because Windows Azure Platform is based on the familiar .NET platform, ASP.NET developers can design and develop web applications on fairly inexpensive machines and then deploy them on Azure. This will empower the developers to instantly scale their web apps without worrying about the cost and the complexity of

Page 41 Janakiram MSV http://www.janakiramm.net

infrastructure needs. Even PHP developers can enjoy the benefits of elasticity and pay-by-use attributes of the platform.

Compute Intensive Application Windows Azure Platform can be used to run process intensive applications that occasionally need high end computing resources. By leveraging the Worker Role, developers can move code that can run across multiple instances in parallel. The data generated by either Web Role or On-Premise applications can be fed to the Worker Roles through Azure Storage. Centralized Data Access When data has to be made accessible to a variety of applications running across the browser, desktop and mobile, it makes sense to store that in a central location. Azure Cloud based storage can be great solution for persisting and maintaining data that can be easily consumed by desktop applications, Silverlight, Flash and AJAX based web applications or mobile applications. With the Pay-as-you-grow model, there is no upfront investment and you will only pay for what you use.

Hybrid Applications (Cloud + On-Premise) There may be a requirement for extending a part of an application to the Cloud or building a Cloud faade for an existing application. By utilizing the AppFabric services like Service Bus and Access Control, on-premise applications can be seamlessly and securely extended to the Cloud. AppFabric can enable the Hybrid Cloud scenario. Cloud Based Data Hub Through SQL Azure, companies can securely build data hubs that will be open to trading partners and mobile employees. For example, the Inventory of a manufacturing company can be securely hosted on the Cloud which is always in sync with the local inventory database. The Cloud based DB will be opened up for B2B partners to directly query and place orders. SQL Azure and the SQL Azure Data Sync will enable interesting scenarios.

Page 42 Janakiram MSV http://www.janakiramm.net

Chapter 6 Google App Engine


Google App Engine is a platform to deploy and run web applications on Googles infrastructure. It comes with a dynamic web server with full support for common web technologies. It offers a transactional data store for persisting data. Developers can integrate their web application with Google Accounts through the APIs. The biggest advantage of running web applications on GAE is the scalability that it offers. Your web application will be as scalable as some of the popular Google services like search.

Your web app running along with Google properties Google App Engine currently supports Python and Java environments. Java developers will be able to deploy and run JSPs and Servlets while Python developers can use standard library. Since GAE runs in a sandbox, not all operations are possible. For example, opening and listening on sockets is disabled. The applications running on GAE live in a sandbox that provides multi-tenancy and isolation across applications.

Page 43 Janakiram MSV http://www.janakiramm.net

Components of Google App Engine The next logical layer is a set of APIs and Services to support the web application developers. This layer has a persistent Datastore, User Authentication services, Task Scheduler and Task Queue, URL Fetch, a Mail component, MemCache and Image Manipulation. All these services are exposed through native API bindings. For example, Java developers will be able to use JDO/JPA to talk to the datastore. Lets take a closer look at some of the services provided by GAE. Java Runtime GAE is based on Java 6 VM and Servlet 2.5 Container. The datastore can be accessed through the JDO/JPA API. It supports JSR 107 for MemCache API. Mail can be accessed through javax.mail API. Javax.net.URLConnection provides access to URLFetch service. Apart from core Java language, other dynamic languages based on Java like JRuby and Scala. Python Rumtime GAE comes with a rich set of API and tools for developing web applications based on Python. It supports Python 2.5.2 and Python 3 is being considered for the future releases. You can also take advantage of a wide variety of mature libraries and frameworks for Python web application development, such as Django. The Python environment provides rich Python APIs for the datastore, Google Accounts, URL fetch, and email services. App Engine also provides a simple Python web application framework called webapp to make it easy to start building applications.

Page 44 Janakiram MSV http://www.janakiramm.net

Datastore App Engine comes with a very powerful data storage that can scale dynamically. It also features a query engine and support for transactions. The datastore is different from traditional relational databases. The objects stored in datastore are called Entities which are schemaless. These entities have a set of properties that can be queried using a SQL like grammar like GQL or Google Query Language. The datastore is strongly consistent and supports optimistic concurrency control. User Authentication One of the advantages of using GAE is its integration with Google Accounts. This empowers the developers to leverage Googles secure authentication engine for their custom applications. While a user is signed in to the application, the app can access the user's email address, as well as a unique user ID. The app can also detect whether the current user is an administrator, making it easy to implement admin-only areas of the app.

Google Accounts integration with App Engine URL Fetch This service will fetch external web pages using the high bandwidth that many other Google applications use. Mail This will enable developers to programmatically send email messages from custom web applications. MemCache The Memcache service provides applications with a high performance in-memory key-value cache that is accessible by multiple instances of the application. Memcache is useful for data that does not need the persistence and transactional features of the datastore, such as temporary data or data copied from the datastore to the cache for high speed access. Page 45 Janakiram MSV http://www.janakiramm.net

Image Manipulation Through this service, developers can manipulate images. With this API, you can resize, crop, rotate and flip images in JPEG and PNG formats. Scheduled Tasks Scheduled Tasks are also called cron jobs. Other than running interactive web applications, GAE can also schedule tasks that can be invoked at a specific time. To get started on Google App Engine, download the Eclipse plug-in and the SDK. The SDK emulates the GAE environment locally and enables you to design, develop and test applications on your machine before finally deploying on GAE.

About the author Janakiram MSV is the Web Services Evangelist at Amazon Web Services (AWS) in India. He helps developers, IT professionals, customers and partners understand the value and use the service offerings of AWS cloud platform. His responsibilities include enabling ISVs, SIs and enterprises with the skills to move their IT infrastructure and applications to the Cloud. In his previous role, he was the Technology Architect Cloud at Microsoft Corporation where he was responsible for architecting enterprise solutions on the Microsoft Cloud Computing platform. Janakiram is a Microsoft Certified Professional on the Windows Azure Platform. He worked for Microsoft Corporation for over 10 years during which he played various roles that involved selling, marketing and evangelizing the Microsoft Application Platform and Tools to customers and partners in India. He has been a regular speaker at events like Microsoft TechEd, Microsoft Developer Days, Great Indian Developer Summit, JAX India and Foss.in. Janakirams blog can be found at http://www.janakiramm.net

Page 46 Janakiram MSV http://www.janakiramm.net

You might also like