0% found this document useful (0 votes)
76 views

Cloud Computing Notes

Cloud computing notes

Uploaded by

Yuv Raj
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views

Cloud Computing Notes

Cloud computing notes

Uploaded by

Yuv Raj
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

Cloud Computing

Computing is the process of using computer technology to complete a given goal-oriented task.

What is Cloud Computing?

Cloud computing is the delivery of computing resources (like storage, processing power, and software) over
the internet. Major providers like Google Cloud Platform (GCP), Amazon Web Services (AWS), and
Microsoft Azure manage these resources for users. This allows users to access services like data storage and
web-based software without owning physical hardware. Many courses are available to learn about cloud
computing.

History of Cloud Computing

• 1960s: John McCarthy envisioned computing as a utility.


• 1999: Salesforce offered the first cloud-based service (SaaS) for customer data management.
• 2002: Amazon created AWS, initially for internal use, later as a commercial service.
• 2006: Google's CEO, Eric Schmidt, popularized the term "cloud computing".
• 2008: Microsoft launched Azure, expanding the cloud service market. Cloud computing grew rapidly
and is now integral to modern businesses.

Why Cloud Computing?

• Flexibility and Security: Easily adaptable and secure.


• Increased Collaboration: Simplifies teamwork and remote work.
• Cost Savings: Reduces expenses on hardware and software.
• Quality Control: Ensures consistent performance.
• Disaster Recovery: Protects data from loss.
• Competitiveness: Keeps businesses up-to-date with technology.
• Sustainability: More environmentally friendly than traditional setups.

Characteristics of Cloud Computing

• Pay-as-you-go: Pay only for what you use.


• On-demand self-service: Access resources as needed without human intervention.
• High Scalability and Elasticity: Easily scale resources up or down.
• Statistics Generation: Monitor and manage resource usage effectively.
• Resiliency and Availability: High uptime and reliability.
• Resource Pooling: Shared resources across multiple users.
• Security: Advanced security measures by cloud providers.
• Broad Network Access: Accessible from anywhere with an internet connection.

Advantages of Cloud Computing

• Work from Anywhere: Access data and services from any device.
• Cost Efficiency: Lower setup and maintenance costs.
• Backup and Recovery: Simplified data backup and recovery.
• Database Security: Enhanced protection for stored data.

Disadvantages of Cloud Computing

• Security Risks: Potential vulnerabilities in cloud resources.


• Costs: Can become expensive with increased usage.
• Data Persistence: Ensuring data is fully deleted can be difficult.
Applications of Cloud Computing

• Data Storage and Backup: Reliable and cost-effective data storage.


• Business Applications: Used for resource planning, CRM, and more.
• Education: Online courses and virtual classrooms.
• Entertainment: Streaming music, videos, and gaming.
• Management: Project, HR, and inventory management.
• Social media: Platforms like Facebook and Twitter rely on cloud computing for data processing and
storage.

Introduction to Cloud Computing

Cloud computing is the delivery of computing resources like storage and software over the internet,
managed by providers such as GCP, AWS, and Azure.

History of Cloud Computing

• Client-Server Model: Central server holds data; users connect to access it.
• Distributed Computing Model: Multiple computers share resources via networking.
• Cloud Computing: Emerged in the early 2000s with AWS (2002), Google Cloud (2009), and
Microsoft Azure (2009). Cloud computing improved efficiency and flexibility.

Characteristics of Cloud Computing

• On-Demand Resources: Users can deploy and manage resources as needed.


• High Scalability: Resources can quickly expand or contract based on demand.
• Resource Pooling: Shared resources are dynamically allocated.
• Statistics Generation: Usage monitoring for effective management and billing.

Types of Cloud Computing

• Public Cloud: Accessible to everyone, cost-effective, managed by providers.


• Private Cloud: Used by a single organization, more secure, self-managed or by third-party.
• Community Cloud: Shared by multiple organizations with common goals.
• Hybrid Cloud: Combination of public and private clouds, balancing security and flexibility.

Types of Cloud Services

• Infrastructure as a Service (IaaS): Provides basic computing resources like virtual machines and
storage.
• Platform as a Service (PaaS): Offers hardware and software tools over the internet for application
development.
• Software as a Service (SaaS): Delivers software applications over the internet without needing
installation.
• Anything as a Service (XaaS): Encompasses a variety of services delivered online, paid for based
on usage.

Advantages of Cloud Computing

• Accessibility: Access data from anywhere at any time.


• Cost Savings: Reduces the need for expensive hardware and software.
• Backup and Security: Simplifies data backup and enhances security.

Disadvantages of Cloud Computing


• Security Risks: Potential vulnerabilities due to shared resources.
• Cost Management: Costs can rise with increased usage.
• Data Persistence: Ensuring complete data deletion can be challenging.

Conclusion

Cloud computing offers flexible, scalable, and cost-effective computing resources. It includes various types
like public, private, and hybrid clouds, and services such as IaaS, PaaS, SaaS, and XaaS. While it provides
significant benefits like accessibility and cost savings, it also presents challenges like security risks and cost
management.

Limitations of Traditional Computing Approaches

Traditional computing has been integral to business and daily life, but it has several limitations. Initially,
businesses relied on manual processes like pen, paper, and fax machines. Over time, computers replaced
these methods, making business operations more efficient.

However, traditional computing presents several challenges:

1. High Cost and Complexity: Setting up and maintaining in-house IT infrastructure is expensive and
complex. It involves significant investment in hardware, software, and skilled personnel.
2. Time-Consuming: The process of setting up infrastructure can take weeks or months.
3. Maintenance Burden: Regular maintenance, updates, and security measures are continuous
burdens.
4. Scalability Issues: Scaling up or down based on demand is difficult and often requires additional
investment.
5. Limited Access and Flexibility: Traditional systems are not easily accessible remotely, limiting
flexibility and mobility.
6. Resource Management: Organizations often face challenges in managing hardware and software
resources efficiently.

# Is There Any Solution to These Worries?

When needing IT infrastructure, organizations can either manage it themselves or outsource to third parties.
However, both options come with their own set of issues, such as managing hardware and software, which
may not be the organization's core competency.

Traditional outsourcing doesn't solve all problems because different users have different requirements. For
instance, application developers need a different setup than end-users. Therefore, traditional outsourcing
can't fully address the complexities and concerns of all users.

* Three Layers of Computing

Computing facilities can be divided into three layers:

1. Infrastructure: The hardware components like processors, memory, and storage devices. It requires
basic amenities like power and cooling.
2. Platform: The combination of hardware and software, including operating systems and runtime
libraries, where applications run. It is also where software development occurs.
3. Application: The software applications accessed by end-users for various tasks like document
editing, gaming, or business operations.

# Three Layers in Traditional Computing


In traditional computing, the boundaries between these layers are blurred, causing end-users and developers
to manage underlying layers, which isn't their primary focus.

1. Infrastructure: Setting up infrastructure is complex, time-consuming, and requires significant


investment and maintenance.
2. Platform: Building and managing platforms require expertise and can delay actual tasks. Licensing
and updates are additional burdens.
3. Application: Users often need to manage underlying layers to use applications, leading to additional
difficulties and costs.

# The End of Traditional Computing

Technological advancements have led to the development of cloud computing, which offers computing
resources as services, much like utilities. This model removes the burden of managing infrastructure,
platforms, and applications, allowing users to focus on their core tasks.

*Cloud Computing as a Utility Service

Cloud computing provides three main services:

1. Infrastructure Service: Ready-made hardware resources.


2. Platform Service: Pre-configured platforms for development and application hosting.
3. Application Service: Software applications accessible via the cloud.

This model offers flexibility, cost-efficiency, and better service quality by leveraging the economies of scale
of large cloud vendors. Users only need a basic device and internet connection to access these services,
reducing the demands on local hardware.

*Concerns

While cloud computing raises concerns about reliability and data security, it generally offers better or
comparable safeguards than traditional computing. The key advantage is the significant reduction in cost and
complexity, along with improved service flexibility and quality.
Benefits of Cloud Computing:
Cloud computing has transformed the scope of computing, offering it as an on-demand utility service. This
model provides several benefits that influence its adoption over traditional computing methods:

1. Less Acquisition/Purchase Cost:


o Initial investment in hardware and software is significantly reduced as users only need basic
client systems.
2. Reduced Operational Cost:
o Costs of system administration, maintenance, and 24/7 energy support are shifted to the
provider.
3. Reduced System Management Responsibility:
o Cloud vendors handle infrastructure and system management, freeing users to focus on their
core tasks.
4. Use-basis Payment Facility:
o Users are charged based on usage time or volume, reducing overall computing costs.
5. Unlimited Computing Power and Storage:
o Access to powerful computing resources and virtually unlimited storage as needed.
6. Quality of Service:
o High-quality services are provided by expert vendors, ensuring reliability and efficiency.
7. Reliability:
o Cloud vendors offer robust load balancing, backup, and recovery, ensuring consistent service.
8. Continuous Availability:
o Cloud services have high uptime, typically guaranteed at 99.9%, ensuring continuous
availability.
9. Locational Independence/Convenience of Access:
o Accessible anywhere with an Internet connection, using various devices like PCs, tablets, and
smartphones.
10. High Resiliency:
o Cloud infrastructure is resilient to attacks and faults, with redundancy and effective recovery
mechanisms.
11. Quick Deployment:
o Rapid and automatic resource provisioning allows for quick system or application
deployment.
12. Automatic Software Updates:
o Software updates are managed by the vendor, ensuring users always have access to the latest
versions.
13. No License Procurement:
o Users do not need to purchase software licenses; instead, they pay for usage as needed.
14. Safety against Disaster:
o Robust recovery systems protect against technical failures or natural disasters, safeguarding
data.
15. Environment Friendly:
o Efficient resource utilization reduces e-waste and carbon footprint, promoting green
computing.

These benefits make cloud computing a cost-effective, flexible, and reliable alternative to traditional
computing methods.
Challenges of Cloud Computing:
While cloud computing offers many benefits, it also presents several challenges. Efforts are ongoing to
address these issues:

1. Limited Portability between Cloud Providers:


o Moving applications and data between different cloud providers can be difficult, leading to
vendor lock-in.
2. Inter-operability Problem:
o Different cloud platforms may not work seamlessly together, complicating integration and
data exchange.
3. Data Security:
o Ensuring data privacy and protection in the cloud is critical, as data is stored off-site and
accessed over the internet.
4. Reduced Control over Governance:
o Users have less control over the infrastructure and data management, which can be an issue
for compliance and oversight.
5. Multi-Regional Compliance and Legal Issues:
o Cloud services operating across different regions must comply with varying legal and
regulatory requirements, complicating compliance.
6. Bandwidth Cost:
o High bandwidth costs can arise from the large volume of data transfer between users and
cloud services.

These challenges highlight areas where cloud computing needs to improve, but ongoing efforts by cloud
vendors aim to mitigate these issues.
Types of Cloud Computing:
Cloud computing provides computing resources as a service, with various deployment models like public,
private, hybrid, and community clouds. These models differ based on implementation, hosting, and
accessibility. Here are the main types:

1. Public Cloud:
o Operated by third-party providers offering services over the internet with pay-as-you-go
billing.
o Examples: Microsoft Azure, Google App Engine, Amazon EC2.
o Benefits: Cost-effective, scalable, ideal for small businesses.
o Drawback: Shared resources may lead to security concerns.
2. Private Cloud:
o Runs on private infrastructure, offering dedicated resources to a single organization.
o Examples: HP Data Centers, Elastic-Private Cloud.
o Benefits: High security and control, customizable.
o Drawback: High cost, requires skilled management.
3. Hybrid Cloud:
o Combines public and private clouds, allowing data and applications to be shared between
them.
o Benefits: Scalable, cost-effective, secure, and can handle peak loads.
o Drawback: Complex to manage.
4. Community Cloud:
o Shared by several organizations with similar needs, managed internally or by a third party.
o Benefits: Cost-sharing, higher security than public clouds.
o Drawback: Limited storage and bandwidth, not suitable for all businesses.
5. Multi-cloud:
o Uses multiple cloud services from different providers.
o Benefits: Prevents vendor lock-in, enhances reliability, meets specific business and
application needs.
o Drawback: Can be complex to manage and integrate.

Cloud Computing Service:


Cloud computing offers three primary service models: Infrastructure-as-a-Service (IaaS), Platform-as-a-
Service (PaaS), and Software-as-a-Service (SaaS). These models provide computing resources over the
internet, allowing consumers to use them as needed without managing the underlying infrastructure.

1. Infrastructure-as-a-Service (IaaS):
o Provides virtualized hardware resources such as virtual processors, memory, storage, and
networks.
o Consumers can build virtual machines and other infrastructure components.
o Examples: Amazon EC2, Google Compute Engine.
o Benefits: No need to manage physical hardware, scalable, cost-effective.
2. Platform-as-a-Service (PaaS):
o Offers a platform for developing, testing, and deploying applications.
o Includes infrastructure plus middleware, development tools, and runtime environments.
o Examples: Google App Engine, Microsoft Azure.
o Benefits: Simplifies application development, reduces the cost of ownership, supports
collaborative work.
3. Software-as-a-Service (SaaS):
o Delivers software applications over the internet, accessible via web browsers.
o Hosted and maintained by the service provider, including updates and data management.
o Examples: Salesforce CRM, Google Apps, Microsoft Office 365.
o Benefits: No need for software installation or maintenance, accessible from anywhere, cost-
effective subscription model.

These models collectively enable users to focus on their core tasks while leveraging scalable and efficient
cloud resources.

Comparison of the traditional system model and the cloud system model:

Aspect Traditional System Model Cloud System Model


Ownership of Organizations own and maintain Cloud service providers own and maintain
Infrastructure computing infrastructure. infrastructure.
Deployment
On-premises Off-premises (managed by CSPs)
Location
Operational expenditure (subscription/pay-
Expenditure Upfront capital expenditure
per-use)
Limited scalability, requires additional Elastic scalability, resources scale
Scalability
hardware procurement dynamically based on demand
Resource Organizations manage and maintain Cloud service providers handle
Management computing resources infrastructure management
Fixed work environment, often within Anywhere access via internet, flexible work
Work Environment
organization's network environments

This table highlights the key differences between the traditional system model and the cloud system model
in terms of ownership, deployment, expenditure, scalability, resource management, and work environment.

OTHER CATEGORY OF CLOUD SERVICES:


1. Security Management-as-a-Service: Provides cloud-based security solutions, such as threat
detection, vulnerability scanning, and incident response.
2. Identity Management-as-a-Service (IDaaS): Offers cloud-based identity and access management
solutions, including single sign-on, multi-factor authentication.
3. Storage-as-a-Service: Delivers cloud-based storage solutions, allowing users to store and access
data over the internet.
4. Database-as-a-Service: Provides cloud-based database solutions, offering scalable databases
without the need for infrastructure management.
5. Backup-as-a-Service (BaaS): Offers cloud-based backup and recovery solutions, allowing users to
backup their data and recover it when needed.
6. Compliance-as-a-Service: Provides cloud-based compliance solutions, helping organizations meet
regulatory requirements and standards.
7. Desktop-as-a-Service: Delivers cloud-based virtual desktop solutions, enabling users to access their
desktop environment remotely.
8. Monitoring-as-a-Service: Offers cloud-based monitoring solutions, allowing users to monitor their
IT infrastructure and applications.

These categories extend the range of cloud services beyond infrastructure, platform, and software, providing
specialized solutions for various business needs such as security, identity management, storage, database
management, backup, compliance, desktop virtualization, and monitoring.

OPEN CLOUD SERVICES:

• Eucalyptus: A cloud computing platform that enables users to build private and hybrid clouds compatible
with Amazon Web Services (AWS) APIs.

• OpenNebula: An open-source cloud computing toolkit for managing virtualized data centers and private
cloud infrastructure.

• Nebula: Developed by NASA, Nebula is an open-source cloud computing platform designed for scientific
research and high-performance computing.

• Nimbus: An open-source toolkit for building Infrastructure-as-a-Service (IaaS) clouds, primarily focused
on providing cloud computing capabilities for scientific and academic research.

• OpenStack: A widely-used open-source cloud computing platform for building public and private clouds,
offering infrastructure and platform services.

• Apache VCL (Virtual Computing Lab): An open-source cloud computing platform designed for
managing and provisioning virtual machines in educational environments.

• Apache CloudStack: An open-source cloud computing platform for building and managing public,
private, and hybrid clouds.

• Enomaly ECP (Elastic Computing Platform): An open-source cloud computing platform for building
private and public clouds, offering infrastructure services with features like auto-scaling and resource
scheduling.

Resource Virtualization:
Overview: Virtualization in cloud computing creates virtual versions of computing resources like servers,
storage, and networks. It allows multiple virtual instances to run on a single physical infrastructure,
enhancing resource utilization, scalability, and flexibility. Virtualization optimizes hardware usage and
reduces costs, making it a cornerstone of cloud computing.

Work of Virtualization in Cloud Computing: Virtualization enables users to share infrastructure and
reduce costs in cloud computing. It allows outsourcing IT maintenance, streamlining operations, and
optimizing expenses, thus becoming a cost-saving advantage for cloud users.

Benefits of Virtualization:

1. Better Performance: Utilizes CPU resources more efficiently, leading to improved performance.
2. Enhanced Security: Virtual machines are logically separated, enhancing security and availability.
3. Cost Reduction: Enables running multiple virtual machines on the same hardware, reducing
operational costs.
4. Reliability: Offers better reliability and disaster recovery compared to traditional systems.
5. Environmentally Friendly: Reduces physical resource consumption, making it more
environmentally friendly.

Drawback of Virtualization:

1. Significant Initial Investment: Transitioning to cloud infrastructure requires substantial upfront


investment.
2. Learning Curve: Requires skilled personnel for seamless integration and operation.
3. Data Security Concerns: Hosting data on third-party platforms introduces potential security risks.

Characteristics of Virtualization:

1. Managed Execution: Allows controlled execution of programs on various computer environments.


2. Sharing: Enables sharing of computing environments to reduce hardware requirements.
3. Aggregation: Consolidates separate hosts into one single host, known as aggregation of resources.
4. Emulation: Allows emulation of different programs/devices on host devices.
5. Isolation: Ensures virtual machine or guest application is separate from the host machine.
6. Portability: Enables running virtual images on different hosts without recompilation.

Types of Virtualization:

1. Application Virtualization: Provides remote access to applications from a server.


2. Network Virtualization: Combines available resources by separating bandwidth into different
channels.
3. Desktop Virtualization: Allows remote access to operating systems from a remote server.
4. Storage Virtualization: Stores data on multiple servers controlled by virtual storage systems.
5. Server Virtualization: Restructures one central server into multiple smaller virtual servers.
6. Data Virtualization: Formats data logically for access by stakeholders without revealing
background processes.

Uses of Virtualization:

1. Data Integration
2. Business Integration
3. Service-Oriented Architecture (SOA) Data Services
4. Searching Organizational Data
Conclusion: Virtualization in cloud computing enhances resource pooling, cost savings, and environmental
friendliness. It offers various benefits like better performance, enhanced security, and reliability. Different
types of virtualization cater to diverse needs, ensuring scalability, reliability, and accessibility of data.

Resource Pooling, Sharing and Provisioning:


Resource pooling in cloud computing involves combining multiple pools of computing resources like
servers, storage, and networks to efficiently allocate resources to consumers. It replaces traditional silos of
resources with interconnected pools to meet consumer needs without revealing the actual resource locations.

Resource Pooling Architecture:

*Design: A resource pooling architecture combines identical computing resources into pools and ensures
synchronized allocation.

*Categories of Resources: Computing resources are categorized into computer/server, network, and
storage, with a focus on processors, memory, network devices, and storage.

1. Computer or Server Pool:

• Setup: Physical servers are grouped into pools and installed with operating systems and system
software.
• Virtualization: Virtual machines are created on these servers, and physical processor and memory
components are linked to increase server capacity.

2. Storage Pool:

• Setup: Storage disks are configured with proper partitioning and formatting and provided to
consumers in a virtualized mode.

3. Network Pool:

• Setup: Networking components like switches and routers are pre-configured and delivered in
virtualized mode to consumers to build their own networks.

4. Hierarchical Organization:

• Structure: Data centers organize separate resource pools for server, processor, memory, storage, and
network components.
• Hierarchy: Hierarchical structures establish parent and child relationships among pools to handle
complexity and ensure fault tolerance.

Benefits:

1. Efficiency: Efficient allocation of resources based on consumer demand.


2. Scalability: Ability to scale resources up or down based on requirements.
3. Fault Tolerance: Eliminates single points of failure with nested sub-pool architecture.

Conclusion: Resource pooling in cloud computing enables efficient allocation and management of
computing resources. By grouping resources into interconnected pools and organizing them hierarchically,
cloud providers ensure scalability, fault tolerance, and efficient resource utilization.
Resource sharing in cloud computing increases resource utilization rates by allowing multiple
applications to use pooled resources. It involves distributing pooled and virtualized resources among
applications, users, and servers, requiring appropriate architectural support.

Resource Sharing Challenges:

1. Quality of Service (QoS): Maintaining performance isolation is crucial for ensuring QoS as multiple
applications compete for the same resources.
2. Predictability: Predicting response and turnaround time becomes difficult due to resource sharing,
necessitating optimized resource management strategies.

Multi-tenancy:

• Definition: Multi-tenancy allows a resource component to serve different consumers while keeping
them isolated from each other.
• Implementation: In cloud computing, multi-tenancy is realized through ownership-free sharing of
resources in virtualized mode and temporary allocation from resource pools.

Types of Tenancy:

• Public Cloud: Co-tenants are mutually exclusive and share the same computing infrastructure.
• Community Cloud: Co-tenants belong to the same community and share similar interests.
• Private Cloud: Tenancy is limited within sub-co-tenants internal to a single organization.

Tenancy at Different Levels of Cloud Services:

• Infrastructure as a Service (IaaS): Shared computing infrastructure resources like servers and
storage.
• Platform as a Service (PaaS): Sharing of operating system by multiple applications.
• Software as a Service (SaaS): Single application instance and/or database instance serving multiple
consumers.

Benefits of Multi-tenancy:

1. Cost Reduction: Increases resource utilization rate, reducing investment.


2. Efficiency: Eases application maintenance and improves utilization of resources.

Conclusion: Resource sharing and multi-tenancy in cloud computing enable efficient resource utilization
and cost reduction by distributing pooled resources among applications and users while maintaining
performance isolation and ensuring QoS.

RESOURCE PROVISIONING:
In traditional computing, setting up new servers or virtual servers is time-consuming. Cloud computing, with
its virtualization and IaaS model, enables rapid provisioning of resources, often in just minutes, if the
required resources are available. This is a significant advantage of cloud computing, allowing users to create
virtual servers through self-service interfaces.

Flexible Provisioning in Cloud Computing: Flexible resource provisioning is essential in cloud computing
to meet varying demands. Orchestration of resources must be intelligent and rapid to provision resources to
applications dynamically.

Autonomic Resource Provisioning: In cloud computing, resource provisioning is automated through


artificial intelligence, known as autonomic resource provisioning. The aim is to efficiently manage resource
demand by allocating one resource to multiple applications as needed.

Role of SLA: Service level agreements (SLAs) between consumers and cloud providers help estimate
resource requirements. Cloud providers plan resources based on SLAs to dynamically allocate physical
resources to virtual machines (VMs) running end-user applications.
Resource Provisioning Approaches: Resource provisioning in cloud computing is enabled through VM
provisioning. Two approaches are static and dynamic provisioning. Static provisioning allocates resources
once during VM creation, while dynamic provisioning adjusts resource capacity based on workload
fluctuations.

Hybrid Approach: A hybrid provisioning approach combines static and dynamic provisioning to address
real-time scenarios with changing load in cloud computing.

Resource Under-provisioning and Over-provisioning: Traditional computing often faces resource under-
provisioning or over-provisioning issues due to fixed-size resource allocation. Cloud computing mitigates
these issues by employing dynamic or hybrid resource provisioning approaches, ensuring high performance
at lower costs.

VM Sizing: VM sizing ensures the allocated resources match the workload. It can be done on a VM-by-VM
basis or jointly, allowing unused resources from less loaded VMs to be allocated to others.

Dynamic Provisioning and Fault Tolerance: Dynamic resource provisioning in cloud computing enhances
fault tolerance by replacing faulty nodes with new ones. This leads to a zero-downtime architecture,
ensuring continuous operation even during hardware failures.

Zero Downtime Architecture: Dynamic VM provisioning enables zero-downtime architecture by swiftly


migrating virtual servers to new hosts in case of physical server failures.

In essence, resource provisioning in cloud computing is characterized by its flexibility, automation, and
ability to ensure high performance while optimizing costs.

Case Study: -
case study that illustrates the use of Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and
Software as a Service (SaaS) in a real-world scenario:

Case Study: Cloud Computing Adoption in a Startup

Background: A tech startup, XYZ Innovations, has developed a new social media analytics tool aimed at
helping businesses analyze and optimize their social media marketing strategies. The company is growing
rapidly and needs to scale its infrastructure to handle increasing demand from customers.

IaaS Implementation: To meet their infrastructure needs, XYZ Innovations decides to adopt IaaS. They
choose a popular cloud provider and provision virtual servers to host their application. With IaaS, they have
full control over the virtual machines, allowing them to install and configure the necessary software stack
for their social media analytics tool. They can also scale their infrastructure up or down based on demand,
ensuring they only pay for the resources they use.

PaaS Implementation: As XYZ Innovations continues to grow, they realize that managing the entire
software stack on their own virtual machines is becoming cumbersome. They decide to migrate their
application to a PaaS solution offered by their cloud provider. With PaaS, they no longer need to worry
about managing the underlying infrastructure. Instead, they focus on developing and deploying their
application using the tools and services provided by the PaaS platform. This allows them to accelerate their
development process and focus more on innovation.

SaaS Implementation: To complement their social media analytics tool, XYZ Innovations decides to offer
additional services to their customers. They develop a SaaS application that integrates with their existing
platform and provides advanced reporting and visualization features. This SaaS application is hosted on the
same cloud platform as their PaaS solution, allowing seamless integration between the two. Customers can
now access the social media analytics tool and the additional reporting features through a web browser,
without the need for any installation or maintenance on their end.

Benefits:

• Scalability: With IaaS, XYZ Innovations can easily scale their infrastructure to handle increasing
demand.
• Development Efficiency: PaaS enables faster development and deployment of applications,
allowing XYZ Innovations to innovate more rapidly.
• User Accessibility: SaaS makes their services easily accessible to customers, who can access the
applications from anywhere with an internet connection.
• Cost Savings: By leveraging cloud computing services, XYZ Innovations can reduce upfront
infrastructure costs and pay only for the resources they use.

Conclusion: By adopting IaaS, PaaS, and SaaS solutions, XYZ Innovations has been able to rapidly scale
their infrastructure, accelerate their development process, and offer additional services to their customers.
This has helped them stay competitive in the rapidly evolving market of social media analytics.

UNIT-2
Scaling in the Cloud:

What is Scaling?

• Scaling is about a system's ability to grow or shrink as needed. In computing, it means handling
varying workloads efficiently without performance issues or unnecessary costs.

Traditional vs. Cloud Computing Scaling:

• In traditional computing, scaling resources manually is common, but it's inefficient because
resources often go unused.
• Cloud computing offers dynamic and automatic scaling, adjusting resources on-the-fly based on
demand.

Foundation of Cloud Scaling:

• Cloud scaling relies on:


o Resource virtualization: Making resources flexible and easier to manage.
o Resource sharing: Efficiently using resources among multiple users.
o Dynamic resource provisioning: Adding or removing resources as needed.

Scaling Strategies in Cloud:


• Proactive Scaling: Planning resource adjustments based on predictable patterns, like scheduled
events or regular peaks in demand.
• Reactive Scaling: Automatically responding to changes in resource utilization, adding or removing
resources as needed in real-time.

Auto-Scaling in Cloud:

• How it Works: The system automatically adjusts resources based on predefined conditions or
schedules.
• Scaling Boundaries: Limits can be set to prevent excessive scaling, with manual intervention
required if boundaries are exceeded.

In summary, scaling in the cloud is about adapting resources to meet demand efficiently. It involves
proactive planning based on predictable patterns and reactive adjustments to real-time changes, all done
automatically within predefined limits.

Types of Scaling:
• Vertical Scaling (Scaling Up): Increasing a system's capacity by replacing existing components
with more powerful ones.
• Horizontal Scaling (Scaling Out): Increasing capacity by adding more resources without replacing
existing ones.

Vertical Scaling:

• Advantages: Simple, less risky, and no system downtime during scaling.


• Disadvantages: Expensive, limited by hardware capacity, and may cause service interruption.

Horizontal Scaling:

• Advantages: No service interruption, uses commodity hardware, and can handle almost unlimited
traffic.
• Disadvantages: Complex to manage, requires distributed computing architecture, and may need
application redesign.

Comparison Between Vertical and Horizontal Scaling:

• Vertical Scaling: Simple but limited. Suitable for any computing environment.
• Horizontal Scaling: Complex but cost-effective. Needs distributed computing architecture.
Horizontal Scaling in the Cloud:

• Offers infinite scalability.


• More suitable for cloud-native environments due to its ability to distribute load across multiple
nodes.

Performance vs. Scalability:

• Performance: Measured by response time and overall completion time for tasks.
• Scalability: Refers to the system's ability to maintain performance with growing user numbers.
• Ideal Scalable System: Response time should remain consistent regardless of concurrent users.

Balancing Performance and Scalability:

• There's often a trade-off between performance and scalability.


• Setting a scalability limit helps manage performance degradation beyond a certain number of
concurrent users.

In essence, while vertical scaling is simpler but limited, horizontal scaling offers greater scalability potential
but requires a more complex setup. Balancing performance and scalability is crucial for a robust computing
system.

Vertical Scaling Horizontal Scaling


Known as Scaling up Known as Scaling out
Involves replacement of components or resource Involves introduction of additional components or
nodes resource nodes
Can be implemented in any type of computing Can only be implemented in distributed computing
environment environment
The management of larger numbers of nodes
Has less management complexity
increases system complexity
May cause service interruption as system needs to Does not cause service interruption. No system
restart after replacing component restart is required
This approach has a more fundamental influence on
Has less influence on application architecture
application architecture
A vertically scalable application tends to run on A horizontally scalable application tends to run on
high-end hardware low-end (commodity) hardware
Loads are concentrated Spreads the load
Can be done using normal commodity hardware
Requires specialized hardware components
components
Upgrading the capacity of resource beyond a certain It is always less expensive and the expense is always
level can become very expensive proportional to resource capacity
Expansion is limited by any resource’s maximum Expansion does not depend on available capacity of
capacity hardware components
Not a long term solution for scaling Provides a long term solution for scaling

This table should make it easier to see the differences between vertical and horizontal scaling approaches.
Cloud bursting is like a safety valve for your computing system. When your organization's internal
infrastructure or private cloud can't handle the demand, it expands into a public cloud temporarily. This
setup is called cloud bursting, and it lets you manage changing resource needs without disrupting your users.

Here's a simpler breakdown:

Cloud Bursting:

• What is it? When your internal infrastructure or private cloud can't handle the demand, so your
system expands into a public cloud temporarily.
• How does it work? It's based on two main functions: automated scaling listener and resource
replicator. The scaling listener decides when to switch to the external cloud, and the replicator keeps
your systems synchronized during the switch.
• When is it used? It's handy for systems with sudden spikes in traffic, like during events or
promotions.

Why is Scalability Important for Business?

• What is scalability? It's the ability of your computing system to handle more workload efficiently.
• Why does it matter? Scalability means happy customers because your system can handle more
traffic without slowing down. Even tiny delays in loading times can hurt business—Amazon saw a
1% decrease in revenue for every 100-millisecond delay.
• How does cloud computing help? Cloud computing gives you almost unlimited scalability. Your
applications can automatically adjust to handle more traffic when needed and scale down during
quieter times, letting you focus on your business goals.

So, cloud bursting is like a safety net, ensuring your system can handle whatever comes its way, while
scalability is crucial for keeping customers happy and your business thriving.

Capacity Planning: Ensuring Enough Resources


What is it? Capacity planning is like having enough seats at a concert. You want to make sure there's room
for everyone who wants to come, but you don't want to rent too many seats and waste money.

Why does it matter? It's all about balancing supply and demand. You want to have just the right amount of
resources (like servers and storage) to support your applications, neither too much nor too little.
In the past: Companies used to estimate their resource needs for a long time ahead and invest in fixed
amounts of computing power. But this often led to wasted resources or not having enough when demand
spiked.

Now: With cloud computing, things have changed. You can rent resources as you need them, paying only
for what you use. This flexibility helps avoid wasted resources and ensures you have enough when demand
goes up.

Who's responsible? Everyone has a role. Cloud providers manage the physical resources, while consumers
(like businesses) estimate their needs and communicate them through service level agreements (SLAs).

Benefits: Proper capacity planning means better performance for your applications and more control over
costs. It's like having just the right amount of popcorn for movie night—enough to enjoy without leftovers.

Steps for Capacity Planning:


Step 1: Determine Expected Demand

• Look at past usage patterns to understand how resources are typically used over time.
• Analyze data to predict future demand, considering factors like seasonal variations and special
events.
• This helps in knowing when resources will be needed the most.

Step 2: Analyze Current Load

• Monitor the current usage of resources like processors, memory, storage, and network connectivity.
• Identify any bottlenecks or stress points where resources are maxed out and affecting performance.
• Understand how the system responds to different levels of demand.

Step 3: Understand System Value

• Evaluate the importance of meeting additional demand for the business.


• Consider the cost of adding more capacity versus the benefits it brings.
• Decide whether adding capacity is worth it based on the value it adds to the system.
By following these steps, service providers can ensure they have the right amount of resources to meet
demand without overspending or compromising performance.

Load Balancing:
Definition: Load balancing is a crucial technique in distributed computing systems. It ensures that
processing and communication tasks are evenly distributed across network resources, preventing any single
resource from becoming overloaded.

Purpose: Load balancing is especially important for handling large and unpredictable numbers of service
requests. It evenly distributes workload among multiple computing resources like processors and memory,
improving overall system performance.

Benefits:

1. Improved Resource Utilization: Efficient load balancing ensures that resources are utilized
optimally, maximizing system performance.
2. Enhanced Reliability: By distributing workload across multiple resources, load balancing increases
system reliability. If one component fails, others can continue functioning without disruption.
3. Scalability: Load balancing enables a system to scale according to demand, redirecting load to
newly added resources as needed.
4. Continuous Availability: By preventing overloading of resources, load balancing ensures that
applications remain available at all times, enhancing user experience.

Implementation: Load balancing can be implemented through hardware or software solutions. Software-
based solutions are cost-effective and easier to configure and maintain compared to hardware-based
solutions.

Importance in Cloud Computing: Load balancing is essential in cloud computing to maintain operational
efficiency and reliability. It ensures that workloads are properly distributed among available resources,
making the system scalable and tolerant to failures. It also maximizes resource availability and minimizes
downtime, critical for businesses relying on cloud services.
Categories of Load Balancing
Static Approach:

• Definition: Static load balancing doesn't use a knowledge base to distribute tasks. It assigns tasks to
resources solely based on task characteristics, without considering the current state of resources.
• Operation: Tasks are evenly distributed among available resources based on predefined rules. It
doesn't track the workload of each resource.
• Advantages: Simple to design and implement, uses algorithms like round robin.
• Example: If two servers are available and six similar requests come in, static load balancing ensures
each server handles three requests.
Dynamic Approach:

• Definition: Dynamic load balancing considers the current state of resources when distributing tasks.
It adjusts task allocation based on real-time information about resource load.
• Operation: Load balancers continually monitor resource load and adjust task distribution
accordingly. Tasks are assigned based on current resource availability.
• Advantages: Ensures even resource utilization, increases fault tolerance, and supports scalability.
• Example: If one server becomes overloaded, dynamic load balancing redirects tasks to less loaded
servers to maintain system stability.

Exploring Dynamic Load Balancing:

• Distributed Approach: All nodes share the task of load balancing, either cooperatively or non-
cooperatively.
• Non-Distributed Approach: Load balancing is managed by one or a cluster of nodes. It can be
centralized or semi-distributed.
• Centralized: A single node manages load balancing for the entire system, communicating directly
with other nodes.
• Semi-Distributed: Load balancing tasks are partitioned among groups of nodes, each with a central
node responsible for balancing within the group.
• Objective: Balancing loads while minimizing overhead, optimizing response time, ensuring system
health, and facilitating scalability.

Load balancing plays a critical role in optimizing resource utilization and ensuring efficient task execution
in distributed systems like cloud computing. Whether using a static or dynamic approach, load balancers
help maintain system performance, reliability, and scalability.
LOAD BALANCING ALGORITHMS

Objectives of Load Balancing Algorithms

The main goal of load balancing algorithms is to ensure no server is overloaded in terms of capacity or
performance. These algorithms are divided into two categories:

1. Class-Agnostic Load Balancing Algorithms


o Definition: These algorithms distribute load without knowing details about the requests.
o Characteristics:
▪ Simple to design and implement.
▪ Don't consider specifics of requests.
o Examples:
▪ Round Robin: Distributes tasks cyclically among servers.
▪ Random: Assigns tasks to servers randomly.
▪ Least Connection: Directs requests to the server with the fewest active connections.
2. Class-Aware Load Balancing Algorithms
o Definition: These algorithms use knowledge about the requests to make informed decisions.
o Sub-Categories:
▪ Content-Aware: Uses information about the request content to avoid redundant
processing. Example: Streaming requests for the same video go to the same server.
▪ Client-Aware: Uses information about the client to enhance processing efficiency.
Example: VIP customer requests are routed to high-performance servers.
o Advantages: Improves performance and resource use, suitable for critical environments.
o Challenges: More complex to design and implement, benefits depend on system
heterogeneity.
o Examples:
▪ URL Hashing: Routes based on URL hash.
▪ Session Persistence: Ensures requests from the same client session go to the same
server.

Persistence in Load Balancing

Persistence, or stickiness, ensures that all requests from a single client session go to the same backend
server. This is important for applications that need to maintain session state.

• Methods for Maintaining Persistence:


o Session Tracking: Stores session data on the load balancer.
o In-Memory Session Database: Stores session info in memory, replicated across servers.
o Session Cookies: Uses client-side cookies to store session info, reducing load on the
balancer.
Application Delivery Controller (ADC)

ADCs are advanced load balancers that enhance application performance.

• Features:
o SSL Offloading: Moves encryption tasks from the server to the ADC.
o Content Compression: Reduces data size sent over the network, improving speed and
bandwidth use.

Case Studies

1. Google Cloud
o DNS Level Load Balancing: Redirects requests to the nearest data center.
o Data Center Level Load Balancing: Distributes requests based on server load within the
data center.
2. Amazon EC2
o Auto-Scaling Group Integration: Dynamically adjusts capacity based on traffic.
o Controller Service: Monitors load balancers' health and performance.
o DNS Records: Updates load balancer records dynamically for efficient scaling.

Classification and Key Considerations

Load balancing algorithms are classified based on their awareness of the requests:

1. Class-Agnostic Algorithms:
o Round Robin
o Random
o Least Connection
2. Class-Aware Algorithms:
o Content-Aware: Routes similar requests to the same server.
o Client-Aware: Routes requests from similar clients to specific servers.
o Client & Content Aware: Combines both approaches.

Key Considerations for Load Balancing:

• Resource Utilization: Efficient use of all resources.


• Response Time: Minimize time to respond to requests.
• System Health: Monitor and maintain component health.
• Associated Overhead: Balance task movement overhead with performance benefits.
• Scalability: Support dynamic scaling of resources.

By choosing the right load balancing algorithm, cloud service providers can enhance system performance,
reliability, and efficiency.

File System and Storage:

13. FILE SYSTEM AND STORAGE

13.1 Requirements of Data-Intensive Computing

Data-intensive computing involves processing large datasets that require efficient management and rapid
data movement. Traditional enterprise storage systems are inadequate for such tasks. Key requirements
include:
• Partitioning and Distribution: Large datasets need to be partitioned and processed across multiple
nodes.
• Scalability: Effective data partitioning and distribution promote scalability.
• I/O Performance: High-performance data processing demands efficient data handling to reduce
access time.
• Parallel and Distributed Processing: Handling complex data efficiently requires parallel and
distributed computing models like MapReduce.

13.2 Challenges Before Cloud Native File System

Cloud native file systems must meet several challenges not faced by traditional file systems, including:

• Multi-Tenancy: Ensuring isolation and security for multiple tenants sharing resources.
• Scalability: Supporting both upward and downward scaling to meet varying storage needs without
resource wastage.
• Unlimited Storage: Providing virtually unlimited and fault-tolerant storage using inexpensive
hardware.
• Efficiency: Handling thousands of concurrent operations efficiently.
• Compatibility: Maintaining backward compatibility with existing file system interfaces.
• Metered Use: Enabling resource usage metering.
• Error Detection and Recovery: Incorporating automatic error detection and recovery mechanisms.

13.3 Model for High-Performance Processing of Large Data-Sets

Processing large datasets often requires parallel and distributed models like MapReduce. Key aspects
include:

• MapReduce Programming Model: A framework developed by Google to process massive


unstructured data in parallel.
o Map Function: Divides data processing tasks across a distributed environment, generating
intermediate results.
o Reduce Function: Merges intermediate results to produce the final output.
o Scalability: Promotes application scalability by parallelizing data processing across multiple
nodes.
• Other Models: Extensions like Hadoop MapReduce, Pig, Hive, and Sphere, which offer modified
and easier interfaces for processing large datasets.

13.4 Cloud Native File System

To support high-performance computing, several cloud-native file systems have been developed, including:

• IBM General Parallel File System (GPFS): An early high-performance distributed file system
developed by IBM.
• Google File System (GFS): A scalable and reliable distributed file system using inexpensive
hardware, designed for large file storage and access.
• Hadoop Distributed File System (HDFS): An open-source implementation of GFS, providing
scalable and reliable data storage on commodity servers.
• Ghost Cloud File System: A scalable private cloud file system designed for use within an Amazon
Web Services (AWS) account.
• Gluster File System (GlusterFS): An open-source distributed file system capable of scaling across
multiple disks, machines, and data centers.
• Kosmos File System (KFS): An open-source GFS implementation developed in C++, also known as
CloudStore.
• Sector Distributed File System: Another open-source file system inspired by GFS.
13.5 Storage Deployment Models

Cloud storage can be deployed in three models:

• Public Cloud Storage: Accessible to anyone and provided by third-party service providers.
• Private Cloud Storage: Managed by the consumer enterprise, can be set up on-premises or off-
premises.
• Hybrid Cloud Storage: Combines public and private storage, using private storage for critical data
and public storage for archiving.

13.6 Storage Types

Cloud storage is categorized based on user needs:

• General Purpose Storage: For end-users storing files and folders.


• Specialized Storage: For developers needing control over the storage system for deploying and
developing applications.

13.7 Popular Cloud Storages for Developers

Several managed cloud storage services are popular among developers, providing reliable and high-
performance storage:

• Amazon Elastic Block Store (EBS): Provides block-level storage volumes for Amazon EC2
instances, supporting various file systems.
• Amazon Simple Storage Service (S3): Stores files as objects within buckets, capable of handling
trillions of objects.
• Google Cloud Storage: Persistent storage attached to Google Compute Engine (GCE), storing data
as objects within buckets.

These advancements in cloud file systems and storage solutions are crucial for meeting the demands of high-
performance and data-intensive computing environments.
UNIT-3
Multi-tenant software:
Applications have traditionally been developed for single enterprises, where the data belongs to one
organization. However, SaaS platforms require a single application to handle data from multiple customers,
known as multi-tenancy. Multi-tenancy can also be achieved through virtualization, where each tenant has
its own virtual machines. This chapter discusses application-level multi-tenancy.

1 Multi-Entity Support

Before SaaS, large organizations needed applications to support multiple units (entities) while keeping data
segregated. For example, a bank's software should allow branch-level users to see only their data, but also
support centralized changes and global processing like inter-branch reconciliation. This is similar to multi-
tenancy.

To achieve multi-entity support, each database table includes an additional column (OU_ID) indicating the
organizational unit. Queries filter data based on the current user's unit. This is called the single schema
model, where one schema holds data for all entities.

Advantages of the single schema model include easy upgrades for all customers. However, it can be
complex to support custom fields for specific customers. The model shown in Figure 9.2 demonstrates how
custom fields are managed separately, making upgrades complicated.

2 Multi-Schema Approach

The multi-schema approach uses separate schemas for each customer. This simplifies application design and
makes it easier to re-engineer existing applications. Each schema can have its own customizations. Figure
9.3 shows how separate schemas handle customer data, making the system more flexible.

3 Multi-Tenancy Using Cloud Data Stores

Cloud data stores like Google App Engine, Amazon SimpleDB, and Azure's data services support multi-
tenancy. For example, Google App Engine allows dynamic creation of classes for different schemas, as
shown in Figure 9.4. This flexibility makes it easier to manage multi-tenant applications on these platforms.
4 Data Access Control for Enterprise Applications

Multi-tenancy can also be useful within an enterprise, where access to data needs to be controlled based on
various rules. Data Access Control (DAC) can manage who sees what data. Figure 9.5 illustrates a generic
DAC implementation where each table has an additional DAC_ID field. Rules define access, and users are
assigned DAC roles.

In traditional databases, SQL queries can enforce DAC by joining tables. In cloud databases, such joins are
not supported, so DAC must be implemented in the application code.

To summarize, achieving multi-tenancy, especially with cloud databases, involves complexities and careful
planning to ensure data security and efficient customization.

Data in the cloud:


Overview

Relational databases have dominated enterprise applications since the 1980s, providing a reliable way to
store and retrieve data, especially for transaction processing. However, with the rise of web services from
companies like Google, new distributed systems like GFS (Google File System) and BigTable, as well as
programming models like MapReduce, have been developed. These new systems are particularly effective
for handling large volumes of data and parallel processing.

Relational Databases

Relational databases use SQL to interact with users and applications, optimizing query execution through
memory and disk operations. Data is typically stored in rows on disk pages, but column-oriented storage can
be more efficient for read-heavy tasks like analytics.

Relational databases have evolved to support parallel processing using different architectures (shared
memory, shared disk, shared nothing). They handle transaction isolation through locking mechanisms,
which become complex in parallel and distributed setups. These databases are designed for transaction
processing, but large-scale parallelism requires new approaches.
Cloud File Systems: GFS and HDFS

GFS (Google File System) and HDFS (Hadoop Distributed File System) are designed for managing large
files across clusters of servers. They handle hardware failures and support multiple clients reading, writing,
and appending data in parallel. These systems break files into chunks, which are replicated across different
servers to ensure reliability. Clients access data chunks directly after getting metadata from a master server,
ensuring consistency and fault tolerance through regular updates.

BigTable, HBase, and Dynamo

BigTable (by Google) and HBase (on HDFS) are distributed storage systems that manage data in a
structured, multi-dimensional map format. Data is accessed by row key, column key, and timestamp, with
column families storing related data together, similar to column-oriented databases. Data is managed by
tablet servers, with metadata servers locating these tablets.

Amazon's Dynamo is a key-value store that handles high volumes of concurrent updates using distributed
object versioning and quorum consistency, ensuring data reliability across nodes. Data is stored with
versioning, allowing for conflict resolution by applications. A quorum protocol ensures consistency by
requiring reads and writes to access multiple replicas.

Cloud Data Stores: Datastore and SimpleDB

Google App Engine's Datastore and Amazon's SimpleDB are built on BigTable and Dynamo, respectively.
Datastore uses BigTable's infrastructure for efficient data storage and querying. SimpleDB uses Dynamo's
key-value approach, providing scalable, fault-tolerant storage.

Summary

Cloud data strategies like BigTable, HBase, and Dynamo offer scalable and fault-tolerant alternatives to
traditional relational databases. They are particularly well-suited for large-scale, parallel data processing,
leveraging distributed file systems like GFS and HDFS for efficient data access and reliability. These
systems support the growing needs of enterprise applications in the cloud.
1. Database in Cloud:
• Implementation Forms:
o Traditional Database Solution on IaaS: Users deploy database applications on virtual
machines.
o Database-as-a-Service (DBaaS): Service providers manage backend administration tasks.

2. Data Models:

• Structured Data (SQL Model): Traditional relational databases like Oracle, SQL Server.
• Unstructured Data (NoSQL Model): Suitable for scalable systems, efficient for unstructured data
sets. Examples: Amazon SimpleDB, Google Datastore, Apache Cassandra.

3. Database-as-a-Service (DBaaS):

• Managed by service providers, offering scalability, metered billing, automated provisioning,


increased security.
• Supports both structured and unstructured data.

4. Relational DBMS in Cloud:

• Deployment Options:
o Traditional Deployment: Users deploy traditional RDBMS on cloud servers, manage
administration.
o Relational Database-as-a-Service: Fully-managed RDBMS by cloud providers.

5. Examples of Managed Relational Database Services:

• Amazon RDS: Supports MySQL, Oracle, SQL Server, PostgreSQL, and Amazon Aurora.
• Google Cloud SQL: Managed MySQL database.
• Azure SQL Database: Managed Microsoft SQL Server.

In essence, cloud databases offer flexibility in deployment, management, and scalability, catering to both
structured and unstructured data needs. Managed services relieve users of administrative burdens, providing
on-demand access and automated functionalities.

NoSQL databases are a new type of database system that handle large volumes of unstructured data more
efficiently than traditional relational databases. They are crucial for cloud computing environments due to
their ability to store and retrieve data effectively. Let's break down the key points:

1. Emergence of Big Data: With the rise of web-based platforms, data started growing exponentially
in volume and complexity. Big data refers to massive sets of structured and unstructured data with
characteristics like high volume, velocity (fast-changing), and variety (different formats like text,
audio, video).
2. Challenges with Relational Databases: Traditional relational databases faced challenges in
handling the increasing volume of data, especially with the surge in online transactions and social
networking. They struggled to scale horizontally, meaning they couldn't easily distribute data across
multiple servers to handle increased loads.
3. Need for NoSQL Databases: As web applications moved to cloud computing, it became evident
that traditional relational databases were inadequate for handling modern data requirements. NoSQL
databases emerged as a solution because they could scale horizontally, distribute data efficiently, and
handle unstructured data effectively.
4. Characteristics of NoSQL Databases:
o Horizontal Scalability: NoSQL databases can scale out across multiple servers to handle
increasing data loads.
o Flexible Schemas: They allow for schema-less data storage, enabling easy adaptation to
changing data structures.
o Non-relational: NoSQL databases can efficiently handle both relational and non-relational
data.
o Auto-distribution and Replication: Data distribution and replication are automatic
processes in NoSQL databases, ensuring high availability and fault tolerance.
o Integrated Caching: Many NoSQL databases offer integrated caching capabilities to
improve performance.
5. Types of NoSQL Databases:
o Key-Value: Simplest form where data is stored as key-value pairs.
o Document-Oriented: Data is stored in documents, usually using JSON or XML formats.
o Column-Family: Data is grouped in columns, allowing for efficient querying.
o Graph: Data is stored as graph structures, useful for applications with complex relationships.
6. Popular NoSQL Databases: Examples include MongoDB, Cassandra, HBase, DynamoDB,
CouchDB, and Neo4j. Each has its own strengths and weaknesses, catering to different use cases.
7. Selecting the Right NoSQL Database: Choosing the appropriate NoSQL database depends on the
specific requirements of the application. There's no one-size-fits-all solution, and often multiple
NoSQL databases may be used together to optimize performance.

Content Delivery Network:


1. Why CDNs are Needed:
o In cloud computing, everything travels through the internet, which can cause delays in
delivering content to users.
o These delays can hurt user experience and business success.
o CDNs help by creating a dedicated network for faster content delivery, reducing delays and
improving user experience.
2. How CDNs Work:
o CDNs store copies of content on servers located closer to users, reducing the distance content
needs to travel.
o When users request content, CDNs deliver it from the nearest server, speeding up delivery.
o CDNs use caching and replication to store content on servers worldwide, ensuring fast and
reliable delivery.
3. Advantages of CDNs:
o CDNs handle heavy traffic efficiently, ensuring fast delivery even during peak times.
o They support more simultaneous users and reduce load on individual servers, improving
performance.
o CDNs accelerate content delivery, making websites load faster for users.
o By reducing the distance content travels, CDNs lower delivery costs and improve reliability.
o They also provide better security by storing content copies on multiple servers, protecting
against data loss.
4. Disadvantages of CDNs:
o CDNs create a new point of failure in the delivery chain, so if a CDN service fails, content
providers are affected.
o Content providers must manage content through the CDN provider's facilities, adding
complexity.
5. CDN Service Providers:
o Several companies offer CDN services, including Akamai, Limelight, Amazon's CloudFront,
Microsoft's Azure CDN, and CDNetworks.
o These providers offer global networks of servers for fast content delivery, catering to
different business needs and geographical regions.

Overall, CDNs are essential for delivering content quickly and reliably to users worldwide, improving user
experience and business success.
Security Reference Model
Security is a big concern in cloud computing, just like in regular computing. When individuals or businesses
switch to cloud services, they want to be sure their data and operations are safe. Cloud computing means
working with data and applications outside your usual setup, which can feel risky.

People often say, "Cloud computing is a great idea, but it's not secure." But, if done right, cloud computing
can be just as safe as traditional methods. Understanding the risks and choosing the right setup for your
needs is key.

Security Concerns in Cloud Computing

In the past, we used firewalls to protect our networks from outside threats. But with cloud computing, the
traditional security boundaries change. Computing resources move outside our usual security zones, and we
need new ways to keep them safe.

Cloud Security Working Groups

Several groups, like the Cloud Security Alliance (CSA) and the Jericho Forum, have worked on creating
security standards for cloud computing. They've published guidelines and best practices to help both
providers and users stay safe.

Elements of Cloud Security Model

Gartner outlined seven key security issues for cloud computing:

1. User Access: Who has control over your data in the cloud?
2. Regulatory Compliance: Are your cloud providers following security rules?
3. Data Location: Do you know where your data is stored?
4. Data Segregation: Is your data kept separate from others'?
5. Recovery: What happens if there's a disaster? Can you get your data back?
6. Investigative Support: Can you investigate any problems that occur?
7. Long-Term Viability: What happens if your cloud provider goes out of business?

By asking these questions and choosing reputable providers, you can make sure your cloud computing is as
safe as possible.
Cloud Security Reference Model

For years, various groups and organizations worked on developing a model to address cloud security. The
Jericho Forum proposed the "Cloud Cube Model" in 2009 to tackle the issue of security boundaries blurring
among collaborating businesses.

The Cloud Cube Model

This model suggests that cloud security shouldn't just be measured by whether systems are "internal" or
"external." It looks at factors like who manages the cloud and who has access rights.

Primary Objectives

The Cloud Cube Model aims to represent different cloud formations, highlight their characteristics, and
show the benefits and risks. It also emphasizes that traditional approaches to computing aren't always
obsolete.

The Four Criteria

The model focuses on four security-related criteria:

1. Data storage location (Internal/External)


2. Ownership of technology (Proprietary/Open)
3. Security boundary (Perimeterized/De-perimeterized)
4. Service sourcing (Insourced/Outsourced)

These criteria help determine the nature of the cloud formation.

Examining Cloud Security against Traditional Computing

Cloud security depends on both providers and consumers. Consumers have more responsibility for security
management with Infrastructure as a Service (IaaS), while it decreases with Software as a Service (SaaS).
Security Policy

Security policies guide reliable security implementation in a system. These policies include management
policy, regulatory policy, advisory policy, and informative policy.

Trusted Cloud Computing

Building trust in cloud computing is essential. Providers can gain trust through security certifications like the
Security, Trust & Assurance Registry (STAR) from the Cloud Security Alliance (CSA).

How to Make a Cloud Service Trusted

Providers can strengthen trust by obtaining security certifications. Transparency about security measures
helps consumers feel confident in adopting cloud services.

Cloud Security Concerns:


• Cloud computing offers many benefits, but it also brings new security challenges due to shared
infrastructure and resources.
• Understanding cloud architecture and choosing the right deployment model can reduce security risks.
• Both cloud providers and consumers share responsibility for security.

Key Security Needs:

• Cloud security must address basic requirements like confidentiality, integrity, availability, identity
management, and access control.
• These requirements are not new, but their implementation in the cloud requires special attention.

Responsibility:

• Security in the cloud is a joint effort between providers and consumers.


• Providers must secure their infrastructure and client data, while consumers must ensure providers
meet security standards.

Service-Level Agreements (SLAs):

• SLAs establish trust between providers and consumers, outlining service capabilities and security
measures.
• They should include detailed security provisions and responsibilities for both parties.

Threats and Risks:

• Threats like eavesdropping, fraud, theft, sabotage, and external attacks exist in cloud computing.
• Cloud-specific threats include infrastructure, information, and access control vulnerabilities.

Infrastructure Security:

• Focuses on controlling physical resources supporting the cloud infrastructure.


• Includes network, host, and service-level security, with responsibilities shared between providers and
consumers.

Information Security:

• Addresses confidentiality, integrity, and availability of data.


• Measures include encryption, access control, and protection against network-based attacks.
Protection from Undesirable Circumstances:

• Plans should be in place for scenarios like provider bankruptcy, acquisition, or service
discontinuation.
• Consumers should have access to their data and be able to move it to other providers if needed.

Overall, cloud security requires collaboration between providers and consumers, clear SLAs, and measures
to protect data integrity, confidentiality, and availability.

Identity Management and Access Control:

Identity management and access control, also known as IAM, are essential for secure computing. Here's
why:

1. Operational Efficiency: IAM automates user verification, making system operations smoother.
2. Enhanced Security: It protects systems and data from harmful attacks.

IAM includes:

• Identification Management: Users state their identities, usually with a unique ID or username.
• Authentication Management: Verifies a user's identity through passwords, fingerprints, or retina
scans.
• Authorization Management: Determines a user's level of access rights after verifying identity.
• Access Management: Implements organizational policies and system privileges for resource
requests.
• Accountability: Ensures individuals cannot deny actions within the system, using audit trails and
logs.
• Monitoring and Auditing: Allows users to monitor, audit, and report compliance issues regarding
resource access.

IAM in the Cloud:

In cloud computing, IAM is crucial due to the loss of control over infrastructure. Both service providers and
consumers play roles in ensuring security:

• Service Providers: Must implement IAM mechanisms to protect cloud environments.


• Consumers: Need to implement proper access control techniques like authentication and
authorization.

IAM in cloud computing includes robust authentication, authorization, and access control mechanisms, often
using modern technologies like biometrics or smart cards.

Exploring Identity Management:

User authentication can be simplified with Single Sign-On (SSO) and Federated Identity Management
(FIM):

• SSO: Lets users access multiple applications with a single sign-in.


• FIM: Links identity information across different applications within known trust boundaries.

Exploring Access Control:

Access control limits user access to systems and data. Models like Mandatory Access Control (MAC),
Discretionary Access Control (DAC), and Non-Discretionary Access Control determine access policies
based on user roles and system requirements.
Cloud Security Design Principles:

• Least Privilege: Assign minimum required privileges.


• Defense in Depth: Use multiple layers of protection.
• Fail Safe: Ensure system remains secure even in case of failure.
• Economy of Mechanism: Keep security design simple.
• Open Design: Design security systems that don't rely on secrecy.
• Complete Mediation: Rigorously check access controls with every request.
• Least Common Mechanism: Avoid sharing security mechanisms among components.
• Separation of Privilege: Distribute privileges among multiple subjects.
• Weakest Link: Continuously assess and improve weakest parts of security system.
• Psychological Acceptability: Ensure security mechanisms don't overly complicate user experience.

Cloud Security Management Frameworks:

Frameworks like ITIL and ISO 27001/27002 provide guidelines for managing cloud security effectively.

Security-as-a-Service:

Organizations can outsource security management responsibilities to third-party providers, who offer
services like email and web content filtering, vulnerability management, and identity management. These
services are subscription-based and managed by the provider.

In summary, IAM, access control, and robust security design principles are critical for ensuring cloud
security, with frameworks and Security-as-a-Service options available to help organizations manage their
security effectively.

Privacy and Cloud Computing

Understanding Privacy: Privacy and security are both important in cloud computing, but they're not the
same. While security focuses on protecting data from unauthorized access, privacy is about controlling who
can access personal or sensitive information.

Consumer Responsibility: Cloud service providers are mainly responsible for maintaining privacy, but
consumers should also consider their own privacy needs when agreeing to service terms.

Regulatory Compliance: Meeting legal requirements in different countries can be challenging due to the
global nature of cloud data storage. Consumers should understand their privacy rights and the provider's
approach to compliance.

GRC Issues: Governance, risk, and compliance become more complex with cloud adoption. Regular audits
can help identify any violations and build consumer trust.

What is Privacy?: Privacy involves keeping personal information confidential and under control.
Personally identifiable information (PII), like names and contact details, falls under privacy regulations.

Privacy Concerns in Cloud Computing:

• Access to Data: Consumers need clarity on who can access their data and under what circumstances.
• Compliance: Privacy laws vary by region, so consumers must understand legal implications.
• Storage Location: Data stored in the cloud can be spread across multiple locations, subject to
different privacy laws.
• Retention and Destruction: Consumers should know how long data will be stored and what
happens when they switch providers.
Key Concerns: Privacy laws differ globally, making it crucial for consumers to understand their rights and
the provider's privacy policies.

Security vs. Privacy:

• Security: Focuses on data protection (confidentiality, integrity, availability).


• Privacy: Concerned with appropriate use of information and freedom from observation or
disturbance.

Importance of Privacy Policy: Cloud providers should have clear privacy policies to address consumer
concerns and build trust. These policies outline how privacy is maintained and any limitations on liability.

Compliance in Cloud Computing

Understanding Compliance: Compliance involves following regulations and laws, which can be
challenging due to differing standards across countries and regions. Cloud computing adds complexity
because data is stored in multiple locations with different regulations.

Types of Compliance Concerns:

• Regulatory: Meeting legal requirements.


• Performance: Ensuring systems meet objectives.
• Security: Protecting data from threats.
• Legal: Adhering to laws and contracts.

Role of Service Level Agreements (SLAs): SLAs between cloud service providers (CSPs) and consumers
address compliance issues. Both parties must agree on requirements and resolve any violations.

Shared Responsibility: Compliance isn't solely the provider's responsibility. Consumers also play a role in
monitoring and addressing compliance issues.

Governance, Risk, and Compliance (GRC): GRC encompasses governance strategies, risk management,
and compliance to regulations. It's crucial for organizations to address these areas together to avoid conflicts.

Steps to Address GRC:

1. Risk Assessment: Regularly assess risks related to governance and compliance.


2. Key Controls: Identify and document key controls to mitigate risks.
3. Monitoring: Monitor key controls to ensure compliance.
4. Reporting: Report on functional metrics and performance indicators.
5. Continuous Improvement: Address any gaps or conflicts in compliance processes.

Importance of GRC:

• Information Explosion: Increased data requires better management for security and privacy.
• Scores of Regulations: Many regulations make compliance challenging.
• Globalization: Businesses operate globally, facing varied regulations.
• Cloud Computing: Cloud adds complexity due to data storage across jurisdictions.

Automated GRC Monitoring: Tools automate GRC monitoring, enhancing visibility and efficiency.
Solutions from vendors like SAP offer automated GRC programs.

Audit and Monitoring:

• System Audit: Regular checks to assess system performance, security, and compliance.
• Internal and External Audit: Internal audits are performed by employees, while external audits are
conducted by independent professionals.

Audit Frameworks:

• SysTrust and WebTrust: Focus on security, availability, confidentiality, and integrity.


• ISO 27001: Addresses information security and risk management.

Auditing the Cloud: Cloud customers need assurance that providers comply with regulations. Right to
Audit clauses in contracts allow clients to conduct audits. CSPs can adopt compliance programs based on
standard frameworks to build trust.

UNIT-4
Portability and interoperability are critical concerns in cloud computing, impacting a consumer's ability to
move their systems between different providers and integrate applications across various environments. Let's
break down the key points:

Portability:

• Definition: Portability refers to the ability to move computing entities, such as virtual machines,
applications, or development environments, from one system to another without losing functionality.
• Levels of Concern: Portability issues vary across different layers of cloud services, including SaaS,
PaaS, and IaaS.
• Categories of Portability: Data portability, application portability, and platform portability are the
primary categories of concern.
• Challenges: Moving data between cloud services requires standard data formats. Application
portability is hindered by platform-specific features and APIs. Platform portability involves either
reusing platform components or bundling machine images.

Interoperability:

• Definition: Interoperability refers to the ability of systems or applications to communicate and


exchange information effectively.
• Cloud Perspective: Interoperability in cloud computing involves communication between multiple
cloud environments, including public, private, and hybrid clouds.
• Importance: Interoperability enables seamless data exchange between systems running in different
cloud environments and facilitates application integration.
• Initiatives: Various organizations, such as the Cloud Computing Interoperability Forum (CCIF) and
the Distributed Management Task Force (DMTF), are working on developing open standards and
protocols to promote interoperability.

Addressing Challenges:

• Portability Solutions: Standardizing data formats and APIs, as well as initiatives like the Simple
Cloud API, help address portability challenges.
• Interoperability Solutions: Standardizing communication interfaces and data formats, leveraging
service-oriented architecture (SOA) principles, and initiatives from organizations like CCIF and
DMTF are key to promoting interoperability.

Context and Evolution:

• Historical Context: Portability and interoperability challenges have been present in computing since
earlier days, but cloud computing exacerbates these issues due to the distributed nature of cloud
environments.
• Current Efforts: Technologists are actively working on addressing these challenges through
standardization efforts and collaborative initiatives.

Overall, understanding and addressing portability and interoperability concerns are essential for consumers
considering cloud adoption, as they impact the flexibility and effectiveness of cloud deployments.

Various situations where interoperability and portability issues may arise in cloud computing, along with
potential recommendations and remedies:

Scenario 1: Customer Switches the Cloud Service Provider:

• Portability Considerations:
o SaaS Level: Data portability is critical, ensuring data format, extent, and semantics align
between providers. Migration tools or standard data formats can aid in this.
o PaaS Level: Application portability depends on compatibility of virtual machines, operating
systems, and development environments. Portable machine image formats like OVF can
assist in this.
• Interoperability Considerations:
o For SaaS, functional interface similarities between old and new providers are important. API
compatibility is crucial for applications using SaaS APIs.
o For PaaS and IaaS, API interoperability is vital for applications.

Scenario 2: Customer Uses Cloud Services from Multiple Providers Concurrently:

• Recommendations from Scenario 1 remain valid.


• Additional considerations include data portability between equivalent or non-equivalent SaaS
services and application portability concerns at PaaS level.

Scenario 3: Customer Links One Cloud Service to Another Cloud Service:

• Portability Considerations: No entity transfer occurs, so portability is not a concern.


• Interoperability Considerations: Integration complexities vary based on whether the first cloud
service is SaaS, PaaS, or IaaS. Standardized APIs and SOA approaches facilitate interoperability.

Scenario 4: Customer Links In-house Capabilities with Cloud Services:

• Portability Considerations: Data and functionality/process integration are crucial. APIs must be
well-defined for both on-premises and cloud applications.
• Interoperability Considerations: Integration effort is reduced with SOA techniques or PaaS, where
customers have more control. IaaS integration requires less effort as customers control platform
services and applications.

Addressing these scenarios effectively involves understanding the specific challenges and leveraging
appropriate technologies and standards to ensure seamless interoperability and portability between cloud
services and environments.

Machine imaging and virtual appliances are both valuable tools in the realm of cloud computing, each
serving specific purposes and offering distinct benefits. Let's break down the key points about each:

Machine Imaging:

• Definition: A machine image is essentially a clone of an entire system stored in a file, which can be
deployed later to launch multiple instances of that machine.
• Contents: It includes the operating system, pre-installed applications, and tools necessary to run the
machine.
• Benefits:
o Streamlines deployment of virtual servers or machines, providing users with pre-configured
options.
o Facilitates faster disaster recovery by allowing users to restore systems from image backups.

Virtual Appliance:

• Definition: A virtual appliance is a pre-integrated, self-contained system consisting of a minimal


operating system environment (JeOS) and a single application.
• Contents: It contains only the essential parts of the operating system required to support the
application, along with the application itself and its dependencies.
• Benefits:
o Simplifies deployment by encapsulating application dependencies in a self-contained unit.
o Offers enhanced security by running applications in isolation, reducing the risk of
interference with other applications.

Distinguishing Features:

• Machine Image vs. Machine Instance: A machine image serves as a template from which multiple
machine instances can be launched. Each instance is a virtual server created from the image.
• Deployment Objective: Machine imaging primarily focuses on launching machine instances over
virtual infrastructure, while virtual appliances aim to deploy applications over virtual infrastructure.

Open Virtualization Format (OVF):

• Definition: OVF is a packaging standard developed by DMTF to address portability issues of virtual
systems, such as virtual machine images or virtual appliances.
• Purpose: OVF enables cross-platform portability by providing a common packaging format for
virtual systems.
• Delivery: OVF packages consist of one or more files, including an XML descriptor file containing
metadata, virtual disk files, and other relevant data. OVF packages can be delivered either as a
package of files or as a single OVA (Open Virtualization Appliance) file.

In summary, machine imaging and virtual appliances offer efficient ways to deploy and manage applications
in cloud computing environments, with each serving specific needs and providing unique advantages. The
adoption of standards like OVF further enhances interoperability and portability across different
virtualization platforms.

Aspect Virtual Machine Image Virtual Appliance


Minimal required part of the OS (Just-
Contents Full image of the operating system. Enough OS) along with the application and
its dependencies.
Application Always contains an application and its
May or may not contain any application.
Presence dependencies.
Multiple May contain more than one application Generally contains only one application
Applications installed over the OS. installed per OS image.
Primary To launch machine instances over virtual To launch applications over virtual
Objective infrastructure. infrastructure.
Other software applications can be installed
Installation of Nothing can be installed over a deployed
over machine instances created from virtual
Software virtual appliance.
machine images.
Aspect Virtual Machine Image Virtual Appliance
Image of a virtual machine stored in archived A type of virtual machine image in
Nature
format as captured at some instance. customized (and reduced) form.

This table summarizes the key differences between virtual machine images and virtual appliances, focusing
on their contents, objectives, and the ability to install additional software.

Cloud Management and a Programming Model Case Study

Cloud Architecture Revisited

• Frontend and Backend: Cloud computing involves client applications at the frontend and the cloud
as the backend, which is composed of multiple layers and abstractions.
• Backend Components: The backend comprises a cloud building platform and cloud services.
• Platform for Cloud: This includes the infrastructure and support system for cloud operations.
• Cloud Services: Comprise infrastructure, platform, and application services, with infrastructure
services being the most common.

Design Characteristics

• Dynamic Infrastructure: Utilizes techniques like resource pooling, sharing, virtualization,


provisioning, and load balancing.
• Flexible Application Architecture: Built on dynamic infrastructure using service-oriented
architecture (SOA) principles, enabling loose coupling and composability.

Cloud Computing Overview

• Deployment Options: Private, public, hybrid, and community deployments, with three main service
delivery categories: IaaS, PaaS, and SaaS.
• Degree of Abstraction: Increases as one moves upwards along the service levels, with flexibility
decreasing as abstraction increases.

Migration into Cloud

• Reasons for Migration: Business needs and technological advancements.


• Forms of Migration: Can involve importing applications, making changes to code, redesigning
applications, or changing entire architectures.
• Phases of Migration: Assessment, isolation, mapping, rebuilding, augmentation, testing, and
optimization.

Asset Management in Cloud

• Simplified Asset Management: Consumers can easily manage computing assets through cloud
service providers, with capacity expansion and provisioning becoming simpler and more agile.

These summaries capture the key points discussed in the text, providing a clearer understanding of cloud
management and programming models.

Cloud Service Management Categories

1. Business Support:
o Involves activities like customer management, contract management, inventory management,
accounting and billing, reporting and auditing, and pricing and rating.
2. Provisioning and Configuration:
o Core activities include rapid provisioning, resource changing, monitoring and reporting,
metering, and SLA management.
3. Portability and Interoperability:
o Focuses on data, application, and platform portability, as well as service interoperability.

Cloud Management Tools

• Cloud management tools enhance performance attributes like effectiveness, security, and flexibility.
• They include features like resource configuration, provisioning, policy management, performance
monitoring, cost monitoring, performance optimization, and security management.
• Various vendors offer cloud management solutions tailored to different cloud environments.

Cloud Management Standards

• Standards like CIMI (Cloud Infrastructure Management Interface) and OVF (Open Virtualization
Format) simplify cloud management by providing standard APIs and packaging formats.
• These standards enable the development of vendor-independent cloud management tools.

Share of Management Responsibilities

• Responsibilities vary between service consumers and providers across SaaS, PaaS, and IaaS models.
• Consumers have minimal responsibilities in SaaS, moderate in PaaS, and maximum in IaaS.

Cloud Service Lifecycle

• Phases include service template creation, SLA definition and service contracts, service creation and
provisioning, service optimization and customization, service maintenance and governance, and
service retirement.
• The lifecycle starts with service creation and ends with service retirement, aiming for optimal
resource usage.
The lifecycle of cloud services has several stages to manage their dynamic nature. Here's a simpler
breakdown:

1. Service Template Creation:


o In Phase 1, service templates are defined or modified for creating services. These templates
are like blueprints.
2. Understanding Consumer Needs:
o Phase 2 focuses on understanding consumer needs and managing relationships through SLA
contracts.
3. Service Creation and Deployment:
o Phase 3 involves creating, deleting, or modifying actual services based on SLA contracts. It
includes planning deployment on the cloud and managing resources for service execution.
4. Service Optimization and Customization:
o Phase 4 optimizes and customizes services based on performance evaluations, adjusting
attributes as needed.
5. Service Maintenance and Monitoring:
o Phase 5 monitors service resources closely to ensure compliance with agreements and
security. It involves routine maintenance, reporting, and billing.
6. Service Retirement:
o Phase 6 handles service retirement, including migration requests or contract terminations. It
addresses archiving, data protection, and aims to minimize resource waste.

The cloud service lifecycle begins with creating templates and ends with retiring services, aiming for
efficient resource usage throughout.

SLA MANAGEMENT

An SLA (Service-Level Agreement) is a legal agreement between a service provider and a consumer or
carrier regarding the quality of services provided. In cloud computing, SLAs define various parameters for
service quality, such as uptime or response time for issue resolution. Before finalizing the SLA, both parties
must clearly identify the Service-Level Objectives (SLOs).

TYPES OF SLA

There are two main types of SLAs:

1. Infrastructure SLA: This covers issues related to the infrastructure, like network connectivity and
server availability. For example, it might guarantee that network packet loss will not exceed 1% in a
month.
2. Application SLA: This focuses on application performance metrics, such as web server latency or
database response time.

SLA LIFECYCLE

The lifecycle of an SLA involves several phases:

1. Contract Definition: Service providers create SLAs based on standard templates, which can be
customized for individual customers.
2. Publishing and Discovery: Providers publish their service offerings, and consumers search the
service catalog to find suitable options.
3. Negotiation: Both parties negotiate the terms of the SLA before signing the final agreement.
4. Operationalization: Once the SLA is in effect, monitoring, accounting, and enforcement activities
ensure compliance with the agreed-upon terms.
5. Decommissioning: If the relationship between the consumer and provider ends, this phase specifies
the terms for terminating the agreement.
Overall, the SLA lifecycle involves defining, negotiating, implementing, and terminating agreements to
ensure that service quality meets expectations.

DISASTER RECOVERY IN CLOUD

In cloud computing, disasters or damage to physical computing resources don't cause much harm because
data centers are well-protected and connected through networks. Disaster recovery planning involves two
key factors:

1. Recovery Point Objective (RPO): This determines the maximum acceptable data loss in case of a
disaster. For example, if the RPO is 8 hours, backups must be made at least every 8 hours.
2. Recovery Time Objective (RTO): This specifies the acceptable downtime for the system in case of a
disaster. For instance, if the RTO is 6 hours, the system must be operational again within 6 hours.

DATA RECOVERY STRATEGIES

A good disaster recovery plan should meet the RPO and RTO needs. It involves: A. Backups and data
retention: Regular backups of persistent data are crucial for recovery. Each service provider has its backup
strategy. B. Geographic redundancy: Storing data in multiple geographic locations increases survival
chances during disasters. C. Organizational redundancy: Maintaining backups with alternative service
providers safeguards against unexpected shutdowns or failures.

INTERCLOUD OR CLOUD FEDERATION

InterCloud, also called cloud federation, is the concept of connecting multiple clouds to share resources and
support each other in case of saturation. It's like a "cloud of clouds" where communication protocols allow
clouds to interact. This federation enables workload flexibility and resource sharing among interconnected
clouds with similar architecture and interfaces.
CLOUD PROGRAMMING: A CASE STUDY WITH ANEKA
Aneka is a cloud platform developed by the University of Melbourne, Australia, and commercialized by
Manjrasoft, an Australian company. It's named after the Sanskrit word for "many in one" because it supports
multiple programming models: task programming, thread programming, and MapReduce programming.

Let's break down the key components and programming models supported by Aneka:

1. Components of Aneka:
o Executors: These are nodes that execute tasks.
o Schedulers: They arrange the execution of tasks across multiple nodes.
o WorkUnits: These are the units of work that make up an application.
o Manager: This is a client component that communicates with the Aneka system.
2. Programming Models:
o Thread Programming:
▪ This model focuses on achieving high performance by allowing multiple threads to
run simultaneously.
▪ In Aneka, it's implemented using distributed threads called Aneka threads.
▪ Developers use APIs similar to .NET's thread class, making it easy to port existing
multi-threaded applications to Aneka.
o Task Programming:
▪ This model considers applications as collections of independent tasks that can be
executed in any order.
▪ Aneka supports this model with APIs like the ITask interface.
▪ Tasks can be bundled and sent to the Aneka cloud for execution.
o MapReduce Programming:
▪ This model, popularized by Google, is used for processing large volumes of data
efficiently.
▪ Aneka's implementation follows the structure of Hadoop, a Java-based framework.
▪ It involves two phases: mapping and reducing, where data is filtered, sorted, and
summarized.

Aneka is built on the Microsoft .NET framework and provides tools and APIs for developing .NET
applications. It can be deployed on various cloud platforms, including public clouds like Amazon EC2 and
Microsoft Azure, as well as private clouds. This flexibility allows developers to build hybrid applications
with minimal effort.
Popular cloud services:

Cloud computing has become a big deal over the past decade, with major players like Amazon, Microsoft,
and Google leading the charge. They offer various cloud services catering to businesses and individuals
alike.
1. Amazon Web Services (AWS): Amazon is a big player in cloud services, especially in
Infrastructure-as-a-Service (IaaS). Their suite of services, AWS, covers everything from computing
to storage to databases.
o Elastic Compute Cloud (EC2): This allows users to create virtual servers in the cloud.
o Simple Storage System (S3) and Elastic Block Store (EBS): Used for data storage in the
cloud.
o Elastic Beanstalk: A platform for developing and deploying applications.
o Relational Database Service (RDS), DynamoDB, and SimpleDB: For managing databases.
o CloudFront: A content delivery service (CDN).
2. Amazon Elastic Compute Cloud (EC2): This is Amazon's main cloud computing platform, where
users can create and manage virtual servers with different configurations.
o EC2 uses different instance types like general purpose, compute optimized, memory
optimized, storage optimized, and GPU instances.
o Users can choose the instance type based on their application needs, like balancing between
processor, memory, storage, and network resources.
3. Amazon Storage Systems:
o Amazon Simple Storage System (Amazon S3): A scalable and persistent storage system for
durable data storage.
o Amazon Elastic Block Store (Amazon EBS): Persistent storage designed for use with
Amazon EC2 instances.
o S3 is good for storing data independently from other services, while EBS is specifically for
use with EC2 instances.
4. AWS Elastic Beanstalk: This is Amazon's Platform-as-a-Service (PaaS) offering, allowing users to
develop and deploy applications seamlessly.
5. Database Services of AWS:
o Amazon Relational Database Service (Amazon RDS): A managed relational database
service supporting various database engines like MySQL, PostgreSQL, Oracle, and Microsoft
SQL Server.
o SimpleDB and DynamoDB: Fully managed NoSQL database services catering to different
workload needs.
6. Amazon CDN Service: CloudFront: A content delivery network service for distributing content to
end-users with low latency.
7. Amazon Message Queuing Service: SQS: A fully managed message queuing service for storing
messages as they travel between computing nodes.

These services make it easier for businesses and individuals to access computing resources, store data,
deploy applications, and manage databases in the cloud.

Microsoft Azure is Microsoft's cloud service offering, encompassing various components like Azure Virtual
Machine, Azure Platform, Azure Storage, Azure Database Services, and Azure Content Delivery Network
(CDN). Here's a simplified breakdown:

1. Azure Virtual Machine: Allows users to create scalable virtual machines on-demand. It runs on the
Azure operating system with a layer called the 'Fabric Controller' managing computing and storage
resources. Users can choose between Windows and Linux operating systems.
2. Azure Platform: Combines Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS)
offerings. It provides application servers, storage, networking, and computing infrastructure.
Developers can use the cloud-enabled .NET Framework for Windows-based application
development.
3. Azure Storage: Provides scalable, durable, and highly-available storage for Azure Virtual Machines.
It includes services like Blob storage, Table storage, Queue storage, and File storage. Blob storage is
used for storing large volumes of unstructured data.
4. Azure Database Services: Offers various database services like Azure SQL Database (a managed
relational database service), DocumentDB (a NoSQL document database service), and HDInsight (an
Apache Hadoop solution for Big Data processing).
5. Azure Content Delivery Network (CDN): Enables easy delivery of high-bandwidth content hosted
in Azure by caching blobs and static content at strategically-placed locations for high-performance
content delivery.
6. Microsoft's SaaS Offerings: Includes services like Office 365, OneDrive, and SharePoint Online.
Office 365 is a cloud version of the traditional Microsoft Office suite with additional services like
Outlook, Skype, and OneDrive. OneDrive offers free online storage, while SharePoint Online
facilitates secure collaboration.

In summary, Microsoft Azure provides a comprehensive range of cloud services, including virtual machines,
platform services, storage solutions, database services, content delivery network, and software-as-a-service
offerings like Office 365, OneDrive, and SharePoint Online.

Google Cloud, like other major cloud providers, offers a wide range of services across different categories
like Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS), and Software-as-a-Service (SaaS).
Let's break down the key offerings:

1. Google's IaaS Offerings:


o Google Compute Engine (GCE): Allows developers to run large-scale distributed
applications on virtual machines hosted on Google's infrastructure. It supports various Linux
distributions and offers both pre-defined and custom-configured machine types.
o Google Cloud Storage: Provides scalable storage accessible via simple APIs. Data is stored
as objects in buckets, and each object consists of data and metadata.
2. Google's PaaS Offering:
o Google App Engine (GAE): A platform for developing and deploying web applications
without worrying about infrastructure management. It supports high-level programming
languages like Java, Python, and PHP, with built-in scalability and auto load balancing
features.
3. Database Services:
o Cloud SQL: A fully-managed relational database service based on MySQL. It offers
scalability and integrates well with Google App Engine.
o Cloud Datastore: A schema-less NoSQL data store for storing non-relational data. It
supports SQL-like queries and provides ACID transactions.
o BigQuery: A tool for analyzing massive datasets in real-time using SQL-like queries. It's not
a database but allows quick exploration and examination of Big Data in the cloud.
4. Google's SaaS Offerings:
o G Suite (formerly Google Apps): Offers various applications for communication,
collaboration, and productivity, including Gmail, Google Docs, Google Drive, Google
Calendar, and more. These applications are accessible via web browsers and are popular for
personal and enterprise use.

In summary, Google Cloud provides a comprehensive suite of cloud services, ranging from infrastructure
and platform solutions to productivity and collaboration tools, catering to the needs of developers,
businesses, and individual users alike.

Certainly, here's a more concise comparison:

Aspect AWS Azure Google Cloud


Vendor Amazon Microsoft Google
Hypervisor Xen Microsoft Azure KVM
Load Balancing Available Available Available
Auto-scaling Available Available Available
Provisioning Instant Instant Instant
Aspect AWS Azure Google Cloud
Virtual Machines
EC2 Azure VM GCE
(VM)
Subscription
On-demand, Reserved, Spot On-demand, Reserved On-demand
Options
OS Options Windows, Linux Windows, Linux Windows, Linux
Billing Plan Per-hour Per-minute Per-minute
Application
Elastic Beanstalk Azure App Engine
Platform
.NET, Java, PHP, Python, .NET, Java, PHP, Python, Java, PHP, Python, Ruby,
Runtimes
Ruby, Node.js, Go, Docker Ruby, Node.js, Docker Node.js, Go, .NET
Storage Service S3, EBS Azure Storage Cloud Storage
Database Service RDS Azure SQL Database Cloud SQL
Big Data
DynamoDB, SimpleDB DocumentDB Cloud Datastore
Solution
Analytics Tool EMR HDInsight BigQuery
CDN Offering CloudFront Azure CDN None
Identity
IAM Azure AD B2C Cloud IAM
Management

This table provides a quick overview of the main features and offerings of AWS, Azure, and Google Cloud.
UNIT-5

Enterprise Architecture: Role and Evolution


Evolution of Technology in Enterprises

Technological Evolution:

• Mainframes to Client-Server Systems: Transitioned to more flexible systems.


• Internet to Cloud Computing: Shifted to more scalable and accessible platforms.
• Adaptation: Enterprises balance stability of old systems with new business needs and technologies.

The Role of Enterprise Architecture (EA)

Enterprise Architecture (EA):

• Purpose: Manage the complexities of changing technical environments.


• Key Roles:
1. Maintaining Application Descriptions: Inventory of all software applications, their
functions, technical details, and interactions.
2. Defining and Enforcing Standards: Set and enforce technical standards for selecting new
technologies and platforms.

Managing Enterprise Data and Processes

Information Systems:

• Track Data and Manage Processes: Essential for enterprise operations.


• EA Perspective:
o Business Processes: Identify and name each process (e.g., 'prospect to order', 'order to cash').
o Process Classification:
▪ Vertical Processes: Within a single department (e.g., sales, accounting).
▪ Horizontal Processes: Span multiple departments (e.g., 'order to cash').

Enterprise Components

Component-Based Structure:

• Business Components: High-level processes grouped by function.


• Application Components: Smaller processes forming cohesive data sets (e.g., 'customer data',
'payments data').
• Software Components: Split into:
o Entity Components: Handle data.
o Process Components: Handle business logic and workflows.

Application Integration and Service-Oriented Architecture (SOA)

Integration Needs:

• Processes: Spanning multiple applications.


• Data Access: Across different applications.
• Resolving Overlaps: In functionalities of packaged solutions.
• Unified Data View: Across the enterprise.

Integration Levels:
1. Data Level: Direct data transfer.
2. API Level: Applications publish API libraries.
3. Service Level: Applications publish services (e.g., web services).
4. User Interface Level: Common user interfaces using mashup APIs.
5. Workflow Level: Tasks in one application trigger tasks in others.

EAI (Enterprise Application Integration) and SOA Strategies:

• For Legacy Systems:


o Use data-level or service methods.
• For Unified Systems:
o Develop common data models and unified user interfaces.
• Gradual Replacement:
o Replace legacy systems without impacting users.

Overall Goal:

• EA provides a roadmap for continuous IT adaptation, balancing current needs with future goals.

Enterprise Technical Architecture: Deployment Options

In-House vs. Cloud: When deciding where to deploy enterprise applications, consider these factors:

1. Uniformity vs. Best of Breed:


o Uniformity: Standardizing on a single technology stack (like Java or Microsoft) reduces
costs and simplifies maintenance.
o Best of Breed: Modern systems allow different technologies to work together, making it
possible to use various cloud providers alongside internal systems.
2. Network and Data Security:
oData Security: Cloud data centers are secure, but businesses need to consider data location
regulations and transfer security.
o Network Security: Internal applications may need enhanced security for cloud deployment.
Features like Amazon's virtual private cloud can help.
3. Implementation and Quick Wins:
o Implementation: Adopting new technologies requires considering skills, tools, and business
impact to minimize costs and ensure longevity.
o Quick Wins: New technologies often start in less critical areas (like user interfaces) to
improve experience without affecting core systems.

Managing Data Center Complexity

Challenges in Large Enterprises:

• Complexity: Extensive IT infrastructures with many applications and servers are costly and difficult
to manage.
• Server Sprawl: Using many underutilized servers is inefficient. Virtualization allows multiple
applications to share servers, improving efficiency.

Automation:

• Cloud platforms offer automation (e.g., server provisioning), reducing manual efforts and costs.

Benefits of Cloud Computing:

• Cost Savings: Virtualization and automation in the cloud lead to significant savings.
• Efficiency: Shared resources and automated management enhance IT efficiency.

Future chapters will explore cloud platforms and their technologies in detail.

Enterprise software: ERP, SCM, CRM


13.1 Anatomy of a Large Enterprise

What a Manufacturing Corporation Does:

• Plans products: what, when, and how much to build.


• Executes sales, marketing, manufacturing, distribution, and customer support.
• Tracks costs, revenues, human resources, and material assets.

Core Data Model

• Customers are reached through marketing and sales.


• Orders are placed by customers.
• Products are delivered from inventory.
• Manufacturing involves executing processes and tracking costs.
• Billing customers and providing after-sales service.
• Accounting manages revenue, costs, and profits.
• Human Resources (HR) manages employees.
• Assets and Energy are tracked.
• Enterprise Applications like ERP, SCM, and CRM manage these functions.

13.2 Partners: People and Organizations

Partner Model
• Partners can be organizations or individuals.
• Roles: A partner can have multiple roles (customer, supplier, employee).
• Communications: Track interactions with partners through different mechanisms.

13.3 Products

Product Model

• Products: Goods or services produced/consumed by the organization.


• Classifications: Products are classified by categories (e.g., model, grade).
• Features and Pricing: Products have features affecting their price.
• Inventory: Physical goods stored in facilities.
• Suppliers: Track which suppliers provide which products.
• Reorder Guidelines: Policies on inventory maintenance.

13.4 Orders: Sales and Purchases

Order Model

• Sales Orders: Accepted from customers.


• Purchase Orders: Placed for required products.
• Order Items: Each order consists of items for products or features.
• Shipments: Order items fulfilled by shipment items.

Quote Model

• Requests: Received from customers or placed with suppliers.


• Quotes: Responses to requests detailing products and prices.
• Agreements: Contracts based on quotes, leading to orders.

13.5 Execution: Tracking Work

Work Model

• Work Orders: Created from sales orders or internal requirements.


• Efforts: Track planned and actual efforts for work orders.
• Party Assignments: Efforts assigned to employees or contractors.
• Rates: Cost or charge rates for efforts.
• Timesheets: Record time spent on efforts.

This simplified version highlights the key components and relationships in enterprise applications without
the detailed technical specifications.

13.6 Billing

Billing Process:

• Customers: Billed after delivery, payments received.


• Suppliers: Payments disbursed for purchases, traced to invoices and billable goods/services.
• Invoices: Can be sales or purchase invoices.
• Invoice Components:
o Invoice items capture details like amount, description, quantity.
o Linked to goods/services delivery (billed items: shipment items, effort/timesheet entries,
order items).
Billing Model (Figure 13.10):

• Classes: Invoice, Invoice Item, Product, Product Feature, Partner.


• Relationships:
o Invoice to Invoice Item (one-to-many).
o Invoice Item to Product/Product Feature (optional).
o Invoice Item to Billed Item (one-to-many).

13.7 Accounting

Accounting:

• Financial Position: Affected by payments, unpaid invoices, inventory.


• Chart of Accounts: Set of GL (general ledger) accounts, each with a GL-ID, name, description,
account type (revenue, expense, asset, liability).
• Budget Entities: Like manufacturing plants, linked to GL accounts.

Accounting Model (Figure 13.12):

• Classes: GL Account, Budget Entity, Accounting Transaction, Transaction Detail.


• Relationships:
o GL Account to Budget Entity (many-to-many).
o Accounting Transaction to GL Account (one-to-many).
o Transaction Detail to Accounting Transaction (many-to-one).

13.8 Enterprise Processes, Build vs. Buy, and SaaS

Enterprise Processes:

• End-to-End Processes: Critical for efficiency, like ‘order to cash’.


• Unified Data Model: Ideal but complicated with packaged software.

Integration Challenges:

• Multiple Packages: SCM, ERP, CRM needing synchronization.


• Domain-Specific Functionality: Financial services with complex partner models not covered by
standard CRM.

Build vs. Buy:

• Integration vs. Development Costs: Weighing integration challenges against high development
costs.

SaaS Considerations:

• Data Security and Volume: Important for SaaS decisions.


• CRM: Suitable for SaaS due to low data replication needs.
• Order Management: Suitable, especially for web-based orders.
• Human Resources: Suitable, can handle payroll and bank transactions.
• SCM and Core ERP: Less suitable due to high data volume and physical asset linkage.

This simplified version covers the essential points about billing, accounting, enterprise processes, and the
considerations for building vs. buying software and using SaaS.
Custom enterprise applications and Dev 2.0

Building Enterprise Systems: Simplified

In the previous chapter, we learned that enterprises have various information needs, which can be managed
using CRM, SCM, or core ERP systems. These systems can be either pre-packaged or custom-built,
depending on specific requirements.

14.1 Software Architecture for Enterprise Components

Enterprise applications are made up of different parts called application components. Each component
manages specific tasks like storing data, interacting with users, and executing business logic. These
components are organized into layers, following a design pattern called 'model-view-controller' (MVC). For
example, in a typical application server architecture, the presentation layer deals with user interfaces, the
business logic layer executes computations, and the data access layer interacts with the database.

14.2 User Interface Patterns and Basic Transactions

14.2.1 Layered MVC and the AJAX Paradigm

In transaction-processing applications, users interact with the system by viewing, entering, and modifying
data. For instance, when entering a new order, data validation and submission happen through layers like
HTML pages and server-side scripts. Traditional web architectures retrieve pages through server requests,
while newer paradigms like AJAX allow for more efficient page updates without full page reloads. These
architectures use controllers to manage navigation and data flow between different layers.

14.2.2 Common UI Patterns

Common patterns for user interfaces include search, results, editing, and creating pages. For example, an
order entry system may have screens for searching orders, viewing results, and editing order details. These
interfaces can be formalized using defined rules and patterns, making it easier to design and develop similar
interfaces for other applications.

14.2.3 Formal Models and Frameworks

Formal models help define user interface behavior, making it easier to reuse patterns and frameworks across
different applications. While technology evolves rapidly, formal models provide a stable foundation for
adapting to new technologies and patterns. However, as new technologies emerge, these models may need to
be updated to accommodate new patterns and features.
In summary, understanding software architecture principles and common UI patterns is crucial for building
scalable enterprise applications, whether using pre-packaged solutions or custom development.

Business Logic:

• Business logic refers to the rules and processes that govern how data is handled within an
application, apart from the user interface.
• It includes tasks like data validation, computation, transaction management, data manipulation, and
calling other methods.
• For example, when a user submits an order form, the business logic validates the order, computes the
total value, manages the transaction, and inserts the order data into the database.

Rule-based Computing:

• Rule-based computing uses formal logic to define rules that determine whether certain conditions are
true or false.
• These rules are useful for validation and other computations.
• Rule engines evaluate these rules against facts (data) to determine their truth value.
• Rule engines can be backward-chaining (evaluating specific predicates) or forward-chaining
(determining all true predicates).
• Rule-based systems allow for dynamic addition and modification of rules at runtime.

Modeling Business Logic using MapReduce:

• Some platforms use a visual approach called Logic Map, based on the MapReduce cloud
programming paradigm.
• Logic Maps represent business logic functions like create, search, and update as nodes in a graph.
• These nodes manipulate relational records and perform computations without a user interface.
• Logic Maps allow for parallel execution of tasks in cloud environments and handle complex business
logic functions efficiently.

Model Driven Interpreters (MDI) in Dev 2.0:

1. MDA vs. MDI:

• MDA (Model Driven Architecture) transforms models into code using code generators.
• MDI (Model Driven Interpreter) directly interprets higher-level abstractions at runtime, without
generating code.
• In MDI, the application functionality is represented by a meta-model, which is interpreted directly.

2. Application Modeling and Maintenance:

• MDA typically involves visual modeling or higher-level language specifications translated into code.
• MDI platforms use a WYSIWIG approach, instantly reflecting changes in the running platform.
• This makes Dev 2.0 platforms 'always on', eliminating compile and build cycles and allowing
immediate testing and use of incremental changes.

3. Multi-tenant Dev 2.0: Application Virtualization:

• In MDI platforms, application specifications are stored in a model repository.


• A single Dev 2.0 instance can interpret multiple application models simultaneously, making it a
multi-tenant platform.
• Different applications can use the same platform instance without conflicts by using separate
schemas for each application's database.
• Similar to hardware virtualization enabling multiple operating systems to run on a single machine,
application virtualization in Dev 2.0 allows multiple applications to run on a single platform
instance.
• Application virtualization drives productivity improvements by adapting to rapidly changing
application functionality and needs, akin to how hardware virtualization optimizes costs in response
to changing hardware requirements.

SECURITY, ERROR HANDLING, TRANSACTIONS AND WORKFLOW


1. Application Security:

• It's about controlling who can access what in an application.


• Methods like Kerberos protocol and JAAS library are used for secure authentication.
• After logging in securely, users are given access to specific functions and data.
• Access control is often implemented at both the application and server levels.
• Application security involves managing user roles, permissions, and access to functions and data.

2. Error Handling:

• It's about how an application deals with errors and communicates them to users.
• Different types of errors need different handling, from technical errors to business validation failures.
• Error management frameworks ensure errors are reported uniformly and appropriately.
• Errors can be shown immediately or bundled until it's appropriate to inform the user.
• Error handling is crucial for user experience and involves multiple layers of the application
architecture.

3. Transaction Management:

• It's about ensuring data integrity and concurrency control during user interactions.
• Each user interaction should be atomic, especially in multi-user scenarios.
• Optimistic concurrency control is commonly used, where each record has a version number.
• If data changes between read and write attempts, a version number mismatch triggers a transaction
failure.
• Transaction management ensures data consistency and is essential for multi-user applications.

All these aspects are considered cross-cutting concerns, meaning they involve multiple layers of the
application architecture. Aspect-oriented programming (AOP) is a technique used to manage such concerns
efficiently.

Workflow and business processes:

15.1 Implementing Workflow in an Application: In a simple leave approval process, employees request
leave, which goes to their supervisor for approval. If approved, it's recorded by HR. Employees are notified
of the outcome. The process involves forms for requesting leave, approving it, and viewing the result.

15.2 Workflow Meta-Model Using ECA Rules: The process involves activities like applying for leave,
approving it, and viewing results. These activities are performed by different actors like employees,
supervisors, and HR managers. We can model this using Entry-Condition-Action (ECA) rules.

15.3 ECA Workflow Engine: A simple workflow engine can be created using ECA rules. It updates
activity lists and sends notifications based on changes in the process. Users can view their pending tasks
through a worklist.

15.4 Using an External Workflow Engine: External workflow engines are necessary when transactions
from multiple applications need to be managed. They handle flow updates, notifications, worklists, and
navigation between applications.

15.5 Process Modeling and BPMN: BPMN (Business Process Modeling Notation) is a graphical way to
represent processes. It includes activities, gateways, transactions, and exceptions to model complex
processes more intuitively.

15.6 Workflow in the Cloud: Workflow services are ideal for cloud deployment, as they enable integration
between different applications deployed on-premise or in the cloud. More cloud-based workflow solutions
are likely to emerge in the future.

Enterprise analytics and search:

An in-depth exploration of enterprise search and analytics, highlighting the importance of leveraging data to
uncover hidden patterns and drive business decisions. Let's break down some key points:

1. Motivations for Enterprise Search and Analytics:


o The text identifies several motivations for maintaining data about an enterprise's operations,
including segmenting customers, targeting marketing campaigns effectively, detecting
anomalies, identifying problems and opinions, and assessing overall situations and trends.
o It emphasizes the importance of using analytical applications to support knowledge discovery
tasks, especially in the context of increasing amounts of unstructured data generated and
maintained by large enterprises.
2. Enterprise Knowledge Discovery Tasks:
o The text outlines various knowledge discovery tasks, including segmenting customers,
targeting advertising, detecting anomalies, identifying problems and opinions, and assessing
overall situations and trends.
o Each task requires specific techniques and approaches tailored to the enterprise's data and
objectives.
3. Business Intelligence (BI) and OLAP:
o Business intelligence involves aggregating, slicing, and dicing data using online analytical
processing (OLAP) tools to view complex summaries of large volumes of data.
o OLAP is popular for human-assisted segmentation of data and optimizing business strategies.
o OLAP typically operates on a star schema, a popular data model for capturing
multidimensional information, especially where dimensions are hierarchical.
4. Data Warehousing and OLAP on a Star Schema:
o Data warehousing involves transforming operational data into a form suitable for reporting
and analysis.
o OLAP tasks are performed on a star schema, which consists of a central fact table surrounded
by associated dimension tables.
o OLAP queries on a star schema are typically expressed using SQL or MDX
(Multidimensional Expressions).
5. OLAP Using MapReduce:
o The text discusses how OLAP queries can be efficiently executed on a cloud database using
parallel computing with the MapReduce paradigm.
o It describes the process of parallel execution of OLAP queries, involving mapping subsets of
data records to processors and reducing intermediate results to compute final aggregates.

Overall, the text provides a comprehensive overview of enterprise search and analytics, covering
motivations, knowledge discovery tasks, business intelligence, data warehousing, OLAP, and parallel OLAP
using MapReduce. It emphasizes the importance of leveraging data to drive business decisions and optimize
operational processes.

TEXT AND DATA MINING:

Data Classification:

• What it is: Sorting data into different groups based on certain characteristics.
• How it's done: We represent our data in a matrix. Each row represents a data point, and each column
represents a feature. We use techniques like the Singular Value Decomposition (SVD) to simplify
the data and find patterns.
• Example: Imagine we have documents categorized into physics (P), computing (C), and biology (B).
We can use SVD to analyze these documents and classify new ones based on their content.

Computing the SVD using MapReduce:

• What it is: A method to compute SVD, a way to simplify large data sets.
• How it works: We divide our data into smaller parts and process them in parallel, using a technique
called MapReduce. This helps speed up the computation, especially for big data sets.

Clustering Data:

• What it is: Grouping data points together based on similarities.


• How it's done: Similar to classification, we use SVD to simplify the data and find clusters. This
helps us see which data points are similar to each other.
• Example: We can use SVD to represent our data in a lower-dimensional space, making it easier to
see clusters.

Anomaly Detection:
• What it is: Identifying unusual data points that don't fit with the rest.
• How it's done: We look for data points that are not well-represented by the main patterns in the data.
These outliers could indicate something unusual or suspicious.
• Example: Using SVD, we can identify data points that don't fit well with the main patterns in the
data, potentially highlighting anomalies.

In essence, these methods help us make sense of large amounts of data by simplifying them and finding
patterns or anomalies.

TEXT AND DATABASE SEARCH:


In the world of enterprise search, the challenge is to make finding information within a company as easy as
web searching. However, there are significant differences between web search and enterprise search:

1. Ranking Difficulty: Web search prioritizes popular results, while enterprise search prioritizes
accurate ones. Plus, in a company, there's more user information available to tailor results.
2. Data Structure: Web data is linked explicitly through hyperlinks, while enterprise data is often a
mix of text and structured records in databases, which aren't always linked directly.
3. Security: Enterprise data often has security restrictions, unlike public web data.
4. Data Variety: Web data is mainly text and easily located via URLs, but enterprise data is a mix of
text, structured records, and various formats.

To handle search efficiently, techniques like MapReduce and Latent Semantic Indexing (LSI) are used:

• MapReduce: It efficiently processes search queries in parallel, breaking down tasks into smaller
chunks handled by different processors.
• LSI: It uses a mathematical method called Singular Value Decomposition (SVD) to understand the
context of search terms better, capturing synonyms and polysemous words.

For structured data, traditional SQL queries have limitations, so there's growing interest in applying search
technology to databases:

• Challenges: SQL queries may not cover all search needs, especially when dealing with multiple
databases or cloud platforms that don't support joins.
• Solutions: Techniques like indexing and parallel processing can help search structured data
efficiently, though incorporating relationships between data items without explicit joins is an
ongoing area of research.

Overall, the goal is to make searching within a company as seamless and effective as searching the web,
even with the added complexities of enterprise data.

Enterprise Cloud Computing Ecosystem

We've discussed various cloud computing technologies and their impact on enterprise software needs,
mainly focusing on major providers like Amazon, Google, and Microsoft. However, there's more to the
cloud ecosystem beyond these giants. Emerging technologies complement public clouds, enable
interoperability with private data centers, and facilitate private cloud creation within enterprises.

Categories in the Cloud Ecosystem:

1. Public Cloud Providers: These offer Infrastructure as a Service (IaaS) and Platform as a Service
(PaaS) solutions. Examples include Amazon EC2 (IaaS) and Google App Engine (PaaS). Some
traditional hosting providers are also entering the cloud space.
2. Cloud Management Platforms and Tools: Tools like RightScale and EnStratus help manage
complex cloud deployments. They offer features like infrastructure configuration, monitoring, and
load balancing.
3. Tools for Building Private Clouds: Enterprises are interested in private clouds, combining
virtualization, self-service infrastructure, and dynamic monitoring. Tools like VMware and 3tera
assist in this. Grid computing technologies are also used in high-performance computing
environments.

Examples:

• Eucalyptus (IaaS): An open-source framework for building private clouds. It mimics Amazon
EC2's infrastructure and provides insights into cloud architectures.
• AppScale (PaaS on IaaS): Mimics Google App Engine's platform on an IaaS platform. It allows
scalable deployment of GAE-like environments on platforms like EC2 or Eucalyptus.

These tools simplify cloud deployment and management, catering to diverse enterprise needs.

You might also like