Devops 1

The document outlines a step-by-step approach to build a disaster recovery environment for Pfizer's e-commerce application using Terraform. It describes setting up primary and DR environments across two regions, defining infrastructure as code, establishing network connectivity, replicating data, automating failover processes, and testing procedures.

Uploaded by

shubhamji61098

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views6 pages

Devops 1

Uploaded by

shubhamji61098

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

DEVOPS

Pfizer have their Data Centre (Production Centre) on Google Public Cloud in Ohio State in USA. The data Centre is built
with proper Network resources, VPC, Sub-nets, load balancers, Fire walls etc. to deploy the Pfizer e-commerce
application on to the Ohio Environment. Pfizer also wanted to have a Disaster Recovery Centre in another Region
(Seattle), but they wanted to have Data Centre and Disaster Recovery Centre in Active-Passive mode. How do you
build DR Environment on demand with Infra as Code using Terraform? Pl lay down the Architecture with all the steps,
for the same.

To build a Disaster Recovery (DR) environment for Pfizer's e-commerce application using Terraform for infrastructure as
code, we need to follow a systematic approach. Here's a step-by-step guide outlining the architecture and necessary steps:
Architecture Overview:
1. Primary Data Center (Ohio, USA):
- Hosts the production environment for Pfizer's e-commerce application.
- Built on Google Public Cloud with proper network resources, VPC, subnets, load balancers, firewalls, etc.
2. Disaster Recovery Center (Seattle, USA):
- Serves as a backup location in case of failures or disasters in the primary data center.
- Configured in an Active-Passive mode, where it remains passive until the primary data center fails.
Steps to Build DR Environment with Terraform:
1. Define Infrastructure as Code:
- Write Terraform configuration files (.tf) to define resources for both primary and DR environments.
- Define VPC, subnets, firewall rules, load balancers, VM instances, storage, etc., for both regions.
2. Set Up Network Connectivity:
- Configure VPN or Direct Interconnect between Ohio and Seattle regions to ensure secure communication.
- Implement proper routing tables and DNS configurations to route traffic between the primary and DR environments.
3. Replicate Data:
- Set up data replication mechanisms (e.g., database replication, file synchronization) to mirror data between the primary
and DR sites.
- Ensure data consistency and integrity across both environments.
4. Automation and Monitoring:
- Implement automation scripts or Terraform modules to automate failover processes from the primary to the DR
environment.
- Integrate monitoring tools to continuously monitor the health and performance of both environments.
- Set up alerts and notifications for any anomalies or failures.
5. Test Failover Procedures:
- Regularly conduct failover drills to test the effectiveness of the DR setup.
- Simulate various failure scenarios to ensure the DR environment can seamlessly take over in case of a disaster.
6. Compliance and Security:
- Ensure compliance with regulatory requirements and industry standards for data protection and disaster recovery.
- Implement security best practices, such as encryption, access controls, and vulnerability assessments, in both
environments.
Terraform Implementation:
Here's a simplified example of Terraform code to illustrate the creation of primary and DR environments:
Define resources for Primary Data Center (Ohio)
resource "google_compute_instance" "primary_instance" {
Configuration for primary VM instance
Define other resources like VPC, subnets, load balancers, etc.
Define resources for Disaster Recovery Center (Seattle) resource "google_compute_instance" "dr_instance" {
Configuration for DR VM instance
Define other resources similar to primary but with different region and settings.
Define VPN or Interconnect configuration for connectivity between regions
Define data replication mechanisms (e.g., Cloud Storage, Database replication)
Define automation scripts or modules for failover procedures
Define monitoring configurations for both environments

Conclusion:
Building a Disaster Recovery environment with Terraform requires careful planning, configuration, and testing to ensure the
resilience and reliability of the application. By following the steps outlined above and leveraging Terraform's infrastructure as
code capabilities, Pfizer can achieve a robust DR setup to safeguard its e-commerce application against potential disasters
or failures.
If resources are locked into a particular configuration which does have preset values for CPU, Memory, Networking
that does not expand as demand grows and does not shrink when there is no demand. Then what kind of mechanism
will you use on Cloud. Explain with an example. Also how do you control the economics of the resources on the cloud
when you adopt to the specified mechanism

In a scenario where resources are locked into fixed configurations that do not automatically scale based on demand, a
common mechanism used on the cloud is "Auto Scaling Groups" coupled with "Elastic Load Balancing" and possibly
"Serverless Computing" services. Let's break down each component with an example:
Auto Scaling Groups (ASG):
Auto Scaling Groups allow you to automatically adjust the number of instances in a group based on demand. Even though
the CPU, memory, and networking configuration might be fixed for each instance, you can control the number of instances
dynamically.
Example:
Suppose Pfizer's e-commerce application experiences increased traffic during peak hours. With ASG, when the CPU
utilization exceeds a certain threshold, it triggers the auto-scaling policy to launch additional instances to handle the load.
Conversely, during low traffic periods, instances can be terminated to save costs.
Elastic Load Balancing (ELB):
Elastic Load Balancing distributes incoming traffic across multiple instances to ensure optimal utilization and high
availability. It works seamlessly with Auto Scaling Groups to distribute traffic to newly launched instances.
Example:
When new instances are launched due to increased demand, ELB automatically distributes incoming requests across all
instances, ensuring no single instance becomes overloaded.
Serverless Computing:
Serverless computing services like AWS Lambda or Google Cloud Functions allow you to run code without provisioning or
managing servers. They automatically scale based on the number of incoming requests, providing a cost-effective solution
for sporadic workloads.
Example:
For certain components of the e-commerce application that have variable demand, such as image processing or email
notifications, you can leverage serverless functions. They automatically scale up or down based on the incoming workload,
eliminating the need to manage infrastructure manually.
Cost Control:
To control the economics of resources on the cloud while adopting these mechanisms, you can implement several
strategies:
1. Reserved Instances or Savings Plans: Commit to a specific instance type or usage volume in advance to receive discounts
on your cloud usage.
2. Use Spot Instances: Take advantage of unused capacity in the cloud at significantly lower prices than on-demand
instances. However, they may be interrupted with short notice.
3. Scheduled Scaling: Predictable traffic patterns can be addressed by scheduling scaling actions. For instance, scaling up
before peak hours and scaling down during off-peak times.
4. Monitoring and Optimization: Continuously monitor resource utilization and optimize configurations to ensure efficient
resource usage and minimize costs.
By combining these mechanisms and strategies, Pfizer can ensure efficient resource utilization, scalability, and cost-
effectiveness for their e-commerce application on the cloud.
Being a project manager, you are given the task of testing and deploy a new Version (V2) of a mission critical software.
The existing version (V1) is now running live. Define a deployment strategy to effectively migrate the current live
version with the new version with minimal risk and downtime. Draw a simple schematic of the deployment strategy
with various servers and routers involved in it. (Answer your question in terms of Blue/Green Deployments)

Certainly! Blue/Green deployment is an excellent strategy for minimizing risk and downtime during software updates. In this
approach, you maintain two identical production environments, one "blue" and one "green." While one environment serves
live traffic (e.g., V1), the other can be prepared with the new version (V2) and tested thoroughly before switching traffic to it.
Here's how the deployment strategy could look like:
1. Setup:
- Blue Environment (Live): This is the current live environment running V1.
- Green Environment (New): This environment is identical to the blue environment but runs the updated V2.
2. Testing:
- Deploy V2 to the green environment.
- Conduct thorough testing in the green environment to ensure V2 works as expected. This includes unit tests, integration
tests, performance tests, and user acceptance tests.
3. Validation:
- Once testing is complete, validate the V2 deployment in the green environment against predefined criteria to ensure it
meets the required quality standards.
4. Switching Traffic:
- Route a portion of the live traffic to the green environment gradually, using load balancers or DNS settings. This can be
done in stages (e.g., 10% initially).
- Monitor the performance and stability of V2 in the green environment. If any issues arise, you can quickly roll back by
redirecting all traffic back to the blue environment.
5. Monitoring and Rollback:
- Continuously monitor the performance, error rates, and other key metrics in both blue and green environments.
- If any critical issues are detected in the green environment, rollback to the blue environment by redirecting all traffic back
to it.
- Investigate the root cause of the issues and fix them before attempting another deployment.
6. Complete Transition:
- Once V2 in the green environment is stable and performing well, gradually increase the traffic to it until all traffic is routed
there.
- Decommission the blue environment once it is no longer serving any traffic.
As for the schematic, I can describe it visually:

In this schematic:
- Traffic is initially routed to the Blue Environment (V1).
- The Green Environment (V2) is prepared with the new version but doesn't receive live traffic initially.
- Once V2 is validated, traffic is gradually shifted from the Blue to the Green Environment.
- Monitoring tools should be set up to watch both environments, and rollback procedures should be in place if issues arise.
A leading Europe healthcare player with a large IT structure wanted to adopt agile and DevOps practices to gain true
business value. They want to reduce the release cycle time from 12 weeks to 3- 4 weeks. Currently they use Java
based technologies for waterfall SDLC development of their monolithic applications and they plan to move into cloud.
Their team size is 26 where their efforts are distributed among development, manual testing and operations. Please
recommend a suitable deployment pipeline that brings business value in terms of faster feature releases, better
service quality, efficient deployments with scale and cost optimization. While you recommend a deployment
pipeline, please write the suitable tools for each stage including continuous monitoring. Explain the benefits.

To transition from a traditional waterfall SDLC to Agile and DevOps practices and achieve faster release cycles, improved
service quality, efficient deployments, and cost optimization, the healthcare player can adopt a modern deployment
pipeline tailored to their needs. Here's a recommended deployment pipeline along with suitable tools for each stage:
1. Version Control:
- Tool: Git
- Benefits: Enables versioning, collaboration, and traceability of code changes. Developers can work concurrently on
features, branches, and merge changes easily.
2. Continuous Integration (CI):
- Tool: Jenkins, GitLab CI, or Travis CI
- Benefits: Automates the process of integrating code changes into a shared repository, triggering builds and tests. This
ensures early detection of integration issues and maintains code quality.
3. Automated Testing:
- Unit Tests: JUnit, Mockito
- Integration Tests: Selenium, Cucumber
- Benefits: Automated testing ensures code reliability and reduces manual testing efforts. It helps catch bugs early in the
development cycle and provides fast feedback to developers.
4. Containerization:
- Tool: Docker
- Benefits: Enables packaging applications and their dependencies into lightweight containers, ensuring consistency
across different environments and simplifying deployment.
5. Orchestration:
- Tool: Kubernetes
- Benefits: Automates deployment, scaling, and management of containerized applications. It provides flexibility,
scalability, and resilience while optimizing resource utilization.
6. Continuous Delivery (CD):
- Tool: Spinnaker, Jenkins Pipeline
- Benefits: Automates the deployment process from code commit to production, enabling faster and more reliable
releases. CD pipelines allow for automated testing, approvals, and canary deployments for risk-free releases.
7. Infrastructure as Code (IaC):
- Tool: Terraform, AWS CloudFormation
- Benefits: Defines and manages infrastructure using code, ensuring consistency, scalability, and repeatability across
environments. It facilitates automated provisioning and configuration of cloud resources.
8. Continuous Monitoring:
- Tool: Prometheus, Grafana, ELK Stack
- Benefits: Monitors application performance, availability, and resource utilization in real-time. It helps identify and
troubleshoot issues proactively, ensuring high service quality and reliability.
Benefits of the recommended deployment pipeline:
- Faster Feature Releases: Automation streamlines the development, testing, and deployment processes, reducing cycle
times from weeks to days or even hours.
- Better Service Quality: Automated testing, containerization, and continuous monitoring ensure code reliability, scalability,
and performance.
- Efficient Deployments with Scale: Orchestration and infrastructure as code enable efficient deployment and scaling of
applications across cloud environments.
- Cost Optimization: Containerization and cloud-native technologies optimize resource utilization, enabling cost-effective
scaling and infrastructure management.
By implementing this deployment pipeline, the healthcare player can achieve their goals of adopting Agile and DevOps
practices, reducing release cycle times, and delivering business value through faster, higher-quality software releases.
What is virtualization? How is Containerization different from Virtualization? Discuss Docker, Kubernetes and AWS
lambda and its suitability to the DevOps implementation. Explain each of these with an example

Virtualization is a technology that allows multiple virtual instances of operating systems (OS) to run on a single physical
machine. It enables efficient utilization of hardware resources by abstracting the physical hardware and creating virtual
representations, known as virtual machines (VMs). Each VM can run its own OS and applications, independent of the
underlying hardware.
Containerization, on the other hand, is a lightweight form of virtualization that abstracts the application from the underlying
OS. Instead of virtualizing the entire OS, containers encapsulate the application and its dependencies, allowing it to run
consistently across different computing environments. Containers share the host OS kernel and are more lightweight and
portable compared to VMs.
Docker is a popular containerization platform that simplifies the process of creating, deploying, and managing containers. It
provides a standardized format for packaging applications and their dependencies into containers, making it easy to build,
ship, and run applications across different environments.
Kubernetes is an open-source container orchestration platform designed to automate the deployment, scaling, and
management of containerized applications. It provides features such as automatic scaling, self-healing, and rolling updates,
enabling efficient management of containerized workloads at scale.
AWS Lambda is a serverless computing service provided by Amazon Web Services (AWS). It allows developers to run code
without provisioning or managing servers. With Lambda, developers can upload their code and specify triggers to execute it
in response to events, such as HTTP requests, database changes, or file uploads.
In a DevOps context, Docker, Kubernetes, and AWS Lambda play different roles:
1. Docker: Developers can use Docker to package their applications and dependencies into containers, ensuring
consistency between development, testing, and production environments. For example, a development team building a
microservices-based application might use Docker to containerize each service and manage dependencies more efficiently.
2. Kubernetes: DevOps teams can use Kubernetes to automate the deployment, scaling, and management of containerized
applications. For instance, a DevOps engineer might deploy a web application using Kubernetes, specifying resource
requirements, scaling policies, and health checks to ensure high availability and reliability.
3. AWS Lambda: DevOps teams can leverage AWS Lambda to implement serverless architectures and automate workflows
without managing infrastructure. For example, a DevOps engineer might use AWS Lambda to trigger automated deployment
pipelines in response to code commits, automatically building and deploying applications whenever changes are made.
In summary, Docker, Kubernetes, and AWS Lambda are all valuable tools for DevOps practitioners, offering different
capabilities for containerization, orchestration, and serverless computing. By combining these technologies effectively,
DevOps teams can streamline development workflows, improve scalability and reliability, and accelerate time-to-market for
their applications.
A major bank has been observing lots of incidents of late. On close analysis they found that sometime there is
performance issue on VMs, and they are not receiving any alerts for the same. Also, there is no dashboard for
application metrics e.g., user login should happen in 5 seconds, payments should be processed in 2 seconds. As
there is no application metrics specific dashboard, by the time issue impacts the end user, it’s too late. Based on the
information provided, please answer the following-
(a) Will you recommend Top-down monitoring approach here, or Bottom-Up monitoring approach? And why?
(b) Recommend some application monitoring tool which could be considered. Application is written in Java.
(c) As application is deployed on AWS EC2 AutoScalingGroup, so scaling up and scaling down is a norm, so how we
can ensure that new instances coming up because of scaling has entire monitoring enabled.
(a) Given the scenario described, I would recommend a top-down monitoring approach. Top-down monitoring focuses on
the end-user experience and business processes, which aligns well with the bank's concern about performance issues
impacting end users without receiving alerts. By monitoring user interactions and application performance from the top level
down to the underlying infrastructure, the bank can proactively identify issues before they affect end users. This approach
also allows for the creation of application-specific dashboards to monitor critical metrics, addressing the lack of such
dashboards mentioned in the problem statement.

(b) For application monitoring in Java-based environments, a tool like New Relic or AppDynamics would be suitable. These
tools provide comprehensive monitoring capabilities for Java applications, including performance monitoring, error tracking,
and real-time analytics. They offer features such as customizable dashboards, alerting mechanisms, and deep visibility into
application code and dependencies, which would address the bank's need for application-specific monitoring.

(c) To ensure that new instances coming up due to scaling in the AWS EC2 Auto Scaling Group have monitoring enabled, you
can leverage AWS CloudWatch. CloudWatch provides monitoring and observability services for AWS resources, including
EC2 instances. By configuring CloudWatch agent or integrating with AWS CloudWatch Logs, you can automatically enable
monitoring on new instances as they are launched within the Auto Scaling Group. Additionally, you can use AWS
CloudFormation templates or AWS Systems Manager Automation to automate the deployment and configuration of
monitoring agents on new instances, ensuring consistent monitoring across the environment.

ABC organization would like to monitor the following scenarios occurring at their end continuously:

(a) Suspicious Login Attempts

(b) Network Failure or downtime
(c) User sessions, buffer cache
(d) Tracking VMs
ABC Organization must apply Application Monitoring, Network Monitoring, Database Monitoring or Security
Monitoring for each of these scenarios. Select the correct monitoring type and explain your choice

Sure, here's how I would recommend monitoring each scenario:

(a) Suspicious Login Attempts:

- Monitoring Type: Security Monitoring
Explanation: Security Monitoring involves tracking and analyzing activities within a network or system to detect and respond
to security threats. Suspicious login attempts are a common indicator of potential security breaches, such as brute force
attacks or unauthorized access attempts. By implementing security monitoring tools and techniques, ABC Organization can
continuously monitor login attempts, analyze patterns, and receive alerts for any suspicious activity, enabling them to take
appropriate action to mitigate potential security risks.
(b) Network Failure or Downtime:
- Monitoring Type: Network Monitoring
Explanation: Network Monitoring focuses on monitoring the performance and availability of network devices, connections,
and services. By employing network monitoring tools, ABC Organization can continuously monitor network infrastructure for
signs of failure or downtime, such as fluctuations in network traffic, device failures, or connectivity issues. This enables
proactive identification and resolution of network issues to minimize downtime and ensure uninterrupted service delivery.
(c) User Sessions, Buffer Cache:
- Monitoring Type: Application Monitoring
Explanation: Application Monitoring involves monitoring the performance and behavior of applications and their
components in real-time. User sessions and buffer cache are critical components of application performance. By
implementing application monitoring tools, ABC Organization can continuously monitor user sessions to track user
activities, identify performance bottlenecks, and optimize resource utilization, including buffer cache management. This
enables ABC Organization to ensure optimal application performance and user experience.
(d) Tracking VMs:
- Monitoring Type: Infrastructure Monitoring
Explanation: Infrastructure Monitoring focuses on monitoring the health, performance, and availability of IT infrastructure
components, including virtual machines (VMs). VMs are fundamental building blocks of modern IT infrastructure, and
tracking their status and performance is crucial for maintaining the overall health and efficiency of the infrastructure. By
deploying infrastructure monitoring tools, ABC Organization can continuously track VMs, monitor resource usage, detect
performance issues, and ensure the optimal allocation of resources to support business operations effectively.
By applying the appropriate monitoring types for each scenario, ABC Organization can establish a comprehensive
monitoring strategy to continuously monitor and manage their IT environment, ensuring operational efficiency, performance
optimization, and proactive threat detection and response.