SREDetailed With Google
SREDetailed With Google
Training Plan
Course Duration
Prerequisites
Learning outcomes
● GCP connectivity
● Prometheus and Grafana
● Dynatrace, Datadog
● ELK
● Linux on google compute engine
● Github, Jenkins
Course Content
Introduction to SRE
o Part 1: Introduction to SRE
What is SRE?
The history of SRE and its development
The principles of SRE
Differences between SRE and traditional operations roles
The benefits of implementing SRE
SRE teams and organizational structure
The role of SRE in DevOps
o Part 2: Building Reliable Systems
Define reliability by embracing risk
Measure reliability through SLIs, SLOs, and error budget
Lab Activity: Define SLx
Lab Activity: Build Error Budget Policy
Reliability concepts and metrics
Understanding the "error budget" concept
Setting service level objectives (SLOs)
How to measure and report on reliability
Designing reliable systems
Building resiliency into systems
Managing risk and failure modes
Scaling systems for growth
Linux Basics
o Linux Basics
o Linux Directory Structure
o Linux Basic Commands
o clear
o pwd
o cd
o echo
o ls
o history
o whoami
o sudo su
o Copy, Remove, Move and Time Commands
o touch and cat
o watch
o env
o Dif and Grep Commands
o Head, tail, sort and more commands
o zip and tar
o tr and wc commands
o Disk utilities like free, fdisk, df and du commands
o Getting Help From Command Line user Interface
o w, who, hostname, hostnamectl and uname commands
o Search for files and directories using find and locate commands
o top command its output explanation
o User and group management commands
o id
id -u <user>
id -g <group>
o sudo useradd <user>
o sudo passwd <user>
o sudo userdel <user>
o sudo groupadd <group>
o sudo groupdel <group>
o sed, awk, vmstat and nestat commands
o vnstat command
o cut command
o Merge multiple files using paste command
o Connect and Manage remote machine using SSH
o Changing files and directory permissions
o tar and zip commands
o Scheduling future jobs using crontab
o PATH environment variable
o Curl
o short tutorial on ssh
o short tutorial on vi text editor
o ifconfig, ip, netstat, nslookup
o short tutorial on apt-get and yum
-
Bash Scripting
o 5. Conditional Statements
o 6. Looping Constructs
for Loop
while Loop
until Loop
Loop Control Statements (break, continue)
o 7. Functions in Bash
o 8. Error Handling
o 9. String Manipulation
Concatenation
Substring Extraction
Searching and Replacing in Strings
Docker
Virtualization and Containerization
Install git on windows as well as VM
Overview of virtualization
Overview of Hypervisors
Install docker
Docker Architecture
Lab:
Pull docker images from DockerHub
Create containers from Docker images
Access it outside the machine
Get inside a container
Get inside an already created container
Install java in it
Commit and create a new image
Watch the container created from it has java in it
Pushing our image to Docker Hub
Push/pull images from/to ECR repository
Export and import images and Containers
Discussion on lot more options while creating containers
o Volume
What is Volume and why do we need them.
Different types of Docker volumes
Lab
o Create a container attached to
volume and understand the internals
Named volumes
Anonymous volumes
Bind mounts and understand the internals
Tempfs and understanding the internals
o Understand the various options
How to distinguish them
Which volume type to use? When to use them?
Lab:
Create Docker web container connected to backend mysql
container.
Crash and restore of mysql container.
Create or download a simple but multi container app in java and python
Dockerize it
Explain the difference in dockering applications in different language
Kubernetes
o Kubernetes
Overview of Orchestration
Install a three-node cluster (one master and two worker) using kubeadm
Understand each components getting installed
Kubernetes architecture
kube-admin namespace
Advantages of Kubernetes
Pods
Deep dive into pods
Labs: Creating our own Pods
o Using imperative approach
o Using declarative approach
How was the Pod created?
Hands on Deep dive into Pods
Namespaces
Lab: Namespaces
Labels and selectors
Lab: Labels and selectors
ReplicationController and Replica Set
Lab
o Hands on impact of Replica Set
o Difference between them
o Deployment
Lab:
o Create a deployment
o Overview of deployment strategies
o Rolling update
o Scale out and scale in
o Update and rollback
o Recreate and OnDelete
o Kubernetes volumes
Discussion on the internals
EmptyDir
Hostpath
PV
PVC
Connecting Pods to NFS
o Lab:
Kubernetes volumes
o ServiceTypes
Clust IP
Lab: Cluster IP
NodePort
Lab: NodePort
Loadbalancer
Deep dive into how Kubernetes networking work
DNS Lookup
Deep dive into how dns lookup work
Ingress
o Kubernetes security
Network policies
RBAC
Assignment: To be done by engineers with little help from trainer.
GKE
o Introduction to GKE
1.1 Overview
1.2 Key Features
1.3 Benefits
o Getting Started
2.1 Setting up a GKE Cluster
2.2 Installing gcloud CLI
2.3 Configuring kubectl
Monitoring
- Monitoring using native GCP
Introduction to Monitoring on GCP
1.1 Overview of Monitoring in GCP
1.2 Importance of Monitoring for Cloud Applications
1.3 GCP Monitoring Services Overview
Implement the monitoring for the Node and Kubernetes cluster created earlier using
GCP services.
Implement the monitoring for the Node and Kubernetes cluster created earlier using
Prometheus and grafana.
Implement the monitoring for the Node and Kubernetes cluster created earlier using
Dynatrace.
- Datadog
o Introduction to Datadog
Overview of Datadog
Key Features and Capabilities
Use Cases and Benefits
o
o Getting Started with Datadog
Creating a Datadog Account
Navigating the Datadog Web Interface
Installation and Setup
o Datadog Dashboards
Creating Custom Dashboards
Dashboard Widgets and Metrics
Sharing and Collaborating on Dashboards
Implement the monitoring for the Node and Kubernetes cluster created earlier using
Datadog.
Log management
- ELK
o Introduction to ELK Stack
Overview of ELK Stack
Key Components: Elasticsearch, Logstash, Kibana
Use Cases and Benefits
o Installing and Setting Up ELK Stack
Installing Elasticsearch
Installing Logstash
Installing Kibana
Configuring Basic Settings
o Understanding Elasticsearch
Introduction to Elasticsearch
Indexing and Searching Data
Data Sharding and Replication
Mapping and Analysis
o Logstash: Data Collection and Processing
Logstash Overview
Configuring Logstash Input
Filter Plugins for Data Processing
Output Plugins for Data Routing
o Kibana: Data Visualization and Exploration
Introduction to Kibana
Connecting Kibana to Elasticsearch
Creating Index Patterns
Building Visualizations and Dashboards
o Elasticsearch Query DSL
Basics of Elasticsearch Query Language
Querying and Filtering Data
Aggregations and Metrics
o Advanced Elasticsearch Features
Full-Text Search and Analyzers
Highlighting and Fuzzy Search
Geo-Location Queries
o Log Management and Parsing
Parsing Log Files with Logstash
Grok Patterns and Regular Expressions
Enriching Log Data
o Beats: Lightweight Data Shippers
Overview of Beats
Configuring Filebeat for Log Shipping
Metricbeat for System and Service Metrics
-
CI/CD
- Jenkins
o Introduction to Jenkins
Overview of Jenkins
Continuous Integration and Continuous Delivery (CI/CD)
Key Features and Benefits
o Installing and Setting Up Jenkins
Installing Jenkins
Configuring Jenkins
Jenkins Plugins and Integration
o Creating Your First Jenkins Job
Introduction to Jenkins Jobs
Setting Up Source Code Repositories
Configuring Jenkins Freestyle Projects
o Pipeline as Code with Jenkinsfile
Understanding Jenkinsfile
Declarative vs. Scripted Pipelines
Writing and Configuring Jenkins Pipelines
o Version Control Integration
Integrating Jenkins with Git
Configuring Jenkins with GitHub
o Build Tools and Build Environments
Integration with Build Tools (e.g., Maven)
Configuring Build Environments
Artifact Management
o Automated Testing with Jenkins
Setting Up Automated Tests
Running Unit Tests
Integration Testing and Code Quality Checks
o Continuous Deployment with Jenkins
Introduction to Continuous Deployment
Deploying Applications with Jenkins
Implementation of CI/CD with Jenkins
o Jenkins Distributed Builds
Configuring Jenkins Agents
Master-Slave Architecture
Cloud-Based Build Agents
o Security in Jenkins
User Authentication and Authorization
Role-Based Access Control (RBAC)
o Monitoring and Logging
Jenkins Metrics and Monitoring
Centralized Logging with Jenkins
Health Checks and Notifications
Implement the CI/CD using Jenkins for the application orchestrated earlier using
Jenkins.
o Custom Actions
Creating Custom Actions
Sharing and Using Custom Actions
Best Practices for Action Development
Implement the CI/CD using Jenkins for the application orchestrated earlier using
GitHub Actions.
Provision an Kubernetes stack such that you can reuse it in dev, testing, staging and
production env using Terraform
Python
o Python Basics
Variables and Data Types
Operators and Expressions
Control Flow (if statements, loops)
o Functions in Python
Defining Functions
Function Parameters and Return Values
Scope and Lifetime of Variables
Collect basic server metrics such as CPU usage, memory usage, disk space, and
network statistics.
Use external Python libraries such as psutil (for system monitoring) and requests (for
fetching external data).
Objective: Set up a basic web application on GCP using Google App Engine.
Steps:
1. Get/download a webapplication.
2. Set up a new project on GCP.
3. Deploy the web application to Google App Engine.
4. Configure a custom domain for your web application.
5. Explore monitoring and logging features in Google Cloud Console.
Objective: Create a scalable and redundant storage solution using Google Cloud Storage.
Steps:
Steps:
Steps:
Steps: