DevOps guide for Python
DevOps guide for Python
DevOps Shack
DevOps Guide for Python
1. Introduction
Python has established itself as one of the most widely used languages in
DevOps due to its powerful scripting capabilities and ease of automation. It
plays a crucial role in infrastructure automation, CI/CD pipelines, cloud
management, and system administration.
Why Python is Popular in DevOps?
1. Versatility & Simplicity: Python’s easy-to-read syntax allows DevOps
engineers to write efficient automation scripts without a steep learning
curve.
2. Cross-Platform Support: Python runs seamlessly on Linux, Windows, and
macOS, making it an ideal choice for managing diverse IT environments.
3. Rich Ecosystem of Libraries: Python offers specialized libraries like boto3
for AWS automation, paramiko for SSH-based tasks, and subprocess for
shell scripting.
4. Integration with DevOps Tools: Many DevOps tools, such as Ansible,
Kubernetes, and Jenkins, support Python for scripting and plugin
development.
Python’s Impact on DevOps Adoption
A JetBrains survey reports that 38% of Python usage is in DevOps,
automation, and system administration, highlighting its significance in
managing modern infrastructure and workflows.
Python helps DevOps teams with:
Scripting: Automating repetitive tasks like log management, server
provisioning, and software deployment.
Automation: Streamlining infrastructure provisioning, configuration
management, and CI/CD workflows.
2
Platform Engineering: Developing custom tools and internal platforms for
DevOps teams to enhance efficiency.
With its adaptability and strong community support, Python remains an
essential language for DevOps professionals aiming to improve automation
and operational efficiency.
def get_auth_token():
url = "https://api.example.com/get-token"
response = requests.get(url)
return response.json().get("token")
3
auth_token = get_auth_token()
print(f"Retrieved Token: {auth_token}")
This script ensures that deployments use dynamic authentication instead of
hardcoded credentials.
4
3. DevOps Platform Tooling
In modern DevOps practices, organizations often develop internal DevOps
platforms and tools to streamline workflows, enforce security policies, and
enhance automation. Python is widely used in platform engineering to build
these custom tools, ensuring flexibility and scalability.
Why Python for DevOps Platform Engineering?
• Custom Automation – Python allows DevOps engineers to develop
scripts and tools tailored to specific internal needs.
• Seamless API Integration – Python makes it easy to interact with APIs for
managing cloud services, CI/CD pipelines, and monitoring systems.
• Cross-Platform Compatibility – Python-based DevOps tools work across
Linux, Windows, and macOS environments.
Real-World Use Cases of Python in DevOps Tooling
if args.deploy:
print("Starting deployment...")
This script can be extended to automate deployment processes, manage
configurations, or trigger CI/CD pipelines.
5
Developing API-Based Automation Tools:
Python’s Flask framework allows the creation of custom API services for
DevOps automation.
from flask import Flask, jsonify
app = Flask(__name__)
@app.route('/health', methods=['GET'])
def health_check():
return jsonify({"status": "OK"})
if __name__ == '__main__':
app.run(port=5000)
This lightweight monitoring service can be integrated into infrastructure
health checks or used as a webhook receiver for CI/CD events.
s3 = boto3.client("s3")
buckets = s3.list_buckets()
6
This helps automate security audits and enforce compliance policies.
Python’s Role in DevOps Platform Engineering
• Enhances productivity by automating repetitive tasks.
• Improves security by enabling automated compliance checks.
• Facilitates scalability by allowing the development of internal DevOps
tools that integrate seamlessly with existing infrastructure.
Python is a key enabler for DevOps platform engineering, making it an
essential tool for organizations building internal automation solutions.
4. Cloud Automation
Cloud environments are an essential part of DevOps, and Python is widely
used for automating cloud infrastructure and services. With cloud providers
like AWS, Azure, and Google Cloud offering SDKs for Python, DevOps engineers
can manage cloud resources programmatically, enabling faster, more reliable,
and scalable automation.
Why Use Python for Cloud Automation?
• Extensive Cloud SDK Support – Libraries like boto3 (AWS), azure-mgmt
(Azure), and google-cloud-sdk (GCP) allow seamless cloud interactions.
• Infrastructure as Code (IaC) Automation – Python scripts can provision
and configure cloud infrastructure dynamically.
• Serverless & Event-Driven Automation – Python is widely used in cloud-
based Lambda functions, event triggers, and webhook-based
automation.
Real-World Use Cases of Python in Cloud Automation
ec2 = boto3.client('ec2')
7
instances = ec2.describe_instances()
bucket_name = "my-devops-automation-bucket"
s3.create_bucket(Bucket=bucket_name)
print(f"Bucket {bucket_name} created successfully!")
This is useful for automated backups, data management, and storage
provisioning.
class MyStack(core.Stack):
def __init__(self, scope: core.Construct, id: str, **kwargs):
super().__init__(scope, id, **kwargs)
s3.Bucket(self, "MyBucket")
app = core.App()
8
MyStack(app, "CloudAutomationStack")
app.synth()
This IaC approach ensures repeatability, consistency, and version control for
cloud infrastructure.
return "Completed"
This script helps in cost optimization by shutting down instances when they are
not needed.
Python’s Role in Cloud Automation
• Simplifies cloud management through API-driven automation.
• Enhances efficiency by reducing manual intervention in cloud
operations.
9
• Improves scalability by enabling Infrastructure as Code (IaC)
deployments.
Python’s integration with cloud providers makes it an essential tool for DevOps
engineers to automate cloud infrastructure and services efficiently.
cpu_usage = psutil.cpu_percent(interval=1)
memory_usage = psutil.virtual_memory().percent
disk_usage = psutil.disk_usage('/').percent
10
print(f"Memory Usage: {memory_usage}%")
print(f"Disk Usage: {disk_usage}%")
This script helps monitor system health and detect high resource utilization.
log_file = "/var/log/app.log"
app = Flask(__name__)
ec2 = boto3.client('ec2')
11
@app.route('/scale', methods=['POST'])
def scale():
alert_data = request.json
if alert_data['metric'] == "high_cpu":
ec2.start_instances(InstanceIds=['i-1234567890abcdef'])
return "Scaling triggered!", 200
return "No action taken.", 200
if __name__ == '__main__':
app.run(port=5000)
This webhook can be integrated with monitoring tools like Prometheus,
CloudWatch, or Datadog to trigger auto-scaling events.
SLACK_WEBHOOK_URL =
"https://hooks.slack.com/services/your/slack/webhook"
requests.post(SLACK_WEBHOOK_URL, json=message)
This helps improve incident response times by notifying the DevOps team
instantly.
Python’s Role in Monitoring & Alerting
• Enhances observability by collecting system and application metrics.
12
• Improves reliability by automating incident detection and resolution.
• Integrates seamlessly with monitoring tools to create custom
dashboards and alerting systems.
Python’s flexibility makes it an ideal choice for real-time monitoring, proactive
alerting, and automated incident response in DevOps environments.
13
import joblib
# Train model
model = RandomForestClassifier()
model.fit(X_train, y_train)
# Save model
joblib.dump(model, 'iris_model.pkl')
This automates model training and ensures that the latest model is always
available for deployment.
app = Flask(__name__)
model = joblib.load('iris_model.pkl')
@app.route('/predict', methods=['POST'])
def predict():
data = request.get_json()
14
prediction = model.predict([data['features']])
return jsonify({'prediction': prediction.tolist()})
if __name__ == '__main__':
app.run(debug=True)
This allows machine learning models to be exposed as APIs, making them easy
to integrate into larger systems or applications.
model = joblib.load('iris_model.pkl')
accuracy = model.score(X_test, y_test)
15
from datetime import datetime
def train_model():
# Your model training code here
pass
train_task = PythonOperator(task_id='train_model',
python_callable=train_model, dag=dag)
Airflow allows DevOps engineers to automate end-to-end machine learning
workflows, making it an essential tool in MLOps automation.
Python’s Role in MLOps
• End-to-End Automation – Python automates training, testing,
deployment, and monitoring of machine learning models, integrating
them into DevOps workflows.
• Collaboration Between Teams – Python helps DevOps engineers and
data scientists collaborate on deploying, scaling, and monitoring ML
models in production.
• Streamlined Model Deployment – By integrating Python with APIs,
Docker, and orchestration tools like Airflow, the deployment and
management of ML models are streamlined and automated.
Python is indispensable in MLOps, enabling continuous integration,
continuous delivery, and continuous training of machine learning models in
production environments.
16
7. Important Python Modules for DevOps Automation
In DevOps, Python's versatility is enhanced by a wide range of modules and
libraries that simplify automation, system administration, cloud management,
and integration tasks. Below is a list of essential Python modules that DevOps
engineers frequently use for automation and other DevOps tasks.
1. os and sys
• Use: These modules are part of Python’s standard library and are
primarily used for interacting with the operating system.
• Use Cases: File system manipulation, process management, environment
variable handling, and path management.
o Example: Running shell commands, getting environment variables.
import os
print(os.environ['HOME'])
os.system('ls -l')
2. subprocess
• Use: To spawn new processes, connect to their input/output/error pipes,
and obtain their return codes.
• Use Cases: Running shell commands and capturing output from
command-line tools. This is especially useful for automating tasks that
involve system-level commands or scripts.
import subprocess
result = subprocess.run(['ls', '-l'], stdout=subprocess.PIPE)
print(result.stdout.decode())
3. psutil
• Use: Provides an interface for retrieving information on running
processes and system utilization (CPU, memory, disks, network, and
sensors).
• Use Cases: Monitoring and managing system resources, creating alerts
based on system health metrics.
17
python
CopyEdit
import psutil
print(f"CPU Usage: {psutil.cpu_percent()}%")
print(f"Memory Usage: {psutil.virtual_memory().percent}%")
4. boto3
• Use: The Amazon Web Services (AWS) SDK for Python, used to interact
with AWS services like EC2, S3, and Lambda.
• Use Cases: Automating AWS infrastructure provisioning, managing
resources, deploying applications, and scaling cloud infrastructure.
import boto3
ec2 = boto3.resource('ec2')
for instance in ec2.instances.all():
print(instance.id, instance.state)
5. requests and urllib3
• Use: These modules are used for making HTTP requests. requests is a
simpler and more popular library for making HTTP requests in Python,
while urllib3 is more low-level.
• Use Cases: API integrations, sending data to monitoring systems,
managing configurations, making HTTP requests to third-party services.
import requests
response = requests.get('https://api.example.com/data')
print(response.json())
6. paramiko
• Use: A module for working with SSH to execute commands on remote
machines.
• Use Cases: Automating SSH access, running commands remotely,
managing servers, and securely transferring files.
18
import paramiko
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect('example.com', username='user', password='password')
stdin, stdout, stderr = ssh.exec_command('uptime')
print(stdout.read().decode())
7. PyYAML
• Use: A module used for parsing and writing YAML files, which are
commonly used for configuration management.
• Use Cases: Reading/writing configuration files, automating tasks related
to system configuration, managing infrastructure as code (e.g., Ansible
playbooks).
import yaml
with open('config.yaml', 'r') as file:
config = yaml.load(file, Loader=yaml.FullLoader)
print(config)
8. json
• Use: Part of the Python standard library for parsing JSON data. JSON is
widely used for configuration files and data interchange in DevOps
pipelines.
• Use Cases: Reading and writing JSON files, integrating with APIs,
managing configuration data, and passing data between services.
import json
data = {'name': 'DevOps', 'language': 'Python'}
json_data = json.dumps(data)
print(json_data)
9. logging
19
• Use: Python's built-in library for logging, which is critical for tracking the
execution of scripts and diagnosing errors.
• Use Cases: Setting up logging in automation scripts to track execution,
failures, and system events.
import logging
logging.basicConfig(level=logging.INFO)
logging.info('This is an info message')
logging.error('This is an error message')
10. re (Regular Expressions)
• Use: Provides regular expression matching operations.
• Use Cases: Parsing log files, validating data, extracting information,
filtering logs, or automating tests.
import re
match = re.search(r"error", "Log file with an error message")
if match:
print("Error found!")
11. pandas
• Use: A data analysis library, often used for working with structured data
such as CSV or Excel files.
• Use Cases: Analyzing logs, processing CSV files, and working with
structured data for DevOps automation (e.g., managing deployments,
configuration data).
import pandas as pd
df = pd.read_csv('data.csv')
print(df.head())
12. smtplib
• Use: A Python module for sending emails using the Simple Mail Transfer
Protocol (SMTP).
20
• Use Cases: Sending automated alerts, notifications, or reports as part of
the automation process.
import smtplib
from email.mime.text import MIMEText
@app.route('/status', methods=['GET'])
def status():
return jsonify({"status": "up and running"})
if __name__ == '__main__':
21
app.run(debug=True)
14. Kubernetes Python Client
• Use: A Python client for interacting with Kubernetes clusters.
• Use Cases: Managing Kubernetes resources, scaling deployments,
monitoring pods, and integrating with CI/CD pipelines.
from kubernetes import client, config
config.load_kube_config()
v1 = client.CoreV1Api()
pods = v1.list_pod_for_all_namespaces(watch=False)
for pod in pods.items:
print(f"Pod Name: {pod.metadata.name}")
22
Project: Server Health Monitoring and Alerting System
Objective:
Automate the process of monitoring system health (CPU, memory, and disk
usage) and send email alerts if these exceed certain thresholds (e.g., CPU usage
> 80%, Memory usage > 75%).
Prerequisites:
• Python: Ensure Python is installed on your system (Python 3.x
recommended).
• Libraries: We'll need several Python libraries, mainly psutil for system
monitoring, smtplib for sending emails, and logging for logging events.
• Email: A Gmail or SMTP-compatible email account to send alerts (you
will need to allow access to less secure apps or use an application-
specific password for Gmail).
23
▪ Log the event as a warning (e.g., high CPU or memory
usage).
▪ Call the Send Alert function.
5. Send Alert via Email
o Use smtplib to send an email with system health details (CPU and
memory usage) to the designated recipient.
o Log success or failure of sending the alert.
6. Repeat Monitoring
o After sending an alert or completing the check, wait for the next
cycle (e.g., 5 minutes) and repeat the monitoring process.
7. Exit/Stop
o The monitoring continues indefinitely until manually stopped.
24
25
Step-by-Step Implementation
1. Set Up Python Environment
Install necessary Python libraries using pip:
pip install psutil
pip install smtplib
These libraries will allow you to:
• psutil: Access system resource stats (CPU, memory, disk usage).
• smtplib: Send emails via an SMTP server (like Gmail).
• logging: Log events related to system health checks.
2. Import Libraries and Define Functions
import psutil
import smtplib
from email.mime.text import MIMEText
import time
import logging
3. Configure Logging
You can use the logging module to keep track of events, such as when
thresholds are exceeded or when health checks occur. Set up logging to write
to a file:
logging.basicConfig(filename='server_health.log', level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s')
This log file will track all health check results and alert actions.
4. Create Functions to Monitor System Resources
We'll use psutil to monitor system resources such as CPU usage, memory
usage, and disk usage.
Function to get system health stats:
def get_system_health():
26
cpu_usage = psutil.cpu_percent(interval=1) # Returns CPU usage percentage
over 1 second
memory = psutil.virtual_memory() # Returns memory stats (total, available,
used, percent)
memory_usage = memory.percent # Memory usage percentage
return cpu_usage, memory_usage
This function will return the CPU and memory usage.
5. Define the Alerting Function
Using smtplib and MIMEText, we can set up email alerts. This function will send
an email whenever a threshold is exceeded.
def send_alert(cpu_usage, memory_usage):
# Email setup
sender = "[email protected]"
recipient = "[email protected]"
subject = "Server Health Alert"
body = f"Warning! The server has exceeded its thresholds.\n\nCPU Usage:
{cpu_usage}%\nMemory Usage: {memory_usage}%"
msg = MIMEText(body)
msg['Subject'] = subject
msg['From'] = sender
msg['To'] = recipient
try:
with smtplib.SMTP_SSL('smtp.gmail.com', 465) as server:
server.login(sender, 'your_password') # Use your Gmail app password
server.sendmail(sender, [recipient], msg.as_string())
27
logging.info("Alert sent successfully!")
except Exception as e:
logging.error(f"Failed to send alert: {e}")
6. Define the Threshold Check Function
Next, we'll define the function that checks if CPU or memory usage exceeds the
defined thresholds. If the usage exceeds these thresholds, we call the
send_alert() function.
def check_thresholds():
cpu_usage, memory_usage = get_system_health()
28
check_thresholds()
logging.info(f"CPU: {psutil.cpu_percent()}% | Memory:
{psutil.virtual_memory().percent}%")
time.sleep(300) # Check every 5 minutes
8. Running the Monitoring System
Now, we need to execute the monitoring function in the script's main section:
python
CopyEdit
if __name__ == '__main__':
monitor()
When this script runs, it will monitor the system’s health, check the usage of
resources, and send an email alert if the usage exceeds the specified
thresholds. It will also log all checks and alerts.
Additional Enhancements
1. Customizable Thresholds
Instead of hardcoding thresholds, consider making them configurable via a
settings file or environment variables. This will make the tool more flexible.
import os
29
disk = psutil.disk_usage('/')
return disk.percent
3. Integration with Other Monitoring Tools
You can integrate this script with popular monitoring tools like Prometheus or
Grafana for more advanced visualizations and alerting.
4. Handling Edge Cases
Make sure your script handles any edge cases, such as when psutil cannot
retrieve data or the SMTP server fails.
5. Notifications via Multiple Channels
Instead of just email, you could use APIs like Slack or Telegram to send alerts
for more timely notifications.
Conclusion.
By implementing this Server Health Monitoring and Alerting System, you will:
• Gain experience using Python for system monitoring and automation.
• Learn how to interact with system APIs (e.g., CPU, memory) using psutil.
• Automate real-time alerts for system health, which is a key DevOps
responsibility.
• Understand how to implement a simple but powerful monitoring tool for
both personal and production environments.
30
This project is scalable, and as you gain more experience, you can extend it
with more features such as integrating with cloud platforms (AWS, GCP), using
Kubernetes, and handling different types of system alerts.
31
Conclusion
Python's power in DevOps automation is largely enhanced by these essential
libraries and modules. They offer a wide variety of tools for system
administration, automation, monitoring, cloud management, and data
processing. Understanding and using these modules enables DevOps engineers
to automate repetitive tasks, integrate with cloud services, monitor system
health, and build custom solutions for complex DevOps workflows.
Throughout this guide, we have explored how Python plays a crucial role in
DevOps automation and system administration. By delving into various use
cases such as CI/CD, infrastructure provisioning, cloud automation, and
monitoring, we have highlighted Python's versatility and power in automating
repetitive tasks and enhancing DevOps workflows.
Python's ease of use, extensive libraries, and seamless integration with popular
tools make it a valuable asset for DevOps engineers. Whether you're managing
cloud environments with Boto3, monitoring system health, or automating
deployment pipelines, Python provides the flexibility and reliability needed to
improve efficiency, reduce manual intervention, and ensure robust system
performance.
By learning the fundamental concepts, key Python modules, and real-world
implementation strategies, you can successfully integrate Python into your
DevOps processes. From basic scripting tasks to more advanced automation
like cloud management, monitoring, and continuous integration, Python
enables DevOps engineers to build and scale solutions efficiently.
As DevOps continues to evolve, mastering Python for automation not only
enhances your capabilities but also makes you a more competitive and
effective engineer in the rapidly growing DevOps field. By undertaking projects
such as the Server Health Monitoring and Alerting System, you gain hands-on
experience and deepen your understanding of practical automation in real-
world scenarios.
In conclusion, mastering Python for DevOps is an essential skill for any engineer
looking to automate workflows, streamline operations, and drive more
efficient, scalable systems. With the tools, techniques, and best practices
outlined in this guide, you’re well-equipped to embark on a rewarding journey
into Python-driven DevOps automation.
32