SW Architecture - Lecture - 03 - Lab
SW Architecture - Lecture - 03 - Lab
and Grafana
Objective
Understand Availability Principles: Learn the key principles of availability in software
architecture, particularly focusing on redundancy, fault detection, and monitoring.
Implement Tactics for Availability: Set up and test an architecture with redundancy to
ensure continuous service and use local tools to monitor availability and service
performance without relying on paid cloud solutions.
Prerequisites
Install Docker and Docker Compose to run and manage containerized applications.
Familiarity with Docker, Prometheus, Grafana, and either Python or Node.js to write and
deploy the application code.
Lab Steps
Instructions:
- Write a basic server application using Python’s Flask library (or Node.js with Express if
preferred).
- Build a Docker image from this application, allowing it to run consistently across instances.
- Use Docker Compose to launch two instances of this server, each on a different port, to
mimic redundancy.
Code Examples:
```python
from flask import Flask
app = Flask(__name__)
@app.route('/')
def home():
return "Server is running and available!"
if __name__ == "__main__":
app.run(host="0.0.0.0", port=5000)
```
Dockerfile:
```dockerfile
FROM python:3.8-slim
WORKDIR /app
COPY server.py /app
RUN pip install flask
CMD ["python", "server.py"]
```
Docker Compose:
```yaml
version: '3'
services:
server1:
build: .
ports:
- "5001:5000"
server2:
build: .
ports:
- "5002:5000"
```
Outcome: Two server instances, accessible on ports 5001 and 5002, simulate redundancy.
Both instances serve the same application, ensuring availability if one instance fails.
Instructions:
- Write a Python script that sends a request to each server instance every 10 seconds.
- Log responses to monitor which servers are up or down, simulating a health-check
process.
```python
import requests
import time
servers = ["http://localhost:5001", "http://localhost:5002"]
def check_servers():
for server in servers:
try:
response = requests.get(server)
if response.status_code == 200:
print(f"{server} is UP")
else:
print(f"{server} returned error code: {response.status_code}")
except requests.exceptions.RequestException:
print(f"{server} is DOWN")
while True:
check_servers()
time.sleep(10)
```
Outcome: A real-time log showing the status of each server, demonstrating fault detection in
a simple yet effective way.
Instructions:
- Configure Prometheus to monitor the servers with a scrape interval of 15 seconds.
- Use Docker Compose to run Prometheus and Grafana locally for monitoring and
visualization.
```yaml
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'servers'
static_configs:
- targets: ['localhost:5001', 'localhost:5002']
```
grafana:
image: grafana/grafana
ports:
- "3000:3000"
```
Instructions:
- Run the two server instances using Docker.
- Simulate a failure by stopping one container (docker stop <container-id>) and monitor the
system’s response.
Expected Outcome: The remaining server should continue serving requests, as recorded by
the heartbeat script and visualized on the Grafana dashboard, demonstrating effective
redundancy.
Instructions:
- Update docker-compose.yml with a restart: always policy for each server instance to
ensure automatic recovery in case of failure.
```yaml
version: '3'
services:
server1:
build: .
ports:
- "5001:5000"
restart: always
server2:
build: .
ports:
- "5002:5000"
restart: always
```
Test Self-Healing:
Outcome: Docker automatically restarts any stopped server instance, demonstrating a self-
healing system and effective fault recovery tactics.
Deliverables
Server Code: Provide the source code for the server and Dockerfile.
Docker Compose File: Submit the updated docker-compose.yml with Prometheus and
Grafana configurations.
Heartbeat Script Log: Include the log showing server availability over time.
Grafana Dashboard Screenshot: Show visualized metrics for server uptime and fault
detection.
Documentation: Briefly document your observations during failover and self-healing
tests, noting how the system maintained availability.
Evaluation Criteria
Code Accuracy: Correctness in server setup, heartbeat functionality, and monitoring
configuration.
Implementation Success: Effective use of Docker for redundancy and self-healing.
Documentation and Visualization: Quality of Grafana dashboard and clear
documentation of redundancy and self-healing behavior.