Autoscaling Kubernetes Apps With Prometheus and KEDA
Autoscaling Kubernetes Apps With Prometheus and KEDA
Scalability is a key requirement for cloud native applications. With Kubernetes, scaling
your application is as simple as increasing the number of replicas for the corresponding
Deployment or ReplicaSet - but, this is a manual process. Kubernetes makes it possible
to automatically scale your applications (i.e. Pod s in a Deployment or ReplicaSet ) in a
declarative manner using the Horizontal Pod Autoscaler specification. By default, there
is support for using CPU utilization ( Resource metrics) as criteria for auto-scaling, but
it is also possible to integrate custom as well as externally provided metrics.
This blog will demonstrate how you can external metrics to auto-scale a Kubernetes
application. For demonstration purposes, we will use HTTP access request metrics that
are exposed using Prometheus. Instead of using the Horizontal Pod Autoscaler directly,
we will leverage Kubernetes Event Driven Autoscaling aka KEDA - an open source
Kubernetes operator which integrates natively with the Horizontal Pod Autoscaler to
provide fine grained autoscaling (including to/from zero) for event-driven workloads.
I would love to have your feedback and suggestions! Feel free to tweet or drop a
comment 😃
Overview
Here is a summary of how things work end to end — each of these will be discussed in
detail in this section
https://itnext.io/tutorial-auto-scale-your-kubernetes-apps-with-prometheus-and-keda-c6ea460e4642 1/13
07/02/2020 Autoscaling Kubernetes apps with Prometheus and KEDA
Prometheus scaler in KEDA is configured and deployed to auto-scale the app based
on the HTTP access count metrics
KEDA supports the concept of Scaler s which act as a bridge between KEDA and an
relevant data from it, which is then used by KEDA to help drive auto-scaling. There is
support for multiple scalers(including Kafka, Redis, etc.) including Prometheus. This
means that you can leverage KEDA to auto-scale your Kubernetes Deployment s using
Sample application
The example Golang app exposes an HTTP endpoint and does two important things:
Uses Prometheus Go client library to instrument the app and expose the
http_requests metric backed by a Counter . Prometheus metrics endpoint is
available at /metrics .
var httpRequestsCounter =
promauto.NewCounter(prometheus.CounterOpts{
https://itnext.io/tutorial-auto-scale-your-kubernetes-apps-with-prometheus-and-keda-c6ea460e4642 2/13
07/02/2020 Autoscaling Kubernetes apps with Prometheus and KEDA
Name: "http_requests",
Help: "number of http requests",
})
func main() {
http.Handle("/metrics", promhttp.Handler())
http.HandleFunc("/test", func(w http.ResponseWriter, r
*http.Request) {
defer httpRequestsCounter.Inc()
count, err := client.Incr(redisCounterName).Result()
if err != nil {
fmt.Println("Unable to increment redis counter",
err)
os.Exit(1)
}
resp := "Accessed on " + time.Now().String() + "\nAccess
count " + strconv.Itoa(int(count))
w.Write([]byte(resp))
})
http.ListenAndServe(":8080", nil)
}
Prometheus server
The Prometheus deployment manifest consists of:
https://itnext.io/tutorial-auto-scale-your-kubernetes-apps-with-prometheus-and-keda-c6ea460e4642 3/13
07/02/2020 Autoscaling Kubernetes apps with Prometheus and KEDA
metadata on the event source (e.g. connection string secret, queue name), polling
interval, cooldown period, etc. The ScaledObject will result in corresponding
autoscaling resource (HPA definition) to scale the Deployment
When a ScaledObject gets deleted, the corresponding HPA definition is cleaned up.
Here is the ScaledObject definition for our example which uses the Prometheus scaler
apiVersion: keda.k8s.io/v1alpha1
kind: ScaledObject
metadata:
name: prometheus-scaledobject
namespace: default
labels:
deploymentName: go-prom-app
spec:
scaleTargetRef:
deploymentName: go-prom-app
pollingInterval: 15
cooldownPeriod: 30
minReplicaCount: 1
maxReplicaCount: 10
triggers:
- type: prometheus
metadata:
serverAddress: http://prometheus-
service.default.svc.cluster.local:9090
metricName: access_frequency
threshold: '3'
query: sum(rate(http_requests[2m]))
https://itnext.io/tutorial-auto-scale-your-kubernetes-apps-with-prometheus-and-keda-c6ea460e4642 4/13
07/02/2020 Autoscaling Kubernetes apps with Prometheus and KEDA
to be used
As per pollingInterval , KEDA will poll Prometheus target every fifteen seconds.
A minimum of one Pod will be maintained ( minReplicaCount ) and the maximum
number of Pod s will not exceed the maxReplicaCount ( ten in this example)
It is possible to set minReplicaCount to zero . In this case, KEDA will "activate" the
deployment from zero to one and then leave it to HPA to auto-scale it further (the same
is done the other way around i.e. scale in from one to zero). We haven't chosen zero
since this is an HTTP service and not an on-demand system such as a message
queue/topic consumer
for sum(rate(http_requests[2m])) remains less than three. If it goes up, there will be an
additional Pod for every time the sum(rate(http_requests[2m])) increases by three e.g.
if the value is between 12 to 14 , the number of Pod s will be 4
Pre-requisites
All you need is a Kubernetes cluster and kubectl
https://itnext.io/tutorial-auto-scale-your-kubernetes-apps-with-prometheus-and-keda-c6ea460e4642 5/13
07/02/2020 Autoscaling Kubernetes apps with Prometheus and KEDA
Kubernetes cluster — This example uses minikube but feel free to use any other. You
can install it using this guide.
Setup
Install KEDA
You can deploy KEDA in multiple ways as per the documentation. I am simply using a
monolith YAML to get the job done
kubectl apply -f
https://raw.githubusercontent.com/kedacore/keda/master/deploy/KedaSc
aleController.yaml
To confirm,
Wait for KEDA operator Pod to start ( Running state) before you proceed
helm init is to initialize the local CLI and also install Tiller into your Kubernetes
cluster
Wait for Redis server Pod to start ( Running state) before you proceed
//output
deployment.apps/go-prom-app created
service/go-prom-app-service created
Wait for application Pod to start ( Running state) before you proceed
kubernetes_sd_configs:
- role: service
relabel_configs:
- source_labels: [__meta_kubernetes_service_label_run]
regex: go-prom-app-service
action: keep
To deploy:
//output
clusterrole.rbac.authorization.k8s.io/prometheus created
serviceaccount/default configured
clusterrolebinding.rbac.authorization.k8s.io/prometheus created
configmap/prom-conf created
deployment.extensions/prometheus-deployment created
service/prometheus-service created
https://itnext.io/tutorial-auto-scale-your-kubernetes-apps-with-prometheus-and-keda-c6ea460e4642 8/13
07/02/2020 Autoscaling Kubernetes apps with Prometheus and KEDA
Wait for Prometheus server Pod to start ( Running state) before you proceed
Check application Pod - should have one instance running since minReplicaCount was
1
https://itnext.io/tutorial-auto-scale-your-kubernetes-apps-with-prometheus-and-keda-c6ea460e4642 9/13
07/02/2020 Autoscaling Kubernetes apps with Prometheus and KEDA
Autoscaling in action…
curl http://localhost:8080/test
At this point, check Redis as well. You will see that the access_count key has been
incremented to 1
"1"
Confirm that the http_requests metric count has is also the same
Generate load
We will use hey, a utility program to generate load
Invoke it as such
./hey http://localhost:8080/test
By default, the utility sends 200 requests. You should be able to confirm it using the
Prometheus metrics as well as Redis
https://itnext.io/tutorial-auto-scale-your-kubernetes-apps-with-prometheus-and-keda-c6ea460e4642 11/13
07/02/2020 Autoscaling Kubernetes apps with Prometheus and KEDA
curl -g 'http://localhost:9090/api/v1/query?
query=sum(rate(http_requests[2m]))'
//output
{"status":"success","data":{"resultType":"vector","result":
[{"metric":{},"value":[1571734214.228,"1.686057971014493"]}]}}
In this case, the actual result is 1.686057971014493 (in value ). This is not enough to
Moar load!
In a new terminal, keep a track of the application Pod s
In sometime, you will see that the Deployment will be scaled out by the HPA and new
Pod s will be spun up.
If the load does not sustain, the Deployment will be scaled down to the point where only
a single Pod is running
If you check the actual metric (returned by the PromQL query) using
https://itnext.io/tutorial-auto-scale-your-kubernetes-apps-with-prometheus-and-keda-c6ea460e4642 12/13
07/02/2020 Autoscaling Kubernetes apps with Prometheus and KEDA
curl -g 'http://localhost:9090/api/v1/query?
query=sum(rate(http_requests[2m]))'
To clean up
//Delete KEDA
kubectl delete namespace keda
//Delete Redis
helm del --purge redis-server
Conclusion
KEDA allows you to auto scale your Kubernetes Deployments (to/from zero) based on
data from external metrics such as Prometheus metrics, queue length in Redis,
consumer lag of a Kafka topic, etc. It does all the heavy lifting of integrating with the
external source as well as exposing its metrics via a Metrics server for the Horizontal
Pod Autoscaler to weave its magic!
That’s all for this blog. Also, if you found this article useful, please like and follow 😃
😃
https://itnext.io/tutorial-auto-scale-your-kubernetes-apps-with-prometheus-and-keda-c6ea460e4642 13/13