For a quick test of horizontal autoscaling of an NginX deployment, we need the following ingredients:

  • a metrics server installation to be able to measure the POD’s CPU
  • an nginx deployment with a CPU reservation
  • an auto scaler configuration
  • a process that causes high CPU

This can be tested on: https://killercoda.com/playgrounds/scenario/kubernetes

Install Metrics (quick&insecure way)

curl -s -L https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml | sed 's_\(--cert-dir=/tmp\)_\1\n\ \ \ \ \ \ \ \ - --kubelet-insecure-tls_g' | k apply -f -

Create an NginX Deployment…

…with CPU resource request of 0.1

k create deploy nginx --image=nginx --dry-run=client -o yaml | sed 's_\(resources: \){}_resources:\n\ \ \ \ \ \ \ \ \ \ requests:\n\ \ \ \ \ \ \ \ \ \ \ \ cpu: 0.1_' > nginx.deploy.yaml;
k apply -f nginx.deploy.yaml

Create a horizontal Autoscaler

kubectl autoscale deployment nginx --cpu-percent=50 --min=1 --max=4

Run a high CPU Process

k exec -it deploy/nginx -- bash -c 'while true; do echo -n . >/dev/null; done'

Scaling up Result:

k get hpa --watch
NAME    REFERENCE          TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
nginx   Deployment/nginx   0%/50%    1         4         1          51s
nginx   Deployment/nginx   721%/50%   1         4         1          60s
nginx   Deployment/nginx   942%/50%   1         4         4          75s
nginx   Deployment/nginx   228%/50%   1         4         4          90s

Scaling down Result

After stopping the while loop, we see:

k get hpa --watch
NAME    REFERENCE          TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
nginx   Deployment/nginx   0%/50%    1         4         4          6m47s
nginx   Deployment/nginx   0%/50%    1         4         4          8m15s
nginx   Deployment/nginx   0%/50%    1         4         1          8m30s

Appendix: A. 1: Next Generation (v2) Autoscaling with Default Behavior

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  creationTimestamp: null
  name: nginx
spec:
  maxReplicas: 4
  minReplicas: 1
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15
      - type: Pods
        value: 4
        periodSeconds: 15
      selectPolicy: Max
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx

Appendix: A. 2: Next Generation (v2) Autoscaling with Custom Behavior

The target here is:

  • scale down quickly, if there is no high CPU anymore (stabilizationWindowSeconds set to 0)
  • scale up one by one every 60 seconds: Pods value 1 and periodSeconds 60
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  creationTimestamp: null
  name: nginx
spec:
  maxReplicas: 4
  minReplicas: 1
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 0
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
      - type: Pods
        value: 1
        periodSeconds: 60
      selectPolicy: Max
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx

we can see that the number of replicas rises one by one from 1 to 4, while the load is high, and the number of replicas goes down quickly if the CPU is below its threshold:

controlplane $ k get hpa --watch
NAME    REFERENCE          TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
nginx   Deployment/nginx   0%/80%    1         4         1          32m
nginx   Deployment/nginx   517%/80%   1         4         1          32m
nginx   Deployment/nginx   816%/80%   1         4         2          33m
nginx   Deployment/nginx   414%/80%   1         4         2          33m
nginx   Deployment/nginx   428%/80%   1         4         2          33m
nginx   Deployment/nginx   428%/80%   1         4         2          34m
nginx   Deployment/nginx   419%/80%   1         4         3          34m
nginx   Deployment/nginx   274%/80%   1         4         3          34m
nginx   Deployment/nginx   275%/80%   1         4         3          35m
nginx   Deployment/nginx   279%/80%   1         4         4          35m
nginx   Deployment/nginx   236%/80%   1         4         4          35m
nginx   Deployment/nginx   214%/80%   1         4         4          35m
nginx   Deployment/nginx   204%/80%   1         4         4          36m
nginx   Deployment/nginx   134%/80%   1         4         4          36m
nginx   Deployment/nginx   0%/80%     1         4         4          36m
nginx   Deployment/nginx   0%/80%     1         4         1          36m

More Information:

Next Generation Autoscaling with Behavior Control