Skip to content
Luca Terracciano edited this page Dec 11, 2020 · 2 revisions

System Autoscaler CRD

The system will use two different Custom Resources to perform the autoscaling:

  • Service Level Agreement
  • Podscale

Service Level Agreement

Service Level Agreement is the CRD directly managed by the user. This resource sets a condition to let the system decide whether to scale up/down the pod resources and replicas. The constraint can be specified on different metrics e.g. application response time, throughput ecc.
In this first version the only one metrics supported is the response time. The resources looks like the following one:

apiVersion: systemautoscaler.polimi.it/v1beta1
kind: ServiceLevelAgreement
metadata:
  name: example-sla
spec:
  metric:
    responseTime: 10
  defaultResources:
    cpu: 50m
    memory: 70Mi
  serviceSelector:
    matchLabels:
      app: example-app

Spec fields

  • Metric: specify on which metric the agreement is built on and its desired value. I.e. in the example the agreement is on the response time and the desired value is 4 (milliseconds). This means that the system will try to keep the Services’ response time below 4 milliseconds on average. The only supported metric at the moment is the Response Time.
  • DefaultResources: specify a default set of requests in case the corresponding field is empty in the Pod spec belonging to the service.
  • ServiceSelector: allow to specify a selector to match only a specific subset of Services.

PodScale

Podscale is a system managed resources. It is created once a Pod has been matched by a Service Level Agreement. The resources looks like the following one:

apiVersion: systemautoscaler.polimi.it/v1beta1
kind: PodScale
metadata:
  name: example-podscale
spec:
  serviceLevelAgreement:
    name: example-sla
    namespace: foo
  pod:
    name: example-sa
    namespace: foo
  desired:
    cpu: 40m
    memory: 70Mi
status:
  actual:
    cpu: 40m
    memory: 70Mi

Spec fields

  • ServiceLevelAgreement: contains the metadata of the ServiceLevelAgreement matched by the Pod service. The only supported metric at the moment is the Response Time.
  • Pod: contains the metadata of the Pod to scale.
  • Desired: the ideal amount of resources to assign to a Pod. This amount corresponds to the one suggested by the Recommender component. This value is the desired one since it does not take into consideration the available resources on the node.

Status fields

  • Actual: the amount of resources to be set to the Pod. The amount is decided by the Contention Manager component by taking into account all the Desired resources of the pods running on a specific node. Whenever it is modified, the pod resources will eventually be modified by the Pod Resource Updater accordingly.

Other resources

ReplicaSet

The ReplicaSet object is an already existing Kubernetes resource and it is the same Resource used also by HPA for horizontal scaling. By editing this object, it is possible to manage the amount of pod replicas. In fact, we only need to modify the replica number in the Deployment/StatefulSet resource and the ReplicaSet Controller will spawn the correct number of containers.

Clone this wiki locally