You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If a pod is in the process of being created for a statefulset, the downscale webhook will reject an attempt to change a resource that would cause pods to downscale:
level=error ts=2023-07-18T01:23:33.406992007Z name=ingester-zone-a resource=statefulsets namespace=mimir-dev-11 request_gvk="apps/v1, Kind=StatefulSet" old_replicas=225 new_replicas=5 msg="downscale not allowed due to error" err="Post "http://ingester-zone-a-218.ingester-zone-a.mimir-dev-11.svc.cluster.local:80/ingester/prepare-shutdown": dial tcp: lookup ingester-zone-a-218.ingester-zone-a.mimir-dev-11.svc.cluster.local on 10.188.0.10:53: no such host"
This was discovered when an HPA was scaling up too aggressively, and when trying to revert the change that caused that, the downscale webhook rejected the change since the statefulset was currently upscaling.
The text was updated successfully, but these errors were encountered:
jhalterman
changed the title
Downscale webhook fails if any pod is being created
Downscale webhook fails when currently upscaling
Jul 18, 2023
We could ignore "no such host" errors when performing this check since that implies the machine wasn't running in the first place. This might not be a perfect solution, but an improvement at least.
Did the prepare-shutdown call eventually succeed once the pod started?
Yes, it would succeed for a pod eventually, but in this scenario the HPA was regularly creating new pods, so then the same error would be hit on a new pod the next time a resource change was attempted.
If a pod is in the process of being created for a statefulset, the downscale webhook will reject an attempt to change a resource that would cause pods to downscale:
This was discovered when an HPA was scaling up too aggressively, and when trying to revert the change that caused that, the downscale webhook rejected the change since the statefulset was currently upscaling.
The text was updated successfully, but these errors were encountered: