Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to rotate pods when marked with an annotation or label #148

Open
aallawala opened this issue May 23, 2024 · 4 comments
Open

Ability to rotate pods when marked with an annotation or label #148

aallawala opened this issue May 23, 2024 · 4 comments

Comments

@aallawala
Copy link

Would there happen to be a way for the rollout-operator to enact on pods when a given pod label or annotation is applied to it?

The Kubernetes cluster that our Mimir statefulsets run on have a 15-day node cycling enforcement. Due to the PDB only allowing 1 maxDisruption for the entire fleet of a statefulset (for example, ingesters in zones 1, 2, and 3), cycling all nodes in a kubernetes cluster can take up to a week.

This means that the system is left exposed to any upstream failures that the cloud provider may experience while the node cycling is occurring.

The cycling mechanism can either utilize a PDB or apply an annotation to the Mimir statefulset pods to have another system cycle the pods onto a new node.

The question I have is whether the rollout-operator can enact on these annotations in order to safely delete the pods at a quicker rate (due to more than 1 pod in a zone can be deleted if all other zones are still up) than 1-by-1 with the current PDB enforcement.

@pracucci
Copy link
Collaborator

apply an annotation to the Mimir statefulset pods to have another system cycle the pods onto a new node

As a workaround, have you considered modifying a pod spec annotation (in the StatefulSet) to trigger a rollout?

@aallawala
Copy link
Author

aallawala commented Jun 21, 2024

As a workaround, have you considered modifying a pod spec annotation (in the StatefulSet) to trigger a rollout?

@pracucci, This is not a bad idea. I suspect this is akin to just doing a deployment of the statefulset itself (a change to the STS for the rollout operator to enact).

I'd like to not need to update all pods, if I can get away with it. It's unnecessary churn for the ingester pods and they may end up getting rescheduled on a K8s node that's set to expire soon thereafter.

Is there appetite for this feature? I've seen this in other rollout operators such as this, where an annotation can be added and the rollout operator can enact on the pod, if all conditions are right.

@pracucci
Copy link
Collaborator

To my understanding, you would like to set an annotation on the individual pod and having the rollout-operator taking care of rescheduling it (practically, deleting the pod) when it's safe to do. If my understanding it correct, I think this is a bit out of the scope of rollout-operator because the operator is designed to coordinate statefulsets (essentialy, the idea of the operator is to bypass the built-it statefulset rolling update strategy and build a custom one).

That being said, this doesn't stop you from giving it a try in a fork and see how it works to you. Building the rollout-operator image is easy and you can give it a try if it solves your problem. If it does, then we can see how invasive the changes are and take an informed decision.

@aallawala
Copy link
Author

@pracucci, I'd like to add that if this is viewed as a way to bypass the statefulset rolling update strategy, then it seems like a graceful deletion is also within scope of an operator like this. I could be wrong though!

I'll take a stab at adding an entry point for a graceful delete and tag you once I'm ready. thanks for all the input!

@aallawala aallawala changed the title Ability to rotate pods when marked with an annotation Ability to rotate pods when marked with an annotation or label Jul 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants