Skip to content

KEP-4563: EvictionRequest API (fka Evacuation) #4565

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

atiratree
Copy link
Member

@atiratree atiratree commented Mar 28, 2024

  • One-line PR description: introduce EvictionRequest API to allow managed graceful pod removal

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory labels Mar 28, 2024
@k8s-ci-robot k8s-ci-robot requested review from kow3ns and soltysh March 28, 2024 15:05
@k8s-ci-robot k8s-ci-robot added sig/apps Categorizes an issue or PR as relevant to SIG Apps. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Mar 28, 2024
Comment on lines 228 to 245
We will introduce a new term called evacuation. This is a contract between the evacuation instigator,
the evacuee, and the evacuator. The contract is enforced by the API and an evacuation controller.
We can think of evacuation as a managed and safer alternative to eviction.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Watch out for the risk of confusing end users.

We already have preemption and eviction and people confuse the two. Or three, because there are two kinds of eviction. And there's disruption in the mix.

Do we want to rename Scheduling, Preemption and Eviction to Scheduling, Preemption, Evacuation and Eviction?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I added a mention which of what kind of eviction I mean here.

Do we want to rename Scheduling, Preemption and Eviction to Scheduling, Preemption, Evacuation and Eviction?

Yes, I think we want to add a new concept there and generally update the docs once we have an alpha.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm with Tim here.

Preemption vs Eviction is already quite confusing. And TBH, I couldn't fully understand what the "evacuation" is supposed to solve by reading the summary or motivation.

From Goals:

Changes to the eviction API to support the evacuation process.

If this is already going to be part of the Eviction API, maybe it should be named as a form of eviction. Something like "cooperative eviction" or "eviction with ack" or something along those lines?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm all for framing it as another type of eviction; we already have two, so the extra cognitive load for users is not so much a problem.

Copy link
Member Author

@atiratree atiratree May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alculquicondor I have updated the summary and goals, I hope it makes more sense now.

I think the name should make the most sense to the person creating the Evacuation (Evacuation Instigator ATM). So CooperativeEviction or EvictionWithAck is a bit misleading IMO. Because from that person's perpective there is no additional step required of them. Only the evacuators and the evacuation controller implement the cooperative evacuation process but this is hidden from the normal user.

My suggestions:

  • GracefulEviction (might confuse people if it is associated with graceful pod termination, which it is not)
  • SafeEviction (*safer than the API-initiated one for some pods)
  • Or just call it Eviction? And tell people to use it instead of the Eviction API endpoint? This might be a bit confusing (at least in the beginning)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can bikeshed names for the API kind; I'd throw a few of my own into the hat:

  • EvictionRequest
  • PodEvictionRequest

Copy link
Member Author

@atiratree atiratree Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have renamed the API to EvictionRequest to make the term recognizable. A minor disadvantage is that we have to clarify what type of eviction we mean if we say evict (API-initiate eviction, or EvictionRequest)

The rest of the renames are as follows:

Evacuation (noun) -> EvictionRequest / Eviction Process
evacuation (verb) -> request an eviction / terminate / evict / process eviction
Evacuator -> Interceptor
Evacuee -> Pod
Evacuator Class -> Interceptor Class
Evacuation Instigator -> Eviction Requester
Evacuation Controller -> Eviction Request Controller
ActiveEvacuatorClass -> ActiveInterceptorClass
ActiveEvacuatorCompleted -> ActiveInterceptorCompleted
EvacuationProgressTimestamp -> ProgressTimestamp
ExpectedEvacuationFinishTime -> ExpectedInterceptorFinishTime
EvacuationCancellationPolicy -> EvictionRequestCancellationPolicy
FailedEvictionCounter -> FailedAPIEvictionCounter

Comment on lines 1130 to 1302
<!--
What other approaches did you consider, and why did you rule them out? These do
not need to be as detailed as the proposal, but should include enough
information to express the idea and why it was not acceptable.
-->
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a number of examples of having a SomethingRequest or SomethingClaim API that then causes a something (certificate signing, node provisioning, etc).
Think of TokenRequest (a subresource for a ServiceAccount), or CertificateSigningRequest.

I would confidently and strongly prefer to have an EvictionRequest or PodEvictionRequest API, rather than an Evacuation API kind.

It's easy to teach that we have evictions and than an EvictionRequest is asking for one to happen; it's hard to teach the difference between an eviction and an evacuation.

As a side effect, this makes the feature gate easier to name (eg PodEvictionRequests).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you have mentioned, we already have different kinds of eviction. So I think it would be good to use a completely new term to distinguish it from the others.

Also, Evacuation does not always result in eviction (and PDB consultation). It depends on the controller/workload. For some workloads like DaemonSets and static pods, API eviction has never worked before. This could also be very confusing if we name it the same way.

I think Evacuation fits this better because

  1. The name is shorter. If we go with EvacuationRequest then the evacuation will become just an abstract term and less recognizable.
  2. It seems it will have quite a lot of functionality included (state synchronization between multiple instigators and multiple evacuators, state of the evacuee and evacuation). TokenRequest and CertificateSigningRequest are simpler and not involved in a complex process.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested EvictionRequest so that we don't have to have a section with the (too long) title: Scheduling, Preemption, Evacuation and Eviction. Not EvacuationRequest.

Adding another term doesn't scale so well: it means helping n people understand the difference between evacuation and eviction. It's a scaling challenge where n is not only large, it probably includes quite a few Kubernetes maintainers.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for CertificateSigningRequest being simple: I don't buy it. There are three controllers, custom signers, an integration with trust bundles, the horrors of ASN.1 and X.509… trust me, it's complicated enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that it will be confusing for people, but that will happen regardless of what term we will use.

My main issue is that evacuation does not directly translate to eviction. Therefore, I think it would be preferable to choose a new term (not necessarily evacuation).

I would like to get additional opinions from people about this. And we will definitely have to come back to this in the API review.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be resolved now: #4565 (comment)

@atiratree atiratree force-pushed the evacuation-api branch 8 times, most recently from 13611ce to 2d15b79 Compare March 28, 2024 20:48
@atiratree
Copy link
Member Author

  • I have updated the KEP to include support for multiple evacuators
  • Evacuators can now advertise which pods they are able to evacuate, even before the evacuation. The advantage of this approach is that we can trigger an eviction immediately without a delay (was known asacceptDeadlineSeconds) if we do not find an evacuator. I have added a bunch of restrictions to ensure the API cannot be misused.
  • Clarified how the Evacuation objects should be named and how the admission should work in general (also for pods). This will ensure a 1:1 mapping between pods and Evacuation.
  • Removed the ability to add a full reference of the evacuator because it would be a hassle to synchronize considering the evacuator leader election and multiple evacuatorsin play.


Example evacuation triggers:
- Node maintenance controller: node maintenance triggered by an admin.
- Descheduler: descheduling triggered by a descheduling rule.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the descheduler requests an eviction, what thing is being evacuated?

(the node maintenance and cluster autoscaler examples are easier: you're evacuating an entire node)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A single pod or multiple pods. The descheduler can use it as a new mechanism instead of eviction.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so people typically think of “evacuate” as a near synonym of “drain” - you drain a node, you evacuate a rack or zone full of servers. Saying that you can evacuate a Pod might make people think its containers all get stopped, or just confuse readers. We do need to smooth out how we can teach this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems it can be used in both scenarios https://www.merriam-webster.com/grammar/can-you-evacuate-people.
Evacuation of containers doesn't make sense because they are tied to the pod lifecycle. But, I guess it could be confusing if we do not make it explicitly clear what we are targeting.

Copy link

@sftim sftim Jun 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thing is, Kubernetes users typically - before we accept this KEP - use “evacuate” as a synonym for drain.

I'm (still) fine with the API idea, and still concerned about the naming.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to +1 the potential confusion of the term "evacuation".
Is it okay to have a glossary of terms for "evacuation", "eviction", and "drain" (or any other potentially confusing terms) added somewhere in this KEP?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can include it in the KEP. And yes, we are going to change the name to something more suitable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"To evacuate a person" implies "get them out of trouble, to safety" as opposed to "to empty" (as in vacuum). It's not ENTRIELY wrong in this context, but it's not entirely right either.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will change the name to EvictionRequest. It was originally chosen to distinguish it from eviction, but there is value in making it familiar to existing concepts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API is renamed to EvictionRequest now, see #4565 (comment) for more details

@atiratree atiratree force-pushed the evacuation-api branch 6 times, most recently from 1900020 to 085672a Compare April 10, 2024 21:47
See some practical use cases for this feature:
1. Ability to upscale first before terminating the pods with a Deployment: [Deployment Pod Surge Example](#deployment-pod-surge-example)
based on the [EvictionRequest Process](#evictionrequest-process).
2. Ability to upscale first before terminating the pods with HPA: [HorizontalPodAutoscaler Pod Surge Example](#horizontalpodautoscaler-pod-surge-example)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess dragging along the pdb to have minAvailable == HPA's current target is a pretty hacky thus the need for a non pdb blocked signal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there are a bunch of problems here, user friendliness, atomicity of actions, and responding to any descheduling (e.g. node drain).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the lack of atomicity covered anywhere further in this document? It was something that came to my mind while reading this too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, added to the motivation section

@txomon
Copy link

txomon commented Dec 10, 2024

Not sure if it's relevant to the conversation, however I would expect that after this change, the user scenario 2 is solved without the user doing anything, as the maxSurge parameter and the fact that it's a rollout deployment are already specified in the deployment spec.

@atiratree
Copy link
Member Author

Yes, the implementation of the follow-ups is only outlined, here and there should be additional KEPs for each improvement. And the majority of the things proposed should result in an immediate benefit without any user action.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: atiratree
Once this PR has been reviewed and has the lgtm label, please assign soltysh for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@atiratree
Copy link
Member Author

We will revisit this KEP as part of the new Node Lifecycle WG: kubernetes/community#8396

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 13, 2025
@atiratree
Copy link
Member Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 15, 2025
@atiratree
Copy link
Member Author

FYI: this is being reviewed by the Node Lifecycle WG (see Agenda/Recording https://docs.google.com/document/d/1LSSfiJatBYX7dhLTowYygDO6MK0K-NZ_L52bEZfcZqU) and we will continue next Monday.

Copy link

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned a little about why the pod side of this API is implemented as a set of annotations, rather than something within the pods spec that is structured. There's a lot of complexity in the API, and, a lot of complexity around how the annotations are formatted, where, I believe at least, you'd save a bunch of that complexity by just creating a first class API.

For example, there's this part about the priorities, and not allowing third parties to interleave the priorities of the controller actor. If there was more structure, different groups of actors could be prioritised within each other (group priority and actor priority) and then that issue would surely be resolved, not just for the core controller group, but also other third party implementations that have multiple controllers reconciling.

Was a first class pod API ever considered?

to edit the PDB to account for the additional pod that needs to be spun to move the workload
from one node to another. This has been discussed in issue [kubernetes/kubernetes#66811](https://github.com/kubernetes/kubernetes/issues/66811)
and in issue [kubernetes/kubernetes#114877](https://github.com/kubernetes/kubernetes/issues/114877).
2. Similar to the first point, it is difficult to use PDBs for applications that can have a variable

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not make sense to recommend that users of HPA set their minimum to some number greater than 1 when they are using PDBs which would avoid this issue no?

See some practical use cases for this feature:
1. Ability to upscale first before terminating the pods with a Deployment: [Deployment Pod Surge Example](#deployment-pod-surge-example)
based on the [EvictionRequest Process](#evictionrequest-process).
2. Ability to upscale first before terminating the pods with HPA: [HorizontalPodAutoscaler Pod Surge Example](#horizontalpodautoscaler-pod-surge-example)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the lack of atomicity covered anywhere further in this document? It was something that came to my mind while reading this too

Comment on lines +306 to +307
Any pod can be the subject to an eviction request. There can be multiple interceptors for a single
pod, and they should all advertise which pods they are responsible for. Normally, the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that the various interceptors are expected to have no prior knowledge of each other, and, should expect there to be no ordering/priority between them? We don't want to create dependencies between them right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the design should not presume any ordering as it is difficult to predict all the use cases. This should mostly be left to the ecosystem.

  1. We do not want to handle the dependencies and would like to leave this to the ecosystem. If one project has a knowledge of another and would like to preempt it, it should be possible. So the projects can dynamically set the priority.
  2. We expect the core controllers (Deployment, HPA, etc.) to be aware of each other and to resolve the priorities/ordering accordingly.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the projects can dynamically set the priority.

This would be up to the cluster administrator to set and be aware of the ordering requirements no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be if a custom behavior is desired. but I would expect the components/projects to set reasonable defaults.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an unknowing cluster administrator were to install two systems simultaneously, and they had multiple components, and had default priorities, then they could just interleave accidentally? Feels like that is asking for trouble

Copy link
Member Author

@atiratree atiratree Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it is. What are our alternatives though?

Should we hard fail when something tries to add an interceptor to a pod and another component occupies the priority?

That would increase the need for conflict resolution, and almost everyone would have to solve it. On the other hand, if we allow conflicts, most of the time nothing bad would happen (taking into account that we allow controller to have their own interval). Interceptors should expect that they can be preempted. E.g. running an A+B interceptor vs a B+A interceptor.

Comment on lines 311 to 312
1. It can reject the eviction request and wait for the pod to be intercepted by another interceptor
or evicted by the eviction request controller.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would an interceptor decide to defer the decision to someone else?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just lists all the options. This could happen if there was a race of the interceptor annotation removal and eviction.

Comment on lines 964 to 965
- To prevent misuse, we will maintain a list of allowed `*.k8s.io` interceptor classes. And reject
any classes outside the main Kubernetes project on admission.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So third parties can't create interceptors?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was poorly worded. They can, just not with the .k8s.io suffix.

Comment on lines +975 to +978
`.spec.interceptors` is only set by the Eviction Requester and during the EvictionRequest object
create admission. We do not allow subsequent changes to this field to ensure the predictability of
the eviction request process. Also, late registration of the interceptor could go unnoticed and be
preempted by the eviction request controller, resulting in the premature eviction of the pod.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that the eviction requestor must have prior knowledge of many other interceptors? I would have expected the interceptors that matter would self register (kind of like a PDB does), rather than requiring some pre-existing knowledge

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the interceptors would self register but on a pod beforehand. Eviction requestors should have an easy way of creating EvictionRequests without having to track available interceptors.

12. Actor A updates the EvictionRequest status and ensures that
`.status.evictionRequestCancellationPolicy=Allow`
13. Actor A deletes the p-1 pod.
14. EvictionRequest is garbage collected once the pods terminate even with the descheduling

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't remove an object with a finalizer present, so this statement reads oddly right now, what do you actually expect to happen here? Something is removing that finalizer no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the finalizer should be removed by the Eviction Request Controller GC:

For convenience, we will also remove requester finalizers with
`evictionrequest.coordination.k8s.io/` prefix when the eviction request task is complete (points 2
and 3). Other finalizers will still block deletion.

already exists. It sets the
`requester.evictionrequest.coordination.k8s.io/name_descheduling.avalanche.io` finalizer on the
EvictionRequest.
4. The eviction request controller designates Actor B as the next interceptor by updating

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who are Actor A and B and how do they relate to the nodemaintenance and descheduler?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please see:

Let's assume there is a single pod p-1 of application P with interceptors A and B:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I did read that, but I still don't get the relation. Are these (do these/should these) interceptors tied at all to NodeMaintainence/Descheduling, or are they completely independent of those concepts and just actors who are knowledgeable about a particular pod and how it should be removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they are completely independent and not tied to NodeMaintainence/Descheduling. I have updated the intro to reflect this.

Comment on lines +1219 to +1230
5. The deployment controller creates a set of surge pods C to compensate for the future loss of
availability of pods B. The new pods are created by temporarily surging the `.spec.replicas`
count of the underlying replica sets up to the value of deployments `maxSurge`.
6. Pods C are scheduled on a new schedulable node that is not under the node drain.
7. Pods C become available.
8. The deployment controller scales down the surging replica sets back to their original value.
9. The deployment controller sets `ActiveInterceptorCompleted=true` on the eviction requests of
pods B that are ready to be deleted.
10. The eviction request controller designates the replica set controller as the next interceptor by
updating `.status.activeInterceptorClass`.
11. The replica set controller deletes the pods to which an EvictionRequest object has been
assigned, preserving the availability of the application.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like it will interfere with the normal operation of deployments/replicasets? Would scaling down the replicasets not just delete/initiate other pods to be removed, and maybe, the pod in question?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would. However, ReplicaSets controlled by Deployments should only be scaled by the Deployments during normal operation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we consulted with sig-apps about the implications of this? It sounds like a fairly large change to the way deployments work

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have presented the extensions to Deployment, etc. when discussing the EvictionRequest KEP in sig-apps. However, the discussion was mostly focused on EvictionRequest.

There are a number of open sig-apps issues that could benefit from these solutions for which there are no alternatives.

@atiratree
Copy link
Member Author

@JoelSpeed thanks a lot for the thorough review. These are valuable observations, and perhaps we should consider making more of the API first class. I will also try to resolve your remarks regarding the current API (e.g. spec/status).

// ProgressDeadlineSeconds, the eviction request is passed over to the next interceptor with the
// highest priority. If there is none, the pod is evicted using the Eviction API.
//
// The minimum value is 600 (10m) and the maximum value is 21600 (6h).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 6 hours enough for all operations? I can think of operations in systems such as Database Orchestration that take far longer than 6 hours. I can also think of times where an EvictionRequest is intentionally deprioritised for even days in large long running systems.

I think we should not set a maximum value here. I appreciate that opens up the potential cases that users could use, but I also think its a case of "if you break it you buy it" when you set it to a very large number or so high it doesn't have an ffect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the total time of the operation. It is only the maximum amount of time allotted to the controller to provide updates on said operation. The maximum is useful to ensure that controllers do not start the operation and forget about it. There must be an active entity.

Please see #4565 (comment) for more details

Also, this value is set by the eviction requester (e.g., node drain) and not the interceptor. The interceptor has to comply with the minimum update period (expected to be 10 minutes).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to that thread.

@atiratree atiratree force-pushed the evacuation-api branch 6 times, most recently from fb00a38 to 79bf3b6 Compare August 1, 2025 18:26
@atiratree atiratree force-pushed the evacuation-api branch 2 times, most recently from f0d9681 to 8fc4f0d Compare August 11, 2025 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/apps Categorizes an issue or PR as relevant to SIG Apps. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
Status: Needs Triage
Development

Successfully merging this pull request may close these issues.