-
Notifications
You must be signed in to change notification settings - Fork 39.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storage upgrade mechanism #52185
Comments
Thanks, Daniel. Good summary. To echo:
Emphasis on removal. It sucks to carry old APIs forward, but this is a liability I think we can't ignore. We've gotten lucky this far. @kubernetes/sig-network-api-reviews @cmluciano @dcbw @danwinship @kubernetes/api-approvers @kubernetes/api-reviewers |
For context the openshift approach is we force a migration before every upgrade starts, and the migration has to complete. Migration is a kubectl cmd that reads all objects and writes a no-op change (which forces storage turn over). We use this for protobuf migration, field defaulting, new versions, storage versions conversion, self link fixup, etc. It's basically a production version of upgrade-storage-objects.sh - depending on how complicated we want to get, i'd say it's the minimum viable path, while the cluster consensus is probably fine but more complicated. |
Is this blocking the removal of Alpha APIs, too? |
@smarterclayton Is that an upstream command? |
@thockin I think it depends on how the alpha API is storing its objects. |
Downstream, it is a bunch of utility code for a migration framework (we have other migrators for things like RBAC, alpha annotations to beta fields, image references) and then a set of simple commands. I don't know that I think of this as |
@smarterclayton Gotcha. I will take a look at it but that still doesn't sound like an ideal way to solve the problem. Like, it's good if you're going to run one or two of these things with super-well-trained humans doing the upgrade, but Kubernetes isn't installed like that :) |
It's just part of general cluster upgrade operations. Someone has to roll
the masters. Someone has to lay down new config.
I agree there can be magic involved. But magic may have higher cost in
some scenarios.
On Sep 8, 2017, at 4:28 PM, Daniel Smith <[email protected]> wrote:
@smarterclayton <https://github.com/smarterclayton> Gotcha. I will take a
look at it but that still doesn't sound like an ideal way to solve the
problem. Like, it's good if you're going to run one or two of these things
with super-well-trained humans doing the upgrade, but Kubernetes isn't
installed like that :)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#52185 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABG_p9i4xB-URdFNh97YstqxPPuQg8_cks5sgaN8gaJpZM4PReU1>
.
|
I have worked closely with OpenShift's installer/upgrade team on their use of the |
Per conversation on sig-api-machinery, there was general consensus that it makes more sense to have this as a controllers responsibility with some set of strategies. |
We need to figure out downgrades, also, which are similar. Right now, a downgrade will strand storage of any new APIs that are created. |
Brand new resources will get orphaned on downgrade, but should be inert. New versions of existing resources shouldn't start persisting with the new version until n+1 versions from when they are released (or rolling HA API upgrades won't work). If we only support single version downgrade, there shouldn't be any issues. |
Issues go stale after 90d of inactivity. Prevent issues from auto-closing with an If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or |
/lifecycle frozen |
Any status update here? We'd like to remove the deprecated v1beta1 NetworkPolicy. |
I haven't seen a better suggestion than a no-op read/write cycle via the API on the affected resources. Enforcing that is done seems like the responsibility of the deployment mechanism, so it can be staged after all apiservers in a HA deployment are upgraded. |
In the meantime, you can start disabling serving the deprecated object by default. The types still remain, so they can be read from storage, and we still have the in tree debt, but it pushes people to use the new types as new clusters are deployed. |
Speaking of HA, today each apiserver has its own independent default storage versions, which are imposed immediately upon apiserver upgrade. That doesn't seem desirable. |
@lavalamp @caesarxuchao If you aren't able to handle this issue, consider unassigning yourself and/or adding the 🤖 I am a bot run by vllry. 👩🔬 |
/remove-triage unresolved |
long-term-issue (note to self) |
/assign |
This is marked as |
/lifecycle frozen The foundational pieces are being worked on here: |
The #sig-api-machinery-storageversion-dev slack channel is available as well. |
Before we remove an API object version, we need to be certain that all stored objects have been upgraded to a version that will be readable in the future. The old plan for that was to run the cluster/upgrade-storage-objects.sh script after each upgrade. Unfortunately, there are a number of problems with this:
Therefore, we need to design a robust solution for this problem, which works in the face of HA installations, user-provided apiservers, rollbacks, patch version releases, etc. Probably this won't look much like a script that you run after an upgrade, instead it may look more like a system where apiservers come to consensus on the desired storage version, plus a controller that does the job that script was supposed to do after a consensus change.
In the mean time, it is not safe to remove API object versions, so we are instating a moratorium on API object version removal until this system is in place. (Deprecations are still fine.)
The text was updated successfully, but these errors were encountered: