[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: How to recover from failed update in OpenShift 4.2.x?

On Nov 17, 2019, at 9:34 PM, Joel Pearson <japearson agiledigital com au> wrote:

So, I'm running OpenShift 4.2 on Azure UPI following this blog article: https://blog.openshift.com/openshift-4-1-upi-environment-deployment-on-microsoft-azure-cloud/ with a few customisations on the terraform side.

One of the main differences it seems, is how the router/ingress is handled. Normal Azure uses load balancers, but UPI Azure uses a regular router (that I'm used to seeing the 3.x version) which is configured by setting the "HostNetwork" for the endpoint publishing strategy

This sounds like a bug in Azure UPI.  IPI is the reference architecture, it shouldn’t have a default divergent from the ref arch.

It was all working fine in OpenShift 4.2.0 and 4.2.2, but when I upgraded to OpenShift 4.2.4, the router stopped listening on ports 80 and 443, I could see the pod running with "crictl ps", but a "netstat -tpln" didn't show anything listening.

I tried updating the version back from 4.2.4 to 4.2.2, but I accidentally used 4.1.22 image digest value, so I quickly reverted back to 4.2.4 once I saw the apiservers coming up as 4.1.22.  I then noticed that there was a 4.2.7 release on the candidate-4.2 channel, so I switched to that, and ingress started working properly again.

So my question is, what is the strategy for recovering from a failed update? Do I need to have etcd backups and then restore the cluster by restoring etcd? Ie. https://docs.openshift.com/container-platform/4.2/backup_and_restore/disaster_recovery/scenario-2-restoring-cluster-state.html

The upgrade page specifically says "Reverting your cluster to a previous version, or a rollback, is not supported. Only upgrading to a newer version is supported." so is it an expectation for a production cluster that you would restore from backup if the cluster isn't usable?

Backup, yes.  If you could open a bug for the documentation that would be great.

Maybe the upgrade page should mention taking backups? Especially if there is no rollback option.
users mailing list
users lists openshift redhat com

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]