[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Master/ETCD Migration

On Tue, Dec 13, 2016 at 12:37 PM, Diego Castro <diego castro getupcloud com> wrote:
Thanks John, it's very helpful.
Looking over the playbook code, it's seems to replace all certificates and trigger node evacuation to update all pods CA, i definitely don't want that!

It should only do that when openshift_certificates_redeploy_ca is set to True, otherwise it should just redeploy certificates on the masters. 

There is also a PR for splitting out the certificate redeploy playbooks to allow for more flexibility when running: https://github.com/openshift/openshift-ansible/pull/2671
- ETCD wont be a problem since i can replace the certs, migrate the datadir and restart masters.

We don't currently support automated resizing or migration of etcd currently, but this approach should work just fine.

That said, one *could* do the following:
- Add the new etcd hosts to the inventory
- Run Ansible against the hosts (I suspect it will fail on service startup)
- Add the newly provisioned etcd hosts manually to the cluster using etcdctl
- if Ansible failed on the previous step, re-run Ansible again to finish landing the etcd config change
- Remove the old etcd hosts from the etcd cluster using etcdctl
- Update the inventory to remove the old etcd hosts
- Run Ansible to remove the old etcd hosts from the master configs

- Masters is a big issue, since i had to change public cluster hostname. 

Indeed, but there shouldn't be a huge disruption of doing a rolling update of the master services to land the new certificate. The controllers service will migrate (possibly multiple times), but that should be mostly transparent to running apps and users. 

Diego Castro / The CloudFather
GetupCloud.com - Eliminamos a Gravidade

2016-12-13 11:17 GMT-03:00 Skarbek, John <John Skarbek ca com>:


We’ve done a similar thing in our environment. I’m not sure if the openshift-ansible guys have a better way, but this is what we did at that time.

We created a custom playbook to run through all the steps as necessary. And due to the version of openshift-ansible we were running, we had to be careful when we did whichever server was index 0 in the array of hosts. (I think they resolved that problem now)

First we created a play that copied the necessary certificates too all the nodes, such that it didn’t matter which node was in index 0 of the list of nodes. So we had the playbook limited to operate one one node at a time which dealt with tearing it down. Then we’d run the deploy on the entire cluster. For the new node, everything was installed as necessary. For the rest of the cluster it was mostly a no-op. We use static addresses, so the only thing that really changed was the underlying host. Certificate regeneration was limited.

For the master nodes, this was pretty easy. For the etcd nodes, we had to do a bit of extra work as the nodes being added to the cluster, had different member id’s that what the cluster thought that node ought to have. Following etcd’s docs on Member Migration should be able to help you out here.

The only major part we had to be careful of, was doing the work on the node that was going to be the first node. Due to the way the playbooks operated, it put a lot of config and certificate details that would get copied around. If they’ve addressed this, it shouldn’t be an issue, but at the time, we got around this by simply adjusting the order of which nodes defined in our inventory file.

A wee bit laborious, but definitely doable.

In our case, we didn’t experience any downtime, the master nodes cycled through the haproxy box appropriately, and the etcd nodes were removed and added to the cluster without any major headaches.

Though I’m now more curious if the team at redhat working on openshift-ansible may have addressed any of these sorts of issues to make it easier.

John Skarbek

On December 13, 2016 at 08:35:54, Diego Castro (diego castro getupcloud com) wrote:

Hello, i have to migrate my production HA masters/etcd servers to new boxes.

Steps Intended:

1) Create a new masters and etcd machines using byo/config playbook.
2) Stop the old masters and move etcd data directory to new etcd servers
3) Start the new masters
4) Run byo/openshift-cluster/redeploy-certificates.yml against the cluster to updage CA and node configuration.

- Is it the best or the right way to do since this is a production cluster and i want minimal downtime?

Diego Castro / The CloudFather
GetupCloud.com - Eliminamos a Gravidade
users mailing list
users lists openshift redhat com

users mailing list
users lists openshift redhat com

Jason DeTiberus

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]