We’ve done a similar thing in our environment. I’m not sure if the openshift-ansible guys have a better way, but this is what we did at that time.
We created a custom playbook to run through all the steps as necessary. And due to the version of openshift-ansible we were running, we had to be careful when we did whichever server was index 0 in the array of hosts. (I think they resolved that problem now)
First we created a play that copied the necessary certificates too all the nodes, such that it didn’t matter which node was in index 0 of the list of nodes. So we had the playbook limited to operate one one node at a time which dealt with tearing it down. Then we’d run the deploy on the entire cluster. For the new node, everything was installed as necessary. For the rest of the cluster it was mostly a no-op. We use static addresses, so the only thing that really changed was the underlying host. Certificate regeneration was limited.
For the master nodes, this was pretty easy. For the etcd nodes, we had to do a bit of extra work as the nodes being added to the cluster, had different member id’s that what the cluster thought that node ought to have. Following etcd’s docs on Member Migration should be able to help you out here.
The only major part we had to be careful of, was doing the work on the node that was going to be the first node. Due to the way the playbooks operated, it put a lot of config and certificate details that would get copied around. If they’ve addressed this, it shouldn’t be an issue, but at the time, we got around this by simply adjusting the order of which nodes defined in our inventory file.
A wee bit laborious, but definitely doable.
In our case, we didn’t experience any downtime, the master nodes cycled through the haproxy box appropriately, and the etcd nodes were removed and added to the cluster without any major headaches.
Though I’m now more curious if the team at redhat working on openshift-ansible may have addressed any of these sorts of issues to make it easier.
On December 13, 2016 at 08:35:54, Diego Castro (diego castro getupcloud com) wrote: