[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Moving from multitenant to subnet plugin results in disaster



In an attempt to avoid problems[1] with the multitenant networking plugin, we recently switched from the multitenant plugin to the subnet plugin.  After restarting all instances of origin-node and origin-master across our cluster, we found that nothing was able to communicate with anything over the service network.  Deploys were unable to pull from the registry, and apps were inaccessible.  We "fixed" this by reverting to the multitenant plugin, which brought us back to the previous less broken state, but only after rebuilding all apps (not just redeploying, but rebuilding).

Does anybody know what went so horribly wrong here?  I may be able to provide logs if need be (not sure if logs rolled over yet).  One potential source of trouble: instead of shutting down all nodes, making the change, then bringing them back up, I changed them one-by-one.  Is that a bad thing to do?  Also, should we expect to need to rebuild all apps after changing the networking plugin?  Does that include the router and registry?

[1] Pods on different machines were sometimes unable to communicate with each other via the service network.  Likely fixed by https://github.com/openshift/openshift-sdn/pull/285
--
Alex Wauck // DevOps Engineer
+1 612 790 1558 (USA Mobile) 

E X O S I T E 

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]