[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

OpenShift Origin 3.7 Template Broker seems super flaky



Hi,

Has anyone else noticed that the new OpenShift Origin 3.7 Template Broker seems super flaky?

For example, if I deploy a Jenkins (Persistent or Ephemeral), and then I modify the route, by adding an annotation for example:

kubernetes.io/tls-acme: 'true'

I have https://github.com/tnozicka/openshift-acme Installed in the cluster which then grabs an SSL cert for me, adds it to the route, then moments later all resources from the template are garbage collected for no apparent reason. 

I also got the same behaviour when I modified the service account the Jenkins template uses, I added an additional route so I added a new "serviceaccounts.openshift.io/oauth-redirectreference.jenkins:" entry. It took a bit longer (like 12 hours), but it all disappeared again.  I have a suspicion that if you modify any object that a template created, then eventually the template broker will remove all objects it created.

Is there any way to disable the new template broker and use the old template system?

In Origin 3.6 it was flawless and worked with openshift-acme without any problems at all.

I should mention that if I create things manually then it works fine, I can use openshift-acme, and all my resources don't vanish at whim. 

Here is a snippet of the logs, you can see the acme points are removed after successfully getting a cert, and then moments later, the deleting starts:

Jan 08 00:26:47 master-0.openshift.staging.local dockerd-current[23329]: I0108 00:26:47.648255       1 leaderelection.go:199] successfully renewed lease kube-service-catalog/service-catalog-controller-manager
Jan 08 00:26:47 master-0.openshift.staging.local origin-node[26684]: I0108 00:26:47.744777   26749 roundrobin.go:338] LoadBalancerRR: Removing endpoints for jenkins-test/acme-9cv97q5dn8:
Jan 08 00:26:47 master-0.openshift.staging.local dockerd-current[23329]: I0108 00:26:47.744777   26749 roundrobin.go:338] LoadBalancerRR: Removing endpoints for jenkins-test/acme-9cv97q5dn8:
Jan 08 00:26:47 master-0.openshift.staging.local origin-node[26684]: I0108 00:26:47.762005   26749 ovs.go:143] Error executing ovs-ofctl: ovs-ofctl: None: invalid IP address
Jan 08 00:26:47 master-0.openshift.staging.local dockerd-current[23329]: I0108 00:26:47.762005   26749 ovs.go:143] Error executing ovs-ofctl: ovs-ofctl: None: invalid IP address
Jan 08 00:26:47 master-0.openshift.staging.local dockerd-current[23329]: E0108 00:26:47.765091   26749 sdn_controller.go:284] Error deleting OVS flows for service &{{ } {acme-9cv97q5dn8  jenkins-test /api/v1/namespaces/jenkins-test/services/acme-9cv97q5dn8 94c6b3b3-f40a-11e7-88e5-fa163eb8ca3a 622382 0 2018-01-08 00:26:34 +0000 UTC <nil> <nil> map[] map[] [] nil [] } {ClusterIP [{http TCP 80 {0 80 } 0}] map[] None  []  None []  0} {{[]}}}: exit status 1
Jan 08 00:26:47 master-0.openshift.staging.local origin-node[26684]: E0108 00:26:47.765091   26749 sdn_controller.go:284] Error deleting OVS flows for service &{{ } {acme-9cv97q5dn8  jenkins-test /api/v1/namespaces/jenkins-test/services/acme-9cv97q5dn8 94c6b3b3-f40a-11e7-88e5-fa163eb8ca3a 622382 0 2018-01-08 00:26:34 +0000 UTC <nil> <nil> map[] map[] [] nil [] } {ClusterIP [{http TCP 80 {0 80 } 0}] map[] None  []  None []  0} {{[]}}}: exit status 1
Jan 08 00:26:48 master-0.openshift.staging.local dockerd-current[23329]: I0108 00:26:48.139090       1 rest.go:362] Starting watch for /api/v1/namespaces, rv=622418 labels= fields= timeout=8m38s
Jan 08 00:26:48 master-0.openshift.staging.local origin-master-api[23448]: I0108 00:26:48.139090       1 rest.go:362] Starting watch for /api/v1/namespaces, rv=622418 labels= fields= timeout=8m38s
Jan 08 00:26:49 master-0.openshift.staging.local dockerd-current[23329]: I0108 00:26:49.668205       1 leaderelection.go:199] successfully renewed lease kube-service-catalog/service-catalog-controller-manager
Jan 08 00:26:49 master-0.openshift.staging.local dockerd-current[23329]: I0108 00:26:49.885207       1 garbagecollector.go:291] processing item [template.openshift.io/v1/TemplateInstance, namespace: jenkins-test, name: e3639aec-bbbc-4170-b0e4-3b63735af348, uid: 915d585d-f408-11e7-88e5-fa163eb8ca3a]
Jan 08 00:26:49 master-0.openshift.staging.local origin-master-controllers[73353]: I0108 00:26:49.885207       1 garbagecollector.go:291] processing item [template.openshift.io/v1/TemplateInstance, namespace: jenkins-test, name: e3639aec-bbbc-4170-b0e4-3b63735af348, uid: 915d585d-f408-11e7-88e5-fa163eb8ca3a]
Jan 08 00:26:49 master-0.openshift.staging.local dockerd-current[23329]: I0108 00:26:49.904249       1 garbagecollector.go:394] delete object [template.openshift.io/v1/TemplateInstance, namespace: jenkins-test, name: e3639aec-bbbc-4170-b0e4-3b63735af348, uid: 915d585d-f408-11e7-88e5-fa163eb8ca3a] with propagation policy Background
Jan 08 00:26:49 master-0.openshift.staging.local origin-master-controllers[73353]: I0108 00:26:49.904249       1 garbagecollector.go:394] delete object [template.openshift.io/v1/TemplateInstance, namespace: jenkins-test, name: e3639aec-bbbc-4170-b0e4-3b63735af348, uid: 915d585d-f408-11e7-88e5-fa163eb8ca3a] with propagation policy Background
Jan 08 00:26:49 master-0.openshift.staging.local dockerd-current[23329]: I0108 00:26:49.910964       1 garbagecollector.go:291] processing item [apps.openshift.io/v1/DeploymentConfig, namespace: jenkins-test, name: jenkins, uid: 91759f72-f408-11e7-88e5-fa163eb8ca3a]

Any ideas? Has anyone else seen this?  Considering "openshift-ansible-service-broker" is deployed in a broken state by openshift-ansible on the release-3.7 branch (for origin, I think enterprise would work as the tags exist), it makes me think that not many people are using the new service brokers that are talked about here: https://blog.openshift.com/whats-new-in-openshift-3-7-service-catalog-and-brokers/

Thanks,

Joel

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]