[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: "Readiness probe failed" after a restart of the node



It sounds like the node can no longer access the pod network, because
the connection is reset.  Are you using openshift-sdn?

> On Nov 13, 2015, at 6:34 AM, v <vekt0r7 gmx net> wrote:
>
> Hello,
>
> I just had a very interesting problem with an OpenShift Node. After a restart many of our pods would be like this:
>
> root node02 ~ # oc get po
> zzz-1-vmu4g        0/1       Running   0          44s
>
> root node02 ~ # oc describe po
>  1m            1m              1       {kubelet node02.xyz.com}       spec.containers{zzz}            created         Created with docker id 060d48664a9a
>  1m            1m              1       {kubelet node02.xyz.com}       spec.containers{zzz}            started         Started with docker id 060d48664a9a
>  1m            35s             4       {kubelet node02.xyz.com}       spec.containers{zzz}            unhealthy       Readiness probe failed: Get http://10.1.0.21:8080/mgmt/health: dial tcp 10.1.0.21:8080: connection refused
>
> And they would stay like this. What was very weird is that "brctl show" would show lots of different veth interfaces:
>
> root node02 ~ # brctl show
> bridge name     bridge id               STP enabled     interfaces
> docker0         8000.56847afe9799       no
> lbr0            8000.0a0dac23c824       no              veth0e410ea
>                                                        veth1b8a907
> [many more lines like that, *snip*]
>                                                        vethddc8aa5
>                                                        vethf9ba02f
>                                                        vlinuxbr
>
> On our working nodes "brctl show" does not show any veth* interfaces.
> We tried many things: Restarting the node one more time, restarting the pods, restarting docker/origin-node, restarting iptables-service and openvswitch but in the end the only thing that helped was running
> ansible-playbook ~/openshift-ansible/playbooks/byo/config.yml -i ~/openshift-hostsone more time and then restarting the node.
> After that, all the veth* interfaces disappeared again and everything was fine.
>
> Needless to say that running ansible-playbook every time something goes wrong is not a good solution for us. Anyone got an idea as to what was going on there?
>
> Regards,
> v
>
> _______________________________________________
> users mailing list
> users lists openshift redhat com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]