[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Pods stuck on 'ContainerCreating' when redhat/openshift-ovs-multitenant enabled



I found the root cause for this issue.
In my machine, I firstly deployed cop with calico. It works well.
Then run uninstall playbook and reinstall with sdn openshift-ovs-multitenant.
And it didn’t work anymore.
I found something as below,

[root buzz1 openshift-ansible]# systemctl status  atomic-openshift-node.service 
atomic-openshift-node.service - OpenShift Node
   Loaded: loaded (/etc/systemd/system/atomic-openshift-node.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2019-10-14 00:43:08 PDT; 22h ago
 Main PID: 87388 (hyperkube)
   CGroup: /system.slice/atomic-openshift-node.service
           ├─87388 /usr/bin/hyperkube kubelet --v=6 --address=0.0.0.0 --allow-privileged=true --anonymous-auth=true --authentication-toke...
           └─88872 /opt/cni/bin/calico

Oct 14 23:15:48 buzz1.fyre.ibm.com atomic-openshift-node[87388]: I1014 23:15:48.289674   87388 common.go:71] Using namespace "kube-s....yaml
Oct 14 23:15:48 buzz1.fyre.ibm.com atomic-openshift-node[87388]: I1014 23:15:48.289809   87388 file.go:199] Reading config file "/et...yaml"
Oct 14 23:15:48 buzz1.fyre.ibm.com atomic-openshift-node[87388]: I1014 23:15:48.292556   87388 common.go:62] Generated UID "598eab3c....yaml
Oct 14 23:15:48 buzz1.fyre.ibm.com atomic-openshift-node[87388]: I1014 23:15:48.293602   87388 common.go:66] Generated Name "master-....yaml
Oct 14 23:15:48 buzz1.fyre.ibm.com atomic-openshift-node[87388]: I1014 23:15:48.294512   87388 common.go:71] Using namespace "kube-s....yaml
Oct 14 23:15:48 buzz1.fyre.ibm.com atomic-openshift-node[87388]: I1014 23:15:48.295667   87388 file.go:199] Reading config file "/et...yaml"
Oct 14 23:15:48 buzz1.fyre.ibm.com atomic-openshift-node[87388]: I1014 23:15:48.296350   87388 common.go:62] Generated UID "d71dc810....yaml
Oct 14 23:15:48 buzz1.fyre.ibm.com atomic-openshift-node[87388]: I1014 23:15:48.296367   87388 common.go:66] Generated Name "master-....yaml
Oct 14 23:15:48 buzz1.fyre.ibm.com atomic-openshift-node[87388]: I1014 23:15:48.296379   87388 common.go:71] Using namespace "kube-s....yaml
Oct 14 23:15:48 buzz1.fyre.ibm.com atomic-openshift-node[87388]: I1014 23:15:48.300194   87388 config.go:303] Setting pods for source file
Oct 14 23:15:48 buzz1.fyre.ibm.com atomic-openshift-node[87388]: I1014 23:15:48.361625   87388 kubelet.go:1884] SyncLoop (SYNC): 3 p...d33c)
Oct 14 23:15:48 buzz1.fyre.ibm.com atomic-openshift-node[87388]: I1014 23:15:48.361693   87388 config.go:100] Looking for [api file]...e:{}]
Oct 14 23:15:48 buzz1.fyre.ibm.com atomic-openshift-node[87388]: I1014 23:15:48.361716   87388 kubelet.go:1907] SyncLoop (housekeeping)
Hint: Some lines were ellipsized, use -l to show in full.
[root buzz1 openshift-ansible]# ps -ef | grep calico
root      88872  87388  0 23:15 ?        00:00:00 /opt/cni/bin/calico
root      88975  74601  0 23:15 pts/0    00:00:00 grep --color=auto calico
[root buzz1 openshift-ansible]# 

It seemed that calico is extra here. Then using the same inventory file, OCP 3.11 could be deployed on a clean VM successfully.
I guessed that uninstall playbook did not clear calico thoroughly.


On Oct 12, 2019, at 11:52 PM, Yu Wei <yu2003w hotmail com> wrote:

Hi,
I tried to install OCP 3.11 with following variables set.
openshift_use_openshift_sdn=true
os_sdn_network_plugin_name='redhat/openshift-ovs-multitenant’

Some pods stuck on ‘ContainerCreating’.
[root buzz1 openshift-ansible]# oc get pods --all-namespaces
NAMESPACE               NAME                                    READY     STATUS              RESTARTS   AGE
default                 docker-registry-1-deploy                0/1       ContainerCreating   0          5h
default                 registry-console-1-deploy               0/1       ContainerCreating   0          5h
kube-system             master-api-buzz1.center1.com            1/1       Running             0          5h
kube-system             master-controllers-buzz1.center1.com    1/1       Running             0          5h
kube-system             master-etcd-buzz1.center1.com           1/1       Running             0          5h
openshift-node          sync-x8j7d                              1/1       Running             0          5h
openshift-sdn           ovs-ff7r7                               1/1       Running             0          5h
openshift-sdn           sdn-7frfw                               1/1       Running             10         5h
openshift-web-console   webconsole-85494cdb8c-s2dnh             0/1       ContainerCreating   0          5h

Run ‘oc describe pods’, I got something as below.

Events:
  Type     Reason                  Age              From                         Message
  ----     ------                  ----             ----                         -------
  Warning  FailedCreatePodSandBox  2m               kubelet, buzz1  Failed create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "8570c350953e29185ef8ab05d628f90c6791a56ac392e40f2f6e30a14a76ab22" network for pod "network-diag-test-pod-qz7hv": NetworkPlugin cni failed to set up pod "network-diag-test-pod-qz7hv_network-diag-global-ns-q7vbn" network: context deadline exceeded, failed to clean up sandbox container "8570c350953e29185ef8ab05d628f90c6791a56ac392e40f2f6e30a14a76ab22" network for pod "network-diag-test-pod-qz7hv": NetworkPlugin cni failed to teardown pod "network-diag-test-pod-qz7hv_network-diag-global-ns-q7vbn" network: context deadline exceeded]
  Normal   SandboxChanged          2s (x8 over 2m)  kubelet, buzz1  Pod sandbox changed, it will be killed and re-created.

How could I resolve this problem?
Any thoughts?

Thanks,
Jared

_______________________________________________
users mailing list
users lists openshift redhat com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]