[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

pods are not getting deleted in openshift 4.2



Hi All,
I am using the Openshift 4.2 and trying to delete the daemon set
with 'oc delete ds <ds_name>'  command, but its failing. 

We deployed our own container image as a daemonset and stopped the running application (processes) using preStop hook
which is having systemctl stop <service_name>. This service_name basically stopping all the application processes
spawned by us that are running inside the container.

But as I mentioned 'oc delete ds <daemonset_name> ' running on the master node is not killing the pods on worker nodes and hence
the pods are showing in terminating state in the master node forever, but the pods are actually running in the worker nodes.

I tried manually deleting the pods on worker node using crictl rm <conatiner id> but it is not deleting the pods.
But when I use runc kill <full_conatiner_id> 37 ( singal 37) on the worker nodes, its killing the container. 


Expected behavior:
'oc delete ds <daemonset> '

should delete the pods on worker nodes.

Any help regarding this highly appreciated.
Looking forward to your reply.

Thanks & Regards,
Ramana

These are the deatils:

OS version:
[core compute-2 ~]$ cat /etc/os-release
NAME="Red Hat Enterprise Linux CoreOS"
VERSION="42.81.20191223.0"
VERSION_ID="4.2"
PRETTY_NAME="Red Hat Enterprise Linux CoreOS 42.81.20191223.0 (Ootpa)"
ID="rhcos"
ID_LIKE="rhel fedora"
ANSI_COLOR="0;31"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="OpenShift Container Platform"
REDHAT_BUGZILLA_PRODUCT_VERSION="4.2"
REDHAT_SUPPORT_PRODUCT="OpenShift Container Platform"
REDHAT_SUPPORT_PRODUCT_VERSION="4.2"
OSTREE_VERSION=42.81.20191223.0

crictl version: 
[core compute-2 ~]$ sudo crictl version
Version:  0.1.0
RuntimeName:  cri-o
RuntimeVersion:  1.14.11-4.dev.rhaos4.2.git179ea6b.el8
RuntimeApiVersion:  v1alpha1

[core compute-2 ~]$ sudo runc --version
runc version spec: 1.0.1-dev

[core control-plane-0 ~]$ oc version ( on master node and client version is same on worker code)
Client Version: v4.2.13
Server Version: 4.2.14
Kubernetes Version: v1.14.6+b294fe5

These are the following logs collected from on master node after running 'oc delete ds <daemonset_name>
[core control-plane-0 ~]$ kubectl get events --sort-by='{.lastTimestamp}'

72m         Warning   FailedKillPod      pod/test-defaultgroup-fg9zm   error killing pod: [failed to "KillContainer" for "test-defaultgroup" with KillContainerError: "rpc error: code = Unknown desc = failed to stop container d79f627a1ca96bab4fd060dcd425774e01cc7451304cbb1574c2632afef386a3: failed to stop container \"d79f627a1ca96bab4fd060dcd425774e01cc7451304cbb1574c2632afef386a3\": failed to find process: <nil>"
, failed to "KillPodSandbox" for "1398e7ed-44e9-11ea-b014-005056b87475" with KillPodSandboxError: "rpc error: code = Unknown desc = failed to stop container k8s_test-defaultgroup_test-defaultgroup-fg9zm_default_1398e7ed-44e9-11ea-b014-005056b87475_0 in pod sandbox 1fc87253a559df2341c2af4a8d0d746c92fe8722b715ce3cd21bc3b5e82015d7: failed to stop container \"d79f627a1ca96bab4fd060dcd425774e01cc7451304cbb1574c2632afef386a3\": failed to find process: <nil>"
]
69m         Warning   FailedKillPod      pod/test-defaultgroup-vqrw4   error killing pod: [failed to "KillContainer" for "test-defaultgroup" with KillContainerError: "rpc error: code = Unknown desc = failed to stop container c6d141612aa37b424ead130f109298ffcb212653171d43728d610bcb3bdd9073: failed to stop container \"c6d141612aa37b424ead130f109298ffcb212653171d43728d610bcb3bdd9073\": failed to find process: <nil>"
, failed to "KillPodSandbox" for "fba29286-4520-11ea-b014-005056b87475" with KillPodSandboxError: "rpc error: code = Unknown desc = failed to stop container k8s_test-defaultgroup_test-defaultgroup-vqrw4_default_fba29286-4520-11ea-b014-005056b87475_0 in pod sandbox f42ce723a4ea1560a9eb988b086c759feb3450ed28b551b3f768ddf5a4aca889: failed to stop container \"c6d141612aa37b424ead130f109298ffcb212653171d43728d610bcb3bdd9073\": failed to find process: <nil>"
]
4m53s       Normal    Killing            pod/test-defaultgroup-fg9zm   Stopping container test-defaultgroup
3m22s       Normal    Killing            pod/test-defaultgroup-zxzz9   Stopping container test-defaultgroup
3m19s       Normal    Killing            pod/test-defaultgroup-vqrw4   Stopping container test-defaultgroup

When I manually try to remove conatiner with crictl command, getting the following error:
[core compute-2 ~]$ sudo crictl rm fe3d97bf8a70b
Removing the container "fe3d97bf8a70b" failed: rpc error: code = Unknown desc = unable to stop container fe3d97bf8a70be26327b42aae31fde08231244eefc94599a3bbc1282c4c160e0: failed to stop container fe3d97bf8a70be26327b42aae31fde08231244eefc94599a3bbc1282c4c160e0: failed to stop container "fe3d97bf8a70be26327b42aae31fde08231244eefc94599a3bbc1282c4c160e0": failed to find process: <nil>
[core compute-2 ~]$                             

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]