[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

openshift installation error on TASK openshift_cluster_monitoring_operator : Wait for the ServiceMonitor CRD to be created



Hello,

Is there anyone in this list had this issue on openshift 3.11 ?

My hosts file:

[masters]
master.os.serra.local

[etcd]
master.os.serra.local

[nodes]
master.os.serra.local openshift_node_group_name='node-config-master-infra'
node1.os.serra.local openshift_node_group_name='node-config-compute'

[OSEv3:children]
masters
nodes
etcd

[OSEv3:vars]
ansible_user=root
openshift_deployment_type=openshift-enterprise
openshift_master_default_subdomain=apps.os.serra.local
debug_level=2
oreg_auth_user='110xxxx|user1'
oreg_auth_password='XXXXXXXXXXXXXXxxx'
openshift_check_min_host_memory_gb=4


I already registered on redhat for  oreg_auth_user and password. also both system is RHEL 7.5 with latest updates.


deploy_cluster.yml  output:

TASK [openshift_cluster_monitoring_operator : Set cluster-monitoring-operator template] ***
changed: [master.os.serra.local]

TASK [openshift_cluster_monitoring_operator : Set cluster-monitoring-operator template] ***
changed: [master.os.serra.local]

TASK [openshift_cluster_monitoring_operator : Wait for the ServiceMonitor CRD to be created] ***
FAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (30 retries left).
FAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (29 retries left).
FAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (28 retries left).
......
FAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (1 retries left).
fatal: [master.os.serra.local]: FAILED! => {"attempts": 30, "changed": true, "cmd": ["oc", "get", "crd", "servicemonitors.monitoring.coreos.com", "-n", "openshift-monitoring", "--config=/tmp/openshift-cluster-monitoring-ansible-SswP6B/admin.kubeconfig"], "delta": "0:00:00.274308", "end": "2018-10-14 19:20:36.769452", "msg": "non-zero return code", "rc": 1, "start": "2018-10-14 19:20:36.495144", "stderr": "No resources found.\nError from server (NotFound): customresourcedefinitions.apiextensions.k8s.io \"servicemonitors.monitoring.coreos.com\" not found", "stderr_lines": ["No resources found.", "Error from server (NotFound): customresourcedefinitions.apiextensions.k8s.io \"servicemonitors.monitoring.coreos.com\" not found"], "stdout": "", "stdout_lines": []}
        to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.retry

PLAY RECAP *********************************************************************
localhost                  : ok=11   changed=0    unreachable=0    failed=0  
master.os.serra.local      : ok=589  changed=264  unreachable=0    failed=1  
node1.os.serra.local       : ok=118  changed=61   unreachable=0    failed=0  


INSTALLER STATUS ***************************************************************
Initialization               : Complete (0:00:37)
Health Check                 : Complete (0:01:08)
Node Bootstrap Preparation   : Complete (0:14:57)
etcd Install                 : Complete (0:02:50)
Master Install               : Complete (0:05:44)
Master Additional Install    : Complete (0:01:58)
Node Join                    : Complete (0:00:33)
Hosted Install               : Complete (0:01:02)
Cluster Monitoring Operator  : In Progress (0:15:31)
        This phase can be restarted by running: playbooks/openshift-monitoring/config.yml


Failure summary:


  1. Hosts:    master.os.serra.local
     Play:     Configure Cluster Monitoring Operator
     Task:     Wait for the ServiceMonitor CRD to be created
     Message:  non-zero return code


At /var/log/messages I get /etc/cni/net.d/ is emtpy error like:

Oct 14 19:17:58 master atomic-openshift-node: exec openshift start network --config=/etc/origin/node/node-config.yaml --kubeconfig=/tmp/kubeconfig --loglevel=${DEBUG_LOGLEVEL:-2}
Oct 14 19:17:58 master atomic-openshift-node: ] Args:[] WorkingDir: Ports:[{Name:healthz HostPort:10256 ContainerPort:10256 Protocol:TCP HostIP:}] EnvFrom:[] Env:[{Name:OPENSHIFT_DNS_DOMAIN Value:cluster.local ValueFrom:nil}] Resources:{Limits:map[] Requests:map[cpu:{i:{value:100 scale:-3} d:{Dec:<nil>} s:100m Format:DecimalSI} memory:{i:{value:209715200 scale:0} d:{Dec:<nil>} s: Format:BinarySI}]} VolumeMounts:[{Name:host-config ReadOnly:true MountPath:/etc/origin/node/ SubPath: MountPropagation:<nil>} {Name:host-sysconfig-node ReadOnly:true MountPath:/etc/sysconfig/origin-node SubPath: MountPropagation:<nil>} {Name:host-var-run ReadOnly:false MountPath:/var/run SubPath: MountPropagation:<nil>} {Name:host-var-run-dbus ReadOnly:true MountPath:/var/run/dbus/ SubPath: MountPropagation:<nil>} {Name:host-var-run-ovs ReadOnly:true MountPath:/var/run/openvswitch/ SubPath: MountPropagation:<nil>} {Name:host-var-run-kubernetes ReadOnly:true MountPath:/var/run/kubernetes/ SubPath: MountPropagation:<nil>} {Name:host-var-run-openshift-sdn ReadOnly:false MountPath:/var/run/openshift-sdn SubPath: MountPropagation:<nil>} {Name:host-opt-cni-bin ReadOnly:false MountPath:/host/opt/cni/bin SubPath: MountPropagation:<nil>} {Name:host-etc-cni-netd ReadOnly:false MountPath:/etc/cni/net.d SubPath: MountPropagation:<nil>} {Name:host-var-lib-cni-networks-openshift-sdn ReadOnly:false MountPath:/var/lib/cni/networks/openshift-sdn SubPath: MountPropagation:<nil>} {Name:sdn-token-8f5tb ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:nil ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:&SecurityContext{Capabilities:nil,Privileged:*true,SELinuxOptions:nil,RunAsUser:*0,RunAsNonRoot:nil,ReadOnlyRootFilesystem:nil,AllowPrivilegeEscalation:nil,RunAsGroup:nil,} Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Oct 14 19:17:58 master atomic-openshift-node: I1014 19:17:58.931158   17873 kuberuntime_manager.go:757] checking backoff for container "sdn" in pod "sdn-7fcwv_openshift-sdn(2b9a2a99-cfdb-11e8-85c5-525400975a6b)"
Oct 14 19:17:58 master atomic-openshift-node: I1014 19:17:58.931335   17873 kuberuntime_manager.go:767] Back-off 5m0s restarting failed container=sdn pod=sdn-7fcwv_openshift-sdn(2b9a2a99-cfdb-11e8-85c5-525400975a6b)
Oct 14 19:17:58 master atomic-openshift-node: E1014 19:17:58.931374   17873 pod_workers.go:186] Error syncing pod 2b9a2a99-cfdb-11e8-85c5-525400975a6b ("sdn-7fcwv_openshift-sdn(2b9a2a99-cfdb-11e8-85c5-525400975a6b)"), skipping: failed to "StartContainer" for "sdn" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=sdn pod=sdn-7fcwv_openshift-sdn(2b9a2a99-cfdb-11e8-85c5-525400975a6b)"
Oct 14 19:18:00 master atomic-openshift-node: W1014 19:18:00.354523   17873 cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d
Oct 14 19:18:00 master atomic-openshift-node: E1014 19:18:00.354630   17873 kubelet.go:2101] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Oct 14 19:18:01 master atomic-openshift-node: E1014 19:18:01.828464   17873 summary.go:102] Failed to get system container stats for "/system.slice/atomic-openshift-node.service": failed to get cgroup stats for "/system.slice/atomic-openshift-node.service": failed to get container info for "/system.slice/atomic-openshift-node.service": unknown container "/system.slice/atomic-openshift-node.service"
Oct 14 19:18:03 master python: ansible-command Invoked with warn=True exec



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]