[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Node not joining cluster during ansible install



(reposting: forgot to reply-all the first time)


Just based off of the number of tasks your summary says completed I am not sure your installation actually completed in full. I expect to see upwards of 1->2 thousand tasks.


A while back we changed node integration behavior such that if a node fails to provision it does not stop your entire installation. This is to ease the pain felt when provisioning large (hundred+) node clusters. 

<private node1 dns name> : ok=235  changed=56 unreachable=0    failed=0

That node did not fully install. Open a shell on that node and check the openshift services. I'm willing to bet that

systemctl list-units --all | grep -i origin

would show the node service is not running. Find the name of the node service and then examine the journal logs for that node

journalctl -x -u <node-service-name>


I think we (the openshift-ansible team) will want to add detection of failed node integrations into our error summary report in the future. Would you mind please opening an issue for this on our github page with this information?


Thanks!



On Sun, Jul 30, 2017 at 10:57 AM, Tim Dudgeon <tdudgeon ml gmail com> wrote:
I'm trying to get to grips with the advanced (Ansible) installer.
Initially I'm trying to do something very simple, fire up a cluster with one master and one node.
My inventory file looks like this:

[OSEv3:children]
masters
nodes


[OSEv3:vars]
ansible_ssh_user=root
openshift_hostname=<private master dns name>
openshift_master_cluster_hostname=<private master dns name>
openshift_master_cluster_public_hostname=<public master dns name>
openshift_disable_check=docker_storage,memory_availability
openshift_deployment_type=origin

[masters]
<private master dns name>

[etcd]
<private master dns name>


[nodes]
<private master dns name>
<private node1 dns name>


I run:
ansible-playbook ~/openshift-ansible/playbooks/byo/config.yml
and (after a long time) it completes, without any noticeable errors:

...
PLAY RECAP *********************************************************************************************************************************************************
<private node1 dns name> : ok=235  changed=56 unreachable=0    failed=0
<private master dns name> : ok=623  changed=166 unreachable=0    failed=0
localhost                  : ok=12   changed=0    unreachable=0 failed=0

Both nodes seem to have been setup OK.
But when I look on the master node there is only the master in the cluster, no second node:

oc get nodes
NAME STATUS                     AGE
<private master dns name> Ready,SchedulingDisabled   32m

and of course like this nothing can get scheduled.

Presumably the node should be added to the cluster, so any ideas what is going wrong here?

Thanks
Tim

_______________________________________________
users mailing list
users lists openshift redhat com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users



--
Tim Bielawa, Sr. Software Engineer [ED-C137]
IRC: tbielawa (#openshift)
1BA0 4FAB 4C13 FBA0 A036  4958 AD05 E75E 0333 AE37

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]