[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Failure when adding node - Approve node certificates when bootstrapping



Hi,

I've recently run the scaleup procedeure on a 3.11 OKD cluster with the same result(failure) from the ansible run.
However, when checking for node status and extra info I've found that the node was successfully added to the cluster and in "Ready" state.

oc get nodes -o wide -> gives the status of the nodes, their role, internal IP etc;

I had a similar CSR problem when initially installing the cluster and posted a question in here some weeks ago. My problem was DNS related, but, while searching for a solution, I found that subsequent runs of the node playbook would generate OKD csr-s that would not be approved, but in pending state.
You can see if there are any and what state they're in with:

oc get csr


What I did was to enable automatic certificate issue from the master with using the variable

openshift_master_bootstrap_auto_approve=true

This is documented as being used in the cluster auto-scaling procedure in the case of AWS-deployed clusters. I honestly don't know if this change also has side effects apart from eliminating the duplicate/invalid csr-s being created in subsequent runs of the same playbook. And, again, this was tried while trying to solve the initial problem and left like that for the following operations with the inventory file.

Going back to the scale-up problem, I also checked, after looking for the node state, that the node gets Pods allocated(either by running repeated deployments of a test app, or adding a label to the new node and specifying it as a selector inside the test DeploymentConfig).
In my case, again, the node addition seems to have been successful, despite the ansible install error.

Hope this is of some help,
Dan


On 25.06.2019 21:00, Robert Dahlem wrote:
Hi,

I tried adding a node by adding to /etc/ansible/hosts:
===============================================================================
[OSEv3:children]
new_nodes

[new_nodes:vars]
openshift_disable_check=disk_availability,memory_availability,docker_storage

[new_nodes]
os-node2.MYDOMAIN openshift_node_group_name='node-config-compute'
===============================================================================

and running:
# ansible-playbook
/usr/share/ansible/openshift-ansible/playbooks/openshift-node/scaleup.yml

Unfortunately this (repeatedly) ends in:

===============================================================================
TASK [Approve node certificates when bootstrapping]
*******************************************************************************
FAILED - RETRYING: Approve node certificates when bootstrapping (30
retries left).
...FAILED - RETRYING: Approve node certificates when bootstrapping (1
retries left).
...
        to retry, use: --limit
@/usr/share/ansible/openshift-ansible/playbooks/openshift-node/scaleup.retry
===============================================================================

# uname -a
Linux os-master.MYDOMAIN 3.10.0-957.21.3.el7.x86_64 #1 SMP Tue Jun 18
16:35:19 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

# cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

# oc version
oc v3.11.0+62803d0-1
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://os-master.openshift.rdahlem.de:8443
openshift v3.11.0+7f5d53b-195
kubernetes v1.11.0+d4cacc0


# ansible --version
ansible 2.6.14
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/root/.ansible/plugins/modules',
u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /bin/ansible
  python version = 2.7.5 (default, Jun 20 2019, 20:27:34) [GCC 4.8.5
20150623 (Red Hat 4.8.5-36)]

What additional information would be needed?

Kind regards,
Robert

_______________________________________________
users mailing list
users lists openshift redhat com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]