[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Missing OpenShift Nodes - Unable to Join Cluster

No, the hostnames are the same.  Because I was getting the "external Id from Cloud provider" error, I disabled the AWS configuration settings and left it as solely a BYO.  

This allowed me to get my nodes back up.  There's definitely something with the AWS cloud provider settings and how instance names for nodes are being found. 

I only need the AWS config for EBS storage for Persistence Volumes, so I can't fully disable it the AWS settings.

How does the external id lookup work?  Can I verify the settings it expects?

Isaac ChristoffersenTechnical Director
w: 703.318.7800 x8202 | m: 703.980.2836 | @ichristo

Vizuri, a division of AEM Corporation
13880 Dulles Corner Lane # 300
Herndon, Virginia 20171
www.vizuri.com | @1Vizuri

On Thu, Sep 8, 2016 at 9:24 PM, Jason DeTiberus <jdetiber redhat com> wrote:

On Sep 8, 2016 7:06 PM, "Isaac Christoffersen" <ichristoffersen vizuri com> wrote:
> I'm running Origin in AWS and after adding some shared EFS volumes to the node instances, the nodes seem to be unable to rejoin the cluster.  
> It's a 3 Master + ETCD setup with 4 application Nodes.  An 'oc get nodes' returns an empty list and of course, none of the pods will start.
> Various error messages that I see that are relevant are:
> "Unable to construct api.Node object for kubelet: failed to get external ID from cloud provider: instance not found
> "Could not find an allocated subnet for node: ip-10-0-37-217..... , Waiting..."
> and 
> ""Error updating node status, will retry: error getting node "ip-10-0-37-217....": nodes "ip-10-0-37-217...." not found"
> Any insights into how to start troubleshooting further.  I'm baffled.

Did the nodes come back up with a new IP address? If so, the internal DNS name would have also changed and the node would need to be reconfigured accordingly.

Items that would need to be updated:
- node name in the node config
- node serving certificate

There is an Ansible playbook that can automate the redeployment of certificates as well (playbooks/byo/openshift-cluster/redeploy-certificates.yml).

Jason DeTiberus

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]