[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

RE: DNS lookup failures

The logging from dnsmasq was insightful.  It looks like the lookups favor the second server in the list.
In my case, the second server was for quick offsite lookups so it was failing the local lookups.

From: Brigman, Larry

Sent: Thursday, February 22, 2018 3:57 PM

To: Clayton Coleman

Cc: users lists openshift redhat com

Subject: RE: DNS lookup failures

I hadn't tried that.  
I did turn on dnsmasq logging of queries to help pinpoint the problem.

One of the issues was outside of Openshift where the DNS wouldn't forward requests and just time out.

Getting that out of the system was a multi-step process plus restarting dnsmasq after changing the config files to get it to pick up the correct DNS servers.

When it occurs again, I'll use dig against the local resolver where I'm getting the failure.

On a multi-node cluster, changing all the files is painful.

From: Clayton Coleman [ccoleman redhat com]

Sent: Thursday, February 22, 2018 2:58 PM

To: Brigman, Larry

Cc: users lists openshift redhat com

Subject: Re: DNS lookup failures

Do you see errors when you try to dig the master DNS address?  Or if you dig the local dnsmasq?

I wonder if we're caching a negative lookup or soemithng similar.

On Wed, Feb 21, 2018 at 6:01 PM, Brigman, Larry 
<Larry Brigman arris com> wrote:

I have been experiencing DNS lookup failures.  This is preventing production deployment of Openshift.
I see it in two cases, lookup of a remote docker registry and lookup of a ldap service.  Both of these are not local to the server(s) in question but local to internal DNS servers.
The ldap case is easier for me to replicate as I just need to attempt to login.
Feb 20 11:21:16 lab-stack1 atomic-openshift-master-api: E0220 11:21:16.924930    2005 login.go:176] Error authenticating "XXXX" with provider "ldap": LDAP Result Code 200 "": dial tcp: lookup ldap.xxx.xxx on xxx.xxx.xxx.xxx:53: no such
Officiated the user, provider name and host for security.
On xxx.xxx.xxx.xxx:53 is the master node which is running dnsmasq with the default configuration provided via openshift-ansible installation.
These get resolved for a while if I go on a host and do ‘host ldap.xxx.xxx’.  It then works for a while and then reverts. 

oc version
oc v3.7.0+7ed6862
kubernetes v1.7.6+a08f5eeb62
features: Basic-Auth GSSAPI Kerberos SPNEGO
openshift v3.7.0+7ed6862
kubernetes v1.7.6+a08f5eeb62
What are the next steps to try.  Using dig or host on the node in question always returns a valid lookup result.


users mailing list

users lists openshift redhat com


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]