[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: openshift dns: docker-registry-deploy times out, fails




| If you've also set up /etc/hosts on your master, its dnsmasq server ought to be able to resolve the addresses (having read /etc/hosts). Is this the case? Check with dig:
| dig master.rh71 @192.168.122.78 

Before setting re-configuring master-yaml.conf:

[root master ~]# dig master.rh71 @192.168.122.78

; <<>> DiG 9.9.4-RedHat-9.9.4-29.el7_2.1 <<>> master.rh71 @192.168.122.78
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 42892
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;master.rh71. IN A

;; Query time: 0 msec
;; SERVER: 192.168.122.78#53(192.168.122.78)
;; WHEN: Tue Jan 12 15:28:31 EST 2016
;; MSG SIZE rcvd: 29



Not experienced w/ dig but it looks like it can't resolve the master with the default setup (i.e. quick install method, no changes to dnsmasq)



With dnsmasq & skyDNS re-configured:

[root master ~]# dig master.rh71 @192.168.122.78

; <<>> DiG 9.9.4-RedHat-9.9.4-29.el7_2.1 <<>> master.rh71 @192.168.122.78
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63347
;; flags: qr aa rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;master.rh71. IN A

;; ANSWER SECTION:
master.rh71. 0 IN A 192.168.122.78

;; Query time: 1 msec
;; SERVER: 192.168.122.78#53(192.168.122.78)
;; WHEN: Tue Jan 12 15:44:25 EST 2016
;; MSG SIZE rcvd: 45


That seems to fix the name resolution but the issue persists with deploying the docker-registry.


Also, I forgot to include the journalctl -ei atomic-openshift-node.  It's filled w/ error related to the slave being unable to resolve the master.


Jan 12 15:25:38 slave.rh71 atomic-openshift-node[31791]: E0112 15:25:38.823326 31791 manager.go:1342] Failed tearing down the infra container: exit status 1

Jan 12 15:25:38 slave.rh71 atomic-openshift-node[31791]: I0112 15:25:38.824581 31791 manager.go:1419] Killing container "e1a1b97c8ff9e69158fe1998d772f4273a4b3dad1962f407dcde3ea065713297 default/docker-registry-1-deploy" with 30 second g
Jan 12 15:25:39 slave.rh71 atomic-openshift-node[31791]: I0112 15:25:39.149577 31791 manager.go:1451] Container "e1a1b97c8ff9e69158fe1998d772f4273a4b3dad1962f407dcde3ea065713297 default/docker-registry-1-deploy" exited after 324.955308m
Jan 12 15:25:39 slave.rh71 atomic-openshift-node[31791]: E0112 15:25:39.153585 31791 pod_workers.go:113] Error syncing pod 0e5738be-b96a-11e5-a344-525400aef072, skipping: failed to delete containers ([exit status 1])
Jan 12 15:25:39 slave.rh71 atomic-openshift-node[31791]: I0112 15:25:39.200610 31791 container.go:430] Failed to update stats for container "/system.slice/boot.mount": failed to parse memory.usage_in_bytes - read /sys/fs/cgroup/memory/s
Jan 12 15:25:41 slave.rh71 atomic-openshift-node[31791]: I0112 15:25:41.010269 31791 helpers.go:96] Unable to get network stats from pid 34704: couldn't read network stats: failure opening /proc/34704/net/dev: open /proc/34704/net/dev:
Jan 12 15:25:41 slave.rh71 atomic-openshift-node[31791]: I0112 15:25:41.798062 31791 helpers.go:96] Unable to get network stats from pid 34455: couldn't read network stats: failure opening /proc/34455/net/dev: open /proc/34455/net/dev:
Jan 12 15:25:42 slave.rh71 atomic-openshift-node[31791]: I0112 15:25:42.797868 31791 helpers.go:96] Unable to get network stats from pid 34455: couldn't read network stats: failure opening /proc/34455/net/dev: open /proc/34455/net/dev:
Jan 12 15:25:44 slave.rh71 atomic-openshift-node[31791]: I0112 15:25:44.797762 31791 helpers.go:96] Unable to get network stats from pid 34455: couldn't read network stats: failure opening /proc/34455/net/dev: open /proc/34455/net/dev:
Jan 12 15:25:48 slave.rh71 atomic-openshift-node[31791]: I0112 15:25:48.797794 31791 helpers.go:96] Unable to get network stats from pid 34455: couldn't read network stats: failure opening /proc/34455/net/dev: open /proc/34455/net/dev:
Jan 12 15:25:49 slave.rh71 atomic-openshift-node[31791]: I0112 15:25:49.010277 31791 helpers.go:96] Unable to get network stats from pid 34704: couldn't read network stats: failure opening /proc/34704/net/dev: open /proc/34704/net/dev:
Jan 12 15:25:56 slave.rh71 atomic-openshift-node[31791]: I0112 15:25:56.797829 31791 helpers.go:96] Unable to get network stats from pid 34455: couldn't read network stats: failure opening /proc/34455/net/dev: open /proc/34455/net/dev:
Jan 12 15:41:17 slave.rh71 atomic-openshift-node[31791]: I0112 15:41:17.806951 31791 iowatcher.go:102] Unexpected EOF during watch stream event decoding: unexpected EOF
Jan 12 15:41:17 slave.rh71 atomic-openshift-node[31791]: I0112 15:41:17.807053 31791 iowatcher.go:102] Unexpected EOF during watch stream event decoding: unexpected EOF
Jan 12 15:41:17 slave.rh71 atomic-openshift-node[31791]: I0112 15:41:17.806951 31791 iowatcher.go:102] Unexpected EOF during watch stream event decoding: unexpected EOF
Jan 12 15:41:17 slave.rh71 atomic-openshift-node[31791]: I0112 15:41:17.810417 31791 iowatcher.go:102] Unexpected EOF during watch stream event decoding: unexpected EOF
Jan 12 15:41:17 slave.rh71 atomic-openshift-node[31791]: I0112 15:41:17.811365 31791 iowatcher.go:102] Unexpected EOF during watch stream event decoding: unexpected EOF
Jan 12 15:41:17 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:17.812240 31791 reflector.go:206] pkg/kubelet/kubelet.go:240: Failed to watch *api.Node: Get https://master.rh71:8443/api/v1/watch/nodes?fieldSelector=metadata.name%3D
Jan 12 15:41:17 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:17.812377 31791 reflector.go:206] pkg/kubelet/config/apiserver.go:43: Failed to watch *api.Pod: Get https://master.rh71:8443/api/v1/watch/pods?fieldSelector=spec.nodeN
Jan 12 15:41:17 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:17.812431 31791 reflector.go:206] pkg/kubelet/kubelet.go:223: Failed to watch *api.Service: Get https://master.rh71:8443/api/v1/watch/services?resourceVersion=905: dia
Jan 12 15:41:17 slave.rh71 atomic-openshift-node[31791]: I0112 15:41:17.812486 31791 iowatcher.go:102] Unexpected EOF during watch stream event decoding: unexpected EOF
Jan 12 15:41:17 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:17.813541 31791 reflector.go:206] pkg/proxy/config/api.go:47: Failed to watch *api.Service: Get https://master.rh71:8443/api/v1/watch/services?resourceVersion=1117: di
Jan 12 15:41:17 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:17.813620 31791 reflector.go:206] pkg/proxy/config/api.go:60: Failed to watch *api.Endpoints: Get https://master.rh71:8443/api/v1/watch/endpoints?resourceVersion=1117:
Jan 12 15:41:17 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:17.815172 31791 reflector.go:206] /builddir/build/BUILD/atomic-openshift-git-15.5e061c3/_thirdpartyhacks/src/github.com/openshift/openshift-sdn/plugins/osdn/osdn.go:52
Jan 12 15:41:18 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:18.065568 31791 kubelet.go:2396] Error updating node status, will retry: error getting node "slave.rh71": Get https://master.rh71:8443/api/v1/nodes/slave.rh71: dial tc
Jan 12 15:41:18 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:18.066377 31791 kubelet.go:2396] Error updating node status, will retry: error getting node "slave.rh71": Get https://master.rh71:8443/api/v1/nodes/slave.rh71: dial tc
Jan 12 15:41:18 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:18.067003 31791 kubelet.go:2396] Error updating node status, will retry: error getting node "slave.rh71": Get https://master.rh71:8443/api/v1/nodes/slave.rh71: dial tc
Jan 12 15:41:18 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:18.067595 31791 kubelet.go:2396] Error updating node status, will retry: error getting node "slave.rh71": Get https://master.rh71:8443/api/v1/nodes/slave.rh71: dial tc
Jan 12 15:41:18 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:18.068158 31791 kubelet.go:2396] Error updating node status, will retry: error getting node "slave.rh71": Get https://master.rh71:8443/api/v1/nodes/slave.rh71: dial tc
Jan 12 15:41:18 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:18.068170 31791 kubelet.go:977] Unable to update node status: update node status exceeds retry count
Jan 12 15:41:18 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:18.813350 31791 reflector.go:206] pkg/kubelet/kubelet.go:240: Failed to watch *api.Node: Get https://master.rh71:8443/api/v1/watch/nodes?fieldSelector=metadata.name%3D
Jan 12 15:41:18 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:18.813900 31791 reflector.go:206] pkg/kubelet/config/apiserver.go:43: Failed to watch *api.Pod: Get https://master.rh71:8443/api/v1/watch/pods?fieldSelector=spec.nodeN
Jan 12 15:41:18 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:18.813949 31791 reflector.go:206] pkg/kubelet/kubelet.go:223: Failed to watch *api.Service: Get https://master.rh71:8443/api/v1/watch/services?resourceVersion=905: dia
Jan 12 15:41:18 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:18.814487 31791 reflector.go:206] pkg/proxy/config/api.go:47: Failed to watch *api.Service: Get https://master.rh71:8443/api/v1/watch/services?resourceVersion=1117: di
Jan 12 15:41:18 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:18.814536 31791 reflector.go:206] pkg/proxy/config/api.go:60: Failed to watch *api.Endpoints: Get https://master.rh71:8443/api/v1/watch/endpoints?resourceVersion=1117:
Jan 12 15:41:18 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:18.815919 31791 reflector.go:206] /builddir/build/BUILD/atomic-openshift-git-15.5e061c3/_thirdpartyhacks/src/github.com/openshift/openshift-sdn/plugins/osdn/osdn.go:52
Jan 12 15:41:19 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:19.816513 31791 reflector.go:206] pkg/kubelet/kubelet.go:223: Failed to watch *api.Service: Get https://master.rh71:8443/api/v1/watch/services?resourceVersion=905: dia
Jan 12 15:41:19 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:19.817388 31791 reflector.go:206] pkg/kubelet/kubelet.go:240: Failed to watch *api.Node: Get https://master.rh71:8443/api/v1/watch/nodes?fieldSelector=metadata.name%3D
Jan 12 15:41:19 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:19.817848 31791 reflector.go:206] pkg/proxy/config/api.go:60: Failed to watch *api.Endpoints: Get https://master.rh71:8443/api/v1/watch/endpoints?resourceVersion=1117:
Jan 12 15:41:19 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:19.817917 31791 reflector.go:206] pkg/proxy/config/api.go:47: Failed to watch *api.Service: Get https://master.rh71:8443/api/v1/watch/services?resourceVersion=1117: di
Jan 12 15:41:19 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:19.817945 31791 reflector.go:206] pkg/kubelet/config/apiserver.go:43: Failed to watch *api.Pod: Get https://master.rh71:8443/api/v1/watch/pods?fieldSelector=spec.nodeN
Jan 12 15:41:19 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:19.819333 31791 reflector.go:206] /builddir/build/BUILD/atomic-openshift-git-15.5e061c3/_thirdpartyhacks/src/github.com/openshift/openshift-sdn/plugins/osdn/osdn.go:52
Jan 12 15:41:20 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:20.817735 31791 reflector.go:206] pkg/kubelet/kubelet.go:223: Failed to watch *api.Service: Get https://master.rh71:8443/api/v1/watch/services?resourceVersion=905: dia
Jan 12 15:41:20 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:20.818426 31791 reflector.go:206] pkg/kubelet/kubelet.go:240: Failed to watch *api.Node: Get https://master.rh71:8443/api/v1/watch/nodes?fieldSelector=metadata.name%3D
Jan 12 15:41:20 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:20.818572 31791 reflector.go:206] pkg/proxy/config/api.go:60: Failed to watch *api.Endpoints: Get https://master.rh71:8443/api/v1/watch/endpoints?resourceVersion=1117:
Jan 12 15:41:20 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:20.818819 31791 reflector.go:206] pkg/kubelet/config/apiserver.go:43: Failed to watch *api.Pod: Get https://master.rh71:8443/api/v1/watch/pods?fieldSelector=spec.nodeN
Jan 12 15:41:20 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:20.818850 31791 reflector.go:206] pkg/proxy/config/api.go:47: Failed to watch *api.Service: Get https://master.rh71:8443/api/v1/watch/services?resourceVersion=1117: di
Jan 12 15:41:20 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:20.819911 31791 reflector.go:206] /builddir/build/BUILD/atomic-openshift-git-15.5e061c3/_thirdpartyhacks/src/github.com/openshift/openshift-sdn/plugins/osdn/osdn.go:52
Jan 12 15:41:22 slave.rh71 atomic-openshift-node[31791]: I0112 15:41:22.183391 31791 roundrobin.go:263] LoadBalancerRR: Setting endpoints for default/kubernetes:dns-tcp to [192.168.122.78:8053]
Jan 12 15:41:22 slave.rh71 atomic-openshift-node[31791]: I0112 15:41:22.183465 31791 roundrobin.go:220] Delete endpoint 192.168.122.78:8053 for service "default/kubernetes:dns-tcp"
Jan 12 15:41:22 slave.rh71 atomic-openshift-node[31791]: I0112 15:41:22.183483 31791 roundrobin.go:220] Delete endpoint 192.168.122.78:53 for service "default/kubernetes:dns-tcp"
Jan 12 15:41:22 slave.rh71 atomic-openshift-node[31791]: I0112 15:41:22.183513 31791 roundrobin.go:263] LoadBalancerRR: Setting endpoints for default/kubernetes:dns to [192.168.122.78:8053]
Jan 12 15:41:22 slave.rh71 atomic-openshift-node[31791]: I0112 15:41:22.183530 31791 roundrobin.go:220] Delete endpoint 192.168.122.78:8053 for service "default/kubernetes:dns"
Jan 12 15:41:22 slave.rh71 atomic-openshift-node[31791]: I0112 15:41:22.183542 31791 roundrobin.go:220] Delete endpoint 192.168.122.78:53 for service "default/kubernetes:dns"
Jan 12 15:41:22 slave.rh71 atomic-openshift-node[31791]: I0112 15:41:22.206927 31791 proxier.go:390] Adding new service "default/kubernetes:dns" at 172.30.0.1:8053/UDP
Jan 12 15:41:22 slave.rh71 atomic-openshift-node[31791]: I0112 15:41:22.207037 31791 proxier.go:332] Proxying for service "default/kubernetes:dns" on UDP port 37361
Jan 12 15:41:22 slave.rh71 atomic-openshift-node[31791]: E0112 15:41:22.211031 31791 proxysocket.go:216] ReadFrom failed, exiting ProxyLoop: read udp [::]:39824: use of closed network connection
Jan 12 15:41:22 slave.rh71 atomic-openshift-node[31791]: I0112 15:41:22.242899 31791 proxier.go:390] Adding new service "default/kubernetes:dns-tcp" at 172.30.0.1:8053/TCP
Jan 12 15:41:22 slave.rh71 atomic-openshift-node[31791]: I0112 15:41:22.243034 31791 proxier.go:332] Proxying for service "default/kubernetes:dns-tcp" on TCP port 53574
lines 937-1001/1001 (END)



From: "Luke Meyer" <lmeyer redhat com>
To: "Jon Cope" <jcope redhat com>
Cc: "users" <users lists openshift redhat com>
Sent: Tuesday, January 12, 2016 11:53:32 AM
Subject: Re: openshift dns: docker-registry-deploy times out, fails



On Tue, Jan 12, 2016 at 11:04 AM, Jon Cope <jcope redhat com> wrote:
Hi all,
I'm new to openshift and attempting to run a small proof of concept cluster. I can deploy openshift 3.1 using the quick install method without issue. The problem is that the docker-registry-1-deploy pod times out when it can't contact the master service.  I suspect openshift dns isn't working properly but I'm unsure how to diagnose it.

My end goal is to setup the docker-registry using a glusterfs pvc.

>From pod's log:
    [root master init-cluster]# oc logs docker-registry-1-deploy
    F0111 18:26:03.967979 1 deployer.go:65] couldn't get deployment default/docker-registry-1: Get https://master.rh71:8443/api/v1/namespaces/default/replicationcontrollers/docker-registry-1: dial tcp: lookup master.rh71: no such host

General info:

- 2 vm cluster; rhel server 7.2 (also reproducible on 7.1)
- ose 3.1
- docker 1.8.2

- /etc/hosts is set across both nodes:
    [root slave init-cluster]# cat /etc/hosts
    127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
    ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

    192.168.122.78 master.rh71
    192.168.122.52 slave.rh71

Just FYI, pods and direct docker containers don't inherit /etc/hosts from the node. In order to resolve it inside containers, DNS needs to resolve these names.
 

- /etc/resolv.conf looks right:
    [root slave ~]# cat /etc/resolv.conf
    # Generated by NetworkManager
    search rh71
    nameserver 192.168.122.1

- first indicator of something wrong:
    [root master ~]# oc get pods
    NAME READY STATUS RESTARTS AGE
    docker-registry-1-deploy 0/1 Error 0 59s

- describing the pod
[root master ~]# oc describe pods
Name: docker-registry-1-deploy
Namespace: default
Image(s): openshift3/ose-deployer:v3.1.0.4
Node: slave.rh71/192.168.122.52
Start Time: Mon, 11 Jan 2016 18:25:59 -0500
Labels: openshift.io/deployer-pod-for.name=docker-registry-1
Status: Failed
Reason:
Message:
IP:
Replication Controllers: <none>
Containers:
deployment:
Container ID: docker://674407a0ac9187245dbb4df45b70465091cfd5368315a1a569d5987e62be9785
Image: openshift3/ose-deployer:v3.1.0.4
Image ID: docker://9580a28b3e18c64cff56f96e3f777464431accde6c98b3765d9bfc5a7e619ea2
QoS Tier:
memory: BestEffort
cpu: BestEffort
State: Terminated
Reason: Error
Exit Code: 255
Started: Mon, 11 Jan 2016 18:26:02 -0500
Finished: Mon, 11 Jan 2016 18:26:04 -0500
Ready: False
Restart Count: 0
Environment Variables:
KUBERNETES_MASTER: https://master.rh71:8443
OPENSHIFT_MASTER: https://master.rh71:8443
BEARER_TOKEN_FILE: /var/run/secrets/kubernetes.io/serviceaccount/token
OPENSHIFT_CA_DATA: <left out to reduce the wall of text>

OPENSHIFT_DEPLOYMENT_NAME: docker-registry-1
OPENSHIFT_DEPLOYMENT_NAMESPACE: default
Conditions:
Type Status
Ready False
Volumes:
deployer-token-duq7o:
Type: Secret (a secret that should populate this volume)
SecretName: deployer-token-duq7o
Events:
FirstSeen LastSeen Count From SubobjectPath Reason Message
───────── ──────── ───── ──── ───────────── ────── ───────
23m 23m 1 {kubelet slave.rh71} implicitly required container POD Pulled Container image "openshift3/ose-pod:v3.1.0.4" already present on machine
23m 23m 1 {kubelet slave.rh71} implicitly required container POD Created Created with docker id 9ffdbb0f8e6c
23m 23m 1 {kubelet slave.rh71} implicitly required container POD Started Started with docker id 9ffdbb0f8e6c
23m 23m 1 {kubelet slave.rh71} spec.containers{deployment} Pulled Container image "openshift3/ose-deployer:v3.1.0.4" already present on machine
23m 23m 1 {kubelet slave.rh71} spec.containers{deployment} Created Created with docker id 674407a0ac91
23m 23m 1 {kubelet slave.rh71} spec.containers{deployment} Started Started with docker id 674407a0ac91
22m 22m 1 {kubelet slave.rh71} implicitly required container POD Killing Killing with docker id 9ffdbb0f8e6c
22m 22m 1 {kubelet slave.rh71} FailedSync Error syncing pod, skipping: failed to delete containers ([exit status 1])
19m 19m 1 {scheduler } Scheduled Successfully assigned docker-registry-1-deploy to slave.rh71


To test, I created a busybox container on the slave and attempted an nslookup of the master. It cannot resolve the server name.

[root slave ~]# docker run -it --rm busybox nslookup master.rh71
Server: 192.168.122.1
Address 1: 192.168.122.1

nslookup: can't resolve 'master.rh71'

Kubernetes inserts the SkyDNS (master) IP into /etc/resolv.conf for containers it owns. When you run a docker container directly, it doesn't get this, it just gets what the host has (and no /etc/hosts). So be aware those are different environments. You could try `oc run` to directly run an image (not sure how long that has existed).
 


Following instructions on configuring dnsmasq and openshift's skynds to coexist on the master allowed the nodes to perform nslookups but did nothing to fix the issue. Guide here: http://developerblog.redhat.com/2015/11/19/dns-your-openshift-v3-cluster/

If you've also set up /etc/hosts on your master, its dnsmasq server ought to be able to resolve the addresses (having read /etc/hosts). Is this the case? Check with dig:

dig master.rh71 @192.168.122.78 
 


Here I'm stumped. What else can I do to further diagnose the cause? It appears that openshift dns isn't working as expected however I'm lost as where to look next.

Appreciatively,
Jon

_______________________________________________
users mailing list
users lists openshift redhat com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]