[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

openshift dns: docker-registry-deploy times out, fails



Hi all, 
I'm new to openshift and attempting to run a small proof of concept cluster. I can deploy openshift 3.1 using the quick install method without issue. The problem is that the docker-registry-1-deploy pod times out when it can't contact the master service.  I suspect openshift dns isn't working properly but I'm unsure how to diagnose it.

My end goal is to setup the docker-registry using a glusterfs pvc. 

>From pod's log: 
    [root master init-cluster]# oc logs docker-registry-1-deploy 
    F0111 18:26:03.967979 1 deployer.go:65] couldn't get deployment default/docker-registry-1: Get https://master.rh71:8443/api/v1/namespaces/default/replicationcontrollers/docker-registry-1: dial tcp: lookup master.rh71: no such host 

General info: 

- 2 vm cluster; rhel server 7.2 (also reproducible on 7.1) 
- ose 3.1 
- docker 1.8.2 

- /etc/hosts is set across both nodes: 
    [root slave init-cluster]# cat /etc/hosts 
    127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 
    ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 

    192.168.122.78 master.rh71 
    192.168.122.52 slave.rh71 

- /etc/resolv.conf looks right: 
    [root slave ~]# cat /etc/resolv.conf 
    # Generated by NetworkManager 
    search rh71 
    nameserver 192.168.122.1 

- first indicator of something wrong: 
    [root master ~]# oc get pods 
    NAME READY STATUS RESTARTS AGE 
    docker-registry-1-deploy 0/1 Error 0 59s 

- describing the pod 
[root master ~]# oc describe pods 
Name: docker-registry-1-deploy 
Namespace: default 
Image(s): openshift3/ose-deployer:v3.1.0.4 
Node: slave.rh71/192.168.122.52 
Start Time: Mon, 11 Jan 2016 18:25:59 -0500 
Labels: openshift.io/deployer-pod-for.name=docker-registry-1 
Status: Failed 
Reason: 
Message: 
IP: 
Replication Controllers: <none> 
Containers: 
deployment: 
Container ID: docker://674407a0ac9187245dbb4df45b70465091cfd5368315a1a569d5987e62be9785 
Image: openshift3/ose-deployer:v3.1.0.4 
Image ID: docker://9580a28b3e18c64cff56f96e3f777464431accde6c98b3765d9bfc5a7e619ea2 
QoS Tier: 
memory: BestEffort 
cpu: BestEffort 
State: Terminated 
Reason: Error 
Exit Code: 255 
Started: Mon, 11 Jan 2016 18:26:02 -0500 
Finished: Mon, 11 Jan 2016 18:26:04 -0500 
Ready: False 
Restart Count: 0 
Environment Variables: 
KUBERNETES_MASTER: https://master.rh71:8443 
OPENSHIFT_MASTER: https://master.rh71:8443 
BEARER_TOKEN_FILE: /var/run/secrets/kubernetes.io/serviceaccount/token 
OPENSHIFT_CA_DATA: <left out to reduce the wall of text> 

OPENSHIFT_DEPLOYMENT_NAME: docker-registry-1 
OPENSHIFT_DEPLOYMENT_NAMESPACE: default 
Conditions: 
Type Status 
Ready False 
Volumes: 
deployer-token-duq7o: 
Type: Secret (a secret that should populate this volume) 
SecretName: deployer-token-duq7o 
Events: 
FirstSeen LastSeen Count From SubobjectPath Reason Message 
───────── ──────── ───── ──── ───────────── ────── ─────── 
23m 23m 1 {kubelet slave.rh71} implicitly required container POD Pulled Container image "openshift3/ose-pod:v3.1.0.4" already present on machine 
23m 23m 1 {kubelet slave.rh71} implicitly required container POD Created Created with docker id 9ffdbb0f8e6c 
23m 23m 1 {kubelet slave.rh71} implicitly required container POD Started Started with docker id 9ffdbb0f8e6c 
23m 23m 1 {kubelet slave.rh71} spec.containers{deployment} Pulled Container image "openshift3/ose-deployer:v3.1.0.4" already present on machine 
23m 23m 1 {kubelet slave.rh71} spec.containers{deployment} Created Created with docker id 674407a0ac91 
23m 23m 1 {kubelet slave.rh71} spec.containers{deployment} Started Started with docker id 674407a0ac91 
22m 22m 1 {kubelet slave.rh71} implicitly required container POD Killing Killing with docker id 9ffdbb0f8e6c 
22m 22m 1 {kubelet slave.rh71} FailedSync Error syncing pod, skipping: failed to delete containers ([exit status 1]) 
19m 19m 1 {scheduler } Scheduled Successfully assigned docker-registry-1-deploy to slave.rh71 


To test, I created a busybox container on the slave and attempted an nslookup of the master. It cannot resolve the server name. 

[root slave ~]# docker run -it --rm busybox nslookup master.rh71 
Server: 192.168.122.1 
Address 1: 192.168.122.1 

nslookup: can't resolve 'master.rh71' 


Following instructions on configuring dnsmasq and openshift's skynds to coexist on the master allowed the nodes to perform nslookups but did nothing to fix the issue. Guide here: http://developerblog.redhat.com/2015/11/19/dns-your-openshift-v3-cluster/ 

Here I'm stumped. What else can I do to further diagnose the cause? It appears that openshift dns isn't working as expected however I'm lost as where to look next. 

Appreciatively, 
Jon


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]