[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: openshift-ansible release-3.10 - Install fails with control plane pods



Sure, see attached. 

Before each attempt I pull the latest release-3.10 branch for openshift-ansible.

@Scott Dodson: I am going to investigate again using your suggestions.

> Marc,
> 
> Is it possible to share  your ansible inventory file to review your
> openshift installation? I know there are some changes in 3.10 installation
> and might reflect in the inventory.
> 
> On Thu, Aug 30, 2018 at 3:37 PM Marc Schlegel <marc schlegel gmx de> wrote:
> 
> > Thanks for the link. It looks like the api-pod is not getting up at all!
> >
> > Log from k8s_controllers_master-controllers-*
> >
> > [vagrant master ~]$ sudo docker logs
> > k8s_controllers_master-controllers-master.vnet.de_kube-system_a3c3ca56f69ed817bad799176cba5ce8_1
> > E0830 18:28:05.787358       1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:594:
> > Failed to list *v1.Pod: Get
> > https://master.vnet.de:8443/api/v1/pods?fieldSelector=spec.schedulerName%3Ddefault-scheduler%2Cstatus.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded&limit=500&resourceVersion=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:05.788589       1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.ReplicationController: Get
> > https://master.vnet.de:8443/api/v1/replicationcontrollers?limit=500&resourceVersion=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:05.804239       1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.Node: Get
> > https://master.vnet.de:8443/api/v1/nodes?limit=500&resourceVersion=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:05.806879       1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1beta1.StatefulSet: Get
> > https://master.vnet.de:8443/apis/apps/v1beta1/statefulsets?limit=500&resourceVersion=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:05.808195       1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1beta1.PodDisruptionBudget: Get
> > https://master.vnet.de:8443/apis/policy/v1beta1/poddisruptionbudgets?limit=500&resourceVersion=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:06.673507       1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.PersistentVolume: Get
> > https://master.vnet.de:8443/api/v1/persistentvolumes?limit=500&resourceVersion=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:06.770141       1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1beta1.ReplicaSet: Get
> > https://master.vnet.de:8443/apis/extensions/v1beta1/replicasets?limit=500&resourceVersion=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:06.773878       1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.Service: Get
> > https://master.vnet.de:8443/api/v1/services?limit=500&resourceVersion=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:06.778204       1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.StorageClass: Get
> > https://master.vnet.de:8443/apis/storage.k8s.io/v1/storageclasses?limit=500&resourceVersion=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:06.784874       1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.PersistentVolumeClaim: Get
> > https://master.vnet.de:8443/api/v1/persistentvolumeclaims?limit=500&resourceVersion=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> >
> > The log is full with those. Since it is all about api, I tried to get the
> > logs from k8s_POD_master-api-master.vnet.de_kube-system_* which is
> > completely empty :-/
> >
> > [vagrant master ~]$ sudo docker logs
> > k8s_POD_master-api-master.vnet.de_kube-system_86017803919d833e39cb3d694c249997_1
> > [vagrant master ~]$
> >
> > Is there any special prerequisite about the api-pod?
> >
> > regards
> > Marc
> >
> >
> > > Marc,
> > >
> > > could you please look over the issue [1] and pull the master pod logs and
> > > see if you bumped into same issue mentioned by the other folks?
> > > Also make sure the openshift-ansible release is the latest one.
> > >
> > > Dani
> > >
> > > [1] https://github.com/openshift/openshift-ansible/issues/9575
> > >
> > > On Wed, Aug 29, 2018 at 7:36 PM Marc Schlegel <marc schlegel gmx de>
> > wrote:
> > >
> > > > Hello everyone
> > > >
> > > > I am having trouble getting a working Origin 3.10 installation using
> > the
> > > > openshift-ansible installer. My install always fails because the
> > control
> > > > pane pods are not available. I've checkout the release-3.10 branch from
> > > > openshift-ansible and configured the inventory accordingly
> > > >
> > > >
> > > > TASK [openshift_control_plane : Start and enable self-hosting node]
> > > > ******************
> > > > changed: [master]
> > > > TASK [openshift_control_plane : Get node logs]
> > > > *******************************
> > > > skipping: [master]
> > > > TASK [openshift_control_plane : debug]
> > > > ******************************************
> > > > skipping: [master]
> > > > TASK [openshift_control_plane : fail]
> > > > *********************************************
> > > > skipping: [master]
> > > > TASK [openshift_control_plane : Wait for control plane pods to appear]
> > > > ***************
> > > >
> > > > failed: [master] (item=etcd) => {"attempts": 60, "changed": false,
> > "item":
> > > > "etcd", "msg": {"cmd": "/bin/oc get pod master-etcd-master.vnet.de -o
> > > > json -n kube-system", "results": [{}], "returncode": 1, "stderr": "The
> > > > connection to the server master.vnet.de:8443 was refused - did you
> > > > specify the right host or port?\n", "stdout": ""}}
> > > >
> > > > TASK [openshift_control_plane : Report control plane errors]
> > > > *************************
> > > > fatal: [master]: FAILED! => {"changed": false, "msg": "Control plane
> > pods
> > > > didn't come up"}
> > > >
> > > >
> > > > I am using Vagrant to setup a local domain (vnet.de) which also
> > includes
> > > > a dnsmasq-node to have full control over the dns. The following VMs are
> > > > running and DNS ans SSH works as expected
> > > >
> > > > Hostname             IP
> > > > domain.vnet.de   192.168.60.100
> > > > master.vnet.de    192.168.60.150 (dns also works for openshift.vnet.de
> > > > which is configured as openshift_master_cluster_public_hostname) also
> > runs
> > > > etcd
> > > > infra.vnet.de        192.168.60.151
> > (openshift_master_default_subdomain
> > > > wildcard points to this node)
> > > > app1.vnet.de        192.168.60.152
> > > > app2.vnet.de        192.168.60.153
> > > >
> > > >
> > > > When connecting to the master-node I can see that several
> > docker-instances
> > > > are up and running
> > > >
> > > > [vagrant master ~]$ sudo docker ps
> > > > CONTAINER ID        IMAGE                                    COMMAND
> > > >             CREATED             STATUS              PORTS
> > > >  NAMES
> > > >
> > > > 9a0844123909        ff5dd2137a4f                             "/bin/sh
> > -c
> > > > '#!/bi..."   19 minutes ago      Up 19 minutes
> > > >
> > k8s_etcd_master-etcd-master.vnet.de_kube-system_a2c858fccd481c334a9af7413728e203_0
> > > >
> > > > 41d803023b72        f216d84cdf54
> >  "/bin/bash -c
> > > > '#!/..."   19 minutes ago      Up 19 minutes
> > > >
> > k8s_controllers_master-controllers-master.vnet.de_kube-system_a3c3ca56f69ed817bad799176cba5ce8_0
> > > >
> > > > 044c9d12588c        docker.io/openshift/origin-pod:v3.10.0
> > > >  "/usr/bin/pod"           19 minutes ago      Up 19 minutes
> > > >
> > > >
> > k8s_POD_master-api-master.vnet.de_kube-system_86017803919d833e39cb3d694c249997_0
> > > >
> > > > 10a197e394b3        docker.io/openshift/origin-pod:v3.10.0
> > > >  "/usr/bin/pod"           19 minutes ago      Up 19 minutes
> > > >
> > > >
> > k8s_POD_master-controllers-master.vnet.de_kube-system_a3c3ca56f69ed817bad799176cba5ce8_0
> > > >
> > > > 20f4f86bdd07        docker.io/openshift/origin-pod:v3.10.0
> > > >  "/usr/bin/pod"           19 minutes ago      Up 19 minutes
> > > >
> > > >
> > k8s_POD_master-etcd-master.vnet.de_kube-system_a2c858fccd481c334a9af7413728e203_0
> > > >
> > > >
> > > > However, there is no port 8443 open on the master-node. No wonder the
> > > > ansible-installer complains.
> > > >
> > > > The machines are using a plain Centos 7.5 and I've run the
> > > > openshift-ansible/playbooks/prerequisites.yml first and then
> > > > openshift-ansible/playbooks/deploy_cluster.yml.
> > > > I've double-checked the installation documentation and my Vagrant
> > > > config...all looks correct.
> > > >
> > > > Any ideas/advice?
> > > > regards
> > > > Marc
> > > >
> > > >
> > > > _______________________________________________
> > > > users mailing list
> > > > users lists openshift redhat com
> > > > http://lists.openshift.redhat.com/openshiftmm/listinfo/users
> > > >
> > >
> >
> >
> >
> >
> > _______________________________________________
> > users mailing list
> > users lists openshift redhat com
> > http://lists.openshift.redhat.com/openshiftmm/listinfo/users
> >
> 
> 
> 

# Create an OSEv3 group that contains the masters, nodes, and etcd groups
[OSEv3:children]
masters
nodes
etcd

# Set variables common for all OSEv3 hosts
[OSEv3:vars]
# SSH user, this user should allow ssh based auth without requiring a password
ansible_ssh_user=vagrant

# If ansible_ssh_user is not root, ansible_become must be set to true
ansible_become=true

openshift_deployment_type=origin
openshift_release=v3.10
openshift_master_cluster_public_hostname=openshift.vnet.de
openshift_master_default_subdomain=apps.vnet.de

# Networking Defaults
osm_cluster_network_cidr=10.128.0.0/14
openshift_portal_net=172.30.0.0/16
osm_host_subnet_length=9 

# uncomment the following to enable htpasswd authentication; defaults to AllowAllPasswordIdentityProvider
#openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}]

openshift_disable_check=memory_availability,disk_availability,docker_storage, docker_storage_driver


# host group for masters
[masters]
master

# host group for etcd
[etcd]
master

# host group for nodes, includes region info
[nodes]
master openshift_node_group_name='node-config-master'
app1 openshift_node_group_name='node-config-compute'
app2 openshift_node_group_name='node-config-compute'
infra openshift_node_group_name='node-config-infra'

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]