[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: after native HA install atomic-openshift-master service is not enabled

[root ose-ha-master-01 playbook-openshift]# oc get events
4h 28s 408 docker-registry-1 ReplicationController FailedCreate {deployer } Error creating deployer pod for default/docker-registry-1: pods "docker-registry-1-deploy" is forbidden: pod node label selector conflicts with its project node label selector

I think if I modify in the default project to select nodes labeled `region=infra`, it will fix it.

[root ose-ha-master-01 playbook-openshift]# oc get namespace default -o json | jq ".metadata.annotations"
"openshift.io/sa.initialized-roles": "true",
"openshift.io/sa.scc.mcs": "s0:c1,c0",
"openshift.io/sa.scc.supplemental-groups": "1000000000/10000",
"openshift.io/sa.scc.uid-range": "1000000000/10000"
[root ose-ha-master-01 playbook-openshift]# oc edit namespace default
[root ose-ha-master-01 playbook-openshift]# oc get namespace default -o json | jq ".metadata.annotations"
"openshift.io/node-selector": "region=infra",
"openshift.io/sa.initialized-roles": "true",
"openshift.io/sa.scc.mcs": "s0:c1,c0",
"openshift.io/sa.scc.supplemental-groups": "1000000000/10000",
"openshift.io/sa.scc.uid-range": "1000000000/10000"

That did it. Thank you!
[root ose-ha-master-01 playbook-openshift]# oc get pods
docker-registry-1-17w76 0/1 Pending 0 4s
docker-registry-1-deploy 1/1 Running 0 36s
router-1-aait2 0/1 Pending 0 4s
router-1-deploy 1/1 Running 0 36s

Is there a more direct command to change the node selector for a project?

----- On Feb 3, 2016, at 6:42 AM, Andrew Butcher <abutcher redhat com> wrote:
Everything looks good there except the part where there aren't any pods. Are there any events related to the docker-registry `oc get events`? If there are no pods being created I would expect the events to have some information.

On Tue, Feb 2, 2016 at 10:59 PM, Dale Bewley <dale bewley net> wrote:
I didn't realize there was a difference in services for the native HA config. Thanks for that information.

I did a fresh install and below is how I deployed the registry.

[root ose-ha-master-01 playbook-openshift]# oc status
In project default on server https://ose-master.ha.os.example.com:8443

svc/kubernetes - ports 443, 53, 53

View details with 'oc describe <resource>/<name>' or list everything with 'oc get all'.
[root ose-ha-master-01 playbook-openshift]# oc get SecurityContextConstraints privileged -o json | jq '.users'
[root ose-ha-master-01 playbook-openshift]# oadm registry \
> --service-account=registry \
> --config=/etc/origin/master/admin.kubeconfig \
> --credentials=/etc/origin/master/openshift-registry.kubeconfig \
> --images='registry.access.redhat.com/openshift3/ose-${component}:${version}' \
> --selector="region=infra"
DeploymentConfig "docker-registry" created
Service "docker-registry" created

Other than creating the dc and the svc nothing seems to happen.

[root ose-ha-master-01 playbook-openshift]# oc get deploymentconfig --all-namespaces
default docker-registry ConfigChange 1
[root ose-ha-master-01 playbook-openshift]# oc get pods --all-namespaces
[root ose-ha-master-01 playbook-openshift]# oc get svc
docker-registry <none> 5000/TCP docker-registry=default 1m
kubernetes <none> 443/TCP,53/UDP,53/TCP <none> 16m

There are no pods and this `ansible nodes -m command -a 'docker ps'` tells me there are no containers present anywhere.

The nodes are schedulable. What else can I look for?

[root ose-ha-master-01 playbook-openshift]# oc get nodes
ose-ha-master-01.example.com kubernetes.io/hostname=ose-ha-master-01.example.com,region=infra,zone=rhev Ready,SchedulingDisabled 28m
ose-ha-master-02.example.com kubernetes.io/hostname=ose-ha-master-02.example.com,region=infra,zone=rhev Ready,SchedulingDisabled 28m
ose-ha-master-03.example.com kubernetes.io/hostname=ose-ha-master-03.example.com,region=infra,zone=rhev Ready,SchedulingDisabled 28m
ose-ha-node-01.example.com kubernetes.io/hostname=ose-ha-node-01.example.com,region=infra,zone=rhev Ready 28m
ose-ha-node-02.example.com kubernetes.io/hostname=ose-ha-node-02.example.com,region=infra,zone=rhev Ready 28m
ose-ha-node-03.example.com kubernetes.io/hostname=ose-ha-node-03.example.com,region=primary,zone=rhev Ready 28m
ose-ha-node-04.example.com kubernetes.io/hostname=ose-ha-node-04.example.com,region=primary,zone=rhev Ready 28m
ose-ha-node-05.example.com kubernetes.io/hostname=ose-ha-node-05.example.com,region=primary,zone=rhev Ready 28m
ose-ha-node-06.example.com kubernetes.io/hostname=ose-ha-node-06.example.com,region=primary,zone=rhev Ready 28m

----- On Jan 31, 2016, at 1:50 PM, Andrew Butcher <abutcher redhat com> wrote:
Hey Dale,
Two services are started when using the native ha method; atomic-openshift-master-api and atomic-openshift-master-controllers. How did you deploy the registry? Are there any failed pods?

On Sun, Jan 31, 2016 at 4:14 PM, Dale Bewley <dale bewley net> wrote:

I'm provisioning a Native HA OpenShift Enterprise 3.1 cluster using byo playbook. There are no failures, but at the end the master service is disabled.
(Full hosts file here: https://gist.github.com/dlbewley/d7db07edb7fa6da72259 )

ose-ha-master-[01:03].example.com openshift_node_labels="{'region': 'infra', 'zone': 'rhev'}" openshift_schedulable=False
ose-ha-node-[01:02].example.com    openshift_node_labels="{'region': 'infra', 'zone': 'rhev'}"
ose-ha-node-[03:06].example.com    openshift_node_labels="{'region': 'primary', 'zone': 'rhev'}"

The playbook runs with no failures, `oc get nodes` and everything looks fine, and I can even login to the web console by way of the loadbalancer node with my LDAP credentials. Quite an impressive playbook. :)

PLAY RECAP ********************************************************************
localhost : ok=18 changed=0 unreachable=0 failed=0
ose-ha-etcd-01.example.com : ok=191 changed=40 unreachable=0 failed=0
ose-ha-etcd-02.example.com : ok=89 changed=20 unreachable=0 failed=0
ose-ha-etcd-03.example.com : ok=89 changed=20 unreachable=0 failed=0
ose-ha-lb-01.example.com : ok=29 changed=7 unreachable=0 failed=0
ose-ha-master-01.example.com : ok=334 changed=73 unreachable=0 failed=0
ose-ha-master-02.example.com : ok=203 changed=47 unreachable=0 failed=0
ose-ha-master-03.example.com : ok=203 changed=47 unreachable=0 failed=0
ose-ha-node-01.example.com : ok=103 changed=24 unreachable=0 failed=0
ose-ha-node-02.example.com : ok=103 changed=24 unreachable=0 failed=0
ose-ha-node-03.example.com : ok=103 changed=24 unreachable=0 failed=0
ose-ha-node-04.example.com : ok=103 changed=24 unreachable=0 failed=0
ose-ha-node-05.example.com : ok=103 changed=24 unreachable=0 failed=0
ose-ha-node-06.example.com : ok=103 changed=24 unreachable=0 failed=0

However when I attempted to deploy the registry nothing happened. I then noticed that `atomic-openshift-master` service was never enabled by the playbook.

It looks like https://github.com/openshift/openshift-ansible/blob/master/roles/openshift_master/tasks/main.yml#L274 is skipped because this conditional: `when: not openshift_master_ha | bool`.

[root ose-ha-master-01 ]# grep -A1 'Start and enable master' ansible-byo-2016-01-28.log
TASK: [openshift_master | Start and enable master] ****************************
skipping: [ose-ha-master-01.example.com]
TASK: [openshift_master | Start and enable master api] ************************
changed: [ose-ha-master-01.example.com]
TASK: [openshift_master | Start and enable master controller] *****************
changed: [ose-ha-master-01.example.com]
TASK: [openshift_master | Start and enable master] ****************************
skipping: [ose-ha-master-02.example.com]
TASK: [openshift_master | Start and enable master api] ************************
changed: [ose-ha-master-02.example.com]
TASK: [openshift_master | Start and enable master controller] *****************
changed: [ose-ha-master-02.example.com]
TASK: [openshift_master | Start and enable master] ****************************
skipping: [ose-ha-master-03.example.com]
TASK: [openshift_master | Start and enable master api] ************************
changed: [ose-ha-master-03.example.com]
TASK: [openshift_master | Start and enable master controller] *****************
changed: [ose-ha-master-03.example.com]

Any idea why this is happening? As near as I can discern openshift_master_ha is true, because `"{{ groups.oo_masters_to_config | length > 1 }}"` and I see my 3 masters in that group.

users mailing list
users lists openshift redhat com

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]