[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Openshift failing to start on second master potential HAProxy issue



HI guys, 

Thanks for the replies. I was pulled from the project for a few of days and only getting back now. 

Installed and up and running with two x masters, one node and one LB. 
No doubt there will be further fun and games but for now we have something to test with. 

Thanks once more, genuinely appreciate the help. 


On 24 May 2016, at 16:55, Scott Dodson <sdodson redhat com> wrote:

A single load balancer is a risk in general, but it's even worse than that load balancer is one of your masters. We mostly added that functionality as a proof of concept with the expectation that people would do something else for truly HA environments. Like an F5, ELB, or multiple HAProxy instances with keepalived ipfailover.

On Tue, May 24, 2016 at 11:15 AM, Julio Saura <jsaura hiberus com> wrote:
hello

just if it helps i have just deployed a  new installation like this and i can confirm using one master as de LB does not work .. you have to deploy de LB on other machine rather than master .. 

i used a small VM on my environment as a dedicated load balancer for master and worked without any problem

hope it helps!

Best regards

El 24 may 2016, a las 16:52, Scott Dodson <sdodson redhat com> escribió:

Ok, so the master that's failing is also your load balancer defined in your [lb] group, correct? I'd suggest not using one of your masters as the load balancer if at all possible. There may be enough knobs in the inventory to allow this to work but I don't think this is a setup we've tested.

On Tue, May 24, 2016 at 10:16 AM, Ronan O Keeffe <ronanok donedeal ie> wrote:
HI Scott, 

I have deleted the installs and am starting fresh. What I ran earlier was
lsof -i :8443, which I assume will be identical to lsof -i4 :8443 as IPv6 is disabled. 

I am re-installing now from a template and can test that exact command in a few mins. 

lsof -i :8443

COMMAND   PID    USER   FD   TYPE  DEVICE SIZE/OFF NODE NAME
haproxy 16621 haproxy    6u  IPv4   62049      0t0  TCP *:pcsync-https (LISTEN)
haproxy 16621 haproxy   26u  IPv4 8617543      0t0  TCP master2.openshift:pcsync-https->master2.openshift:36691 (CLOSE_WAIT)
haproxy 16621 haproxy   27u  IPv4 8617545      0t0  TCP master2.openshift:57968->master2.openshift:pcsync-https (FIN_WAIT2)
haproxy 16621 haproxy   28u  IPv4 8617546      0t0  TCP master2.openshift:57969->master2.openshift:pcsync-https (FIN_WAIT2)
haproxy 16621 haproxy   29u  IPv4 8617547      0t0  TCP master2.openshift:57970->master2.openshift:pcsync-https (FIN_WAIT2)
haproxy 16621 haproxy   30u  IPv4 8617548      0t0  TCP master2.openshift:57971->master2.openshift:pcsync-https (FIN_WAIT2)
haproxy 16621 haproxy   31u  IPv4 8617549      0t0  TCP master2.openshift:57972->master2.openshift:pcsync-https (FIN_WAIT2)
haproxy 16621 haproxy *792u  IPv4 8657802      0t0  TCP master2.openshift:49633->master2.openshift:pcsync-https (FIN_WAIT2)
haproxy 16621 haproxy *793u  IPv4 8657803      0t0  TCP master2.openshift:49634->master2.openshift:pcsync-https (FIN_WAIT2)
haproxy 16621 haproxy *794u  IPv4 8657804      0t0  TCP master2.openshift:49635->master2.openshift:pcsync-https (FIN_WAIT2)
haproxy 16621 haproxy *795u  IPv4 8657805      0t0  TCP master2.openshift:49636->master2.openshift:pcsync-https (FIN_WAIT2)
haproxy 16621 haproxy *796u  IPv4 8657806      0t0  TCP master2.openshift:49637->master2.openshift:pcsync-https (FIN_WAIT2)
haproxy 16621 haproxy *797u  IPv4 8657807      0t0  TCP master2.openshift:pcsync-https->master2.openshift:49615 (CLOSE_WAIT)
haproxy 16621 haproxy *798u  IPv4 8657808      0t0  TCP master2.openshift:pcsync-https->master2.openshift:49616 (CLOSE_WAIT)
haproxy 16621 haproxy *799u  IPv4 8657809      0t0  TCP master2.openshift:pcsync-https->master2.openshift:49617 (CLOSE_WAIT)
haproxy 16621 haproxy *800u  IPv4 8657810      0t0  TCP master2.openshift:pcsync-https->master2.openshift:49618 (CLOSE_WAIT)
haproxy 16621 haproxy *801u  IPv4 8657811      0t0  TCP master2.openshift:pcsync-https->master2.openshift:49619 (CLOSE_WAIT)
haproxy 16621 haproxy *802u  IPv4 8657812      0t0  TCP master2.openshift:pcsync-https->master2.openshift:49620 (CLOSE_WAIT)
.
.
.

lsof -i :8443 | wc -l
39983

All the same for the almost 40000 entries. Nothing else on port 8443 that I can see. 

----
Ronan O Keeffe
System Administrator

DoneDeal

Have you DoneDealed yet?

Where to find us
    

On 24 May 2016, at 15:02, Scott Dodson <sdodson redhat com> wrote:

Can you see what's using port 8443 and remove that conflict? `lsof
-i4:8443` should show you what's listening on that port.

On Tue, May 24, 2016 at 7:47 AM, Ronan O Keeffe <ronanok donedeal ie> wrote:
Hi,

We are currently attempting to install Openshift Origin via the advanced
install method using Ansible. Initially this will be for testing but we hope
to move this to production eventually.
We are installing two masters and one node as a test.
master1.openshift
master2.openshift
node1.openshift

Specifically we are following this method:
https://docs.openshift.org/latest/install_config/install/advanced_install.html
but using the 7.6 EPEL repo as the 7.5 one is not available.

Running: ansible-playbook
~/openshift-ansible-master/playbooks/byo/config.yml
Seems to install openshift on master1.openshift successfully but fails on
master2.openshift.

The error message is as follows:
failed: [master2.openshift] => {"failed": true}
msg: Job for origin-master-api.service failed because the control process
exited with error code. See "systemctl status origin-master-api.service" and
"journalctl -xe" for details.


Output of systemctl status origin-master-api.service:

systemctl status -l origin-master-api.service
● origin-master-api.service - Atomic OpenShift Master API
  Loaded: loaded (/usr/lib/systemd/system/origin-master-api.service;
enabled; vendor preset: disabled)
  Active: failed (Result: exit-code) since Tue 2016-05-24 12:13:32 IST;
27min ago
    Docs: https://github.com/openshift/origin
 Process: 24577 ExecStart=/usr/bin/openshift start master api
--config=/etc/origin/master/master-config.yaml $OPTIONS (code=exited,
status=255)
Main PID: 24577 (code=exited, status=255)
May 24 12:13:32 master1.openshift atomic-openshift-master-api[24577]: I0524
12:13:32.762625   24577 master.go:262] Started Origin API at
0.0.0.0:8443/oapi/v1
May 24 12:13:32 master1.openshift atomic-openshift-master-api[24577]: I0524
12:13:32.762630   24577 master.go:262] Started OAuth2 API at
0.0.0.0:8443/oauth
May 24 12:13:32 master1.openshift atomic-openshift-master-api[24577]: I0524
12:13:32.762634   24577 master.go:262] Started Web Console
0.0.0.0:8443/console/
May 24 12:13:32 master1.openshift atomic-openshift-master-api[24577]: I0524
12:13:32.762639   24577 master.go:262] Started Swagger Schema API at
0.0.0.0:8443/swaggerapi/
May 24 12:13:32 master1.openshift atomic-openshift-master-api[24577]: I0524
12:13:32.893090   24577 net.go:105] Got error tls.timeoutError{}, trying
again: "0.0.0.0:8443"
May 24 12:13:32 master1.openshift atomic-openshift-master-api[24577]: F0524
12:13:32.913543   24577 master.go:277] listen tcp4 0.0.0.0:8443: listen:
address already in use


cat /etc/redhat-release - Red Hat Enterprise Linux Server release 7.2
(Maipo)
uname -a - Linux master1.openshift 3.10.0-327.18.2.el7.x86_64
docker version - 1.9.1 docker-common-1.9.1-40.el7.centos.x86_64
ansible --version - ansible 1.9.6

Hope someone can help get us past this hurdle.

Cheers,
Ronan.

_______________________________________________
users mailing list
users lists openshift redhat com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users



_______________________________________________
users mailing list
users lists openshift redhat com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]