[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: OKD installation on CentOS 7.6



Hi Wilfred,
just as some input: When you can access your node while origin-node isn't running/disable, what happens when you start docker and origin-node? The access should go down I guess. That way you should be able to track down the process causing the issue. For example set up an external port check on 22, and log ps -ef / netstat -tupln / docker ps / journalctl / iptables -L frequently to get the time/process when the node gets unavailable. 
OKD does not limit the network access based on subnet or smiliar. So this behaviour is an unwanted side effect caused by the environment (network, sysconfig, ext. firewall etc.). What is different from a vanilla CentOS installation? Are there any routines while starting the node? Maybe it's an issue of services going up in the wrong order. I wouldn't study the ansible installer itself, as it seems to be working correctly. Try to find the exact moment/process, when the access gets denied. 
And are you using the openshift-installer from github or from the CentOS repository? The RPMs from the CentOS repo are not that well updated, so maybe try the openshift-installer from github (branch release-3.11) and use the playbooks from there. Sometimes there are relevant bug fixes included.

Regards,
Nikolas

Am Mi., 17. Apr. 2019 um 10:44 Uhr schrieb ANUZET Wilfried <wilfried anuzet uclouvain be>:

Hi Nikolas,

 

I just ask the netwok team here to see with them if there's something that block OKD at network level and it seems not.

 

And since you can access the servers only from certain hosts after the installation really looks like an external component breaks somethings.

Because you have a short window to access the server from your client, I'm pretty sure it's not a local firewalld issue, as network/firewall go up together. So a different service is causing the issue. I would try to identify this processes, until it's clear what component issues that behaviour. 

To me as well the issue seems related to an openshift component as the server is inaccessible when OKD start. I'll try to identify which one …

 

I asked you once for the wrong nodes. Is dnsmasq running on the LB node?

I just checked and DNSMasq is not running on the LB.

 

You could maybe verify that with stopping services origin-node and docker and try to get rid of all openshift specific processes (also dnsmasq), so only basic services are running (or disabling and reboot). 

Stop origin-node.service and docker.service units and nothing changed.

disable origin-node.service and docker.service and reboot and the node server is accessible from outside it's subnet.

The issue seems clearly related to OKD ;)

 

On the LB using a cli browser (lynx) I can access to the master URL (https://okdmst01t.stluc.ucl.ac.be:8443 which redirect correctly to https://okdmst01t:8443/console/ = but obviously there a mention to activate _javascript_ on the login page ).

I just saw that I forgot to put the okd master / node IP in the /etc/hosts of the LB.

I just add them but it change nothing.

 

I'm also out of idea but I will check every OKD pods and better read the openshift installer (but as it's well wrtitten it's also insanely imbricated with a lot of import_playbook, import_tasks …)

 

:'(

 

logo-stluc

Wilfried Anuzet
Service Infrastructure
Département Information & Systèmes
Tél: +32 2 764 2488


Avenue Hippocrate, 10 - 1200 Bruxelles - Belgique - Tel: + 32 2 764 11 11 - www.saintluc.be

logo-fsl

Soutenez les Cliniques, soutenez la Fondation Saint-Luc
Support our Hospital, support Fondation Saint-Luc

 

 

De : Nikolas Philips <nikolas philips gmail com>
Envoyé : mercredi 17 avril 2019 09:48
À : ANUZET Wilfried <wilfried anuzet uclouvain be>
Cc : OpenShift Users List <users lists openshift redhat com>
Objet : Re: OKD installation on CentOS 7.6

 

Hi Wilfried,

sadly I'm a bit out of ideas what could cause this issue. All the settings and configs I saw from you were looking good. 

And since you can access the servers only from certain hosts after the installation really looks like an external component breaks somethings.

My guess would be that maybe an external/internal firewall blocks external traffic to your nodes when certain ports are open (or similar). Maybe because of DNS to prevent spoofing? (I asked you once for the wrong nodes. Is dnsmasq running on the LB node?)

You could maybe verify that with stopping services origin-node and docker and try to get rid of all openshift specific processes (also dnsmasq), so only basic services are running (or disabling and reboot). 

Because you have a short window to access the server from your client, I'm pretty sure it's not a local firewalld issue, as network/firewall go up together. So a different service is causing the issue. I would try to identify this processes, until it's clear what component issues that behaviour. 

 

But you can access the cluster through the LB (e.g. 8443 or 443), right? 

 

Regards,

Nikolas

 

Am Mi., 17. Apr. 2019 um 09:12 Uhr schrieb ANUZET Wilfried <wilfried anuzet uclouvain be>:

Hello Nikola,

 

Here the output of the firewall-cmd command on the LB and master:

LB:

public (active)

  target: default

  icmp-block-inversion: no

  interfaces: ens192

  sources:

  services: ssh dhcpv6-client

  ports: 10250/tcp 10256/tcp 80/tcp 443/tcp 4789/udp 9000-10000/tcp 1936/tcp

  protocols:

  masquerade: no

  forward-ports:

  source-ports:

  icmp-blocks:

  rich rules:

 

MASTER:

public (active)

  target: default

  icmp-block-inversion: no

  interfaces: ens192

  sources:

  services: ssh dhcpv6-client

  ports: 10250/tcp 10256/tcp 80/tcp 443/tcp 4789/udp 9000-10000/tcp 1936/tcp 2379/tcp 2380/tcp 9000/tcp 8443/tcp 8444/tcp 8053/tcp 8053/udp

  protocols:

  masquerade: no

  forward-ports:

  source-ports:

  icmp-blocks:

  rich rules:

 

 

logo-stluc

Wilfried Anuzet
Service Infrastructure
Département Information & Systèmes
Tél: +32 2 764 2488


Avenue Hippocrate, 10 - 1200 Bruxelles - Belgique - Tel: + 32 2 764 11 11 - www.saintluc.be

logo-fsl

Soutenez les Cliniques, soutenez la Fondation Saint-Luc
Support our Hospital, support
Fondation Saint-Luc

 

 

De : Nikolas Philips <nikolas philips gmail com>
Envoyé : mardi 16 avril 2019 19:17
À : ANUZET Wilfried <wilfried anuzet uclouvain be>
Cc : OpenShift Users List <users lists openshift redhat com>
Objet : Re: OKD installation on CentOS 7.6

 

Sorry Wilfried,

I missed the line with "os_firewall_use_firewalld" in your inventory file. 

What's the output of "firewall-cmd --list-all" on the LB and master?

 

 

Am Di., 16. Apr. 2019 um 17:52 Uhr schrieb ANUZET Wilfried <wilfried anuzet uclouvain be>:

Thanks Nikolas;

 

Here some answer to better identify the source problem:

 

·         I can connect via ssh before running the ansible installer, I run another ansible playbook before to be compliant wit our enterprise policy

In this playbook I just ensure that firewalld is up an running but I keep the default value (just ssh service open and icmp response not blocked.)

If I uninstall Openshift and reboot the server I can connect to it again.

 

·         All of these servers have only one NIC

 

·         I tried to disable firewalld and flush all iptables rules but stil can't join the server

/!\ I just see that I can join the server with another server in the same subnet without deactivate and flush the firewall /!\

 

·         Connected on one node:

disable origin node via systemd => still no connection

add ssh port and icmp in iptable => still no connection or icmp response

it seems that kubernets recreate some rules (via the pods/ docker container which are still running ? do I have to stop them all via docker container stop $(docker container ls -q) ?)

 

·         Here the information about one node

ip a sh

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

    inet 127.0.0.1/8 scope host lo

       valid_lft forever preferred_lft forever

    inet6 ::1/128 scope host

       valid_lft forever preferred_lft forever

2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000

    link/ether 00:50:56:92:79:03 brd ff:ff:ff:ff:ff:ff

    inet 10.244.246.68/24 brd 10.244.246.255 scope global noprefixroute ens192

       valid_lft forever preferred_lft forever

    inet6 fe80::250:56ff:fe92:7903/64 scope link

       valid_lft forever preferred_lft forever

3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default

    link/ether 02:42:cb:3e:8f:86 brd ff:ff:ff:ff:ff:ff

    inet 172.17.0.1/16 scope global docker0

       valid_lft forever preferred_lft forever

4: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000

    link/ether 82:94:30:55:98:12 brd ff:ff:ff:ff:ff:ff

5: br0: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN group default qlen 1000

    link/ether ee:73:3d:25:b7:48 brd ff:ff:ff:ff:ff:ff

6: vxlan_sys_4789: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65535 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000

    link/ether 5a:63:33:de:9f:70 brd ff:ff:ff:ff:ff:ff

    inet6 fe80::5863:33ff:fede:9f70/64 scope link

       valid_lft forever preferred_lft forever

7: tun0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default qlen 1000

    link/ether b6:35:b5:77:d4:60 brd ff:ff:ff:ff:ff:ff

    inet 10.131.0.1/23 brd 10.131.1.255 scope global tun0

       valid_lft forever preferred_lft forever

    inet6 fe80::b435:b5ff:fe77:d460/64 scope link

       valid_lft forever preferred_lft forever

 

netstat

Active Internet connections (only servers)

Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name   

tcp        0      0 127.0.0.1:9101          0.0.0.0:*               LISTEN      16787/node_exporter

tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN      1/systemd          

tcp        0      0 127.0.0.1:53            0.0.0.0:*               LISTEN      13155/openshift    

tcp        0      0 10.131.0.1:53           0.0.0.0:*               LISTEN      9666/dnsmasq       

tcp        0      0 10.244.246.68:53        0.0.0.0:*               LISTEN      9666/dnsmasq       

tcp        0      0 172.17.0.1:53           0.0.0.0:*               LISTEN      9666/dnsmasq       

tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      6515/sshd          

tcp        0      0 127.0.0.1:11256         0.0.0.0:*               LISTEN      13155/openshift    

tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN      6762/master        

tcp6       0      0 :::9100                 :::*                    LISTEN      16837/./kube-rbac-p

tcp6       0      0 :::111                  :::*                    LISTEN      1/systemd          

tcp6       0      0 :::10256                :::*                    LISTEN      13155/openshift    

tcp6       0      0 fe80::5863:33ff:fede:53 :::*                    LISTEN      9666/dnsmasq       

tcp6       0      0 fe80::b435:b5ff:fe77:53 :::*                    LISTEN      9666/dnsmasq       

tcp6       0      0 fe80::250:56ff:fe92::53 :::*                    LISTEN      9666/dnsmasq       

tcp6       0      0 :::22                   :::*                    LISTEN      6515/sshd          

tcp6       0      0 ::1:25                  :::*                    LISTEN      6762/master        

udp        0      0 127.0.0.1:53            0.0.0.0:*                           13155/openshift    

udp        0      0 10.131.0.1:53           0.0.0.0:*                           9666/dnsmasq       

udp        0      0 10.244.246.68:53        0.0.0.0:*                           9666/dnsmasq       

udp        0      0 172.17.0.1:53           0.0.0.0:*                           9666/dnsmasq       

udp        0      0 0.0.0.0:111             0.0.0.0:*                           1/systemd          

udp        0      0 127.0.0.1:323           0.0.0.0:*                           5855/chronyd       

udp        0      0 0.0.0.0:4789            0.0.0.0:*                           -                  

udp        0      0 0.0.0.0:922             0.0.0.0:*                           5857/rpcbind       

udp6       0      0 fe80::5863:33ff:fede:53 :::*                                9666/dnsmasq       

udp6       0      0 fe80::b435:b5ff:fe77:53 :::*                                9666/dnsmasq       

udp6       0      0 fe80::250:56ff:fe92::53 :::*                                9666/dnsmasq       

udp6       0      0 :::111                  :::*                                1/systemd          

udp6       0      0 ::1:323                 :::*                                5855/chronyd       

udp6       0      0 :::4789                 :::*                                -                   

udp6       0      0 :::922                  :::*                                5857/rpcbind    

 

 

And here's the informations about the infra node:

ip a sh

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

    inet 127.0.0.1/8 scope host lo

       valid_lft forever preferred_lft forever

    inet6 ::1/128 scope host

       valid_lft forever preferred_lft forever

2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000

    link/ether 00:50:56:92:7e:0e brd ff:ff:ff:ff:ff:ff

    inet 10.244.246.67/24 brd 10.244.246.255 scope global noprefixroute ens192

       valid_lft forever preferred_lft forever

    inet6 fe80::250:56ff:fe92:7e0e/64 scope link

       valid_lft forever preferred_lft forever

3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default

    link/ether 02:42:2f:95:40:a0 brd ff:ff:ff:ff:ff:ff

    inet 172.17.0.1/16 scope global docker0

       valid_lft forever preferred_lft forever

4: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000

    link/ether 5e:fa:58:76:66:e3 brd ff:ff:ff:ff:ff:ff

5: br0: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN group default qlen 1000

    link/ether 7a:fa:a0:8a:3c:44 brd ff:ff:ff:ff:ff:ff

6: vxlan_sys_4789: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65535 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000

    link/ether 9a:3d:14:c3:b7:88 brd ff:ff:ff:ff:ff:ff

    inet6 fe80::983d:14ff:fec3:b788/64 scope link

       valid_lft forever preferred_lft forever

7: tun0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default qlen 1000

    link/ether a6:e8:18:17:e5:90 brd ff:ff:ff:ff:ff:ff

    inet 10.129.0.1/23 brd 10.129.1.255 scope global tun0

       valid_lft forever preferred_lft forever

    inet6 fe80::a4e8:18ff:fe17:e590/64 scope link

       valid_lft forever preferred_lft forever

10: veth942356cc if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default

    link/ether 96:86:35:57:42:df brd ff:ff:ff:ff:ff:ff link-netnsid 2

    inet6 fe80::9486:35ff:fe57:42df/64 scope link

       valid_lft forever preferred_lft forever

11: veth138895b0 if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default

    link/ether 5e:2a:76:d4:d3:3c brd ff:ff:ff:ff:ff:ff link-netnsid 0

    inet6 fe80::5c2a:76ff:fed4:d33c/64 scope link

       valid_lft forever preferred_lft forever

12: veth6b93f489 if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default

    link/ether 5a:ba:57:1e:f1:8d brd ff:ff:ff:ff:ff:ff link-netnsid 1

    inet6 fe80::58ba:57ff:fe1e:f18d/64 scope link

       valid_lft forever preferred_lft forever

13: veth13cc07c2 if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default

    link/ether da:38:d2:ab:25:c0 brd ff:ff:ff:ff:ff:ff link-netnsid 3

    inet6 fe80::d838:d2ff:feab:25c0/64 scope link

       valid_lft forever preferred_lft forever

14: vethbe90ec8d if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default

    link/ether da:92:a1:00:1a:c8 brd ff:ff:ff:ff:ff:ff link-netnsid 4

    inet6 fe80::d892:a1ff:fe00:1ac8/64 scope link

       valid_lft forever preferred_lft forever

15: veth94866813 if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default

    link/ether 86:a0:7b:f7:26:55 brd ff:ff:ff:ff:ff:ff link-netnsid 5

    inet6 fe80::84a0:7bff:fef7:2655/64 scope link

       valid_lft forever preferred_lft forever

16: vethb41bbb85 if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default

    link/ether ee:3d:77:ca:f6:81 brd ff:ff:ff:ff:ff:ff link-netnsid 6

    inet6 fe80::ec3d:77ff:feca:f681/64 scope link

       valid_lft forever preferred_lft forever

17: vethe66c0168 if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default

    link/ether d6:d6:aa:3f:a9:2c brd ff:ff:ff:ff:ff:ff link-netnsid 7

    inet6 fe80::d4d6:aaff:fe3f:a92c/64 scope link

       valid_lft forever preferred_lft forever

18: vethcc45533d if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default

    link/ether 36:9e:1c:cc:18:b0 brd ff:ff:ff:ff:ff:ff link-netnsid 8

    inet6 fe80::349e:1cff:fecc:18b0/64 scope link

       valid_lft forever preferred_lft forever

20: vethe86b839d if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default

    link/ether 6a:4b:17:c1:37:bc brd ff:ff:ff:ff:ff:ff link-netnsid 10

    inet6 fe80::684b:17ff:fec1:37bc/64 scope link

       valid_lft forever preferred_lft forever

 

netstat

Active Internet connections (only servers)

Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name   

tcp        0      0 127.0.0.1:10443         0.0.0.0:*               LISTEN      47456/haproxy      

tcp        0      0 127.0.0.1:10444         0.0.0.0:*               LISTEN      47456/haproxy      

tcp        0      0 127.0.0.1:9101          0.0.0.0:*               LISTEN      20146/node_exporter

tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN      1/systemd          

tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      47456/haproxy      

tcp        0      0 127.0.0.1:43730         0.0.0.0:*               LISTEN      499/hyperkube      

tcp        0      0 127.0.0.1:53            0.0.0.0:*               LISTEN      13116/openshift    

tcp        0      0 10.129.0.1:53           0.0.0.0:*               LISTEN      9678/dnsmasq       

tcp        0      0 10.244.246.67:53        0.0.0.0:*               LISTEN      9678/dnsmasq       

tcp        0      0 172.17.0.1:53           0.0.0.0:*               LISTEN      9678/dnsmasq        

tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      6502/sshd          

tcp        0      0 127.0.0.1:11256         0.0.0.0:*               LISTEN      13116/openshift    

tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN      6769/master        

tcp        0      0 0.0.0.0:443             0.0.0.0:*               LISTEN      47456/haproxy      

tcp6       0      0 :::10250                :::*                    LISTEN      499/hyperkube      

tcp6       0      0 :::9100                 :::*                    LISTEN      20203/./kube-rbac-p

tcp6       0      0 :::111                  :::*                    LISTEN      1/systemd          

tcp6       0      0 :::1936                 :::*                    LISTEN      14493/openshift-rou

tcp6       0      0 :::10256                :::*                    LISTEN      13116/openshift    

tcp6       0      0 fe80::684b:17ff:fec1:53 :::*                    LISTEN      9678/dnsmasq       

tcp6       0      0 fe80::349e:1cff:fecc:53 :::*                    LISTEN      9678/dnsmasq       

tcp6       0      0 fe80::d4d6:aaff:fe3f:53 :::*                    LISTEN      9678/dnsmasq       

tcp6       0      0 fe80::ec3d:77ff:feca:53 :::*                    LISTEN      9678/dnsmasq       

tcp6       0      0 fe80::84a0:7bff:fef7:53 :::*                    LISTEN      9678/dnsmasq       

tcp6       0      0 fe80::d892:a1ff:fe00:53 :::*                    LISTEN      9678/dnsmasq       

tcp6       0      0 fe80::d838:d2ff:feab:53 :::*                    LISTEN      9678/dnsmasq       

tcp6       0      0 fe80::58ba:57ff:fe1e:53 :::*                    LISTEN      9678/dnsmasq       

tcp6       0      0 fe80::5c2a:76ff:fed4:53 :::*                    LISTEN      9678/dnsmasq       

tcp6       0      0 fe80::9486:35ff:fe57:53 :::*                    LISTEN      9678/dnsmasq       

tcp6       0      0 fe80::983d:14ff:fec3:53 :::*                    LISTEN      9678/dnsmasq       

tcp6       0      0 fe80::a4e8:18ff:fe17:53 :::*                    LISTEN      9678/dnsmasq       

tcp6       0      0 fe80::250:56ff:fe92::53 :::*                    LISTEN      9678/dnsmasq       

tcp6       0      0 :::22                   :::*                    LISTEN      6502/sshd          

tcp6       0      0 ::1:25                  :::*                    LISTEN      6769/master        

udp        0      0 127.0.0.1:53            0.0.0.0:*                           13116/openshift    

udp        0      0 10.129.0.1:53           0.0.0.0:*                           9678/dnsmasq       

udp        0      0 10.244.246.67:53        0.0.0.0:*                           9678/dnsmasq       

udp        0      0 172.17.0.1:53           0.0.0.0:*                           9678/dnsmasq       

udp        0      0 0.0.0.0:111             0.0.0.0:*                           1/systemd          

udp        0      0 127.0.0.1:323           0.0.0.0:*                           5863/chronyd       

udp        0      0 0.0.0.0:4789            0.0.0.0:*                           -                  

udp        0      0 0.0.0.0:929             0.0.0.0:*                           5856/rpcbind       

udp6       0      0 fe80::684b:17ff:fec1:53 :::*                                9678/dnsmasq       

udp6       0      0 fe80::349e:1cff:fecc:53 :::*                                9678/dnsmasq       

udp6       0      0 fe80::d4d6:aaff:fe3f:53 :::*                                9678/dnsmasq       

udp6       0      0 fe80::ec3d:77ff:feca:53 :::*                                9678/dnsmasq       

udp6       0      0 fe80::84a0:7bff:fef7:53 :::*                                9678/dnsmasq       

udp6       0      0 fe80::d892:a1ff:fe00:53 :::*                                9678/dnsmasq       

udp6       0      0 fe80::d838:d2ff:feab:53 :::*                                9678/dnsmasq       

udp6       0      0 fe80::58ba:57ff:fe1e:53 :::*                                9678/dnsmasq       

udp6       0      0 fe80::5c2a:76ff:fed4:53 :::*                                9678/dnsmasq       

udp6       0      0 fe80::9486:35ff:fe57:53 :::*                                9678/dnsmasq       

udp6       0      0 fe80::983d:14ff:fec3:53 :::*                                9678/dnsmasq       

udp6       0      0 fe80::a4e8:18ff:fe17:53 :::*                                9678/dnsmasq       

udp6       0      0 fe80::250:56ff:fe92::53 :::*                                9678/dnsmasq       

udp6       0      0 :::111                  :::*                                1/systemd          

udp6       0      0 ::1:323                 :::*                                5863/chronyd       

udp6       0      0 :::4789                 :::*                                -                  

udp6       0      0 :::929                  :::*                                5856/rpcbind       

 

Hope this will be useful

 

logo-stluc

Wilfried Anuzet
Service Infrastructure
Département Information & Systèmes
Tél: +32 2 764 2488


Avenue Hippocrate, 10 - 1200 Bruxelles - Belgique - Tel: + 32 2 764 11 11 - www.saintluc.be

logo-fsl

Soutenez les Cliniques, soutenez la Fondation Saint-Luc
Support our Hospital, support
Fondation Saint-Luc

 

 

De : Nikolas Philips <nikolas philips gmail com>
Envoyé : mardi 16 avril 2019 17:27
À : ANUZET Wilfried <
wilfried anuzet uclouvain be>
Cc : OpenShift Users List <
users lists openshift redhat com>
Objet : Re: OKD installation on CentOS 7.6

 

Hi Wilfried,

did you check that you could connect to these server via ssh before you run the ansible-installer?

Do you have applied any custom iptable rules to these servers (via cloud-init or similar maybe)?

Do these servers only have one NIC resp. one IP address over which you access them?

Maybe try to open port 22 explicitly via iptables on one node to test, if it's the firewall which blocks the requests. 

Try what happens, if you stop the origin-node service on a compute node (sytemctl stop origin-node). If this doesn't help try to flush all applied iptable rules, and add only port 22 for example afterwards (better backup. I think the kube-proxy will generate them, but not 100% sure). 

And please provide the output of "netstat -tupln" and "ip address show" of one node and the infra node (resp. check if the ip binding for the sshd service is your external ip).

Even if the cluster is causing this behaviour, I think the issue might be caused from a certain server config (e.g. firewall, network). I try to isolate the possible cause with these questions. 

 

Best Regards,

Nikolas

 

 

Am Di., 16. Apr. 2019 um 16:42 Uhr schrieb ANUZET Wilfried <wilfried anuzet uclouvain be>:

Hello Nikolas,

 

I just test something and it seems obviously a network problem on the openshift cluster itself.

I just reboot the master to test and it seems that the server is accessible throught a little window when the TCP/IP stack is up but before the firewall / OKD start.

 

Don't know where I missed something.

 

logo-stluc

Wilfried Anuzet
Service Infrastructure
Département Information & Systèmes
Tél: +32 2 764 2488


Avenue Hippocrate, 10 - 1200 Bruxelles - Belgique - Tel: + 32 2 764 11 11 - www.saintluc.be

logo-fsl

Soutenez les Cliniques, soutenez la Fondation Saint-Luc
Support our Hospital, support
Fondation Saint-Luc

 

 

De : ANUZET Wilfried
Envoyé : mardi 16 avril 2019 15:17
À : 'Nikolas Philips' <
nikolas philips gmail com>
Objet : RE: OKD installation on CentOS 7.6

 

Hello Nikolas,

 

Here's the points you mentions I've to check:

·         On all servers the NM_CONTROLLED=yes is set in the network interfaces definitions.

                The service itself is running:

[root okdmst01t ~]# systemctl status NetworkManager

● NetworkManager.service - Network Manager

   Loaded: loaded (/usr/lib/systemd/system/NetworkManager.service; enabled; vendor preset: enabled)

   Active: active (running) since Mon 2019-04-15 10:05:21 CEST; 1 day 5h ago

     Docs: man:NetworkManager(8)

Main PID: 10304 (NetworkManager)

   CGroup: /system.slice/NetworkManager.service

           └─10304 /usr/sbin/NetworkManager --no-daemon

 

Apr 15 10:17:40 okdmst01t.stluc.ucl.ac.be NetworkManager[10304]: <info>  [1555316260.6944] device (veth726db232): enslaved to non-master-type device ovs-system; ignoring

Apr 15 10:18:35 okdmst01t.stluc.ucl.ac.be NetworkManager[10304]: <info>  [1555316315.3185] device (veth138e5060): carrier: link connected

Apr 15 10:18:35 okdmst01t.stluc.ucl.ac.be NetworkManager[10304]: <info>  [1555316315.3188] manager: (veth138e5060): new Veth device (/org/freedesktop/NetworkManager/Devices/12)

Apr 15 10:18:35 okdmst01t.stluc.ucl.ac.be NetworkManager[10304]: <info>  [1555316315.3346] device (veth138e5060): enslaved to non-master-type device ovs-system; ignoring

Apr 15 10:18:44 okdmst01t.stluc.ucl.ac.be NetworkManager[10304]: <info>  [1555316324.3338] manager: (veth95ee3ae7): new Veth device (/org/freedesktop/NetworkManager/Devices/13)

Apr 15 10:18:44 okdmst01t.stluc.ucl.ac.be NetworkManager[10304]: <info>  [1555316324.3347] device (veth95ee3ae7): carrier: link connected

Apr 15 10:18:44 okdmst01t.stluc.ucl.ac.be NetworkManager[10304]: <info>  [1555316324.3555] device (veth95ee3ae7): enslaved to non-master-type device ovs-system; ignoring

Apr 15 10:20:39 okdmst01t.stluc.ucl.ac.be NetworkManager[10304]: <info>  [1555316439.2149] device (vethb5a95288): carrier: link connected

Apr 15 10:20:39 okdmst01t.stluc.ucl.ac.be NetworkManager[10304]: <info>  [1555316439.2155] manager: (vethb5a95288): new Veth device (/org/freedesktop/NetworkManager/Devices/14)

Apr 15 10:20:39 okdmst01t.stluc.ucl.ac.be NetworkManager[10304]: <info>  [1555316439.2515] device (vethb5a95288): enslaved to non-master-type device ovs-system; ignoring

 

·         I can reach another server in another internal subnet outside our /24 subnet defined in OKD servers (I can't go reach a server outside our internal network as SSH and ICMP out are disabled at our firewall level…)

 

·         The route are the same on the master and lb node:

LB:

[root okdlb01t ~]$ ip route show

default via 10.244.246.2 dev ens192 proto static metric 100

10.244.246.0/24 dev ens192 proto kernel scope link src 10.244.246.84 metric 100

172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1

 

MASTER:

[root okdmst01t ~]# ip route show

default via 10.244.246.2 dev ens192 proto static metric 100

10.128.0.0/14 dev tun0 scope link

10.244.246.0/24 dev ens192 proto kernel scope link src 10.244.246.66 metric 100

172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1

172.30.0.0/16 dev tun0

 

·         Here's the result of the oc commands regarding the openshift SDN pods:

[root okdmst01t ~]# oc get pods -n openshift-sdn

NAME        READY     STATUS    RESTARTS   AGE

ovs-h6vqq   1/1       Running   0          1d

ovs-prm2z   1/1       Running   0          1d

ovs-r5wll   1/1       Running   0          1d

ovs-stnc5   1/1       Running   0          1d

sdn-4g5fk   1/1       Running   0          1d

sdn-4vlpr   1/1       Running   0          1d

sdn-5775r   1/1       Running   0          1d

sdn-j87dp   1/1       Running   0          1d

 

 

Thanks for your help.

 

logo-stluc

Wilfried Anuzet
Service Infrastructure
Département Information & Systèmes
Tél: +32 2 764 2488


Avenue Hippocrate, 10 - 1200 Bruxelles - Belgique - Tel: + 32 2 764 11 11 - www.saintluc.be

logo-fsl

Soutenez les Cliniques, soutenez la Fondation Saint-Luc
Support our Hospital, support
Fondation Saint-Luc

 

 

De : Nikolas Philips <nikolas philips gmail com>
Envoyé : mardi 16 avril 2019 14:47
À : ANUZET Wilfried <
wilfried anuzet uclouvain be>
Objet : Re: OKD installation on CentOS 7.6

 

Hey Wilfried,

it looks like you got some networking issues. I think the [lb] node isn't affected because there's only a HAProxy deployment, and the node is probably not integrated in the SDN of your cluster. So I guess the ansible installer resp. the installation of the SDN messed up with your network settings. 

Can you reach hosts outside of your subnet from the master node? E.g. 1.1.1.1 or a different internal host from a different subnet? 

Is NetworkManager enabled and running on all nodes (required!)?

Are the default routes correct on all nodes (check with "ip route show", and look for the line with default. Is the gateway correct? Is it the same as the LB node has?)

When you are connected to the master node, can you execute "oc get nodes"? If yes, can you check if the SDN pods are running ("oc get pods -n openshift-sdn")? And  are the nodes ready? 

 

Best Regards,

Nikolas

 

 

Am Di., 16. Apr. 2019 um 14:21 Uhr schrieb ANUZET Wilfried <wilfried anuzet uclouvain be>:

Hello,

 

I tried to install OKD onto brand new CentOS VM 7.6.

As I already set up a simple cluster on my cloud server to learn Openshift (1 master 1 node / CentOS 7.6 running on proxmox), I assume it will be easy as well using the openshift-ansible project.

 

Here's the server I want to deploy:

okdlb01t => OKD Load balancer / 1CPU / 2G RAM / 1NIC

okdmst01t => OKD master / 8CPU / 16G RAM / 1NIC

okdnod01t / okdnod02t => 2 OKD nodes / 4CPU / 8G RAM / 1NIC

okdinf01t => OKD infrastructure node / 4CPU / 8G RAM / 1NIC

 

All serveurs are configured to

 

All servers are configured to:

- use one of our internal /24 network

- use the coporate proxy at user space and docker level

- use Red Hat Satellite as repositories source

- use Active Directory as user authentication method

- be accessible throught SSH.

 

Here's my inventory file:

---------------------

[masters]

okdmst01t.stluc.ucl.ac.be

 

[etcd]

okdmst01t.stluc.ucl.ac.be openshift_master_cluster_hostname="okdmst01t.stluc.ucl.ac.be" openshift_schedulable=true

 

[nodes]

okdmst01t.stluc.ucl.ac.be openshift_node_group_name="node-config-master"

okdinf01t.stluc.ucl.ac.be openshift_node_group_name="node-config-infra"

okdnod0[1:2]t.stluc.ucl.ac.be openshift_node_group_name="node-config-compute"

 

[lb]

okdlb01t.stluc.ucl.ac.be

 

[OSEv3:children]

masters

nodes

etcd

lb

 

[OSEv3:vars]

openshift_deployment_type=origin

openshift_master_default_subdomain=okdt.stluc.ucl.ac.be

debug_level=2

ansible_become=true

openshift_docker_insecure_registries=172.30.0.0/16

openshift_release=3.11

openshift_install_examples=true

os_firewall_use_firewalld=true

openshift_disable_check:=docker_image_availability

---------------------

 

I use ansible tower upstream (AWX) to deploy OKD and made the following workflow:

prerequisites.yml == on-success ==> deply-cluster.yml == on-failure ==> uninstall.yml

 

Everything seems to tun well and my workflow execute correctly.

 

But I don't know why but when OKD is deployed none of the master / nodes / infra server are accessible throught ssh and none respond to ping.

I can still use the vmware console and see that every conainers are up and running.

 

I can still login to the lb and all nodes are visible from this one.

 

So I can't connect to the web console or login using oc using the following:

- in Browser (tested with latest Firefox and Chromium): https://okdmst01t.stluc.ucl.ac.be:8443/

  Connection time out

 

- CLI:

  oc login https://okdmst01t.stluc.ucl.ac.be:8443

  error: dial tcp 10.244.246.66:8443: i/o timeout - verify you have provided the correct host and port and that the server is currently running.

 

Do you have a clue that I've to check ?

Is there something I missed ?

I already read the OKD latest doc and serverworld tutorial (https://www.server-world.info/en/note?os=CentOS_7&p=openshift311&f=1) but I can't found something to help me solve this.

I don't really know what to search …

If you have a clue or something to help please share it.

 

Bests regards.

 

logo-stluc

Wilfried Anuzet
Service Infrastructure
Département Information & Systèmes
Tél: +32 2 764 2488


Avenue Hippocrate, 10 - 1200 Bruxelles - Belgique - Tel: + 32 2 764 11 11 - www.saintluc.be

logo-fsl

Soutenez les Cliniques, soutenez la Fondation Saint-Luc
Support our Hospital, support
Fondation Saint-Luc

 

 

_______________________________________________
users mailing list
users lists openshift redhat com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]