[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: ansible run with cert errors (certificate signed by unknown authority)



Dear community,
I think the problem lies here:

$ openssl x509 -in /etc/etcd/peer.crt -text -noout
        Subject: CN=xxx.xxx
            X509v3 Subject Alternative Name:
                IP Address:z.z.z.z

CN - master 1
IP - master 3

Plus this cert  /etc/etcd/peer.crt appears in all three masters - with the same values.
It should be: (on master1) CN:master1 IP:master1
(on master2) CN:master2 IP:master2

Seems like one of the last commits in these area broke things. It was working fine before :(
But I can’t find the commit. :(

Really need help with this.
Thanks a lot!
   Sebastian Wieseler



On 8 Apr 2016, at 12:05 PM, Sebastian Wieseler <sebastian myrepublic com sg> wrote:

Dear community,
I am running the latest ansible playbook version and followed the advanced installation guide.
(Updating 6bae443..1b82b1b)


When I execute ansible-playbook ~/openshift-ansible/playbooks/byo/config.yml it fails with:
TASK: [openshift_master | Start and enable master api] ************************
failed: [x.x.x.x] => {"failed": true}
msg: Job for origin-master-api.service failed because the control process exited with error code. See "systemctl status origin-master-api.service" and "journalctl -xe" for details.



Apr 08 03:47:43   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:43   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:43   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:43   etcd[12180]: dropped MsgHeartbeatResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:43   etcd[12180]: dropped MsgProp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:45   etcd[12180]: dropped MsgHeartbeatResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:45   etcd[12180]: publish error: etcdserver: request timed out, possibly due to connection lost
Apr 08 03:47:45   origin-master-controllers[116866]: E0408 03:47:45.976514  116866 leaderlease.go:69] unable to check lease openshift.io/leases/controllers: 501:
All the given peers are not reachable (failed to propose on members [https://xxx.xxx:2379 x509: certificate signed by unknown authority]) [0]

Apr 08 03:47:47   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:47   etcd[12180]: the connection to peer af936f5f6ff57c05 is unhealthy
Apr 08 03:47:47   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:47   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:47   etcd[12180]: dropped MsgHeartbeatResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:47   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:47   etcd[12180]: dropped MsgProp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:47   origin-node[26652]: E0408 03:47:47.708378   26652 kubelet.go:2761] Error updating node status, will retry: error getting node “xxx.xxx": error #0: net/http: TLS handshake timeout
Apr 08 03:47:47   origin-node[26652]: error #1: net/http: TLS handshake timeout
Apr 08 03:47:47   origin-node[26652]: error #2: x509: certificate signed by unknown authority
Apr 08 03:47:48   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:48   etcd[12180]: dropped MsgHeartbeatResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:48   etcd[12180]: dropped MsgHeartbeatResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:49   origin-node[26652]: E0408 03:47:49.187066   26652 kubelet.go:2761] Error updating node status, will retry: error getting node “xxx.xxx": error #0: x509: certificate signed by unknown authority
Apr 08 03:47:49   origin-node[26652]: error #1: x509: certificate signed by unknown authority
Apr 08 03:47:49   origin-node[26652]: error #2: x509: certificate signed by unknown authority
Apr 08 03:47:49   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: dropped MsgHeartbeatResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: failed to dial af936f5f6ff57c05 on stream MsgApp v2 (EOF)
Apr 08 03:47:49   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: failed to dial af936f5f6ff57c05 on stream Message (EOF)
Apr 08 03:47:49   etcd[12180]: dropped MsgHeartbeatResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: dropped MsgHeartbeatResp to 9dc58f8e2290c613 since pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: the connection with 9dc58f8e2290c613 became inactive
Apr 08 03:47:49   etcd[12180]: failed to write 9dc58f8e2290c613 on pipeline (EOF)
Apr 08 03:47:51   etcd[12180]: failed to dial af936f5f6ff57c05 on stream Message (x509: certificate is valid for y.y.y.y, not z.z.z.z)
 ———>  z.z.z.z is my master03 and y.y.y.y my master02
Apr 08 03:47:51   etcd[12180]: failed to dial af936f5f6ff57c05 on stream MsgApp v2 (x509: certificate is valid for y.y.y.y, not z.z.z.z)
Apr 08 03:47:52   etcd[12180]: failed to write 9dc58f8e2290c613 on pipeline (net/http: TLS handshake timeout)
Apr 08 03:47:52   etcd[12180]: the connection with 9dc58f8e2290c613 became active
Apr 08 03:47:53   etcd[12180]: the connection with 9dc58f8e2290c613 became inactive
Apr 08 03:47:53   etcd[12180]: failed to write 9dc58f8e2290c613 on pipeline (net/http: TLS handshake timeout)
Apr 08 03:47:54   etcd[12180]: etcdserver: request timed out, possibly due to connection lost
Apr 08 03:47:56   etcd[12180]: publish error: etcdserver: request timed out, possibly due to connection lost
Apr 08 03:47:56   etcd[12180]: the connection with 9dc58f8e2290c613 became active
Apr 08 03:48:01   origin-node[26652]: E0408 03:48:01.380964   26652 kubelet.go:2761] Error updating node status, will retry: error getting node “xxx.xxxt": error
Apr 08 03:48:01   origin-node[26652]: error #1: net/http: TLS handshake timeout
Apr 08 03:48:01   origin-node[26652]: error #2: x509: certificate signed by unknown authority
Apr 08 03:48:03   etcd[12180]: the connection with 9dc58f8e2290c613 became inactive
Apr 08 03:48:03   etcd[12180]: failed to write 9dc58f8e2290c613 on pipeline (EOF)
Apr 08 03:48:04   origin-master-controllers[116866]: E0408 03:48:04.691728  116866 leaderlease.go:69] unable to check lease openshift.io/leases/controllers: 501: All the given peers are not reachable



My setup includes three masters:
[masters]
x.x.x.x openshift_hostname=xxx.xxx openshift_public_hostname=xxx.xxx
y.y.y.y openshift_hostname=yyy.yyy openshift_public_hostname=yyy.yyy
z.z.z.z openshift_hostname=zzz.zzz openshift_public_hostname=zzz.zzz

[etcd]
x.x.x.x openshift_hostname=xxx.xxx openshift_public_hostname=xxx.xxx
y.y.y.y openshift_hostname=yyy.yyy openshift_public_hostname=yyy.yyy
z.z.z.z openshift_hostname=zzz.zzz openshift_public_hostname=zzz.zzz



I also tried destroying the config:
# yum -y remove openshift openshift-* etcd
# rm -rf /etc/origin /var/lib/openshift /etc/etcd \
    /var/lib/etcd /etc/sysconfig/atomic-openshift* \
    /root/.kube/config /etc/ansible/facts.d /usr/share/openshift

But ansible fails at the same step and the cert errors persist.

Can somebody help me?

Thanks a lot i advance!
Best Regards,
  Sebastian Wieseler


_______________________________________________
users mailing list
users lists openshift redhat com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]