[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: weird issue with etcd



hello

yes only have two .. i know 3 is the number but i guessed that it might also work with 2 :/

yes both etcd servers can connect between them with the peer port .. i have also checked it ..

so maybe is because i only have three etcd?

thanks scott!!


> El 21 jun 2016, a las 15:19, Scott Dodson <sdodson redhat com> escribió:
> 
> Julio,
> 
> First, it looks like you've only got two etcd hosts, in order to
> tolerate failure of a single host you'll want three.
> From your master config it looks like your two etcd hosts are
> openshift-balancer01 and openshift-balancer02, can each of those hosts
> connect to each other on port 2380? They will connect directly to each
> other for clustering purposes, then the masters will connect to each
> of the etcd hosts on port 2379 for client connectivity.
> 
> --
> Scott
> 
> On Tue, Jun 21, 2016 at 7:28 AM, Julio Saura <jsaura hiberus com> wrote:
>> yes
>> 
>> working
>> 
>> [root openshift-master01 ~]# telnet XXXXX 2380
>> Trying XXXX...
>> Connected to XXXX.
>> Escape character is '^]'.
>> ^CConnection closed by foreign host.
>> 
>> 
>> El 21 jun 2016, a las 13:21, Jason DeTiberus <jdetiber redhat com> escribió:
>> 
>> Did you verify connectivity over the peering port as well (2380)?
>> 
>> On Jun 21, 2016 7:17 AM, "Julio Saura" <jsaura hiberus com> wrote:
>>> 
>>> hello
>>> 
>>> same problem
>>> 
>>> jun 21 13:11:03 openshift-master01 atomic-openshift-master-api[59618]:
>>> F0621 13:11:03.155246   59618 auth.go:141] error #0: dial tcp XXXX:2379:
>>> connection refused ( the one i rebooted )
>>> jun 21 13:11:03 openshift-master01 atomic-openshift-master-api[59618]:
>>> error #1: client: etcd member https://YYYY:2379 has no leader
>>> 
>>> i rebooted the etcd server and my master is not able to use other one
>>> 
>>> still able to connect from both masters using telnet to the etcd port ..
>>> 
>>> any clue? this is weird.
>>> 
>>> 
>>>> El 14 jun 2016, a las 9:28, Julio Saura <jsaura hiberus com> escribió:
>>>> 
>>>> hello
>>>> 
>>>> yes is correct .. it was the first thing i checked ..
>>>> 
>>>> first master
>>>> 
>>>> etcdClientInfo:
>>>> ca: master.etcd-ca.crt
>>>> certFile: master.etcd-client.crt
>>>> keyFile: master.etcd-client.key
>>>> urls:
>>>>  - https://openshift-balancer01:2379
>>>>  - https://openshift-balancer02:2379
>>>> 
>>>> 
>>>> second master
>>>> 
>>>> etcdClientInfo:
>>>> ca: master.etcd-ca.crt
>>>> certFile: master.etcd-client.crt
>>>> keyFile: master.etcd-client.key
>>>> urls:
>>>>  - https://openshift-balancer01:2379
>>>>  - https://openshift-balancer02:2379
>>>> 
>>>> dns names resolve in both masters
>>>> 
>>>> Best regards and thanks!
>>>> 
>>>> 
>>>>> El 13 jun 2016, a las 18:45, Scott Dodson <sdodson redhat com>
>>>>> escribió:
>>>>> 
>>>>> Can you verify the connection information etcdClientInfo section in
>>>>> /etc/origin/master/master-config.yaml is correct?
>>>>> 
>>>>> On Mon, Jun 13, 2016 at 11:56 AM, Julio Saura <jsaura hiberus com>
>>>>> wrote:
>>>>>> hello
>>>>>> 
>>>>>> yes.. i have a external balancer in front of my masters for HA as doc
>>>>>> says.
>>>>>> 
>>>>>> i don’t have any balancer in front of my etcd servers for masters
>>>>>> connection, it’s not necessary right? masters will try all etcd availables
>>>>>> it one is down right?
>>>>>> 
>>>>>> i don’t know why but none of my masters were able to connect to the
>>>>>> second etcd instance, but using telnet from their shell worked .. so it was
>>>>>> not a net o fw issue..
>>>>>> 
>>>>>> 
>>>>>> best regards.
>>>>>> 
>>>>>>> El 13 jun 2016, a las 17:53, Clayton Coleman <ccoleman redhat com>
>>>>>>> escribió:
>>>>>>> 
>>>>>>> I have not seen that particular issue.  Do you have a load balancer
>>>>>>> in
>>>>>>> between your masters and etcd?
>>>>>>> 
>>>>>>> On Fri, Jun 10, 2016 at 5:55 AM, Julio Saura <jsaura hiberus com>
>>>>>>> wrote:
>>>>>>>> hello
>>>>>>>> 
>>>>>>>> i have an origin 3.1 installation working cool so far
>>>>>>>> 
>>>>>>>> today one of my etcd nodes ( 1 of 2 ) crashed and i started having
>>>>>>>> problems..
>>>>>>>> 
>>>>>>>> i noticed on one of my master nodes that it was not able to connect
>>>>>>>> to second etcd server and that the etcd server was not able to promote as
>>>>>>>> leader..
>>>>>>>> 
>>>>>>>> 
>>>>>>>> un 10 11:09:55 openshift-balancer02 etcd[47218]: 12c8a31c8fcae0d4 is
>>>>>>>> starting a new election at term 10048
>>>>>>>> jun 10 11:09:55 openshift-balancer02 etcd[47218]: 12c8a31c8fcae0d4
>>>>>>>> became candidate at term 10049
>>>>>>>> jun 10 11:09:55 openshift-balancer02 etcd[47218]: 12c8a31c8fcae0d4
>>>>>>>> received vote from 12c8a31c8fcae0d4 at term 10049
>>>>>>>> jun 10 11:09:55 openshift-balancer02 etcd[47218]: 12c8a31c8fcae0d4
>>>>>>>> [logterm: 8, index: 4600461] sent vote request to bf80ee3a26e8772c at term
>>>>>>>> 10049
>>>>>>>> jun 10 11:09:56 openshift-balancer02 etcd[47218]: got unexpected
>>>>>>>> response error (etcdserver: request timed out)
>>>>>>>> 
>>>>>>>> my masters logged that they were not able to connect to the etcd
>>>>>>>> 
>>>>>>>> er.go:218] unexpected ListAndWatch error: pkg/storage/cacher.go:161:
>>>>>>>> Failed to list *extensions.Job: error #0: dial tcp X.X.X.X:2379: connection
>>>>>>>> refused
>>>>>>>> 
>>>>>>>> so i tried a simple test, just telnet from masters to the etcd node
>>>>>>>> port ..
>>>>>>>> 
>>>>>>>> [root openshift-master01 log]# telnet X.X.X.X 2379
>>>>>>>> Trying X.X.X.X...
>>>>>>>> Connected to X.X.X.X.
>>>>>>>> Escape character is '^]’
>>>>>>>> 
>>>>>>>> so i was able to connect from masters.
>>>>>>>> 
>>>>>>>> i was not able to recover my oc masters until the first etcd node
>>>>>>>> rebooted .. so it seems my etcd “cluster” is not working without the first
>>>>>>>> node ..
>>>>>>>> 
>>>>>>>> any clue?
>>>>>>>> 
>>>>>>>> thanks
>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> users lists openshift redhat com
>>>>>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users lists openshift redhat com
>>>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>> 
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> users lists openshift redhat com
>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> users lists openshift redhat com
>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>> 
>> 



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]