[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: router/registry crashes



https://docs.openshift.org/latest/install_config/install/prerequisites.html#system-requirements
and https://access.redhat.com/articles/2191731 (if you are a red hat
access subscriber) cover a lot of the details of capacity planning.

On Thu, Apr 21, 2016 at 1:06 PM, Candide Kemmler
<candide intrinsic world> wrote:
> Sorry for my long silence and late answer, but I had a hard time reproducing the bug as a lot of other issues crept in.
> I finally decided to opt for a (much) more RAM-powered machine which seems to have solved all issues really.
>
> So I went from a 2Gb master/4Gb node cluster to an 8Gb master/30Gb node setup.
>
> Really, is it possible to evaluate requirements ahead of time? What would be a good metric?
>
>
>> On 21 Apr 2016, at 15:09, Clayton Coleman <ccoleman redhat com> wrote:
>>
>> When you connect to the router on port 1936 at path healthz (from one
>> of your hosts), what happens?
>>
>>> On Apr 21, 2016, at 8:22 AM, Candide Kemmler <candide intrinsic world> wrote:
>>>
>>> This is getting really annoying: I completely changed our deployment strategy in order to use less resources, but those problems keep reappearing.
>>>
>>> The router&registry will crash occasionally.
>>>
>>> They'll keep restarting, so problems come and go, but when the router is gone, evidently everything stops working.
>>>
>>> Eventually though the pod will stop retrying, and everything is dead for good.
>>>
>>> Router events:
>>>
>>> 2:18:16 PM    Normal    Pulled    Container image "openshift/origin-haproxy-router:v1.1.6" already present on machine
>>> 3 times in the last 7 minutes
>>> 2:18:14 PM    Warning    Unhealthy    Readiness probe failed: Get http://localhost:1936/healthz: dial tcp 127.0.0.1:1936: connection refused
>>> 3 times in the last minute
>>> 2:18:14 PM    Normal    Killing    Killing container with docker id d75e2094cb69: pod "router-1-dz5h0_default(8888db2b-07b5-11e6-b1b1-560000242f19)" container "router" is unhealthy, it will be killed and re-created.
>>> 2:17:54 PM    Warning    Unhealthy    Liveness probe failed: Get http://localhost:1936/healthz: dial tcp 127.0.0.1:1936: connection refused
>>> 2:17:25 PM    Normal    Started    Started container with docker id d75e2094cb69
>>> 2:17:06 PM    Normal    Created    Created container with docker id d75e2094cb69
>>> 2:16:36 PM    Normal    Killing    Killing container with docker id c1ab9c8a9742: pod "router-1-dz5h0_default(8888db2b-07b5-11e6-b1b1-560000242f19)" container "router" is unhealthy, it will be killed and re-created.
>>> 2:16:35 PM    Warning    Unhealthy    Readiness probe failed: Get http://localhost:1936/healthz: read tcp 127.0.0.1:1936: use of closed network connection
>>> 10 times in the last 17 minutes
>>> 2:16:35 PM    Warning    Unhealthy    Liveness probe failed: Get http://localhost:1936/healthz: read tcp 127.0.0.1:1936: use of closed network connection
>>> 9 times in the last 17 minutes
>>> 2:13:01 PM    Warning    Unhealthy    Readiness probe failed: Get http://localhost:1936/healthz: net/http: request canceled while waiting for connection
>>> 2 times in the last 5 minutes
>>> 2:12:52 PM    Warning    Unhealthy    Liveness probe failed: Get http://localhost:1936/healthz: net/http: request canceled while waiting for connection
>>> 2:11:02 PM    Normal    Created    Created container with docker id c1ab9c8a9742
>>> 2:11:02 PM    Normal    Started    Started container with docker id c1ab9c8a9742
>>> 2:11:01 PM    Normal    Killing    Killing container with docker id 5aa1a32bb178: pod "router-1-dz5h0_default(8888db2b-07b5-11e6-b1b1-560000242f19)" container "router" is unhealthy, it will be killed and re-created.
>>>
>>> _______________________________________________
>>> users mailing list
>>> users lists openshift redhat com
>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]