[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [3.x]: openshift router and its own metrics



Hi Clayton,

Certainly some of the metrics should be preserved across reloads, e.g. metrics like haproxy_server_http_responses_total should be preserved across reload (though to an extent, Prometheus can handle resets correctly with its native support).
 
However, the metric haproxy_server_http_average_response_latency_milliseconds appears also to be accumulating when we wouldn't expect it to. (According the the haproxy stats, I think that's a rolling average over the last 1024 calls -- so it goes up and down, or should.)

Thoughts?


Cheers,
Dani


On Thu, Aug 15, 2019 at 3:59 PM Clayton Coleman <ccoleman redhat com> wrote:
Metrics memory use in the router should be proportional to number of services, endpoints, and routes.  I doubt it's leaking there and if it were it'd be really slow since we don't restart the router monitor process ever.  Stats should definitely be preserved across reloads, but will not be preserved across the pod being restarted.

On Thu, Aug 15, 2019 at 10:30 AM Dan Mace <dmace redhat com> wrote:


On Thu, Aug 15, 2019 at 10:03 AM Daniel Comnea <comnea dani gmail com> wrote:
Hi,

Would appreciate if anyone can please confirm that my understanding is correct w.r.t the way the router haproxy image [1] is built.
Am i right to assume that the image [1] is is built as it's seen without any other layer being added to include [2] ?
Also am i right to say the haproxy metrics [2] is part of the origin package ?


A bit of background/ context:

a while back on OKD 3.7 we had to swap the openshift 3.7.2 router image with 3.10 because we were seeing some problems with the reload and so we wanted to take the benefit of the native haproxy 1.8 reload feature to stop affecting the traffic.

While everything was nice and working okay we've noticed recently that the haproxy stats do slowly increase and we do wonder if this is an accumulation or not cause (maybe?) by the reloads. Now i'm aware of a change made [3] however i suspect that is not part of the 3.10 image hence my question to double check if my understanding is wrong or not.


Cheers,
Dani

_______________________________________________
dev mailing list
dev lists openshift redhat com
http://lists.openshift.redhat.com/openshiftmm/listinfo/dev

I think Clayton (copied) has the history here, but the nature of the metrics commit you referenced is that many of the exposed metrics points are counters which were being reset across reloads. The patch was (I think) to enable counter metrics to correctly aaccumulate across reloads.

As to how the image itself is built, the pkg directly is part of the router controller code included with the image. Not sure if that answers your question.

--

Dan Mace

Principal Software Engineer, OpenShift

Red Hat

dmace redhat com



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]