[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Openshift HA environment - keepalived high number of close syscalls

Haven't seen this till you mentioned it. I can see the close calls in my local env. It looks like it happens in a new process - after a clone() syscall at about a couple of seconds apart. So it is likely part of the script that does the health check: 
     script "</dev/tcp/${ip}/${watch_port}"

But I don't see a slowdown on the cpu side on my instance - its running about 1% for the last 30 mins odd  so suspect that might have to do with the agent/sysdig in your case. 

Filing a bug would be good - spent some time right now but couldn't figure out what's causing it or if its a "feature". 


On Tue, Apr 5, 2016 at 2:04 PM, Chuck Sochin <csochin westwardone net> wrote:
Using OSEv3.1.1

I'm looking to setup sysdig in our native HA openshift environment, but having issues getting the agent to run on our infra nodes hosting keepalived and ha-proxy -- agent runs without issue on all the other nodes in our env.

After the agent has been running about an hour or two, the node hangs and our hypervisor reports 100% cpu utilization. A power reset is the only option to bring the node back to life. The problem may be with keepalived doing an extremely large number(around 17 million in a minute) of "close" syscall operations, and it looks like those close operations are on any available fd. Is this expected behavior of keepalived running in an OSEv3.1.1 HA environment?


users mailing list
users lists openshift redhat com


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]