[Date Prev][Date Next] [Thread Prev][Thread Next]
Re: Socket activation
- From: Clayton Coleman <ccoleman redhat com>
- To: Krishna Raman <kraman gmail com>
- Cc: Openshift Dev <dev lists openshift redhat com>
- Subject: Re: Socket activation
- Date: Wed, 4 Sep 2013 15:29:29 -0400 (EDT)
----- Original Message -----
> On Sep 4, 2013, at 11:33 AM, David Strauss <david davidstrauss net> wrote:
> > My other responses are in-line, but I wanted to offer yet another
> > option: out-of-band container activation. We currently use this for
> > MariaDB, but they way I'll suggest here is slightly different and more
> > general-purpose than the Drupal integration we use.
> > It's pretty safe to assume that for the main application comes online,
> > its dependent resources need to come online. Socket activation handles
> > this lazily, which is nice, but it's probably a minimal optimization
> > over another option: activating all containers publishing to the
> > application when the initial HTTP request comes in.
> > It wouldn't require any socket magic, nor would it only support HTTP.
> > It's just that the entry point would have to be HTTP.
> > This approach has a major downside, though. If someone wants to
> > directly connect to MariaDB, they'd either have to invoke the
> > out-of-band mechanism first or send an HTTP request to their
> > application to activate dependencies. We've seen a bit of user
> > confusion over this, despite our best attempts at documentation.
> > It might be possible to turn the out-of-band activation into a plus,
> > though. We're looking to move to a security approach where direct
> > MariaDB access requires making an API or dashboard request to unlock
> > the access, which would be a prime time to activate the necessary
> > container(s), too.
> > On Wed, Sep 4, 2013 at 11:18 AM, Krishna Raman <kraman gmail com> wrote:
> >> It would only be the first SYN that would be ignored and would be picked
> >> up upon the next retry once the container is started.
> >> If the container takes longer to start than the SYN timeout, then even the
> >> socat approach would result in timeout at your router level.
> >> Would the router's failure-detection trip based on a single SYN failure?
> > It would only be a single SYN failure for each container. If lots of
> > containers get activated on a box, say, after a reboot, there's be a
> > spike in failed SYN receipts.
> When an openshift node is rebooted, all containers that were running before
> the reboot are started
> by a separate systemd service. So I don't think we would hit a lot of SYN
> failures in that case but I see your point.
> There might be other situations where this could occur.
> If the request is HTTP/HTTPS then the apache reverse proxy on the node would
> hide these failures but if it is directly to
> an exposed port, then the failure would be propagated to the router/switch.
> Other than running a haproxy or equivalent
> on the node (instead of socat in each gear) to hide these failures, I don't
> see a lot of options.
> Could make this a configuration/setup option. If the router/switch misbehaves
> with it gets SYN failures then use a haproxy,
> else live with 1 SYN failure per gear un-idle.
> What do you think?
Is SYN on startup after a node reboot really an issue? I'm not sure that scenario is what I consider the normal activation scenario - if you lose a node for X minutes, then it takes Y minutes for a set of containers to be brought back onto load, the solution isn't really magic queueing of the network... it's tolerance at the higher levels of the stack that is necessary (a lost node should be triggering a master/slave failover, a leader election, etc vs. a really long wait). Realistically, activation is going to range from 1-120seconds, with a large percentage being <20s and a slightly larger cluster around 30-40s for big Java stacks. Can we tolerate drops in those windows? Magical activation and idling only works when you can mask the end effect to the user - a 15-20s transaction is going to almost always result in user failure so it really doesn't matter how long it takes after that.
> >>> Have you looked into environment-based socket passing with a shim like
> >>> socat? socat already supports specifying a socket using a file
> >>> descriptor, and it can forward to and from sockets in the container's
> >>> network namespace. It seems like it would be equivalent in
> >>> functionality and compatibility with SYN timeout and iptables but
> >>> without breaking the TCP spec or requiring global firewall changes at
> >>> container start-up time.
> >> A socat based approach has a few issues:
> >> 1) I am trying to avoid having to route packets through a user level
> >> program. Would rather keep it in kernel space and use IPTables.
> >> 2) Even tough it is small, socat and other programs will add overhead to
> >> each gear
> >> 2) Based on some tested a we did a while ago socat becomes unresponsive is
> >> there is heavy load. Rob can elaborate on this.
> > I haven't load-tested socat for anything, and I completely understand
> > the other reasons here.
> > --
> > David Strauss
> > | david davidstrauss net
> > | +1 512 577 5827 [mobile]
> dev mailing list
> dev lists openshift redhat com
[Date Prev][Date Next] [Thread Prev][Thread Next]