[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Draft 1 of PEP005 - Highly available web applications - is available for review

On Jul 9, 2013, at 4:58 PM, Luke Meyer <lmeyer redhat com> wrote:

> This got a little long. Hopefully much of it can be dispensed with quickly as me just not having the whole picture...
> Terms
> •  "system proxy" Kind of a nebulous term until defined; I think it's important to identify that this is an HTTP proxy, not the port proxy. Perhaps "web proxy" or "host web proxy"

system web proxy or node web proxy or system http proxy.  "Node" is problematic because it overlaps with the node.js websockets proxy

> • "a specific node, which has a system proxy which chooses which gear receives the traffic." => "choose" sounds like it's imbued with intelligence rather than just plain configuration. How about just "a specific node, which has a web proxy routing traffic to gears."


> • "head gear" Thought we were standardizing on "primary gear"? Can have more than one primary but two heads are unusual...

Head gear is old term.  New term is "web load balancer gear".  Primary was discussed but is not part of the pep, so I'd prefer to leave it out.

> •  Also, in addition / instead of mentioning OSE 1.2 here how about referring to Origin releases or just timeframe. Scaled apps have worked this way for the last year at least...


> Topology diagram / Failover of a proxy
> • With multiple web load balancer gears all active/active and actually receiving traffic, there will have to be some coordination between them to determine the total level of traffic being received (how??) - unless the router does this; in that case, it seems a lot simpler to expect the router to be full LB as well - what is the web load balancer in a gear really getting us in that scenario except to be a bottleneck and complication?

Rate of config change at scale.  Router is high volume, low config change delta.  Gear LBs can change more frequently.  

The diagram could show active/passive but that requires 2x over provision. It's also dedicated hardware and config, operating on a full system.  Not a fundamental of the diagram for sure, there are many options.

> Routing table data model
> • Why is phpmyadmin called out in particular here? Don't quite understand why any plugin should get special treatment - seems dangerous (note, OSE doesn't even ship phpmyadmin or rockmongo). Unless we provide some kind of SPI endpoint in the cartridge manifest which can be used generically.

Phpadmin represents an embedded web endpoint.  We do.  The pep can be generic.

> • Come to think of it, why do DB gears get special notice... I think I would include info about proxy ports providing any kind of non-http service. 

The list of examples is based on concrete scenarios.  The solution is intended to be generic to all potential protocols.

> • Re git URLs - does git need to be HA as well?

Eventually, yes.

> If so, does the router component also proxy git requests (somehow?? maybe a forwarding hook)?

No, this will probably be covered in the deploy pep but git *can* be kept in parallel on all gears.  A git push brings along all the info necessary to do a build, and ha git builds are a concrete feature.

> If not, how will the user switch to push to a different gear when the one cloned as origin is down?

Change their git remote / add a new one.

> OpenShift Router component
> "An OpenShift" => "An OpenShift installation"


> Multiple web load balancer gears per application
> "A load balancer gear should be stoppable, just like other gears. In general, the minimum availability level is to have two web load balancers" => OK, so does this only apply to stopping via the broker?

If you stop a gear explicitly then you are no longer available.

> Does the broker dub a new web load balancer gear before stopping the old one?


> If one load balancer gear of two is killed/crashes, what corrective action happens and what initiates it?

None / up to an implementer.  Detecting failed load balancers is possible from a router but is out of scope for this pep (or a follow up).

> Seems to me only the router would know for sure that the minimum limit had been violated (since it sees the connection is failing), so it would need to ask the broker to install and activate a web load balancer on another gear?
> Network traffic a single, in-gear load balancer can handle - need to consider not only bandwidth, but connections. If routing to the web proxy, it has a node-wide cap on number of open connections.

You referring to ulimit or something else?  R=16 takes that into account today, will call out ulimit.

> If routing to proxy ports, the TCP proxy is still (I assume) going to cap open connections node-wide.


> Auto-scaling multiple load balancers
> It seems to me the specifics of how to coordinate multiple balancer gears is getting glossed over here and would be really complicated. Meanwhile those who are concerned about HA already have advanced LBs that would make anything we came up with feel like a regression. Why not just let the router handle balancing and deciding when to scale, and just deprecate web LB gears for HA apps? It would simplify a lot of things...

It's glossed over because high volume apps typically are going to be managed much more closely by an operations team in the short term.  Getting auto balancing right at the hundreds of gears level isn't a short term priority, its mostly targeted at 2-10 gears.

Web lb gears are still relevant and are not related to scale up.  Also, not all advanced lbs can provide multiapp balancing with all the characteristics needed.  It is an option for an implementer to make the decision to route direct to gears.

It's certainly possible to drive scaling at the router, just by looking at the load balancer weights.

> Other notes
> This model handles no outage larger than one node. A related and relatively small change might be to modify node configuration and the gear allocation algorithm to specify "confidence zones" (terminology made up) - basically, if you have a gear in each of two confidence zones, the risk that both will be offline at the same time is considered acceptable. So for instance, maybe if you have two racks in the server room (or two rooms) on different power sources, each is in a different zone. Then the broker tries to ensure HA apps span confidence zones. Not strictly part of this PEP but a relatively minor extension of the existing gear dispersion algorithm that lets the user specify desired level of HA confidence.

Seems like part of geo.

> What about this requires adjustment in a multi-gear-size environment?

Nothing I can think of.  Your thoughts?

> ----- Original Message -----
> From: "Clayton Coleman" <ccoleman redhat com>
> To: dev lists openshift redhat com
> Sent: Wednesday, July 3, 2013 7:59:28 PM
> Subject: Draft 1 of PEP005 - Highly available web applications - is    available for review
> As part of the long term evolution of the OpenShift platform, we intend to introduce support for highly available web applications.  A PEP has been drafted in the public repository to describe some of the design considerations and general intent:
>  https://github.com/openshift/openshift-pep/blob/master/openshift-pep-005.md
> This is an early draft, and there are topics that are deliberately considered out of scope of the document at the current time, or are insufficiently specified.  If you have questions, comments, or feedback, please reply to this thread.
> Thanks!
> _______________________________________________
> dev mailing list
> dev lists openshift redhat com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/dev

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]