[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Scale up and kaboom



Thanks - there is an open issue for Kube and if we can verify this is the same we will include it in 1.1.1

On Jan 8, 2016, at 6:58 PM, Diego Castro <diego castro getupcloud com> wrote:

I can confirm that if an all-in-one server runs out of disk things goes crazy (console action weird when scaling pods)
It happens if you deploy the metrics system (cassandra uses a lot of disk space)
I'll try to reproduce one more time send the logs.


Diego Castro
The CloudFather
(11) 3230.5927
+54 (911) 2159.1779
gtalk: diego castro getupcloud com



2016-01-08 19:27 GMT-03:00 Clayton Coleman <ccoleman redhat com>:
Ok, can you file a bug and describe the scenario?  It's possible this is a UI bug or a lower level problem.

On Jan 7, 2016, at 7:44 AM, John Skarbek <jskarbek rallydev com> wrote:

Clayton,

I meant pods.  When I hit the up button, a pod tried to be deployed to a node with any room left and count of failed pods starts to increase dramatically.
<screencast 2016-01-07 07-35-37.gif>
And it continues to increase until I go back and stop it via command line (which is me deleting these pods followed by changing the scale count).  Hitting the down button doesn't assist in stopping this process.

This same issue appears to occur via command line as well.  Here's a snippet after trying to scale the pod:
```
logging-fluentd-1-tnjv7   0/1       Pending            0          0s
logging-fluentd-1-uj81h   0/1       OutOfDisk          0          6s
logging-fluentd-1-ukw2q   0/1       OutOfDisk          0          9s
logging-fluentd-1-ullqn   0/1       OutOfDisk          0          1s
logging-fluentd-1-v19ka   0/1       OutOfDisk          0          8s
logging-fluentd-1-w4zre   0/1       OutOfDisk          0          5s
logging-fluentd-1-wkzco   0/1       OutOfDisk          0          12s
logging-fluentd-1-wvces   0/1       OutOfDisk          0          12s
logging-fluentd-1-x5h0i   0/1       OutOfDisk          0          1s
logging-fluentd-1-xl4hz   0/1       OutOfDisk          0          12s
logging-fluentd-1-xqhul   0/1       OutOfDisk          0          10s
logging-fluentd-1-ykpku   0/1       OutOfDisk          0          13s
logging-fluentd-1-z2map   0/1       OutOfDisk          0          7s
[root master-001 ~]# oc get pods | wc -l
116
[root master-001 ~]# oc get pods | wc -l
119
```

On Wed, Jan 6, 2016 at 5:38 PM, Clayton Coleman <ccoleman redhat com> wrote:
When you say manually scale up a node, what do you mean?

On Jan 6, 2016, at 5:36 PM, John Skarbek <jskarbek rallydev com> wrote:

Has anyone seen a slight issue that is induced by the unknown administrator where one decides to manually scale up a pod, openshift tries to place it on a node that doesn't have any disk space left, and openshift proceeds to get stuck in a loop of trying to deploy to it?

I did this, and openshift has created 500 nodes in one minute and the counter is still climbing.  All of these are failed pods to the same node that ran out of disk space.

--

John Skarbek

Infrastructure Engineer

CA Technologies | 1101 Haynes St, Suite 105 | Raleigh, NC 27604

Office: +1 720 921 8126 | john skarbek ca com

Rally is now CA Technologies

_______________________________________________
users mailing list
users lists openshift redhat com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users



--

John Skarbek

Infrastructure Engineer

CA Technologies | 1101 Haynes St, Suite 105 | Raleigh, NC 27604

Office: +1 720 921 8126 | john skarbek ca com

Rally is now CA Technologies


CA_logo.pngTW.jpgSS.jpgFB.jpgYT.jpgIN.jpgG.jpgRSS.jpgCOM.jpg

_______________________________________________
users mailing list
users lists openshift redhat com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]