[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Enabling Cluster Metrics



I'm running without persistent storage. 

When the pods are "turned on" more than 20 minutes they change to state "Running" and working. Is it possible that it due to insufficient memory?
I've watch their state for a day and the pods are working. I will try in bigger scenarios when I can and I will post if the error appears again.


Now, I've got this problem (also I've got it previously):

I've launched a pod especifying requests and limits for cpu and memory but when I watch on the pod overview pages, the value of metrics graphs is 0 (with any pod, not only with this).


Heapster logs:

W0210 18:11:48.637940       1 reflector.go:224] /tmp/gopath/src/k8s.io/heapster/sources/pods.go:173: watch of *api.Pod ended with: 401: The event in requested index is outdated and cleared (the requested history has been cleared [2322574/2322059]) [2323573]



# curl -X GET https://hawkular-metrics.example.com/hawkular/metrics/status -k

{"MetricsService":"STARTED","Implementation-Version":"0.12.0.Final"....


Here curl doesn't get JSON object:

# curl -H "Authorization: Bearer XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"

       -H "Hawkular-tenant: test"

       -X GET https://hawkular-metrics.example.com/hawkular/metrics/metrics -k| python -m json.tool

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current

                                 Dload  Upload   Total   Spent    Left  Speed

100    68  100    68    0     0    110      0 --:--:-- --:--:-- --:--:--   110

No JSON object could be decoded


It seems like the problem isn't due to certificates issues, the node IP appears correctly in certificates.




El mié., 10 feb. 2016 17:56, Clayton Coleman <ccoleman redhat com> escribió:
I don't know what unconfigured table means (beyond maybe your tables need to be recreated because you have an old version) but I bet Matt does.

On Feb 10, 2016, at 10:50 AM, Alejandro Nieto Boza <ale90nb gmail com> wrote:

Thanks, the Openshift DNS wasn't running correctly. Now the error doesn't appear but...

Now I've an error (this error have already appears to me in other scenarios).

This is the state of my metrics pods:

# oc get pods
NAME                         READY     STATUS      RESTARTS   AGE
hawkular-cassandra-1-j09f6   1/1       Running     0          10m
hawkular-metrics-xpa33       0/1       Error       1          10m
heapster-42vyz               0/1       Error       2          10m
metrics-deployer-e5e3v       0/1       Completed   0          12m


# oc get pods
NAME                         READY     STATUS             RESTARTS   AGE
hawkular-cassandra-1-j09f6   1/1       Running            0          12m
hawkular-metrics-xpa33       0/1       Completed          2          12m
heapster-42vyz               0/1       CrashLoopBackOff   4          12m
metrics-deployer-e5e3v       0/1       Completed          0          15m


The pod hawkular-metrics change its state between completed and error (?)


These are some logs of hawkular-metrics pod:

 # oc logs hawkular-metrics-xpa33
15:22:08,104 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-1) MSC000001: Failed to start service jboss.deployment.unit."hawkular-metrics-api-jaxrs.war": org.jboss.msc.service.StartException in service jboss.deployment.unit."hawkular-metrics-api-jaxrs.war": Failed to start service
        at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1904)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: Container is down
...............

15:22:08,211 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-1) MSC000001: Failed to start service jboss.serverManagement.controller.management.http: org.jboss.msc.service.StartException in service jboss.serverManagement.controller.management.http: Failed to start service
        at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1904)
...............


15:29:35,416 FATAL [org.hawkular.metrics.api.jaxrs.MetricsServiceLifecycle] (metricsservice-lifecycle-thread) HAWKMETRICS200006: An error occurred trying to connect to the Cassandra cluster: com.datastax.driver.core.exceptions.InvalidQueryException: unconfigured table retentions_idx
        at com.datastax.driver.core.exceptions.InvalidQueryException.copy(InvalidQueryException.java:35)
.................



And obviously heapster cannot connect to hawkular-metrics:

# oc logs heapster-42vyz
Could not connect to https://hawkular-metrics:443/hawkular/metrics/status. Curl exit code: 7. Status Code 000
'https://hawkular-metrics:443/hawkular/metrics/status' is not accessible [HTTP status code: 000. Curl exit code 7]. Retrying.


hawkular-cassandra logs don't show errors.



2016-02-10 14:50 GMT+01:00 Clayton Coleman <ccoleman redhat com>:
Can you try from one of your nodes to reach the nameserver directly and via the proxy?

    dig @<your master ip> kubernetes.default.svc.cluster.local
    dig @172.30.0.1 kubernetes.default.svc.cluster.local



On Feb 10, 2016, at 8:40 AM, Alejandro Nieto Boza <ale90nb gmail com> wrote:

It's like you said.

Test logs:
# oc logs test
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (6) Could not resolve host: kubernetes; Unknown error




Test2 logs: 
# oc logs test2
nameserver "172.30.0.1"
nameserver "another-ip"




# oc get svc/kubernetes -n default
NAME         CLUSTER_IP   EXTERNAL_IP   PORT(S)                 SELECTOR   AGE
kubernetes   "172.30.0.1"   <none>        443/TCP,53/UDP,53/TCP   <none>     92d
search test.svc.cluster.local svc.cluster.local cluster.local test.es
options ndots:5






2016-02-10 14:01 GMT+01:00 Clayton Coleman <ccoleman redhat com>:
That seems to indicate that inside the deployment container DNS is not working.  Can you do the following to check:

    oc run --image centos:7 test --generator=run-pod/v1 --restart=Never -- curl https://kubernetes
    oc logs test

And then

    oc run --image centos:7 test2 --generator=run-pod/v1 --restart=Never -- cat /etc/resolv.conf
    oc logs test2

The latter should have a nameserver pointing to the master by its service IP - the command:

    oc get svc/kubernetes -n default

Should show that same IP

On Feb 10, 2016, at 7:39 AM, Alejandro Nieto Boza <ale90nb gmail com> wrote:

Hi,

I've been following the following steps to deploy metrics:


When I run the following command:


oc process -f metrics.yaml -v \
HAWKULAR_METRICS_HOSTNAME=hawkular-metrics.example.com,USE_PERSISTENT_STORAGE=false \
| oc create -f -


I get the following error:

Creating the Cassandra Certificate Secrets configuration json file
+++ base64
++++ echo hawkular-cassandra
+++ base64 -w 0 /etc/deploy/_output/hawkular-cassandra.truststore
+++ base64
++++ echo RjR--747mUzmTS-
+++ base64 -w 0 /etc/deploy/_output/hawkular-cassandra.pem
++ echo
++ echo 'Creating the Cassandra Certificate Secrets configuration json file'
++ cat
+++ base64 -w 0 /etc/deploy/_output/hawkular-cassandra.cert
+++ base64 -w 0 /etc/deploy/_output/hawkular-cassandra-ca.cert
Creating Hawkular Metrics & Cassandra Secrets
++ echo 'Creating Hawkular Metrics & Cassandra Secrets'
++ oc create -f /etc/deploy/_output/hawkular-metrics-secrets.json
unable to connect to a server to handle "secrets": Get https://kubernetes.default.svc:443/api: dial tcp: lookup kubernetes.default.svc: no such host




# oc get pods
NAME                     READY     STATUS    RESTARTS   AGE
metrics-deployer-7gcpd   0/1       Error     0          39m


How can I know if my kubernetes master URL is https://kubernetes.default.svc:443 or is another URL?

My Openshift installation isn't an update.
_______________________________________________
users mailing list
users lists openshift redhat com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]