[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Enabling Cluster Metrics



----- Original Message -----
> From: "Clayton Coleman" <ccoleman redhat com>
> To: "Alejandro Nieto Boza" <ale90nb gmail com>, mwringe redhat com
> Cc: "users" <users lists openshift redhat com>
> Sent: Wednesday, February 10, 2016 11:56:14 AM
> Subject: Re: Enabling Cluster Metrics
> 
> I don't know what unconfigured table means (beyond maybe your tables need
> to be recreated because you have an old version) but I bet Matt does.
> 
> On Feb 10, 2016, at 10:50 AM, Alejandro Nieto Boza <ale90nb gmail com>
> wrote:
> 
> Thanks, the Openshift DNS wasn't running correctly. Now the error doesn't
> appear but...
> 
> Now I've an error (this error have already appears to me in other
> scenarios).
> 
> This is the state of my metrics pods:
> 
> # oc get pods
> NAME                         READY     STATUS      RESTARTS   AGE
> hawkular-cassandra-1-j09f6   1/1       Running     0          10m
> hawkular-metrics-xpa33       0/1       Error       1          10m
> heapster-42vyz               0/1       Error       2          10m
> metrics-deployer-e5e3v       0/1       Completed   0          12m
> 
> 
> # oc get pods
> NAME                         READY     STATUS             RESTARTS   AGE
> hawkular-cassandra-1-j09f6   1/1       Running            0          12m
> hawkular-metrics-xpa33       0/1       Completed          2          12m
> heapster-42vyz               0/1       CrashLoopBackOff   4          12m
> metrics-deployer-e5e3v       0/1       Completed          0          15m
> 
> 
> The pod hawkular-metrics change its state between completed and error (?)
> 
> 
> These are some logs of hawkular-metrics pod:
> 
>  # oc logs hawkular-metrics-xpa33
> 15:22:08,104 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-1)
> MSC000001: Failed to start service
> jboss.deployment.unit."hawkular-metrics-api-jaxrs.war":
> org.jboss.msc.service.StartException in service
> jboss.deployment.unit."hawkular-metrics-api-jaxrs.war": Failed to start
> service
>         at
> org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1904)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalStateException: Container is down
> ...............
> 
> 15:22:08,211 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-1)
> MSC000001: Failed to start service
> jboss.serverManagement.controller.management.http:
> org.jboss.msc.service.StartException in service
> jboss.serverManagement.controller.management.http: Failed to start service
>         at
> org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1904)
> ...............
> 
> 
> 15:29:35,416 FATAL [org.hawkular.metrics.api.jaxrs.MetricsServiceLifecycle]
> (metricsservice-lifecycle-thread) HAWKMETRICS200006: An error occurred
> trying to connect to the Cassandra cluster:
> com.datastax.driver.core.exceptions.InvalidQueryException: unconfigured
> table retentions_idx
>         at
> com.datastax.driver.core.exceptions.InvalidQueryException.copy(InvalidQueryException.java:35)
> .................

I have not seen this issue exact issue before, but it is similar to something I have seen where if you use origin-metrics and then switch to the OSE metric images then something similar may happen (the version of Hawkular Metrics in origin metrics uses a different schema than the OSE images). Are you running this without persistent storage? and if using persistent storage, was it used previously for a different version of Hawkular Metrics?


> 
> 
> And obviously heapster cannot connect to hawkular-metrics:
> 
> # oc logs heapster-42vyz
> Could not connect to https://hawkular-metrics:443/hawkular/metrics/status.
> Curl exit code: 7. Status Code 000
> 'https://hawkular-metrics:443/hawkular/metrics/status' is not accessible
> [HTTP status code: 000. Curl exit code 7]. Retrying.
> 
> 
> hawkular-cassandra logs don't show errors.
> 
> 
> 
> 2016-02-10 14:50 GMT+01:00 Clayton Coleman <ccoleman redhat com>:
> 
> > Can you try from one of your nodes to reach the nameserver directly and
> > via the proxy?
> >
> >     dig @<your master ip> kubernetes.default.svc.cluster.local
> >     dig @172.30.0.1 kubernetes.default.svc.cluster.local
> >
> >
> >
> > On Feb 10, 2016, at 8:40 AM, Alejandro Nieto Boza <ale90nb gmail com>
> > wrote:
> >
> > It's like you said.
> >
> > Test logs:
> > # oc logs test
> >   % Total    % Received % Xferd  Average Speed   Time    Time     Time
> >  Current
> >                                  Dload  Upload   Total   Spent    Left
> >  Speed
> >   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--
> >   0curl: (6) Could not resolve host: kubernetes; Unknown error
> >
> >
> >
> >
> > Test2 logs:
> > # oc logs test2
> > nameserver "172.30.0.1"
> > nameserver "another-ip"
> >
> >
> >
> >
> > # oc get svc/kubernetes -n default
> > NAME         CLUSTER_IP   EXTERNAL_IP   PORT(S)                 SELECTOR
> > AGE
> > kubernetes   "172.30.0.1"   <none>        443/TCP,53/UDP,53/TCP   <none>
> >   92d
> > search test.svc.cluster.local svc.cluster.local cluster.local test.es
> > options ndots:5
> >
> >
> >
> >
> >
> >
> > 2016-02-10 14:01 GMT+01:00 Clayton Coleman <ccoleman redhat com>:
> >
> >> That seems to indicate that inside the deployment container DNS is not
> >> working.  Can you do the following to check:
> >>
> >>     oc run --image centos:7 test --generator=run-pod/v1 --restart=Never
> >> -- curl https://kubernetes
> >>     oc logs test
> >>
> >> And then
> >>
> >>     oc run --image centos:7 test2 --generator=run-pod/v1 --restart=Never
> >> -- cat /etc/resolv.conf
> >>     oc logs test2
> >>
> >> The latter should have a nameserver pointing to the master by its service
> >> IP - the command:
> >>
> >>     oc get svc/kubernetes -n default
> >>
> >> Should show that same IP
> >>
> >> On Feb 10, 2016, at 7:39 AM, Alejandro Nieto Boza <ale90nb gmail com>
> >> wrote:
> >>
> >> Hi,
> >>
> >> I've been following the following steps to deploy metrics:
> >>
> >> https://docs.openshift.org/latest/install_config/cluster_metrics.html
> >>
> >> When I run the following command:
> >>
> >>
> >> oc process -f metrics.yaml -v \
> >> HAWKULAR_METRICS_HOSTNAME=hawkular-metrics.example.com,USE_PERSISTENT_STORAGE=false
> >> \
> >> | oc create -f -
> >>
> >>
> >> I get the following error:
> >>
> >> Creating the Cassandra Certificate Secrets configuration json file
> >> +++ base64
> >> ++++ echo hawkular-cassandra
> >> +++ base64 -w 0 /etc/deploy/_output/hawkular-cassandra.truststore
> >> +++ base64
> >> ++++ echo RjR--747mUzmTS-
> >> +++ base64 -w 0 /etc/deploy/_output/hawkular-cassandra.pem
> >> ++ echo
> >> ++ echo 'Creating the Cassandra Certificate Secrets configuration json
> >> file'
> >> ++ cat
> >> +++ base64 -w 0 /etc/deploy/_output/hawkular-cassandra.cert
> >> +++ base64 -w 0 /etc/deploy/_output/hawkular-cassandra-ca.cert
> >> Creating Hawkular Metrics & Cassandra Secrets
> >> ++ echo 'Creating Hawkular Metrics & Cassandra Secrets'
> >> ++ oc create -f /etc/deploy/_output/hawkular-metrics-secrets.json
> >> unable to connect to a server to handle "secrets": Get
> >> https://kubernetes.default.svc:443/api: dial tcp: lookup
> >> kubernetes.default.svc: no such host
> >>
> >>
> >>
> >>
> >> # oc get pods
> >> NAME                     READY     STATUS    RESTARTS   AGE
> >> metrics-deployer-7gcpd   0/1       Error     0          39m
> >>
> >>
> >> How can I know if my kubernetes master URL is
> >> https://kubernetes.default.svc:443 or is another URL?
> >>
> >> My Openshift installation isn't an update.
> >>
> >> _______________________________________________
> >> users mailing list
> >> users lists openshift redhat com
> >> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
> >>
> >>
> >
> 


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]