[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Metrics deployment



----- Original Message -----
> From: "Srinivas Naga Kotaru (skotaru)" <skotaru cisco com>
> To: "Matt Wringe" <mwringe redhat com>
> Cc: users lists openshift redhat com
> Sent: Monday, June 13, 2016 7:26:06 PM
> Subject: Re: Metrics deployment
> 
> Matt
> 
> PV issue resolved. Was able to to see PV successfully bounded and Casandra
> container has been running. However, it seems puzzle not fully yet solved.

Are you sure the OpenShift DNS server is running?

If you are running OSE 3.1, can you please follow this https://access.redhat.com/solutions/2329131 and see if you are now seeing errors in the Hawkular Metrics logs (essentially just run `oc exec hawkular-metrics-xxxxx cat /opt/eap/standalone/log/server.log`)

> 
> I could see other container(heapster) not coming up, and seeing below errors
> 
> [skotaru l3imas-id2-01 metrics]$ oc logs -f heapster-fnkdc
> Endpoint Check in effect. Checking
> https://hawkular-metrics:443/hawkular/metrics/status
> Could not connect to https://hawkular-metrics:443/hawkular/metrics/status.
> Curl exit code: 6. Status Code 000
> 'https://hawkular-metrics:443/hawkular/metrics/status' is not accessible
> [HTTP status code: 000. Curl exit code 6]. Retrying.
> Could not connect to https://hawkular-metrics:443/hawkular/metrics/status.
> Curl exit code: 6. Status Code 000
> 'https://hawkular-metrics:443/hawkular/metrics/status' is not accessible
> [HTTP status code: 000. Curl exit code 6]. Retrying.
> 
> 
> # oc get pv
> pv-5gb-0011   5Gi        RWO           Bound
> openshift-infra/metrics-cassandra-1             22m
> 
> 
> $ oc get pods
> NAME                         READY     STATUS      RESTARTS   AGE
> hawkular-cassandra-1-2pzd7   1/1       Running     0          20m
> hawkular-metrics-mf5qf       0/1       Running     7          20m
> heapster-fnkdc               0/1       Error       6          20m
> metrics-deployer-cvep0       0/1       Completed   0          21m
> 
> # oc logs -f hawkular-metrics-mf5qf
> 
> 19:20:00,819 INFO  [org.xnio] (MSC service thread 1-2) XNIO Version
> 3.0.14.GA-redhat-1
> 19:20:00,831 INFO  [org.jboss.as.server] (Controller Boot Thread) JBAS015888:
> Creating http management service using socket-binding (management-http)
> 19:20:00,834 INFO  [org.xnio.nio] (MSC service thread 1-2) XNIO NIO
> Implementation Version 3.0.14.GA-redhat-1
> 19:20:00,844 INFO  [org.jboss.remoting] (MSC service thread 1-2) JBoss
> Remoting version 3.3.5.Final-redhat-1
> 
> $ oc logs -f heapster-fnkdc
> Endpoint Check in effect. Checking
> https://hawkular-metrics:443/hawkular/metrics/status
> Could not connect to https://hawkular-metrics:443/hawkular/metrics/status.
> Curl exit code: 6. Status Code 000
> 'https://hawkular-metrics:443/hawkular/metrics/status' is not accessible
> [HTTP status code: 000. Curl exit code 6]. Retrying.
> Could not connect to https://hawkular-metrics:443/hawkular/metrics/status.
> Curl exit code: 6. Status Code 000
> 'https://hawkular-metrics:443/hawkular/metrics/status' is not accessible
> [HTTP status code: 000. Curl exit code 6]. Retrying.
> Could not connect to https://hawkular-metrics:443/hawkular/metrics/status.
> Curl exit code: 6. Status Code 000
> 
> $ oc logs -f hawkular-cassandra-1-2pzd7
> INFO  23:00:24 Starting listening for CQL clients on
> hawkular-cassandra-1-2pzd7/10.1.6.2:9042...
> INFO  23:00:24 Binding thrift service to
> hawkular-cassandra-1-2pzd7/10.1.6.2:9160
> INFO  23:00:24 enabling encrypted thrift connections between client and
> server
> INFO  23:00:24 Listening for thrift clients...
> INFO  23:00:26 Created default superuser role 'cassandra'
> 
> # oc get svc
> NAME                       CLUSTER-IP       EXTERNAL-IP   PORT(S)
> AGE
> hawkular-cassandra         172.30.2.13      <none>
> 9042/TCP,9160/TCP,7000/TCP,7001/TCP   25m
> hawkular-cassandra-nodes   None             <none>
> 9042/TCP,9160/TCP,7000/TCP,7001/TCP   25m
> hawkular-metrics           172.30.117.176   <none>        443/TCP
> 25m
> heapster                   172.30.107.135   <none>        80/TCP
> 25m
> 
> #curl -I 172.30.117.176:443//hawkular/metrics/status
> 
> HTTP/1.1 504 Gateway Timeout
> Mime-Version: 1.0
> Date: Mon, 13 Jun 2016 23:25:47 GMT
> Content-Type: text/html
> Connection: keep-alive
> Proxy-Connection: keep-alive
> Content-Length: 1572
> 
> --
> Srinivas Kotaru
> 
> On 6/13/16, 2:33 PM, "Srinivas Naga Kotaru (skotaru)" <skotaru cisco com>
> wrote:
> 
> >Matt
> >
> >That is good catch. I ran without USE_PERSISTENT_STORAGE=false and working
> >
> >I adjusted PV to 5Gi and reran. Will update progress.
> >
> >Thanks you for your help so far.
> >
> >--
> >Srinivas Kotaru
> >
> >On 6/13/16, 2:27 PM, "Matt Wringe" <mwringe redhat com> wrote:
> >
> >>
> >>
> >>----- Original Message -----
> >>> From: "Srinivas Naga Kotaru (skotaru)" <skotaru cisco com>
> >>> To: "Matt Wringe" <mwringe redhat com>
> >>> Cc: users lists openshift redhat com
> >>> Sent: Monday, June 13, 2016 5:21:01 PM
> >>> Subject: Re: Metrics deployment
> >>> 
> >>> Oh ok
> >>> 
> >>> Am using PV for metrics
> >>> 
> >>> description: "The persistent volume size for each of the Cassandra nodes"
> >>>   name: CASSANDRA_PV_SIZE
> >>>   value: "10Gi"
> >>> 
> >>> oc get pv
> >>> NAME          CAPACITY   ACCESSMODES   STATUS      CLAIM
> >>> REASON
> >>> AGE
> >>> pv-1gb-001    1Gi        RWO           Available
> >>> 4d
> >>> pv-1gb-002    1Gi        RWO           Available
> >>> 4d
> >>> pv-1gb-003    1Gi        RWO           Available
> >>> 4d
> >>> pv-1gb-004    1Gi        RWO           Bound       thlatt/mongodb
> >>> 4d
> >>> pv-1gb-005    1Gi        RWO           Available
> >>> 4d
> >>> pv-2gb-0010   2Gi        RWO           Available
> >>> 4d
> >>> pv-2gb-006    2Gi        RWO           Available
> >>> 4d
> >>> pv-2gb-007    2Gi        RWO           Available
> >>> 4d
> >>> pv-2gb-008    2Gi        RWO           Available
> >>> 4d
> >>> pv-2gb-009    2Gi        RWO           Available
> >>> 4d
> >>> pv-5gb-0011   5Gi        RWO           Available
> >>> 4d
> >>> pv-5gb-0012   5Gi        RWO           Available
> >>> 4d
> >>> pv-5gb-0013   5Gi        RWO           Available
> >>> 4d
> >>> pv-5gb-0014   5Gi        RWO           Available
> >>> 4d
> >>> pv-5gb-0015   5Gi        RWO           Available
> >>> 4d
> >>> 
> >>> am running with below command
> >>> 
> >>> $ oc new-app -f metrics-deployer.yaml  ( hardcoded HOSTNAME, MASTER_API
> >>> and
> >>> PV info so not passing any parameters)
> >>> 
> >>
> >>I would suspect that Cassandra is blocked because its waiting for 10Gi PV
> >>to become available, and none of the PV listed above are big enough.
> >>
> >>> 
> >>> --
> >>> Srinivas Kotaru
> >>> 
> >>> On 6/13/16, 2:12 PM, "Matt Wringe" <mwringe redhat com> wrote:
> >>> 
> >>> >----- Original Message -----
> >>> >> From: "Srinivas Naga Kotaru (skotaru)" <skotaru cisco com>
> >>> >> To: "Matt Wringe" <mwringe redhat com>
> >>> >> Cc: users lists openshift redhat com
> >>> >> Sent: Monday, June 13, 2016 4:55:55 PM
> >>> >> Subject: Re: Metrics deployment
> >>> >> 
> >>> >> Matt
> >>> >> 
> >>> >> Thanks for looking into. I rerun the setup, but had the same issue
> >>> >> 
> >>> >> # oc get pods
> >>> >> NAME                         READY     STATUS              RESTARTS
> >>> >> AGE
> >>> >> hawkular-cassandra-1-y2egy   0/1       ContainerCreating   0
> >>> >> 5m
> >>> >> hawkular-metrics-4b16f       0/1       Running             1
> >>> >> 4m
> >>> >> heapster-x2gj2               0/1       Running             2
> >>> >> 4m
> >>> >> metrics-deployer-9v7vc       0/1       Completed           0
> >>> >> 6m
> >>> >> 
> >>> >> $ oc logs -f hawkular-cassandra-1-y2egy
> >>> >> Error from server: container "hawkular-cassandra-1" in pod
> >>> >> "hawkular-cassandra-1-y2egy" is waiting to start: ContainerCreating
> >>> >
> >>> >Ok, so it looks like something is blocking the Cassandra pod from
> >>> >starting.
> >>> >
> >>> >If you are using persistent storage, Cassandra will not start until the
> >>> >PV
> >>> >is available. There may be some more information about Cassandra in the
> >>> >pod
> >>> >section of the console under events.
> >>> >
> >>> >What command did you use when deploying the deployer?
> >>> >
> >>> >> 
> >>> >> $ oc logs -f hawkular-metrics-4b16f
> >>> >> 
> >>> >> 16:54:25,703 DEBUG [org.jboss.as.config] (MSC service thread 1-4) VM
> >>> >> Arguments: -Duser.home=/home/jboss -Duser.name=jboss -D[Standalone]
> >>> >> -XX:+UseCompressedOops -verbose:gc
> >>> >> -Xloggc:/opt/eap/standalone/log/gc.log
> >>> >> -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation
> >>> >> -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=3M -XX:-TraceClassUnloading
> >>> >> -Xms1303m -Xmx1303m -XX:MaxPermSize=256m
> >>> >> -Djava.net.preferIPv4Stack=true
> >>> >> -Djboss.modules.system.pkgs=org.jboss.logmanager
> >>> >> -Djava.awt.headless=true
> >>> >> -Djboss.modules.policy-permissions=true
> >>> >> -Xbootclasspath/p:/opt/eap/jboss-modules.jar:/opt/eap/modules/system/layers/base/org/jboss/logmanager/main/jboss-logmanager-1.5.4.Final-redhat-1.jar:/opt/eap/modules/system/layers/base/org/jboss/logmanager/ext/main/javax.json-1.0.4.jar:/opt/eap/modules/system/layers/base/org/jboss/logmanager/ext/main/jboss-logmanager-ext-1.0.0.Alpha2-redhat-1.jar
> >>> >> -Djava.util.logging.manager=org.jboss.logmanager.LogManager
> >>> >> -javaagent:/opt/eap/jolokia.jar=port=8778,protocol=https,caCert=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt,clientPrincipal=cn=system:master-proxy,useSslClientAuthentication=true,extraClientCheck=true,host=0.0.0.0,discoveryEnabled=false
> >>> >> -Djava.security.egd=file:/dev/./urandom
> >>> >> -Dorg.jboss.boot.log.file=/opt/eap/standalone/log/server.log
> >>> >> -Dlogging.configuration=file:/opt/eap/standalone/configuration/logging.properties
> >>> >> 16:54:27,079 INFO  [org.xnio] (MSC service thread 1-3) XNIO Version
> >>> >> 3.0.14.GA-redhat-1
> >>> >> 16:54:27,083 INFO  [org.xnio.nio] (MSC service thread 1-3) XNIO NIO
> >>> >> Implementation Version 3.0.14.GA-redhat-1
> >>> >> 16:54:27,101 INFO  [org.jboss.as.server] (Controller Boot Thread)
> >>> >> JBAS015888:
> >>> >> Creating http management service using socket-binding
> >>> >> (management-http)
> >>> >> 16:54:27,104 INFO  [org.jboss.remoting] (MSC service thread 1-3) JBoss
> >>> >> Remoting version 3.3.5.Final-redhat-1
> >>> >> 
> >>> >> $ oc logs -f heapster-x2gj2
> >>> >> Endpoint Check in effect. Checking
> >>> >> https://hawkular-metrics:443/hawkular/metrics/status
> >>> >> Could not connect to
> >>> >> https://hawkular-metrics:443/hawkular/metrics/status.
> >>> >> Curl exit code: 6. Status Code 000
> >>> >> 'https://hawkular-metrics:443/hawkular/metrics/status' is not
> >>> >> accessible
> >>> >> [HTTP status code: 000. Curl exit code 6]. Retrying.
> >>> >> Could not connect to
> >>> >> https://hawkular-metrics:443/hawkular/metrics/status.
> >>> >> Curl exit code: 6. Status Code 000
> >>> >> 'https://hawkular-metrics:443/hawkular/metrics/status' is not
> >>> >> accessible
> >>> >> [HTTP status code: 000. Curl exit code 6]. Retrying.
> >>> >> Could not connect to
> >>> >> https://hawkular-metrics:443/hawkular/metrics/status.
> >>> >> Curl exit code: 6. Status Code 000
> >>> >> 
> >>> >> 
> >>> >>  $ oc logs -f metrics-deployer-9v7vc
> >>> >> 
> >>> >> ++ oc create -f -
> >>> >> serviceaccount "heapster" created
> >>> >> service "heapster" created
> >>> >> replicationcontroller "heapster" created
> >>> >> + echo 'Success!'
> >>> >> Success!
> >>> >> 
> >>> >> --
> >>> >> Srinivas Kotaru
> >>> >> 
> >>> >> On 6/13/16, 1:49 PM, "Matt Wringe" <mwringe redhat com> wrote:
> >>> >> 
> >>> >> >
> >>> >> >
> >>> >> >----- Original Message -----
> >>> >> >> From: "Srinivas Naga Kotaru (skotaru)" <skotaru cisco com>
> >>> >> >> To: users lists openshift redhat com
> >>> >> >> Sent: Monday, June 13, 2016 3:58:12 PM
> >>> >> >> Subject: Metrics deployment
> >>> >> >> 
> >>> >> >> 
> >>> >> >> 
> >>> >> >> Hi
> >>> >> >> 
> >>> >> >> 
> >>> >> >> 
> >>> >> >> Am trying to configure metrics in our newly installed clusters. Am
> >>> >> >> seeing
> >>> >> >> below errors once metrics-deploy script was successful. I used our
> >>> >> >> environment specific HAWKULAR_METRICS_HOSTNAME and MASTER_URL
> >>> >> >> 
> >>> >> >> 
> >>> >> >> 
> >>> >> >> # oc new-app -f metrics-deployer.yaml
> >>> >> >> 
> >>> >> >> 
> >>> >> >> 
> >>> >> >> Note: customized, CASSANDARA PV, MASTER_URL, and
> >>> >> >> HAWKULAR_METRICS_HOSTNAME
> >>> >> >> (
> >>> >> >> hard coded as values)
> >>> >> >> 
> >>> >> >> 
> >>> >> >> 
> >>> >> >> template "hawkular-heapster" created
> >>> >> >> 
> >>> >> >> Deploying the Heapster component
> >>> >> >> 
> >>> >> >> ++ echo 'Deploying the Heapster component'
> >>> >> >> 
> >>> >> >> ++ '[' -n '' ']'
> >>> >> >> 
> >>> >> >> ++ oc create -f -
> >>> >> >> 
> >>> >> >> ++ oc process hawkular-heapster -v
> >>> >> >> IMAGE_PREFIX=registry.access.redhat.com/openshift3/,IMAGE_VERSION=latest,MASTER_URL=https://lae3-alln-int-idev01.cisco.com:443,NODE_ID=nodename
> >>> >> >> 
> >>> >> >> serviceaccount "heapster" created
> >>> >> >> 
> >>> >> >> service "heapster" created
> >>> >> >> 
> >>> >> >> replicationcontroller "heapster" created
> >>> >> >> 
> >>> >> >> + echo 'Success!'
> >>> >> >> 
> >>> >> >> Success!
> >>> >> >> 
> >>> >> >> 
> >>> >> >> 
> >>> >> >> # oc get pods
> >>> >> >> 
> >>> >> >> NAME READY STATUS RESTARTS AGE
> >>> >> >> 
> >>> >> >> hawkular-cassandra-1-9nzio 0/1 ContainerCreating 0 4m
> >>> >> >> 
> >>> >> >> hawkular-metrics-hi7mb 0/1 Running 1 4m
> >>> >> >> 
> >>> >> >> heapster-e8gbu 0/1 Running 2 4m
> >>> >> >> 
> >>> >> >> metrics-deployer-64703 0/1 ContainerCreating 0 3s
> >>> >> >> 
> >>> >> >> metrics-deployer-cd1nf 0/1 Completed 0 5m
> >>> >> >> 
> >>> >> >
> >>> >> >It looks like none of your containers are fully up and running yet.
> >>> >> >
> >>> >> >Without Cassandra running, Hawkular Metrics will not run, and
> >>> >> >Heapster
> >>> >> >will
> >>> >> >wait until Hawkular Metrics is fully running.
> >>> >> >
> >>> >> >Do you see anything in the Cassandra logs? The first step will be to
> >>> >> >get
> >>> >> >Cassandra running properly.
> >>> >> >
> >>> >> >> 
> >>> >> >> 
> >>> >> >> 
> >>> >> >> $ oc logs -f heapster-e8gbu
> >>> >> >> 
> >>> >> >> Endpoint Check in effect. Checking
> >>> >> >> https://hawkular-metrics:443/hawkular/metrics/status
> >>> >> >> 
> >>> >> >> Could not connect to
> >>> >> >> https://hawkular-metrics:443/hawkular/metrics/status.
> >>> >> >> Curl exit code: 6. Status Code 000
> >>> >> >> 
> >>> >> >> 'https://hawkular-metrics:443/hawkular/metrics/status' is not
> >>> >> >> accessible
> >>> >> >> [HTTP status code: 000. Curl exit code 6]. Retrying.
> >>> >> >> 
> >>> >> >> Could not connect to
> >>> >> >> https://hawkular-metrics:443/hawkular/metrics/status.
> >>> >> >> Curl exit code: 6. Status Code 000
> >>> >> >
> >>> >> >Heapster waits until Hawkular Metrics is started before trying to
> >>> >> >push
> >>> >> >metrics to it. The issue that you are seeing is because Heapster
> >>> >> >could
> >>> >> >not
> >>> >> >properly connect to Hawkular Metrics. Until the Hawkular Metrics
> >>> >> >service
> >>> >> >is
> >>> >> >fully up, Heapster will not be able to connect to it.
> >>> >> >
> >>> >> >
> >>> >> >> 
> >>> >> >> 
> >>> >> >> 
> >>> >> >> 
> >>> >> >> What is the wrong? Why it checking just hawkular-metrics rather
> >>> >> >> full
> >>> >> >> routing
> >>> >> >> URL which was provided as HAWKULAR_METRICS_HOSTNAME
> >>> >> >
> >>> >> >The Hawkular Metrics service has two hostnames: the internal hostname
> >>> >> >used
> >>> >> >by the internal components (eg 'hawkular-metrics') and the external
> >>> >> >hostname (eg what is configured via HAWKULAR_METRICS_HOSTNAME). The
> >>> >> >OpenShift dns server will resolve hostnames to the name of services,
> >>> >> >which
> >>> >> >is where the internal 'hawkular-metrics' comes from.
> >>> >> >
> >>> >> >> 
> >>> >> >> 
> >>> >> >> 
> >>> >> >> 
> >>> >> >> 
> >>> >> >> 
> >>> >> >> 
> >>> >> >> 
> >>> >> >> 
> >>> >> >> 
> >>> >> >> --
> >>> >> >> 
> >>> >> >> 
> >>> >> >> Srinivas Kotaru
> >>> >> >> 
> >>> >> >> _______________________________________________
> >>> >> >> users mailing list
> >>> >> >> users lists openshift redhat com
> >>> >> >> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
> >>> >> >> 
> >>> >> 
> >>> >> 
> >>> 
> >>> 
> >
> 
> 


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]