[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: issue adding node to district



Jason,

I have the same issue. Was you able to resolve it?

Regards,
Anthony


On Fri, Sep 5, 2014 at 11:56 PM, Brenton Leanhardt <bleanhar redhat com> wrote:
> +++ Jason Marley [05/09/14 16:53 -0400]:
>>
>> thx for quick reply, was about to give up :) .
>>
>> Definitely seems like something is wrong with mcollective, bc no results
>> are returning from either of those commands. I'll double check how I
>> configured it.
>>
>> Does cron run on both node and broker or just the broker?
>
>
> That's going to run on the Node.  The cronjob generates facts every
> minute that the Node will report to the Broker.
>
>
>>
>> [root broker sbin]# oo-mco facts -v kernel
>> Discovering hosts using the mc method for 2 second(s) .... 1
>> Report for fact: kernel
>>
>>
>> ---- rpc stats ----
>>           Nodes: 1 / 0
>>     Pass / Fail: 0 / 0
>>      Start Time: 2014-09-05 16:49:27 -0400
>>  Discovery Time: 2019.78ms
>>      Agent Time: 12002.98ms
>>      Total Time: 14022.76ms
>>
>>
>> No response from:
>>
>>   node1.os.com
>>
>> [root broker sbin]# oo-mco inventory node1.os.com
>> Did not receive any results from node node1.os.com
>>
>> Jason
>>
>> ----- Original Message -----
>>>
>>> +++ Jason Marley [05/09/14 16:32 -0400]:
>>> >It is a lab env; no dice using the latest oo-admin-ctl-district, altho
>>> > it
>>> >went further. Any other thoughts?
>>> >
>>> >[root broker sbin]# oo-admin-ctl-district -c add-node -n small_district
>>> > -i
>>> >node1.os.com
>>> >
>>> >
>>> >{"_id"=>"54083b272e25cd0ccc000001",
>>> > "uuid"=>"54083b272e25cd0ccc000001",
>>> > "available_uids"=>"<6000 uids hidden>",
>>> > "name"=>"small_district",
>>> > "platform"=>"linux",
>>> > "gear_size"=>"small",
>>> > "available_capacity"=>6000,
>>> > "max_uid"=>6999,
>>> > "max_capacity"=>6000,
>>> > "active_servers_size"=>0,
>>> > "updated_at"=>2014-09-04 10:12:55 UTC,
>>> > "created_at"=>2014-09-04 10:12:55 UTC}
>>> >
>>> >ERROR OUTPUT:
>>> >Cannot connect to node.
>>>
>>> I'm betting it's something a little wrong with the MCollective setup
>>> on that machine.  All 'oo-mco ping' is doing is telling you that
>>> ActiveMQ is setup properly and that the Broker at least knows the Node
>>> exists.
>>>
>>> Based on that error output you are running fairly recent code since
>>> that error message was introduced in the commits I sent previously.
>>> That error is printed when it can't fine the kernel fact for some
>>> reason.
>>>
>>> What does this command return?
>>>
>>> oo-mco facts -v kernel
>>>
>>> Another good mcollective sanity check is:
>>>
>>> oo-mco inventory node1.os.com
>>>
>>> With facts the most important thing to check would be the value for
>>> plugin.yaml in /opt/rh/ruby193/root/etc/mcollective/server.cfg (or
>>> /etc/mcollective/server.cfg if you are using Fedora).  It needs to
>>> match the value that is being generated in
>>> /etc/cron.minutely/openshift-facts.  You also need to ensure that
>>> cronjob is running.
>>>
>>> --Brenton
>>>
>>>
>>> >
>>> >[root broker sbin]# oo-mco ping
>>> >node1.os.com                             time=83.01 ms
>>> >
>>> >
>>> >---- ping statistics ----
>>> >1 replies max: 83.01 min: 83.01 avg: 83.01
>>> >
>>> >
>>> >Jason
>>> >
>>> >----- Original Message -----
>>> >> +++ Jason Marley [04/09/14 17:46 -0400]:
>>> >> >Hi All,
>>> >> >
>>> >> >I'm having an issue adding a node to my district. When I run mco ping
>>> >> > I
>>> >> >can
>>> >> >see my node, but when I add it see a weird message. I verified that
>>> >> > my
>>> >> >broker was set up correctly, but my node is not. My node says
>>> >> > mcollective
>>> >> >is not running as a service, but is still listening for messages from
>>> >> > the
>>> >> >broker and responds to pings. When I run the node check it says
>>> >> >mcollective
>>> >> >is not running and that my SELINUX context's are not correct. Some
>>> >> > other
>>> >> >people have had this issue but haven't seen any resolution except for
>>> >> > a
>>> >> >bugzilla ticket
>>> >> > (https://bugzilla.redhat.com/show_bug.cgi?id=1074553).
>>> >> >
>>> >> >any help would be appreciated.
>>> >> >
>>> >> >[root broker ~]# oo-admin-ctl-district -c add-node -n small_district
>>> >> > -i
>>> >> >node1.os.com
>>> >> >/usr/sbin/oo-admin-ctl-district:215:in `block in <main>': undefined
>>> >> >method
>>> >> >`casecmp' for nil:NilClass (NoMethodError)
>>> >> >        from /usr/sbin/oo-admin-ctl-district:178:in `block in
>>> >> >        collate_errors'
>>> >> >        from /usr/sbin/oo-admin-ctl-district:176:in `each'
>>> >> >        from /usr/sbin/oo-admin-ctl-district:176:in `collate_errors'
>>> >> >        from /usr/sbin/oo-admin-ctl-district:213:in `<main>'
>>> >>
>>> >> I'm not sure what version of the code you are running but see if your
>>> >> version of /usr/sbin/oo-admin-ctl-district has the following changes:
>>> >>
>>> >>
>>> >> https://github.com/openshift/origin-server/commit/11632afde2d6b407ef1a6fe217e31b9cc0f5ce88
>>> >>
>>> >> https://github.com/openshift/origin-server/commit/e17edc775d8debf9706c5a677e730084b5635b50
>>> >>
>>> >> If you're running in a non-production environment you could likely
>>> >> backup your version of /usr/sbin/oo-admin-ctl-district and just
>>> >> replace it with
>>> >>
>>> >> https://raw.githubusercontent.com/openshift/origin-server/master/broker-util/oo-admin-ctl-district
>>> >> to test.
>>> >>
>>> >> >
>>> >> >[root broker ~]# date
>>> >> >Thu Sep  4 17:36:56 EDT 2014
>>> >> >
>>> >> >[root broker ~]# oo-mco ping
>>> >> >node1.os.com                             time=76.51 ms
>>> >> >
>>> >> >
>>> >> >---- ping statistics ----
>>> >> >1 replies max: 76.51 min: 76.51 avg: 76.51
>>> >> >
>>> >> >[root node1 ~]# oo-accept-node -v
>>> >> >INFO: using default accept-node extensions
>>> >> >INFO: loading node configuration file /etc/openshift/node.conf
>>> >> >INFO: loading resource limit file /etc/openshift/resource_limits.conf
>>> >> >INFO: finding external network device
>>> >> >INFO: checking node public hostname resolution
>>> >> >INFO: checking selinux status
>>> >> >INFO: checking selinux openshift-origin policy
>>> >> >INFO: checking selinux booleans
>>> >> >INFO: checking package list
>>> >> >INFO: checking services
>>> >> >FAIL: service ruby193-mcollective not running
>>> >> >FAIL: Could not get SELinux context for ruby193-mcollective
>>> >> >INFO: checking kernel semaphores >= 512
>>> >> >INFO: checking cgroups configuration
>>> >> >INFO: checking cgroups processes
>>> >> >INFO: find district uuid: NONE
>>> >> >INFO: determining node uid range: 1000 to 6999
>>> >> >INFO: checking presence of tc qdisc
>>> >> >INFO: checking for cgroup filter
>>> >> >INFO: checking presence of tc classes
>>> >> >INFO: checking filesystem quotas
>>> >> >INFO: checking quota db file selinux label
>>> >> >INFO: checking 0 user accounts
>>> >> >INFO: checking application dirs
>>> >> >INFO: checking system httpd configs
>>> >> >INFO: checking cartridge repository
>>> >> >2 ERRORS
>>> >> >
>>> >> >[root node1 ~]# tail /var/log/openshift/node/ruby193-mcollective.log
>>> >> >D, [2014-09-04T17:36:57.886032 #11971] DEBUG -- :
>>> >> > pluginmanager.rb:83:in
>>> >> >`[]' Returning cached plugin connector_plugin with class
>>> >> >MCollective::Connector::Activemq
>>> >> >D, [2014-09-04T17:36:57.886184 #11971] DEBUG -- : activemq.rb:362:in
>>> >> >`publish' Sending a broadcast message to ActiveMQ target
>>> >> >'/queue/mcollective.reply.broker.os.com_6176' with headers
>>> >> >'{"timestamp"=>"1409866617000", "expires"=>"1409866687000"}'
>>> >> >D, [2014-09-04T17:36:57.886400 #11971] DEBUG -- :
>>> >> > runnerstats.rb:56:in
>>> >> >`block in sent' Incrementing replies stat
>>> >> >D, [2014-09-04T17:37:02.204937 #11971] DEBUG -- :
>>> >> > pluginmanager.rb:83:in
>>> >> >`[]' Returning cached plugin security_plugin with class
>>> >> >MCollective::Security::Psk
>>> >> >D, [2014-09-04T17:37:02.205498 #11971] DEBUG -- : base.rb:178:in
>>> >> >`create_request' Encoding a request for agent 'registration' in
>>> >> >collective
>>> >> >mcollective with request id 2cd84c1bd9d35bb0bedd32a36a0ee0b9
>>> >> >D, [2014-09-04T17:37:02.205545 #11971] DEBUG -- : psk.rb:98:in
>>> >> > `callerid'
>>> >> >Setting callerid to uid=0 based on callertype=uid
>>> >> >D, [2014-09-04T17:37:02.205606 #11971] DEBUG -- : base.rb:70:in
>>> >> > `publish'
>>> >> >Sending registration 2cd84c1bd9d35bb0bedd32a36a0ee0b9 to collective
>>> >> >mcollective
>>> >> >D, [2014-09-04T17:37:02.205646 #11971] DEBUG -- :
>>> >> > pluginmanager.rb:83:in
>>> >> >`[]' Returning cached plugin connector_plugin with class
>>> >> >MCollective::Connector::Activemq
>>> >> >D, [2014-09-04T17:37:02.205718 #11971] DEBUG -- : activemq.rb:362:in
>>> >> >`publish' Sending a broadcast message to ActiveMQ target
>>> >> >'/topic/mcollective.registration.agent' with headers
>>> >> >'{"timestamp"=>"1409866622000", "expires"=>"1409866692000",
>>> >> >"reply-to"=>"/queue/mcollective.reply.node1.os.com_11971"}'
>>> >> >D, [2014-09-04T17:37:04.503263 #11971] DEBUG -- : activemq.rb:169:in
>>> >> >`on_hbfire' Publishing heartbeat to
>>> >> >stomp://mcollective broker os com:61613: send_fire,
>>> >> >{:curt=>1409866624.5028434, :last_sleep=>30.49942183494568}
>>> >> >
>>> >> >
>>> >> >Jason
>>> >> >
>>> >> >_______________________________________________
>>> >> >dev mailing list
>>> >> >dev lists openshift redhat com
>>> >> >http://lists.openshift.redhat.com/openshiftmm/listinfo/dev
>>> >>
>>>
>
> _______________________________________________
> dev mailing list
> dev lists openshift redhat com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/dev


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]