[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: issue adding node to district



+++ Jason Marley [05/09/14 16:32 -0400]:
It is a lab env; no dice using the latest oo-admin-ctl-district, altho it went further. Any other thoughts?

[root broker sbin]# oo-admin-ctl-district -c add-node -n small_district -i node1.os.com


{"_id"=>"54083b272e25cd0ccc000001",
"uuid"=>"54083b272e25cd0ccc000001",
"available_uids"=>"<6000 uids hidden>",
"name"=>"small_district",
"platform"=>"linux",
"gear_size"=>"small",
"available_capacity"=>6000,
"max_uid"=>6999,
"max_capacity"=>6000,
"active_servers_size"=>0,
"updated_at"=>2014-09-04 10:12:55 UTC,
"created_at"=>2014-09-04 10:12:55 UTC}

ERROR OUTPUT:
Cannot connect to node.

I'm betting it's something a little wrong with the MCollective setup
on that machine.  All 'oo-mco ping' is doing is telling you that
ActiveMQ is setup properly and that the Broker at least knows the Node
exists.

Based on that error output you are running fairly recent code since
that error message was introduced in the commits I sent previously.
That error is printed when it can't fine the kernel fact for some
reason.

What does this command return?

oo-mco facts -v kernel

Another good mcollective sanity check is:

oo-mco inventory node1.os.com

With facts the most important thing to check would be the value for
plugin.yaml in /opt/rh/ruby193/root/etc/mcollective/server.cfg (or
/etc/mcollective/server.cfg if you are using Fedora).  It needs to
match the value that is being generated in
/etc/cron.minutely/openshift-facts.  You also need to ensure that
cronjob is running.

--Brenton



[root broker sbin]# oo-mco ping
node1.os.com                             time=83.01 ms


---- ping statistics ----
1 replies max: 83.01 min: 83.01 avg: 83.01


Jason

----- Original Message -----
+++ Jason Marley [04/09/14 17:46 -0400]:
>Hi All,
>
>I'm having an issue adding a node to my district. When I run mco ping I can
>see my node, but when I add it see a weird message. I verified that my
>broker was set up correctly, but my node is not. My node says mcollective
>is not running as a service, but is still listening for messages from the
>broker and responds to pings. When I run the node check it says mcollective
>is not running and that my SELINUX context's are not correct. Some other
>people have had this issue but haven't seen any resolution except for a
>bugzilla ticket (https://bugzilla.redhat.com/show_bug.cgi?id=1074553).
>
>any help would be appreciated.
>
>[root broker ~]# oo-admin-ctl-district -c add-node -n small_district -i
>node1.os.com
>/usr/sbin/oo-admin-ctl-district:215:in `block in <main>': undefined method
>`casecmp' for nil:NilClass (NoMethodError)
>        from /usr/sbin/oo-admin-ctl-district:178:in `block in
>        collate_errors'
>        from /usr/sbin/oo-admin-ctl-district:176:in `each'
>        from /usr/sbin/oo-admin-ctl-district:176:in `collate_errors'
>        from /usr/sbin/oo-admin-ctl-district:213:in `<main>'

I'm not sure what version of the code you are running but see if your
version of /usr/sbin/oo-admin-ctl-district has the following changes:

https://github.com/openshift/origin-server/commit/11632afde2d6b407ef1a6fe217e31b9cc0f5ce88
https://github.com/openshift/origin-server/commit/e17edc775d8debf9706c5a677e730084b5635b50

If you're running in a non-production environment you could likely
backup your version of /usr/sbin/oo-admin-ctl-district and just
replace it with
https://raw.githubusercontent.com/openshift/origin-server/master/broker-util/oo-admin-ctl-district
to test.

>
>[root broker ~]# date
>Thu Sep  4 17:36:56 EDT 2014
>
>[root broker ~]# oo-mco ping
>node1.os.com                             time=76.51 ms
>
>
>---- ping statistics ----
>1 replies max: 76.51 min: 76.51 avg: 76.51
>
>[root node1 ~]# oo-accept-node -v
>INFO: using default accept-node extensions
>INFO: loading node configuration file /etc/openshift/node.conf
>INFO: loading resource limit file /etc/openshift/resource_limits.conf
>INFO: finding external network device
>INFO: checking node public hostname resolution
>INFO: checking selinux status
>INFO: checking selinux openshift-origin policy
>INFO: checking selinux booleans
>INFO: checking package list
>INFO: checking services
>FAIL: service ruby193-mcollective not running
>FAIL: Could not get SELinux context for ruby193-mcollective
>INFO: checking kernel semaphores >= 512
>INFO: checking cgroups configuration
>INFO: checking cgroups processes
>INFO: find district uuid: NONE
>INFO: determining node uid range: 1000 to 6999
>INFO: checking presence of tc qdisc
>INFO: checking for cgroup filter
>INFO: checking presence of tc classes
>INFO: checking filesystem quotas
>INFO: checking quota db file selinux label
>INFO: checking 0 user accounts
>INFO: checking application dirs
>INFO: checking system httpd configs
>INFO: checking cartridge repository
>2 ERRORS
>
>[root node1 ~]# tail /var/log/openshift/node/ruby193-mcollective.log
>D, [2014-09-04T17:36:57.886032 #11971] DEBUG -- : pluginmanager.rb:83:in
>`[]' Returning cached plugin connector_plugin with class
>MCollective::Connector::Activemq
>D, [2014-09-04T17:36:57.886184 #11971] DEBUG -- : activemq.rb:362:in
>`publish' Sending a broadcast message to ActiveMQ target
>'/queue/mcollective.reply.broker.os.com_6176' with headers
>'{"timestamp"=>"1409866617000", "expires"=>"1409866687000"}'
>D, [2014-09-04T17:36:57.886400 #11971] DEBUG -- : runnerstats.rb:56:in
>`block in sent' Incrementing replies stat
>D, [2014-09-04T17:37:02.204937 #11971] DEBUG -- : pluginmanager.rb:83:in
>`[]' Returning cached plugin security_plugin with class
>MCollective::Security::Psk
>D, [2014-09-04T17:37:02.205498 #11971] DEBUG -- : base.rb:178:in
>`create_request' Encoding a request for agent 'registration' in collective
>mcollective with request id 2cd84c1bd9d35bb0bedd32a36a0ee0b9
>D, [2014-09-04T17:37:02.205545 #11971] DEBUG -- : psk.rb:98:in `callerid'
>Setting callerid to uid=0 based on callertype=uid
>D, [2014-09-04T17:37:02.205606 #11971] DEBUG -- : base.rb:70:in `publish'
>Sending registration 2cd84c1bd9d35bb0bedd32a36a0ee0b9 to collective
>mcollective
>D, [2014-09-04T17:37:02.205646 #11971] DEBUG -- : pluginmanager.rb:83:in
>`[]' Returning cached plugin connector_plugin with class
>MCollective::Connector::Activemq
>D, [2014-09-04T17:37:02.205718 #11971] DEBUG -- : activemq.rb:362:in
>`publish' Sending a broadcast message to ActiveMQ target
>'/topic/mcollective.registration.agent' with headers
>'{"timestamp"=>"1409866622000", "expires"=>"1409866692000",
>"reply-to"=>"/queue/mcollective.reply.node1.os.com_11971"}'
>D, [2014-09-04T17:37:04.503263 #11971] DEBUG -- : activemq.rb:169:in
>`on_hbfire' Publishing heartbeat to
>stomp://mcollective broker os com:61613: send_fire,
>{:curt=>1409866624.5028434, :last_sleep=>30.49942183494568}
>
>
>Jason
>
>_______________________________________________
>dev mailing list
>dev lists openshift redhat com
>http://lists.openshift.redhat.com/openshiftmm/listinfo/dev



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]