[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: issue adding node to district



thx for quick reply, was about to give up :) .

Definitely seems like something is wrong with mcollective, bc no results are returning from either of those commands. I'll double check how I configured it.

Does cron run on both node and broker or just the broker?

[root broker sbin]# oo-mco facts -v kernel
Discovering hosts using the mc method for 2 second(s) .... 1
Report for fact: kernel


---- rpc stats ----
           Nodes: 1 / 0
     Pass / Fail: 0 / 0
      Start Time: 2014-09-05 16:49:27 -0400
  Discovery Time: 2019.78ms
      Agent Time: 12002.98ms
      Total Time: 14022.76ms


No response from:

   node1.os.com

[root broker sbin]# oo-mco inventory node1.os.com
Did not receive any results from node node1.os.com

Jason

----- Original Message -----
> +++ Jason Marley [05/09/14 16:32 -0400]:
> >It is a lab env; no dice using the latest oo-admin-ctl-district, altho it
> >went further. Any other thoughts?
> >
> >[root broker sbin]# oo-admin-ctl-district -c add-node -n small_district -i
> >node1.os.com
> >
> >
> >{"_id"=>"54083b272e25cd0ccc000001",
> > "uuid"=>"54083b272e25cd0ccc000001",
> > "available_uids"=>"<6000 uids hidden>",
> > "name"=>"small_district",
> > "platform"=>"linux",
> > "gear_size"=>"small",
> > "available_capacity"=>6000,
> > "max_uid"=>6999,
> > "max_capacity"=>6000,
> > "active_servers_size"=>0,
> > "updated_at"=>2014-09-04 10:12:55 UTC,
> > "created_at"=>2014-09-04 10:12:55 UTC}
> >
> >ERROR OUTPUT:
> >Cannot connect to node.
> 
> I'm betting it's something a little wrong with the MCollective setup
> on that machine.  All 'oo-mco ping' is doing is telling you that
> ActiveMQ is setup properly and that the Broker at least knows the Node
> exists.
> 
> Based on that error output you are running fairly recent code since
> that error message was introduced in the commits I sent previously.
> That error is printed when it can't fine the kernel fact for some
> reason.
> 
> What does this command return?
> 
> oo-mco facts -v kernel
> 
> Another good mcollective sanity check is:
> 
> oo-mco inventory node1.os.com
> 
> With facts the most important thing to check would be the value for
> plugin.yaml in /opt/rh/ruby193/root/etc/mcollective/server.cfg (or
> /etc/mcollective/server.cfg if you are using Fedora).  It needs to
> match the value that is being generated in
> /etc/cron.minutely/openshift-facts.  You also need to ensure that
> cronjob is running.
> 
> --Brenton
> 
> 
> >
> >[root broker sbin]# oo-mco ping
> >node1.os.com                             time=83.01 ms
> >
> >
> >---- ping statistics ----
> >1 replies max: 83.01 min: 83.01 avg: 83.01
> >
> >
> >Jason
> >
> >----- Original Message -----
> >> +++ Jason Marley [04/09/14 17:46 -0400]:
> >> >Hi All,
> >> >
> >> >I'm having an issue adding a node to my district. When I run mco ping I
> >> >can
> >> >see my node, but when I add it see a weird message. I verified that my
> >> >broker was set up correctly, but my node is not. My node says mcollective
> >> >is not running as a service, but is still listening for messages from the
> >> >broker and responds to pings. When I run the node check it says
> >> >mcollective
> >> >is not running and that my SELINUX context's are not correct. Some other
> >> >people have had this issue but haven't seen any resolution except for a
> >> >bugzilla ticket (https://bugzilla.redhat.com/show_bug.cgi?id=1074553).
> >> >
> >> >any help would be appreciated.
> >> >
> >> >[root broker ~]# oo-admin-ctl-district -c add-node -n small_district -i
> >> >node1.os.com
> >> >/usr/sbin/oo-admin-ctl-district:215:in `block in <main>': undefined
> >> >method
> >> >`casecmp' for nil:NilClass (NoMethodError)
> >> >        from /usr/sbin/oo-admin-ctl-district:178:in `block in
> >> >        collate_errors'
> >> >        from /usr/sbin/oo-admin-ctl-district:176:in `each'
> >> >        from /usr/sbin/oo-admin-ctl-district:176:in `collate_errors'
> >> >        from /usr/sbin/oo-admin-ctl-district:213:in `<main>'
> >>
> >> I'm not sure what version of the code you are running but see if your
> >> version of /usr/sbin/oo-admin-ctl-district has the following changes:
> >>
> >> https://github.com/openshift/origin-server/commit/11632afde2d6b407ef1a6fe217e31b9cc0f5ce88
> >> https://github.com/openshift/origin-server/commit/e17edc775d8debf9706c5a677e730084b5635b50
> >>
> >> If you're running in a non-production environment you could likely
> >> backup your version of /usr/sbin/oo-admin-ctl-district and just
> >> replace it with
> >> https://raw.githubusercontent.com/openshift/origin-server/master/broker-util/oo-admin-ctl-district
> >> to test.
> >>
> >> >
> >> >[root broker ~]# date
> >> >Thu Sep  4 17:36:56 EDT 2014
> >> >
> >> >[root broker ~]# oo-mco ping
> >> >node1.os.com                             time=76.51 ms
> >> >
> >> >
> >> >---- ping statistics ----
> >> >1 replies max: 76.51 min: 76.51 avg: 76.51
> >> >
> >> >[root node1 ~]# oo-accept-node -v
> >> >INFO: using default accept-node extensions
> >> >INFO: loading node configuration file /etc/openshift/node.conf
> >> >INFO: loading resource limit file /etc/openshift/resource_limits.conf
> >> >INFO: finding external network device
> >> >INFO: checking node public hostname resolution
> >> >INFO: checking selinux status
> >> >INFO: checking selinux openshift-origin policy
> >> >INFO: checking selinux booleans
> >> >INFO: checking package list
> >> >INFO: checking services
> >> >FAIL: service ruby193-mcollective not running
> >> >FAIL: Could not get SELinux context for ruby193-mcollective
> >> >INFO: checking kernel semaphores >= 512
> >> >INFO: checking cgroups configuration
> >> >INFO: checking cgroups processes
> >> >INFO: find district uuid: NONE
> >> >INFO: determining node uid range: 1000 to 6999
> >> >INFO: checking presence of tc qdisc
> >> >INFO: checking for cgroup filter
> >> >INFO: checking presence of tc classes
> >> >INFO: checking filesystem quotas
> >> >INFO: checking quota db file selinux label
> >> >INFO: checking 0 user accounts
> >> >INFO: checking application dirs
> >> >INFO: checking system httpd configs
> >> >INFO: checking cartridge repository
> >> >2 ERRORS
> >> >
> >> >[root node1 ~]# tail /var/log/openshift/node/ruby193-mcollective.log
> >> >D, [2014-09-04T17:36:57.886032 #11971] DEBUG -- : pluginmanager.rb:83:in
> >> >`[]' Returning cached plugin connector_plugin with class
> >> >MCollective::Connector::Activemq
> >> >D, [2014-09-04T17:36:57.886184 #11971] DEBUG -- : activemq.rb:362:in
> >> >`publish' Sending a broadcast message to ActiveMQ target
> >> >'/queue/mcollective.reply.broker.os.com_6176' with headers
> >> >'{"timestamp"=>"1409866617000", "expires"=>"1409866687000"}'
> >> >D, [2014-09-04T17:36:57.886400 #11971] DEBUG -- : runnerstats.rb:56:in
> >> >`block in sent' Incrementing replies stat
> >> >D, [2014-09-04T17:37:02.204937 #11971] DEBUG -- : pluginmanager.rb:83:in
> >> >`[]' Returning cached plugin security_plugin with class
> >> >MCollective::Security::Psk
> >> >D, [2014-09-04T17:37:02.205498 #11971] DEBUG -- : base.rb:178:in
> >> >`create_request' Encoding a request for agent 'registration' in
> >> >collective
> >> >mcollective with request id 2cd84c1bd9d35bb0bedd32a36a0ee0b9
> >> >D, [2014-09-04T17:37:02.205545 #11971] DEBUG -- : psk.rb:98:in `callerid'
> >> >Setting callerid to uid=0 based on callertype=uid
> >> >D, [2014-09-04T17:37:02.205606 #11971] DEBUG -- : base.rb:70:in `publish'
> >> >Sending registration 2cd84c1bd9d35bb0bedd32a36a0ee0b9 to collective
> >> >mcollective
> >> >D, [2014-09-04T17:37:02.205646 #11971] DEBUG -- : pluginmanager.rb:83:in
> >> >`[]' Returning cached plugin connector_plugin with class
> >> >MCollective::Connector::Activemq
> >> >D, [2014-09-04T17:37:02.205718 #11971] DEBUG -- : activemq.rb:362:in
> >> >`publish' Sending a broadcast message to ActiveMQ target
> >> >'/topic/mcollective.registration.agent' with headers
> >> >'{"timestamp"=>"1409866622000", "expires"=>"1409866692000",
> >> >"reply-to"=>"/queue/mcollective.reply.node1.os.com_11971"}'
> >> >D, [2014-09-04T17:37:04.503263 #11971] DEBUG -- : activemq.rb:169:in
> >> >`on_hbfire' Publishing heartbeat to
> >> >stomp://mcollective broker os com:61613: send_fire,
> >> >{:curt=>1409866624.5028434, :last_sleep=>30.49942183494568}
> >> >
> >> >
> >> >Jason
> >> >
> >> >_______________________________________________
> >> >dev mailing list
> >> >dev lists openshift redhat com
> >> >http://lists.openshift.redhat.com/openshiftmm/listinfo/dev
> >>
> 


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]