[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: issue adding node to district



I saw something similar back in July: http://lists.openshift.redhat.com/openshift-archives/dev/2014-July/msg00217.html

Changing direct_addressing to 1 in /opt/rh/ruby193/root/etc/mcollective/server.cfg resolved the problem for me, but I was told that the change might cause other problems with MCollective. We never got to a resolution of the problem, but I have not had any problems with the configuration change.

On Tue, Sep 30, 2014 at 3:41 PM, Jason Marley <jmarley redhat com> wrote:
I wasn't able to resolve in community version. I switched gears back to openshift enterprise and was able to get that all running.

Maybe try uninstall/reinstall mcollective/activemq?

Jason


----- Original Message -----
> Jason,
>
> I have the same issue. Was you able to resolve it?
>
> Regards,
> Anthony
>
>
> On Fri, Sep 5, 2014 at 11:56 PM, Brenton Leanhardt <bleanhar redhat com>
> wrote:
> > +++ Jason Marley [05/09/14 16:53 -0400]:
> >>
> >> thx for quick reply, was about to give up :) .
> >>
> >> Definitely seems like something is wrong with mcollective, bc no results
> >> are returning from either of those commands. I'll double check how I
> >> configured it.
> >>
> >> Does cron run on both node and broker or just the broker?
> >
> >
> > That's going to run on the Node.  The cronjob generates facts every
> > minute that the Node will report to the Broker.
> >
> >
> >>
> >> [root broker sbin]# oo-mco facts -v kernel
> >> Discovering hosts using the mc method for 2 second(s) .... 1
> >> Report for fact: kernel
> >>
> >>
> >> ---- rpc stats ----
> >>           Nodes: 1 / 0
> >>     Pass / Fail: 0 / 0
> >>      Start Time: 2014-09-05 16:49:27 -0400
> >>  Discovery Time: 2019.78ms
> >>      Agent Time: 12002.98ms
> >>      Total Time: 14022.76ms
> >>
> >>
> >> No response from:
> >>
> >>   node1.os.com
> >>
> >> [root broker sbin]# oo-mco inventory node1.os.com
> >> Did not receive any results from node node1.os.com
> >>
> >> Jason
> >>
> >> ----- Original Message -----
> >>>
> >>> +++ Jason Marley [05/09/14 16:32 -0400]:
> >>> >It is a lab env; no dice using the latest oo-admin-ctl-district, altho
> >>> > it
> >>> >went further. Any other thoughts?
> >>> >
> >>> >[root broker sbin]# oo-admin-ctl-district -c add-node -n small_district
> >>> > -i
> >>> >node1.os.com
> >>> >
> >>> >
> >>> >{"_id"=>"54083b272e25cd0ccc000001",
> >>> > "uuid"=>"54083b272e25cd0ccc000001",
> >>> > "available_uids"=>"<6000 uids hidden>",
> >>> > "name"=>"small_district",
> >>> > "platform"=>"linux",
> >>> > "gear_size"=>"small",
> >>> > "available_capacity"=>6000,
> >>> > "max_uid"=>6999,
> >>> > "max_capacity"=>6000,
> >>> > "active_servers_size"=>0,
> >>> > "updated_at"=>2014-09-04 10:12:55 UTC,
> >>> > "created_at"=>2014-09-04 10:12:55 UTC}
> >>> >
> >>> >ERROR OUTPUT:
> >>> >Cannot connect to node.
> >>>
> >>> I'm betting it's something a little wrong with the MCollective setup
> >>> on that machine.  All 'oo-mco ping' is doing is telling you that
> >>> ActiveMQ is setup properly and that the Broker at least knows the Node
> >>> exists.
> >>>
> >>> Based on that error output you are running fairly recent code since
> >>> that error message was introduced in the commits I sent previously.
> >>> That error is printed when it can't fine the kernel fact for some
> >>> reason.
> >>>
> >>> What does this command return?
> >>>
> >>> oo-mco facts -v kernel
> >>>
> >>> Another good mcollective sanity check is:
> >>>
> >>> oo-mco inventory node1.os.com
> >>>
> >>> With facts the most important thing to check would be the value for
> >>> plugin.yaml in /opt/rh/ruby193/root/etc/mcollective/server.cfg (or
> >>> /etc/mcollective/server.cfg if you are using Fedora).  It needs to
> >>> match the value that is being generated in
> >>> /etc/cron.minutely/openshift-facts.  You also need to ensure that
> >>> cronjob is running.
> >>>
> >>> --Brenton
> >>>
> >>>
> >>> >
> >>> >[root broker sbin]# oo-mco ping
> >>> >node1.os.com                             time=83.01 ms
> >>> >
> >>> >
> >>> >---- ping statistics ----
> >>> >1 replies max: 83.01 min: 83.01 avg: 83.01
> >>> >
> >>> >
> >>> >Jason
> >>> >
> >>> >----- Original Message -----
> >>> >> +++ Jason Marley [04/09/14 17:46 -0400]:
> >>> >> >Hi All,
> >>> >> >
> >>> >> >I'm having an issue adding a node to my district. When I run mco ping
> >>> >> > I
> >>> >> >can
> >>> >> >see my node, but when I add it see a weird message. I verified that
> >>> >> > my
> >>> >> >broker was set up correctly, but my node is not. My node says
> >>> >> > mcollective
> >>> >> >is not running as a service, but is still listening for messages from
> >>> >> > the
> >>> >> >broker and responds to pings. When I run the node check it says
> >>> >> >mcollective
> >>> >> >is not running and that my SELINUX context's are not correct. Some
> >>> >> > other
> >>> >> >people have had this issue but haven't seen any resolution except for
> >>> >> > a
> >>> >> >bugzilla ticket
> >>> >> > (https://bugzilla.redhat.com/show_bug.cgi?id=1074553).
> >>> >> >
> >>> >> >any help would be appreciated.
> >>> >> >
> >>> >> >[root broker ~]# oo-admin-ctl-district -c add-node -n small_district
> >>> >> > -i
> >>> >> >node1.os.com
> >>> >> >/usr/sbin/oo-admin-ctl-district:215:in `block in <main>': undefined
> >>> >> >method
> >>> >> >`casecmp' for nil:NilClass (NoMethodError)
> >>> >> >        from /usr/sbin/oo-admin-ctl-district:178:in `block in
> >>> >> >        collate_errors'
> >>> >> >        from /usr/sbin/oo-admin-ctl-district:176:in `each'
> >>> >> >        from /usr/sbin/oo-admin-ctl-district:176:in `collate_errors'
> >>> >> >        from /usr/sbin/oo-admin-ctl-district:213:in `<main>'
> >>> >>
> >>> >> I'm not sure what version of the code you are running but see if your
> >>> >> version of /usr/sbin/oo-admin-ctl-district has the following changes:
> >>> >>
> >>> >>
> >>> >> https://github.com/openshift/origin-server/commit/11632afde2d6b407ef1a6fe217e31b9cc0f5ce88
> >>> >>
> >>> >> https://github.com/openshift/origin-server/commit/e17edc775d8debf9706c5a677e730084b5635b50
> >>> >>
> >>> >> If you're running in a non-production environment you could likely
> >>> >> backup your version of /usr/sbin/oo-admin-ctl-district and just
> >>> >> replace it with
> >>> >>
> >>> >> https://raw.githubusercontent.com/openshift/origin-server/master/broker-util/oo-admin-ctl-district
> >>> >> to test.
> >>> >>
> >>> >> >
> >>> >> >[root broker ~]# date
> >>> >> >Thu Sep  4 17:36:56 EDT 2014
> >>> >> >
> >>> >> >[root broker ~]# oo-mco ping
> >>> >> >node1.os.com                             time=76.51 ms
> >>> >> >
> >>> >> >
> >>> >> >---- ping statistics ----
> >>> >> >1 replies max: 76.51 min: 76.51 avg: 76.51
> >>> >> >
> >>> >> >[root node1 ~]# oo-accept-node -v
> >>> >> >INFO: using default accept-node extensions
> >>> >> >INFO: loading node configuration file /etc/openshift/node.conf
> >>> >> >INFO: loading resource limit file /etc/openshift/resource_limits.conf
> >>> >> >INFO: finding external network device
> >>> >> >INFO: checking node public hostname resolution
> >>> >> >INFO: checking selinux status
> >>> >> >INFO: checking selinux openshift-origin policy
> >>> >> >INFO: checking selinux booleans
> >>> >> >INFO: checking package list
> >>> >> >INFO: checking services
> >>> >> >FAIL: service ruby193-mcollective not running
> >>> >> >FAIL: Could not get SELinux context for ruby193-mcollective
> >>> >> >INFO: checking kernel semaphores >= 512
> >>> >> >INFO: checking cgroups configuration
> >>> >> >INFO: checking cgroups processes
> >>> >> >INFO: find district uuid: NONE
> >>> >> >INFO: determining node uid range: 1000 to 6999
> >>> >> >INFO: checking presence of tc qdisc
> >>> >> >INFO: checking for cgroup filter
> >>> >> >INFO: checking presence of tc classes
> >>> >> >INFO: checking filesystem quotas
> >>> >> >INFO: checking quota db file selinux label
> >>> >> >INFO: checking 0 user accounts
> >>> >> >INFO: checking application dirs
> >>> >> >INFO: checking system httpd configs
> >>> >> >INFO: checking cartridge repository
> >>> >> >2 ERRORS
> >>> >> >
> >>> >> >[root node1 ~]# tail /var/log/openshift/node/ruby193-mcollective.log
> >>> >> >D, [2014-09-04T17:36:57.886032 #11971] DEBUG -- :
> >>> >> > pluginmanager.rb:83:in
> >>> >> >`[]' Returning cached plugin connector_plugin with class
> >>> >> >MCollective::Connector::Activemq
> >>> >> >D, [2014-09-04T17:36:57.886184 #11971] DEBUG -- : activemq.rb:362:in
> >>> >> >`publish' Sending a broadcast message to ActiveMQ target
> >>> >> >'/queue/mcollective.reply.broker.os.com_6176' with headers
> >>> >> >'{"timestamp"=>"1409866617000", "expires"=>"1409866687000"}'
> >>> >> >D, [2014-09-04T17:36:57.886400 #11971] DEBUG -- :
> >>> >> > runnerstats.rb:56:in
> >>> >> >`block in sent' Incrementing replies stat
> >>> >> >D, [2014-09-04T17:37:02.204937 #11971] DEBUG -- :
> >>> >> > pluginmanager.rb:83:in
> >>> >> >`[]' Returning cached plugin security_plugin with class
> >>> >> >MCollective::Security::Psk
> >>> >> >D, [2014-09-04T17:37:02.205498 #11971] DEBUG -- : base.rb:178:in
> >>> >> >`create_request' Encoding a request for agent 'registration' in
> >>> >> >collective
> >>> >> >mcollective with request id 2cd84c1bd9d35bb0bedd32a36a0ee0b9
> >>> >> >D, [2014-09-04T17:37:02.205545 #11971] DEBUG -- : psk.rb:98:in
> >>> >> > `callerid'
> >>> >> >Setting callerid to uid=0 based on callertype=uid
> >>> >> >D, [2014-09-04T17:37:02.205606 #11971] DEBUG -- : base.rb:70:in
> >>> >> > `publish'
> >>> >> >Sending registration 2cd84c1bd9d35bb0bedd32a36a0ee0b9 to collective
> >>> >> >mcollective
> >>> >> >D, [2014-09-04T17:37:02.205646 #11971] DEBUG -- :
> >>> >> > pluginmanager.rb:83:in
> >>> >> >`[]' Returning cached plugin connector_plugin with class
> >>> >> >MCollective::Connector::Activemq
> >>> >> >D, [2014-09-04T17:37:02.205718 #11971] DEBUG -- : activemq.rb:362:in
> >>> >> >`publish' Sending a broadcast message to ActiveMQ target
> >>> >> >'/topic/mcollective.registration.agent' with headers
> >>> >> >'{"timestamp"=>"1409866622000", "expires"=>"1409866692000",
> >>> >> >"reply-to"=>"/queue/mcollective.reply.node1.os.com_11971"}'
> >>> >> >D, [2014-09-04T17:37:04.503263 #11971] DEBUG -- : activemq.rb:169:in
> >>> >> >`on_hbfire' Publishing heartbeat to
> >>> >> >stomp://mcollective broker os com:61613: send_fire,
> >>> >> >{:curt=>1409866624.5028434, :last_sleep=>30.49942183494568}
> >>> >> >
> >>> >> >
> >>> >> >Jason
> >>> >> >
> >>> >> >_______________________________________________
> >>> >> >dev mailing list
> >>> >> >dev lists openshift redhat com
> >>> >> >http://lists.openshift.redhat.com/openshiftmm/listinfo/dev
> >>> >>
> >>>
> >
> > _______________________________________________
> > dev mailing list
> > dev lists openshift redhat com
> > http://lists.openshift.redhat.com/openshiftmm/listinfo/dev
>

_______________________________________________
dev mailing list
dev lists openshift redhat com
http://lists.openshift.redhat.com/openshiftmm/listinfo/dev


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]