[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: issue adding node to district



Thanks, Charles,

That solved the problem.

I think it was a bug in documentation:
http://openshift.github.io/documentation/oo_deployment_guide_comprehensive.html#configure-mcollective

However in master branch it is fixed already.

On Tue, Sep 30, 2014 at 11:25 PM, Charles Simpson <csimpson gmail com> wrote:
> I saw something similar back in July:
> http://lists.openshift.redhat.com/openshift-archives/dev/2014-July/msg00217.html
>
> Changing direct_addressing to 1 in
> /opt/rh/ruby193/root/etc/mcollective/server.cfg resolved the problem for me,
> but I was told that the change might cause other problems with MCollective.
> We never got to a resolution of the problem, but I have not had any problems
> with the configuration change.
>
> On Tue, Sep 30, 2014 at 3:41 PM, Jason Marley <jmarley redhat com> wrote:
>>
>> I wasn't able to resolve in community version. I switched gears back to
>> openshift enterprise and was able to get that all running.
>>
>> Maybe try uninstall/reinstall mcollective/activemq?
>>
>> Jason
>>
>>
>> ----- Original Message -----
>> > Jason,
>> >
>> > I have the same issue. Was you able to resolve it?
>> >
>> > Regards,
>> > Anthony
>> >
>> >
>> > On Fri, Sep 5, 2014 at 11:56 PM, Brenton Leanhardt <bleanhar redhat com>
>> > wrote:
>> > > +++ Jason Marley [05/09/14 16:53 -0400]:
>> > >>
>> > >> thx for quick reply, was about to give up :) .
>> > >>
>> > >> Definitely seems like something is wrong with mcollective, bc no
>> > >> results
>> > >> are returning from either of those commands. I'll double check how I
>> > >> configured it.
>> > >>
>> > >> Does cron run on both node and broker or just the broker?
>> > >
>> > >
>> > > That's going to run on the Node.  The cronjob generates facts every
>> > > minute that the Node will report to the Broker.
>> > >
>> > >
>> > >>
>> > >> [root broker sbin]# oo-mco facts -v kernel
>> > >> Discovering hosts using the mc method for 2 second(s) .... 1
>> > >> Report for fact: kernel
>> > >>
>> > >>
>> > >> ---- rpc stats ----
>> > >>           Nodes: 1 / 0
>> > >>     Pass / Fail: 0 / 0
>> > >>      Start Time: 2014-09-05 16:49:27 -0400
>> > >>  Discovery Time: 2019.78ms
>> > >>      Agent Time: 12002.98ms
>> > >>      Total Time: 14022.76ms
>> > >>
>> > >>
>> > >> No response from:
>> > >>
>> > >>   node1.os.com
>> > >>
>> > >> [root broker sbin]# oo-mco inventory node1.os.com
>> > >> Did not receive any results from node node1.os.com
>> > >>
>> > >> Jason
>> > >>
>> > >> ----- Original Message -----
>> > >>>
>> > >>> +++ Jason Marley [05/09/14 16:32 -0400]:
>> > >>> >It is a lab env; no dice using the latest oo-admin-ctl-district,
>> > >>> > altho
>> > >>> > it
>> > >>> >went further. Any other thoughts?
>> > >>> >
>> > >>> >[root broker sbin]# oo-admin-ctl-district -c add-node -n
>> > >>> > small_district
>> > >>> > -i
>> > >>> >node1.os.com
>> > >>> >
>> > >>> >
>> > >>> >{"_id"=>"54083b272e25cd0ccc000001",
>> > >>> > "uuid"=>"54083b272e25cd0ccc000001",
>> > >>> > "available_uids"=>"<6000 uids hidden>",
>> > >>> > "name"=>"small_district",
>> > >>> > "platform"=>"linux",
>> > >>> > "gear_size"=>"small",
>> > >>> > "available_capacity"=>6000,
>> > >>> > "max_uid"=>6999,
>> > >>> > "max_capacity"=>6000,
>> > >>> > "active_servers_size"=>0,
>> > >>> > "updated_at"=>2014-09-04 10:12:55 UTC,
>> > >>> > "created_at"=>2014-09-04 10:12:55 UTC}
>> > >>> >
>> > >>> >ERROR OUTPUT:
>> > >>> >Cannot connect to node.
>> > >>>
>> > >>> I'm betting it's something a little wrong with the MCollective setup
>> > >>> on that machine.  All 'oo-mco ping' is doing is telling you that
>> > >>> ActiveMQ is setup properly and that the Broker at least knows the
>> > >>> Node
>> > >>> exists.
>> > >>>
>> > >>> Based on that error output you are running fairly recent code since
>> > >>> that error message was introduced in the commits I sent previously.
>> > >>> That error is printed when it can't fine the kernel fact for some
>> > >>> reason.
>> > >>>
>> > >>> What does this command return?
>> > >>>
>> > >>> oo-mco facts -v kernel
>> > >>>
>> > >>> Another good mcollective sanity check is:
>> > >>>
>> > >>> oo-mco inventory node1.os.com
>> > >>>
>> > >>> With facts the most important thing to check would be the value for
>> > >>> plugin.yaml in /opt/rh/ruby193/root/etc/mcollective/server.cfg (or
>> > >>> /etc/mcollective/server.cfg if you are using Fedora).  It needs to
>> > >>> match the value that is being generated in
>> > >>> /etc/cron.minutely/openshift-facts.  You also need to ensure that
>> > >>> cronjob is running.
>> > >>>
>> > >>> --Brenton
>> > >>>
>> > >>>
>> > >>> >
>> > >>> >[root broker sbin]# oo-mco ping
>> > >>> >node1.os.com                             time=83.01 ms
>> > >>> >
>> > >>> >
>> > >>> >---- ping statistics ----
>> > >>> >1 replies max: 83.01 min: 83.01 avg: 83.01
>> > >>> >
>> > >>> >
>> > >>> >Jason
>> > >>> >
>> > >>> >----- Original Message -----
>> > >>> >> +++ Jason Marley [04/09/14 17:46 -0400]:
>> > >>> >> >Hi All,
>> > >>> >> >
>> > >>> >> >I'm having an issue adding a node to my district. When I run mco
>> > >>> >> > ping
>> > >>> >> > I
>> > >>> >> >can
>> > >>> >> >see my node, but when I add it see a weird message. I verified
>> > >>> >> > that
>> > >>> >> > my
>> > >>> >> >broker was set up correctly, but my node is not. My node says
>> > >>> >> > mcollective
>> > >>> >> >is not running as a service, but is still listening for messages
>> > >>> >> > from
>> > >>> >> > the
>> > >>> >> >broker and responds to pings. When I run the node check it says
>> > >>> >> >mcollective
>> > >>> >> >is not running and that my SELINUX context's are not correct.
>> > >>> >> > Some
>> > >>> >> > other
>> > >>> >> >people have had this issue but haven't seen any resolution
>> > >>> >> > except for
>> > >>> >> > a
>> > >>> >> >bugzilla ticket
>> > >>> >> > (https://bugzilla.redhat.com/show_bug.cgi?id=1074553).
>> > >>> >> >
>> > >>> >> >any help would be appreciated.
>> > >>> >> >
>> > >>> >> >[root broker ~]# oo-admin-ctl-district -c add-node -n
>> > >>> >> > small_district
>> > >>> >> > -i
>> > >>> >> >node1.os.com
>> > >>> >> >/usr/sbin/oo-admin-ctl-district:215:in `block in <main>':
>> > >>> >> > undefined
>> > >>> >> >method
>> > >>> >> >`casecmp' for nil:NilClass (NoMethodError)
>> > >>> >> >        from /usr/sbin/oo-admin-ctl-district:178:in `block in
>> > >>> >> >        collate_errors'
>> > >>> >> >        from /usr/sbin/oo-admin-ctl-district:176:in `each'
>> > >>> >> >        from /usr/sbin/oo-admin-ctl-district:176:in
>> > >>> >> > `collate_errors'
>> > >>> >> >        from /usr/sbin/oo-admin-ctl-district:213:in `<main>'
>> > >>> >>
>> > >>> >> I'm not sure what version of the code you are running but see if
>> > >>> >> your
>> > >>> >> version of /usr/sbin/oo-admin-ctl-district has the following
>> > >>> >> changes:
>> > >>> >>
>> > >>> >>
>> > >>> >>
>> > >>> >> https://github.com/openshift/origin-server/commit/11632afde2d6b407ef1a6fe217e31b9cc0f5ce88
>> > >>> >>
>> > >>> >>
>> > >>> >> https://github.com/openshift/origin-server/commit/e17edc775d8debf9706c5a677e730084b5635b50
>> > >>> >>
>> > >>> >> If you're running in a non-production environment you could
>> > >>> >> likely
>> > >>> >> backup your version of /usr/sbin/oo-admin-ctl-district and just
>> > >>> >> replace it with
>> > >>> >>
>> > >>> >>
>> > >>> >> https://raw.githubusercontent.com/openshift/origin-server/master/broker-util/oo-admin-ctl-district
>> > >>> >> to test.
>> > >>> >>
>> > >>> >> >
>> > >>> >> >[root broker ~]# date
>> > >>> >> >Thu Sep  4 17:36:56 EDT 2014
>> > >>> >> >
>> > >>> >> >[root broker ~]# oo-mco ping
>> > >>> >> >node1.os.com                             time=76.51 ms
>> > >>> >> >
>> > >>> >> >
>> > >>> >> >---- ping statistics ----
>> > >>> >> >1 replies max: 76.51 min: 76.51 avg: 76.51
>> > >>> >> >
>> > >>> >> >[root node1 ~]# oo-accept-node -v
>> > >>> >> >INFO: using default accept-node extensions
>> > >>> >> >INFO: loading node configuration file /etc/openshift/node.conf
>> > >>> >> >INFO: loading resource limit file
>> > >>> >> > /etc/openshift/resource_limits.conf
>> > >>> >> >INFO: finding external network device
>> > >>> >> >INFO: checking node public hostname resolution
>> > >>> >> >INFO: checking selinux status
>> > >>> >> >INFO: checking selinux openshift-origin policy
>> > >>> >> >INFO: checking selinux booleans
>> > >>> >> >INFO: checking package list
>> > >>> >> >INFO: checking services
>> > >>> >> >FAIL: service ruby193-mcollective not running
>> > >>> >> >FAIL: Could not get SELinux context for ruby193-mcollective
>> > >>> >> >INFO: checking kernel semaphores >= 512
>> > >>> >> >INFO: checking cgroups configuration
>> > >>> >> >INFO: checking cgroups processes
>> > >>> >> >INFO: find district uuid: NONE
>> > >>> >> >INFO: determining node uid range: 1000 to 6999
>> > >>> >> >INFO: checking presence of tc qdisc
>> > >>> >> >INFO: checking for cgroup filter
>> > >>> >> >INFO: checking presence of tc classes
>> > >>> >> >INFO: checking filesystem quotas
>> > >>> >> >INFO: checking quota db file selinux label
>> > >>> >> >INFO: checking 0 user accounts
>> > >>> >> >INFO: checking application dirs
>> > >>> >> >INFO: checking system httpd configs
>> > >>> >> >INFO: checking cartridge repository
>> > >>> >> >2 ERRORS
>> > >>> >> >
>> > >>> >> >[root node1 ~]# tail
>> > >>> >> > /var/log/openshift/node/ruby193-mcollective.log
>> > >>> >> >D, [2014-09-04T17:36:57.886032 #11971] DEBUG -- :
>> > >>> >> > pluginmanager.rb:83:in
>> > >>> >> >`[]' Returning cached plugin connector_plugin with class
>> > >>> >> >MCollective::Connector::Activemq
>> > >>> >> >D, [2014-09-04T17:36:57.886184 #11971] DEBUG -- :
>> > >>> >> > activemq.rb:362:in
>> > >>> >> >`publish' Sending a broadcast message to ActiveMQ target
>> > >>> >> >'/queue/mcollective.reply.broker.os.com_6176' with headers
>> > >>> >> >'{"timestamp"=>"1409866617000", "expires"=>"1409866687000"}'
>> > >>> >> >D, [2014-09-04T17:36:57.886400 #11971] DEBUG -- :
>> > >>> >> > runnerstats.rb:56:in
>> > >>> >> >`block in sent' Incrementing replies stat
>> > >>> >> >D, [2014-09-04T17:37:02.204937 #11971] DEBUG -- :
>> > >>> >> > pluginmanager.rb:83:in
>> > >>> >> >`[]' Returning cached plugin security_plugin with class
>> > >>> >> >MCollective::Security::Psk
>> > >>> >> >D, [2014-09-04T17:37:02.205498 #11971] DEBUG -- : base.rb:178:in
>> > >>> >> >`create_request' Encoding a request for agent 'registration' in
>> > >>> >> >collective
>> > >>> >> >mcollective with request id 2cd84c1bd9d35bb0bedd32a36a0ee0b9
>> > >>> >> >D, [2014-09-04T17:37:02.205545 #11971] DEBUG -- : psk.rb:98:in
>> > >>> >> > `callerid'
>> > >>> >> >Setting callerid to uid=0 based on callertype=uid
>> > >>> >> >D, [2014-09-04T17:37:02.205606 #11971] DEBUG -- : base.rb:70:in
>> > >>> >> > `publish'
>> > >>> >> >Sending registration 2cd84c1bd9d35bb0bedd32a36a0ee0b9 to
>> > >>> >> > collective
>> > >>> >> >mcollective
>> > >>> >> >D, [2014-09-04T17:37:02.205646 #11971] DEBUG -- :
>> > >>> >> > pluginmanager.rb:83:in
>> > >>> >> >`[]' Returning cached plugin connector_plugin with class
>> > >>> >> >MCollective::Connector::Activemq
>> > >>> >> >D, [2014-09-04T17:37:02.205718 #11971] DEBUG -- :
>> > >>> >> > activemq.rb:362:in
>> > >>> >> >`publish' Sending a broadcast message to ActiveMQ target
>> > >>> >> >'/topic/mcollective.registration.agent' with headers
>> > >>> >> >'{"timestamp"=>"1409866622000", "expires"=>"1409866692000",
>> > >>> >> >"reply-to"=>"/queue/mcollective.reply.node1.os.com_11971"}'
>> > >>> >> >D, [2014-09-04T17:37:04.503263 #11971] DEBUG -- :
>> > >>> >> > activemq.rb:169:in
>> > >>> >> >`on_hbfire' Publishing heartbeat to
>> > >>> >> >stomp://mcollective broker os com:61613: send_fire,
>> > >>> >> >{:curt=>1409866624.5028434, :last_sleep=>30.49942183494568}
>> > >>> >> >
>> > >>> >> >
>> > >>> >> >Jason
>> > >>> >> >
>> > >>> >> >_______________________________________________
>> > >>> >> >dev mailing list
>> > >>> >> >dev lists openshift redhat com
>> > >>> >> >http://lists.openshift.redhat.com/openshiftmm/listinfo/dev
>> > >>> >>
>> > >>>
>> > >
>> > > _______________________________________________
>> > > dev mailing list
>> > > dev lists openshift redhat com
>> > > http://lists.openshift.redhat.com/openshiftmm/listinfo/dev
>> >
>>
>> _______________________________________________
>> dev mailing list
>> dev lists openshift redhat com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/dev
>
>


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]