[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: OpenShift-v3 Beta3 -- BYO ansible playbook failure (hostname lookup ?)



On 05/05/15 17:50 +0200, Florian Daniel Otel wrote:
Just tried another run (clean system restored from snapshot)  but this time
using "openshift_public_hostname" for all the hosts  (master + 2 nodes) in
"/etc/ansible/hosts"  -- as per your indication below and earlier in this
thread.

Still, the same result -- breaks in the same place, the same way  :((

Again, this would have been "part and parcel of an OSS project" if this
wasn't for a high-profile customer that's eager to see this working ... I'm
sure you understand :)

Anyway, I'll patiently wait for you to signal there's a fix / patch I can
try.

The fix I'm currently testing can be found in this PR: https://github.com/openshift/openshift-ansible/pull/199

I've replicated the issue on my end, and my testing so far seems to show that it is working. The only thing that should be holding back the PR currently is that I need to finish up some pylint related cleanup and code review.

Please give the code from the PR a run and let me know if you run into any issue proceeding with the deployment using it.

--
Jason DeTiberus


Kind thanks yet again for the help.



On Tue, May 5, 2015 at 5:35 PM, Jason DeTiberus <jdetiber redhat com> wrote:


On 05/05/15 17:27 +0200, Florian Daniel Otel wrote:

Still, no luck :((


Unfortunately, you won't be able to make any progress until I finish up
the bug fix for openshift-ansible (specifically, the openshift-facts
module). I hope to finish up the fix today.


I amended my "/etc/ansible/hosts" to use "openshift_hostname" for all the
nodes.  I  even "openshift_public_hostname" for the "master1"  with a
permanent Elastic IP (even if, again, this is not intended to be used) .


You should be able to go ahead and set openshift_public_hostname to the
internal hostnames (which is what I actually meant to reply with instead of
setting openshift_hostname, which may actually be properly set, though it
really isn't getting far enough into the playbook run to tell).

 Here is the gist with the whole log of the playbook run (using "-vvvv")
the content of "/etc/ansible/hosts" and the output of "--list-hosts" and
"--list-tasks":   I tried running the playbook from the "master1" node
itself, not an external "jumphost"

https://gist.github.com/FlorianOtel/cab952b01150df01d0dc

(note: I editied out the Elastic IP of the "master1").

One question (and here comes out my Ansible ignorance): is it normal that
"--list-hosts" lists "localhost" for all plays ?


It depends on the playbooks, in our case we are using "localhost" for some
plays to allow for synchronizing content between hosts, aggregating groups
of hosts for calling other playbooks, and for setting "facts" for passing
into other playbooks, tasks, roles, etc.

 Thanks again for trying to help -- after two days wasted on this, this is
rather frustrating....


My apologies for the frustration. My test systems have been using either
AWS Classic or a VPC that issues both internal and external hostnames as
well as external IPs, so I missed a few edge cases within the
openshift-facts module that processes the ec2 metadata for the system.

--
Jason


 On Tue, May 5, 2015 at 4:25 PM, Jason DeTiberus <jdetiber redhat com>
wrote:

 On 05/05/15 10:42 +0200, Florian Daniel Otel wrote:


 (follow-up)


 Jason, all,


 I have now replicated the issue from a fourth,  "jumpstart" host. The
entire output of the "ansible-playbook -vvvv
openshift-ansible/playbooks/byo/config.yml"
for this run is located here:
https://gist.github.com/FlorianOtel/d67b4d9a62a1ce3e1a57


 Thanks again,


 Florian





 On Mon, May 4, 2015 at 9:47 PM, Florian Daniel Otel <
florian otel gmail com>
wrote:


  Thanks Jason,



  Not sure what "public hostname" you are referring to since this setup is

completely isolated / self-contained:



 For cloud environments we try to distinguish between publicly accessible
hostnames and IP addresses and internal only hostnames and IP addresses,
with the assumption that users will want to access their OpenShift
environment publicly (for some value of publicly).  We use the instance
metadata within the openshift-facts module to do this.  In this case, we
are incorrectly assuming that an instance always has values for the given
metadata item associated with public/private hostnames and ip addresses
(which is not the case for certain VPC configurations).


  I have a set up a  DNS server on the same VPC subnet. It acts as zone

master for my internal domain (in my case "nuage-vpc253.internal") +
forwarder. This is a fourth host, in addition to my (intended...) 1x
master
+ 2 x nodes.



  The entries in my "nuage-vpc253.internal" zone  for my nodes are the VPC
/

subnet-local IP addresses.



  (yes, these are DHCP addresses from the VPC subnet. And yes, I know
that's

wrong .. :)) .  However, they seem to persist throughout the instance
lifetime and this setup is simply intended as a test environment,
isolated
from any outside use)



 No issue with this in this case, the only thing is that you can only rely
on the auto-detected value of the openshift_hostname value. The value for
openshift_public_hostname will need to be overridden to account for the
VPC
configuration.


 To override the detected public_hostname your inventory should look like
the following:
/etc/ansible/hosts(trimmed, and each entry should be a single line, in
case it gets wrapped in email):
# host group for masters
[masters]
master1.nuage-vpc253.internal
openshift_public_hostname=master1.nuage-vpc253.internal



 # host group for nodes
[nodes]
node1.nuage-vpc253.internal
openshift_public_hostname=node1.nuage-vpc253.internal
node2.nuage-vpc253.internal
openshift_public_hostname=node1.nuage-vpc253.internal


 This will override the detected value with the specified value (once the
bug has been fixed in openshift-ansible that is causing openshift-facts
to
choke on aws metadata that does not contain an external or internal
hostname).


  I verified that both "hostname -s" and "hostname -f" return the correct

entries, and both resolve nicely to internal IP addresses from that DNS
server. This is  from any other host in the VCP  (All hosts in my VPC
have
their "/etc/resolv.conf" pointing to the internal DNS server instead of
AWS
provided DNS  -- hence all internal names resolve correctly)



  LMK if there is any additional information you need.



  One last question:



  Is  there any issue with trying to deploy that from one of the nodes in

the setup itself (i.e. "master1" node) ? Should I use another
"jumpstart"
host -- i.e. a fourth host, in addition to my 1 x master + 2 x nodes  ?



 There *shouldn't* be an issue with installing from one of the hosts,
however we have had issues crop up from time to time since most testing
is
done from a separate host.  Anything preventing ansible being run from
one
of the instances should definitely be considered a bug though.



  Thanks for trying to help,



  Florian













  On Mon, May 4, 2015 at 6:33 PM, Jason DeTiberus <jdetiber redhat com>

wrote:



  On 04/05/15 11:54 +0200, Florian Daniel Otel wrote:



  Hello all,



  I'm trying to set up an OpenShift-v3 Beta3 environment consisting of 3

hosts -- 1 master + 2nodes, as follows (trimmed output of
"/etc/ansible/hosts"):



  # host group for masters

[masters]
master1.nuage-vpc253.internal




  # host group for nodes

[nodes]
node1.nuage-vpc253.internal
node2.nuage-vpc253.internal




  <snip>



  My problem: When on "master1" node I try to run the BYO Ansible playbook

(as per this GitHub repo --
https://github.com/detiber/openshift-ansible
),
as follows:



  ansible-playbook -vvvv ./openshift-ansible/playbooks/byo/config.yml



  The playbook results in an error:




  <snip>



   failed: [master1.nuage-vpc253.internal] => {"failed": true, "parsed":


 false}
Traceback (most recent call last):
 File





"/root/.ansible/tmp/ansible-tmp-1430732251.8-74072282264480/openshift_facts",
line 4981, in <module>
   main()
 File





"/root/.ansible/tmp/ansible-tmp-1430732251.8-74072282264480/openshift_facts",
line 461, in main
   openshift_facts = OpenShiftFacts(role, fact_file, local_facts)
 File





"/root/.ansible/tmp/ansible-tmp-1430732251.8-74072282264480/openshift_facts",
line 36, in __init__
   self.facts = self.generate_facts(local_facts)
 File





"/root/.ansible/tmp/ansible-tmp-1430732251.8-74072282264480/openshift_facts",
line 44, in generate_facts
   facts = self.apply_provider_facts(defaults, provider_facts, roles)
 File





"/root/.ansible/tmp/ansible-tmp-1430732251.8-74072282264480/openshift_facts",
line 142, in apply_provider_facts
   facts['common'][h_var] =
self.choose_hostname([provider_facts['network'].get(h_var)],
facts['common'][ip_var])
 File





"/root/.ansible/tmp/ansible-tmp-1430732251.8-74072282264480/openshift_facts",
line 164, in choose_hostname
   ips = [ i for i in hostnames if i is not None and
re.match(r'\A\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\Z', i) ]
 File "/usr/lib64/python2.7/re.py", line 137, in match
   return _compile(pattern, flags).match(string)
TypeError: expected string or buffera




  <snip>



   Now, if I'm reading that correctly, there's due to an error parsing the


 hostname (?). Again, here's the output on said host.




  You are correct that it is an error in parsing a hostname, though the

hostname I believe it is trying to parse is the AWS public hostname from
the metadata (I'm assuming the VPC you are using is configured to not
issue
public hostnames).



  I started working on a fix for this here:

https://github.com/openshift/openshift-ansible/pull/199



  I'll pick back up on it this afternoon (as time permits) and verify that

it does what it should do with different combinations of VPC
configurations
for hostnames and ip addresses.



  One other thing to note is that if you are going to access your

environment from outside of the VPC, then you will also need to provide
the
openshift_public_hostname setting for the hosts (as described here:



https://github.com/openshift/training/blob/master/beta-3-setup.md#generic-cloud-install
)
especially if using the generated self-signed certificates.



  --

Jason DeTiberus








[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]