[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: OpenShift-v3 Beta3 -- BYO ansible playbook failure (hostname lookup ?)



Just tried another run (clean system restored from snapshot)  but this time using "openshift_public_hostname" for all the hosts  (master + 2 nodes) in "/etc/ansible/hosts"  -- as per your indication below and earlier in this thread.  

Still, the same result -- breaks in the same place, the same way  :(( 

Again, this would have been "part and parcel of an OSS project" if this wasn't for a high-profile customer that's eager to see this working ... I'm sure you understand :)

Anyway, I'll patiently wait for you to signal there's a fix / patch I can try. 

Kind thanks yet again for the help. 



On Tue, May 5, 2015 at 5:35 PM, Jason DeTiberus <jdetiber redhat com> wrote:

On 05/05/15 17:27 +0200, Florian Daniel Otel wrote:
Still, no luck :((

Unfortunately, you won't be able to make any progress until I finish up the bug fix for openshift-ansible (specifically, the openshift-facts module). I hope to finish up the fix today.


I amended my "/etc/ansible/hosts" to use "openshift_hostname" for all the
nodes.  I  even "openshift_public_hostname" for the "master1"  with a
permanent Elastic IP (even if, again, this is not intended to be used) .

You should be able to go ahead and set openshift_public_hostname to the internal hostnames (which is what I actually meant to reply with instead of setting openshift_hostname, which may actually be properly set, though it really isn't getting far enough into the playbook run to tell).

Here is the gist with the whole log of the playbook run (using "-vvvv")
the content of "/etc/ansible/hosts" and the output of "--list-hosts" and
"--list-tasks":   I tried running the playbook from the "master1" node
itself, not an external "jumphost"

https://gist.github.com/FlorianOtel/cab952b01150df01d0dc

(note: I editied out the Elastic IP of the "master1").

One question (and here comes out my Ansible ignorance): is it normal that
"--list-hosts" lists "localhost" for all plays ?

It depends on the playbooks, in our case we are using "localhost" for some plays to allow for synchronizing content between hosts, aggregating groups of hosts for calling other playbooks, and for setting "facts" for passing into other playbooks, tasks, roles, etc.

Thanks again for trying to help -- after two days wasted on this, this is
rather frustrating....

My apologies for the frustration. My test systems have been using either AWS Classic or a VPC that issues both internal and external hostnames as well as external IPs, so I missed a few edge cases within the openshift-facts module that processes the ec2 metadata for the system.

--
Jason


On Tue, May 5, 2015 at 4:25 PM, Jason DeTiberus <jdetiber redhat com> wrote:

On 05/05/15 10:42 +0200, Florian Daniel Otel wrote:

(follow-up)

Jason, all,

I have now replicated the issue from a fourth,  "jumpstart" host. The
entire output of the "ansible-playbook -vvvv
openshift-ansible/playbooks/byo/config.yml"
for this run is located here:
https://gist.github.com/FlorianOtel/d67b4d9a62a1ce3e1a57

Thanks again,

Florian




On Mon, May 4, 2015 at 9:47 PM, Florian Daniel Otel <
florian otel gmail com>
wrote:

 Thanks Jason,


 Not sure what "public hostname" you are referring to since this setup is
completely isolated / self-contained:


For cloud environments we try to distinguish between publicly accessible
hostnames and IP addresses and internal only hostnames and IP addresses,
with the assumption that users will want to access their OpenShift
environment publicly (for some value of publicly).  We use the instance
metadata within the openshift-facts module to do this.  In this case, we
are incorrectly assuming that an instance always has values for the given
metadata item associated with public/private hostnames and ip addresses
(which is not the case for certain VPC configurations).

 I have a set up a  DNS server on the same VPC subnet. It acts as zone
master for my internal domain (in my case "nuage-vpc253.internal") +
forwarder. This is a fourth host, in addition to my (intended...) 1x
master
+ 2 x nodes.


 The entries in my "nuage-vpc253.internal" zone  for my nodes are the VPC /
subnet-local IP addresses.


 (yes, these are DHCP addresses from the VPC subnet. And yes, I know that's
wrong .. :)) .  However, they seem to persist throughout the instance
lifetime and this setup is simply intended as a test environment,
isolated
from any outside use)


No issue with this in this case, the only thing is that you can only rely
on the auto-detected value of the openshift_hostname value. The value for
openshift_public_hostname will need to be overridden to account for the VPC
configuration.

To override the detected public_hostname your inventory should look like
the following:
/etc/ansible/hosts(trimmed, and each entry should be a single line, in
case it gets wrapped in email):
# host group for masters
[masters]
master1.nuage-vpc253.internal
openshift_public_hostname=master1.nuage-vpc253.internal


# host group for nodes
[nodes]
node1.nuage-vpc253.internal
openshift_public_hostname=node1.nuage-vpc253.internal
node2.nuage-vpc253.internal
openshift_public_hostname=node1.nuage-vpc253.internal

This will override the detected value with the specified value (once the
bug has been fixed in openshift-ansible that is causing openshift-facts to
choke on aws metadata that does not contain an external or internal
hostname).

 I verified that both "hostname -s" and "hostname -f" return the correct
entries, and both resolve nicely to internal IP addresses from that DNS
server. This is  from any other host in the VCP  (All hosts in my VPC
have
their "/etc/resolv.conf" pointing to the internal DNS server instead of
AWS
provided DNS  -- hence all internal names resolve correctly)


 LMK if there is any additional information you need.


 One last question:


 Is  there any issue with trying to deploy that from one of the nodes in
the setup itself (i.e. "master1" node) ? Should I use another "jumpstart"
host -- i.e. a fourth host, in addition to my 1 x master + 2 x nodes  ?


There *shouldn't* be an issue with installing from one of the hosts,
however we have had issues crop up from time to time since most testing is
done from a separate host.  Anything preventing ansible being run from one
of the instances should definitely be considered a bug though.


 Thanks for trying to help,


 Florian












 On Mon, May 4, 2015 at 6:33 PM, Jason DeTiberus <jdetiber redhat com>
wrote:


 On 04/05/15 11:54 +0200, Florian Daniel Otel wrote:


 Hello all,


 I'm trying to set up an OpenShift-v3 Beta3 environment consisting of 3
hosts -- 1 master + 2nodes, as follows (trimmed output of
"/etc/ansible/hosts"):


 # host group for masters
[masters]
master1.nuage-vpc253.internal



 # host group for nodes
[nodes]
node1.nuage-vpc253.internal
node2.nuage-vpc253.internal



 <snip>


 My problem: When on "master1" node I try to run the BYO Ansible playbook
(as per this GitHub repo --
https://github.com/detiber/openshift-ansible
),
as follows:


 ansible-playbook -vvvv ./openshift-ansible/playbooks/byo/config.yml


 The playbook results in an error:



 <snip>


  failed: [master1.nuage-vpc253.internal] => {"failed": true, "parsed":

false}
Traceback (most recent call last):
 File



"/root/.ansible/tmp/ansible-tmp-1430732251.8-74072282264480/openshift_facts",
line 4981, in <module>
   main()
 File



"/root/.ansible/tmp/ansible-tmp-1430732251.8-74072282264480/openshift_facts",
line 461, in main
   openshift_facts = OpenShiftFacts(role, fact_file, local_facts)
 File



"/root/.ansible/tmp/ansible-tmp-1430732251.8-74072282264480/openshift_facts",
line 36, in __init__
   self.facts = self.generate_facts(local_facts)
 File



"/root/.ansible/tmp/ansible-tmp-1430732251.8-74072282264480/openshift_facts",
line 44, in generate_facts
   facts = self.apply_provider_facts(defaults, provider_facts, roles)
 File



"/root/.ansible/tmp/ansible-tmp-1430732251.8-74072282264480/openshift_facts",
line 142, in apply_provider_facts
   facts['common'][h_var] =
self.choose_hostname([provider_facts['network'].get(h_var)],
facts['common'][ip_var])
 File



"/root/.ansible/tmp/ansible-tmp-1430732251.8-74072282264480/openshift_facts",
line 164, in choose_hostname
   ips = [ i for i in hostnames if i is not None and
re.match(r'\A\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\Z', i) ]
 File "/usr/lib64/python2.7/re.py", line 137, in match
   return _compile(pattern, flags).match(string)
TypeError: expected string or buffera



 <snip>


  Now, if I'm reading that correctly, there's due to an error parsing the

hostname (?). Again, here's the output on said host.



 You are correct that it is an error in parsing a hostname, though the
hostname I believe it is trying to parse is the AWS public hostname from
the metadata (I'm assuming the VPC you are using is configured to not
issue
public hostnames).


 I started working on a fix for this here:
https://github.com/openshift/openshift-ansible/pull/199


 I'll pick back up on it this afternoon (as time permits) and verify that
it does what it should do with different combinations of VPC
configurations
for hostnames and ip addresses.


 One other thing to note is that if you are going to access your
environment from outside of the VPC, then you will also need to provide
the
openshift_public_hostname setting for the hosts (as described here:

https://github.com/openshift/training/blob/master/beta-3-setup.md#generic-cloud-install
)
especially if using the generated self-signed certificates.


 --
Jason DeTiberus







[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]