[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: ose 3 installation issue



Hi Brenton,

Thanks you sooo much for your help and clarifying things :) Good news is I am able to deploy it now with one master and two nodes successfully, cleaned up the environment as you said and tried again , there was some certificate issue only.

I can see all three nodes are running and in ready states by doing: "oc get nodes"



Though hitting one other issue, when I try to deploy docker registry using command below, it  doesn't get deployed:

oadm registry --config=/etc/openshift/master/admin.kubeconfig \
    --credentials=/etc/openshift/master/openshift-registry.kubeconfig \
    --images='registry.access.redhat.com/openshift3/ose-${component}:${version}'


In pod logs it show: 


F1003 02:56:56.522753       1 deployer.go:64] couldn't get deployment default/docker-registry-1: Get https://ip-.- -1.compute.internal:8443/api/v1/namespaces/default/replicationcontrollers/docker-registry-1: dial tcp: lookup ip-.- -1.compute.internal: no such host


and when I try to curl the URL shown in docker pod logs it gives:


curl -L https://ip-.- -1.compute.internal:8443/api/v1/namespaces/default/replicationcontrollers/docker-registry-1 --insecure

{

  "kind": "Status",

  "apiVersion": "v1",

  "metadata": {},

  "status": "Failure",

  "message": "User \"system:anonymous\" cannot get replicationcontrollers in project \"default\"",

  "reason": "Forbidden",

  "details": {

    "name": "docker-registry-1",

    "kind": "replicationcontrollers"

  },

  "code": 403



Any pointers, do i need to login and create project first ??

and how can I access this registry : registry.access.redhat.com/openshift3 ?

Thanks a lot again!





On Thu, Oct 1, 2015 at 9:15 PM, Brenton Leanhardt <bleanhar redhat com> wrote:
On Thu, Oct 1, 2015 at 11:33 AM, priyanka Gupta
<priyanka4openshift gmail com> wrote:
> hi Brenton, thanks again , so nothing is required for this?? I tried with
> public Hostname also but if I not set up ssh keys it's fails to connect with
> hosts. Note that I am not using ansible scripts just the installer via curl.
> could you please tell?   Thanks a lot for helping continuously

The installer must be able to connect via SSH to all the hosts
involved in the installation.  That's not an OpenShift requirement
it's more of a side effect of how we're using Ansible.  I just wanted
to make sure you understood the difference between what is required
for installation and what is required at runtime for the environment
because I was worried there was some confusion.  You're hitting a type
of runtime problem.  The Node is trying to self-register but it can't
reach the Master for some reason.

There are a lot of certificates involved in OpenShift and I wanted to
make clear that the ones that are likely causing the problem are _not_
the SSH certificates but rather the x509 certificates that the Node is
using to contact the Master's API.  That's why you'll want to look in
the openshift-node logs.

>
>
> On Thursday, October 1, 2015, Brenton Leanhardt <bleanhar redhat com> wrote:
>>
>> On Thu, Oct 1, 2015 at 11:16 AM, priyanka Gupta
>> <priyanka4openshift gmail com> wrote:
>> >
>> > Hi Brenton,
>> >
>> > Thanks I will try this way too. just one thing are you not copying ssh
>> > keys
>> > to hosts , how it allowing acess with root user then? I ran the utility
>> > from
>> > master host where I created ssh keys and copied to nodes so that master
>> > can
>> > communicate with nodes.
>> >
>> >
>> > Are you not doing this?
>>
>> By default the Masters and Node OpenShift services actually
>> communicate via x509 certificates which are generated on the Master
>> (technically you can provide your own certificates, but that's another
>> topic).  Ansible triggers the creation of the certificates on the
>> Master and then syncs them locally to the system running the installer
>> which in turn pushes them to the systems that will run as Nodes.  SSH
>> connectivity is only required from the installer to all the systems.
>> In my example I actually didn't use the root user at all but instead
>> Ansible handled the sudo invocations.
>>
>> Where this can be tripped up today is if a previous installation
>> failed after the certificates were created.  We're working to provide
>> smarter playbooks that will handle this scenario but for now you'll
>> have to manually clean things up.
>>
>> >
>> >
>> > Thanks,
>> > Priya
>> >
>> > On Thursday, October 1, 2015, Brenton Leanhardt <bleanhar redhat com>
>> > wrote:
>> >>
>> >> On Wed, Sep 30, 2015 at 11:05 PM, priyanka Gupta
>> >> <priyanka4openshift gmail com> wrote:
>> >> > Hi Brenton,
>> >> >
>> >> > Thanks,Yes it looks correct to me , below is the output , it shows
>> >> > private
>> >> > ip of instances first in "validated_facts" like:
>> >> >
>> >> > masters: [10.0.0.0]
>> >> > nodes: [10.0.0.0, 17.10.0.0]
>> >> > validated_facts:
>> >> >   private_ip_node1: {hostname: -.internal, ip: 10.0.0.9,
>> >> >     public_hostname: -.compute.amazonaws.com, public_ip: 50.0.0.1}
>> >> >   private_ip_master1: {hostname: ip--.compute.internal, ip: 10.0.0.0,
>> >> >     public_hostname: --.amazonaws.com, public_ip: 50.0.0.0}
>> >> >
>> >> >
>> >> > As mentioned by you I am supplying private ips only during
>> >> > installation.
>> >> > Here I am using 1 node and 1 master but it shows two nodes(one master
>> >> > itself
>> >> > and other defined node) , is it the expected behavior??
>> >>
>> >> If you are inputting the private IP's during installation I'm assuming
>> >> you are running the installer from within Amazon.  I tend to run the
>> >> installer outside of Amazon.  Here's the exact configuration I just
>> >> used (note: these hosts are done now)
>> >>
>> >> masters: [ec2-54-85-68-36.compute-1.amazonaws.com]
>> >> nodes: [ec2-54-172-254-176.compute-1.amazonaws.com,
>> >> ec2-54-85-68-36.compute-1.amazonaws.com]
>> >> validated_facts:
>> >>   ec2-54-172-254-176.compute-1.amazonaws.com: {hostname:
>> >> ip-172-18-10-102.ec2.internal,
>> >>     ip: 172.18.10.102, public_hostname:
>> >> ec2-54-172-254-176.compute-1.amazonaws.com,
>> >>     public_ip: 54.172.254.176}
>> >>   ec2-54-85-68-36.compute-1.amazonaws.com: {hostname:
>> >> ip-172-18-3-233.ec2.internal,
>> >>     ip: 172.18.3.233, public_hostname:
>> >> ec2-54-85-68-36.compute-1.amazonaws.com, public_ip: 54.85.68.36}
>> >>
>> >> I fed the public hostnames in to the installer and then ansible
>> >> correctly filled in everything else without me needing to change the
>> >> public/private IPs or hostnames.  After that the installer ran
>> >> successfully and I have a working environment in the EC2.
>> >>
>> >> Here's my running nodes:
>> >>
>> >> [root ip-172-18-3-233 ~]# oc get nodes
>> >> NAME                            LABELS
>> >>                 STATUS                     AGE
>> >> ip-172-18-10-102.ec2.internal
>> >> kubernetes.io/hostname=ip-172-18-10-102.ec2.internal   Ready
>> >>            27m
>> >> ip-172-18-3-233.ec2.internal
>> >> kubernetes.io/hostname=ip-172-18-3-233.ec2.internal
>> >> Ready,SchedulingDisabled   27m
>> >>
>> >> Thinking about this a little more I'm wondering if you ran the
>> >> installation a few times and perhaps the certificates are now out of
>> >> sync and breaking subsequent installs.  You should check the logs on
>> >> the node that is failing to register.  Run 'journalctl -f -u
>> >> openshift-node' and then 'systemctl restart openshift-node' in another
>> >> shell.  Look for problems connecting to the master that are related to
>> >> certificates.
>> >>
>> >> If this turns out to be the case then you will likely need to clean up
>> >> your environment.  If you are familiar with ansible I would try
>> >> running the uninstall playbook:
>> >>
>> >>
>> >> https://github.com/openshift/openshift-ansible/blob/master/playbooks/adhoc/atomic_openshift_tutorial_reset.yml.
>> >> Otherwise you could follow the manual steps in the training
>> >> documentation we used for our beta.  These steps should still be
>> >> accurate:
>> >>
>> >>
>> >>
>> >> https://github.com/openshift/training/blob/master/deprecated/uninstall.md#uninstallation
>> >>
>> >> --Brenton
>> >>
>> >> >
>> >> > Do let me know if I need to check or configure something else.
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Wed, Sep 30, 2015 at 11:27 PM, Brenton Leanhardt
>> >> > <bleanhar redhat com>
>> >> > wrote:
>> >> >>
>> >> >> If you look in ~/.config/openshift/installer.cfg.yml on the system
>> >> >> where you ran the installer does the "validated_facts" section look
>> >> >> accurate?
>> >> >>
>> >> >> Here's an example:
>> >> >>
>> >> >> validated_facts:
>> >> >>   ose3-master.example.com: {hostname: ose3-master.example.com, ip:
>> >> >> 192.168.133.2,
>> >> >>     public_hostname: ose3-master.example.com, public_ip:
>> >> >> 192.168.133.2}
>> >> >>   ose3-node1.example.com: {hostname: ose3-node1.example.com, ip:
>> >> >> 192.168.133.3, public_hostname: ose3-node1.example.com,
>> >> >>     public_ip: 192.168.133.3}
>> >> >>
>> >> >> The key is the name of the host from the installer's perspective.
>> >> >> Here's some additional notes from the installer's output when you're
>> >> >> asked to supply the values:
>> >> >>
>> >> >>  * The installation host is the hostname from the installer's
>> >> >> perspective.
>> >> >>  * The IP of the host should be the internal IP of the instance.
>> >> >>  * The public IP should be the externally accessible IP associated
>> >> >> with the instance
>> >> >>  * The hostname should resolve to the internal IP from the instances
>> >> >>    themselves.
>> >> >>  * The public hostname should resolve to the external ip from hosts
>> >> >> outside of
>> >> >>    the cloud.
>> >> >>
>> >> >> On Wed, Sep 30, 2015 at 11:59 AM, priyanka Gupta
>> >> >> <priyanka4openshift gmail com> wrote:
>> >> >> > Hi,
>> >> >> >
>> >> >> > I am installing ose v3 using "sh <(curl -s
>> >> >> > https://install.openshift.com/ose/)" for installation. I am using
>> >> >> > one
>> >> >> > node
>> >> >> > and one master setup. I have followed all prerequisites on both
>> >> >> > the
>> >> >> > hosts.
>> >> >> >
>> >> >> > But when I run the utililty, it just fails at:
>> >> >> >
>> >> >> > TASK: [openshift_manage_node | Wait for Node Registration]
>> >> >> > ********************
>> >> >> > changed: [1--.10.0.898] =>
>> >> >> > (item=ip-1---10-0-00.--east-2.compute.internal)
>> >> >> > failed: [1--.10.0.898] =>
>> >> >> > (item=ip----10-0-00.--east-2.compute.internal)
>> >> >> > =>
>> >> >> > {"attempts": 10, "changed": true, "cmd": ["oc", "get", "node",
>> >> >> > "ip----10-0-00.--east-2.compute.internal"], "delta":
>> >> >> > "0:00:00.215592",
>> >> >> > "end": "2015-09-30 08:08:20.860288", "failed": true, "item":
>> >> >> > "ip----10-0-00.--east-2.compute.internal", "rc": 1, "start":
>> >> >> > "2015-09-30
>> >> >> > 08:08:20.644696", "warnings": []}
>> >> >> > stderr: Error from server: node
>> >> >> > "ip----10-0-00.--east-2.compute.internal"
>> >> >> > not found
>> >> >> > msg: Task failed as maximum retries was encountered
>> >> >> >
>> >> >> > FATAL: all hosts have already failed -- aborting
>> >> >> >
>> >> >> >
>> >> >> > 1--.10.0.898        : ok=151  changed=45   unreachable=0
>> >> >> > failed=1
>> >> >> >
>> >> >> > I have already googled out similar issues and tried resolution but
>> >> >> > nothing
>> >> >> > worked.
>> >> >> > though I can ping each instances though private/public ip.
>> >> >> > Here I am not using any DNS name ,just IP addresses.
>> >> >> >
>> >> >> > Not sure what is going wrong?
>> >> >> >
>> >> >> >
>> >> >> > Can anyone take a look at it and tell me what could be the issue?
>> >> >> >
>> >> >> > hope to get help soon.
>> >> >> >
>> >> >> > Thanks much in advance
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > _______________________________________________
>> >> >> > dev mailing list
>> >> >> > dev lists openshift redhat com
>> >> >> > http://lists.openshift.redhat.com/openshiftmm/listinfo/dev
>> >> >> >
>> >> >
>> >> >
>> >
>> >


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]