[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Certificate Problem?



Hrm - if oc get pods is the thing that is causing the registry to be
unable to connect to the master, that implies something deeper (and
possibly more fundamental) about the network connections in play, or a
bad proxy or other component of the system.  I haven't ever seen a
failure like that that wasn't related to networking (it's very
unlikely there is anything in the master or client code triggering
this).  Do you have a firewall or other system daemon monitoring the
server?  Are packets being dropped anywhere?  Are your iptables rules
being managed by something else?

On Sat, Aug 22, 2015 at 4:47 PM, Justin Wood <justin wood sixtree co nz> wrote:
> Ok I played around with this a lot.   In my original situation I had mistakenly only assigned one core of my 8 core 2.2 Ghz i7 and the registry stayed pending and the system generally misbehaved.  With two or 4 cores assigned the registry gets created pretty quickly unless you are impatient and keep hitting ‘oc get pods’ to see how things are going.   If you do that then it fails with "ExitCode:255 “
>
> [root master ~]# oc logs docker-registry-1-deploy
> F0822 16:01:16.406786       1 deployer.go:64] couldn't get deployment default/docker-registry-1: Get
>
>
> e.g.
>
> [root master ~]# oadm registry --config=/etc/openshift/master/admin.kubeconfig     --credentials=/etc/openshift/master/openshift-registry.kubeconfig     --images='registry.access.redhat.com/openshift3/ose-${component}:${version}'
> deploymentconfigs/docker-registry
> services/docker-registry
> [root master ~]# docker ps
> CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
> [root master ~]# oc get pods
> NAME                       READY     STATUS    RESTARTS   AGE
> docker-registry-1-deploy   0/1       Running   0          8s
> [root master ~]# oc get pods
> NAME                       READY     STATUS    RESTARTS   AGE
> docker-registry-1-deploy   1/1       Running   0          16s
> [root master ~]# oc get pods
> NAME                       READY     STATUS    RESTARTS   AGE
> docker-registry-1-deploy   1/1       Running   0          18s
> [root master ~]# oc get pods
> NAME                       READY     STATUS    RESTARTS   AGE
> docker-registry-1-deploy   1/1       Running   0          23s
> [root master ~]# oc get pods
> NAME                       READY     STATUS    RESTARTS   AGE
> docker-registry-1-deploy   1/1       Running   0          29s
> [root master ~]# oc get pods
> NAME                       READY     STATUS    RESTARTS   AGE
> docker-registry-1-deploy   1/1       Running   0          31s
> [root master ~]# oc get pods
> NAME                       READY     STATUS    RESTARTS   AGE
> docker-registry-1-deploy   1/1       Running   0          33s
> [root master ~]# oc get pods
> NAME                       READY     STATUS         RESTARTS   AGE
> docker-registry-1-deploy   0/1       ExitCode:255   0          42s
> [root master ~]# oc build-logs docker-registry-1-deploy
> Error from server: build "docker-registry-1-deploy" not found
> [root master ~]# oc get pods
> NAME                       READY     STATUS         RESTARTS   AGE
> docker-registry-1-deploy   0/1       ExitCode:255   0          1m
> [root master ~]# oc logs docker-registry-1-deploy
> F0822 16:01:16.406786       1 deployer.go:64] couldn't get deployment default/docker-registry-1: Get https://master.sixtree.com:8443/api/v1/namespaces/default/replicationcontrollers/docker-registry-1: dial tcp: i/o timeout
>
>
>
>
>
>
>
>
>
>> On 21/08/2015, at 11:04 am, Justin Wood <justin wood sixtree co nz> wrote:
>>
>> That was a clever idea!   I execed into the registry deploy pod and hit that URL but never go a response before it was killed Something I was doing perhaps kept it alive longer and hey presto my registry was created.  I then exec d into the registry and it worked but too a long time to return:
>>
>> [root master ~]# oc rsh docker-registry-1-8aycp
>> <.com:8443/api/v1/namespaces/default/replicationcontrollers/docker-registry-1
>> {
>>  "kind": "Status",
>>  "apiVersion": "v1",
>>  "metadata": {},
>>  "status": "Failure",
>>  "message": "User \"system:anonymous\" cannot get replicationcontrollers in project \"default\"",
>>  "reason": "Forbidden",
>>  "details": {
>>    "name": "docker-registry-1",
>>    "kind": "replicationcontrollers"
>>  },
>>  "code": 403
>>
>> I checked my VM’s and they are a bit under speced.  I’ll give them another core and some more RAM and let everyone know how that goes.
>>
>> Thanks for your help!
>> Justin
>>
>>> On 21/08/2015, at 10:29 am, Clayton Coleman <ccoleman redhat com> wrote:
>>>
>>> can you create a pod, exec into it, and then try pinging the master
>>> (to verify the pods can reach back to the master)?
>>>
>>> On Thu, Aug 20, 2015 at 6:17 PM, Justin Wood <justin wood sixtree co nz> wrote:
>>>> Yes.  I also get a successful answer from on the URL that’s timing out
>>>>
>>>> [root node1 ~]# curl -k https://master.example.com:8443/api/v1/namespaces/default/replicationcontrollers/docker-registry-1
>>>> {
>>>> "kind": "Status",
>>>> "apiVersion": "v1",
>>>> "metadata": {},
>>>> "status": "Failure",
>>>> "message": "User \"system:anonymous\" cannot get replicationcontrollers in project \"default\"",
>>>> "reason": "Forbidden",
>>>> "details": {
>>>>   "name": "docker-registry-1",
>>>>   "kind": "replicationcontrollers"
>>>> },
>>>> "code": 403
>>>>
>>>> I’m looking for a way to bump the login level up.
>>>>
>>>> Justin
>>>>
>>>>> On 21/08/2015, at 10:04 am, Clayton Coleman <ccoleman redhat com> wrote:
>>>>>
>>>>> Does master.example.com resolve from your node?  Is the IP address the
>>>>> same as your master instance?
>>>>>
>>>>> On Thu, Aug 20, 2015 at 5:48 PM, Justin Wood <justin wood example co nz> wrote:
>>>>>> Ok here’s what I get.
>>>>>>
>>>>>> [root master ~]# oc logs docker-registry-1-deploy
>>>>>> F0820 17:35:02.953324       1 deployer.go:64] couldn't get deployment default/docker-registry-1: Get https://master.example.com:8443/api/v1/namespaces/default/replicationcontrollers/docker-registry-1: dial tcp: i/o timeout
>>>>>>
>>>>>> [root master ~]# oc get pods
>>>>>> NAME                       READY     STATUS         RESTARTS   AGE
>>>>>> docker-registry-1-deploy   0/1       ExitCode:255   0          3m
>>>>>>
>>>>>>
>>>>>> Aug 21 09:14:18 master.example.com openshift-master[1466]: 2015/08/21 09:14:18 etcdserver: saved snapshot at index 20002
>>>>>> Aug 21 09:34:31 master.example.com openshift-master[1466]: I0821 09:34:31.317638 1466 controller.go:72] Ignoring change for DeploymentConfig default/docker-registry:1; no existing Deployment found
>>>>>> Aug 21 09:34:31 master.example.com openshift-master[1466]: I0821 09:34:31.702437 1466 factory.go:214] About to try and schedule pod docker-registry-1-deploy
>>>>>> Aug 21 09:34:31 master.example.com openshift-master[1466]: I0821 09:34:31.703204 1466 factory.go:312] Attempting to bind docker-registry-1-deploy to node1.example.com
>>>>>> Aug 21 09:34:33 master.example.com openshift-master[1466]: I0821 09:34:33.492440    1466 controller.go:85] Ignoring DeploymentConfig change for default/docker-registry:1 (latestVersion=1); same as Deployment default/docker-registry-1
>>>>>>
>>>>>> I took the firewall on node1 down, just for good measure and tried again, but got the same result
>>>>>>
>>>>>> Justin
>>>>>>
>>>>>>> On 21/08/2015, at 9:31 am, Clayton Coleman <ccoleman redhat com> wrote:
>>>>>>>
>>>>>>> Hrm, the TLS error may be a red herring.  Pull the logs for the deploy
>>>>>>> pod - oc logs docker-registry-1-deploy
>>>>>>>
>>>>>>> On Thu, Aug 20, 2015 at 5:29 PM, Justin Wood <justin wood example co nz> wrote:
>>>>>>>> Thanks Clayton.  This is what I have
>>>>>>>>
>>>>>>>> ...
>>>>>>>> serviceAccountConfig:
>>>>>>>> managedNames:
>>>>>>>> - default
>>>>>>>> - builder
>>>>>>>> - deployer
>>>>>>>> masterCA: ca.crt
>>>>>>>> privateKeyFile: serviceaccounts.private.key
>>>>>>>> publicKeyFiles:
>>>>>>>> - serviceaccounts.public.key
>>>>>>>> servingInfo:
>>>>>>>> bindAddress: 0.0.0.0:8443
>>>>>>>> certFile: master.server.crt
>>>>>>>> clientCA: ca.crt
>>>>>>>> keyFile: master.server.key
>>>>>>>> maxRequestsInFlight: 500
>>>>>>>> requestTimeoutSeconds: 3600
>>>>>>>> …
>>>>>>>>
>>>>>>>> and I was running the command as system:admin
>>>>>>>>
>>>>>>>> [root master ~]# oc whoami
>>>>>>>> system:admin
>>>>>>>>
>>>>>>>>
>>>>>>>> Cheers
>>>>>>>> Justin
>>>>>>>>
>>>>>>>>> On 21/08/2015, at 8:40 am, Clayton Coleman <ccoleman redhat com> wrote:
>>>>>>>>>
>>>>>>>>> Hrm, check that you have "masterCA" set under the serviceAccountConfig field in your master-config.yaml
>>>>>>>>>
>>>>>>>>> On Thu, Aug 20, 2015 at 4:05 PM, Justin Wood <justin wood example co nz> wrote:
>>>>>>>>> Hi All
>>>>>>>>>
>>>>>>>>> I just did a fresh install of OpenShift using this guide
>>>>>>>>>
>>>>>>>>> https://docs.openshift.com/enterprise/3.0/admin_guide/install/advanced_install.html
>>>>>>>>>
>>>>>>>>> and everything comes up as it should but when I try to deploy a registry it fails
>>>>>>>>>
>>>>>>>>> The logs indicate that I need to address some certificate issue.   Where do I had trusted certs configure it to just use plain http?
>>>>>>>>>
>>>>>>>>> Here are the logs
>>>>>>>>>
>>>>>>>>> Aug 20 19:26:18 master.example.com openshift-master[1466]: [676ns] [676ns] About to list directory
>>>>>>>>> Aug 20 19:26:18 master.example.com openshift-master[1466]: [819.978876ms] [819.9782ms] List extracted
>>>>>>>>> Aug 20 19:26:18 master.example.com openshift-master[1466]: [819.989248ms] [10.372µs] List filtered
>>>>>>>>> Aug 20 19:26:18 master.example.com openshift-master[1466]: [819.989814ms] [566ns] END
>>>>>>>>> Aug 20 19:26:18 master.example.com openshift-master[1466]: I0820 19:26:18.298101    1466 trace.go:57] Trace "List *api.PodList" (started 2015-08-20 19:26:17.394538848 +1200 NZST):
>>>>>>>>> Aug 20 19:26:18 master.example.com openshift-master[1466]: [490ns] [490ns] About to list directory
>>>>>>>>> Aug 20 19:26:18 master.example.com openshift-master[1466]: [903.534372ms] [903.533882ms] List extracted
>>>>>>>>> Aug 20 19:26:18 master.example.com openshift-master[1466]: [903.537414ms] [3.042µs] List filtered
>>>>>>>>> Aug 20 19:26:18 master.example.com openshift-master[1466]: [903.537779ms] [365ns] END
>>>>>>>>> Aug 20 19:26:19 master.example.com openshift-master[1466]: I0820 19:26:19.363015    1466 common.go:66] Self IP: 172.16.63.129.
>>>>>>>>> Aug 20 19:29:50 master.example.com openshift-master[1466]: I0820 19:29:50.900598 1466 controller.go:72] Ignoring change for DeploymentConfig default/docker-registry:1; no existing Deployment found
>>>>>>>>> Aug 20 19:29:51 master.example.com openshift-master[1466]: I0820 19:29:51.014624    1466 factory.go:214] About to try and schedule pod docker-registry-1-deploy
>>>>>>>>> Aug 20 19:29:51 master.example.com openshift-master[1466]: I0820 19:29:51.014842    1466 factory.go:312] Attempting to bind docker-registry-1-deploy to node1.example.com
>>>>>>>>> Aug 20 19:30:21 master.example.com openshift-master[1466]: I0820 19:30:21.843904 1466 controller.go:85] Ignoring DeploymentConfig change for default/docker-registry:1 (latestVersion=1); same as Deployment default/docker-registry-1
>>>>>>>>> Aug 20 19:32:22 master.example.com openshift-master[1466]: I0820 19:32:22.844859 1466 controller.go:85] Ignoring DeploymentConfig change for default/docker-registry:1 (latestVersion=1); same as Deployment default/docker-registry-1
>>>>>>>>>
>>>>>>>>> Aug 20 19:33:35 master.example.com openshift-master[1466]: 2015/08/20 19:33:35 http: TLS handshake error from 172.16.63.129:56385: remote error: unknown certificate authority
>>>>>>>>>
>>>>>>>>> Aug 20 19:34:23 master.example.com openshift-master[1466]: I0820 19:34:23.951961 1466 controller.go:85] Ignoring DeploymentConfig change for default/docker-registry:1 (latestVersion=1); same as Deployment default/docker-registry-1
>>>>>>>>> Aug 20 19:36:24 master.example.com openshift-master[1466]: I0820 19:36:24.873571 1466 controller.go:85] Ignoring DeploymentConfig change for default/docker-registry:1 (latestVersion=1); same as Deployment default/docker-registry-1
>>>>>>>>> Aug 20 19:37:03 master.example.com openshift-master[1466]: I0820 19:37:03.750158    1466 replication_controller.go:370] Replication Controller has been deleted default/docker-registry-1
>>>>>>>>> Aug 20 19:37:21 master.example.com openshift-master[1466]: I0820 19:37:21.932608    1466 controller.go:72] Ignoring change for DeploymentConfig default/docker-registry:1; no existing Deployment found
>>>>>>>>>
>>>>>>>>> Cheers
>>>>>>>>> Justin
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> users lists openshift redhat com
>>>>>>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Clayton Coleman | Lead Engineer, OpenShift
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Clayton Coleman | Lead Engineer, OpenShift
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Clayton Coleman | Lead Engineer, OpenShift
>>>>
>>>
>>>
>>>
>>> --
>>> Clayton Coleman | Lead Engineer, OpenShift
>>
>



-- 
Clayton Coleman | Lead Engineer, OpenShift


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]