[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

NFS-based PV when used for aggregated logging elasticsearch storage

I am trying to set up aggregate logging (via https://docs.openshift.org/latest/install_config/aggregate_logging.html) on my Openshift-Origin cluster. The cluster was installed with the 'advanced' Ansible playbook and is us
ing Origin v1.1.1 with kubernetes v1.1.0-origin-1107-g4c8e6f4. The one modification I've made to the aggregate logging set up from the docs is to use an NFS-based persistent volume and a corresponding persistentVolumeClaim. The NFS server's export and the PV and PVC are set up as outlined in the https://docs.openshift.org/latest/install_config/persistent_storage/persistent_storage_nfs.html.

When the `logging-es-XXXX` pod gets created it immediately goes into a crash cycle with these errors in the log:

    [2016-02-08 23:53:33,745][INFO ][plugins                  ] [Dmitri Bukharin] loaded [searchguard, openshift-elasticsearch-plugin, cloud-kubernetes], sites []
    {1.5.2}: Initialization Failed ...
    - ElasticsearchIllegalStateException[Failed to created node environment]

Clearly, it is having trouble with permissions of existence of the mounted persistentVolume. I can't `exec` in to the real pod to verify anything since it crashes too quickly, but I did create a duplicate pod that runs a `sleep`  instead of the default `run.sh` `CMD`. In this pod I can verify that the PV is properly mounted and running the `run.sh` (https://github.com/openshift/origin-aggregated-logging/blob/master/elasticsearch/Dockerfile#L32) by hand fails in the same way. If I modify this one-off pod object definition again to remove the `runAsUser: 1000040000` from the `securityContext` for the container, it seems to work (or at least not fail outright).

    [master]# oc exec -ti logging-es-068wj8on-2-12xle-rcw bash
    bash-4.2$ id
    uid=1000 gid=0(root)

    bash-4.2$ ls -l /elasticsearch/persistent/
    total 0

    bash-4.2$ ls -ld /elasticsearch/persistent/
    drwxr-xr-x 2 nobody nobody 4096 Feb  8 18:29 /elasticsearch/persistent/

    bash-4.2$ /opt/app-root/src/run.sh
    [2016-02-09 00:03:13,831][INFO ][node                     ] [Captain Omen] version[1.5.2], pid[22], build[62ff986/2015-04-27T09:21:06Z]
    [2016-02-09 00:03:13,832][INFO ][node                     ] [Captain Omen] initializing ...
    [2016-02-09 00:03:14,882][INFO ][plugins                  ] [Captain Omen] loaded [searchguard, openshift-elasticsearch-plugin, cloud-kubernetes], sites []
    [2016-02-09 00:03:20,026][INFO ][node                     ] [Captain Omen] initialized
    [2016-02-09 00:03:20,027][INFO ][node                     ] [Captain Omen] starting ...
    [2016-02-09 00:03:20,265][INFO ][transport                ] [Captain Omen] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/]}
    [2016-02-09 00:03:20,303][INFO ][discovery                ] [Captain Omen] logging-es/Xpb-Cet6RaGdQdOgiAcM0w
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/usr/share/elasticsearch/plugins/cloud-kubernetes/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/usr/share/elasticsearch/plugins/openshift-elasticsearch-plugin/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    [2016-02-09 00:03:23,516][INFO ][cluster.service          ] [Captain Omen] new_master [Captain Omen][Xpb-Cet6RaGdQdOgiAcM0w][logging-es-068wj8on-2-12xle-rcw][inet[/]], reason: zen-disco-join (elected_as_master)
    [2016-02-09 00:03:23,562][INFO ][http                     ] [Captain Omen] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/]}
    [2016-02-09 00:03:23,563][INFO ][node                     ] [Captain Omen] started

I've run through the docs several times and rebuilt the project environment each time, but get the same result. Does anyone have an idea of where I might look next to figure out why using an NFS-based PV/PVC doesn't seem to work with the default Origin aggregated logging setup?


Robert Wehner

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]