[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: NFS-based PV when used for aggregated logging elasticsearch storage

I know you said you ran through https://docs.openshift.org/latest/install_config/persistent_storage/persistent_storage_nfs.html#selinux-and-nfs-export-settings but did you set up SupplementalGroups as suggested in the last paragraph?

The NFS export and directory must be set up so that it is accessible by your pods. Either set the export to be owned by the container’s primary UID, or give your pod group based access using SuppplementalGroups. See Volume Security for more information.

The fact that it has access when not running as the one-off user kind of suggests that everything is OK except for file permissions, so unless you want to run it as root, you'll need to get the group access worked out.

On Mon, Feb 8, 2016 at 7:40 PM, Robert Wehner <robert wehner returnpath com> wrote:
I am trying to set up aggregate logging (via https://docs.openshift.org/latest/install_config/aggregate_logging.html) on my Openshift-Origin cluster. The cluster was installed with the 'advanced' Ansible playbook and is us
ing Origin v1.1.1 with kubernetes v1.1.0-origin-1107-g4c8e6f4. The one modification I've made to the aggregate logging set up from the docs is to use an NFS-based persistent volume and a corresponding persistentVolumeClaim. The NFS server's export and the PV and PVC are set up as outlined in the https://docs.openshift.org/latest/install_config/persistent_storage/persistent_storage_nfs.html.

When the `logging-es-XXXX` pod gets created it immediately goes into a crash cycle with these errors in the log:

    [2016-02-08 23:53:33,745][INFO ][plugins                  ] [Dmitri Bukharin] loaded [searchguard, openshift-elasticsearch-plugin, cloud-kubernetes], sites []
    {1.5.2}: Initialization Failed ...
    - ElasticsearchIllegalStateException[Failed to created node environment]

Clearly, it is having trouble with permissions of existence of the mounted persistentVolume. I can't `exec` in to the real pod to verify anything since it crashes too quickly, but I did create a duplicate pod that runs a `sleep`  instead of the default `run.sh` `CMD`. In this pod I can verify that the PV is properly mounted and running the `run.sh` (https://github.com/openshift/origin-aggregated-logging/blob/master/elasticsearch/Dockerfile#L32) by hand fails in the same way. If I modify this one-off pod object definition again to remove the `runAsUser: 1000040000` from the `securityContext` for the container, it seems to work (or at least not fail outright).

    [master]# oc exec -ti logging-es-068wj8on-2-12xle-rcw bash
    bash-4.2$ id
    uid=1000 gid=0(root)

    bash-4.2$ ls -l /elasticsearch/persistent/
    total 0

    bash-4.2$ ls -ld /elasticsearch/persistent/
    drwxr-xr-x 2 nobody nobody 4096 Feb  8 18:29 /elasticsearch/persistent/

    bash-4.2$ /opt/app-root/src/run.sh
    [2016-02-09 00:03:13,831][INFO ][node                     ] [Captain Omen] version[1.5.2], pid[22], build[62ff986/2015-04-27T09:21:06Z]
    [2016-02-09 00:03:13,832][INFO ][node                     ] [Captain Omen] initializing ...
    [2016-02-09 00:03:14,882][INFO ][plugins                  ] [Captain Omen] loaded [searchguard, openshift-elasticsearch-plugin, cloud-kubernetes], sites []
    [2016-02-09 00:03:20,026][INFO ][node                     ] [Captain Omen] initialized
    [2016-02-09 00:03:20,027][INFO ][node                     ] [Captain Omen] starting ...
    [2016-02-09 00:03:20,265][INFO ][transport                ] [Captain Omen] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/]}
    [2016-02-09 00:03:20,303][INFO ][discovery                ] [Captain Omen] logging-es/Xpb-Cet6RaGdQdOgiAcM0w
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/usr/share/elasticsearch/plugins/cloud-kubernetes/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/usr/share/elasticsearch/plugins/openshift-elasticsearch-plugin/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    [2016-02-09 00:03:23,516][INFO ][cluster.service          ] [Captain Omen] new_master [Captain Omen][Xpb-Cet6RaGdQdOgiAcM0w][logging-es-068wj8on-2-12xle-rcw][inet[/]], reason: zen-disco-join (elected_as_master)
    [2016-02-09 00:03:23,562][INFO ][http                     ] [Captain Omen] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/]}
    [2016-02-09 00:03:23,563][INFO ][node                     ] [Captain Omen] started

I've run through the docs several times and rebuilt the project environment each time, but get the same result. Does anyone have an idea of where I might look next to figure out why using an NFS-based PV/PVC doesn't seem to work with the default Origin aggregated logging setup?


Robert Wehner

users mailing list
users lists openshift redhat com

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]