[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Logging] searchguard configuration issue? ["warning", "elasticsearch"], "pid":1, "message":"Unable to revive connection: https://logging-es:9200/"}



This looks a lot like this BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1449378, "Timeout after 30SECONDS while retrieving configuration"

What version of Origin are you using?

I found that I had to run the sgadmin script in each ES pod at the same time, and when one succeeds and one fails, just run it again and it worked.

It seems to have to do with sgadmin script trying to be sure that all nodes can see the searchguard index, but since we create one per node, if another node does not have searchguard successfully setup, the current node's setup will fail.  Retry at the same time until they work seems to be the fix. :(

-peter

On Wed, Jul 12, 2017 at 9:03 AM, Stéphane Klein <contact stephane-klein info> wrote:
Hi,

Since one day, after ES cluster pods restart, I have this error message when I launch logging-es:

$ oc logs -f logging-es-ne81bsny-5-jdcdk
Comparing the specificed RAM to the maximum recommended for ElasticSearch...
Inspecting the maximum RAM available...
ES_JAVA_OPTS: '-Dmapper.allow_dots_in_name=true -Xms128M -Xmx4096m'
Checking if Elasticsearch is ready on https://localhost:9200 ......................................Will connect to localhost:9300 ... done
Contacting elasticsearch cluster 'elasticsearch' and wait for YELLOW clusterstate ...
Clustername: logging-es
Clusterstate: YELLOW
Number of nodes: 2
Number of data nodes: 2
.searchguard.logging-es-ne81bsny-5-jdcdk index does not exists, attempt to create it ... done (with 1 replicas, auto expand replicas is off)
Populate config from /opt/app-root/src/sgconfig/
Will update 'config' with /opt/app-root/src/sgconfig/sg_config.yml
   SUCC: Configuration for 'config' created or updated
Will update 'roles' with /opt/app-root/src/sgconfig/sg_roles.yml
   SUCC: Configuration for 'roles' created or updated
Will update 'rolesmapping' with /opt/app-root/src/sgconfig/sg_roles_mapping.yml
   SUCC: Configuration for 'rolesmapping' created or updated
Will update 'internalusers' with /opt/app-root/src/sgconfig/sg_internal_users.yml
   SUCC: Configuration for 'internalusers' created or updated
Will update 'actiongroups' with /opt/app-root/src/sgconfig/sg_action_groups.yml
   SUCC: Configuration for 'actiongroups' created or updated
Timeout (java.util.concurrent.TimeoutException: Timeout after 30SECONDS while retrieving configuration for [config, roles, rolesmapping, internalusers, actiongroups](index=.searchguard.logging-es-x39myqbs-1-s5g7c))
Done with failures

after some time, my ES cluster (2 nodes) is green:

stephane$ oc rsh logging-es-x39myqbs-1-s5g7c bash
st:9200/_cluster/health?pretty=trueasticsearch/secret/admin-cert https://localho
{
  "cluster_name" : "logging-es",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 1643,
  "active_shards" : 3286,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

I have this error in kibana container:

$ oc logs -f -c kibana logging-kibana-1-jblhl
{"type":"log","@timestamp":"2017-07-12T12:54:54Z","tags":["warning","elasticsearch"],"pid":1,"message":"No living connections"}
{"type":"log","@timestamp":"2017-07-12T12:54:57Z","tags":["warning","elasticsearch"],"pid":1,"message":"Unable to revive connection: https://logging-es:9200/"}

But in Kibana container I can access to elasticsearch server:

$ oc rsh -c kibana logging-kibana-1-jblhl bash
$ curl https://logging-es:9200/ --cacert /etc/kibana/keys/ca --key /etc/kibana/keys/key --cert /etc/kibana/keys/cert
{
  "name" : "Adri Nital",
  "cluster_name" : "logging-es",
  "cluster_uuid" : "iRo3wOHWSq2bTZskrIs6Zg",
  "version" : {
    "number" : "2.4.4",
    "build_hash" : "fcbb46dfd45562a9cf00c604b30849a6dec6b017",
    "build_timestamp" : "2017-01-03T11:33:16Z",
    "build_snapshot" : false,
    "lucene_version" : "5.5.2"
  },
  "tagline" : "You Know, for Search"
}

How can I fix this error?

Best regards,
Stéphane
--

_______________________________________________
users mailing list
users lists openshift redhat com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]