[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: CNS hekiti-cli replace node?



I have found that I can run the gluster deploy ‘config’ playbook again, after adding new (4th) node, with some changes and the ‘new’ node listed in my inventory file under new_node and new_gluster.  This did create a new 4th node and I was able to fail out and delete the ‘simulated failed’ node (1).

 

But IS this the right process?

 

Thanks,

 

Todd

 

From: "Walters, Todd" <Todd_Walters unigroup com>
Date: Wednesday, August 1, 2018 at 1:44 PM
To: "users lists openshift redhat com" <users lists openshift redhat com>
Subject: CNS hekiti-cli replace node?

 

Does anyone know the proper procedure to replace a storage node in OpenShift?  I’ve not found a successful process. We have 3 ‘storage’ nodes in 3 AWS AZ.  If one node fails or we lose an AZ, we have to be able to replace the node. We’re on OpenShift origin 3.9 and GlusterFS CNS.  We have 3 storage nodes, each with a gluster pod and 1 hekiti pod.

 

When I simulate a node failure to try and replace node I get issues each time.   This has to be able to work, to be ‘production’ ready but I can not find solution.  I’ve tried just shutting down node to simulate failure, to leaving it running, to adding 4th node, then trying the delete process as defined in CNS Gluster doc located here:

https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.3/html/container-native_storage_for_openshift_container_platform/ch12s02#Deleting_Node

 

I get this failed to remove device error.
sh-4.4# heketi-cli node disable fb344a2ea889c7e25a772e747eeeec2a -s http://localhost:8080 --user admin --secret "$HEKETI_CLI_KEY"

Node fb344a2ea889c7e25a772e747eeeec2a is now offline

sh-4.4# heketi-cli node remove fb344a2ea889c7e25a772e747eeeec2a -s http://localhost:8080 --user admin --secret "$HEKETI_CLI_KEY"

Error: Failed to remove device, error: No Replacement was found for resource requested to be removed

 

Or I get /var/lib/heketi/db is in a read only state and I’v got to destroy whole cluster at this point and start again. 

 

Please let me know if you have successfully done this and the process.

 

Thank you

 

Todd


########################################################################
The information contained in this message, and any attachments thereto,
is intended solely for the use of the addressee(s) and may contain
confidential and/or privileged material. Any review, retransmission,
dissemination, copying, or other use of the transmitted information is
prohibited. If you received this in error, please contact the sender
and delete the material from any computer. UNIGROUP.COM
########################################################################


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]