[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Failure to detach Azure Disk in OpenShift 4.2.7 after 15 minutes



Did you run must-gather while it couldn’t detach?

Without deeper debug info from the interval it’s hard to say.  If you can recreate it and run must gather we might be able to find it.

On Nov 24, 2019, at 10:25 PM, Joel Pearson <japearson agiledigital com au> wrote:

Hi,

I updated some machine config to configure chrony for masters and workers, and I found that one of my containers got stuck after the masters had restarted.

One of the containers still couldn't start for 15 minutes, as the disk was still attached to master-2 whereas the pod had been scheduled on master-1.

In the end I manually detached the disk in the azure console.

Is this a known issue? Or should I have waited for more than 15 minutes?

Maybe this happened because the masters restarted and maybe whatever is responsible for detaching the disk got restarted, and there wasn't a cleanup process to detach from the original node? I'm not sure if this is further complicated by the fact that my masters are also workers?

Here is the event information from the pod:

  Warning  FailedMount         57s (x8 over 16m)   kubelet, resource-group-prefix-master-1  Unable to mount volumes for pod "odoo-3-m9kxs_odoo(c0a31c68-0f2c-11ea-b695-000d3a970043)": timeout expired waiting for volumes to attach or mount for pod "odoo"/"odoo-3-m9kxs". list of unmounted volumes=[odoo-data]. list of unattached volumes=[odoo-1 odoo-data default-token-5d6x7]

  Warning  FailedAttachVolume  55s (x15 over 15m)  attachdetach-controller                       AttachVolume.Attach failed for volume "pvc-61f1ad81-0f24-11ea-8f8f-000d3a970df2" : Attach volume "resource-group-prefix-dynamic-pvc-61f1ad81-0f24-11ea-8f8f-000d3a970df2" to instance "resource-group-prefix-master-1" failed with compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="ConflictingUserInput" Message="A disk with name resource-group-prefix-dynamic-pvc-61f1ad81-0f24-11ea-8f8f-000d3a970df2 already exists in Resource Group RESOURCE-GROUP-PREFIX-RG and is attached to VM /subscriptions/xxxx-xxx-xxxx-xxxx-xxxxx/resourceGroups/resource-group-prefix-rg/providers/Microsoft.Compute/virtualMachines/resource-group-prefix-master-2. 'Name' is an optional property for a disk and a unique name will be generated if not provided." Target="/subscriptions/xxxx-xxx-xxxx-xxxx-xxxxx/resourceGroups/resource-group-prefix-rg/providers/Microsoft.Compute/disks/resource-group-prefix-dynamic-pvc-61f1ad81-0f24-11ea-8f8f-000d3a970df2"

Thanks,

Joel
_______________________________________________
users mailing list
users lists openshift redhat com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]