VM Volume Attacher Enhancement
VM volume attacher enhancement
Volume attach failed
- Duplicated error
| 2021-09-28T16:00:03.816682107-07:00 stderr F I0928 23:00:03.816558 1 cephvolume.go:73] attach disk &{XMLName:{Space: Local:} Device:disk RawIO: SGIO: Snapshot: Model: Driver:0xc00073b420 Auth:0xc000f49e40 Source:0xc000159860 BackingStore:<nil> Geometry:<nil> BlockIO:<nil> Mirror:<nil> Target:0xc000a58b40 IOTune:0xc00019b970 ReadOnly:<nil> Shareable:<nil> Transient:<nil> Serial:pvc-33003998-6624-4ac9-a923-d94f9401abdf WWN: Vendor: Product: Encryption:<nil> Boot:<nil> Alias:<nil> Address:0xc001420b40} error: virError(Code=27, Domain=20, Message=’XML error: target ‘vdf’ duplicated for disk sources ‘volume-0aab375c-1858-4f09-b276-ea297cd29a3d’ and ‘volume-63ef92c4-a027-476c-a2de-9fcf501dd4de’’) <disk type=’network’ device=’disk’> <driver name=’qemu’ type=’raw’ cache=’none’ io=’native’/> <auth username=’cinder’> <secret type=’ceph’ uuid=’1bcf1a49-b42f-4bc2-6e70-9ea7a4006740’/> </auth> <source protocol=’rbd’ name=’volumes/volume-0aab375c-1858-4f09-b276-ea297cd29a3d’ > <host name=’10.166.77.141’ port=’6789’/> <host name=’10.78.225.48’ port=’6789’/> <host name=’10.33.212.26’ port=’6789’/> <host name=’10.212.255.52’ port=’6789’/> <host name=’10.164.134.166’ port=’6789’/> </source> <target dev=’vdf’ bus=’virtio’/> <iotune> <total_bytes_sec>157286400</total_bytes_sec> <total_iops_sec>300</total_iops_sec> </iotune> <serial>pvc-22650282-34fe-412c-a5a3-1df0bdb3cadd</serial> <alias name=’virtio-disk5’/> <address type=’pci’ domain=’0x0000’ bus=’0x00’ slot=’0x09’ function=’0x0’/> </disk> ➜ ~ k. 130 tSwitched to context “130”. ➜ ~ tk get pv pvc-22650282-34fe-412c-a5a3-1df0bdb3cadd Error from server (NotFound): persistentvolumes “pvc-22650282-34fe-412c-a5a3-1df0bdb3cadd” not found |
|---|
Dump exists vdf device but persistent xml not found.
- If backend volume not exist, report this
| I0721 03:39:04.965504 1 cephvolume.go:73] attach disk &{XMLName:{Space: Local:} Device:disk RawIO: SGIO: Snapshot: Model: Driver:0xc0015061c0 Auth:0xc00294c880 Source:0xc0004db310 BackingStore:<nil> Geometry:<nil> BlockIO:<nil> Mirror:<nil> Target:0xc00685fd00 IOTune:0xc0014fc8f0 ReadOnly:<nil> Shareable:<nil> Transient:<nil> Serial:pvc-20e1dd78-f543-40ae-bdb0-3eb74f0ffb1c WWN: Vendor: Product: Encryption:<nil> Boot:<nil> Alias:<nil> Address:0xc001508cc0} error: virError(Code=1, Domain=10, Message=’internal error: unable to execute QEMU command ‘device_add’: Property ‘virtio-blk-device.drive’ can’t find value ‘drive-virtio-disk15’’) |
|---|
Try to use rbd info to find the volume, if not found, report the error.
- List-Watch item lost and restart container recovered:
| 2023-03-28T16:46:39.34066771-07:00 stderr F I0328 23:46:39.340535 1 streamwatcher.go:103] Unexpected EOF during watch stream event decoding: unexpected EOF 2023-03-28T16:46:39.340722015-07:00 stderr F I0328 23:46:39.340600 1 reflector.go:371] tess.io/ebay/vm-volume/pkg/controller/attach/attach_controller.go:285: Watch close - *v1.VmVolumeAttachment total 2 items received 2023-03-28T16:47:39.094070215-07:00 stderr F I0328 23:47:39.093952 1 reflector.go:371] tess.io/ebay/vm-volume/pkg/controller/attach/attach_controller.go:286: Watch close - *v1.Node total 1007 items received 2023-03-28T16:49:39.350681116-07:00 stderr F I0328 23:49:39.350572 1 streamwatcher.go:103] Unexpected EOF during watch stream event decoding: unexpected EOF 2023-03-28T16:49:39.35071423-07:00 stderr F I0328 23:49:39.350619 1 reflector.go:371] tess.io/ebay/vm-volume/pkg/controller/attach/attach_controller.go:285: Watch close - *v1.VmVolumeAttachment total 0 items received 2023-03-28T16:51:29.857528981-07:00 stderr F I0328 23:51:29.857397 1 attach_controller.go:453] VmVolumeAttachment pvc-891a1e4f-fe72-447d-ac29-efea9675bc51.tess-node-7fm8k is already attached phase 2023-03-28T16:51:29.857578201-07:00 stderr F I0328 23:51:29.857451 1 attach_controller.go:252] tess140/pvc-891a1e4f-fe72-447d-ac29-efea9675bc51.tess-node-7fm8k added to queue 2023-03-28T16:51:32.341806809-07:00 stderr F I0328 23:51:32.341711 1 attach_controller.go:453] VmVolumeAttachment pvc-891a1e4f-fe72-447d-ac29-efea9675bc51.tess-node-7fm8k is already attached phase 2023-03-28T16:51:32.341832149-07:00 stderr F I0328 23:51:32.341743 1 attach_controller.go:252] tess140/pvc-891a1e4f-fe72-447d-ac29-efea9675bc51.tess-node-7fm8k added to queue 2023-03-28T16:51:32.341835164-07:00 stderr F E0328 23:51:32.341758 1 attach_controller.go:323] vmvolumeattachment “tess140/pvc-891a1e4f-fe72-447d-ac29-efea9675bc51.tess-node-7fm8k” in work queue no longer exists 2023-03-28T16:54:35.827164294-07:00 stderr F I0328 23:54:35.827044 1 streamwatcher.go:103] Unexpected EOF during watch stream event decoding: unexpected EOF 2023-03-28T16:54:35.827197159-07:00 stderr F I0328 23:54:35.827105 1 reflector.go:371] tess.io/ebay/vm-volume/pkg/controller/attach/attach_controller.go:285: Watch close - *v1.VmVolumeAttachment total 3 items received 2023-03-28T16:54:37.095045006-07:00 stderr F I0328 23:54:37.094917 1 reflector.go:371] tess.io/ebay/vm-volume/pkg/controller/attach/attach_controller.go:286: Watch close - *v1.Node total 1056 items received 2023-03-28T16:54:40.55082912-07:00 stderr F I0328 23:54:40.550708 1 attach_controller.go:453] VmVolumeAttachment pvc-d0c67ff7-aeb0-4e13-b373-507b335a67fb.tess-node-2nb47 is already attached phase |
|---|
All attacher happened EOF at the same time.
- Double use pci address
| I0330 00:28:55.070331 1 cephvolume.go:73] attach disk &{XMLName:{Space: Local:} Device:disk RawIO: SGIO: Snapshot: Model: Driver:0xc000024380 Auth:0xc006a51600 Source:0xc000640cd0 BackingStore:<nil> Geometry:<nil> BlockIO:<nil> Mirror:<nil> Target:0xc00342db80 IOTune:0xc005071c30 ReadOnly:<nil> Shareable:<nil> Transient:<nil> Serial:pvc-39c80157-0862-433c-a1ec-49475db818cf WWN: Vendor: Product: Encryption:<nil> Boot:<nil> Alias:<nil> Address:0xc000f5e240} error: virError(Code=27, Domain=20, Message=’XML error: Attempted double use of PCI Address 0000:00:0a.0’) |
|---|
Auto recovered first time : https://tessio.slack.com/archives/C03JP804D/p1670377925841209
Dumpxml not found the 0x0a address but it is in persistent file and used by a pv. The pv detached failed because not found in dump like detach failed case 1.
The address can transfer to pci address:
<address type=’pci’ domain=’0x0000’ bus=’0x00’ slot=’0x0a’ function=’0x0’/>
So it is a slot conflict issue. Need skip the slot.
Volume detach failed
- Volume exist in persistent file but not in the dump
| I0330 00:38:54.615254 1 cephvolume.go:227] detach disk &{XMLName:{Space: Local:disk} Device:disk RawIO: SGIO: Snapshot: Model: Driver:0xc000736540 Auth:0xc0027b68a0 Source:0xc000641040 BackingStore:<nil> Geometry:<nil> BlockIO:<nil> Mirror:<nil> Target:0xc004557640 IOTune:0xc0006e4370 ReadOnly:<nil> Shareable:<nil> Transient:<nil> Serial:pvc-febae406-15ad-4d05-9c93-b2d09c197840 WWN: Vendor: Product: Encryption:<nil> Boot:<nil> Alias:<nil> Address:0xc00709cba0} error: virError(Code=9, Domain=10, Message=’operation failed: disk vde not found’) |
|---|
This case can not directly return nil because the device still exists in the VM. If returned, success might produce complex conditions. Attacher will retry to check and auto return success if actually detach.
- Volume exist in dump and host but vm not running
| I0417 07:57:33.764883 1 iscsivolume.go:176] detach disk &{XMLName:{Space: Local:disk} Device:disk RawIO: SGIO: Snapshot: Model: Driver:0xc004d3b0a0 Auth:<nil> Source:0xc000a66b90 BackingStore:<nil> Geometry:<nil> BlockIO:<nil> Mirror:<nil> Target:0xc009aefdc0 IOTune:<nil> ReadOnly:<nil> Shareable:<nil> Transient:<nil> Serial:pvc-9ad2511a-d8d9-4b14-b285-7acb4ef33800 WWN: Vendor: Product: Encryption:<nil> Boot:<nil> Alias:<nil> Address:0xc0048ce780} error: virError(Code=55, Domain=20, Message=’Requested operation is not valid: domain is not running’) root@tess-node-bf7rf-tess93:/# virsh list --all Id Name State -——————————————————- - virtlet-19c7472c-40a1-tess-node-69wjv shut off |
|---|