CSI Inline Volume Becomes Orphan After Kubelet Restart During Pod Termination
When a pod is terminating and the kubelet restarts or shuts down at the same time, a CSI inline (ephemeral) volume can be left as an orphan — kubelet skips both the unmount and the cleanup of the volume after it restarts. This post walks through the root cause and the fix.
Problem
When a pod is in Terminating state and the kubelet is restarted, the following errors appear in the kubelet log:
1 | kubelet: I reconciler.go:388] "Could not construct volume information, cleaning up mounts" |
After kubelet restarts, it receives the SyncLoop DELETE event for the pod, but the underlying LVM logical volume is never removed — it becomes an orphan resource on the node.
Root Cause Analysis
1. reconstructVolume takes the wrong plugin lookup path
After kubelet restarts, it calls reconstructVolume to rebuild the in-memory volume state from existing mount paths on disk. The flow is:
1 | kubelet restart |
A CSI inline (ephemeral) volume uses api.CSIVolumeSource, while a CSI PV uses api.CSIPersistentVolumeSource. The getPVSourceFromSpec function only handles the latter and errors out on the former.
2. Inline volumes should go through GetUniqueVolumeNameFromSpecWithPod
For ephemeral volumes, the unique name must incorporate the pod UID and should call GetUniqueVolumeNameFromSpecWithPod. The branching logic is:
1 | if attachablePlugin != nil || deviceMountablePlugin != nil { |
A CSI inline volume should not have attachablePlugin or deviceMountablePlugin. However, the code uses FindAttachablePluginByName (lookup by plugin name) instead of FindDeviceMountablePluginBySpec (lookup by spec). This causes the inline volume to incorrectly enter the if branch and call GetUniqueVolumeNameFromSpec, which fails.
3. Consequence of reconstructVolume failure
1 | reconstructedVolume, err := rc.reconstructVolume(volume) |
Because the pod is Terminating, it is not added to the Desired State of World (DSW), so volumeInDSW is false. Kubelet calls cleanupMounts → UnmountVolume. But at this point the container has not yet fully exited, so the CSI driver returns Aborted (NodeUnpublish is still in progress). This is the only cleanup opportunity — after the failure, kubelet does not retry, and the volume is left as an orphan.
Trigger conditions
All three must be true simultaneously:
- Pod is in Terminating state
- Kubelet restarts or crashes during pod termination
reconstructVolumefails, and the pod is not in DSW
CSI Driver Side
While kubelet attempts cleanup, the local CSI driver also tries to execute NodeUnpublishVolume but fails because the LV is still in use by the container:
1 | CSI local driver: NodeUnpublishVolume |
The LV cannot be removed because the container is still using the filesystem. NodeUnpublish returns an Internal error.
Fix
Upstream fix
This bug was fixed in Kubernetes 1.25: kubernetes/kubernetes#108997
The core change: in reconstructVolume, the plugin lookup for CSI inline volumes is changed to use FindDeviceMountablePluginBySpec instead of FindAttachablePluginByName. This correctly skips the attachablePlugin/deviceMountablePlugin check for inline volumes, routes them to GetUniqueVolumeNameFromSpecWithPod, and allows the volume to be properly rebuilt in ActualStateOfWorld so the normal unmount flow can proceed.
Workaround for older versions
Before the fix is rolled out, orphan LVs must be cleaned up manually:
1 | # 1. List all LVs to find orphans (those with no active mount) |
Summary
| Stage | Bug | Impact |
|---|---|---|
| Plugin lookup | FindAttachablePluginByName incorrectly matches inline volumes |
reconstructVolume errors out |
| Path decision | Inline volume incorrectly enters GetUniqueVolumeNameFromSpec |
Volume cannot be added to ASW |
| Cleanup window | cleanupMounts is called while container is still running |
Guaranteed to fail |
The root cause is that CSI inline volumes share the same reconstructVolume code path as CSI PVs but require different plugin lookup semantics. Kubernetes 1.25 fixes this by differentiating the plugin lookup method based on the volume spec type.