Fixing an AB-BA Deadlock in SCST User-Space Device Handler
Background
The SCST (SCSI Target) framework provides a user-space device handler (scst_user) that allows implementing SCSI target devices in userspace. Recently, we encountered a critical issue where the system would hang during cleanup when the userspace handler crashed or was killed while I/O operations were in progress.
The symptom appeared as an infinite loop in the cleanup code, requiring workarounds like forcing cleanup after 1000 iterations or even triggering a kernel panic. However, these were just treating the symptoms, not the root cause.
The Symptom: Infinite Loop in Cleanup
The cleanup process dev_user_process_cleanup() would get stuck in an infinite loop when trying to clean up “zombie” commands - commands that remained in the ucmd_hash table but were not in the ready_cmd_list. The code would repeatedly call dev_user_unjam_dev() trying to clean these up, but never succeed.
Previous workarounds included:
- Panic after 10,000 iterations - Crash the system to prevent soft lockup
- Force cleanup after 1,000 iterations - Manipulate reference counts and forcefully remove commands
While these prevented the system from hanging forever, they didn’t address why the cleanup was failing in the first place.
The Real Problem: AB-BA Deadlock
After deep analysis of the lock ordering, I discovered a classic AB-BA deadlock scenario hidden in the code:
Lock Order A: In dev_user_unjam_dev()
1 | static int dev_user_unjam_dev(struct scst_user_dev *dev) |
Lock order: sgv_pool_lock → cmd_list_lock
The sgv_pool_flush() function internally acquires spin_lock_bh(&pool->sgv_pool_lock) to iterate through and free cached memory pool entries.
Lock Order B: In Normal Command Processing
1 | spin_lock_irq(&dev->cmd_list_lock); // First takes cmd_list_lock |
Lock order: cmd_list_lock → sgv_pool_lock
The Deadlock
With these conflicting lock orders, a classic AB-BA deadlock can occur:
- Thread 1 (cleanup loop): Holds
sgv_pool_lock, waits forcmd_list_lock - Thread 2 (command processing): Holds
cmd_list_lock, waits forsgv_pool_lock
Both threads wait forever, causing the system hang that manifested as an “infinite loop.”
The Fix
The solution is surprisingly simple - remove the sgv_pool_flush() calls from inside the unjam loop:
1 | diff --git a/scst/src/dev_handlers/scst_user.c b/scst/src/dev_handlers/scst_user.c |
Why This Works
Eliminates the AB-BA deadlock - By removing
sgv_pool_lockacquisition before takingcmd_list_lock, we remove the conflicting lock order entirely.Memory pools still get flushed - The pools are flushed later at line 3769-3770 (in
dev_user_exit_dev()) when no locks are held, which is the proper place.Improves performance - Previously, the cleanup loop was flushing memory pools on every iteration (potentially hundreds or thousands of times). Now it happens once at the end.
Unjam doesn’t need flush - The unjam logic works by changing command states and cleaning up references, not by freeing memory. The pool flush was unnecessary in this context.
Analysis: Why Pool Flush Was in Unjam
Looking at the code history, sgv_pool_flush() was likely added to dev_user_unjam_dev() with the intent to free up memory and potentially help stuck commands. However:
- Pool flush only frees cached, unused buffers
- Active commands hold references to their buffers - those won’t be freed anyway
- The unjam mechanism works through state transitions, not memory management
- Flushing pools on every iteration was wasteful and introduced the lock ordering bug
Verification
This fix was tested by:
- Running SCST with userspace device handler under heavy I/O
- Killing the userspace handler process during active I/O
- Observing cleanup completes successfully without:
- Infinite loops
- Force cleanup being triggered
- System hangs
- Kernel panics
Lessons Learned
Lock ordering matters - Always acquire locks in a consistent order across all code paths. Tools like lockdep can help detect these issues during development.
Analyze before workarounding - The force cleanup and panic workarounds masked the real issue. Root cause analysis revealed a completely different problem.
Question unnecessary operations - Flushing memory pools repeatedly in a cleanup loop was both unnecessary and harmful.
Simple fixes are often correct - The three-line patch that removes the problematic calls is cleaner than the complex workarounds.
Future Work
While this patch fixes the deadlock, the underlying “zombie command” issue still needs attention. The long-term solution should include:
- Per-command timeout tracking - Detect stuck commands early
- Lockdep annotations - Add
lockdep_set_class()andlockdep_assert_held()to catch lock ordering issues during development - Command lifecycle refactoring - Prevent zombie commands from occurring in the first place
- Redesign cleanup sequence - Replace
wait_for_completion()with bounded timeouts
Conclusion
What appeared to be an infinite loop was actually a deadlock caused by conflicting lock ordering. A three-line patch that removes unnecessary memory pool flushing from the cleanup loop fixes the issue while also improving performance and code clarity.
This case demonstrates the importance of understanding lock hierarchies in concurrent code and the value of root cause analysis over symptom-based workarounds.
Code: SCST Project
Related Commits:
- Fix infinite loop deadlock:
99ebfd13 - Previous panic workaround:
05164093