Fixing an AB-BA Deadlock in SCST User-Space Device Handler

Background

The SCST (SCSI Target) framework provides a user-space device handler (scst_user) that allows implementing SCSI target devices in userspace. Recently, we encountered a critical issue where the system would hang during cleanup when the userspace handler crashed or was killed while I/O operations were in progress.

The symptom appeared as an infinite loop in the cleanup code, requiring workarounds like forcing cleanup after 1000 iterations or even triggering a kernel panic. However, these were just treating the symptoms, not the root cause.

The Symptom: Infinite Loop in Cleanup

The cleanup process dev_user_process_cleanup() would get stuck in an infinite loop when trying to clean up “zombie” commands - commands that remained in the ucmd_hash table but were not in the ready_cmd_list. The code would repeatedly call dev_user_unjam_dev() trying to clean these up, but never succeed.

Previous workarounds included:

  1. Panic after 10,000 iterations - Crash the system to prevent soft lockup
  2. Force cleanup after 1,000 iterations - Manipulate reference counts and forcefully remove commands

While these prevented the system from hanging forever, they didn’t address why the cleanup was failing in the first place.

The Real Problem: AB-BA Deadlock

After deep analysis of the lock ordering, I discovered a classic AB-BA deadlock scenario hidden in the code:

Lock Order A: In dev_user_unjam_dev()

1
2
3
4
5
6
7
8
9
static int dev_user_unjam_dev(struct scst_user_dev *dev)
{
...
sgv_pool_flush(dev->pool); // Takes sgv_pool_lock
sgv_pool_flush(dev->pool_clust); // Takes sgv_pool_lock

spin_lock_irq(&dev->cmd_list_lock); // Then takes cmd_list_lock
...
}

Lock order: sgv_pool_lockcmd_list_lock

The sgv_pool_flush() function internally acquires spin_lock_bh(&pool->sgv_pool_lock) to iterate through and free cached memory pool entries.

Lock Order B: In Normal Command Processing

1
2
3
4
5
6
7
8
spin_lock_irq(&dev->cmd_list_lock);    // First takes cmd_list_lock
...
ucmd_put(ucmd); // May free command
→ dev_user_free_ucmd(ucmd);
→ (releases cmd_list_lock)
→ Memory operations
→ sgv_put_obj() // Then takes sgv_pool_lock
→ spin_lock_bh(&pool->sgv_pool_lock);

Lock order: cmd_list_locksgv_pool_lock

The Deadlock

With these conflicting lock orders, a classic AB-BA deadlock can occur:

  • Thread 1 (cleanup loop): Holds sgv_pool_lock, waits for cmd_list_lock
  • Thread 2 (command processing): Holds cmd_list_lock, waits for sgv_pool_lock

Both threads wait forever, causing the system hang that manifested as an “infinite loop.”

The Fix

The solution is surprisingly simple - remove the sgv_pool_flush() calls from inside the unjam loop:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
diff --git a/scst/src/dev_handlers/scst_user.c b/scst/src/dev_handlers/scst_user.c
index 501d55bd..e9699b91 100644
--- a/scst/src/dev_handlers/scst_user.c
+++ b/scst/src/dev_handlers/scst_user.c
@@ -2688,9 +2688,6 @@ static int dev_user_unjam_dev(struct scst_user_dev *dev)

TRACE_MGMT_DBG("Unjamming dev %p", dev);

- sgv_pool_flush(dev->pool);
- sgv_pool_flush(dev->pool_clust);
-
spin_lock_irq(&dev->udev_cmd_threads.cmd_list_lock);

repeat:

Why This Works

  1. Eliminates the AB-BA deadlock - By removing sgv_pool_lock acquisition before taking cmd_list_lock, we remove the conflicting lock order entirely.

  2. Memory pools still get flushed - The pools are flushed later at line 3769-3770 (in dev_user_exit_dev()) when no locks are held, which is the proper place.

  3. Improves performance - Previously, the cleanup loop was flushing memory pools on every iteration (potentially hundreds or thousands of times). Now it happens once at the end.

  4. Unjam doesn’t need flush - The unjam logic works by changing command states and cleaning up references, not by freeing memory. The pool flush was unnecessary in this context.

Analysis: Why Pool Flush Was in Unjam

Looking at the code history, sgv_pool_flush() was likely added to dev_user_unjam_dev() with the intent to free up memory and potentially help stuck commands. However:

  • Pool flush only frees cached, unused buffers
  • Active commands hold references to their buffers - those won’t be freed anyway
  • The unjam mechanism works through state transitions, not memory management
  • Flushing pools on every iteration was wasteful and introduced the lock ordering bug

Verification

This fix was tested by:

  1. Running SCST with userspace device handler under heavy I/O
  2. Killing the userspace handler process during active I/O
  3. Observing cleanup completes successfully without:
    • Infinite loops
    • Force cleanup being triggered
    • System hangs
    • Kernel panics

Lessons Learned

  1. Lock ordering matters - Always acquire locks in a consistent order across all code paths. Tools like lockdep can help detect these issues during development.

  2. Analyze before workarounding - The force cleanup and panic workarounds masked the real issue. Root cause analysis revealed a completely different problem.

  3. Question unnecessary operations - Flushing memory pools repeatedly in a cleanup loop was both unnecessary and harmful.

  4. Simple fixes are often correct - The three-line patch that removes the problematic calls is cleaner than the complex workarounds.

Future Work

While this patch fixes the deadlock, the underlying “zombie command” issue still needs attention. The long-term solution should include:

  1. Per-command timeout tracking - Detect stuck commands early
  2. Lockdep annotations - Add lockdep_set_class() and lockdep_assert_held() to catch lock ordering issues during development
  3. Command lifecycle refactoring - Prevent zombie commands from occurring in the first place
  4. Redesign cleanup sequence - Replace wait_for_completion() with bounded timeouts

Conclusion

What appeared to be an infinite loop was actually a deadlock caused by conflicting lock ordering. A three-line patch that removes unnecessary memory pool flushing from the cleanup loop fixes the issue while also improving performance and code clarity.

This case demonstrates the importance of understanding lock hierarchies in concurrent code and the value of root cause analysis over symptom-based workarounds.


Code: SCST Project
Related Commits:

  • Fix infinite loop deadlock: 99ebfd13
  • Previous panic workaround: 05164093