Message ID | 20221208031002.106700-8-michael.christie@oracle.com |
---|---|
State | Superseded |
Headers | show |
Series | scsi target task management fixes | expand |
On Wed, Dec 07, 2022 at 09:10:02PM -0600, Mike Christie wrote: > > This fixes a bug where an initiator thinks a LUN_RESET has cleaned > up running commands when it hasn't. The bug was added in: > > commit 51ec502a3266 ("target: Delete tmr from list before processing") > > The problem occurs when: > > 1. We have N IO cmds running in the target layer spread over 2 sessions. > 2. The initiator sends a LUN_RESET for each session. > 3. session1's LUN_RESET loops over all the running commands from both > sessions and moves them to its local drain_task_list. > 4. session2's LUN_RESET does not see the LUN_RESET from session1 because > the commit above has it remove itself. session2 also does not see any > commands since the other reset moved them off the state lists. > 5. sessions2's LUN_RESET will then complete with a successful response. > 6. sessions2's inititor believes the running commands on its session are > now cleaned up due to the successful response and cleans up the running > commands from its side. It then restarts them. > 7. The commands do eventually complete on the backend and the target > starts to return aborted task statuses for them. The initiator will > either throw a invalid ITT error or might accidentally lookup a new task > if the ITT has been reallocated already. > > This fixes the bug by reverting the patch. > > Fixes: 51ec502a3266 ("target: Delete tmr from list before processing") > Signed-off-by: Mike Christie <michael.christie@oracle.com> > --- > drivers/target/target_core_tmr.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/target/target_core_tmr.c b/drivers/target/target_core_tmr.c > index bac111456fa1..ba2a2c18dae9 100644 > --- a/drivers/target/target_core_tmr.c > +++ b/drivers/target/target_core_tmr.c > @@ -188,9 +188,10 @@ static void core_tmr_drain_tmr_list( > * LUN_RESET tmr.. > */ > spin_lock_irqsave(&dev->se_tmr_lock, flags); > - if (tmr) > - list_del_init(&tmr->tmr_list); > list_for_each_entry_safe(tmr_p, tmr_pp, &dev->dev_tmr_list, tmr_list) { > + if (tmr_p == tmr) > + continue; > + > cmd = tmr_p->task_cmd; > if (!cmd) { > pr_err("Unable to locate struct se_cmd for TMR\n"); > -- > 2.25.1 > Reviewed-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
diff --git a/drivers/target/target_core_tmr.c b/drivers/target/target_core_tmr.c index bac111456fa1..ba2a2c18dae9 100644 --- a/drivers/target/target_core_tmr.c +++ b/drivers/target/target_core_tmr.c @@ -188,9 +188,10 @@ static void core_tmr_drain_tmr_list( * LUN_RESET tmr.. */ spin_lock_irqsave(&dev->se_tmr_lock, flags); - if (tmr) - list_del_init(&tmr->tmr_list); list_for_each_entry_safe(tmr_p, tmr_pp, &dev->dev_tmr_list, tmr_list) { + if (tmr_p == tmr) + continue; + cmd = tmr_p->task_cmd; if (!cmd) { pr_err("Unable to locate struct se_cmd for TMR\n");
This fixes a bug where an initiator thinks a LUN_RESET has cleaned up running commands when it hasn't. The bug was added in: commit 51ec502a3266 ("target: Delete tmr from list before processing") The problem occurs when: 1. We have N IO cmds running in the target layer spread over 2 sessions. 2. The initiator sends a LUN_RESET for each session. 3. session1's LUN_RESET loops over all the running commands from both sessions and moves them to its local drain_task_list. 4. session2's LUN_RESET does not see the LUN_RESET from session1 because the commit above has it remove itself. session2 also does not see any commands since the other reset moved them off the state lists. 5. sessions2's LUN_RESET will then complete with a successful response. 6. sessions2's inititor believes the running commands on its session are now cleaned up due to the successful response and cleans up the running commands from its side. It then restarts them. 7. The commands do eventually complete on the backend and the target starts to return aborted task statuses for them. The initiator will either throw a invalid ITT error or might accidentally lookup a new task if the ITT has been reallocated already. This fixes the bug by reverting the patch. Fixes: 51ec502a3266 ("target: Delete tmr from list before processing") Signed-off-by: Mike Christie <michael.christie@oracle.com> --- drivers/target/target_core_tmr.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)