Message ID | 20210416092146.3201-1-d.bogdanov@yadro.com |
---|---|
State | New |
Headers | show |
Series | [v2] target: core: remove from tmr_list at lun unlink | expand |
Hi Mike, >>> --- a/drivers/target/iscsi/iscsi_target.c >>> +++ b/drivers/target/iscsi/iscsi_target.c >>> @@ -2142,7 +2142,7 @@ iscsit_handle_task_mgt_cmd(struct iscsi_conn *conn, struct iscsi_cmd *cmd, >>> * TMR TASK_REASSIGN. >>> */ >>> iscsit_add_cmd_to_response_queue(cmd, conn, cmd->i_state); >>> - target_put_sess_cmd(&cmd->se_cmd); >>> + transport_generic_free_cmd(&cmd->se_cmd, false); >>> return 0; >>> } >> >> Doh. I see how I got all confused. I guess this path leaks the lun_ref >> taken by transport_lookup_tmr_lun. It looks like an old issue and >> nothing to do with your patch. >> I'm not sure if we are supposed to be calling >> transport_generic_free_cmd twice. It looks like it works ok, because your patch added the "cmd->se_lun = NULL" >> in transport_lun_remove_cmd, so we won't do a double list deletion. >> It feels dirty though. I can feel Bart saying, "Mike did you see the >> comment at the top of the function". :) >> >> Maybe it's best to more cleanly unwind what was setup in the rror >> path. I think we can fix lun_ref leak too. >> >> So instead of doing transport_generic_free_cmd above do >> transport_lun_remove_cmd to match/undo the transport_lookup_tmr_lun call in iscsit_handle_task_mgt_cmd? > >Shoot. I'm all over the place. I think the root issue is my original comment on the v1 patch was wrong. >On a failure we would still do: >iscsit_free_cmd -> transport_generic_free_cmd -> transport_lun_remove_cmd >right? So we don't need any change in the iscsi target. It should all just work. iscsit_free_cmd will be called only at response completion(next incoming iscsi command): iscsit_ack_from_expstatsn -> iscsit_free_cmd. That produces some unacceptable delay of lun unlinking from cmd. There is a bug report to the similar behavior: http://lkml.iu.edu/hypermail/linux/kernel/2002.0/05272.html Because of that complain, the commit 83f85b8ec305, that solves the same crash as I am fixing, was reverted. So, this piece of patch has some indirect relation :) I will extract it to a separate patch in the coming patchset on TMF handling. BR, Dmitry
diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index d0e7ed8f28cc..3311b031a812 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -2142,7 +2142,7 @@ iscsit_handle_task_mgt_cmd(struct iscsi_conn *conn, struct iscsi_cmd *cmd, * TMR TASK_REASSIGN. */ iscsit_add_cmd_to_response_queue(cmd, conn, cmd->i_state); - target_put_sess_cmd(&cmd->se_cmd); + transport_generic_free_cmd(&cmd->se_cmd, false); return 0; } EXPORT_SYMBOL(iscsit_handle_task_mgt_cmd); diff --git a/drivers/target/target_core_tmr.c b/drivers/target/target_core_tmr.c index 7347285471fa..323a173010c1 100644 --- a/drivers/target/target_core_tmr.c +++ b/drivers/target/target_core_tmr.c @@ -50,15 +50,6 @@ EXPORT_SYMBOL(core_tmr_alloc_req); void core_tmr_release_req(struct se_tmr_req *tmr) { - struct se_device *dev = tmr->tmr_dev; - unsigned long flags; - - if (dev) { - spin_lock_irqsave(&dev->se_tmr_lock, flags); - list_del_init(&tmr->tmr_list); - spin_unlock_irqrestore(&dev->se_tmr_lock, flags); - } - kfree(tmr); } diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c index 5ecb9f18a53d..34b20c0646ab 100644 --- a/drivers/target/target_core_transport.c +++ b/drivers/target/target_core_transport.c @@ -667,6 +667,20 @@ static void target_remove_from_state_list(struct se_cmd *cmd) spin_unlock_irqrestore(&dev->queues[cmd->cpuid].lock, flags); } +static void target_remove_from_tmr_list(struct se_cmd *cmd) +{ + struct se_device *dev = NULL; + unsigned long flags; + + if (cmd->se_cmd_flags & SCF_SCSI_TMR_CDB) + dev = cmd->se_tmr_req->tmr_dev; + + if (dev) { + spin_lock_irqsave(&dev->se_tmr_lock, flags); + list_del_init(&cmd->se_tmr_req->tmr_list); + spin_unlock_irqrestore(&dev->se_tmr_lock, flags); + } +} /* * This function is called by the target core after the target core has * finished processing a SCSI command or SCSI TMF. Both the regular command @@ -678,13 +692,6 @@ static int transport_cmd_check_stop_to_fabric(struct se_cmd *cmd) { unsigned long flags; - target_remove_from_state_list(cmd); - - /* - * Clear struct se_cmd->se_lun before the handoff to FE. - */ - cmd->se_lun = NULL; - spin_lock_irqsave(&cmd->t_state_lock, flags); /* * Determine if frontend context caller is requesting the stopping of @@ -719,8 +726,16 @@ static void transport_lun_remove_cmd(struct se_cmd *cmd) if (!lun) return; + target_remove_from_state_list(cmd); + target_remove_from_tmr_list(cmd); + if (cmpxchg(&cmd->lun_ref_active, true, false)) percpu_ref_put(&lun->lun_ref); + + /* + * Clear struct se_cmd->se_lun before the handoff to FE. + */ + cmd->se_lun = NULL; } static void target_complete_failure_work(struct work_struct *work)