Message ID | 20230129234441.116310-6-michael.christie@oracle.com |
---|---|
State | Superseded |
Headers | show |
Series | target: TMF and recovery fixes | expand |
po 30. 1. 2023 v 0:45 odesÃlatel Mike Christie <michael.christie@oracle.com> napsal: > > This fixes a bug added in: > > commit f36199355c64 ("scsi: target: iscsi: Fix cmd abort fabric stop > race") > > If we have multiple sessions to the same se_device we can hit a race where > a LUN_RESET on one session cleans up the se_cmds from under another > session which is being closed. This results in the closing session freeing > its conn/session structs while they are still in use. > > The bug is: > > 1. Session1 has IO se_cmd1. > 2. Session2 can also have se_cmds for IO and optionally TMRs for ABORTS > but then gets a LUN_RESET. > 3. The LUN_RESET on session2 sees the se_cmds on session1 and during > the drain stages marks them all with CMD_T_ABORTED. > 4. session1 is now closed so iscsit_release_commands_from_conn only sees > se_cmds with the CMD_T_ABORTED bit set and returns immediately even > though we have outstanding commands. > 5. session1's connection and session are freed. > 6. The backend request for se_cmd1 completes and it accesses the freed > connection/session. > > This hooks the iscsit layer into the cmd counter code, so we can wait for > all outstanding se_cmds before freeing the connection. > > Fixes: f36199355c64 ("scsi: target: iscsi: Fix cmd abort fabric stop race") > Signed-off-by: Mike Christie <michael.christie@oracle.com> > --- > drivers/target/iscsi/iscsi_target.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c > index 11115c207844..83b007141229 100644 > --- a/drivers/target/iscsi/iscsi_target.c > +++ b/drivers/target/iscsi/iscsi_target.c > @@ -4245,6 +4245,16 @@ static void iscsit_release_commands_from_conn(struct iscsit_conn *conn) > iscsit_free_cmd(cmd, true); > > } > + > + /* > + * Wait on commands that were cleaned up via the aborted_task path. > + * LLDs that implement iscsit_wait_conn will already have waited for > + * commands. > + */ > + if (!conn->conn_transport->iscsit_wait_conn) { > + target_stop_cmd_counter(conn->cmd_cnt); > + target_wait_for_cmds(conn->cmd_cnt); > + } > } > > static void iscsit_stop_timers_for_cmds( > -- > 2.25.1 > Reviewed-by: Maurizio Lombardi <mlombard@redhat.com>
diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index 11115c207844..83b007141229 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -4245,6 +4245,16 @@ static void iscsit_release_commands_from_conn(struct iscsit_conn *conn) iscsit_free_cmd(cmd, true); } + + /* + * Wait on commands that were cleaned up via the aborted_task path. + * LLDs that implement iscsit_wait_conn will already have waited for + * commands. + */ + if (!conn->conn_transport->iscsit_wait_conn) { + target_stop_cmd_counter(conn->cmd_cnt); + target_wait_for_cmds(conn->cmd_cnt); + } } static void iscsit_stop_timers_for_cmds(
This fixes a bug added in: commit f36199355c64 ("scsi: target: iscsi: Fix cmd abort fabric stop race") If we have multiple sessions to the same se_device we can hit a race where a LUN_RESET on one session cleans up the se_cmds from under another session which is being closed. This results in the closing session freeing its conn/session structs while they are still in use. The bug is: 1. Session1 has IO se_cmd1. 2. Session2 can also have se_cmds for IO and optionally TMRs for ABORTS but then gets a LUN_RESET. 3. The LUN_RESET on session2 sees the se_cmds on session1 and during the drain stages marks them all with CMD_T_ABORTED. 4. session1 is now closed so iscsit_release_commands_from_conn only sees se_cmds with the CMD_T_ABORTED bit set and returns immediately even though we have outstanding commands. 5. session1's connection and session are freed. 6. The backend request for se_cmd1 completes and it accesses the freed connection/session. This hooks the iscsit layer into the cmd counter code, so we can wait for all outstanding se_cmds before freeing the connection. Fixes: f36199355c64 ("scsi: target: iscsi: Fix cmd abort fabric stop race") Signed-off-by: Mike Christie <michael.christie@oracle.com> --- drivers/target/iscsi/iscsi_target.c | 10 ++++++++++ 1 file changed, 10 insertions(+)