Message ID | 20201215194731.2326-1-jhasan@marvell.com |
---|---|
State | New |
Headers | show |
Series | [v2] scsi: libfc: Avoid invoking response handler twice if ep is already completed. | expand |
Hello Martin, This is a gentle reminder to apply this patch into scsi-queue at your earliest convenience. Regards Javed -----Original Message----- From: Javed Hasan <jhasan@marvell.com> Sent: Wednesday, December 16, 2020 1:18 AM To: martin.petersen@oracle.com Cc: linux-scsi@vger.kernel.org; GR-QLogic-Storage-Upstream <GR-QLogic-Storage-Upstream@marvell.com>; Javed Hasan <jhasan@marvell.com> Subject: [PATCH v2] scsi: libfc: Avoid invoking response handler twice if ep is already completed. Issue :- race condition getting hit between the response handler get called because of the exchange_mgr_reset() which clear out all the active XID and the response we get via interrupt as the same time Below are the sequence of events occurring in case of "issue/race condition" :- rport ba0200: Port timeout, state PLOGI rport ba0200: Port entered PLOGI state from PLOGI state xid 1052: Exchange timer armed : 20000 msecs xid timer armed here rport ba0200: Received LOGO request while in state PLOGI rport ba0200: Delete port rport ba0200: work event 3 rport ba0200: lld callback ev 3 bnx2fc: rport_event_hdlr: event = 3, port_id = 0xba0200 bnx2fc: ba0200 - rport not created Yet!! /* Here we reset any outstanding exchanges before freeing rport using the exch_mgr_reset() */ xid 1052: Exchange timer canceled /*Here we got two response for one xid*/ xid 1052: invoking resp(), esb 20000000 state 3 xid 1052: invoking resp(), esb 20000000 state 3 xid 1052: fc_rport_plogi_resp() : ep->resp_active 2 xid 1052: fc_rport_plogi_resp() : ep->resp_active 2 Signed-off-by: Javed Hasan <jhasan@marvell.com> diff --git a/drivers/scsi/libfc/fc_exch.c b/drivers/scsi/libfc/fc_exch.c index 16eb3b60ed58..368724f4281e 100644 --- a/drivers/scsi/libfc/fc_exch.c +++ b/drivers/scsi/libfc/fc_exch.c @@ -1624,8 +1624,13 @@ static void fc_exch_recv_seq_resp(struct fc_exch_mgr *mp, struct fc_frame *fp) rc = fc_exch_done_locked(ep); WARN_ON(fc_seq_exch(sp) != ep); spin_unlock_bh(&ep->ex_lock); - if (!rc) + if (!rc) { fc_exch_delete(ep); + } else { + FC_EXCH_DBG(ep, "ep is completed already," + "hence skip calling the resp\n"); + goto skip_resp; + } } /* @@ -1644,6 +1649,7 @@ static void fc_exch_recv_seq_resp(struct fc_exch_mgr *mp, struct fc_frame *fp) if (!fc_invoke_resp(ep, sp, fp)) fc_frame_free(fp); +skip_resp: fc_exch_release(ep); return; rel: @@ -1900,10 +1906,16 @@ static void fc_exch_reset(struct fc_exch *ep) fc_exch_hold(ep); - if (!rc) + if (!rc) { fc_exch_delete(ep); + } else { + FC_EXCH_DBG(ep, "ep is completed already," + "hence skip calling the resp\n"); + goto skip_resp; + } fc_invoke_resp(ep, sp, ERR_PTR(-FC_EX_CLOSED)); +skip_resp: fc_seq_set_resp(sp, NULL, ep->arg); fc_exch_release(ep); } -- 2.18.2
On Tue, 15 Dec 2020 11:47:31 -0800, Javed Hasan wrote: > Issue :- race condition getting hit between the response handler > get called because of the exchange_mgr_reset() which clear out all > the active XID and the response we get via interrupt as the same time > > Below are the sequence of events occurring in case of > "issue/race condition" :- > rport ba0200: Port timeout, state PLOGI > rport ba0200: Port entered PLOGI state from PLOGI state > xid 1052: Exchange timer armed : 20000 msecs xid timer armed here > rport ba0200: Received LOGO request while in state PLOGI > rport ba0200: Delete port > rport ba0200: work event 3 > rport ba0200: lld callback ev 3 > bnx2fc: rport_event_hdlr: event = 3, port_id = 0xba0200 > bnx2fc: ba0200 - rport not created Yet!! > /* Here we reset any outstanding exchanges before > freeing rport using the exch_mgr_reset() */ > xid 1052: Exchange timer canceled > /*Here we got two response for one xid*/ > xid 1052: invoking resp(), esb 20000000 state 3 > xid 1052: invoking resp(), esb 20000000 state 3 > xid 1052: fc_rport_plogi_resp() : ep->resp_active 2 > xid 1052: fc_rport_plogi_resp() : ep->resp_active 2 Applied to 5.11/scsi-fixes, thanks! [1/1] scsi: libfc: Avoid invoking response handler twice if ep is already completed. https://git.kernel.org/mkp/scsi/c/b2b0f16fa65e -- Martin K. Petersen Oracle Linux Engineering
diff --git a/drivers/scsi/libfc/fc_exch.c b/drivers/scsi/libfc/fc_exch.c index 16eb3b60ed58..368724f4281e 100644 --- a/drivers/scsi/libfc/fc_exch.c +++ b/drivers/scsi/libfc/fc_exch.c @@ -1624,8 +1624,13 @@ static void fc_exch_recv_seq_resp(struct fc_exch_mgr *mp, struct fc_frame *fp) rc = fc_exch_done_locked(ep); WARN_ON(fc_seq_exch(sp) != ep); spin_unlock_bh(&ep->ex_lock); - if (!rc) + if (!rc) { fc_exch_delete(ep); + } else { + FC_EXCH_DBG(ep, "ep is completed already," + "hence skip calling the resp\n"); + goto skip_resp; + } } /* @@ -1644,6 +1649,7 @@ static void fc_exch_recv_seq_resp(struct fc_exch_mgr *mp, struct fc_frame *fp) if (!fc_invoke_resp(ep, sp, fp)) fc_frame_free(fp); +skip_resp: fc_exch_release(ep); return; rel: @@ -1900,10 +1906,16 @@ static void fc_exch_reset(struct fc_exch *ep) fc_exch_hold(ep); - if (!rc) + if (!rc) { fc_exch_delete(ep); + } else { + FC_EXCH_DBG(ep, "ep is completed already," + "hence skip calling the resp\n"); + goto skip_resp; + } fc_invoke_resp(ep, sp, ERR_PTR(-FC_EX_CLOSED)); +skip_resp: fc_seq_set_resp(sp, NULL, ep->arg); fc_exch_release(ep); }
Issue :- race condition getting hit between the response handler get called because of the exchange_mgr_reset() which clear out all the active XID and the response we get via interrupt as the same time Below are the sequence of events occurring in case of "issue/race condition" :- rport ba0200: Port timeout, state PLOGI rport ba0200: Port entered PLOGI state from PLOGI state xid 1052: Exchange timer armed : 20000 msecs xid timer armed here rport ba0200: Received LOGO request while in state PLOGI rport ba0200: Delete port rport ba0200: work event 3 rport ba0200: lld callback ev 3 bnx2fc: rport_event_hdlr: event = 3, port_id = 0xba0200 bnx2fc: ba0200 - rport not created Yet!! /* Here we reset any outstanding exchanges before freeing rport using the exch_mgr_reset() */ xid 1052: Exchange timer canceled /*Here we got two response for one xid*/ xid 1052: invoking resp(), esb 20000000 state 3 xid 1052: invoking resp(), esb 20000000 state 3 xid 1052: fc_rport_plogi_resp() : ep->resp_active 2 xid 1052: fc_rport_plogi_resp() : ep->resp_active 2 Signed-off-by: Javed Hasan <jhasan@marvell.com>