mbox series

[0/4] Revert "Call blk_mq_free_tag_set() earlier"

Message ID 20220821220502.13685-1-bvanassche@acm.org
Headers show
Series Revert "Call blk_mq_free_tag_set() earlier" | expand

Message

Bart Van Assche Aug. 21, 2022, 10:04 p.m. UTC
Hi Martin,

Since a device, target or host reference may be held when scsi_remove_host()
or scsi_remove_target() is called and since te patch series "Call
blk_mq_free_tag_set() earlier" makes these functions wait until all references
are gone, that patch series may trigger a deadlock. Hence this request to
revert the patch series "Call blk_mq_free_tag_set() earlier".

Thanks,

Bart.

Bart Van Assche (4):
  scsi: core: Revert "Call blk_mq_free_tag_set() earlier"
  scsi: core: Revert "Simplify LLD module reference counting"
  scsi: core: Revert "Make sure that hosts outlive targets"
  scsi: core: Revert "Make sure that targets outlive devices"

 drivers/scsi/hosts.c       | 18 +++++-------------
 drivers/scsi/scsi.c        |  9 +++------
 drivers/scsi/scsi_scan.c   |  9 ---------
 drivers/scsi/scsi_sysfs.c  | 29 ++++++++++++-----------------
 include/scsi/scsi_device.h |  2 --
 include/scsi/scsi_host.h   |  3 ---
 6 files changed, 20 insertions(+), 50 deletions(-)

Comments

Ming Lei Aug. 29, 2022, 9:14 a.m. UTC | #1
Hi Bart,

On Mon, Aug 22, 2022 at 6:05 AM Bart Van Assche <bvanassche@acm.org> wrote:
>
> Hi Martin,
>
> Since a device, target or host reference may be held when scsi_remove_host()
> or scsi_remove_target() is called and since te patch series "Call
> blk_mq_free_tag_set() earlier" makes these functions wait until all references
> are gone, that patch series may trigger a deadlock. Hence this request to
> revert the patch series "Call blk_mq_free_tag_set() earlier".

Care to share the deadlock details? Such as dmesg log or theory behind.

Thanks,
Ming
Bart Van Assche Aug. 29, 2022, 12:14 p.m. UTC | #2
On 8/29/22 02:14, Ming Lei wrote:
> On Mon, Aug 22, 2022 at 6:05 AM Bart Van Assche <bvanassche@acm.org> wrote:
>> Since a device, target or host reference may be held when scsi_remove_host()
>> or scsi_remove_target() is called and since te patch series "Call
>> blk_mq_free_tag_set() earlier" makes these functions wait until all references
>> are gone, that patch series may trigger a deadlock. Hence this request to
>> revert the patch series "Call blk_mq_free_tag_set() earlier".
> 
> Care to share the deadlock details? Such as dmesg log or theory behind.

Hi Ming,

Details of two different deadlock scenarios are available here:
* [syzbot] INFO: task hung in scsi_remove_host 
(https://lore.kernel.org/all/000000000000b5187d05e6c08086@google.com/).
* https://lore.kernel.org/all/Yv%2FMKymRC9O04Nqu@google.com/. The link 
[2] in this email includes a call trace and instructions for reproducing 
this issue. My root cause analysis of this deadlock is available here: 
https://lore.kernel.org/all/27d0dde8-344c-1dd0-cc26-0e10c4e1f296@acm.org/.

Thanks,

Bart.
Martin K. Petersen Sept. 1, 2022, 5:12 a.m. UTC | #3
On Sun, 21 Aug 2022 15:04:58 -0700, Bart Van Assche wrote:

> Since a device, target or host reference may be held when scsi_remove_host()
> or scsi_remove_target() is called and since te patch series "Call
> blk_mq_free_tag_set() earlier" makes these functions wait until all references
> are gone, that patch series may trigger a deadlock. Hence this request to
> revert the patch series "Call blk_mq_free_tag_set() earlier".
> 
> Thanks,
> 
> [...]

Applied to 6.0/scsi-fixes, thanks!

[1/4] scsi: core: Revert "Call blk_mq_free_tag_set() earlier"
      https://git.kernel.org/mkp/scsi/c/2b36209ca818
[2/4] scsi: core: Revert "Simplify LLD module reference counting"
      https://git.kernel.org/mkp/scsi/c/70e8d057bef5
[3/4] scsi: core: Revert "Make sure that hosts outlive targets"
      https://git.kernel.org/mkp/scsi/c/d94b2d00f7bf
[4/4] scsi: core: Revert "Make sure that targets outlive devices"
      https://git.kernel.org/mkp/scsi/c/f782201ebc2b