Message ID | 20231105183712.26520-1-marex@denx.de |
---|---|
State | New |
Headers | show |
Series | [v2] scsi: mvsas: Try to enable MSI | expand |
> > Signed-off-by: Marek Vasut <marex@denx.de> > --- > Note that the "PCI-E x0, Bandwidth Usage: UnKnown Gbps" is due to QEMU > vfio-pci VT-d passthrough, for some reason this is what it reports. > The issue with PCI MSI happens on real hardware too, this vfio/VT-d > is just debugging convenience. > Note that this would be nice to have in stable series, but I'm reluctant > to ask for that in order to avoid breaking other peoples' machines. > Maybe a default-off kernel parameter for the mvsas module would be > acceptable for stable? > --- > Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> > Cc: "Martin K. Petersen" <martin.petersen@oracle.com> > Cc: Bart Van Assche <bvanassche@acm.org> > Cc: Damien Le Moal <dlemoal@kernel.org> > Cc: Jason Yan <yanaijie@huawei.com> > Cc: John Garry <john.g.garry@oracle.com> > Cc: linux-scsi@vger.kernel.org > --- > V2: - Limit the MSI enablement to OCZ devices only > --- > drivers/scsi/mvsas/mv_init.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c > index 43ebb331e2167..d3b1cee6b3252 100644 > --- a/drivers/scsi/mvsas/mv_init.c > +++ b/drivers/scsi/mvsas/mv_init.c > @@ -571,6 +571,17 @@ static int mvs_pci_init(struct pci_dev *pdev, const struct pci_device_id *ent) > rc = sas_register_ha(SHOST_TO_SAS_HA(shost)); > if (rc) > goto err_out_shost; > + > + /* Try to enable MSI, this is needed at least on OCZ RevoDrive 3 X2 */ > + if (pdev->vendor == PCI_VENDOR_ID_OCZ) { PCI_VENDOR_ID_OCZ means 9485. So how about enable MSI for all PCI device IDs which use that, which is all OCZ and MARVELL_EXT? I could not get my hands on a datasheet for that SoC (could you?), but since all previous generations supported MSI, I think that it's a safe bet. Then, if we do that, instead of repeating this same vendor check, how about add a new member to mvs_chip_info to flag whether we need to try MSI? For example, it could be mvs_chip_info.use_msi . > + rc = pci_enable_msi(mvi->pdev); > + if (rc) { > + dev_err(&mvi->pdev->dev, > + "mvsas: Failed to enable MSI for OCZ device, attached drives may not be detected. rc=%d\n", > + rc); We should fail to load the driver in this case. > + } > + } > + > rc = request_irq(pdev->irq, irq_handler, IRQF_SHARED, > DRV_NAME, SHOST_TO_SAS_HA(shost)); > if (rc) > @@ -583,6 +594,9 @@ static int mvs_pci_init(struct pci_dev *pdev, const struct pci_device_id *ent) > return 0; > > err_not_sas: > + if (pdev->vendor == PCI_VENDOR_ID_OCZ) > + pci_disable_msi(mvi->pdev); > + > sas_unregister_ha(SHOST_TO_SAS_HA(shost)); > err_out_shost: > scsi_remove_host(mvi->shost); > @@ -607,6 +621,9 @@ static void mvs_pci_remove(struct pci_dev *pdev) > tasklet_kill(&((struct mvs_prv_info *)sha->lldd_ha)->mv_tasklet); > #endif > > + if (pdev->vendor == PCI_VENDOR_ID_OCZ) > + pci_disable_msi(mvi->pdev); > + > sas_unregister_ha(sha); > sas_remove_host(mvi->shost); >
On 11/6/23 12:08, John Garry wrote: Hi, >> drivers/scsi/mvsas/mv_init.c | 17 +++++++++++++++++ >> 1 file changed, 17 insertions(+) >> >> diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c >> index 43ebb331e2167..d3b1cee6b3252 100644 >> --- a/drivers/scsi/mvsas/mv_init.c >> +++ b/drivers/scsi/mvsas/mv_init.c >> @@ -571,6 +571,17 @@ static int mvs_pci_init(struct pci_dev *pdev, >> const struct pci_device_id *ent) >> rc = sas_register_ha(SHOST_TO_SAS_HA(shost)); >> if (rc) >> goto err_out_shost; >> + >> + /* Try to enable MSI, this is needed at least on OCZ RevoDrive 3 >> X2 */ >> + if (pdev->vendor == PCI_VENDOR_ID_OCZ) { > > PCI_VENDOR_ID_OCZ means 9485. It does not, see: $ git grep PCI_VENDOR_ID_OCZ include/ include/linux/pci_ids.h:#define PCI_VENDOR_ID_OCZ 0x1b85 > So how about enable MSI for all PCI device > IDs which use that, which is all OCZ and MARVELL_EXT? I could not get my > hands on a datasheet for that SoC (could you?), but since all previous > generations supported MSI, I think that it's a safe bet. Nope. I only have the one device here. > Then, if we do that, instead of repeating this same vendor check, how > about add a new member to mvs_chip_info to flag whether we need to try > MSI? For example, it could be mvs_chip_info.use_msi . > >> + rc = pci_enable_msi(mvi->pdev); >> + if (rc) { >> + dev_err(&mvi->pdev->dev, >> + "mvsas: Failed to enable MSI for OCZ device, attached >> drives may not be detected. rc=%d\n", >> + rc); > > We should fail to load the driver in this case. Wouldn't it be better to give the legacy IRQ a chance in any case, maybe those do work on some of the other OCZ devices (or other versions of firmware) ?
On 06/11/2023 12:59, Marek Vasut wrote: >>> drivers/scsi/mvsas/mv_init.c | 17 +++++++++++++++++ >>> 1 file changed, 17 insertions(+) >>> >>> diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c >>> index 43ebb331e2167..d3b1cee6b3252 100644 >>> --- a/drivers/scsi/mvsas/mv_init.c >>> +++ b/drivers/scsi/mvsas/mv_init.c >>> @@ -571,6 +571,17 @@ static int mvs_pci_init(struct pci_dev *pdev, >>> const struct pci_device_id *ent) >>> rc = sas_register_ha(SHOST_TO_SAS_HA(shost)); >>> if (rc) >>> goto err_out_shost; >>> + >>> + /* Try to enable MSI, this is needed at least on OCZ RevoDrive 3 >>> X2 */ >>> + if (pdev->vendor == PCI_VENDOR_ID_OCZ) { >> >> PCI_VENDOR_ID_OCZ means 9485. I meant chip_9485, not the PCI vendor ID. See how it is used as a lookup to chip-specific parameters for multiple OCZ and MARVELL SoCs in mvs_pci_table[] and mvs_chips[] > > It does not, see: > > $ git grep PCI_VENDOR_ID_OCZ include/ > include/linux/pci_ids.h:#define PCI_VENDOR_ID_OCZ 0x1b85 > >> So how about enable MSI for all PCI device IDs which use that, which >> is all OCZ and MARVELL_EXT? I could not get my hands on a datasheet >> for that SoC (could you?), but since all previous generations >> supported MSI, I think that it's a safe bet. > > Nope. I only have the one device here. Checking whether the PCI vendor is PCI_VENDOR_ID_OCZ actually covers many PCI devices, but they all use chip_9485 > >> Then, if we do that, instead of repeating this same vendor check, how >> about add a new member to mvs_chip_info to flag whether we need to try >> MSI? For example, it could be mvs_chip_info.use_msi . >> >>> + rc = pci_enable_msi(mvi->pdev); >>> + if (rc) { >>> + dev_err(&mvi->pdev->dev, >>> + "mvsas: Failed to enable MSI for OCZ device, >>> attached drives may not be detected. rc=%d\n", >>> + rc); >> >> We should fail to load the driver in this case. > > Wouldn't it be better to give the legacy IRQ a chance in any case, maybe > those do work on some of the other OCZ devices (or other versions of > firmware) ? Then according to the change here, we would always call pci_disable_msi() in removal path for OCZ, regardless of whether the original pci_enable_msi() call was successful - is that safe and proper? Thanks, John
On 11/6/23 23:12, John Garry wrote: > On 06/11/2023 12:59, Marek Vasut wrote: >>>> drivers/scsi/mvsas/mv_init.c | 17 +++++++++++++++++ >>>> 1 file changed, 17 insertions(+) >>>> >>>> diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c >>>> index 43ebb331e2167..d3b1cee6b3252 100644 >>>> --- a/drivers/scsi/mvsas/mv_init.c >>>> +++ b/drivers/scsi/mvsas/mv_init.c >>>> @@ -571,6 +571,17 @@ static int mvs_pci_init(struct pci_dev *pdev, >>>> const struct pci_device_id *ent) >>>> rc = sas_register_ha(SHOST_TO_SAS_HA(shost)); >>>> if (rc) >>>> goto err_out_shost; >>>> + >>>> + /* Try to enable MSI, this is needed at least on OCZ RevoDrive 3 >>>> X2 */ >>>> + if (pdev->vendor == PCI_VENDOR_ID_OCZ) { >>> >>> PCI_VENDOR_ID_OCZ means 9485. > > I meant chip_9485, not the PCI vendor ID. See how it is used as a lookup > to chip-specific parameters for multiple OCZ and MARVELL SoCs in > mvs_pci_table[] and mvs_chips[] > >> >> It does not, see: >> >> $ git grep PCI_VENDOR_ID_OCZ include/ >> include/linux/pci_ids.h:#define PCI_VENDOR_ID_OCZ 0x1b85 >> >>> So how about enable MSI for all PCI device IDs which use that, which >>> is all OCZ and MARVELL_EXT? I could not get my hands on a datasheet >>> for that SoC (could you?), but since all previous generations >>> supported MSI, I think that it's a safe bet. >> >> Nope. I only have the one device here. > > Checking whether the PCI vendor is PCI_VENDOR_ID_OCZ actually covers > many PCI devices, but they all use chip_9485 > >> >>> Then, if we do that, instead of repeating this same vendor check, how >>> about add a new member to mvs_chip_info to flag whether we need to try >>> MSI? For example, it could be mvs_chip_info.use_msi . >>> >>>> + rc = pci_enable_msi(mvi->pdev); >>>> + if (rc) { >>>> + dev_err(&mvi->pdev->dev, >>>> + "mvsas: Failed to enable MSI for OCZ device, >>>> attached drives may not be detected. rc=%d\n", >>>> + rc); >>> >>> We should fail to load the driver in this case. >> >> Wouldn't it be better to give the legacy IRQ a chance in any case, maybe >> those do work on some of the other OCZ devices (or other versions of >> firmware) ? > > Then according to the change here, we would always call > pci_disable_msi() in removal path for OCZ, regardless of whether the > original pci_enable_msi() call was successful - is that safe and proper? pci_disable_msi() does nothing if MSI is not enabled. So it shouldn't be an issue. > > Thanks, > John > > >
diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c index 43ebb331e2167..d3b1cee6b3252 100644 --- a/drivers/scsi/mvsas/mv_init.c +++ b/drivers/scsi/mvsas/mv_init.c @@ -571,6 +571,17 @@ static int mvs_pci_init(struct pci_dev *pdev, const struct pci_device_id *ent) rc = sas_register_ha(SHOST_TO_SAS_HA(shost)); if (rc) goto err_out_shost; + + /* Try to enable MSI, this is needed at least on OCZ RevoDrive 3 X2 */ + if (pdev->vendor == PCI_VENDOR_ID_OCZ) { + rc = pci_enable_msi(mvi->pdev); + if (rc) { + dev_err(&mvi->pdev->dev, + "mvsas: Failed to enable MSI for OCZ device, attached drives may not be detected. rc=%d\n", + rc); + } + } + rc = request_irq(pdev->irq, irq_handler, IRQF_SHARED, DRV_NAME, SHOST_TO_SAS_HA(shost)); if (rc) @@ -583,6 +594,9 @@ static int mvs_pci_init(struct pci_dev *pdev, const struct pci_device_id *ent) return 0; err_not_sas: + if (pdev->vendor == PCI_VENDOR_ID_OCZ) + pci_disable_msi(mvi->pdev); + sas_unregister_ha(SHOST_TO_SAS_HA(shost)); err_out_shost: scsi_remove_host(mvi->shost); @@ -607,6 +621,9 @@ static void mvs_pci_remove(struct pci_dev *pdev) tasklet_kill(&((struct mvs_prv_info *)sha->lldd_ha)->mv_tasklet); #endif + if (pdev->vendor == PCI_VENDOR_ID_OCZ) + pci_disable_msi(mvi->pdev); + sas_unregister_ha(sha); sas_remove_host(mvi->shost);
This seems to be needed on OCZ RevoDrive 3 X2 / RevoDrive 350 OCZ Technology Group, Inc. RevoDrive 3 X2 PCIe SSD 240 GB (Marvell SAS Controller) [1b85:1021] (rev 02) Without MSI enabled, the controller fails as follows: " mvsas 0000:00:02.0: mvsas: PCI-E x0, Bandwidth Usage: UnKnown Gbps scsi host0: mvsas sas: Enter sas_scsi_recover_host busy: 0 failed: 0 ata1.00: qc timeout after 5000 msecs (cmd 0xec) ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4) ata1.00: qc timeout after 10000 msecs (cmd 0xec) ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4) ata1.00: qc timeout after 30000 msecs (cmd 0xec) ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4) sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1 sas: sas_probe_sata: for direct-attached device 0000000000000000 returned -19 sas: Enter sas_scsi_recover_host busy: 0 failed: 0 ata2.00: qc timeout after 5000 msecs (cmd 0xec) ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4) ata2.00: qc timeout after 10000 msecs (cmd 0xec) ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4) ata2.00: qc timeout after 30000 msecs (cmd 0xec) ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4) sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1 sas: sas_probe_sata: for direct-attached device 0100000000000000 returned -19 " With this patch, the controller detects the two SSD drives on it: " mvsas 0000:00:02.0: mvsas: PCI-E x0, Bandwidth Usage: UnKnown Gbps scsi host0: mvsas sas: Enter sas_scsi_recover_host busy: 0 failed: 0 ata1.00: ATA-8: OCZ-REVODRIVE350, 2.50, max UDMA/133 ata1.00: 234441648 sectors, multi 1: LBA48 NCQ (depth 32) ata1.00: configured for UDMA/133 sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1 scsi 0:0:0:0: Direct-Access ATA OCZ-REVODRIVE350 2.50 PQ: 0 ANSI: 5 sas: Enter sas_scsi_recover_host busy: 0 failed: 0 ata2.00: ATA-8: OCZ-REVODRIVE350, 2.50, max UDMA/133 ata2.00: 234441648 sectors, multi 1: LBA48 NCQ (depth 32) ata2.00: configured for UDMA/133 sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1 scsi 0:0:1:0: Direct-Access ATA OCZ-REVODRIVE350 2.50 PQ: 0 ANSI: 5 scsi 0:0:0:0: Attached scsi generic sg0 type 0 sd 0:0:0:0: [sda] 234441648 512-byte logical blocks: (120 GB/112 GiB) sd 0:0:1:0: [sdb] 234441648 512-byte logical blocks: (120 GB/112 GiB) sd 0:0:1:0: Attached scsi generic sg1 type 0 sd 0:0:0:0: [sda] Write Protect is off sd 0:0:1:0: [sdb] Write Protect is off sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 0:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 0:0:0:0: [sda] Preferred minimum I/O size 512 bytes sd 0:0:1:0: [sdb] Preferred minimum I/O size 512 bytes sd 0:0:0:0: [sda] Attached SCSI disk sd 0:0:1:0: [sdb] Attached SCSI disk " The enablement of MSIs has been part of this driver before, but was removed without any real explanation in commit: 20b09c2992fe ("[SCSI] mvsas: add support for 94xx; layout change; bug fixes") The enablement of MSIs is also part of the 'oczpcie' driver, which is really an ancient fork of this driver with a lot of variable renames and such. This variant of fix attempt limits the MSI enablement to OCZ devices, since the only ones produced are likely the RevoDrive series, and all are likely to be affected. In case the MSI enablement fails, try to continue in hope that the probe might still somehow work out, but warn the user about this. Signed-off-by: Marek Vasut <marex@denx.de> --- Note that the "PCI-E x0, Bandwidth Usage: UnKnown Gbps" is due to QEMU vfio-pci VT-d passthrough, for some reason this is what it reports. The issue with PCI MSI happens on real hardware too, this vfio/VT-d is just debugging convenience. Note that this would be nice to have in stable series, but I'm reluctant to ask for that in order to avoid breaking other peoples' machines. Maybe a default-off kernel parameter for the mvsas module would be acceptable for stable? --- Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Damien Le Moal <dlemoal@kernel.org> Cc: Jason Yan <yanaijie@huawei.com> Cc: John Garry <john.g.garry@oracle.com> Cc: linux-scsi@vger.kernel.org --- V2: - Limit the MSI enablement to OCZ devices only --- drivers/scsi/mvsas/mv_init.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)