diff mbox series

[v2] scsi: mvsas: Try to enable MSI

Message ID 20231105183712.26520-1-marex@denx.de
State New
Headers show
Series [v2] scsi: mvsas: Try to enable MSI | expand

Commit Message

Marek Vasut Nov. 5, 2023, 6:36 p.m. UTC
This seems to be needed on OCZ RevoDrive 3 X2 / RevoDrive 350
OCZ Technology Group, Inc. RevoDrive 3 X2 PCIe SSD 240 GB (Marvell SAS Controller) [1b85:1021] (rev 02)

Without MSI enabled, the controller fails as follows:
"
mvsas 0000:00:02.0: mvsas: PCI-E x0, Bandwidth Usage: UnKnown Gbps
scsi host0: mvsas
sas: Enter sas_scsi_recover_host busy: 0 failed: 0
ata1.00: qc timeout after 5000 msecs (cmd 0xec)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata1.00: qc timeout after 10000 msecs (cmd 0xec)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata1.00: qc timeout after 30000 msecs (cmd 0xec)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
sas: sas_probe_sata: for direct-attached device 0000000000000000 returned -19
sas: Enter sas_scsi_recover_host busy: 0 failed: 0
ata2.00: qc timeout after 5000 msecs (cmd 0xec)
ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata2.00: qc timeout after 10000 msecs (cmd 0xec)
ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata2.00: qc timeout after 30000 msecs (cmd 0xec)
ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
sas: sas_probe_sata: for direct-attached device 0100000000000000 returned -19
"

With this patch, the controller detects the two SSD drives on it:
"
mvsas 0000:00:02.0: mvsas: PCI-E x0, Bandwidth Usage: UnKnown Gbps
scsi host0: mvsas
sas: Enter sas_scsi_recover_host busy: 0 failed: 0
ata1.00: ATA-8: OCZ-REVODRIVE350, 2.50, max UDMA/133
ata1.00: 234441648 sectors, multi 1: LBA48 NCQ (depth 32)
ata1.00: configured for UDMA/133
sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
scsi 0:0:0:0: Direct-Access     ATA      OCZ-REVODRIVE350 2.50 PQ: 0 ANSI: 5
sas: Enter sas_scsi_recover_host busy: 0 failed: 0
ata2.00: ATA-8: OCZ-REVODRIVE350, 2.50, max UDMA/133
ata2.00: 234441648 sectors, multi 1: LBA48 NCQ (depth 32)
ata2.00: configured for UDMA/133
sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
scsi 0:0:1:0: Direct-Access     ATA      OCZ-REVODRIVE350 2.50 PQ: 0 ANSI: 5
scsi 0:0:0:0: Attached scsi generic sg0 type 0
sd 0:0:0:0: [sda] 234441648 512-byte logical blocks: (120 GB/112 GiB)
sd 0:0:1:0: [sdb] 234441648 512-byte logical blocks: (120 GB/112 GiB)
sd 0:0:1:0: Attached scsi generic sg1 type 0
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:1:0: [sdb] Write Protect is off
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 0:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 0:0:0:0: [sda] Preferred minimum I/O size 512 bytes
sd 0:0:1:0: [sdb] Preferred minimum I/O size 512 bytes
sd 0:0:0:0: [sda] Attached SCSI disk
sd 0:0:1:0: [sdb] Attached SCSI disk
"

The enablement of MSIs has been part of this driver before, but was
removed without any real explanation in commit:
20b09c2992fe ("[SCSI] mvsas: add support for 94xx; layout change; bug fixes")
The enablement of MSIs is also part of the 'oczpcie' driver, which is
really an ancient fork of this driver with a lot of variable renames
and such.

This variant of fix attempt limits the MSI enablement to OCZ devices,
since the only ones produced are likely the RevoDrive series, and all
are likely to be affected. In case the MSI enablement fails, try to
continue in hope that the probe might still somehow work out, but warn
the user about this.

Signed-off-by: Marek Vasut <marex@denx.de>
---
Note that the "PCI-E x0, Bandwidth Usage: UnKnown Gbps" is due to QEMU
     vfio-pci VT-d passthrough, for some reason this is what it reports.
     The issue with PCI MSI happens on real hardware too, this vfio/VT-d
     is just debugging convenience.
Note that this would be nice to have in stable series, but I'm reluctant
     to ask for that in order to avoid breaking other peoples' machines.
     Maybe a default-off kernel parameter for the mvsas module would be
     acceptable for stable?
---
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: Jason Yan <yanaijie@huawei.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: linux-scsi@vger.kernel.org
---
V2: - Limit the MSI enablement to OCZ devices only
---
 drivers/scsi/mvsas/mv_init.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

Comments

John Garry Nov. 6, 2023, 11:08 a.m. UTC | #1
> 
> Signed-off-by: Marek Vasut <marex@denx.de>
> ---
> Note that the "PCI-E x0, Bandwidth Usage: UnKnown Gbps" is due to QEMU
>       vfio-pci VT-d passthrough, for some reason this is what it reports.
>       The issue with PCI MSI happens on real hardware too, this vfio/VT-d
>       is just debugging convenience.
> Note that this would be nice to have in stable series, but I'm reluctant
>       to ask for that in order to avoid breaking other peoples' machines.
>       Maybe a default-off kernel parameter for the mvsas module would be
>       acceptable for stable?
> ---
> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
> Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: Damien Le Moal <dlemoal@kernel.org>
> Cc: Jason Yan <yanaijie@huawei.com>
> Cc: John Garry <john.g.garry@oracle.com>
> Cc: linux-scsi@vger.kernel.org
> ---
> V2: - Limit the MSI enablement to OCZ devices only
> ---
>   drivers/scsi/mvsas/mv_init.c | 17 +++++++++++++++++
>   1 file changed, 17 insertions(+)
> 
> diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c
> index 43ebb331e2167..d3b1cee6b3252 100644
> --- a/drivers/scsi/mvsas/mv_init.c
> +++ b/drivers/scsi/mvsas/mv_init.c
> @@ -571,6 +571,17 @@ static int mvs_pci_init(struct pci_dev *pdev, const struct pci_device_id *ent)
>   	rc = sas_register_ha(SHOST_TO_SAS_HA(shost));
>   	if (rc)
>   		goto err_out_shost;
> +
> +	/* Try to enable MSI, this is needed at least on OCZ RevoDrive 3 X2 */
> +	if (pdev->vendor == PCI_VENDOR_ID_OCZ) {

PCI_VENDOR_ID_OCZ means 9485. So how about enable MSI for all PCI device 
IDs which use that, which is all OCZ and MARVELL_EXT? I could not get my 
hands on a datasheet for that SoC (could you?), but since all previous 
generations supported MSI, I think that it's a safe bet.

Then, if we do that, instead of repeating this same vendor check, how 
about add a new member to mvs_chip_info to flag whether we need to try 
MSI? For example, it could be mvs_chip_info.use_msi .

> +		rc = pci_enable_msi(mvi->pdev);
> +		if (rc) {
> +			dev_err(&mvi->pdev->dev,
> +				"mvsas: Failed to enable MSI for OCZ device, attached drives may not be detected. rc=%d\n",
> +				rc);

We should fail to load the driver in this case.

> +		}
> +	}
> +
>   	rc = request_irq(pdev->irq, irq_handler, IRQF_SHARED,
>   		DRV_NAME, SHOST_TO_SAS_HA(shost));
>   	if (rc)
> @@ -583,6 +594,9 @@ static int mvs_pci_init(struct pci_dev *pdev, const struct pci_device_id *ent)
>   	return 0;
>   
>   err_not_sas:
> +	if (pdev->vendor == PCI_VENDOR_ID_OCZ)
> +		pci_disable_msi(mvi->pdev);
> +
>   	sas_unregister_ha(SHOST_TO_SAS_HA(shost));
>   err_out_shost:
>   	scsi_remove_host(mvi->shost);
> @@ -607,6 +621,9 @@ static void mvs_pci_remove(struct pci_dev *pdev)
>   	tasklet_kill(&((struct mvs_prv_info *)sha->lldd_ha)->mv_tasklet);
>   #endif
>   
> +	if (pdev->vendor == PCI_VENDOR_ID_OCZ)
> +		pci_disable_msi(mvi->pdev);
> +
>   	sas_unregister_ha(sha);
>   	sas_remove_host(mvi->shost);
>
Marek Vasut Nov. 6, 2023, 12:59 p.m. UTC | #2
On 11/6/23 12:08, John Garry wrote:

Hi,

>>   drivers/scsi/mvsas/mv_init.c | 17 +++++++++++++++++
>>   1 file changed, 17 insertions(+)
>>
>> diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c
>> index 43ebb331e2167..d3b1cee6b3252 100644
>> --- a/drivers/scsi/mvsas/mv_init.c
>> +++ b/drivers/scsi/mvsas/mv_init.c
>> @@ -571,6 +571,17 @@ static int mvs_pci_init(struct pci_dev *pdev, 
>> const struct pci_device_id *ent)
>>       rc = sas_register_ha(SHOST_TO_SAS_HA(shost));
>>       if (rc)
>>           goto err_out_shost;
>> +
>> +    /* Try to enable MSI, this is needed at least on OCZ RevoDrive 3 
>> X2 */
>> +    if (pdev->vendor == PCI_VENDOR_ID_OCZ) {
> 
> PCI_VENDOR_ID_OCZ means 9485.

It does not, see:

$ git grep PCI_VENDOR_ID_OCZ include/
include/linux/pci_ids.h:#define PCI_VENDOR_ID_OCZ               0x1b85

> So how about enable MSI for all PCI device 
> IDs which use that, which is all OCZ and MARVELL_EXT? I could not get my 
> hands on a datasheet for that SoC (could you?), but since all previous 
> generations supported MSI, I think that it's a safe bet.

Nope. I only have the one device here.

> Then, if we do that, instead of repeating this same vendor check, how 
> about add a new member to mvs_chip_info to flag whether we need to try 
> MSI? For example, it could be mvs_chip_info.use_msi .
> 
>> +        rc = pci_enable_msi(mvi->pdev);
>> +        if (rc) {
>> +            dev_err(&mvi->pdev->dev,
>> +                "mvsas: Failed to enable MSI for OCZ device, attached 
>> drives may not be detected. rc=%d\n",
>> +                rc);
> 
> We should fail to load the driver in this case.

Wouldn't it be better to give the legacy IRQ a chance in any case, maybe 
those do work on some of the other OCZ devices (or other versions of 
firmware) ?
John Garry Nov. 6, 2023, 2:12 p.m. UTC | #3
On 06/11/2023 12:59, Marek Vasut wrote:
>>> drivers/scsi/mvsas/mv_init.c | 17 +++++++++++++++++
>>>   1 file changed, 17 insertions(+)
>>>
>>> diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c
>>> index 43ebb331e2167..d3b1cee6b3252 100644
>>> --- a/drivers/scsi/mvsas/mv_init.c
>>> +++ b/drivers/scsi/mvsas/mv_init.c
>>> @@ -571,6 +571,17 @@ static int mvs_pci_init(struct pci_dev *pdev, 
>>> const struct pci_device_id *ent)
>>>       rc = sas_register_ha(SHOST_TO_SAS_HA(shost));
>>>       if (rc)
>>>           goto err_out_shost;
>>> +
>>> +    /* Try to enable MSI, this is needed at least on OCZ RevoDrive 3 
>>> X2 */
>>> +    if (pdev->vendor == PCI_VENDOR_ID_OCZ) {
>>
>> PCI_VENDOR_ID_OCZ means 9485.

I meant chip_9485, not the PCI vendor ID. See how it is used as a lookup 
to chip-specific parameters for multiple OCZ and MARVELL SoCs in 
mvs_pci_table[] and mvs_chips[]

> 
> It does not, see:
> 
> $ git grep PCI_VENDOR_ID_OCZ include/
> include/linux/pci_ids.h:#define PCI_VENDOR_ID_OCZ               0x1b85
> 
>> So how about enable MSI for all PCI device IDs which use that, which 
>> is all OCZ and MARVELL_EXT? I could not get my hands on a datasheet 
>> for that SoC (could you?), but since all previous generations 
>> supported MSI, I think that it's a safe bet.
> 
> Nope. I only have the one device here.

Checking whether the PCI vendor is PCI_VENDOR_ID_OCZ actually covers 
many PCI devices, but they all use chip_9485

> 
>> Then, if we do that, instead of repeating this same vendor check, how 
>> about add a new member to mvs_chip_info to flag whether we need to try 
>> MSI? For example, it could be mvs_chip_info.use_msi .
>>
>>> +        rc = pci_enable_msi(mvi->pdev);
>>> +        if (rc) {
>>> +            dev_err(&mvi->pdev->dev,
>>> +                "mvsas: Failed to enable MSI for OCZ device, 
>>> attached drives may not be detected. rc=%d\n",
>>> +                rc);
>>
>> We should fail to load the driver in this case.
> 
> Wouldn't it be better to give the legacy IRQ a chance in any case, maybe 
> those do work on some of the other OCZ devices (or other versions of 
> firmware) ?

Then according to the change here, we would always call 
pci_disable_msi() in removal path for OCZ, regardless of whether the 
original pci_enable_msi() call was successful - is that safe and proper?

Thanks,
John
Damien Le Moal Nov. 6, 2023, 10:13 p.m. UTC | #4
On 11/6/23 23:12, John Garry wrote:
> On 06/11/2023 12:59, Marek Vasut wrote:
>>>> drivers/scsi/mvsas/mv_init.c | 17 +++++++++++++++++
>>>>   1 file changed, 17 insertions(+)
>>>>
>>>> diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c
>>>> index 43ebb331e2167..d3b1cee6b3252 100644
>>>> --- a/drivers/scsi/mvsas/mv_init.c
>>>> +++ b/drivers/scsi/mvsas/mv_init.c
>>>> @@ -571,6 +571,17 @@ static int mvs_pci_init(struct pci_dev *pdev, 
>>>> const struct pci_device_id *ent)
>>>>       rc = sas_register_ha(SHOST_TO_SAS_HA(shost));
>>>>       if (rc)
>>>>           goto err_out_shost;
>>>> +
>>>> +    /* Try to enable MSI, this is needed at least on OCZ RevoDrive 3 
>>>> X2 */
>>>> +    if (pdev->vendor == PCI_VENDOR_ID_OCZ) {
>>>
>>> PCI_VENDOR_ID_OCZ means 9485.
> 
> I meant chip_9485, not the PCI vendor ID. See how it is used as a lookup 
> to chip-specific parameters for multiple OCZ and MARVELL SoCs in 
> mvs_pci_table[] and mvs_chips[]
> 
>>
>> It does not, see:
>>
>> $ git grep PCI_VENDOR_ID_OCZ include/
>> include/linux/pci_ids.h:#define PCI_VENDOR_ID_OCZ               0x1b85
>>
>>> So how about enable MSI for all PCI device IDs which use that, which 
>>> is all OCZ and MARVELL_EXT? I could not get my hands on a datasheet 
>>> for that SoC (could you?), but since all previous generations 
>>> supported MSI, I think that it's a safe bet.
>>
>> Nope. I only have the one device here.
> 
> Checking whether the PCI vendor is PCI_VENDOR_ID_OCZ actually covers 
> many PCI devices, but they all use chip_9485
> 
>>
>>> Then, if we do that, instead of repeating this same vendor check, how 
>>> about add a new member to mvs_chip_info to flag whether we need to try 
>>> MSI? For example, it could be mvs_chip_info.use_msi .
>>>
>>>> +        rc = pci_enable_msi(mvi->pdev);
>>>> +        if (rc) {
>>>> +            dev_err(&mvi->pdev->dev,
>>>> +                "mvsas: Failed to enable MSI for OCZ device, 
>>>> attached drives may not be detected. rc=%d\n",
>>>> +                rc);
>>>
>>> We should fail to load the driver in this case.
>>
>> Wouldn't it be better to give the legacy IRQ a chance in any case, maybe 
>> those do work on some of the other OCZ devices (or other versions of 
>> firmware) ?
> 
> Then according to the change here, we would always call 
> pci_disable_msi() in removal path for OCZ, regardless of whether the 
> original pci_enable_msi() call was successful - is that safe and proper?

pci_disable_msi() does nothing if MSI is not enabled. So it shouldn't be an issue.

> 
> Thanks,
> John
> 
> 
>
diff mbox series

Patch

diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c
index 43ebb331e2167..d3b1cee6b3252 100644
--- a/drivers/scsi/mvsas/mv_init.c
+++ b/drivers/scsi/mvsas/mv_init.c
@@ -571,6 +571,17 @@  static int mvs_pci_init(struct pci_dev *pdev, const struct pci_device_id *ent)
 	rc = sas_register_ha(SHOST_TO_SAS_HA(shost));
 	if (rc)
 		goto err_out_shost;
+
+	/* Try to enable MSI, this is needed at least on OCZ RevoDrive 3 X2 */
+	if (pdev->vendor == PCI_VENDOR_ID_OCZ) {
+		rc = pci_enable_msi(mvi->pdev);
+		if (rc) {
+			dev_err(&mvi->pdev->dev,
+				"mvsas: Failed to enable MSI for OCZ device, attached drives may not be detected. rc=%d\n",
+				rc);
+		}
+	}
+
 	rc = request_irq(pdev->irq, irq_handler, IRQF_SHARED,
 		DRV_NAME, SHOST_TO_SAS_HA(shost));
 	if (rc)
@@ -583,6 +594,9 @@  static int mvs_pci_init(struct pci_dev *pdev, const struct pci_device_id *ent)
 	return 0;
 
 err_not_sas:
+	if (pdev->vendor == PCI_VENDOR_ID_OCZ)
+		pci_disable_msi(mvi->pdev);
+
 	sas_unregister_ha(SHOST_TO_SAS_HA(shost));
 err_out_shost:
 	scsi_remove_host(mvi->shost);
@@ -607,6 +621,9 @@  static void mvs_pci_remove(struct pci_dev *pdev)
 	tasklet_kill(&((struct mvs_prv_info *)sha->lldd_ha)->mv_tasklet);
 #endif
 
+	if (pdev->vendor == PCI_VENDOR_ID_OCZ)
+		pci_disable_msi(mvi->pdev);
+
 	sas_unregister_ha(sha);
 	sas_remove_host(mvi->shost);