diff mbox series

[2/3] scsi: qla2xxx: make target send correct LOGO

Message ID 20241008132402.26164-3-a.kovaleva@yadro.com
State Superseded
Headers show
Series Fix bugs in qla2xxx driver | expand

Commit Message

Anastasia Kovaleva Oct. 8, 2024, 1:24 p.m. UTC
Upon removing the ACL from the target, it sends a LOGO command to the
initiator to break the connection. But HBA fills port_name and port_id
of the LOGO command with all zeroes, which is not valid. The initiator
sends a reject for this command, but it is not being processed on the
target, since it assumes LOGO can never fail. This leaves a system in a
state where the initiator thinks it is still logged in to the target and
can send commands to it, but the target ignores all incoming commands
from this initiator.

If, in such a situation, the initiator sends some command (e.g. during a
scan), after not receiving a response for a timeout duration, it sends
ABORT for the command. After a timeout on receiving an ABORT response,
the initiator sends LOGO to the target. Only after that, the initiator
can successfully relogin to the target and restore the connection. In
the end, this whole situation hangs the system for approximately a
minute.

By default, the driver sends a LOGO command to HBA filling only port_id,
expecting HBA to match port_id with the correct port_name from it's
internal table. HBA doesn't do that, instead filling these fields with
all zeroes.

This patch makes the driver send a LOGO command to HBA with port_name
and port_id in the I/O PARMETER fields. HBA then copies these values to
corresponding fields in the LOGO command frame.

Signed-off-by: Anastasia Kovaleva <a.kovaleva@yadro.com>
Reviewed-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
---
 drivers/scsi/qla2xxx/qla_iocb.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Hannes Reinecke Oct. 9, 2024, 6:32 a.m. UTC | #1
On 10/8/24 15:24, Anastasia Kovaleva wrote:
> Upon removing the ACL from the target, it sends a LOGO command to the
> initiator to break the connection. But HBA fills port_name and port_id
> of the LOGO command with all zeroes, which is not valid. The initiator
> sends a reject for this command, but it is not being processed on the
> target, since it assumes LOGO can never fail. This leaves a system in a
> state where the initiator thinks it is still logged in to the target and
> can send commands to it, but the target ignores all incoming commands
> from this initiator.
> 
> If, in such a situation, the initiator sends some command (e.g. during a
> scan), after not receiving a response for a timeout duration, it sends
> ABORT for the command. After a timeout on receiving an ABORT response,
> the initiator sends LOGO to the target. Only after that, the initiator
> can successfully relogin to the target and restore the connection. In
> the end, this whole situation hangs the system for approximately a
> minute.
> 
> By default, the driver sends a LOGO command to HBA filling only port_id,
> expecting HBA to match port_id with the correct port_name from it's
> internal table. HBA doesn't do that, instead filling these fields with
> all zeroes.
> 
> This patch makes the driver send a LOGO command to HBA with port_name
> and port_id in the I/O PARMETER fields. HBA then copies these values to
> corresponding fields in the LOGO command frame.
> 
> Signed-off-by: Anastasia Kovaleva <a.kovaleva@yadro.com>
> Reviewed-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
> ---
>   drivers/scsi/qla2xxx/qla_iocb.c | 11 +++++++++++
>   1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_iocb.c b/drivers/scsi/qla2xxx/qla_iocb.c
> index 0b41e8a06602..90026fca14dc 100644
> --- a/drivers/scsi/qla2xxx/qla_iocb.c
> +++ b/drivers/scsi/qla2xxx/qla_iocb.c
> @@ -2486,6 +2486,17 @@ qla24xx_logout_iocb(srb_t *sp, struct logio_entry_24xx *logio)
>   	logio->port_id[1] = sp->fcport->d_id.b.area;
>   	logio->port_id[2] = sp->fcport->d_id.b.domain;
>   	logio->vp_index = sp->vha->vp_idx;
> +	logio->io_parameter[0] = cpu_to_le32(sp->vha->d_id.b.al_pa |
> +				 sp->vha->d_id.b.area << 8 |
> +				 sp->vha->d_id.b.domain << 16);
> +	logio->io_parameter[1] = cpu_to_le32(sp->vha->port_name[3] |
> +				 sp->vha->port_name[2] << 8 |
> +				 sp->vha->port_name[1] << 16 |
> +				 sp->vha->port_name[0] << 24);
> +	logio->io_parameter[2] = cpu_to_le32(sp->vha->port_name[7] |
> +				 sp->vha->port_name[6] << 8 |
> +				 sp->vha->port_name[5] << 16 |
> +				 sp->vha->port_name[4] << 24);
>   }
>   
>   static void

Now that looks like serious debugging. Well done.

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
diff mbox series

Patch

diff --git a/drivers/scsi/qla2xxx/qla_iocb.c b/drivers/scsi/qla2xxx/qla_iocb.c
index 0b41e8a06602..90026fca14dc 100644
--- a/drivers/scsi/qla2xxx/qla_iocb.c
+++ b/drivers/scsi/qla2xxx/qla_iocb.c
@@ -2486,6 +2486,17 @@  qla24xx_logout_iocb(srb_t *sp, struct logio_entry_24xx *logio)
 	logio->port_id[1] = sp->fcport->d_id.b.area;
 	logio->port_id[2] = sp->fcport->d_id.b.domain;
 	logio->vp_index = sp->vha->vp_idx;
+	logio->io_parameter[0] = cpu_to_le32(sp->vha->d_id.b.al_pa |
+				 sp->vha->d_id.b.area << 8 |
+				 sp->vha->d_id.b.domain << 16);
+	logio->io_parameter[1] = cpu_to_le32(sp->vha->port_name[3] |
+				 sp->vha->port_name[2] << 8 |
+				 sp->vha->port_name[1] << 16 |
+				 sp->vha->port_name[0] << 24);
+	logio->io_parameter[2] = cpu_to_le32(sp->vha->port_name[7] |
+				 sp->vha->port_name[6] << 8 |
+				 sp->vha->port_name[5] << 16 |
+				 sp->vha->port_name[4] << 24);
 }
 
 static void