diff mbox series

[v2] scsi: target: alua: do not report emtpy port group

Message ID 20220912125457.22573-1-d.bogdanov@yadro.com
State New
Headers show
Series [v2] scsi: target: alua: do not report emtpy port group | expand

Commit Message

Dmitry Bogdanov Sept. 12, 2022, 12:54 p.m. UTC
Default target port group is always returned in the list of port
groups, even if the behaviour is unwanted, i.e. it has no members and
non-default port groups are primary port groups.

SPC-4 ("5.15.2.7 Target port asymmetric access state reporting")
states that a target MAY not provide info about port groups that do not
contain the current port through that the RTPG is received.

This patch hides port groups with no ports in REPORT TARGET PORT GROUPS
response.

Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
---
v2:
  new solution - just skip all empty groups
---
 drivers/target/target_core_alua.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Mike Christie Sept. 14, 2022, 7:18 p.m. UTC | #1
On 9/12/22 4:45 PM, Dmitry Bogdanov wrote:
>> Remember how ESX used to send a RTPG to one port and expect that it got
>> every group and that the state info was all in sync (basically opposite
>> if what's in the spec now)?
>>
>> The spec and ESX were updated, but I don't know if other OSs did this and
>> if/when everyone was updated. Do you know this info? Are the old ESX versions
>> that worked like that end of life?
> ESXi is kinda a pain. But fortunately it has nothing to do with that
> patch 😄

I didn't get why that is. How do you set up a distributed/cluster/HA target? I'm
probably missing that part.

Software drivers like iscsi I get, but for HW drivers I didn't see how you do it.

For example, if you have 2 systems/nodes running LIO which each export the same
device via 1 port each where one is active/optimized and the other is standby and you
are using qla2xxx, then on the local node would you create 2 groups:

[root@ol8n4 alua]# pwd
/sys/kernel/config/target/core/iblock_0/device0/alua

[root@ol8n4 alua]# ls
default_tg_pt_gp  local  remote

Then under the mapped lun:

[root@ol8n4 lun_0]# pwd
/sys/kernel/config/target/..../tpgt_1/lun/lun_0

in the alua_tg_pt_gp file you set that to local. That would then have tg_pt_gp_members
set, but remote would not.

Before your patch, windows and ESX could do a RTPG to just one port/path and we would
return the default, local and remote groups. We don't want the default group, but we
wanted the local and remote one returned. With your patch we only return the the local
one now. I wasn't sure how that works for drivers like qla2xxx.

For iscsi, you could just mirror the remote node locally, so you would have a second
tpg:

[root@ol8n4 lun_0]# pwd
/sys/kernel/config/target/..../tpgt_2/lun/lun_0

and in there set alua_tg_pt_gp to remote. Your patch works fine for that because both
groups then have tg_pt_gp_members set so if the OS just does a RTPG to one path/port
you get all the groups.
Dmitry Bogdanov Sept. 15, 2022, 6:08 a.m. UTC | #2
On Wed, Sep 14, 2022 at 02:18:40PM -0500, Mike Christie wrote:
> 
> On 9/12/22 4:45 PM, Dmitry Bogdanov wrote:
> >> Remember how ESX used to send a RTPG to one port and expect that it got
> >> every group and that the state info was all in sync (basically opposite
> >> if what's in the spec now)?
> >>
> >> The spec and ESX were updated, but I don't know if other OSs did this and
> >> if/when everyone was updated. Do you know this info? Are the old ESX versions
> >> that worked like that end of life?
> > ESXi is kinda a pain. But fortunately it has nothing to do with that
> > patch 😄
> 
> I didn't get why that is. How do you set up a distributed/cluster/HA target? I'm
> probably missing that part.
> 
> Software drivers like iscsi I get, but for HW drivers I didn't see how you do it.
> 
> For example, if you have 2 systems/nodes running LIO which each export the same
> device via 1 port each where one is active/optimized and the other is standby and you
> are using qla2xxx, then on the local node would you create 2 groups:
> 
> [root@ol8n4 alua]# pwd
> /sys/kernel/config/target/core/iblock_0/device0/alua
> 
> [root@ol8n4 alua]# ls
> default_tg_pt_gp  local  remote
> 
> Then under the mapped lun:
> 
> [root@ol8n4 lun_0]# pwd
> /sys/kernel/config/target/..../tpgt_1/lun/lun_0
> 
> in the alua_tg_pt_gp file you set that to local. That would then have tg_pt_gp_members
> set, but remote would not.
> 
> Before your patch, windows and ESX could do a RTPG to just one port/path and we would
> return the default, local and remote groups. We don't want the default group, but we
> wanted the local and remote one returned. With your patch we only return the the local
> one now. I wasn't sure how that works for drivers like qla2xxx.
> 
> For iscsi, you could just mirror the remote node locally, so you would have a second
> tpg:
> 
> [root@ol8n4 lun_0]# pwd
> /sys/kernel/config/target/..../tpgt_2/lun/lun_0
> 
> and in there set alua_tg_pt_gp to remote. Your patch works fine for that because both
> groups then have tg_pt_gp_members set so if the OS just does a RTPG to one path/port
> you get all the groups.
>
I use a virtual remote fabric driver to configure wwn/iqn-tpg-lun of
remote peers at each local node. In that way 'remote' alua port group
will have ports(RTPI) too. That allows RTPG (and other discovery-like
commands) report all ports in all port groups in the cluster.
I sent it within the RFC patchset:
https://patchwork.kernel.org/project/target-devel/patch/20220803162857.27770-36-d.bogdanov@yadro.com/

BR,
 Dmitry
Mike Christie Sept. 22, 2022, 4:26 p.m. UTC | #3
On 9/15/22 1:08 AM, Dmitry Bogdanov wrote:
> On Wed, Sep 14, 2022 at 02:18:40PM -0500, Mike Christie wrote:
>>
>> On 9/12/22 4:45 PM, Dmitry Bogdanov wrote:
>>>> Remember how ESX used to send a RTPG to one port and expect that it got
>>>> every group and that the state info was all in sync (basically opposite
>>>> if what's in the spec now)?
>>>>
>>>> The spec and ESX were updated, but I don't know if other OSs did this and
>>>> if/when everyone was updated. Do you know this info? Are the old ESX versions
>>>> that worked like that end of life?
>>> ESXi is kinda a pain. But fortunately it has nothing to do with that
>>> patch 😄
>>
>> I didn't get why that is. How do you set up a distributed/cluster/HA target? I'm
>> probably missing that part.
>>
>> Software drivers like iscsi I get, but for HW drivers I didn't see how you do it.
>>
>> For example, if you have 2 systems/nodes running LIO which each export the same
>> device via 1 port each where one is active/optimized and the other is standby and you
>> are using qla2xxx, then on the local node would you create 2 groups:
>>
>> [root@ol8n4 alua]# pwd
>> /sys/kernel/config/target/core/iblock_0/device0/alua
>>
>> [root@ol8n4 alua]# ls
>> default_tg_pt_gp  local  remote
>>
>> Then under the mapped lun:
>>
>> [root@ol8n4 lun_0]# pwd
>> /sys/kernel/config/target/..../tpgt_1/lun/lun_0
>>
>> in the alua_tg_pt_gp file you set that to local. That would then have tg_pt_gp_members
>> set, but remote would not.
>>
>> Before your patch, windows and ESX could do a RTPG to just one port/path and we would
>> return the default, local and remote groups. We don't want the default group, but we
>> wanted the local and remote one returned. With your patch we only return the the local
>> one now. I wasn't sure how that works for drivers like qla2xxx.
>>
>> For iscsi, you could just mirror the remote node locally, so you would have a second
>> tpg:
>>
>> [root@ol8n4 lun_0]# pwd
>> /sys/kernel/config/target/..../tpgt_2/lun/lun_0
>>

>> and in there set alua_tg_pt_gp to remote. Your patch works fine for that because both
>> groups then have tg_pt_gp_members set so if the OS just does a RTPG to one path/port
>> you get all the groups.
>>
> I use a virtual remote fabric driver to configure wwn/iqn-tpg-lun of
> remote peers at each local node. In that way 'remote' alua port group
> will have ports(RTPI) too. That allows RTPG (and other discovery-like
> commands) report all ports in all port groups in the cluster.
> I sent it within the RFC patchset:

Ok shoot. I think this type of setup is not common, so the patch should be ok for most
users. However, I know people did do some complex setups and I'm worried those might
break.

I think the remote target patch is fine. Did that require any additional patches?
Maybe we could add that patch and your patch in this email at the same time and we
could migrate users.

What's your take?
Dmitry Bogdanov Sept. 23, 2022, 11:38 a.m. UTC | #4
On Thu, Sep 22, 2022 at 11:26:29AM -0500, michael.christie@oracle.com wrote:
> 
> On 9/15/22 1:08 AM, Dmitry Bogdanov wrote:
> > On Wed, Sep 14, 2022 at 02:18:40PM -0500, Mike Christie wrote:
> >>
> >> On 9/12/22 4:45 PM, Dmitry Bogdanov wrote:
> >>>> Remember how ESX used to send a RTPG to one port and expect that it got
> >>>> every group and that the state info was all in sync (basically opposite
> >>>> if what's in the spec now)?
> >>>>
> >>>> The spec and ESX were updated, but I don't know if other OSs did this and
> >>>> if/when everyone was updated. Do you know this info? Are the old ESX versions
> >>>> that worked like that end of life?
> >>> ESXi is kinda a pain. But fortunately it has nothing to do with that
> >>> patch 😄
> >>
> >> I didn't get why that is. How do you set up a distributed/cluster/HA target? I'm
> >> probably missing that part.
> >>
> >> Software drivers like iscsi I get, but for HW drivers I didn't see how you do it.
> >>
> >> For example, if you have 2 systems/nodes running LIO which each export the same
> >> device via 1 port each where one is active/optimized and the other is standby and you
> >> are using qla2xxx, then on the local node would you create 2 groups:
> >>
> >> [root@ol8n4 alua]# pwd
> >> /sys/kernel/config/target/core/iblock_0/device0/alua
> >>
> >> [root@ol8n4 alua]# ls
> >> default_tg_pt_gp  local  remote
> >>
> >> Then under the mapped lun:
> >>
> >> [root@ol8n4 lun_0]# pwd
> >> /sys/kernel/config/target/..../tpgt_1/lun/lun_0
> >>
> >> in the alua_tg_pt_gp file you set that to local. That would then have tg_pt_gp_members
> >> set, but remote would not.
> >>
> >> Before your patch, windows and ESX could do a RTPG to just one port/path and we would
> >> return the default, local and remote groups. We don't want the default group, but we
> >> wanted the local and remote one returned. With your patch we only return the the local
> >> one now. I wasn't sure how that works for drivers like qla2xxx.
> >>
> >> For iscsi, you could just mirror the remote node locally, so you would have a second
> >> tpg:
> >>
> >> [root@ol8n4 lun_0]# pwd
> >> /sys/kernel/config/target/..../tpgt_2/lun/lun_0
> >>
> 
> >> and in there set alua_tg_pt_gp to remote. Your patch works fine for that because both
> >> groups then have tg_pt_gp_members set so if the OS just does a RTPG to one path/port
> >> you get all the groups.
> >>
> > I use a virtual remote fabric driver to configure wwn/iqn-tpg-lun of
> > remote peers at each local node. In that way 'remote' alua port group
> > will have ports(RTPI) too. That allows RTPG (and other discovery-like
> > commands) report all ports in all port groups in the cluster.
> > I sent it within the RFC patchset:
> 
> Ok shoot. I think this type of setup is not common, so the patch should be ok for most
> users. However, I know people did do some complex setups and I'm worried those might
> break.

I am not sure that any kind of cluster configuration is available now.
Because RTPI (without my patchset) is not configurable and normally it
is the same on the ports(LUNs now) on different nodes. That prevents
a correct reporting of ports in RTPG.

> 
> I think the remote target patch is fine. Did that require any additional patches?
Remote target patch is self-sufficient but it is not useful alone I think.
Together with RTPI patchset it allows to report in RTPG all ports in a
cluster.

> Maybe we could add that patch and your patch in this email at the same time and we
> could migrate users.
> 
> What's your take?
I will send the remote target patch soon, but as I said, it is useful
only together with RTPI patchset (published but not yet merged to
scsi-queue).

BR,
 Dmitry
diff mbox series

Patch

diff --git a/drivers/target/target_core_alua.c b/drivers/target/target_core_alua.c
index fb91423a4e2e..c8470e7c0e10 100644
--- a/drivers/target/target_core_alua.c
+++ b/drivers/target/target_core_alua.c
@@ -164,6 +164,9 @@  target_emulate_report_target_port_groups(struct se_cmd *cmd)
 	spin_lock(&dev->t10_alua.tg_pt_gps_lock);
 	list_for_each_entry(tg_pt_gp, &dev->t10_alua.tg_pt_gps_list,
 			tg_pt_gp_list) {
+		/* Skip empty port groups */
+		if (!tg_pt_gp->tg_pt_gp_members)
+			continue;
 		/*
 		 * Check if the Target port group and Target port descriptor list
 		 * based on tg_pt_gp_members count will fit into the response payload.