diff mbox series

scsi: target: iscsi: set memalloc_noio with loopback network connections

Message ID 20230208200957.14073-1-djeffery@redhat.com
State New
Headers show
Series scsi: target: iscsi: set memalloc_noio with loopback network connections | expand

Commit Message

David Jeffery Feb. 8, 2023, 8:09 p.m. UTC
If an admin connects an iscsi initiator to an iscsi target on the same
system, the iscsi connection is vulnerable to deadlocks during memory
allocations. Memory allocations in the target task accepting the I/O from
the initiator can wait on the initiator's I/O when the system is under
memory pressure, causing a deadlock situation between the iscsi target and
initiator.

When in this configuration, the deadlock scenario can be avoided by use of
GFP_NOIO allocations. Rather than force all configurations to use NOIO,
memalloc_noio_save/restore can be used to force GFP_NOIO allocations only
when in this loopback configuration.

Signed-off-by: David Jeffery <djeffery@redhat.com>
---
 drivers/target/iscsi/iscsi_target.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

Comments

Laurence Oberman Feb. 8, 2023, 8:58 p.m. UTC | #1
On Wed, 2023-02-08 at 15:09 -0500, David Jeffery wrote:
> If an admin connects an iscsi initiator to an iscsi target on the
> same
> system, the iscsi connection is vulnerable to deadlocks during memory
> allocations. Memory allocations in the target task accepting the I/O
> from
> the initiator can wait on the initiator's I/O when the system is
> under
> memory pressure, causing a deadlock situation between the iscsi
> target and
> initiator.
> 
> When in this configuration, the deadlock scenario can be avoided by
> use of
> GFP_NOIO allocations. Rather than force all configurations to use
> NOIO,
> memalloc_noio_save/restore can be used to force GFP_NOIO allocations
> only
> when in this loopback configuration.
> 
> Signed-off-by: David Jeffery <djeffery@redhat.com>
> ---
>  drivers/target/iscsi/iscsi_target.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/drivers/target/iscsi/iscsi_target.c
> b/drivers/target/iscsi/iscsi_target.c
> index baf4da7bb3b4..a68e47e2cdf9 100644
> --- a/drivers/target/iscsi/iscsi_target.c
> +++ b/drivers/target/iscsi/iscsi_target.c
> @@ -16,6 +16,7 @@
>  #include <linux/vmalloc.h>
>  #include <linux/idr.h>
>  #include <linux/delay.h>
> +#include <linux/sched/mm.h>
>  #include <linux/sched/signal.h>
>  #include <asm/unaligned.h>
>  #include <linux/inet.h>
> @@ -4168,7 +4169,10 @@ int iscsi_target_rx_thread(void *arg)
>  {
>  	int rc;
>  	struct iscsit_conn *conn = arg;
> +	struct dst_entry *dst;
>  	bool conn_freed = false;
> +	bool loopback = false;
> +	unsigned int flags;
>  
>  	/*
>  	 * Allow ourselves to be interrupted by SIGINT so that a
> @@ -4186,8 +4190,25 @@ int iscsi_target_rx_thread(void *arg)
>  	if (!conn->conn_transport->iscsit_get_rx_pdu)
>  		return 0;
>  
> +	/*
> +	 * If the iscsi connection is over a loopback device from using
> +	 * iscsi and iscsit on the same system, we need to set
> memalloc_noio to
> +	 * prevent memory allocation deadlocks between target and
> initiator.
> +	 */
> +	rcu_read_lock();
> +	dst = rcu_dereference(conn->sock->sk->sk_dst_cache);
> +	if (dst && dst->dev && dst->dev->flags & IFF_LOOPBACK)
> +		loopback = true;
> +	rcu_read_unlock();
> +
> +	if (loopback)
> +		flags = memalloc_noio_save();
> +
>  	conn->conn_transport->iscsit_get_rx_pdu(conn);
>  
> +	if (loopback)
> +		memalloc_noio_restore(flags);
> +
>  	if (!signal_pending(current))
>  		atomic_set(&conn->transport_failed, 1);
>  	iscsit_take_action_for_connection_exit(conn, &conn_freed);


I had mentioned to Mike that this was already tested at a large
customer and in our labs and resolved the deadlocks .

Regards
Laurence Oberman
Laurence Oberman Feb. 9, 2023, 7:26 p.m. UTC | #2
On Wed, 2023-02-08 at 15:58 -0500, Laurence Oberman wrote:
> On Wed, 2023-02-08 at 15:09 -0500, David Jeffery wrote:
> > If an admin connects an iscsi initiator to an iscsi target on the
> > same
> > system, the iscsi connection is vulnerable to deadlocks during
> > memory
> > allocations. Memory allocations in the target task accepting the
> > I/O
> > from
> > the initiator can wait on the initiator's I/O when the system is
> > under
> > memory pressure, causing a deadlock situation between the iscsi
> > target and
> > initiator.
> > 
> > When in this configuration, the deadlock scenario can be avoided by
> > use of
> > GFP_NOIO allocations. Rather than force all configurations to use
> > NOIO,
> > memalloc_noio_save/restore can be used to force GFP_NOIO
> > allocations
> > only
> > when in this loopback configuration.
> > 
> > Signed-off-by: David Jeffery <djeffery@redhat.com>
> > ---
> >  drivers/target/iscsi/iscsi_target.c | 21 +++++++++++++++++++++
> >  1 file changed, 21 insertions(+)
> > 
> > diff --git a/drivers/target/iscsi/iscsi_target.c
> > b/drivers/target/iscsi/iscsi_target.c
> > index baf4da7bb3b4..a68e47e2cdf9 100644
> > --- a/drivers/target/iscsi/iscsi_target.c
> > +++ b/drivers/target/iscsi/iscsi_target.c
> > @@ -16,6 +16,7 @@
> >  #include <linux/vmalloc.h>
> >  #include <linux/idr.h>
> >  #include <linux/delay.h>
> > +#include <linux/sched/mm.h>
> >  #include <linux/sched/signal.h>
> >  #include <asm/unaligned.h>
> >  #include <linux/inet.h>
> > @@ -4168,7 +4169,10 @@ int iscsi_target_rx_thread(void *arg)
> >  {
> >  	int rc;
> >  	struct iscsit_conn *conn = arg;
> > +	struct dst_entry *dst;
> >  	bool conn_freed = false;
> > +	bool loopback = false;
> > +	unsigned int flags;
> >  
> >  	/*
> >  	 * Allow ourselves to be interrupted by SIGINT so that a
> > @@ -4186,8 +4190,25 @@ int iscsi_target_rx_thread(void *arg)
> >  	if (!conn->conn_transport->iscsit_get_rx_pdu)
> >  		return 0;
> >  
> > +	/*
> > +	 * If the iscsi connection is over a loopback device from using
> > +	 * iscsi and iscsit on the same system, we need to set
> > memalloc_noio to
> > +	 * prevent memory allocation deadlocks between target and
> > initiator.
> > +	 */
> > +	rcu_read_lock();
> > +	dst = rcu_dereference(conn->sock->sk->sk_dst_cache);
> > +	if (dst && dst->dev && dst->dev->flags & IFF_LOOPBACK)
> > +		loopback = true;
> > +	rcu_read_unlock();
> > +
> > +	if (loopback)
> > +		flags = memalloc_noio_save();
> > +
> >  	conn->conn_transport->iscsit_get_rx_pdu(conn);
> >  
> > +	if (loopback)
> > +		memalloc_noio_restore(flags);
> > +
> >  	if (!signal_pending(current))
> >  		atomic_set(&conn->transport_failed, 1);
> >  	iscsit_take_action_for_connection_exit(conn, &conn_freed);
> 
> I had mentioned to Mike that this was already tested at a large
> customer and in our labs and resolved the deadlocks .
> 
> Regards
> Laurence Oberman
> 

Tested-by:   Laurence Oberman <loberman@redhat.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>

I hate  to nag here but we have a pressing customer issue and are keen
to get others to weigh in here.

Regards
Laurence



Thanks
Laurence
Maurizio Lombardi Feb. 13, 2023, 11:59 a.m. UTC | #3
st 8. 2. 2023 v 21:10 odesílatel David Jeffery <djeffery@redhat.com> napsal:
>
>
> +       /*
> +        * If the iscsi connection is over a loopback device from using
> +        * iscsi and iscsit on the same system, we need to set memalloc_noio to
> +        * prevent memory allocation deadlocks between target and initiator.
> +        */
> +       rcu_read_lock();
> +       dst = rcu_dereference(conn->sock->sk->sk_dst_cache);
> +       if (dst && dst->dev && dst->dev->flags & IFF_LOOPBACK)
> +               loopback = true;
> +       rcu_read_unlock();

Hi Mike,
I tested it, it works. The customer also confirmed that it fixes the
deadlock on his setup.

Maurizio
Mike Christie Feb. 13, 2023, 4:22 p.m. UTC | #4
On 2/13/23 5:59 AM, Maurizio Lombardi wrote:
> st 8. 2. 2023 v 21:10 odesílatel David Jeffery <djeffery@redhat.com> napsal:
>>
>>
>> +       /*
>> +        * If the iscsi connection is over a loopback device from using
>> +        * iscsi and iscsit on the same system, we need to set memalloc_noio to
>> +        * prevent memory allocation deadlocks between target and initiator.
>> +        */
>> +       rcu_read_lock();
>> +       dst = rcu_dereference(conn->sock->sk->sk_dst_cache);
>> +       if (dst && dst->dev && dst->dev->flags & IFF_LOOPBACK)
>> +               loopback = true;
>> +       rcu_read_unlock();
> 
> Hi Mike,
> I tested it, it works. The customer also confirmed that it fixes the
> deadlock on his setup.

You never responded about why/how it's used in production. Is it some sort
of clustering or container or what?

The login related code can still swing back on you if it's run for a relogin.
It would happen if we overqueue and a nop timesout because the iscsi recv thread
is waiting for backend resources like a request/queue slot, or if management tools
disable/enable the tpgt for reconfigs, etc.
Laurence Oberman Feb. 13, 2023, 4:25 p.m. UTC | #5
On Mon, 2023-02-13 at 10:22 -0600, Mike Christie wrote:
> On 2/13/23 5:59 AM, Maurizio Lombardi wrote:
> > st 8. 2. 2023 v 21:10 odesílatel David Jeffery <djeffery@redhat.com
> > > napsal:
> > > 
> > > +       /*
> > > +        * If the iscsi connection is over a loopback device from
> > > using
> > > +        * iscsi and iscsit on the same system, we need to set
> > > memalloc_noio to
> > > +        * prevent memory allocation deadlocks between target and
> > > initiator.
> > > +        */
> > > +       rcu_read_lock();
> > > +       dst = rcu_dereference(conn->sock->sk->sk_dst_cache);
> > > +       if (dst && dst->dev && dst->dev->flags & IFF_LOOPBACK)
> > > +               loopback = true;
> > > +       rcu_read_unlock();
> > 
> > Hi Mike,
> > I tested it, it works. The customer also confirmed that it fixes
> > the
> > deadlock on his setup.
> 
> You never responded about why/how it's used in production. Is it some
> sort
> of clustering or container or what?
> 
> The login related code can still swing back on you if it's run for a
> relogin.
> It would happen if we overqueue and a nop timesout because the iscsi
> recv thread
> is waiting for backend resources like a request/queue slot, or if
> management tools
> disable/enable the tpgt for reconfigs, etc.
> 

Hi Mike,
The use case described is as follows:


"This customer moved their on-premise system to the cloud.
Their on-premise system runs with two servers and one external storage
and uses data mirroring software to mirror data.

When moving to the cloud, customer wanted to implement a data mirror
using data mirror software with two instances to reduce the cost of
using the cloud infrastructure.
To build a system with two instances, we use iSCSI to mirror data
between a local disk on one instance and a local disk on the other
instance.
We coexist iSCSI initiator and target so that data mirroring software
can access each disk through a unified interface."

Thanks
Laurence
diff mbox series

Patch

diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
index baf4da7bb3b4..a68e47e2cdf9 100644
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -16,6 +16,7 @@ 
 #include <linux/vmalloc.h>
 #include <linux/idr.h>
 #include <linux/delay.h>
+#include <linux/sched/mm.h>
 #include <linux/sched/signal.h>
 #include <asm/unaligned.h>
 #include <linux/inet.h>
@@ -4168,7 +4169,10 @@  int iscsi_target_rx_thread(void *arg)
 {
 	int rc;
 	struct iscsit_conn *conn = arg;
+	struct dst_entry *dst;
 	bool conn_freed = false;
+	bool loopback = false;
+	unsigned int flags;
 
 	/*
 	 * Allow ourselves to be interrupted by SIGINT so that a
@@ -4186,8 +4190,25 @@  int iscsi_target_rx_thread(void *arg)
 	if (!conn->conn_transport->iscsit_get_rx_pdu)
 		return 0;
 
+	/*
+	 * If the iscsi connection is over a loopback device from using
+	 * iscsi and iscsit on the same system, we need to set memalloc_noio to
+	 * prevent memory allocation deadlocks between target and initiator.
+	 */
+	rcu_read_lock();
+	dst = rcu_dereference(conn->sock->sk->sk_dst_cache);
+	if (dst && dst->dev && dst->dev->flags & IFF_LOOPBACK)
+		loopback = true;
+	rcu_read_unlock();
+
+	if (loopback)
+		flags = memalloc_noio_save();
+
 	conn->conn_transport->iscsit_get_rx_pdu(conn);
 
+	if (loopback)
+		memalloc_noio_restore(flags);
+
 	if (!signal_pending(current))
 		atomic_set(&conn->transport_failed, 1);
 	iscsit_take_action_for_connection_exit(conn, &conn_freed);