diff mbox series

[patchlet] r8169: fix napi_schedule_irqoff() called with irqs enabled warning

Message ID 9c34e18280bde5c14f40b1ef89a5e6ea326dd5da.camel@gmx.de
State New
Headers show
Series [patchlet] r8169: fix napi_schedule_irqoff() called with irqs enabled warning | expand

Commit Message

Mike Galbraith Oct. 16, 2020, 11:26 a.m. UTC
When the kernel is built with PREEMPT_RT or booted with threadirqs,
irqs are not disabled when rtl8169_interrupt() is called, inspiring
__raise_softirq_irqoff() to gripe.  Use plain napi_schedule().

Signed-off-by: Mike Galbraith <efault@gmx.de>
---
 drivers/net/ethernet/realtek/r8169_main.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Heiner Kallweit Oct. 16, 2020, 11:34 a.m. UTC | #1
On 16.10.2020 13:26, Mike Galbraith wrote:
> 

> When the kernel is built with PREEMPT_RT or booted with threadirqs,

> irqs are not disabled when rtl8169_interrupt() is called, inspiring

> __raise_softirq_irqoff() to gripe.  Use plain napi_schedule().

> 


I'm aware of the topic, but missing the benefits of the irqoff version
unconditionally doesn't seem to be the best option. See also:
https://lore.kernel.org/linux-arm-kernel/20201008162749.860521-1-john@metanate.com/
Needed is a function that dynamically picks the right version.

> Signed-off-by: Mike Galbraith <efault@gmx.de>

> ---

>  drivers/net/ethernet/realtek/r8169_main.c |    2 +-

>  1 file changed, 1 insertion(+), 1 deletion(-)

> 

> --- a/drivers/net/ethernet/realtek/r8169_main.c

> +++ b/drivers/net/ethernet/realtek/r8169_main.c

> @@ -4573,7 +4573,7 @@ static irqreturn_t rtl8169_interrupt(int

>  	}

> 

>  	rtl_irq_disable(tp);

> -	napi_schedule_irqoff(&tp->napi);

> +	napi_schedule(&tp->napi);

>  out:

>  	rtl_ack_events(tp, status);

> 

>
Mike Galbraith Oct. 16, 2020, 11:57 a.m. UTC | #2
On Fri, 2020-10-16 at 13:34 +0200, Heiner Kallweit wrote:
> On 16.10.2020 13:26, Mike Galbraith wrote:

> >

> > When the kernel is built with PREEMPT_RT or booted with threadirqs,

> > irqs are not disabled when rtl8169_interrupt() is called, inspiring

> > __raise_softirq_irqoff() to gripe.  Use plain napi_schedule().

> >

>

> I'm aware of the topic, but missing the benefits of the irqoff version

> unconditionally doesn't seem to be the best option. See also:

> https://lore.kernel.org/linux-arm-kernel/20201008162749.860521-1-john@metanate.com/

> Needed is a function that dynamically picks the right version.


Thanks for your time, I'll just carry it locally.

	-Mike
Vladimir Oltean Oct. 16, 2020, 2:26 p.m. UTC | #3
On Fri, Oct 16, 2020 at 01:34:55PM +0200, Heiner Kallweit wrote:
> I'm aware of the topic, but missing the benefits of the irqoff version

> unconditionally doesn't seem to be the best option.


What are the benefits of the irqoff version? As far as I see it, the
only use case for that function is when the caller has _explicitly_
disabled interrupts.

The plain napi_schedule call will check if irqs are disabled. If they
are, it won't do anything further in that area. There is no performance
impact except for a check.

> Needed is a function that dynamically picks the right version.


So you want to replace a check with another check, am I right? How will
that improve anything performance-wise?
Heiner Kallweit Oct. 16, 2020, 2:41 p.m. UTC | #4
On 16.10.2020 16:26, Vladimir Oltean wrote:
> On Fri, Oct 16, 2020 at 01:34:55PM +0200, Heiner Kallweit wrote:

>> I'm aware of the topic, but missing the benefits of the irqoff version

>> unconditionally doesn't seem to be the best option.

> 

> What are the benefits of the irqoff version? As far as I see it, the

> only use case for that function is when the caller has _explicitly_

> disabled interrupts.

> 

If the irqoff version wouldn't have a benefit, then I think we wouldn't
have it ..

> The plain napi_schedule call will check if irqs are disabled. If they

> are, it won't do anything further in that area. There is no performance

> impact except for a check.

> 

There is no such check, and in general currently attempts are made to
remove usage of e.g. in_interrupt(). napi_schedule() has additional calls
to local_irq_save() and local_irq_restore().

>> Needed is a function that dynamically picks the right version.

> 

> So you want to replace a check with another check, am I right? How will

> that improve anything performance-wise?

>
Vladimir Oltean Oct. 16, 2020, 3:40 p.m. UTC | #5
On Fri, Oct 16, 2020 at 04:41:50PM +0200, Heiner Kallweit wrote:
> On 16.10.2020 16:26, Vladimir Oltean wrote:

> > On Fri, Oct 16, 2020 at 01:34:55PM +0200, Heiner Kallweit wrote:

> >> I'm aware of the topic, but missing the benefits of the irqoff version

> >> unconditionally doesn't seem to be the best option.

> > 

> > What are the benefits of the irqoff version? As far as I see it, the

> > only use case for that function is when the caller has _explicitly_

> > disabled interrupts.

> > 

> If the irqoff version wouldn't have a benefit, then I think we wouldn't

> have it ..

> 

> > The plain napi_schedule call will check if irqs are disabled. If they

> > are, it won't do anything further in that area. There is no performance

> > impact except for a check.

> > 

> There is no such check, and in general currently attempts are made to

> remove usage of e.g. in_interrupt(). napi_schedule() has additional calls

> to local_irq_save() and local_irq_restore().


This has nothing to do with in_interrupt().

Now, to explain where my confusion came from.
arm64 has this:

static inline unsigned long arch_local_irq_save(void)
{
	unsigned long flags;

	flags = arch_local_save_flags();

	/*
	 * There are too many states with IRQs disabled, just keep the current
	 * state if interrupts are already disabled/masked.
	 */
	if (!arch_irqs_disabled_flags(flags))
		arch_local_irq_disable();

	return flags;
}

I just thought that the generic implementation had the "if" too.
Mike Galbraith Oct. 16, 2020, 5:11 p.m. UTC | #6
On Fri, 2020-10-16 at 17:26 +0300, Vladimir Oltean wrote:
> On Fri, Oct 16, 2020 at 01:34:55PM +0200, Heiner Kallweit wrote:

> > I'm aware of the topic, but missing the benefits of the irqoff version

> > unconditionally doesn't seem to be the best option.

>

> What are the benefits of the irqoff version? As far as I see it, the

> only use case for that function is when the caller has _explicitly_

> disabled interrupts.


Yeah, it's a straight up correctness issue as it sits.  There is a
dinky bit of overhead added to the general case when using the correct
function though, at least on x86.  I personally don't see why we should
care deeply enough to want to add more code to avoid it given there are
about a zillions places where we do the same for the same reason, but
that's a maintainer call.

	-Mike
Vladimir Oltean Oct. 16, 2020, 5:19 p.m. UTC | #7
On Fri, Oct 16, 2020 at 07:11:22PM +0200, Mike Galbraith wrote:
> Yeah, it's a straight up correctness issue as it sits.


Yeah, tell me about it, you don't even want to know what it looks like
when the local_softirq_pending_ref percpu mask gets corrupted. The
symptom will be that random syscalls, like clock_nanosleep, will make
user processes hang in the kernel, because their timers will appear to
never expire. You'll also see negative expiry times in /proc/timer_list.
Eventually the entire system dies. All of that, reproducible with a
simple flood ping, given enough time.  Ask me how I come to know....
The 215602a8d212 ("enetc: use napi_schedule to be compatible with
PREEMPT_RT") commit is from before the lockdep warning came to be.
Heiner Kallweit Oct. 16, 2020, 7:15 p.m. UTC | #8
On 16.10.2020 19:11, Mike Galbraith wrote:
> On Fri, 2020-10-16 at 17:26 +0300, Vladimir Oltean wrote:

>> On Fri, Oct 16, 2020 at 01:34:55PM +0200, Heiner Kallweit wrote:

>>> I'm aware of the topic, but missing the benefits of the irqoff version

>>> unconditionally doesn't seem to be the best option.

>>

>> What are the benefits of the irqoff version? As far as I see it, the

>> only use case for that function is when the caller has _explicitly_

>> disabled interrupts.

> 

> Yeah, it's a straight up correctness issue as it sits.  There is a

> dinky bit of overhead added to the general case when using the correct

> function though, at least on x86.  I personally don't see why we should

> care deeply enough to want to add more code to avoid it given there are

> about a zillions places where we do the same for the same reason, but

> that's a maintainer call.

> 

Of course switching back to napi_schedule() is the easiest solution,
and also for r8169 we may come to the conclusion that it's the best one.
(or, considering full RT, we may even remove the irqoff version completely)
But we should spend at least a few thoughts on whether and how the
irqoff version could be improved. This would have two benefits:
- avoid the local_irq_save/local_irq_restore overhead (architecture-dependent)
- automatically fix all drivers using the irqoff version
If others go the easy way, then this doesn't mean that we must not think
about a better way.
Mike Galbraith Oct. 17, 2020, 2:26 a.m. UTC | #9
On Fri, 2020-10-16 at 21:15 +0200, Heiner Kallweit wrote:
>

> But we should spend at least a few thoughts on whether and how the

> irqoff version could be improved. This would have two benefits:


I disagree, using the irqoff version in a place that irqoff is not
always true is a bug.

*Maybe* it would be worth it to twiddle the regular version, but color
me highly skeptical.  Branches ain't free (and arm for one already adds
one), and static branches are not generic whereas napi_schedule() is.

	-Mike
diff mbox series

Patch

--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -4573,7 +4573,7 @@  static irqreturn_t rtl8169_interrupt(int
 	}

 	rtl_irq_disable(tp);
-	napi_schedule_irqoff(&tp->napi);
+	napi_schedule(&tp->napi);
 out:
 	rtl_ack_events(tp, status);