diff mbox series

acpi-cpufreq: Skip initialization if a cpufreq driver exists

Message ID 20210521221906.199436-1-kyle.meyer@hpe.com
State New
Headers show
Series acpi-cpufreq: Skip initialization if a cpufreq driver exists | expand

Commit Message

Kyle Meyer May 21, 2021, 10:19 p.m. UTC
Revert part of commit 75c0758137c7a
("acpi-cpufreq: Fail initialization if driver cannot be registered").

acpi-cpufreq is mutually exclusive with intel_pstate, however,
acpi-cpufreq is loaded multiple times during startup while intel_pstate is
enabled. On systems using systemd the kernel triggers one uevent for each
device as a result of systemd-udev-trigger.service. The service exists to
retrigger all devices as uevents sent by the kernel before systemd-udevd
is running are missed. The delay caused by systemd-udevd repeatedly loading
the driver, getting a fail return, and unloading the driver twice per
logical CPU has a significant impact on the startup time, and can cause
some devices to be unavailable after reaching the root login prompt.

Load the driver once but skip initialization if a cpufreq driver exists by
changing the return value of cpufreq_get_current_driver() from -EEXIST to
0.

Signed-off-by: Kyle Meyer <kyle.meyer@hpe.com>
---
 drivers/cpufreq/acpi-cpufreq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Rafael J. Wysocki May 24, 2021, 2:16 p.m. UTC | #1
On Sat, May 22, 2021 at 12:19 AM Kyle Meyer <kyle.meyer@hpe.com> wrote:
>

> Revert part of commit 75c0758137c7a

> ("acpi-cpufreq: Fail initialization if driver cannot be registered").

>

> acpi-cpufreq is mutually exclusive with intel_pstate, however,

> acpi-cpufreq is loaded multiple times during startup while intel_pstate is

> enabled. On systems using systemd the kernel triggers one uevent for each

> device as a result of systemd-udev-trigger.service. The service exists to

> retrigger all devices as uevents sent by the kernel before systemd-udevd

> is running are missed. The delay caused by systemd-udevd repeatedly loading

> the driver, getting a fail return, and unloading the driver twice per

> logical CPU has a significant impact on the startup time, and can cause

> some devices to be unavailable after reaching the root login prompt.

>

> Load the driver once but skip initialization if a cpufreq driver exists by

> changing the return value of cpufreq_get_current_driver() from -EEXIST to

> 0.

>

> Signed-off-by: Kyle Meyer <kyle.meyer@hpe.com>

> ---

>  drivers/cpufreq/acpi-cpufreq.c | 2 +-

>  1 file changed, 1 insertion(+), 1 deletion(-)

>

> diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c

> index 7e7450453714..e79a945369d1 100644

> --- a/drivers/cpufreq/acpi-cpufreq.c

> +++ b/drivers/cpufreq/acpi-cpufreq.c

> @@ -1003,7 +1003,7 @@ static int __init acpi_cpufreq_init(void)

>

>         /* don't keep reloading if cpufreq_driver exists */

>         if (cpufreq_get_current_driver())

> -               return -EEXIST;

> +               return 0;

>

>         pr_debug("%s\n", __func__);

>

> --


Applied as 5.14 material with some edits in the subject and changelog, thanks!
Viresh Kumar June 7, 2021, 7:25 a.m. UTC | #2
Hi Rafael,

On Mon, May 24, 2021 at 7:47 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Sat, May 22, 2021 at 12:19 AM Kyle Meyer <kyle.meyer@hpe.com> wrote:


> > diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c

> > index 7e7450453714..e79a945369d1 100644

> > --- a/drivers/cpufreq/acpi-cpufreq.c

> > +++ b/drivers/cpufreq/acpi-cpufreq.c

> > @@ -1003,7 +1003,7 @@ static int __init acpi_cpufreq_init(void)

> >

> >         /* don't keep reloading if cpufreq_driver exists */

> >         if (cpufreq_get_current_driver())

> > -               return -EEXIST;

> > +               return 0;

> >

> >         pr_debug("%s\n", __func__);

> >

> > --

>

> Applied as 5.14 material with some edits in the subject and changelog, thanks!


I am not sure how this is supposed to work. If we return 0 from
acpi_cpufreq_init(),
then the driver will never be used, since it's acpi_cpufreq_init()
will never get
called again later.

cpufreq drivers don't follow the generic device/driver model where a driver gets
probed again if a device appears and so this is broken.

Please revert this patch.
Rafael J. Wysocki June 7, 2021, 11:02 a.m. UTC | #3
On Mon, Jun 7, 2021 at 9:26 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>

> Hi Rafael,

>

> On Mon, May 24, 2021 at 7:47 PM Rafael J. Wysocki <rafael@kernel.org> wrote:

> > On Sat, May 22, 2021 at 12:19 AM Kyle Meyer <kyle.meyer@hpe.com> wrote:

>

> > > diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c

> > > index 7e7450453714..e79a945369d1 100644

> > > --- a/drivers/cpufreq/acpi-cpufreq.c

> > > +++ b/drivers/cpufreq/acpi-cpufreq.c

> > > @@ -1003,7 +1003,7 @@ static int __init acpi_cpufreq_init(void)

> > >

> > >         /* don't keep reloading if cpufreq_driver exists */

> > >         if (cpufreq_get_current_driver())

> > > -               return -EEXIST;

> > > +               return 0;

> > >

> > >         pr_debug("%s\n", __func__);

> > >

> > > --

> >

> > Applied as 5.14 material with some edits in the subject and changelog, thanks!

>

> I am not sure how this is supposed to work. If we return 0 from

> acpi_cpufreq_init(),

> then the driver will never be used, since it's acpi_cpufreq_init()

> will never get

> called again later.


Unless the module is unloaded and loaded again, that is.

> cpufreq drivers don't follow the generic device/driver model where a driver gets

> probed again if a device appears and so this is broken.


It is broken anyway as per the changelog of this patch.

On systems with several hundred logical CPUs this really can be troublesome.

> Please revert this patch.


Well, you can argue that the problem at hand is outside the kernel and
so it's not a kernel's business to address it.

After all, systemd-udevd could learn to avoid attempting to load the
module again if it fails with -EEXIST, but I'm not sure how different
that really would be from what this patch does, in practice.
Viresh Kumar June 7, 2021, 11:14 a.m. UTC | #4
On 07-06-21, 13:02, Rafael J. Wysocki wrote:
> On Mon, Jun 7, 2021 at 9:26 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:

> > I am not sure how this is supposed to work. If we return 0 from

> > acpi_cpufreq_init(),

> > then the driver will never be used, since it's acpi_cpufreq_init()

> > will never get

> > called again later.

> 

> Unless the module is unloaded and loaded again, that is.


Right.

> > cpufreq drivers don't follow the generic device/driver model where a driver gets

> > probed again if a device appears and so this is broken.

> 

> It is broken anyway as per the changelog of this patch.

> 

> On systems with several hundred logical CPUs this really can be troublesome.


Hmm, I agree.

> > Please revert this patch.

> 

> Well, you can argue that the problem at hand is outside the kernel and

> so it's not a kernel's business to address it.


Exactly, what we did here is add a band-aid to make a userspace tool
happy, the kernel was doing the right thing earlier.

> After all, systemd-udevd could learn to avoid attempting to load the

> module again if it fails with -EEXIST,


That is one way, right.

> but I'm not sure how different

> that really would be from what this patch does, in practice.


The very first difference is we won't be adding an incorrect hack in
the kernel to solve this userspace problem. Else in order to make
acpi-cpufreq driver work, after a user unloads intel-pstate, user
would be required to unload the acpi-cpufreq and load it again, which
will surely look confusing to the user. Why unload to load it again ?

Leaving a module inserted in an unusable state is not the right
solution to fix a problem IMHO.

-- 
viresh
diff mbox series

Patch

diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index 7e7450453714..e79a945369d1 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -1003,7 +1003,7 @@  static int __init acpi_cpufreq_init(void)
 
 	/* don't keep reloading if cpufreq_driver exists */
 	if (cpufreq_get_current_driver())
-		return -EEXIST;
+		return 0;
 
 	pr_debug("%s\n", __func__);