Message ID | 20210521221906.199436-1-kyle.meyer@hpe.com |
---|---|
State | New |
Headers | show |
Series | acpi-cpufreq: Skip initialization if a cpufreq driver exists | expand |
On Sat, May 22, 2021 at 12:19 AM Kyle Meyer <kyle.meyer@hpe.com> wrote: > > Revert part of commit 75c0758137c7a > ("acpi-cpufreq: Fail initialization if driver cannot be registered"). > > acpi-cpufreq is mutually exclusive with intel_pstate, however, > acpi-cpufreq is loaded multiple times during startup while intel_pstate is > enabled. On systems using systemd the kernel triggers one uevent for each > device as a result of systemd-udev-trigger.service. The service exists to > retrigger all devices as uevents sent by the kernel before systemd-udevd > is running are missed. The delay caused by systemd-udevd repeatedly loading > the driver, getting a fail return, and unloading the driver twice per > logical CPU has a significant impact on the startup time, and can cause > some devices to be unavailable after reaching the root login prompt. > > Load the driver once but skip initialization if a cpufreq driver exists by > changing the return value of cpufreq_get_current_driver() from -EEXIST to > 0. > > Signed-off-by: Kyle Meyer <kyle.meyer@hpe.com> > --- > drivers/cpufreq/acpi-cpufreq.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c > index 7e7450453714..e79a945369d1 100644 > --- a/drivers/cpufreq/acpi-cpufreq.c > +++ b/drivers/cpufreq/acpi-cpufreq.c > @@ -1003,7 +1003,7 @@ static int __init acpi_cpufreq_init(void) > > /* don't keep reloading if cpufreq_driver exists */ > if (cpufreq_get_current_driver()) > - return -EEXIST; > + return 0; > > pr_debug("%s\n", __func__); > > -- Applied as 5.14 material with some edits in the subject and changelog, thanks!
Hi Rafael, On Mon, May 24, 2021 at 7:47 PM Rafael J. Wysocki <rafael@kernel.org> wrote: > On Sat, May 22, 2021 at 12:19 AM Kyle Meyer <kyle.meyer@hpe.com> wrote: > > diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c > > index 7e7450453714..e79a945369d1 100644 > > --- a/drivers/cpufreq/acpi-cpufreq.c > > +++ b/drivers/cpufreq/acpi-cpufreq.c > > @@ -1003,7 +1003,7 @@ static int __init acpi_cpufreq_init(void) > > > > /* don't keep reloading if cpufreq_driver exists */ > > if (cpufreq_get_current_driver()) > > - return -EEXIST; > > + return 0; > > > > pr_debug("%s\n", __func__); > > > > -- > > Applied as 5.14 material with some edits in the subject and changelog, thanks! I am not sure how this is supposed to work. If we return 0 from acpi_cpufreq_init(), then the driver will never be used, since it's acpi_cpufreq_init() will never get called again later. cpufreq drivers don't follow the generic device/driver model where a driver gets probed again if a device appears and so this is broken. Please revert this patch.
On Mon, Jun 7, 2021 at 9:26 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: > > Hi Rafael, > > On Mon, May 24, 2021 at 7:47 PM Rafael J. Wysocki <rafael@kernel.org> wrote: > > On Sat, May 22, 2021 at 12:19 AM Kyle Meyer <kyle.meyer@hpe.com> wrote: > > > > diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c > > > index 7e7450453714..e79a945369d1 100644 > > > --- a/drivers/cpufreq/acpi-cpufreq.c > > > +++ b/drivers/cpufreq/acpi-cpufreq.c > > > @@ -1003,7 +1003,7 @@ static int __init acpi_cpufreq_init(void) > > > > > > /* don't keep reloading if cpufreq_driver exists */ > > > if (cpufreq_get_current_driver()) > > > - return -EEXIST; > > > + return 0; > > > > > > pr_debug("%s\n", __func__); > > > > > > -- > > > > Applied as 5.14 material with some edits in the subject and changelog, thanks! > > I am not sure how this is supposed to work. If we return 0 from > acpi_cpufreq_init(), > then the driver will never be used, since it's acpi_cpufreq_init() > will never get > called again later. Unless the module is unloaded and loaded again, that is. > cpufreq drivers don't follow the generic device/driver model where a driver gets > probed again if a device appears and so this is broken. It is broken anyway as per the changelog of this patch. On systems with several hundred logical CPUs this really can be troublesome. > Please revert this patch. Well, you can argue that the problem at hand is outside the kernel and so it's not a kernel's business to address it. After all, systemd-udevd could learn to avoid attempting to load the module again if it fails with -EEXIST, but I'm not sure how different that really would be from what this patch does, in practice.
On 07-06-21, 13:02, Rafael J. Wysocki wrote: > On Mon, Jun 7, 2021 at 9:26 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: > > I am not sure how this is supposed to work. If we return 0 from > > acpi_cpufreq_init(), > > then the driver will never be used, since it's acpi_cpufreq_init() > > will never get > > called again later. > > Unless the module is unloaded and loaded again, that is. Right. > > cpufreq drivers don't follow the generic device/driver model where a driver gets > > probed again if a device appears and so this is broken. > > It is broken anyway as per the changelog of this patch. > > On systems with several hundred logical CPUs this really can be troublesome. Hmm, I agree. > > Please revert this patch. > > Well, you can argue that the problem at hand is outside the kernel and > so it's not a kernel's business to address it. Exactly, what we did here is add a band-aid to make a userspace tool happy, the kernel was doing the right thing earlier. > After all, systemd-udevd could learn to avoid attempting to load the > module again if it fails with -EEXIST, That is one way, right. > but I'm not sure how different > that really would be from what this patch does, in practice. The very first difference is we won't be adding an incorrect hack in the kernel to solve this userspace problem. Else in order to make acpi-cpufreq driver work, after a user unloads intel-pstate, user would be required to unload the acpi-cpufreq and load it again, which will surely look confusing to the user. Why unload to load it again ? Leaving a module inserted in an unusable state is not the right solution to fix a problem IMHO. -- viresh
diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c index 7e7450453714..e79a945369d1 100644 --- a/drivers/cpufreq/acpi-cpufreq.c +++ b/drivers/cpufreq/acpi-cpufreq.c @@ -1003,7 +1003,7 @@ static int __init acpi_cpufreq_init(void) /* don't keep reloading if cpufreq_driver exists */ if (cpufreq_get_current_driver()) - return -EEXIST; + return 0; pr_debug("%s\n", __func__);
Revert part of commit 75c0758137c7a ("acpi-cpufreq: Fail initialization if driver cannot be registered"). acpi-cpufreq is mutually exclusive with intel_pstate, however, acpi-cpufreq is loaded multiple times during startup while intel_pstate is enabled. On systems using systemd the kernel triggers one uevent for each device as a result of systemd-udev-trigger.service. The service exists to retrigger all devices as uevents sent by the kernel before systemd-udevd is running are missed. The delay caused by systemd-udevd repeatedly loading the driver, getting a fail return, and unloading the driver twice per logical CPU has a significant impact on the startup time, and can cause some devices to be unavailable after reaching the root login prompt. Load the driver once but skip initialization if a cpufreq driver exists by changing the return value of cpufreq_get_current_driver() from -EEXIST to 0. Signed-off-by: Kyle Meyer <kyle.meyer@hpe.com> --- drivers/cpufreq/acpi-cpufreq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)