Message ID | 20231006153612.5851-3-sumitg@nvidia.com |
---|---|
State | New |
Headers | show |
Series | Add support for _TFP and change throttle pctg | expand |
On 06/10/23 21:22, Rafael J. Wysocki wrote: > External email: Use caution opening links or attachments > > > On Fri, Oct 6, 2023 at 5:36 PM Sumit Gupta <sumitg@nvidia.com> wrote: >> >> From: Srikar Srimath Tirumala <srikars@nvidia.com> >> >> Current implementation of processor_thermal performs software throttling >> in fixed steps of "20%" which can be too coarse for some platforms. >> We observed some performance gain after reducing the throttle percentage. >> Change the CPUFREQ thermal reduction percentage and maximum thermal steps >> to be configurable. Also, update the default values of both for Nvidia >> Tegra241 (Grace) SoC. The thermal reduction percentage is reduced to "5%" >> and accordingly the maximum number of thermal steps are increased as they >> are derived from the reduction percentage. >> >> Signed-off-by: Srikar Srimath Tirumala <srikars@nvidia.com> >> Signed-off-by: Sumit Gupta <sumitg@nvidia.com> >> --- >> drivers/acpi/processor_thermal.c | 43 +++++++++++++++++++++++++++++--- >> 1 file changed, 40 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/acpi/processor_thermal.c b/drivers/acpi/processor_thermal.c >> index b7c6287eccca..677ba8bc3fbc 100644 >> --- a/drivers/acpi/processor_thermal.c >> +++ b/drivers/acpi/processor_thermal.c >> @@ -26,7 +26,16 @@ >> */ >> >> #define CPUFREQ_THERMAL_MIN_STEP 0 >> -#define CPUFREQ_THERMAL_MAX_STEP 3 >> + >> +static int cpufreq_thermal_max_step __read_mostly = 3; >> + >> +/* >> + * Minimum throttle percentage for processor_thermal cooling device. >> + * The processor_thermal driver uses it to calculate the percentage amount by >> + * which cpu frequency must be reduced for each cooling state. This is also used >> + * to calculate the maximum number of throttling steps or cooling states. >> + */ >> +static int cpufreq_thermal_pctg __read_mostly = 20; >> >> static DEFINE_PER_CPU(unsigned int, cpufreq_thermal_reduction_pctg); >> >> @@ -71,7 +80,7 @@ static int cpufreq_get_max_state(unsigned int cpu) >> if (!cpu_has_cpufreq(cpu)) >> return 0; >> >> - return CPUFREQ_THERMAL_MAX_STEP; >> + return cpufreq_thermal_max_step; >> } >> >> static int cpufreq_get_cur_state(unsigned int cpu) >> @@ -113,7 +122,8 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state) >> if (!policy) >> return -EINVAL; >> >> - max_freq = (policy->cpuinfo.max_freq * (100 - reduction_pctg(i) * 20)) / 100; >> + max_freq = (policy->cpuinfo.max_freq * >> + (100 - reduction_pctg(i) * cpufreq_thermal_pctg)) / 100; >> >> cpufreq_cpu_put(policy); >> >> @@ -126,10 +136,37 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state) >> return 0; >> } >> >> +#ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY >> +#define SMCCC_SOC_ID_T241 0x036b0241 >> + >> +static void acpi_thermal_cpufreq_config_nvidia(void) >> +{ >> + s32 soc_id = arm_smccc_get_soc_id_version(); >> + >> + /* Check JEP106 code for NVIDIA Tegra241 chip (036b:0241) */ >> + if (soc_id < 0 || soc_id != SMCCC_SOC_ID_T241) >> + return; >> + >> + /* Reduce the CPUFREQ Thermal reduction percentage to 5% */ >> + cpufreq_thermal_pctg = 5; >> + >> + /* >> + * Derive the MAX_STEP from minimum throttle percentage so that the reduction >> + * percentage doesn't end up becoming negative. Also, cap the MAX_STEP so that >> + * the CPU performance doesn't become 0. >> + */ >> + cpufreq_thermal_max_step = (100 / cpufreq_thermal_pctg) - 1; >> +} > > Looks better now, but one more thing: This is introducing an > ARM-specific piece of code into an otherwise generic file and there is > drivers/acpi/arm64/ for ARM-specific code, so I would very much prefer > this piece of code to go there. > > Of course, it won't be able to modify the static variables directly > then, but what if instead it defines functions to return the > appropriate values? > > The variables in question could be initialized with the help of those > functions then. > Hi Rafael, Thank you for the review! Have done the suggested change and sent v4[1]. Please suggest if it looks fine now (or) needs any further change. [1] https://lore.kernel.org/lkml/20231009171839.12267-1-sumitg@nvidia.com/ Best Regards, Sumit Gupta >> +#else >> +static inline void acpi_thermal_cpufreq_config_nvidia(void) {} >> +#endif >> + >> void acpi_thermal_cpufreq_init(struct cpufreq_policy *policy) >> { >> unsigned int cpu; >> >> + acpi_thermal_cpufreq_config_nvidia(); >> + >> for_each_cpu(cpu, policy->related_cpus) { >> struct acpi_processor *pr = per_cpu(processors, cpu); >> int ret; >> -- >> 2.17.1 >>
diff --git a/drivers/acpi/processor_thermal.c b/drivers/acpi/processor_thermal.c index b7c6287eccca..677ba8bc3fbc 100644 --- a/drivers/acpi/processor_thermal.c +++ b/drivers/acpi/processor_thermal.c @@ -26,7 +26,16 @@ */ #define CPUFREQ_THERMAL_MIN_STEP 0 -#define CPUFREQ_THERMAL_MAX_STEP 3 + +static int cpufreq_thermal_max_step __read_mostly = 3; + +/* + * Minimum throttle percentage for processor_thermal cooling device. + * The processor_thermal driver uses it to calculate the percentage amount by + * which cpu frequency must be reduced for each cooling state. This is also used + * to calculate the maximum number of throttling steps or cooling states. + */ +static int cpufreq_thermal_pctg __read_mostly = 20; static DEFINE_PER_CPU(unsigned int, cpufreq_thermal_reduction_pctg); @@ -71,7 +80,7 @@ static int cpufreq_get_max_state(unsigned int cpu) if (!cpu_has_cpufreq(cpu)) return 0; - return CPUFREQ_THERMAL_MAX_STEP; + return cpufreq_thermal_max_step; } static int cpufreq_get_cur_state(unsigned int cpu) @@ -113,7 +122,8 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state) if (!policy) return -EINVAL; - max_freq = (policy->cpuinfo.max_freq * (100 - reduction_pctg(i) * 20)) / 100; + max_freq = (policy->cpuinfo.max_freq * + (100 - reduction_pctg(i) * cpufreq_thermal_pctg)) / 100; cpufreq_cpu_put(policy); @@ -126,10 +136,37 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state) return 0; } +#ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY +#define SMCCC_SOC_ID_T241 0x036b0241 + +static void acpi_thermal_cpufreq_config_nvidia(void) +{ + s32 soc_id = arm_smccc_get_soc_id_version(); + + /* Check JEP106 code for NVIDIA Tegra241 chip (036b:0241) */ + if (soc_id < 0 || soc_id != SMCCC_SOC_ID_T241) + return; + + /* Reduce the CPUFREQ Thermal reduction percentage to 5% */ + cpufreq_thermal_pctg = 5; + + /* + * Derive the MAX_STEP from minimum throttle percentage so that the reduction + * percentage doesn't end up becoming negative. Also, cap the MAX_STEP so that + * the CPU performance doesn't become 0. + */ + cpufreq_thermal_max_step = (100 / cpufreq_thermal_pctg) - 1; +} +#else +static inline void acpi_thermal_cpufreq_config_nvidia(void) {} +#endif + void acpi_thermal_cpufreq_init(struct cpufreq_policy *policy) { unsigned int cpu; + acpi_thermal_cpufreq_config_nvidia(); + for_each_cpu(cpu, policy->related_cpus) { struct acpi_processor *pr = per_cpu(processors, cpu); int ret;