Message ID | 20240516200555.33982-1-trintaeoitogc@gmail.com |
---|---|
State | New |
Headers | show |
Series | [thermal] adding check if the thermal firmware is running | expand |
On 5/16/2024 10:05 PM, Guilherme Giacomo Simoes wrote: > in the dmesg is showing the message "failed to read out thermal zone" > as if the temperature read is failed by don't find the thermal zone. > > After researching and debugging, I see that this specific error is > occurrence because the thermal try read the temperature when is started, > but the firmware is not running yet. > > For more legibiliti i create the NOTLOAD error code in the errno-base.h, > and in the iwl_mvm_tzone_get_temp() on tt.c i check if firmware is > running and I set the NOTLOAD code for ret variable and goto out. > > After this, in the update_temperature() in the thermal_code.c i received > the return of thermal_zone_get_temp() and check if return is NOTLOAD, > because if it is, I print the warning message "firmware yet not load" > and return for caller > > The thermal_core.c i think that is generic for any thermal drivers and > not only used for tt.c of course. > But if this ipotetic driver not check if firmware is running before read > the temperature, the thermal_code.c is work as a before this change. > > After this change, in my computer I compile and install kernel in /boot > and in my dmesg the message "failed to read out thermal zone" is not > show any more. In your place the warning messafe "Firmware yet not > load" is showing. > > I would like to thank you in advance for any contribution, suggestion > or criticism of my patch suggestion. > --- > drivers/net/wireless/intel/iwlwifi/mvm/tt.c | 10 ++++++++-- > drivers/thermal/thermal_core.c | 10 +++++++--- > include/uapi/asm-generic/errno-base.h | 1 + > 3 files changed, 16 insertions(+), 5 deletions(-) > > diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tt.c b/drivers/net/wireless/intel/iwlwifi/mvm/tt.c > index 8083c4b2ab6b..dd5725db06d2 100644 > --- a/drivers/net/wireless/intel/iwlwifi/mvm/tt.c > +++ b/drivers/net/wireless/intel/iwlwifi/mvm/tt.c > @@ -620,8 +620,14 @@ static int iwl_mvm_tzone_get_temp(struct thermal_zone_device *device, > > mutex_lock(&mvm->mutex); > > - if (!iwl_mvm_firmware_running(mvm) || > - mvm->fwrt.cur_fw_img != IWL_UCODE_REGULAR) { > + const int res = iwl_mvm_firmware_running(mvm); > + > + if (!res) { > + ret = -NOTLOAD; > + goto out; > + } > + > + if (mvm->fwrt.cur_fw_img != IWL_UCODE_REGULAR) { > ret = -ENODATA; > goto out; > } > diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c > index 34a31bc72023..4116d312d4a1 100644 > --- a/drivers/thermal/thermal_core.c > +++ b/drivers/thermal/thermal_core.c > @@ -414,10 +414,14 @@ static void handle_thermal_trip(struct thermal_zone_device *tz, > > static void update_temperature(struct thermal_zone_device *tz) > { > - int temp, ret; > - > - ret = __thermal_zone_get_temp(tz, &temp); > + int temp; > + int ret = __thermal_zone_get_temp(tz, &temp); > if (ret) { > + if (ret == -NOTLOAD) { > + pr_warn("Firmware yet not load"); > + return; > + } > + The thermal core doesn't need to be modified for this. Please print the new message from the driver and you may as well return -EAGAIN from it in all cases when the issue is expected to be intermittent to prevent the core from printing the (existing) warning message. Thanks! > if (ret != -EAGAIN) > dev_warn(&tz->device, > "failed to read out thermal zone (%d)\n", > diff --git a/include/uapi/asm-generic/errno-base.h b/include/uapi/asm-generic/errno-base.h > index 9653140bff92..8b92c41f7993 100644 > --- a/include/uapi/asm-generic/errno-base.h > +++ b/include/uapi/asm-generic/errno-base.h > @@ -36,5 +36,6 @@ > #define EPIPE 32 /* Broken pipe */ > #define EDOM 33 /* Math argument out of domain of func */ > #define ERANGE 34 /* Math result not representable */ > +#define NOTLOAD 35 /* Firmware yet not load */ > > #endif
Okay, this makes sense. I will send a new patch with your suggestions. Thanks Em sex., 17 de mai. de 2024 às 07:20, Wysocki, Rafael J <rafael.j.wysocki@intel.com> escreveu: > > > On 5/16/2024 10:05 PM, Guilherme Giacomo Simoes wrote: > > in the dmesg is showing the message "failed to read out thermal zone" > > as if the temperature read is failed by don't find the thermal zone. > > > > After researching and debugging, I see that this specific error is > > occurrence because the thermal try read the temperature when is started, > > but the firmware is not running yet. > > > > For more legibiliti i create the NOTLOAD error code in the errno-base.h, > > and in the iwl_mvm_tzone_get_temp() on tt.c i check if firmware is > > running and I set the NOTLOAD code for ret variable and goto out. > > > > After this, in the update_temperature() in the thermal_code.c i received > > the return of thermal_zone_get_temp() and check if return is NOTLOAD, > > because if it is, I print the warning message "firmware yet not load" > > and return for caller > > > > The thermal_core.c i think that is generic for any thermal drivers and > > not only used for tt.c of course. > > But if this ipotetic driver not check if firmware is running before read > > the temperature, the thermal_code.c is work as a before this change. > > > > After this change, in my computer I compile and install kernel in /boot > > and in my dmesg the message "failed to read out thermal zone" is not > > show any more. In your place the warning messafe "Firmware yet not > > load" is showing. > > > > I would like to thank you in advance for any contribution, suggestion > > or criticism of my patch suggestion. > > --- > > drivers/net/wireless/intel/iwlwifi/mvm/tt.c | 10 ++++++++-- > > drivers/thermal/thermal_core.c | 10 +++++++--- > > include/uapi/asm-generic/errno-base.h | 1 + > > 3 files changed, 16 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tt.c b/drivers/net/wireless/intel/iwlwifi/mvm/tt.c > > index 8083c4b2ab6b..dd5725db06d2 100644 > > --- a/drivers/net/wireless/intel/iwlwifi/mvm/tt.c > > +++ b/drivers/net/wireless/intel/iwlwifi/mvm/tt.c > > @@ -620,8 +620,14 @@ static int iwl_mvm_tzone_get_temp(struct thermal_zone_device *device, > > > > mutex_lock(&mvm->mutex); > > > > - if (!iwl_mvm_firmware_running(mvm) || > > - mvm->fwrt.cur_fw_img != IWL_UCODE_REGULAR) { > > + const int res = iwl_mvm_firmware_running(mvm); > > + > > + if (!res) { > > + ret = -NOTLOAD; > > + goto out; > > + } > > + > > + if (mvm->fwrt.cur_fw_img != IWL_UCODE_REGULAR) { > > ret = -ENODATA; > > goto out; > > } > > diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c > > index 34a31bc72023..4116d312d4a1 100644 > > --- a/drivers/thermal/thermal_core.c > > +++ b/drivers/thermal/thermal_core.c > > @@ -414,10 +414,14 @@ static void handle_thermal_trip(struct thermal_zone_device *tz, > > > > static void update_temperature(struct thermal_zone_device *tz) > > { > > - int temp, ret; > > - > > - ret = __thermal_zone_get_temp(tz, &temp); > > + int temp; > > + int ret = __thermal_zone_get_temp(tz, &temp); > > if (ret) { > > + if (ret == -NOTLOAD) { > > + pr_warn("Firmware yet not load"); > > + return; > > + } > > + > > The thermal core doesn't need to be modified for this. > > Please print the new message from the driver and you may as well return > -EAGAIN from it in all cases when the issue is expected to be > intermittent to prevent the core from printing the (existing) warning > message. > > Thanks! > > > > if (ret != -EAGAIN) > > dev_warn(&tz->device, > > "failed to read out thermal zone (%d)\n", > > diff --git a/include/uapi/asm-generic/errno-base.h b/include/uapi/asm-generic/errno-base.h > > index 9653140bff92..8b92c41f7993 100644 > > --- a/include/uapi/asm-generic/errno-base.h > > +++ b/include/uapi/asm-generic/errno-base.h > > @@ -36,5 +36,6 @@ > > #define EPIPE 32 /* Broken pipe */ > > #define EDOM 33 /* Math argument out of domain of func */ > > #define ERANGE 34 /* Math result not representable */ > > +#define NOTLOAD 35 /* Firmware yet not load */ > > > > #endif
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tt.c b/drivers/net/wireless/intel/iwlwifi/mvm/tt.c index 8083c4b2ab6b..dd5725db06d2 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/tt.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/tt.c @@ -620,8 +620,14 @@ static int iwl_mvm_tzone_get_temp(struct thermal_zone_device *device, mutex_lock(&mvm->mutex); - if (!iwl_mvm_firmware_running(mvm) || - mvm->fwrt.cur_fw_img != IWL_UCODE_REGULAR) { + const int res = iwl_mvm_firmware_running(mvm); + + if (!res) { + ret = -NOTLOAD; + goto out; + } + + if (mvm->fwrt.cur_fw_img != IWL_UCODE_REGULAR) { ret = -ENODATA; goto out; } diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c index 34a31bc72023..4116d312d4a1 100644 --- a/drivers/thermal/thermal_core.c +++ b/drivers/thermal/thermal_core.c @@ -414,10 +414,14 @@ static void handle_thermal_trip(struct thermal_zone_device *tz, static void update_temperature(struct thermal_zone_device *tz) { - int temp, ret; - - ret = __thermal_zone_get_temp(tz, &temp); + int temp; + int ret = __thermal_zone_get_temp(tz, &temp); if (ret) { + if (ret == -NOTLOAD) { + pr_warn("Firmware yet not load"); + return; + } + if (ret != -EAGAIN) dev_warn(&tz->device, "failed to read out thermal zone (%d)\n", diff --git a/include/uapi/asm-generic/errno-base.h b/include/uapi/asm-generic/errno-base.h index 9653140bff92..8b92c41f7993 100644 --- a/include/uapi/asm-generic/errno-base.h +++ b/include/uapi/asm-generic/errno-base.h @@ -36,5 +36,6 @@ #define EPIPE 32 /* Broken pipe */ #define EDOM 33 /* Math argument out of domain of func */ #define ERANGE 34 /* Math result not representable */ +#define NOTLOAD 35 /* Firmware yet not load */ #endif