mbox series

[v4,0/4] mfd: tps6586x: register restart handler

Message ID 20230327-tegra-pmic-reboot-v4-0-b24af219fb47@skidata.com
Headers show
Series mfd: tps6586x: register restart handler | expand

Message

Benjamin Bara April 13, 2023, 7:46 a.m. UTC
Hi!

The Tegra20 requires an enabled VDE power domain during startup. As the
VDE is currently not used, it is disabled during runtime.
Since 8f0c714ad9be, there is a workaround for the "normal restart path"
which enables the VDE before doing PMC's warm reboot. This workaround is
not executed in the "emergency restart path", leading to a hang-up
during start.

This series implements and registers a new pmic-based restart handler
for boards with tps6586x. This cold reboot ensures that the VDE power
domain is enabled during startup on tegra20-based boards.

Since bae1d3a05a8b, i2c transfers are non-atomic while preemption is
disabled (which is e.g. done during panic()). This could lead to
warnings ("Voluntary context switch within RCU") in i2c-based restart
handlers during emergency restart. The state of preemption should be
detected by i2c_in_atomic_xfer_mode() to use atomic i2c xfer when
required. Beside the new system_state check, the check is the same as
the one pre v5.2.

v3: https://lore.kernel.org/r/20230327-tegra-pmic-reboot-v3-0-3c0ee3567e14@skidata.com
v2: https://lore.kernel.org/all/20230320220345.1463687-1-bbara93@gmail.com/
system_state: https://lore.kernel.org/all/20230320213230.1459532-1-bbara93@gmail.com/
v1: https://lore.kernel.org/all/20230316164703.1157813-1-bbara93@gmail.com/

v4:
- 1,2: add "Fixes" and adapt commit messages
- 4: reduce delay after requesting the restart (as suggested by Dmitry)

v3:
- bring system_state back in this series
- do atomic i2c xfer if not preemptible (as suggested by Dmitry)
- fix style issues mentioned by Dmitry
- add cc stable as suggested by Dmitry
- add explanation why this is needed for Jon

v2:
- use devm-based restart handler
- convert the existing power_off handler to a devm-based handler
- handle system_state in extra series

---
Benjamin Bara (4):
      kernel/reboot: emergency_restart: set correct system_state
      i2c: core: run atomic i2c xfer when !preemptible
      mfd: tps6586x: use devm-based power off handler
      mfd: tps6586x: register restart handler

 drivers/i2c/i2c-core.h |  2 +-
 drivers/mfd/tps6586x.c | 45 +++++++++++++++++++++++++++++++++++++--------
 kernel/reboot.c        |  1 +
 3 files changed, 39 insertions(+), 9 deletions(-)
---
base-commit: 197b6b60ae7bc51dd0814953c562833143b292aa
change-id: 20230327-tegra-pmic-reboot-4175ff814a4b

Best regards,

Comments

Wolfram Sang April 13, 2023, 7:51 p.m. UTC | #1
On Thu, Apr 13, 2023 at 09:46:40AM +0200, Benjamin Bara wrote:
> From: Benjamin Bara <benjamin.bara@skidata.com>
> 
> Since bae1d3a05a8b, i2c transfers are non-atomic if preemption is
> disabled. However, non-atomic i2c transfers require preemption (e.g. in
> wait_for_completion() while waiting for the DMA).
> 
> panic() calls preempt_disable_notrace() before calling
> emergency_restart(). Therefore, if an i2c device is used for the
> restart, the xfer should be atomic. This avoids warnings like:
> 
> [   12.667612] WARNING: CPU: 1 PID: 1 at kernel/rcu/tree_plugin.h:318 rcu_note_context_switch+0x33c/0x6b0
> [   12.676926] Voluntary context switch within RCU read-side critical section!
> ...
> [   12.742376]  schedule_timeout from wait_for_completion_timeout+0x90/0x114
> [   12.749179]  wait_for_completion_timeout from tegra_i2c_wait_completion+0x40/0x70
> ...
> [   12.994527]  atomic_notifier_call_chain from machine_restart+0x34/0x58
> [   13.001050]  machine_restart from panic+0x2a8/0x32c
> 
> Use !preemptible() instead, which is basically the same check as
> pre-v5.2.
> 
> Fixes: bae1d3a05a8b ("i2c: core: remove use of in_atomic()")
> Cc: stable@vger.kernel.org # v5.2+
> Suggested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> Signed-off-by: Benjamin Bara <benjamin.bara@skidata.com>

So, with Peter's input and me checking again:

Acked-by: Wolfram Sang <wsa@kernel.org>

I assume this shall go in via the mfd-tree. Let me know if I should pick
it instead.
Dmitry Osipenko April 13, 2023, 8:36 p.m. UTC | #2
On 4/13/23 10:46, Benjamin Bara wrote:
> +static int tps6586x_power_off_handler(struct sys_off_data *data)
>  {
> -	if (tps6586x_clr_bits(tps6586x_dev, TPS6586X_SUPPLYENE, EXITSLREQ_BIT))
> -		return;
> +	struct device *tps6586x_dev = data->cb_data;
> +	int ret;
> +
> +	ret = tps6586x_clr_bits(tps6586x_dev, TPS6586X_SUPPLYENE, EXITSLREQ_BIT);
> +	if (ret)
> +		return ret;
>  
> -	tps6586x_set_bits(tps6586x_dev, TPS6586X_SUPPLYENE, SLEEP_MODE_BIT);
> +	return tps6586x_set_bits(tps6586x_dev, TPS6586X_SUPPLYENE, SLEEP_MODE_BIT);

Handlers must return NOTIFY_DONE or notifier_from_errno(). Sorry for
missing this previously.
Benjamin Bara April 14, 2023, 6:15 a.m. UTC | #3
On Thu, 13 Apr 2023, 22:37 Dmitry Osipenko,
<dmitry.osipenko@collabora.com> wrote:
> Handlers must return NOTIFY_DONE or notifier_from_errno(). Sorry for
> missing this previously.

Thanks!

AFAIU, notifier_from_errno() sets NOTIFY_STOP_MASK, which stops
atomic_notifier_call_chain() immediately. So I think NOTIFY_DONE is the
only valid return value for sys_off handlers, to not skip others. So I
think letting sys_off_notify() [1] always return NOTIFY_DONE might be a
good idea.

If so, we could return a "notify return errno" (or also a "normal
errno") from the handler, which is checked, but then replaced to
NOTIFY_DONE, in [1]. This would enable us to have a common place to
check for failed handlers.

Handlers then should only return NOTIFY_DONE when they are skipped (e.g.
when the requested reboot mode is not supported by the handler).
Otherwise, I think ETIME, ENOSYS or ENOTSUPP might fit when the
communication was successful, a possible delay awaited, but the return
was still reached. What do you think?

Thanks and best regards,
Benjamin

[1] https://elixir.bootlin.com/linux/v6.3-rc6/source/kernel/reboot.c#L327
Dmitry Osipenko April 24, 2023, 10:42 a.m. UTC | #4
On 4/14/23 09:15, Benjamin Bara wrote:
> On Thu, 13 Apr 2023, 22:37 Dmitry Osipenko,
> <dmitry.osipenko@collabora.com> wrote:
>> Handlers must return NOTIFY_DONE or notifier_from_errno(). Sorry for
>> missing this previously.
> 
> Thanks!
> 
> AFAIU, notifier_from_errno() sets NOTIFY_STOP_MASK, which stops
> atomic_notifier_call_chain() immediately. So I think NOTIFY_DONE is the
> only valid return value for sys_off handlers, to not skip others. So I
> think letting sys_off_notify() [1] always return NOTIFY_DONE might be a
> good idea.
> 
> If so, we could return a "notify return errno" (or also a "normal
> errno") from the handler, which is checked, but then replaced to
> NOTIFY_DONE, in [1]. This would enable us to have a common place to
> check for failed handlers.
> 
> Handlers then should only return NOTIFY_DONE when they are skipped (e.g.
> when the requested reboot mode is not supported by the handler).
> Otherwise, I think ETIME, ENOSYS or ENOTSUPP might fit when the
> communication was successful, a possible delay awaited, but the return
> was still reached. What do you think?

The behaviour may depend on a particular platform and driver. In general
and in case of this driver, it should be more reliable and cleaner to
abort the reboot on a error that shall never happen.
Benjamin Bara April 24, 2023, 12:07 p.m. UTC | #5
On Mon, 24 Apr 2023 at 12:42, Dmitry Osipenko <dmitry.osipenko@collabora.com> wrote:
> In general and in case of this driver, it should be more reliable and
> cleaner to abort the reboot on a error that shall never happen.

Thanks! Then I will drop my 4/6 of v5 [1].

[1] https://lore.kernel.org/all/20230327-tegra-pmic-reboot-v5-4-ab090e03284d@skidata.com/