diff mbox series

[1/1] i2c: designware: Ensure runtime suspend is invoked during rapid slave unregistration and registration

Message ID 20250412023303.378600-1-ende.tan@starfivetech.com
State New
Headers show
Series [1/1] i2c: designware: Ensure runtime suspend is invoked during rapid slave unregistration and registration | expand

Commit Message

EnDe Tan April 12, 2025, 2:33 a.m. UTC
From: Tan En De <ende.tan@starfivetech.com>

Replaced pm_runtime_put() with pm_runtime_put_sync_suspend() to ensure
the runtime suspend is invoked immediately when unregistering a slave.
This prevents a race condition where suspend was skipped when
unregistering and registering slave in quick succession.

Signed-off-by: Tan En De <ende.tan@starfivetech.com>
---
 drivers/i2c/busses/i2c-designware-slave.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

EnDe Tan April 20, 2025, 3:31 a.m. UTC | #1
It appears that when performing a rapid sequence of `delete_device -> new_device -> delete_device -> new_device`, the `dw_i2c_plat_runtime_suspend` is not invoked for the second `delete_device`.

This seems to happen because when `i2c_dw_unreg_slave` is about to trigger suspend during the second `delete_device`, the second `new_device` operation cancels the suspend. As a result, `dw_i2c_plat_runtime_resume` is not called (since there was no suspend), which means `i_dev->init` (i.e., `i2c_dw_init_slave`) is skipped.

Because `i2c_dw_init_slave` is skipped, `i2c_dw_configure_fifo_slave` is not invoked, which leaves `DW_IC_INTR_MASK` unconfigured.
If we inspect the interrupt mask register using devmem, it will show as zero.

Here's an example shell script to reproduce the issue:
```
#!/bin/sh

SLAVE_LADDR=0x1010
SLAVE_BUS=13
NEW_DEVICE=/sys/bus/i2c/devices/i2c-$SLAVE_BUS/new_device
DELETE_DEVICE=/sys/bus/i2c/devices/i2c-$SLAVE_BUS/delete_device

# Create initial device
echo slave-24c02 $SLAVE_LADDR > $NEW_DEVICE
sleep 2

# Rapid sequence of delete_device -> new_device -> delete_device -> new_device
echo $SLAVE_LADDR > $DELETE_DEVICE
echo slave-24c02 $SLAVE_LADDR > $NEW_DEVICE
echo $SLAVE_LADDR > $DELETE_DEVICE
echo slave-24c02 $SLAVE_LADDR > $NEW_DEVICE

# If we use devmem to inspect IC_INTR_MASK, it will show as zero
```
Jarkko Nikula May 2, 2025, 11:03 a.m. UTC | #2
Hi

Sorry the delay. Comment below.

On 4/20/25 6:31 AM, EnDe Tan wrote:
> It appears that when performing a rapid sequence of `delete_device -> new_device -> delete_device -> new_device`, the `dw_i2c_plat_runtime_suspend` is not invoked for the second `delete_device`.
> 
> This seems to happen because when `i2c_dw_unreg_slave` is about to trigger suspend during the second `delete_device`, the second `new_device` operation cancels the suspend. As a result, `dw_i2c_plat_runtime_resume` is not called (since there was no suspend), which means `i_dev->init` (i.e., `i2c_dw_init_slave`) is skipped.
> 
> Because `i2c_dw_init_slave` is skipped, `i2c_dw_configure_fifo_slave` is not invoked, which leaves `DW_IC_INTR_MASK` unconfigured.
> If we inspect the interrupt mask register using devmem, it will show as zero.
> 
> Here's an example shell script to reproduce the issue:
> ```
> #!/bin/sh
> 
> SLAVE_LADDR=0x1010
> SLAVE_BUS=13
> NEW_DEVICE=/sys/bus/i2c/devices/i2c-$SLAVE_BUS/new_device
> DELETE_DEVICE=/sys/bus/i2c/devices/i2c-$SLAVE_BUS/delete_device
> 
> # Create initial device
> echo slave-24c02 $SLAVE_LADDR > $NEW_DEVICE
> sleep 2
> 
> # Rapid sequence of delete_device -> new_device -> delete_device -> new_device
> echo $SLAVE_LADDR > $DELETE_DEVICE
> echo slave-24c02 $SLAVE_LADDR > $NEW_DEVICE
> echo $SLAVE_LADDR > $DELETE_DEVICE
> echo slave-24c02 $SLAVE_LADDR > $NEW_DEVICE
> 
> # If we use devmem to inspect IC_INTR_MASK, it will show as zero
> ```
> 
Good explanation and could you add it the commit log together with the 
example?
Andi Shyti May 5, 2025, 9:48 p.m. UTC | #3
Hi EnDe,

On Fri, May 02, 2025 at 02:03:25PM +0300, Jarkko Nikula wrote:
> On 4/20/25 6:31 AM, EnDe Tan wrote:
> > It appears that when performing a rapid sequence of `delete_device -> new_device -> delete_device -> new_device`, the `dw_i2c_plat_runtime_suspend` is not invoked for the second `delete_device`.
> > 
> > This seems to happen because when `i2c_dw_unreg_slave` is about to trigger suspend during the second `delete_device`, the second `new_device` operation cancels the suspend. As a result, `dw_i2c_plat_runtime_resume` is not called (since there was no suspend), which means `i_dev->init` (i.e., `i2c_dw_init_slave`) is skipped.
> > 
> > Because `i2c_dw_init_slave` is skipped, `i2c_dw_configure_fifo_slave` is not invoked, which leaves `DW_IC_INTR_MASK` unconfigured.
> > If we inspect the interrupt mask register using devmem, it will show as zero.
> > 
> > Here's an example shell script to reproduce the issue:
> > ```
> > #!/bin/sh
> > 
> > SLAVE_LADDR=0x1010
> > SLAVE_BUS=13
> > NEW_DEVICE=/sys/bus/i2c/devices/i2c-$SLAVE_BUS/new_device
> > DELETE_DEVICE=/sys/bus/i2c/devices/i2c-$SLAVE_BUS/delete_device
> > 
> > # Create initial device
> > echo slave-24c02 $SLAVE_LADDR > $NEW_DEVICE
> > sleep 2
> > 
> > # Rapid sequence of delete_device -> new_device -> delete_device -> new_device
> > echo $SLAVE_LADDR > $DELETE_DEVICE
> > echo slave-24c02 $SLAVE_LADDR > $NEW_DEVICE
> > echo $SLAVE_LADDR > $DELETE_DEVICE
> > echo slave-24c02 $SLAVE_LADDR > $NEW_DEVICE
> > 
> > # If we use devmem to inspect IC_INTR_MASK, it will show as zero
> > ```

Please, don't remove the interesting parts of the original email
from your reply, otherwise it wouldn't be easy to follow the
discussion. Please refer to the email netiquette[*].

> Good explanation and could you add it the commit log together with the
> example?

If you want you can paste the new commit log as reply to this
e-mail.

Thanks,
Andi

[*] https://www.ietf.org/rfc/rfc1855.txt
EnDe Tan May 8, 2025, 8:30 a.m. UTC | #4
Hi Andi and Jarkko, thank you for the feedback.

> -----Original Message-----
> From: Andi Shyti <andi.shyti@kernel.org>
> Sent: Tuesday, 6 May, 2025 5:49 AM
> To: Jarkko Nikula <jarkko.nikula@linux.intel.com>
> Cc: EnDe Tan <ende.tan@starfivetech.com>; linux-i2c@vger.kernel.org;
...
> > Good explanation and could you add it the commit log together with the
> > example?
> 
> If you want you can paste the new commit log as reply to this e-mail.

Here is the new commit log, feel free to let me know if further changes are required:

Replaced pm_runtime_put() with pm_runtime_put_sync_suspend() to ensure
the runtime suspend is invoked immediately when unregistering a slave.
This prevents a race condition where suspend was skipped when
unregistering and registering slave in quick succession.

For example, consider the rapid sequence of
`delete_device -> new_device -> delete_device -> new_device`.
In this sequence, it is observed that the dw_i2c_plat_runtime_suspend() might
not be invoked after `delete_device` operation.

This is because after `delete_device` operation, when the
pm_runtime_put() is about to trigger suspend, the following `new_device`
operation might race and cancel the suspend.

If that happens, during the `new_device` operation,
dw_i2c_plat_runtime_resume() is skipped (since there was no suspend), which
means `i_dev->init()`, i.e. i2c_dw_init_slave(), is skipped.
Since i2c_dw_init_slave() is skipped, i2c_dw_configure_fifo_slave() is
skipped too, which leaves `DW_IC_INTR_MASK` unconfigured. If we inspect
the interrupt mask register using devmem, it will show as zero.

Example shell script to reproduce the issue:
```
  #!/bin/sh

  SLAVE_LADDR=0x1010
  SLAVE_BUS=13
  NEW_DEVICE=/sys/bus/i2c/devices/i2c-$SLAVE_BUS/new_device
  DELETE_DEVICE=/sys/bus/i2c/devices/i2c-$SLAVE_BUS/delete_device

  # Create initial device
  echo slave-24c02 $SLAVE_LADDR > $NEW_DEVICE
  sleep 2

  # Rapid sequence of
  # delete_device -> new_device -> delete_device -> new_device
  echo $SLAVE_LADDR > $DELETE_DEVICE
  echo slave-24c02 $SLAVE_LADDR > $NEW_DEVICE
  echo $SLAVE_LADDR > $DELETE_DEVICE
  echo slave-24c02 $SLAVE_LADDR > $NEW_DEVICE

  # Using devmem to inspect IC_INTR_MASK will show as zero
```
Jarkko Nikula May 8, 2025, 11:55 a.m. UTC | #5
On 5/8/25 11:30 AM, EnDe Tan wrote:
> Hi Andi and Jarkko, thank you for the feedback.
> 
>> -----Original Message-----
>> From: Andi Shyti <andi.shyti@kernel.org>
>> Sent: Tuesday, 6 May, 2025 5:49 AM
>> To: Jarkko Nikula <jarkko.nikula@linux.intel.com>
>> Cc: EnDe Tan <ende.tan@starfivetech.com>; linux-i2c@vger.kernel.org;
> ...
>>> Good explanation and could you add it the commit log together with the
>>> example?
>>
>> If you want you can paste the new commit log as reply to this e-mail.
> 
> Here is the new commit log, feel free to let me know if further changes are required:
> 
> Replaced pm_runtime_put() with pm_runtime_put_sync_suspend() to ensure
> the runtime suspend is invoked immediately when unregistering a slave.
> This prevents a race condition where suspend was skipped when
> unregistering and registering slave in quick succession.
> 
> For example, consider the rapid sequence of
> `delete_device -> new_device -> delete_device -> new_device`.
> In this sequence, it is observed that the dw_i2c_plat_runtime_suspend() might
> not be invoked after `delete_device` operation.
> 
> This is because after `delete_device` operation, when the
> pm_runtime_put() is about to trigger suspend, the following `new_device`
> operation might race and cancel the suspend.
> 
> If that happens, during the `new_device` operation,
> dw_i2c_plat_runtime_resume() is skipped (since there was no suspend), which
> means `i_dev->init()`, i.e. i2c_dw_init_slave(), is skipped.
> Since i2c_dw_init_slave() is skipped, i2c_dw_configure_fifo_slave() is
> skipped too, which leaves `DW_IC_INTR_MASK` unconfigured. If we inspect
> the interrupt mask register using devmem, it will show as zero.
> 
> Example shell script to reproduce the issue:
> ```
>    #!/bin/sh
> 
>    SLAVE_LADDR=0x1010
>    SLAVE_BUS=13
>    NEW_DEVICE=/sys/bus/i2c/devices/i2c-$SLAVE_BUS/new_device
>    DELETE_DEVICE=/sys/bus/i2c/devices/i2c-$SLAVE_BUS/delete_device
> 
>    # Create initial device
>    echo slave-24c02 $SLAVE_LADDR > $NEW_DEVICE
>    sleep 2
> 
>    # Rapid sequence of
>    # delete_device -> new_device -> delete_device -> new_device
>    echo $SLAVE_LADDR > $DELETE_DEVICE
>    echo slave-24c02 $SLAVE_LADDR > $NEW_DEVICE
>    echo $SLAVE_LADDR > $DELETE_DEVICE
>    echo slave-24c02 $SLAVE_LADDR > $NEW_DEVICE
> 
>    # Using devmem to inspect IC_INTR_MASK will show as zero
> ```
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Andi Shyti May 12, 2025, 11:29 p.m. UTC | #6
Hi EnDe,

On Thu, May 08, 2025 at 08:30:27AM +0000, EnDe Tan wrote:
> Hi Andi and Jarkko, thank you for the feedback.
> 
> > -----Original Message-----
> > From: Andi Shyti <andi.shyti@kernel.org>
> > Sent: Tuesday, 6 May, 2025 5:49 AM
> > To: Jarkko Nikula <jarkko.nikula@linux.intel.com>
> > Cc: EnDe Tan <ende.tan@starfivetech.com>; linux-i2c@vger.kernel.org;
> ...
> > > Good explanation and could you add it the commit log together with the
> > > example?
> > 
> > If you want you can paste the new commit log as reply to this e-mail.
> 
> Here is the new commit log, feel free to let me know if further changes are required:
> 
> Replaced pm_runtime_put() with pm_runtime_put_sync_suspend() to ensure
> the runtime suspend is invoked immediately when unregistering a slave.
> This prevents a race condition where suspend was skipped when
> unregistering and registering slave in quick succession.
> 
> For example, consider the rapid sequence of
> `delete_device -> new_device -> delete_device -> new_device`.
> In this sequence, it is observed that the dw_i2c_plat_runtime_suspend() might
> not be invoked after `delete_device` operation.
> 
> This is because after `delete_device` operation, when the
> pm_runtime_put() is about to trigger suspend, the following `new_device`
> operation might race and cancel the suspend.
> 
> If that happens, during the `new_device` operation,
> dw_i2c_plat_runtime_resume() is skipped (since there was no suspend), which
> means `i_dev->init()`, i.e. i2c_dw_init_slave(), is skipped.
> Since i2c_dw_init_slave() is skipped, i2c_dw_configure_fifo_slave() is
> skipped too, which leaves `DW_IC_INTR_MASK` unconfigured. If we inspect
> the interrupt mask register using devmem, it will show as zero.
> 
> Example shell script to reproduce the issue:
> ```
>   #!/bin/sh
> 
>   SLAVE_LADDR=0x1010
>   SLAVE_BUS=13
>   NEW_DEVICE=/sys/bus/i2c/devices/i2c-$SLAVE_BUS/new_device
>   DELETE_DEVICE=/sys/bus/i2c/devices/i2c-$SLAVE_BUS/delete_device
> 
>   # Create initial device
>   echo slave-24c02 $SLAVE_LADDR > $NEW_DEVICE
>   sleep 2
> 
>   # Rapid sequence of
>   # delete_device -> new_device -> delete_device -> new_device
>   echo $SLAVE_LADDR > $DELETE_DEVICE
>   echo slave-24c02 $SLAVE_LADDR > $NEW_DEVICE
>   echo $SLAVE_LADDR > $DELETE_DEVICE
>   echo slave-24c02 $SLAVE_LADDR > $NEW_DEVICE
> 
>   # Using devmem to inspect IC_INTR_MASK will show as zero
> ```

Thanks, merged to i2c/i2c-host.

I just reworded the title to:

i2c: designware: Invoke runtime suspend on quick slave re-registration

to keep it under 75 characters.

Thanks,
Andi
diff mbox series

Patch

diff --git a/drivers/i2c/busses/i2c-designware-slave.c b/drivers/i2c/busses/i2c-designware-slave.c
index 5cd4a5f7a472..b936a240db0a 100644
--- a/drivers/i2c/busses/i2c-designware-slave.c
+++ b/drivers/i2c/busses/i2c-designware-slave.c
@@ -96,7 +96,7 @@  static int i2c_dw_unreg_slave(struct i2c_client *slave)
 	i2c_dw_disable(dev);
 	synchronize_irq(dev->irq);
 	dev->slave = NULL;
-	pm_runtime_put(dev->dev);
+	pm_runtime_put_sync_suspend(dev->dev);
 
 	return 0;
 }