mbox series

[0/3] i2c: mv64xxx: Support for I2C unstuck

Message ID 20230926234801.4078042-1-chris.packham@alliedtelesis.co.nz
Headers show
Series i2c: mv64xxx: Support for I2C unstuck | expand

Message

Chris Packham Sept. 26, 2023, 11:47 p.m. UTC
Some newer Marvell SoCs natively support a recovery mechanisim. This can be
used as an alternative to the generic pinctrl/GPIO recovery mechanism used on
the older SoCs.

Chris Packham (3):
  dt-bindings: i2c: mv64xxx: update bindings for unstuck register
  arm64: dts: marvell: AC5: use I2C unstuck function
  i2c: mv64xxx: add support for FSM based recovery

 .../bindings/i2c/marvell,mv64xxx-i2c.yaml     |  5 +-
 arch/arm64/boot/dts/marvell/ac5-98dx25xx.dtsi | 14 ++--
 drivers/i2c/busses/i2c-mv64xxx.c              | 71 +++++++++++++++++--
 3 files changed, 75 insertions(+), 15 deletions(-)

Comments

Andi Shyti Oct. 5, 2023, 9:58 p.m. UTC | #1
Hi Chris,

Looks good, just a few questions.

> +static int
> +mv64xxx_i2c_recover_bus(struct i2c_adapter *adap)
> +{
> +	struct mv64xxx_i2c_data *drv_data = i2c_get_adapdata(adap);
> +	int ret;
> +	u32 val;
> +
> +	dev_dbg(&adap->dev, "Trying i2c bus recovery\n");
> +	writel(MV64XXX_I2C_UNSTUCK_TRIGGER, drv_data->unstuck_reg);
> +	ret = readl_poll_timeout_atomic(drv_data->unstuck_reg, val,
> +					!(val & MV64XXX_I2C_UNSTUCK_INPROGRESS),
> +					1000, 5000);

here you are busy looping for 1ms between reads which is a long
time. Why not using read_poll_timeout() instead?

> +	if (ret) {
> +		dev_err(&adap->dev, "recovery timeout\n");
> +		return ret;
> +	}
> +
> +	if (val & MV64XXX_I2C_UNSTUCK_ERROR) {
> +		dev_err(&adap->dev, "recovery failed\n");
> +		return -EBUSY;
> +	}
> +
> +	dev_info(&adap->dev, "recovery complete after %d pulses\n", MV64XXX_I2C_UNSTUCK_COUNT(val));

dev_dbg?

> +	return 0;
> +}
> +

[...]

> -	if (of_device_is_compatible(np, "marvell,mv78230-a0-i2c")) {
> +	if (of_device_is_compatible(np, "marvell,mv78230-a0-i2c") ||
> +	    of_device_is_compatible(np, "marvell,armada-8k-i2c")) {

should this be part of a different patch?

>  		drv_data->offload_enabled = false;
>  		/* The delay is only needed in standard mode (100kHz) */
>  		if (bus_freq <= I2C_MAX_STANDARD_MODE_FREQ)
> @@ -936,8 +973,21 @@ mv64xxx_of_config(struct mv64xxx_i2c_data *drv_data,
>  }
>  #endif /* CONFIG_OF */
>  
> -static int mv64xxx_i2c_init_recovery_info(struct mv64xxx_i2c_data *drv_data,
> -					  struct device *dev)
> +static int mv64xxx_i2c_init_fsm_recovery_info(struct mv64xxx_i2c_data *drv_data,
> +					      struct device *dev)
> +{
> +	struct i2c_bus_recovery_info *rinfo = &drv_data->rinfo;
> +
> +	dev_info(dev, "using FSM for recovery\n");

dev_dbg?

> +	rinfo->recover_bus = mv64xxx_i2c_recover_bus;
> +	drv_data->adapter.bus_recovery_info = rinfo;
> +
> +	return 0;
> +
> +}
> +

[...]

> +	/* optional unstuck support */
> +	res = platform_get_resource(pd, IORESOURCE_MEM, 1);
> +	if (res) {
> +		drv_data->unstuck_reg = devm_ioremap_resource(&pd->dev, res);
> +		if (IS_ERR(drv_data->unstuck_reg))
> +			return PTR_ERR(drv_data->unstuck_reg);

OK, we failed to ioremap... but instead of returning an error,
wouldn't it be better to just set unstuck_reg to NULL and move
forward without unstuck support?

Maybe you will stil crash later because something might have
happened, but failing on purpose on an optional feature looks a
bit too drastic to me. What do you think?

Thanks,
Andi
Chris Packham Oct. 5, 2023, 10:39 p.m. UTC | #2
On 6/10/23 10:58, Andi Shyti wrote:
> Hi Chris,
>
> Looks good, just a few questions.
>
>> +static int
>> +mv64xxx_i2c_recover_bus(struct i2c_adapter *adap)
>> +{
>> +	struct mv64xxx_i2c_data *drv_data = i2c_get_adapdata(adap);
>> +	int ret;
>> +	u32 val;
>> +
>> +	dev_dbg(&adap->dev, "Trying i2c bus recovery\n");
>> +	writel(MV64XXX_I2C_UNSTUCK_TRIGGER, drv_data->unstuck_reg);
>> +	ret = readl_poll_timeout_atomic(drv_data->unstuck_reg, val,
>> +					!(val & MV64XXX_I2C_UNSTUCK_INPROGRESS),
>> +					1000, 5000);
> here you are busy looping for 1ms between reads which is a long
> time. Why not using read_poll_timeout() instead?

I needed to use the atomic variant because this ends up getting called 
from an interrupt handler (mv64xxx_i2c_intr() -> mv64xxx_i2c_fsm()). I 
probably don't need to wait so long between reads those times were just 
pulled out of thin air. In my experimentation the faults that can be 
cleared do so within a couple of clocks, if it hasn't cleared within 8 
clocks it's not going to.

>> +	if (ret) {
>> +		dev_err(&adap->dev, "recovery timeout\n");
>> +		return ret;
>> +	}
>> +
>> +	if (val & MV64XXX_I2C_UNSTUCK_ERROR) {
>> +		dev_err(&adap->dev, "recovery failed\n");
>> +		return -EBUSY;
>> +	}
>> +
>> +	dev_info(&adap->dev, "recovery complete after %d pulses\n", MV64XXX_I2C_UNSTUCK_COUNT(val));
> dev_dbg?
ack.
>> +	return 0;
>> +}
>> +
> [...]
>
>> -	if (of_device_is_compatible(np, "marvell,mv78230-a0-i2c")) {
>> +	if (of_device_is_compatible(np, "marvell,mv78230-a0-i2c") ||
>> +	    of_device_is_compatible(np, "marvell,armada-8k-i2c")) {
> should this be part of a different patch?

Yes sorry. Originally I was going to use a new compatible to indicate 
the unstuck support but went with the 2nd reg cell so this is unnecessary.

>
>>   		drv_data->offload_enabled = false;
>>   		/* The delay is only needed in standard mode (100kHz) */
>>   		if (bus_freq <= I2C_MAX_STANDARD_MODE_FREQ)
>> @@ -936,8 +973,21 @@ mv64xxx_of_config(struct mv64xxx_i2c_data *drv_data,
>>   }
>>   #endif /* CONFIG_OF */
>>   
>> -static int mv64xxx_i2c_init_recovery_info(struct mv64xxx_i2c_data *drv_data,
>> -					  struct device *dev)
>> +static int mv64xxx_i2c_init_fsm_recovery_info(struct mv64xxx_i2c_data *drv_data,
>> +					      struct device *dev)
>> +{
>> +	struct i2c_bus_recovery_info *rinfo = &drv_data->rinfo;
>> +
>> +	dev_info(dev, "using FSM for recovery\n");
> dev_dbg?
>
>> +	rinfo->recover_bus = mv64xxx_i2c_recover_bus;
>> +	drv_data->adapter.bus_recovery_info = rinfo;
>> +
>> +	return 0;
>> +
>> +}
>> +
> [...]
>
>> +	/* optional unstuck support */
>> +	res = platform_get_resource(pd, IORESOURCE_MEM, 1);
>> +	if (res) {
>> +		drv_data->unstuck_reg = devm_ioremap_resource(&pd->dev, res);
>> +		if (IS_ERR(drv_data->unstuck_reg))
>> +			return PTR_ERR(drv_data->unstuck_reg);
> OK, we failed to ioremap... but instead of returning an error,
> wouldn't it be better to just set unstuck_reg to NULL and move
> forward without unstuck support?
>
> Maybe you will stil crash later because something might have
> happened, but failing on purpose on an optional feature looks a
> bit too drastic to me. What do you think?

Personally I think if the reg property is supplied in the dts we'd 
better be able to use it. If the feature is not wanted then the way to 
indicate this is by supplying only one reg cell.

I'd be happy with a dev_warn() and unstuck_reg = NULL if that helps get 
this landed.

>
> Thanks,
> Andi
Andi Shyti Oct. 5, 2023, 11:07 p.m. UTC | #3
Hi Chris,

> >> +static int
> >> +mv64xxx_i2c_recover_bus(struct i2c_adapter *adap)
> >> +{
> >> +	struct mv64xxx_i2c_data *drv_data = i2c_get_adapdata(adap);
> >> +	int ret;
> >> +	u32 val;
> >> +
> >> +	dev_dbg(&adap->dev, "Trying i2c bus recovery\n");
> >> +	writel(MV64XXX_I2C_UNSTUCK_TRIGGER, drv_data->unstuck_reg);
> >> +	ret = readl_poll_timeout_atomic(drv_data->unstuck_reg, val,
> >> +					!(val & MV64XXX_I2C_UNSTUCK_INPROGRESS),
> >> +					1000, 5000);
> > here you are busy looping for 1ms between reads which is a long
> > time. Why not using read_poll_timeout() instead?
> 
> I needed to use the atomic variant because this ends up getting called 
> from an interrupt handler (mv64xxx_i2c_intr() -> mv64xxx_i2c_fsm()). I 
> probably don't need to wait so long between reads those times were just 
> pulled out of thin air. In my experimentation the faults that can be 
> cleared do so within a couple of clocks, if it hasn't cleared within 8 
> clocks it's not going to.

It's still a long time to wait in atomic context...
readl_poll_timeout_atomic() waits in udelays, where the maximum
accepted waiting time is 10us. Here you are waiting 100 times
more.

If we can't be within that value I would rather use a thread.

Or, you could also consider using threaded_irq()... but this
might have a bit of a higher impact.

[...]

> >> +	/* optional unstuck support */
> >> +	res = platform_get_resource(pd, IORESOURCE_MEM, 1);
> >> +	if (res) {
> >> +		drv_data->unstuck_reg = devm_ioremap_resource(&pd->dev, res);
> >> +		if (IS_ERR(drv_data->unstuck_reg))
> >> +			return PTR_ERR(drv_data->unstuck_reg);
> > OK, we failed to ioremap... but instead of returning an error,
> > wouldn't it be better to just set unstuck_reg to NULL and move
> > forward without unstuck support?
> >
> > Maybe you will stil crash later because something might have
> > happened, but failing on purpose on an optional feature looks a
> > bit too drastic to me. What do you think?
> 
> Personally I think if the reg property is supplied in the dts we'd 
> better be able to use it. If the feature is not wanted then the way to 
> indicate this is by supplying only one reg cell.
> 
> I'd be happy with a dev_warn() and unstuck_reg = NULL if that helps get 
> this landed.

Don't ahve a strong opinion... as you like. Mine is just an
opinion and your argument is valid :-)

Andi