Message ID | 20171115123823.1895515-1-arnd@arndb.de |
---|---|
State | New |
Headers | show |
Series | [RFT] drm: adv7511/33: fix adv7511_cec_init() failure handling | expand |
On Thu, Nov 16, 2017 at 1:50 PM, John Stultz <john.stultz@linaro.org> wrote: > On Wed, Nov 15, 2017 at 4:37 AM, Arnd Bergmann <arnd@arndb.de> wrote: >> An otherwise correct cleanup patch from Dan Carpenter turned a broken >> failure handling from a feature patch by Hans Verkuil into a kernel >> Oops, so bisection points to commit 7af35b0addbc ("drm/kirin: Checking >> for IS_ERR() instead of NULL") rather than 3b1b975003e4 ("drm: >> adv7511/33: add HDMI CEC support"). >> >> I've managed to piece together several partial problems, though >> I'm still struggling with the bigger picture: >> >> adv7511_probe() registers a drm_bridge structure that was allocated >> with devm_kzalloc(). It calls adv7511_cec_init(), which fails for an >> unknown reason, which in turn triggers the registered structure to be >> removed. >> >> Elsewhere, kirin_drm_platform_probe() gets called, which calls >> of_graph_get_remote_node(), and that returns NULL. Before Dan's >> patch we would go on with a NULL pointer here and register that, >> now kirin_drm_platform_probe() fails with -ENODEV. >> >> In a third driver, dsi_parse_dt() calls drm_of_find_panel_or_bridge(), >> which after not finding a panel goes on to call of_drm_find_bridge(), >> and that crashes due to the earlier list corruption. >> >> This addresses the first issue by making sure that adv7511_probe() >> does not leave behind any corrupted list entries. This should >> get the system back to boot but needs testing. From my understanding, >> there is at least one more bug that needs to be resolved to actually >> get everything to work again. > > So I've started hitting the issue this patch tries to address (now > that the related code landed in Linus' tree). The only issue is that > with this fix, I don't see graphics initializing properly, so I > suspect something is still wrong with the error handling (though what > exactly I'm not sure). So this seems to only happen when CONFIG_DRM_I2C_ADV7511_CEC is enabled. If it is on, I don't get any graphics, but if its disabled graphics works. Tying this with Arnd's patch, I'm guessing adv7511_cec_init() is failing, but it seems like instead of just disabling the CEC feature, we're failing to load the driver entirely. Maybe should the logic be something like: #ifdef CONFIG_DRM_I2C_ADV7511_CEC ret = adv7511_cec_init(dev, adv7511, offset); if (ret) #endif regmap_write(adv7511->regmap, ADV7511_REG_CEC_CTRL + offset, ADV7511_CEC_CTRL_POWER_DOWN); ? thanks -john
On Thu, Nov 16, 2017 at 2:20 PM, John Stultz <john.stultz@linaro.org> wrote: > On Thu, Nov 16, 2017 at 1:50 PM, John Stultz <john.stultz@linaro.org> wrote: >> On Wed, Nov 15, 2017 at 4:37 AM, Arnd Bergmann <arnd@arndb.de> wrote: >>> An otherwise correct cleanup patch from Dan Carpenter turned a broken >>> failure handling from a feature patch by Hans Verkuil into a kernel >>> Oops, so bisection points to commit 7af35b0addbc ("drm/kirin: Checking >>> for IS_ERR() instead of NULL") rather than 3b1b975003e4 ("drm: >>> adv7511/33: add HDMI CEC support"). >>> >>> I've managed to piece together several partial problems, though >>> I'm still struggling with the bigger picture: >>> >>> adv7511_probe() registers a drm_bridge structure that was allocated >>> with devm_kzalloc(). It calls adv7511_cec_init(), which fails for an >>> unknown reason, which in turn triggers the registered structure to be >>> removed. >>> >>> Elsewhere, kirin_drm_platform_probe() gets called, which calls >>> of_graph_get_remote_node(), and that returns NULL. Before Dan's >>> patch we would go on with a NULL pointer here and register that, >>> now kirin_drm_platform_probe() fails with -ENODEV. >>> >>> In a third driver, dsi_parse_dt() calls drm_of_find_panel_or_bridge(), >>> which after not finding a panel goes on to call of_drm_find_bridge(), >>> and that crashes due to the earlier list corruption. >>> >>> This addresses the first issue by making sure that adv7511_probe() >>> does not leave behind any corrupted list entries. This should >>> get the system back to boot but needs testing. From my understanding, >>> there is at least one more bug that needs to be resolved to actually >>> get everything to work again. >> >> So I've started hitting the issue this patch tries to address (now >> that the related code landed in Linus' tree). The only issue is that >> with this fix, I don't see graphics initializing properly, so I >> suspect something is still wrong with the error handling (though what >> exactly I'm not sure). > > So this seems to only happen when CONFIG_DRM_I2C_ADV7511_CEC is > enabled. If it is on, I don't get any graphics, but if its disabled > graphics works. > > Tying this with Arnd's patch, I'm guessing adv7511_cec_init() is > failing, but it seems like instead of just disabling the CEC feature, > we're failing to load the driver entirely. > > Maybe should the logic be something like: > #ifdef CONFIG_DRM_I2C_ADV7511_CEC > ret = adv7511_cec_init(dev, adv7511, offset); > if (ret) > #endif > regmap_write(adv7511->regmap, ADV7511_REG_CEC_CTRL + offset, > ADV7511_CEC_CTRL_POWER_DOWN); > > ? I just tested with this, and this approach seems to work for me. thanks -john
On Fri, Nov 17, 2017 at 12:43 AM, Hans Verkuil <hverkuil@xs4all.nl> wrote: > diff --git a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c > index 3a33075dbb22..56eeeea6a1fa 100644 > --- a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c > +++ b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c > @@ -1202,9 +1202,7 @@ static int adv7511_probe(struct i2c_client *i2c, const struct i2c_device_id *id) > offset = adv7511->type == ADV7533 ? ADV7533_REG_CEC_OFFSET : 0; > > #ifdef CONFIG_DRM_I2C_ADV7511_CEC > - ret = adv7511_cec_init(dev, adv7511, offset); > - if (ret) > - goto err_unregister_cec; > + adv7511_cec_init(dev, adv7511, offset); > #else > regmap_write(adv7511->regmap, ADV7511_REG_CEC_CTRL + offset, > ADV7511_CEC_CTRL_POWER_DOWN); One tiny nit-pick I realized I should have made in my patch... In the !CONFIG_DRM_I2C_ADV7511_CEC, can you just define adv7511_cec_init() as { regmap_write(adv7511->regmap, ADV7511_REG_CEC_CTRL + offset, ADV7511_CEC_CTRL_POWER_DOWN); } Then we can call it either way, and don't need to have the ufly #ifdefs in the adv7511_probe function? thanks -john
On 11/23/2017 05:52 AM, John Stultz wrote: > On Tue, Nov 21, 2017 at 12:17 AM, Hans Verkuil <hverkuil@xs4all.nl> wrote: >> If the device tree for a board did not specify a cec clock, then >> adv7511_cec_init would return an error, which would cause adv7511_probe() >> to fail and thus there is no HDMI output. >> >> There is no need to have adv7511_probe() fail if the CEC initialization >> fails, so just change adv7511_cec_init() to a void function. In addition, >> adv7511_cec_init() should just return silently if the cec clock isn't >> found and show a message for any other errors. >> >> An otherwise correct cleanup patch from Dan Carpenter turned this broken >> failure handling into a kernel Oops, so bisection points to commit >> 7af35b0addbc ("drm/kirin: Checking for IS_ERR() instead of NULL") rather >> than 3b1b975003e4 ("drm: adv7511/33: add HDMI CEC support"). >> >> Based on earlier patches from Arnd and John. >> >> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> >> Cc: Xinliang Liu <xinliang.liu@linaro.org> >> Cc: Dan Carpenter <dan.carpenter@oracle.com> >> Cc: Sean Paul <seanpaul@chromium.org> >> Cc: Archit Taneja <architt@codeaurora.org> >> Cc: John Stultz <john.stultz@linaro.org> >> Link: https://bugs.linaro.org/show_bug.cgi?id=3345 >> Link: https://lkft.validation.linaro.org/scheduler/job/48017#L3551 >> Fixes: 7af35b0addbc ("drm/kirin: Checking for IS_ERR() instead of NULL") >> Fixes: 3b1b975003e4 ("drm: adv7511/33: add HDMI CEC support") >> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> >> Tested-by: Hans Verkuil <hans.verkuil@cisco.com> >> --- >> This rework of Arnd and John's patches goes a bit further and just silently >> exits if there is no cec clock defined in the dts. I'm sure that's the >> reason why the kirin board failed on this. BTW: if the kirin board DOES >> support cec, then it would be nice if it can be hooked up in the dts! >> >> Tested with my Dragonboard and Renesas Koelsch board. Also tested what >> happens when probing is deferred due to missing cec clock. >> >> John, can you test this again? > > Sorry I didn't get back to you yesterday on this! > > Seems to be working ok for me! > > Tested-by: John Stultz <john.stultz@linaro.org> Queued to drm-misc-fixes. Thanks for fixing this. Archit
diff --git a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c index 0e14f1572d05..93d1ecafe8fa 100644 --- a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c +++ b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c @@ -1204,7 +1204,7 @@ static int adv7511_probe(struct i2c_client *i2c, const struct i2c_device_id *id) #ifdef CONFIG_DRM_I2C_ADV7511_CEC ret = adv7511_cec_init(dev, adv7511, offset); if (ret) - goto err_unregister_cec; + goto err_unregister_bridge; #else regmap_write(adv7511->regmap, ADV7511_REG_CEC_CTRL + offset, ADV7511_CEC_CTRL_POWER_DOWN); @@ -1212,6 +1212,11 @@ static int adv7511_probe(struct i2c_client *i2c, const struct i2c_device_id *id) return 0; +#ifdef CONFIG_DRM_I2C_ADV7511_CEC +err_unregister_bridge: + adv7511_audio_exit(adv7511); + drm_bridge_remove(&adv7511->bridge); +#endif err_unregister_cec: i2c_unregister_device(adv7511->i2c_cec); if (adv7511->cec_clk)