Message ID | 20230328223740.69446-1-dennis@kernel.org |
---|---|
State | New |
Headers | show |
Series | mmc: allow mmc to block wait_for_device_probe() | expand |
On Tue, Mar 28, 2023 at 03:37:40PM -0700, Dennis Zhou wrote: > I've been hitting a failed data device lookup when using dm-verity and a > root device on an emmc partition. This is because there is a race where > dm-verity is looking for a data device, but the partitions on the emmc > device haven't been probed yet. > > Initially I looked at solving this by changing devt_from_devname() to > look for partitions, but it seems there is legacy reasons and issues due > to dm. > > MMC uses 2 levels of probing. The first to handle initializing the > host and the second to iterate attached devices. The second is done by > a workqueue item. However, this paradigm makes wait_for_device_probe() > useless as a barrier for when we can assume attached devices have been > probed. > > This patch fixes this by exposing 2 methods inc/dec_probe_count() to > allow device drivers that do asynchronous probing to delay waiters on > wait_for_device_probe() so that when they are released, they can assume > attached devices have been probed. Please no. For 2 reasons: - the api names you picked here do not make much sense from a global namespace standpoint. Always try to do "noun/verb" as well, so if we really wanted to do this it would be "driver_probe_incrememt()" or something like that. - drivers and subsystems should not be messing around with the probe count as it's a hack in the first place to get around other issues. Please let's not make it worse and make a formal api for it and allow anyone to mess with it. Why can't you just use normal deferred probing for this? thanks, greg k-h
On Wed, Mar 29, 2023 at 06:54:11AM +0200, Greg Kroah-Hartman wrote: > On Tue, Mar 28, 2023 at 03:37:40PM -0700, Dennis Zhou wrote: > > I've been hitting a failed data device lookup when using dm-verity and a > > root device on an emmc partition. This is because there is a race where > > dm-verity is looking for a data device, but the partitions on the emmc > > device haven't been probed yet. > > > > Initially I looked at solving this by changing devt_from_devname() to > > look for partitions, but it seems there is legacy reasons and issues due > > to dm. > > > > MMC uses 2 levels of probing. The first to handle initializing the > > host and the second to iterate attached devices. The second is done by > > a workqueue item. However, this paradigm makes wait_for_device_probe() > > useless as a barrier for when we can assume attached devices have been > > probed. > > > > This patch fixes this by exposing 2 methods inc/dec_probe_count() to > > allow device drivers that do asynchronous probing to delay waiters on > > wait_for_device_probe() so that when they are released, they can assume > > attached devices have been probed. > Thanks for the quick reply. > Please no. For 2 reasons: > - the api names you picked here do not make much sense from a global > namespace standpoint. Always try to do "noun/verb" as well, so if > we really wanted to do this it would be "driver_probe_incrememt()" > or something like that. Yeah that is a bit of a blunder on my part... > - drivers and subsystems should not be messing around with the probe > count as it's a hack in the first place to get around other issues. > Please let's not make it worse and make a formal api for it and allow > anyone to mess with it. > That's fair. > Why can't you just use normal deferred probing for this? > I'm not familiar with why mmc is written the way it is, but probing creates a notion of the host whereas the devices attached are probed later via a work item. Examining it a bit closer, inlining the first discovery call avoids all of this mess. I sent that out just now in [1]. Hopefully that'll be fine. [1] https://lore.kernel.org/lkml/20230329202148.71107-1-dennis@kernel.org/T/#u > thanks, > > greg k-h Thanks, Dennis
diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 95ae347df137..c0117476e1d6 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -494,6 +494,19 @@ EXPORT_SYMBOL_GPL(device_bind_driver); static atomic_t probe_count = ATOMIC_INIT(0); static DECLARE_WAIT_QUEUE_HEAD(probe_waitqueue); +void inc_probe_count(void) +{ + atomic_inc(&probe_count); +} +EXPORT_SYMBOL_GPL(inc_probe_count); + +void dec_probe_count(void) +{ + if (atomic_dec_return(&probe_count) == 0) + wake_up_all(&probe_waitqueue); +} +EXPORT_SYMBOL_GPL(dec_probe_count); + static ssize_t state_synced_show(struct device *dev, struct device_attribute *attr, char *buf) { @@ -793,8 +806,8 @@ static int driver_probe_device(struct device_driver *drv, struct device *dev) !defer_all_probes) driver_deferred_probe_trigger(); } - atomic_dec(&probe_count); - wake_up_all(&probe_waitqueue); + if (atomic_dec_return(&probe_count) == 0) + wake_up_all(&probe_waitqueue); return ret; } diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c index 368f10405e13..92690984dac2 100644 --- a/drivers/mmc/core/core.c +++ b/drivers/mmc/core/core.c @@ -2192,11 +2192,11 @@ void mmc_rescan(struct work_struct *work) int i; if (host->rescan_disable) - return; + goto out_probe; /* If there is a non-removable card registered, only scan once */ if (!mmc_card_is_removable(host) && host->rescan_entered) - return; + goto out_probe; host->rescan_entered = 1; if (host->trigger_card_event && host->ops->card_event) { @@ -2247,6 +2247,13 @@ void mmc_rescan(struct work_struct *work) out: if (host->caps & MMC_CAP_NEEDS_POLL) mmc_schedule_delayed_work(&host->detect, HZ); + +out_probe: + if (host->start_probe) { + /* matches inc_probe_count() in mmc_start_host() */ + dec_probe_count(); + host->start_probe = 0; + } } void mmc_start_host(struct mmc_host *host) @@ -2261,6 +2268,15 @@ void mmc_start_host(struct mmc_host *host) } mmc_gpiod_request_cd_irq(host); + + /* + * MMC uses 2 levels of probing. The first to handle initializing the + * host and the second to iterate attached devices. However, this + * paradigm breaks wait_for_device_probe(). Fix this here by + * incrementing the probe_count and decrementing after the scan. + */ + host->start_probe = 1; + inc_probe_count(); _mmc_detect_change(host, 0, false); } @@ -2273,6 +2289,11 @@ void __mmc_stop_host(struct mmc_host *host) host->rescan_disable = 1; cancel_delayed_work_sync(&host->detect); + /* start_probe is protected by the cancel_delayed_work_sync() */ + if (host->start_probe) { + dec_probe_count(); + host->start_probe = 0; + } } void mmc_stop_host(struct mmc_host *host) diff --git a/include/linux/device.h b/include/linux/device.h index e270cb740b9e..d09bdc33d1cf 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -891,6 +891,13 @@ int __must_check device_reprobe(struct device *dev); bool device_is_bound(struct device *dev); +/* + * Functions that inc/dec probe_count to allow device drivers that finish + * probing asynchronously to delay wait_for_device_probe() appropriately. + */ +void inc_probe_count(void); +void dec_probe_count(void); + /* * Easy functions for dynamically creating devices on the fly */ diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h index 0c0c9a0fdf57..ea7b9158f052 100644 --- a/include/linux/mmc/host.h +++ b/include/linux/mmc/host.h @@ -428,6 +428,7 @@ struct mmc_host { unsigned int retune_paused:1; /* re-tuning is temporarily disabled */ unsigned int retune_crc_disable:1; /* don't trigger retune upon crc */ unsigned int can_dma_map_merge:1; /* merging can be used */ + unsigned int start_probe:1; /* if this is our first scan */ int rescan_disable; /* disable card detection */ int rescan_entered; /* used with nonremovable devices */
I've been hitting a failed data device lookup when using dm-verity and a root device on an emmc partition. This is because there is a race where dm-verity is looking for a data device, but the partitions on the emmc device haven't been probed yet. Initially I looked at solving this by changing devt_from_devname() to look for partitions, but it seems there is legacy reasons and issues due to dm. MMC uses 2 levels of probing. The first to handle initializing the host and the second to iterate attached devices. The second is done by a workqueue item. However, this paradigm makes wait_for_device_probe() useless as a barrier for when we can assume attached devices have been probed. This patch fixes this by exposing 2 methods inc/dec_probe_count() to allow device drivers that do asynchronous probing to delay waiters on wait_for_device_probe() so that when they are released, they can assume attached devices have been probed. Signed-off-by: Dennis Zhou <dennis@kernel.org> --- drivers/base/dd.c | 17 +++++++++++++++-- drivers/mmc/core/core.c | 25 +++++++++++++++++++++++-- include/linux/device.h | 7 +++++++ include/linux/mmc/host.h | 1 + 4 files changed, 46 insertions(+), 4 deletions(-)