Message ID | 20200413100321.v4.9.I07c1f70e0e8f2dc0004bd38970b4e258acdc773e@changeid |
---|---|
State | New |
Headers | show |
Series | None | expand |
Quoting Douglas Anderson (2020-04-13 10:04:14) > Auditing tcs_invalidate() made me worried. Specifically I saw that it > used spin_lock(), not spin_lock_irqsave(). That always worries me > unless I can trace for sure that I'm in the interrupt handler or that > someone else already disabled interrupts. > > Looking more at it, there is actually no reason for these locks > anyway. Specifically the only reason you'd ever call > rpmh_rsc_invalidate() is if you cared that the sleep/wake TCSes were > empty. That means that they need to continue to be empty even after > rpmh_rsc_invalidate() returns. The only way that can happen is if the > caller already has done something to keep all other RPMH users out. > It should be noted that even though the caller is only worried about > making sleep/wake TCSes empty, they also need to worry about stopping > active-only transfers if they need to handle the case where > active-only transfers might borrow the wake TCS. > > At the moment rpmh_rsc_invalidate() is only called in PM code from the > last CPU. If that later changes the caller will still need to solve > the above problems themselves, so these locks will never be useful. > > Continuing to audit tcs_invalidate(), I found a bug. The function > didn't properly check for a borrowed TCS if we hadn't recently written > anything into the TCS. Specifically, if we've never written to the > WAKE_TCS (or we've flushed it recently) then tcs->slots is empty. > We'll early-out and we'll never call tcs_is_free(). > > I thought about fixing this bug by either deleting the early check for > bitmap_empty() or possibly only doing it if we knew we weren't on a > TCS that could be borrowed. However, I think it's better to just > delete the checks. > > As argued above it's up to the caller to make sure that all other > users of RPMH are quiet before tcs_invalidate() is called. Since > callers need to handle the zero-active-TCS case anyway that means they > need to make sure that the active-only transfers are quiet before > calling too. The one way tcs_invalidate() gets called today is > through rpmh_rsc_cpu_pm_callback() which calls > rpmh_rsc_ctrlr_is_busy() to handle this. When we have another path to > get to tcs_invalidate() it will also need to come up with something > similar and it won't need this extra check either. If we later find > some code path that actually needs this check back in (and somehow > manages to be race free) we can always add it back in. > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > Reviewed-by: Maulik Shah <mkshah@codeaurora.org> > Tested-by: Maulik Shah <mkshah@codeaurora.org> > --- Reviewed-by: Stephen Boyd <swboyd@chromium.org>
diff --git a/drivers/soc/qcom/rpmh-internal.h b/drivers/soc/qcom/rpmh-internal.h index f06350cbc9a2..dba8510c0669 100644 --- a/drivers/soc/qcom/rpmh-internal.h +++ b/drivers/soc/qcom/rpmh-internal.h @@ -132,7 +132,7 @@ struct rsc_drv { int rpmh_rsc_send_data(struct rsc_drv *drv, const struct tcs_request *msg); int rpmh_rsc_write_ctrl_data(struct rsc_drv *drv, const struct tcs_request *msg); -int rpmh_rsc_invalidate(struct rsc_drv *drv); +void rpmh_rsc_invalidate(struct rsc_drv *drv); void rpmh_tx_done(const struct tcs_request *msg, int r); int rpmh_flush(struct rpmh_ctrlr *ctrlr); diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c index 4f2a72fdac2c..9bd0c7c3db7c 100644 --- a/drivers/soc/qcom/rpmh-rsc.c +++ b/drivers/soc/qcom/rpmh-rsc.c @@ -198,50 +198,38 @@ static bool tcs_is_free(struct rsc_drv *drv, int tcs_id) * This will clear the "slots" variable of the given tcs_group and also * tell the hardware to forget about all entries. * - * Return: 0 if no problem, or -EAGAIN if the caller should try again in a - * bit. Caller should make sure to enable interrupts between tries. + * The caller must ensure that no other RPMH actions are happening when this + * function is called, since otherwise the device may immediately become + * used again even before this function exits. */ -static int tcs_invalidate(struct rsc_drv *drv, int type) +static void tcs_invalidate(struct rsc_drv *drv, int type) { int m; struct tcs_group *tcs = &drv->tcs[type]; - spin_lock(&tcs->lock); - if (bitmap_empty(tcs->slots, MAX_TCS_SLOTS)) { - spin_unlock(&tcs->lock); - return 0; - } + /* Caller ensures nobody else is running so no lock */ + if (bitmap_empty(tcs->slots, MAX_TCS_SLOTS)) + return; for (m = tcs->offset; m < tcs->offset + tcs->num_tcs; m++) { - if (!tcs_is_free(drv, m)) { - spin_unlock(&tcs->lock); - return -EAGAIN; - } write_tcs_reg_sync(drv, RSC_DRV_CMD_ENABLE, m, 0); write_tcs_reg_sync(drv, RSC_DRV_CMD_WAIT_FOR_CMPL, m, 0); } bitmap_zero(tcs->slots, MAX_TCS_SLOTS); - spin_unlock(&tcs->lock); - - return 0; } /** * rpmh_rsc_invalidate() - Invalidate sleep and wake TCSes. * @drv: The RSC controller. * - * Return: 0 if no problem, or -EAGAIN if the caller should try again in a - * bit. Caller should make sure to enable interrupts between tries. + * The caller must ensure that no other RPMH actions are happening when this + * function is called, since otherwise the device may immediately become + * used again even before this function exits. */ -int rpmh_rsc_invalidate(struct rsc_drv *drv) +void rpmh_rsc_invalidate(struct rsc_drv *drv) { - int ret; - - ret = tcs_invalidate(drv, SLEEP_TCS); - if (!ret) - ret = tcs_invalidate(drv, WAKE_TCS); - - return ret; + tcs_invalidate(drv, SLEEP_TCS); + tcs_invalidate(drv, WAKE_TCS); } /** diff --git a/drivers/soc/qcom/rpmh.c b/drivers/soc/qcom/rpmh.c index e6a453759abd..aab6cc6bba4f 100644 --- a/drivers/soc/qcom/rpmh.c +++ b/drivers/soc/qcom/rpmh.c @@ -440,7 +440,6 @@ static int send_single(struct rpmh_ctrlr *ctrlr, enum rpmh_state state, * * Return: * * 0 - Success - * * -EAGAIN - Retry again * * Error code - Otherwise */ int rpmh_flush(struct rpmh_ctrlr *ctrlr) @@ -456,9 +455,7 @@ int rpmh_flush(struct rpmh_ctrlr *ctrlr) } /* Invalidate the TCSes first to avoid stale data */ - ret = rpmh_rsc_invalidate(ctrlr_to_drv(ctrlr)); - if (ret) - return ret; + rpmh_rsc_invalidate(ctrlr_to_drv(ctrlr)); /* First flush the cached batch requests */ ret = flush_batch(ctrlr);