Message ID | 20170209153403.9730-7-linus.walleij@linaro.org |
---|---|
State | New |
Headers | show |
Series | multiqueue for MMC/SD third try | expand |
On 09/02/17 17:33, Linus Walleij wrote: > The waitqueue in the host context is there to signal back from > mmc_request_done() through mmc_wait_data_done() that the hardware > is done with a command, and when the wait is over, the core > will typically submit the next asynchronous request that is pending > just waiting for the hardware to be available. > > This is in the way for letting the mmc_request_done() trigger the > report up to the block layer that a block request is finished. > > Re-jig this as a first step, remvoving the waitqueue and introducing > a work that will run after a completed asynchronous request, > finalizing that request, including retransmissions, and eventually > reporting back with a completion and a status code to the > asynchronous issue method. > > This had the upside that we can remove the MMC_BLK_NEW_REQUEST > status code and the "new_request" state in the request queue > that is only there to make the state machine spin out > the first time we send a request. > > Introduce a workqueue in the host for handling just this, and > then a work and completion in the asynchronous request to deal > with this mechanism. > > This is a central change that let us do many other changes since > we have broken the submit and complete code paths in two, and we > can potentially remove the NULL flushing of the asynchronous > pipeline and report block requests as finished directly from > the worker. This needs more thought. The completion should go straight to the mmc block driver from the ->done() callback. And from there straight back to the block layer if recovery is not needed. We want to stop using mmc_start_areq() altogether because we never want to wait - we always want to issue (if possible) and return. The core API to use is __mmc_start_req() but the block driver should populate mrq->done with its own handler. i.e. change __mmc_start_req() - mrq->done = mmc_wait_done; + if (!mrq->done) + mrq->done = mmc_wait_done; mrq->done() would complete the request (e.g. via blk_complete_request()) if it has no errors (and doesn't need polling), and wake up the queue thread to finish up everything else and start the next request. For the blk-mq port, the queue thread should also be retained, partly because it solves some synchronization problems, but mostly because, at this stage, we anyway don't have solutions for all the different ways the driver can block. (as listed here https://marc.info/?l=linux-mmc&m=148336571720463&w=2 ) -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday, February 09, 2017 04:33:53 PM Linus Walleij wrote: > The waitqueue in the host context is there to signal back from > mmc_request_done() through mmc_wait_data_done() that the hardware > is done with a command, and when the wait is over, the core > will typically submit the next asynchronous request that is pending > just waiting for the hardware to be available. > > This is in the way for letting the mmc_request_done() trigger the > report up to the block layer that a block request is finished. > > Re-jig this as a first step, remvoving the waitqueue and introducing > a work that will run after a completed asynchronous request, > finalizing that request, including retransmissions, and eventually > reporting back with a completion and a status code to the > asynchronous issue method. > > This had the upside that we can remove the MMC_BLK_NEW_REQUEST > status code and the "new_request" state in the request queue > that is only there to make the state machine spin out > the first time we send a request. > > Introduce a workqueue in the host for handling just this, and > then a work and completion in the asynchronous request to deal > with this mechanism. > > This is a central change that let us do many other changes since > we have broken the submit and complete code paths in two, and we > can potentially remove the NULL flushing of the asynchronous > pipeline and report block requests as finished directly from > the worker. > > Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Best regards, -- Bartlomiej Zolnierkiewicz Samsung R&D Institute Poland Samsung Electronics -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Feb 22, 2017 at 2:29 PM, Adrian Hunter <adrian.hunter@intel.com> wrote: > On 09/02/17 17:33, Linus Walleij wrote: >> The waitqueue in the host context is there to signal back from >> mmc_request_done() through mmc_wait_data_done() that the hardware >> is done with a command, and when the wait is over, the core >> will typically submit the next asynchronous request that is pending >> just waiting for the hardware to be available. >> >> This is in the way for letting the mmc_request_done() trigger the >> report up to the block layer that a block request is finished. >> >> Re-jig this as a first step, remvoving the waitqueue and introducing >> a work that will run after a completed asynchronous request, >> finalizing that request, including retransmissions, and eventually >> reporting back with a completion and a status code to the >> asynchronous issue method. >> >> This had the upside that we can remove the MMC_BLK_NEW_REQUEST >> status code and the "new_request" state in the request queue >> that is only there to make the state machine spin out >> the first time we send a request. >> >> Introduce a workqueue in the host for handling just this, and >> then a work and completion in the asynchronous request to deal >> with this mechanism. >> >> This is a central change that let us do many other changes since >> we have broken the submit and complete code paths in two, and we >> can potentially remove the NULL flushing of the asynchronous >> pipeline and report block requests as finished directly from >> the worker. > > This needs more thought. The completion should go straight to the mmc block > driver from the ->done() callback. And from there straight back to the > block layer if recovery is not needed. We want to stop using > mmc_start_areq() altogether because we never want to wait - we always want > to issue (if possible) and return. I don't quite follow this. Isn't what you request exactly what patch 15/16 "mmc: queue: issue requests in massive parallel" is doing? The whole patch series leads up to that. > The core API to use is __mmc_start_req() but the block driver should > populate mrq->done with its own handler. i.e. change __mmc_start_req() > > - mrq->done = mmc_wait_done; > + if (!mrq->done) > + mrq->done = mmc_wait_done; > > mrq->done() would complete the request (e.g. via blk_complete_request()) if > it has no errors (and doesn't need polling), and wake up the queue thread to > finish up everything else and start the next request. I think this is what it does at the end of the patch series, patch 15/16. I have to split it somehow... > For the blk-mq port, the queue thread should also be retained, partly > because it solves some synchronization problems, but mostly because, at this > stage, we anyway don't have solutions for all the different ways the driver > can block. > (as listed here https://marc.info/?l=linux-mmc&m=148336571720463&w=2 ) Essentially I take out that thread and replace it with this one worker introduced in this very patch. I agree the driver can block in many ways and that is why I need to have it running in process context, and this is what the worker introduced here provides. Maybe I'm getting something backwards, sorry then :/ Yours, Linus Walleij -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/03/17 00:49, Linus Walleij wrote: > On Wed, Feb 22, 2017 at 2:29 PM, Adrian Hunter <adrian.hunter@intel.com> wrote: >> On 09/02/17 17:33, Linus Walleij wrote: >>> The waitqueue in the host context is there to signal back from >>> mmc_request_done() through mmc_wait_data_done() that the hardware >>> is done with a command, and when the wait is over, the core >>> will typically submit the next asynchronous request that is pending >>> just waiting for the hardware to be available. >>> >>> This is in the way for letting the mmc_request_done() trigger the >>> report up to the block layer that a block request is finished. >>> >>> Re-jig this as a first step, remvoving the waitqueue and introducing >>> a work that will run after a completed asynchronous request, >>> finalizing that request, including retransmissions, and eventually >>> reporting back with a completion and a status code to the >>> asynchronous issue method. >>> >>> This had the upside that we can remove the MMC_BLK_NEW_REQUEST >>> status code and the "new_request" state in the request queue >>> that is only there to make the state machine spin out >>> the first time we send a request. >>> >>> Introduce a workqueue in the host for handling just this, and >>> then a work and completion in the asynchronous request to deal >>> with this mechanism. >>> >>> This is a central change that let us do many other changes since >>> we have broken the submit and complete code paths in two, and we >>> can potentially remove the NULL flushing of the asynchronous >>> pipeline and report block requests as finished directly from >>> the worker. >> >> This needs more thought. The completion should go straight to the mmc block >> driver from the ->done() callback. And from there straight back to the >> block layer if recovery is not needed. We want to stop using >> mmc_start_areq() altogether because we never want to wait - we always want >> to issue (if possible) and return. > > I don't quite follow this. Isn't what you request exactly what > patch 15/16 "mmc: queue: issue requests in massive parallel" > is doing? There is the latency for the worker that runs mmc_finalize_areq() and then another latency to wake up the worker that is running mmc_start_areq(). That is 2 wake-ups instead of 1. As a side note, ideally we would be able to issue the next request from the interrupt or soft interrupt context of the completion (i.e. 0 wake-ups between requests), but we would probably have to look at the host API to support that. > > The whole patch series leads up to that. > >> The core API to use is __mmc_start_req() but the block driver should >> populate mrq->done with its own handler. i.e. change __mmc_start_req() >> >> - mrq->done = mmc_wait_done; >> + if (!mrq->done) >> + mrq->done = mmc_wait_done; >> >> mrq->done() would complete the request (e.g. via blk_complete_request()) if >> it has no errors (and doesn't need polling), and wake up the queue thread to >> finish up everything else and start the next request. > > I think this is what it does at the end of the patch series, patch 15/16. > I have to split it somehow... > >> For the blk-mq port, the queue thread should also be retained, partly >> because it solves some synchronization problems, but mostly because, at this >> stage, we anyway don't have solutions for all the different ways the driver >> can block. >> (as listed here https://marc.info/?l=linux-mmc&m=148336571720463&w=2 ) > > Essentially I take out that thread and replace it with this one worker > introduced in this very patch. I agree the driver can block in many ways > and that is why I need to have it running in process context, and this > is what the worker introduced here provides. The last time I looked at the blk-mq I/O scheduler code, it pulled up to qdepth requests from the I/O scheduler and left them on a local list while running ->queue_rq(). That means blocking in ->queue_rq() leaves some number of requests in limbo (not issued but also not in the I/O scheduler) for that time. Maybe blk-mq should offer a pull interface to I/O scheduler users? -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/10/2017 07:21 AM, Adrian Hunter wrote: >> Essentially I take out that thread and replace it with this one worker >> introduced in this very patch. I agree the driver can block in many ways >> and that is why I need to have it running in process context, and this >> is what the worker introduced here provides. > > The last time I looked at the blk-mq I/O scheduler code, it pulled up to > qdepth requests from the I/O scheduler and left them on a local list while > running ->queue_rq(). That means blocking in ->queue_rq() leaves some > number of requests in limbo (not issued but also not in the I/O scheduler) > for that time. Look again, if we're not handling the requeued dispatches, we pull one at the time from the scheduler. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11/03/17 00:05, Jens Axboe wrote: > On 03/10/2017 07:21 AM, Adrian Hunter wrote: >>> Essentially I take out that thread and replace it with this one worker >>> introduced in this very patch. I agree the driver can block in many ways >>> and that is why I need to have it running in process context, and this >>> is what the worker introduced here provides. >> >> The last time I looked at the blk-mq I/O scheduler code, it pulled up to >> qdepth requests from the I/O scheduler and left them on a local list while >> running ->queue_rq(). That means blocking in ->queue_rq() leaves some >> number of requests in limbo (not issued but also not in the I/O scheduler) >> for that time. > > Look again, if we're not handling the requeued dispatches, we pull one > at the time from the scheduler. > That's good :-) Now the next thing ;-) It looks like we either set BLK_MQ_F_BLOCKING and miss the possibility of issuing synchronous requests immediately, or we don't set BLK_MQ_F_BLOCKING in which case we are never allowed to sleep in ->queue_rq(). Is that true? -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/13/2017 03:25 AM, Adrian Hunter wrote: > On 11/03/17 00:05, Jens Axboe wrote: >> On 03/10/2017 07:21 AM, Adrian Hunter wrote: >>>> Essentially I take out that thread and replace it with this one worker >>>> introduced in this very patch. I agree the driver can block in many ways >>>> and that is why I need to have it running in process context, and this >>>> is what the worker introduced here provides. >>> >>> The last time I looked at the blk-mq I/O scheduler code, it pulled up to >>> qdepth requests from the I/O scheduler and left them on a local list while >>> running ->queue_rq(). That means blocking in ->queue_rq() leaves some >>> number of requests in limbo (not issued but also not in the I/O scheduler) >>> for that time. >> >> Look again, if we're not handling the requeued dispatches, we pull one >> at the time from the scheduler. >> > > That's good :-) > > Now the next thing ;-) > > It looks like we either set BLK_MQ_F_BLOCKING and miss the possibility of > issuing synchronous requests immediately, or we don't set BLK_MQ_F_BLOCKING > in which case we are never allowed to sleep in ->queue_rq(). Is that true? Only one of those statements is true - if you don't set BLK_MQ_F_BLOCKING, then you may never block in your ->queue_rq() function. But if you do set it, it does not preclude immediate issue of sync requests. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 13/03/17 16:19, Jens Axboe wrote: > On 03/13/2017 03:25 AM, Adrian Hunter wrote: >> On 11/03/17 00:05, Jens Axboe wrote: >>> On 03/10/2017 07:21 AM, Adrian Hunter wrote: >>>>> Essentially I take out that thread and replace it with this one worker >>>>> introduced in this very patch. I agree the driver can block in many ways >>>>> and that is why I need to have it running in process context, and this >>>>> is what the worker introduced here provides. >>>> >>>> The last time I looked at the blk-mq I/O scheduler code, it pulled up to >>>> qdepth requests from the I/O scheduler and left them on a local list while >>>> running ->queue_rq(). That means blocking in ->queue_rq() leaves some >>>> number of requests in limbo (not issued but also not in the I/O scheduler) >>>> for that time. >>> >>> Look again, if we're not handling the requeued dispatches, we pull one >>> at the time from the scheduler. >>> >> >> That's good :-) >> >> Now the next thing ;-) >> >> It looks like we either set BLK_MQ_F_BLOCKING and miss the possibility of >> issuing synchronous requests immediately, or we don't set BLK_MQ_F_BLOCKING >> in which case we are never allowed to sleep in ->queue_rq(). Is that true? > > Only one of those statements is true - if you don't set BLK_MQ_F_BLOCKING, > then you may never block in your ->queue_rq() function. But if you do set it, > it does not preclude immediate issue of sync requests. I meant it gets put to the workqueue rather than issued in the context of the submitter. -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/14/2017 06:59 AM, Adrian Hunter wrote: > On 13/03/17 16:19, Jens Axboe wrote: >> On 03/13/2017 03:25 AM, Adrian Hunter wrote: >>> On 11/03/17 00:05, Jens Axboe wrote: >>>> On 03/10/2017 07:21 AM, Adrian Hunter wrote: >>>>>> Essentially I take out that thread and replace it with this one worker >>>>>> introduced in this very patch. I agree the driver can block in many ways >>>>>> and that is why I need to have it running in process context, and this >>>>>> is what the worker introduced here provides. >>>>> >>>>> The last time I looked at the blk-mq I/O scheduler code, it pulled up to >>>>> qdepth requests from the I/O scheduler and left them on a local list while >>>>> running ->queue_rq(). That means blocking in ->queue_rq() leaves some >>>>> number of requests in limbo (not issued but also not in the I/O scheduler) >>>>> for that time. >>>> >>>> Look again, if we're not handling the requeued dispatches, we pull one >>>> at the time from the scheduler. >>>> >>> >>> That's good :-) >>> >>> Now the next thing ;-) >>> >>> It looks like we either set BLK_MQ_F_BLOCKING and miss the possibility of >>> issuing synchronous requests immediately, or we don't set BLK_MQ_F_BLOCKING >>> in which case we are never allowed to sleep in ->queue_rq(). Is that true? >> >> Only one of those statements is true - if you don't set BLK_MQ_F_BLOCKING, >> then you may never block in your ->queue_rq() function. But if you do set it, >> it does not preclude immediate issue of sync requests. > > I meant it gets put to the workqueue rather than issued in the context of > the submitter. There's one case that doesn't look like it was converted properly, but that's a mistake. The general insert-and-run cases run inline if we can, but the direct-issue needs a fixup, see below. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.htmldiff --git a/block/blk-mq.c b/block/blk-mq.c index 159187a28d66..4196d6bee92d 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1434,7 +1434,8 @@ static blk_qc_t request_to_qc_t(struct blk_mq_hw_ctx *hctx, struct request *rq) return blk_tag_to_qc_t(rq->internal_tag, hctx->queue_num, true); } -static void blk_mq_try_issue_directly(struct request *rq, blk_qc_t *cookie) +static void blk_mq_try_issue_directly(struct request *rq, blk_qc_t *cookie, + bool can_block) { struct request_queue *q = rq->q; struct blk_mq_queue_data bd = { @@ -1475,7 +1476,7 @@ static void blk_mq_try_issue_directly(struct request *rq, blk_qc_t *cookie) } insert: - blk_mq_sched_insert_request(rq, false, true, true, false); + blk_mq_sched_insert_request(rq, false, true, false, can_block); } /* @@ -1569,11 +1570,11 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) if (!(data.hctx->flags & BLK_MQ_F_BLOCKING)) { rcu_read_lock(); - blk_mq_try_issue_directly(old_rq, &cookie); + blk_mq_try_issue_directly(old_rq, &cookie, false); rcu_read_unlock(); } else { srcu_idx = srcu_read_lock(&data.hctx->queue_rq_srcu); - blk_mq_try_issue_directly(old_rq, &cookie); + blk_mq_try_issue_directly(old_rq, &cookie, true); srcu_read_unlock(&data.hctx->queue_rq_srcu, srcu_idx); } goto done;
On Tue, Mar 14, 2017 at 08:36:26AM -0600, Jens Axboe wrote: > There's one case that doesn't look like it was converted properly, but > that's a mistake. The general insert-and-run cases run inline if we can, > but the direct-issue needs a fixup, see below. Note that blk_mq_try_issue_directly is only called for the multiple hardware queue case, so MMC would not hit it. -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/14/2017 08:43 AM, Christoph Hellwig wrote: > On Tue, Mar 14, 2017 at 08:36:26AM -0600, Jens Axboe wrote: >> There's one case that doesn't look like it was converted properly, but >> that's a mistake. The general insert-and-run cases run inline if we can, >> but the direct-issue needs a fixup, see below. > > Note that blk_mq_try_issue_directly is only called for the multiple > hardware queue case, so MMC would not hit it. Right, which is why I said that the general case works fine, it's only the explicit issue-direct that currently does not. Not by design, just an error. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Mar 10, 2017 at 3:21 PM, Adrian Hunter <adrian.hunter@intel.com> wrote: > On 10/03/17 00:49, Linus Walleij wrote: >> On Wed, Feb 22, 2017 at 2:29 PM, Adrian Hunter <adrian.hunter@intel.com> wrote: >>> On 09/02/17 17:33, Linus Walleij wrote: >>>> This is a central change that let us do many other changes since >>>> we have broken the submit and complete code paths in two, and we >>>> can potentially remove the NULL flushing of the asynchronous >>>> pipeline and report block requests as finished directly from >>>> the worker. >>> >>> This needs more thought. The completion should go straight to the mmc block >>> driver from the ->done() callback. And from there straight back to the >>> block layer if recovery is not needed. We want to stop using >>> mmc_start_areq() altogether because we never want to wait - we always want >>> to issue (if possible) and return. >> >> I don't quite follow this. Isn't what you request exactly what >> patch 15/16 "mmc: queue: issue requests in massive parallel" >> is doing? > > There is the latency for the worker that runs mmc_finalize_areq() and then > another latency to wake up the worker that is running mmc_start_areq(). > That is 2 wake-ups instead of 1. That is correct (though the measured effect is small). However when we switch to MQ it must happen like this due to its asynchronous nature of issuing requests to us. Then we have MQ's submission thread coming in from one en and our worker to manage retries and errors on the other side. We obviously cannot do the retries and resends in MQs context as it blocks subsequent requests. > As a side note, ideally we would be able to issue the next request from the > interrupt or soft interrupt context of the completion (i.e. 0 wake-ups > between requests), but we would probably have to look at the host API to > support that. I looked at that and couldn't find a good way to get to that point. Mainly because of mmc_start_bkops() that may arbitrarily fire after every command and start new requests to the card, and that of course require a process context to happen. Then there is the retune thing that I do not fully understand how it schedules, but it might be fine since I'm under the impression that it is done at the start of the next request if need be. Maybe both can be overcome by quick checks in IRQ context and then this can be done. (I'm not smart enough to see that yet, sorry.) However since we activate the blocking context in MQ I don't know if it can even deal with having requests completed in interrupt context so that the next thing that happens after completing the request and returning from the interrupt is that the block layer thread gets scheduled (unless something more important is going on), I guess it is possible? It looks like it could be really efficient. >>> For the blk-mq port, the queue thread should also be retained, partly >>> because it solves some synchronization problems, but mostly because, at this >>> stage, we anyway don't have solutions for all the different ways the driver >>> can block. >>> (as listed here https://marc.info/?l=linux-mmc&m=148336571720463&w=2 ) >> >> Essentially I take out that thread and replace it with this one worker >> introduced in this very patch. I agree the driver can block in many ways >> and that is why I need to have it running in process context, and this >> is what the worker introduced here provides. > > The last time I looked at the blk-mq I/O scheduler code, it pulled up to > qdepth requests from the I/O scheduler and left them on a local list while > running ->queue_rq(). That means blocking in ->queue_rq() leaves some > number of requests in limbo (not issued but also not in the I/O scheduler) > for that time. I think Jens provided a patch for this bug (don't see the patch upstream though). Yours, Linus Walleij -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 14, 2017 at 3:36 PM, Jens Axboe <axboe@kernel.dk> wrote: > There's one case that doesn't look like it was converted properly, but > that's a mistake. The general insert-and-run cases run inline if we can, > but the direct-issue needs a fixup, see below. > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index 159187a28d66..4196d6bee92d 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -1434,7 +1434,8 @@ static blk_qc_t request_to_qc_t(struct blk_mq_hw_ctx *hctx, struct request *rq) > return blk_tag_to_qc_t(rq->internal_tag, hctx->queue_num, true); > } > > -static void blk_mq_try_issue_directly(struct request *rq, blk_qc_t *cookie) > +static void blk_mq_try_issue_directly(struct request *rq, blk_qc_t *cookie, > + bool can_block) > { > struct request_queue *q = rq->q; > struct blk_mq_queue_data bd = { > @@ -1475,7 +1476,7 @@ static void blk_mq_try_issue_directly(struct request *rq, blk_qc_t *cookie) > } > > insert: > - blk_mq_sched_insert_request(rq, false, true, true, false); > + blk_mq_sched_insert_request(rq, false, true, false, can_block); > } > > /* > @@ -1569,11 +1570,11 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) > > if (!(data.hctx->flags & BLK_MQ_F_BLOCKING)) { > rcu_read_lock(); > - blk_mq_try_issue_directly(old_rq, &cookie); > + blk_mq_try_issue_directly(old_rq, &cookie, false); > rcu_read_unlock(); > } else { > srcu_idx = srcu_read_lock(&data.hctx->queue_rq_srcu); > - blk_mq_try_issue_directly(old_rq, &cookie); > + blk_mq_try_issue_directly(old_rq, &cookie, true); > srcu_read_unlock(&data.hctx->queue_rq_srcu, srcu_idx); > } > goto done; Jens do you have this in your patch queue or is it something you want us to test and submit back to you? Yours, Linus Walleij -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c index c49c90dba839..c459d80c66bf 100644 --- a/drivers/mmc/core/block.c +++ b/drivers/mmc/core/block.c @@ -1562,6 +1562,8 @@ static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq, mqrq->areq.mrq = &brq->mrq; mqrq->areq.err_check = mmc_blk_err_check; + mqrq->areq.host = card->host; + kthread_init_work(&mqrq->areq.finalization_work, mmc_finalize_areq); mmc_queue_bounce_pre(mqrq); } @@ -1672,8 +1674,6 @@ static void mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *new_req) * and there is nothing more to do until it is * complete. */ - if (status == MMC_BLK_NEW_REQUEST) - mq->new_request = true; return; } @@ -1811,7 +1811,6 @@ void mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req) goto out; } - mq->new_request = false; if (req && req_op(req) == REQ_OP_DISCARD) { /* complete ongoing async transfer before issuing discard */ if (card->host->areq) @@ -1832,7 +1831,7 @@ void mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req) } out: - if ((!req && !mq->new_request) || req_is_special) + if (!req || req_is_special) /* * Release host when there are no more requests * and after special request(discard, flush) is done. diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c index 663799240635..8ecf61e51662 100644 --- a/drivers/mmc/core/core.c +++ b/drivers/mmc/core/core.c @@ -415,10 +415,13 @@ EXPORT_SYMBOL(mmc_start_bkops); */ static void mmc_wait_data_done(struct mmc_request *mrq) { - struct mmc_context_info *context_info = &mrq->host->context_info; + struct mmc_host *host = mrq->host; + struct mmc_context_info *context_info = &host->context_info; + struct mmc_async_req *areq = host->areq; context_info->is_done_rcv = true; - wake_up_interruptible(&context_info->wait); + /* Schedule a work to deal with finalizing this request */ + kthread_queue_work(&host->req_done_worker, &areq->finalization_work); } static void mmc_wait_done(struct mmc_request *mrq) @@ -592,43 +595,34 @@ static void mmc_post_req(struct mmc_host *host, struct mmc_request *mrq, * will return MMC_BLK_SUCCESS if no request was * going on. */ -static enum mmc_blk_status mmc_finalize_areq(struct mmc_host *host) +void mmc_finalize_areq(struct kthread_work *work) { + struct mmc_async_req *areq = + container_of(work, struct mmc_async_req, finalization_work); + struct mmc_host *host = areq->host; struct mmc_context_info *context_info = &host->context_info; - enum mmc_blk_status status; - - if (!host->areq) - return MMC_BLK_SUCCESS; - - while (1) { - wait_event_interruptible(context_info->wait, - (context_info->is_done_rcv || - context_info->is_new_req)); + enum mmc_blk_status status = MMC_BLK_SUCCESS; - if (context_info->is_done_rcv) { - struct mmc_command *cmd; + if (context_info->is_done_rcv) { + struct mmc_command *cmd; - context_info->is_done_rcv = false; - cmd = host->areq->mrq->cmd; + context_info->is_done_rcv = false; + cmd = areq->mrq->cmd; - if (!cmd->error || !cmd->retries || - mmc_card_removed(host->card)) { - status = host->areq->err_check(host->card, - host->areq); - break; /* return status */ - } else { - mmc_retune_recheck(host); - pr_info("%s: req failed (CMD%u): %d, retrying...\n", - mmc_hostname(host), - cmd->opcode, cmd->error); - cmd->retries--; - cmd->error = 0; - __mmc_start_request(host, host->areq->mrq); - continue; /* wait for done/new event again */ - } + if (!cmd->error || !cmd->retries || + mmc_card_removed(host->card)) { + status = areq->err_check(host->card, + areq); + } else { + mmc_retune_recheck(host); + pr_info("%s: req failed (CMD%u): %d, retrying...\n", + mmc_hostname(host), + cmd->opcode, cmd->error); + cmd->retries--; + cmd->error = 0; + __mmc_start_request(host, areq->mrq); + return; /* wait for done/new event again */ } - - return MMC_BLK_NEW_REQUEST; } mmc_retune_release(host); @@ -644,10 +638,12 @@ static enum mmc_blk_status mmc_finalize_areq(struct mmc_host *host) } /* Successfully postprocess the old request at this point */ - mmc_post_req(host, host->areq->mrq, 0); + mmc_post_req(host, areq->mrq, 0); - return status; + areq->finalization_status = status; + complete(&areq->complete); } +EXPORT_SYMBOL(mmc_finalize_areq); /** * mmc_start_areq - start an asynchronous request @@ -677,18 +673,21 @@ struct mmc_async_req *mmc_start_areq(struct mmc_host *host, if (areq) mmc_pre_req(host, areq->mrq); - /* Finalize previous request */ - status = mmc_finalize_areq(host); + /* Finalize previous request, if there is one */ + if (previous) { + wait_for_completion(&previous->complete); + status = previous->finalization_status; + } else { + status = MMC_BLK_SUCCESS; + } if (ret_stat) *ret_stat = status; - /* The previous request is still going on... */ - if (status == MMC_BLK_NEW_REQUEST) - return NULL; - /* Fine so far, start the new request! */ - if (status == MMC_BLK_SUCCESS && areq) + if (status == MMC_BLK_SUCCESS && areq) { + init_completion(&areq->complete); start_err = __mmc_start_data_req(host, areq->mrq); + } /* Cancel a prepared request if it was not started. */ if ((status != MMC_BLK_SUCCESS || start_err) && areq) @@ -2996,7 +2995,6 @@ void mmc_init_context_info(struct mmc_host *host) host->context_info.is_new_req = false; host->context_info.is_done_rcv = false; host->context_info.is_waiting_last_req = false; - init_waitqueue_head(&host->context_info.wait); } static int __init mmc_init(void) diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c index 5cb369c2664b..73250ed8f093 100644 --- a/drivers/mmc/core/queue.c +++ b/drivers/mmc/core/queue.c @@ -86,11 +86,6 @@ static int mmc_queue_thread(void *d) set_current_state(TASK_RUNNING); mmc_blk_issue_rq(mq, req); cond_resched(); - if (mq->new_request) { - mq->new_request = false; - continue; /* fetch again */ - } - /* * Current request becomes previous request * and vice versa. @@ -143,7 +138,6 @@ static void mmc_request_fn(struct request_queue *q) if (cntx->is_waiting_last_req) { cntx->is_new_req = true; - wake_up_interruptible(&cntx->wait); } if (mq->asleep) diff --git a/drivers/mmc/core/queue.h b/drivers/mmc/core/queue.h index e298f100101b..39d8e710287e 100644 --- a/drivers/mmc/core/queue.h +++ b/drivers/mmc/core/queue.h @@ -40,7 +40,6 @@ struct mmc_queue { struct mmc_card *card; struct task_struct *thread; struct semaphore thread_sem; - bool new_request; bool suspended; bool asleep; struct mmc_blk_data *blkdata; diff --git a/include/linux/mmc/core.h b/include/linux/mmc/core.h index a0c63ea28796..5db0fb722c37 100644 --- a/include/linux/mmc/core.h +++ b/include/linux/mmc/core.h @@ -9,6 +9,7 @@ #define LINUX_MMC_CORE_H #include <linux/completion.h> +#include <linux/kthread.h> #include <linux/types.h> struct mmc_data; @@ -23,7 +24,6 @@ enum mmc_blk_status { MMC_BLK_DATA_ERR, MMC_BLK_ECC_ERR, MMC_BLK_NOMEDIUM, - MMC_BLK_NEW_REQUEST, }; struct mmc_command { @@ -158,6 +158,7 @@ struct mmc_request { struct mmc_card; struct mmc_async_req; +void mmc_finalize_areq(struct kthread_work *work); struct mmc_async_req *mmc_start_areq(struct mmc_host *host, struct mmc_async_req *areq, enum mmc_blk_status *ret_stat); diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h index b04f8cd51c82..c5f61f2f2310 100644 --- a/include/linux/mmc/host.h +++ b/include/linux/mmc/host.h @@ -14,6 +14,7 @@ #include <linux/device.h> #include <linux/fault-inject.h> #include <linux/kthread.h> +#include <linux/completion.h> #include <linux/mmc/core.h> #include <linux/mmc/card.h> @@ -169,6 +170,10 @@ struct mmc_async_req { * Returns 0 if success otherwise non zero. */ enum mmc_blk_status (*err_check)(struct mmc_card *, struct mmc_async_req *); + struct kthread_work finalization_work; + enum mmc_blk_status finalization_status; + struct completion complete; + struct mmc_host *host; }; /** @@ -192,13 +197,11 @@ struct mmc_slot { * @is_done_rcv wake up reason was done request * @is_new_req wake up reason was new request * @is_waiting_last_req mmc context waiting for single running request - * @wait wait queue */ struct mmc_context_info { bool is_done_rcv; bool is_new_req; bool is_waiting_last_req; - wait_queue_head_t wait; }; struct regulator;
The waitqueue in the host context is there to signal back from mmc_request_done() through mmc_wait_data_done() that the hardware is done with a command, and when the wait is over, the core will typically submit the next asynchronous request that is pending just waiting for the hardware to be available. This is in the way for letting the mmc_request_done() trigger the report up to the block layer that a block request is finished. Re-jig this as a first step, remvoving the waitqueue and introducing a work that will run after a completed asynchronous request, finalizing that request, including retransmissions, and eventually reporting back with a completion and a status code to the asynchronous issue method. This had the upside that we can remove the MMC_BLK_NEW_REQUEST status code and the "new_request" state in the request queue that is only there to make the state machine spin out the first time we send a request. Introduce a workqueue in the host for handling just this, and then a work and completion in the asynchronous request to deal with this mechanism. This is a central change that let us do many other changes since we have broken the submit and complete code paths in two, and we can potentially remove the NULL flushing of the asynchronous pipeline and report block requests as finished directly from the worker. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> --- drivers/mmc/core/block.c | 7 ++-- drivers/mmc/core/core.c | 84 +++++++++++++++++++++++------------------------- drivers/mmc/core/queue.c | 6 ---- drivers/mmc/core/queue.h | 1 - include/linux/mmc/core.h | 3 +- include/linux/mmc/host.h | 7 ++-- 6 files changed, 51 insertions(+), 57 deletions(-) -- 2.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html