Message ID | 20230609131817.712867-1-xianting.tian@linux.alibaba.com |
---|---|
Headers | show |
Series | fixup potential cpu stall | expand |
On Fri, Jun 09, 2023 at 03:39:24PM +0200, Greg KH wrote: > On Fri, Jun 09, 2023 at 09:18:15PM +0800, Xianting Tian wrote: > > From: Xianting Tian <tianxianting.txt@alibaba-inc.com> > > > > Cpu stall issue may happen if device is configured with multi queues > > and large queue depth, so fix it. > > > > Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com> > > --- > > drivers/crypto/virtio/virtio_crypto_core.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/crypto/virtio/virtio_crypto_core.c b/drivers/crypto/virtio/virtio_crypto_core.c > > index 1198bd306365..94849fa3bd74 100644 > > --- a/drivers/crypto/virtio/virtio_crypto_core.c > > +++ b/drivers/crypto/virtio/virtio_crypto_core.c > > @@ -480,6 +480,7 @@ static void virtcrypto_free_unused_reqs(struct virtio_crypto *vcrypto) > > kfree(vc_req->req_data); > > kfree(vc_req->sgs); > > } > > + cond_resched(); > > that's not "fixing a stall", it is "call the scheduler because we are > taking too long". The CPU isn't stalled at all, just busy. > > Are you sure this isn't just a bug in the code? Why is this code taking > so long that you have to force the scheduler to run? This is almost > always a sign that something else needs to be fixed instead. And same comment on the other 2 patches, please fix this properly. Also, this is a tight loop that is just freeing memory, why is it taking so long? Why do you want it to take longer (which is what you are doing here), ideally it would be faster, not slower, so you are now slowing down the system overall with this patchset, right? thanks, greg k-h
On Fri, Jun 09, 2023 at 09:49:39PM +0800, Xianting Tian wrote: > > 在 2023/6/9 下午9:41, Greg KH 写道: > > On Fri, Jun 09, 2023 at 03:39:24PM +0200, Greg KH wrote: > > > On Fri, Jun 09, 2023 at 09:18:15PM +0800, Xianting Tian wrote: > > > > From: Xianting Tian <tianxianting.txt@alibaba-inc.com> > > > > > > > > Cpu stall issue may happen if device is configured with multi queues > > > > and large queue depth, so fix it. > > > > > > > > Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com> > > > > --- > > > > drivers/crypto/virtio/virtio_crypto_core.c | 1 + > > > > 1 file changed, 1 insertion(+) > > > > > > > > diff --git a/drivers/crypto/virtio/virtio_crypto_core.c b/drivers/crypto/virtio/virtio_crypto_core.c > > > > index 1198bd306365..94849fa3bd74 100644 > > > > --- a/drivers/crypto/virtio/virtio_crypto_core.c > > > > +++ b/drivers/crypto/virtio/virtio_crypto_core.c > > > > @@ -480,6 +480,7 @@ static void virtcrypto_free_unused_reqs(struct virtio_crypto *vcrypto) > > > > kfree(vc_req->req_data); > > > > kfree(vc_req->sgs); > > > > } > > > > + cond_resched(); > > > that's not "fixing a stall", it is "call the scheduler because we are > > > taking too long". The CPU isn't stalled at all, just busy. > > > > > > Are you sure this isn't just a bug in the code? Why is this code taking > > > so long that you have to force the scheduler to run? This is almost > > > always a sign that something else needs to be fixed instead. > > And same comment on the other 2 patches, please fix this properly. > > > > Also, this is a tight loop that is just freeing memory, why is it taking > > so long? Why do you want it to take longer (which is what you are doing > > here), ideally it would be faster, not slower, so you are now slowing > > down the system overall with this patchset, right? > > yes, it is the similar fix with one for virtio-net > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/virtio_net.c?h=v6.4-rc5&id=f8bb5104394560e29017c25bcade4c6b7aabd108 <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/virtio_net.c?h=v6.4-rc5&id=f8bb5104394560e29017c25bcade4c6b7aabd108> I would argue that this too is incorrect, because why does freeing memory take so long? And again, you are making it take longer, is that ok? thanks, greg k-h
在 2023/6/9 下午10:05, Greg KH 写道: > On Fri, Jun 09, 2023 at 09:49:39PM +0800, Xianting Tian wrote: >> 在 2023/6/9 下午9:41, Greg KH 写道: >>> On Fri, Jun 09, 2023 at 03:39:24PM +0200, Greg KH wrote: >>>> On Fri, Jun 09, 2023 at 09:18:15PM +0800, Xianting Tian wrote: >>>>> From: Xianting Tian <tianxianting.txt@alibaba-inc.com> >>>>> >>>>> Cpu stall issue may happen if device is configured with multi queues >>>>> and large queue depth, so fix it. >>>>> >>>>> Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com> >>>>> --- >>>>> drivers/crypto/virtio/virtio_crypto_core.c | 1 + >>>>> 1 file changed, 1 insertion(+) >>>>> >>>>> diff --git a/drivers/crypto/virtio/virtio_crypto_core.c b/drivers/crypto/virtio/virtio_crypto_core.c >>>>> index 1198bd306365..94849fa3bd74 100644 >>>>> --- a/drivers/crypto/virtio/virtio_crypto_core.c >>>>> +++ b/drivers/crypto/virtio/virtio_crypto_core.c >>>>> @@ -480,6 +480,7 @@ static void virtcrypto_free_unused_reqs(struct virtio_crypto *vcrypto) >>>>> kfree(vc_req->req_data); >>>>> kfree(vc_req->sgs); >>>>> } >>>>> + cond_resched(); >>>> that's not "fixing a stall", it is "call the scheduler because we are >>>> taking too long". The CPU isn't stalled at all, just busy. >>>> >>>> Are you sure this isn't just a bug in the code? Why is this code taking >>>> so long that you have to force the scheduler to run? This is almost >>>> always a sign that something else needs to be fixed instead. >>> And same comment on the other 2 patches, please fix this properly. >>> >>> Also, this is a tight loop that is just freeing memory, why is it taking >>> so long? Why do you want it to take longer (which is what you are doing >>> here), ideally it would be faster, not slower, so you are now slowing >>> down the system overall with this patchset, right? >> yes, it is the similar fix with one for virtio-net >> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/virtio_net.c?h=v6.4-rc5&id=f8bb5104394560e29017c25bcade4c6b7aabd108 <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/virtio_net.c?h=v6.4-rc5&id=f8bb5104394560e29017c25bcade4c6b7aabd108> > I would argue that this too is incorrect, because why does freeing > memory take so long? And again, you are making it take longer, is that > ok? Yes, it may take longer, but I think it's no harms. As the queue numbers and queue's depth are uncertain, it depends on user's configuration. It may take more times in kernel space to free all queues without schedule, so it has the risk to cause other task starve > > thanks, > > greg k-h
On Fri, Jun 09, 2023 at 09:18:15PM +0800, Xianting Tian wrote: > From: Xianting Tian <tianxianting.txt@alibaba-inc.com> > > Cpu stall issue may happen if device is configured with multi queues > and large queue depth, so fix it. > > Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com> include a Fixes tag? > --- > drivers/crypto/virtio/virtio_crypto_core.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/crypto/virtio/virtio_crypto_core.c b/drivers/crypto/virtio/virtio_crypto_core.c > index 1198bd306365..94849fa3bd74 100644 > --- a/drivers/crypto/virtio/virtio_crypto_core.c > +++ b/drivers/crypto/virtio/virtio_crypto_core.c > @@ -480,6 +480,7 @@ static void virtcrypto_free_unused_reqs(struct virtio_crypto *vcrypto) > kfree(vc_req->req_data); > kfree(vc_req->sgs); > } > + cond_resched(); > } > } > > -- > 2.17.1
在 2023/6/9 下午11:57, Michael S. Tsirkin 写道: > On Fri, Jun 09, 2023 at 09:18:15PM +0800, Xianting Tian wrote: >> From: Xianting Tian <tianxianting.txt@alibaba-inc.com> >> >> Cpu stall issue may happen if device is configured with multi queues >> and large queue depth, so fix it. > What does "may happen" imply exactly? > was this observed? I didn't met such issue, this patch set just a theoretical fix. > >> Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com> >> --- >> drivers/crypto/virtio/virtio_crypto_core.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/drivers/crypto/virtio/virtio_crypto_core.c b/drivers/crypto/virtio/virtio_crypto_core.c >> index 1198bd306365..94849fa3bd74 100644 >> --- a/drivers/crypto/virtio/virtio_crypto_core.c >> +++ b/drivers/crypto/virtio/virtio_crypto_core.c >> @@ -480,6 +480,7 @@ static void virtcrypto_free_unused_reqs(struct virtio_crypto *vcrypto) >> kfree(vc_req->req_data); >> kfree(vc_req->sgs); >> } >> + cond_resched(); >> } >> } >> >> -- >> 2.17.1
On Sat, Jun 10, 2023 at 11:20:49AM +0800, Xianting Tian wrote: > > 在 2023/6/9 下午11:57, Michael S. Tsirkin 写道: > > On Fri, Jun 09, 2023 at 09:18:15PM +0800, Xianting Tian wrote: > > > From: Xianting Tian <tianxianting.txt@alibaba-inc.com> > > > > > > Cpu stall issue may happen if device is configured with multi queues > > > and large queue depth, so fix it. > > What does "may happen" imply exactly? > > was this observed? > I didn't met such issue, this patch set just a theoretical fix. Then I would not recommend adding it at this time, as you just slowed down the kernel for something that no one has reported :( thanks, greg k-h
在 2023/6/22 下午7:59, Michael S. Tsirkin 写道: > On Fri, Jun 09, 2023 at 09:18:14PM +0800, Xianting Tian wrote: >> Cpu stall issue may happen if device is configured with multi queues >> and large queue depth, so fix it. > > I applied this after tweaking commit log to address Greg's comments. > In the future I expect you guys to do such tweaks yourself. thanks. > >> Xianting Tian (3): >> virtio-crypto: fixup potential cpu stall when free unused bufs >> virtio_console: fixup potential cpu stall when free unused bufs >> virtio_bt: fixup potential cpu stall when free unused bufs >> >> drivers/bluetooth/virtio_bt.c | 1 + >> drivers/char/virtio_console.c | 1 + >> drivers/crypto/virtio/virtio_crypto_core.c | 1 + >> 3 files changed, 3 insertions(+) >> >> -- >> 2.17.1
Hello: This series was applied to bluetooth/bluetooth-next.git (master) by Michael S. Tsirkin <mst@redhat.com>: On Fri, 9 Jun 2023 21:18:14 +0800 you wrote: > Cpu stall issue may happen if device is configured with multi queues > and large queue depth, so fix it. > > Xianting Tian (3): > virtio-crypto: fixup potential cpu stall when free unused bufs > virtio_console: fixup potential cpu stall when free unused bufs > virtio_bt: fixup potential cpu stall when free unused bufs > > [...] Here is the summary with links: - [1/3] virtio-crypto: fixup potential cpu stall when free unused bufs https://git.kernel.org/bluetooth/bluetooth-next/c/7a5103b81a96 - [2/3] virtio_console: fixup potential cpu stall when free unused bufs https://git.kernel.org/bluetooth/bluetooth-next/c/56b5e65efe00 - [3/3] virtio_bt: fixup potential cpu stall when free unused bufs https://git.kernel.org/bluetooth/bluetooth-next/c/3845308fc8b0 You are awesome, thank you!