mbox series

[RFC,v2,0/2] ceph_check_delayed_caps() softlockup

Message ID 20210629094749.25253-1-lhenriques@suse.de
Headers show
Series ceph_check_delayed_caps() softlockup | expand

Message

Luis Henriques June 29, 2021, 9:47 a.m. UTC
This is an attempt to fix the softlock on the delayed_work workqueue.  As
stated in 0002 patch:

  Function ceph_check_delayed_caps() is called from the mdsc->delayed_work
  workqueue and it can be kept looping for quite some time if caps keep being
  added back to the mdsc->cap_delay_list.  This may result in the watchdog
  tainting the kernel with the softlockup flag.

v2 of this fix modifies the approach by time-bounding the loop in this
function, so that any caps added to the list *after* the loop starts will
be postponed to the next wq run.

An extra change in 0001 (suggested by Jeff) allows scheduling runs for
periods smaller than the default (5 secs) period.  This way,
delayed_work() can have the next run scheduled for the next list element
ci->i_hold_caps_max instead of 5 secs.

This patchset should fix the issue reported here [1], although a quick
search for "ceph_check_delayed_caps" in the tracker returns a few more
bugs, possibly duplicates.

[1] https://tracker.ceph.com/issues/46284

Luis Henriques (2):
  ceph: allow schedule_delayed() callers to set delay for workqueue
  ceph: reduce contention in ceph_check_delayed_caps()

 fs/ceph/caps.c       | 17 ++++++++++++++++-
 fs/ceph/mds_client.c | 24 +++++++++++++++---------
 fs/ceph/super.h      |  2 +-
 3 files changed, 32 insertions(+), 11 deletions(-)

Comments

Ilya Dryomov June 29, 2021, 10:14 a.m. UTC | #1
On Tue, Jun 29, 2021 at 11:47 AM Luis Henriques <lhenriques@suse.de> wrote:
>
> Allow schedule_delayed() callers to explicitly set the delay value instead
> of defaulting to a 5 secs value.
>
> Signed-off-by: Luis Henriques <lhenriques@suse.de>
> ---
>  fs/ceph/mds_client.c | 19 ++++++++++++-------
>  1 file changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> index e5af591d3bd4..08c76bf57fb1 100644
> --- a/fs/ceph/mds_client.c
> +++ b/fs/ceph/mds_client.c
> @@ -4502,13 +4502,18 @@ void inc_session_sequence(struct ceph_mds_session *s)
>  }
>
>  /*
> - * delayed work -- periodically trim expired leases, renew caps with mds
> + * delayed work -- periodically trim expired leases, renew caps with mds.  If
> + * the @delay parameter is set to 0 or if it's more than 5 secs, the default
> + * workqueue delay value of 5 secs will be used.
>   */
> -static void schedule_delayed(struct ceph_mds_client *mdsc)
> +static void schedule_delayed(struct ceph_mds_client *mdsc, unsigned long delay)
>  {
> -       int delay = 5;
> -       unsigned hz = round_jiffies_relative(HZ * delay);
> -       schedule_delayed_work(&mdsc->delayed_work, hz);
> +       unsigned long max_delay = round_jiffies_relative(HZ * 5);
> +
> +       /* 5 secs default delay */
> +       if (!delay || (delay > max_delay))
> +               delay = max_delay;
> +       schedule_delayed_work(&mdsc->delayed_work, delay);

Hi Luis,

Is there a reason to not round the non-default delay?  Does it need to
be precise?

Thanks,

                Ilya