[RFC,0/4] Action initalization fixes

Message ID	20210331164012.28653-1-vladbu@nvidia.com
Headers	show Return-Path: <netdev-owner@kernel.org> Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.112.32 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.112.32; helo=mail.nvidia.com; From: Vlad Buslov <vladbu@nvidia.com> To: <netdev@vger.kernel.org> CC: <memxor@gmail.com>, <xiyou.wangcong@gmail.com>, <davem@davemloft.net>, <jhs@mojatatu.com>, <jiri@resnulli.us>, <kuba@kernel.org>, <toke@redhat.com>, Vlad Buslov <vladbu@nvidia.com> Subject: [PATCH RFC 0/4] Action initalization fixes Date: Wed, 31 Mar 2021 19:40:08 +0300 Message-ID: <20210331164012.28653-1-vladbu@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain Precedence: bulk
Series	Action initalization fixes \| expand [RFC,0/4] Action initalization fixes [RFC,1/4] net: sched: fix action overwrite reference counting [RFC,4/4] tc-testing: add simple action test to verify batch change cleanup

Message ID

20210331164012.28653-1-vladbu@nvidia.com

Headers

Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates
	216.228.112.32 as permitted sender)
	receiver=protection.outlook.com; 
	client-ip=216.228.112.32; helo=mail.nvidia.com;
From: Vlad Buslov <vladbu@nvidia.com>
To: <netdev@vger.kernel.org>
CC: <memxor@gmail.com>, <xiyou.wangcong@gmail.com>,
	<davem@davemloft.net>, <jhs@mojatatu.com>, <jiri@resnulli.us>,
	<kuba@kernel.org>, <toke@redhat.com>, Vlad Buslov <vladbu@nvidia.com>
Subject: [PATCH RFC 0/4] Action initalization fixes
Date: Wed, 31 Mar 2021 19:40:08 +0300
Message-ID: <20210331164012.28653-1-vladbu@nvidia.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Mar 2021 16:40:54.5563
	(UTC)
X-MS-Exchange-CrossTenant-Network-Message-Id: 399337f3-cbf8-4367-8fe0-08d8f463c0d8
X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;
	Ip=[216.228.112.32]; Helo=[mail.nvidia.com]
X-MS-Exchange-CrossTenant-AuthSource: BN8NAM11FT036.eop-nam11.prod.protection.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Anonymous
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR12MB1517
Precedence: bulk
List-ID: <netdev.vger.kernel.org>
X-Mailing-List: netdev@vger.kernel.org

Series

Action initalization fixes | expand

Message

Vlad Buslov March 31, 2021, 4:40 p.m. UTC

This series contains fixes and relevant tests for two issues that are being
discussed in mailing list thread "net: sched: bump refcount for new action in ACT
replace mode".

Sending this as RFC to gather feedbeck. Non-RFC submission will probably split
in two: fixes to net, tests to net-next.

Vlad Buslov (4):
  net: sched: fix action overwrite reference counting
  net: sched: fix err handler in tcf_action_init()
  tc-testing: add simple action test to verify batch add cleanup
  tc-testing: add simple action test to verify batch change cleanup

 include/net/act_api.h                         |  5 +-
 net/sched/act_api.c                           | 57 +++++++++++-------
 net/sched/cls_api.c                           |  9 +--
 .../tc-testing/tc-tests/actions/simple.json   | 59 +++++++++++++++++++
 4 files changed, 104 insertions(+), 26 deletions(-)

Comments

Cong Wang April 2, 2021, 11:14 p.m. UTC | #1

On Wed, Mar 31, 2021 at 9:41 AM Vlad Buslov <vladbu@nvidia.com> wrote:
>

> With recent changes that separated action module load from action

> initialization tcf_action_init() function error handling code was modified

> to manually release the loaded modules if loading/initialization of any

> further action in same batch failed. For the case when all modules

> successfully loaded and some of the actions were initialized before one of

> them failed in init handler. In this case for all previous actions the

> module will be released twice by the error handler: First time by the loop

> that manually calls module_put() for all ops, and second time by the action

> destroy code that puts the module after destroying the action.


This is really strange. Isn't tc_action_load_ops() paired with module_put()
under 'err_mod'? And the one in tcf_action_destroy() paired with
tcf_action_init_1()? Is it the one below which causes the imbalance?

1038         /* module count goes up only when brand new policy is created
1039          * if it exists and is only bound to in a_o->init() then
1040          * ACT_P_CREATED is not returned (a zero is).
1041          */
1042         if (err != ACT_P_CREATED)
1043                 module_put(a_o->owner);
1044

Thanks.

Vlad Buslov April 3, 2021, 10:01 a.m. UTC | #2

On Sat 03 Apr 2021 at 02:14, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Wed, Mar 31, 2021 at 9:41 AM Vlad Buslov <vladbu@nvidia.com> wrote:

>>

>> With recent changes that separated action module load from action

>> initialization tcf_action_init() function error handling code was modified

>> to manually release the loaded modules if loading/initialization of any

>> further action in same batch failed. For the case when all modules

>> successfully loaded and some of the actions were initialized before one of

>> them failed in init handler. In this case for all previous actions the

>> module will be released twice by the error handler: First time by the loop

>> that manually calls module_put() for all ops, and second time by the action

>> destroy code that puts the module after destroying the action.

>

> This is really strange. Isn't tc_action_load_ops() paired with module_put()

> under 'err_mod'? And the one in tcf_action_destroy() paired with

> tcf_action_init_1()? Is it the one below which causes the imbalance?

>

> 1038         /* module count goes up only when brand new policy is created

> 1039          * if it exists and is only bound to in a_o->init() then

> 1040          * ACT_P_CREATED is not returned (a zero is).

> 1041          */

> 1042         if (err != ACT_P_CREATED)

> 1043                 module_put(a_o->owner);

> 1044

This problem is not related to action change reference counting
imbalance which is addressed in previous commit. The issue is that
function tcf_action_init_1() doesn't take another reference to module.
It expects caller to get the reference before calling init and "takes
over" the reference in case of success (e.g. action instance now owns
the reference which will be released when action instance is destroyed).

So, the following happens in reproduction provided in commit message
when executing "tc actions add action simple sdata \"1\" index 1
action simple sdata \"2\" index 2" command:

1. tcf_action_init() is called with batch of two actions of same type,
no module references are held, 'actions' array is empty:

act_simple refcnt balance = 0
actions[] = {}

2. tc_action_load_ops() is called for first action:

act_simple refcnt balance = +1
actions[] = {}

3. tc_action_load_ops() is called for second action:

act_simple refcnt balance = +2
actions[] = {}

4. tcf_action_init_1() called for first action, succeeds, action
instance is assigned to 'actions' array:

act_simple refcnt balance = +2
actions[] = { [0]=act1 }

5. tcf_action_init_1() fails for second action, 'actions' array not
changed, goto err:

act_simple refcnt balance = +2
actions[] = { [0]=act1 }

6. tcf_action_destroy() is called for 'actions' array, last reference to
first action is released, tcf_action_destroy_1() calls module_put() for
actions module:

act_simple refcnt balance = +1
actions[] = {}

7. err_mod loop starts iterating over ops array, executes module_put()
for first actions ops:

act_simple refcnt balance = 0
actions[] = {}

7. err_mod loop executes module_put() for second actions ops:

act_simple refcnt balance = -1
actions[] = {}

The goal of my fix is to not unconditionally release the module
reference for successfully initialized actions because this is already
handled by action destroy code. Hope this explanation clarifies things.

Regards,
Vlad

Cong Wang April 5, 2021, 10:56 p.m. UTC | #3

On Sat, Apr 3, 2021 at 3:01 AM Vlad Buslov <vladbu@nvidia.com> wrote:
> So, the following happens in reproduction provided in commit message

> when executing "tc actions add action simple sdata \"1\" index 1

> action simple sdata \"2\" index 2" command:

>

> 1. tcf_action_init() is called with batch of two actions of same type,

> no module references are held, 'actions' array is empty:

>

> act_simple refcnt balance = 0

> actions[] = {}

>

> 2. tc_action_load_ops() is called for first action:

>

> act_simple refcnt balance = +1

> actions[] = {}

>

> 3. tc_action_load_ops() is called for second action:

>

> act_simple refcnt balance = +2

> actions[] = {}

>

> 4. tcf_action_init_1() called for first action, succeeds, action

> instance is assigned to 'actions' array:

>

> act_simple refcnt balance = +2

> actions[] = { [0]=act1 }

>

> 5. tcf_action_init_1() fails for second action, 'actions' array not

> changed, goto err:

>

> act_simple refcnt balance = +2

> actions[] = { [0]=act1 }

>

> 6. tcf_action_destroy() is called for 'actions' array, last reference to

> first action is released, tcf_action_destroy_1() calls module_put() for

> actions module:

>

> act_simple refcnt balance = +1

> actions[] = {}

>

> 7. err_mod loop starts iterating over ops array, executes module_put()

> for first actions ops:

>

> act_simple refcnt balance = 0

> actions[] = {}

>

> 7. err_mod loop executes module_put() for second actions ops:

>

> act_simple refcnt balance = -1

> actions[] = {}

>

>

> The goal of my fix is to not unconditionally release the module

> reference for successfully initialized actions because this is already

> handled by action destroy code. Hope this explanation clarifies things.


Great explanation! It seems harder and harder to understand the
module refcnt here. How about we just take the refcnt when we
successfully create an action? Something like this:

diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index b919826939e0..075cc80480bf 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -493,6 +493,7 @@ int tcf_idr_create(struct tc_action_net *tn, u32
index, struct nlattr *est,
        }

        p->idrinfo = idrinfo;
+       __module_get(ops->owner);
        p->ops = ops;
        *a = p;
        return 0;
@@ -1035,13 +1036,6 @@ struct tc_action *tcf_action_init_1(struct net
*net, struct tcf_proto *tp,
        if (!name)
                a->hw_stats = hw_stats;

-       /* module count goes up only when brand new policy is created
-        * if it exists and is only bound to in a_o->init() then
-        * ACT_P_CREATED is not returned (a zero is).
-        */
-       if (err != ACT_P_CREATED)
-               module_put(a_o->owner);
-
        return a;

 err_out:
@@ -1100,7 +1094,8 @@ int tcf_action_init(struct net *net, struct
tcf_proto *tp, struct nlattr *nla,
        tcf_idr_insert_many(actions);

        *attr_size = tcf_action_full_attrs_size(sz);
-       return i - 1;
+       err = i - 1;
+       goto err_mod:

 err:
        tcf_action_destroy(actions, bind);

The idea is on the higher level we hold refcnt when loading module and
put it back _unconditionally_ when returning, and hold a refcnt only when
we create an action and conditionally put it back when an error happens.
With pseudo code, it is something like this:

load_ops() // module refcnt +1
init_actions(); // module refcnt +1 only when create a new one
if (err)
  // module refcnt -1 when we delete one
  module_put();
module_put(); // module refcnt -1

This looks much easier to track. What do you think?

Thanks!

Vlad Buslov April 6, 2021, 7:35 p.m. UTC | #4

On Tue 06 Apr 2021 at 01:56, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Sat, Apr 3, 2021 at 3:01 AM Vlad Buslov <vladbu@nvidia.com> wrote:

>> So, the following happens in reproduction provided in commit message

>> when executing "tc actions add action simple sdata \"1\" index 1

>> action simple sdata \"2\" index 2" command:

>>

>> 1. tcf_action_init() is called with batch of two actions of same type,

>> no module references are held, 'actions' array is empty:

>>

>> act_simple refcnt balance = 0

>> actions[] = {}

>>

>> 2. tc_action_load_ops() is called for first action:

>>

>> act_simple refcnt balance = +1

>> actions[] = {}

>>

>> 3. tc_action_load_ops() is called for second action:

>>

>> act_simple refcnt balance = +2

>> actions[] = {}

>>

>> 4. tcf_action_init_1() called for first action, succeeds, action

>> instance is assigned to 'actions' array:

>>

>> act_simple refcnt balance = +2

>> actions[] = { [0]=act1 }

>>

>> 5. tcf_action_init_1() fails for second action, 'actions' array not

>> changed, goto err:

>>

>> act_simple refcnt balance = +2

>> actions[] = { [0]=act1 }

>>

>> 6. tcf_action_destroy() is called for 'actions' array, last reference to

>> first action is released, tcf_action_destroy_1() calls module_put() for

>> actions module:

>>

>> act_simple refcnt balance = +1

>> actions[] = {}

>>

>> 7. err_mod loop starts iterating over ops array, executes module_put()

>> for first actions ops:

>>

>> act_simple refcnt balance = 0

>> actions[] = {}

>>

>> 7. err_mod loop executes module_put() for second actions ops:

>>

>> act_simple refcnt balance = -1

>> actions[] = {}

>>

>>

>> The goal of my fix is to not unconditionally release the module

>> reference for successfully initialized actions because this is already

>> handled by action destroy code. Hope this explanation clarifies things.

>

> Great explanation! It seems harder and harder to understand the

> module refcnt here. How about we just take the refcnt when we

> successfully create an action? Something like this:

>

> diff --git a/net/sched/act_api.c b/net/sched/act_api.c

> index b919826939e0..075cc80480bf 100644

> --- a/net/sched/act_api.c

> +++ b/net/sched/act_api.c

> @@ -493,6 +493,7 @@ int tcf_idr_create(struct tc_action_net *tn, u32

> index, struct nlattr *est,

>         }

>

>         p->idrinfo = idrinfo;

> +       __module_get(ops->owner);

>         p->ops = ops;

>         *a = p;

>         return 0;

> @@ -1035,13 +1036,6 @@ struct tc_action *tcf_action_init_1(struct net

> *net, struct tcf_proto *tp,

>         if (!name)

>                 a->hw_stats = hw_stats;

>

> -       /* module count goes up only when brand new policy is created

> -        * if it exists and is only bound to in a_o->init() then

> -        * ACT_P_CREATED is not returned (a zero is).

> -        */

> -       if (err != ACT_P_CREATED)

> -               module_put(a_o->owner);

> -

>         return a;

>

>  err_out:

> @@ -1100,7 +1094,8 @@ int tcf_action_init(struct net *net, struct

> tcf_proto *tp, struct nlattr *nla,

>         tcf_idr_insert_many(actions);

>

>         *attr_size = tcf_action_full_attrs_size(sz);

> -       return i - 1;

> +       err = i - 1;

> +       goto err_mod:

>

>  err:

>         tcf_action_destroy(actions, bind);

>

> The idea is on the higher level we hold refcnt when loading module and

> put it back _unconditionally_ when returning, and hold a refcnt only when

> we create an action and conditionally put it back when an error happens.

> With pseudo code, it is something like this:

>

> load_ops() // module refcnt +1

> init_actions(); // module refcnt +1 only when create a new one

> if (err)

>   // module refcnt -1 when we delete one

>   module_put();

> module_put(); // module refcnt -1

>

> This looks much easier to track. What do you think?

>

> Thanks!


Indeed, your suggestion looks more straightforward. The only thing we
need to mind is that action->init() callbacks assume that caller
releases the module even after calling tcf_idr_create(), so we also need
to modify tcf_idr_release() (used by error handlers in action->init()
implementations) to release the module.

I'll run some tests tomorrow to verify that I'm not missing anything
else.

Regards,
Vlad