Message ID | 20200923035624.7307-1-xiyou.wangcong@gmail.com |
---|---|
Headers | show |
Series | net_sched: fix a UAF in tcf_action_init() | expand |
From: Cong Wang <xiyou.wangcong@gmail.com> Date: Tue, 22 Sep 2020 20:56:22 -0700 > This patchset fixes a use-after-free triggered by syzbot. Please > find more details in each patch description. Series applied and queued up for -stable, thanks Cong.
On Wed 23 Sep 2020 at 06:56, Cong Wang <xiyou.wangcong@gmail.com> wrote: > All TC actions call tcf_idr_insert() for new action at the end > of their ->init(), so we can actually move it to a central place > in tcf_action_init_1(). > > And once the action is inserted into the global IDR, other parallel > process could free it immediately as its refcnt is still 1, so we can > not fail after this, we need to move it after the goto action > validation to avoid handling the failure case after insertion. > > This is found during code review, is not directly triggered by syzbot. > And this prepares for the next patch. > > Cc: Vlad Buslov <vladbu@mellanox.com> > Cc: Jamal Hadi Salim <jhs@mojatatu.com> > Cc: Jiri Pirko <jiri@resnulli.us> > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> > --- Hi Cong, Thanks for fixing this! > include/net/act_api.h | 2 -- > net/sched/act_api.c | 38 ++++++++++++++++++++------------------ > net/sched/act_bpf.c | 4 +--- > net/sched/act_connmark.c | 1 - > net/sched/act_csum.c | 3 --- > net/sched/act_ct.c | 2 -- > net/sched/act_ctinfo.c | 3 --- > net/sched/act_gact.c | 2 -- > net/sched/act_gate.c | 3 --- > net/sched/act_ife.c | 3 --- > net/sched/act_ipt.c | 2 -- > net/sched/act_mirred.c | 2 -- > net/sched/act_mpls.c | 2 -- > net/sched/act_nat.c | 3 --- > net/sched/act_pedit.c | 2 -- > net/sched/act_police.c | 2 -- > net/sched/act_sample.c | 2 -- > net/sched/act_simple.c | 2 -- > net/sched/act_skbedit.c | 2 -- > net/sched/act_skbmod.c | 2 -- > net/sched/act_tunnel_key.c | 3 --- > net/sched/act_vlan.c | 2 -- > 22 files changed, 21 insertions(+), 66 deletions(-) > > diff --git a/include/net/act_api.h b/include/net/act_api.h > index cb382a89ea58..87214927314a 100644 > --- a/include/net/act_api.h > +++ b/include/net/act_api.h > @@ -166,8 +166,6 @@ int tcf_idr_create_from_flags(struct tc_action_net *tn, u32 index, > struct nlattr *est, struct tc_action **a, > const struct tc_action_ops *ops, int bind, > u32 flags); > -void tcf_idr_insert(struct tc_action_net *tn, struct tc_action *a); > - > void tcf_idr_cleanup(struct tc_action_net *tn, u32 index); > int tcf_idr_check_alloc(struct tc_action_net *tn, u32 *index, > struct tc_action **a, int bind); > diff --git a/net/sched/act_api.c b/net/sched/act_api.c > index 063d8aaf2900..0030f00234ee 100644 > --- a/net/sched/act_api.c > +++ b/net/sched/act_api.c > @@ -467,17 +467,6 @@ int tcf_idr_create_from_flags(struct tc_action_net *tn, u32 index, > } > EXPORT_SYMBOL(tcf_idr_create_from_flags); > > -void tcf_idr_insert(struct tc_action_net *tn, struct tc_action *a) > -{ > - struct tcf_idrinfo *idrinfo = tn->idrinfo; > - > - mutex_lock(&idrinfo->lock); > - /* Replace ERR_PTR(-EBUSY) allocated by tcf_idr_check_alloc */ > - WARN_ON(!IS_ERR(idr_replace(&idrinfo->action_idr, a, a->tcfa_index))); > - mutex_unlock(&idrinfo->lock); > -} > -EXPORT_SYMBOL(tcf_idr_insert); > - > /* Cleanup idr index that was allocated but not initialized. */ > > void tcf_idr_cleanup(struct tc_action_net *tn, u32 index) > @@ -902,6 +891,16 @@ static const struct nla_policy tcf_action_policy[TCA_ACT_MAX + 1] = { > [TCA_ACT_HW_STATS] = NLA_POLICY_BITFIELD32(TCA_ACT_HW_STATS_ANY), > }; > > +static void tcf_idr_insert(struct tc_action *a) > +{ > + struct tcf_idrinfo *idrinfo = a->idrinfo; > + > + mutex_lock(&idrinfo->lock); > + /* Replace ERR_PTR(-EBUSY) allocated by tcf_idr_check_alloc */ > + WARN_ON(!IS_ERR(idr_replace(&idrinfo->action_idr, a, a->tcfa_index))); > + mutex_unlock(&idrinfo->lock); > +} > + > struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp, > struct nlattr *nla, struct nlattr *est, > char *name, int ovr, int bind, > @@ -989,6 +988,16 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp, > if (err < 0) > goto err_mod; > > + if (TC_ACT_EXT_CMP(a->tcfa_action, TC_ACT_GOTO_CHAIN) && > + !rcu_access_pointer(a->goto_chain)) { > + tcf_action_destroy_1(a, bind); > + NL_SET_ERR_MSG(extack, "can't use goto chain with NULL chain"); > + return ERR_PTR(-EINVAL); > + } I don't think calling tcf_action_destoy_1() is enough here. Since you moved this block before assigning cookie and releasing the module, you also need to release them manually in addition to destroying the action instance. > + > + if (err == ACT_P_CREATED) > + tcf_idr_insert(a); > + > if (!name && tb[TCA_ACT_COOKIE]) > tcf_set_action_cookie(&a->act_cookie, cookie); > > @@ -1002,13 +1011,6 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp, > if (err != ACT_P_CREATED) > module_put(a_o->owner); > > - if (TC_ACT_EXT_CMP(a->tcfa_action, TC_ACT_GOTO_CHAIN) && > - !rcu_access_pointer(a->goto_chain)) { > - tcf_action_destroy_1(a, bind); > - NL_SET_ERR_MSG(extack, "can't use goto chain with NULL chain"); > - return ERR_PTR(-EINVAL); > - } > - > return a; > > err_mod: [...]
On Fri, Sep 25, 2020 at 8:24 AM Vlad Buslov <vlad@buslov.dev> wrote: > > + if (TC_ACT_EXT_CMP(a->tcfa_action, TC_ACT_GOTO_CHAIN) && > > + !rcu_access_pointer(a->goto_chain)) { > > + tcf_action_destroy_1(a, bind); > > + NL_SET_ERR_MSG(extack, "can't use goto chain with NULL chain"); > > + return ERR_PTR(-EINVAL); > > + } > > I don't think calling tcf_action_destoy_1() is enough here. Since you > moved this block before assigning cookie and releasing the module, you > also need to release them manually in addition to destroying the action > instance. > tcf_action_destoy_1() eventually calls free_tcf() which frees cookie and tcf_action_destroy() which releases module refcnt. What am I missing here? Thanks.
On Fri 25 Sep 2020 at 22:22, Cong Wang <xiyou.wangcong@gmail.com> wrote: > On Fri, Sep 25, 2020 at 8:24 AM Vlad Buslov <vlad@buslov.dev> wrote: >> > + if (TC_ACT_EXT_CMP(a->tcfa_action, TC_ACT_GOTO_CHAIN) && >> > + !rcu_access_pointer(a->goto_chain)) { >> > + tcf_action_destroy_1(a, bind); >> > + NL_SET_ERR_MSG(extack, "can't use goto chain with NULL chain"); >> > + return ERR_PTR(-EINVAL); >> > + } >> >> I don't think calling tcf_action_destoy_1() is enough here. Since you >> moved this block before assigning cookie and releasing the module, you >> also need to release them manually in addition to destroying the action >> instance. >> > > tcf_action_destoy_1() eventually calls free_tcf() which frees cookie and > tcf_action_destroy() which releases module refcnt. > > What am I missing here? > > Thanks. The memory referenced by the function local pointer "cookie" hasn't been assigned yet to the a->act_cookie because in your patch you moved goto_chain validation code before the cookie change. That means that if user overwrites existing action, then action old a->act_cookie will be freed by tcf_action_destroy_1() but new cookie that was allocated by nla_memdup_cookie() will leak. Regards, Vlad
hello, On Fri, 2020-09-25 at 22:45 +0300, Vlad Buslov wrote: > On Fri 25 Sep 2020 at 22:22, Cong Wang <xiyou.wangcong@gmail.com> wrote: > > On Fri, Sep 25, 2020 at 8:24 AM Vlad Buslov <vlad@buslov.dev> wrote: > > > > + if (TC_ACT_EXT_CMP(a->tcfa_action, TC_ACT_GOTO_CHAIN) && > > > > + !rcu_access_pointer(a->goto_chain)) { > > > > + tcf_action_destroy_1(a, bind); > > > > + NL_SET_ERR_MSG(extack, "can't use goto chain with NULL chain"); > > > > + return ERR_PTR(-EINVAL); > > > > + } > > > > > > I don't think calling tcf_action_destoy_1() is enough here. Since you > > > moved this block before assigning cookie and releasing the module, you > > > also need to release them manually in addition to destroying the action > > > instance. > > > > > > > tcf_action_destoy_1() eventually calls free_tcf() which frees cookie and > > tcf_action_destroy() which releases module refcnt. > > > > What am I missing here? > > > > Thanks. > > The memory referenced by the function local pointer "cookie" hasn't been > assigned yet to the a->act_cookie because in your patch you moved > goto_chain validation code before the cookie change. That means that if > user overwrites existing action, then action old a->act_cookie will be > freed by tcf_action_destroy_1() but new cookie that was allocated by > nla_memdup_cookie() will leak. maybe we can just delete this if (TC_ACT_EXT_CMP(...)) { ... } statement, instead of moving it? Each TC action already does the check for NULL "goto chains" with a_o->init() -> tcf_action_check_ctrlact(), so this if () statement looks dead code to me _ I probably forgot to remove it after all actions were converted to validate the control action inside their .init() function.
On Mon, Sep 28, 2020 at 3:14 AM Davide Caratti <dcaratti@redhat.com> wrote: > > hello, > > On Fri, 2020-09-25 at 22:45 +0300, Vlad Buslov wrote: > > On Fri 25 Sep 2020 at 22:22, Cong Wang <xiyou.wangcong@gmail.com> wrote: > > > On Fri, Sep 25, 2020 at 8:24 AM Vlad Buslov <vlad@buslov.dev> wrote: > > > > > + if (TC_ACT_EXT_CMP(a->tcfa_action, TC_ACT_GOTO_CHAIN) && > > > > > + !rcu_access_pointer(a->goto_chain)) { > > > > > + tcf_action_destroy_1(a, bind); > > > > > + NL_SET_ERR_MSG(extack, "can't use goto chain with NULL chain"); > > > > > + return ERR_PTR(-EINVAL); > > > > > + } > > > > > > > > I don't think calling tcf_action_destoy_1() is enough here. Since you > > > > moved this block before assigning cookie and releasing the module, you > > > > also need to release them manually in addition to destroying the action > > > > instance. > > > > > > > > > > tcf_action_destoy_1() eventually calls free_tcf() which frees cookie and > > > tcf_action_destroy() which releases module refcnt. > > > > > > What am I missing here? > > > > > > Thanks. > > > > The memory referenced by the function local pointer "cookie" hasn't been > > assigned yet to the a->act_cookie because in your patch you moved > > goto_chain validation code before the cookie change. That means that if > > user overwrites existing action, then action old a->act_cookie will be > > freed by tcf_action_destroy_1() but new cookie that was allocated by > > nla_memdup_cookie() will leak. Yes, good catch! > > maybe we can just delete this if (TC_ACT_EXT_CMP(...)) { ... } > statement, instead of moving it? Each TC action already does the check > for NULL "goto chains" with a_o->init() -> tcf_action_check_ctrlact(), > so this if () statement looks dead code to me _ I probably forgot to > remove it after all actions were converted to validate the control > action inside their .init() function. Good point, I think you are right, I will send a patch to remove it. Thanks!