mbox series

[bpf-next,0/3] potential memleak and use after free in bpf verifier

Message ID 20210707043811.5349-1-hefengqing@huawei.com
Headers show
Series potential memleak and use after free in bpf verifier | expand

Message

He Fengqing July 7, 2021, 4:38 a.m. UTC
While reading the code of bpf verifier, I found these two issues.
Patch 1 move the bpf_prog_clone_free function into filter.h, so
we can use it in other file. Patch 2 fix memleak in an error
handling path in bpf_patch_insn_data function.
Patch 3 fix a use after free in bpf_check function.

He Fengqing (3):
  bpf: Move bpf_prog_clone_free into filter.h file
  bpf: Fix a memory leak in an error handling path in
    'bpf_patch_insn_data()'
  bpf: Fix a use after free in bpf_check()

 include/linux/filter.h | 17 ++++++++++++-
 kernel/bpf/core.c      | 27 +++++---------------
 kernel/bpf/verifier.c  | 58 ++++++++++++++++++++++++++++++++----------
 net/core/filter.c      |  2 +-
 4 files changed, 68 insertions(+), 36 deletions(-)

Comments

Song Liu July 7, 2021, 7:25 a.m. UTC | #1
On Tue, Jul 6, 2021 at 8:53 PM He Fengqing <hefengqing@huawei.com> wrote:
>
> In bpf_patch_insn_data, env->prog was input parameter of
> bpf_patch_insn_single function. bpf_patch_insn_single call
> bpf_prog_realloc to realloc ebpf prog. When we need to malloc new prog,
> bpf_prog_realloc will free the old prog, in this scenery is the
> env->prog.
> Then bpf_patch_insn_data function call adjust_insn_aux_data function, if
> adjust_insn_aux_data function return error, bpf_patch_insn_data will
> return NULL.
> In bpf_check->convert_ctx_accesses->bpf_patch_insn_data call chain, if
> bpf_patch_insn_data return NULL, env->prog has been freed in
> bpf_prog_realloc, then bpf_check will use the freed env->prog.

Besides "what is the bug", please also describe "how to fix it". For example,
add "Fix it by adding a free_old argument to bpf_prog_realloc(), and ...".
Also, for the subject of 0/3, it is better to say "fix potential
memory leak and ...".

>
> Signed-off-by: He Fengqing <hefengqing@huawei.com>
> ---
>  include/linux/filter.h |  2 +-
>  kernel/bpf/core.c      |  9 ++++---
>  kernel/bpf/verifier.c  | 53 ++++++++++++++++++++++++++++++++----------
>  net/core/filter.c      |  2 +-
>  4 files changed, 49 insertions(+), 17 deletions(-)
>
> diff --git a/include/linux/filter.h b/include/linux/filter.h
> index f39e008a377d..ec11a5ae92c2 100644
> --- a/include/linux/filter.h
> +++ b/include/linux/filter.h
> @@ -881,7 +881,7 @@ void bpf_prog_jit_attempt_done(struct bpf_prog *prog);
>  struct bpf_prog *bpf_prog_alloc(unsigned int size, gfp_t gfp_extra_flags);
>  struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flags);
>  struct bpf_prog *bpf_prog_realloc(struct bpf_prog *fp_old, unsigned int size,
> -                                 gfp_t gfp_extra_flags);
> +                                 gfp_t gfp_extra_flags, bool free_old);
>  void __bpf_prog_free(struct bpf_prog *fp);
>
>  static inline void bpf_prog_clone_free(struct bpf_prog *fp)
> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> index 49b0311f48c1..e5616bb1665b 100644
> --- a/kernel/bpf/core.c
> +++ b/kernel/bpf/core.c
> @@ -218,7 +218,7 @@ void bpf_prog_fill_jited_linfo(struct bpf_prog *prog,
>  }
>
>  struct bpf_prog *bpf_prog_realloc(struct bpf_prog *fp_old, unsigned int size,
> -                                 gfp_t gfp_extra_flags)
> +                                 gfp_t gfp_extra_flags, bool free_old)
>  {
>         gfp_t gfp_flags = GFP_KERNEL_ACCOUNT | __GFP_ZERO | gfp_extra_flags;
>         struct bpf_prog *fp;
> @@ -238,7 +238,8 @@ struct bpf_prog *bpf_prog_realloc(struct bpf_prog *fp_old, unsigned int size,
>                 /* We keep fp->aux from fp_old around in the new
>                  * reallocated structure.
>                  */
> -               bpf_prog_clone_free(fp_old);
> +               if (free_old)
> +                       bpf_prog_clone_free(fp_old);
>         }
>
>         return fp;
> @@ -456,7 +457,7 @@ struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off,
>          * last page could have large enough tailroom.
>          */
>         prog_adj = bpf_prog_realloc(prog, bpf_prog_size(insn_adj_cnt),
> -                                   GFP_USER);
> +                                   GFP_USER, false);
>         if (!prog_adj)
>                 return ERR_PTR(-ENOMEM);
>
> @@ -1150,6 +1151,8 @@ struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)
>                         return tmp;
>                 }
>
> +               if (tmp != clone)
> +                       bpf_prog_clone_free(clone);
>                 clone = tmp;
>                 insn_delta = rewritten - 1;
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 41109f49b724..e75b933f69e4 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -11855,7 +11855,10 @@ static int opt_subreg_zext_lo32_rnd_hi32(struct bpf_verifier_env *env,
>                 new_prog = bpf_patch_insn_data(env, adj_idx, patch, patch_len);
>                 if (!new_prog)
>                         return -ENOMEM;
> -               env->prog = new_prog;
> +               if (new_prog != env->prog) {
> +                       bpf_prog_clone_free(env->prog);
> +                       env->prog = new_prog;
> +               }

Can we move this check into bpf_patch_insn_data()?

>                 insns = new_prog->insnsi;
>                 aux = env->insn_aux_data;
>                 delta += patch_len - 1;
[...]

> diff --git a/net/core/filter.c b/net/core/filter.c
> index d70187ce851b..8a8d1a3ba5c2 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -1268,7 +1268,7 @@ static struct bpf_prog *bpf_migrate_filter(struct bpf_prog *fp)
>
>         /* Expand fp for appending the new filter representation. */
>         old_fp = fp;
> -       fp = bpf_prog_realloc(old_fp, bpf_prog_size(new_len), 0);
> +       fp = bpf_prog_realloc(old_fp, bpf_prog_size(new_len), 0, true);

Can we add some logic here and not add free_old to bpf_prog_realloc()?

Thanks,
Song
He Fengqing July 8, 2021, 3 a.m. UTC | #2
在 2021/7/7 15:25, Song Liu 写道:
> On Tue, Jul 6, 2021 at 8:53 PM He Fengqing <hefengqing@huawei.com> wrote:

>>

>> In bpf_patch_insn_data, env->prog was input parameter of

>> bpf_patch_insn_single function. bpf_patch_insn_single call

>> bpf_prog_realloc to realloc ebpf prog. When we need to malloc new prog,

>> bpf_prog_realloc will free the old prog, in this scenery is the

>> env->prog.

>> Then bpf_patch_insn_data function call adjust_insn_aux_data function, if

>> adjust_insn_aux_data function return error, bpf_patch_insn_data will

>> return NULL.

>> In bpf_check->convert_ctx_accesses->bpf_patch_insn_data call chain, if

>> bpf_patch_insn_data return NULL, env->prog has been freed in

>> bpf_prog_realloc, then bpf_check will use the freed env->prog.

> 

> Besides "what is the bug", please also describe "how to fix it". For example,

> add "Fix it by adding a free_old argument to bpf_prog_realloc(), and ...".

> Also, for the subject of 0/3, it is better to say "fix potential

> memory leak and ...".


Thanks for your suggestion.

> 

>>

>> Signed-off-by: He Fengqing <hefengqing@huawei.com>

>> ---

>>   include/linux/filter.h |  2 +-

>>   kernel/bpf/core.c      |  9 ++++---

>>   kernel/bpf/verifier.c  | 53 ++++++++++++++++++++++++++++++++----------

>>   net/core/filter.c      |  2 +-

>>   4 files changed, 49 insertions(+), 17 deletions(-)

>>

>> diff --git a/include/linux/filter.h b/include/linux/filter.h

>> index f39e008a377d..ec11a5ae92c2 100644

>> --- a/include/linux/filter.h

>> +++ b/include/linux/filter.h

>> @@ -881,7 +881,7 @@ void bpf_prog_jit_attempt_done(struct bpf_prog *prog);

>>   struct bpf_prog *bpf_prog_alloc(unsigned int size, gfp_t gfp_extra_flags);

>>   struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flags);

>>   struct bpf_prog *bpf_prog_realloc(struct bpf_prog *fp_old, unsigned int size,

>> -                                 gfp_t gfp_extra_flags);

>> +                                 gfp_t gfp_extra_flags, bool free_old);

>>   void __bpf_prog_free(struct bpf_prog *fp);

>>

>>   static inline void bpf_prog_clone_free(struct bpf_prog *fp)

>> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c

>> index 49b0311f48c1..e5616bb1665b 100644

>> --- a/kernel/bpf/core.c

>> +++ b/kernel/bpf/core.c

>> @@ -218,7 +218,7 @@ void bpf_prog_fill_jited_linfo(struct bpf_prog *prog,

>>   }

>>

>>   struct bpf_prog *bpf_prog_realloc(struct bpf_prog *fp_old, unsigned int size,

>> -                                 gfp_t gfp_extra_flags)

>> +                                 gfp_t gfp_extra_flags, bool free_old)

>>   {

>>          gfp_t gfp_flags = GFP_KERNEL_ACCOUNT | __GFP_ZERO | gfp_extra_flags;

>>          struct bpf_prog *fp;

>> @@ -238,7 +238,8 @@ struct bpf_prog *bpf_prog_realloc(struct bpf_prog *fp_old, unsigned int size,

>>                  /* We keep fp->aux from fp_old around in the new

>>                   * reallocated structure.

>>                   */

>> -               bpf_prog_clone_free(fp_old);

>> +               if (free_old)

>> +                       bpf_prog_clone_free(fp_old);

>>          }

>>

>>          return fp;

>> @@ -456,7 +457,7 @@ struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off,

>>           * last page could have large enough tailroom.

>>           */

>>          prog_adj = bpf_prog_realloc(prog, bpf_prog_size(insn_adj_cnt),

>> -                                   GFP_USER);

>> +                                   GFP_USER, false);

>>          if (!prog_adj)

>>                  return ERR_PTR(-ENOMEM);

>>

>> @@ -1150,6 +1151,8 @@ struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)

>>                          return tmp;

>>                  }

>>

>> +               if (tmp != clone)

>> +                       bpf_prog_clone_free(clone);

>>                  clone = tmp;

>>                  insn_delta = rewritten - 1;

>>

>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c

>> index 41109f49b724..e75b933f69e4 100644

>> --- a/kernel/bpf/verifier.c

>> +++ b/kernel/bpf/verifier.c

>> @@ -11855,7 +11855,10 @@ static int opt_subreg_zext_lo32_rnd_hi32(struct bpf_verifier_env *env,

>>                  new_prog = bpf_patch_insn_data(env, adj_idx, patch, patch_len);

>>                  if (!new_prog)

>>                          return -ENOMEM;

>> -               env->prog = new_prog;

>> +               if (new_prog != env->prog) {

>> +                       bpf_prog_clone_free(env->prog);

>> +                       env->prog = new_prog;

>> +               }

> 

> Can we move this check into bpf_patch_insn_data()?


Ok, I will change this in next version.

> 

>>                  insns = new_prog->insnsi;

>>                  aux = env->insn_aux_data;

>>                  delta += patch_len - 1;

> [...]

> 

>> diff --git a/net/core/filter.c b/net/core/filter.c

>> index d70187ce851b..8a8d1a3ba5c2 100644

>> --- a/net/core/filter.c

>> +++ b/net/core/filter.c

>> @@ -1268,7 +1268,7 @@ static struct bpf_prog *bpf_migrate_filter(struct bpf_prog *fp)

>>

>>          /* Expand fp for appending the new filter representation. */

>>          old_fp = fp;

>> -       fp = bpf_prog_realloc(old_fp, bpf_prog_size(new_len), 0);

>> +       fp = bpf_prog_realloc(old_fp, bpf_prog_size(new_len), 0, true);

> 

> Can we add some logic here and not add free_old to bpf_prog_realloc()?


Ok, maybe we can free old_fp here, never in bpf_prog_realloc.


> 

> Thanks,

> Song

> .

>
Alexei Starovoitov July 8, 2021, 3:09 a.m. UTC | #3
On Wed, Jul 7, 2021 at 8:00 PM He Fengqing <hefengqing@huawei.com> wrote:
>

> Ok, I will change this in next version.


before you spam the list with the next version
please explain why any of these changes are needed?
I don't see an explanation in the patches and I don't see a bug in the code.
Did you check what is the prog clone ?
When is it constructed? Why verifier has anything to do with it?
He Fengqing July 9, 2021, 11:11 a.m. UTC | #4
在 2021/7/8 11:09, Alexei Starovoitov 写道:
> On Wed, Jul 7, 2021 at 8:00 PM He Fengqing <hefengqing@huawei.com> wrote:

>>

>> Ok, I will change this in next version.

> 

> before you spam the list with the next version

> please explain why any of these changes are needed?

> I don't see an explanation in the patches and I don't see a bug in the code.

> Did you check what is the prog clone ?

> When is it constructed? Why verifier has anything to do with it?

> .

> 



I'm sorry, I didn't describe these errors clearly.

bpf_check(bpf_verifier_env)
     |
     |->do_misc_fixups(env)
     |    |
     |    |->bpf_patch_insn_data(env)
     |    |    |
     |    |    |->bpf_patch_insn_single(env->prog)
     |    |    |    |
     |    |    |    |->bpf_prog_realloc(env->prog)
     |    |    |    |    |
     |    |    |    |    |->construct new_prog
     |    |    |    |    |    free old_prog(env->prog)
     |    |    |    |    |
     |    |    |    |    |->return new_prog;
     |    |    |    |
     |    |    |    |->return new_prog;
     |    |    |
     |    |    |->adjust_insn_aux_data
     |    |    |    |
     |    |    |    |->return ENOMEM;
     |    |    |
     |    |    |->return NULL;
     |    |
     |    |->return ENOMEM;

bpf_verifier_env->prog had been freed in bpf_prog_realloc function.


There are two errors here, the first is memleak in the 
bpf_patch_insn_data function, and the second is use after free in the 
bpf_check function.

memleak in bpf_patch_insn_data:

Look at the call chain above, if adjust_insn_aux_data function return 
ENOMEM, bpf_patch_insn_data will return NULL, but we do not free the 
new_prog.

So in the patch 2, before bpf_patch_insn_data return NULL, we free the 
new_prog.

use after free in bpf_check:

If bpf_patch_insn_data function return NULL, we will not assign new_prog 
to the bpf_verifier_env->prog, but bpf_verifier_env->prog has been freed 
in the bpf_prog_realloc function. Then in bpf_check function, we will 
use bpf_verifier_env->prog after do_misc_fixups function.

In the patch 3, I added a free_old parameter to bpf_prog_realloc, in 
this scenario we don't free old_prog. Instead, we free it in the 
do_misc_fixups function when bpf_patch_insn_data return a valid new_prog.

Thanks for your reviews.
Alexei Starovoitov July 9, 2021, 3:12 p.m. UTC | #5
On Fri, Jul 9, 2021 at 4:11 AM He Fengqing <hefengqing@huawei.com> wrote:
>

>

>

> 在 2021/7/8 11:09, Alexei Starovoitov 写道:

> > On Wed, Jul 7, 2021 at 8:00 PM He Fengqing <hefengqing@huawei.com> wrote:

> >>

> >> Ok, I will change this in next version.

> >

> > before you spam the list with the next version

> > please explain why any of these changes are needed?

> > I don't see an explanation in the patches and I don't see a bug in the code.

> > Did you check what is the prog clone ?

> > When is it constructed? Why verifier has anything to do with it?

> > .

> >

>

>

> I'm sorry, I didn't describe these errors clearly.

>

> bpf_check(bpf_verifier_env)

>      |

>      |->do_misc_fixups(env)

>      |    |

>      |    |->bpf_patch_insn_data(env)

>      |    |    |

>      |    |    |->bpf_patch_insn_single(env->prog)

>      |    |    |    |

>      |    |    |    |->bpf_prog_realloc(env->prog)

>      |    |    |    |    |

>      |    |    |    |    |->construct new_prog

>      |    |    |    |    |    free old_prog(env->prog)

>      |    |    |    |    |

>      |    |    |    |    |->return new_prog;

>      |    |    |    |

>      |    |    |    |->return new_prog;

>      |    |    |

>      |    |    |->adjust_insn_aux_data

>      |    |    |    |

>      |    |    |    |->return ENOMEM;

>      |    |    |

>      |    |    |->return NULL;

>      |    |

>      |    |->return ENOMEM;

>

> bpf_verifier_env->prog had been freed in bpf_prog_realloc function.

>

>

> There are two errors here, the first is memleak in the

> bpf_patch_insn_data function, and the second is use after free in the

> bpf_check function.

>

> memleak in bpf_patch_insn_data:

>

> Look at the call chain above, if adjust_insn_aux_data function return

> ENOMEM, bpf_patch_insn_data will return NULL, but we do not free the

> new_prog.

>

> So in the patch 2, before bpf_patch_insn_data return NULL, we free the

> new_prog.

>

> use after free in bpf_check:

>

> If bpf_patch_insn_data function return NULL, we will not assign new_prog

> to the bpf_verifier_env->prog, but bpf_verifier_env->prog has been freed

> in the bpf_prog_realloc function. Then in bpf_check function, we will

> use bpf_verifier_env->prog after do_misc_fixups function.

>

> In the patch 3, I added a free_old parameter to bpf_prog_realloc, in

> this scenario we don't free old_prog. Instead, we free it in the

> do_misc_fixups function when bpf_patch_insn_data return a valid new_prog.


Thanks for explaining.
Why not to make adjust_insn_aux_data() in bpf_patch_insn_data() first then?
Just changing the order will resolve both issues, no?
He Fengqing July 12, 2021, 2:17 a.m. UTC | #6
在 2021/7/9 23:12, Alexei Starovoitov 写道:
> On Fri, Jul 9, 2021 at 4:11 AM He Fengqing <hefengqing@huawei.com> wrote:

>>

>>

>>

>> 在 2021/7/8 11:09, Alexei Starovoitov 写道:

>>> On Wed, Jul 7, 2021 at 8:00 PM He Fengqing <hefengqing@huawei.com> wrote:

>>>>

>>>> Ok, I will change this in next version.

>>>

>>> before you spam the list with the next version

>>> please explain why any of these changes are needed?

>>> I don't see an explanation in the patches and I don't see a bug in the code.

>>> Did you check what is the prog clone ?

>>> When is it constructed? Why verifier has anything to do with it?

>>> .

>>>

>>

>>

>> I'm sorry, I didn't describe these errors clearly.

>>

>> bpf_check(bpf_verifier_env)

>>       |

>>       |->do_misc_fixups(env)

>>       |    |

>>       |    |->bpf_patch_insn_data(env)

>>       |    |    |

>>       |    |    |->bpf_patch_insn_single(env->prog)

>>       |    |    |    |

>>       |    |    |    |->bpf_prog_realloc(env->prog)

>>       |    |    |    |    |

>>       |    |    |    |    |->construct new_prog

>>       |    |    |    |    |    free old_prog(env->prog)

>>       |    |    |    |    |

>>       |    |    |    |    |->return new_prog;

>>       |    |    |    |

>>       |    |    |    |->return new_prog;

>>       |    |    |

>>       |    |    |->adjust_insn_aux_data

>>       |    |    |    |

>>       |    |    |    |->return ENOMEM;

>>       |    |    |

>>       |    |    |->return NULL;

>>       |    |

>>       |    |->return ENOMEM;

>>

>> bpf_verifier_env->prog had been freed in bpf_prog_realloc function.

>>

>>

>> There are two errors here, the first is memleak in the

>> bpf_patch_insn_data function, and the second is use after free in the

>> bpf_check function.

>>

>> memleak in bpf_patch_insn_data:

>>

>> Look at the call chain above, if adjust_insn_aux_data function return

>> ENOMEM, bpf_patch_insn_data will return NULL, but we do not free the

>> new_prog.

>>

>> So in the patch 2, before bpf_patch_insn_data return NULL, we free the

>> new_prog.

>>

>> use after free in bpf_check:

>>

>> If bpf_patch_insn_data function return NULL, we will not assign new_prog

>> to the bpf_verifier_env->prog, but bpf_verifier_env->prog has been freed

>> in the bpf_prog_realloc function. Then in bpf_check function, we will

>> use bpf_verifier_env->prog after do_misc_fixups function.

>>

>> In the patch 3, I added a free_old parameter to bpf_prog_realloc, in

>> this scenario we don't free old_prog. Instead, we free it in the

>> do_misc_fixups function when bpf_patch_insn_data return a valid new_prog.

> 

> Thanks for explaining.

> Why not to make adjust_insn_aux_data() in bpf_patch_insn_data() first then?

> Just changing the order will resolve both issues, no?

> .

> 

adjust_insn_aux_data() need the new constructed new_prog as an input 
parameter, so we must call bpf_patch_insn_single() before 
adjust_insn_aux_data().

But we can make adjust_insn_aux_data() never return ENOMEM. In 
bpf_patch_insn_data(), first we pre-malloc memory for new aux_data, then 
call bpf_patch_insn_single() to constructed the new_prog, at last call 
adjust_insn_aux_data() functin. In this way, adjust_insn_aux_data() 
never fails.

bpf_patch_insn_data(env) {
	struct bpf_insn_aux_data *new_data = vzalloc();
	struct bpf_prog *new_prog;
	if (new_data == NULL)
		return NULL;

	new_prog = bpf_patch_insn_single(env->prog);
	if (new_prog == NULL) {
		vfree(new_data);
		return NULL;
	}

	adjust_insn_aux_data(new_prog, new_data);
	return new_prog;
}
What do you think about it?
Alexei Starovoitov July 13, 2021, 11:17 p.m. UTC | #7
On Sun, Jul 11, 2021 at 7:17 PM He Fengqing <hefengqing@huawei.com> wrote:
>

>

>

> 在 2021/7/9 23:12, Alexei Starovoitov 写道:

> > On Fri, Jul 9, 2021 at 4:11 AM He Fengqing <hefengqing@huawei.com> wrote:

> >>

> >>

> >>

> >> 在 2021/7/8 11:09, Alexei Starovoitov 写道:

> >>> On Wed, Jul 7, 2021 at 8:00 PM He Fengqing <hefengqing@huawei.com> wrote:

> >>>>

> >>>> Ok, I will change this in next version.

> >>>

> >>> before you spam the list with the next version

> >>> please explain why any of these changes are needed?

> >>> I don't see an explanation in the patches and I don't see a bug in the code.

> >>> Did you check what is the prog clone ?

> >>> When is it constructed? Why verifier has anything to do with it?

> >>> .

> >>>

> >>

> >>

> >> I'm sorry, I didn't describe these errors clearly.

> >>

> >> bpf_check(bpf_verifier_env)

> >>       |

> >>       |->do_misc_fixups(env)

> >>       |    |

> >>       |    |->bpf_patch_insn_data(env)

> >>       |    |    |

> >>       |    |    |->bpf_patch_insn_single(env->prog)

> >>       |    |    |    |

> >>       |    |    |    |->bpf_prog_realloc(env->prog)

> >>       |    |    |    |    |

> >>       |    |    |    |    |->construct new_prog

> >>       |    |    |    |    |    free old_prog(env->prog)

> >>       |    |    |    |    |

> >>       |    |    |    |    |->return new_prog;

> >>       |    |    |    |

> >>       |    |    |    |->return new_prog;

> >>       |    |    |

> >>       |    |    |->adjust_insn_aux_data

> >>       |    |    |    |

> >>       |    |    |    |->return ENOMEM;

> >>       |    |    |

> >>       |    |    |->return NULL;

> >>       |    |

> >>       |    |->return ENOMEM;

> >>

> >> bpf_verifier_env->prog had been freed in bpf_prog_realloc function.

> >>

> >>

> >> There are two errors here, the first is memleak in the

> >> bpf_patch_insn_data function, and the second is use after free in the

> >> bpf_check function.

> >>

> >> memleak in bpf_patch_insn_data:

> >>

> >> Look at the call chain above, if adjust_insn_aux_data function return

> >> ENOMEM, bpf_patch_insn_data will return NULL, but we do not free the

> >> new_prog.

> >>

> >> So in the patch 2, before bpf_patch_insn_data return NULL, we free the

> >> new_prog.

> >>

> >> use after free in bpf_check:

> >>

> >> If bpf_patch_insn_data function return NULL, we will not assign new_prog

> >> to the bpf_verifier_env->prog, but bpf_verifier_env->prog has been freed

> >> in the bpf_prog_realloc function. Then in bpf_check function, we will

> >> use bpf_verifier_env->prog after do_misc_fixups function.

> >>

> >> In the patch 3, I added a free_old parameter to bpf_prog_realloc, in

> >> this scenario we don't free old_prog. Instead, we free it in the

> >> do_misc_fixups function when bpf_patch_insn_data return a valid new_prog.

> >

> > Thanks for explaining.

> > Why not to make adjust_insn_aux_data() in bpf_patch_insn_data() first then?

> > Just changing the order will resolve both issues, no?

> > .

> >

> adjust_insn_aux_data() need the new constructed new_prog as an input

> parameter, so we must call bpf_patch_insn_single() before

> adjust_insn_aux_data().


Right. I forgot about insn_has_def32() logic and
commit b325fbca4b13 ("bpf: verifier: mark patched-insn with
sub-register zext flag")
that added that extra parameter.

> But we can make adjust_insn_aux_data() never return ENOMEM. In

> bpf_patch_insn_data(), first we pre-malloc memory for new aux_data, then

> call bpf_patch_insn_single() to constructed the new_prog, at last call

> adjust_insn_aux_data() functin. In this way, adjust_insn_aux_data()

> never fails.

>

> bpf_patch_insn_data(env) {

>         struct bpf_insn_aux_data *new_data = vzalloc();

>         struct bpf_prog *new_prog;

>         if (new_data == NULL)

>                 return NULL;

>

>         new_prog = bpf_patch_insn_single(env->prog);

>         if (new_prog == NULL) {

>                 vfree(new_data);

>                 return NULL;

>         }

>

>         adjust_insn_aux_data(new_prog, new_data);

>         return new_prog;

> }

> What do you think about it?


That's a good idea. Let's do that. The new size for vzalloc is easy to compute.
What should be the commit in the Fixes tag?
commit 8041902dae52 ("bpf: adjust insn_aux_data when patching insns")
right?
4 year old bug then.
I wonder why syzbot with malloc error injection didn't catch it sooner.
He Fengqing July 14, 2021, 1:53 a.m. UTC | #8
在 2021/7/14 7:17, Alexei Starovoitov 写道:
> On Sun, Jul 11, 2021 at 7:17 PM He Fengqing <hefengqing@huawei.com> wrote:

>>

>>

>>

>> 在 2021/7/9 23:12, Alexei Starovoitov 写道:

>>> On Fri, Jul 9, 2021 at 4:11 AM He Fengqing <hefengqing@huawei.com> wrote:

>>>>

>>>>

>>>>

>>>> 在 2021/7/8 11:09, Alexei Starovoitov 写道:

>>>>> On Wed, Jul 7, 2021 at 8:00 PM He Fengqing <hefengqing@huawei.com> wrote:

>>>>>>

>>>>>> Ok, I will change this in next version.

>>>>>

>>>>> before you spam the list with the next version

>>>>> please explain why any of these changes are needed?

>>>>> I don't see an explanation in the patches and I don't see a bug in the code.

>>>>> Did you check what is the prog clone ?

>>>>> When is it constructed? Why verifier has anything to do with it?

>>>>> .

>>>>>

>>>>

>>>>

>>>> I'm sorry, I didn't describe these errors clearly.

>>>>

>>>> bpf_check(bpf_verifier_env)

>>>>        |

>>>>        |->do_misc_fixups(env)

>>>>        |    |

>>>>        |    |->bpf_patch_insn_data(env)

>>>>        |    |    |

>>>>        |    |    |->bpf_patch_insn_single(env->prog)

>>>>        |    |    |    |

>>>>        |    |    |    |->bpf_prog_realloc(env->prog)

>>>>        |    |    |    |    |

>>>>        |    |    |    |    |->construct new_prog

>>>>        |    |    |    |    |    free old_prog(env->prog)

>>>>        |    |    |    |    |

>>>>        |    |    |    |    |->return new_prog;

>>>>        |    |    |    |

>>>>        |    |    |    |->return new_prog;

>>>>        |    |    |

>>>>        |    |    |->adjust_insn_aux_data

>>>>        |    |    |    |

>>>>        |    |    |    |->return ENOMEM;

>>>>        |    |    |

>>>>        |    |    |->return NULL;

>>>>        |    |

>>>>        |    |->return ENOMEM;

>>>>

>>>> bpf_verifier_env->prog had been freed in bpf_prog_realloc function.

>>>>

>>>>

>>>> There are two errors here, the first is memleak in the

>>>> bpf_patch_insn_data function, and the second is use after free in the

>>>> bpf_check function.

>>>>

>>>> memleak in bpf_patch_insn_data:

>>>>

>>>> Look at the call chain above, if adjust_insn_aux_data function return

>>>> ENOMEM, bpf_patch_insn_data will return NULL, but we do not free the

>>>> new_prog.

>>>>

>>>> So in the patch 2, before bpf_patch_insn_data return NULL, we free the

>>>> new_prog.

>>>>

>>>> use after free in bpf_check:

>>>>

>>>> If bpf_patch_insn_data function return NULL, we will not assign new_prog

>>>> to the bpf_verifier_env->prog, but bpf_verifier_env->prog has been freed

>>>> in the bpf_prog_realloc function. Then in bpf_check function, we will

>>>> use bpf_verifier_env->prog after do_misc_fixups function.

>>>>

>>>> In the patch 3, I added a free_old parameter to bpf_prog_realloc, in

>>>> this scenario we don't free old_prog. Instead, we free it in the

>>>> do_misc_fixups function when bpf_patch_insn_data return a valid new_prog.

>>>

>>> Thanks for explaining.

>>> Why not to make adjust_insn_aux_data() in bpf_patch_insn_data() first then?

>>> Just changing the order will resolve both issues, no?

>>> .

>>>

>> adjust_insn_aux_data() need the new constructed new_prog as an input

>> parameter, so we must call bpf_patch_insn_single() before

>> adjust_insn_aux_data().

> 

> Right. I forgot about insn_has_def32() logic and

> commit b325fbca4b13 ("bpf: verifier: mark patched-insn with

> sub-register zext flag")

> that added that extra parameter.

> 

>> But we can make adjust_insn_aux_data() never return ENOMEM. In

>> bpf_patch_insn_data(), first we pre-malloc memory for new aux_data, then

>> call bpf_patch_insn_single() to constructed the new_prog, at last call

>> adjust_insn_aux_data() functin. In this way, adjust_insn_aux_data()

>> never fails.

>>

>> bpf_patch_insn_data(env) {

>>          struct bpf_insn_aux_data *new_data = vzalloc();

>>          struct bpf_prog *new_prog;

>>          if (new_data == NULL)

>>                  return NULL;

>>

>>          new_prog = bpf_patch_insn_single(env->prog);

>>          if (new_prog == NULL) {

>>                  vfree(new_data);

>>                  return NULL;

>>          }

>>

>>          adjust_insn_aux_data(new_prog, new_data);

>>          return new_prog;

>> }

>> What do you think about it?

> 

> That's a good idea. Let's do that. The new size for vzalloc is easy to compute.

> What should be the commit in the Fixes tag?

> commit 8041902dae52 ("bpf: adjust insn_aux_data when patching insns")

> right?


Ok, I will add this in the commit message.

> 4 year old bug then.

> I wonder why syzbot with malloc error injection didn't catch it sooner.

> .

>