Message ID | 20220329104705.65256-3-ammarfaizi2@gnuweeb.org |
---|---|
State | New |
Headers | show |
Series | Two x86 fixes | expand |
On Tue, Mar 29 2022 at 17:47, Ammar Faizi wrote: > In mce_threshold_create_device(), if threshold_create_bank() fails, the > @bp will be leaked, because the call to mce_threshold_remove_device() > will not free the @bp. mce_threshold_remove_device() frees > @threshold_banks. At that point, the @bp has not been written to > @threshold_banks, @threshold_banks is NULL, so the call is just a nop. > > Fix this by extracting the cleanup part into a new static function > __threshold_remove_device(), then call it from create/remove device > functions. The way simpler fix is to move > } > this_cpu_write(threshold_banks, bp); before the loop. That's safe because the banks cannot yet be reached via an MCE as the vector is not yet enabled: > if (thresholding_irq_en) > mce_threshold_vector = amd_threshold_interrupt; Thanks, tglx
On 4/4/22 12:03 AM, Thomas Gleixner wrote: > On Tue, Mar 29 2022 at 17:47, Ammar Faizi wrote: > >> In mce_threshold_create_device(), if threshold_create_bank() fails, the >> @bp will be leaked, because the call to mce_threshold_remove_device() >> will not free the @bp. mce_threshold_remove_device() frees >> @threshold_banks. At that point, the @bp has not been written to >> @threshold_banks, @threshold_banks is NULL, so the call is just a nop. >> >> Fix this by extracting the cleanup part into a new static function >> __threshold_remove_device(), then call it from create/remove device >> functions. > > The way simpler fix is to move > >> } >> this_cpu_write(threshold_banks, bp); > > before the loop. That's safe because the banks cannot yet be reached via > an MCE as the vector is not yet enabled: > >> if (thresholding_irq_en) >> mce_threshold_vector = amd_threshold_interrupt; Thomas, I did like what you said (in the patch v4), but after Yazen and Borislav reviewed it, we got a conclusion that it's not safe. See [1] and [2] for the full message. [1]: https://lore.kernel.org/lkml/YkFsQhpGGXIFTMyp@zn.tnic/ [2]: https://lore.kernel.org/lkml/Yh+oyD%2F5M3TW5ZMM@yaz-ubuntu/ Yazen, Borislav, please take a deeper look on this again. I will send a v7 revision to really make it simpler by moving that "per-CPU var write" before the loop. Thanks!
On 4/4/22 12:43 AM, Ammar Faizi wrote: > On 4/4/22 12:03 AM, Thomas Gleixner wrote: >> On Tue, Mar 29 2022 at 17:47, Ammar Faizi wrote: >> >>> In mce_threshold_create_device(), if threshold_create_bank() fails, the >>> @bp will be leaked, because the call to mce_threshold_remove_device() >>> will not free the @bp. mce_threshold_remove_device() frees >>> @threshold_banks. At that point, the @bp has not been written to >>> @threshold_banks, @threshold_banks is NULL, so the call is just a nop. >>> >>> Fix this by extracting the cleanup part into a new static function >>> __threshold_remove_device(), then call it from create/remove device >>> functions. >> >> The way simpler fix is to move >> >>> } >>> this_cpu_write(threshold_banks, bp); >> >> before the loop. That's safe because the banks cannot yet be reached via >> an MCE as the vector is not yet enabled: >>> if (thresholding_irq_en) >>> mce_threshold_vector = amd_threshold_interrupt; > Thomas, > > I did like what you said (in the patch v4), but after Yazen and Borislav > reviewed it, we got a conclusion that it's not safe. > > See [1] and [2] for the full message. > > [1]: https://lore.kernel.org/lkml/YkFsQhpGGXIFTMyp@zn.tnic/ > [2]: https://lore.kernel.org/lkml/Yh+oyD%2F5M3TW5ZMM@yaz-ubuntu/ > > Yazen, Borislav, please take a deeper look on this again. I will send > a v7 revision to really make it simpler by moving that "per-CPU var write" > before the loop. (only if it's really safe)
On Mon, Apr 04, 2022 at 12:45:04AM +0700, Ammar Faizi wrote: > > Yazen, Borislav, please take a deeper look on this again. I will send > > a v7 revision to really make it simpler by moving that "per-CPU var write" > > before the loop. > > (only if it's really safe) Don't bother - I've discussed it with tglx offlist. The current solution remains.
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 1940d305db1c..d293ae088d6b 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -1294,10 +1294,23 @@ static void threshold_remove_bank(struct threshold_bank *bank) kfree(bank); } +static void __threshold_remove_device(struct threshold_bank **bp, + unsigned int numbanks) +{ + unsigned int bank; + + for (bank = 0; bank < numbanks; bank++) { + if (bp[bank]) { + threshold_remove_bank(bp[bank]); + bp[bank] = NULL; + } + } + kfree(bp); +} + int mce_threshold_remove_device(unsigned int cpu) { struct threshold_bank **bp = this_cpu_read(threshold_banks); - unsigned int bank, numbanks = this_cpu_read(mce_num_banks); if (!bp) return 0; @@ -1308,13 +1321,7 @@ int mce_threshold_remove_device(unsigned int cpu) */ this_cpu_write(threshold_banks, NULL); - for (bank = 0; bank < numbanks; bank++) { - if (bp[bank]) { - threshold_remove_bank(bp[bank]); - bp[bank] = NULL; - } - } - kfree(bp); + __threshold_remove_device(bp, this_cpu_read(mce_num_banks)); return 0; } @@ -1351,15 +1358,14 @@ int mce_threshold_create_device(unsigned int cpu) if (!(this_cpu_read(bank_map) & (1 << bank))) continue; err = threshold_create_bank(bp, cpu, bank); - if (err) - goto out_err; + if (err) { + __threshold_remove_device(bp, numbanks); + return err; + } } this_cpu_write(threshold_banks, bp); if (thresholding_irq_en) mce_threshold_vector = amd_threshold_interrupt; return 0; -out_err: - mce_threshold_remove_device(cpu); - return err; }