Message ID | 20200605132130.1411255-1-daniel.thompson@linaro.org |
---|---|
Headers | show |
Series | kgdb: Honour the kprobe blacklist when setting breakpoints | expand |
On Fri, Jun 05, 2020 at 02:21:26PM +0100, Daniel Thompson wrote: > kgdb has traditionally adopted a no safety rails approach to breakpoint > placement. If the debugger is commanded to place a breakpoint at an > address then it will do so even if that breakpoint results in kgdb > becoming inoperable. > > A stop-the-world debugger with memory peek/poke does intrinsically > provide its operator with the means to hose their system in all manner > of exciting ways (not least because stopping-the-world is already a DoS > attack ;-) ) but the current no safety rail approach is not easy to > defend, especially given kprobes provides us with plenty of machinery to > mark parts of the kernel where breakpointing is discouraged. > > This patchset introduces some safety rails by using the existing > kprobes infrastructure. It does not cover all locations where > breakpoints can cause trouble but it will definitely block off several > avenues, including the architecture specific parts that are handled by > arch_within_kprobe_blacklist(). > > This patch is an RFC because: > > 1. My workstation is still chugging through the compile testing. > > 2. Patch 4 needs more runtime testing. > > 3. The code to extract the kprobe blacklist code (patch 4 again) needs > more review especially for its impact on arch specific code. > > To be clear I do plan to do the detailed review of the kprobe blacklist > stuff but would like to check the direction of travel first since the > change is already surprisingly big and maybe there's a better way to > organise things. Thanks for doing these patches, esp 1-3 look very good to me. I've taken the liberty to bounce the entire set to Masami-San, who is the kprobes maintainer for comments as well.
On Fri, Jun 05, 2020 at 04:29:53PM +0200, Peter Zijlstra wrote: > On Fri, Jun 05, 2020 at 02:21:26PM +0100, Daniel Thompson wrote: > > kgdb has traditionally adopted a no safety rails approach to breakpoint > > placement. If the debugger is commanded to place a breakpoint at an > > address then it will do so even if that breakpoint results in kgdb > > becoming inoperable. > > > > A stop-the-world debugger with memory peek/poke does intrinsically > > provide its operator with the means to hose their system in all manner > > of exciting ways (not least because stopping-the-world is already a DoS > > attack ;-) ) but the current no safety rail approach is not easy to > > defend, especially given kprobes provides us with plenty of machinery to > > mark parts of the kernel where breakpointing is discouraged. > > > > This patchset introduces some safety rails by using the existing > > kprobes infrastructure. It does not cover all locations where > > breakpoints can cause trouble but it will definitely block off several > > avenues, including the architecture specific parts that are handled by > > arch_within_kprobe_blacklist(). > > > > This patch is an RFC because: > > > > 1. My workstation is still chugging through the compile testing. > > > > 2. Patch 4 needs more runtime testing. > > > > 3. The code to extract the kprobe blacklist code (patch 4 again) needs > > more review especially for its impact on arch specific code. > > > > To be clear I do plan to do the detailed review of the kprobe blacklist > > stuff but would like to check the direction of travel first since the > > change is already surprisingly big and maybe there's a better way to > > organise things. > > Thanks for doing these patches, esp 1-3 look very good to me. > > I've taken the liberty to bounce the entire set to Masami-San, who is > the kprobes maintainer for comments as well. OK, after having had a second look, one thing we can perhaps address with the last patch, or perhaps on top of that, is extending the kprobes_blacklist() with data regions. Because these patches only exclude kgdb from setting breakpoints on code; data breakpoints do not match what we do with arch_build_bp_info().
On Fri, Jun 05, 2020 at 04:29:53PM +0200, Peter Zijlstra wrote: > On Fri, Jun 05, 2020 at 02:21:26PM +0100, Daniel Thompson wrote: > > kgdb has traditionally adopted a no safety rails approach to breakpoint > > placement. If the debugger is commanded to place a breakpoint at an > > address then it will do so even if that breakpoint results in kgdb > > becoming inoperable. > > > > A stop-the-world debugger with memory peek/poke does intrinsically > > provide its operator with the means to hose their system in all manner > > of exciting ways (not least because stopping-the-world is already a DoS > > attack ;-) ) but the current no safety rail approach is not easy to > > defend, especially given kprobes provides us with plenty of machinery to > > mark parts of the kernel where breakpointing is discouraged. > > > > This patchset introduces some safety rails by using the existing > > kprobes infrastructure. It does not cover all locations where > > breakpoints can cause trouble but it will definitely block off several > > avenues, including the architecture specific parts that are handled by > > arch_within_kprobe_blacklist(). > > > > This patch is an RFC because: > > > > 1. My workstation is still chugging through the compile testing. > > > > 2. Patch 4 needs more runtime testing. > > > > 3. The code to extract the kprobe blacklist code (patch 4 again) needs > > more review especially for its impact on arch specific code. > > > > To be clear I do plan to do the detailed review of the kprobe blacklist > > stuff but would like to check the direction of travel first since the > > change is already surprisingly big and maybe there's a better way to > > organise things. > > Thanks for doing these patches, esp 1-3 look very good to me. > > I've taken the liberty to bounce the entire set to Masami-San, who is > the kprobes maintainer for comments as well. Not a liberty... leaving out Masami-san was an oversight on my part. Thanks for connecting! Daniel.
On Fri, Jun 05, 2020 at 04:44:57PM +0200, Peter Zijlstra wrote: > On Fri, Jun 05, 2020 at 04:29:53PM +0200, Peter Zijlstra wrote: > > On Fri, Jun 05, 2020 at 02:21:26PM +0100, Daniel Thompson wrote: > > > kgdb has traditionally adopted a no safety rails approach to breakpoint > > > placement. If the debugger is commanded to place a breakpoint at an > > > address then it will do so even if that breakpoint results in kgdb > > > becoming inoperable. > > > > > > A stop-the-world debugger with memory peek/poke does intrinsically > > > provide its operator with the means to hose their system in all manner > > > of exciting ways (not least because stopping-the-world is already a DoS > > > attack ;-) ) but the current no safety rail approach is not easy to > > > defend, especially given kprobes provides us with plenty of machinery to > > > mark parts of the kernel where breakpointing is discouraged. > > > > > > This patchset introduces some safety rails by using the existing > > > kprobes infrastructure. It does not cover all locations where > > > breakpoints can cause trouble but it will definitely block off several > > > avenues, including the architecture specific parts that are handled by > > > arch_within_kprobe_blacklist(). > > > > > > This patch is an RFC because: > > > > > > 1. My workstation is still chugging through the compile testing. > > > > > > 2. Patch 4 needs more runtime testing. > > > > > > 3. The code to extract the kprobe blacklist code (patch 4 again) needs > > > more review especially for its impact on arch specific code. > > > > > > To be clear I do plan to do the detailed review of the kprobe blacklist > > > stuff but would like to check the direction of travel first since the > > > change is already surprisingly big and maybe there's a better way to > > > organise things. > > > > Thanks for doing these patches, esp 1-3 look very good to me. > > > > I've taken the liberty to bounce the entire set to Masami-San, who is > > the kprobes maintainer for comments as well. > > OK, after having had a second look, one thing we can perhaps address > with the last patch, or perhaps on top of that, is extending the > kprobes_blacklist() with data regions. > > Because these patches only exclude kgdb from setting breakpoints on > code; data breakpoints do not match what we do with > arch_build_bp_info(). Right, my patches will only affect the code paths where kgdb sets software breakpoints. In principle h/w breakpoints, whether they are code or data, should be able to rely on hw_breakpoint_arch_parse() and any errors from the hw breakpoint API will propagate into the kgdb core and do the right thing. In practice it looks like kgdb/x86/hw_breakpoint have plumbed together in a manner that circumvents the parse (and therefore# arch_build_bp_info() never runs). Rather the h/w/ break problem using the kprobe blacklist I think we could just fix these code paths. The following is 100% untested (not even compile) and I copied a line or two from register_perf_hw_breakpoint() without checking what they do ;-). Nevertheless I hope it gives a clear idea about what I am talking about! If this was developed into a "real" patch then I think dbg_release_bp_slot() should perhaps be replaced with a better API that doesn't bypass the checks rather than solving everything in kgdb.c . Daniel. diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c index c44fe7d8d9a4..64ac0ee9b55c 100644 --- a/arch/x86/kernel/kgdb.c +++ b/arch/x86/kernel/kgdb.c @@ -223,11 +223,12 @@ static void kgdb_correct_hw_break(void) hw_breakpoint_restore(); } -static int hw_break_reserve_slot(int breakno) +static int kgdb_register_hw_break(int breakno) { int cpu; int cnt = 0; struct perf_event **pevent; + struct arch_hw_breakpoint hw = { }; if (dbg_is_early) return 0; @@ -237,6 +238,11 @@ static int hw_break_reserve_slot(int breakno) pevent = per_cpu_ptr(breakinfo[breakno].pev, cpu); if (dbg_reserve_bp_slot(*pevent)) goto fail; + if (hw_breakpoint_arch_parse(*pevent, &(*pevent)->attr, hw)) { + cnt++; + goto fail; + } + (*pevent)->hw.info = hw; } return 0; @@ -361,7 +367,7 @@ kgdb_set_hw_break(unsigned long addr, int len, enum kgdb_bptype bptype) return -1; } breakinfo[i].addr = addr; - if (hw_break_reserve_slot(i)) { + if (kgdb_register_hw_break(i)) { breakinfo[i].addr = 0; return -1; }
On Fri, 5 Jun 2020 16:29:53 +0200 Peter Zijlstra <peterz@infradead.org> wrote: > On Fri, Jun 05, 2020 at 02:21:26PM +0100, Daniel Thompson wrote: > > kgdb has traditionally adopted a no safety rails approach to breakpoint > > placement. If the debugger is commanded to place a breakpoint at an > > address then it will do so even if that breakpoint results in kgdb > > becoming inoperable. > > > > A stop-the-world debugger with memory peek/poke does intrinsically > > provide its operator with the means to hose their system in all manner > > of exciting ways (not least because stopping-the-world is already a DoS > > attack ;-) ) but the current no safety rail approach is not easy to > > defend, especially given kprobes provides us with plenty of machinery to > > mark parts of the kernel where breakpointing is discouraged. > > > > This patchset introduces some safety rails by using the existing > > kprobes infrastructure. It does not cover all locations where > > breakpoints can cause trouble but it will definitely block off several > > avenues, including the architecture specific parts that are handled by > > arch_within_kprobe_blacklist(). > > > > This patch is an RFC because: > > > > 1. My workstation is still chugging through the compile testing. > > > > 2. Patch 4 needs more runtime testing. > > > > 3. The code to extract the kprobe blacklist code (patch 4 again) needs > > more review especially for its impact on arch specific code. > > > > To be clear I do plan to do the detailed review of the kprobe blacklist > > stuff but would like to check the direction of travel first since the > > change is already surprisingly big and maybe there's a better way to > > organise things. > > Thanks for doing these patches, esp 1-3 look very good to me. > > I've taken the liberty to bounce the entire set to Masami-San, who is > the kprobes maintainer for comments as well. Thanks Peter to Cc me. Reusing kprobe blacklist is good to me as far as it doesn't expand it only for kgdb. For example, if a function which can cause a recursive trap issue only when the kgdb puts a breakpoint, it should be covered by kgdb blacklist, or we should make a "noinstr-list" including both :) Thus, Nack for PATCH 4/4, that seems a bit selfish change. If kgdb wants to use the "kprobes blacklist", we should make CONFIG_KGDB depending on CONFIG_KPROBES. Or, (as I pointed) we should make independent "noinstr-list" and use it from both. Thank you, -- Masami Hiramatsu <mhiramat@kernel.org>
On Thu, Jun 11, 2020 at 09:42:09PM +0900, Masami Hiramatsu wrote: > On Fri, 5 Jun 2020 16:29:53 +0200 > Peter Zijlstra <peterz@infradead.org> wrote: > > > On Fri, Jun 05, 2020 at 02:21:26PM +0100, Daniel Thompson wrote: > > > kgdb has traditionally adopted a no safety rails approach to breakpoint > > > placement. If the debugger is commanded to place a breakpoint at an > > > address then it will do so even if that breakpoint results in kgdb > > > becoming inoperable. > > > > > > A stop-the-world debugger with memory peek/poke does intrinsically > > > provide its operator with the means to hose their system in all manner > > > of exciting ways (not least because stopping-the-world is already a DoS > > > attack ;-) ) but the current no safety rail approach is not easy to > > > defend, especially given kprobes provides us with plenty of machinery to > > > mark parts of the kernel where breakpointing is discouraged. > > > > > > This patchset introduces some safety rails by using the existing > > > kprobes infrastructure. It does not cover all locations where > > > breakpoints can cause trouble but it will definitely block off several > > > avenues, including the architecture specific parts that are handled by > > > arch_within_kprobe_blacklist(). > > > > > > This patch is an RFC because: > > > > > > 1. My workstation is still chugging through the compile testing. > > > > > > 2. Patch 4 needs more runtime testing. > > > > > > 3. The code to extract the kprobe blacklist code (patch 4 again) needs > > > more review especially for its impact on arch specific code. > > > > > > To be clear I do plan to do the detailed review of the kprobe blacklist > > > stuff but would like to check the direction of travel first since the > > > change is already surprisingly big and maybe there's a better way to > > > organise things. > > > > Thanks for doing these patches, esp 1-3 look very good to me. > > > > I've taken the liberty to bounce the entire set to Masami-San, who is > > the kprobes maintainer for comments as well. > > Thanks Peter to Cc me. > > Reusing kprobe blacklist is good to me as far as it doesn't expand it > only for kgdb. For example, if a function which can cause a recursive > trap issue only when the kgdb puts a breakpoint, it should be covered > by kgdb blacklist, or we should make a "noinstr-list" including > both :) Recursion is what focuses the mind but I don't think we'd need recursion for there to be problems. For example taking a kprobe trap whilst executing the kgdb trap handler (or vice versa) is already likely to be fragile and is almost certainly untested on most or all architectures. Further if I understood Peter's original nudge correctly then, in addition, x86 plans to explicitly prohibit this anyway. On other words I think there will only be one blacklist. > Thus, Nack for PATCH 4/4, that seems a bit selfish change. If kgdb wants > to use the "kprobes blacklist", we should make CONFIG_KGDB depending on > CONFIG_KPROBES. Some of the architectures currently have kgdb support but do not have kprobes... > Or, (as I pointed) we should make independent "noinstr-list" > and use it from both. This sounds like this wouldn't really be a functional change over what I have proposed. More like augmenting it with a massive symbol rename (and maybe a little bit of extra code movement in the headers to give us linux/noinstr.h). Taking my cues from things like set_fs() I originally decided to keep away from such a big rename ;-) . Personally I'm open to a rename. I could write PATCH 4/4 assuming a rename will come (e.g. different naming for new files and Kconfig options) and follow that with an automatically generated rename patch (or patches). Daniel.
On Thu, 11 Jun 2020 15:32:40 +0100 Daniel Thompson <daniel.thompson@linaro.org> wrote: > On Thu, Jun 11, 2020 at 09:42:09PM +0900, Masami Hiramatsu wrote: > > On Fri, 5 Jun 2020 16:29:53 +0200 > > Peter Zijlstra <peterz@infradead.org> wrote: > > > > > On Fri, Jun 05, 2020 at 02:21:26PM +0100, Daniel Thompson wrote: > > > > kgdb has traditionally adopted a no safety rails approach to breakpoint > > > > placement. If the debugger is commanded to place a breakpoint at an > > > > address then it will do so even if that breakpoint results in kgdb > > > > becoming inoperable. > > > > > > > > A stop-the-world debugger with memory peek/poke does intrinsically > > > > provide its operator with the means to hose their system in all manner > > > > of exciting ways (not least because stopping-the-world is already a DoS > > > > attack ;-) ) but the current no safety rail approach is not easy to > > > > defend, especially given kprobes provides us with plenty of machinery to > > > > mark parts of the kernel where breakpointing is discouraged. > > > > > > > > This patchset introduces some safety rails by using the existing > > > > kprobes infrastructure. It does not cover all locations where > > > > breakpoints can cause trouble but it will definitely block off several > > > > avenues, including the architecture specific parts that are handled by > > > > arch_within_kprobe_blacklist(). > > > > > > > > This patch is an RFC because: > > > > > > > > 1. My workstation is still chugging through the compile testing. > > > > > > > > 2. Patch 4 needs more runtime testing. > > > > > > > > 3. The code to extract the kprobe blacklist code (patch 4 again) needs > > > > more review especially for its impact on arch specific code. > > > > > > > > To be clear I do plan to do the detailed review of the kprobe blacklist > > > > stuff but would like to check the direction of travel first since the > > > > change is already surprisingly big and maybe there's a better way to > > > > organise things. > > > > > > Thanks for doing these patches, esp 1-3 look very good to me. > > > > > > I've taken the liberty to bounce the entire set to Masami-San, who is > > > the kprobes maintainer for comments as well. > > > > Thanks Peter to Cc me. > > > > Reusing kprobe blacklist is good to me as far as it doesn't expand it > > only for kgdb. For example, if a function which can cause a recursive > > trap issue only when the kgdb puts a breakpoint, it should be covered > > by kgdb blacklist, or we should make a "noinstr-list" including > > both :) > > Recursion is what focuses the mind but I don't think we'd need > recursion for there to be problems. > > For example taking a kprobe trap whilst executing the kgdb trap handler > (or vice versa) is already likely to be fragile and is almost certainly > untested on most or all architectures. Ah, OK. Actually, on x86 that is not a problem (it can handle recursive int3 if software handles it correctly), but other arch may not accept it. Hmm, then using NOKPROBE_SYMBOL() is reasonable. > Further if I understood Peter's > original nudge correctly then, in addition, x86 plans to explicitly > prohibit this anyway. > > On other words I think there will only be one blacklist. Agreed. > > Thus, Nack for PATCH 4/4, that seems a bit selfish change. If kgdb wants > > to use the "kprobes blacklist", we should make CONFIG_KGDB depending on > > CONFIG_KPROBES. > > Some of the architectures currently have kgdb support but do not have > kprobes... "depends on KPROBES if HAVE_KPROBES" ? :-) (Anyway, it is a good chance to port kprobe on such arch...) Thank you, -- Masami Hiramatsu <mhiramat@kernel.org>
On Fri, Jun 12, 2020 at 07:13:49PM +0900, Masami Hiramatsu wrote: > On Thu, 11 Jun 2020 15:32:40 +0100 > Daniel Thompson <daniel.thompson@linaro.org> wrote: > > > On Thu, Jun 11, 2020 at 09:42:09PM +0900, Masami Hiramatsu wrote: > > > On Fri, 5 Jun 2020 16:29:53 +0200 > > > Peter Zijlstra <peterz@infradead.org> wrote: > > > > > > > On Fri, Jun 05, 2020 at 02:21:26PM +0100, Daniel Thompson wrote: > > > > > kgdb has traditionally adopted a no safety rails approach to breakpoint > > > > > placement. If the debugger is commanded to place a breakpoint at an > > > > > address then it will do so even if that breakpoint results in kgdb > > > > > becoming inoperable. > > > > > > > > > > A stop-the-world debugger with memory peek/poke does intrinsically > > > > > provide its operator with the means to hose their system in all manner > > > > > of exciting ways (not least because stopping-the-world is already a DoS > > > > > attack ;-) ) but the current no safety rail approach is not easy to > > > > > defend, especially given kprobes provides us with plenty of machinery to > > > > > mark parts of the kernel where breakpointing is discouraged. > > > > > > > > > > This patchset introduces some safety rails by using the existing > > > > > kprobes infrastructure. It does not cover all locations where > > > > > breakpoints can cause trouble but it will definitely block off several > > > > > avenues, including the architecture specific parts that are handled by > > > > > arch_within_kprobe_blacklist(). > > > > > > > > > > This patch is an RFC because: > > > > > > > > > > 1. My workstation is still chugging through the compile testing. > > > > > > > > > > 2. Patch 4 needs more runtime testing. > > > > > > > > > > 3. The code to extract the kprobe blacklist code (patch 4 again) needs > > > > > more review especially for its impact on arch specific code. > > > > > > > > > > To be clear I do plan to do the detailed review of the kprobe blacklist > > > > > stuff but would like to check the direction of travel first since the > > > > > change is already surprisingly big and maybe there's a better way to > > > > > organise things. > > > > > > > > Thanks for doing these patches, esp 1-3 look very good to me. > > > > > > > > I've taken the liberty to bounce the entire set to Masami-San, who is > > > > the kprobes maintainer for comments as well. > > > > > > Thanks Peter to Cc me. > > > > > > Reusing kprobe blacklist is good to me as far as it doesn't expand it > > > only for kgdb. For example, if a function which can cause a recursive > > > trap issue only when the kgdb puts a breakpoint, it should be covered > > > by kgdb blacklist, or we should make a "noinstr-list" including > > > both :) > > > > Recursion is what focuses the mind but I don't think we'd need > > recursion for there to be problems. > > > > For example taking a kprobe trap whilst executing the kgdb trap handler > > (or vice versa) is already likely to be fragile and is almost certainly > > untested on most or all architectures. > > Ah, OK. Actually, on x86 that is not a problem (it can handle recursive int3 > if software handles it correctly), but other arch may not accept it. > Hmm, then using NOKPROBE_SYMBOL() is reasonable. > > > Further if I understood Peter's > > original nudge correctly then, in addition, x86 plans to explicitly > > prohibit this anyway. > > > > On other words I think there will only be one blacklist. > > Agreed. > > > > Thus, Nack for PATCH 4/4, that seems a bit selfish change. If kgdb wants > > > to use the "kprobes blacklist", we should make CONFIG_KGDB depending on > > > CONFIG_KPROBES. > > > > Some of the architectures currently have kgdb support but do not have > > kprobes... > > "depends on KPROBES if HAVE_KPROBES" ? :-) I'm personally be OK with something like: #ifndef CONFIG_KGDB_ALLOW_UNSAFE_BREAKPOINTS if (within_kprobe_blacklist()) ... #endif And then setup Kconfig so that KGDB_ALLOW_UNSAFE_BREAKPOINTS defaults to n and add a extra check to put a warning in dmesg that breakpoints are disabled. > (Anyway, it is a good chance to port kprobe on such arch...) I like kprobes very much... but not quite enough to want to implement it on architectures I don't use ;-). Daniel.