Message ID | 20241211153955.33518-2-tjarlama@gmail.com |
---|---|
State | Superseded |
Headers | show |
Series | Add a new command in kgdb for vmcoreinfo | expand |
Hi, On Wed, Dec 11, 2024 at 7:40 AM Amal Raj T <tjarlama@gmail.com> wrote: > > From: Amal Raj T <amalrajt@meta.com> > > Add a new function kgdb_mem2ebin that converts memory > to binary format, escaping special characters > ('$', '#', and '}'). kgdb_mem2ebin function ensures > that memory data is properly formatted and escaped > before being sent over the wire. Additionally, this > function reduces the amount of data exchanged between > debugger compared to hex. > > Signed-off-by: Amal Raj T <amalrajt@meta.com> > --- > include/linux/kgdb.h | 1 + > kernel/debug/gdbstub.c | 31 +++++++++++++++++++++++++++++++ > 2 files changed, 32 insertions(+) > > diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h > index 76e891ee9e37..fa3cf38a14de 100644 > --- a/include/linux/kgdb.h > +++ b/include/linux/kgdb.h > @@ -322,6 +322,7 @@ extern struct kgdb_io *dbg_io_ops; > > extern int kgdb_hex2long(char **ptr, unsigned long *long_val); > extern char *kgdb_mem2hex(char *mem, char *buf, int count); > +extern char *kgdb_mem2ebin(char *mem, char *buf, int count); > extern int kgdb_hex2mem(char *buf, char *mem, int count); > > extern int kgdb_isremovedbreak(unsigned long addr); > diff --git a/kernel/debug/gdbstub.c b/kernel/debug/gdbstub.c > index f625172d4b67..6198d2eb49c4 100644 > --- a/kernel/debug/gdbstub.c > +++ b/kernel/debug/gdbstub.c > @@ -257,6 +257,37 @@ char *kgdb_mem2hex(char *mem, char *buf, int count) > return buf; > } > > +/* > + * Convert memory to binary format for GDB remote protocol > + * transmission, escaping special characters ($, #, and }). Why exactly are those characters special? What considers them special and so why do you need to escape them? I guess you really just care about avoiding # and $ and you're using '}' as your escape character so you need to escape that too? Your function comment should describe the escaping method and ideally provide a few examples. > + */ > +char *kgdb_mem2ebin(char *mem, char *buf, int count) One of the two buffers seems like it should be "const", right? That would help document which was input and which was output. I guess "mem" is the input? "count" should be "size_t" Presumably there should be two counts talking about the sizes of each buffer, or at least some documentation should be in the function comment talking about the fact that "buf" needs to be twice the size? > +{ > + char *tmp; > + int err; > + > + tmp = buf + count; Could use a comment that the buffer needs to be 2x long to handle escaping and that you'll use the 2nd half as a temp buffer. > + > + err = copy_from_kernel_nofault(tmp, mem, count); > + if (err) > + return NULL; > + while (count > 0) { If you change `count` to `size_t` the above test won't work because it'll be unsigned. Still probably better to use `size_t`, but just a warning that you'll have to change the condition. > + unsigned char c = *tmp; > + > + if (c == 0x7d || c == 0x23 || c == 0x24) { > + *buf++ = 0x7d; Don't hardcode. Much better to use '}', '#', '$' > + *buf++ = c ^ 0x20; > + } else { > + *buf++ = c; > + } > + count -= 1; > + tmp += 1; count--; tmp++; > + } > + *buf = 0; Why is the resulting buffer '\0' terminated when the source buffer isn't? Adding this termination means that the destination buffer needs to be 1 byte bigger, right? ...or maybe the source buffer doesn't actually have any embedded '\0' characters and you're using this for termination to the other side? It would be good to clarify. In other words, if the input is 2 bytes big: '}}' The output buffer will be 5 bytes big: '}]}]\0' > + > + return buf; What exactly is this return value? It points right past the end of the buffer? You seem to just use it as an error value (NULL means error, non-NULL means no error). Why not return an error code then? -Doug
On Wed, Dec 11, 2024 at 07:39:53AM -0800, Amal Raj T wrote: > From: Amal Raj T <amalrajt@meta.com> > > Add a new function kgdb_mem2ebin that converts memory > to binary format, escaping special characters > ('$', '#', and '}'). What about the '*' character? The gdb docs say "Responses sent by the stub must also escape 0x2a (ASCII ‘*’), so that it is not interpreted as the start of a run-length encoded sequence". > kgdb_mem2ebin function ensures > that memory data is properly formatted and escaped > before being sent over the wire. Additionally, this > function reduces the amount of data exchanged between > debugger compared to hex. > > Signed-off-by: Amal Raj T <amalrajt@meta.com> > --- > include/linux/kgdb.h | 1 + > kernel/debug/gdbstub.c | 31 +++++++++++++++++++++++++++++++ > 2 files changed, 32 insertions(+) > > diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h > index 76e891ee9e37..fa3cf38a14de 100644 > --- a/include/linux/kgdb.h > +++ b/include/linux/kgdb.h > @@ -322,6 +322,7 @@ extern struct kgdb_io *dbg_io_ops; > > extern int kgdb_hex2long(char **ptr, unsigned long *long_val); > extern char *kgdb_mem2hex(char *mem, char *buf, int count); > +extern char *kgdb_mem2ebin(char *mem, char *buf, int count); > extern int kgdb_hex2mem(char *buf, char *mem, int count); > > extern int kgdb_isremovedbreak(unsigned long addr); > diff --git a/kernel/debug/gdbstub.c b/kernel/debug/gdbstub.c > index f625172d4b67..6198d2eb49c4 100644 > --- a/kernel/debug/gdbstub.c > +++ b/kernel/debug/gdbstub.c > @@ -257,6 +257,37 @@ char *kgdb_mem2hex(char *mem, char *buf, int count) > return buf; > } > > +/* > + * Convert memory to binary format for GDB remote protocol > + * transmission, escaping special characters ($, #, and }). > + */ It would be good to include a link to the following in this comment: https://sourceware.org/gdb/current/onlinedocs/gdb.html/Overview.html#Binary-Data Doug already mentioned putting buffer size expectations in this comment. Maybe also mention that the worst case packing is identical to kgdb_mem2hex() (e.g. 2*count + 1). Daniel.
Hi, On Mon, Jan 13, 2025 at 8:09 AM Amal <tjarlama@gmail.com> wrote: > > On Fri, Dec 13, 2024 at 12:55 AM Doug Anderson <dianders@chromium.org> wrote: > > > > Hi, > > > > On Wed, Dec 11, 2024 at 7:40 AM Amal Raj T <tjarlama@gmail.com> wrote: > > > > > > From: Amal Raj T <amalrajt@meta.com> > > > > > > Add a new function kgdb_mem2ebin that converts memory > > > to binary format, escaping special characters > > > ('$', '#', and '}'). kgdb_mem2ebin function ensures > > > that memory data is properly formatted and escaped > > > before being sent over the wire. Additionally, this > > > function reduces the amount of data exchanged between > > > debugger compared to hex. > > > > > > Signed-off-by: Amal Raj T <amalrajt@meta.com> > > > --- > > > include/linux/kgdb.h | 1 + > > > kernel/debug/gdbstub.c | 31 +++++++++++++++++++++++++++++++ > > > 2 files changed, 32 insertions(+) FWIW, it looks like Stephen has already responded to your V3 and the protocol might change a bunch. I'm not going to do a detailed review on it at this time. I'll comment on a few of the things below, though. One other note is that you should have done a "reply to all" when responding to my review feedback, not just a reply to me. The responses should be archived on the lists unless there was a specific reason to exclude them. Adding the lists back here since I don't see anything sensitive in your reply. > > > diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h > > > index 76e891ee9e37..fa3cf38a14de 100644 > > > --- a/include/linux/kgdb.h > > > +++ b/include/linux/kgdb.h > > > @@ -322,6 +322,7 @@ extern struct kgdb_io *dbg_io_ops; > > > > > > extern int kgdb_hex2long(char **ptr, unsigned long *long_val); > > > extern char *kgdb_mem2hex(char *mem, char *buf, int count); > > > +extern char *kgdb_mem2ebin(char *mem, char *buf, int count); > > > extern int kgdb_hex2mem(char *buf, char *mem, int count); > > > > > > extern int kgdb_isremovedbreak(unsigned long addr); > > > diff --git a/kernel/debug/gdbstub.c b/kernel/debug/gdbstub.c > > > index f625172d4b67..6198d2eb49c4 100644 > > > --- a/kernel/debug/gdbstub.c > > > +++ b/kernel/debug/gdbstub.c > > > @@ -257,6 +257,37 @@ char *kgdb_mem2hex(char *mem, char *buf, int count) > > > return buf; > > > } > > > > > > +/* > > > + * Convert memory to binary format for GDB remote protocol > > > + * transmission, escaping special characters ($, #, and }). > > > > Why exactly are those characters special? What considers them special > > and so why do you need to escape them? I guess you really just care > > about avoiding # and $ and you're using '}' as your escape character > > so you need to escape that too? > > > > Your function comment should describe the escaping method and ideally > > provide a few examples. > > > > > Added relevant links in the commit message and commented two examples > in code > > > + */ > > > +char *kgdb_mem2ebin(char *mem, char *buf, int count) > > > > One of the two buffers seems like it should be "const", right? That > > would help document which was input and which was output. I guess > > "mem" is the input? > > > > "count" should be "size_t" > > > This was copied from `kgdb_mem2hex` which uses a similar template. > Should we continue using `int` or just this function need to be migrated to > `size_t` In general the existing kdb/kgdb functions are not always the best examples. I think size_t is correct here. Indeed, if you look at what "kgdb_mem2hex" does with "count" it passes it to copy_from_kernel_nofault() which takes a size_t. Feel free to post a (separate) patch fixing kgdb_mem2hex(). > > Presumably there should be two counts talking about the sizes of each > > buffer, or at least some documentation should be in the function > > comment talking about the fact that "buf" needs to be twice the size? > > > > > > > +{ > > > + char *tmp; > > > + int err; > > > + > > > + tmp = buf + count; > > > > Could use a comment that the buffer needs to be 2x long to handle > > escaping and that you'll use the 2nd half as a temp buffer. > > > > > Done, commented in v3 > > > + > > > + err = copy_from_kernel_nofault(tmp, mem, count); > > > + if (err) > > > + return NULL; > > > + while (count > 0) { > > > > If you change `count` to `size_t` the above test won't work because > > it'll be unsigned. Still probably better to use `size_t`, but just a > > warning that you'll have to change the condition. > > > > > > > + unsigned char c = *tmp; > > > + > > > + if (c == 0x7d || c == 0x23 || c == 0x24) { > > > + *buf++ = 0x7d; > > > > Don't hardcode. Much better to use '}', '#', '$' > > > Fixed in v3 > > > > > + *buf++ = c ^ 0x20; > > > + } else { > > > + *buf++ = c; > > > + } > > > + count -= 1; > > > + tmp += 1; > > > > count--; > > tmp++; > > > Fixed in v3 > > > + } > > > + *buf = 0; > > > > Why is the resulting buffer '\0' terminated when the source buffer > > isn't? Adding this termination means that the destination buffer needs > > to be 1 byte bigger, right? > > > > ...or maybe the source buffer doesn't actually have any embedded '\0' > > characters and you're using this for termination to the other side? It > > would be good to clarify. > > > > In other words, if the input is 2 bytes big: > > '}}' > > > > The output buffer will be 5 bytes big: > > '}]}]\0' > > > Don't think null termination is required in this case. > In this case, the `gdb_serial_stub` already initializes the buffer with `\0`, > and `put_packet` iterates over the buffer until the first occurrence of \0. > This works for `vmcoreinfo` query. But for input buffers with additional > `\0` characters need to be escaped, to make sure `put_packet` parsed the > entire buffer. > > > + > > > + return buf; > > > > What exactly is this return value? It points right past the end of the buffer? > > > > You seem to just use it as an error value (NULL means error, non-NULL > > means no error). Why not return an error code then? > > > Other encoders also return a `char *`. However there is no error handling > anywhere, the return value is discarded at the caller side. As per above, existing kdb/kgdb code isn't always the best example to follow. Personally I'd rather see an error code returned unless there's a good reason not to. If you really want to return a pointer like this, a bare minimum would be to document it. The kgdb_mem2hex() function you pointed to _does_ at least document what is returned. -Doug
diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h index 76e891ee9e37..fa3cf38a14de 100644 --- a/include/linux/kgdb.h +++ b/include/linux/kgdb.h @@ -322,6 +322,7 @@ extern struct kgdb_io *dbg_io_ops; extern int kgdb_hex2long(char **ptr, unsigned long *long_val); extern char *kgdb_mem2hex(char *mem, char *buf, int count); +extern char *kgdb_mem2ebin(char *mem, char *buf, int count); extern int kgdb_hex2mem(char *buf, char *mem, int count); extern int kgdb_isremovedbreak(unsigned long addr); diff --git a/kernel/debug/gdbstub.c b/kernel/debug/gdbstub.c index f625172d4b67..6198d2eb49c4 100644 --- a/kernel/debug/gdbstub.c +++ b/kernel/debug/gdbstub.c @@ -257,6 +257,37 @@ char *kgdb_mem2hex(char *mem, char *buf, int count) return buf; } +/* + * Convert memory to binary format for GDB remote protocol + * transmission, escaping special characters ($, #, and }). + */ +char *kgdb_mem2ebin(char *mem, char *buf, int count) +{ + char *tmp; + int err; + + tmp = buf + count; + + err = copy_from_kernel_nofault(tmp, mem, count); + if (err) + return NULL; + while (count > 0) { + unsigned char c = *tmp; + + if (c == 0x7d || c == 0x23 || c == 0x24) { + *buf++ = 0x7d; + *buf++ = c ^ 0x20; + } else { + *buf++ = c; + } + count -= 1; + tmp += 1; + } + *buf = 0; + + return buf; +} + /* * Convert the hex array pointed to by buf into binary to be placed in * mem. Return a pointer to the character AFTER the last byte