Message ID | 20220715060227.23923-1-njavali@marvell.com |
---|---|
Headers | show |
Series | qla2xxx driver features | expand |
On Mon, 18 Jul 2022 12:02:26 -0700 Arun Easi <aeasi@marvell.com> wrote: > Many times when a problem was reported on the driver, we had to request > for a repro with extended error logging (via driver module parameter) > turned on. With this internal tracing in place, log messages that appear > only with extended error logging are captured by default in the internal > trace buffer. > > AFAIK, kernel tracing requires a user initiated action to be turned on, > like enabling individual tracepoints. Though a script (either startup or > udev) can do this job, enabling tracepoints by default for a single > driver, I think, may not be a preferred choice -- particularly when the > trace buffer is shared across the kernel. That also brings the extra > overhead of finding how this could be done across various distros. > > For cases where the memory/driver size matters, there is an option to > compile out this feature, plus choosing a lower default trace buffer size. You can enable an ftrace instance from inside the kernel, and make it a compile time option. #include <linux/trace_events.h> #include <linux/trace.h> struct trace_array *tr; tr = trace_array_get_by_name("qla2xxx"); trace_array_set_clr_event(tr, "qla", NULL, true); And now you have trace events running: # cat /sys/kernel/tracing/instances/qla/trace -- Steve
On Tue, 19 Jul 2022 10:25:14 +0200 Daniel Wagner <dwagner@suse.de> wrote: > > You can enable an ftrace instance from inside the kernel, and make it a > > compile time option. > > > > #include <linux/trace_events.h> > > #include <linux/trace.h> > > > > struct trace_array *tr; > > > > tr = trace_array_get_by_name("qla2xxx"); > > trace_array_set_clr_event(tr, "qla", NULL, true); > > > > And now you have trace events running: > > > > # cat /sys/kernel/tracing/instances/qla/trace The above should be: # cat /sys/kernel/tracing/instances/qla2xxx/trace as the instance name is the string sent to trace_array_get_by_name(). -- Steve
On Tue, 19 Jul 2022 14:06:11 -0700 Arun Easi <aeasi@marvell.com> wrote: > It appears by calling the above two interfaces, I get a separate instance > of "qla" only traces, but with only the "qla"-only instance being enabled, > leaving the main (/sys/kernel/tracing/events/qla/enable) one disabled. No > issues there, but when I enable both of them, I get garbage values on one. > > /sys/kernel/tracing/instances/qla2xxx/trace: > cat-56106 [012] ..... 2419873.470098: ql_dbg_log: qla2xxx [0000:05:00.0]-1054:14: Entered (null). > cat-56106 [012] ..... 2419873.470101: ql_dbg_log: qla2xxx [0000:05:00.0]-1000:14: Entered ×+<96>²Ü<98>^H. > cat-56106 [012] ..... 2419873.470102: ql_dbg_log: qla2xxx [0000:05:00.0]-1006:14: Prepare to issue mbox cmd=0xde589000. > > /sys/kernel/tracing/trace: > cat-56106 [012] ..... 2419873.470097: ql_dbg_log: qla2xxx [0000:05:00.0]-1054:14: Entered qla2x00_get_firmware_state. > cat-56106 [012] ..... 2419873.470100: ql_dbg_log: qla2xxx [0000:05:00.0]-1000:14: Entered qla2x00_mailbox_command. > cat-56106 [012] ..... 2419873.470102: ql_dbg_log: qla2xxx [0000:05:00.0]-1006:14: Prepare to issue mbox cmd=0x69. > > It appears that only one should be enabled at a time. Per my read of > Documentation/trace/ftrace.rst, the main directory and instances have > separate trace buffers, so I am a bit confused with the above output. That's because it uses va_list, and va_list can only be used once. I just added helpers for va_list use cases: https://lore.kernel.org/all/20220705224453.120955146@goodmis.org/ But it will likely suffer the same issue, but I can easily fix that with this on top: Not even compiled tested: -- Steve diff --git a/include/trace/stages/stage6_event_callback.h b/include/trace/stages/stage6_event_callback.h index 0f51f6b3ab70..3c554a585320 100644 --- a/include/trace/stages/stage6_event_callback.h +++ b/include/trace/stages/stage6_event_callback.h @@ -40,7 +40,12 @@ #undef __assign_vstr #define __assign_vstr(dst, fmt, va) \ - vsnprintf(__get_str(dst), TRACE_EVENT_STR_MAX, fmt, *(va)) + do { \ + va_list __cp_va; \ + va_copy(__cp_va, *(va)); \ + vsnprintf(__get_str(dst), TRACE_EVENT_STR_MAX, fmt, __cp_va); \ + va_end(__cp_va); \ + } while (0) #undef __bitmask #define __bitmask(item, nr_bits) __dynamic_array(unsigned long, item, -1)
On Tue, 19 Jul 2022, 2:40pm, Steven Rostedt wrote: > On Tue, 19 Jul 2022 14:06:11 -0700 > Arun Easi <aeasi@marvell.com> wrote: > > > It appears by calling the above two interfaces, I get a separate instance > > of "qla" only traces, but with only the "qla"-only instance being enabled, > > leaving the main (/sys/kernel/tracing/events/qla/enable) one disabled. No > > issues there, but when I enable both of them, I get garbage values on one. > > > > /sys/kernel/tracing/instances/qla2xxx/trace: > > cat-56106 [012] ..... 2419873.470098: ql_dbg_log: qla2xxx [0000:05:00.0]-1054:14: Entered (null). > > cat-56106 [012] ..... 2419873.470101: ql_dbg_log: qla2xxx [0000:05:00.0]-1000:14: Entered ×+<96>²Ü<98>^H. > > cat-56106 [012] ..... 2419873.470102: ql_dbg_log: qla2xxx [0000:05:00.0]-1006:14: Prepare to issue mbox cmd=0xde589000. > > > > /sys/kernel/tracing/trace: > > cat-56106 [012] ..... 2419873.470097: ql_dbg_log: qla2xxx [0000:05:00.0]-1054:14: Entered qla2x00_get_firmware_state. > > cat-56106 [012] ..... 2419873.470100: ql_dbg_log: qla2xxx [0000:05:00.0]-1000:14: Entered qla2x00_mailbox_command. > > cat-56106 [012] ..... 2419873.470102: ql_dbg_log: qla2xxx [0000:05:00.0]-1006:14: Prepare to issue mbox cmd=0x69. > > > > It appears that only one should be enabled at a time. Per my read of > > Documentation/trace/ftrace.rst, the main directory and instances have > > separate trace buffers, so I am a bit confused with the above output. > > That's because it uses va_list, and va_list can only be used once. Ah, that makes sense. > > I just added helpers for va_list use cases: > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_all_20220705224453.120955146-40goodmis.org_&d=DwIFaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=P-q_Qkt75qFy33SvdD2nAxAyN87eO1d-mFO-lqNOomw&m=cDBXba_lNDGk7c1Qm4elDe1Co-RV8zR-c1A9xV7vc4nMcQO7iRle72qNxljHsDNA&s=74PzgMj6nmfRCLXd3oMIqt-_DZbQeukUl0nK7MzfUjs&e= > > But it will likely suffer the same issue, but I can easily fix that with > this on top: > > Not even compiled tested: > > -- Steve > > diff --git a/include/trace/stages/stage6_event_callback.h b/include/trace/stages/stage6_event_callback.h > index 0f51f6b3ab70..3c554a585320 100644 > --- a/include/trace/stages/stage6_event_callback.h > +++ b/include/trace/stages/stage6_event_callback.h > @@ -40,7 +40,12 @@ > > #undef __assign_vstr > #define __assign_vstr(dst, fmt, va) \ > - vsnprintf(__get_str(dst), TRACE_EVENT_STR_MAX, fmt, *(va)) > + do { \ > + va_list __cp_va; \ > + va_copy(__cp_va, *(va)); \ > + vsnprintf(__get_str(dst), TRACE_EVENT_STR_MAX, fmt, __cp_va); \ > + va_end(__cp_va); \ > + } while (0) > > #undef __bitmask > #define __bitmask(item, nr_bits) __dynamic_array(unsigned long, item, -1) > This will be nice. Thanks for your help, Steve. Regards, -Arun
Hi Daniel, On Tue, 19 Jul 2022, 1:25am, Daniel Wagner wrote: > External Email > > ---------------------------------------------------------------------- > Hi Arun, > > On Mon, Jul 18, 2022 at 03:22:43PM -0400, Steven Rostedt wrote: > > On Mon, 18 Jul 2022 12:02:26 -0700 > > Arun Easi <aeasi@marvell.com> wrote: > > > > > Many times when a problem was reported on the driver, we had to request > > > for a repro with extended error logging (via driver module parameter) > > > turned on. With this internal tracing in place, log messages that appear > > > only with extended error logging are captured by default in the internal > > > trace buffer. > > > > > > AFAIK, kernel tracing requires a user initiated action to be turned on, > > > like enabling individual tracepoints. Though a script (either startup or > > > udev) can do this job, enabling tracepoints by default for a single > > > driver, I think, may not be a preferred choice -- particularly when the > > > trace buffer is shared across the kernel. That also brings the extra > > > overhead of finding how this could be done across various distros. > > > > > > For cases where the memory/driver size matters, there is an option to > > > compile out this feature, plus choosing a lower default trace buffer size. > > I am really asking the question why do we need to add special debugging > code to every single driver? Can't we try to make more use of generic > code and extend it if necessary? > > Both FC drivers qla2xxx and lpfc are doing their own thing for > debugging/logging and I really fail to see why we can't not move towards > a more generic way. Dumping logs to the kernel log was the simplest way > but as this series shows, this is not something you can do in production > systems. No contention here on a generic way. The per instance trace creation and enabling from within the kernel looks like such a one. Let me revisit the trace patches with this new info. Regards, -Arun > > > You can enable an ftrace instance from inside the kernel, and make it a > > compile time option. > > > > #include <linux/trace_events.h> > > #include <linux/trace.h> > > > > struct trace_array *tr; > > > > tr = trace_array_get_by_name("qla2xxx"); > > trace_array_set_clr_event(tr, "qla", NULL, true); > > > > And now you have trace events running: > > > > # cat /sys/kernel/tracing/instances/qla/trace > > Thanks Steve for the tip! > > Daniel >
On Tue, 19 Jul 2022 15:09:30 -0700 Arun Easi <aeasi@marvell.com> wrote: > This will be nice. > > Thanks for your help, Steve. No problem. What a coincidence that I just finished this work earlier this month. -- Steve
On 7/20/22 00:09, Arun Easi wrote: > On Tue, 19 Jul 2022, 1:25am, Daniel Wagner wrote: >> On Mon, Jul 18, 2022 at 03:22:43PM -0400, Steven Rostedt wrote: >>> On Mon, 18 Jul 2022 12:02:26 -0700 >>> Arun Easi <aeasi@marvell.com> wrote: >>> >>>> Many times when a problem was reported on the driver, we had to request >>>> for a repro with extended error logging (via driver module parameter) >>>> turned on. With this internal tracing in place, log messages that appear >>>> only with extended error logging are captured by default in the internal >>>> trace buffer. >>>> >>>> AFAIK, kernel tracing requires a user initiated action to be turned on, >>>> like enabling individual tracepoints. Though a script (either startup or >>>> udev) can do this job, enabling tracepoints by default for a single >>>> driver, I think, may not be a preferred choice -- particularly when the >>>> trace buffer is shared across the kernel. That also brings the extra >>>> overhead of finding how this could be done across various distros. >>>> >>>> For cases where the memory/driver size matters, there is an option to >>>> compile out this feature, plus choosing a lower default trace buffer size. >> >> I am really asking the question why do we need to add special debugging >> code to every single driver? Can't we try to make more use of generic >> code and extend it if necessary? >> >> Both FC drivers qla2xxx and lpfc are doing their own thing for All three FC drivers: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/s390/scsi/zfcp_dbf.c We have this flight recorder (multiple topical ring buffers per vHBA to avoid some higher frequency topics lose history of others) since a long time [maybe pre-dating ftrace], including crash tool support to extract it from post mortem kernel dumps. We use binary trace records, like tracepoints, with offline decoding/formatting (zfcpdbf in s390-tools package). Other s390 kernel components share the underlying s390dbf infrastructure. The crash extension "ftrace" is probably able to do an export from dump for the approach Steven suggested. I had used it with kernel function tracer. Very useful. >> debugging/logging and I really fail to see why we can't not move towards >> a more generic way. Dumping logs to the kernel log was the simplest way >> but as this series shows, this is not something you can do in production >> systems. > > No contention here on a generic way. The per instance trace creation and > enabling from within the kernel looks like such a one. Let me revisit the > trace patches with this new info. > > Regards, > -Arun > >> >>> You can enable an ftrace instance from inside the kernel, and make it a >>> compile time option. >>> >>> #include <linux/trace_events.h> >>> #include <linux/trace.h> >>> >>> struct trace_array *tr; >>> >>> tr = trace_array_get_by_name("qla2xxx"); >>> trace_array_set_clr_event(tr, "qla", NULL, true); >>> >>> And now you have trace events running: >>> >>> # cat /sys/kernel/tracing/instances/qla/trace >> >> Thanks Steve for the tip! >> >> Daniel >>
On Sat, 23 Jul 2022 09:47:21 +0200 Steffen Maier <maier@linux.ibm.com> wrote: > The crash extension "ftrace" is probably able to do an export from dump for the > approach Steven suggested. I had used it with kernel function tracer. Very useful. Note, the crash extension is just called "trace" and you can get it here: https://github.com/fujitsu/crash-trace -- Steve