mbox series

[v5,0/1] gpio: add simple logic analyzer using polling

Message ID 20211123164902.35370-1-wsa+renesas@sang-engineering.com
Headers show
Series gpio: add simple logic analyzer using polling | expand

Message

Wolfram Sang Nov. 23, 2021, 4:49 p.m. UTC
The bravery continues with the next update of the in-kernel logic
analyzer based on GPIO polling with local irqs disabled. It has been
tested locally and remotely. It provided satisfactory results. Besides
the driver, there is also a script which isolates a CPU to achieve the
best possible result. I am aware of the latency limitations. However,
the intention is for debugging only, not mass production. Especially for
remote debugging and to get a first impression, this has already been
useful. Documentation is within the patch, to get a better idea what
this is all about.

A branch is here:
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux.git renesas/gpio-logic-analyzer-v5

And an eLinux-wiki page with a picture of a result is here:
https://elinux.org/Kernel_GPIO_Logic_analyzer

Changes since last version:
* increased version number to 5 to match internal and external numbering
* added "SLOPPY" to the Kconfig symbol as well
* addressed Geert's comments (typos and more 'unsigned int' usage).
  Thanks!
* addressed Andy's comments (updated comment, use dev_err_probe, use
  '... || fail' pattern in script consequently). Thanks!

I've used the analyzer in a few more scenarios and on another SoC
(Renesas R-Car M3-W) and was happy with the outcome. Looking forward to
other tests and comments. From my side this is good to go.

Happy hacking,

   Wolfram


Wolfram Sang (1):
  gpio: add sloppy logic analyzer using polling

 .../dev-tools/gpio-sloppy-logic-analyzer.rst  |  71 ++++
 Documentation/dev-tools/index.rst             |   1 +
 drivers/gpio/Kconfig                          |  17 +
 drivers/gpio/Makefile                         |   1 +
 drivers/gpio/gpio-sloppy-logic-analyzer.c     | 340 ++++++++++++++++++
 tools/gpio/gpio-sloppy-logic-analyzer         | 222 ++++++++++++
 6 files changed, 652 insertions(+)
 create mode 100644 Documentation/dev-tools/gpio-sloppy-logic-analyzer.rst
 create mode 100644 drivers/gpio/gpio-sloppy-logic-analyzer.c
 create mode 100755 tools/gpio/gpio-sloppy-logic-analyzer

Comments

Andy Shevchenko Nov. 23, 2021, 6:45 p.m. UTC | #1
On Tue, Nov 23, 2021 at 05:49:02PM +0100, Wolfram Sang wrote:
> This is a sloppy logic analyzer using GPIOs. It comes with a script to
> isolate a CPU for polling. While this is definitely not a production
> level analyzer, it can be a helpful first view when remote debugging.
> Read the documentation for details.

...

> +Result is a .sr file to be consumed with PulseView or sigrok-cli from the free
> +`sigrok`_ project. It is a zip file which also contains the binary sample data
> +which may be consumed by other software. The filename is the logic analyzer
> +instance name plus a since-epoch timestamp.
> +
> +.. _sigrok: https://sigrok.org/

Alas, yet another tool required... (Sad thoughts since recently has installed
PicoScope software).

...

>     kgdb
>     kselftest
>     kunit/index

> +   gpio-sloppy-logic-analyzer

Above looks like ordered, do we need some groups here or so?

...

> +	mutex_lock(&priv->lock);

> +	if (priv->blob_dent) {

Redundant (i.e. duplicate).

> +		debugfs_remove(priv->blob_dent);
> +		priv->blob_dent = NULL;
> +	}

...

> +gpio_err:

A bit confusing name. What about

enable_irq_and_free_data:

?

> +	preempt_enable_notrace();
> +	local_irq_enable();
> +	if (ret)
> +		dev_err(priv->dev, "couldn't read GPIOs: %d\n", ret);
> +
> +	kfree(priv->trig_data);
> +	priv->trig_data = NULL;
> +	priv->trig_len = 0;

...

> +static int gpio_la_poll_probe(struct platform_device *pdev)
> +{
> +	struct gpio_la_poll_priv *priv;
> +	struct device *dev = &pdev->dev;
> +	const char *devname = dev_name(dev);
> +	const char *gpio_names[GPIO_LA_MAX_PROBES];

> +	char *meta = NULL;
> +	unsigned int i, meta_len = 0;
> +	int ret;

Perhaps

	unsigned int i, meta_len = 0;
	char *meta = NULL;
	int ret;


...

> +	ret = device_property_read_string_array(dev, "probe-names", gpio_names,
> +						priv->descs->ndescs);
> +	if (ret >= 0 && ret != priv->descs->ndescs)

> +		ret = -ENODATA;

Don't remember if we already discussed this error code, but data is there,
it's not correct. EBADSLT? EBADR? ECHRNG?

> +	if (ret < 0)
> +		return dev_err_probe(dev, ret, "error naming the GPIOs");

...

> +	for (i = 0; i < priv->descs->ndescs; i++) {
> +		unsigned int add_len;
> +		char *new_meta, *consumer_name;
> +
> +		if (gpiod_cansleep(priv->descs->desc[i]))
> +			return -EREMOTE;
> +
> +		consumer_name = kasprintf(GFP_KERNEL, "%s: %s", devname, gpio_names[i]);
> +		if (!consumer_name)
> +			return -ENOMEM;
> +		gpiod_set_consumer_name(priv->descs->desc[i], consumer_name);
> +		kfree(consumer_name);
> +
> +		/* '10' is length of 'probe00=\n\0' */
> +		add_len = strlen(gpio_names[i]) + 10;
> +
> +		new_meta = devm_krealloc(dev, meta, meta_len + add_len, GFP_KERNEL);
> +		if (!new_meta)
> +			return -ENOMEM;
> +
> +		meta = new_meta;
> +		meta_len += snprintf(meta + meta_len, add_len, "probe%02u=%s\n",
> +				     i + 1, gpio_names[i]);

Do we really need the 'probe%02u=' part? It's redundant since it may be derived
from the line number of the output (and it always in [1..ndescs+1]).

> +	}

...

> +	dev_info(dev, "initialized");

Is it useful?

...

> +print_help()
> +{
> +	cat <<EOF

	cat << EOF

is slightly easier to read.

> +EOF
> +}

...

> +set_newmask()
> +{
> +	for f in $(find "$1" -iname "$2"); do echo "$newmask" > "$f" 2>/dev/null || true; done

While here it's okay, the rule of thumb is never use `for` or `while` against
the list of filenames.

> +}

...

> +init_cpu()
> +{
> +	isol_cpu="$1"

> +	[ -d $cpusetdir ] || mkdir $cpusetdir

`mkdir -p` and drop needless test.

> +	mount | grep -q $cpusetdir || mount -t cpuset cpuset $cpusetdir

> +	[ -d "$lacpusetdir" ] || mkdir "$lacpusetdir"

`mkdir -p` and drop needless test.

> +	cur_cpu="$(cat "$lacpusetdir"/cpus)"
> +	[ "$cur_cpu" = "$isol_cpu" ] && return
> +	[ -z "$cur_cpu" ] || fail "CPU$isol_cpu requested but CPU$cur_cpu already isolated"
> +
> +	echo "$isol_cpu" > "$lacpusetdir"/cpus || fail "Could not isolate CPU$isol_cpu. Does it exist?"
> +	echo 1 > "$lacpusetdir"/cpu_exclusive
> +	echo 0 > "$lacpusetdir"/mems
> +
> +	oldmask=$(cat /proc/irq/default_smp_affinity)

> +	val=$((0x$oldmask & ~(1 << isol_cpu)))
> +	newmask=$(printf "%x" $val)

Can be on one line (in a single expression).

> +	set_newmask '/proc/irq' '*smp_affinity'
> +	set_newmask '/sys/devices/virtual/workqueue/' 'cpumask'
> +
> +	# Move tasks away from isolated CPU
> +	for p in $(ps -o pid | tail -n +2); do
> +		mask=$(taskset -p "$p") || continue
> +		# Ignore tasks with a custom mask, i.e. not equal $oldmask
> +		[ "${mask##*: }" = "$oldmask" ] || continue
> +		taskset -p "$newmask" "$p" || continue
> +	done 2>/dev/null >/dev/null

`> /dev/null 2>&1` is idiomatic. And I think there is actually a subtle
difference between two.

> +	echo 1 > /sys/module/rcupdate/parameters/rcu_cpu_stall_suppress
> +
> +	cpufreqgov="/sys/devices/system/cpu/cpu$isol_cpu/cpufreq/scaling_governor"
> +	[ -w "$cpufreqgov" ] && echo 'performance' > "$cpufreqgov" || true
> +}

...

> +parse_triggerdat()
> +{
> +	oldifs="$IFS"
> +	IFS=','; for trig in $1; do
> +		mask=0; val1=0; val2=0
> +		IFS='+'; for elem in $trig; do
> +			chan=${elem%[lhfrLHFR]}
> +			mode=${elem#$chan}
> +			# Check if we could parse something and the channel number fits

> +			[ "$chan" != "$elem" ] && [ "$chan" -le $max_chans ] || fail "Trigger syntax error: $elem"

No need to execute `test` twice:

			[ "$chan" != "$elem" -a "$chan" -le $max_chans ] || fail "Trigger syntax error: $elem"

> +			bit=$((1 << (chan - 1)))
> +			mask=$((mask | bit))
> +			case $mode in
> +				[hH]) val1=$((val1 | bit)); val2=$((val2 | bit));;
> +				[fF]) val1=$((val1 | bit));;
> +				[rR]) val2=$((val2 | bit));;
> +			esac
> +		done

> +		trigger_bindat="$trigger_bindat$(printf '\\%o\\%o' $mask $val1)"
> +		[ $val1 -ne $val2 ] && trigger_bindat="$trigger_bindat$(printf '\\%o\\%o' $mask $val2)"

`printf` with arguments may be split to a separate helper function.

> +	done
> +	IFS="$oldifs"
> +}
> +
> +do_capture()
> +{
> +	taskset "$1" echo 1 > "$lasysfsdir"/capture || fail "Capture error! Check kernel log"

Shouldn't this function setup signal TRAPs?

> +	srtmp=$(mktemp -d)
> +	echo 1 > "$srtmp"/version
> +	cp "$lasysfsdir"/sample_data "$srtmp"/logic-1-1
> +	cat > "$srtmp"/metadata <<EOF

	cat > "$srtmp"/metadata << EOF

> +[global]
> +sigrok version=0.2.0
> +
> +[device 1]
> +capturefile=logic-1
> +total probes=$(wc -l < "$lasysfsdir"/meta_data)
> +samplerate=${samplefreq}Hz
> +unitsize=1
> +EOF
> +	cat "$lasysfsdir"/meta_data >> "$srtmp"/metadata
> +
> +	zipname="$outputdir/${lasysfsdir##*/}-$(date +%s).sr"
> +	zip -jq "$zipname" "$srtmp"/*
> +	rm -rf "$srtmp"
> +	delay_ack=$(cat "$lasysfsdir"/delay_ns_acquisition)
> +	[ "$delay_ack" -eq 0 ] && delay_ack=1
> +	echo "Logic analyzer done. Saved '$zipname'"
> +	echo "Max sample frequency this time: $((1000000000 / delay_ack))Hz."
> +}
> +
> +rep=$(getopt -a -l cpu:,duration-us:,help,instance:,kernel-debug-dir:,num_samples:,output-dir:,sample_freq:,trigger: -o c:d:hi:k:n:o:s:t: -- "$@") || exit 1
> +eval set -- "$rep"
> +while true; do
> +	case "$1" in
> +	-c|--cpu) initcpu="$2"; shift;;
> +	-d|--duration-us) duration="$2"; shift;;
> +	-h|--help) print_help; exit 0;;
> +	-i|--instance) lainstance="$2"; shift;;
> +	-k|--kernel-debug-dir) debugdir="$2"; shift;;
> +	-n|--num_samples) numsamples="$2"; shift;;
> +	-o|--output-dir) outputdir="$2"; shift;;
> +	-s|--sample_freq) samplefreq="$2"; shift;;
> +	-t|--trigger) triggerdat="$2"; shift;;
> +	--) break;;

> +	*) fail "error parsing command line: $*";;

$@ is better, actually one should never use $*.

> +	esac
> +	shift
> +done

...

Wondering, shouldn't be a simple validator before start that we have commands
present, such as zip?
Wolfram Sang Nov. 24, 2021, 8:12 p.m. UTC | #2
> I'd argue that debugfs isn't really the right interface for a useful
> tool that is this LA.

I have to disagree. This is a kind-of logic analyzer, it is sloppy. To
emphasize it is for debugging only, I think debugfs is the proper place
for it.

But thanks for finding it useful :)
Geert Uytterhoeven Nov. 30, 2021, 10:37 a.m. UTC | #3
Hi Wolfram,

On Tue, Nov 23, 2021 at 5:49 PM Wolfram Sang
<wsa+renesas@sang-engineering.com> wrote:
> This is a sloppy logic analyzer using GPIOs. It comes with a script to
> isolate a CPU for polling. While this is definitely not a production
> level analyzer, it can be a helpful first view when remote debugging.
> Read the documentation for details.
>
> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

> --- a/drivers/gpio/Makefile
> +++ b/drivers/gpio/Makefile
> @@ -74,6 +74,7 @@ obj-$(CONFIG_GPIO_IT87)                       += gpio-it87.o
>  obj-$(CONFIG_GPIO_IXP4XX)              += gpio-ixp4xx.o
>  obj-$(CONFIG_GPIO_JANZ_TTL)            += gpio-janz-ttl.o
>  obj-$(CONFIG_GPIO_KEMPLD)              += gpio-kempld.o
> +obj-$(CONFIG_GPIO_SLOPPY_LOGIC_ANALYZER) += gpio-sloppy-logic-analyzer.o
>  obj-$(CONFIG_GPIO_LOGICVC)             += gpio-logicvc.o
>  obj-$(CONFIG_GPIO_LOONGSON1)           += gpio-loongson1.o
>  obj-$(CONFIG_GPIO_LOONGSON)            += gpio-loongson.o

(noticed while resolving a merge conflict with the out-of-tree
 gpio-litex)

Please preserve sort order, cfr. the (hilarious?) comment at the top
of the list:

    # Device drivers. Generally keep list sorted alphabetically
    obj-$(CONFIG_GPIO_REGMAP)       += gpio-regmap.o
    obj-$(CONFIG_GPIO_GENERIC)      += gpio-generic.o

    # directly supported by gpio-generic

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
Wolfram Sang Dec. 18, 2021, 9:40 a.m. UTC | #4
Hi Andy,

> > +Result is a .sr file to be consumed with PulseView or sigrok-cli from the free
> > +`sigrok`_ project. It is a zip file which also contains the binary sample data
> > +which may be consumed by other software. The filename is the logic analyzer
> > +instance name plus a since-epoch timestamp.
> > +
> > +.. _sigrok: https://sigrok.org/
> 
> Alas, yet another tool required... (Sad thoughts since recently has installed
> PicoScope software).

? For sure, another tool is required. Do you want the analyzer itself to
output pretty SVG files? :)

> 
> >     kgdb
> >     kselftest
> >     kunit/index
> 
> > +   gpio-sloppy-logic-analyzer
> 
> Above looks like ordered, do we need some groups here or so?

No feedback from the doc-maintainers so far. Can easily be fixed
afterwards if needed.

> > +	mutex_lock(&priv->lock);
> 
> > +	if (priv->blob_dent) {
> 
> Redundant (i.e. duplicate).

Nope, it can be NULL if allocating memory all goes wrong.

> > +gpio_err:
> 
> A bit confusing name. What about
> 
> enable_irq_and_free_data:

Yes, fixed in v6.

> > +	char *meta = NULL;
> > +	unsigned int i, meta_len = 0;
> > +	int ret;
> 
> Perhaps
> 
> 	unsigned int i, meta_len = 0;
> 	char *meta = NULL;
> 	int ret;

I'd like to keep the pointers grouped together.

> > +	if (ret >= 0 && ret != priv->descs->ndescs)
> 
> > +		ret = -ENODATA;
> 
> Don't remember if we already discussed this error code, but data is there,
> it's not correct. EBADSLT? EBADR? ECHRNG?

In your V1 review, you suggested -ENODATA. I will pick yet another one,
but it really matters zero in practice.

> > +		meta_len += snprintf(meta + meta_len, add_len, "probe%02u=%s\n",
> > +				     i + 1, gpio_names[i]);
> 
> Do we really need the 'probe%02u=' part? It's redundant since it may be derived
> from the line number of the output (and it always in [1..ndescs+1]).

It makes creating the .sr-file a lot easier. If you feel strong about
it, then you can later remove it and also update the script, I'd say.

> > +	dev_info(dev, "initialized");
> 
> Is it useful?

For the third time, yes!

> > +	cat <<EOF
> 
> 	cat << EOF
> 
> is slightly easier to read.

I'll fix it.

> > +	[ -d $cpusetdir ] || mkdir $cpusetdir
> 
> `mkdir -p` and drop needless test.

It is not the same. I prefer to bail out if e.g. '/dev/' does not exist
rather than silently create it.

> > +	val=$((0x$oldmask & ~(1 << isol_cpu)))
> > +	newmask=$(printf "%x" $val)
> 
> Can be on one line (in a single expression).

Ok.

> `> /dev/null 2>&1` is idiomatic. And I think there is actually a subtle
> difference between two.

What is the difference? Does it matter here?

> > +			[ "$chan" != "$elem" ] && [ "$chan" -le $max_chans ] || fail "Trigger syntax error: $elem"
> 
> No need to execute `test` twice:
> 
> 			[ "$chan" != "$elem" -a "$chan" -le $max_chans ] || fail "Trigger syntax error: $elem"

I read that '-a' and '-o' are deprecated. Dunno where but looking again
I found this: https://stackoverflow.com/questions/20449680/boolean-operators-a-o-in-bash

> 
> > +			bit=$((1 << (chan - 1)))
> > +			mask=$((mask | bit))
> > +			case $mode in
> > +				[hH]) val1=$((val1 | bit)); val2=$((val2 | bit));;
> > +				[fF]) val1=$((val1 | bit));;
> > +				[rR]) val2=$((val2 | bit));;
> > +			esac
> > +		done
> 
> > +		trigger_bindat="$trigger_bindat$(printf '\\%o\\%o' $mask $val1)"
> > +		[ $val1 -ne $val2 ] && trigger_bindat="$trigger_bindat$(printf '\\%o\\%o' $mask $val2)"
> 
> `printf` with arguments may be split to a separate helper function.

I think this is a micro-optimization, but feel free to change it later.

> > +	taskset "$1" echo 1 > "$lasysfsdir"/capture || fail "Capture error! Check kernel log"
> 
> Shouldn't this function setup signal TRAPs?

To do what?

> $@ is better, actually one should never use $*.

What difference does it make when expanding into a string?

> Wondering, shouldn't be a simple validator before start that we have commands
> present, such as zip?

This is what the variable 'neededcmds' is for...

Kind regards,

   Wolfram
Andy Shevchenko Dec. 19, 2021, 11:04 a.m. UTC | #5
On Sat, Dec 18, 2021 at 11:21 PM Wolfram Sang
<wsa+renesas@sang-engineering.com> wrote:

> > > +Result is a .sr file to be consumed with PulseView or sigrok-cli from the free
> > > +`sigrok`_ project. It is a zip file which also contains the binary sample data
> > > +which may be consumed by other software. The filename is the logic analyzer
> > > +instance name plus a since-epoch timestamp.
> > > +
> > > +.. _sigrok: https://sigrok.org/
> >
> > Alas, yet another tool required... (Sad thoughts since recently has installed
> > PicoScope software).
>
> ? For sure, another tool is required. Do you want the analyzer itself to
> output pretty SVG files? :)

I mean that there are similar functionality in different tools and for
one purpose you need one, for another another and there is no format
file convertors available (as far as my shallow googling shows).

...

> > > +   if (ret >= 0 && ret != priv->descs->ndescs)
> >
> > > +           ret = -ENODATA;
> >
> > Don't remember if we already discussed this error code, but data is there,
> > it's not correct. EBADSLT? EBADR? ECHRNG?
>
> In your V1 review, you suggested -ENODATA. I will pick yet another one,
> but it really matters zero in practice.

Ah, okay, then choose the one you think fits most.

...

> > Do we really need the 'probe%02u=' part? It's redundant since it may be derived
> > from the line number of the output (and it always in [1..ndescs+1]).
>
> It makes creating the .sr-file a lot easier. If you feel strong about
> it, then you can later remove it and also update the script, I'd say.

No strong opinion, I don't know the Sigrok tool and its file format,
so I can't tell if it makes sense or doesn't.

...

> > `> /dev/null 2>&1` is idiomatic. And I think there is actually a subtle
> > difference between two.
>
> What is the difference? Does it matter here?

I'm a bit lost in the context here, but the ' > /dev/null 2>&1' means
to redirect stdout to the /dev/null followed by redirecting stderr to
stdout (which is redirected to /dev/null). The other construction
might have side effects IIRC.

...

> > > +                   [ "$chan" != "$elem" ] && [ "$chan" -le $max_chans ] || fail "Trigger syntax error: $elem"
> >
> > No need to execute `test` twice:
> >
> >                       [ "$chan" != "$elem" -a "$chan" -le $max_chans ] || fail "Trigger syntax error: $elem"
>
> I read that '-a' and '-o' are deprecated. Dunno where but looking again
> I found this: https://stackoverflow.com/questions/20449680/boolean-operators-a-o-in-bash

The SO talks about _bash_, your script is a plain Shell one, right?
And for the record, I don't like bashisms in some generic code, like
the one we use with Linux kernel.

...

> > > +   taskset "$1" echo 1 > "$lasysfsdir"/capture || fail "Capture error! Check kernel log"
> >
> > Shouldn't this function setup signal TRAPs?
>
> To do what?

To clean up the garbage it may leave in case of the interrupted run, no?

...

> > $@ is better, actually one should never use $*.
>
> What difference does it make when expanding into a string?

The difference is on how the  "foo bar" (with double quotes!) will be
represented. In your case it will be translated as "foo" and "bar", in
the case I'm saying it will be "foo bar".
Wolfram Sang Dec. 19, 2021, 1:09 p.m. UTC | #6
Hi Andy,

> I mean that there are similar functionality in different tools and for
> one purpose you need one, for another another and there is no format
> file convertors available (as far as my shallow googling shows).

Yes, this is a truth I can't change. So, I chose the Free Software
solution, so the format is at least documented. Also, you could just try
sigrok and pulseview, it has a nice set of protocol decoders. GTKWave
should be able to use the binary data inside the .sr IIRC. For other
software, you can at least write a converter because the format is open.

> > In your V1 review, you suggested -ENODATA. I will pick yet another one,
> > but it really matters zero in practice.
> 
> Ah, okay, then choose the one you think fits most.

I took -EBADR now.

> > What is the difference? Does it matter here?
> 
> I'm a bit lost in the context here, but the ' > /dev/null 2>&1' means
> to redirect stdout to the /dev/null followed by redirecting stderr to
> stdout (which is redirected to /dev/null). The other construction
> might have side effects IIRC.

Andy, *if* there is a side effect, I will happily fix it. But "it might
have a side effect IIRC" leaves all the detective work to me and I am
not short of other action items. Especially because this is not a
critical path.

> > I read that '-a' and '-o' are deprecated. Dunno where but looking again
> > I found this: https://stackoverflow.com/questions/20449680/boolean-operators-a-o-in-bash
> 
> The SO talks about _bash_, your script is a plain Shell one, right?

It talks about being deprecated in POSIX, so quite the opposite of a
bashism, I'd say.

> > > > +   taskset "$1" echo 1 > "$lasysfsdir"/capture || fail "Capture error! Check kernel log"
> > >
> > > Shouldn't this function setup signal TRAPs?
> >
> > To do what?
> 
> To clean up the garbage it may leave in case of the interrupted run, no?

I don't see any? Which ones do you have in mind?

> > > $@ is better, actually one should never use $*.
> >
> > What difference does it make when expanding into a string?
> 
> The difference is on how the  "foo bar" (with double quotes!) will be
> represented. In your case it will be translated as "foo" and "bar", in
> the case I'm saying it will be "foo bar".

I very well know the difference. I was interested in what difference you
see when they get expanded into a string?

Happy hacking,

   Wolfram