diff mbox series

[v2,1/8] lib/printbuf: New data structure for heap-allocated strings

Message ID 20220421234837.3629927-7-kent.overstreet@gmail.com
State New
Headers show
Series [v2,1/8] lib/printbuf: New data structure for heap-allocated strings | expand

Commit Message

Kent Overstreet April 21, 2022, 11:48 p.m. UTC
This adds printbufs: simple heap-allocated strings meant for building up
structured messages, for logging/procfs/sysfs and elsewhere. They've
been heavily used in bcachefs for writing .to_text() functions/methods -
pretty printers, which has in turn greatly improved the overall quality
of error messages.

Basic usage is documented in include/linux/printbuf.h.

The next patches in the series are going to be using printbufs to
implement a .to_text() method for shrinkers, and improving OOM
reporting.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
---
 include/linux/printbuf.h | 164 +++++++++++++++++++++++
 lib/Makefile             |   2 +-
 lib/printbuf.c           | 274 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 439 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/printbuf.h
 create mode 100644 lib/printbuf.c

Comments

James Bottomley April 23, 2022, 2:16 p.m. UTC | #1
[change of subject for an easy thread kill, since I'm sure most people
on cc don't want to follow this]
On Fri, 2022-04-22 at 17:13 -0400, Kent Overstreet wrote:
> On Fri, Apr 22, 2022 at 04:03:12PM -0400, James Bottomley wrote:
> > Hey, I didn't say that at all.  I said vendoring the facebook
> > reference implementation wouldn't work (it being 74k lines and us
> > using 300) but that facebook was doing the right thing for us with
> > zstd because they were maintaining the core code we needed, even if
> > we couldn't vendor it from their code base:
> > 
> > https://lore.kernel.org/rust-for-linux/ea85b3bce5f172dc73e2be8eb4dbd21fae826fa1.camel@HansenPartnership.com/
> > 
> > You were the one who said all that about facebook, while
> > incorrectly implying I said it first (which is an interesting
> > variation on the strawman fallacy):
> 
> Hey, sorry for picking on you James - I didn't mean to single you
> out.
> 
> Let me explain where I'm coming from - actually, this email I
> received puts it better than I could:
> 
> From: "John Ericson" <mail@johnericson.me>
> To: "Kent Overstreet" <kent.overstreet@gmail.com>
> Subject: Nice email on cargo crates in Linux
> 
> Hi. I saw your email linked from https://lwn.net/Articles/889924/.
> Nicely put!
> 
> I was involved in some of the allocating try_* function work to avoid
> nasty panics, and thus make sure Rust in Linux didn't have to
> reinvent the wheel redoing the alloc library. I very much share your
> goals but suspect this will happen in a sort of boil-the-frog way
> over time. The basic portability method of "incentivize users to
> write code that's *more* portable than what they currently
> need" is a culture shock for user space and kernel space alike. But
> if we get all our ducks in a row on the Rust side, eventually it
> should just be "too tempting", and the code sharing will happen for
> economic reasons (rather than ideological ones about trying to smash
> down the arbitrary dichotomies :) going it alone).
> 
> The larger issue is major projects closed off unto themselves such as
> Linux, PostgreSQL, or even the Glasgow Haskell Compiler (something I
> am currently writing a plan to untangle) have fallen too deep in
> Conway's law's embrace, and thus are full of bad habits like
> https://johno.com/composition-over-configuration is arguing against.
> Over time, I hope using external libraries will not only result in
> less duplicated work, but also "pry open" their architecture a bit,
> tamping down on crazy configurability and replacing it with more
> principled composition.
> 
> I fear this is all way too spicy to bring up on the LKML in 2022, at
> least coming from someone like me without any landed kernel patches
> under his belt, but I wanted to let you know other people have
> similar thoughts.
> 
> Cheers,
> 
> John
> 
> -------
> 
> Re: Rust in Linux, I get frustrated when I see senior people tell the
> new Rust people "don't do that" to things that are standard practice
> in the outside world.

You stripped the nuance of that.  I said many no_std crates could be
used in the kernel.  I also said that the async crate couldn't because
the rust compiler itself would have to support the kernel threading
model.

> I think Linus said recently that Rust in the kernel is something that
> could fail, and he's right - but if it fails, it won't just be the
> failure of the Rust people to do the required work, it'll be _our_
> failure too, a failure to work with them.

The big risk is that rust needs to adapt to the kernel environment. 
This isn't rust specific, llvm had similar issues as an alternative C
compiler.  I think rust in the kernel would fail if it were only the
rust kernel people asking.  Fortunately the pressure to support rust in
embedded leading to the rise in no_std crates is a force which can also
get rust in the kernel over the finish line because of the focus it
puts on getting the language and crates to adapt to non standard
environments.

[...]
> The kernel community has a lot of that going on here. Again, sorry to
> pick on you James, but I wanted to make the argument that - maybe the
> kernel _should_ be adopting a more structured way of using code from
> outside repositories, like cargo, or git submodules (except I've
> never had a positive experience with git submodules, so ignore that
> suggestion, unless I've just been using them wrong, in which case
> someone please teach me). To read you and Greg saying "nah, just
> copy code from other repos, it's fine" - it felt like being back in
> the old days when we were still trying to get people to use source
> control, and having that one older colleague who _insisted_ on not
> using source control of any kind, and that's a bit disheartening.

Even in C terms, the kernel is a nostdlib environment.  If a C project
has too much libc dependency it's not going to work directly in the
kernel, nor should it.  Let's look at zstd (which is pretty much a
nostdlib project) as a great example: the facebook people didn't
actually port the top of their tree (1.5) to the kernel, they
backported bug fixes to the 1.4 branch and made a special release
(1.4.10) just for us.  Why did they do this?  It was because the 1.5
version vastly increased stack use to the extent it would run off the
end of the limited kernel stack so couldn't be ported directly into the
kernel.  A lot of C libraries that are nostdlib have problems like this
as well (you can use recursion, but not in the kernel).  There's no
easy way of shimming environmental constraints like this.

The lesson: it is possible to make the core of a project mobile, but
only if you're aware of all the environmental constraints it will run
into as it gets ported.  The list of possible environments is huge:
kernel, embedded, industrial control ..., so naturally not every (or
more accurately hardly any) project wants to do this.

James
Kent Overstreet April 24, 2022, 8:36 p.m. UTC | #2
On Sat, Apr 23, 2022 at 10:16:37AM -0400, James Bottomley wrote:
> You stripped the nuance of that.  I said many no_std crates could be
> used in the kernel.  I also said that the async crate couldn't because
> the rust compiler itself would have to support the kernel threading
> model.

I just scanned through that thread and that's not what you said. What you said
was:

> The above is also the rust crate problem in miniature: the crates grow
> API features the kernel will never care about and importing them
> wholesale is going to take forever because of the internal kernel
> support issue.  In the end, to take rust async as an example, it will
> be much better to do for rust what we've done for zlib: take the core
> that can support the kernel threading model and reimplement that in the
> kernel crate.  The act of doing that will a) prove people care enough
> about the functionality and b) allow us to refine it nicely.
> 
> I also don't think rust would really want to import crates wholesale.
> The reason for no_std is that rust is trying to adapt to embedded
> environments, which the somewhat harsh constraints of the kernel is
> very similar to.

But maybe your position has changed somewhat? It sounds like you've been
arguing against just directly depending on foreign reposotories and for the
staus quo of just ad-hoc copying of code.

I'll help by stating my own position: I think we should be coming up with a
process for how dependencies on other git repositories are going to work,
something better than just cut and paste. Whether or not we vendorize code isn't
really that important, but I'd say that if we are vendorizing code and we're not
including entire sub-repositories (like cargo vendor does) we ought to still
make this a scripted process that takes as an input a list of files we're
pulling and a remote repository we're pulling from, and the file list and the
remote repo (and commit ID we're pulling from) should all be checked in.

I think using cargo would be _great_ because it would handle this part for us
(perhaps minus pulling of individual files? haven't checked) instead of
home-growing our own. However, I'd like something for C repositories too, and I
don't think we want to start depending on cargo for non Rust development... :)

But much more important that the technical details of how we import code is just
having good answers for people who aren't embedded in Linux kernel development
culture, so we aren't just telling them "no" by default because we haven't
thought about this stuff and having them walk away frustrated.

> > I think Linus said recently that Rust in the kernel is something that
> > could fail, and he's right - but if it fails, it won't just be the
> > failure of the Rust people to do the required work, it'll be _our_
> > failure too, a failure to work with them.
> 
> The big risk is that rust needs to adapt to the kernel environment. 
> This isn't rust specific, llvm had similar issues as an alternative C
> compiler.  I think rust in the kernel would fail if it were only the
> rust kernel people asking.  Fortunately the pressure to support rust in
> embedded leading to the rise in no_std crates is a force which can also
> get rust in the kernel over the finish line because of the focus it
> puts on getting the language and crates to adapt to non standard
> environments.

It's both! It's on all of us to make this work.

> > The kernel community has a lot of that going on here. Again, sorry to
> > pick on you James, but I wanted to make the argument that - maybe the
> > kernel _should_ be adopting a more structured way of using code from
> > outside repositories, like cargo, or git submodules (except I've
> > never had a positive experience with git submodules, so ignore that
> > suggestion, unless I've just been using them wrong, in which case
> > someone please teach me). To read you and Greg saying "nah, just
> > copy code from other repos, it's fine" - it felt like being back in
> > the old days when we were still trying to get people to use source
> > control, and having that one older colleague who _insisted_ on not
> > using source control of any kind, and that's a bit disheartening.
> 
> Even in C terms, the kernel is a nostdlib environment.  If a C project
> has too much libc dependency it's not going to work directly in the
> kernel, nor should it.  Let's look at zstd (which is pretty much a
> nostdlib project) as a great example: the facebook people didn't
> actually port the top of their tree (1.5) to the kernel, they
> backported bug fixes to the 1.4 branch and made a special release
> (1.4.10) just for us.  Why did they do this?  It was because the 1.5
> version vastly increased stack use to the extent it would run off the
> end of the limited kernel stack so couldn't be ported directly into the
> kernel.  A lot of C libraries that are nostdlib have problems like this
> as well (you can use recursion, but not in the kernel).  There's no
> easy way of shimming environmental constraints like this.

I wonder if we might have come up with a better solution if there'd been more
cross-project communication and less siloing. Small stacks aren't particular to
the kernel - it's definitely not unheard of to write userspace code where you
want to have a lot of small stacks (especially if you're doing some kind of
coroutine style threading; I've done stuff like this in the past) - and to me,
as someone who's been incrementing on and maintaining a codebase in active use
for 10 years, having multiple older versions in active use that need bugfixes
gives me cold shivers.

I wouldn't be surprised if at some point the zstd people walk back some of their
changes or make it configurable at some point :)
Joe Perches April 24, 2022, 11:46 p.m. UTC | #3
On Thu, 2022-04-21 at 19:48 -0400, Kent Overstreet wrote:
> This adds printbufs: simple heap-allocated strings meant for building up
> structured messages, for logging/procfs/sysfs and elsewhere. They've
> been heavily used in bcachefs for writing .to_text() functions/methods -
> pretty printers, which has in turn greatly improved the overall quality
> of error messages.
> 
> Basic usage is documented in include/linux/printbuf.h.

Given the maximum printk output is less than 1024 bytes, why should
this be allowed to be larger than that or larger than PAGE_SIZE?

> + * pr_human_readable_u64, pr_human_readable_s64: Print an integer with human
> + * readable units.

Why not extend vsprintf for this using something like %pH[8|16|32|64] 
or %pH[c|s|l|ll|uc|us|ul|ull] ?
Kent Overstreet April 25, 2022, 12:45 a.m. UTC | #4
On Sun, Apr 24, 2022 at 04:46:03PM -0700, Joe Perches wrote:
> On Thu, 2022-04-21 at 19:48 -0400, Kent Overstreet wrote:
> > This adds printbufs: simple heap-allocated strings meant for building up
> > structured messages, for logging/procfs/sysfs and elsewhere. They've
> > been heavily used in bcachefs for writing .to_text() functions/methods -
> > pretty printers, which has in turn greatly improved the overall quality
> > of error messages.
> > 
> > Basic usage is documented in include/linux/printbuf.h.
> 
> Given the maximum printk output is less than 1024 bytes, why should
> this be allowed to be larger than that or larger than PAGE_SIZE?

It's not just used there - in bcachefs I use it for sysfs & debugfs, as well as
userspace code for e.g. printing out the superblock (which gets pretty big when
including all the variable length sections).

> > + * pr_human_readable_u64, pr_human_readable_s64: Print an integer with human
> > + * readable units.
> 
> Why not extend vsprintf for this using something like %pH[8|16|32|64] 
> or %pH[c|s|l|ll|uc|us|ul|ull] ?

It'd be incompatible with userspace printf. I do like the way we extend printf
is the kernel, but I'm trying to make sure the code I write now is by default
portable between both kernel space and userspace. Glibc has its own mechanism
for extending printf, I've been meaning to look at that more and see if it'd be
possible to do something more generic and extensible that works for both.
Matthew Wilcox April 25, 2022, 2:44 a.m. UTC | #5
On Sun, Apr 24, 2022 at 04:46:03PM -0700, Joe Perches wrote:
> > + * pr_human_readable_u64, pr_human_readable_s64: Print an integer with human
> > + * readable units.
> 
> Why not extend vsprintf for this using something like %pH[8|16|32|64] 
> or %pH[c|s|l|ll|uc|us|ul|ull] ?

The %pX extension we have is _cute_, but ultimately a bad idea.  It
centralises all kinds of unrelated things in vsprintf.c, eg bdev_name()
and clock() and ip_addr_string().

Really, it's working around that we don't have something like Java's
StringBuffer (which I see both seq_buf and printbuf as attempting to
be).  So we have this primitive format string hack instead of exposing
methods like:

void dentry_string(struct strbuf *, struct dentry *);

as an example,
                if (unlikely(ino == dir->i_ino)) {
                        EXT4_ERROR_INODE(dir, "'%pd' linked to parent dir",
                                         dentry);
                        return ERR_PTR(-EFSCORRUPTED);
                }

would become something like:

		if (unlikely(ino == dir->i_ino)) {
			struct strbuf strbuf;
			strbuf_char(strbuf, '\'');
			dentry_string(strbuf, dentry);
			strbuf_string(strbuf, "' linked to parent dir");
			EXT4_ERROR_INODE(dir, strbuf);
			return ERR_PTR(-EFSCORRUPTED);
		}

which isn't terribly nice, but C has sucky syntax for string
construction.  Other languages have done this better, including Rust.
Kent Overstreet April 25, 2022, 4:19 a.m. UTC | #6
On Mon, Apr 25, 2022 at 03:44:34AM +0100, Matthew Wilcox wrote:
> On Sun, Apr 24, 2022 at 04:46:03PM -0700, Joe Perches wrote:
> > > + * pr_human_readable_u64, pr_human_readable_s64: Print an integer with human
> > > + * readable units.
> > 
> > Why not extend vsprintf for this using something like %pH[8|16|32|64] 
> > or %pH[c|s|l|ll|uc|us|ul|ull] ?
> 
> The %pX extension we have is _cute_, but ultimately a bad idea.  It
> centralises all kinds of unrelated things in vsprintf.c, eg bdev_name()
> and clock() and ip_addr_string().

And it's not remotely discoverable. I didn't realize we had bdev_name()
available as a format string until just now or I would've been using it!

> Really, it's working around that we don't have something like Java's
> StringBuffer (which I see both seq_buf and printbuf as attempting to
> be).  So we have this primitive format string hack instead of exposing
> methods like:
> 
> void dentry_string(struct strbuf *, struct dentry *);

Exactly!

> as an example,
>                 if (unlikely(ino == dir->i_ino)) {
>                         EXT4_ERROR_INODE(dir, "'%pd' linked to parent dir",
>                                          dentry);
>                         return ERR_PTR(-EFSCORRUPTED);
>                 }
> 
> would become something like:
> 
> 		if (unlikely(ino == dir->i_ino)) {
> 			struct strbuf strbuf;
> 			strbuf_char(strbuf, '\'');
> 			dentry_string(strbuf, dentry);
> 			strbuf_string(strbuf, "' linked to parent dir");
> 			EXT4_ERROR_INODE(dir, strbuf);
> 			return ERR_PTR(-EFSCORRUPTED);
> 		}
> 
> which isn't terribly nice, but C has sucky syntax for string
> construction.  Other languages have done this better, including Rust.

Over IRC just now you proposed "%p(%p)", dentry_name, dentry - I'm _really_
liking this idea, especially if we can get glibc to take it.

Then your ext4 example becomes just 

	if (unlikely(ino == dir->i_ino)) {
		EXT4_ERROR_INODE(dir, "'%p(%p)' linked to parent dir",
				 dentry_name, dentry);
		return ERR_PTR(-EFSCORRUPTED);
	}

And you can cscope to the pretty-printer! And dentry_name becomes just

void dentry_name(struct printbuf *out, struct dentry *dentry)
{
	...
}

Which is quite a bit simpler than the current definition.

Sweeeeeet.
Joe Perches April 25, 2022, 4:48 a.m. UTC | #7
On Mon, 2022-04-25 at 00:19 -0400, Kent Overstreet wrote:
> On Mon, Apr 25, 2022 at 03:44:34AM +0100, Matthew Wilcox wrote:
> > On Sun, Apr 24, 2022 at 04:46:03PM -0700, Joe Perches wrote:
> > > > + * pr_human_readable_u64, pr_human_readable_s64: Print an integer with human
> > > > + * readable units.
> > > 
> > > Why not extend vsprintf for this using something like %pH[8|16|32|64] 
> > > or %pH[c|s|l|ll|uc|us|ul|ull] ?
> > 
> > The %pX extension we have is _cute_, but ultimately a bad idea.  It
> > centralises all kinds of unrelated things in vsprintf.c, eg bdev_name()
> > and clock() and ip_addr_string().
> 
> And it's not remotely discoverable. I didn't realize we had bdev_name()
> available as a format string until just now or I would've been using it!

Documentation/core-api/printk-formats.rst
Kent Overstreet April 25, 2022, 4:59 a.m. UTC | #8
On Sun, Apr 24, 2022 at 09:48:58PM -0700, Joe Perches wrote:
> On Mon, 2022-04-25 at 00:19 -0400, Kent Overstreet wrote:
> > On Mon, Apr 25, 2022 at 03:44:34AM +0100, Matthew Wilcox wrote:
> > > On Sun, Apr 24, 2022 at 04:46:03PM -0700, Joe Perches wrote:
> > > > > + * pr_human_readable_u64, pr_human_readable_s64: Print an integer with human
> > > > > + * readable units.
> > > > 
> > > > Why not extend vsprintf for this using something like %pH[8|16|32|64] 
> > > > or %pH[c|s|l|ll|uc|us|ul|ull] ?
> > > 
> > > The %pX extension we have is _cute_, but ultimately a bad idea.  It
> > > centralises all kinds of unrelated things in vsprintf.c, eg bdev_name()
> > > and clock() and ip_addr_string().
> > 
> > And it's not remotely discoverable. I didn't realize we had bdev_name()
> > available as a format string until just now or I would've been using it!
> 
> Documentation/core-api/printk-formats.rst

Who has time for docs?
Joe Perches April 25, 2022, 5 a.m. UTC | #9
On Mon, 2022-04-25 at 00:59 -0400, Kent Overstreet wrote:
> On Sun, Apr 24, 2022 at 09:48:58PM -0700, Joe Perches wrote:
> > On Mon, 2022-04-25 at 00:19 -0400, Kent Overstreet wrote:
> > > On Mon, Apr 25, 2022 at 03:44:34AM +0100, Matthew Wilcox wrote:
> > > > On Sun, Apr 24, 2022 at 04:46:03PM -0700, Joe Perches wrote:
> > > > > > + * pr_human_readable_u64, pr_human_readable_s64: Print an integer with human
> > > > > > + * readable units.
> > > > > 
> > > > > Why not extend vsprintf for this using something like %pH[8|16|32|64] 
> > > > > or %pH[c|s|l|ll|uc|us|ul|ull] ?
> > > > 
> > > > The %pX extension we have is _cute_, but ultimately a bad idea.  It
> > > > centralises all kinds of unrelated things in vsprintf.c, eg bdev_name()
> > > > and clock() and ip_addr_string().
> > > 
> > > And it's not remotely discoverable. I didn't realize we had bdev_name()
> > > available as a format string until just now or I would've been using it!
> > 
> > Documentation/core-api/printk-formats.rst
> 
> Who has time for docs?

The same people that have time to reimplement the already implemented?
Kent Overstreet April 25, 2022, 5:56 a.m. UTC | #10
On Sun, Apr 24, 2022 at 10:00:32PM -0700, Joe Perches wrote:
> On Mon, 2022-04-25 at 00:59 -0400, Kent Overstreet wrote:
> > On Sun, Apr 24, 2022 at 09:48:58PM -0700, Joe Perches wrote:
> > > On Mon, 2022-04-25 at 00:19 -0400, Kent Overstreet wrote:
> > > > On Mon, Apr 25, 2022 at 03:44:34AM +0100, Matthew Wilcox wrote:
> > > > > On Sun, Apr 24, 2022 at 04:46:03PM -0700, Joe Perches wrote:
> > > > > > > + * pr_human_readable_u64, pr_human_readable_s64: Print an integer with human
> > > > > > > + * readable units.
> > > > > > 
> > > > > > Why not extend vsprintf for this using something like %pH[8|16|32|64] 
> > > > > > or %pH[c|s|l|ll|uc|us|ul|ull] ?
> > > > > 
> > > > > The %pX extension we have is _cute_, but ultimately a bad idea.  It
> > > > > centralises all kinds of unrelated things in vsprintf.c, eg bdev_name()
> > > > > and clock() and ip_addr_string().
> > > > 
> > > > And it's not remotely discoverable. I didn't realize we had bdev_name()
> > > > available as a format string until just now or I would've been using it!
> > > 
> > > Documentation/core-api/printk-formats.rst
> > 
> > Who has time for docs?
> 
> The same people that have time to reimplement the already implemented?

Touché :)
James Bottomley April 26, 2022, 2:22 a.m. UTC | #11
On Sun, 2022-04-24 at 16:36 -0400, Kent Overstreet wrote:
> On Sat, Apr 23, 2022 at 10:16:37AM -0400, James Bottomley wrote:
> > You stripped the nuance of that.  I said many no_std crates could
> > be used in the kernel.  I also said that the async crate couldn't
> > because the rust compiler itself would have to support the kernel
> > threading model.
> 
> I just scanned through that thread and that's not what you said. What
> you said was:
> 
> > The above is also the rust crate problem in miniature: the crates
> > grow API features the kernel will never care about and importing
> > them wholesale is going to take forever because of the internal
> > kernel support issue.  In the end, to take rust async as an
> > example, it will be much better to do for rust what we've done for
> > zlib: take the core that can support the kernel threading model and
> > reimplement that in the kernel crate.  The act of doing that will
> > a) prove people care enough about the functionality and b) allow us
> > to refine it nicely.
> > 
> > I also don't think rust would really want to import crates
> > wholesale.  The reason for no_std is that rust is trying to adapt
> > to embedded environments, which the somewhat harsh constraints of
> > the kernel is very similar to.
> 
> But maybe your position has changed somewhat?

I read those two as saying the same thing just with differing levels of
detail.

>  It sounds like you've been arguing against just directly depending
> on foreign reposotories and for the staus quo of just ad-hoc copying
> of code.

I don't think I've said anything generalised about that either way. 
However, I have noted that most of the external repositories I've
looked at can't (or shouldn't) be imported directly.  Perhaps if we
could find one that could be pulled in directly, we could use that to
drive a discussion of how.

> I'll help by stating my own position: I think we should be coming up
> with a process for how dependencies on other git repositories are
> going to work, something better than just cut and paste. Whether or
> not we vendorize code isn't really that important, but I'd say that
> if we are vendorizing code and we're notincluding entire sub-
> repositories (like cargo vendor does) we ought to still make this a
> scripted process that takes as an input a list of files we're
> pulling and a remote repository we're pulling from, and the file list
> and the remote repo (and commit ID we're pulling from) should all be
> checked in.

Do we have an example of an external piece of code we could do this to?

[...]
> > > The kernel community has a lot of that going on here. Again,
> > > sorry to pick on you James, but I wanted to make the argument
> > > that - maybe the kernel _should_ be adopting a more structured
> > > way of using code from outside repositories, like cargo, or git
> > > submodules (except I've never had a positive experience with git
> > > submodules, so ignore that suggestion, unless I've just been
> > > using them wrong, in which case someone please teach me). To read
> > > you and Greg saying "nah, just copy code from other repos, it's
> > > fine" - it felt like being back in the old days when we were
> > > still trying to get people to use source control, and having that
> > > one older colleague who _insisted_ on not using source control of
> > > any kind, and that's a bit disheartening.
> > 
> > Even in C terms, the kernel is a nostdlib environment.  If a C
> > project has too much libc dependency it's not going to work
> > directly in the kernel, nor should it.  Let's look at zstd (which
> > is pretty much a nostdlib project) as a great example: the facebook
> > people didn't actually port the top of their tree (1.5) to the
> > kernel, they backported bug fixes to the 1.4 branch and made a
> > special release (1.4.10) just for us.  Why did they do this?  It
> > was because the 1.5 version vastly increased stack use to the
> > extent it would run off the end of the limited kernel stack so
> > couldn't be ported directly into the kernel.  A lot of C libraries
> > that are nostdlib have problems like this as well (you can use
> > recursion, but not in the kernel).  There's no easy way of shimming
> > environmental constraints like this.
> 
> I wonder if we might have come up with a better solution if there'd
> been more cross-project communication and less siloing. Small stacks
> aren't particular to the kernel - it's definitely not unheard of to
> write userspace code where you want to have a lot of small stacks
> (especially if you're doing some kind of coroutine style threading;
> I've done stuff like this in the past)

But would you say that every piece of userspace code should reject
recursion and write for small stacks just in case it needs to be reused
in a more extreme environment?

I don't; I think it's fine there are loads of implementations that
would never work in our environment, because mostly there's no reason
to use them in the kernel (or another embedded environment).  I also
understand why people build for the standard userspace environment
first ... it reduces complexity and makes the construction way easier. 

>  - and to me, as someone who's been incrementing on and maintaining a
> codebase in active use for 10 years, having multiple older versions
> in active use that need bugfixes gives me cold shivers.
> 
> I wouldn't be surprised if at some point the zstd people walk back
> some of their changes or make it configurable at some point :)

Many things can happen in the future, but right at the moment I still
think zstd is working fine for both parties.

James
diff mbox series

Patch

diff --git a/include/linux/printbuf.h b/include/linux/printbuf.h
new file mode 100644
index 0000000000..276cdecf08
--- /dev/null
+++ b/include/linux/printbuf.h
@@ -0,0 +1,164 @@ 
+/* SPDX-License-Identifier: LGPL-2.1+ */
+/* Copyright (C) 2022 Kent Overstreet */
+
+#ifndef _LINUX_PRINTBUF_H
+#define _LINUX_PRINTBUF_H
+
+/*
+ * Printbufs: Simple heap allocated strings, with some features for structered
+ * formatting.
+ *
+ * This code has provisions for use in userspace, to aid in making other code
+ * portable between kernelspace and userspace.
+ *
+ * Basic example:
+ *
+ *   struct printbuf buf = PRINTBUF;
+ *
+ *   pr_buf(&buf, "foo=");
+ *   foo_to_text(&buf, foo);
+ *   printk("%s", buf.buf);
+ *   printbuf_exit(&buf);
+ *
+ * We can now write pretty printers instead of writing code that dumps
+ * everything to the kernel log buffer, and then those pretty-printers can be
+ * used by other code that outputs to kernel log, sysfs, debugfs, etc.
+ *
+ * Memory allocation: Outputing to a printbuf may allocate memory. This
+ * allocation is done with GFP_KERNEL, by default: use the newer
+ * memalloc_*_(save|restore) functions as needed.
+ *
+ * Since no equivalent yet exists for GFP_ATOMIC/GFP_NOWAIT, memory allocations
+ * will be done with GFP_ATOMIC if printbuf->atomic is nonzero.
+ *
+ * Memory allocation failures: We don't return errors directly, because on
+ * memory allocation failure we usually don't want to bail out and unwind - we
+ * want to print what we've got, on a best-effort basis. But code that does want
+ * to return -ENOMEM may check printbuf.allocation_failure.
+ *
+ * Indenting, tabstops:
+ *
+ * To aid is writing multi-line pretty printers spread across multiple
+ * functions, printbufs track the current indent level.
+ *
+ * pr_indent_push() and pr_indent_pop() increase and decrease the current indent
+ * level, respectively.
+ *
+ * To use tabstops, set printbuf->tabstops[]; they are in units of spaces, from
+ * start of line. Once set, pr_tab() will output spaces up to the next tabstop.
+ * pr_tab_rjust() will also advance the current line of text up to the next
+ * tabstop, but it does so by shifting text since the previous tabstop up to the
+ * next tabstop - right justifying it.
+ *
+ * Make sure you use pr_newline() instead of \n in the format string for indent
+ * level and tabstops to work corretly.
+ *
+ * Output units: printbuf->units exists to tell pretty-printers how to output
+ * numbers: a raw value (e.g. directly from a superblock field), as bytes, or as
+ * human readable bytes. pr_units() and pr_sectors() obey it.
+ *
+ * Other helpful functions:
+ *
+ * pr_human_readable_u64, pr_human_readable_s64: Print an integer with human
+ * readable units.
+ *
+ * pr_time(): for printing a time_t with strftime in userspace, prints as an
+ * integer number of seconds in the kernel.
+ *
+ * pr_string_option: Given an enumerated value and a string array with names for
+ * each option, prints out the enum names with the selected one indicated with
+ * square brackets.
+ *
+ * pr_bitflags: Given a bitflag and a string array with names for each bit,
+ * prints out the names of the selected bits.
+ */
+
+#include <linux/slab.h>
+#include <linux/string_helpers.h>
+
+enum printbuf_units {
+	PRINTBUF_UNITS_RAW,
+	PRINTBUF_UNITS_BYTES,
+	PRINTBUF_UNITS_HUMAN_READABLE,
+};
+
+struct printbuf {
+	char			*buf;
+	unsigned		size;
+	unsigned		pos;
+	unsigned		last_newline;
+	unsigned		last_field;
+	unsigned		indent;
+	enum printbuf_units	units:8;
+	/*
+	 * If nonzero, allocations will be done with GFP_ATOMIC:
+	 */
+	u8			atomic;
+	bool			allocation_failure:1;
+	/* SI units (10^3), or 2^10: */
+	enum string_size_units	human_readable_units:1;
+	u8			tabstop;
+	u8			tabstops[4];
+};
+
+#define PRINTBUF ((struct printbuf) { .human_readable_units = STRING_UNITS_2 })
+
+/**
+ * printbuf_exit - exit a printbuf, freeing memory it owns and poisoning it
+ * against accidental use.
+ */
+static inline void printbuf_exit(struct printbuf *buf)
+{
+	kfree(buf->buf);
+	buf->buf = ERR_PTR(-EINTR); /* poison value */
+}
+
+/**
+ * printbuf_reset - re-use a printbuf without freeing and re-initializing it:
+ */
+static inline void printbuf_reset(struct printbuf *buf)
+{
+	buf->pos		= 0;
+	buf->last_newline	= 0;
+	buf->last_field		= 0;
+	buf->indent		= 0;
+	buf->tabstop		= 0;
+	buf->allocation_failure	= 0;
+}
+
+/**
+ * printbuf_atomic_inc - mark as entering an atomic section
+ */
+static inline void printbuf_atomic_inc(struct printbuf *buf)
+{
+	buf->atomic++;
+}
+
+/**
+ * printbuf_atomic_inc - mark as leaving an atomic section
+ */
+static inline void printbuf_atomic_dec(struct printbuf *buf)
+{
+	buf->atomic--;
+}
+
+void pr_buf(struct printbuf *out, const char *fmt, ...)
+	__attribute__ ((format (printf, 2, 3)));
+
+void pr_char(struct printbuf *buf, char c);
+void pr_newline(struct printbuf *);
+void pr_indent_push(struct printbuf *, unsigned);
+void pr_indent_pop(struct printbuf *, unsigned);
+void pr_tab(struct printbuf *);
+void pr_tab_rjust(struct printbuf *);
+void pr_human_readable_u64(struct printbuf *, u64);
+void pr_human_readable_s64(struct printbuf *, s64);
+void pr_units(struct printbuf *, s64, s64);
+void pr_sectors(struct printbuf *, u64);
+void pr_time(struct printbuf *, u64);
+void pr_uuid(struct printbuf *, u8 *);
+void pr_string_option(struct printbuf *, const char * const list[], size_t);
+void pr_bitflags(struct printbuf *, const char * const list[], u64);
+const char *printbuf_str(const struct printbuf *);
+
+#endif /* _LINUX_PRINTBUF_H */
diff --git a/lib/Makefile b/lib/Makefile
index c588a126a3..31a3904eda 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -34,7 +34,7 @@  lib-y := ctype.o string.o vsprintf.o cmdline.o \
 	 is_single_threaded.o plist.o decompress.o kobject_uevent.o \
 	 earlycpio.o seq_buf.o siphash.o dec_and_lock.o \
 	 nmi_backtrace.o nodemask.o win_minmax.o memcat_p.o \
-	 buildid.o
+	 buildid.o printbuf.o
 
 lib-$(CONFIG_PRINTK) += dump_stack.o
 lib-$(CONFIG_SMP) += cpumask.o
diff --git a/lib/printbuf.c b/lib/printbuf.c
new file mode 100644
index 0000000000..e0dfa82cda
--- /dev/null
+++ b/lib/printbuf.c
@@ -0,0 +1,274 @@ 
+// SPDX-License-Identifier: LGPL-2.1+
+/* Copyright (C) 2022 Kent Overstreet */
+
+#ifdef __KERNEL__
+#include <linux/export.h>
+#include <linux/kernel.h>
+#else
+#define EXPORT_SYMBOL(x)
+#endif
+
+#include <linux/log2.h>
+#include <linux/printbuf.h>
+#include <linux/string_helpers.h>
+
+static inline size_t printbuf_remaining(struct printbuf *buf)
+{
+	return buf->size - buf->pos;
+}
+
+static inline size_t printbuf_linelen(struct printbuf *buf)
+{
+	return buf->pos - buf->last_newline;
+}
+
+static int printbuf_realloc(struct printbuf *out, unsigned extra)
+{
+	unsigned new_size;
+	char *buf;
+
+	if (out->pos + extra + 1 < out->size)
+		return 0;
+
+	new_size = roundup_pow_of_two(out->size + extra);
+	buf = krealloc(out->buf, new_size, !out->atomic ? GFP_KERNEL : GFP_ATOMIC);
+
+	if (!buf) {
+		out->allocation_failure = true;
+		return -ENOMEM;
+	}
+
+	out->buf	= buf;
+	out->size	= new_size;
+	return 0;
+}
+
+void pr_buf(struct printbuf *out, const char *fmt, ...)
+{
+	va_list args;
+	int len;
+
+	do {
+		va_start(args, fmt);
+		len = vsnprintf(out->buf + out->pos, printbuf_remaining(out), fmt, args);
+		va_end(args);
+	} while (len + 1 >= printbuf_remaining(out) &&
+		 !printbuf_realloc(out, len + 1));
+
+	len = min_t(size_t, len,
+		  printbuf_remaining(out) ? printbuf_remaining(out) - 1 : 0);
+	out->pos += len;
+}
+EXPORT_SYMBOL(pr_buf);
+
+void pr_char(struct printbuf *buf, char c)
+{
+	if (!printbuf_realloc(buf, 1)) {
+		buf->buf[buf->pos++] = c;
+		buf->buf[buf->pos] = 0;
+	}
+}
+EXPORT_SYMBOL(pr_char);
+
+void pr_newline(struct printbuf *buf)
+{
+	unsigned i;
+
+	pr_char(buf, '\n');
+
+	buf->last_newline	= buf->pos;
+
+	for (i = 0; i < buf->indent; i++)
+		pr_char(buf, ' ');
+
+	buf->last_field		= buf->pos;
+	buf->tabstop = 0;
+}
+EXPORT_SYMBOL(pr_newline);
+
+void pr_indent_push(struct printbuf *buf, unsigned spaces)
+{
+	buf->indent += spaces;
+	while (spaces--)
+		pr_char(buf, ' ');
+}
+EXPORT_SYMBOL(pr_indent_push);
+
+void pr_indent_pop(struct printbuf *buf, unsigned spaces)
+{
+	if (buf->last_newline + buf->indent == buf->pos) {
+		buf->pos -= spaces;
+		buf->buf[buf->pos] = 0;
+	}
+	buf->indent -= spaces;
+}
+EXPORT_SYMBOL(pr_indent_pop);
+
+void pr_tab(struct printbuf *buf)
+{
+	BUG_ON(buf->tabstop > ARRAY_SIZE(buf->tabstops));
+
+	while (printbuf_remaining(buf) > 1 &&
+	       printbuf_linelen(buf) < buf->tabstops[buf->tabstop])
+		pr_char(buf, ' ');
+
+	buf->last_field = buf->pos;
+	buf->tabstop++;
+}
+EXPORT_SYMBOL(pr_tab);
+
+void pr_tab_rjust(struct printbuf *buf)
+{
+	BUG_ON(buf->tabstop > ARRAY_SIZE(buf->tabstops));
+
+	if (printbuf_linelen(buf) < buf->tabstops[buf->tabstop]) {
+		unsigned move = buf->pos - buf->last_field;
+		unsigned shift = buf->tabstops[buf->tabstop] -
+			printbuf_linelen(buf);
+
+		printbuf_realloc(buf, shift);
+
+		if (buf->last_field + shift + 1 < buf->size) {
+			move = min(move, buf->size - 1 - buf->last_field - shift);
+
+			memmove(buf->buf + buf->last_field + shift,
+				buf->buf + buf->last_field,
+				move);
+			memset(buf->buf + buf->last_field, ' ', shift);
+			buf->pos += shift;
+			buf->buf[buf->pos] = 0;
+		}
+	}
+
+	buf->last_field = buf->pos;
+	buf->tabstop++;
+}
+EXPORT_SYMBOL(pr_tab_rjust);
+
+void pr_human_readable_u64(struct printbuf *buf, u64 v)
+{
+	printbuf_realloc(buf, 10);
+	string_get_size(v, 1, buf->human_readable_units, buf->buf + buf->pos,
+			printbuf_remaining(buf));
+	buf->pos += strlen(buf->buf + buf->pos);
+}
+EXPORT_SYMBOL(pr_human_readable_u64);
+
+void pr_human_readable_s64(struct printbuf *buf, s64 v)
+{
+	if (v < 0)
+		pr_char(buf, '-');
+	pr_human_readable_u64(buf, abs(v));
+}
+EXPORT_SYMBOL(pr_human_readable_s64);
+
+void pr_units(struct printbuf *out, s64 raw, s64 bytes)
+{
+	switch (out->units) {
+	case PRINTBUF_UNITS_RAW:
+		pr_buf(out, "%llu", raw);
+		break;
+	case PRINTBUF_UNITS_BYTES:
+		pr_buf(out, "%llu", bytes);
+		break;
+	case PRINTBUF_UNITS_HUMAN_READABLE:
+		pr_human_readable_s64(out, bytes);
+		break;
+	}
+}
+EXPORT_SYMBOL(pr_units);
+
+void pr_sectors(struct printbuf *out, u64 v)
+{
+	pr_units(out, v, v << 9);
+}
+EXPORT_SYMBOL(pr_sectors);
+
+#ifdef __KERNEL__
+
+void pr_time(struct printbuf *out, u64 time)
+{
+	pr_buf(out, "%llu", time);
+}
+EXPORT_SYMBOL(pr_time);
+
+void pr_uuid(struct printbuf *out, u8 *uuid)
+{
+	pr_buf(out, "%pUb", uuid);
+}
+EXPORT_SYMBOL(pr_uuid);
+
+#else
+
+#include <time.h>
+#include <uuid.h>
+
+void pr_time(struct printbuf *out, u64 _time)
+{
+	char time_str[64];
+	time_t time = _time;
+	struct tm *tm = localtime(&time);
+	size_t err = strftime(time_str, sizeof(time_str), "%c", tm);
+
+	if (!err)
+		pr_buf(out, "(formatting error)");
+	else
+		pr_buf(out, "%s", time_str);
+}
+
+void pr_uuid(struct printbuf *out, u8 *uuid)
+{
+	char uuid_str[40];
+
+	uuid_unparse_lower(uuid, uuid_str);
+	pr_buf(out, uuid_str);
+}
+
+#endif
+
+void pr_string_option(struct printbuf *out,
+		      const char * const list[],
+		      size_t selected)
+{
+	size_t i;
+
+	for (i = 0; list[i]; i++)
+		pr_buf(out, i == selected ? "[%s] " : "%s ", list[i]);
+}
+EXPORT_SYMBOL(pr_string_option);
+
+void pr_bitflags(struct printbuf *out,
+		 const char * const list[], u64 flags)
+{
+	unsigned bit, nr = 0;
+	bool first = true;
+
+	while (list[nr])
+		nr++;
+
+	while (flags && (bit = __ffs(flags)) < nr) {
+		if (!first)
+			pr_buf(out, ",");
+		first = false;
+		pr_buf(out, "%s", list[bit]);
+		flags ^= 1 << bit;
+	}
+}
+EXPORT_SYMBOL(pr_bitflags);
+
+/**
+ * printbuf_str - returns printbuf's buf as a C string, guaranteed to be null
+ * terminated
+ */
+const char *printbuf_str(const struct printbuf *buf)
+{
+	/*
+	 * If we've written to a printbuf then it's guaranteed to be a null
+	 * terminated string - but if we haven't, then we might not have
+	 * allocated a buffer at all:
+	 */
+	return buf->pos
+		? buf->buf
+		: "";
+}
+EXPORT_SYMBOL(printbuf_str);