[PATCH v5 00/32] Printbufs

Matthew Wilcox (Oracle) posted 32 patches 3 years, 8 months ago
arch/powerpc/kernel/process.c             |   16 +-
arch/powerpc/kernel/security.c            |   75 +-
arch/powerpc/platforms/pseries/papr_scm.c |   34 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c    |   16 +-
drivers/acpi/apei/erst-dbg.c              |    1 +
drivers/clk/tegra/clk-bpmp.c              |   21 +-
drivers/input/joystick/analog.c           |   23 +-
drivers/pci/p2pdma.c                      |   21 +-
fs/d_path.c                               |   35 +
include/linux/dcache.h                    |    1 +
include/linux/kernel.h                    |   12 +
include/linux/printbuf.h                  |  257 ++++
include/linux/seq_buf.h                   |  162 --
include/linux/string.h                    |    5 +
include/linux/string_helpers.h            |    8 +-
include/linux/trace_events.h              |    2 +-
include/linux/trace_seq.h                 |   17 +-
kernel/trace/trace.c                      |   45 +-
kernel/trace/trace_dynevent.c             |   34 +-
kernel/trace/trace_events_filter.c        |    2 +-
kernel/trace/trace_events_synth.c         |   53 +-
kernel/trace/trace_functions_graph.c      |    6 +-
kernel/trace/trace_kprobe.c               |    2 +-
kernel/trace/trace_seq.c                  |  111 +-
lib/Makefile                              |    4 +-
lib/hexdump.c                             |  246 ++--
lib/printbuf.c                            |  258 ++++
lib/seq_buf.c                             |  397 -----
lib/string_helpers.c                      |  224 +--
lib/test_hexdump.c                        |   30 +-
lib/test_printf.c                         |    6 -
lib/vsprintf.c                            | 1640 ++++++++++-----------
mm/memcontrol.c                           |   54 +-
tools/testing/nvdimm/test/ndtest.c        |   22 +-
34 files changed, 1855 insertions(+), 1985 deletions(-)
create mode 100644 include/linux/printbuf.h
delete mode 100644 include/linux/seq_buf.h
create mode 100644 lib/printbuf.c
delete mode 100644 lib/seq_buf.c
[PATCH v5 00/32] Printbufs
Posted by Matthew Wilcox (Oracle) 3 years, 8 months ago
[all this code is from Kent, I'm helping him out because he's having
trouble with git send-email and OAUTH]

This should (hopefully) be the final iteration of this patch series.

Previous discussion:
https://lore.kernel.org/linux-mm/20220620004233.3805-1-kent.overstreet@gmail.com/

Changes since v4:

 - %pf(%p) format extension has been dropped for now, since it turned out to be
   a bit controversial. I think we're going to want it, but I'm leaving it for a
   future patch series.
 - There was a small off by one error in the patch converting
   trace_events_synth. The original code using seq_buf had to calculate the size
   of the string to allocate; since printbufs have native support for heap
   allocating strings, the version of the patch in this series just uses that
   for a nice cleanup.

What are printbufs, and why?
============================

Printbufs are like the existing seq_buf, with some differences and new features:

 - Heap allocation (optional)
   
   Currently, we have no standard mechanism for printing to a heap-allocated
   string: code that needs to do this must calculate the size of the buffer to
   allocate, which is tedious and error prone. IOW, this is a major ergonomic
   improvement.

 - Indent level

   Any time we're printing structured multi-line data, proper indenting makes
   the output considerably more readable. Printbufs have state for the current
   indent level, controlled by printbuf_indent_add() and printbuf_indent_sub()
   and obeyed by prt_newline().

   In the future this may work with just \n, pending work to do this without
   performance regressions.

 - Tab stops

   Printing reports with nicely aligned columns is something we do a _lot_, and
   printbufs make this a lot easier. After setting tabstops (byte offsets from
   start of line), prt_tab() will output spaces up to the next tabstop, and
   pr_tab_rjust() will right justify output since the previous output against
   the next tabstap. 

   In the future this may work with just \t, pending work to do this without
   performance regressions.

Why a new data structure, instead of extending seq_buf?
=======================================================

Some semantic changes were required. The main ones were:

 - printbufs have different overflow semantics vs. seq_buf, to support
   snprintf() which is required to return the number of characters that were
   written even when an empty buffer is passed 

 - seq_buf has a readpos member which printbuf drops, since it's only for
   tracing.

Also, a number of the existing seq_buf users were converted to printbuf's heap
allocation functionality, replacing open coded string size calculations and
allocations.

Given all this, adding the new thing and converting existing users one by one
felt like the more incremental/bisectable approach.

Goals of this patch series:
===========================

 - A unified API for functions that output strings (i.e. pretty printers).

   Currently, we've got a bit of a mess of different calling conventions for
   functions that output strings. There's sprintf/snprintf style, seq_buf, and
   various other variations of passing raw character pointers around.

 - Low level helpers for building up strings (prt_char(), prt_str(), etc.).

   This replaces a _lot_ of low level pointer arithmetic in vsprintf.c, and
   elsewhere, with proper helpers.

 - Cleanup of pretty-printers in vsprintf.c

   Currently, we have a lot of pretty-printers in vsprintf.c invoked by various
   %p extensions. However, arguments to those pretty printers are passed as
   additional format string characters, instead of normal function arguments,
   and we have the code that decodes format strings mixed with pretty-printer
   code, which is not great for readability and makes those pretty-printers
   unnecessarily tied to vsprintf.c internal details.

   One of my goals is to give those pretty-printers normal function call
   interfaces, and separate out the format-string parsing: this patch series
   does some of the prep work. I've got more patches in progress in that vein,
   but this patch series is big enough as is so I'm leaving them for later.

Kent Overstreet (32):
  lib/printbuf: New data structure for printing strings
  lib/string_helpers: Convert string_escape_mem() to printbuf
  vsprintf: Convert to printbuf
  lib/hexdump: Convert to printbuf
  lib/string_helpers: string_get_size() now returns characters wrote
  lib/printbuf: Heap allocation
  lib/printbuf: Tabstops, indenting
  lib/printbuf: Unit specifiers
  vsprintf: Improve number()
  vsprintf: prt_u64_minwidth(), prt_u64()
  test_printf: Drop requirement that sprintf not write past nul
  vsprintf: Start consolidating printf_spec handling
  vsprintf: Refactor resource_string()
  vsprintf: Refactor fourcc_string()
  vsprintf: Refactor ip_addr_string()
  vsprintf: Refactor mac_address_string()
  vsprintf: time_and_date() no longer takes printf_spec
  vsprintf: flags_string() no longer takes printf_spec
  vsprintf: Refactor device_node_string, fwnode_string
  vsprintf: Refactor hex_string, bitmap_string_list, bitmap_string
  Input/joystick/analog: Convert from seq_buf -> printbuf
  mm/memcontrol.c: Convert to printbuf
  clk: tegra: bpmp: Convert to printbuf
  tools/testing/nvdimm: Convert to printbuf
  powerpc: Convert to printbuf
  x86/resctrl: Convert to printbuf
  PCI/P2PDMA: Convert to printbuf
  tracing: trace_events_synth: Convert to printbuf
  d_path: prt_path()
  ACPI/APEI: Add missing include
  tracing: Convert to printbuf
  Delete seq_buf

 arch/powerpc/kernel/process.c             |   16 +-
 arch/powerpc/kernel/security.c            |   75 +-
 arch/powerpc/platforms/pseries/papr_scm.c |   34 +-
 arch/x86/kernel/cpu/resctrl/rdtgroup.c    |   16 +-
 drivers/acpi/apei/erst-dbg.c              |    1 +
 drivers/clk/tegra/clk-bpmp.c              |   21 +-
 drivers/input/joystick/analog.c           |   23 +-
 drivers/pci/p2pdma.c                      |   21 +-
 fs/d_path.c                               |   35 +
 include/linux/dcache.h                    |    1 +
 include/linux/kernel.h                    |   12 +
 include/linux/printbuf.h                  |  257 ++++
 include/linux/seq_buf.h                   |  162 --
 include/linux/string.h                    |    5 +
 include/linux/string_helpers.h            |    8 +-
 include/linux/trace_events.h              |    2 +-
 include/linux/trace_seq.h                 |   17 +-
 kernel/trace/trace.c                      |   45 +-
 kernel/trace/trace_dynevent.c             |   34 +-
 kernel/trace/trace_events_filter.c        |    2 +-
 kernel/trace/trace_events_synth.c         |   53 +-
 kernel/trace/trace_functions_graph.c      |    6 +-
 kernel/trace/trace_kprobe.c               |    2 +-
 kernel/trace/trace_seq.c                  |  111 +-
 lib/Makefile                              |    4 +-
 lib/hexdump.c                             |  246 ++--
 lib/printbuf.c                            |  258 ++++
 lib/seq_buf.c                             |  397 -----
 lib/string_helpers.c                      |  224 +--
 lib/test_hexdump.c                        |   30 +-
 lib/test_printf.c                         |    6 -
 lib/vsprintf.c                            | 1640 ++++++++++-----------
 mm/memcontrol.c                           |   54 +-
 tools/testing/nvdimm/test/ndtest.c        |   22 +-
 34 files changed, 1855 insertions(+), 1985 deletions(-)
 create mode 100644 include/linux/printbuf.h
 delete mode 100644 include/linux/seq_buf.h
 create mode 100644 lib/printbuf.c
 delete mode 100644 lib/seq_buf.c

-- 
2.35.1
Re: [PATCH v5 00/32] Printbufs
Posted by Petr Mladek 3 years, 6 months ago
On Mon 2022-08-08 03:40:56, Matthew Wilcox (Oracle) wrote:
> [all this code is from Kent, I'm helping him out because he's having
> trouble with git send-email and OAUTH]
> 
> This should (hopefully) be the final iteration of this patch series.
> 
> Previous discussion:
> https://lore.kernel.org/linux-mm/20220620004233.3805-1-kent.overstreet@gmail.com/
> 
> Changes since v4:
> 
>  - %pf(%p) format extension has been dropped for now, since it turned out to be
>    a bit controversial. I think we're going to want it, but I'm leaving it for a
>    future patch series.
>  - There was a small off by one error in the patch converting
>    trace_events_synth. The original code using seq_buf had to calculate the size
>    of the string to allocate; since printbufs have native support for heap
>    allocating strings, the version of the patch in this series just uses that
>    for a nice cleanup.
> 
> What are printbufs, and why?
> ============================
> 
> Printbufs are like the existing seq_buf, with some differences and new features:
> 
>  - Heap allocation (optional)
>    
>    Currently, we have no standard mechanism for printing to a heap-allocated
>    string: code that needs to do this must calculate the size of the buffer to
>    allocate, which is tedious and error prone. IOW, this is a major ergonomic
>    improvement.
> 
>  - Indent level
> 
>    Any time we're printing structured multi-line data, proper indenting makes
>    the output considerably more readable. Printbufs have state for the current
>    indent level, controlled by printbuf_indent_add() and printbuf_indent_sub()
>    and obeyed by prt_newline().
> 
>    In the future this may work with just \n, pending work to do this without
>    performance regressions.
> 
>  - Tab stops
> 
>    Printing reports with nicely aligned columns is something we do a _lot_, and
>    printbufs make this a lot easier. After setting tabstops (byte offsets from
>    start of line), prt_tab() will output spaces up to the next tabstop, and
>    pr_tab_rjust() will right justify output since the previous output against
>    the next tabstap. 
> 
>    In the future this may work with just \t, pending work to do this without
>    performance regressions.

The more I think about this patchset the more doubts I have.

My first reaction was rather positive because

1. __prt_char(out, c)

	looks more safe and sane than repeating the following code
	pattern in vsprintf code

    if (buf < end)
	*buf = '0';
    ++buf;


2. printk("%pf", meaningful_function_name(ptr))

     is more user friendly, extendable. With a type check,
     it might be even more safe than

   printk("%pxyz", ptr);


3. pretty print API would allow to avoid some tricky use of
   temporary buffers in vsprintf code, for example, see
   https://lore.kernel.org/r/20220808024128.3219082-15-willy@infradead.org


What are the concerns?

It seems that the motivation for the pretty print is to allow
printing more pretty/fancy multi-line debug messages. The API
should be usable inside vsprintf.c and also outside.

1. vsprintf() is very important core API. It is used (not only for)
   when kernel wants to provide a human readable feedback, for
   example, via printk(), trace_printk(), procfs, sysfs, debugfs.

   If a bug in vsprintf() blocks printk()/trace_printk() then
   crash_dump might be the only way to debug the kernel.


2. My experience with printk() is that external APIs are big source of
   problems. Some of them are even solved by hacks, for example:

     + Console drivers have to check oops_in_progress before taking
       port->lock to prevent a deadlock.

     + printk_deferred() or printk_once() have to be used by code that
       might be called by printk().

    This patchset adds another external API.

    The %pf feature allows writing crazy callbacks called inside
    vsprintf()/printk() without any proper review and self-tests.

    People would want to extend the pretty print API for a
    profs/sysfs/debugfs use-case. They would take it easily.
    There is a lower risk in this case because only a particular
    file is affected, the API is called in a well defined context.
    But it looks like a call for problems if we allow to call
    the same complicated code also from vsprintf() or printk()
    in any context.


3. Features usually complicate the code immediately or later.

   For example, the record based ring buffer complicated the printk()
   code a lot. It introduced many regressions. Development of the
   lockless variant was a real challenge. And it did not solve
   everything. Peple still complain that pr_cont() is not reliable.
   Multi-line output might be mixed, ...

   The pretty print API is actually designed for multi-line output.
   But it will not help when used with printk() that uses 1k buffers
   internally. And adding support for "unlimited" printk() messages
   would be another challenge. It would bring new problems,
   for example, one printk() caller might block others for too long, ...


Any opinion is really appreciated.

Best Regards,
Petr


PS: I am sorry that I did not react on this patchset for months. I was
    overloaded, sick twice, and had vacation.

    A more detailed review of the patchset would help me to have
    stronger opinion about it. I am not clever and experienced enough
    to see all the consequences on the first look.
Re: [PATCH v5 00/32] Printbufs
Posted by Kent Overstreet 3 years, 6 months ago
On 9/23/22 03:10, Petr Mladek wrote:
> It seems that the motivation for the pretty print is to allow
> printing more pretty/fancy multi-line debug messages. The API
> should be usable inside vsprintf.c and also outside.
> 
> 1. vsprintf() is very important core API. It is used (not only for)
>     when kernel wants to provide a human readable feedback, for
>     example, via printk(), trace_printk(), procfs, sysfs, debugfs.
> 
>     If a bug in vsprintf() blocks printk()/trace_printk() then
>     crash_dump might be the only way to debug the kernel.

All the more reason for that code to be written with safe APIs, wouldn't 
you say?

> 2. My experience with printk() is that external APIs are big source of
>     problems. Some of them are even solved by hacks, for example:
> 
>       + Console drivers have to check oops_in_progress before taking
>         port->lock to prevent a deadlock.
> 
>       + printk_deferred() or printk_once() have to be used by code that
>         might be called by printk().
> 
>      This patchset adds another external API.

We should differentiate between sprintf and printk.

sprintf is _just_ for formatting strings, it doesn't have the same 
concerns as printk re: locking the output buffer - or any locking 
concerns whatsover, it's a pure function that doesn't mutate outside 
system state.

This patch series is about introducing a safe common API for sprintf and 
other code that outputs to strings, and I'd also note that even Linus 
agreed on the need for such an API, because it gets rid of the separate 
stack-allocated buffers that have been a problem in the past.

There's no impact on printk. The extra feature that printbufs add - heap 
allocation - isn't used by printk.


>      The %pf feature allows writing crazy callbacks called inside
>      vsprintf()/printk() without any proper review and self-tests.

I came to similar conclusions about %pf after the discussion with Linus; 
that part has been dropped from the patch set for now.

I do think we can do it safely in the future, and in the meantime it 
_may_ be worthwhile for use in a much more limited - probably not, I'd 
rather hold off and do it right. But since it hasn't been in the patch 
set the last few times I posted it, let's leave this out of the discussion.

(I think the way to make it safe will be to output to a heap allocated 
buffer instead of the printk buffer directly; however, that's not an 
option currently because we'd need to plumb through a gfp flags 
parameter and obviously we're not going to do that to printk. However, 
we're looking at switching from gfp flags to the memalloc_*() API, which 
would make this work).

>      People would want to extend the pretty print API for a
>      profs/sysfs/debugfs use-case. They would take it easily.
>      There is a lower risk in this case because only a particular
>      file is affected, the API is called in a well defined context.
>      But it looks like a call for problems if we allow to call
>      the same complicated code also from vsprintf() or printk()
>      in any context.

Since %pf isn't getting added, this isn't a concern.

_But_, introducing a common API so that we can use the same code for 
outputting to the system console, or procfs, or debugfs, is precisely 
what this is all about! We've got a fair amount of code duplication 
(some of which this patch series addresses; see the hexdump patches) 
because of this lack of a common API - often messily, with subtle 
differences for absolutely no reason.

We _want_ common, reusable, generic code.

Without %pf, to use printbuf code for outputting to the console the 
standard approach is going to be something like

struct printbuf buf = PRINTBUF;

prt_foo(&buf, foo);
printk("%s\n", buf.buf);
printbuf_exit(&buf);

Obviously, this isn't the greatest ergonomic-wise, %pf would be better 
in that respect. But this is still quite a bit better than completely 
duplicated code - one for outputting to a seq_buf, another set for 
seq_file, another for printk()...

I also have a patch series in the works for a printbuf based replacement 
for seq_file - eliminating that API fragmentation as well. I've just 
been waiting for movement on this patch series...

> 3. Features usually complicate the code immediately or later.
> 
>     For example, the record based ring buffer complicated the printk()
>     code a lot. It introduced many regressions. Development of the
>     lockless variant was a real challenge. And it did not solve
>     everything. Peple still complain that pr_cont() is not reliable.
>     Multi-line output might be mixed, ...

I'm not seeing the complications this patch series introduces in its 
current form. There's heap allocation, which replaces open-coded output 
buffer allocation that's currently done all over the place. And there's 
tabstops and indenting; given the amount of columnar data presented in 
procfs and debugfs, I think those are reasonable additions.

In return, you get a whole ton of code that was previously using raw 
pointer arithmetic converted to safe APIs, and we have a start on 
standardizing a lot of different code on a common API.

And, the sprintf code ends up a whole lot more readable and easier to 
work on.

>     The pretty print API is actually designed for multi-line output.
>     But it will not help when used with printk() that uses 1k buffers
>     internally. And adding support for "unlimited" printk() messages
>     would be another challenge. It would bring new problems,
>     for example, one printk() caller might block others for too long, ...

I'm not seeing that this is a real difficulty, since we're not doing %pf 
at this time.

  - add a printk_string_as_lines(), which takes a string buffer (not a 
format string) and calls printk() on it one line at a time. This would 
be an easy way around the current 1k buffer.
  - or, if the printk code wants to keep the output lines together and 
not have them interspersed with other output, that should be easier when 
we're presenting you the output already formatted and all you need to do 
is memcpy().

Basically, we've got broad agreement that calling arbitrary 
pretty-printers from printk() context is a bad idea, so since we won't 
be doing that this looks like a non-issue to me as well.

>      A more detailed review of the patchset would help me to have
>      stronger opinion about it. I am not clever and experienced enough
>      to see all the consequences on the first look.

There's really not a lot in the way of functional changes, since %pf has 
been dropped - it's just refactoring and converting to common APIs, and 
it was pretty mechanical as far as refactorings go.

Besides what we've talked about, the other thing I was doing that might 
be worth discussing was working on separating the pretty-printers in 
vsprintf.c from the format string parsing code. What I was seeing was a 
lot of code mixing parsing of the format string, and I think that code 
would be _much_ easier to read and work on with the format string 
parsing confined to pointer(), and having it call pretty-printers with a 
normal function call interface with well-typed arguments.

I have additional patches finishing this work for around half of the 
pretty-printers in vsprintf.c, but you were complaining about the patch 
series growing, so I haven't posted them yet...

Anyways, I hope this addresses some concerns.

Cheers,
Kent