It is now possible to mmap() a ring-buffer to stream its content. Add
some documentation and a code example.
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
diff --git a/Documentation/trace/index.rst b/Documentation/trace/index.rst
index 5092d6c13af5..0b300901fd75 100644
--- a/Documentation/trace/index.rst
+++ b/Documentation/trace/index.rst
@@ -29,6 +29,7 @@ Linux Tracing Technologies
timerlat-tracer
intel_th
ring-buffer-design
+ ring-buffer-map
stm
sys-t
coresight/index
diff --git a/Documentation/trace/ring-buffer-map.rst b/Documentation/trace/ring-buffer-map.rst
new file mode 100644
index 000000000000..2ba7b5339178
--- /dev/null
+++ b/Documentation/trace/ring-buffer-map.rst
@@ -0,0 +1,105 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==================================
+Tracefs ring-buffer memory mapping
+==================================
+
+:Author: Vincent Donnefort <vdonnefort@google.com>
+
+Overview
+========
+Tracefs ring-buffer memory map provides an efficient method to stream data
+as no memory copy is necessary. The application mapping the ring-buffer becomes
+then a consumer for that ring-buffer, in a similar fashion to trace_pipe.
+
+Memory mapping setup
+====================
+The mapping works with a mmap() of the trace_pipe_raw interface.
+
+The first system page of the mapping contains ring-buffer statistics and
+description. It is referred as the meta-page. One of the most important field of
+the meta-page is the reader. It contains the subbuf ID which can be safely read
+by the mapper (see ring-buffer-design.rst).
+
+The meta-page is followed by all the subbuf, ordered by ascendant ID. It is
+therefore effortless to know where the reader starts in the mapping:
+
+.. code-block:: c
+
+ reader_id = meta->reader->id;
+ reader_offset = meta->meta_page_size + reader_id * meta->subbuf_size;
+
+When the application is done with the current reader, it can get a new one using
+the trace_pipe_raw ioctl() TRACE_MMAP_IOCTL_GET_READER. This ioctl also updates
+the meta-page fields.
+
+Limitations
+===========
+When a mapping is in place on a Tracefs ring-buffer, it is not possible to
+either resize it (either by increasing the entire size of the ring-buffer or
+each subbuf). It is also not possible to use snapshot or splice.
+
+Concurrent readers (either another application mapping that ring-buffer or the
+kernel with trace_pipe) are allowed but not recommended. They will compete for
+the ring-buffer and the output is unpredictable.
+
+Example
+=======
+
+.. code-block:: c
+
+ #include <fcntl.h>
+ #include <stdio.h>
+ #include <stdlib.h>
+ #include <unistd.h>
+
+ #include <linux/trace_mmap.h>
+
+ #include <sys/mman.h>
+ #include <sys/ioctl.h>
+
+ #define TRACE_PIPE_RAW "/sys/kernel/tracing/per_cpu/cpu0/trace_pipe_raw"
+
+ int main(void)
+ {
+ int page_size = getpagesize(), fd, reader_id;
+ unsigned long meta_len, data_len;
+ struct trace_buffer_meta *meta;
+ void *map, *reader, *data;
+
+ fd = open(TRACE_PIPE_RAW, O_RDONLY);
+ if (fd < 0)
+ exit(EXIT_FAILURE);
+
+ map = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0);
+ if (map == MAP_FAILED)
+ exit(EXIT_FAILURE);
+
+ meta = (struct trace_buffer_meta *)map;
+ meta_len = meta->meta_page_size;
+
+ printf("entries: %lu\n", meta->entries);
+ printf("overrun: %lu\n", meta->overrun);
+ printf("read: %lu\n", meta->read);
+ printf("subbufs_touched:%lu\n", meta->subbufs_touched);
+ printf("subbufs_lost: %lu\n", meta->subbufs_lost);
+ printf("subbufs_read: %lu\n", meta->subbufs_read);
+ printf("nr_subbufs: %u\n", meta->nr_subbufs);
+
+ data_len = meta->subbuf_size * meta->nr_subbufs;
+ data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, fd, data_len);
+ if (data == MAP_FAILED)
+ exit(EXIT_FAILURE);
+
+ if (ioctl(fd, TRACE_MMAP_IOCTL_GET_READER) < 0)
+ exit(EXIT_FAILURE);
+
+ reader_id = meta->reader.id;
+ reader = data + meta->subbuf_size * reader_id;
+
+ munmap(data, data_len);
+ munmap(meta, meta_len);
+ close (fd);
+
+ return 0;
+ }
--
2.43.0.275.g3460e3d667-goog
Hi Vincent,
On Thu, 11 Jan 2024 16:17:11 +0000
Vincent Donnefort <vdonnefort@google.com> wrote:
> It is now possible to mmap() a ring-buffer to stream its content. Add
> some documentation and a code example.
>
> Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
>
> diff --git a/Documentation/trace/index.rst b/Documentation/trace/index.rst
> index 5092d6c13af5..0b300901fd75 100644
> --- a/Documentation/trace/index.rst
> +++ b/Documentation/trace/index.rst
> @@ -29,6 +29,7 @@ Linux Tracing Technologies
> timerlat-tracer
> intel_th
> ring-buffer-design
> + ring-buffer-map
> stm
> sys-t
> coresight/index
> diff --git a/Documentation/trace/ring-buffer-map.rst b/Documentation/trace/ring-buffer-map.rst
> new file mode 100644
> index 000000000000..2ba7b5339178
> --- /dev/null
> +++ b/Documentation/trace/ring-buffer-map.rst
> @@ -0,0 +1,105 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +==================================
> +Tracefs ring-buffer memory mapping
> +==================================
> +
> +:Author: Vincent Donnefort <vdonnefort@google.com>
> +
> +Overview
> +========
> +Tracefs ring-buffer memory map provides an efficient method to stream data
> +as no memory copy is necessary. The application mapping the ring-buffer becomes
> +then a consumer for that ring-buffer, in a similar fashion to trace_pipe.
> +
> +Memory mapping setup
> +====================
> +The mapping works with a mmap() of the trace_pipe_raw interface.
> +
> +The first system page of the mapping contains ring-buffer statistics and
> +description. It is referred as the meta-page. One of the most important field of
> +the meta-page is the reader. It contains the subbuf ID which can be safely read
> +by the mapper (see ring-buffer-design.rst).
> +
> +The meta-page is followed by all the subbuf, ordered by ascendant ID. It is
> +therefore effortless to know where the reader starts in the mapping:
> +
> +.. code-block:: c
> +
> + reader_id = meta->reader->id;
> + reader_offset = meta->meta_page_size + reader_id * meta->subbuf_size;
> +
> +When the application is done with the current reader, it can get a new one using
> +the trace_pipe_raw ioctl() TRACE_MMAP_IOCTL_GET_READER. This ioctl also updates
> +the meta-page fields.
> +
> +Limitations
> +===========
> +When a mapping is in place on a Tracefs ring-buffer, it is not possible to
> +either resize it (either by increasing the entire size of the ring-buffer or
> +each subbuf). It is also not possible to use snapshot or splice.
I've played with the sample code.
- "free_buffer" just doesn't work when the process is mmap the ring buffer.
- After mmap the buffers, when the snapshot took, the IOCTL returns an error.
OK, but I rather like to fail snapshot with -EBUSY when the buffer is mmaped.
> +
> +Concurrent readers (either another application mapping that ring-buffer or the
> +kernel with trace_pipe) are allowed but not recommended. They will compete for
> +the ring-buffer and the output is unpredictable.
> +
> +Example
> +=======
> +
> +.. code-block:: c
> +
> + #include <fcntl.h>
> + #include <stdio.h>
> + #include <stdlib.h>
> + #include <unistd.h>
> +
> + #include <linux/trace_mmap.h>
> +
> + #include <sys/mman.h>
> + #include <sys/ioctl.h>
> +
> + #define TRACE_PIPE_RAW "/sys/kernel/tracing/per_cpu/cpu0/trace_pipe_raw"
> +
> + int main(void)
> + {
> + int page_size = getpagesize(), fd, reader_id;
> + unsigned long meta_len, data_len;
> + struct trace_buffer_meta *meta;
> + void *map, *reader, *data;
> +
> + fd = open(TRACE_PIPE_RAW, O_RDONLY);
> + if (fd < 0)
> + exit(EXIT_FAILURE);
> +
> + map = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0);
> + if (map == MAP_FAILED)
> + exit(EXIT_FAILURE);
> +
> + meta = (struct trace_buffer_meta *)map;
> + meta_len = meta->meta_page_size;
> +
> + printf("entries: %lu\n", meta->entries);
> + printf("overrun: %lu\n", meta->overrun);
> + printf("read: %lu\n", meta->read);
> + printf("subbufs_touched:%lu\n", meta->subbufs_touched);
> + printf("subbufs_lost: %lu\n", meta->subbufs_lost);
> + printf("subbufs_read: %lu\n", meta->subbufs_read);
> + printf("nr_subbufs: %u\n", meta->nr_subbufs);
> +
> + data_len = meta->subbuf_size * meta->nr_subbufs;
> + data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, fd, data_len);
> + if (data == MAP_FAILED)
> + exit(EXIT_FAILURE);
> +
> + if (ioctl(fd, TRACE_MMAP_IOCTL_GET_READER) < 0)
> + exit(EXIT_FAILURE);
> +
> + reader_id = meta->reader.id;
> + reader = data + meta->subbuf_size * reader_id;
Also, this caused a bus error if I add below 2 lines here.
printf("reader_id: %d, addr: %p\n", reader_id, reader);
printf("read data head: %lx\n", *(unsigned long *)reader);
-----
/ # cd /sys/kernel/tracing/
/sys/kernel/tracing # echo 1 > events/enable
[ 17.941894] Scheduler tracepoints stat_sleep, stat_iowait, stat_blocked and stat_runtime require the kernel parameter schedstats=enable or kernel.sched_schedstats=1
/sys/kernel/tracing #
/sys/kernel/tracing # echo 1 > buffer_percent
/sys/kernel/tracing # /mnt/rbmap2
entries: 245291
overrun: 203741
read: 0
subbufs_touched:2041
subbufs_lost: 1688
subbufs_read: 0
nr_subbufs: 355
reader_id: 1, addr: 0x7f0cde51a000
Bus error
-----
Is this expected behavior? how can I read the ring buffer?
Thank you,
> +
> + munmap(data, data_len);
> + munmap(meta, meta_len);
> + close (fd);
> +
> + return 0;
> + }
> --
> 2.43.0.275.g3460e3d667-goog
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
On Sun, 14 Jan 2024 23:26:43 +0900
Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:
> Hi Vincent,
>
> On Thu, 11 Jan 2024 16:17:11 +0000
> Vincent Donnefort <vdonnefort@google.com> wrote:
>
> > It is now possible to mmap() a ring-buffer to stream its content. Add
> > some documentation and a code example.
> >
> > Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
> >
> > diff --git a/Documentation/trace/index.rst b/Documentation/trace/index.rst
> > index 5092d6c13af5..0b300901fd75 100644
> > --- a/Documentation/trace/index.rst
> > +++ b/Documentation/trace/index.rst
> > @@ -29,6 +29,7 @@ Linux Tracing Technologies
> > timerlat-tracer
> > intel_th
> > ring-buffer-design
> > + ring-buffer-map
> > stm
> > sys-t
> > coresight/index
> > diff --git a/Documentation/trace/ring-buffer-map.rst b/Documentation/trace/ring-buffer-map.rst
> > new file mode 100644
> > index 000000000000..2ba7b5339178
> > --- /dev/null
> > +++ b/Documentation/trace/ring-buffer-map.rst
> > @@ -0,0 +1,105 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +==================================
> > +Tracefs ring-buffer memory mapping
> > +==================================
> > +
> > +:Author: Vincent Donnefort <vdonnefort@google.com>
> > +
> > +Overview
> > +========
> > +Tracefs ring-buffer memory map provides an efficient method to stream data
> > +as no memory copy is necessary. The application mapping the ring-buffer becomes
> > +then a consumer for that ring-buffer, in a similar fashion to trace_pipe.
> > +
> > +Memory mapping setup
> > +====================
> > +The mapping works with a mmap() of the trace_pipe_raw interface.
> > +
> > +The first system page of the mapping contains ring-buffer statistics and
> > +description. It is referred as the meta-page. One of the most important field of
> > +the meta-page is the reader. It contains the subbuf ID which can be safely read
> > +by the mapper (see ring-buffer-design.rst).
> > +
> > +The meta-page is followed by all the subbuf, ordered by ascendant ID. It is
> > +therefore effortless to know where the reader starts in the mapping:
> > +
> > +.. code-block:: c
> > +
> > + reader_id = meta->reader->id;
> > + reader_offset = meta->meta_page_size + reader_id * meta->subbuf_size;
> > +
> > +When the application is done with the current reader, it can get a new one using
> > +the trace_pipe_raw ioctl() TRACE_MMAP_IOCTL_GET_READER. This ioctl also updates
> > +the meta-page fields.
> > +
> > +Limitations
> > +===========
> > +When a mapping is in place on a Tracefs ring-buffer, it is not possible to
> > +either resize it (either by increasing the entire size of the ring-buffer or
> > +each subbuf). It is also not possible to use snapshot or splice.
>
> I've played with the sample code.
>
> - "free_buffer" just doesn't work when the process is mmap the ring buffer.
> - After mmap the buffers, when the snapshot took, the IOCTL returns an error.
>
> OK, but I rather like to fail snapshot with -EBUSY when the buffer is mmaped.
>
> > +
> > +Concurrent readers (either another application mapping that ring-buffer or the
> > +kernel with trace_pipe) are allowed but not recommended. They will compete for
> > +the ring-buffer and the output is unpredictable.
> > +
> > +Example
> > +=======
> > +
> > +.. code-block:: c
> > +
> > + #include <fcntl.h>
> > + #include <stdio.h>
> > + #include <stdlib.h>
> > + #include <unistd.h>
> > +
> > + #include <linux/trace_mmap.h>
> > +
> > + #include <sys/mman.h>
> > + #include <sys/ioctl.h>
> > +
> > + #define TRACE_PIPE_RAW "/sys/kernel/tracing/per_cpu/cpu0/trace_pipe_raw"
> > +
> > + int main(void)
> > + {
> > + int page_size = getpagesize(), fd, reader_id;
> > + unsigned long meta_len, data_len;
> > + struct trace_buffer_meta *meta;
> > + void *map, *reader, *data;
> > +
> > + fd = open(TRACE_PIPE_RAW, O_RDONLY);
> > + if (fd < 0)
> > + exit(EXIT_FAILURE);
> > +
> > + map = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0);
> > + if (map == MAP_FAILED)
> > + exit(EXIT_FAILURE);
> > +
> > + meta = (struct trace_buffer_meta *)map;
> > + meta_len = meta->meta_page_size;
> > +
> > + printf("entries: %lu\n", meta->entries);
> > + printf("overrun: %lu\n", meta->overrun);
> > + printf("read: %lu\n", meta->read);
> > + printf("subbufs_touched:%lu\n", meta->subbufs_touched);
> > + printf("subbufs_lost: %lu\n", meta->subbufs_lost);
> > + printf("subbufs_read: %lu\n", meta->subbufs_read);
> > + printf("nr_subbufs: %u\n", meta->nr_subbufs);
> > +
> > + data_len = meta->subbuf_size * meta->nr_subbufs;
> > + data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, fd, data_len);
The above is buggy. It should be:
data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, fd, meta_len);
The last parameter is where to start the mapping from, which is just
after the meta page. The code is currently starting the map far away
from that.
-- Steve
> > + if (data == MAP_FAILED)
> > + exit(EXIT_FAILURE);
> > +
> > + if (ioctl(fd, TRACE_MMAP_IOCTL_GET_READER) < 0)
> > + exit(EXIT_FAILURE);
> > +
> > + reader_id = meta->reader.id;
> > + reader = data + meta->subbuf_size * reader_id;
>
> Also, this caused a bus error if I add below 2 lines here.
>
> printf("reader_id: %d, addr: %p\n", reader_id, reader);
> printf("read data head: %lx\n", *(unsigned long *)reader);
>
> -----
> / # cd /sys/kernel/tracing/
> /sys/kernel/tracing # echo 1 > events/enable
> [ 17.941894] Scheduler tracepoints stat_sleep, stat_iowait, stat_blocked and stat_runtime require the kernel parameter schedstats=enable or kernel.sched_schedstats=1
> /sys/kernel/tracing #
> /sys/kernel/tracing # echo 1 > buffer_percent
> /sys/kernel/tracing # /mnt/rbmap2
> entries: 245291
> overrun: 203741
> read: 0
> subbufs_touched:2041
> subbufs_lost: 1688
> subbufs_read: 0
> nr_subbufs: 355
> reader_id: 1, addr: 0x7f0cde51a000
> Bus error
> -----
>
> Is this expected behavior? how can I read the ring buffer?
>
> Thank you,
>
> > +
> > + munmap(data, data_len);
> > + munmap(meta, meta_len);
> > + close (fd);
> > +
> > + return 0;
> > + }
> > --
> > 2.43.0.275.g3460e3d667-goog
> >
>
>
On Sun, 14 Jan 2024 11:23:24 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:
> On Sun, 14 Jan 2024 23:26:43 +0900
> Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:
>
> > Hi Vincent,
> >
> > On Thu, 11 Jan 2024 16:17:11 +0000
> > Vincent Donnefort <vdonnefort@google.com> wrote:
> >
> > > It is now possible to mmap() a ring-buffer to stream its content. Add
> > > some documentation and a code example.
> > >
> > > Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
> > >
> > > diff --git a/Documentation/trace/index.rst b/Documentation/trace/index.rst
> > > index 5092d6c13af5..0b300901fd75 100644
> > > --- a/Documentation/trace/index.rst
> > > +++ b/Documentation/trace/index.rst
> > > @@ -29,6 +29,7 @@ Linux Tracing Technologies
> > > timerlat-tracer
> > > intel_th
> > > ring-buffer-design
> > > + ring-buffer-map
> > > stm
> > > sys-t
> > > coresight/index
> > > diff --git a/Documentation/trace/ring-buffer-map.rst b/Documentation/trace/ring-buffer-map.rst
> > > new file mode 100644
> > > index 000000000000..2ba7b5339178
> > > --- /dev/null
> > > +++ b/Documentation/trace/ring-buffer-map.rst
> > > @@ -0,0 +1,105 @@
> > > +.. SPDX-License-Identifier: GPL-2.0
> > > +
> > > +==================================
> > > +Tracefs ring-buffer memory mapping
> > > +==================================
> > > +
> > > +:Author: Vincent Donnefort <vdonnefort@google.com>
> > > +
> > > +Overview
> > > +========
> > > +Tracefs ring-buffer memory map provides an efficient method to stream data
> > > +as no memory copy is necessary. The application mapping the ring-buffer becomes
> > > +then a consumer for that ring-buffer, in a similar fashion to trace_pipe.
> > > +
> > > +Memory mapping setup
> > > +====================
> > > +The mapping works with a mmap() of the trace_pipe_raw interface.
> > > +
> > > +The first system page of the mapping contains ring-buffer statistics and
> > > +description. It is referred as the meta-page. One of the most important field of
> > > +the meta-page is the reader. It contains the subbuf ID which can be safely read
> > > +by the mapper (see ring-buffer-design.rst).
> > > +
> > > +The meta-page is followed by all the subbuf, ordered by ascendant ID. It is
> > > +therefore effortless to know where the reader starts in the mapping:
> > > +
> > > +.. code-block:: c
> > > +
> > > + reader_id = meta->reader->id;
> > > + reader_offset = meta->meta_page_size + reader_id * meta->subbuf_size;
> > > +
> > > +When the application is done with the current reader, it can get a new one using
> > > +the trace_pipe_raw ioctl() TRACE_MMAP_IOCTL_GET_READER. This ioctl also updates
> > > +the meta-page fields.
> > > +
> > > +Limitations
> > > +===========
> > > +When a mapping is in place on a Tracefs ring-buffer, it is not possible to
> > > +either resize it (either by increasing the entire size of the ring-buffer or
> > > +each subbuf). It is also not possible to use snapshot or splice.
> >
> > I've played with the sample code.
> >
> > - "free_buffer" just doesn't work when the process is mmap the ring buffer.
> > - After mmap the buffers, when the snapshot took, the IOCTL returns an error.
> >
> > OK, but I rather like to fail snapshot with -EBUSY when the buffer is mmaped.
> >
> > > +
> > > +Concurrent readers (either another application mapping that ring-buffer or the
> > > +kernel with trace_pipe) are allowed but not recommended. They will compete for
> > > +the ring-buffer and the output is unpredictable.
> > > +
> > > +Example
> > > +=======
> > > +
> > > +.. code-block:: c
> > > +
> > > + #include <fcntl.h>
> > > + #include <stdio.h>
> > > + #include <stdlib.h>
> > > + #include <unistd.h>
> > > +
> > > + #include <linux/trace_mmap.h>
> > > +
> > > + #include <sys/mman.h>
> > > + #include <sys/ioctl.h>
> > > +
> > > + #define TRACE_PIPE_RAW "/sys/kernel/tracing/per_cpu/cpu0/trace_pipe_raw"
> > > +
> > > + int main(void)
> > > + {
> > > + int page_size = getpagesize(), fd, reader_id;
> > > + unsigned long meta_len, data_len;
> > > + struct trace_buffer_meta *meta;
> > > + void *map, *reader, *data;
> > > +
> > > + fd = open(TRACE_PIPE_RAW, O_RDONLY);
> > > + if (fd < 0)
> > > + exit(EXIT_FAILURE);
> > > +
> > > + map = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0);
> > > + if (map == MAP_FAILED)
> > > + exit(EXIT_FAILURE);
> > > +
> > > + meta = (struct trace_buffer_meta *)map;
> > > + meta_len = meta->meta_page_size;
> > > +
> > > + printf("entries: %lu\n", meta->entries);
> > > + printf("overrun: %lu\n", meta->overrun);
> > > + printf("read: %lu\n", meta->read);
> > > + printf("subbufs_touched:%lu\n", meta->subbufs_touched);
> > > + printf("subbufs_lost: %lu\n", meta->subbufs_lost);
> > > + printf("subbufs_read: %lu\n", meta->subbufs_read);
> > > + printf("nr_subbufs: %u\n", meta->nr_subbufs);
> > > +
> > > + data_len = meta->subbuf_size * meta->nr_subbufs;
> > > + data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, fd, data_len);
>
> The above is buggy. It should be:
>
> data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, fd, meta_len);
>
> The last parameter is where to start the mapping from, which is just
> after the meta page. The code is currently starting the map far away
> from that.
Ah, indeed! I confirmed that fixed the bus error.
Thank you!
>
> -- Steve
>
> > > + if (data == MAP_FAILED)
> > > + exit(EXIT_FAILURE);
> > > +
> > > + if (ioctl(fd, TRACE_MMAP_IOCTL_GET_READER) < 0)
> > > + exit(EXIT_FAILURE);
> > > +
> > > + reader_id = meta->reader.id;
> > > + reader = data + meta->subbuf_size * reader_id;
> >
> > Also, this caused a bus error if I add below 2 lines here.
> >
> > printf("reader_id: %d, addr: %p\n", reader_id, reader);
> > printf("read data head: %lx\n", *(unsigned long *)reader);
> >
> > -----
> > / # cd /sys/kernel/tracing/
> > /sys/kernel/tracing # echo 1 > events/enable
> > [ 17.941894] Scheduler tracepoints stat_sleep, stat_iowait, stat_blocked and stat_runtime require the kernel parameter schedstats=enable or kernel.sched_schedstats=1
> > /sys/kernel/tracing #
> > /sys/kernel/tracing # echo 1 > buffer_percent
> > /sys/kernel/tracing # /mnt/rbmap2
> > entries: 245291
> > overrun: 203741
> > read: 0
> > subbufs_touched:2041
> > subbufs_lost: 1688
> > subbufs_read: 0
> > nr_subbufs: 355
> > reader_id: 1, addr: 0x7f0cde51a000
> > Bus error
> > -----
> >
> > Is this expected behavior? how can I read the ring buffer?
> >
> > Thank you,
> >
> > > +
> > > + munmap(data, data_len);
> > > + munmap(meta, meta_len);
> > > + close (fd);
> > > +
> > > + return 0;
> > > + }
> > > --
> > > 2.43.0.275.g3460e3d667-goog
> > >
> >
> >
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
On Thu, 11 Jan 2024 16:17:11 +0000
Vincent Donnefort <vdonnefort@google.com> wrote:
> It is now possible to mmap() a ring-buffer to stream its content. Add
> some documentation and a code example.
>
> Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
>
> diff --git a/Documentation/trace/index.rst b/Documentation/trace/index.rst
> index 5092d6c13af5..0b300901fd75 100644
> --- a/Documentation/trace/index.rst
> +++ b/Documentation/trace/index.rst
> @@ -29,6 +29,7 @@ Linux Tracing Technologies
> timerlat-tracer
> intel_th
> ring-buffer-design
> + ring-buffer-map
> stm
> sys-t
> coresight/index
> diff --git a/Documentation/trace/ring-buffer-map.rst b/Documentation/trace/ring-buffer-map.rst
> new file mode 100644
> index 000000000000..2ba7b5339178
> --- /dev/null
> +++ b/Documentation/trace/ring-buffer-map.rst
> @@ -0,0 +1,105 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +==================================
> +Tracefs ring-buffer memory mapping
> +==================================
> +
> +:Author: Vincent Donnefort <vdonnefort@google.com>
> +
> +Overview
> +========
> +Tracefs ring-buffer memory map provides an efficient method to stream data
> +as no memory copy is necessary. The application mapping the ring-buffer becomes
> +then a consumer for that ring-buffer, in a similar fashion to trace_pipe.
> +
> +Memory mapping setup
> +====================
> +The mapping works with a mmap() of the trace_pipe_raw interface.
> +
> +The first system page of the mapping contains ring-buffer statistics and
> +description. It is referred as the meta-page. One of the most important field of
> +the meta-page is the reader. It contains the subbuf ID which can be safely read
> +by the mapper (see ring-buffer-design.rst).
> +
> +The meta-page is followed by all the subbuf, ordered by ascendant ID. It is
> +therefore effortless to know where the reader starts in the mapping:
> +
> +.. code-block:: c
> +
> + reader_id = meta->reader->id;
> + reader_offset = meta->meta_page_size + reader_id * meta->subbuf_size;
> +
> +When the application is done with the current reader, it can get a new one using
> +the trace_pipe_raw ioctl() TRACE_MMAP_IOCTL_GET_READER. This ioctl also updates
> +the meta-page fields.
> +
> +Limitations
> +===========
> +When a mapping is in place on a Tracefs ring-buffer, it is not possible to
> +either resize it (either by increasing the entire size of the ring-buffer or
> +each subbuf). It is also not possible to use snapshot or splice.
> +
> +Concurrent readers (either another application mapping that ring-buffer or the
> +kernel with trace_pipe) are allowed but not recommended. They will compete for
> +the ring-buffer and the output is unpredictable.
> +
> +Example
> +=======
> +
> +.. code-block:: c
> +
> + #include <fcntl.h>
> + #include <stdio.h>
> + #include <stdlib.h>
> + #include <unistd.h>
> +
> + #include <linux/trace_mmap.h>
> +
> + #include <sys/mman.h>
> + #include <sys/ioctl.h>
> +
> + #define TRACE_PIPE_RAW "/sys/kernel/tracing/per_cpu/cpu0/trace_pipe_raw"
> +
> + int main(void)
> + {
> + int page_size = getpagesize(), fd, reader_id;
> + unsigned long meta_len, data_len;
> + struct trace_buffer_meta *meta;
> + void *map, *reader, *data;
nit: this example code has a compile warning.
rbmap.c: In function ‘main’:
rbmap.c:18:21: warning: variable ‘reader’ set but not used [-Wunused-but-set-variable]
18 | void *map, *reader, *data;
| ^~~~~~
> +
> + fd = open(TRACE_PIPE_RAW, O_RDONLY);
> + if (fd < 0)
> + exit(EXIT_FAILURE);
> +
> + map = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0);
> + if (map == MAP_FAILED)
> + exit(EXIT_FAILURE);
> +
> + meta = (struct trace_buffer_meta *)map;
> + meta_len = meta->meta_page_size;
> +
> + printf("entries: %lu\n", meta->entries);
> + printf("overrun: %lu\n", meta->overrun);
> + printf("read: %lu\n", meta->read);
> + printf("subbufs_touched:%lu\n", meta->subbufs_touched);
> + printf("subbufs_lost: %lu\n", meta->subbufs_lost);
> + printf("subbufs_read: %lu\n", meta->subbufs_read);
> + printf("nr_subbufs: %u\n", meta->nr_subbufs);
> +
> + data_len = meta->subbuf_size * meta->nr_subbufs;
> + data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, fd, data_len);
> + if (data == MAP_FAILED)
> + exit(EXIT_FAILURE);
> +
> + if (ioctl(fd, TRACE_MMAP_IOCTL_GET_READER) < 0)
> + exit(EXIT_FAILURE);
> +
> + reader_id = meta->reader.id;
> + reader = data + meta->subbuf_size * reader_id;
So here, maybe you need;
printf("Current read sub-buffer address: %p\n", reader);
Thank you,
> +
> + munmap(data, data_len);
> + munmap(meta, meta_len);
> + close (fd);
> +
> + return 0;
> + }
> --
> 2.43.0.275.g3460e3d667-goog
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
© 2016 - 2025 Red Hat, Inc.