[PATCH] ring-buffer: Do not read at &event->array[0] if it across the page

Tze-nan Wu posted 1 patch 1 year ago
There is a newer version of this series
kernel/trace/ring_buffer.c | 4 ++++
1 file changed, 4 insertions(+)
[PATCH] ring-buffer: Do not read at &event->array[0] if it across the page
Posted by Tze-nan Wu 1 year ago
While reading from the tracing/trace, the ftrace reader rarely encounters
a KASAN invalid access issue.
It is likely that the writer has disrupted the ring_buffer that the reader
is currently parsing. the kasan report is as below:

[name:report&]BUG: KASAN: invalid-access in rb_iter_head_event+0x27c/0x3d0
[name:report&]Read of size 4 at addr 71ffff8111a18000 by task xxxx
[name:report_sw_tags&]Pointer tag: [71], memory tag: [0f]
[name:report&]
CPU: 2 PID: 380 Comm: xxxx
Call trace:
dump_backtrace+0x168/0x1b0
show_stack+0x2c/0x3c
dump_stack_lvl+0xa4/0xd4
print_report+0x268/0x9b0
kasan_report+0xdc/0x148
kasan_tag_mismatch+0x28/0x3c
__hwasan_tag_mismatch+0x2c/0x58
rb_event_length() [inline]
rb_iter_head_event+0x27c/0x3d0
ring_buffer_iter_peek+0x23c/0x6e0
__find_next_entry+0x1ac/0x3d8
s_next+0x1f0/0x310
seq_read_iter+0x4e8/0x77c
seq_read+0xf8/0x150
vfs_read+0x1a8/0x4cc

In some edge cases, ftrace reader could access to an invalid address,
specifically when reading 4 bytes beyond the end of the currently page.
While issue happened, the dump of rb_iter_head_event is shown as below:

    in function rb_iter_head_event:
          - iter->head = 0xFEC
          - iter->next_event = 0xFEC
          - commit = 0xFF0
          - read_stamp = 0x2955AC46DB0
          - page_stamp = 0x2955AC2439A
          - iter->head_page->page = 0x71FFFF8111A17000
          - iter->head_page->time_stamp = 0x2956A142267
          - iter->head_page->page->commit = 0xFF0
          - the content in iter->head_page->page
                0x71FFFF8111A17FF0: 01010075 00002421 0A123B7C FFFFFFC0

In rb_iter_head_event, reader will call rb_event_length with argument
(struct ring_buffer_event *event = 0x71FFFF8111A17FFC).
Since the content data start at address 0x71FFFF8111A17FFC are 0xFFFFFFC0.
event->type will be interpret as 0x0, than the reader will try to get the
length by accessing event->array[0], which is an invalid address:
    &event->array[0] = 0x71FFFF8111A18000

Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com>
---
Following patch may not become a solution, it merely checks if the address
to be accessed is valid or not within the rb_event_length before access.
And not sure if there is any side-effect it can lead to.

I am curious about what a better solution for this issue would look like.
Should we address the problem from the writer or the reader?

Also I wonder if the cause of the issue is exactly as I suspected.
Any Suggestion will be appreciated.

Test below can reproduce the issue in 2 hours on kernel-6.1.24:
    $cd /sys/kernel/tracing/
    # make the reader and writer race more through resize the buffer to 8kb
    $echo 8 > buffer_size_kn
    # enable all events
    $echo 1 > event/enable
    # enable trace
    $echo 1 > tracing_on
 
    # write and run a script that keep reading trace
    $./read_trace.sh

    ``` read_trace.sh
       while :
       do
           cat /sys/kernel/tracing/trace > /dev/null
       done

    ```
---
 kernel/trace/ring_buffer.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 78502d4c7214..ed5ddc3a134b 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -200,6 +200,8 @@ rb_event_length(struct ring_buffer_event *event)
 		if (rb_null_event(event))
 			/* undefined */
 			return -1;
+		if (((unsigned long)event & 0xfffUL) >= PAGE_SIZE - 4)
+			return -1;
 		return  event->array[0] + RB_EVNT_HDR_SIZE;
 
 	case RINGBUF_TYPE_TIME_EXTEND:
@@ -209,6 +211,8 @@ rb_event_length(struct ring_buffer_event *event)
 		return RB_LEN_TIME_STAMP;
 
 	case RINGBUF_TYPE_DATA:
+		if ((((unsigned long)event & 0xfffUL) >= PAGE_SIZE - 4) && !event->type_len)
+			return -1;
 		return rb_event_data_length(event);
 	default:
 		WARN_ON_ONCE(1);
-- 
2.18.0
Re: [PATCH] ring-buffer: Do not read at &event->array[0] if it across the page
Posted by Steven Rostedt 1 year ago
On Tue, 5 Sep 2023 22:12:45 +0800
Tze-nan Wu <Tze-nan.Wu@mediatek.com> wrote:

> While reading from the tracing/trace, the ftrace reader rarely encounters
> a KASAN invalid access issue.
> It is likely that the writer has disrupted the ring_buffer that the reader
> is currently parsing. the kasan report is as below:

What kernel are you using? There's been recent fixes that handle the race
between reading the last entry and the writer adding to it. This looks like
you may have hit that bug.

> 
> [name:report&]BUG: KASAN: invalid-access in rb_iter_head_event+0x27c/0x3d0
> [name:report&]Read of size 4 at addr 71ffff8111a18000 by task xxxx
> [name:report_sw_tags&]Pointer tag: [71], memory tag: [0f]
> [name:report&]
> CPU: 2 PID: 380 Comm: xxxx
> Call trace:
> dump_backtrace+0x168/0x1b0
> show_stack+0x2c/0x3c
> dump_stack_lvl+0xa4/0xd4
> print_report+0x268/0x9b0
> kasan_report+0xdc/0x148
> kasan_tag_mismatch+0x28/0x3c
> __hwasan_tag_mismatch+0x2c/0x58
> rb_event_length() [inline]
> rb_iter_head_event+0x27c/0x3d0
> ring_buffer_iter_peek+0x23c/0x6e0
> __find_next_entry+0x1ac/0x3d8
> s_next+0x1f0/0x310
> seq_read_iter+0x4e8/0x77c
> seq_read+0xf8/0x150
> vfs_read+0x1a8/0x4cc
> 
> In some edge cases, ftrace reader could access to an invalid address,
> specifically when reading 4 bytes beyond the end of the currently page.
> While issue happened, the dump of rb_iter_head_event is shown as below:
> 
>     in function rb_iter_head_event:
>           - iter->head = 0xFEC
>           - iter->next_event = 0xFEC
>           - commit = 0xFF0
>           - read_stamp = 0x2955AC46DB0
>           - page_stamp = 0x2955AC2439A
>           - iter->head_page->page = 0x71FFFF8111A17000
>           - iter->head_page->time_stamp = 0x2956A142267
>           - iter->head_page->page->commit = 0xFF0
>           - the content in iter->head_page->page
>                 0x71FFFF8111A17FF0: 01010075 00002421 0A123B7C FFFFFFC0
> 
> In rb_iter_head_event, reader will call rb_event_length with argument
> (struct ring_buffer_event *event = 0x71FFFF8111A17FFC).

As the commit is considered the end of buffer, it should never read past
that, which the above appears to be doing. That's why I think it hit that
race.

> Since the content data start at address 0x71FFFF8111A17FFC are 0xFFFFFFC0.
> event->type will be interpret as 0x0, than the reader will try to get the
> length by accessing event->array[0], which is an invalid address:
>     &event->array[0] = 0x71FFFF8111A18000
> 
> Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com>
> ---
> Following patch may not become a solution, it merely checks if the address
> to be accessed is valid or not within the rb_event_length before access.
> And not sure if there is any side-effect it can lead to.
> 
> I am curious about what a better solution for this issue would look like.
> Should we address the problem from the writer or the reader?
> 
> Also I wonder if the cause of the issue is exactly as I suspected.
> Any Suggestion will be appreciated.

I guess this depends on if you have the fixes or not?

> 
> Test below can reproduce the issue in 2 hours on kernel-6.1.24:
>     $cd /sys/kernel/tracing/
>     # make the reader and writer race more through resize the buffer to 8kb
>     $echo 8 > buffer_size_kn
>     # enable all events
>     $echo 1 > event/enable
>     # enable trace
>     $echo 1 > tracing_on
>  
>     # write and run a script that keep reading trace
>     $./read_trace.sh
> 
>     ``` read_trace.sh
>        while :
>        do
>            cat /sys/kernel/tracing/trace > /dev/null
>        done
> 
>     ```

Thanks, I'll look at that when I finish debugging the eventfs bug.

-- Steve
Re: [PATCH] ring-buffer: Do not read at &event->array[0] if it across the page
Posted by Tze-nan Wu (吳澤南) 1 year ago
On Tue, 2023-09-05 at 12:39 -0400, Steven Rostedt wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  On Tue, 5 Sep 2023 22:12:45 +0800
> Tze-nan Wu <Tze-nan.Wu@mediatek.com> wrote:
> 
> > While reading from the tracing/trace, the ftrace reader rarely
> encounters
> > a KASAN invalid access issue.
> > It is likely that the writer has disrupted the ring_buffer that the
> reader
> > is currently parsing. the kasan report is as below:
> 
> What kernel are you using? There's been recent fixes that handle the
> race
> between reading the last entry and the writer adding to it. This
> looks like
> you may have hit that bug.
> 

I am currently using kernel version 6.1.24, but I have applied the
patch for the ringbuffer.c file up to kernel version 6.1.52. However, I
am still able to reprduce the issue.

BTW, my decice have ARCH=ARM64 and PAGESIZE=4K

> > 
> > [name:report&]BUG: KASAN: invalid-access in
> rb_iter_head_event+0x27c/0x3d0
> > [name:report&]Read of size 4 at addr 71ffff8111a18000 by task xxxx
> > [name:report_sw_tags&]Pointer tag: [71], memory tag: [0f]
> > [name:report&]
> > CPU: 2 PID: 380 Comm: xxxx
> > Call trace:
> > dump_backtrace+0x168/0x1b0
> > show_stack+0x2c/0x3c
> > dump_stack_lvl+0xa4/0xd4
> > print_report+0x268/0x9b0
> > kasan_report+0xdc/0x148
> > kasan_tag_mismatch+0x28/0x3c
> > __hwasan_tag_mismatch+0x2c/0x58
> > rb_event_length() [inline]
> > rb_iter_head_event+0x27c/0x3d0
> > ring_buffer_iter_peek+0x23c/0x6e0
> > __find_next_entry+0x1ac/0x3d8
> > s_next+0x1f0/0x310
> > seq_read_iter+0x4e8/0x77c
> > seq_read+0xf8/0x150
> > vfs_read+0x1a8/0x4cc
> > 
> > In some edge cases, ftrace reader could access to an invalid
> address,
> > specifically when reading 4 bytes beyond the end of the currently
> page.
> > While issue happened, the dump of rb_iter_head_event is shown as
> below:
> > 
> >     in function rb_iter_head_event:
> >           - iter->head = 0xFEC
> >           - iter->next_event = 0xFEC
> >           - commit = 0xFF0
> >           - read_stamp = 0x2955AC46DB0
> >           - page_stamp = 0x2955AC2439A
> >           - iter->head_page->page = 0x71FFFF8111A17000
> >           - iter->head_page->time_stamp = 0x2956A142267
> >           - iter->head_page->page->commit = 0xFF0
> >           - the content in iter->head_page->page
> >                 0x71FFFF8111A17FF0: 01010075 00002421 0A123B7C
> FFFFFFC0
> > 
> > In rb_iter_head_event, reader will call rb_event_length with
> argument
> > (struct ring_buffer_event *event = 0x71FFFF8111A17FFC).
> 
> As the commit is considered the end of buffer, it should never read
> past
> that, which the above appears to be doing. That's why I think it hit
> that
> race.
> 

Same as my thought,
since every time the reader try to access the address in next page,
the below condition hold in rb_iter_head_event function:

        if (iter->page_stamp != iter_head_page->page->time_stamp ||
                commit > rb_page_commit(iter_head_page))
                        goto reset;

I observe the result by the debug patch provided below:

        @@ -2378,6 +2378,19 @@ rb_iter_head_event(struct
ring_buffer_iter *iter)
                commit = rb_page_commit(iter_head_page);
                smp_rmb();
        +       if ((iter->head >= 0xFECUL) && commit == 0xFF0UL) {
        +               pr_info("rbdbg: cpu = %d, event = 0x%lx, iter-
>head = 0x%lx,\
        +                       commit = 0xFF0, type = 0x%x, ts = 0x%x,
array addr = 0x%lx\n",\
        +                       iter->cpu_buffer->cpu, (unsigned
long)event, iter->head,\
        +                       event->type_len, event->time_delta,
(unsigned long)(&(event->array[0])));
        +               mdelay(500);
        +               pr_info("rbdbg2: cpu = %d, event = 0x%lx, iter-
>head = 0x%lx,\
        +                       commit = 0xFF0, type = 0x%x, ts = 0x%x,
array addr = 0x%lx\n",\
        +                       iter->cpu_buffer->cpu, (unsigned
long)event, iter->head,\
        +                       event->type_len, event->time_delta,
(unsigned long)(&(event->array[0])));
        +               if (iter->page_stamp != iter_head_page->page-
>time_stamp || commit > rb_page_commit(iter_head_page))
        +                       pr_info("rbdbg: writer corrupt
reader\n");
        +       }
                event = __rb_page_index(iter_head_page, iter->head);
                length = rb_event_length(event);

Note that the mdelay(500) in the debug patch can reproduce the issue
easier with same test in my environmnet,
I am now able to reproduce the issue within 15 minutes if the debug
patch on.

> > Since the content data start at address 0x71FFFF8111A17FFC are
> 0xFFFFFFC0.
> > event->type will be interpret as 0x0, than the reader will try to
> get the
> > length by accessing event->array[0], which is an invalid address:
> >     &event->array[0] = 0x71FFFF8111A18000
> > 
> > Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com>
> > ---
> > Following patch may not become a solution, it merely checks if the
> address
> > to be accessed is valid or not within the rb_event_length before
> access.
> > And not sure if there is any side-effect it can lead to.
> > 
> > I am curious about what a better solution for this issue would look
> like.
> > Should we address the problem from the writer or the reader?
> > 
> > Also I wonder if the cause of the issue is exactly as I suspected.
> > Any Suggestion will be appreciated.
> 
> I guess this depends on if you have the fixes or not?
> 

yes, I could try to pick the patches that only included in mainline but
not in kernel-6.1.52 for ring_buffer.c file,
and do the same test to see if the issue is still reproducible.

> > 
> > Test below can reproduce the issue in 2 hours on kernel-6.1.24:
> >     $cd /sys/kernel/tracing/
> >     # make the reader and writer race more through resize the
> buffer to 8kb
> >     $echo 8 > buffer_size_kn
> >     # enable all events
> >     $echo 1 > event/enable
> >     # enable trace
> >     $echo 1 > tracing_on
> >  
> >     # write and run a script that keep reading trace
> >     $./read_trace.sh
> > 
> >     ``` read_trace.sh
> >        while :
> >        do
> >            cat /sys/kernel/tracing/trace > /dev/null
> >        done
> > 
> >     ```
> 
> Thanks, I'll look at that when I finish debugging the eventfs bug.
> 
> -- Steve

Also thank for your reply,

-- Tze-nan
Re: [PATCH] ring-buffer: Do not read at &event->array[0] if it across the page
Posted by Tze-nan Wu (吳澤南) 1 year ago
On Thu, 2023-09-07 at 17:42 +0800, Tze-nan Wu wrote:

More information added.
> On Tue, 2023-09-05 at 12:39 -0400, Steven Rostedt wrote:
> >  	 
> > External email : Please do not click links or open attachments
> > until
> > you have verified the sender or the content.
> >  On Tue, 5 Sep 2023 22:12:45 +0800
> > Tze-nan Wu <Tze-nan.Wu@mediatek.com> wrote:
> > 
> > > While reading from the tracing/trace, the ftrace reader rarely
> > 
> > encounters
> > > a KASAN invalid access issue.
> > > It is likely that the writer has disrupted the ring_buffer that
> > > the
> > 
> > reader
> > > is currently parsing. the kasan report is as below:
> > 
> > What kernel are you using? There's been recent fixes that handle
> > the
> > race
> > between reading the last entry and the writer adding to it. This
> > looks like
> > you may have hit that bug.
> > 
> 
> I am currently using kernel version 6.1.24, but I have applied the
> patch for the ringbuffer.c file up to kernel version 6.1.52. However,
> I
> am still able to reprduce the issue.
> 
> BTW, my decice have ARCH=ARM64 and PAGESIZE=4K
> 
> > > 
> > > [name:report&]BUG: KASAN: invalid-access in
> > 
> > rb_iter_head_event+0x27c/0x3d0
> > > [name:report&]Read of size 4 at addr 71ffff8111a18000 by task
> > > xxxx
> > > [name:report_sw_tags&]Pointer tag: [71], memory tag: [0f]
> > > [name:report&]
> > > CPU: 2 PID: 380 Comm: xxxx
> > > Call trace:
> > > dump_backtrace+0x168/0x1b0
> > > show_stack+0x2c/0x3c
> > > dump_stack_lvl+0xa4/0xd4
> > > print_report+0x268/0x9b0
> > > kasan_report+0xdc/0x148
> > > kasan_tag_mismatch+0x28/0x3c
> > > __hwasan_tag_mismatch+0x2c/0x58
> > > rb_event_length() [inline]
> > > rb_iter_head_event+0x27c/0x3d0
> > > ring_buffer_iter_peek+0x23c/0x6e0
> > > __find_next_entry+0x1ac/0x3d8
> > > s_next+0x1f0/0x310
> > > seq_read_iter+0x4e8/0x77c
> > > seq_read+0xf8/0x150
> > > vfs_read+0x1a8/0x4cc
> > > 
> > > In some edge cases, ftrace reader could access to an invalid
> > 
> > address,
> > > specifically when reading 4 bytes beyond the end of the currently
> > 
> > page.
> > > While issue happened, the dump of rb_iter_head_event is shown as
> > 
> > below:
> > > 
> > >     in function rb_iter_head_event:
> > >           - iter->head = 0xFEC
> > >           - iter->next_event = 0xFEC
> > >           - commit = 0xFF0
> > >           - read_stamp = 0x2955AC46DB0
> > >           - page_stamp = 0x2955AC2439A
> > >           - iter->head_page->page = 0x71FFFF8111A17000
> > >           - iter->head_page->time_stamp = 0x2956A142267
> > >           - iter->head_page->page->commit = 0xFF0
> > >           - the content in iter->head_page->page
> > >                 0x71FFFF8111A17FF0: 01010075 00002421 0A123B7C
> > 
> > FFFFFFC0
> > > 
> > > In rb_iter_head_event, reader will call rb_event_length with
> > 
> > argument
> > > (struct ring_buffer_event *event = 0x71FFFF8111A17FFC).
> > 
> > As the commit is considered the end of buffer, it should never read
> > past
> > that, which the above appears to be doing. That's why I think it
> > hit
> > that
> > race.
> > 
> 
> Same as my thought,
> since every time the reader try to access the address in next page,
> the below condition hold in rb_iter_head_event function:
> 
>         if (iter->page_stamp != iter_head_page->page->time_stamp ||
>                 commit > rb_page_commit(iter_head_page))
>                         goto reset;
> 
> I observe the result by the debug patch provided below:
> 
>         @@ -2378,6 +2378,19 @@ rb_iter_head_event(struct
> ring_buffer_iter *iter)
>                 commit = rb_page_commit(iter_head_page);
>                 smp_rmb();
>         +       if ((iter->head >= 0xFECUL) && commit == 0xFF0UL) {
>         +               pr_info("rbdbg: cpu = %d, event = 0x%lx,
> iter-
> > head = 0x%lx,\
> 
>         +                       commit = 0xFF0, type = 0x%x, ts =
> 0x%x,
> array addr = 0x%lx\n",\
>         +                       iter->cpu_buffer->cpu, (unsigned
> long)event, iter->head,\
>         +                       event->type_len, event->time_delta,
> (unsigned long)(&(event->array[0])));
>         +               mdelay(500);
>         +               pr_info("rbdbg2: cpu = %d, event = 0x%lx,
> iter-
> > head = 0x%lx,\
> 
>         +                       commit = 0xFF0, type = 0x%x, ts =
> 0x%x,
> array addr = 0x%lx\n",\
>         +                       iter->cpu_buffer->cpu, (unsigned
> long)event, iter->head,\
>         +                       event->type_len, event->time_delta,
> (unsigned long)(&(event->array[0])));
>         +               if (iter->page_stamp != iter_head_page->page-
> > time_stamp || commit > rb_page_commit(iter_head_page))
> 
>         +                       pr_info("rbdbg: writer corrupt
> reader\n");
>         +       }
>                 event = __rb_page_index(iter_head_page, iter->head);
>                 length = rb_event_length(event);
> 
> Note that the mdelay(500) in the debug patch can reproduce the issue
> easier with same test in my environmnet,
> I am now able to reproduce the issue within 15 minutes if the debug
> patch on.
> 

The debug patch may give something similar to the below log just before
the invalid access happened, for me it looks like the padding event had
just corrupted by the writer before the reader invoke rb_event_length
function on it.

[  338.156772] cat: [name:ring_buffer&]rbdbg: cpu = 0, event =
0x????????????dffc, iter->head = 0xfec, commit = 0xFF0, type = 0x1d, ts
= 0x0, array addr = 0x????????????e000
[  338.656796] cat: [name:ring_buffer&]rbdbg2: cpu = 0, event =
0x????????????dffc, iter->head = 0xfec, commit = 0xFF0, type = 0x0, ts
= 0x0, array addr = 0x????????????e000
[  338.656803] cat: [name:ring_buffer&]rbdbg: writer corrupt reader
[  338.656810] cat:
[name:report&]=========================================================
=========
[  338.656819] cat: [name:report&]BUG: KASAN: invalid-access in
rb_event_length


> > > Since the content data start at address 0x71FFFF8111A17FFC are
> > 
> > 0xFFFFFFC0.
> > > event->type will be interpret as 0x0, than the reader will try to
> > 
> > get the
> > > length by accessing event->array[0], which is an invalid address:
> > >     &event->array[0] = 0x71FFFF8111A18000
> > > 
> > > Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com>
> > > ---
> > > Following patch may not become a solution, it merely checks if
> > > the
> > 
> > address
> > > to be accessed is valid or not within the rb_event_length before
> > 
> > access.
> > > And not sure if there is any side-effect it can lead to.
> > > 
> > > I am curious about what a better solution for this issue would
> > > look
> > 
> > like.
> > > Should we address the problem from the writer or the reader?
> > > 
> > > Also I wonder if the cause of the issue is exactly as I
> > > suspected.
> > > Any Suggestion will be appreciated.
> > 
> > I guess this depends on if you have the fixes or not?
> > 
> 
> yes, I could try to pick the patches that only included in mainline
> but
> not in kernel-6.1.52 for ring_buffer.c file,
> and do the same test to see if the issue is still reproducible.
> 
> > > 
> > > Test below can reproduce the issue in 2 hours on kernel-6.1.24:
> > >     $cd /sys/kernel/tracing/
> > >     # make the reader and writer race more through resize the
> > 
> > buffer to 8kb
> > >     $echo 8 > buffer_size_kn
> > >     # enable all events
> > >     $echo 1 > event/enable
> > >     # enable trace
> > >     $echo 1 > tracing_on
> > >  
> > >     # write and run a script that keep reading trace
> > >     $./read_trace.sh
> > > 
> > >     ``` read_trace.sh
> > >        while :
> > >        do
> > >            cat /sys/kernel/tracing/trace > /dev/null
> > >        done
> > > 
> > >     ```
> > 
> > Thanks, I'll look at that when I finish debugging the eventfs bug.
> > 
> > -- Steve
> 
> Also thank for your reply,
> 

And the temporary fix patch in my first mail should have some modified
as shown below.
+               if (((unsigned long)event & 0xfffUL) >= PAGE_SIZE - 4)
                                             ^^^^^^
                                            PAGE_SIZE-1
+                       return -1;

> -- Tze-nan
Re: [PATCH] ring-buffer: Do not read at &event->array[0] if it across the page
Posted by Steven Rostedt 1 year ago
On Thu, 7 Sep 2023 10:10:57 +0000
Tze-nan Wu (吳澤南) <Tze-nan.Wu@mediatek.com> wrote:

> > Same as my thought,
> > since every time the reader try to access the address in next page,
> > the below condition hold in rb_iter_head_event function:
> > 
> >         if (iter->page_stamp != iter_head_page->page->time_stamp ||
> >                 commit > rb_page_commit(iter_head_page))
> >                         goto reset;
> > 
> > I observe the result by the debug patch provided below:
> > 
> >         @@ -2378,6 +2378,19 @@ rb_iter_head_event(struct
> > ring_buffer_iter *iter)
> >                 commit = rb_page_commit(iter_head_page);
> >                 smp_rmb();
> >         +       if ((iter->head >= 0xFECUL) && commit == 0xFF0UL) {
> >         +               pr_info("rbdbg: cpu = %d, event = 0x%lx,
> > iter-  
> > > head = 0x%lx,\  
> > 
> >         +                       commit = 0xFF0, type = 0x%x, ts =
> > 0x%x,
> > array addr = 0x%lx\n",\
> >         +                       iter->cpu_buffer->cpu, (unsigned
> > long)event, iter->head,\
> >         +                       event->type_len, event->time_delta,
> > (unsigned long)(&(event->array[0])));
> >         +               mdelay(500);
> >         +               pr_info("rbdbg2: cpu = %d, event = 0x%lx,
> > iter-  
> > > head = 0x%lx,\  
> > 
> >         +                       commit = 0xFF0, type = 0x%x, ts =
> > 0x%x,
> > array addr = 0x%lx\n",\
> >         +                       iter->cpu_buffer->cpu, (unsigned
> > long)event, iter->head,\
> >         +                       event->type_len, event->time_delta,
> > (unsigned long)(&(event->array[0])));
> >         +               if (iter->page_stamp != iter_head_page->page-  
> > > time_stamp || commit > rb_page_commit(iter_head_page))  
> > 
> >         +                       pr_info("rbdbg: writer corrupt
> > reader\n");
> >         +       }
> >                 event = __rb_page_index(iter_head_page, iter->head);
> >                 length = rb_event_length(event);
> > 
> > Note that the mdelay(500) in the debug patch can reproduce the issue
> > easier with same test in my environmnet,
> > I am now able to reproduce the issue within 15 minutes if the debug
> > patch on.
> >   
> 
> The debug patch may give something similar to the below log just before
> the invalid access happened, for me it looks like the padding event had
> just corrupted by the writer before the reader invoke rb_event_length
> function on it.
> 
> [  338.156772] cat: [name:ring_buffer&]rbdbg: cpu = 0, event =
> 0x????????????dffc, iter->head = 0xfec, commit = 0xFF0, type = 0x1d, ts
> = 0x0, array addr = 0x????????????e000
> [  338.656796] cat: [name:ring_buffer&]rbdbg2: cpu = 0, event =
> 0x????????????dffc, iter->head = 0xfec, commit = 0xFF0, type = 0x0, ts
> = 0x0, array addr = 0x????????????e000
> [  338.656803] cat: [name:ring_buffer&]rbdbg: writer corrupt reader
> [  338.656810] cat:
> [name:report&]=========================================================
> =========
> [  338.656819] cat: [name:report&]BUG: KASAN: invalid-access in
> rb_event_length
> 
> 
> > > > Since the content data start at address 0x71FFFF8111A17FFC are  
> > > 
> > > 0xFFFFFFC0.  
> > > > event->type will be interpret as 0x0, than the reader will try to  
> > > 
> > > get the  
> > > > length by accessing event->array[0], which is an invalid address:
> > > >     &event->array[0] = 0x71FFFF8111A18000
> > > > 
> > > > Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com>
> > > > ---
> > > > Following patch may not become a solution, it merely checks if
> > > > the  
> > > 
> > > address  
> > > > to be accessed is valid or not within the rb_event_length before  
> > > 
> > > access.  
> > > > And not sure if there is any side-effect it can lead to.
> > > > 
> > > > I am curious about what a better solution for this issue would
> > > > look  
> > > 
> > > like.  
> > > > Should we address the problem from the writer or the reader?
> > > > 
> > > > Also I wonder if the cause of the issue is exactly as I
> > > > suspected.
> > > > Any Suggestion will be appreciated.  
> > > 
> > > I guess this depends on if you have the fixes or not?
> > >   
> > 
> > yes, I could try to pick the patches that only included in mainline
> > but
> > not in kernel-6.1.52 for ring_buffer.c file,
> > and do the same test to see if the issue is still reproducible.
> >   
> > > > 
> > > > Test below can reproduce the issue in 2 hours on kernel-6.1.24:
> > > >     $cd /sys/kernel/tracing/
> > > >     # make the reader and writer race more through resize the  
> > > 
> > > buffer to 8kb  
> > > >     $echo 8 > buffer_size_kn
> > > >     # enable all events
> > > >     $echo 1 > event/enable
> > > >     # enable trace
> > > >     $echo 1 > tracing_on
> > > >  
> > > >     # write and run a script that keep reading trace
> > > >     $./read_trace.sh
> > > > 
> > > >     ``` read_trace.sh
> > > >        while :
> > > >        do
> > > >            cat /sys/kernel/tracing/trace > /dev/null
> > > >        done
> > > > 
> > > >     ```  
> > > 
> > > Thanks, I'll look at that when I finish debugging the eventfs bug.
> > > 
> > > -- Steve  
> > 
> > Also thank for your reply,
> >   
> 
> And the temporary fix patch in my first mail should have some modified
> as shown below.
> +               if (((unsigned long)event & 0xfffUL) >= PAGE_SIZE - 4)
>                                              ^^^^^^
>                                             PAGE_SIZE-1
> +                       return -1;


Your email is getting really hard to read due to the line wrap formatting.

Anyway, can you try this patch?

-- Steve

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 72ccf75defd0..5b0653378089 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -2390,6 +2390,10 @@ rb_iter_head_event(struct ring_buffer_iter *iter)
 	 */
 	commit = rb_page_commit(iter_head_page);
 	smp_rmb();
+	/* An event needs to be at least 8 bytes in size */
+	if (iter->head > commit - 8)
+		goto reset;
+
 	event = __rb_page_index(iter_head_page, iter->head);
 	length = rb_event_length(event);