[PATCH v2] kasan:print the original fault addr when access invalid shadow

Haibo Li posted 1 patch 11 months, 2 weeks ago
include/linux/kasan.h | 6 +++---
mm/kasan/report.c     | 4 +---
2 files changed, 4 insertions(+), 6 deletions(-)
[PATCH v2] kasan:print the original fault addr when access invalid shadow
Posted by Haibo Li 11 months, 2 weeks ago
when the checked address is illegal,the corresponding shadow address
from kasan_mem_to_shadow may have no mapping in mmu table.
Access such shadow address causes kernel oops.
Here is a sample about oops on arm64(VA 39bit) 
with KASAN_SW_TAGS and KASAN_OUTLINE on:

[ffffffb80aaaaaaa] pgd=000000005d3ce003, p4d=000000005d3ce003,
    pud=000000005d3ce003, pmd=0000000000000000
Internal error: Oops: 0000000096000006 [#1] PREEMPT SMP
Modules linked in:
CPU: 3 PID: 100 Comm: sh Not tainted 6.6.0-rc1-dirty #43
Hardware name: linux,dummy-virt (DT)
pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : __hwasan_load8_noabort+0x5c/0x90
lr : do_ib_ob+0xf4/0x110
ffffffb80aaaaaaa is the shadow address for efffff80aaaaaaaa.
The problem is reading invalid shadow in kasan_check_range.

The generic kasan also has similar oops.

It only reports the shadow address which causes oops but not
the original address.

Commit 2f004eea0fc8("x86/kasan: Print original address on #GP")
introduce to kasan_non_canonical_hook but limit it to KASAN_INLINE.

This patch extends it to KASAN_OUTLINE mode.

Signed-off-by: Haibo Li <haibo.li@mediatek.com>
---
v2:
- In view of the possible perf impact by checking shadow address,change 
   to use kasan_non_canonical_hook as it works after oops.
---
 include/linux/kasan.h | 6 +++---
 mm/kasan/report.c     | 4 +---
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index 3df5499f7936..a707ee8b19ce 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -466,10 +466,10 @@ static inline void kasan_free_module_shadow(const struct vm_struct *vm) {}
 
 #endif /* (CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS) && !CONFIG_KASAN_VMALLOC */
 
-#ifdef CONFIG_KASAN_INLINE
+#ifdef CONFIG_KASAN
 void kasan_non_canonical_hook(unsigned long addr);
-#else /* CONFIG_KASAN_INLINE */
+#else /* CONFIG_KASAN */
 static inline void kasan_non_canonical_hook(unsigned long addr) { }
-#endif /* CONFIG_KASAN_INLINE */
+#endif /* CONFIG_KASAN */
 
 #endif /* LINUX_KASAN_H */
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index ca4b6ff080a6..3974e4549c3e 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -621,9 +621,8 @@ void kasan_report_async(void)
 }
 #endif /* CONFIG_KASAN_HW_TAGS */
 
-#ifdef CONFIG_KASAN_INLINE
 /*
- * With CONFIG_KASAN_INLINE, accesses to bogus pointers (outside the high
+ * With CONFIG_KASAN, accesses to bogus pointers (outside the high
  * canonical half of the address space) cause out-of-bounds shadow memory reads
  * before the actual access. For addresses in the low canonical half of the
  * address space, as well as most non-canonical addresses, that out-of-bounds
@@ -659,4 +658,3 @@ void kasan_non_canonical_hook(unsigned long addr)
 	pr_alert("KASAN: %s in range [0x%016lx-0x%016lx]\n", bug_type,
 		 orig_addr, orig_addr + KASAN_GRANULE_SIZE - 1);
 }
-#endif
-- 
2.18.0
Re: [PATCH v2] kasan:print the original fault addr when access invalid shadow
Posted by Andrey Konovalov 11 months, 2 weeks ago
On Mon, Oct 9, 2023 at 9:37 AM Haibo Li <haibo.li@mediatek.com> wrote:
>
> when the checked address is illegal,the corresponding shadow address
> from kasan_mem_to_shadow may have no mapping in mmu table.
> Access such shadow address causes kernel oops.
> Here is a sample about oops on arm64(VA 39bit)
> with KASAN_SW_TAGS and KASAN_OUTLINE on:
>
> [ffffffb80aaaaaaa] pgd=000000005d3ce003, p4d=000000005d3ce003,
>     pud=000000005d3ce003, pmd=0000000000000000
> Internal error: Oops: 0000000096000006 [#1] PREEMPT SMP
> Modules linked in:
> CPU: 3 PID: 100 Comm: sh Not tainted 6.6.0-rc1-dirty #43
> Hardware name: linux,dummy-virt (DT)
> pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : __hwasan_load8_noabort+0x5c/0x90
> lr : do_ib_ob+0xf4/0x110
> ffffffb80aaaaaaa is the shadow address for efffff80aaaaaaaa.
> The problem is reading invalid shadow in kasan_check_range.
>
> The generic kasan also has similar oops.
>
> It only reports the shadow address which causes oops but not
> the original address.
>
> Commit 2f004eea0fc8("x86/kasan: Print original address on #GP")
> introduce to kasan_non_canonical_hook but limit it to KASAN_INLINE.
>
> This patch extends it to KASAN_OUTLINE mode.
>
> Signed-off-by: Haibo Li <haibo.li@mediatek.com>
> ---
> v2:
> - In view of the possible perf impact by checking shadow address,change
>    to use kasan_non_canonical_hook as it works after oops.
> ---
>  include/linux/kasan.h | 6 +++---
>  mm/kasan/report.c     | 4 +---
>  2 files changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index 3df5499f7936..a707ee8b19ce 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h
> @@ -466,10 +466,10 @@ static inline void kasan_free_module_shadow(const struct vm_struct *vm) {}
>
>  #endif /* (CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS) && !CONFIG_KASAN_VMALLOC */
>
> -#ifdef CONFIG_KASAN_INLINE
> +#ifdef CONFIG_KASAN
>  void kasan_non_canonical_hook(unsigned long addr);
> -#else /* CONFIG_KASAN_INLINE */
> +#else /* CONFIG_KASAN */
>  static inline void kasan_non_canonical_hook(unsigned long addr) { }
> -#endif /* CONFIG_KASAN_INLINE */
> +#endif /* CONFIG_KASAN */
>
>  #endif /* LINUX_KASAN_H */
> diff --git a/mm/kasan/report.c b/mm/kasan/report.c
> index ca4b6ff080a6..3974e4549c3e 100644
> --- a/mm/kasan/report.c
> +++ b/mm/kasan/report.c
> @@ -621,9 +621,8 @@ void kasan_report_async(void)
>  }
>  #endif /* CONFIG_KASAN_HW_TAGS */
>
> -#ifdef CONFIG_KASAN_INLINE
>  /*
> - * With CONFIG_KASAN_INLINE, accesses to bogus pointers (outside the high
> + * With CONFIG_KASAN, accesses to bogus pointers (outside the high
>   * canonical half of the address space) cause out-of-bounds shadow memory reads
>   * before the actual access. For addresses in the low canonical half of the
>   * address space, as well as most non-canonical addresses, that out-of-bounds
> @@ -659,4 +658,3 @@ void kasan_non_canonical_hook(unsigned long addr)
>         pr_alert("KASAN: %s in range [0x%016lx-0x%016lx]\n", bug_type,
>                  orig_addr, orig_addr + KASAN_GRANULE_SIZE - 1);
>  }
> -#endif
> --
> 2.18.0
>

Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>

Thank you!

On a related note, I have debugged the reason why
kasan_non_canonical_hook sometimes doesn't get engaged properly for
the SW_TAGS mode. I'll post a fix next week.
Re: [PATCH v2] kasan:print the original fault addr when access invalid shadow
Posted by Andrew Morton 11 months, 2 weeks ago
On Mon, 9 Oct 2023 15:37:48 +0800 Haibo Li <haibo.li@mediatek.com> wrote:

> when the checked address is illegal,the corresponding shadow address
> from kasan_mem_to_shadow may have no mapping in mmu table.
> Access such shadow address causes kernel oops.
> Here is a sample about oops on arm64(VA 39bit) 
> with KASAN_SW_TAGS and KASAN_OUTLINE on:
> 
> [ffffffb80aaaaaaa] pgd=000000005d3ce003, p4d=000000005d3ce003,
>     pud=000000005d3ce003, pmd=0000000000000000
> Internal error: Oops: 0000000096000006 [#1] PREEMPT SMP
> Modules linked in:
> CPU: 3 PID: 100 Comm: sh Not tainted 6.6.0-rc1-dirty #43
> Hardware name: linux,dummy-virt (DT)
> pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : __hwasan_load8_noabort+0x5c/0x90
> lr : do_ib_ob+0xf4/0x110
> ffffffb80aaaaaaa is the shadow address for efffff80aaaaaaaa.
> The problem is reading invalid shadow in kasan_check_range.
> 
> The generic kasan also has similar oops.
> 
> It only reports the shadow address which causes oops but not
> the original address.
> 
> Commit 2f004eea0fc8("x86/kasan: Print original address on #GP")
> introduce to kasan_non_canonical_hook but limit it to KASAN_INLINE.
> 
> This patch extends it to KASAN_OUTLINE mode.

Is 2f004eea0fc8 a suitable Fixes: target for this?  If not, what is?

Also, I'm assuming that we want to backport this fix into earlier
kernel versions?
Re: [PATCH v2] kasan:print the original fault addr when access invalid shadow
Posted by Haibo Li 11 months, 2 weeks ago
> On Mon, 9 Oct 2023 15:37:48 +0800 Haibo Li <haibo.li@mediatek.com> wrote:
> 
> > when the checked address is illegal,the corresponding shadow address
> > from kasan_mem_to_shadow may have no mapping in mmu table.
> > Access such shadow address causes kernel oops.
> > Here is a sample about oops on arm64(VA 39bit) 
> > with KASAN_SW_TAGS and KASAN_OUTLINE on:
> > 
> > [ffffffb80aaaaaaa] pgd=000000005d3ce003, p4d=000000005d3ce003,
> >     pud=000000005d3ce003, pmd=0000000000000000
> > Internal error: Oops: 0000000096000006 [#1] PREEMPT SMP
> > Modules linked in:
> > CPU: 3 PID: 100 Comm: sh Not tainted 6.6.0-rc1-dirty #43
> > Hardware name: linux,dummy-virt (DT)
> > pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > pc : __hwasan_load8_noabort+0x5c/0x90
> > lr : do_ib_ob+0xf4/0x110
> > ffffffb80aaaaaaa is the shadow address for efffff80aaaaaaaa.
> > The problem is reading invalid shadow in kasan_check_range.
> > 
> > The generic kasan also has similar oops.
> > 
> > It only reports the shadow address which causes oops but not
> > the original address.
> > 
> > Commit 2f004eea0fc8("x86/kasan: Print original address on #GP")
> > introduce to kasan_non_canonical_hook but limit it to KASAN_INLINE.
> > 
> > This patch extends it to KASAN_OUTLINE mode.
> 
> Is 2f004eea0fc8 a suitable Fixes: target for this?  If not, what is?
> 
Yes, 2f004eea0fc8 is a suitable fix.
All we need is a better crash report for this case.
After commit 2f004eea0fc8 and commit 
07b742a4d912 ("arm64: mm: log potential KASAN shadow alias"),
it is easy to understand the original address when
 out-of-bounds KASAN shadow accesses occur.
Currently, this feature is only available for the KASAN_INLINE case.
As Jann said, it is also suitable for the KASAN_OUTLINE case.

> Also, I'm assuming that we want to backport this fix into earlier
> kernel versions?
My opinion:
 As it is to improve crash report,there is no requirement to backport.