[PATCH v1 0/3] x86/traps: Fix DR6/DR7 inintialization

Xin Li (Intel) posted 3 patches 3 months, 4 weeks ago
arch/x86/coco/sev/core.c             |  1 +
arch/x86/coco/sev/vc-handle.c        |  1 +
arch/x86/include/asm/debugreg.h      | 12 +++++-----
arch/x86/include/asm/sev-internal.h  |  2 --
arch/x86/include/uapi/asm/debugreg.h |  9 ++++++-
arch/x86/kernel/cpu/common.c         | 17 ++++++-------
arch/x86/kernel/hw_breakpoint.c      |  8 +++----
arch/x86/kernel/kgdb.c               |  4 ++--
arch/x86/kernel/process_32.c         |  6 ++---
arch/x86/kernel/process_64.c         |  6 ++---
arch/x86/kernel/traps.c              | 36 +++++++++++++++++-----------
arch/x86/kvm/vmx/nested.c            |  2 +-
arch/x86/kvm/vmx/vmx.c               |  6 ++---
arch/x86/kvm/x86.c                   |  4 ++--
14 files changed, 63 insertions(+), 51 deletions(-)
[PATCH v1 0/3] x86/traps: Fix DR6/DR7 inintialization
Posted by Xin Li (Intel) 3 months, 4 weeks ago
Sohil reported seeing a split lock warning when running a test that
generates userspace #DB:

  x86/split lock detection: #DB: sigtrap_loop_64/4614 took a bus_lock trap at address: 0x4011ae


We investigated the issue and identified how the false bus lock detected
warning is generated under certain test conditions:

  1) The warning is a false positive.

  2) It is not caused by the test itself.

  3) It occurs even when Bus Lock Detection (BLD) is disabled.

  4) It only happens on the first #DB on a CPU.


And the root cause is, at boot time, Linux zeros DR6.  This leads to
different DR6 values depending on whether the CPU supports BLD:

  1) On CPUs with BLD support, DR6 becomes 0xFFFF07F0 (bit 11, DR6.BLD,
     is cleared).

  2) On CPUs without BLD, DR6 becomes 0xFFFF0FF0.

Since only BLD-induced #DB exceptions clear DR6.BLD and other debug
exceptions leave it unchanged, even if the first #DB is unrelated to
BLD, DR6.BLD is still cleared.  As a result, such a first #DB is
misinterpreted as a BLD #DB, and a false warning is triggerred.


Fix the bug by initializing DR6 by writing its architectural reset
value at boot time.


DR7 suffers from a similar issue.  We apply the same fix.


This patch set is based on tip/x86/urgent branch as of today.


Xin Li (Intel) (3):
  x86/traps: Move DR7_RESET_VALUE to <uapi/asm/debugreg.h>
  x86/traps: Initialize DR7 by writing its architectural reset value
  x86/traps: Initialize DR6 by writing its architectural reset value

 arch/x86/coco/sev/core.c             |  1 +
 arch/x86/coco/sev/vc-handle.c        |  1 +
 arch/x86/include/asm/debugreg.h      | 12 +++++-----
 arch/x86/include/asm/sev-internal.h  |  2 --
 arch/x86/include/uapi/asm/debugreg.h |  9 ++++++-
 arch/x86/kernel/cpu/common.c         | 17 ++++++-------
 arch/x86/kernel/hw_breakpoint.c      |  8 +++----
 arch/x86/kernel/kgdb.c               |  4 ++--
 arch/x86/kernel/process_32.c         |  6 ++---
 arch/x86/kernel/process_64.c         |  6 ++---
 arch/x86/kernel/traps.c              | 36 +++++++++++++++++-----------
 arch/x86/kvm/vmx/nested.c            |  2 +-
 arch/x86/kvm/vmx/vmx.c               |  6 ++---
 arch/x86/kvm/x86.c                   |  4 ++--
 14 files changed, 63 insertions(+), 51 deletions(-)


base-commit: 7cd9a11dd0c3d1dd225795ed1b5b53132888e7b5
-- 
2.49.0
Re: [PATCH v1 0/3] x86/traps: Fix DR6/DR7 inintialization
Posted by Sohil Mehta 3 months, 4 weeks ago
On 6/13/2025 12:01 AM, Xin Li (Intel) wrote:

> 
> Xin Li (Intel) (3):
>   x86/traps: Move DR7_RESET_VALUE to <uapi/asm/debugreg.h>
>   x86/traps: Initialize DR7 by writing its architectural reset value
>   x86/traps: Initialize DR6 by writing its architectural reset value
> 

The patches fix the false bus_lock warning that I was observing with the
infinite sigtrap selftest.

Tested-by: Sohil Mehta <sohil.mehta@intel.com>

I'll try it out again once you send the updated version.

In future, should we incorporate a #DB (or bus_lock) specific selftest
to detect such DR6/7 initialization issues?
Re: [PATCH v1 0/3] x86/traps: Fix DR6/DR7 inintialization
Posted by Xin Li 3 months, 4 weeks ago
On 6/13/2025 3:43 PM, Sohil Mehta wrote:
> On 6/13/2025 12:01 AM, Xin Li (Intel) wrote:
> 
>>
>> Xin Li (Intel) (3):
>>    x86/traps: Move DR7_RESET_VALUE to <uapi/asm/debugreg.h>
>>    x86/traps: Initialize DR7 by writing its architectural reset value
>>    x86/traps: Initialize DR6 by writing its architectural reset value
>>
> 
> The patches fix the false bus_lock warning that I was observing with the
> infinite sigtrap selftest.
> 
> Tested-by: Sohil Mehta <sohil.mehta@intel.com>
> 
> I'll try it out again once you send the updated version.

Thank you very much!

> 
> In future, should we incorporate a #DB (or bus_lock) specific selftest
> to detect such DR6/7 initialization issues?


I cant think of how to tests it.  Any suggestion about a new test?
Re: [PATCH v1 0/3] x86/traps: Fix DR6/DR7 inintialization
Posted by H. Peter Anvin 3 months, 4 weeks ago
On June 13, 2025 4:22:57 PM PDT, Xin Li <xin@zytor.com> wrote:
>On 6/13/2025 3:43 PM, Sohil Mehta wrote:
>> On 6/13/2025 12:01 AM, Xin Li (Intel) wrote:
>> 
>>> 
>>> Xin Li (Intel) (3):
>>>    x86/traps: Move DR7_RESET_VALUE to <uapi/asm/debugreg.h>
>>>    x86/traps: Initialize DR7 by writing its architectural reset value
>>>    x86/traps: Initialize DR6 by writing its architectural reset value
>>> 
>> 
>> The patches fix the false bus_lock warning that I was observing with the
>> infinite sigtrap selftest.
>> 
>> Tested-by: Sohil Mehta <sohil.mehta@intel.com>
>> 
>> I'll try it out again once you send the updated version.
>
>Thank you very much!
>
>> 
>> In future, should we incorporate a #DB (or bus_lock) specific selftest
>> to detect such DR6/7 initialization issues?
>
>
>I cant think of how to tests it.  Any suggestion about a new test?
>

You would have to map some memory uncached.
Re: [PATCH v1 0/3] x86/traps: Fix DR6/DR7 inintialization
Posted by Ethan Zhao 3 months, 3 weeks ago
在 2025/6/14 11:38, H. Peter Anvin 写道:
> On June 13, 2025 4:22:57 PM PDT, Xin Li <xin@zytor.com> wrote:
>> On 6/13/2025 3:43 PM, Sohil Mehta wrote:
>>> On 6/13/2025 12:01 AM, Xin Li (Intel) wrote:
>>>
>>>> Xin Li (Intel) (3):
>>>>     x86/traps: Move DR7_RESET_VALUE to <uapi/asm/debugreg.h>
>>>>     x86/traps: Initialize DR7 by writing its architectural reset value
>>>>     x86/traps: Initialize DR6 by writing its architectural reset value
>>>>
>>> The patches fix the false bus_lock warning that I was observing with the
>>> infinite sigtrap selftest.
>>>
>>> Tested-by: Sohil Mehta <sohil.mehta@intel.com>
>>>
>>> I'll try it out again once you send the updated version.
>> Thank you very much!
>>
>>> In future, should we incorporate a #DB (or bus_lock) specific selftest
>>> to detect such DR6/7 initialization issues?
>>
>> I cant think of how to tests it.  Any suggestion about a new test?
>>
> You would have to map some memory uncached.

To trigger a real bus_lock event in user space ?


Thanks,

Ethan

>
-- 
"firm, enduring, strong, and long-lived"

Re: [PATCH v1 0/3] x86/traps: Fix DR6/DR7 inintialization
Posted by Peter Zijlstra 3 months, 4 weeks ago
On Fri, Jun 13, 2025 at 12:01:14AM -0700, Xin Li (Intel) wrote:
> Sohil reported seeing a split lock warning when running a test that
> generates userspace #DB:
> 
>   x86/split lock detection: #DB: sigtrap_loop_64/4614 took a bus_lock trap at address: 0x4011ae
> 
> 
> We investigated the issue and identified how the false bus lock detected
> warning is generated under certain test conditions:
> 
>   1) The warning is a false positive.
> 
>   2) It is not caused by the test itself.
> 
>   3) It occurs even when Bus Lock Detection (BLD) is disabled.
> 
>   4) It only happens on the first #DB on a CPU.
> 
> 
> And the root cause is, at boot time, Linux zeros DR6.  This leads to
> different DR6 values depending on whether the CPU supports BLD:
> 
>   1) On CPUs with BLD support, DR6 becomes 0xFFFF07F0 (bit 11, DR6.BLD,
>      is cleared).
> 
>   2) On CPUs without BLD, DR6 becomes 0xFFFF0FF0.
> 
> Since only BLD-induced #DB exceptions clear DR6.BLD and other debug
> exceptions leave it unchanged, even if the first #DB is unrelated to
> BLD, DR6.BLD is still cleared.  As a result, such a first #DB is
> misinterpreted as a BLD #DB, and a false warning is triggerred.
> 
> 
> Fix the bug by initializing DR6 by writing its architectural reset
> value at boot time.
> 
> 
> DR7 suffers from a similar issue.  We apply the same fix.

Bah, this DR6 polarity is a pain in the behind for sure. Patches look
good, except I'm really not a fan of using those 'names'. But I'll not
object too much of others like it.
Re: [PATCH v1 0/3] x86/traps: Fix DR6/DR7 inintialization
Posted by Xin Li 3 months, 4 weeks ago
On 6/13/2025 12:18 AM, Peter Zijlstra wrote:
> On Fri, Jun 13, 2025 at 12:01:14AM -0700, Xin Li (Intel) wrote:
>>
>> Since only BLD-induced #DB exceptions clear DR6.BLD and other debug
>> exceptions leave it unchanged, even if the first #DB is unrelated to
>> BLD, DR6.BLD is still cleared.  As a result, such a first #DB is
>> misinterpreted as a BLD #DB, and a false warning is triggerred.
>>
>>
>> Fix the bug by initializing DR6 by writing its architectural reset
>> value at boot time.
>>
>>
>> DR7 suffers from a similar issue.  We apply the same fix.
> 
> Bah, this DR6 polarity is a pain in the behind for sure. Patches look
> good, except I'm really not a fan of using those 'names'. But I'll not
> object too much of others like it.

Let's see if there will be more objections :)