arch/x86/mm/fault.c | 9 +++++++++ 1 file changed, 9 insertions(+)
Some older Intel CPUs have errata:
"Not-Present Page Faults May Set the RSVD Flag in the Error Code
Problem:
An attempt to access a page that is not marked present causes a page
fault. Such a page fault delivers an error code in which both the
P flag (bit 0) and the RSVD flag (bit 3) are 0. Due to this erratum,
not-present page faults may deliver an error code in which the P flag
is 0 but the RSVD flag is 1.
Implication:
Software may erroneously infer that a page fault was due to a
reserved-bit violation when it was actually due to an attempt
to access a not-present page.
Workaround: Page-fault handlers should ignore the RSVD flag in the error
code if the P flag is 0."
This issues was observed on several nodes crashed with messages
httpd: Corrupted page table at address 7f62d5b48e68
PGD 80000002e92bf067 PUD 1c99c5067 PMD 195015067 PTE 7fffffffb78b680
Bad pagetable: 000c [#1] SMP
Let's follow the recommendation and will ignore the RSVD flag in the
error code if the P flag is 0
Link: https://lore.kernel.org/all/aae9c7c6-989c-0261-470a-252537493b53@openvz.org
Signed-off-by: Vasily Averin <vvs@openvz.org>
---
arch/x86/mm/fault.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index fe10c6d76bac..ffc6d6bd2a22 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1481,6 +1481,15 @@ handle_page_fault(struct pt_regs *regs, unsigned long error_code,
if (unlikely(kmmio_fault(regs, address)))
return;
+ /*
+ * Some older Intel CPUs have errata
+ * "Not-Present Page Faults May Set the RSVD Flag in the Error Code"
+ * It is recommended to ignore the RSVD flag (bit 3) in the error code
+ * if the P flag (bit 0) is 0.
+ */
+ if (unlikely((error_code & X86_PF_RSVD) && !(error_code & X86_PF_PROT)))
+ error_code &= ~X86_PF_RSVD;
+
/* Was the fault on kernel-controlled part of the address space? */
if (unlikely(fault_in_kernel_space(address))) {
do_kern_addr_fault(regs, error_code, address);
--
2.36.1
On June 29, 2022 10:58:36 PM PDT, Vasily Averin <vvs@openvz.org> wrote:
>Some older Intel CPUs have errata:
>"Not-Present Page Faults May Set the RSVD Flag in the Error Code
>
>Problem:
>An attempt to access a page that is not marked present causes a page
>fault. Such a page fault delivers an error code in which both the
>P flag (bit 0) and the RSVD flag (bit 3) are 0. Due to this erratum,
>not-present page faults may deliver an error code in which the P flag
>is 0 but the RSVD flag is 1.
>
>Implication:
>Software may erroneously infer that a page fault was due to a
>reserved-bit violation when it was actually due to an attempt
>to access a not-present page.
>
>Workaround: Page-fault handlers should ignore the RSVD flag in the error
>code if the P flag is 0."
>
>This issues was observed on several nodes crashed with messages
>httpd: Corrupted page table at address 7f62d5b48e68
>PGD 80000002e92bf067 PUD 1c99c5067 PMD 195015067 PTE 7fffffffb78b680
>Bad pagetable: 000c [#1] SMP
>
>Let's follow the recommendation and will ignore the RSVD flag in the
>error code if the P flag is 0
>
>Link: https://lore.kernel.org/all/aae9c7c6-989c-0261-470a-252537493b53@openvz.org
>Signed-off-by: Vasily Averin <vvs@openvz.org>
>---
> arch/x86/mm/fault.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
>diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
>index fe10c6d76bac..ffc6d6bd2a22 100644
>--- a/arch/x86/mm/fault.c
>+++ b/arch/x86/mm/fault.c
>@@ -1481,6 +1481,15 @@ handle_page_fault(struct pt_regs *regs, unsigned long error_code,
> if (unlikely(kmmio_fault(regs, address)))
> return;
>
>+ /*
>+ * Some older Intel CPUs have errata
>+ * "Not-Present Page Faults May Set the RSVD Flag in the Error Code"
>+ * It is recommended to ignore the RSVD flag (bit 3) in the error code
>+ * if the P flag (bit 0) is 0.
>+ */
>+ if (unlikely((error_code & X86_PF_RSVD) && !(error_code & X86_PF_PROT)))
>+ error_code &= ~X86_PF_RSVD;
>+
> /* Was the fault on kernel-controlled part of the address space? */
> if (unlikely(fault_in_kernel_space(address))) {
> do_kern_addr_fault(regs, error_code, address);
Are there other bits we could/should mask.out in the case P = 0? The only bits that should be able to appear are ones that are independent of the PTE content.
On 7/1/22 03:42, H. Peter Anvin wrote:
> On June 29, 2022 10:58:36 PM PDT, Vasily Averin <vvs@openvz.org> wrote:
>> Some older Intel CPUs have errata:
>> "Not-Present Page Faults May Set the RSVD Flag in the Error Code
>>
>> Problem:
>> An attempt to access a page that is not marked present causes a page
>> fault. Such a page fault delivers an error code in which both the
>> P flag (bit 0) and the RSVD flag (bit 3) are 0. Due to this erratum,
>> not-present page faults may deliver an error code in which the P flag
>> is 0 but the RSVD flag is 1.
>>
>> Implication:
>> Software may erroneously infer that a page fault was due to a
>> reserved-bit violation when it was actually due to an attempt
>> to access a not-present page.
>>
>> Workaround: Page-fault handlers should ignore the RSVD flag in the error
>> code if the P flag is 0."
>>
>> This issues was observed on several nodes crashed with messages
>> httpd: Corrupted page table at address 7f62d5b48e68
>> PGD 80000002e92bf067 PUD 1c99c5067 PMD 195015067 PTE 7fffffffb78b680
>> Bad pagetable: 000c [#1] SMP
>>
>> Let's follow the recommendation and will ignore the RSVD flag in the
>> error code if the P flag is 0
>>
>> Link: https://lore.kernel.org/all/aae9c7c6-989c-0261-470a-252537493b53@openvz.org
>> Signed-off-by: Vasily Averin <vvs@openvz.org>
>> ---
>> arch/x86/mm/fault.c | 9 +++++++++
>> 1 file changed, 9 insertions(+)
>>
>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
>> index fe10c6d76bac..ffc6d6bd2a22 100644
>> --- a/arch/x86/mm/fault.c
>> +++ b/arch/x86/mm/fault.c
>> @@ -1481,6 +1481,15 @@ handle_page_fault(struct pt_regs *regs, unsigned long error_code,
>> if (unlikely(kmmio_fault(regs, address)))
>> return;
>>
>> + /*
>> + * Some older Intel CPUs have errata
>> + * "Not-Present Page Faults May Set the RSVD Flag in the Error Code"
>> + * It is recommended to ignore the RSVD flag (bit 3) in the error code
>> + * if the P flag (bit 0) is 0.
>> + */
>> + if (unlikely((error_code & X86_PF_RSVD) && !(error_code & X86_PF_PROT)))
>> + error_code &= ~X86_PF_RSVD;
>> +
>> /* Was the fault on kernel-controlled part of the address space? */
>> if (unlikely(fault_in_kernel_space(address))) {
>> do_kern_addr_fault(regs, error_code, address);
>
> Are there other bits we could/should mask.out in the case P = 0? The
> only bits that should be able to appear are ones that are independent
> of the PTE content.
In accordance with the "Intel® 64 and IA-32 Architectures Software Developer’s
Manual Volume 3A: System Programming Guide, Part 1" there are several other
similar bits:
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf
"4.7 PAGE-FAULT EXCEPTIONS
...
• HLAT (bit 7).
This flag is 1 if there is no translation for the linear address using HLAT
paging because, in one of the paging structure entries used to translate that
address, either the P flag was 0 or a reserved bit was set. An error code will
set this flag only if it clears bit 0 or sets bit 3. This flag will not be set
by a page fault resulting from a violation of access rights, nor for one
encountered during ordinary paging, including the case in which there has been
a restart of HLAT paging.
• SGX flag (bit 15).
This flag is 1 if the exception is unrelated to paging and resulted from
violation of SGX-specific access-control requirements. Because such a violation
can occur only if there is no ordinary page fault, this flag is set only if
the P flag (bit 0) is 1 and the RSVD flag (bit 3) and the PK flag (bit 5)
are both 0."
However, only the RSVD flag has errata in real processors.
So I don't think any other bits should be masked in some way.
Thank you,
Vasily Averin
On 6/29/22 22:58, Vasily Averin wrote: > Some older Intel CPUs have errata: > "Not-Present Page Faults May Set the RSVD Flag in the Error Code Please include a link to the documentation when you cite things like this. For example, this is very helpful: Several older Intel CPUs have this or a similar erratum. For instance, the "Intel Xeon Processor 5400 Series Specification Update" has "AX74 ... Not-Present Page Faults May Set the RSVD Flag in the Error Code". https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-5400-spec-update.pdf That makes it a *LOT* easier to find the actual erratum and its text. I honestly also woudln't mind if you just copy a chunk of the problem text verbatim into the changelog. Intel does have a habit of updating text in documents like that and it's quite handy to have a snapshot of what you were reading when you wrote the patch.
Several older Intel CPUs have this or a similar erratum.
For instance, the "Intel Xeon Processor 5400 Series
Specification Update" [1] has
"AX74. Not-Present Page Faults May Set the RSVD Flag in the Error Code
Problem:
An attempt to access a page that is not marked present causes
a page fault. Such a page fault delivers an error code in which
both the P flag (bit 0) and the RSVD flag (bit 3) are 0.
Due to this erratum, not-present page faults may deliver
an error code in which the P flag is 0 but the RSVD flag is 1.
Implication:
Software may erroneously infer that a page fault was due to
a reserved-bit violation when it was actually due to an attempt
to access a not-present page. Intel has not observed this erratum
with any commercially available software.
Workaround:
Page-fault handlers should ignore the RSVD flag in the error
code if the P flag is 0"
This problem has been observed several times on several nodes using
Intel Xeon E5450 processors. These nodes were crashed after
"Bad pagetable: 000c" messages like this:
Corrupted page table at address 7f62d5b48e68
PGD 80000002e92bf067 PUD 1c99c5067 PMD 195015067 PTE 7fffffffb78b680
Bad pagetable: 000c [#1] SMP
Error code here is 0xc, it have set RSVD flag (bit 3), however P flag
(bit 0) is clear.
Let's follow the recommendations and ignore the RSVD flag in the cases
described.
Link: [1] https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-5400-spec-update.pdf
Link: https://lore.kernel.org/all/aae9c7c6-989c-0261-470a-252537493b53@openvz.org
Reported-by: Steve Sipes <steve.sipes@comandsolutions.com>
Signed-off-by: Vasily Averin <vvs@openvz.org>
---
v2: added original reporter
improved patch description, added link to CPU spec update
---
arch/x86/mm/fault.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index fe10c6d76bac..ffc6d6bd2a22 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1481,6 +1481,15 @@ handle_page_fault(struct pt_regs *regs, unsigned long error_code,
if (unlikely(kmmio_fault(regs, address)))
return;
+ /*
+ * Some older Intel CPUs have errata
+ * "Not-Present Page Faults May Set the RSVD Flag in the Error Code"
+ * It is recommended to ignore the RSVD flag (bit 3) in the error code
+ * if the P flag (bit 0) is 0.
+ */
+ if (unlikely((error_code & X86_PF_RSVD) && !(error_code & X86_PF_PROT)))
+ error_code &= ~X86_PF_RSVD;
+
/* Was the fault on kernel-controlled part of the address space? */
if (unlikely(fault_in_kernel_space(address))) {
do_kern_addr_fault(regs, error_code, address);
--
2.36.1
© 2016 - 2026 Red Hat, Inc.