[PATCH] x86/boot: Restrict directmap permissions for .text/.rodata

Andrew Cooper posted 1 patch 2 years, 4 months ago
Test gitlab-ci passed
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20211206130855.15372-1-andrew.cooper3@citrix.com
There is a newer version of this series
xen/arch/x86/setup.c | 8 ++++++++
1 file changed, 8 insertions(+)
[PATCH] x86/boot: Restrict directmap permissions for .text/.rodata
Posted by Andrew Cooper 2 years, 4 months ago
While we've been diligent to ensure that the main text/data/rodata mappings
have suitable restrictions, their aliases via the directmap were left fully
RW.  Worse, we even had pieces of code making use of this as a feature.

Restrict the permissions, as we have no legitimate need for writeability of
these areas via the directmap alias.

Note that the pagetables and cpu0_stack do get written through their directmap
alias, so we can't just read-only the whole image.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>

Ever so slightly RFC, as it has only had light testing.

Notes:
 * The stubs are still have RX via one alias, RW via another, and these need
   to stay.  Hardening options include splitting the stubs so the SYSCALL ones
   can be read-only after setup, and/or expanding the stub size to 4k per CPU
   so we really can keep the writeable alias as not present when the stub
   isn't in active use.
 * Future CPUs with Protection Key Supervisor (Sapphire Rapids and later)
   would be able to inhibit writeability outside of a permitted region, and
   because the protection key is per logical thread, we woulnd't need to
   expand the stubs to 4k per CPU.
 * At the time of writing, PV Shim still makes use of .rodata's read/write
   alias in the directmap to patch the hypercall table, but that runs earlier
   on boot.  Also, there are patches out to address this.
 * For backporting, this patch depends on c/s e7f147bf4ac7 ("x86/crash: Drop
   manual hooking of exception_table[]"), and nothing would break at compile
   time if the dependency was missing.
---
 xen/arch/x86/setup.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index f40a9fe5d351..c8641c227d9a 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1566,6 +1566,14 @@ void __init noreturn __start_xen(unsigned long mbi_p)
         destroy_xen_mappings((unsigned long)&__2M_rwdata_end,
                              ROUNDUP((unsigned long)&__2M_rwdata_end, MB(2)));
 
+    /*
+     * Mark all of .text and .rodata as RO in the directmap - we don't want
+     * these sections writeable via any alias.
+     */
+    modify_xen_mappings((unsigned long)__va(__pa(_start)),
+                        (unsigned long)__va(__pa(__2M_rodata_end)),
+                        PAGE_HYPERVISOR_RO);
+
     nr_pages = 0;
     for ( i = 0; i < e820.nr_map; i++ )
         if ( e820.map[i].type == E820_RAM )
-- 
2.11.0


Re: [PATCH] x86/boot: Restrict directmap permissions for .text/.rodata
Posted by Jan Beulich 2 years, 4 months ago
On 06.12.2021 14:08, Andrew Cooper wrote:
> While we've been diligent to ensure that the main text/data/rodata mappings
> have suitable restrictions, their aliases via the directmap were left fully
> RW.  Worse, we even had pieces of code making use of this as a feature.
> 
> Restrict the permissions, as we have no legitimate need for writeability of
> these areas via the directmap alias.

Where do we end up reading .text and/or .rodata through the directmap? Can't
we zap the mappings altogether?

As to superpage shattering - I understand this is not deemed to be an issue
in the common case since, with Xen moved as high up below 4G as possible,
it wouldn't normally live inside a 1G mapping anyway? This may want calling
out here. Plus, in non-EFI, non-XEN_ALIGN_2M builds isn't this going to
shatter a 2M page at the tail of .rodata?

> Note that the pagetables and cpu0_stack do get written through their directmap
> alias, so we can't just read-only the whole image.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Wei Liu <wl@xen.org>
> 
> Ever so slightly RFC, as it has only had light testing.
> 
> Notes:
>  * The stubs are still have RX via one alias, RW via another, and these need
>    to stay.  Hardening options include splitting the stubs so the SYSCALL ones
>    can be read-only after setup, and/or expanding the stub size to 4k per CPU
>    so we really can keep the writeable alias as not present when the stub
>    isn't in active use.
>  * Future CPUs with Protection Key Supervisor (Sapphire Rapids and later)
>    would be able to inhibit writeability outside of a permitted region, and
>    because the protection key is per logical thread, we woulnd't need to
>    expand the stubs to 4k per CPU.

I'm afraid I don't follow: The keys still apply to entire pages, don't they?
This would still allow write access by all 16 CPUs sharing a page for their
stubs.

>  * At the time of writing, PV Shim still makes use of .rodata's read/write
>    alias in the directmap to patch the hypercall table, but that runs earlier
>    on boot.  Also, there are patches out to address this.

I did consider committing that change, but it wasn't clear to me whether
there were dependencies on earlier parts of the series that it's part of.

>  * For backporting, this patch depends on c/s e7f147bf4ac7 ("x86/crash: Drop
>    manual hooking of exception_table[]"), and nothing would break at compile
>    time if the dependency was missing.

Hmm, not nice. I'm likely to forget if we would indeed decide to backport
the one here.

Jan


Re: [PATCH] x86/boot: Restrict directmap permissions for .text/.rodata
Posted by Andrew Cooper 2 years, 4 months ago
On 06/12/2021 13:58, Jan Beulich wrote:
> On 06.12.2021 14:08, Andrew Cooper wrote:
>> While we've been diligent to ensure that the main text/data/rodata mappings
>> have suitable restrictions, their aliases via the directmap were left fully
>> RW.  Worse, we even had pieces of code making use of this as a feature.
>>
>> Restrict the permissions, as we have no legitimate need for writeability of
>> these areas via the directmap alias.
> Where do we end up reading .text and/or .rodata through the directmap? Can't
> we zap the mappings altogether?

I felt it was safer to keep readability via the directmap.

I'm not aware of any logic we have which reads the directmap in order,
but it ought to be possible.

> As to superpage shattering - I understand this is not deemed to be an issue
> in the common case since, with Xen moved as high up below 4G as possible,
> it wouldn't normally live inside a 1G mapping anyway? This may want calling
> out here. Plus, in non-EFI, non-XEN_ALIGN_2M builds isn't this going to
> shatter a 2M page at the tail of .rodata?

cpu0_stack has already shattered down to 4k, which is likely in the same
superpage as rodata in a non-2M build.

But at the end of the day, it is a security/performance tradeoff.

memcpy(__va(__pa(divide_error)), "\x0f\x0b", 2);
asm ("div %ecx" :: "c" (0));

is an especially low barrier for an attacker who has a partial write gadget.

The security benefits are substantial, and the perf downsides are a
handful of extra pagetables, and a handful of pagewalks taking extra
steps, in non-fast paths (i.e. distinctly marginal).

It occurs to me while writing this that the same applies to livepatches.

>
>> Note that the pagetables and cpu0_stack do get written through their directmap
>> alias, so we can't just read-only the whole image.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> CC: Jan Beulich <JBeulich@suse.com>
>> CC: Roger Pau Monné <roger.pau@citrix.com>
>> CC: Wei Liu <wl@xen.org>
>>
>> Ever so slightly RFC, as it has only had light testing.
>>
>> Notes:
>>  * The stubs are still have RX via one alias, RW via another, and these need
>>    to stay.  Hardening options include splitting the stubs so the SYSCALL ones
>>    can be read-only after setup, and/or expanding the stub size to 4k per CPU
>>    so we really can keep the writeable alias as not present when the stub
>>    isn't in active use.
>>  * Future CPUs with Protection Key Supervisor (Sapphire Rapids and later)
>>    would be able to inhibit writeability outside of a permitted region, and
>>    because the protection key is per logical thread, we woulnd't need to
>>    expand the stubs to 4k per CPU.
> I'm afraid I don't follow: The keys still apply to entire pages, don't they?
> This would still allow write access by all 16 CPUs sharing a page for their
> stubs.

It would be all stubs, because there are only 16 protection keys and we
wouldn't want to interleave adjacent stub mappings.

The logic would now be:

pks_allow_write_access();
write new stub;
pks_revoke_write_access();

so as to limit writeability of any stub to just the critical intending
to modify it.

This way, an unrelated buggy hypercall couldn't write into the stub.

>>  * At the time of writing, PV Shim still makes use of .rodata's read/write
>>    alias in the directmap to patch the hypercall table, but that runs earlier
>>    on boot.  Also, there are patches out to address this.
> I did consider committing that change, but it wasn't clear to me whether
> there were dependencies on earlier parts of the series that it's part of.

I've got a dis-entangled version in my CET series.

https://github.com/andyhhp/xen/commit/8d55e1c8ff1d979c985b3fb75c23627348c15209

which needed some header file adjustments to build.

But yes - I too was thinking that it ought to be committed.

~Andrew

Re: [PATCH] x86/boot: Restrict directmap permissions for .text/.rodata
Posted by Jan Beulich 2 years, 4 months ago
On 06.12.2021 16:11, Andrew Cooper wrote:
> On 06/12/2021 13:58, Jan Beulich wrote:
>> On 06.12.2021 14:08, Andrew Cooper wrote:
>>> While we've been diligent to ensure that the main text/data/rodata mappings
>>> have suitable restrictions, their aliases via the directmap were left fully
>>> RW.  Worse, we even had pieces of code making use of this as a feature.
>>>
>>> Restrict the permissions, as we have no legitimate need for writeability of
>>> these areas via the directmap alias.
>> Where do we end up reading .text and/or .rodata through the directmap? Can't
>> we zap the mappings altogether?
> 
> I felt it was safer to keep readability via the directmap.
> 
> I'm not aware of any logic we have which reads the directmap in order,
> but it ought to be possible.

Could you add a sentence to this effect to this description, please?

>> As to superpage shattering - I understand this is not deemed to be an issue
>> in the common case since, with Xen moved as high up below 4G as possible,
>> it wouldn't normally live inside a 1G mapping anyway? This may want calling
>> out here. Plus, in non-EFI, non-XEN_ALIGN_2M builds isn't this going to
>> shatter a 2M page at the tail of .rodata?
> 
> cpu0_stack has already shattered down to 4k, which is likely in the same
> superpage as rodata in a non-2M build.
> 
> But at the end of the day, it is a security/performance tradeoff.
> 
> memcpy(__va(__pa(divide_error)), "\x0f\x0b", 2);
> asm ("div %ecx" :: "c" (0));
> 
> is an especially low barrier for an attacker who has a partial write gadget.
> 
> The security benefits are substantial, and the perf downsides are a
> handful of extra pagetables, and a handful of pagewalks taking extra
> steps, in non-fast paths (i.e. distinctly marginal).

How do you easily know what paths there are accessing data on the same
(potential) superpage? However, thinking about it, with the directmap
mapping presumably not getting used at all, how the mapping is arranged
doesn't really matter (except for the extra memory needed, but as you
say that's probably marginal).

Jan


Re: [PATCH] x86/boot: Restrict directmap permissions for .text/.rodata
Posted by Andrew Cooper 1 year, 1 month ago
On 06/12/2021 3:21 pm, Jan Beulich wrote:
> On 06.12.2021 16:11, Andrew Cooper wrote:
>> On 06/12/2021 13:58, Jan Beulich wrote:
>>> On 06.12.2021 14:08, Andrew Cooper wrote:
>>>> While we've been diligent to ensure that the main text/data/rodata
>>>> mappings
>>>> have suitable restrictions, their aliases via the directmap were
>>>> left fully
>>>> RW.  Worse, we even had pieces of code making use of this as a
>>>> feature.
>>>>
>>>> Restrict the permissions, as we have no legitimate need for
>>>> writeability of
>>>> these areas via the directmap alias.
>>> Where do we end up reading .text and/or .rodata through the
>>> directmap? Can't
>>> we zap the mappings altogether?
>>
>> I felt it was safer to keep readability via the directmap.
>>
>> I'm not aware of any logic we have which reads the directmap in order,
>> but it ought to be possible.
>
> Could you add a sentence to this effect to this description, please?

Ok.  The commit message a rewrite anyway, given changes to the boot cpu
stack.

>
>>> As to superpage shattering - I understand this is not deemed to be
>>> an issue
>>> in the common case since, with Xen moved as high up below 4G as
>>> possible,
>>> it wouldn't normally live inside a 1G mapping anyway? This may want
>>> calling
>>> out here. Plus, in non-EFI, non-XEN_ALIGN_2M builds isn't this going to
>>> shatter a 2M page at the tail of .rodata?
>>
>> cpu0_stack has already shattered down to 4k, which is likely in the same
>> superpage as rodata in a non-2M build.
>>
>> But at the end of the day, it is a security/performance tradeoff.
>>
>> memcpy(__va(__pa(divide_error)), "\x0f\x0b", 2);
>> asm ("div %ecx" :: "c" (0));
>>
>> is an especially low barrier for an attacker who has a partial write
>> gadget.
>>
>> The security benefits are substantial, and the perf downsides are a
>> handful of extra pagetables, and a handful of pagewalks taking extra
>> steps, in non-fast paths (i.e. distinctly marginal).
>
> How do you easily know what paths there are accessing data on the same
> (potential) superpage? However, thinking about it, with the directmap
> mapping presumably not getting used at all, how the mapping is arranged
> doesn't really matter (except for the extra memory needed, but as you
> say that's probably marginal).

Indeed.  Any path which requires Xen to reach into guest state via the
directmap isn't a fastpath in the first place.

~Andrew