[v1] RE: [RFC 0/4] Adding Virtual Memory Fuses to Xen

RE: [RFC 0/4] Adding Virtual Memory Fuses to Xen

Posted by Stefano Stabellini 3 years, 1 month ago

On Tue, 20 Dec 2022, Smith, Jackson wrote:
> > Hi Stefano,
> >
> > On 16/12/2022 01:46, Stefano Stabellini wrote:
> > > On Thu, 15 Dec 2022, Julien Grall wrote:
> > >>>> On 13/12/2022 19:48, Smith, Jackson wrote:
> > >>> Yes, we are familiar with the "secret-free hypervisor" work. As
> you
> > >>> point out, both our work and the secret-free hypervisor remove the
> > >>> directmap region to mitigate the risk of leaking sensitive guest
> > >>> secrets. However, our work is slightly different because it
> > >>> additionally prevents attackers from tricking Xen into remapping a
> > guest.
> > >>
> > >> I understand your goal, but I don't think this is achieved (see
> > >> above). You would need an entity to prevent write to TTBR0_EL2 in
> > >> order to fully protect it.
> > >
> > > Without a way to stop Xen from reading/writing TTBR0_EL2, we
> > cannot
> > > claim that the guest's secrets are 100% safe.
> > >
> > > But the attacker would have to follow the sequence you outlines
> > above
> > > to change Xen's pagetables and remap guest memory before
> > accessing it.
> > > It is an additional obstacle for attackers that want to steal other
> > guests'
> > > secrets. The size of the code that the attacker would need to inject
> > > in Xen would need to be bigger and more complex.
> >
> > Right, that's why I wrote with a bit more work. However, the nuance
> > you mention doesn't seem to be present in the cover letter:
> >
> > "This creates what we call "Software Enclaves", ensuring that an
> > adversary with arbitrary code execution in the hypervisor STILL cannot
> > read/write guest memory."
> >
> > So if the end goal if really to protect against *all* sort of
> arbitrary 
> > code,
> > then I think we should have a rough idea how this will look like in
> Xen.
> >
> >  From a brief look, it doesn't look like it would be possible to
> prevent
> > modification to TTBR0_EL2 (even from EL3). We would need to
> > investigate if there are other bits in the architecture to help us.
> >
> > >
> > > Every little helps :-)
> >
> > I can see how making the life of the attacker more difficult is 
> > appealing.
> > Yet, the goal needs to be clarified and the risk with the approach
> > acknowledged (see above).
> >
> 
> You're right, we should have mentioned this weakness in our first email.
> Sorry about the oversight! This is definitely still a limitation that we
> have not yet overcome. However, we do think that the increase in
> attacker workload that you and Stefano are discussing could still be
> valuable to security conscious Xen users.
> 
> It would nice to find additional architecture features that we can use
> to close this hole on arm, but there aren't any that stand out to me
> either.
> 
> With this limitation in mind, what are the next steps we should take to
> support this feature for the xen community? Is this increase in attacker
> workload meaningful enough to justify the inclusion of VMF in Xen?

I think it could be valuable as an additional obstacle for the attacker
to overcome. The next step would be to port your series on top of
Julien's "Remove the directmap" patch series
https://marc.info/?l=xen-devel&m=167119090721116

Julien, what do you think?

Re: [RFC 0/4] Adding Virtual Memory Fuses to Xen

Posted by Julien Grall 3 years, 1 month ago

Hi Stefano,

On 22/12/2022 00:38, Stefano Stabellini wrote:
> On Tue, 20 Dec 2022, Smith, Jackson wrote:
>>> Hi Stefano,
>>>
>>> On 16/12/2022 01:46, Stefano Stabellini wrote:
>>>> On Thu, 15 Dec 2022, Julien Grall wrote:
>>>>>>> On 13/12/2022 19:48, Smith, Jackson wrote:
>>>>>> Yes, we are familiar with the "secret-free hypervisor" work. As
>> you
>>>>>> point out, both our work and the secret-free hypervisor remove the
>>>>>> directmap region to mitigate the risk of leaking sensitive guest
>>>>>> secrets. However, our work is slightly different because it
>>>>>> additionally prevents attackers from tricking Xen into remapping a
>>> guest.
>>>>>
>>>>> I understand your goal, but I don't think this is achieved (see
>>>>> above). You would need an entity to prevent write to TTBR0_EL2 in
>>>>> order to fully protect it.
>>>>
>>>> Without a way to stop Xen from reading/writing TTBR0_EL2, we
>>> cannot
>>>> claim that the guest's secrets are 100% safe.
>>>>
>>>> But the attacker would have to follow the sequence you outlines
>>> above
>>>> to change Xen's pagetables and remap guest memory before
>>> accessing it.
>>>> It is an additional obstacle for attackers that want to steal other
>>> guests'
>>>> secrets. The size of the code that the attacker would need to inject
>>>> in Xen would need to be bigger and more complex.
>>>
>>> Right, that's why I wrote with a bit more work. However, the nuance
>>> you mention doesn't seem to be present in the cover letter:
>>>
>>> "This creates what we call "Software Enclaves", ensuring that an
>>> adversary with arbitrary code execution in the hypervisor STILL cannot
>>> read/write guest memory."
>>>
>>> So if the end goal if really to protect against *all* sort of
>> arbitrary
>>> code,
>>> then I think we should have a rough idea how this will look like in
>> Xen.
>>>
>>>   From a brief look, it doesn't look like it would be possible to
>> prevent
>>> modification to TTBR0_EL2 (even from EL3). We would need to
>>> investigate if there are other bits in the architecture to help us.
>>>
>>>>
>>>> Every little helps :-)
>>>
>>> I can see how making the life of the attacker more difficult is
>>> appealing.
>>> Yet, the goal needs to be clarified and the risk with the approach
>>> acknowledged (see above).
>>>
>>
>> You're right, we should have mentioned this weakness in our first email.
>> Sorry about the oversight! This is definitely still a limitation that we
>> have not yet overcome. However, we do think that the increase in
>> attacker workload that you and Stefano are discussing could still be
>> valuable to security conscious Xen users.
>>
>> It would nice to find additional architecture features that we can use
>> to close this hole on arm, but there aren't any that stand out to me
>> either.
>>
>> With this limitation in mind, what are the next steps we should take to
>> support this feature for the xen community? Is this increase in attacker
>> workload meaningful enough to justify the inclusion of VMF in Xen?
> 
> I think it could be valuable as an additional obstacle for the attacker
> to overcome. The next step would be to port your series on top of
> Julien's "Remove the directmap" patch series
> https://marc.info/?l=xen-devel&m=167119090721116
> 
> Julien, what do you think?

If we want Xen to be used in confidential compute, then we need a 
compelling story and prove that we are at least as secure as other 
hypervisors.

So I think we need to investigate a few areas:
    * Can we protect the TTBR? I don't think this can be done with the 
HW. But maybe I overlook it.
    * Can VMF be extended to more use-cases? For instances, for 
hypercalls, we could have bounce buffer.
    * If we can't fully secure VMF, can the attack surface be reduced 
(e.g. disable hypercalls at runtime/compile time)? Could we use a 
different architecture (I am thinking something like pKVM [1])?

Cheers,

[1] https://lwn.net/Articles/836693/

-- 
Julien Grall

Re: [RFC 0/4] Adding Virtual Memory Fuses to Xen

Posted by Demi Marie Obenour 3 years, 1 month ago

On Thu, Dec 22, 2022 at 09:52:11AM +0000, Julien Grall wrote:
> Hi Stefano,
> 
> On 22/12/2022 00:38, Stefano Stabellini wrote:
> > On Tue, 20 Dec 2022, Smith, Jackson wrote:
> > > > Hi Stefano,
> > > > 
> > > > On 16/12/2022 01:46, Stefano Stabellini wrote:
> > > > > On Thu, 15 Dec 2022, Julien Grall wrote:
> > > > > > > > On 13/12/2022 19:48, Smith, Jackson wrote:
> > > > > > > Yes, we are familiar with the "secret-free hypervisor" work. As
> > > you
> > > > > > > point out, both our work and the secret-free hypervisor remove the
> > > > > > > directmap region to mitigate the risk of leaking sensitive guest
> > > > > > > secrets. However, our work is slightly different because it
> > > > > > > additionally prevents attackers from tricking Xen into remapping a
> > > > guest.
> > > > > > 
> > > > > > I understand your goal, but I don't think this is achieved (see
> > > > > > above). You would need an entity to prevent write to TTBR0_EL2 in
> > > > > > order to fully protect it.
> > > > > 
> > > > > Without a way to stop Xen from reading/writing TTBR0_EL2, we
> > > > cannot
> > > > > claim that the guest's secrets are 100% safe.
> > > > > 
> > > > > But the attacker would have to follow the sequence you outlines
> > > > above
> > > > > to change Xen's pagetables and remap guest memory before
> > > > accessing it.
> > > > > It is an additional obstacle for attackers that want to steal other
> > > > guests'
> > > > > secrets. The size of the code that the attacker would need to inject
> > > > > in Xen would need to be bigger and more complex.
> > > > 
> > > > Right, that's why I wrote with a bit more work. However, the nuance
> > > > you mention doesn't seem to be present in the cover letter:
> > > > 
> > > > "This creates what we call "Software Enclaves", ensuring that an
> > > > adversary with arbitrary code execution in the hypervisor STILL cannot
> > > > read/write guest memory."
> > > > 
> > > > So if the end goal if really to protect against *all* sort of
> > > arbitrary
> > > > code,
> > > > then I think we should have a rough idea how this will look like in
> > > Xen.
> > > > 
> > > >   From a brief look, it doesn't look like it would be possible to
> > > prevent
> > > > modification to TTBR0_EL2 (even from EL3). We would need to
> > > > investigate if there are other bits in the architecture to help us.
> > > > 
> > > > > 
> > > > > Every little helps :-)
> > > > 
> > > > I can see how making the life of the attacker more difficult is
> > > > appealing.
> > > > Yet, the goal needs to be clarified and the risk with the approach
> > > > acknowledged (see above).
> > > > 
> > > 
> > > You're right, we should have mentioned this weakness in our first email.
> > > Sorry about the oversight! This is definitely still a limitation that we
> > > have not yet overcome. However, we do think that the increase in
> > > attacker workload that you and Stefano are discussing could still be
> > > valuable to security conscious Xen users.
> > > 
> > > It would nice to find additional architecture features that we can use
> > > to close this hole on arm, but there aren't any that stand out to me
> > > either.
> > > 
> > > With this limitation in mind, what are the next steps we should take to
> > > support this feature for the xen community? Is this increase in attacker
> > > workload meaningful enough to justify the inclusion of VMF in Xen?
> > 
> > I think it could be valuable as an additional obstacle for the attacker
> > to overcome. The next step would be to port your series on top of
> > Julien's "Remove the directmap" patch series
> > https://marc.info/?l=xen-devel&m=167119090721116
> > 
> > Julien, what do you think?
> 
> If we want Xen to be used in confidential compute, then we need a compelling
> story and prove that we are at least as secure as other hypervisors.
> 
> So I think we need to investigate a few areas:
>    * Can we protect the TTBR? I don't think this can be done with the HW.
> But maybe I overlook it.

This can be done by running most of Xen at a lower EL, and having only a
small trusted (and hopefully formally verified) kernel run at EL2.

>    * Can VMF be extended to more use-cases? For instances, for hypercalls,
> we could have bounce buffer.
>    * If we can't fully secure VMF, can the attack surface be reduced (e.g.
> disable hypercalls at runtime/compile time)? Could we use a different
> architecture (I am thinking something like pKVM [1])?
> 
> Cheers,
> 
> [1] https://lwn.net/Articles/836693/

pKVM has been formally verified already, in the form of seKVM.  So there
very much is precident for this.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

Re: [RFC 0/4] Adding Virtual Memory Fuses to Xen

Posted by Julien Grall 3 years, 1 month ago


On 22/12/2022 10:14, Demi Marie Obenour wrote:
> On Thu, Dec 22, 2022 at 09:52:11AM +0000, Julien Grall wrote:
>> Hi Stefano,
>>
>> On 22/12/2022 00:38, Stefano Stabellini wrote:
>>> On Tue, 20 Dec 2022, Smith, Jackson wrote:
>>>>> Hi Stefano,
>>>>>
>>>>> On 16/12/2022 01:46, Stefano Stabellini wrote:
>>>>>> On Thu, 15 Dec 2022, Julien Grall wrote:
>>>>>>>>> On 13/12/2022 19:48, Smith, Jackson wrote:
>>>>>>>> Yes, we are familiar with the "secret-free hypervisor" work. As
>>>> you
>>>>>>>> point out, both our work and the secret-free hypervisor remove the
>>>>>>>> directmap region to mitigate the risk of leaking sensitive guest
>>>>>>>> secrets. However, our work is slightly different because it
>>>>>>>> additionally prevents attackers from tricking Xen into remapping a
>>>>> guest.
>>>>>>>
>>>>>>> I understand your goal, but I don't think this is achieved (see
>>>>>>> above). You would need an entity to prevent write to TTBR0_EL2 in
>>>>>>> order to fully protect it.
>>>>>>
>>>>>> Without a way to stop Xen from reading/writing TTBR0_EL2, we
>>>>> cannot
>>>>>> claim that the guest's secrets are 100% safe.
>>>>>>
>>>>>> But the attacker would have to follow the sequence you outlines
>>>>> above
>>>>>> to change Xen's pagetables and remap guest memory before
>>>>> accessing it.
>>>>>> It is an additional obstacle for attackers that want to steal other
>>>>> guests'
>>>>>> secrets. The size of the code that the attacker would need to inject
>>>>>> in Xen would need to be bigger and more complex.
>>>>>
>>>>> Right, that's why I wrote with a bit more work. However, the nuance
>>>>> you mention doesn't seem to be present in the cover letter:
>>>>>
>>>>> "This creates what we call "Software Enclaves", ensuring that an
>>>>> adversary with arbitrary code execution in the hypervisor STILL cannot
>>>>> read/write guest memory."
>>>>>
>>>>> So if the end goal if really to protect against *all* sort of
>>>> arbitrary
>>>>> code,
>>>>> then I think we should have a rough idea how this will look like in
>>>> Xen.
>>>>>
>>>>>    From a brief look, it doesn't look like it would be possible to
>>>> prevent
>>>>> modification to TTBR0_EL2 (even from EL3). We would need to
>>>>> investigate if there are other bits in the architecture to help us.
>>>>>
>>>>>>
>>>>>> Every little helps :-)
>>>>>
>>>>> I can see how making the life of the attacker more difficult is
>>>>> appealing.
>>>>> Yet, the goal needs to be clarified and the risk with the approach
>>>>> acknowledged (see above).
>>>>>
>>>>
>>>> You're right, we should have mentioned this weakness in our first email.
>>>> Sorry about the oversight! This is definitely still a limitation that we
>>>> have not yet overcome. However, we do think that the increase in
>>>> attacker workload that you and Stefano are discussing could still be
>>>> valuable to security conscious Xen users.
>>>>
>>>> It would nice to find additional architecture features that we can use
>>>> to close this hole on arm, but there aren't any that stand out to me
>>>> either.
>>>>
>>>> With this limitation in mind, what are the next steps we should take to
>>>> support this feature for the xen community? Is this increase in attacker
>>>> workload meaningful enough to justify the inclusion of VMF in Xen?
>>>
>>> I think it could be valuable as an additional obstacle for the attacker
>>> to overcome. The next step would be to port your series on top of
>>> Julien's "Remove the directmap" patch series
>>> https://marc.info/?l=xen-devel&m=167119090721116
>>>
>>> Julien, what do you think?
>>
>> If we want Xen to be used in confidential compute, then we need a compelling
>> story and prove that we are at least as secure as other hypervisors.
>>
>> So I think we need to investigate a few areas:
>>     * Can we protect the TTBR? I don't think this can be done with the HW.
>> But maybe I overlook it.
> 
> This can be done by running most of Xen at a lower EL, and having only a
> small trusted (and hopefully formally verified) kernel run at EL2.

This is what I hinted in my 3rd bullet. :) I didn't consider this for 
the first bullet because the goal of this question is to figure out 
whether we can leave all Xen running in EL2 and still have the same 
guarantee.

Cheers,

-- 
Julien Grall

Re: [RFC 0/4] Adding Virtual Memory Fuses to Xen

Posted by Demi Marie Obenour 3 years, 1 month ago

On Thu, Dec 22, 2022 at 10:21:57AM +0000, Julien Grall wrote:
> 
> 
> On 22/12/2022 10:14, Demi Marie Obenour wrote:
> > On Thu, Dec 22, 2022 at 09:52:11AM +0000, Julien Grall wrote:
> > > Hi Stefano,
> > > 
> > > On 22/12/2022 00:38, Stefano Stabellini wrote:
> > > > On Tue, 20 Dec 2022, Smith, Jackson wrote:
> > > > > > Hi Stefano,
> > > > > > 
> > > > > > On 16/12/2022 01:46, Stefano Stabellini wrote:
> > > > > > > On Thu, 15 Dec 2022, Julien Grall wrote:
> > > > > > > > > > On 13/12/2022 19:48, Smith, Jackson wrote:
> > > > > > > > > Yes, we are familiar with the "secret-free hypervisor" work. As
> > > > > you
> > > > > > > > > point out, both our work and the secret-free hypervisor remove the
> > > > > > > > > directmap region to mitigate the risk of leaking sensitive guest
> > > > > > > > > secrets. However, our work is slightly different because it
> > > > > > > > > additionally prevents attackers from tricking Xen into remapping a
> > > > > > guest.
> > > > > > > > 
> > > > > > > > I understand your goal, but I don't think this is achieved (see
> > > > > > > > above). You would need an entity to prevent write to TTBR0_EL2 in
> > > > > > > > order to fully protect it.
> > > > > > > 
> > > > > > > Without a way to stop Xen from reading/writing TTBR0_EL2, we
> > > > > > cannot
> > > > > > > claim that the guest's secrets are 100% safe.
> > > > > > > 
> > > > > > > But the attacker would have to follow the sequence you outlines
> > > > > > above
> > > > > > > to change Xen's pagetables and remap guest memory before
> > > > > > accessing it.
> > > > > > > It is an additional obstacle for attackers that want to steal other
> > > > > > guests'
> > > > > > > secrets. The size of the code that the attacker would need to inject
> > > > > > > in Xen would need to be bigger and more complex.
> > > > > > 
> > > > > > Right, that's why I wrote with a bit more work. However, the nuance
> > > > > > you mention doesn't seem to be present in the cover letter:
> > > > > > 
> > > > > > "This creates what we call "Software Enclaves", ensuring that an
> > > > > > adversary with arbitrary code execution in the hypervisor STILL cannot
> > > > > > read/write guest memory."
> > > > > > 
> > > > > > So if the end goal if really to protect against *all* sort of
> > > > > arbitrary
> > > > > > code,
> > > > > > then I think we should have a rough idea how this will look like in
> > > > > Xen.
> > > > > > 
> > > > > >    From a brief look, it doesn't look like it would be possible to
> > > > > prevent
> > > > > > modification to TTBR0_EL2 (even from EL3). We would need to
> > > > > > investigate if there are other bits in the architecture to help us.
> > > > > > 
> > > > > > > 
> > > > > > > Every little helps :-)
> > > > > > 
> > > > > > I can see how making the life of the attacker more difficult is
> > > > > > appealing.
> > > > > > Yet, the goal needs to be clarified and the risk with the approach
> > > > > > acknowledged (see above).
> > > > > > 
> > > > > 
> > > > > You're right, we should have mentioned this weakness in our first email.
> > > > > Sorry about the oversight! This is definitely still a limitation that we
> > > > > have not yet overcome. However, we do think that the increase in
> > > > > attacker workload that you and Stefano are discussing could still be
> > > > > valuable to security conscious Xen users.
> > > > > 
> > > > > It would nice to find additional architecture features that we can use
> > > > > to close this hole on arm, but there aren't any that stand out to me
> > > > > either.
> > > > > 
> > > > > With this limitation in mind, what are the next steps we should take to
> > > > > support this feature for the xen community? Is this increase in attacker
> > > > > workload meaningful enough to justify the inclusion of VMF in Xen?
> > > > 
> > > > I think it could be valuable as an additional obstacle for the attacker
> > > > to overcome. The next step would be to port your series on top of
> > > > Julien's "Remove the directmap" patch series
> > > > https://marc.info/?l=xen-devel&m=167119090721116
> > > > 
> > > > Julien, what do you think?
> > > 
> > > If we want Xen to be used in confidential compute, then we need a compelling
> > > story and prove that we are at least as secure as other hypervisors.
> > > 
> > > So I think we need to investigate a few areas:
> > >     * Can we protect the TTBR? I don't think this can be done with the HW.
> > > But maybe I overlook it.
> > 
> > This can be done by running most of Xen at a lower EL, and having only a
> > small trusted (and hopefully formally verified) kernel run at EL2.
> 
> This is what I hinted in my 3rd bullet. :) I didn't consider this for the
> first bullet because the goal of this question is to figure out whether we
> can leave all Xen running in EL2 and still have the same guarantee.

It should be possible (see Google Native Client) but whether or not it
is useful is questionable.  I expect the complexity of the needed
compiler patches and binary-level static analysis to be greater than
that of running most of Xen at a lower exception level.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab