Implement entropy leak reporting for virtio-rng

[RFC PATCH 0/1] Implement entropy leak reporting for virtio-rng

Posted by Babis Chalios 2 years, 10 months ago

This patchset implements the entropy leak reporting feature proposal [1]
for virtio-rng devices.

Entropy leaking (as defined in the specification proposal) typically
happens when we take a snapshot of a VM or while we resume a VM from a
snapshot. In these cases, we want to let the guest know so that it can
reset state that needs to be uniqueue, for example.

This feature is offering functionality similar to what VMGENID does.
However, it allows to build mechanisms on the guest side to notify
user-space applications, like VMGENID for userspace and additionally for
kernel.

The new specification describes two request types that the guest might
place in the queues for the device to perform, a fill-on-leak request
where the device needs to fill with random bytes a buffer and a
copy-on-leak request where the device needs to perform a copy between
two guest-provided buffers. We currently trigger the handling of guest
requests when saving the VM state and when loading a VM from a snapshot
file.

This is an RFC, since the corresponding specification changes have not
yet been merged. It also aims to allow testing a respective patch-set
implementing the feature in the Linux front-end driver[2].

However, I would like to ask the community's opinion regarding the
handling of the fill-on-leak requests. Essentially, these requests are
very similar to the normal virtio-rng entropy requests, with the catch
that we should complete these requests before resuming the VM, so that
we avoid race-conditions in notifying the guest about entropy leak
events. This means that we cannot rely on the RngBackend's API, which is
asynchronous. At the moment, I have handled that using getrandom(), but
I would like a solution which doesn't work only with (relatively new)
Linux hosts. I am inclined to solve that by extending the RngBackend API
with a synchronous call to request for random bytes and I'd like to hear
opinion's on this approach.

[1] https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg09016.html
[2] https://lore.kernel.org/lkml/20230131145543.86369-1-bchalios@amazon.es/

Babis Chalios (1):
  virtio-rng: implement entropy leak feature

 hw/virtio/virtio-rng.c                      | 170 +++++++++++++++++++-
 include/hw/virtio/virtio-rng.h              |   9 +-
 include/standard-headers/linux/virtio_rng.h |   3 +
 3 files changed, 179 insertions(+), 3 deletions(-)

-- 
2.39.2

Re: [RFC PATCH 0/1] Implement entropy leak reporting for virtio-rng

Posted by Amit Shah 2 years, 10 months ago

Hey Babis,

On Mon, 2023-04-03 at 12:52 +0200, Babis Chalios wrote:
> This patchset implements the entropy leak reporting feature proposal [1]
> for virtio-rng devices.
> 
> Entropy leaking (as defined in the specification proposal) typically
> happens when we take a snapshot of a VM or while we resume a VM from a
> snapshot. In these cases, we want to let the guest know so that it can
> reset state that needs to be uniqueue, for example.
> 
> This feature is offering functionality similar to what VMGENID does.
> However, it allows to build mechanisms on the guest side to notify
> user-space applications, like VMGENID for userspace and additionally for
> kernel.
> 
> The new specification describes two request types that the guest might
> place in the queues for the device to perform, a fill-on-leak request
> where the device needs to fill with random bytes a buffer and a
> copy-on-leak request where the device needs to perform a copy between
> two guest-provided buffers. We currently trigger the handling of guest
> requests when saving the VM state and when loading a VM from a snapshot
> file.
> 
> This is an RFC, since the corresponding specification changes have not
> yet been merged. It also aims to allow testing a respective patch-set
> implementing the feature in the Linux front-end driver[2].
> 
> However, I would like to ask the community's opinion regarding the
> handling of the fill-on-leak requests. Essentially, these requests are
> very similar to the normal virtio-rng entropy requests, with the catch
> that we should complete these requests before resuming the VM, so that
> we avoid race-conditions in notifying the guest about entropy leak
> events. This means that we cannot rely on the RngBackend's API, which is
> asynchronous. At the moment, I have handled that using getrandom(), but
> I would like a solution which doesn't work only with (relatively new)
> Linux hosts. I am inclined to solve that by extending the RngBackend API
> with a synchronous call to request for random bytes and I'd like to hear
> opinion's on this approach.

The patch looks OK - I suggest you add a new sync call that also probes
for the availability of getrandom().  If that doesn't exist, your new
code should figure out a way to deal with the lack of that call.

On older Linux or non-Linux, there are other ways of getting random
numbers, so I expect that sync backend you introduce get more capable.

		Amit

Re: [RFC PATCH 0/1] Implement entropy leak reporting for virtio-rng

Posted by Jason A. Donenfeld 2 years, 10 months ago

On Tue, Apr 11, 2023 at 6:19 PM Amit Shah <amit@infradead.org> wrote:
>
> Hey Babis,
>
> On Mon, 2023-04-03 at 12:52 +0200, Babis Chalios wrote:
> > This patchset implements the entropy leak reporting feature proposal [1]
> > for virtio-rng devices.
> >
> > Entropy leaking (as defined in the specification proposal) typically
> > happens when we take a snapshot of a VM or while we resume a VM from a
> > snapshot. In these cases, we want to let the guest know so that it can
> > reset state that needs to be uniqueue, for example.
> >
> > This feature is offering functionality similar to what VMGENID does.
> > However, it allows to build mechanisms on the guest side to notify
> > user-space applications, like VMGENID for userspace and additionally for
> > kernel.
> >
> > The new specification describes two request types that the guest might
> > place in the queues for the device to perform, a fill-on-leak request
> > where the device needs to fill with random bytes a buffer and a
> > copy-on-leak request where the device needs to perform a copy between
> > two guest-provided buffers. We currently trigger the handling of guest
> > requests when saving the VM state and when loading a VM from a snapshot
> > file.
> >
> > This is an RFC, since the corresponding specification changes have not
> > yet been merged. It also aims to allow testing a respective patch-set
> > implementing the feature in the Linux front-end driver[2].
> >
> > However, I would like to ask the community's opinion regarding the
> > handling of the fill-on-leak requests. Essentially, these requests are
> > very similar to the normal virtio-rng entropy requests, with the catch
> > that we should complete these requests before resuming the VM, so that
> > we avoid race-conditions in notifying the guest about entropy leak
> > events. This means that we cannot rely on the RngBackend's API, which is
> > asynchronous. At the moment, I have handled that using getrandom(), but
> > I would like a solution which doesn't work only with (relatively new)
> > Linux hosts. I am inclined to solve that by extending the RngBackend API
> > with a synchronous call to request for random bytes and I'd like to hear
> > opinion's on this approach.
>
> The patch looks OK - I suggest you add a new sync call that also probes
> for the availability of getrandom().

qemu_guest_getrandom_nofail?

Re: [RFC PATCH 0/1] Implement entropy leak reporting for virtio-rng

Posted by Babis Chalios 2 years, 10 months ago


On 11/4/23 18:20, Jason A. Donenfeld wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> On Tue, Apr 11, 2023 at 6:19 PM Amit Shah <amit@infradead.org> wrote:
>> Hey Babis,
>>
>> On Mon, 2023-04-03 at 12:52 +0200, Babis Chalios wrote:
>>> This patchset implements the entropy leak reporting feature proposal [1]
>>> for virtio-rng devices.
>>>
>>> Entropy leaking (as defined in the specification proposal) typically
>>> happens when we take a snapshot of a VM or while we resume a VM from a
>>> snapshot. In these cases, we want to let the guest know so that it can
>>> reset state that needs to be uniqueue, for example.
>>>
>>> This feature is offering functionality similar to what VMGENID does.
>>> However, it allows to build mechanisms on the guest side to notify
>>> user-space applications, like VMGENID for userspace and additionally for
>>> kernel.
>>>
>>> The new specification describes two request types that the guest might
>>> place in the queues for the device to perform, a fill-on-leak request
>>> where the device needs to fill with random bytes a buffer and a
>>> copy-on-leak request where the device needs to perform a copy between
>>> two guest-provided buffers. We currently trigger the handling of guest
>>> requests when saving the VM state and when loading a VM from a snapshot
>>> file.
>>>
>>> This is an RFC, since the corresponding specification changes have not
>>> yet been merged. It also aims to allow testing a respective patch-set
>>> implementing the feature in the Linux front-end driver[2].
>>>
>>> However, I would like to ask the community's opinion regarding the
>>> handling of the fill-on-leak requests. Essentially, these requests are
>>> very similar to the normal virtio-rng entropy requests, with the catch
>>> that we should complete these requests before resuming the VM, so that
>>> we avoid race-conditions in notifying the guest about entropy leak
>>> events. This means that we cannot rely on the RngBackend's API, which is
>>> asynchronous. At the moment, I have handled that using getrandom(), but
>>> I would like a solution which doesn't work only with (relatively new)
>>> Linux hosts. I am inclined to solve that by extending the RngBackend API
>>> with a synchronous call to request for random bytes and I'd like to hear
>>> opinion's on this approach.
>> The patch looks OK - I suggest you add a new sync call that also probes
>> for the availability of getrandom().
> qemu_guest_getrandom_nofail?

That should work, I think. Any objections to this Amit?

Cheers,
Babis

Re: [RFC PATCH 0/1] Implement entropy leak reporting for virtio-rng

Posted by Amit Shah 2 years, 10 months ago

On Thu, 2023-04-13 at 15:36 +0200, Babis Chalios wrote:
> 
> On 11/4/23 18:20, Jason A. Donenfeld wrote:
> > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> > 
> > 
> > 
> > On Tue, Apr 11, 2023 at 6:19 PM Amit Shah <amit@infradead.org> wrote:
> > > Hey Babis,
> > > 
> > > On Mon, 2023-04-03 at 12:52 +0200, Babis Chalios wrote:
> > > > This patchset implements the entropy leak reporting feature proposal [1]
> > > > for virtio-rng devices.
> > > > 
> > > > Entropy leaking (as defined in the specification proposal) typically
> > > > happens when we take a snapshot of a VM or while we resume a VM from a
> > > > snapshot. In these cases, we want to let the guest know so that it can
> > > > reset state that needs to be uniqueue, for example.
> > > > 
> > > > This feature is offering functionality similar to what VMGENID does.
> > > > However, it allows to build mechanisms on the guest side to notify
> > > > user-space applications, like VMGENID for userspace and additionally for
> > > > kernel.
> > > > 
> > > > The new specification describes two request types that the guest might
> > > > place in the queues for the device to perform, a fill-on-leak request
> > > > where the device needs to fill with random bytes a buffer and a
> > > > copy-on-leak request where the device needs to perform a copy between
> > > > two guest-provided buffers. We currently trigger the handling of guest
> > > > requests when saving the VM state and when loading a VM from a snapshot
> > > > file.
> > > > 
> > > > This is an RFC, since the corresponding specification changes have not
> > > > yet been merged. It also aims to allow testing a respective patch-set
> > > > implementing the feature in the Linux front-end driver[2].
> > > > 
> > > > However, I would like to ask the community's opinion regarding the
> > > > handling of the fill-on-leak requests. Essentially, these requests are
> > > > very similar to the normal virtio-rng entropy requests, with the catch
> > > > that we should complete these requests before resuming the VM, so that
> > > > we avoid race-conditions in notifying the guest about entropy leak
> > > > events. This means that we cannot rely on the RngBackend's API, which is
> > > > asynchronous. At the moment, I have handled that using getrandom(), but
> > > > I would like a solution which doesn't work only with (relatively new)
> > > > Linux hosts. I am inclined to solve that by extending the RngBackend API
> > > > with a synchronous call to request for random bytes and I'd like to hear
> > > > opinion's on this approach.
> > > The patch looks OK - I suggest you add a new sync call that also probes
> > > for the availability of getrandom().
> > qemu_guest_getrandom_nofail?
> 
> That should work, I think. Any objections to this Amit?

That's one way to do it - and is fine too - but still needs probing
before calling in that function to ensure it's not going to fail.

		Amit

Re: [RFC PATCH 0/1] Implement entropy leak reporting for virtio-rng

Posted by Jason A. Donenfeld 2 years, 10 months ago

Hi Babis,

Why are you resending this? As I mentioned before, I'm going to move
forward in implementing this feature in a way that actually works with
the RNG. I'll use your RFC patch as a base, but I think beyond that, I
can take it from here.

Thanks,
Jason

Re: [RFC PATCH 0/1] Implement entropy leak reporting for virtio-rng

Posted by Jason A. Donenfeld 2 years, 10 months ago

On Mon, Apr 3, 2023 at 4:15 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> Hi Babis,
>
> Why are you resending this? As I mentioned before, I'm going to move
> forward in implementing this feature in a way that actually works with
> the RNG. I'll use your RFC patch as a base, but I think beyond that, I
> can take it from here.

Grrr, sorry! This is for QEMU! I understand.

(Kernel ends from me are forthcoming.)

Jason

Re: [RFC PATCH 0/1] Implement entropy leak reporting for virtio-rng

Posted by bchalios@amazon.es 2 years, 10 months ago


On 4/3/23 4:16 PM, "Jason A. Donenfeld" <Jason@zx2c4.com> wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> On Mon, Apr 3, 2023 at 4:15 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> >
> > Hi Babis,
> >
> > Why are you resending this? As I mentioned before, I'm going to move
> > forward in implementing this feature in a way that actually works with
> > the RNG. I'll use your RFC patch as a base, but I think beyond that, I
> > can take it from here.
> 
> Grrr, sorry! This is for QEMU! I understand.
> 
> (Kernel ends from me are forthcoming.)
> 
> Jason
> 

Hey Jason,

Good to hear from you. Yeap, I thought it would be nice to be able to test this using QEMU (apart from Firecracker).

Cheers,
Babis