fs/userfaultfd.c | 28 ++- tools/testing/selftests/mm/uffd-unit-tests.c | 203 +++++++++++++++++++ 2 files changed, 225 insertions(+), 6 deletions(-)
When discussing some userfaultfd issues with Andrea, Andrea pointed out an ABI issue with userfaultfd that existed for years. Luckily the issue should only be a very corner case one, and the fix (even if changing the kernel ABI) should only be in the good way, IOW there should have no risk breaking any userapp but only fixing. This issue also should not matter if the userapp didn't enable any of the UFFD_FEATURE_EVENT_* feature. The first patch contains more information on the issue and the fix. The 2nd patch is a test case I added which would fail on old kernels (including current latest branches) but will start working after patch 1 applied. Thanks, Peter Xu (2): mm/userfaultfd: Fix uninitialized output field for -EAGAIN race mm/selftests: Add a test to verify mmap_changing race with -EAGAIN fs/userfaultfd.c | 28 ++- tools/testing/selftests/mm/uffd-unit-tests.c | 203 +++++++++++++++++++ 2 files changed, 225 insertions(+), 6 deletions(-) -- 2.48.1
On Thu, Apr 24, 2025 at 5:57 PM Peter Xu <peterx@redhat.com> wrote:
>
> When discussing some userfaultfd issues with Andrea, Andrea pointed out an
> ABI issue with userfaultfd that existed for years. Luckily the issue
> should only be a very corner case one, and the fix (even if changing the
> kernel ABI) should only be in the good way, IOW there should have no risk
> breaking any userapp but only fixing.
FWIW, my userspace basically looks like this:
struct uffdio_continue uffdio_continue;
int64_t target_len = /* whatever */;
int64_t bytes_mapped = 0;
int ioctl_ret;
do {
uffdio_continue.range = /* whatever */;
uffdio_continue.mapped = 0;
ioctl_ret = ioctl(uffd, UFFDIO_CONTINUE, &uffdio_continue);
if (uffdio_continue.mapped < 0) { break; }
bytes_mapped += uffdio_continue.mapped;
} while (bytes_mapped < target_len && errno == EAGAIN);
I think your patch would indeed break this. (Perhaps I shouldn't be
reading from `mapped` without first checking that errno == EAGAIN.)
Well, that's what I would say, except in practice I never actually hit
the mmap_changing case while invoking UFFDIO_CONTINUE. :)
On 25.04.25 17:45, James Houghton wrote:
> On Thu, Apr 24, 2025 at 5:57 PM Peter Xu <peterx@redhat.com> wrote:
>>
>> When discussing some userfaultfd issues with Andrea, Andrea pointed out an
>> ABI issue with userfaultfd that existed for years. Luckily the issue
>> should only be a very corner case one, and the fix (even if changing the
>> kernel ABI) should only be in the good way, IOW there should have no risk
>> breaking any userapp but only fixing.
>
> FWIW, my userspace basically looks like this:
>
> struct uffdio_continue uffdio_continue;
> int64_t target_len = /* whatever */;
> int64_t bytes_mapped = 0;
> int ioctl_ret;
> do {
> uffdio_continue.range = /* whatever */;
> uffdio_continue.mapped = 0;
> ioctl_ret = ioctl(uffd, UFFDIO_CONTINUE, &uffdio_continue);
> if (uffdio_continue.mapped < 0) { break; }
> bytes_mapped += uffdio_continue.mapped;
> } while (bytes_mapped < target_len && errno == EAGAIN);
>
> I think your patch would indeed break this. (Perhaps I shouldn't be
> reading from `mapped` without first checking that errno == EAGAIN.)
>
> Well, that's what I would say, except in practice I never actually hit
> the mmap_changing case while invoking UFFDIO_CONTINUE. :)
Hm, but what if mfill_atomic_continue() would already return -EAGAIN
when checking mmap_changing etc?
Wouldn't code already run into an issue there?
--
Cheers,
David / dhildenb
On Fri, Apr 25, 2025 at 11:58 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 25.04.25 17:45, James Houghton wrote:
> > On Thu, Apr 24, 2025 at 5:57 PM Peter Xu <peterx@redhat.com> wrote:
> >>
> >> When discussing some userfaultfd issues with Andrea, Andrea pointed out an
> >> ABI issue with userfaultfd that existed for years. Luckily the issue
> >> should only be a very corner case one, and the fix (even if changing the
> >> kernel ABI) should only be in the good way, IOW there should have no risk
> >> breaking any userapp but only fixing.
> >
> > FWIW, my userspace basically looks like this:
> >
> > struct uffdio_continue uffdio_continue;
> > int64_t target_len = /* whatever */;
> > int64_t bytes_mapped = 0;
> > int ioctl_ret;
> > do {
> > uffdio_continue.range = /* whatever */;
> > uffdio_continue.mapped = 0;
> > ioctl_ret = ioctl(uffd, UFFDIO_CONTINUE, &uffdio_continue);
> > if (uffdio_continue.mapped < 0) { break; }
> > bytes_mapped += uffdio_continue.mapped;
> > } while (bytes_mapped < target_len && errno == EAGAIN);
> >
> > I think your patch would indeed break this. (Perhaps I shouldn't be
> > reading from `mapped` without first checking that errno == EAGAIN.)
> >
> > Well, that's what I would say, except in practice I never actually hit
> > the mmap_changing case while invoking UFFDIO_CONTINUE. :)
>
> Hm, but what if mfill_atomic_continue() would already return -EAGAIN
> when checking mmap_changing etc?
>
> Wouldn't code already run into an issue there?
Ah, thanks David. You're right, my code is already broken! :(
So given that we already have a case where -EAGAIN is put in the
output field, I change my mind, let's keep putting -EAGAIN in the
output field, and I'll go fix my code.
On Fri, Apr 25, 2025 at 12:07:31PM -0400, James Houghton wrote:
> On Fri, Apr 25, 2025 at 11:58 AM David Hildenbrand <david@redhat.com> wrote:
> >
> > On 25.04.25 17:45, James Houghton wrote:
> > > On Thu, Apr 24, 2025 at 5:57 PM Peter Xu <peterx@redhat.com> wrote:
> > >>
> > >> When discussing some userfaultfd issues with Andrea, Andrea pointed out an
> > >> ABI issue with userfaultfd that existed for years. Luckily the issue
> > >> should only be a very corner case one, and the fix (even if changing the
> > >> kernel ABI) should only be in the good way, IOW there should have no risk
> > >> breaking any userapp but only fixing.
> > >
> > > FWIW, my userspace basically looks like this:
> > >
> > > struct uffdio_continue uffdio_continue;
> > > int64_t target_len = /* whatever */;
> > > int64_t bytes_mapped = 0;
> > > int ioctl_ret;
> > > do {
> > > uffdio_continue.range = /* whatever */;
> > > uffdio_continue.mapped = 0;
> > > ioctl_ret = ioctl(uffd, UFFDIO_CONTINUE, &uffdio_continue);
> > > if (uffdio_continue.mapped < 0) { break; }
> > > bytes_mapped += uffdio_continue.mapped;
> > > } while (bytes_mapped < target_len && errno == EAGAIN);
> > >
> > > I think your patch would indeed break this. (Perhaps I shouldn't be
> > > reading from `mapped` without first checking that errno == EAGAIN.)
> > >
> > > Well, that's what I would say, except in practice I never actually hit
> > > the mmap_changing case while invoking UFFDIO_CONTINUE. :)
> >
> > Hm, but what if mfill_atomic_continue() would already return -EAGAIN
> > when checking mmap_changing etc?
> >
> > Wouldn't code already run into an issue there?
>
> Ah, thanks David. You're right, my code is already broken! :(
>
> So given that we already have a case where -EAGAIN is put in the
> output field, I change my mind, let's keep putting -EAGAIN in the
> output field, and I'll go fix my code.
Thanks both for the comments.
AFAIU it shouldn't affect any app that doesn't use UFFD_FEATURE_EVENT_* as
mentioned in cover letter. But I tend to agree a fix is good, that any app
should better check ioctl retval and errno, before anything else..
--
Peter Xu
© 2016 - 2025 Red Hat, Inc.