From nobody Wed May 1 14:18:21 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1656353460243947.3650478980707; Mon, 27 Jun 2022 11:11:00 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.356889.585253 (Exim 4.92) (envelope-from ) id 1o5tBw-0005kN-TZ; Mon, 27 Jun 2022 18:10:32 +0000 Received: by outflank-mailman (output) from mailman id 356889.585253; Mon, 27 Jun 2022 18:10:32 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o5tBw-0005iC-LT; Mon, 27 Jun 2022 18:10:32 +0000 Received: by outflank-mailman (input) for mailman id 356889; Mon, 27 Jun 2022 18:10:32 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o5tBv-0005LG-Rj for xen-devel@lists.xenproject.org; Mon, 27 Jun 2022 18:10:32 +0000 Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 6da97e1f-f644-11ec-b725-ed86ccbb4733; Mon, 27 Jun 2022 20:10:30 +0200 (CEST) Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id C260D5C01AB; Mon, 27 Jun 2022 14:10:29 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Mon, 27 Jun 2022 14:10:29 -0400 Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 27 Jun 2022 14:10:29 -0400 (EDT) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 6da97e1f-f644-11ec-b725-ed86ccbb4733 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= invisiblethingslab.com; h=cc:cc:content-transfer-encoding:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to; s=fm2; t= 1656353429; x=1656439829; bh=RprnaSYI0mL+yNjtEtS6aEn1qbkBaECOWhl hXlboKR4=; b=ekjOyBfjOuG8nnYBFQzscZ+qRtjF0kfpGjrxeVRSKar/oPvBEx0 UNW4CJ17iiH9AlKhSAMfFzzLf5UtAiVq7uw/ty3gDMNqo+RBPgJZ62pORIhFOG39 nzcyc/DOlxVt1qVMoGpWdPCRHGV9eg8WHe4bpkOgVYM8awJx4DzESQVcGAzHs/CO APQBnwuG6FU2GrEh5zSSpBbQ/ntSIS5N+ng/f7nnqXfrH3W0/aY5KNTDXJqzkcGU kybrRLgRKntrZtAann63CzpYnfzJEfXgIU1BFg/0SrBRIy002SXArrVVuWC0h9D1 ZyEcOYE1cfOX3jT8FZldZYGAkEMQrK/itLQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm2; t=1656353429; x=1656439829; bh=RprnaSYI0mL+y NjtEtS6aEn1qbkBaECOWhlhXlboKR4=; b=HDJh7U5evnJYnjA+zqHYmNvERKg5i MKXz2VCDoBh1S8pbHuf/6vYbj+cyKaOTVMCCN0XhpJmFg/XLrX2RjfT5f6AtHYPs NY7Qsw3RWJLSd9uvKkexsX9cwPs0FBLmYVxCSyMkvhq1W0XHPI7oT36F5fM+z80M U/nA2SYtcUoVUab0fOujs61T9OXEKKk+dpNJTFPfinOP2PVRiupMgVTL9x7ktkg9 vp9NHlKuxa6VslH37UkPYFzLHBlyJ3y7oyVG77fEoXaBcDtAG59k0R9VDm24PBwI AmbGCG10wy6hLTBxODnm4vRBS4ejYodRYDrTDXAHzMghA1WqmPGy/MjaQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvfedrudeghedguddvgecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefhvfevufffkffojghfggfgsedtkeertdertddtnecuhfhrohhmpeffvghm ihcuofgrrhhivgcuqfgsvghnohhurhcuoeguvghmihesihhnvhhishhisghlvghthhhinh hgshhlrggsrdgtohhmqeenucggtffrrghtthgvrhhnpeettdegudelkeelveejhefhvdek hfevuedvvedtudfhhfdvledtgedtgfekuddugeenucffohhmrghinhepghhithhhuhgsrd gtohhmpdhkvghrnhgvlhdrohhrghenucevlhhushhtvghrufhiiigvpedvnecurfgrrhgr mhepmhgrihhlfhhrohhmpeguvghmihesihhnvhhishhisghlvghthhhinhhgshhlrggsrd gtohhm X-ME-Proxy: Feedback-ID: iac594737:Fastmail From: Demi Marie Obenour To: stable@vger.kernel.org, Xen developer discussion , Juergen Gross Cc: Demi Marie Obenour Subject: [PATCH 4.9] xen/gntdev: Avoid blocking in unmap_grant_pages() Date: Mon, 27 Jun 2022 14:10:06 -0400 Message-Id: <20220627181006.1954-5-demi@invisiblethingslab.com> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20220627181006.1954-1-demi@invisiblethingslab.com> References: <20220627181006.1954-1-demi@invisiblethingslab.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ZM-MESSAGEID: 1656353460693100007 Content-Type: text/plain; charset="utf-8" commit dbe97cff7dd9f0f75c524afdd55ad46be3d15295 upstream unmap_grant_pages() currently waits for the pages to no longer be used. In https://github.com/QubesOS/qubes-issues/issues/7481, this lead to a deadlock against i915: i915 was waiting for gntdev's MMU notifier to finish, while gntdev was waiting for i915 to free its pages. I also believe this is responsible for various deadlocks I have experienced in the past. Avoid these problems by making unmap_grant_pages async. This requires making it return void, as any errors will not be available when the function returns. Fortunately, the only use of the return value is a WARN_ON(), which can be replaced by a WARN_ON when the error is detected. Additionally, a failed call will not prevent further calls from being made, but this is harmless. Because unmap_grant_pages is now async, the grant handle will be sent to INVALID_GRANT_HANDLE too late to prevent multiple unmaps of the same handle. Instead, a separate bool array is allocated for this purpose. This wastes memory, but stuffing this information in padding bytes is too fragile. Furthermore, it is necessary to grab a reference to the map before making the asynchronous call, and release the reference when the call returns. It is also necessary to guard against reentrancy in gntdev_map_put(), and to handle the case where userspace tries to map a mapping whose contents have not all been freed yet. Fixes: 745282256c75 ("xen/gntdev: safely unmap grants in case they are stil= l in use") Cc: stable@vger.kernel.org Signed-off-by: Demi Marie Obenour Reviewed-by: Juergen Gross Link: https://lore.kernel.org/r/20220622022726.2538-1-demi@invisiblethingsl= ab.com Signed-off-by: Juergen Gross --- drivers/xen/gntdev.c | 144 ++++++++++++++++++++++++++++++------------- 1 file changed, 102 insertions(+), 42 deletions(-) diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c index 69d59102ff1b..2c3248e71e9c 100644 --- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -57,6 +57,7 @@ MODULE_PARM_DESC(limit, "Maximum number of grants that ma= y be mapped by " =20 static atomic_t pages_mapped =3D ATOMIC_INIT(0); =20 +/* True in PV mode, false otherwise */ static int use_ptemod; #define populate_freeable_maps use_ptemod =20 @@ -92,11 +93,16 @@ struct grant_map { struct gnttab_unmap_grant_ref *unmap_ops; struct gnttab_map_grant_ref *kmap_ops; struct gnttab_unmap_grant_ref *kunmap_ops; + bool *being_removed; struct page **pages; unsigned long pages_vm_start; + /* Number of live grants */ + atomic_t live_grants; + /* Needed to avoid allocation in unmap_grant_pages */ + struct gntab_unmap_queue_data unmap_data; }; =20 -static int unmap_grant_pages(struct grant_map *map, int offset, int pages); +static void unmap_grant_pages(struct grant_map *map, int offset, int pages= ); =20 /* ------------------------------------------------------------------ */ =20 @@ -127,6 +133,7 @@ static void gntdev_free_map(struct grant_map *map) kfree(map->unmap_ops); kfree(map->kmap_ops); kfree(map->kunmap_ops); + kfree(map->being_removed); kfree(map); } =20 @@ -145,12 +152,15 @@ static struct grant_map *gntdev_alloc_map(struct gntd= ev_priv *priv, int count) add->kmap_ops =3D kcalloc(count, sizeof(add->kmap_ops[0]), GFP_KERNEL); add->kunmap_ops =3D kcalloc(count, sizeof(add->kunmap_ops[0]), GFP_KERNEL= ); add->pages =3D kcalloc(count, sizeof(add->pages[0]), GFP_KERNEL); + add->being_removed =3D + kcalloc(count, sizeof(add->being_removed[0]), GFP_KERNEL); if (NULL =3D=3D add->grants || NULL =3D=3D add->map_ops || NULL =3D=3D add->unmap_ops || NULL =3D=3D add->kmap_ops || NULL =3D=3D add->kunmap_ops || - NULL =3D=3D add->pages) + NULL =3D=3D add->pages || + NULL =3D=3D add->being_removed) goto err; =20 if (gnttab_alloc_pages(count, add->pages)) @@ -215,6 +225,34 @@ static void gntdev_put_map(struct gntdev_priv *priv, s= truct grant_map *map) return; =20 atomic_sub(map->count, &pages_mapped); + if (map->pages && !use_ptemod) { + /* + * Increment the reference count. This ensures that the + * subsequent call to unmap_grant_pages() will not wind up + * re-entering itself. It *can* wind up calling + * gntdev_put_map() recursively, but such calls will be with a + * reference count greater than 1, so they will return before + * this code is reached. The recursion depth is thus limited to + * 1. + */ + atomic_set(&map->users, 1); + + /* + * Unmap the grants. This may or may not be asynchronous, so it + * is possible that the reference count is 1 on return, but it + * could also be greater than 1. + */ + unmap_grant_pages(map, 0, map->count); + + /* Check if the memory now needs to be freed */ + if (!atomic_dec_and_test(&map->users)) + return; + + /* + * All pages have been returned to the hypervisor, so free the + * map. + */ + } =20 if (map->notify.flags & UNMAP_NOTIFY_SEND_EVENT) { notify_remote_via_evtchn(map->notify.event); @@ -272,6 +310,7 @@ static int set_grant_ptes_as_special(pte_t *pte, pgtabl= e_t token, =20 static int map_grant_pages(struct grant_map *map) { + size_t alloced =3D 0; int i, err =3D 0; =20 if (!use_ptemod) { @@ -320,85 +359,107 @@ static int map_grant_pages(struct grant_map *map) map->pages, map->count); =20 for (i =3D 0; i < map->count; i++) { - if (map->map_ops[i].status =3D=3D GNTST_okay) + if (map->map_ops[i].status =3D=3D GNTST_okay) { map->unmap_ops[i].handle =3D map->map_ops[i].handle; - else if (!err) + if (!use_ptemod) + alloced++; + } else if (!err) err =3D -EINVAL; =20 if (map->flags & GNTMAP_device_map) map->unmap_ops[i].dev_bus_addr =3D map->map_ops[i].dev_bus_addr; =20 if (use_ptemod) { - if (map->kmap_ops[i].status =3D=3D GNTST_okay) + if (map->kmap_ops[i].status =3D=3D GNTST_okay) { + if (map->map_ops[i].status =3D=3D GNTST_okay) + alloced++; map->kunmap_ops[i].handle =3D map->kmap_ops[i].handle; - else if (!err) + } else if (!err) err =3D -EINVAL; } } + atomic_add(alloced, &map->live_grants); return err; } =20 -static int __unmap_grant_pages(struct grant_map *map, int offset, int page= s) +static void __unmap_grant_pages_done(int result, + struct gntab_unmap_queue_data *data) { - int i, err =3D 0; - struct gntab_unmap_queue_data unmap_data; + unsigned int i; + struct grant_map *map =3D data->data; + unsigned int offset =3D data->unmap_ops - map->unmap_ops; + + for (i =3D 0; i < data->count; i++) { + WARN_ON(map->unmap_ops[offset+i].status); + pr_debug("unmap handle=3D%d st=3D%d\n", + map->unmap_ops[offset+i].handle, + map->unmap_ops[offset+i].status); + map->unmap_ops[offset+i].handle =3D -1; + } + /* + * Decrease the live-grant counter. This must happen after the loop to + * prevent premature reuse of the grants by gnttab_mmap(). + */ + atomic_sub(data->count, &map->live_grants); =20 + /* Release reference taken by unmap_grant_pages */ + gntdev_put_map(NULL, map); +} + +static void __unmap_grant_pages(struct grant_map *map, int offset, int pag= es) +{ if (map->notify.flags & UNMAP_NOTIFY_CLEAR_BYTE) { int pgno =3D (map->notify.addr >> PAGE_SHIFT); + if (pgno >=3D offset && pgno < offset + pages) { /* No need for kmap, pages are in lowmem */ uint8_t *tmp =3D pfn_to_kaddr(page_to_pfn(map->pages[pgno])); + tmp[map->notify.addr & (PAGE_SIZE-1)] =3D 0; map->notify.flags &=3D ~UNMAP_NOTIFY_CLEAR_BYTE; } } =20 - unmap_data.unmap_ops =3D map->unmap_ops + offset; - unmap_data.kunmap_ops =3D use_ptemod ? map->kunmap_ops + offset : NULL; - unmap_data.pages =3D map->pages + offset; - unmap_data.count =3D pages; + map->unmap_data.unmap_ops =3D map->unmap_ops + offset; + map->unmap_data.kunmap_ops =3D use_ptemod ? map->kunmap_ops + offset : NU= LL; + map->unmap_data.pages =3D map->pages + offset; + map->unmap_data.count =3D pages; + map->unmap_data.done =3D __unmap_grant_pages_done; + map->unmap_data.data =3D map; + atomic_inc(&map->users); /* to keep map alive during async call below */ =20 - err =3D gnttab_unmap_refs_sync(&unmap_data); - if (err) - return err; - - for (i =3D 0; i < pages; i++) { - if (map->unmap_ops[offset+i].status) - err =3D -EINVAL; - pr_debug("unmap handle=3D%d st=3D%d\n", - map->unmap_ops[offset+i].handle, - map->unmap_ops[offset+i].status); - map->unmap_ops[offset+i].handle =3D -1; - } - return err; + gnttab_unmap_refs_async(&map->unmap_data); } =20 -static int unmap_grant_pages(struct grant_map *map, int offset, int pages) +static void unmap_grant_pages(struct grant_map *map, int offset, int pages) { - int range, err =3D 0; + int range; + + if (atomic_read(&map->live_grants) =3D=3D 0) + return; /* Nothing to do */ =20 pr_debug("unmap %d+%d [%d+%d]\n", map->index, map->count, offset, pages); =20 /* It is possible the requested range will have a "hole" where we * already unmapped some of the grants. Only unmap valid ranges. */ - while (pages && !err) { - while (pages && map->unmap_ops[offset].handle =3D=3D -1) { + while (pages) { + while (pages && map->being_removed[offset]) { offset++; pages--; } range =3D 0; while (range < pages) { - if (map->unmap_ops[offset+range].handle =3D=3D -1) + if (map->being_removed[offset + range]) break; + map->being_removed[offset + range] =3D true; range++; } - err =3D __unmap_grant_pages(map, offset, range); + if (range) + __unmap_grant_pages(map, offset, range); offset +=3D range; pages -=3D range; } - - return err; } =20 /* ------------------------------------------------------------------ */ @@ -454,7 +515,6 @@ static void unmap_if_in_range(struct grant_map *map, unsigned long start, unsigned long end) { unsigned long mstart, mend; - int err; =20 if (!map->vma) return; @@ -468,10 +528,9 @@ static void unmap_if_in_range(struct grant_map *map, map->index, map->count, map->vma->vm_start, map->vma->vm_end, start, end, mstart, mend); - err =3D unmap_grant_pages(map, + unmap_grant_pages(map, (mstart - map->vma->vm_start) >> PAGE_SHIFT, (mend - mstart) >> PAGE_SHIFT); - WARN_ON(err); } =20 static void mn_invl_range_start(struct mmu_notifier *mn, @@ -503,7 +562,6 @@ static void mn_release(struct mmu_notifier *mn, { struct gntdev_priv *priv =3D container_of(mn, struct gntdev_priv, mn); struct grant_map *map; - int err; =20 mutex_lock(&priv->lock); list_for_each_entry(map, &priv->maps, next) { @@ -512,8 +570,7 @@ static void mn_release(struct mmu_notifier *mn, pr_debug("map %d+%d (%lx %lx)\n", map->index, map->count, map->vma->vm_start, map->vma->vm_end); - err =3D unmap_grant_pages(map, /* offset */ 0, map->count); - WARN_ON(err); + unmap_grant_pages(map, /* offset */ 0, map->count); } list_for_each_entry(map, &priv->freeable_maps, next) { if (!map->vma) @@ -521,8 +578,7 @@ static void mn_release(struct mmu_notifier *mn, pr_debug("map %d+%d (%lx %lx)\n", map->index, map->count, map->vma->vm_start, map->vma->vm_end); - err =3D unmap_grant_pages(map, /* offset */ 0, map->count); - WARN_ON(err); + unmap_grant_pages(map, /* offset */ 0, map->count); } mutex_unlock(&priv->lock); } @@ -1012,6 +1068,10 @@ static int gntdev_mmap(struct file *flip, struct vm_= area_struct *vma) goto unlock_out; } =20 + if (atomic_read(&map->live_grants)) { + err =3D -EAGAIN; + goto unlock_out; + } atomic_inc(&map->users); =20 vma->vm_ops =3D &gntdev_vmops; --=20 Sincerely, Demi Marie Obenour (she/her/hers) Invisible Things Lab