From nobody Tue Feb 10 00:59:08 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D9E8C433EF for ; Mon, 30 May 2022 21:00:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243242AbiE3VAo (ORCPT ); Mon, 30 May 2022 17:00:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40110 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243222AbiE3VAc (ORCPT ); Mon, 30 May 2022 17:00:32 -0400 Received: from mail-lj1-x233.google.com (mail-lj1-x233.google.com [IPv6:2a00:1450:4864:20::233]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D74EF915BB for ; Mon, 30 May 2022 14:00:29 -0700 (PDT) Received: by mail-lj1-x233.google.com with SMTP id t13so10323782ljd.6 for ; Mon, 30 May 2022 14:00:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=3gzHOM+c5HMloZ7sIBCpupu74wnaQ0zWHpxDMyFFqtQ=; b=mGcUtPiCk4PHtAzNluJb0vuddgduTuyyBtyazANXcy5nD1Xf/KtR0S9gIx13RnyRds 8u3rO5oCXpCHS2mZHww++hLTinm0nlnfMeiYj9kthO3SthrbB6DbrlrStbvwNSsfHEGc wiHH2K4htOWLf1r5a4hre0y3UKmZdHF8K/c7gXAcXUD3rs+GntblOimOOCeOWM6mZlDQ huEslXMfOOdMOB+tThTkdTLjzZgql2u33u46YX03yaCDbnad6zTWS4UeCmckfztLnbor BEJwJe277/EujUTG6ud3+fVt64QbT0GJeL0322CCZeAMk5nIuSga2OUr9HENmiul55pB k7yQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=3gzHOM+c5HMloZ7sIBCpupu74wnaQ0zWHpxDMyFFqtQ=; b=t3TpgZxISo7MThQCmD0TwaW48eTO/4Xx+9Wsv+gtES1f/ZibJYeu83JsZq90Zp4GpW VSoncF96FxyhgYJWXo4NjclW9N20hdxwIGpVnfhMzCxpLNIbE70piwtkV+C2H+GzcRrW cMHeeilbctYszUOaD0HR249GaZl8MBifPiIDhwGgLmawf8hYIrqEkE2r1ygzJNEkkvzu Y8LNbcJTYPmEDNkljh6jza1IyOxwdix0yPiJU4+X3SuzM/Sl7kWcEfZocLf2UeSQtZA6 LwCJmmRyRPDb8lhlfsOwWm9VyrARWomm6l8nUeRIcRdQZIf8weu2Z5KFK73PQJqEc/uW IsVg== X-Gm-Message-State: AOAM530yPf3uMgQ5HaKtjaFXppOIq5rIQVOSSoMx+sZjzK6DRhyKq39X gK1pLTKqMg8WlgClTooRBUU= X-Google-Smtp-Source: ABdhPJzpTG1IHR4PsqQEdQZUx9c91vwjmyp9o0CwooQn1XnsvHQd3A/2FTgg8PV1Fj5m7CxUzdqCUQ== X-Received: by 2002:a2e:7c15:0:b0:253:e242:1897 with SMTP id x21-20020a2e7c15000000b00253e2421897mr27412017ljc.72.1653944427560; Mon, 30 May 2022 14:00:27 -0700 (PDT) Received: from otyshchenko.router ([212.22.223.21]) by smtp.gmail.com with ESMTPSA id k21-20020a2ea275000000b0025550e2693asm581541ljm.38.2022.05.30.14.00.26 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 30 May 2022 14:00:27 -0700 (PDT) From: Oleksandr Tyshchenko To: xen-devel@lists.xenproject.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Juergen Gross , Boris Ostrovsky , Stefano Stabellini , Julien Grall , Oleksandr Tyshchenko , "Michael S. Tsirkin" , Christoph Hellwig Subject: [PATCH V3 2/8] xen/grants: support allocating consecutive grants Date: Tue, 31 May 2022 00:00:11 +0300 Message-Id: <1653944417-17168-3-git-send-email-olekstysh@gmail.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1653944417-17168-1-git-send-email-olekstysh@gmail.com> References: <1653944417-17168-1-git-send-email-olekstysh@gmail.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Juergen Gross For support of virtio via grant mappings in rare cases larger mappings using consecutive grants are needed. Support those by adding a bitmap of free grants. As consecutive grants will be needed only in very rare cases (e.g. when configuring a virtio device with a multi-page ring), optimize for the normal case of non-consecutive allocations. Signed-off-by: Juergen Gross Reviewed-by: Boris Ostrovsky --- Changes RFC -> V1: - no changes Changes V1 -> V2: - no changes Changes V2 -> V3: - rebase, move "if (unlikely(ref < GNTTAB_NR_RESERVED_ENTRIES))" to put_free_entry_locked() - do not overwrite "i" in gnttab_init(), introduce local max_nr_grefs - add a comment on top of "while (from < to)" in get_free_seq() - add Boris' R-b --- drivers/xen/grant-table.c | 251 +++++++++++++++++++++++++++++++++++++++---= ---- include/xen/grant_table.h | 4 + 2 files changed, 219 insertions(+), 36 deletions(-) diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c index 1a1aec0..947d82f 100644 --- a/drivers/xen/grant-table.c +++ b/drivers/xen/grant-table.c @@ -33,6 +33,7 @@ =20 #define pr_fmt(fmt) "xen:" KBUILD_MODNAME ": " fmt =20 +#include #include #include #include @@ -70,9 +71,32 @@ =20 static grant_ref_t **gnttab_list; static unsigned int nr_grant_frames; + +/* + * Handling of free grants: + * + * Free grants are in a simple list anchored in gnttab_free_head. They are + * linked by grant ref, the last element contains GNTTAB_LIST_END. The num= ber + * of free entries is stored in gnttab_free_count. + * Additionally there is a bitmap of free entries anchored in + * gnttab_free_bitmap. This is being used for simplifying allocation of + * multiple consecutive grants, which is needed e.g. for support of virtio. + * gnttab_last_free is used to add free entries of new frames at the end of + * the free list. + * gnttab_free_tail_ptr specifies the variable which references the start + * of consecutive free grants ending with gnttab_last_free. This pointer is + * updated in a rather defensive way, in order to avoid performance hits in + * hot paths. + * All those variables are protected by gnttab_list_lock. + */ static int gnttab_free_count; -static grant_ref_t gnttab_free_head; +static unsigned int gnttab_size; +static grant_ref_t gnttab_free_head =3D GNTTAB_LIST_END; +static grant_ref_t gnttab_last_free =3D GNTTAB_LIST_END; +static grant_ref_t *gnttab_free_tail_ptr; +static unsigned long *gnttab_free_bitmap; static DEFINE_SPINLOCK(gnttab_list_lock); + struct grant_frames xen_auto_xlat_grant_frames; static unsigned int xen_gnttab_version; module_param_named(version, xen_gnttab_version, uint, 0); @@ -168,16 +192,116 @@ static int get_free_entries(unsigned count) =20 ref =3D head =3D gnttab_free_head; gnttab_free_count -=3D count; - while (count-- > 1) - head =3D gnttab_entry(head); + while (count--) { + bitmap_clear(gnttab_free_bitmap, head, 1); + if (gnttab_free_tail_ptr =3D=3D __gnttab_entry(head)) + gnttab_free_tail_ptr =3D &gnttab_free_head; + if (count) + head =3D gnttab_entry(head); + } gnttab_free_head =3D gnttab_entry(head); gnttab_entry(head) =3D GNTTAB_LIST_END; =20 + if (!gnttab_free_count) { + gnttab_last_free =3D GNTTAB_LIST_END; + gnttab_free_tail_ptr =3D NULL; + } + spin_unlock_irqrestore(&gnttab_list_lock, flags); =20 return ref; } =20 +static int get_seq_entry_count(void) +{ + if (gnttab_last_free =3D=3D GNTTAB_LIST_END || !gnttab_free_tail_ptr || + *gnttab_free_tail_ptr =3D=3D GNTTAB_LIST_END) + return 0; + + return gnttab_last_free - *gnttab_free_tail_ptr + 1; +} + +/* Rebuilds the free grant list and tries to find count consecutive entrie= s. */ +static int get_free_seq(unsigned int count) +{ + int ret =3D -ENOSPC; + unsigned int from, to; + grant_ref_t *last; + + gnttab_free_tail_ptr =3D &gnttab_free_head; + last =3D &gnttab_free_head; + + for (from =3D find_first_bit(gnttab_free_bitmap, gnttab_size); + from < gnttab_size; + from =3D find_next_bit(gnttab_free_bitmap, gnttab_size, to + 1)) { + to =3D find_next_zero_bit(gnttab_free_bitmap, gnttab_size, + from + 1); + if (ret < 0 && to - from >=3D count) { + ret =3D from; + bitmap_clear(gnttab_free_bitmap, ret, count); + from +=3D count; + gnttab_free_count -=3D count; + if (from =3D=3D to) + continue; + } + + /* + * Recreate the free list in order to have it properly sorted. + * This is needed to make sure that the free tail has the maximum + * possible size. + */ + while (from < to) { + *last =3D from; + last =3D __gnttab_entry(from); + gnttab_last_free =3D from; + from++; + } + if (to < gnttab_size) + gnttab_free_tail_ptr =3D __gnttab_entry(to - 1); + } + + *last =3D GNTTAB_LIST_END; + if (gnttab_last_free !=3D gnttab_size - 1) + gnttab_free_tail_ptr =3D NULL; + + return ret; +} + +static int get_free_entries_seq(unsigned int count) +{ + unsigned long flags; + int ret =3D 0; + + spin_lock_irqsave(&gnttab_list_lock, flags); + + if (gnttab_free_count < count) { + ret =3D gnttab_expand(count - gnttab_free_count); + if (ret < 0) + goto out; + } + + if (get_seq_entry_count() < count) { + ret =3D get_free_seq(count); + if (ret >=3D 0) + goto out; + ret =3D gnttab_expand(count - get_seq_entry_count()); + if (ret < 0) + goto out; + } + + ret =3D *gnttab_free_tail_ptr; + *gnttab_free_tail_ptr =3D gnttab_entry(ret + count - 1); + gnttab_free_count -=3D count; + if (!gnttab_free_count) + gnttab_free_tail_ptr =3D NULL; + bitmap_clear(gnttab_free_bitmap, ret, count); + + out: + spin_unlock_irqrestore(&gnttab_list_lock, flags); + + return ret; +} + static void do_free_callbacks(void) { struct gnttab_free_callback *callback, *next; @@ -204,21 +328,51 @@ static inline void check_free_callbacks(void) do_free_callbacks(); } =20 -static void put_free_entry(grant_ref_t ref) +static void put_free_entry_locked(grant_ref_t ref) { - unsigned long flags; - if (unlikely(ref < GNTTAB_NR_RESERVED_ENTRIES)) return; =20 - spin_lock_irqsave(&gnttab_list_lock, flags); gnttab_entry(ref) =3D gnttab_free_head; gnttab_free_head =3D ref; + if (!gnttab_free_count) + gnttab_last_free =3D ref; + if (gnttab_free_tail_ptr =3D=3D &gnttab_free_head) + gnttab_free_tail_ptr =3D __gnttab_entry(ref); gnttab_free_count++; + bitmap_set(gnttab_free_bitmap, ref, 1); +} + +static void put_free_entry(grant_ref_t ref) +{ + unsigned long flags; + + spin_lock_irqsave(&gnttab_list_lock, flags); + put_free_entry_locked(ref); check_free_callbacks(); spin_unlock_irqrestore(&gnttab_list_lock, flags); } =20 +static void gnttab_set_free(unsigned int start, unsigned int n) +{ + unsigned int i; + + for (i =3D start; i < start + n - 1; i++) + gnttab_entry(i) =3D i + 1; + + gnttab_entry(i) =3D GNTTAB_LIST_END; + if (!gnttab_free_count) { + gnttab_free_head =3D start; + gnttab_free_tail_ptr =3D &gnttab_free_head; + } else { + gnttab_entry(gnttab_last_free) =3D start; + } + gnttab_free_count +=3D n; + gnttab_last_free =3D i; + + bitmap_set(gnttab_free_bitmap, start, n); +} + /* * Following applies to gnttab_update_entry_v1 and gnttab_update_entry_v2. * Introducing a valid entry into the grant table: @@ -450,23 +604,31 @@ void gnttab_free_grant_references(grant_ref_t head) { grant_ref_t ref; unsigned long flags; - int count =3D 1; - if (head =3D=3D GNTTAB_LIST_END) - return; + spin_lock_irqsave(&gnttab_list_lock, flags); - ref =3D head; - while (gnttab_entry(ref) !=3D GNTTAB_LIST_END) { - ref =3D gnttab_entry(ref); - count++; + while (head !=3D GNTTAB_LIST_END) { + ref =3D gnttab_entry(head); + put_free_entry_locked(head); + head =3D ref; } - gnttab_entry(ref) =3D gnttab_free_head; - gnttab_free_head =3D head; - gnttab_free_count +=3D count; check_free_callbacks(); spin_unlock_irqrestore(&gnttab_list_lock, flags); } EXPORT_SYMBOL_GPL(gnttab_free_grant_references); =20 +void gnttab_free_grant_reference_seq(grant_ref_t head, unsigned int count) +{ + unsigned long flags; + unsigned int i; + + spin_lock_irqsave(&gnttab_list_lock, flags); + for (i =3D count; i > 0; i--) + put_free_entry_locked(head + i - 1); + check_free_callbacks(); + spin_unlock_irqrestore(&gnttab_list_lock, flags); +} +EXPORT_SYMBOL_GPL(gnttab_free_grant_reference_seq); + int gnttab_alloc_grant_references(u16 count, grant_ref_t *head) { int h =3D get_free_entries(count); @@ -480,6 +642,24 @@ int gnttab_alloc_grant_references(u16 count, grant_ref= _t *head) } EXPORT_SYMBOL_GPL(gnttab_alloc_grant_references); =20 +int gnttab_alloc_grant_reference_seq(unsigned int count, grant_ref_t *firs= t) +{ + int h; + + if (count =3D=3D 1) + h =3D get_free_entries(1); + else + h =3D get_free_entries_seq(count); + + if (h < 0) + return -ENOSPC; + + *first =3D h; + + return 0; +} +EXPORT_SYMBOL_GPL(gnttab_alloc_grant_reference_seq); + int gnttab_empty_grant_references(const grant_ref_t *private_head) { return (*private_head =3D=3D GNTTAB_LIST_END); @@ -572,16 +752,13 @@ static int grow_gnttab_list(unsigned int more_frames) goto grow_nomem; } =20 + gnttab_set_free(gnttab_size, extra_entries); =20 - for (i =3D grefs_per_frame * nr_grant_frames; - i < grefs_per_frame * new_nr_grant_frames - 1; i++) - gnttab_entry(i) =3D i + 1; - - gnttab_entry(i) =3D gnttab_free_head; - gnttab_free_head =3D grefs_per_frame * nr_grant_frames; - gnttab_free_count +=3D extra_entries; + if (!gnttab_free_tail_ptr) + gnttab_free_tail_ptr =3D __gnttab_entry(gnttab_size); =20 nr_grant_frames =3D new_nr_grant_frames; + gnttab_size +=3D extra_entries; =20 check_free_callbacks(); =20 @@ -1424,20 +1601,20 @@ static int gnttab_expand(unsigned int req_entries) int gnttab_init(void) { int i; - unsigned long max_nr_grant_frames; + unsigned long max_nr_grant_frames, max_nr_grefs; unsigned int max_nr_glist_frames, nr_glist_frames; - unsigned int nr_init_grefs; int ret; =20 gnttab_request_version(); max_nr_grant_frames =3D gnttab_max_grant_frames(); + max_nr_grefs =3D max_nr_grant_frames * + gnttab_interface->grefs_per_grant_frame; nr_grant_frames =3D 1; =20 /* Determine the maximum number of frames required for the * grant reference free list on the current hypervisor. */ - max_nr_glist_frames =3D (max_nr_grant_frames * - gnttab_interface->grefs_per_grant_frame / RPP); + max_nr_glist_frames =3D max_nr_grefs / RPP; =20 gnttab_list =3D kmalloc_array(max_nr_glist_frames, sizeof(grant_ref_t *), @@ -1454,6 +1631,12 @@ int gnttab_init(void) } } =20 + gnttab_free_bitmap =3D bitmap_zalloc(max_nr_grefs, GFP_KERNEL); + if (!gnttab_free_bitmap) { + ret =3D -ENOMEM; + goto ini_nomem; + } + ret =3D arch_gnttab_init(max_nr_grant_frames, nr_status_frames(max_nr_grant_frames)); if (ret < 0) @@ -1464,15 +1647,10 @@ int gnttab_init(void) goto ini_nomem; } =20 - nr_init_grefs =3D nr_grant_frames * - gnttab_interface->grefs_per_grant_frame; - - for (i =3D GNTTAB_NR_RESERVED_ENTRIES; i < nr_init_grefs - 1; i++) - gnttab_entry(i) =3D i + 1; + gnttab_size =3D nr_grant_frames * gnttab_interface->grefs_per_grant_frame; =20 - gnttab_entry(nr_init_grefs - 1) =3D GNTTAB_LIST_END; - gnttab_free_count =3D nr_init_grefs - GNTTAB_NR_RESERVED_ENTRIES; - gnttab_free_head =3D GNTTAB_NR_RESERVED_ENTRIES; + gnttab_set_free(GNTTAB_NR_RESERVED_ENTRIES, + gnttab_size - GNTTAB_NR_RESERVED_ENTRIES); =20 printk("Grant table initialized\n"); return 0; @@ -1481,6 +1659,7 @@ int gnttab_init(void) for (i--; i >=3D 0; i--) free_page((unsigned long)gnttab_list[i]); kfree(gnttab_list); + bitmap_free(gnttab_free_bitmap); return ret; } EXPORT_SYMBOL_GPL(gnttab_init); diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h index 7d0f2f0..a174f90 100644 --- a/include/xen/grant_table.h +++ b/include/xen/grant_table.h @@ -127,10 +127,14 @@ int gnttab_try_end_foreign_access(grant_ref_t ref); */ int gnttab_alloc_grant_references(u16 count, grant_ref_t *pprivate_head); =20 +int gnttab_alloc_grant_reference_seq(unsigned int count, grant_ref_t *firs= t); + void gnttab_free_grant_reference(grant_ref_t ref); =20 void gnttab_free_grant_references(grant_ref_t head); =20 +void gnttab_free_grant_reference_seq(grant_ref_t head, unsigned int count); + int gnttab_empty_grant_references(const grant_ref_t *pprivate_head); =20 int gnttab_claim_grant_reference(grant_ref_t *pprivate_head); --=20 2.7.4