[PATCH for-4.16 v4] gnttab: allow setting max version per-domain

Roger Pau Monne posted 1 patch 2 years, 5 months ago
Failed in applying to current master (apply log)
There is a newer version of this series
docs/man/xl.cfg.5.pod.in             |  6 ++++++
docs/man/xl.conf.5.pod.in            |  7 +++++++
tools/helpers/init-xenstore-domain.c |  1 +
tools/include/libxl.h                |  7 +++++++
tools/libs/light/libxl_create.c      |  3 +++
tools/libs/light/libxl_dm.c          |  1 +
tools/libs/light/libxl_types.idl     |  1 +
tools/ocaml/libs/xc/xenctrl.ml       |  1 +
tools/ocaml/libs/xc/xenctrl.mli      |  1 +
tools/ocaml/libs/xc/xenctrl_stubs.c  |  7 ++++++-
tools/xl/xl.c                        |  8 ++++++++
tools/xl/xl.h                        |  1 +
tools/xl/xl_parse.c                  |  9 +++++++++
xen/arch/arm/domain_build.c          |  2 ++
xen/arch/x86/setup.c                 |  1 +
xen/common/domain.c                  |  3 ++-
xen/common/grant_table.c             | 28 ++++++++++++++++++++++++++--
xen/include/public/domctl.h          | 10 ++++++++--
xen/include/xen/config.h             |  1 +
xen/include/xen/grant_table.h        |  5 +++--
20 files changed, 95 insertions(+), 8 deletions(-)
[PATCH for-4.16 v4] gnttab: allow setting max version per-domain
Posted by Roger Pau Monne 2 years, 5 months ago
Introduce a new domain create field so that toolstack can specify the
maximum grant table version usable by the domain. This is plumbed into
xl and settable by the user as max_grant_version.

Previously this was only settable on a per host basis using the
gnttab command line option.

Note the version is specified using 4 bits, which leaves room to
specify up to grant table version 14 (15 is used to signal default max
version). Given that we only have 2 grant table versions right now,
and a new version is unlikely in the near future using 4 bits seems
more than enough.

xenstored stubdomains are limited to grant table v1 because the
current MiniOS code used to build them only has support for grants v1.
There are existing limits set for xenstored stubdomains at creation
time that already match the defaults in MiniOS.

Note that when using the default grant version the specific max
version in use by the domain is not migrated. Any guests should be
able to cope with the max grant version changing across migrations,
and if a specific guest relies on a maximum grant version being
unconditionally available it should be specified on the configuration
file.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
---
NB: the stubdom max grant version is cloned from the domain one. Not
sure whether long term we might want to use different options for the
stubdom and the domain. In any case the attack surface will always be
max(stubdom, domain), so maybe it's just pointless to allow more fine
grained settings.
---
Changes since v3:
 - Expand commit message re xenstored stubdomains.

Changes since v2:
 - Drop XEN_DOMCTLGRANT_MAX - it's unused.
 - Rename max_grant_version field to max_version in the grant table
   struct.
 - Print domain on log messages.
 - Print a message if host has more than 16Tb of RAM and grant v2 is
   disabled.
 - Add a TB macro.

Changes since v1:
 - Introduce a grant_opts field and use the low 4 bits to specify the
   version. Remaining bits will be used for other purposes.
---
Cc: Ian Jackson <iwj@xenproject.org>
---
Posting this patch alone as I think allowing to control transient
grants on a per-domain basis will require a bit more of work.

Release rationale:

We have had a bunch of security issues involving grant table v2 (382,
379, 268, 255) which could have been avoided by limiting the grant
table version available to guests. This can be currently done using a
global host parameter, but it's certainly more helpful to be able to
do it on a per domain basis from the toolstack.

Changes to the hypervisor by this patch are fairly minimal, as there
are already checks for the max grant table version allowed, so the
main change there is moving the max grant table version limit inside
the domain struct and plumbing it through the toolstrack.

I think the risk here is quite low for libxl/xl, because it's
extensively tested by osstest, so the main risk would be breaking the
Ocaml stubs, which could go unnoticed as those are not actually tested
by osstest.
---
 docs/man/xl.cfg.5.pod.in             |  6 ++++++
 docs/man/xl.conf.5.pod.in            |  7 +++++++
 tools/helpers/init-xenstore-domain.c |  1 +
 tools/include/libxl.h                |  7 +++++++
 tools/libs/light/libxl_create.c      |  3 +++
 tools/libs/light/libxl_dm.c          |  1 +
 tools/libs/light/libxl_types.idl     |  1 +
 tools/ocaml/libs/xc/xenctrl.ml       |  1 +
 tools/ocaml/libs/xc/xenctrl.mli      |  1 +
 tools/ocaml/libs/xc/xenctrl_stubs.c  |  7 ++++++-
 tools/xl/xl.c                        |  8 ++++++++
 tools/xl/xl.h                        |  1 +
 tools/xl/xl_parse.c                  |  9 +++++++++
 xen/arch/arm/domain_build.c          |  2 ++
 xen/arch/x86/setup.c                 |  1 +
 xen/common/domain.c                  |  3 ++-
 xen/common/grant_table.c             | 28 ++++++++++++++++++++++++++--
 xen/include/public/domctl.h          | 10 ++++++++--
 xen/include/xen/config.h             |  1 +
 xen/include/xen/grant_table.h        |  5 +++--
 20 files changed, 95 insertions(+), 8 deletions(-)

diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index 55c4881205..21a39adb70 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -580,6 +580,12 @@ to have. This value controls how many pages of foreign domains can be accessed
 via the grant mechanism by this domain. The default value is settable via
 L<xl.conf(5)>.
 
+=item B<max_grant_version=NUMBER>
+
+Specify the maximum grant table version the domain is allowed to use. Current
+supported versions are 1 and 2. The default value is settable via
+L<xl.conf(5)>.
+
 =item B<nomigrate=BOOLEAN>
 
 Disable migration of this domain.  This enables certain other features
diff --git a/docs/man/xl.conf.5.pod.in b/docs/man/xl.conf.5.pod.in
index b48e99131a..0a70698a7c 100644
--- a/docs/man/xl.conf.5.pod.in
+++ b/docs/man/xl.conf.5.pod.in
@@ -101,6 +101,13 @@ Sets the default value for the C<max_maptrack_frames> domain config value.
 Default: value of Xen command line B<gnttab_max_maptrack_frames>
 parameter (or its default value if unspecified).
 
+=item B<max_grant_version=NUMBER>
+
+Sets the default value for the C<max_grant_version> domain config value.
+
+Default: value of Xen command line B<gnttab> parameter (or its default value if
+unspecified).
+
 =item B<vif.default.script="PATH">
 
 Configures the default hotplug script used by virtual network devices.
diff --git a/tools/helpers/init-xenstore-domain.c b/tools/helpers/init-xenstore-domain.c
index 6836002f0b..7cd1aa8f7c 100644
--- a/tools/helpers/init-xenstore-domain.c
+++ b/tools/helpers/init-xenstore-domain.c
@@ -88,6 +88,7 @@ static int build(xc_interface *xch)
          */
         .max_grant_frames = 4,
         .max_maptrack_frames = 128,
+        .grant_opts = 1,
     };
 
     xs_fd = open("/dev/xen/xenbus_backend", O_RDWR);
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index 2e8679dbcb..8621161845 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -514,6 +514,13 @@
  */
 #define LIBXL_HAVE_VPMU 1
 
+/*
+ * LIBXL_HAVE_MAX_GRANT_VERSION indicates libxl_domain_build_info has a
+ * max_grant_version field for setting the max grant table version per
+ * domain.
+ */
+#define LIBXL_HAVE_MAX_GRANT_VERSION 1
+
 /*
  * libxl ABI compatibility
  *
diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
index 5a61d01722..431c569dd2 100644
--- a/tools/libs/light/libxl_create.c
+++ b/tools/libs/light/libxl_create.c
@@ -607,6 +607,9 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config,
             .max_evtchn_port = b_info->event_channels,
             .max_grant_frames = b_info->max_grant_frames,
             .max_maptrack_frames = b_info->max_maptrack_frames,
+            .grant_opts = b_info->max_grant_version == -1
+                          ? XEN_DOMCTL_GRANT_version_default
+                          : b_info->max_grant_version,
             .vmtrace_size = ROUNDUP(b_info->vmtrace_buf_kb << 10, XC_PAGE_SHIFT),
         };
 
diff --git a/tools/libs/light/libxl_dm.c b/tools/libs/light/libxl_dm.c
index 9d93056b5c..9a8ddbe188 100644
--- a/tools/libs/light/libxl_dm.c
+++ b/tools/libs/light/libxl_dm.c
@@ -2320,6 +2320,7 @@ void libxl__spawn_stub_dm(libxl__egc *egc, libxl__stub_dm_spawn_state *sdss)
 
     dm_config->b_info.max_grant_frames = guest_config->b_info.max_grant_frames;
     dm_config->b_info.max_maptrack_frames = guest_config->b_info.max_maptrack_frames;
+    dm_config->b_info.max_grant_version = guest_config->b_info.max_grant_version;
 
     dm_config->b_info.u.pv.features = "";
 
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index 608d55a456..ce856febe5 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -519,6 +519,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
 
     ("max_grant_frames",    uint32, {'init_val': 'LIBXL_MAX_GRANT_DEFAULT'}),
     ("max_maptrack_frames", uint32, {'init_val': 'LIBXL_MAX_GRANT_DEFAULT'}),
+    ("max_grant_version",   integer, {'init_val': '-1'}),
     
     ("device_model_version", libxl_device_model_version),
     ("device_model_stubdomain", libxl_defbool),
diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
index addcf4cc59..d3eacfba6e 100644
--- a/tools/ocaml/libs/xc/xenctrl.ml
+++ b/tools/ocaml/libs/xc/xenctrl.ml
@@ -84,6 +84,7 @@ type domctl_create_config =
 	max_evtchn_port: int;
 	max_grant_frames: int;
 	max_maptrack_frames: int;
+	max_grant_version: int;
 	arch: arch_domainconfig;
 }
 
diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
index 0a5ce529e9..96e5d14643 100644
--- a/tools/ocaml/libs/xc/xenctrl.mli
+++ b/tools/ocaml/libs/xc/xenctrl.mli
@@ -76,6 +76,7 @@ type domctl_create_config = {
   max_evtchn_port: int;
   max_grant_frames: int;
   max_maptrack_frames: int;
+  max_grant_version: int;
   arch: arch_domainconfig;
 }
 
diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c b/tools/ocaml/libs/xc/xenctrl_stubs.c
index ad953d36bd..1e60925069 100644
--- a/tools/ocaml/libs/xc/xenctrl_stubs.c
+++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
@@ -188,7 +188,8 @@ CAMLprim value stub_xc_domain_create(value xch, value wanted_domid, value config
 #define VAL_MAX_EVTCHN_PORT     Field(config, 5)
 #define VAL_MAX_GRANT_FRAMES    Field(config, 6)
 #define VAL_MAX_MAPTRACK_FRAMES Field(config, 7)
-#define VAL_ARCH                Field(config, 8)
+#define VAL_MAX_GRANT_VERSION   Field(config, 8)
+#define VAL_ARCH                Field(config, 9)
 
 	uint32_t domid = Int_val(wanted_domid);
 	int result;
@@ -198,6 +199,9 @@ CAMLprim value stub_xc_domain_create(value xch, value wanted_domid, value config
 		.max_evtchn_port = Int_val(VAL_MAX_EVTCHN_PORT),
 		.max_grant_frames = Int_val(VAL_MAX_GRANT_FRAMES),
 		.max_maptrack_frames = Int_val(VAL_MAX_MAPTRACK_FRAMES),
+		.grant_opts = Int_val(VAL_MAX_GRANT_VERSION) == -1
+			      ? XEN_DOMCTL_GRANT_version_default
+			      : Int_val(VAL_MAX_GRANT_VERSION),
 	};
 
 	domain_handle_of_uuid_string(cfg.handle, String_val(VAL_HANDLE));
@@ -251,6 +255,7 @@ CAMLprim value stub_xc_domain_create(value xch, value wanted_domid, value config
 	}
 
 #undef VAL_ARCH
+#undef VAL_MAX_GRANT_VERSION
 #undef VAL_MAX_MAPTRACK_FRAMES
 #undef VAL_MAX_GRANT_FRAMES
 #undef VAL_MAX_EVTCHN_PORT
diff --git a/tools/xl/xl.c b/tools/xl/xl.c
index f422f9fed5..7564ff323b 100644
--- a/tools/xl/xl.c
+++ b/tools/xl/xl.c
@@ -55,6 +55,7 @@ bool progress_use_cr = 0;
 bool timestamps = 0;
 int max_grant_frames = -1;
 int max_maptrack_frames = -1;
+int max_grant_version = -1;
 libxl_domid domid_policy = INVALID_DOMID;
 
 xentoollog_level minmsglevel = minmsglevel_default;
@@ -219,6 +220,13 @@ static void parse_global_config(const char *configfile,
     else if (e != ESRCH)
         exit(1);
 
+    e = xlu_cfg_get_bounded_long (config, "max_grant_version", 0,
+                                  INT_MAX, &l, 1);
+    if (!e)
+        max_grant_version = l;
+    else if (e != ESRCH)
+        exit(1);
+
     libxl_cpu_bitmap_alloc(ctx, &global_vm_affinity_mask, 0);
     libxl_cpu_bitmap_alloc(ctx, &global_hvm_affinity_mask, 0);
     libxl_cpu_bitmap_alloc(ctx, &global_pv_affinity_mask, 0);
diff --git a/tools/xl/xl.h b/tools/xl/xl.h
index 7e23f30192..cf12c79a9b 100644
--- a/tools/xl/xl.h
+++ b/tools/xl/xl.h
@@ -282,6 +282,7 @@ extern char *default_colo_proxy_script;
 extern char *blkdev_start;
 extern int max_grant_frames;
 extern int max_maptrack_frames;
+extern int max_grant_version;
 extern libxl_bitmap global_vm_affinity_mask;
 extern libxl_bitmap global_hvm_affinity_mask;
 extern libxl_bitmap global_pv_affinity_mask;
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
index c503b9be00..117fcdcb2b 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -1431,6 +1431,15 @@ void parse_config_data(const char *config_source,
     else
         exit(1);
 
+    e = xlu_cfg_get_bounded_long (config, "max_grant_version", 0,
+                                  INT_MAX, &l, 1);
+    if (e == ESRCH) /* not specified */
+        b_info->max_grant_version = max_grant_version;
+    else if (!e)
+        b_info->max_grant_version = l;
+    else
+        exit(1);
+
     libxl_defbool_set(&b_info->claim_mode, claim_mode);
 
     if (xlu_cfg_get_string (config, "on_poweroff", &buf, 0))
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 0167731ab0..faeb3eba76 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -2967,6 +2967,7 @@ void __init create_domUs(void)
             .max_evtchn_port = -1,
             .max_grant_frames = -1,
             .max_maptrack_frames = -1,
+            .grant_opts = XEN_DOMCTL_GRANT_version_default,
         };
 
         if ( !dt_device_is_compatible(node, "xen,domain") )
@@ -3074,6 +3075,7 @@ void __init create_dom0(void)
         .max_evtchn_port = -1,
         .max_grant_frames = gnttab_dom0_frames(),
         .max_maptrack_frames = -1,
+        .grant_opts = XEN_DOMCTL_GRANT_version_default,
     };
 
     /* The vGIC for DOM0 is exactly emulating the hardware GIC */
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index b101565f14..b5b6c75447 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -750,6 +750,7 @@ static struct domain *__init create_dom0(const module_t *image,
         .max_evtchn_port = -1,
         .max_grant_frames = -1,
         .max_maptrack_frames = -1,
+        .grant_opts = XEN_DOMCTL_GRANT_version_default,
         .max_vcpus = dom0_max_vcpus(),
         .arch = {
             .misc_flags = opt_dom0_msr_relaxed ? XEN_X86_MSR_RELAXED : 0,
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 8b53c49d1e..0c7052c770 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -678,7 +678,8 @@ struct domain *domain_create(domid_t domid,
         init_status |= INIT_evtchn;
 
         if ( (err = grant_table_init(d, config->max_grant_frames,
-                                     config->max_maptrack_frames)) != 0 )
+                                     config->max_maptrack_frames,
+                                     config->grant_opts)) != 0 )
             goto fail;
         init_status |= INIT_gnttab;
 
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index e510395d08..f94f0f272c 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -53,6 +53,7 @@ struct grant_table {
     percpu_rwlock_t       lock;
     /* Lock protecting the maptrack limit */
     spinlock_t            maptrack_lock;
+    unsigned int          max_version;
     /*
      * Defaults to v1.  May be changed with GNTTABOP_set_version.  All other
      * values are invalid.
@@ -1917,11 +1918,33 @@ active_alloc_failed:
 }
 
 int grant_table_init(struct domain *d, int max_grant_frames,
-                     int max_maptrack_frames)
+                     int max_maptrack_frames, unsigned int options)
 {
     struct grant_table *gt;
+    unsigned int max_grant_version = options & XEN_DOMCTL_GRANT_version_mask;
     int ret = -ENOMEM;
 
+    if ( max_grant_version == XEN_DOMCTL_GRANT_version_default )
+        max_grant_version = opt_gnttab_max_version;
+    if ( !max_grant_version )
+    {
+        dprintk(XENLOG_INFO, "%pd: invalid grant table version 0 requested\n",
+                d);
+        return -EINVAL;
+    }
+    if ( max_grant_version > opt_gnttab_max_version )
+    {
+        dprintk(XENLOG_INFO,
+                "%pd: requested grant version (%u) greater than supported (%u)\n",
+                d, max_grant_version, opt_gnttab_max_version);
+        return -EINVAL;
+    }
+    if ( unlikely(max_page >= PFN_DOWN(TB(16))) && is_pv_domain(d) &&
+         max_grant_version < 2 )
+        dprintk(XENLOG_INFO,
+                "%pd: host memory above 16Tb and grant table v2 disabled\n",
+                d);
+
     /* Default to maximum value if no value was specified */
     if ( max_grant_frames < 0 )
         max_grant_frames = opt_max_grant_frames;
@@ -1947,6 +1970,7 @@ int grant_table_init(struct domain *d, int max_grant_frames,
     gt->gt_version = 1;
     gt->max_grant_frames = max_grant_frames;
     gt->max_maptrack_frames = max_maptrack_frames;
+    gt->max_version = max_grant_version;
 
     /* Install the structure early to simplify the error path. */
     gt->domain = d;
@@ -3076,7 +3100,7 @@ gnttab_set_version(XEN_GUEST_HANDLE_PARAM(gnttab_set_version_t) uop)
         goto out;
 
     res = -ENOSYS;
-    if ( op.version == 2 && opt_gnttab_max_version == 1 )
+    if ( op.version == 2 && gt->max_version == 1 )
         goto out; /* Behave as before set_version was introduced. */
 
     res = 0;
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 51017b47bc..0ec57614bd 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -89,14 +89,20 @@ struct xen_domctl_createdomain {
     /*
      * Various domain limits, which impact the quantity of resources
      * (global mapping space, xenheap, etc) a guest may consume.  For
-     * max_grant_frames and max_maptrack_frames, < 0 means "use the
-     * default maximum value in the hypervisor".
+     * max_grant_frames, max_maptrack_frames and max_gnttab_version < 0
+     * means "use the default maximum value in the hypervisor".
      */
     uint32_t max_vcpus;
     uint32_t max_evtchn_port;
     int32_t max_grant_frames;
     int32_t max_maptrack_frames;
 
+/* Grant version, use low 4 bits. */
+#define XEN_DOMCTL_GRANT_version_mask    0xf
+#define XEN_DOMCTL_GRANT_version_default 0xf
+
+    uint32_t grant_opts;
+
     /* Per-vCPU buffer size in bytes.  0 to disable. */
     uint32_t vmtrace_size;
 
diff --git a/xen/include/xen/config.h b/xen/include/xen/config.h
index b76222ecf6..37b42c756a 100644
--- a/xen/include/xen/config.h
+++ b/xen/include/xen/config.h
@@ -81,6 +81,7 @@
 #define KB(_kb)     (_AC(_kb, ULL) << 10)
 #define MB(_mb)     (_AC(_mb, ULL) << 20)
 #define GB(_gb)     (_AC(_gb, ULL) << 30)
+#define TB(_tb)     (_AC(_tb, ULL) << 40)
 
 #define IS_ALIGNED(val, align) (((val) & ((align) - 1)) == 0)
 
diff --git a/xen/include/xen/grant_table.h b/xen/include/xen/grant_table.h
index 41713e2726..fe6225346f 100644
--- a/xen/include/xen/grant_table.h
+++ b/xen/include/xen/grant_table.h
@@ -36,7 +36,7 @@ extern unsigned int opt_max_grant_frames;
 
 /* Create/destroy per-domain grant table context. */
 int grant_table_init(struct domain *d, int max_grant_frames,
-                     int max_maptrack_frames);
+                     int max_maptrack_frames, unsigned int options);
 void grant_table_destroy(
     struct domain *d);
 void grant_table_init_vcpu(struct vcpu *v);
@@ -67,7 +67,8 @@ int gnttab_acquire_resource(
 
 static inline int grant_table_init(struct domain *d,
                                    int max_grant_frames,
-                                   int max_maptrack_frames)
+                                   int max_maptrack_frames,
+                                   unsigned int options)
 {
     return 0;
 }
-- 
2.33.0


Re: [PATCH for-4.16 v4] gnttab: allow setting max version per-domain
Posted by Julien Grall 2 years, 5 months ago
Hi Roger,

On 29/10/2021 08:59, Roger Pau Monne wrote:
> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
> index e510395d08..f94f0f272c 100644
> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -53,6 +53,7 @@ struct grant_table {
>       percpu_rwlock_t       lock;
>       /* Lock protecting the maptrack limit */
>       spinlock_t            maptrack_lock;
> +    unsigned int          max_version;
>       /*
>        * Defaults to v1.  May be changed with GNTTABOP_set_version.  All other
>        * values are invalid.
> @@ -1917,11 +1918,33 @@ active_alloc_failed:
>   }
>   
>   int grant_table_init(struct domain *d, int max_grant_frames,
> -                     int max_maptrack_frames)
> +                     int max_maptrack_frames, unsigned int options)
>   {
>       struct grant_table *gt;
> +    unsigned int max_grant_version = options & XEN_DOMCTL_GRANT_version_mask;
>       int ret = -ENOMEM;
>   
> +    if ( max_grant_version == XEN_DOMCTL_GRANT_version_default )
> +        max_grant_version = opt_gnttab_max_version;
> +    if ( !max_grant_version )
> +    {
> +        dprintk(XENLOG_INFO, "%pd: invalid grant table version 0 requested\n",
> +                d);
> +        return -EINVAL;
> +    }
> +    if ( max_grant_version > opt_gnttab_max_version )
> +    {
> +        dprintk(XENLOG_INFO,
> +                "%pd: requested grant version (%u) greater than supported (%u)\n",
> +                d, max_grant_version, opt_gnttab_max_version);
> +        return -EINVAL;
> +    }
> +    if ( unlikely(max_page >= PFN_DOWN(TB(16))) && is_pv_domain(d) &&

 From my understanding, the limit for the grant table v1 is based on the 
page granularity used and the size of the fields.

So the limit you add is valid for 4KB but not 16KB/64KB. Therefore, I 
think it would be better to use:

'max_page >= (1U << 32)'

Furthermore, it would add a comment explaining where this limit comes from.

Lastly, did you check the compiler wouldn't throw an error on arm32?

> +         max_grant_version < 2 )
> +        dprintk(XENLOG_INFO,
> +                "%pd: host memory above 16Tb and grant table v2 disabled\n",
> +                d);

I would switch this one to a printk().

Cheers,

-- 
Julien Grall

Re: [PATCH for-4.16 v4] gnttab: allow setting max version per-domain
Posted by Roger Pau Monné 2 years, 5 months ago
On Fri, Oct 29, 2021 at 09:58:55AM +0100, Julien Grall wrote:
> Hi Roger,
> 
> On 29/10/2021 08:59, Roger Pau Monne wrote:
> > diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
> > index e510395d08..f94f0f272c 100644
> > --- a/xen/common/grant_table.c
> > +++ b/xen/common/grant_table.c
> > @@ -53,6 +53,7 @@ struct grant_table {
> >       percpu_rwlock_t       lock;
> >       /* Lock protecting the maptrack limit */
> >       spinlock_t            maptrack_lock;
> > +    unsigned int          max_version;
> >       /*
> >        * Defaults to v1.  May be changed with GNTTABOP_set_version.  All other
> >        * values are invalid.
> > @@ -1917,11 +1918,33 @@ active_alloc_failed:
> >   }
> >   int grant_table_init(struct domain *d, int max_grant_frames,
> > -                     int max_maptrack_frames)
> > +                     int max_maptrack_frames, unsigned int options)
> >   {
> >       struct grant_table *gt;
> > +    unsigned int max_grant_version = options & XEN_DOMCTL_GRANT_version_mask;
> >       int ret = -ENOMEM;
> > +    if ( max_grant_version == XEN_DOMCTL_GRANT_version_default )
> > +        max_grant_version = opt_gnttab_max_version;
> > +    if ( !max_grant_version )
> > +    {
> > +        dprintk(XENLOG_INFO, "%pd: invalid grant table version 0 requested\n",
> > +                d);
> > +        return -EINVAL;
> > +    }
> > +    if ( max_grant_version > opt_gnttab_max_version )
> > +    {
> > +        dprintk(XENLOG_INFO,
> > +                "%pd: requested grant version (%u) greater than supported (%u)\n",
> > +                d, max_grant_version, opt_gnttab_max_version);
> > +        return -EINVAL;
> > +    }
> > +    if ( unlikely(max_page >= PFN_DOWN(TB(16))) && is_pv_domain(d) &&
> 
> From my understanding, the limit for the grant table v1 is based on the page
> granularity used and the size of the fields.
> 
> So the limit you add is valid for 4KB but not 16KB/64KB. Therefore, I think
> it would be better to use:
> 
> 'max_page >= (1U << 32)'

I'm slightly confused. Isn't Xen always using a 4KB page granularity,
and that also applies to the grant table entries?

I don't think it's possible to use correctly use a 16KB or 64KB page
as an entry for the grant table, as Xen assumes those to always be 4KB
based.

> Furthermore, it would add a comment explaining where this limit comes from.
> 
> Lastly, did you check the compiler wouldn't throw an error on arm32?

I've tested a previous version (v2), but not this one. I assume it
doesn't build?

I've pushed it to gitlab and will adjust as needed.

> > +         max_grant_version < 2 )
> > +        dprintk(XENLOG_INFO,
> > +                "%pd: host memory above 16Tb and grant table v2 disabled\n",
> > +                d);
> 
> I would switch this one to a printk().

That's fine, will adjust unless someone has objections.

Thanks, Roger.

Re: [PATCH for-4.16 v4] gnttab: allow setting max version per-domain
Posted by Julien Grall 2 years, 5 months ago

On 29/10/2021 10:41, Roger Pau Monné wrote:
> On Fri, Oct 29, 2021 at 09:58:55AM +0100, Julien Grall wrote:
>> Hi Roger,
Hi Roger,

>> On 29/10/2021 08:59, Roger Pau Monne wrote:
>>> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
>>> index e510395d08..f94f0f272c 100644
>>> --- a/xen/common/grant_table.c
>>> +++ b/xen/common/grant_table.c
>>> @@ -53,6 +53,7 @@ struct grant_table {
>>>        percpu_rwlock_t       lock;
>>>        /* Lock protecting the maptrack limit */
>>>        spinlock_t            maptrack_lock;
>>> +    unsigned int          max_version;
>>>        /*
>>>         * Defaults to v1.  May be changed with GNTTABOP_set_version.  All other
>>>         * values are invalid.
>>> @@ -1917,11 +1918,33 @@ active_alloc_failed:
>>>    }
>>>    int grant_table_init(struct domain *d, int max_grant_frames,
>>> -                     int max_maptrack_frames)
>>> +                     int max_maptrack_frames, unsigned int options)
>>>    {
>>>        struct grant_table *gt;
>>> +    unsigned int max_grant_version = options & XEN_DOMCTL_GRANT_version_mask;
>>>        int ret = -ENOMEM;
>>> +    if ( max_grant_version == XEN_DOMCTL_GRANT_version_default )
>>> +        max_grant_version = opt_gnttab_max_version;
>>> +    if ( !max_grant_version )
>>> +    {
>>> +        dprintk(XENLOG_INFO, "%pd: invalid grant table version 0 requested\n",
>>> +                d);
>>> +        return -EINVAL;
>>> +    }
>>> +    if ( max_grant_version > opt_gnttab_max_version )
>>> +    {
>>> +        dprintk(XENLOG_INFO,
>>> +                "%pd: requested grant version (%u) greater than supported (%u)\n",
>>> +                d, max_grant_version, opt_gnttab_max_version);
>>> +        return -EINVAL;
>>> +    }
>>> +    if ( unlikely(max_page >= PFN_DOWN(TB(16))) && is_pv_domain(d) &&
>>
>>  From my understanding, the limit for the grant table v1 is based on the page
>> granularity used and the size of the fields.
>>
>> So the limit you add is valid for 4KB but not 16KB/64KB. Therefore, I think
>> it would be better to use:
>>
>> 'max_page >= (1U << 32)'
> 
> I'm slightly confused. Isn't Xen always using a 4KB page granularity,

Yes. We only support 4KB today. But most of Xen is agnostic to the page 
granularity. I have actually started to look to allow 64KB/16KB page 
granularity for Xen on Arm in my spare time.

> and that also applies to the grant table entries?
The page granularity for the hypercall interface is whatever the page 
granularity Xen is using. So...

> 
> I don't think it's possible to use correctly use a 16KB or 64KB page
> as an entry for the grant table, as Xen assumes those to always be 4KB
> based.

... if you build Xen with 16KB, then the grant table entries will be 
using 16KB.

So I would like to avoid making the assumption that we are always using 
4KB. That said, the worse that can happen is a spurious message. So this 
is more to get an accurate check.

> 
>> Furthermore, it would add a comment explaining where this limit comes from.
>>
>> Lastly, did you check the compiler wouldn't throw an error on arm32?
> 
> I've tested a previous version (v2), but not this one. I assume it
> doesn't build?

I haven't tried. But I remember in the past seen report for always 
true/false check. Maybe that was just on coverity?

> 
> I've pushed it to gitlab and will adjust as needed.

Thanks!

Cheers,

-- 
Julien Grall

Re: [PATCH for-4.16 v4] gnttab: allow setting max version per-domain
Posted by Roger Pau Monné 2 years, 5 months ago
Hello,

On Fri, Oct 29, 2021 at 11:01:29AM +0100, Julien Grall wrote:
> 
> 
> On 29/10/2021 10:41, Roger Pau Monné wrote:
> > On Fri, Oct 29, 2021 at 09:58:55AM +0100, Julien Grall wrote:
> > > Hi Roger,
> Hi Roger,
> 
> > > On 29/10/2021 08:59, Roger Pau Monne wrote:
> > > > diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
> > > > index e510395d08..f94f0f272c 100644
> > > > --- a/xen/common/grant_table.c
> > > > +++ b/xen/common/grant_table.c
> > > > @@ -53,6 +53,7 @@ struct grant_table {
> > > >        percpu_rwlock_t       lock;
> > > >        /* Lock protecting the maptrack limit */
> > > >        spinlock_t            maptrack_lock;
> > > > +    unsigned int          max_version;
> > > >        /*
> > > >         * Defaults to v1.  May be changed with GNTTABOP_set_version.  All other
> > > >         * values are invalid.
> > > > @@ -1917,11 +1918,33 @@ active_alloc_failed:
> > > >    }
> > > >    int grant_table_init(struct domain *d, int max_grant_frames,
> > > > -                     int max_maptrack_frames)
> > > > +                     int max_maptrack_frames, unsigned int options)
> > > >    {
> > > >        struct grant_table *gt;
> > > > +    unsigned int max_grant_version = options & XEN_DOMCTL_GRANT_version_mask;
> > > >        int ret = -ENOMEM;
> > > > +    if ( max_grant_version == XEN_DOMCTL_GRANT_version_default )
> > > > +        max_grant_version = opt_gnttab_max_version;
> > > > +    if ( !max_grant_version )
> > > > +    {
> > > > +        dprintk(XENLOG_INFO, "%pd: invalid grant table version 0 requested\n",
> > > > +                d);
> > > > +        return -EINVAL;
> > > > +    }
> > > > +    if ( max_grant_version > opt_gnttab_max_version )
> > > > +    {
> > > > +        dprintk(XENLOG_INFO,
> > > > +                "%pd: requested grant version (%u) greater than supported (%u)\n",
> > > > +                d, max_grant_version, opt_gnttab_max_version);
> > > > +        return -EINVAL;
> > > > +    }
> > > > +    if ( unlikely(max_page >= PFN_DOWN(TB(16))) && is_pv_domain(d) &&
> > > 
> > >  From my understanding, the limit for the grant table v1 is based on the page
> > > granularity used and the size of the fields.
> > > 
> > > So the limit you add is valid for 4KB but not 16KB/64KB. Therefore, I think
> > > it would be better to use:
> > > 
> > > 'max_page >= (1U << 32)'
> > 
> > I'm slightly confused. Isn't Xen always using a 4KB page granularity,
> 
> Yes. We only support 4KB today. But most of Xen is agnostic to the page
> granularity. I have actually started to look to allow 64KB/16KB page
> granularity for Xen on Arm in my spare time.
> 
> > and that also applies to the grant table entries?
> The page granularity for the hypercall interface is whatever the page
> granularity Xen is using. So...

I've somehow assumed that the current hypercall ABI was strictly tied
to 4KB pages, as that's for example already hardcoded in Linux
as XEN_PAGE_SIZE.

> > 
> > I don't think it's possible to use correctly use a 16KB or 64KB page
> > as an entry for the grant table, as Xen assumes those to always be 4KB
> > based.
> 
> ... if you build Xen with 16KB, then the grant table entries will be using
> 16KB.
> 
> So I would like to avoid making the assumption that we are always using 4KB.
> That said, the worse that can happen is a spurious message. So this is more
> to get an accurate check.

I don't have strong objections to using max_page >> 32, it might even
be clearer than checking against TB(16).

It's just that the check would be wrong if we allow Xen itself to use
a different page size than the one used by the grant table interface
to the guest.

> > 
> > > Furthermore, it would add a comment explaining where this limit comes from.
> > > 
> > > Lastly, did you check the compiler wouldn't throw an error on arm32?
> > 
> > I've tested a previous version (v2), but not this one. I assume it
> > doesn't build?
> 
> I haven't tried. But I remember in the past seen report for always
> true/false check. Maybe that was just on coverity?

Hm, possibly. It seems like debian-unstable arm32 gcc is building OK,
but I've got no idea if different compiler versions could complain.

Thanks, Roger.

Re: [PATCH for-4.16 v4] gnttab: allow setting max version per-domain
Posted by Julien Grall 2 years, 5 months ago
On 29/10/2021 12:04, Roger Pau Monné wrote:
> Hello,

Hi Roger,

> On Fri, Oct 29, 2021 at 11:01:29AM +0100, Julien Grall wrote:
>>
>>
>> On 29/10/2021 10:41, Roger Pau Monné wrote:
>>> On Fri, Oct 29, 2021 at 09:58:55AM +0100, Julien Grall wrote:
>>>> On 29/10/2021 08:59, Roger Pau Monne wrote:
>>>>> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
>>>>> index e510395d08..f94f0f272c 100644
>>>>> --- a/xen/common/grant_table.c
>>>>> +++ b/xen/common/grant_table.c
>>>>> @@ -53,6 +53,7 @@ struct grant_table {
>>>>>         percpu_rwlock_t       lock;
>>>>>         /* Lock protecting the maptrack limit */
>>>>>         spinlock_t            maptrack_lock;
>>>>> +    unsigned int          max_version;
>>>>>         /*
>>>>>          * Defaults to v1.  May be changed with GNTTABOP_set_version.  All other
>>>>>          * values are invalid.
>>>>> @@ -1917,11 +1918,33 @@ active_alloc_failed:
>>>>>     }
>>>>>     int grant_table_init(struct domain *d, int max_grant_frames,
>>>>> -                     int max_maptrack_frames)
>>>>> +                     int max_maptrack_frames, unsigned int options)
>>>>>     {
>>>>>         struct grant_table *gt;
>>>>> +    unsigned int max_grant_version = options & XEN_DOMCTL_GRANT_version_mask;
>>>>>         int ret = -ENOMEM;
>>>>> +    if ( max_grant_version == XEN_DOMCTL_GRANT_version_default )
>>>>> +        max_grant_version = opt_gnttab_max_version;
>>>>> +    if ( !max_grant_version )
>>>>> +    {
>>>>> +        dprintk(XENLOG_INFO, "%pd: invalid grant table version 0 requested\n",
>>>>> +                d);
>>>>> +        return -EINVAL;
>>>>> +    }
>>>>> +    if ( max_grant_version > opt_gnttab_max_version )
>>>>> +    {
>>>>> +        dprintk(XENLOG_INFO,
>>>>> +                "%pd: requested grant version (%u) greater than supported (%u)\n",
>>>>> +                d, max_grant_version, opt_gnttab_max_version);
>>>>> +        return -EINVAL;
>>>>> +    }
>>>>> +    if ( unlikely(max_page >= PFN_DOWN(TB(16))) && is_pv_domain(d) &&
>>>>
>>>>   From my understanding, the limit for the grant table v1 is based on the page
>>>> granularity used and the size of the fields.
>>>>
>>>> So the limit you add is valid for 4KB but not 16KB/64KB. Therefore, I think
>>>> it would be better to use:
>>>>
>>>> 'max_page >= (1U << 32)'
>>>
>>> I'm slightly confused. Isn't Xen always using a 4KB page granularity,
>>
>> Yes. We only support 4KB today. But most of Xen is agnostic to the page
>> granularity. I have actually started to look to allow 64KB/16KB page
>> granularity for Xen on Arm in my spare time.
>>
>>> and that also applies to the grant table entries?
>> The page granularity for the hypercall interface is whatever the page
>> granularity Xen is using. So...
> 
> I've somehow assumed that the current hypercall ABI was strictly tied
> to 4KB pages, as that's for example already hardcoded in Linux
> as XEN_PAGE_SIZE.

It is a mess. Before, XEN_PAGE_SIZE was introduced, we were assuming 
that Linux and Xen were using the same page granularity.

When I introduced the support to run guest with 16KB and 64KB, then we 
decided to introduce the define. Looking back at the decision, this was 
a mistake and we should have introduced an hypercall to fetch the ABI 
granularity instead.

> 
>>>
>>> I don't think it's possible to use correctly use a 16KB or 64KB page
>>> as an entry for the grant table, as Xen assumes those to always be 4KB
>>> based.
>>
>> ... if you build Xen with 16KB, then the grant table entries will be using
>> 16KB.
>>
>> So I would like to avoid making the assumption that we are always using 4KB.
>> That said, the worse that can happen is a spurious message. So this is more
>> to get an accurate check.
> 
> I don't have strong objections to using max_page >> 32, it might even
> be clearer than checking against TB(16).
> 
> It's just that the check would be wrong if we allow Xen itself to use
> a different page size than the one used by the grant table interface
> to the guest.

With the current interface, I don't see how we can untie the hypervisor 
page granularity from the ABI.

At least on Arm, if we were going to build Xen with 64KB page size, then 
we will not be able to do mapping less than 64KB in the stage-2 
page-tables (aka HAP on x86).

This is going to require some overhaul in the ABI to untie the page 
granularity from the hypervisor one.

For now, if we were going to introduce page granularity 64KB, then we 
will automatically change the ABI to use 64KB page granularity. 
Therefore, I think the check would be suitable with the current code.

[...]

> Hm, possibly. It seems like debian-unstable arm32 gcc is building OK,
> but I've got no idea if different compiler versions could complain.

Thanks for the testing. It should be fine then. This would be a build 
failure, so it can be easily fixed afterwards if needed.

Cheers,

-- 
Julien Grall

Re: [PATCH for-4.16 v4] gnttab: allow setting max version per-domain
Posted by Roger Pau Monné 2 years, 5 months ago
On Fri, Oct 29, 2021 at 02:25:22PM +0100, Julien Grall wrote:
> On 29/10/2021 12:04, Roger Pau Monné wrote:
> > Hello,
> 
> Hi Roger,
> 
> > On Fri, Oct 29, 2021 at 11:01:29AM +0100, Julien Grall wrote:
> > > 
> > > 
> > > On 29/10/2021 10:41, Roger Pau Monné wrote:
> > > > On Fri, Oct 29, 2021 at 09:58:55AM +0100, Julien Grall wrote:
> > > > > On 29/10/2021 08:59, Roger Pau Monne wrote:
> > > > > > diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
> > > > > > index e510395d08..f94f0f272c 100644
> > > > > > --- a/xen/common/grant_table.c
> > > > > > +++ b/xen/common/grant_table.c
> > > > > > @@ -53,6 +53,7 @@ struct grant_table {
> > > > > >         percpu_rwlock_t       lock;
> > > > > >         /* Lock protecting the maptrack limit */
> > > > > >         spinlock_t            maptrack_lock;
> > > > > > +    unsigned int          max_version;
> > > > > >         /*
> > > > > >          * Defaults to v1.  May be changed with GNTTABOP_set_version.  All other
> > > > > >          * values are invalid.
> > > > > > @@ -1917,11 +1918,33 @@ active_alloc_failed:
> > > > > >     }
> > > > > >     int grant_table_init(struct domain *d, int max_grant_frames,
> > > > > > -                     int max_maptrack_frames)
> > > > > > +                     int max_maptrack_frames, unsigned int options)
> > > > > >     {
> > > > > >         struct grant_table *gt;
> > > > > > +    unsigned int max_grant_version = options & XEN_DOMCTL_GRANT_version_mask;
> > > > > >         int ret = -ENOMEM;
> > > > > > +    if ( max_grant_version == XEN_DOMCTL_GRANT_version_default )
> > > > > > +        max_grant_version = opt_gnttab_max_version;
> > > > > > +    if ( !max_grant_version )
> > > > > > +    {
> > > > > > +        dprintk(XENLOG_INFO, "%pd: invalid grant table version 0 requested\n",
> > > > > > +                d);
> > > > > > +        return -EINVAL;
> > > > > > +    }
> > > > > > +    if ( max_grant_version > opt_gnttab_max_version )
> > > > > > +    {
> > > > > > +        dprintk(XENLOG_INFO,
> > > > > > +                "%pd: requested grant version (%u) greater than supported (%u)\n",
> > > > > > +                d, max_grant_version, opt_gnttab_max_version);
> > > > > > +        return -EINVAL;
> > > > > > +    }
> > > > > > +    if ( unlikely(max_page >= PFN_DOWN(TB(16))) && is_pv_domain(d) &&
> > > > > 
> > > > >   From my understanding, the limit for the grant table v1 is based on the page
> > > > > granularity used and the size of the fields.
> > > > > 
> > > > > So the limit you add is valid for 4KB but not 16KB/64KB. Therefore, I think
> > > > > it would be better to use:
> > > > > 
> > > > > 'max_page >= (1U << 32)'
> > > > 
> > > > I'm slightly confused. Isn't Xen always using a 4KB page granularity,
> > > 
> > > Yes. We only support 4KB today. But most of Xen is agnostic to the page
> > > granularity. I have actually started to look to allow 64KB/16KB page
> > > granularity for Xen on Arm in my spare time.
> > > 
> > > > and that also applies to the grant table entries?
> > > The page granularity for the hypercall interface is whatever the page
> > > granularity Xen is using. So...
> > 
> > I've somehow assumed that the current hypercall ABI was strictly tied
> > to 4KB pages, as that's for example already hardcoded in Linux
> > as XEN_PAGE_SIZE.
> 
> It is a mess. Before, XEN_PAGE_SIZE was introduced, we were assuming that
> Linux and Xen were using the same page granularity.
> 
> When I introduced the support to run guest with 16KB and 64KB, then we
> decided to introduce the define. Looking back at the decision, this was a
> mistake and we should have introduced an hypercall to fetch the ABI
> granularity instead.
> 
> > 
> > > > 
> > > > I don't think it's possible to use correctly use a 16KB or 64KB page
> > > > as an entry for the grant table, as Xen assumes those to always be 4KB
> > > > based.
> > > 
> > > ... if you build Xen with 16KB, then the grant table entries will be using
> > > 16KB.
> > > 
> > > So I would like to avoid making the assumption that we are always using 4KB.
> > > That said, the worse that can happen is a spurious message. So this is more
> > > to get an accurate check.
> > 
> > I don't have strong objections to using max_page >> 32, it might even
> > be clearer than checking against TB(16).

FWIW, I've changed the check to use >> 32 and limited it to
CONFIG_64BIT, since on 32bit arches max_page will be a 32bit value.

Thanks, Roger.

Re: [PATCH for-4.16 v4] gnttab: allow setting max version per-domain
Posted by Jan Beulich 2 years, 5 months ago
On 29.10.2021 09:59, Roger Pau Monne wrote:
> @@ -1917,11 +1918,33 @@ active_alloc_failed:
>  }
>  
>  int grant_table_init(struct domain *d, int max_grant_frames,
> -                     int max_maptrack_frames)
> +                     int max_maptrack_frames, unsigned int options)
>  {
>      struct grant_table *gt;
> +    unsigned int max_grant_version = options & XEN_DOMCTL_GRANT_version_mask;
>      int ret = -ENOMEM;
>  
> +    if ( max_grant_version == XEN_DOMCTL_GRANT_version_default )
> +        max_grant_version = opt_gnttab_max_version;
> +    if ( !max_grant_version )
> +    {
> +        dprintk(XENLOG_INFO, "%pd: invalid grant table version 0 requested\n",
> +                d);
> +        return -EINVAL;
> +    }
> +    if ( max_grant_version > opt_gnttab_max_version )
> +    {
> +        dprintk(XENLOG_INFO,
> +                "%pd: requested grant version (%u) greater than supported (%u)\n",
> +                d, max_grant_version, opt_gnttab_max_version);
> +        return -EINVAL;
> +    }
> +    if ( unlikely(max_page >= PFN_DOWN(TB(16))) && is_pv_domain(d) &&

This needs to be >, not >= - the variable's name is inaccurate wrt its
actual use. IOW in the proposed alternative code you'd need to check
(max_page - 1) >> 32 against zero.

Jan


Re: [PATCH for-4.16 v4] gnttab: allow setting max version per-domain
Posted by Andrew Cooper 2 years, 5 months ago
On 29/10/2021 08:59, Roger Pau Monne wrote:
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 0167731ab0..faeb3eba76 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -2967,6 +2967,7 @@ void __init create_domUs(void)
>              .max_evtchn_port = -1,
>              .max_grant_frames = -1,
>              .max_maptrack_frames = -1,
> +            .grant_opts = XEN_DOMCTL_GRANT_version_default,

These three will need to be opt_gnttab_max_version which will need
exporting.

See final comment for why "default" mustn't exist.

> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
> index e510395d08..f94f0f272c 100644
> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -1917,11 +1918,33 @@ active_alloc_failed:
>  }
>  
>  int grant_table_init(struct domain *d, int max_grant_frames,
> -                     int max_maptrack_frames)
> +                     int max_maptrack_frames, unsigned int options)
>  {
>      struct grant_table *gt;
> +    unsigned int max_grant_version = options & XEN_DOMCTL_GRANT_version_mask;
>      int ret = -ENOMEM;
>  
> +    if ( max_grant_version == XEN_DOMCTL_GRANT_version_default )
> +        max_grant_version = opt_gnttab_max_version;
> +    if ( !max_grant_version )
> +    {
> +        dprintk(XENLOG_INFO, "%pd: invalid grant table version 0 requested\n",
> +                d);
> +        return -EINVAL;
> +    }
> +    if ( max_grant_version > opt_gnttab_max_version )
> +    {
> +        dprintk(XENLOG_INFO,
> +                "%pd: requested grant version (%u) greater than supported (%u)\n",
> +                d, max_grant_version, opt_gnttab_max_version);
> +        return -EINVAL;
> +    }

I think this wants to live in sanitise_domain_config() along with all
the other auditing of flags and settings.  Also, it can be simplified:

if ( max_grant_version < 1 ||
    max_grant_version > opt_gnttab_max_version )
{
    dprintk(XENLOG_INFO, "Requested gnttab max version %u outside of
supported range [%u, %u]\n", ...);
}



> +    if ( unlikely(max_page >= PFN_DOWN(TB(16))) && is_pv_domain(d) &&
> +         max_grant_version < 2 )
> +        dprintk(XENLOG_INFO,
> +                "%pd: host memory above 16Tb and grant table v2 disabled\n",
> +                d);

This is rather more complicated.

For PV, this going wrong in the first place is conditional on CONFIG_BIGMEM.
For HVM, it the guest address size, not the host.
For ARM, I don't even know, because I've lost track of which bits of the
ABI are directmap in an otherwise translated domain.

I think it is probably useful to do something about it, but probably not
in this patch.

Perhaps modify domain_set_alloc_bitsize() to impose an upper limit for
the "host memory size matters" cases?

For the guest address size cases, this possibly wants to feed in to the
max policy calculations in the same way that shadow kinda does.

> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> index 51017b47bc..0ec57614bd 100644
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -89,14 +89,20 @@ struct xen_domctl_createdomain {
>      /*
>       * Various domain limits, which impact the quantity of resources
>       * (global mapping space, xenheap, etc) a guest may consume.  For
> -     * max_grant_frames and max_maptrack_frames, < 0 means "use the
> -     * default maximum value in the hypervisor".
> +     * max_grant_frames, max_maptrack_frames and max_gnttab_version < 0
> +     * means "use the default maximum value in the hypervisor".
>       */
>      uint32_t max_vcpus;
>      uint32_t max_evtchn_port;
>      int32_t max_grant_frames;
>      int32_t max_maptrack_frames;
>  
> +/* Grant version, use low 4 bits. */
> +#define XEN_DOMCTL_GRANT_version_mask    0xf
> +#define XEN_DOMCTL_GRANT_version_default 0xf

This needs to be a toolstack decision, not something in Xen.  This
doesn't fix the case where VMs can't cope with change underfoot.

It is fine for the user say "use the default", but this must be turned
into an explicit 1 or 2 by the toolstack, so that the version(s) visible
to the guest remains invariant while it is booted.

Given the timescales, I'll put together a prereq patch exposing
gnttab-v1/2 in virt_caps for the toolstack to reason over.

~Andrew


Re: [PATCH for-4.16 v4] gnttab: allow setting max version per-domain
Posted by Roger Pau Monné 2 years, 5 months ago
On Fri, Oct 29, 2021 at 05:39:52PM +0100, Andrew Cooper wrote:
> On 29/10/2021 08:59, Roger Pau Monne wrote:
> > diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
> > index e510395d08..f94f0f272c 100644
> > --- a/xen/common/grant_table.c
> > +++ b/xen/common/grant_table.c
> > @@ -1917,11 +1918,33 @@ active_alloc_failed:
> >  }
> >  
> >  int grant_table_init(struct domain *d, int max_grant_frames,
> > -                     int max_maptrack_frames)
> > +                     int max_maptrack_frames, unsigned int options)
> >  {
> >      struct grant_table *gt;
> > +    unsigned int max_grant_version = options & XEN_DOMCTL_GRANT_version_mask;
> >      int ret = -ENOMEM;
> >  
> > +    if ( max_grant_version == XEN_DOMCTL_GRANT_version_default )
> > +        max_grant_version = opt_gnttab_max_version;
> > +    if ( !max_grant_version )
> > +    {
> > +        dprintk(XENLOG_INFO, "%pd: invalid grant table version 0 requested\n",
> > +                d);
> > +        return -EINVAL;
> > +    }
> > +    if ( max_grant_version > opt_gnttab_max_version )
> > +    {
> > +        dprintk(XENLOG_INFO,
> > +                "%pd: requested grant version (%u) greater than supported (%u)\n",
> > +                d, max_grant_version, opt_gnttab_max_version);
> > +        return -EINVAL;
> > +    }
> 
> I think this wants to live in sanitise_domain_config() along with all
> the other auditing of flags and settings.

The reason to place those there is that the sanity checks for the
other grant table related parameters (max_grant_frames and
max_maptrack_frames) are performed in this function also. I think it's
better to keep the checks together.

We should consider exporting the relevant values from grant table
code and then moving all the checks to sanitise_domain_config, but
likely a follow up work given the current point in the release.

> Also, it can be simplified:
> 
> if ( max_grant_version < 1 ||
>     max_grant_version > opt_gnttab_max_version )
> {
>     dprintk(XENLOG_INFO, "Requested gnttab max version %u outside of
> supported range [%u, %u]\n", ...);
> }

It was originally done this way so that the first check
(!max_grant_version) could be adjusted when support for
max_grant_version == 0 was introduced [0] in order to signal the
disabling of grant tables altogether.

> 
> 
> > +    if ( unlikely(max_page >= PFN_DOWN(TB(16))) && is_pv_domain(d) &&
> > +         max_grant_version < 2 )
> > +        dprintk(XENLOG_INFO,
> > +                "%pd: host memory above 16Tb and grant table v2 disabled\n",
> > +                d);
> 
> This is rather more complicated.
> 
> For PV, this going wrong in the first place is conditional on CONFIG_BIGMEM.
> For HVM, it the guest address size, not the host.
> For ARM, I don't even know, because I've lost track of which bits of the
> ABI are directmap in an otherwise translated domain.

This was only aiming to cover the PV case, which I think it's the more
likely one. It's possible there's people attempting to create PV
guests on a 16TB machine, but I think it's more unlikely that the
guest itself will have 16TB of RAM.

> I think it is probably useful to do something about it, but probably not
> in this patch.

I'm fine with this, we had no warning at all before, so I don't think
it should be a hard requirement to add one now. It would be nice if
there's consensus, but otherwise let's just skip it.

> Perhaps modify domain_set_alloc_bitsize() to impose an upper limit for
> the "host memory size matters" cases?
> 
> For the guest address size cases, this possibly wants to feed in to the
> max policy calculations in the same way that shadow kinda does.

So grant table version will basically clamp the amount of memory a
guest can use?

What about guests that doesn't use grant tables at all, do we expect
those to set the future max_grant_version to 0 in order to avoid the
clamping without having to expose grant v2?

> > diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> > index 51017b47bc..0ec57614bd 100644
> > --- a/xen/include/public/domctl.h
> > +++ b/xen/include/public/domctl.h
> > @@ -89,14 +89,20 @@ struct xen_domctl_createdomain {
> >      /*
> >       * Various domain limits, which impact the quantity of resources
> >       * (global mapping space, xenheap, etc) a guest may consume.  For
> > -     * max_grant_frames and max_maptrack_frames, < 0 means "use the
> > -     * default maximum value in the hypervisor".
> > +     * max_grant_frames, max_maptrack_frames and max_gnttab_version < 0
> > +     * means "use the default maximum value in the hypervisor".
> >       */
> >      uint32_t max_vcpus;
> >      uint32_t max_evtchn_port;
> >      int32_t max_grant_frames;
> >      int32_t max_maptrack_frames;
> >  
> > +/* Grant version, use low 4 bits. */
> > +#define XEN_DOMCTL_GRANT_version_mask    0xf
> > +#define XEN_DOMCTL_GRANT_version_default 0xf
> 
> This needs to be a toolstack decision, not something in Xen.  This
> doesn't fix the case where VMs can't cope with change underfoot.
> 
> It is fine for the user say "use the default", but this must be turned
> into an explicit 1 or 2 by the toolstack, so that the version(s) visible
> to the guest remains invariant while it is booted.

Please bear with me, as I'm afraid I don't understand why this is
relevant. Allowed max grant version can only change as a result of a
migration, and A VM being booted cannot (usually) be migrated, as it
requires guest cooperation (and a fully setup xenstore).

Any guest allowing migration during boot (which is AFAICT the only way
for a max grant table version change) should be capable of handling
the max grant table version changing on resume, whereas this resume
happens in the middle of the boot process is a guest decision, and it
should be capable of handling such changes, or else refuse to suspend
during boot. Any resume process will always involve a
re-initialization of the grant table.

If the intent is to make this compatible with transparent live
migration I think there are also other grant table structures that
will need to be migrated in that case, and thus the version would
already be conveyed there.

Thanks, Roger.

Re: [PATCH for-4.16 v4] gnttab: allow setting max version per-domain [and 1 more messages]
Posted by Ian Jackson 2 years, 5 months ago
Roger Pau Monné writes ("Re: [PATCH for-4.16 v4] gnttab: allow setting max version per-domain"):
> [...]

Roger Pau Monné writes ("Re: [PATCH for-4.16 2/2] xen: Report grant table v1/v2 capabilities to the toolstack"):
> [...]

FTAOD I think these patches are related and I see discussion is still
ongoing.  Thanks for the CCs.  I would like to make a release-ack
decision after the discussion has converged.

Thanks,
Ian.

Re: [PATCH for-4.16 v4] gnttab: allow setting max version per-domain
Posted by Andrew Cooper 2 years, 5 months ago
On 30/10/2021 08:53, Roger Pau Monné wrote:
> On Fri, Oct 29, 2021 at 05:39:52PM +0100, Andrew Cooper wrote:
>> On 29/10/2021 08:59, Roger Pau Monne wrote:
>>> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
>>> index e510395d08..f94f0f272c 100644
>>> --- a/xen/common/grant_table.c
>>> +++ b/xen/common/grant_table.c
>>> @@ -1917,11 +1918,33 @@ active_alloc_failed:
>>>  }
>>>  
>>>  int grant_table_init(struct domain *d, int max_grant_frames,
>>> -                     int max_maptrack_frames)
>>> +                     int max_maptrack_frames, unsigned int options)
>>>  {
>>>      struct grant_table *gt;
>>> +    unsigned int max_grant_version = options & XEN_DOMCTL_GRANT_version_mask;
>>>      int ret = -ENOMEM;
>>>  
>>> +    if ( max_grant_version == XEN_DOMCTL_GRANT_version_default )
>>> +        max_grant_version = opt_gnttab_max_version;
>>> +    if ( !max_grant_version )
>>> +    {
>>> +        dprintk(XENLOG_INFO, "%pd: invalid grant table version 0 requested\n",
>>> +                d);
>>> +        return -EINVAL;
>>> +    }
>>> +    if ( max_grant_version > opt_gnttab_max_version )
>>> +    {
>>> +        dprintk(XENLOG_INFO,
>>> +                "%pd: requested grant version (%u) greater than supported (%u)\n",
>>> +                d, max_grant_version, opt_gnttab_max_version);
>>> +        return -EINVAL;
>>> +    }
>> I think this wants to live in sanitise_domain_config() along with all
>> the other auditing of flags and settings.
> The reason to place those there is that the sanity checks for the
> other grant table related parameters (max_grant_frames and
> max_maptrack_frames) are performed in this function also. I think it's
> better to keep the checks together.
>
> We should consider exporting the relevant values from grant table
> code and then moving all the checks to sanitise_domain_config, but
> likely a follow up work given the current point in the release.
>
>> Also, it can be simplified:
>>
>> if ( max_grant_version < 1 ||
>>     max_grant_version > opt_gnttab_max_version )
>> {
>>     dprintk(XENLOG_INFO, "Requested gnttab max version %u outside of
>> supported range [%u, %u]\n", ...);
>> }
> It was originally done this way so that the first check
> (!max_grant_version) could be adjusted when support for
> max_grant_version == 0 was introduced [0] in order to signal the
> disabling of grant tables altogether.
>
>>
>>> +    if ( unlikely(max_page >= PFN_DOWN(TB(16))) && is_pv_domain(d) &&
>>> +         max_grant_version < 2 )
>>> +        dprintk(XENLOG_INFO,
>>> +                "%pd: host memory above 16Tb and grant table v2 disabled\n",
>>> +                d);
>> This is rather more complicated.
>>
>> For PV, this going wrong in the first place is conditional on CONFIG_BIGMEM.
>> For HVM, it the guest address size, not the host.
>> For ARM, I don't even know, because I've lost track of which bits of the
>> ABI are directmap in an otherwise translated domain.
> This was only aiming to cover the PV case, which I think it's the more
> likely one. It's possible there's people attempting to create PV
> guests on a 16TB machine, but I think it's more unlikely that the
> guest itself will have 16TB of RAM.
>
>> I think it is probably useful to do something about it, but probably not
>> in this patch.
> I'm fine with this, we had no warning at all before, so I don't think
> it should be a hard requirement to add one now. It would be nice if
> there's consensus, but otherwise let's just skip it.
>
>> Perhaps modify domain_set_alloc_bitsize() to impose an upper limit for
>> the "host memory size matters" cases?
>>
>> For the guest address size cases, this possibly wants to feed in to the
>> max policy calculations in the same way that shadow kinda does.
> So grant table version will basically clamp the amount of memory a
> guest can use?
>
> What about guests that doesn't use grant tables at all, do we expect
> those to set the future max_grant_version to 0 in order to avoid the
> clamping without having to expose grant v2?
>
>>> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
>>> index 51017b47bc..0ec57614bd 100644
>>> --- a/xen/include/public/domctl.h
>>> +++ b/xen/include/public/domctl.h
>>> @@ -89,14 +89,20 @@ struct xen_domctl_createdomain {
>>>      /*
>>>       * Various domain limits, which impact the quantity of resources
>>>       * (global mapping space, xenheap, etc) a guest may consume.  For
>>> -     * max_grant_frames and max_maptrack_frames, < 0 means "use the
>>> -     * default maximum value in the hypervisor".
>>> +     * max_grant_frames, max_maptrack_frames and max_gnttab_version < 0
>>> +     * means "use the default maximum value in the hypervisor".
>>>       */
>>>      uint32_t max_vcpus;
>>>      uint32_t max_evtchn_port;
>>>      int32_t max_grant_frames;
>>>      int32_t max_maptrack_frames;
>>>  
>>> +/* Grant version, use low 4 bits. */
>>> +#define XEN_DOMCTL_GRANT_version_mask    0xf
>>> +#define XEN_DOMCTL_GRANT_version_default 0xf
>> This needs to be a toolstack decision, not something in Xen.  This
>> doesn't fix the case where VMs can't cope with change underfoot.
>>
>> It is fine for the user say "use the default", but this must be turned
>> into an explicit 1 or 2 by the toolstack, so that the version(s) visible
>> to the guest remains invariant while it is booted.
> Please bear with me, as I'm afraid I don't understand why this is
> relevant. Allowed max grant version can only change as a result of a
> migration

No.  Allowed max grant version is (well - needs to be) a fixed property
of the VM, even across migration.

It was a fundamentally mistake to ever have gnttab v2 active by default,
without an enumeration, and with no way of turning it off.  Same too for
evtchn, but we've already taken a patch to knobble fifo support.


The toolstack needs to explicitly select v1 or v2.  It's fine to pick a
default on behalf a user which doesn't care, but what moves in the
migrate stream must an explicit, unambiguous value, so the destination
Xen and toolstack can reconstruct the VM exactly.

"default" is ambiguous, and cannot be recovered after the fact.  In
particular, a vm with no explicit configuration, when booted on a Xen
with gnttab limited to v1 on the command line, should not have v2 become
accessible by migrating to a second Xen with no command line limit.  It
is fine, if that VM subsequently reboots, to find that v2 is now available.

~Andrew


Re: [PATCH for-4.16 v4] gnttab: allow setting max version per-domain
Posted by Julien Grall 2 years, 5 months ago
Hi Andree,

On 02/11/2021 14:34, Andrew Cooper wrote:
> Same too for
> evtchn, but we've already taken a patch to knobble fifo support.

I know that Amazon submitted a patch to allow disabling FIFO [1]. But 
AFAIK, this is not yet merged because of disagrement (?) on the approach.

Cheers,

[1] https://lore.kernel.org/xen-devel/20201203124159.3688-2-paul@xen.org/

-- 
Julien Grall

Re: [PATCH for-4.16 v4] gnttab: allow setting max version per-domain
Posted by Andrew Cooper 2 years, 5 months ago
On 02/11/2021 15:00, Julien Grall wrote:
> Hi Andree,
>
> On 02/11/2021 14:34, Andrew Cooper wrote:
>> Same too for
>> evtchn, but we've already taken a patch to knobble fifo support.
>
> I know that Amazon submitted a patch to allow disabling FIFO [1]. But
> AFAIK, this is not yet merged because of disagrement (?) on the approach.
>
> Cheers,
>
> [1] https://lore.kernel.org/xen-devel/20201203124159.3688-2-paul@xen.org/
>

:(  I honestly thought we'd fixed that.  It certainly needs fixing.

I'll see about adding suitable evtchn abi indications alongside the
grant abi indications, but that will have to be tomorrow at this point.

~Andrew

[PATCH for-4.16 1/2] tools/golang: Regenerate bindings
Posted by Andrew Cooper 2 years, 5 months ago
This was missed previously.

Fixes: 1e6706b0d123 ("xen/arm: Introduce gpaddr_bits field to struct xen_domctl_getdomaininfo")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Ian Jackson <iwj@xenproject.org>
CC: Wei Liu <wl@xen.org>

For 4.16.  Fixing a build-time error in a change already accepted.
---
 tools/golang/xenlight/helpers.gen.go | 2 ++
 tools/golang/xenlight/types.gen.go   | 1 +
 2 files changed, 3 insertions(+)

diff --git a/tools/golang/xenlight/helpers.gen.go b/tools/golang/xenlight/helpers.gen.go
index 2449580badb6..c010f2d3a47f 100644
--- a/tools/golang/xenlight/helpers.gen.go
+++ b/tools/golang/xenlight/helpers.gen.go
@@ -323,6 +323,7 @@ x.CpuTime = uint64(xc.cpu_time)
 x.VcpuMaxId = uint32(xc.vcpu_max_id)
 x.VcpuOnline = uint32(xc.vcpu_online)
 x.Cpupool = uint32(xc.cpupool)
+x.GpaddrBits = byte(xc.gpaddr_bits)
 x.DomainType = DomainType(xc.domain_type)
 
  return nil}
@@ -355,6 +356,7 @@ xc.cpu_time = C.uint64_t(x.CpuTime)
 xc.vcpu_max_id = C.uint32_t(x.VcpuMaxId)
 xc.vcpu_online = C.uint32_t(x.VcpuOnline)
 xc.cpupool = C.uint32_t(x.Cpupool)
+xc.gpaddr_bits = C.uint8_t(x.GpaddrBits)
 xc.domain_type = C.libxl_domain_type(x.DomainType)
 
  return nil
diff --git a/tools/golang/xenlight/types.gen.go b/tools/golang/xenlight/types.gen.go
index b2e8bd1a855e..8c8c481b86f6 100644
--- a/tools/golang/xenlight/types.gen.go
+++ b/tools/golang/xenlight/types.gen.go
@@ -316,6 +316,7 @@ CpuTime uint64
 VcpuMaxId uint32
 VcpuOnline uint32
 Cpupool uint32
+GpaddrBits byte
 DomainType DomainType
 }
 
-- 
2.11.0


Re: [PATCH for-4.16 1/2] tools/golang: Regenerate bindings
Posted by Ian Jackson 2 years, 5 months ago
Andrew Cooper writes ("[PATCH for-4.16 1/2] tools/golang: Regenerate bindings"):
> This was missed previously.
> 
> Fixes: 1e6706b0d123 ("xen/arm: Introduce gpaddr_bits field to struct xen_domctl_getdomaininfo")
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Thanks.

Release-Acked-by: Ian Jackson <iwj@xenproject.org>

(but note the mail I just sent about commit moratorium)

[PATCH for-4.16 2/2] xen: Report grant table v1/v2 capabilities to the toolstack
Posted by Andrew Cooper 2 years, 5 months ago
In order to let the toolstack be able to set the gnttab version on a
per-domain basis, it needs to know which ABIs Xen supports.  Introduce
XEN_SYSCTL_PHYSCAP_gnttab_v{1,2} for the purpose, and plumb in down into
userspace.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: George Dunlap <George.Dunlap@eu.citrix.com>
CC: Ian Jackson <iwj@xenproject.org>
CC: Jan Beulich <JBeulich@suse.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Wei Liu <wl@xen.org>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Julien Grall <julien@xen.org>
CC: Christian Lindig <christian.lindig@citrix.com>
CC: Edwin Török <edvin.torok@citrix.com>

For 4.16.  This is useful information:

  xl info virt_caps | grep -q gnttab-v2

is far more coherent for the admin than having to root around in the Xen build
time configuraiton, or command line settings.

Furthermore, it is information necessary for the toolstack to be able to
choose the gnttab version on a per-domain basis (a feature under consideration
for 4.16).

Risks are largely at build time, but adding new PHYSCAPs is formulaic.  In
this case, just using 2 bits in an already existing hypercall.  golang was
generated automatically, while Ocaml has build time ABI checks issues.
---
 tools/golang/xenlight/helpers.gen.go | 4 ++++
 tools/golang/xenlight/types.gen.go   | 2 ++
 tools/include/libxl.h                | 6 ++++++
 tools/libs/light/libxl.c             | 4 ++++
 tools/libs/light/libxl_types.idl     | 2 ++
 tools/ocaml/libs/xc/xenctrl.ml       | 2 ++
 tools/ocaml/libs/xc/xenctrl.mli      | 2 ++
 tools/xl/xl_info.c                   | 6 ++++--
 xen/common/grant_table.c             | 2 +-
 xen/common/sysctl.c                  | 6 ++++++
 xen/include/public/sysctl.h          | 6 +++++-
 xen/include/xen/grant_table.h        | 2 ++
 12 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/tools/golang/xenlight/helpers.gen.go b/tools/golang/xenlight/helpers.gen.go
index c010f2d3a47f..afa44a857a6c 100644
--- a/tools/golang/xenlight/helpers.gen.go
+++ b/tools/golang/xenlight/helpers.gen.go
@@ -3369,6 +3369,8 @@ x.CapShadow = bool(xc.cap_shadow)
 x.CapIommuHapPtShare = bool(xc.cap_iommu_hap_pt_share)
 x.CapVmtrace = bool(xc.cap_vmtrace)
 x.CapVpmu = bool(xc.cap_vpmu)
+x.CapGnttabV1 = bool(xc.cap_gnttab_v1)
+x.CapGnttabV2 = bool(xc.cap_gnttab_v2)
 
  return nil}
 
@@ -3401,6 +3403,8 @@ xc.cap_shadow = C.bool(x.CapShadow)
 xc.cap_iommu_hap_pt_share = C.bool(x.CapIommuHapPtShare)
 xc.cap_vmtrace = C.bool(x.CapVmtrace)
 xc.cap_vpmu = C.bool(x.CapVpmu)
+xc.cap_gnttab_v1 = C.bool(x.CapGnttabV1)
+xc.cap_gnttab_v2 = C.bool(x.CapGnttabV2)
 
  return nil
  }
diff --git a/tools/golang/xenlight/types.gen.go b/tools/golang/xenlight/types.gen.go
index 8c8c481b86f6..31b5af0777d5 100644
--- a/tools/golang/xenlight/types.gen.go
+++ b/tools/golang/xenlight/types.gen.go
@@ -1011,6 +1011,8 @@ CapShadow bool
 CapIommuHapPtShare bool
 CapVmtrace bool
 CapVpmu bool
+CapGnttabV1 bool
+CapGnttabV2 bool
 }
 
 type Connectorinfo struct {
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index 2e8679dbcb21..54c10f6efe23 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -515,6 +515,12 @@
 #define LIBXL_HAVE_VPMU 1
 
 /*
+ * LIBXL_HAVE_PHYSINFO_CAP_GNTTAB indicates that libxl_physinfo has a
+ * cap_gnttab_v1/2 fields, which indicates the available grant table ABIs.
+ */
+#define LIBXL_HAVE_PHYSINFO_CAP_GNTTAB 1
+
+/*
  * libxl ABI compatibility
  *
  * The only guarantee which libxl makes regarding ABI compatibility
diff --git a/tools/libs/light/libxl.c b/tools/libs/light/libxl.c
index a032723fdecb..a77aa856fdd6 100644
--- a/tools/libs/light/libxl.c
+++ b/tools/libs/light/libxl.c
@@ -405,6 +405,10 @@ int libxl_get_physinfo(libxl_ctx *ctx, libxl_physinfo *physinfo)
     physinfo->cap_vmtrace =
         !!(xcphysinfo.capabilities & XEN_SYSCTL_PHYSCAP_vmtrace);
     physinfo->cap_vpmu = !!(xcphysinfo.capabilities & XEN_SYSCTL_PHYSCAP_vpmu);
+    physinfo->cap_gnttab_v1 =
+        !!(xcphysinfo.capabilities & XEN_SYSCTL_PHYSCAP_gnttab_v1);
+    physinfo->cap_gnttab_v2 =
+        !!(xcphysinfo.capabilities & XEN_SYSCTL_PHYSCAP_gnttab_v2);
 
     GC_FREE;
     return 0;
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index 608d55a4568d..573bba68ee3a 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -1065,6 +1065,8 @@ libxl_physinfo = Struct("physinfo", [
     ("cap_iommu_hap_pt_share", bool),
     ("cap_vmtrace", bool),
     ("cap_vpmu", bool),
+    ("cap_gnttab_v1", bool),
+    ("cap_gnttab_v2", bool),
     ], dir=DIR_OUT)
 
 libxl_connectorinfo = Struct("connectorinfo", [
diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
index addcf4cc593d..ed2924a2b34a 100644
--- a/tools/ocaml/libs/xc/xenctrl.ml
+++ b/tools/ocaml/libs/xc/xenctrl.ml
@@ -123,6 +123,8 @@ type physinfo_cap_flag =
 	| CAP_IOMMU_HAP_PT_SHARE
 	| CAP_Vmtrace
 	| CAP_Vpmu
+	| CAP_Gnttab_v1
+	| CAP_Gnttab_v2
 
 type physinfo =
 {
diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
index 0a5ce529e951..d20dc0108dce 100644
--- a/tools/ocaml/libs/xc/xenctrl.mli
+++ b/tools/ocaml/libs/xc/xenctrl.mli
@@ -108,6 +108,8 @@ type physinfo_cap_flag =
   | CAP_IOMMU_HAP_PT_SHARE
   | CAP_Vmtrace
   | CAP_Vpmu
+  | CAP_Gnttab_v1
+  | CAP_Gnttab_v2
 
 type physinfo = {
   threads_per_core : int;
diff --git a/tools/xl/xl_info.c b/tools/xl/xl_info.c
index 2c86b317b702..712b7638b013 100644
--- a/tools/xl/xl_info.c
+++ b/tools/xl/xl_info.c
@@ -210,7 +210,7 @@ static void output_physinfo(void)
          info.hw_cap[4], info.hw_cap[5], info.hw_cap[6], info.hw_cap[7]
         );
 
-    maybe_printf("virt_caps              :%s%s%s%s%s%s%s%s%s\n",
+    maybe_printf("virt_caps              :%s%s%s%s%s%s%s%s%s%s%s\n",
          info.cap_pv ? " pv" : "",
          info.cap_hvm ? " hvm" : "",
          info.cap_hvm && info.cap_hvm_directio ? " hvm_directio" : "",
@@ -219,7 +219,9 @@ static void output_physinfo(void)
          info.cap_shadow ? " shadow" : "",
          info.cap_iommu_hap_pt_share ? " iommu_hap_pt_share" : "",
          info.cap_vmtrace ? " vmtrace" : "",
-         info.cap_vpmu ? " vpmu" : ""
+         info.cap_vpmu ? " vpmu" : "",
+         info.cap_gnttab_v1 ? " gnttab-v1" : "",
+         info.cap_gnttab_v2 ? " gnttab-v2" : ""
         );
 
     vinfo = libxl_get_version_info(ctx);
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index e510395d088e..a20319b22abc 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -178,7 +178,7 @@ static int parse_gnttab_max_maptrack_frames(const char *arg)
 #define GNTTAB_MAX_VERSION 2
 #endif
 
-static unsigned int __read_mostly opt_gnttab_max_version = GNTTAB_MAX_VERSION;
+unsigned int __read_mostly opt_gnttab_max_version = GNTTAB_MAX_VERSION;
 static bool __read_mostly opt_transitive_grants = true;
 
 static int __init parse_gnttab(const char *s)
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index f2dab722b683..1ad3c29351db 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -12,6 +12,7 @@
 #include <xen/sched.h>
 #include <xen/domain.h>
 #include <xen/event.h>
+#include <xen/grant_table.h>
 #include <xen/domain_page.h>
 #include <xen/trace.h>
 #include <xen/console.h>
@@ -283,6 +284,11 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
         if ( vpmu_is_available )
             pi->capabilities |= XEN_SYSCTL_PHYSCAP_vpmu;
 
+        if ( opt_gnttab_max_version >= 1 )
+            pi->capabilities |= XEN_SYSCTL_PHYSCAP_gnttab_v1;
+        if ( opt_gnttab_max_version >= 2 )
+            pi->capabilities |= XEN_SYSCTL_PHYSCAP_gnttab_v2;
+
         if ( copy_to_guest(u_sysctl, op, 1) )
             ret = -EFAULT;
     }
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 3e53681b4368..55252e97f230 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -104,8 +104,12 @@ struct xen_sysctl_tbuf_op {
 /* The platform supports vPMU. */
 #define XEN_SYSCTL_PHYSCAP_vpmu          (1u << 7)
 
+/* Xen supports the Grant v1 and/or v2 ABIs. */
+#define XEN_SYSCTL_PHYSCAP_gnttab_v1     (1u << 8)
+#define XEN_SYSCTL_PHYSCAP_gnttab_v2     (1u << 9)
+
 /* Max XEN_SYSCTL_PHYSCAP_* constant.  Used for ABI checking. */
-#define XEN_SYSCTL_PHYSCAP_MAX XEN_SYSCTL_PHYSCAP_vpmu
+#define XEN_SYSCTL_PHYSCAP_MAX XEN_SYSCTL_PHYSCAP_gnttab_v2
 
 struct xen_sysctl_physinfo {
     uint32_t threads_per_core;
diff --git a/xen/include/xen/grant_table.h b/xen/include/xen/grant_table.h
index 41713e2726e9..9ee830cfd0ab 100644
--- a/xen/include/xen/grant_table.h
+++ b/xen/include/xen/grant_table.h
@@ -32,6 +32,7 @@ struct grant_table;
 
 #ifdef CONFIG_GRANT_TABLE
 
+extern unsigned int opt_gnttab_max_version;
 extern unsigned int opt_max_grant_frames;
 
 /* Create/destroy per-domain grant table context. */
@@ -63,6 +64,7 @@ int gnttab_acquire_resource(
 
 #else
 
+#define opt_gnttab_max_version 0
 #define opt_max_grant_frames 0
 
 static inline int grant_table_init(struct domain *d,
-- 
2.11.0


Re: [PATCH for-4.16 2/2] xen: Report grant table v1/v2 capabilities to the toolstack
Posted by Roger Pau Monné 2 years, 5 months ago
On Fri, Oct 29, 2021 at 06:38:13PM +0100, Andrew Cooper wrote:
> In order to let the toolstack be able to set the gnttab version on a
> per-domain basis, it needs to know which ABIs Xen supports.  Introduce
> XEN_SYSCTL_PHYSCAP_gnttab_v{1,2} for the purpose, and plumb in down into
> userspace.

I did consider exposing the versions supported together with
max_{grant,maptrack}_frames using a new grant-table dedicated sysctl,
but maybe it's fine to expose the version as a physcap and fetch the
other two separately?

I certainly didn't look much into this, maybe it's pointless to expose
max_{grant,maptrack}_frames. I think the toolstack will need to be
able to fetch grant related limits on a per-domain basis like we plan
to do with the grant version in order to assert the destination host
supports the current limit applied to the domain.

Thanks, Roger.

Re: [PATCH for-4.16 2/2] xen: Report grant table v1/v2 capabilities to the toolstack
Posted by Jan Beulich 2 years, 5 months ago
On 30.10.2021 13:20, Roger Pau Monné wrote:
> On Fri, Oct 29, 2021 at 06:38:13PM +0100, Andrew Cooper wrote:
>> In order to let the toolstack be able to set the gnttab version on a
>> per-domain basis, it needs to know which ABIs Xen supports.  Introduce
>> XEN_SYSCTL_PHYSCAP_gnttab_v{1,2} for the purpose, and plumb in down into
>> userspace.
> 
> I did consider exposing the versions supported together with
> max_{grant,maptrack}_frames using a new grant-table dedicated sysctl,
> but maybe it's fine to expose the version as a physcap and fetch the
> other two separately?
> 
> I certainly didn't look much into this, maybe it's pointless to expose
> max_{grant,maptrack}_frames. I think the toolstack will need to be
> able to fetch grant related limits on a per-domain basis like we plan
> to do with the grant version in order to assert the destination host
> supports the current limit applied to the domain.

I think the other two are a better fit to expose via hypfs (alongside
the version). And indeed they are being exposed there already, just
not the version (which would imo belong there for completeness, not
to replace the exposure as a physcap here).

Jan


Re: [PATCH for-4.16 2/2] xen: Report grant table v1/v2 capabilities to the toolstack
Posted by Andrew Cooper 2 years, 5 months ago
On 30/10/2021 12:20, Roger Pau Monné wrote:
> On Fri, Oct 29, 2021 at 06:38:13PM +0100, Andrew Cooper wrote:
>> In order to let the toolstack be able to set the gnttab version on a
>> per-domain basis, it needs to know which ABIs Xen supports.  Introduce
>> XEN_SYSCTL_PHYSCAP_gnttab_v{1,2} for the purpose, and plumb in down into
>> userspace.
> I did consider exposing the versions supported together with
> max_{grant,maptrack}_frames using a new grant-table dedicated sysctl,
> but maybe it's fine to expose the version as a physcap and fetch the
> other two separately?

The naming is already rather wonky.  xl calls it virt_caps which is how
most humans interact with the content, while the comments call it
"platform capabilities".

> I certainly didn't look much into this, maybe it's pointless to expose
> max_{grant,maptrack}_frames. I think the toolstack will need to be
> able to fetch grant related limits on a per-domain basis like we plan
> to do with the grant version in order to assert the destination host
> supports the current limit applied to the domain.

All capabilities, settings and limits ought to be available to an admin
in dom0.  `xl dmesg | grep $FOO` is inadequate.

In an ideal world, yes, we would have max_*_frames available too so `xl`
can give a more coherent error message than "domaincreate returned
-EINVAL", but for now at least those are checks suitably.

On x86, the correct way for this all to have worked would be to have a
gnttab leaf in Xen's CPUID leaves, but that is firmly in ABI-v2
territory now.

~Andrew


Re: [PATCH for-4.16 2/2] xen: Report grant table v1/v2 capabilities to the toolstack
Posted by Christian Lindig 2 years, 5 months ago

On 29 Oct 2021, at 18:38, Andrew Cooper <Andrew.Cooper3@citrix.com<mailto:Andrew.Cooper3@citrix.com>> wrote:

In order to let the toolstack be able to set the gnttab version on a
per-domain basis, it needs to know which ABIs Xen supports.  Introduce
XEN_SYSCTL_PHYSCAP_gnttab_v{1,2} for the purpose, and plumb in down into
userspace.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com<mailto:andrew.cooper3@citrix.com>>
---
CC: George Dunlap <George.Dunlap@eu.citrix.com<mailto:George.Dunlap@eu.citrix.com>>
CC: Ian Jackson <iwj@xenproject.org<mailto:iwj@xenproject.org>>
CC: Jan Beulich <JBeulich@suse.com<mailto:JBeulich@suse.com>>
CC: Stefano Stabellini <sstabellini@kernel.org<mailto:sstabellini@kernel.org>>
CC: Wei Liu <wl@xen.org<mailto:wl@xen.org>>
CC: Roger Pau Monné <roger.pau@citrix.com<mailto:roger.pau@citrix.com>>
CC: Julien Grall <julien@xen.org<mailto:julien@xen.org>>
CC: Christian Lindig <christian.lindig@citrix.com<mailto:christian.lindig@citrix.com>>
CC: Edwin Török <edvin.torok@citrix.com<mailto:edvin.torok@citrix.com>>

The OCaml changes are pure mechanics - so I’m fine with them.

Acked-by: Christian Lindig <christian.lindig@citrix.com<mailto:christian.lindig@citrix.com>>
Re: [PATCH for-4.16 2/2] xen: Report grant table v1/v2 capabilities to the toolstack
Posted by Ian Jackson 2 years, 5 months ago
Andrew Cooper writes ("[PATCH for-4.16 2/2] xen: Report grant table v1/v2 capabilities to the toolstack"):
> In order to let the toolstack be able to set the gnttab version on a
> per-domain basis, it needs to know which ABIs Xen supports.  Introduce
> XEN_SYSCTL_PHYSCAP_gnttab_v{1,2} for the purpose, and plumb in down into
> userspace.

It's not clear to me whether there is consensus around this patch ?

Anyway, tools parts:

Reviewed-by: Ian Jackson <iwj@xenproject.org>

Ian.

Re: [PATCH for-4.16 2/2] xen: Report grant table v1/v2 capabilities to the toolstack
Posted by Roger Pau Monné 2 years, 5 months ago
On Fri, Oct 29, 2021 at 06:38:13PM +0100, Andrew Cooper wrote:
> In order to let the toolstack be able to set the gnttab version on a
> per-domain basis, it needs to know which ABIs Xen supports.  Introduce
> XEN_SYSCTL_PHYSCAP_gnttab_v{1,2} for the purpose, and plumb in down into
> userspace.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.