[PATCH v6 3/4] drm: Expand max DRM device number to full MINORBITS

Michał Winiarski posted 4 patches 2 years, 1 month ago
There is a newer version of this series
[PATCH v6 3/4] drm: Expand max DRM device number to full MINORBITS
Posted by Michał Winiarski 2 years, 1 month ago
Having a limit of 64 DRM devices is not good enough for modern world
where we have multi-GPU servers, SR-IOV virtual functions and virtual
devices used for testing.
Let's utilize full minor range for DRM devices.
To avoid regressing the existing userspace, we're still maintaining the
numbering scheme where 0-63 is used for primary, 64-127 is reserved
(formerly for control) and 128-191 is used for render.
For minors >= 192, we're allocating minors dynamically on a first-come,
first-served basis.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/drm_drv.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 34b60196c443..c2c6e80e6b31 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -121,10 +121,19 @@ static void drm_minor_alloc_release(struct drm_device *dev, void *data)
 	xa_erase(drm_minor_get_xa(minor->type), minor->index);
 }
 
+/*
+ * DRM used to support 64 devices, for backwards compatibility we need to maintain the
+ * minor allocation scheme where minors 0-63 are primary nodes, 64-127 are control nodes,
+ * and 128-191 are render nodes.
+ * After reaching the limit, we're allocating minors dynamically - first-come, first-serve.
+ * Accel nodes are using a distinct major, so the minors are allocated in continuous 0-MAX
+ * range.
+ */
 #define DRM_MINOR_LIMIT(t) ({ \
 	typeof(t) _t = (t); \
 	_t == DRM_MINOR_ACCEL ? XA_LIMIT(0, ACCEL_MAX_MINORS) : XA_LIMIT(64 * _t, 64 * _t + 63); \
 })
+#define DRM_EXTENDED_MINOR_LIMIT XA_LIMIT(192, (1 << MINORBITS) - 1)
 
 static int drm_minor_alloc(struct drm_device *dev, enum drm_minor_type type)
 {
@@ -140,6 +149,9 @@ static int drm_minor_alloc(struct drm_device *dev, enum drm_minor_type type)
 
 	r = xa_alloc(drm_minor_get_xa(type), &minor->index,
 		     NULL, DRM_MINOR_LIMIT(type), GFP_KERNEL);
+	if (r == -EBUSY && (type == DRM_MINOR_PRIMARY || type == DRM_MINOR_RENDER))
+		r = xa_alloc(&drm_minors_xa, &minor->index,
+			     NULL, DRM_EXTENDED_MINOR_LIMIT, GFP_KERNEL);
 	if (r < 0)
 		return r;
 
-- 
2.41.0

Re: [PATCH v6 3/4] drm: Expand max DRM device number to full MINORBITS
Posted by Simon Ser 2 years, 1 month ago
On Monday, July 24th, 2023 at 23:14, Michał Winiarski <michal.winiarski@intel.com> wrote:

> Having a limit of 64 DRM devices is not good enough for modern world
> where we have multi-GPU servers, SR-IOV virtual functions and virtual
> devices used for testing.
> Let's utilize full minor range for DRM devices.
> To avoid regressing the existing userspace, we're still maintaining the
> numbering scheme where 0-63 is used for primary, 64-127 is reserved
> (formerly for control) and 128-191 is used for render.
> For minors >= 192, we're allocating minors dynamically on a first-come,
> first-served basis.

In general the approach looks good to me. Old libdrm will see the new
nodes as nodes with an unknown type when it tries to infer the nod type
from the minor, which is as good as it gets.

We do need patches to stop trying to infer the node type from the minor
in libdrm, though. Emil has suggested using sysfs, which we already do
in a few places in libdrm.
Re: [PATCH v6 3/4] drm: Expand max DRM device number to full MINORBITS
Posted by Christian König 2 years, 1 month ago
Am 26.07.23 um 20:15 schrieb Simon Ser:
> On Monday, July 24th, 2023 at 23:14, Michał Winiarski <michal.winiarski@intel.com> wrote:
>
>> Having a limit of 64 DRM devices is not good enough for modern world
>> where we have multi-GPU servers, SR-IOV virtual functions and virtual
>> devices used for testing.
>> Let's utilize full minor range for DRM devices.
>> To avoid regressing the existing userspace, we're still maintaining the
>> numbering scheme where 0-63 is used for primary, 64-127 is reserved
>> (formerly for control) and 128-191 is used for render.
>> For minors >= 192, we're allocating minors dynamically on a first-come,
>> first-served basis.
> In general the approach looks good to me. Old libdrm will see the new
> nodes as nodes with an unknown type when it tries to infer the nod type
> from the minor, which is as good as it gets.

Yeah, agree. I wouldn't upstream patch #4, but apart from that it looks 
like it shouldn't break anything which wasn't broken before.

> We do need patches to stop trying to infer the node type from the minor
> in libdrm, though. Emil has suggested using sysfs, which we already do
> in a few places in libdrm.

That sounds like a really good idea to me as well.

But what do we do with DRM_MAX_MINOR? Change it or keep it and say apps 
should use drmGetDevices2() like Emil suggested?

Regards,
Christian.
Re: [PATCH v6 3/4] drm: Expand max DRM device number to full MINORBITS
Posted by Simon Ser 2 years, 1 month ago
On Thursday, July 27th, 2023 at 14:01, Christian König <christian.koenig@amd.com> wrote:

> > We do need patches to stop trying to infer the node type from the minor
> > in libdrm, though. Emil has suggested using sysfs, which we already do
> > in a few places in libdrm.
> 
> That sounds like a really good idea to me as well.
> 
> But what do we do with DRM_MAX_MINOR? Change it or keep it and say apps
> should use drmGetDevices2() like Emil suggested?

DRM_MAX_MINOR has been bumped to 64 now.

With the new minor allocation scheme, DRM_MAX_MINOR is meaningless
because there is no "max minor per type" concept anymore: the minor no
longer contains the type.

So I'd suggest leaving it as-is (so that old apps still continue to
work on systems with < 64 devices like they do today) and mark it as
deprecated.
Re: [PATCH v6 3/4] drm: Expand max DRM device number to full MINORBITS
Posted by Christian König 2 years, 1 month ago
Am 28.07.23 um 16:22 schrieb Simon Ser:
> On Thursday, July 27th, 2023 at 14:01, Christian König <christian.koenig@amd.com> wrote:
>
>>> We do need patches to stop trying to infer the node type from the minor
>>> in libdrm, though. Emil has suggested using sysfs, which we already do
>>> in a few places in libdrm.
>> That sounds like a really good idea to me as well.
>>
>> But what do we do with DRM_MAX_MINOR? Change it or keep it and say apps
>> should use drmGetDevices2() like Emil suggested?
> DRM_MAX_MINOR has been bumped to 64 now.
>
> With the new minor allocation scheme, DRM_MAX_MINOR is meaningless
> because there is no "max minor per type" concept anymore: the minor no
> longer contains the type.
>
> So I'd suggest leaving it as-is (so that old apps still continue to
> work on systems with < 64 devices like they do today) and mark it as
> deprecated.

Sounds like a plan to me.

Regards,
Christian.
Re: [PATCH v6 3/4] drm: Expand max DRM device number to full MINORBITS
Posted by James Zhu 2 years, 1 month ago
I would like if these kernel patches are accepted by everyone, If yes, 
when they can be upstream.

I have a MR for libdrm to support drm nodes type up to 2^MINORBITS  
nodes which can work with these patches,

https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/305

Thanks!

James

On 2023-08-08 09:55, Christian König wrote:
> Am 28.07.23 um 16:22 schrieb Simon Ser:
>> On Thursday, July 27th, 2023 at 14:01, Christian König 
>> <christian.koenig@amd.com> wrote:
>>
>>>> We do need patches to stop trying to infer the node type from the 
>>>> minor
>>>> in libdrm, though. Emil has suggested using sysfs, which we already do
>>>> in a few places in libdrm.
>>> That sounds like a really good idea to me as well.
>>>
>>> But what do we do with DRM_MAX_MINOR? Change it or keep it and say apps
>>> should use drmGetDevices2() like Emil suggested?
>> DRM_MAX_MINOR has been bumped to 64 now.
>>
>> With the new minor allocation scheme, DRM_MAX_MINOR is meaningless
>> because there is no "max minor per type" concept anymore: the minor no
>> longer contains the type.
>>
>> So I'd suggest leaving it as-is (so that old apps still continue to
>> work on systems with < 64 devices like they do today) and mark it as
>> deprecated.
>
> Sounds like a plan to me.
>
> Regards,
> Christian.
Re: [PATCH v6 3/4] drm: Expand max DRM device number to full MINORBITS
Posted by Simon Ser 2 years ago
On Tuesday, August 8th, 2023 at 17:04, James Zhu <jamesz@amd.com> wrote:

> I have a MR for libdrm to support drm nodes type up to 2^MINORBITS
> nodes which can work with these patches,
> 
> https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/305

FWIW, this MR has been merged, so in theory this kernel patch should
work fine with latest libdrm.
Re: [PATCH v6 3/4] drm: Expand max DRM device number to full MINORBITS
Posted by James Zhu 2 years ago
Hi Simon,

Thanks! Yes, this kernel patch should work with latest libdrm.

Best regards!

James Zhu

On 2023-08-23 06:53, Simon Ser wrote:
> On Tuesday, August 8th, 2023 at 17:04, James Zhu <jamesz@amd.com> wrote:
>
>> I have a MR for libdrm to support drm nodes type up to 2^MINORBITS
>> nodes which can work with these patches,
>>
>> https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/305
> FWIW, this MR has been merged, so in theory this kernel patch should
> work fine with latest libdrm.
Re: [PATCH v6 3/4] drm: Expand max DRM device number to full MINORBITS
Posted by Simon Ser 2 years ago
On Wednesday, August 23rd, 2023 at 12:53, Simon Ser <contact@emersion.fr> wrote:

> On Tuesday, August 8th, 2023 at 17:04, James Zhu jamesz@amd.com wrote:
> 
> > I have a MR for libdrm to support drm nodes type up to 2^MINORBITS
> > nodes which can work with these patches,
> > 
> > https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/305
> 
> FWIW, this MR has been merged, so in theory this kernel patch should
> work fine with latest libdrm.

Hm, we might want to adjust MAX_DRM_NODES still. It's set to 256
currently, which should be enough for 128 DRM devices, but not more.
Not bumping this value will make drmGetDevices2() print a warning and
not return all devices on systems with more than 128 DRM devices.