From nobody Tue Feb 10 05:17:57 2026 Received: from mail.codeweavers.com (mail.codeweavers.com [4.36.192.163]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BFDF57483; Tue, 16 Apr 2024 01:10:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=4.36.192.163 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713229832; cv=none; b=pDGTo05AdGIHCog1Uk+Vl5eOHBuUh1znkDAir2T8FSP9PrDdWQFrkvGExSFhxYcvaIYzfJEGYfiA4H8/VSWJhRpsQvabp5nX1sugkYuRjDEJn1/Z+MKfP9FtD9sG1NmEbPfVaz9BR0Anr7Aopej51Jf4KU+qOD/r7jMG1cNpLnc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713229832; c=relaxed/simple; bh=iCBKSs88BnhIi4wRYcrHmulPU5noVaFJKgE+8Pd4dp0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=r/vDwIh5PswaINmzRWw9R6uT0inWgpJhkDFn85EXTLBgCP6R6UzOM6eCWfbuv+K4bFWSsFQfwmugta7MZ/XP4iXN5xPAtvf3kljhaGpufPkdIRBTfGBv8OcqpsMKvaFLOoRR1X3ypu9/xDY/xi6ZHLT9lK0WBp4cXPPkb+EjBIM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=codeweavers.com; spf=pass smtp.mailfrom=codeweavers.com; dkim=pass (2048-bit key) header.d=codeweavers.com header.i=@codeweavers.com header.b=n5AlHg5b; arc=none smtp.client-ip=4.36.192.163 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=codeweavers.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=codeweavers.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=codeweavers.com header.i=@codeweavers.com header.b="n5AlHg5b" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=codeweavers.com; s=s1; h=Message-ID:Date:Subject:Cc:To:From:Sender; bh=RYBMOq1phKF6kVrcJnfpZPVkjj2AQtC0kLQdg9MZn2g=; b=n5AlHg5b5Y/vtM+O8bN/DqgZsO 1GFW2Ifz9efIqvZeHPR7lSN0C3jD6/JvgdXafKWTh3RLxf1QOVBRNPgWRKSeMCF+rVgkTtz0NYLuX GabTHhdXdl8KCqOiHynjSOnr5Tub/2TgWt+6ibuZEIsBzxOWqu0Y3HNnNJDCdDt0GxNbGoV8C8zyY d+aRlRVvMJa7clCEUAVoqa1usznV4zm17NHZECoAW7+A5W9GHjjtwSL2zgji9yNi8Fb22tX02LzA3 W5I8qe9XuVVkWGr2Trj336NPvQ0Mqw1gcu5H8eKM+Wz8dNWUshQnc/iGmKSq2OIhyjqQtGFoH0Sjn joUOq3Ew==; Received: from cw137ip160.mn.codeweavers.com ([10.69.137.160] helo=camazotz.mn.codeweavers.com) by mail.codeweavers.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1rwXKy-00FbQv-1E; Mon, 15 Apr 2024 20:10:16 -0500 From: Elizabeth Figura To: Arnd Bergmann , Greg Kroah-Hartman , Jonathan Corbet , Shuah Khan Cc: linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, wine-devel@winehq.org, =?UTF-8?q?Andr=C3=A9=20Almeida?= , Wolfram Sang , Arkadiusz Hiler , Peter Zijlstra , Andy Lutomirski , linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, Randy Dunlap , Ingo Molnar , Will Deacon , Waiman Long , Boqun Feng , Elizabeth Figura Subject: [PATCH v4 01/27] ntsync: Introduce NTSYNC_IOC_WAIT_ANY. Date: Mon, 15 Apr 2024 20:08:11 -0500 Message-ID: <20240416010837.333694-2-zfigura@codeweavers.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240416010837.333694-1-zfigura@codeweavers.com> References: <20240416010837.333694-1-zfigura@codeweavers.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This corresponds to part of the functionality of the NT syscall NtWaitForMultipleObjects(). Specifically, it implements the behaviour where the third argument (wait_any) is TRUE, and it does not handle alertable wai= ts. Those features have been split out into separate patches to ease review. This patch therefore implements the wait/wake infrastructure which comprise= s the core of ntsync's functionality. NTSYNC_IOC_WAIT_ANY is a vectored wait function similar to poll(). Unlike poll(), it "consumes" objects when they are signaled. For semaphores, this = means decreasing one from the internal counter. At most one object can be consume= d by this function. This wait/wake model is fundamentally different from that used anywhere els= e in the kernel, and for that reason ntsync does not use any existing infrastruc= ture, such as futexes, kernel mutexes or semaphores, or wait_event(). Up to 64 objects can be waited on at once. As soon as one is signaled, the object with the lowest index is consumed, and that index is returned via the "index" field. A timeout is supported. The timeout is passed as a u64 nanosecond value, wh= ich represents absolute time measured against either the MONOTONIC or REALTIME = clock (controlled by the flags argument). If U64_MAX is passed, the ioctl waits indefinitely. This ioctl validates that all objects belong to the relevant device. This i= s not necessary for any technical reason related to NTSYNC_IOC_WAIT_ANY, but will= be necessary for NTSYNC_IOC_WAIT_ALL introduced in the following patch. Two u32s of padding are left in the ntsync_wait_args structure; one will be= used by a patch later in the series (which is split out to ease review). Signed-off-by: Elizabeth Figura --- drivers/misc/ntsync.c | 250 ++++++++++++++++++++++++++++++++++++ include/uapi/linux/ntsync.h | 16 +++ 2 files changed, 266 insertions(+) diff --git a/drivers/misc/ntsync.c b/drivers/misc/ntsync.c index 3c2f743c58b0..c6f84a5fc8c0 100644 --- a/drivers/misc/ntsync.c +++ b/drivers/misc/ntsync.c @@ -6,11 +6,16 @@ */ =20 #include +#include #include #include +#include +#include #include #include #include +#include +#include #include #include #include @@ -30,6 +35,8 @@ enum ntsync_type { * * Both rely on struct file for reference counting. Individual * ntsync_obj objects take a reference to the device when created. + * Wait operations take a reference to each object being waited on for + * the duration of the wait. */ =20 struct ntsync_obj { @@ -47,12 +54,56 @@ struct ntsync_obj { __u32 max; } sem; } u; + + struct list_head any_waiters; +}; + +struct ntsync_q_entry { + struct list_head node; + struct ntsync_q *q; + struct ntsync_obj *obj; + __u32 index; +}; + +struct ntsync_q { + struct task_struct *task; + __u32 owner; + + /* + * Protected via atomic_try_cmpxchg(). Only the thread that wins the + * compare-and-swap may actually change object states and wake this + * task. + */ + atomic_t signaled; + + __u32 count; + struct ntsync_q_entry entries[]; }; =20 struct ntsync_device { struct file *file; }; =20 +static void try_wake_any_sem(struct ntsync_obj *sem) +{ + struct ntsync_q_entry *entry; + + lockdep_assert_held(&sem->lock); + + list_for_each_entry(entry, &sem->any_waiters, node) { + struct ntsync_q *q =3D entry->q; + int signaled =3D -1; + + if (!sem->u.sem.count) + break; + + if (atomic_try_cmpxchg(&q->signaled, &signaled, entry->index)) { + sem->u.sem.count--; + wake_up_process(q->task); + } + } +} + /* * Actually change the semaphore state, returning -EOVERFLOW if it is made * invalid. @@ -88,6 +139,8 @@ static int ntsync_sem_post(struct ntsync_obj *sem, void = __user *argp) =20 prev_count =3D sem->u.sem.count; ret =3D post_sem_state(sem, args); + if (!ret) + try_wake_any_sem(sem); =20 spin_unlock(&sem->lock); =20 @@ -141,6 +194,7 @@ static struct ntsync_obj *ntsync_alloc_obj(struct ntsyn= c_device *dev, obj->dev =3D dev; get_file(dev->file); spin_lock_init(&obj->lock); + INIT_LIST_HEAD(&obj->any_waiters); =20 return obj; } @@ -191,6 +245,200 @@ static int ntsync_create_sem(struct ntsync_device *de= v, void __user *argp) return put_user(fd, &user_args->sem); } =20 +static struct ntsync_obj *get_obj(struct ntsync_device *dev, int fd) +{ + struct file *file =3D fget(fd); + struct ntsync_obj *obj; + + if (!file) + return NULL; + + if (file->f_op !=3D &ntsync_obj_fops) { + fput(file); + return NULL; + } + + obj =3D file->private_data; + if (obj->dev !=3D dev) { + fput(file); + return NULL; + } + + return obj; +} + +static void put_obj(struct ntsync_obj *obj) +{ + fput(obj->file); +} + +static int ntsync_schedule(const struct ntsync_q *q, const struct ntsync_w= ait_args *args) +{ + ktime_t timeout =3D ns_to_ktime(args->timeout); + clockid_t clock =3D CLOCK_MONOTONIC; + ktime_t *timeout_ptr; + int ret =3D 0; + + timeout_ptr =3D (args->timeout =3D=3D U64_MAX ? NULL : &timeout); + + if (args->flags & NTSYNC_WAIT_REALTIME) + clock =3D CLOCK_REALTIME; + + do { + if (signal_pending(current)) { + ret =3D -ERESTARTSYS; + break; + } + + set_current_state(TASK_INTERRUPTIBLE); + if (atomic_read(&q->signaled) !=3D -1) { + ret =3D 0; + break; + } + ret =3D schedule_hrtimeout_range_clock(timeout_ptr, 0, HRTIMER_MODE_ABS,= clock); + } while (ret < 0); + __set_current_state(TASK_RUNNING); + + return ret; +} + +/* + * Allocate and initialize the ntsync_q structure, but do not queue us yet. + */ +static int setup_wait(struct ntsync_device *dev, + const struct ntsync_wait_args *args, + struct ntsync_q **ret_q) +{ + const __u32 count =3D args->count; + int fds[NTSYNC_MAX_WAIT_COUNT]; + struct ntsync_q *q; + __u32 i, j; + + if (!args->owner) + return -EINVAL; + + if (args->pad || args->pad2 || (args->flags & ~NTSYNC_WAIT_REALTIME)) + return -EINVAL; + + if (args->count > NTSYNC_MAX_WAIT_COUNT) + return -EINVAL; + + if (copy_from_user(fds, u64_to_user_ptr(args->objs), + array_size(count, sizeof(*fds)))) + return -EFAULT; + + q =3D kmalloc(struct_size(q, entries, count), GFP_KERNEL); + if (!q) + return -ENOMEM; + q->task =3D current; + q->owner =3D args->owner; + atomic_set(&q->signaled, -1); + q->count =3D count; + + for (i =3D 0; i < count; i++) { + struct ntsync_q_entry *entry =3D &q->entries[i]; + struct ntsync_obj *obj =3D get_obj(dev, fds[i]); + + if (!obj) + goto err; + + entry->obj =3D obj; + entry->q =3D q; + entry->index =3D i; + } + + *ret_q =3D q; + return 0; + +err: + for (j =3D 0; j < i; j++) + put_obj(q->entries[j].obj); + kfree(q); + return -EINVAL; +} + +static void try_wake_any_obj(struct ntsync_obj *obj) +{ + switch (obj->type) { + case NTSYNC_TYPE_SEM: + try_wake_any_sem(obj); + break; + } +} + +static int ntsync_wait_any(struct ntsync_device *dev, void __user *argp) +{ + struct ntsync_wait_args args; + struct ntsync_q *q; + int signaled; + __u32 i; + int ret; + + if (copy_from_user(&args, argp, sizeof(args))) + return -EFAULT; + + ret =3D setup_wait(dev, &args, &q); + if (ret < 0) + return ret; + + /* queue ourselves */ + + for (i =3D 0; i < args.count; i++) { + struct ntsync_q_entry *entry =3D &q->entries[i]; + struct ntsync_obj *obj =3D entry->obj; + + spin_lock(&obj->lock); + list_add_tail(&entry->node, &obj->any_waiters); + spin_unlock(&obj->lock); + } + + /* check if we are already signaled */ + + for (i =3D 0; i < args.count; i++) { + struct ntsync_obj *obj =3D q->entries[i].obj; + + if (atomic_read(&q->signaled) !=3D -1) + break; + + spin_lock(&obj->lock); + try_wake_any_obj(obj); + spin_unlock(&obj->lock); + } + + /* sleep */ + + ret =3D ntsync_schedule(q, &args); + + /* and finally, unqueue */ + + for (i =3D 0; i < args.count; i++) { + struct ntsync_q_entry *entry =3D &q->entries[i]; + struct ntsync_obj *obj =3D entry->obj; + + spin_lock(&obj->lock); + list_del(&entry->node); + spin_unlock(&obj->lock); + + put_obj(obj); + } + + signaled =3D atomic_read(&q->signaled); + if (signaled !=3D -1) { + struct ntsync_wait_args __user *user_args =3D argp; + + /* even if we caught a signal, we need to communicate success */ + ret =3D 0; + + if (put_user(signaled, &user_args->index)) + ret =3D -EFAULT; + } else if (!ret) { + ret =3D -ETIMEDOUT; + } + + kfree(q); + return ret; +} + static int ntsync_char_open(struct inode *inode, struct file *file) { struct ntsync_device *dev; @@ -222,6 +470,8 @@ static long ntsync_char_ioctl(struct file *file, unsign= ed int cmd, switch (cmd) { case NTSYNC_IOC_CREATE_SEM: return ntsync_create_sem(dev, argp); + case NTSYNC_IOC_WAIT_ANY: + return ntsync_wait_any(dev, argp); default: return -ENOIOCTLCMD; } diff --git a/include/uapi/linux/ntsync.h b/include/uapi/linux/ntsync.h index dcfa38fdc93c..60ad414b5552 100644 --- a/include/uapi/linux/ntsync.h +++ b/include/uapi/linux/ntsync.h @@ -16,7 +16,23 @@ struct ntsync_sem_args { __u32 max; }; =20 +#define NTSYNC_WAIT_REALTIME 0x1 + +struct ntsync_wait_args { + __u64 timeout; + __u64 objs; + __u32 count; + __u32 owner; + __u32 index; + __u32 flags; + __u32 pad; + __u32 pad2; +}; + +#define NTSYNC_MAX_WAIT_COUNT 64 + #define NTSYNC_IOC_CREATE_SEM _IOWR('N', 0x80, struct ntsync_sem_args) +#define NTSYNC_IOC_WAIT_ANY _IOWR('N', 0x82, struct ntsync_wait_args) =20 #define NTSYNC_IOC_SEM_POST _IOWR('N', 0x81, __u32) =20 --=20 2.43.0