From nobody Wed Feb 11 04:18:45 2026 Received: from mail.codeweavers.com (mail.codeweavers.com [4.36.192.163]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AB30B50297; Sun, 19 May 2024 20:25:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=4.36.192.163 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716150317; cv=none; b=RiBvinysMYBM7kanayeIked0MEbIRg8pd7+WzNFuEXhIHoZcZlpBRCPcPNJ8rhKt9Dfnp0SPybqjuPeKhXTMgHNOpStfpETqGLZv2dEcqH1M+FvG5K+1b9TGTnlMHsbvL34jj/0a71dk5hX3JxWzqw9NftDpqmNyaiVqy0Txz+E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716150317; c=relaxed/simple; bh=gkKEJ5BNjNsOOHLGpQozezzYMmiAToZW+cmZOqiLooc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HiL+7ggyaxfUk22B01C8p0wi+1krRd24iQS5lde3PVvUVI3cp5MkbvqL7QYV/c2gW1qIyMQdwMv/hSZehkofpH/QbljLLuqyPVQi4HNAS/l4j4izq3RA5nZ6ZUW6SPXzYRAf8/exWaXFonpiaQ7TBcVTV19swlN0gSmLzySlisY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=codeweavers.com; spf=pass smtp.mailfrom=codeweavers.com; dkim=pass (2048-bit key) header.d=codeweavers.com header.i=@codeweavers.com header.b=WnQI40yO; arc=none smtp.client-ip=4.36.192.163 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=codeweavers.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=codeweavers.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=codeweavers.com header.i=@codeweavers.com header.b="WnQI40yO" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=codeweavers.com; s=s1; h=Message-ID:Date:Subject:Cc:To:From:Sender; bh=ybxdmy8OEiYoDTC0XOhuRWh3pXSasFPCoqjogScX8js=; b=WnQI40yOwk7HxKkxn+n9l2Gi+Y zhK2xYWveb33iPYihwRDtNPmA/7oJqHew9VtsPl70H3i7RBJgzydGmFsjEZPQpd/TwjWRKUd0Bs3p /TeWGneBjXLdMkD3u8nzgYePkR3OoDmEgPNp8Qkku4nUxp9gt8c3AnBWrqAyveMk4J+TTZZDN4XC9 T/V7oXIznozA75VPhWIppQKPXrz1kqQN3jfr9wVoVw7Ij0fp/Z5FRhjQ/YM/3wTB4T+WLoBrl/Y5D WjaBIT4NOOOVlUMaaCU3g1TsKAuVYC/NIktBybiWJiIVREcMnVQ1MOOuI1ooH9C0y0mMA/5h1BzrU n9Fh+vvg==; Received: from cw137ip160.mn.codeweavers.com ([10.69.137.160] helo=camazotz.mn.codeweavers.com) by mail.codeweavers.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1s8n5a-008wIn-2O; Sun, 19 May 2024 15:25:02 -0500 From: Elizabeth Figura To: Arnd Bergmann , Greg Kroah-Hartman , Jonathan Corbet , Shuah Khan Cc: linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, wine-devel@winehq.org, =?UTF-8?q?Andr=C3=A9=20Almeida?= , Wolfram Sang , Arkadiusz Hiler , Peter Zijlstra , Andy Lutomirski , linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, Randy Dunlap , Ingo Molnar , Will Deacon , Waiman Long , Boqun Feng , Elizabeth Figura Subject: [PATCH v5 01/28] ntsync: Introduce NTSYNC_IOC_WAIT_ANY. Date: Sun, 19 May 2024 15:24:27 -0500 Message-ID: <20240519202454.1192826-2-zfigura@codeweavers.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240519202454.1192826-1-zfigura@codeweavers.com> References: <20240519202454.1192826-1-zfigura@codeweavers.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This corresponds to part of the functionality of the NT syscall NtWaitForMultipleObjects(). Specifically, it implements the behaviour where the third argument (wait_any) is TRUE, and it does not handle alertable wai= ts. Those features have been split out into separate patches to ease review. This patch therefore implements the wait/wake infrastructure which comprise= s the core of ntsync's functionality. NTSYNC_IOC_WAIT_ANY is a vectored wait function similar to poll(). Unlike poll(), it "consumes" objects when they are signaled. For semaphores, this = means decreasing one from the internal counter. At most one object can be consume= d by this function. This wait/wake model is fundamentally different from that used anywhere els= e in the kernel, and for that reason ntsync does not use any existing infrastruc= ture, such as futexes, kernel mutexes or semaphores, or wait_event(). Up to 64 objects can be waited on at once. As soon as one is signaled, the object with the lowest index is consumed, and that index is returned via the "index" field. A timeout is supported. The timeout is passed as a u64 nanosecond value, wh= ich represents absolute time measured against either the MONOTONIC or REALTIME = clock (controlled by the flags argument). If U64_MAX is passed, the ioctl waits indefinitely. This ioctl validates that all objects belong to the relevant device. This i= s not necessary for any technical reason related to NTSYNC_IOC_WAIT_ANY, but will= be necessary for NTSYNC_IOC_WAIT_ALL introduced in the following patch. Some padding fields are added for alignment and for fields which will be ad= ded in future patches (split out to ease review). Signed-off-by: Elizabeth Figura --- drivers/misc/ntsync.c | 245 ++++++++++++++++++++++++++++++++++++ include/uapi/linux/ntsync.h | 14 +++ 2 files changed, 259 insertions(+) diff --git a/drivers/misc/ntsync.c b/drivers/misc/ntsync.c index 3c2f743c58b0..d5864891caf0 100644 --- a/drivers/misc/ntsync.c +++ b/drivers/misc/ntsync.c @@ -6,11 +6,16 @@ */ =20 #include +#include #include #include +#include +#include #include #include #include +#include +#include #include #include #include @@ -30,6 +35,8 @@ enum ntsync_type { * * Both rely on struct file for reference counting. Individual * ntsync_obj objects take a reference to the device when created. + * Wait operations take a reference to each object being waited on for + * the duration of the wait. */ =20 struct ntsync_obj { @@ -47,12 +54,55 @@ struct ntsync_obj { __u32 max; } sem; } u; + + struct list_head any_waiters; +}; + +struct ntsync_q_entry { + struct list_head node; + struct ntsync_q *q; + struct ntsync_obj *obj; + __u32 index; +}; + +struct ntsync_q { + struct task_struct *task; + + /* + * Protected via atomic_try_cmpxchg(). Only the thread that wins the + * compare-and-swap may actually change object states and wake this + * task. + */ + atomic_t signaled; + + __u32 count; + struct ntsync_q_entry entries[]; }; =20 struct ntsync_device { struct file *file; }; =20 +static void try_wake_any_sem(struct ntsync_obj *sem) +{ + struct ntsync_q_entry *entry; + + lockdep_assert_held(&sem->lock); + + list_for_each_entry(entry, &sem->any_waiters, node) { + struct ntsync_q *q =3D entry->q; + int signaled =3D -1; + + if (!sem->u.sem.count) + break; + + if (atomic_try_cmpxchg(&q->signaled, &signaled, entry->index)) { + sem->u.sem.count--; + wake_up_process(q->task); + } + } +} + /* * Actually change the semaphore state, returning -EOVERFLOW if it is made * invalid. @@ -88,6 +138,8 @@ static int ntsync_sem_post(struct ntsync_obj *sem, void = __user *argp) =20 prev_count =3D sem->u.sem.count; ret =3D post_sem_state(sem, args); + if (!ret) + try_wake_any_sem(sem); =20 spin_unlock(&sem->lock); =20 @@ -141,6 +193,7 @@ static struct ntsync_obj *ntsync_alloc_obj(struct ntsyn= c_device *dev, obj->dev =3D dev; get_file(dev->file); spin_lock_init(&obj->lock); + INIT_LIST_HEAD(&obj->any_waiters); =20 return obj; } @@ -191,6 +244,196 @@ static int ntsync_create_sem(struct ntsync_device *de= v, void __user *argp) return put_user(fd, &user_args->sem); } =20 +static struct ntsync_obj *get_obj(struct ntsync_device *dev, int fd) +{ + struct file *file =3D fget(fd); + struct ntsync_obj *obj; + + if (!file) + return NULL; + + if (file->f_op !=3D &ntsync_obj_fops) { + fput(file); + return NULL; + } + + obj =3D file->private_data; + if (obj->dev !=3D dev) { + fput(file); + return NULL; + } + + return obj; +} + +static void put_obj(struct ntsync_obj *obj) +{ + fput(obj->file); +} + +static int ntsync_schedule(const struct ntsync_q *q, const struct ntsync_w= ait_args *args) +{ + ktime_t timeout =3D ns_to_ktime(args->timeout); + clockid_t clock =3D CLOCK_MONOTONIC; + ktime_t *timeout_ptr; + int ret =3D 0; + + timeout_ptr =3D (args->timeout =3D=3D U64_MAX ? NULL : &timeout); + + if (args->flags & NTSYNC_WAIT_REALTIME) + clock =3D CLOCK_REALTIME; + + do { + if (signal_pending(current)) { + ret =3D -ERESTARTSYS; + break; + } + + set_current_state(TASK_INTERRUPTIBLE); + if (atomic_read(&q->signaled) !=3D -1) { + ret =3D 0; + break; + } + ret =3D schedule_hrtimeout_range_clock(timeout_ptr, 0, HRTIMER_MODE_ABS,= clock); + } while (ret < 0); + __set_current_state(TASK_RUNNING); + + return ret; +} + +/* + * Allocate and initialize the ntsync_q structure, but do not queue us yet. + */ +static int setup_wait(struct ntsync_device *dev, + const struct ntsync_wait_args *args, + struct ntsync_q **ret_q) +{ + const __u32 count =3D args->count; + int fds[NTSYNC_MAX_WAIT_COUNT]; + struct ntsync_q *q; + __u32 i, j; + + if (args->pad[0] || args->pad[1] || args->pad[2] || (args->flags & ~NTSYN= C_WAIT_REALTIME)) + return -EINVAL; + + if (args->count > NTSYNC_MAX_WAIT_COUNT) + return -EINVAL; + + if (copy_from_user(fds, u64_to_user_ptr(args->objs), + array_size(count, sizeof(*fds)))) + return -EFAULT; + + q =3D kmalloc(struct_size(q, entries, count), GFP_KERNEL); + if (!q) + return -ENOMEM; + q->task =3D current; + atomic_set(&q->signaled, -1); + q->count =3D count; + + for (i =3D 0; i < count; i++) { + struct ntsync_q_entry *entry =3D &q->entries[i]; + struct ntsync_obj *obj =3D get_obj(dev, fds[i]); + + if (!obj) + goto err; + + entry->obj =3D obj; + entry->q =3D q; + entry->index =3D i; + } + + *ret_q =3D q; + return 0; + +err: + for (j =3D 0; j < i; j++) + put_obj(q->entries[j].obj); + kfree(q); + return -EINVAL; +} + +static void try_wake_any_obj(struct ntsync_obj *obj) +{ + switch (obj->type) { + case NTSYNC_TYPE_SEM: + try_wake_any_sem(obj); + break; + } +} + +static int ntsync_wait_any(struct ntsync_device *dev, void __user *argp) +{ + struct ntsync_wait_args args; + struct ntsync_q *q; + int signaled; + __u32 i; + int ret; + + if (copy_from_user(&args, argp, sizeof(args))) + return -EFAULT; + + ret =3D setup_wait(dev, &args, &q); + if (ret < 0) + return ret; + + /* queue ourselves */ + + for (i =3D 0; i < args.count; i++) { + struct ntsync_q_entry *entry =3D &q->entries[i]; + struct ntsync_obj *obj =3D entry->obj; + + spin_lock(&obj->lock); + list_add_tail(&entry->node, &obj->any_waiters); + spin_unlock(&obj->lock); + } + + /* check if we are already signaled */ + + for (i =3D 0; i < args.count; i++) { + struct ntsync_obj *obj =3D q->entries[i].obj; + + if (atomic_read(&q->signaled) !=3D -1) + break; + + spin_lock(&obj->lock); + try_wake_any_obj(obj); + spin_unlock(&obj->lock); + } + + /* sleep */ + + ret =3D ntsync_schedule(q, &args); + + /* and finally, unqueue */ + + for (i =3D 0; i < args.count; i++) { + struct ntsync_q_entry *entry =3D &q->entries[i]; + struct ntsync_obj *obj =3D entry->obj; + + spin_lock(&obj->lock); + list_del(&entry->node); + spin_unlock(&obj->lock); + + put_obj(obj); + } + + signaled =3D atomic_read(&q->signaled); + if (signaled !=3D -1) { + struct ntsync_wait_args __user *user_args =3D argp; + + /* even if we caught a signal, we need to communicate success */ + ret =3D 0; + + if (put_user(signaled, &user_args->index)) + ret =3D -EFAULT; + } else if (!ret) { + ret =3D -ETIMEDOUT; + } + + kfree(q); + return ret; +} + static int ntsync_char_open(struct inode *inode, struct file *file) { struct ntsync_device *dev; @@ -222,6 +465,8 @@ static long ntsync_char_ioctl(struct file *file, unsign= ed int cmd, switch (cmd) { case NTSYNC_IOC_CREATE_SEM: return ntsync_create_sem(dev, argp); + case NTSYNC_IOC_WAIT_ANY: + return ntsync_wait_any(dev, argp); default: return -ENOIOCTLCMD; } diff --git a/include/uapi/linux/ntsync.h b/include/uapi/linux/ntsync.h index dcfa38fdc93c..edc12c7a10dc 100644 --- a/include/uapi/linux/ntsync.h +++ b/include/uapi/linux/ntsync.h @@ -16,7 +16,21 @@ struct ntsync_sem_args { __u32 max; }; =20 +#define NTSYNC_WAIT_REALTIME 0x1 + +struct ntsync_wait_args { + __u64 timeout; + __u64 objs; + __u32 count; + __u32 index; + __u32 flags; + __u32 pad[3]; +}; + +#define NTSYNC_MAX_WAIT_COUNT 64 + #define NTSYNC_IOC_CREATE_SEM _IOWR('N', 0x80, struct ntsync_sem_args) +#define NTSYNC_IOC_WAIT_ANY _IOWR('N', 0x82, struct ntsync_wait_args) =20 #define NTSYNC_IOC_SEM_POST _IOWR('N', 0x81, __u32) =20 --=20 2.43.0