From nobody Thu Apr 2 23:57:06 2026 Received: from mail-pl1-f194.google.com (mail-pl1-f194.google.com [209.85.214.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BD64E34D929 for ; Wed, 25 Mar 2026 19:00:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.194 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774465253; cv=none; b=TMhnoELiVtJaOF/yoszqZQIm07GxrVOXUHHfKOVMAwPKHQfpV/t8164fe3rXxBzNN/sndGYnxbN529/iNJ1PKpPYneioeJqkXhZpoOThCagrikhQC+r7lh+UN9oCnPoPH20wQ9O+15FE/wqAihMdVONUMX1y+0zXcAhME6In2h4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774465253; c=relaxed/simple; bh=Enw/zzJVJMDXaMMGM31CM+5uyNb3Ul/5hZTe/vhutLk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=TUdFZVCcnccmZ1kVNeMdBJZVyZCNpv3KxJMylA8UgZtivNs6jP865Z06378Wd9mAI3tCw6Sg+sFhhVzgXT781RMBkCs/SCDRJo4mZi/TqS2qVs3D8v0RY896905mJPi5oIKN4wCUgOTh/+TJGTRJ7cfSkRl/JZdQkh3QoRLDnc0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ee.vjti.ac.in; spf=none smtp.mailfrom=ee.vjti.ac.in; dkim=pass (1024-bit key) header.d=vjti.ac.in header.i=@vjti.ac.in header.b=hZg5TDeu; arc=none smtp.client-ip=209.85.214.194 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ee.vjti.ac.in Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ee.vjti.ac.in Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=vjti.ac.in header.i=@vjti.ac.in header.b="hZg5TDeu" Received: by mail-pl1-f194.google.com with SMTP id d9443c01a7336-2b04e6a989eso930395ad.3 for ; Wed, 25 Mar 2026 12:00:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vjti.ac.in; s=google; t=1774465247; x=1775070047; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QiX7Q0UkWwVNRq6H0KnVBDZPehmpnW8I6Y9xdRC1GFE=; b=hZg5TDeuKiGoKitY+K1cikB+gNHs6boOYL0N+KIBA+8uLZS604FRjRhEDuXRl14pGX 9+lP/J2UdVca8s1hbQwD2YhTrGLswAacjENQh5DdxNsubXo9unOo5dIFoeT5hX3LC/MS Sv2eKLNxMUwPgHfDVTGmeSgsBioI10q242kKg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774465247; x=1775070047; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=QiX7Q0UkWwVNRq6H0KnVBDZPehmpnW8I6Y9xdRC1GFE=; b=pt77pb6+0UVTf+IW2QLwxsDk2/1tIOAXiW6jfWcJfNNB22sRj5420GTaBTGWJchgjy uJzXJOkN2isVAfFUF40GTtG7boCQIsaEqa9u7GJ9WNX3mRmPpdifkS52PlZbhnS3iYqO bt7H/YasGWnPm8zNbOAAR/5JBg3J/uplxCmZ1c7cQzaPkojz33H90I6oe17szwBesGAu 8uMGiZ2sARKAQumDrphJJAtIzh02XyrNnT/xaxr2FI50dtqjJHi5raPX1kJTMqReojrN mVs31z34fKFvskHHPWC4TMb8FuaFWj5gMWuX9gLedqs2LtK8BS8kF7dUkZZAnuE/jb0z uVnw== X-Gm-Message-State: AOJu0YyF4j9cr0y9tVgZEI5yuOT/csBWRQewEGKhB+7X1PDDub4Q9f7C nmDHMge/V7z0DOArb7zL7eQG1y7u/5Ove4B0sNieRzmIr22TF5LjfnL76H9osGMUJXIHK2/dDnD RR51+9e4DQ4RX X-Gm-Gg: ATEYQzxDhJ5/3izSKxl7m0fdHQB23uBsZV+Txq/PtE+iOAwyMTOiJ8VoPRh+cA2FJQo zgOg++jiwmWl5G2hCRnTzrTEKHY5Lds/eZdlI7tdkG5DagB56/x8GJ7mGEZOBQOrcYlt+7xEdpK Uy8kB0SfbGGkDQLtV/H5pnHiFh17vczMnyO/Q6wRzs77WA2DN+9yu9rgOVDTuwOY7Jx4X6+0C5B Mm1U4bEIxqK4o02i5B52PzBQQu/+Sk/uko13kJGv0X1OPDf5NIIS9ZlcwIXGJH4LW3KPpoG0g3L zopAAo71iMaK4/vouJSVlShI0FcWT//BdEm1B+wHn8IY33wKP8OuZ9Rp8+FJYqKS32pxaeAXIMs Nukv2ypkdbmqDltvwxuImayF97Qu+8D3DC5WZEt1kJOvqSdjOZZiuZzLayIcmyNnA/K1wa4PAsc w/YgUsmUhYngt0k9trv2SxwlVBSfRoMX3sHVIRLM/ZJbxkauf6AyC64qSh19IdCJ67xlaF3WEa6 5oR2GmPaVf7488EjyNDcA== X-Received: by 2002:a17:903:1b43:b0:2aa:d671:e613 with SMTP id d9443c01a7336-2b0b0aac1c7mr55208405ad.38.1774465246593; Wed, 25 Mar 2026 12:00:46 -0700 (PDT) Received: from ranegod-HP-ENVY-x360-Convertible-13-bd0xxx.www.tendawifi.com ([14.139.108.62]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2b0bc79f7dbsm6483485ad.25.2026.03.25.12.00.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Mar 2026 12:00:46 -0700 (PDT) From: Shaurya Rane To: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: manfred@colorfullife.com, viro@zeniv.linux.org.uk, brauner@kernel.org, chuck.lever@oracle.com, jlayton@kernel.org, rstoyanov@fedoraproject.org, ptikhomirov@virtuozzo.com, Shaurya Rane Subject: [RFC PATCH 1/3] mqueue: uapi: add struct mq_peek_attr and F_MQ_PEEK Date: Thu, 26 Mar 2026 00:30:23 +0530 Message-Id: <20260325190025.40312-2-ssrane_b23@ee.vjti.ac.in> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20260325190025.40312-1-ssrane_b23@ee.vjti.ac.in> References: <20260325190025.40312-1-ssrane_b23@ee.vjti.ac.in> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add the user-visible interface for non-destructive POSIX message queue inspection via fcntl(2). POSIX message queues have no way to inspect queued messages without consuming them: mq_receive() always dequeues the message it returns. This makes it impossible for checkpoint/restore tools such as CRIU to save and replay message queue contents without destroying the queue state in the process. struct mq_peek_attr describes the request: the caller specifies an index into the queue in receive order (0 =3D next message that mq_receive() would return, i.e. highest priority, FIFO within same priority) and a buffer to receive the payload. On return, msg_prio is filled with the message priority and the return value is the number of bytes copied. F_MQ_PEEK =3D F_LINUX_SPECIFIC_BASE + 17 is the new fcntl command that accepts a pointer to struct mq_peek_attr. Link: https://github.com/checkpoint-restore/criu/issues/2285 Signed-off-by: Shaurya Rane --- include/uapi/linux/fcntl.h | 6 ++++++ include/uapi/linux/mqueue.h | 21 +++++++++++++++++++++ 2 files changed, 27 insertions(+) diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h index aadfbf6e0cb3..ea34f87de0fb 100644 --- a/include/uapi/linux/fcntl.h +++ b/include/uapi/linux/fcntl.h @@ -84,6 +84,12 @@ #define F_GETDELEG (F_LINUX_SPECIFIC_BASE + 15) #define F_SETDELEG (F_LINUX_SPECIFIC_BASE + 16) =20 +/* + * Peek at a POSIX message queue message by index without consuming it. + * Argument is a pointer to struct mq_peek_attr (see ). + */ +#define F_MQ_PEEK (F_LINUX_SPECIFIC_BASE + 17) + /* Argument structure for F_GETDELEG and F_SETDELEG */ struct delegation { __u32 d_flags; /* Must be 0 */ diff --git a/include/uapi/linux/mqueue.h b/include/uapi/linux/mqueue.h index b516b66840ad..7133b84c70d1 100644 --- a/include/uapi/linux/mqueue.h +++ b/include/uapi/linux/mqueue.h @@ -53,4 +53,25 @@ struct mq_attr { =20 #define NOTIFY_COOKIE_LEN 32 =20 +/* + * Argument structure for fcntl(F_MQ_PEEK). + * + * Peek at a POSIX message queue message by index without removing it. + * @offset: Index in receive order (0 =3D highest priority, next to dequ= eue). + * FIFO ordering is preserved within the same priority level. + * @msg_prio: Output: priority of the message at @offset. + * @buf_len: Size of the caller-provided buffer at @buf. + * @buf: Output: message payload is written here; truncated to @buf_l= en + * bytes if the message is larger. + * + * Returns the number of bytes copied on success, -ENOMSG if @offset is + * >=3D mq_curmsgs, or a negative error code on failure. + */ +struct mq_peek_attr { + __s32 offset; + __u32 msg_prio; + __kernel_size_t buf_len; + char __user *buf; +}; + #endif --=20 2.34.1 From nobody Thu Apr 2 23:57:06 2026 Received: from mail-pj1-f51.google.com (mail-pj1-f51.google.com [209.85.216.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B985934A3A5 for ; Wed, 25 Mar 2026 19:00:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774465258; cv=none; b=TrE9PbGc952s8cUYHhjTawxUdZG93F2wYrjVopGHTBn17Szgr9PN6gU3FFtSfiatEsq/cPvSGWF3INIrf8e/4GKaueMuQnwidduB2GQ+QAiJwn0UzY0KRrR9dpfer43YZNoPq+PYuNcEKK178LDQukbzWejoVLlSFTu8twX7IdY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774465258; c=relaxed/simple; bh=9UEMg/LUGz82KTZ4mXaCFHNFRy8OlbOJf3nyqzMuYP0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Y1u2gvDPkk5rllzOFjLSgpY6jGWRcvOihoyaH5mSyurcsJ3mXEHFASsNIIOzjdOdHDVIHx7wI1pewa26Y5lx30hAZu1mkg1pko6sjjka57SedBIxt/+tTLga6gVWvEmeRIWio4UV7/WMYR8OfSaeOda2q8UkIGyu94/aWnRxRlA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ee.vjti.ac.in; spf=none smtp.mailfrom=ee.vjti.ac.in; dkim=pass (1024-bit key) header.d=vjti.ac.in header.i=@vjti.ac.in header.b=TJLDN6/c; arc=none smtp.client-ip=209.85.216.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ee.vjti.ac.in Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ee.vjti.ac.in Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=vjti.ac.in header.i=@vjti.ac.in header.b="TJLDN6/c" Received: by mail-pj1-f51.google.com with SMTP id 98e67ed59e1d1-3567e2b4159so109227a91.0 for ; Wed, 25 Mar 2026 12:00:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vjti.ac.in; s=google; t=1774465254; x=1775070054; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qzCbch/KsCOdnnvj+/cBnu2Cwa08GiMJ/oPkFKD+RSg=; b=TJLDN6/c6gBMwovAfpfB/Ho/vlYJDHptgPDukKB5mfXjIh8yBVeyR1wnnbmzIxt3Er hg1B95OCnqRqo9dvUc3A7/tNBPmQMKOsNCLoaWtwclmTmMV9fr26LBcjWsfTOg84V9NP Bnwq8yB+qi4QGo2jDP2geTbVw1+nwxXRNlKgU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774465254; x=1775070054; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=qzCbch/KsCOdnnvj+/cBnu2Cwa08GiMJ/oPkFKD+RSg=; b=fdAGuhGEoBsrcPAvCJErXvYBVv22mEevie/Kfxxv6A311QXoekXY9jZ8E6OU4dBUUS 8ulSgitnaSDW3/tDe76lDx1QBUZ41mc0vA7Wvv+LFJX/2wQmQL1DW8+0nYSFpOV1kvpt 4scA3ycaimuHqz2z3DQulDCFZIP/t4pwdosiTWl9wGzke8ey2u4Cm7z8sqWxSUqEoa3T N2ebzNJuOOD8Ji6Js+Y43GU/a1850wHDhEw6hhMBCAnGQaHjZncJBtPttIYlykbMv6kA HafoX4cV1S7LFTnxri75/pBMFM9Vjt7VEQYvPg49vXbiv+2YpXEL7d6nyNufaWmILgq/ KyYQ== X-Gm-Message-State: AOJu0YxEQNIEPbqAov+cne1u3kjx5Yo8qQWjvngzyZ3v2NY0f6C4byqG l4u1Xas+ZE/5TKTlvb+xMiui3cf40kQrVqkngt2Dt1UrAtWyju6kL4QLOlBywQLyWHMUUluwqrG ZC+RmJkoTPg== X-Gm-Gg: ATEYQzxyE04CjEyrbD5NjbQcXaBJs5QjihJcExvokqoP+6/tD0NR6Wh2O88Aa7FkUYq GlVJMpXeddvYmzI1t3EyjFKbgEiMgl1vMjrRvYjmqrgHpr9pEhCGzlKx2u9ogSgluxo5JemNBwV tYp+QtHZya6HpAePbQOe+bcVCMrhbJxxnq20AjqxJV0ZLScs8bF60o8RUipQYhG/SjAEWjLR9ZE h+cV9NkZMOPsq3D7ttmv3Plc03I4YngbK5n84Lol5AGCTGo/QNChJa8WePeLwov+YaeeVIhORFQ 3yJDss9eBxaEavpI4kWhTJ0ZCkxVxe5AghFayliZ/ZtFlthEfwmFv/xoTbcKlAulwAtXcaDwjDq ypD4Ui+zT55smfEECuKkKStQbJ8EMNlt65H8ZdUCeNYlKKJ+Vi4Md4/PHoFlJRZMLZXgmG1/6nC LVkdFzS/ef5+2IEzLgk8ugRm3SCHNRjk0K3qCRixK4azt+OfXJ2qoZVd1zbZvUwwp8/wu76Xt7B H1AYdEOHsaf4LiONDWg1g== X-Received: by 2002:a17:903:908:b0:2b0:669d:3a68 with SMTP id d9443c01a7336-2b0b0a15b81mr51723535ad.19.1774465253887; Wed, 25 Mar 2026 12:00:53 -0700 (PDT) Received: from ranegod-HP-ENVY-x360-Convertible-13-bd0xxx.www.tendawifi.com ([14.139.108.62]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2b0bc79f7dbsm6483485ad.25.2026.03.25.12.00.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Mar 2026 12:00:53 -0700 (PDT) From: Shaurya Rane To: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: manfred@colorfullife.com, viro@zeniv.linux.org.uk, brauner@kernel.org, chuck.lever@oracle.com, jlayton@kernel.org, rstoyanov@fedoraproject.org, ptikhomirov@virtuozzo.com, Shaurya Rane Subject: [RFC PATCH 2/3] msg: move struct msg_msgseg and DATALEN_* to include/linux/msg.h Date: Thu, 26 Mar 2026 00:30:24 +0530 Message-Id: <20260325190025.40312-3-ssrane_b23@ee.vjti.ac.in> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20260325190025.40312-1-ssrane_b23@ee.vjti.ac.in> References: <20260325190025.40312-1-ssrane_b23@ee.vjti.ac.in> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" struct msg_msgseg and the DATALEN_MSG / DATALEN_SEG macros are currently private to ipc/msgutil.c. struct msg_msg (already in the public kernel header include/linux/msg.h) carries a pointer to msg_msgseg, making it an incomplete type for all callers outside msgutil.c. Move the definition of struct msg_msgseg and the two DATALEN macros to include/linux/msg.h so that other IPC code can safely copy multi-segment message payloads into a kernel buffer under a spinlock, without calling store_msg() which performs copy_to_user() and therefore cannot be used under a spinlock. ipc/msgutil.c already includes , so it picks up the definitions from the header with no functional change. Signed-off-by: Shaurya Rane --- include/linux/msg.h | 13 +++++++++++++ ipc/msgutil.c | 7 ------- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/include/linux/msg.h b/include/linux/msg.h index 9a972a296b95..2d5353bace9a 100644 --- a/include/linux/msg.h +++ b/include/linux/msg.h @@ -5,6 +5,19 @@ #include #include =20 +/* + * Each message is stored in one or more page-sized segments. + * The first segment is embedded in struct msg_msg; overflow goes into + * chained struct msg_msgseg blocks. + */ +struct msg_msgseg { + struct msg_msgseg *next; + /* message data follows immediately */ +}; + +#define DATALEN_MSG ((size_t)PAGE_SIZE - sizeof(struct msg_msg)) +#define DATALEN_SEG ((size_t)PAGE_SIZE - sizeof(struct msg_msgseg)) + /* one msg_msg structure for each message */ struct msg_msg { struct list_head m_list; diff --git a/ipc/msgutil.c b/ipc/msgutil.c index e28f0cecb2ec..9cd4b078d55c 100644 --- a/ipc/msgutil.c +++ b/ipc/msgutil.c @@ -31,13 +31,6 @@ struct ipc_namespace init_ipc_ns =3D { .user_ns =3D &init_user_ns, }; =20 -struct msg_msgseg { - struct msg_msgseg *next; - /* the next part of the message follows immediately */ -}; - -#define DATALEN_MSG ((size_t)PAGE_SIZE-sizeof(struct msg_msg)) -#define DATALEN_SEG ((size_t)PAGE_SIZE-sizeof(struct msg_msgseg)) =20 static kmem_buckets *msg_buckets __ro_after_init; =20 --=20 2.34.1 From nobody Thu Apr 2 23:57:06 2026 Received: from mail-pl1-f195.google.com (mail-pl1-f195.google.com [209.85.214.195]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 616E3381B0A for ; Wed, 25 Mar 2026 19:01:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.195 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774465266; cv=none; b=uRCFjfJJSWYfljI8P6wk8ZsROTyN5tlqIGPJuu3OauLPtZrUItRupteJGVocQSnAyJsrzF22o21lhoZlkumrcN9O0WqBWwaa97PNP5NX5vTQApVMmodyXH7sylmGVKvzVLC4DdNZkNaAFADuD1hoXZFejKouIShqMdu8+vZsT8c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774465266; c=relaxed/simple; bh=sCgrEudUL2EwyvNHOVPpGOhp9GJITcyUXJp7Z4NABLs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=f7L6odHx1fNweza/IGaVmYMJRONeIE59jIgagLwWhBNp/UXonBVqgK61ak9hFgJzuz1bg6qwK2q0my6i9ghn7EIfb2cjFtK62fyJlRLV5qV6+URQB5od5QEFLxRYvoNOVqXMImBCRIegCztvJDxD0z/M7LTBejzlraMlwaUD3DY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ee.vjti.ac.in; spf=none smtp.mailfrom=ee.vjti.ac.in; dkim=pass (1024-bit key) header.d=vjti.ac.in header.i=@vjti.ac.in header.b=R4xlr3ft; arc=none smtp.client-ip=209.85.214.195 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ee.vjti.ac.in Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ee.vjti.ac.in Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=vjti.ac.in header.i=@vjti.ac.in header.b="R4xlr3ft" Received: by mail-pl1-f195.google.com with SMTP id d9443c01a7336-2adff872068so661425ad.1 for ; Wed, 25 Mar 2026 12:01:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vjti.ac.in; s=google; t=1774465261; x=1775070061; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2jCYdcHbAU8vXdPo+HmNvp6/WZwddntRiscW+I1qyME=; b=R4xlr3ftv/lchYB+/k28pxAs3v3CA0HYXVJs/481erJlMRqamIGHR96i561xICDHid V3Y8J12BzY0eOWP8lOSL720+TfBe9em6ipKmQ1X4dGHKshc1WGGt5dPiU8sHEY1DJx8F yqffUg3+lmJnPkAPDyh9c+ZIEuLMOj7rL0CCM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774465261; x=1775070061; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=2jCYdcHbAU8vXdPo+HmNvp6/WZwddntRiscW+I1qyME=; b=OoJfto/5jznN1qYoqranOjnym13ieiLh63xGSdGh8VyUfm75te7Vb2HKH52iRkT7UE kOqRoVQxd8ijzZAMlLDiM76My+/unaoi+E6qv6FxZV16dcjh2T/9J0m/rtL6WGN6KF2Z bzl3UmMegE43LVNMGUNsD2mLXw0qAniO0ETBWo2XSQ9eZTN3QaehIVy29SuDvHyxCU4z FPkYjNxghv9OvuTxoSCywu5eGFwBlmPRrv06VV5nXoVAdJKVpfPPH/83R30YVySR11Av GgrbVFRPtoALIPzACSiGeWqJbbYJ3mo0lMqQezkuYtKYw1Dr/4Sp4f4s0llHsED66ITb hOyw== X-Gm-Message-State: AOJu0Yz7QGE5ngmF1Oh8TXDdUs97Xkuif8qwi1UEncezWDzdlmhzryT/ MoYkUh/e84kc1l34FdEKjf193TCsCdCRVXGu+oz3/fTSQC+Kd+jCJsbGV65OEMscPSHCFbj0A3A NubKo3Ln/ITOY X-Gm-Gg: ATEYQzz9X4RmvUk8rbHbgftr1A8xVK05RKHz+kT/R3uswbYPr5y9G71gL6zhSMwMStQ ldmpOX8IIjVFubPT8o+4QSbMbWnQCaoNHEGfE+fL84h/ptx2aJlQXLZ2qX3q7HFyw8pTnKHrx7X gkA9IIWtEFy3oizSFGmDp8twRDZETmLT8biBcE19lddlsy1oBsbW6zydxXvYpcw12MzaeUBTNEK NVFCn+SpzqgtDIaQe8Y5uOAE9r++10Tl+Mjwj3hwI5MwiWUwzpQ2PYPFeMQA+5rO8gMSTWQ/gAq qW3dszSdwzCKw0DvUTmyx2N4++iqD/2qpGzPXeTDuzAUlR1hWCmoXa1//1gMeIf2Z7rgUNZOJwv ldzaI+EXH9fksoabuRkdqx+365LyyQl599Yw12HZ0JlZWMQpxP75ifV5+CvyWZbkOn1IqwROHkC pUHykwtQq/hQFyBiFW/unj6pLzDSyUDNgZb3f8w6IHmL74jMMpVmSHWo8Pnr9hu4Ca1k2lYP3dG Njtev/aH0Y36+QcZEFpYQ== X-Received: by 2002:a17:903:182:b0:2b0:6961:150a with SMTP id d9443c01a7336-2b0b0ad2a61mr54895465ad.38.1774465260637; Wed, 25 Mar 2026 12:01:00 -0700 (PDT) Received: from ranegod-HP-ENVY-x360-Convertible-13-bd0xxx.www.tendawifi.com ([14.139.108.62]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2b0bc79f7dbsm6483485ad.25.2026.03.25.12.00.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Mar 2026 12:00:59 -0700 (PDT) From: Shaurya Rane To: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: manfred@colorfullife.com, viro@zeniv.linux.org.uk, brauner@kernel.org, chuck.lever@oracle.com, jlayton@kernel.org, rstoyanov@fedoraproject.org, ptikhomirov@virtuozzo.com, Shaurya Rane Subject: [RFC PATCH 3/3] ipc/mqueue: implement fcntl(F_MQ_PEEK) for non-destructive message inspection Date: Thu, 26 Mar 2026 00:30:25 +0530 Message-Id: <20260325190025.40312-4-ssrane_b23@ee.vjti.ac.in> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20260325190025.40312-1-ssrane_b23@ee.vjti.ac.in> References: <20260325190025.40312-1-ssrane_b23@ee.vjti.ac.in> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add support for F_MQ_PEEK, a new fcntl command that reads a POSIX message queue message by index without removing it from the queue. Background: CRIU (Checkpoint/Restore In Userspace) supports live container migration and process checkpoint/restore. POSIX message queues are a widely-used IPC mechanism, but CRIU cannot checkpoint processes that hold open mqueue file descriptors: there is no kernel interface to inspect queued messages non-destructively. The SysV IPC analogue (MSG_COPY for msgrcv) was introduced specifically for CRIU in commit 4a674f34ba04 ("ipc: introduce message queue copy feature"). This patch provides the equivalent for POSIX mqueues. Implementation: The queue stores messages in a red-black tree (info->msg_tree) keyed by priority, with each tree node holding a FIFO list of messages at that priority level. mq_peek_at_offset() walks this structure in receive order (highest priority first, FIFO within priority) to locate the message at the requested index without modifying any state. Message payload is copied into a kvmalloc'd kernel buffer under info->lock using pure memcpy() (no page faults possible). This correctly handles multi-segment messages by walking the msg_msgseg chain. The lock is released before copy_to_user() transfers the kernel buffer to userspace. A new include/linux/mqueue.h kernel header is added to declare do_mq_peek() for use from fs/fcntl.c, following the same pattern as include/linux/memfd.h for memfd_fcntl(). Concurrency: The snapshot is consistent within the spin_lock() critical section. Between two F_MQ_PEEK calls the queue may change (messages may be sent or received). This is documented snapshot semantics, analogous to /proc entries. CRIU freezes the target process via ptrace before dumping, so in practice the queue is stable for the entire checkpoint sequence. Link: https://github.com/checkpoint-restore/criu/issues/2285 Signed-off-by: Shaurya Rane --- fs/fcntl.c | 4 ++ include/linux/mqueue.h | 19 ++++++ ipc/mqueue.c | 129 +++++++++++++++++++++++++++++++++++++++++ 3 files changed, 152 insertions(+) create mode 100644 include/linux/mqueue.h diff --git a/fs/fcntl.c b/fs/fcntl.c index f93dbca08435..32d0dcc8e544 100644 --- a/fs/fcntl.c +++ b/fs/fcntl.c @@ -24,6 +24,7 @@ #include #include #include +#include #include #include #include @@ -563,6 +564,9 @@ static long do_fcntl(int fd, unsigned int cmd, unsigned= long arg, return -EFAULT; err =3D fcntl_setdeleg(fd, filp, &deleg); break; + case F_MQ_PEEK: + err =3D do_mq_peek(filp, argp); + break; default: break; } diff --git a/include/linux/mqueue.h b/include/linux/mqueue.h new file mode 100644 index 000000000000..a725fcf90d39 --- /dev/null +++ b/include/linux/mqueue.h @@ -0,0 +1,19 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __LINUX_MQUEUE_H +#define __LINUX_MQUEUE_H + +#include + +struct file; + +#ifdef CONFIG_POSIX_MQUEUE +long do_mq_peek(struct file *filp, struct mq_peek_attr __user *uattr); +#else +static inline long do_mq_peek(struct file *filp, + struct mq_peek_attr __user *uattr) +{ + return -EBADF; +} +#endif /* CONFIG_POSIX_MQUEUE */ + +#endif /* __LINUX_MQUEUE_H */ diff --git a/ipc/mqueue.c b/ipc/mqueue.c index bb7c9e5d2b90..5e73864a9657 100644 --- a/ipc/mqueue.c +++ b/ipc/mqueue.c @@ -286,6 +286,135 @@ static inline struct msg_msg *msg_get(struct mqueue_i= node_info *info) return msg; } =20 +/* + * mq_peek_at_offset - locate a message by receive-order index. + * + * Walk the priority tree from highest to lowest priority, and within each + * priority level in FIFO order, returning the message at position @offset + * (0 =3D next message that mq_receive() would dequeue). + * + * Must be called with info->lock held. Does not modify queue state. + * Returns NULL if @offset >=3D mq_curmsgs. + */ +static struct msg_msg *mq_peek_at_offset(struct mqueue_inode_info *info, + int offset) +{ + struct posix_msg_tree_node *leaf; + struct rb_node *node; + struct msg_msg *msg; + int count =3D 0; + + for (node =3D info->msg_tree_rightmost; node; node =3D rb_prev(node)) { + leaf =3D rb_entry(node, struct posix_msg_tree_node, rb_node); + list_for_each_entry(msg, &leaf->msg_list, m_list) { + if (count =3D=3D offset) + return msg; + count++; + } + } + return NULL; +} + +/* + * mq_msg_copy_to_buf - copy message payload into a flat kernel buffer. + * + * Handles multi-segment messages by walking the msg_msgseg chain. + * Uses only memcpy() so it is safe to call under info->lock. + * Returns the number of bytes copied. + */ +static size_t mq_msg_copy_to_buf(struct msg_msg *msg, void *buf, size_t bu= f_len) +{ + size_t alen, to_copy, copied =3D 0; + struct msg_msgseg *seg; + + to_copy =3D min(buf_len, msg->m_ts); + + alen =3D min(to_copy, DATALEN_MSG); + memcpy(buf, msg + 1, alen); + copied +=3D alen; + to_copy -=3D alen; + + for (seg =3D msg->next; seg && to_copy > 0; seg =3D seg->next) { + alen =3D min(to_copy, DATALEN_SEG); + memcpy((char *)buf + copied, seg + 1, alen); + copied +=3D alen; + to_copy -=3D alen; + } + return copied; +} + +/* + * do_mq_peek - implement fcntl(F_MQ_PEEK). + * + * Read the message at position @attr.offset in receive order from the + * queue without removing it. Position 0 is the message that the next + * mq_receive() would return (highest priority, FIFO within priority). + * + * The snapshot is consistent within the spin_lock() critical section. + * Between two F_MQ_PEEK calls the queue may change; this is documented + * snapshot semantics analogous to /proc entries. + * + * Returns bytes copied on success, -ENOMSG if offset >=3D mq_curmsgs. + */ +long do_mq_peek(struct file *filp, struct mq_peek_attr __user *uattr) +{ + struct mqueue_inode_info *info; + struct mq_peek_attr attr; + struct msg_msg *msg; + void *kbuf; + long ret; + + if (filp->f_op !=3D &mqueue_file_operations) + return -EBADF; + + if (!(filp->f_mode & FMODE_READ)) + return -EBADF; + + if (copy_from_user(&attr, uattr, sizeof(attr))) + return -EFAULT; + + if (attr.offset < 0 || !attr.buf_len || !attr.buf) + return -EINVAL; + + info =3D MQUEUE_I(file_inode(filp)); + + /* + * Allocate the kernel copy buffer before taking the spinlock. + * Cap at mq_msgsize: no message can exceed it. + */ + kbuf =3D kvmalloc(min_t(size_t, attr.buf_len, info->attr.mq_msgsize), + GFP_KERNEL); + if (!kbuf) + return -ENOMEM; + + spin_lock(&info->lock); + + msg =3D mq_peek_at_offset(info, attr.offset); + if (!msg) { + spin_unlock(&info->lock); + kvfree(kbuf); + return -ENOMSG; + } + + /* + * Copy the payload under the lock using pure memcpy() (no page + * faults), then transfer to userspace after releasing the lock. + */ + ret =3D mq_msg_copy_to_buf(msg, kbuf, + min_t(size_t, attr.buf_len, + info->attr.mq_msgsize)); + attr.msg_prio =3D msg->m_type; + + spin_unlock(&info->lock); + + if (copy_to_user(attr.buf, kbuf, ret) || + copy_to_user(uattr, &attr, sizeof(attr))) + ret =3D -EFAULT; + + kvfree(kbuf); + return ret; +} + static struct inode *mqueue_get_inode(struct super_block *sb, struct ipc_namespace *ipc_ns, umode_t mode, struct mq_attr *attr) --=20 2.34.1