From nobody Fri Apr 10 10:43:51 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24A23C32771 for ; Fri, 19 Aug 2022 15:41:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350054AbiHSPld (ORCPT ); Fri, 19 Aug 2022 11:41:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37860 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1349967AbiHSPk6 (ORCPT ); Fri, 19 Aug 2022 11:40:58 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6301AFF8CD; Fri, 19 Aug 2022 08:40:48 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 055A3B8277D; Fri, 19 Aug 2022 15:40:47 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5CC65C433C1; Fri, 19 Aug 2022 15:40:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1660923645; bh=JH8YGLEcHg3LX4WC0hQaLrQ6zIleqe7Bxnu577DJ1JQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=2rnjkPK1OE+wf3pv8lYDNW+ZT8ZZJprMO3k+vrqzCToIhCoK0+ICAUCNZi1Cz7rdi SJg36vAXfF5H6feyU4mYmHILLNNmCqsQZbXV8xNDW8PmjYQAunMULNgSibCqYm6pSn x5punIvHykIiMB++niSbr2fVoG9VVLA8t1WoD4wo= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Jens Axboe Subject: [PATCH 5.15 01/14] io_uring: use original request task for inflight tracking Date: Fri, 19 Aug 2022 17:40:17 +0200 Message-Id: <20220819153711.712418650@linuxfoundation.org> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220819153711.658766010@linuxfoundation.org> References: <20220819153711.658766010@linuxfoundation.org> User-Agent: quilt/0.67 X-stable: review X-Patchwork-Hint: ignore MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Jens Axboe commit 386e4fb6962b9f248a80f8870aea0870ca603e89 upstream. In prior kernels, we did file assignment always at prep time. This meant that req->task =3D=3D current. But after deferring that assignment and then pushing the inflight tracking back in, we've got the inflight tracking using current when it should in fact now be using req->task. Fixup that error introduced by adding the inflight tracking back after file assignments got modifed. Fixes: 9cae36a094e7 ("io_uring: reinstate the inflight tracking") Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman Tested-by: Bagas Sanjaya Tested-by: Ron Economos Tested-by: Shuah Khan Tested-by: Sudip Mukherjee --- fs/io_uring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -1405,7 +1405,7 @@ static void io_req_track_inflight(struct { if (!(req->flags & REQ_F_INFLIGHT)) { req->flags |=3D REQ_F_INFLIGHT; - atomic_inc(¤t->io_uring->inflight_tracked); + atomic_inc(&req->task->io_uring->inflight_tracked); } } From nobody Fri Apr 10 10:43:51 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 307DCC3F6B0 for ; Fri, 19 Aug 2022 15:42:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350065AbiHSPma (ORCPT ); Fri, 19 Aug 2022 11:42:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38634 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1349981AbiHSPlt (ORCPT ); Fri, 19 Aug 2022 11:41:49 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EAAFFEA336; Fri, 19 Aug 2022 08:41:07 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 262D8B82814; Fri, 19 Aug 2022 15:41:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6EE08C433C1; Fri, 19 Aug 2022 15:41:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1660923664; bh=qh/axndb7wUJZpXguucKoIulVp/Qb3Nd7Atm6YWpB5k=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=KyUglHcQebGDrtrwLWNu1M4ir0TS5OALsmqOMY7HEjtejt2cb6sMFLjxduKsyugpv DbJqbOMTUQcWoe3cDCAUFVGImrQPocItkbBU107NqAbvSvTmJ4tuRgrxZWLTknBqyx MF4+Xo6eb1esJQQgtJiLJh4GqqRDmxqbIBWpK08s= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Nimish Mishra , Anirban Chakraborty , Debdeep Mukhopadhyay , Jerome Forissier , Jens Wiklander , Linus Torvalds Subject: [PATCH 5.15 02/14] tee: add overflow check in register_shm_helper() Date: Fri, 19 Aug 2022 17:40:18 +0200 Message-Id: <20220819153711.747538439@linuxfoundation.org> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220819153711.658766010@linuxfoundation.org> References: <20220819153711.658766010@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Jens Wiklander commit 573ae4f13f630d6660008f1974c0a8a29c30e18a upstream. With special lengths supplied by user space, register_shm_helper() has an integer overflow when calculating the number of pages covered by a supplied user space memory region. This causes internal_get_user_pages_fast() a helper function of pin_user_pages_fast() to do a NULL pointer dereference: Unable to handle kernel NULL pointer dereference at virtual address 00000= 00000000010 Modules linked in: CPU: 1 PID: 173 Comm: optee_example_a Not tainted 5.19.0 #11 Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 pc : internal_get_user_pages_fast+0x474/0xa80 Call trace: internal_get_user_pages_fast+0x474/0xa80 pin_user_pages_fast+0x24/0x4c register_shm_helper+0x194/0x330 tee_shm_register_user_buf+0x78/0x120 tee_ioctl+0xd0/0x11a0 __arm64_sys_ioctl+0xa8/0xec invoke_syscall+0x48/0x114 Fix this by adding an an explicit call to access_ok() in tee_shm_register_user_buf() to catch an invalid user space address early. Fixes: 033ddf12bcf5 ("tee: add register user memory") Cc: stable@vger.kernel.org Reported-by: Nimish Mishra Reported-by: Anirban Chakraborty Reported-by: Debdeep Mukhopadhyay Suggested-by: Jerome Forissier Signed-off-by: Jens Wiklander Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman Tested-by: Bagas Sanjaya Tested-by: Ron Economos Tested-by: Shuah Khan Tested-by: Sudip Mukherjee --- drivers/tee/tee_shm.c | 3 +++ 1 file changed, 3 insertions(+) --- a/drivers/tee/tee_shm.c +++ b/drivers/tee/tee_shm.c @@ -222,6 +222,9 @@ struct tee_shm *tee_shm_register(struct goto err; } =20 + if (!access_ok((void __user *)addr, length)) + return ERR_PTR(-EFAULT); + mutex_lock(&teedev->mutex); shm->id =3D idr_alloc(&teedev->idr, shm, 1, 0, GFP_KERNEL); mutex_unlock(&teedev->mutex); From nobody Fri Apr 10 10:43:51 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58C57C32773 for ; Fri, 19 Aug 2022 15:42:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350005AbiHSPmh (ORCPT ); Fri, 19 Aug 2022 11:42:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36850 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1349721AbiHSPlu (ORCPT ); Fri, 19 Aug 2022 11:41:50 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 56841101D20; Fri, 19 Aug 2022 08:41:09 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 749D3615FC; Fri, 19 Aug 2022 15:41:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7FDA5C433C1; Fri, 19 Aug 2022 15:41:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1660923667; bh=QXKbbD2rBMM2SVJ7jtAh4BSriiVTIwVcRFRHymGJ04g=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hHa2v3hybS9+H/eLQFowKQJiHAJ/hkS8sxY6jZoazvF5vtxWVSm6K2QQPGIzwy7uk 4kfCQF8wwIrcyy1VYAaUtOVH7yrNAzRPQS0hgeg25vQ0AxjgAN3CYtkE8qykJm0Mpi 6rC1aGgRTnslyPTMOT9xR4PnGMY8veAbwwEwnZ4E= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Jamal Hadi Salim , Stephen Hemminger , "David S. Miller" Subject: [PATCH 5.15 03/14] net_sched: cls_route: disallow handle of 0 Date: Fri, 19 Aug 2022 17:40:19 +0200 Message-Id: <20220819153711.787764500@linuxfoundation.org> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220819153711.658766010@linuxfoundation.org> References: <20220819153711.658766010@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Jamal Hadi Salim commit 02799571714dc5dd6948824b9d080b44a295f695 upstream. Follows up on: https://lore.kernel.org/all/20220809170518.164662-1-cascardo@canonical.com/ handle of 0 implies from/to of universe realm which is not very sensible. Lets see what this patch will do: $sudo tc qdisc add dev $DEV root handle 1:0 prio //lets manufacture a way to insert handle of 0 $sudo tc filter add dev $DEV parent 1:0 protocol ip prio 100 \ route to 0 from 0 classid 1:10 action ok //gets rejected... Error: handle of 0 is not valid. We have an error talking to the kernel, -1 //lets create a legit entry.. sudo tc filter add dev $DEV parent 1:0 protocol ip prio 100 route from 10 \ classid 1:10 action ok //what did the kernel insert? $sudo tc filter ls dev $DEV parent 1:0 filter protocol ip pref 100 route chain 0 filter protocol ip pref 100 route chain 0 fh 0x000a8000 flowid 1:10 from 10 action order 1: gact action pass random type none pass val 0 index 1 ref 1 bind 1 //Lets try to replace that legit entry with a handle of 0 $ sudo tc filter replace dev $DEV parent 1:0 protocol ip prio 100 \ handle 0x000a8000 route to 0 from 0 classid 1:10 action drop Error: Replacing with handle of 0 is invalid. We have an error talking to the kernel, -1 And last, lets run Cascardo's POC: $ ./poc 0 0 -22 -22 -22 Signed-off-by: Jamal Hadi Salim Acked-by: Stephen Hemminger Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman Tested-by: Bagas Sanjaya Tested-by: Ron Economos Tested-by: Shuah Khan Tested-by: Sudip Mukherjee --- net/sched/cls_route.c | 10 ++++++++++ 1 file changed, 10 insertions(+) --- a/net/sched/cls_route.c +++ b/net/sched/cls_route.c @@ -424,6 +424,11 @@ static int route4_set_parms(struct net * return -EINVAL; } =20 + if (!nhandle) { + NL_SET_ERR_MSG(extack, "Replacing with handle of 0 is invalid"); + return -EINVAL; + } + h1 =3D to_hash(nhandle); b =3D rtnl_dereference(head->table[h1]); if (!b) { @@ -477,6 +482,11 @@ static int route4_change(struct net *net int err; bool new =3D true; =20 + if (!handle) { + NL_SET_ERR_MSG(extack, "Creating with handle of 0 is invalid"); + return -EINVAL; + } + if (opt =3D=3D NULL) return handle ? -EINVAL : 0; From nobody Fri Apr 10 10:43:51 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92ADEC32771 for ; Fri, 19 Aug 2022 15:42:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350082AbiHSPmk (ORCPT ); Fri, 19 Aug 2022 11:42:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1350001AbiHSPlw (ORCPT ); Fri, 19 Aug 2022 11:41:52 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 031DB102F35; Fri, 19 Aug 2022 08:41:13 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 46CD2B8277D; Fri, 19 Aug 2022 15:41:12 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 947BEC433D7; Fri, 19 Aug 2022 15:41:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1660923670; bh=3ZsFsvZyMK+E7wOFl4R8OLSIHARPZ4AMFBzPb0nUZ6o=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=XsVtPRkEgKCFURn5YwJ95C7IwKKGu21+9CdrYWqKULAUef6Gv+VRArCyAfo4sWbny R0G/FripBkk/X6KWJTbUtTNqUsCyQqPNQqgIx1XqbGKZEc1LSO8Cz5WU9GfCWpihu5 taP0x/xYsFz+jthcCyMQvNqRddSF7duw81Y7aBBc= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Hyunchul Lee , Namjae Jeon , Steve French , zdi-disclosures@trendmicro.com Subject: [PATCH 5.15 04/14] ksmbd: prevent out of bound read for SMB2_WRITE Date: Fri, 19 Aug 2022 17:40:20 +0200 Message-Id: <20220819153711.816369367@linuxfoundation.org> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220819153711.658766010@linuxfoundation.org> References: <20220819153711.658766010@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Hyunchul Lee commit ac60778b87e45576d7bfdbd6f53df902654e6f09 upstream. OOB read memory can be written to a file, if DataOffset is 0 and Length is too large in SMB2_WRITE request of compound request. To prevent this, when checking the length of the data area of SMB2_WRITE in smb2_get_data_area_len(), let the minimum of DataOffset be the size of SMB2 header + the size of SMB2_WRITE header. This bug can lead an oops looking something like: [ 798.008715] BUG: KASAN: slab-out-of-bounds in copy_page_from_iter_atomic= +0xd3d/0x14b0 [ 798.008724] Read of size 252 at addr ffff88800f863e90 by task kworker/0:= 2/2859 ... [ 798.008754] Call Trace: [ 798.008756] [ 798.008759] dump_stack_lvl+0x49/0x5f [ 798.008764] print_report.cold+0x5e/0x5cf [ 798.008768] ? __filemap_get_folio+0x285/0x6d0 [ 798.008774] ? copy_page_from_iter_atomic+0xd3d/0x14b0 [ 798.008777] kasan_report+0xaa/0x120 [ 798.008781] ? copy_page_from_iter_atomic+0xd3d/0x14b0 [ 798.008784] kasan_check_range+0x100/0x1e0 [ 798.008788] memcpy+0x24/0x60 [ 798.008792] copy_page_from_iter_atomic+0xd3d/0x14b0 [ 798.008795] ? pagecache_get_page+0x53/0x160 [ 798.008799] ? iov_iter_get_pages_alloc+0x1590/0x1590 [ 798.008803] ? ext4_write_begin+0xfc0/0xfc0 [ 798.008807] ? current_time+0x72/0x210 [ 798.008811] generic_perform_write+0x2c8/0x530 [ 798.008816] ? filemap_fdatawrite_wbc+0x180/0x180 [ 798.008820] ? down_write+0xb4/0x120 [ 798.008824] ? down_write_killable+0x130/0x130 [ 798.008829] ext4_buffered_write_iter+0x137/0x2c0 [ 798.008833] ext4_file_write_iter+0x40b/0x1490 [ 798.008837] ? __fsnotify_parent+0x275/0xb20 [ 798.008842] ? __fsnotify_update_child_dentry_flags+0x2c0/0x2c0 [ 798.008846] ? ext4_buffered_write_iter+0x2c0/0x2c0 [ 798.008851] __kernel_write+0x3a1/0xa70 [ 798.008855] ? __x64_sys_preadv2+0x160/0x160 [ 798.008860] ? security_file_permission+0x4a/0xa0 [ 798.008865] kernel_write+0xbb/0x360 [ 798.008869] ksmbd_vfs_write+0x27e/0xb90 [ksmbd] [ 798.008881] ? ksmbd_vfs_read+0x830/0x830 [ksmbd] [ 798.008892] ? _raw_read_unlock+0x2a/0x50 [ 798.008896] smb2_write+0xb45/0x14e0 [ksmbd] [ 798.008909] ? __kasan_check_write+0x14/0x20 [ 798.008912] ? _raw_spin_lock_bh+0xd0/0xe0 [ 798.008916] ? smb2_read+0x15e0/0x15e0 [ksmbd] [ 798.008927] ? memcpy+0x4e/0x60 [ 798.008931] ? _raw_spin_unlock+0x19/0x30 [ 798.008934] ? ksmbd_smb2_check_message+0x16af/0x2350 [ksmbd] [ 798.008946] ? _raw_spin_lock_bh+0xe0/0xe0 [ 798.008950] handle_ksmbd_work+0x30e/0x1020 [ksmbd] [ 798.008962] process_one_work+0x778/0x11c0 [ 798.008966] ? _raw_spin_lock_irq+0x8e/0xe0 [ 798.008970] worker_thread+0x544/0x1180 [ 798.008973] ? __cpuidle_text_end+0x4/0x4 [ 798.008977] kthread+0x282/0x320 [ 798.008982] ? process_one_work+0x11c0/0x11c0 [ 798.008985] ? kthread_complete_and_exit+0x30/0x30 [ 798.008989] ret_from_fork+0x1f/0x30 [ 798.008995] Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Cc: stable@vger.kernel.org Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-17817 Signed-off-by: Hyunchul Lee Acked-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman Tested-by: Bagas Sanjaya Tested-by: Ron Economos Tested-by: Shuah Khan Tested-by: Sudip Mukherjee --- fs/ksmbd/smb2misc.c | 7 +++++-- fs/ksmbd/smb2pdu.c | 6 ++---- 2 files changed, 7 insertions(+), 6 deletions(-) --- a/fs/ksmbd/smb2misc.c +++ b/fs/ksmbd/smb2misc.c @@ -132,8 +132,11 @@ static int smb2_get_data_area_len(unsign *len =3D le16_to_cpu(((struct smb2_read_req *)hdr)->ReadChannelInfoLengt= h); break; case SMB2_WRITE: - if (((struct smb2_write_req *)hdr)->DataOffset) { - *off =3D le16_to_cpu(((struct smb2_write_req *)hdr)->DataOffset); + if (((struct smb2_write_req *)hdr)->DataOffset || + ((struct smb2_write_req *)hdr)->Length) { + *off =3D max_t(unsigned int, + le16_to_cpu(((struct smb2_write_req *)hdr)->DataOffset), + offsetof(struct smb2_write_req, Buffer) - 4); *len =3D le32_to_cpu(((struct smb2_write_req *)hdr)->Length); break; } --- a/fs/ksmbd/smb2pdu.c +++ b/fs/ksmbd/smb2pdu.c @@ -6471,10 +6471,8 @@ int smb2_write(struct ksmbd_work *work) (offsetof(struct smb2_write_req, Buffer) - 4)) { data_buf =3D (char *)&req->Buffer[0]; } else { - if ((u64)le16_to_cpu(req->DataOffset) + length > get_rfc1002_len(req)) { - pr_err("invalid write data offset %u, smb_len %u\n", - le16_to_cpu(req->DataOffset), - get_rfc1002_len(req)); + if (le16_to_cpu(req->DataOffset) < + offsetof(struct smb2_write_req, Buffer)) { err =3D -EINVAL; goto out; } From nobody Fri Apr 10 10:43:51 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D1D4C32771 for ; Fri, 19 Aug 2022 15:42:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350025AbiHSPmq (ORCPT ); Fri, 19 Aug 2022 11:42:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38262 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1350012AbiHSPmC (ORCPT ); Fri, 19 Aug 2022 11:42:02 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 647DA101D3D; Fri, 19 Aug 2022 08:41:15 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id E3E1661638; Fri, 19 Aug 2022 15:41:14 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A5B34C433D6; Fri, 19 Aug 2022 15:41:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1660923674; bh=bEpqF+Ej5NYE78XIC1YKdvwrUh3MjVO9mrh/ZAlg750=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=g6d2bIrqJZW++YA7XXWHDqHBFQgejDQjdHX2aTjRoX8YGYDP5T6YrSHMEdAc/ui3N taFfKRgtTCAFYhjUHvNwVUNpI6/QpqGGAvovz0xO+TJ/lkN88lgp5kxJQwT8M2Ny2v RJwY7pbPlVEwVasN7JF3sX9TK8Ih2RMtG3XU82eg= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Hyunchul Lee , Namjae Jeon , Steve French , zdi-disclosures@trendmicro.com Subject: [PATCH 5.15 05/14] ksmbd: fix heap-based overflow in set_ntacl_dacl() Date: Fri, 19 Aug 2022 17:40:21 +0200 Message-Id: <20220819153711.847846093@linuxfoundation.org> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220819153711.658766010@linuxfoundation.org> References: <20220819153711.658766010@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Namjae Jeon commit 8f0541186e9ad1b62accc9519cc2b7a7240272a7 upstream. The testcase use SMB2_SET_INFO_HE command to set a malformed file attribute under the label `security.NTACL`. SMB2_QUERY_INFO_HE command in testcase trigger the following overflow. [ 4712.003781] =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D [ 4712.003790] BUG: KASAN: slab-out-of-bounds in build_sec_desc+0x842/0x1dd= 0 [ksmbd] [ 4712.003807] Write of size 1060 at addr ffff88801e34c068 by task kworker/= 0:0/4190 [ 4712.003813] CPU: 0 PID: 4190 Comm: kworker/0:0 Not tainted 5.19.0-rc5 #1 [ 4712.003850] Workqueue: ksmbd-io handle_ksmbd_work [ksmbd] [ 4712.003867] Call Trace: [ 4712.003870] [ 4712.003873] dump_stack_lvl+0x49/0x5f [ 4712.003935] print_report.cold+0x5e/0x5cf [ 4712.003972] ? ksmbd_vfs_get_sd_xattr+0x16d/0x500 [ksmbd] [ 4712.003984] ? cmp_map_id+0x200/0x200 [ 4712.003988] ? build_sec_desc+0x842/0x1dd0 [ksmbd] [ 4712.004000] kasan_report+0xaa/0x120 [ 4712.004045] ? build_sec_desc+0x842/0x1dd0 [ksmbd] [ 4712.004056] kasan_check_range+0x100/0x1e0 [ 4712.004060] memcpy+0x3c/0x60 [ 4712.004064] build_sec_desc+0x842/0x1dd0 [ksmbd] [ 4712.004076] ? parse_sec_desc+0x580/0x580 [ksmbd] [ 4712.004088] ? ksmbd_acls_fattr+0x281/0x410 [ksmbd] [ 4712.004099] smb2_query_info+0xa8f/0x6110 [ksmbd] [ 4712.004111] ? psi_group_change+0x856/0xd70 [ 4712.004148] ? update_load_avg+0x1c3/0x1af0 [ 4712.004152] ? asym_cpu_capacity_scan+0x5d0/0x5d0 [ 4712.004157] ? xas_load+0x23/0x300 [ 4712.004162] ? smb2_query_dir+0x1530/0x1530 [ksmbd] [ 4712.004173] ? _raw_spin_lock_bh+0xe0/0xe0 [ 4712.004179] handle_ksmbd_work+0x30e/0x1020 [ksmbd] [ 4712.004192] process_one_work+0x778/0x11c0 [ 4712.004227] ? _raw_spin_lock_irq+0x8e/0xe0 [ 4712.004231] worker_thread+0x544/0x1180 [ 4712.004234] ? __cpuidle_text_end+0x4/0x4 [ 4712.004239] kthread+0x282/0x320 [ 4712.004243] ? process_one_work+0x11c0/0x11c0 [ 4712.004246] ? kthread_complete_and_exit+0x30/0x30 [ 4712.004282] ret_from_fork+0x1f/0x30 This patch add the buffer validation for security descriptor that is stored by malformed SMB2_SET_INFO_HE command. and allocate large response buffer about SMB2_O_INFO_SECURITY file info class. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Cc: stable@vger.kernel.org Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-17771 Reviewed-by: Hyunchul Lee Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman Tested-by: Bagas Sanjaya Tested-by: Ron Economos Tested-by: Shuah Khan Tested-by: Sudip Mukherjee --- fs/ksmbd/smb2pdu.c | 39 ++++++++++----- fs/ksmbd/smbacl.c | 130 +++++++++++++++++++++++++++++++++++-------------= ----- fs/ksmbd/smbacl.h | 2=20 fs/ksmbd/vfs.c | 5 ++ 4 files changed, 119 insertions(+), 57 deletions(-) --- a/fs/ksmbd/smb2pdu.c +++ b/fs/ksmbd/smb2pdu.c @@ -541,9 +541,10 @@ int smb2_allocate_rsp_buf(struct ksmbd_w struct smb2_query_info_req *req; =20 req =3D work->request_buf; - if (req->InfoType =3D=3D SMB2_O_INFO_FILE && - (req->FileInfoClass =3D=3D FILE_FULL_EA_INFORMATION || - req->FileInfoClass =3D=3D FILE_ALL_INFORMATION)) + if ((req->InfoType =3D=3D SMB2_O_INFO_FILE && + (req->FileInfoClass =3D=3D FILE_FULL_EA_INFORMATION || + req->FileInfoClass =3D=3D FILE_ALL_INFORMATION)) || + req->InfoType =3D=3D SMB2_O_INFO_SECURITY) sz =3D large_sz; } =20 @@ -2981,7 +2982,7 @@ int smb2_open(struct ksmbd_work *work) goto err_out; =20 rc =3D build_sec_desc(user_ns, - pntsd, NULL, + pntsd, NULL, 0, OWNER_SECINFO | GROUP_SECINFO | DACL_SECINFO, @@ -3824,6 +3825,15 @@ static int verify_info_level(int info_le return 0; } =20 +static int smb2_resp_buf_len(struct ksmbd_work *work, unsigned short hdr2_= len) +{ + int free_len; + + free_len =3D (int)(work->response_sz - + (get_rfc1002_len(work->response_buf) + 4)) - hdr2_len; + return free_len; +} + static int smb2_calc_max_out_buf_len(struct ksmbd_work *work, unsigned short hdr2_len, unsigned int out_buf_len) @@ -3833,9 +3843,7 @@ static int smb2_calc_max_out_buf_len(str if (out_buf_len > work->conn->vals->max_trans_size) return -EINVAL; =20 - free_len =3D (int)(work->response_sz - - (get_rfc1002_len(work->response_buf) + 4)) - - hdr2_len; + free_len =3D smb2_resp_buf_len(work, hdr2_len); if (free_len < 0) return -EINVAL; =20 @@ -5087,10 +5095,10 @@ static int smb2_get_info_sec(struct ksmb struct smb_ntsd *pntsd =3D (struct smb_ntsd *)rsp->Buffer, *ppntsd =3D NU= LL; struct smb_fattr fattr =3D {{0}}; struct inode *inode; - __u32 secdesclen; + __u32 secdesclen =3D 0; unsigned int id =3D KSMBD_NO_FID, pid =3D KSMBD_NO_FID; int addition_info =3D le32_to_cpu(req->AdditionalInformation); - int rc; + int rc =3D 0, ppntsd_size =3D 0; =20 if (addition_info & ~(OWNER_SECINFO | GROUP_SECINFO | DACL_SECINFO | PROTECTED_DACL_SECINFO | @@ -5136,11 +5144,14 @@ static int smb2_get_info_sec(struct ksmb =20 if (test_share_config_flag(work->tcon->share_conf, KSMBD_SHARE_FLAG_ACL_XATTR)) - ksmbd_vfs_get_sd_xattr(work->conn, user_ns, - fp->filp->f_path.dentry, &ppntsd); - - rc =3D build_sec_desc(user_ns, pntsd, ppntsd, addition_info, - &secdesclen, &fattr); + ppntsd_size =3D ksmbd_vfs_get_sd_xattr(work->conn, user_ns, + fp->filp->f_path.dentry, + &ppntsd); + + /* Check if sd buffer size exceeds response buffer size */ + if (smb2_resp_buf_len(work, 8) > ppntsd_size) + rc =3D build_sec_desc(user_ns, pntsd, ppntsd, ppntsd_size, + addition_info, &secdesclen, &fattr); posix_acl_release(fattr.cf_acls); posix_acl_release(fattr.cf_dacls); kfree(ppntsd); --- a/fs/ksmbd/smbacl.c +++ b/fs/ksmbd/smbacl.c @@ -690,6 +690,7 @@ posix_default_acl: static void set_ntacl_dacl(struct user_namespace *user_ns, struct smb_acl *pndacl, struct smb_acl *nt_dacl, + unsigned int aces_size, const struct smb_sid *pownersid, const struct smb_sid *pgrpsid, struct smb_fattr *fattr) @@ -703,9 +704,19 @@ static void set_ntacl_dacl(struct user_n if (nt_num_aces) { ntace =3D (struct smb_ace *)((char *)nt_dacl + sizeof(struct smb_acl)); for (i =3D 0; i < nt_num_aces; i++) { - memcpy((char *)pndace + size, ntace, le16_to_cpu(ntace->size)); - size +=3D le16_to_cpu(ntace->size); - ntace =3D (struct smb_ace *)((char *)ntace + le16_to_cpu(ntace->size)); + unsigned short nt_ace_size; + + if (offsetof(struct smb_ace, access_req) > aces_size) + break; + + nt_ace_size =3D le16_to_cpu(ntace->size); + if (nt_ace_size > aces_size) + break; + + memcpy((char *)pndace + size, ntace, nt_ace_size); + size +=3D nt_ace_size; + aces_size -=3D nt_ace_size; + ntace =3D (struct smb_ace *)((char *)ntace + nt_ace_size); num_aces++; } } @@ -878,7 +889,7 @@ int parse_sec_desc(struct user_namespace /* Convert permission bits from mode to equivalent CIFS ACL */ int build_sec_desc(struct user_namespace *user_ns, struct smb_ntsd *pntsd, struct smb_ntsd *ppntsd, - int addition_info, __u32 *secdesclen, + int ppntsd_size, int addition_info, __u32 *secdesclen, struct smb_fattr *fattr) { int rc =3D 0; @@ -938,15 +949,25 @@ int build_sec_desc(struct user_namespace =20 if (!ppntsd) { set_mode_dacl(user_ns, dacl_ptr, fattr); - } else if (!ppntsd->dacloffset) { - goto out; } else { struct smb_acl *ppdacl_ptr; + unsigned int dacl_offset =3D le32_to_cpu(ppntsd->dacloffset); + int ppdacl_size, ntacl_size =3D ppntsd_size - dacl_offset; + + if (!dacl_offset || + (dacl_offset + sizeof(struct smb_acl) > ppntsd_size)) + goto out; + + ppdacl_ptr =3D (struct smb_acl *)((char *)ppntsd + dacl_offset); + ppdacl_size =3D le16_to_cpu(ppdacl_ptr->size); + if (ppdacl_size > ntacl_size || + ppdacl_size < sizeof(struct smb_acl)) + goto out; =20 - ppdacl_ptr =3D (struct smb_acl *)((char *)ppntsd + - le32_to_cpu(ppntsd->dacloffset)); set_ntacl_dacl(user_ns, dacl_ptr, ppdacl_ptr, - nowner_sid_ptr, ngroup_sid_ptr, fattr); + ntacl_size - sizeof(struct smb_acl), + nowner_sid_ptr, ngroup_sid_ptr, + fattr); } pntsd->dacloffset =3D cpu_to_le32(offset); offset +=3D le16_to_cpu(dacl_ptr->size); @@ -980,24 +1001,31 @@ int smb_inherit_dacl(struct ksmbd_conn * struct smb_sid owner_sid, group_sid; struct dentry *parent =3D path->dentry->d_parent; struct user_namespace *user_ns =3D mnt_user_ns(path->mnt); - int inherited_flags =3D 0, flags =3D 0, i, ace_cnt =3D 0, nt_size =3D 0; - int rc =3D 0, num_aces, dacloffset, pntsd_type, acl_len; + int inherited_flags =3D 0, flags =3D 0, i, ace_cnt =3D 0, nt_size =3D 0, = pdacl_size; + int rc =3D 0, num_aces, dacloffset, pntsd_type, pntsd_size, acl_len, aces= _size; char *aces_base; bool is_dir =3D S_ISDIR(d_inode(path->dentry)->i_mode); =20 - acl_len =3D ksmbd_vfs_get_sd_xattr(conn, user_ns, - parent, &parent_pntsd); - if (acl_len <=3D 0) + pntsd_size =3D ksmbd_vfs_get_sd_xattr(conn, user_ns, + parent, &parent_pntsd); + if (pntsd_size <=3D 0) return -ENOENT; dacloffset =3D le32_to_cpu(parent_pntsd->dacloffset); - if (!dacloffset) { + if (!dacloffset || (dacloffset + sizeof(struct smb_acl) > pntsd_size)) { rc =3D -EINVAL; goto free_parent_pntsd; } =20 parent_pdacl =3D (struct smb_acl *)((char *)parent_pntsd + dacloffset); + acl_len =3D pntsd_size - dacloffset; num_aces =3D le32_to_cpu(parent_pdacl->num_aces); pntsd_type =3D le16_to_cpu(parent_pntsd->type); + pdacl_size =3D le16_to_cpu(parent_pdacl->size); + + if (pdacl_size > acl_len || pdacl_size < sizeof(struct smb_acl)) { + rc =3D -EINVAL; + goto free_parent_pntsd; + } =20 aces_base =3D kmalloc(sizeof(struct smb_ace) * num_aces * 2, GFP_KERNEL); if (!aces_base) { @@ -1008,11 +1036,23 @@ int smb_inherit_dacl(struct ksmbd_conn * aces =3D (struct smb_ace *)aces_base; parent_aces =3D (struct smb_ace *)((char *)parent_pdacl + sizeof(struct smb_acl)); + aces_size =3D acl_len - sizeof(struct smb_acl); =20 if (pntsd_type & DACL_AUTO_INHERITED) inherited_flags =3D INHERITED_ACE; =20 for (i =3D 0; i < num_aces; i++) { + int pace_size; + + if (offsetof(struct smb_ace, access_req) > aces_size) + break; + + pace_size =3D le16_to_cpu(parent_aces->size); + if (pace_size > aces_size) + break; + + aces_size -=3D pace_size; + flags =3D parent_aces->flags; if (!smb_inherit_flags(flags, is_dir)) goto pass; @@ -1057,8 +1097,7 @@ int smb_inherit_dacl(struct ksmbd_conn * aces =3D (struct smb_ace *)((char *)aces + le16_to_cpu(aces->size)); ace_cnt++; pass: - parent_aces =3D - (struct smb_ace *)((char *)parent_aces + le16_to_cpu(parent_aces->size)= ); + parent_aces =3D (struct smb_ace *)((char *)parent_aces + pace_size); } =20 if (nt_size > 0) { @@ -1153,7 +1192,7 @@ int smb_check_perm_dacl(struct ksmbd_con struct smb_ntsd *pntsd =3D NULL; struct smb_acl *pdacl; struct posix_acl *posix_acls; - int rc =3D 0, acl_size; + int rc =3D 0, pntsd_size, acl_size, aces_size, pdacl_size, dacl_offset; struct smb_sid sid; int granted =3D le32_to_cpu(*pdaccess & ~FILE_MAXIMAL_ACCESS_LE); struct smb_ace *ace; @@ -1162,37 +1201,33 @@ int smb_check_perm_dacl(struct ksmbd_con struct smb_ace *others_ace =3D NULL; struct posix_acl_entry *pa_entry; unsigned int sid_type =3D SIDOWNER; - char *end_of_acl; + unsigned short ace_size; =20 ksmbd_debug(SMB, "check permission using windows acl\n"); - acl_size =3D ksmbd_vfs_get_sd_xattr(conn, user_ns, - path->dentry, &pntsd); - if (acl_size <=3D 0 || !pntsd || !pntsd->dacloffset) { - kfree(pntsd); - return 0; - } + pntsd_size =3D ksmbd_vfs_get_sd_xattr(conn, user_ns, + path->dentry, &pntsd); + if (pntsd_size <=3D 0 || !pntsd) + goto err_out; + + dacl_offset =3D le32_to_cpu(pntsd->dacloffset); + if (!dacl_offset || + (dacl_offset + sizeof(struct smb_acl) > pntsd_size)) + goto err_out; =20 pdacl =3D (struct smb_acl *)((char *)pntsd + le32_to_cpu(pntsd->dacloffse= t)); - end_of_acl =3D ((char *)pntsd) + acl_size; - if (end_of_acl <=3D (char *)pdacl) { - kfree(pntsd); - return 0; - } + acl_size =3D pntsd_size - dacl_offset; + pdacl_size =3D le16_to_cpu(pdacl->size); =20 - if (end_of_acl < (char *)pdacl + le16_to_cpu(pdacl->size) || - le16_to_cpu(pdacl->size) < sizeof(struct smb_acl)) { - kfree(pntsd); - return 0; - } + if (pdacl_size > acl_size || pdacl_size < sizeof(struct smb_acl)) + goto err_out; =20 if (!pdacl->num_aces) { - if (!(le16_to_cpu(pdacl->size) - sizeof(struct smb_acl)) && + if (!(pdacl_size - sizeof(struct smb_acl)) && *pdaccess & ~(FILE_READ_CONTROL_LE | FILE_WRITE_DAC_LE)) { rc =3D -EACCES; goto err_out; } - kfree(pntsd); - return 0; + goto err_out; } =20 if (*pdaccess & FILE_MAXIMAL_ACCESS_LE) { @@ -1200,11 +1235,16 @@ int smb_check_perm_dacl(struct ksmbd_con DELETE; =20 ace =3D (struct smb_ace *)((char *)pdacl + sizeof(struct smb_acl)); + aces_size =3D acl_size - sizeof(struct smb_acl); for (i =3D 0; i < le32_to_cpu(pdacl->num_aces); i++) { + if (offsetof(struct smb_ace, access_req) > aces_size) + break; + ace_size =3D le16_to_cpu(ace->size); + if (ace_size > aces_size) + break; + aces_size -=3D ace_size; granted |=3D le32_to_cpu(ace->access_req); ace =3D (struct smb_ace *)((char *)ace + le16_to_cpu(ace->size)); - if (end_of_acl < (char *)ace) - goto err_out; } =20 if (!pdacl->num_aces) @@ -1216,7 +1256,15 @@ int smb_check_perm_dacl(struct ksmbd_con id_to_sid(uid, sid_type, &sid); =20 ace =3D (struct smb_ace *)((char *)pdacl + sizeof(struct smb_acl)); + aces_size =3D acl_size - sizeof(struct smb_acl); for (i =3D 0; i < le32_to_cpu(pdacl->num_aces); i++) { + if (offsetof(struct smb_ace, access_req) > aces_size) + break; + ace_size =3D le16_to_cpu(ace->size); + if (ace_size > aces_size) + break; + aces_size -=3D ace_size; + if (!compare_sids(&sid, &ace->sid) || !compare_sids(&sid_unix_NFS_mode, &ace->sid)) { found =3D 1; @@ -1226,8 +1274,6 @@ int smb_check_perm_dacl(struct ksmbd_con others_ace =3D ace; =20 ace =3D (struct smb_ace *)((char *)ace + le16_to_cpu(ace->size)); - if (end_of_acl < (char *)ace) - goto err_out; } =20 if (*pdaccess & FILE_MAXIMAL_ACCESS_LE && found) { --- a/fs/ksmbd/smbacl.h +++ b/fs/ksmbd/smbacl.h @@ -193,7 +193,7 @@ struct posix_acl_state { int parse_sec_desc(struct user_namespace *user_ns, struct smb_ntsd *pntsd, int acl_len, struct smb_fattr *fattr); int build_sec_desc(struct user_namespace *user_ns, struct smb_ntsd *pntsd, - struct smb_ntsd *ppntsd, int addition_info, + struct smb_ntsd *ppntsd, int ppntsd_size, int addition_info, __u32 *secdesclen, struct smb_fattr *fattr); int init_acl_state(struct posix_acl_state *state, int cnt); void free_acl_state(struct posix_acl_state *state); --- a/fs/ksmbd/vfs.c +++ b/fs/ksmbd/vfs.c @@ -1543,6 +1543,11 @@ int ksmbd_vfs_get_sd_xattr(struct ksmbd_ } =20 *pntsd =3D acl.sd_buf; + if (acl.sd_size < sizeof(struct smb_ntsd)) { + pr_err("sd size is invalid\n"); + goto out_free; + } + (*pntsd)->osidoffset =3D cpu_to_le32(le32_to_cpu((*pntsd)->osidoffset) - NDR_NTSD_OFFSETOF); (*pntsd)->gsidoffset =3D cpu_to_le32(le32_to_cpu((*pntsd)->gsidoffset) - From nobody Fri Apr 10 10:43:51 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 488D4C32771 for ; Fri, 19 Aug 2022 15:42:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350029AbiHSPmu (ORCPT ); Fri, 19 Aug 2022 11:42:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38454 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1350022AbiHSPmI (ORCPT ); Fri, 19 Aug 2022 11:42:08 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F26EC103604; Fri, 19 Aug 2022 08:41:19 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 7605FB82813; Fri, 19 Aug 2022 15:41:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D4A5BC433D6; Fri, 19 Aug 2022 15:41:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1660923677; bh=7UDH/fqBWEzSvu/YCltMHj3q068IxXIPypo+K7qoAuY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=zpt8FaVa2FIjC/D2RTNf6PrGkcoYzOzVREvEFlT/UrTeqPjx6jwyRQhiCsLVJaYYv hsqdUslWTM4oMumezIGT2bFaR2elfI8VlejrsB5aeIWMm/vpP0zChOdX6cvq9BpKnK d8G5JYGFs8uCAqoPWa+iMuacuB3RRLm3FoKnMW/A= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Greg Kroah-Hartman , Thadeu Lima de Souza Cascardo Subject: [PATCH 5.15 06/14] Revert "x86/ftrace: Use alternative RET encoding" Date: Fri, 19 Aug 2022 17:40:22 +0200 Message-Id: <20220819153711.879328558@linuxfoundation.org> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220819153711.658766010@linuxfoundation.org> References: <20220819153711.658766010@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Thadeu Lima de Souza Cascardo This reverts commit e54fcb0812faebd147de72bd37ad87cc4951c68c. This temporarily reverts the backport of upstream commit 1f001e9da6bbf482311e45e48f53c2bd2179e59c. It was not correct to copy the ftrace stub as it would contain a relative jump to the return thunk which would not apply to the context where it was being copied to, leading to ftrace support to be broken. Signed-off-by: Thadeu Lima de Souza Cascardo Signed-off-by: Greg Kroah-Hartman Tested-by: Bagas Sanjaya Tested-by: Ron Economos Tested-by: Shuah Khan Tested-by: Sudip Mukherjee --- arch/x86/kernel/ftrace.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -309,7 +309,7 @@ union ftrace_op_code_union { } __attribute__((packed)); }; =20 -#define RET_SIZE (IS_ENABLED(CONFIG_RETPOLINE) ? 5 : 1 + IS_ENABLED(CONFI= G_SLS)) +#define RET_SIZE 1 + IS_ENABLED(CONFIG_SLS) =20 static unsigned long create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size) @@ -368,10 +368,7 @@ create_trampoline(struct ftrace_ops *ops =20 /* The trampoline ends with ret(q) */ retq =3D (unsigned long)ftrace_stub; - if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) - memcpy(ip, text_gen_insn(JMP32_INSN_OPCODE, ip, &__x86_return_thunk), JM= P32_INSN_SIZE); - else - ret =3D copy_from_kernel_nofault(ip, (void *)retq, RET_SIZE); + ret =3D copy_from_kernel_nofault(ip, (void *)retq, RET_SIZE); if (WARN_ON(ret < 0)) goto fail; From nobody Fri Apr 10 10:43:51 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE399C32771 for ; Fri, 19 Aug 2022 15:42:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349934AbiHSPmy (ORCPT ); Fri, 19 Aug 2022 11:42:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38514 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1350024AbiHSPmK (ORCPT ); Fri, 19 Aug 2022 11:42:10 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2B733EA316; Fri, 19 Aug 2022 08:41:22 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id A6F58B8280C; Fri, 19 Aug 2022 15:41:21 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 065AEC433D6; Fri, 19 Aug 2022 15:41:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1660923680; bh=ba+AyLMN6YwEW7LiiCSOx/+EWjjh2P9tOMHMza/BVyY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=YkwxM1iLIAnvXOh29PLrWMuAJ9W0D0JEvHysQfNOXsbmd7FxDgSUSQoDHI+TXVdlV AHCbasauBbUpE7LMvraTX3F3V5jPPR8CjBSoE2d6g2shbHreZjbtsaRpHhv/Brj1VF obVl/Pd5Fz0Zsn3WWX6dfYykRuEF3X1tk7enTWy8= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Greg Kroah-Hartman , "Peter Zijlstra (Intel)" , Josh Poimboeuf , Thadeu Lima de Souza Cascardo Subject: [PATCH 5.15 07/14] x86/ibt,ftrace: Make function-graph play nice Date: Fri, 19 Aug 2022 17:40:23 +0200 Message-Id: <20220819153711.919713708@linuxfoundation.org> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220819153711.658766010@linuxfoundation.org> References: <20220819153711.658766010@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra commit e52fc2cf3f662828cc0d51c4b73bed73ad275fce upstream. Return trampoline must not use indirect branch to return; while this preserves the RSB, it is fundamentally incompatible with IBT. Instead use a retpoline like ROP gadget that defeats IBT while not unbalancing the RSB. And since ftrace_stub is no longer a plain RET, don't use it to copy from. Since RET is a trivial instruction, poke it directly. Signed-off-by: Peter Zijlstra (Intel) Acked-by: Josh Poimboeuf Link: https://lore.kernel.org/r/20220308154318.347296408@infradead.org [cascardo: remove ENDBR] Signed-off-by: Thadeu Lima de Souza Cascardo Signed-off-by: Greg Kroah-Hartman Tested-by: Bagas Sanjaya Tested-by: Ron Economos Tested-by: Shuah Khan Tested-by: Sudip Mukherjee --- arch/x86/kernel/ftrace.c | 9 ++------- arch/x86/kernel/ftrace_64.S | 19 +++++++++++++++---- 2 files changed, 17 insertions(+), 11 deletions(-) --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -322,12 +322,12 @@ create_trampoline(struct ftrace_ops *ops unsigned long offset; unsigned long npages; unsigned long size; - unsigned long retq; unsigned long *ptr; void *trampoline; void *ip; /* 48 8b 15 is movq (%rip), %rdx */ unsigned const char op_ref[] =3D { 0x48, 0x8b, 0x15 }; + unsigned const char retq[] =3D { RET_INSN_OPCODE, INT3_INSN_OPCODE }; union ftrace_op_code_union op_ptr; int ret; =20 @@ -365,12 +365,7 @@ create_trampoline(struct ftrace_ops *ops goto fail; =20 ip =3D trampoline + size; - - /* The trampoline ends with ret(q) */ - retq =3D (unsigned long)ftrace_stub; - ret =3D copy_from_kernel_nofault(ip, (void *)retq, RET_SIZE); - if (WARN_ON(ret < 0)) - goto fail; + memcpy(ip, retq, RET_SIZE); =20 /* No need to test direct calls on created trampolines */ if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) { --- a/arch/x86/kernel/ftrace_64.S +++ b/arch/x86/kernel/ftrace_64.S @@ -181,7 +181,6 @@ SYM_INNER_LABEL(ftrace_graph_call, SYM_L =20 /* * This is weak to keep gas from relaxing the jumps. - * It is also used to copy the RET for trampolines. */ SYM_INNER_LABEL_ALIGN(ftrace_stub, SYM_L_WEAK) UNWIND_HINT_FUNC @@ -335,7 +334,7 @@ SYM_FUNC_START(ftrace_graph_caller) SYM_FUNC_END(ftrace_graph_caller) =20 SYM_FUNC_START(return_to_handler) - subq $24, %rsp + subq $16, %rsp =20 /* Save the return values */ movq %rax, (%rsp) @@ -347,7 +346,19 @@ SYM_FUNC_START(return_to_handler) movq %rax, %rdi movq 8(%rsp), %rdx movq (%rsp), %rax - addq $24, %rsp - JMP_NOSPEC rdi + + addq $16, %rsp + /* + * Jump back to the old return address. This cannot be JMP_NOSPEC rdi + * since IBT would demand that contain ENDBR, which simply isn't so for + * return addresses. Use a retpoline here to keep the RSB balanced. + */ + ANNOTATE_INTRA_FUNCTION_CALL + call .Ldo_rop + int3 +.Ldo_rop: + mov %rdi, (%rsp) + UNWIND_HINT_FUNC + RET SYM_FUNC_END(return_to_handler) #endif From nobody Fri Apr 10 10:43:51 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F3DEC32771 for ; Fri, 19 Aug 2022 15:43:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349979AbiHSPm6 (ORCPT ); Fri, 19 Aug 2022 11:42:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38596 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1349951AbiHSPmN (ORCPT ); Fri, 19 Aug 2022 11:42:13 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D4F7D101C76; Fri, 19 Aug 2022 08:41:26 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id AC8E9B8277D; Fri, 19 Aug 2022 15:41:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 01B22C433C1; Fri, 19 Aug 2022 15:41:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1660923683; bh=diXNHA+nES9RnYc+jkOrZvmAJAKsH+GdidFUo88bBoU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=iGg3XuMuVjO2/gtfwRfhDgRG6wnFcnvb0gFVkLXpaCs2xazYhYQXrqdC+MkOQCuh3 iO+xrqwk1RdUhKtpx5i3OlvGOJLQ5kTPms+3Mie+vQn5m9fm+7ErsXAA8JfnnrfKNw XKZeJgwLq40Fg+7ijf3nEl5YRkimcHP3lyUtNKSc= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Greg Kroah-Hartman , "Peter Zijlstra (Intel)" , Borislav Petkov , Josh Poimboeuf , Thadeu Lima de Souza Cascardo Subject: [PATCH 5.15 08/14] x86/ftrace: Use alternative RET encoding Date: Fri, 19 Aug 2022 17:40:24 +0200 Message-Id: <20220819153711.948887870@linuxfoundation.org> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220819153711.658766010@linuxfoundation.org> References: <20220819153711.658766010@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra commit 1f001e9da6bbf482311e45e48f53c2bd2179e59c upstream. Use the return thunk in ftrace trampolines, if needed. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Borislav Petkov Reviewed-by: Josh Poimboeuf Signed-off-by: Borislav Petkov [cascardo: use memcpy(text_gen_insn) as there is no __text_gen_insn] Signed-off-by: Thadeu Lima de Souza Cascardo Signed-off-by: Greg Kroah-Hartman Tested-by: Bagas Sanjaya Tested-by: Ron Economos Tested-by: Shuah Khan Tested-by: Sudip Mukherjee --- arch/x86/kernel/ftrace.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -309,7 +309,7 @@ union ftrace_op_code_union { } __attribute__((packed)); }; =20 -#define RET_SIZE 1 + IS_ENABLED(CONFIG_SLS) +#define RET_SIZE (IS_ENABLED(CONFIG_RETPOLINE) ? 5 : 1 + IS_ENABLED(CONFI= G_SLS)) =20 static unsigned long create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size) @@ -365,7 +365,12 @@ create_trampoline(struct ftrace_ops *ops goto fail; =20 ip =3D trampoline + size; - memcpy(ip, retq, RET_SIZE); + + /* The trampoline ends with ret(q) */ + if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) + memcpy(ip, text_gen_insn(JMP32_INSN_OPCODE, ip, &__x86_return_thunk), JM= P32_INSN_SIZE); + else + memcpy(ip, retq, sizeof(retq)); =20 /* No need to test direct calls on created trampolines */ if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) { From nobody Fri Apr 10 10:43:51 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB16FC32773 for ; Fri, 19 Aug 2022 15:43:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350070AbiHSPnD (ORCPT ); Fri, 19 Aug 2022 11:43:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38106 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1350041AbiHSPmO (ORCPT ); Fri, 19 Aug 2022 11:42:14 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6F05102F3F; Fri, 19 Aug 2022 08:41:28 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1F0D761630; Fri, 19 Aug 2022 15:41:27 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 10E18C433C1; Fri, 19 Aug 2022 15:41:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1660923686; bh=GmyLQo4ZOTkk3TH6XnGNNUHjOXkNC3XIh8LyzUnIjxs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=zenOlyg5kHqNW0WWfhpAKzcoLbjnVIkhl211BAIN/aghyn/r/mYmF6EB/po1e8HUh y4Fjt7rqpjYiuCzrOjZfuzjhzAmZ5G9ilCWw0OwhrZ4YvLcOojcEsKHtOPg/ZSZLiw EWPaFPW7PwZej86FwuLpG1MamvBu87GErj6cstxY= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, David Sterba , Qu Wenruo Subject: [PATCH 5.15 09/14] btrfs: only write the sectors in the vertical stripe which has data stripes Date: Fri, 19 Aug 2022 17:40:25 +0200 Message-Id: <20220819153711.977759356@linuxfoundation.org> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220819153711.658766010@linuxfoundation.org> References: <20220819153711.658766010@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Qu Wenruo commit bd8f7e627703ca5707833d623efcd43f104c7b3f upstream. If we have only 8K partial write at the beginning of a full RAID56 stripe, we will write the following contents: 0 8K 32K 64K Disk 1 (data): |XX| | | Disk 2 (data): | | | Disk 3 (parity): |XXXXXXXXXXXXXXX|XXXXXXXXXXXXXXX| |X| means the sector will be written back to disk. Note that, although we won't write any sectors from disk 2, but we will write the full 64KiB of parity to disk. This behavior is fine for now, but not for the future (especially for RAID56J, as we waste quite some space to journal the unused parity stripes). So here we will also utilize the btrfs_raid_bio::dbitmap, anytime we queue a higher level bio into an rbio, we will update rbio::dbitmap to indicate which vertical stripes we need to writeback. And at finish_rmw(), we also check dbitmap to see if we need to write any sector in the vertical stripe. So after the patch, above example will only lead to the following writeback pattern: 0 8K 32K 64K Disk 1 (data): |XX| | | Disk 2 (data): | | | Disk 3 (parity): |XX| | | Acked-by: David Sterba Signed-off-by: Qu Wenruo Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman Tested-by: Bagas Sanjaya Tested-by: Ron Economos Tested-by: Shuah Khan Tested-by: Sudip Mukherjee --- fs/btrfs/raid56.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++++= +---- 1 file changed, 51 insertions(+), 4 deletions(-) --- a/fs/btrfs/raid56.c +++ b/fs/btrfs/raid56.c @@ -324,6 +324,9 @@ static void merge_rbio(struct btrfs_raid { bio_list_merge(&dest->bio_list, &victim->bio_list); dest->bio_list_bytes +=3D victim->bio_list_bytes; + /* Also inherit the bitmaps from @victim. */ + bitmap_or(dest->dbitmap, victim->dbitmap, dest->dbitmap, + dest->stripe_npages); dest->generic_bio_cnt +=3D victim->generic_bio_cnt; bio_list_init(&victim->bio_list); } @@ -865,6 +868,12 @@ static void rbio_orig_end_io(struct btrf =20 if (rbio->generic_bio_cnt) btrfs_bio_counter_sub(rbio->fs_info, rbio->generic_bio_cnt); + /* + * Clear the data bitmap, as the rbio may be cached for later usage. + * do this before before unlock_stripe() so there will be no new bio + * for this bio. + */ + bitmap_clear(rbio->dbitmap, 0, rbio->stripe_npages); =20 /* * At this moment, rbio->bio_list is empty, however since rbio does not @@ -1197,6 +1206,9 @@ static noinline void finish_rmw(struct b else BUG(); =20 + /* We should have at least one data sector. */ + ASSERT(bitmap_weight(rbio->dbitmap, rbio->stripe_npages)); + /* at this point we either have a full stripe, * or we've read the full stripe from the drive. * recalculate the parity and write the new results. @@ -1268,6 +1280,11 @@ static noinline void finish_rmw(struct b for (stripe =3D 0; stripe < rbio->real_stripes; stripe++) { for (pagenr =3D 0; pagenr < rbio->stripe_npages; pagenr++) { struct page *page; + + /* This vertical stripe has no data, skip it. */ + if (!test_bit(pagenr, rbio->dbitmap)) + continue; + if (stripe < rbio->nr_data) { page =3D page_in_rbio(rbio, stripe, pagenr, 1); if (!page) @@ -1292,6 +1309,11 @@ static noinline void finish_rmw(struct b =20 for (pagenr =3D 0; pagenr < rbio->stripe_npages; pagenr++) { struct page *page; + + /* This vertical stripe has no data, skip it. */ + if (!test_bit(pagenr, rbio->dbitmap)) + continue; + if (stripe < rbio->nr_data) { page =3D page_in_rbio(rbio, stripe, pagenr, 1); if (!page) @@ -1715,6 +1737,33 @@ static void btrfs_raid_unplug(struct blk run_plug(plug); } =20 +/* Add the original bio into rbio->bio_list, and update rbio::dbitmap. */ +static void rbio_add_bio(struct btrfs_raid_bio *rbio, struct bio *orig_bio) +{ + const struct btrfs_fs_info *fs_info =3D rbio->fs_info; + const u64 orig_logical =3D orig_bio->bi_iter.bi_sector << SECTOR_SHIFT; + const u64 full_stripe_start =3D rbio->bioc->raid_map[0]; + const u32 orig_len =3D orig_bio->bi_iter.bi_size; + const u32 sectorsize =3D fs_info->sectorsize; + u64 cur_logical; + + ASSERT(orig_logical >=3D full_stripe_start && + orig_logical + orig_len <=3D full_stripe_start + + rbio->nr_data * rbio->stripe_len); + + bio_list_add(&rbio->bio_list, orig_bio); + rbio->bio_list_bytes +=3D orig_bio->bi_iter.bi_size; + + /* Update the dbitmap. */ + for (cur_logical =3D orig_logical; cur_logical < orig_logical + orig_len; + cur_logical +=3D sectorsize) { + int bit =3D ((u32)(cur_logical - full_stripe_start) >> + fs_info->sectorsize_bits) % rbio->stripe_npages; + + set_bit(bit, rbio->dbitmap); + } +} + /* * our main entry point for writes from the rest of the FS. */ @@ -1731,9 +1780,8 @@ int raid56_parity_write(struct btrfs_fs_ btrfs_put_bioc(bioc); return PTR_ERR(rbio); } - bio_list_add(&rbio->bio_list, bio); - rbio->bio_list_bytes =3D bio->bi_iter.bi_size; rbio->operation =3D BTRFS_RBIO_WRITE; + rbio_add_bio(rbio, bio); =20 btrfs_bio_counter_inc_noblocked(fs_info); rbio->generic_bio_cnt =3D 1; @@ -2135,8 +2183,7 @@ int raid56_parity_recover(struct btrfs_f } =20 rbio->operation =3D BTRFS_RBIO_READ_REBUILD; - bio_list_add(&rbio->bio_list, bio); - rbio->bio_list_bytes =3D bio->bi_iter.bi_size; + rbio_add_bio(rbio, bio); =20 rbio->faila =3D find_logical_bio_stripe(rbio, bio); if (rbio->faila =3D=3D -1) { From nobody Fri Apr 10 10:43:51 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B6E86C32771 for ; Fri, 19 Aug 2022 15:41:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349989AbiHSPlu (ORCPT ); Fri, 19 Aug 2022 11:41:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38596 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1349987AbiHSPlA (ORCPT ); Fri, 19 Aug 2022 11:41:00 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9AAD7102645; Fri, 19 Aug 2022 08:40:50 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 41526B82812; Fri, 19 Aug 2022 15:40:50 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8D4F9C433C1; Fri, 19 Aug 2022 15:40:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1660923649; bh=bcOZ4df+8jVzaKVhVAkFAKR1x3LIzs27krUqQ9YnTbM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=kjwPv5lUKVe2Fg/2jFGdYTR2Gmh1zfek3/fpFllOR7HzPmhb7NSX8uw4jPqjM7lg4 gG9roBTk8Su8AwGFaxca/xZOnREGkfLpFZ3bxInn7O5DQ916LXetFL3IDiIofXHDUq GCisRtJOvn3BLYTe/+qcZYyTdih+Tt/7alAOpkDo= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, David Sterba , Qu Wenruo Subject: [PATCH 5.15 10/14] btrfs: raid56: dont trust any cached sector in __raid56_parity_recover() Date: Fri, 19 Aug 2022 17:40:26 +0200 Message-Id: <20220819153712.007595343@linuxfoundation.org> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220819153711.658766010@linuxfoundation.org> References: <20220819153711.658766010@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Qu Wenruo commit f6065f8edeb25f4a9dfe0b446030ad995a84a088 upstream. [BUG] There is a small workload which will always fail with recent kernel: (A simplified version from btrfs/125 test case) mkfs.btrfs -f -m raid5 -d raid5 -b 1G $dev1 $dev2 $dev3 mount $dev1 $mnt xfs_io -f -c "pwrite -S 0xee 0 1M" $mnt/file1 sync umount $mnt btrfs dev scan -u $dev3 mount -o degraded $dev1 $mnt xfs_io -f -c "pwrite -S 0xff 0 128M" $mnt/file2 umount $mnt btrfs dev scan mount $dev1 $mnt btrfs balance start --full-balance $mnt umount $mnt The failure is always failed to read some tree blocks: BTRFS info (device dm-4): relocating block group 217710592 flags data|rai= d5 BTRFS error (device dm-4): parent transid verify failed on 38993920 wante= d 9 found 7 BTRFS error (device dm-4): parent transid verify failed on 38993920 wante= d 9 found 7 ... [CAUSE] With the recently added debug output, we can see all RAID56 operations related to full stripe 38928384: 56.1183: raid56_read_partial: full_stripe=3D38928384 devid=3D2 type=3DDAT= A1 offset=3D0 opf=3D0x0 physical=3D9502720 len=3D65536 56.1185: raid56_read_partial: full_stripe=3D38928384 devid=3D3 type=3DDAT= A2 offset=3D16384 opf=3D0x0 physical=3D9519104 len=3D16384 56.1185: raid56_read_partial: full_stripe=3D38928384 devid=3D3 type=3DDAT= A2 offset=3D49152 opf=3D0x0 physical=3D9551872 len=3D16384 56.1187: raid56_write_stripe: full_stripe=3D38928384 devid=3D3 type=3DDAT= A2 offset=3D0 opf=3D0x1 physical=3D9502720 len=3D16384 56.1188: raid56_write_stripe: full_stripe=3D38928384 devid=3D3 type=3DDAT= A2 offset=3D32768 opf=3D0x1 physical=3D9535488 len=3D16384 56.1188: raid56_write_stripe: full_stripe=3D38928384 devid=3D1 type=3DPQ1= offset=3D0 opf=3D0x1 physical=3D30474240 len=3D16384 56.1189: raid56_write_stripe: full_stripe=3D38928384 devid=3D1 type=3DPQ1= offset=3D32768 opf=3D0x1 physical=3D30507008 len=3D16384 56.1218: raid56_write_stripe: full_stripe=3D38928384 devid=3D3 type=3DDAT= A2 offset=3D49152 opf=3D0x1 physical=3D9551872 len=3D16384 56.1219: raid56_write_stripe: full_stripe=3D38928384 devid=3D1 type=3DPQ1= offset=3D49152 opf=3D0x1 physical=3D30523392 len=3D16384 56.2721: raid56_parity_recover: full stripe=3D38928384 eb=3D39010304 mirr= or=3D2 56.2723: raid56_parity_recover: full stripe=3D38928384 eb=3D39010304 mirr= or=3D2 56.2724: raid56_parity_recover: full stripe=3D38928384 eb=3D39010304 mirr= or=3D2 Before we enter raid56_parity_recover(), we have triggered some metadata write for the full stripe 38928384, this leads to us to read all the sectors from disk. Furthermore, btrfs raid56 write will cache its calculated P/Q sectors to avoid unnecessary read. This means, for that full stripe, after any partial write, we will have stale data, along with P/Q calculated using that stale data. Thankfully due to patch "btrfs: only write the sectors in the vertical stri= pe which has data stripes" we haven't submitted all the corrupted P/Q to disk. When we really need to recover certain range, aka in raid56_parity_recover(), we will use the cached rbio, along with its cached sectors (the full stripe is all cached). This explains why we have no event raid56_scrub_read_recover() triggered. Since we have the cached P/Q which is calculated using the stale data, the recovered one will just be stale. In our particular test case, it will always return the same incorrect metadata, thus causing the same error message "parent transid verify failed on 39010304 wanted 9 found 7" again and again. [BTRFS DESTRUCTIVE RMW PROBLEM] Test case btrfs/125 (and above workload) always has its trouble with the destructive read-modify-write (RMW) cycle: 0 32K 64K Data1: | Good | Good | Data2: | Bad | Bad | Parity: | Good | Good | In above case, if we trigger any write into Data1, we will use the bad data in Data2 to re-generate parity, killing the only chance to recovery Data2, thus Data2 is lost forever. This destructive RMW cycle is not specific to btrfs RAID56, but there are some btrfs specific behaviors making the case even worse: - Btrfs will cache sectors for unrelated vertical stripes. In above example, if we're only writing into 0~32K range, btrfs will still read data range (32K ~ 64K) of Data1, and (64K~128K) of Data2. This behavior is to cache sectors for later update. Incidentally commit d4e28d9b5f04 ("btrfs: raid56: make steal_rbio() subpage compatible") has a bug which makes RAID56 to never trust the cached sectors, thus slightly improve the situation for recovery. Unfortunately, follow up fix "btrfs: update stripe_sectors::uptodate in steal_rbio" will revert the behavior back to the old one. - Btrfs raid56 partial write will update all P/Q sectors and cache them This means, even if data at (64K ~ 96K) of Data2 is free space, and only (96K ~ 128K) of Data2 is really stale data. And we write into that (96K ~ 128K), we will update all the parity sectors for the full stripe. This unnecessary behavior will completely kill the chance of recovery. Thankfully, an unrelated optimization "btrfs: only write the sectors in the vertical stripe which has data stripes" will prevent submitting the write bio for untouched vertical sectors. That optimization will keep the on-disk P/Q untouched for a chance for later recovery. [FIX] Although we have no good way to completely fix the destructive RMW (unless we go full scrub for each partial write), we can still limit the damage. With patch "btrfs: only write the sectors in the vertical stripe which has data stripes" now we won't really submit the P/Q of unrelated vertical stripes, so the on-disk P/Q should still be fine. Now we really need to do is just drop all the cached sectors when doing recovery. By this, we have a chance to read the original P/Q from disk, and have a chance to recover the stale data, while still keep the cache to speed up regular write path. In fact, just dropping all the cache for recovery path is good enough to allow the test case btrfs/125 along with the small script to pass reliably. The lack of metadata write after the degraded mount, and forced metadata COW is saving us this time. So this patch will fix the behavior by not trust any cache in __raid56_parity_recover(), to solve the problem while still keep the cache useful. But please note that this test pass DOES NOT mean we have solved the destructive RMW problem, we just do better damage control a little better. Related patches: - btrfs: only write the sectors in the vertical stripe - d4e28d9b5f04 ("btrfs: raid56: make steal_rbio() subpage compatible") - btrfs: update stripe_sectors::uptodate in steal_rbio Acked-by: David Sterba Signed-off-by: Qu Wenruo Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman Tested-by: Bagas Sanjaya Tested-by: Ron Economos Tested-by: Shuah Khan Tested-by: Sudip Mukherjee --- fs/btrfs/raid56.c | 19 ++++++------------- 1 file changed, 6 insertions(+), 13 deletions(-) --- a/fs/btrfs/raid56.c +++ b/fs/btrfs/raid56.c @@ -2085,9 +2085,12 @@ static int __raid56_parity_recover(struc atomic_set(&rbio->error, 0); =20 /* - * read everything that hasn't failed. Thanks to the - * stripe cache, it is possible that some or all of these - * pages are going to be uptodate. + * Read everything that hasn't failed. However this time we will + * not trust any cached sector. + * As we may read out some stale data but higher layer is not reading + * that stale part. + * + * So here we always re-read everything in recovery path. */ for (stripe =3D 0; stripe < rbio->real_stripes; stripe++) { if (rbio->faila =3D=3D stripe || rbio->failb =3D=3D stripe) { @@ -2096,16 +2099,6 @@ static int __raid56_parity_recover(struc } =20 for (pagenr =3D 0; pagenr < rbio->stripe_npages; pagenr++) { - struct page *p; - - /* - * the rmw code may have already read this - * page in - */ - p =3D rbio_stripe_page(rbio, stripe, pagenr); - if (PageUptodate(p)) - continue; - ret =3D rbio_add_io_page(rbio, &bio_list, rbio_stripe_page(rbio, stripe, pagenr), stripe, pagenr, rbio->stripe_len); From nobody Fri Apr 10 10:43:51 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB9FFC32771 for ; Fri, 19 Aug 2022 15:41:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349454AbiHSPlx (ORCPT ); Fri, 19 Aug 2022 11:41:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38638 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1350009AbiHSPlB (ORCPT ); Fri, 19 Aug 2022 11:41:01 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 63017E341F; Fri, 19 Aug 2022 08:40:53 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id E2D81615ED; Fri, 19 Aug 2022 15:40:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D20C7C433C1; Fri, 19 Aug 2022 15:40:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1660923652; bh=c/vhoYm7qWTuTa71RxkMDXidcM2P22GbZN/xGxqMp74=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=G5zVv4FAmcXJMbHyIi53Ct17Y78AEXsaiLKndCfdb95npxmLDe4bSrXk2dgoiB9s8 IOupID12uPCU8/YwoUPpj9VybcXPe8KQnpAF0Mkh0eyHSe4OFU3RilUjaKK/Wsg+g0 jz+0hM12FliiaBDrU1AlB6hfpZFOTDVF1mV+UXek= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, "Naveen N. Rao" , Eric Biederman , Andrew Morton , Mimi Zohar Subject: [PATCH 5.15 11/14] kexec_file: drop weak attribute from functions Date: Fri, 19 Aug 2022 17:40:27 +0200 Message-Id: <20220819153712.040538428@linuxfoundation.org> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220819153711.658766010@linuxfoundation.org> References: <20220819153711.658766010@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Naveen N. Rao commit 65d9a9a60fd71be964effb2e94747a6acb6e7015 upstream. As requested (http://lkml.kernel.org/r/87ee0q7b92.fsf@email.froward.int.ebiederm.org), this series converts weak functions in kexec to use the #ifdef approach. Quoting the 3e35142ef99fe ("kexec_file: drop weak attribute from arch_kexec_apply_relocations[_add]") changelog: : Since commit d1bcae833b32f1 ("ELF: Don't generate unused section symbols") : [1], binutils (v2.36+) started dropping section symbols that it thought : were unused. This isn't an issue in general, but with kexec_file.c, gcc : is placing kexec_arch_apply_relocations[_add] into a separate : .text.unlikely section and the section symbol ".text.unlikely" is being : dropped. Due to this, recordmcount is unable to find a non-weak symbol in : .text.unlikely to generate a relocation record against. This patch (of 2); Drop __weak attribute from functions in kexec_file.c: - arch_kexec_kernel_image_probe() - arch_kimage_file_post_load_cleanup() - arch_kexec_kernel_image_load() - arch_kexec_locate_mem_hole() - arch_kexec_kernel_verify_sig() arch_kexec_kernel_image_load() calls into kexec_image_load_default(), so drop the static attribute for the latter. arch_kexec_kernel_verify_sig() is not overridden by any architecture, so drop the __weak attribute. Link: https://lkml.kernel.org/r/cover.1656659357.git.naveen.n.rao@linux.vne= t.ibm.com Link: https://lkml.kernel.org/r/2cd7ca1fe4d6bb6ca38e3283c717878388ed6788.16= 56659357.git.naveen.n.rao@linux.vnet.ibm.com Signed-off-by: Naveen N. Rao Suggested-by: Eric Biederman Signed-off-by: Andrew Morton Signed-off-by: Mimi Zohar Signed-off-by: Greg Kroah-Hartman Tested-by: Bagas Sanjaya Tested-by: Ron Economos Tested-by: Shuah Khan Tested-by: Sudip Mukherjee --- arch/arm64/include/asm/kexec.h | 4 ++- arch/powerpc/include/asm/kexec.h | 9 +++++++ arch/s390/include/asm/kexec.h | 3 ++ arch/x86/include/asm/kexec.h | 6 +++++ include/linux/kexec.h | 44 +++++++++++++++++++++++++++++++++-= ----- kernel/kexec_file.c | 35 +------------------------------ 6 files changed, 61 insertions(+), 40 deletions(-) --- a/arch/arm64/include/asm/kexec.h +++ b/arch/arm64/include/asm/kexec.h @@ -103,7 +103,9 @@ extern const struct kexec_file_ops kexec =20 struct kimage; =20 -extern int arch_kimage_file_post_load_cleanup(struct kimage *image); +int arch_kimage_file_post_load_cleanup(struct kimage *image); +#define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_clea= nup + extern int load_other_segments(struct kimage *image, unsigned long kernel_load_addr, unsigned long kernel_size, char *initrd, unsigned long initrd_len, --- a/arch/powerpc/include/asm/kexec.h +++ b/arch/powerpc/include/asm/kexec.h @@ -119,6 +119,15 @@ int setup_purgatory(struct kimage *image #ifdef CONFIG_PPC64 struct kexec_buf; =20 +int arch_kexec_kernel_image_probe(struct kimage *image, void *buf, unsigne= d long buf_len); +#define arch_kexec_kernel_image_probe arch_kexec_kernel_image_probe + +int arch_kimage_file_post_load_cleanup(struct kimage *image); +#define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_clea= nup + +int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf); +#define arch_kexec_locate_mem_hole arch_kexec_locate_mem_hole + int load_crashdump_segments_ppc64(struct kimage *image, struct kexec_buf *kbuf); int setup_purgatory_ppc64(struct kimage *image, const void *slave_code, --- a/arch/s390/include/asm/kexec.h +++ b/arch/s390/include/asm/kexec.h @@ -92,5 +92,8 @@ int arch_kexec_apply_relocations_add(str const Elf_Shdr *relsec, const Elf_Shdr *symtab); #define arch_kexec_apply_relocations_add arch_kexec_apply_relocations_add + +int arch_kimage_file_post_load_cleanup(struct kimage *image); +#define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_clea= nup #endif #endif /*_S390_KEXEC_H */ --- a/arch/x86/include/asm/kexec.h +++ b/arch/x86/include/asm/kexec.h @@ -193,6 +193,12 @@ int arch_kexec_apply_relocations_add(str const Elf_Shdr *relsec, const Elf_Shdr *symtab); #define arch_kexec_apply_relocations_add arch_kexec_apply_relocations_add + +void *arch_kexec_kernel_image_load(struct kimage *image); +#define arch_kexec_kernel_image_load arch_kexec_kernel_image_load + +int arch_kimage_file_post_load_cleanup(struct kimage *image); +#define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_clea= nup #endif #endif =20 --- a/include/linux/kexec.h +++ b/include/linux/kexec.h @@ -182,21 +182,53 @@ int kexec_purgatory_get_set_symbol(struc void *buf, unsigned int size, bool get_value); void *kexec_purgatory_get_symbol_addr(struct kimage *image, const char *na= me); +void *kexec_image_load_default(struct kimage *image); + +#ifndef arch_kexec_kernel_image_probe +static inline int +arch_kexec_kernel_image_probe(struct kimage *image, void *buf, unsigned lo= ng buf_len) +{ + return kexec_image_probe_default(image, buf, buf_len); +} +#endif + +#ifndef arch_kimage_file_post_load_cleanup +static inline int arch_kimage_file_post_load_cleanup(struct kimage *image) +{ + return kexec_image_post_load_cleanup_default(image); +} +#endif + +#ifndef arch_kexec_kernel_image_load +static inline void *arch_kexec_kernel_image_load(struct kimage *image) +{ + return kexec_image_load_default(image); +} +#endif =20 -/* Architectures may override the below functions */ -int arch_kexec_kernel_image_probe(struct kimage *image, void *buf, - unsigned long buf_len); -void *arch_kexec_kernel_image_load(struct kimage *image); -int arch_kimage_file_post_load_cleanup(struct kimage *image); #ifdef CONFIG_KEXEC_SIG int arch_kexec_kernel_verify_sig(struct kimage *image, void *buf, unsigned long buf_len); #endif -int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf); =20 extern int kexec_add_buffer(struct kexec_buf *kbuf); int kexec_locate_mem_hole(struct kexec_buf *kbuf); =20 +#ifndef arch_kexec_locate_mem_hole +/** + * arch_kexec_locate_mem_hole - Find free memory to place the segments. + * @kbuf: Parameters for the memory search. + * + * On success, kbuf->mem will have the start address of the memory region = found. + * + * Return: 0 on success, negative errno on error. + */ +static inline int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf) +{ + return kexec_locate_mem_hole(kbuf); +} +#endif + /* Alignment required for elf header segment */ #define ELF_CORE_HEADER_ALIGN 4096 =20 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -62,14 +62,7 @@ int kexec_image_probe_default(struct kim return ret; } =20 -/* Architectures can provide this probe function */ -int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf, - unsigned long buf_len) -{ - return kexec_image_probe_default(image, buf, buf_len); -} - -static void *kexec_image_load_default(struct kimage *image) +void *kexec_image_load_default(struct kimage *image) { if (!image->fops || !image->fops->load) return ERR_PTR(-ENOEXEC); @@ -80,11 +73,6 @@ static void *kexec_image_load_default(st image->cmdline_buf_len); } =20 -void * __weak arch_kexec_kernel_image_load(struct kimage *image) -{ - return kexec_image_load_default(image); -} - int kexec_image_post_load_cleanup_default(struct kimage *image) { if (!image->fops || !image->fops->cleanup) @@ -93,11 +81,6 @@ int kexec_image_post_load_cleanup_defaul return image->fops->cleanup(image->image_loader_data); } =20 -int __weak arch_kimage_file_post_load_cleanup(struct kimage *image) -{ - return kexec_image_post_load_cleanup_default(image); -} - #ifdef CONFIG_KEXEC_SIG static int kexec_image_verify_sig_default(struct kimage *image, void *buf, unsigned long buf_len) @@ -110,8 +93,7 @@ static int kexec_image_verify_sig_defaul return image->fops->verify_sig(buf, buf_len); } =20 -int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf, - unsigned long buf_len) +int arch_kexec_kernel_verify_sig(struct kimage *image, void *buf, unsigned= long buf_len) { return kexec_image_verify_sig_default(image, buf, buf_len); } @@ -617,19 +599,6 @@ int kexec_locate_mem_hole(struct kexec_b } =20 /** - * arch_kexec_locate_mem_hole - Find free memory to place the segments. - * @kbuf: Parameters for the memory search. - * - * On success, kbuf->mem will have the start address of the memory region = found. - * - * Return: 0 on success, negative errno on error. - */ -int __weak arch_kexec_locate_mem_hole(struct kexec_buf *kbuf) -{ - return kexec_locate_mem_hole(kbuf); -} - -/** * kexec_add_buffer - place a buffer in a kexec segment * @kbuf: Buffer contents and memory parameters. * From nobody Fri Apr 10 10:43:51 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA0E6C32771 for ; Fri, 19 Aug 2022 15:42:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350003AbiHSPmB (ORCPT ); Fri, 19 Aug 2022 11:42:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37706 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1350023AbiHSPlI (ORCPT ); Fri, 19 Aug 2022 11:41:08 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 06C611022AA; Fri, 19 Aug 2022 08:40:56 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 306C6615FB; Fri, 19 Aug 2022 15:40:56 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 09269C433D6; Fri, 19 Aug 2022 15:40:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1660923655; bh=wJ/V/LvE4I2T4P1SCgVc+d8avzexWZgTIgSHPGcTF3Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=cwMcdZXcqnj5TWnZhsapq+LghaSoX3Um/Sg3SNqT8iColeP82mypuFPnhHYaNudv0 BtmtJbF1qd2uEit6dgCWguyIxh2UrsLyo/UEdumjPNoCH5BTvKmg1WvBpP7LncYg5U wgjYHOwMY40+6/gGMEZJKPxEZxhn/OOgRdbRWlIc= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, "Eric W. Biederman" , Michal Suchanek , Baoquan He , Coiby Xu , Mimi Zohar Subject: [PATCH 5.15 12/14] kexec: clean up arch_kexec_kernel_verify_sig Date: Fri, 19 Aug 2022 17:40:28 +0200 Message-Id: <20220819153712.072098573@linuxfoundation.org> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220819153711.658766010@linuxfoundation.org> References: <20220819153711.658766010@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Coiby Xu commit 689a71493bd2f31c024f8c0395f85a1fd4b2138e upstream. Before commit 105e10e2cf1c ("kexec_file: drop weak attribute from functions"), there was already no arch-specific implementation of arch_kexec_kernel_verify_sig. With weak attribute dropped by that commit, arch_kexec_kernel_verify_sig is completely useless. So clean it up. Note later patches are dependent on this patch so it should be backported to the stable tree as well. Cc: stable@vger.kernel.org Suggested-by: Eric W. Biederman Reviewed-by: Michal Suchanek Acked-by: Baoquan He Signed-off-by: Coiby Xu [zohar@linux.ibm.com: reworded patch description "Note"] Link: https://lore.kernel.org/linux-integrity/20220714134027.394370-1-coxu@= redhat.com/ Signed-off-by: Mimi Zohar Signed-off-by: Greg Kroah-Hartman Tested-by: Bagas Sanjaya Tested-by: Ron Economos Tested-by: Shuah Khan Tested-by: Sudip Mukherjee --- include/linux/kexec.h | 5 ----- kernel/kexec_file.c | 33 +++++++++++++-------------------- 2 files changed, 13 insertions(+), 25 deletions(-) --- a/include/linux/kexec.h +++ b/include/linux/kexec.h @@ -206,11 +206,6 @@ static inline void *arch_kexec_kernel_im } #endif =20 -#ifdef CONFIG_KEXEC_SIG -int arch_kexec_kernel_verify_sig(struct kimage *image, void *buf, - unsigned long buf_len); -#endif - extern int kexec_add_buffer(struct kexec_buf *kbuf); int kexec_locate_mem_hole(struct kexec_buf *kbuf); =20 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -81,24 +81,6 @@ int kexec_image_post_load_cleanup_defaul return image->fops->cleanup(image->image_loader_data); } =20 -#ifdef CONFIG_KEXEC_SIG -static int kexec_image_verify_sig_default(struct kimage *image, void *buf, - unsigned long buf_len) -{ - if (!image->fops || !image->fops->verify_sig) { - pr_debug("kernel loader does not support signature verification.\n"); - return -EKEYREJECTED; - } - - return image->fops->verify_sig(buf, buf_len); -} - -int arch_kexec_kernel_verify_sig(struct kimage *image, void *buf, unsigned= long buf_len) -{ - return kexec_image_verify_sig_default(image, buf, buf_len); -} -#endif - /* * Free up memory used by kernel, initrd, and command line. This is tempor= ary * memory allocation which is not needed any more after these buffers have @@ -141,13 +123,24 @@ void kimage_file_post_load_cleanup(struc } =20 #ifdef CONFIG_KEXEC_SIG +static int kexec_image_verify_sig(struct kimage *image, void *buf, + unsigned long buf_len) +{ + if (!image->fops || !image->fops->verify_sig) { + pr_debug("kernel loader does not support signature verification.\n"); + return -EKEYREJECTED; + } + + return image->fops->verify_sig(buf, buf_len); +} + static int kimage_validate_signature(struct kimage *image) { int ret; =20 - ret =3D arch_kexec_kernel_verify_sig(image, image->kernel_buf, - image->kernel_buf_len); + ret =3D kexec_image_verify_sig(image, image->kernel_buf, + image->kernel_buf_len); if (ret) { =20 if (sig_enforce) { From nobody Fri Apr 10 10:43:51 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28CC6C32773 for ; Fri, 19 Aug 2022 15:42:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349941AbiHSPmO (ORCPT ); Fri, 19 Aug 2022 11:42:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37618 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1349934AbiHSPlq (ORCPT ); Fri, 19 Aug 2022 11:41:46 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2F246102F17; Fri, 19 Aug 2022 08:41:01 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id ED101B82813; Fri, 19 Aug 2022 15:40:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 39D8CC433C1; Fri, 19 Aug 2022 15:40:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1660923658; bh=hc73Cztag373h5z9qzhWH4LIzO2kQuPhp2WyLHsG8rU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=mFNPpfzeZlUGTyRhrcJfJ0ivK3EafM9JsB7zj+N0a7XsTvT1feQIiX56A501IzckC 6aLEjtjg6tmBeKZjZbljyXG4VrG+YOerA6+zO2YSYr8ri9u4FwvDDvVfEKK9/0jHZa jOVNpYNjfrzPveQFv6pfVFiHpL071m2q6/WUaVug= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, kexec@lists.infradead.org, keyrings@vger.kernel.org, linux-security-module@vger.kernel.org, Michal Suchanek , Coiby Xu , Mimi Zohar Subject: [PATCH 5.15 13/14] kexec, KEYS: make the code in bzImage64_verify_sig generic Date: Fri, 19 Aug 2022 17:40:29 +0200 Message-Id: <20220819153712.107081653@linuxfoundation.org> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220819153711.658766010@linuxfoundation.org> References: <20220819153711.658766010@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Coiby Xu commit c903dae8941deb55043ee46ded29e84e97cd84bb upstream. commit 278311e417be ("kexec, KEYS: Make use of platform keyring for signature verify") adds platform keyring support on x86 kexec but not arm64. The code in bzImage64_verify_sig uses the keys on the .builtin_trusted_keys, .machine, if configured and enabled, .secondary_trusted_keys, also if configured, and .platform keyrings to verify the signed kernel image as PE file. Cc: kexec@lists.infradead.org Cc: keyrings@vger.kernel.org Cc: linux-security-module@vger.kernel.org Reviewed-by: Michal Suchanek Signed-off-by: Coiby Xu Signed-off-by: Mimi Zohar Signed-off-by: Greg Kroah-Hartman Tested-by: Bagas Sanjaya Tested-by: Ron Economos Tested-by: Shuah Khan Tested-by: Sudip Mukherjee --- arch/x86/kernel/kexec-bzimage64.c | 20 +------------------- include/linux/kexec.h | 7 +++++++ kernel/kexec_file.c | 17 +++++++++++++++++ 3 files changed, 25 insertions(+), 19 deletions(-) --- a/arch/x86/kernel/kexec-bzimage64.c +++ b/arch/x86/kernel/kexec-bzimage64.c @@ -17,7 +17,6 @@ #include #include #include -#include =20 #include #include @@ -528,28 +527,11 @@ static int bzImage64_cleanup(void *loade return 0; } =20 -#ifdef CONFIG_KEXEC_BZIMAGE_VERIFY_SIG -static int bzImage64_verify_sig(const char *kernel, unsigned long kernel_l= en) -{ - int ret; - - ret =3D verify_pefile_signature(kernel, kernel_len, - VERIFY_USE_SECONDARY_KEYRING, - VERIFYING_KEXEC_PE_SIGNATURE); - if (ret =3D=3D -ENOKEY && IS_ENABLED(CONFIG_INTEGRITY_PLATFORM_KEYRING)) { - ret =3D verify_pefile_signature(kernel, kernel_len, - VERIFY_USE_PLATFORM_KEYRING, - VERIFYING_KEXEC_PE_SIGNATURE); - } - return ret; -} -#endif - const struct kexec_file_ops kexec_bzImage64_ops =3D { .probe =3D bzImage64_probe, .load =3D bzImage64_load, .cleanup =3D bzImage64_cleanup, #ifdef CONFIG_KEXEC_BZIMAGE_VERIFY_SIG - .verify_sig =3D bzImage64_verify_sig, + .verify_sig =3D kexec_kernel_verify_pe_sig, #endif }; --- a/include/linux/kexec.h +++ b/include/linux/kexec.h @@ -19,6 +19,7 @@ #include =20 #include +#include =20 #ifdef CONFIG_KEXEC_CORE #include @@ -206,6 +207,12 @@ static inline void *arch_kexec_kernel_im } #endif =20 +#ifdef CONFIG_KEXEC_SIG +#ifdef CONFIG_SIGNED_PE_FILE_VERIFICATION +int kexec_kernel_verify_pe_sig(const char *kernel, unsigned long kernel_le= n); +#endif +#endif + extern int kexec_add_buffer(struct kexec_buf *kbuf); int kexec_locate_mem_hole(struct kexec_buf *kbuf); =20 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -123,6 +123,23 @@ void kimage_file_post_load_cleanup(struc } =20 #ifdef CONFIG_KEXEC_SIG +#ifdef CONFIG_SIGNED_PE_FILE_VERIFICATION +int kexec_kernel_verify_pe_sig(const char *kernel, unsigned long kernel_le= n) +{ + int ret; + + ret =3D verify_pefile_signature(kernel, kernel_len, + VERIFY_USE_SECONDARY_KEYRING, + VERIFYING_KEXEC_PE_SIGNATURE); + if (ret =3D=3D -ENOKEY && IS_ENABLED(CONFIG_INTEGRITY_PLATFORM_KEYRING)) { + ret =3D verify_pefile_signature(kernel, kernel_len, + VERIFY_USE_PLATFORM_KEYRING, + VERIFYING_KEXEC_PE_SIGNATURE); + } + return ret; +} +#endif + static int kexec_image_verify_sig(struct kimage *image, void *buf, unsigned long buf_len) { From nobody Fri Apr 10 10:43:51 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B45C3C32771 for ; Fri, 19 Aug 2022 15:42:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350056AbiHSPmS (ORCPT ); Fri, 19 Aug 2022 11:42:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37836 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1349972AbiHSPls (ORCPT ); Fri, 19 Aug 2022 11:41:48 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AB31E102F23; Fri, 19 Aug 2022 08:41:04 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id F1329B82817; Fri, 19 Aug 2022 15:41:02 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 25D9BC433C1; Fri, 19 Aug 2022 15:41:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1660923661; bh=q01TiepYCgbd3YDF3Teq86Su7fa1NExU5eO/8Pw2Xpc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=kHOZ/cNj40mzAv1+SYNRiyco5Yxnp93P4lbEHIMGdFi3BiZZx1RyfvY9aZOF23ky2 rhFLrffKBDLRJojNoitlejwAByOxNsLb4G/JGRD2yPuA+2ZehmpsXaOImQJB2EHQGf NjZg8KKpn+ixLw7hgf/I/NH5ytLhsBoG2MtwGWNY= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Baoquan He , kexec@lists.infradead.org, keyrings@vger.kernel.org, linux-security-module@vger.kernel.org, Michal Suchanek , Will Deacon , Coiby Xu , Mimi Zohar Subject: [PATCH 5.15 14/14] arm64: kexec_file: use more system keyrings to verify kernel image signature Date: Fri, 19 Aug 2022 17:40:30 +0200 Message-Id: <20220819153712.144781772@linuxfoundation.org> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220819153711.658766010@linuxfoundation.org> References: <20220819153711.658766010@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Coiby Xu commit 0d519cadf75184a24313568e7f489a7fc9b1be3b upstream. Currently, when loading a kernel image via the kexec_file_load() system call, arm64 can only use the .builtin_trusted_keys keyring to verify a signature whereas x86 can use three more keyrings i.e. .secondary_trusted_keys, .machine and .platform keyrings. For example, one resulting problem is kexec'ing a kernel image would be rejected with the error "Lockdown: kexec: kexec of unsigned images is restricted; see man kernel_lockdown.7". This patch set enables arm64 to make use of the same keyrings as x86 to verify the signature kexec'ed kernel image. Fixes: 732b7b93d849 ("arm64: kexec_file: add kernel signature verification = support") Cc: stable@vger.kernel.org # 105e10e2cf1c: kexec_file: drop weak attribute = from functions Cc: stable@vger.kernel.org # 34d5960af253: kexec: clean up arch_kexec_kerne= l_verify_sig Cc: stable@vger.kernel.org # 83b7bb2d49ae: kexec, KEYS: make the code in bz= Image64_verify_sig generic Acked-by: Baoquan He Cc: kexec@lists.infradead.org Cc: keyrings@vger.kernel.org Cc: linux-security-module@vger.kernel.org Co-developed-by: Michal Suchanek Signed-off-by: Michal Suchanek Acked-by: Will Deacon Signed-off-by: Coiby Xu Signed-off-by: Mimi Zohar Signed-off-by: Greg Kroah-Hartman Tested-by: Bagas Sanjaya Tested-by: Ron Economos Tested-by: Shuah Khan Tested-by: Sudip Mukherjee --- arch/arm64/kernel/kexec_image.c | 11 +---------- 1 file changed, 1 insertion(+), 10 deletions(-) --- a/arch/arm64/kernel/kexec_image.c +++ b/arch/arm64/kernel/kexec_image.c @@ -14,7 +14,6 @@ #include #include #include -#include #include #include #include @@ -130,18 +129,10 @@ static void *image_load(struct kimage *i return NULL; } =20 -#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG -static int image_verify_sig(const char *kernel, unsigned long kernel_len) -{ - return verify_pefile_signature(kernel, kernel_len, NULL, - VERIFYING_KEXEC_PE_SIGNATURE); -} -#endif - const struct kexec_file_ops kexec_image_ops =3D { .probe =3D image_probe, .load =3D image_load, #ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG - .verify_sig =3D image_verify_sig, + .verify_sig =3D kexec_kernel_verify_pe_sig, #endif };