From nobody Mon Feb 9 06:24:39 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64241C001B0 for ; Fri, 7 Jul 2023 20:19:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232449AbjGGUTb (ORCPT ); Fri, 7 Jul 2023 16:19:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45170 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232774AbjGGUTW (ORCPT ); Fri, 7 Jul 2023 16:19:22 -0400 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B66F22105 for ; Fri, 7 Jul 2023 13:19:19 -0700 (PDT) Received: by mail-pj1-x1049.google.com with SMTP id 98e67ed59e1d1-263047f46f4so3725753a91.1 for ; Fri, 07 Jul 2023 13:19:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688761159; x=1691353159; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=A+DZafON7JsoL7Ay0CKBI9HDnuWj0nfz7I/atmY5O6E=; b=TldvMxXOlAitzanCYBdJPKTFsceaBUtZtfHE+NCMC7uRLSuEMdNFphYxZysQPlOWyt 6IIkCSmbZ0QhAj6BdA2S0ZbkwbhR7x3BX+nwYBGgaOxeyqB+HsmP8eFpNEMX/EjaKoHw Osqpj/mJ1MqPaQbsakEeDVTjMuhGWE9GnXwVc8F/i6rZzYOS2/oL2zNYYKzVVfG8aSiy zLaZC2KHe+RKfCbvOVNYPOcE1YOJUR4cbXfL9Nm1YDmuYA43h0rfivcjthaU3URwFYSK 96WU5M8uzwqIi+5+ywPkqjzgjsM0fGa0w0GzRv5r4kHftPEO8J5DvOvCOoUqVSHiN5aw 6/8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688761159; x=1691353159; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=A+DZafON7JsoL7Ay0CKBI9HDnuWj0nfz7I/atmY5O6E=; b=ItyuyXKON4UB+2+GHoelozx7BZmZJXPjydnVs3IQR+DMuO+tGfVICWK24nJDRKZDCe RIied27OUsPgmFI0KNyyYmPd6AWEeuC5QhffUteYpC7jIqzIJO9ifCihyXDM1d6lJqrc Igwjro/ULtmnRwoxyULx+BHE10YGdfG71DVpsmkPhMfLwMLOkPQGrKLYrZr/SOudAqax 4Tnb1QRElHxl4uku9UX+lvdBOyybYm+dqUhkYWcckGqh5KShjifrX31v8djdYBIuh26E 0VQhtEypjO+DMBg3lGbBnb4MDhn2IXlQucvUR0RhJRPdOZXaULGdjegzrZ9mtPOiJENt dO+A== X-Gm-Message-State: ABy/qLZJBYizQ5/7tunZH7JSUm2Y0KTyaStCmsazBMwE/eD+og+34WlF kOwyR3Vk/v3+Q3fczTeSCUjZGgHKFSa9kg== X-Google-Smtp-Source: APBJJlFlQ7RKBImKNNS3u3h3d6OO1HQ+Hd+V+iBkTossPJ4vugefj5v7Rlyl3uxPne+VgqOiM11xG9BXaRtiDQ== X-Received: from yjq3.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:272f]) (user=jiaqiyan job=sendgmr) by 2002:a17:90a:ce18:b0:263:49d3:8024 with SMTP id f24-20020a17090ace1800b0026349d38024mr4778324pju.1.1688761159291; Fri, 07 Jul 2023 13:19:19 -0700 (PDT) Date: Fri, 7 Jul 2023 20:19:03 +0000 In-Reply-To: <20230707201904.953262-1-jiaqiyan@google.com> Mime-Version: 1.0 References: <20230707201904.953262-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.41.0.255.g8b1d071c50-goog Message-ID: <20230707201904.953262-4-jiaqiyan@google.com> Subject: [PATCH v3 3/4] hugetlbfs: improve read HWPOISON hugepage From: Jiaqi Yan To: akpm@linux-foundation.org, mike.kravetz@oracle.com, naoya.horiguchi@nec.com Cc: songmuchun@bytedance.com, shy828301@gmail.com, linmiaohe@huawei.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, duenwen@google.com, axelrasmussen@google.com, jthoughton@google.com, Jiaqi Yan Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When a hugepage contains HWPOISON pages, read() fails to read any byte of the hugepage and returns -EIO, although many bytes in the HWPOISON hugepage are readable. Improve this by allowing hugetlbfs_read_iter returns as many bytes as possible. For a requested range [offset, offset + len) that contains HWPOISON page, return [offset, first HWPOISON page addr); the next read attempt will fail and return -EIO. Reviewed-by: Mike Kravetz Reviewed-by: Naoya Horiguchi Signed-off-by: Jiaqi Yan --- fs/hugetlbfs/inode.c | 58 +++++++++++++++++++++++++++++++++++++++----- 1 file changed, 52 insertions(+), 6 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 7b17ccfa039d..c2b807d37f85 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -282,6 +282,42 @@ hugetlb_get_unmapped_area(struct file *file, unsigned = long addr, } #endif =20 +/* + * Someone wants to read @bytes from a HWPOISON hugetlb @page from @offset. + * Returns the maximum number of bytes one can read without touching the 1= st raw + * HWPOISON subpage. + * + * The implementation borrows the iteration logic from copy_page_to_iter*. + */ +static size_t adjust_range_hwpoison(struct page *page, size_t offset, size= _t bytes) +{ + size_t n =3D 0; + size_t res =3D 0; + struct folio *folio =3D page_folio(page); + + /* First subpage to start the loop. */ + page +=3D offset / PAGE_SIZE; + offset %=3D PAGE_SIZE; + while (1) { + if (is_raw_hwp_subpage(folio, page)) + break; + + /* Safe to read n bytes without touching HWPOISON subpage. */ + n =3D min(bytes, (size_t)PAGE_SIZE - offset); + res +=3D n; + bytes -=3D n; + if (!bytes || !n) + break; + offset +=3D n; + if (offset =3D=3D PAGE_SIZE) { + page++; + offset =3D 0; + } + } + + return res; +} + /* * Support for read() - Find the page attached to f_mapping and copy out t= he * data. This provides functionality similar to filemap_read(). @@ -300,7 +336,7 @@ static ssize_t hugetlbfs_read_iter(struct kiocb *iocb, = struct iov_iter *to) =20 while (iov_iter_count(to)) { struct page *page; - size_t nr, copied; + size_t nr, copied, want; =20 /* nr is the maximum number of bytes to copy from this page */ nr =3D huge_page_size(h); @@ -328,16 +364,26 @@ static ssize_t hugetlbfs_read_iter(struct kiocb *iocb= , struct iov_iter *to) } else { unlock_page(page); =20 - if (PageHWPoison(page)) { - put_page(page); - retval =3D -EIO; - break; + if (!PageHWPoison(page)) + want =3D nr; + else { + /* + * Adjust how many bytes safe to read without + * touching the 1st raw HWPOISON subpage after + * offset. + */ + want =3D adjust_range_hwpoison(page, offset, nr); + if (want =3D=3D 0) { + put_page(page); + retval =3D -EIO; + break; + } } =20 /* * We have the page, copy it to user space buffer. */ - copied =3D copy_page_to_iter(page, offset, nr, to); + copied =3D copy_page_to_iter(page, offset, want, to); put_page(page); } offset +=3D copied; --=20 2.41.0.255.g8b1d071c50-goog