From nobody Tue Apr 7 01:33:42 2026 Received: from mail-dy1-f171.google.com (mail-dy1-f171.google.com [74.125.82.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D042F18A921 for ; Tue, 17 Mar 2026 01:48:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773712102; cv=none; b=YoqxxsH8QSTdSFNciVnHskZl4VqT13SVVjpyJu6DlidsV2cG5DMYeCD/gG+fwauvgl5CUIouUCaygktaAwatyCPSUgungMWnyFzvTADJBSXoM7BodmMUi52oX+xa/MzeSPnG3B8b4EKJvQHY3q/8pkXlZSIKa4C+W8+wWQt5zCY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773712102; c=relaxed/simple; bh=K/l5wPzkypoBGGowWuf5aDp8l9HqTc1CTviVyc6uyFQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=GTUc1yY5AiO9NpZ2Bij3skVAGo2Mhrpx18RKeUg9W5gIfujfrA74L0ftpPGP0qc2q567rMhUxlXcAb7VQnkdee93+D5hKvualsPzxyk6rtX16JoriF0IPzPM9uppbB1ovAsp/SrtJNTW1WR4SOjuogkS21dakTCkhqtLBDYFZKw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=WO6yKgp4; arc=none smtp.client-ip=74.125.82.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="WO6yKgp4" Received: by mail-dy1-f171.google.com with SMTP id 5a478bee46e88-2c0c955a481so2412225eec.1 for ; Mon, 16 Mar 2026 18:48:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1773712100; x=1774316900; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=woq8g++l5pYTnrwN/CGFOIvueDbJv5G2Fq6vUnhRrQA=; b=WO6yKgp4LIx7RfVDJv0FWpd8kuBcNtQS6lsgF8Y39/KoCCYl4CNUE5xAM/Diezl6+t 8YMsF38VOkGcghC919fdqcioS5hc+h7gTYnraQjEUYPKfLffPLgYBwf7IByBeTicYewp 4KoC193eWE1dqZ2IpAmnm6ZDOB9/LK6616RyuYgfRNkrA1zjW6gFOYIFRirapNFvK0jP UT6nZkfPiEtttX7/zus5kek6mmfIzC8M7I7eWky0Lnnshc8X4HApZrnVDg33YsLsmByD JmSSaHFdUo4gGKnQt23z5YRf1uQCuaMlxssR+Um0pD7sXaCyQhBdjx8kc0gXMmEgq6m9 QI5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773712100; x=1774316900; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=woq8g++l5pYTnrwN/CGFOIvueDbJv5G2Fq6vUnhRrQA=; b=lu2IADQvPCAit15NnaL6ABuY2eCqlLVKfKPlrn19c4SQClD30yVzj0f9/7k3Ti7duJ XpHqoSi9o+yy4t5pnCdvO4JtfKTgWSEN5Bjz70wHZEydLxGXcoscZ+rSDoKHHb/QsEA0 pd1soGdSVpNbcaWheOyLCHFwJ2Kjqpe9t0EOkTRg5k8EuhRu6A6P4pdm+j8zUlfNLMV+ CZrflb2iWBRQBwItoz5plQ5HyfbdlozGnBvhm7IfLS8zxfhe1/mVkgt/qt0ohtR1EKTk HBMfXGKtXv+gpanoUapLpy2FL4RZ2hIeg9POw7uEwyHMD4YMS2nLz08YglDcpLeWNQoA VPqg== X-Forwarded-Encrypted: i=1; AJvYcCXtWftdPH89CWWRXIw+h3YPRuYxeQvymiJrpWYp90K4zC7JaXGGk4oO2F16rkrET4BUOir3/onmyYBikMQ=@vger.kernel.org X-Gm-Message-State: AOJu0YwTCHBvw3xU9t3ZOu6JQWbQlz6fFSiH86atpB9l/CTa8UyRWRhb zMlHrj8muIDOTk7IVNfj7ZFgiljbRIbSU0GUiinW0+ewDx9OAm+NAD5a3zkX7AB4n2U= X-Gm-Gg: ATEYQzz0kQwd0viTyWgMIKmPRvQdBRU81LkcqoJQMENyszRmkQlEXE27pG6yx5SxXoy T6+DYNeIdDzwkJ6e5fY5nFCcueAzqA6DkAwqoqiG1tEgbbe4WO5f9DfqwRknf3ccqb7mtL2lhN/ JQ5qJIUkEGrFw/PPtM6Pn6A0xtS3KLnMB43ibAVcs4HzEtRZXclVOFZuOSz3z5jjfGeTBN3nNg2 viPHyAYq0Hl25NInZjiB7DoXnahd/YtI2mi6mdFmpAIDNDaVyGALduNMveo8cQnzOB4dJwP+v1n cRY+13d5t6o7mWoELUzvz+NzdZlpI4NAsHc6tJ1Cf0xy7Fsbh/meF4HPakySHe/IoJJJTG3kIch pi8CdwOBh1ZIlywk+dUM3J3jQlIK6quuhhP6l0AO2EFNNw1Sx1xzPW1oIcvXYXXk+jDtu2a7aM+ 90qzLyb8dXB+96o2fEqSRO5CAhyXhvGXzes/mgsQ== X-Received: by 2002:a05:7300:fd02:b0:2be:8216:57db with SMTP id 5a478bee46e88-2bea5418f42mr7112556eec.3.1773712099946; Mon, 16 Mar 2026 18:48:19 -0700 (PDT) Received: from penguin.lxd ([2601:647:6400:3ec0:216:3eff:fecd:e4ef]) by smtp.gmail.com with ESMTPSA id 5a478bee46e88-2beab3a12e2sm17931130eec.2.2026.03.16.18.48.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Mar 2026 18:48:19 -0700 (PDT) From: "Kanchana P. Sridhar" To: hannes@cmpxchg.org, yosry@kernel.org, nphamcs@gmail.com, chengming.zhou@linux.dev, akpm@linux-foundation.org, kanchanapsridhar2026@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: herbert@gondor.apana.org.au, senozhatsky@chromium.org Subject: [PATCH v2 1/2] mm: zswap: Remove redundant checks in zswap_cpu_comp_dead(). Date: Mon, 16 Mar 2026 18:48:01 -0700 Message-Id: <20260317014802.27591-2-kanchanapsridhar2026@gmail.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20260317014802.27591-1-kanchanapsridhar2026@gmail.com> References: <20260317014802.27591-1-kanchanapsridhar2026@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" There are presently redundant checks on the per-CPU acomp_ctx and it's "req" member in zswap_cpu_comp_dead(): redundant because they are inconsistent with zswap_pool_create() handling of failure in allocating the acomp_ctx, and with the expected NULL return value from the acomp_request_alloc() API when it fails to allocate an acomp_req. Fix these by converting to them to be NULL checks. Add comments in zswap_cpu_comp_prepare() clarifying the expected return values of the crypto_alloc_acomp_node() and acomp_request_alloc() API. Suggested-by: Yosry Ahmed Signed-off-by: Kanchana P. Sridhar Acked-by: Yosry Ahmed --- mm/zswap.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index bdd24430f6ff..8ac38f1d0469 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -749,6 +749,10 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, st= ruct hlist_node *node) goto fail; } =20 + /* + * In case of an error, crypto_alloc_acomp_node() returns an + * error pointer, never NULL. + */ acomp =3D crypto_alloc_acomp_node(pool->tfm_name, 0, 0, cpu_to_node(cpu)); if (IS_ERR(acomp)) { pr_err("could not alloc crypto acomp %s : %pe\n", @@ -757,6 +761,7 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, str= uct hlist_node *node) goto fail; } =20 + /* acomp_request_alloc() returns NULL in case of an error. */ req =3D acomp_request_alloc(acomp); if (!req) { pr_err("could not alloc crypto acomp_request %s\n", @@ -802,7 +807,7 @@ static int zswap_cpu_comp_dead(unsigned int cpu, struct= hlist_node *node) struct crypto_acomp *acomp; u8 *buffer; =20 - if (IS_ERR_OR_NULL(acomp_ctx)) + if (!acomp_ctx) return 0; =20 mutex_lock(&acomp_ctx->mutex); @@ -817,8 +822,11 @@ static int zswap_cpu_comp_dead(unsigned int cpu, struc= t hlist_node *node) /* * Do the actual freeing after releasing the mutex to avoid subtle * locking dependencies causing deadlocks. + * + * If there was an error in allocating @acomp_ctx->req, it + * would be set to NULL. */ - if (!IS_ERR_OR_NULL(req)) + if (req) acomp_request_free(req); if (!IS_ERR_OR_NULL(acomp)) crypto_free_acomp(acomp); --=20 2.39.5 From nobody Tue Apr 7 01:33:42 2026 Received: from mail-dy1-f180.google.com (mail-dy1-f180.google.com [74.125.82.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C2E4923C368 for ; Tue, 17 Mar 2026 01:48:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773712103; cv=none; b=TmjbYmj1/lMPEE6nVQS0zKpD4UrEM12QyDSEs7fZpNbo0PJDJqWVBxFXYAX3efYthaCbGm7heYJuZEZrZN+AbnSrXjN1AWZ/nXG/tF43VdaroYMCmTVOQ+kP1YU/XtzeNFrAyXZybigLRraXbs4EkncHzr/ej2BhhkaPF0ZSiis= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773712103; c=relaxed/simple; bh=4OQ73Tfomi3I0kUkx8tPK86h1le/uGZxO6uS1xzcBmY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=oUgl3ClhbNQ6r0EuRz7VxqF+swVzku5w3f/sS0dyFq1MdmkwdzQn2RlxOAsm9YwnD/xaF9vZ+xAF2ojuQYxttP3PaVyMReD7uNaMzsCd8lox79yQO58476b7pbnz+uPEmxExR9g21buqhUf7h5MKmQYqS0tYsk8z4LQ9ODfWUKQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Ymw0di24; arc=none smtp.client-ip=74.125.82.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Ymw0di24" Received: by mail-dy1-f180.google.com with SMTP id 5a478bee46e88-2bd9a485bd6so4021265eec.1 for ; Mon, 16 Mar 2026 18:48:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1773712101; x=1774316901; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ovJFrvNdn0DpMq+zF+ycTp26h9oVszt5lPY82j9CwUY=; b=Ymw0di2489OWDXFICCzkN7ZfoeNWR4csJYoiwOfTqrhmxKkIJv7wPPzkNkjsx/Tua7 T+w6g6MxfSVXQjJb6WW9FOlSd78KRQhz9SOjMdFaWQa5wa/oR5tR3rG/terRRpbwNlmu cPJrXcQzcNd6a8o5v1n0lZiOYoxauIlQv55QTCGih84bd9jJFd3ZhKwnvbumFszfquVD xz05eMeKM9VQMqs6hOqmCeV6PujeaKyXspjXROJyJcVwBnwRwh87JaqFh8LEJki5qJgd NifYDYQF1bquboN3Rac6FyjEVkQBOEitiHVRZbl7jd+nGI5esCIDBJ53elv3Y/pHDzMl WlMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773712101; x=1774316901; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=ovJFrvNdn0DpMq+zF+ycTp26h9oVszt5lPY82j9CwUY=; b=fvOGM7Se4rqU7uTzb/vUfghAcb8mIIRumJ6ii2JVh3FppvIpKBmolnG7nKqfVIG+ck Z3QyTBJD4s6oh4uE6vZr1obV/dtTx6p7epUgu6s8IUOvJm9Ex7eDOVwVzoXme3FGfLvE DUIcsz/H5r5VTlQ8CUSBjse1iP2GluVVqZIUsxnP4qR8KBD9MjIralW92RbBbnwr6c4Z Kn3RP++FoIGBkqHGP/mTAP301EouqHk814aCYAIPEBB3ber22qKznlkXfpKM6qiNXfgI wBTP4MZwh9Q2VEMYWAIX22M/ou3k9iSruCIMTo1DWzd/tkZgThZHTmOsiBbTK5O0Cvde G8lQ== X-Forwarded-Encrypted: i=1; AJvYcCWLygY06+TAaP48AxoRhF+oIeq/L3eU2P6qOifKQZZs820RIkV2bXs2MSDkaFuha13QJ4Gx/XvDRnfOg50=@vger.kernel.org X-Gm-Message-State: AOJu0YwDdFVz7oz3h5EXqt8F5F3YLD8O5hFMfCy6MwCmy09ZDIFXgvy1 RvevL7AYDfXi1N99emIKb7PerWEPrU3WuvSZrmgFPAyHUO/c3IxUeqQm X-Gm-Gg: ATEYQzz/JlkC035VkisHYJOpEuFn89baTju/sJEbrtUQKbEx/GQKUXD8UFDVPgHMANM kqBSb7fJ/73Dk1bUild4nYaAL21zxMOn+jTlqSCmr8pPwh/9m4re/8jT105xvlGkVTJhPWydp0n zmzUiNVvKE4E00jHH7QV2B6/TsLQF7HbgIf7bfIMDyTu5Nb32KNjtsaXs51ao0X19ZXZ92wgB9M yu4pioR8B00LZ/LePIOD1nZAC1WMkzwIqkTd44rF3HGgqrYkGPugm3xyhM/WUzxp8jz5IkasEqy fKGj60hI17NNwuIFXDxMUGzXMS0cAD4BNga9nJFSyI54kFKRfihTQJGgjuSeH/UqdNq3ezAkqaU xwZRgVnc9FDz13/fb20q3d1ZR0o0sp/nwOjVOt8GYdYKAHfbh9fOwnetTKT2+wbvNQV//sJIQJA M39t4t3kRPJgi0NT5zF7vo8bPMSpkgfF9Mqex7WBHB86J7IfQM X-Received: by 2002:a05:7301:2c89:b0:2ba:7b71:4f4 with SMTP id 5a478bee46e88-2bea571f88emr7143066eec.32.1773712100812; Mon, 16 Mar 2026 18:48:20 -0700 (PDT) Received: from penguin.lxd ([2601:647:6400:3ec0:216:3eff:fecd:e4ef]) by smtp.gmail.com with ESMTPSA id 5a478bee46e88-2beab3a12e2sm17931130eec.2.2026.03.16.18.48.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Mar 2026 18:48:20 -0700 (PDT) From: "Kanchana P. Sridhar" To: hannes@cmpxchg.org, yosry@kernel.org, nphamcs@gmail.com, chengming.zhou@linux.dev, akpm@linux-foundation.org, kanchanapsridhar2026@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: herbert@gondor.apana.org.au, senozhatsky@chromium.org Subject: [PATCH v2 2/2] mm: zswap: Tie per-CPU acomp_ctx lifetime to the pool. Date: Mon, 16 Mar 2026 18:48:02 -0700 Message-Id: <20260317014802.27591-3-kanchanapsridhar2026@gmail.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20260317014802.27591-1-kanchanapsridhar2026@gmail.com> References: <20260317014802.27591-1-kanchanapsridhar2026@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, per-CPU acomp_ctx are allocated on pool creation and/or CPU hotplug, and destroyed on pool destruction or CPU hotunplug. This complicates the lifetime management to save memory while a CPU is offlined, which is not very common. Simplify lifetime management by allocating per-CPU acomp_ctx once on pool creation (or CPU hotplug for CPUs onlined later), and keeping them allocated until the pool is destroyed. Refactor cleanup code from zswap_cpu_comp_dead() into acomp_ctx_free() to be used elsewhere. The main benefit of using the CPU hotplug multi state instance startup callback to allocate the acomp_ctx resources is that it prevents the cores from being offlined until the multi state instance addition call returns. From Documentation/core-api/cpu_hotplug.rst: "The node list add/remove operations and the callback invocations are serialized against CPU hotplug operations." Furthermore, zswap_[de]compress() cannot contend with zswap_cpu_comp_prepare() because: - During pool creation/deletion, the pool is not in the zswap_pools list. - During CPU hot[un]plug, the CPU is not yet online, as Yosry pointed out. zswap_cpu_comp_prepare() will be run on a control CPU, since CPUHP_MM_ZSWP_POOL_PREPARE is in the PREPARE section of "enum cpuhp_state". In both these cases, any recursions into zswap reclaim from zswap_cpu_comp_prepare() will be handled by the old pool. The above two observations enable the following simplifications: 1) zswap_cpu_comp_prepare(): a) acomp_ctx mutex locking: If the process gets migrated while zswap_cpu_comp_prepare() is running, it will complete on the new CPU. In case of failures, we pass the acomp_ctx pointer obtained at the start of zswap_cpu_comp_prepare() to acomp_ctx_free(), which again, can only undergo migration. There appear to be no contention scenarios that might cause inconsistent values of acomp_ctx's members. Hence, it seems there is no need for mutex_lock(&acomp_ctx->mutex) in zswap_cpu_comp_prepare(). b) acomp_ctx mutex initialization: Since the pool is not yet on zswap_pools list, we don't need to initialize the per-CPU acomp_ctx mutex in zswap_pool_create(). This has been restored to occur in zswap_cpu_comp_prepare(). c) Subsequent CPU offline-online transitions: zswap_cpu_comp_prepare() checks upfront if acomp_ctx->acomp is valid. If so, it returns success. This should handle any CPU hotplug online-offline transitions after pool creation is done. 2) CPU offline vis-a-vis zswap ops: Let's suppose the process is migrated to another CPU before the current CPU is dysfunctional. If zswap_[de]compress() holds the acomp_ctx->mutex lock of the offlined CPU, that mutex will be released once it completes on the new CPU. Since there is no teardown callback, there is no possibility of UAF. 3) Pool creation/deletion and process migration to another CPU: During pool creation/deletion, the pool is not in the zswap_pools list. Hence it cannot contend with zswap ops on that CPU. However, the process can get migrated. a) Pool creation --> zswap_cpu_comp_prepare() --> process migrated: * Old CPU offline: no-op. * zswap_cpu_comp_prepare() continues to run on the new CPU to finish allocating acomp_ctx resources for the offlined CPU. b) Pool deletion --> acomp_ctx_free() --> process migrated: * Old CPU offline: no-op. * acomp_ctx_free() continues to run on the new CPU to finish de-allocating acomp_ctx resources for the offlined CPU. 4) Pool deletion vis-a-vis CPU onlining: The call to cpuhp_state_remove_instance() cannot race with zswap_cpu_comp_prepare() because of hotplug synchronization. The current acomp_ctx_get_cpu_lock()/acomp_ctx_put_unlock() are deleted. Instead, zswap_[de]compress() directly call mutex_[un]lock(&acomp_ctx->mutex). The per-CPU memory cost of not deleting the acomp_ctx resources upon CPU offlining, and only deleting them when the pool is destroyed, is 8.28 KB on x86_64. This cost is only paid when a CPU is offlined, until it is onlined again. Co-developed-by: Kanchana P. Sridhar Signed-off-by: Kanchana P. Sridhar Signed-off-by: Kanchana P Sridhar Acked-by Yosry, with a minor change requested [2] that is addressed by Acked-by: Yosry Ahmed --- mm/zswap.c | 180 ++++++++++++++++++++++++----------------------------- 1 file changed, 80 insertions(+), 100 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index 8ac38f1d0469..6bdd2ed7d697 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -242,6 +242,34 @@ static inline struct xarray *swap_zswap_tree(swp_entry= _t swp) **********************************/ static void __zswap_pool_empty(struct percpu_ref *ref); =20 +static void acomp_ctx_free(struct crypto_acomp_ctx *acomp_ctx) +{ + if (!acomp_ctx) + return; + + /* + * If there was an error in allocating @acomp_ctx->req, it + * would be set to NULL. + */ + if (acomp_ctx->req) + acomp_request_free(acomp_ctx->req); + + acomp_ctx->req =3D NULL; + + /* + * We have to handle both cases here: an error pointer return from + * crypto_alloc_acomp_node(); and a) NULL initialization by zswap, or + * b) NULL assignment done in a previous call to acomp_ctx_free(). + */ + if (!IS_ERR_OR_NULL(acomp_ctx->acomp)) + crypto_free_acomp(acomp_ctx->acomp); + + acomp_ctx->acomp =3D NULL; + + kfree(acomp_ctx->buffer); + acomp_ctx->buffer =3D NULL; +} + static struct zswap_pool *zswap_pool_create(char *compressor) { struct zswap_pool *pool; @@ -263,19 +291,27 @@ static struct zswap_pool *zswap_pool_create(char *com= pressor) =20 strscpy(pool->tfm_name, compressor, sizeof(pool->tfm_name)); =20 - pool->acomp_ctx =3D alloc_percpu(*pool->acomp_ctx); + /* Many things rely on the zero-initialization. */ + pool->acomp_ctx =3D alloc_percpu_gfp(*pool->acomp_ctx, + GFP_KERNEL | __GFP_ZERO); if (!pool->acomp_ctx) { pr_err("percpu alloc failed\n"); goto error; } =20 - for_each_possible_cpu(cpu) - mutex_init(&per_cpu_ptr(pool->acomp_ctx, cpu)->mutex); - + /* + * This is serialized against CPU hotplug operations. Hence, cores + * cannot be offlined until this finishes. + */ ret =3D cpuhp_state_add_instance(CPUHP_MM_ZSWP_POOL_PREPARE, &pool->node); + + /* + * cpuhp_state_add_instance() will not cleanup on failure since + * we don't register a hotunplug callback. + */ if (ret) - goto error; + goto cpuhp_add_fail; =20 /* being the current pool takes 1 ref; this func expects the * caller to always add the new pool as the current pool @@ -292,6 +328,10 @@ static struct zswap_pool *zswap_pool_create(char *comp= ressor) =20 ref_fail: cpuhp_state_remove_instance(CPUHP_MM_ZSWP_POOL_PREPARE, &pool->node); + +cpuhp_add_fail: + for_each_possible_cpu(cpu) + acomp_ctx_free(per_cpu_ptr(pool->acomp_ctx, cpu)); error: if (pool->acomp_ctx) free_percpu(pool->acomp_ctx); @@ -322,9 +362,15 @@ static struct zswap_pool *__zswap_pool_create_fallback= (void) =20 static void zswap_pool_destroy(struct zswap_pool *pool) { + int cpu; + zswap_pool_debug("destroying", pool); =20 cpuhp_state_remove_instance(CPUHP_MM_ZSWP_POOL_PREPARE, &pool->node); + + for_each_possible_cpu(cpu) + acomp_ctx_free(per_cpu_ptr(pool->acomp_ctx, cpu)); + free_percpu(pool->acomp_ctx); =20 zs_destroy_pool(pool->zs_pool); @@ -738,44 +784,41 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, s= truct hlist_node *node) { struct zswap_pool *pool =3D hlist_entry(node, struct zswap_pool, node); struct crypto_acomp_ctx *acomp_ctx =3D per_cpu_ptr(pool->acomp_ctx, cpu); - struct crypto_acomp *acomp =3D NULL; - struct acomp_req *req =3D NULL; - u8 *buffer =3D NULL; - int ret; + int ret =3D -ENOMEM; =20 - buffer =3D kmalloc_node(PAGE_SIZE, GFP_KERNEL, cpu_to_node(cpu)); - if (!buffer) { - ret =3D -ENOMEM; - goto fail; + /* + * To handle cases where the CPU goes through online-offline-online + * transitions, we return if the acomp_ctx has already been initialized. + */ + if (acomp_ctx->acomp) { + WARN_ON_ONCE(IS_ERR(acomp_ctx->acomp)); + return 0; } =20 + acomp_ctx->buffer =3D kmalloc_node(PAGE_SIZE, GFP_KERNEL, cpu_to_node(cpu= )); + if (!acomp_ctx->buffer) + return ret; + /* * In case of an error, crypto_alloc_acomp_node() returns an * error pointer, never NULL. */ - acomp =3D crypto_alloc_acomp_node(pool->tfm_name, 0, 0, cpu_to_node(cpu)); - if (IS_ERR(acomp)) { + acomp_ctx->acomp =3D crypto_alloc_acomp_node(pool->tfm_name, 0, 0, cpu_to= _node(cpu)); + if (IS_ERR(acomp_ctx->acomp)) { pr_err("could not alloc crypto acomp %s : %pe\n", - pool->tfm_name, acomp); - ret =3D PTR_ERR(acomp); + pool->tfm_name, acomp_ctx->acomp); + ret =3D PTR_ERR(acomp_ctx->acomp); goto fail; } =20 /* acomp_request_alloc() returns NULL in case of an error. */ - req =3D acomp_request_alloc(acomp); - if (!req) { + acomp_ctx->req =3D acomp_request_alloc(acomp_ctx->acomp); + if (!acomp_ctx->req) { pr_err("could not alloc crypto acomp_request %s\n", pool->tfm_name); - ret =3D -ENOMEM; goto fail; } =20 - /* - * Only hold the mutex after completing allocations, otherwise we may - * recurse into zswap through reclaim and attempt to hold the mutex - * again resulting in a deadlock. - */ - mutex_lock(&acomp_ctx->mutex); crypto_init_wait(&acomp_ctx->wait); =20 /* @@ -783,83 +826,17 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, s= truct hlist_node *node) * crypto_wait_req(); if the backend of acomp is scomp, the callback * won't be called, crypto_wait_req() will return without blocking. */ - acomp_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG, + acomp_request_set_callback(acomp_ctx->req, CRYPTO_TFM_REQ_MAY_BACKLOG, crypto_req_done, &acomp_ctx->wait); =20 - acomp_ctx->buffer =3D buffer; - acomp_ctx->acomp =3D acomp; - acomp_ctx->req =3D req; - mutex_unlock(&acomp_ctx->mutex); + mutex_init(&acomp_ctx->mutex); return 0; =20 fail: - if (!IS_ERR_OR_NULL(acomp)) - crypto_free_acomp(acomp); - kfree(buffer); + acomp_ctx_free(acomp_ctx); return ret; } =20 -static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node) -{ - struct zswap_pool *pool =3D hlist_entry(node, struct zswap_pool, node); - struct crypto_acomp_ctx *acomp_ctx =3D per_cpu_ptr(pool->acomp_ctx, cpu); - struct acomp_req *req; - struct crypto_acomp *acomp; - u8 *buffer; - - if (!acomp_ctx) - return 0; - - mutex_lock(&acomp_ctx->mutex); - req =3D acomp_ctx->req; - acomp =3D acomp_ctx->acomp; - buffer =3D acomp_ctx->buffer; - acomp_ctx->req =3D NULL; - acomp_ctx->acomp =3D NULL; - acomp_ctx->buffer =3D NULL; - mutex_unlock(&acomp_ctx->mutex); - - /* - * Do the actual freeing after releasing the mutex to avoid subtle - * locking dependencies causing deadlocks. - * - * If there was an error in allocating @acomp_ctx->req, it - * would be set to NULL. - */ - if (req) - acomp_request_free(req); - if (!IS_ERR_OR_NULL(acomp)) - crypto_free_acomp(acomp); - kfree(buffer); - - return 0; -} - -static struct crypto_acomp_ctx *acomp_ctx_get_cpu_lock(struct zswap_pool *= pool) -{ - struct crypto_acomp_ctx *acomp_ctx; - - for (;;) { - acomp_ctx =3D raw_cpu_ptr(pool->acomp_ctx); - mutex_lock(&acomp_ctx->mutex); - if (likely(acomp_ctx->req)) - return acomp_ctx; - /* - * It is possible that we were migrated to a different CPU after - * getting the per-CPU ctx but before the mutex was acquired. If - * the old CPU got offlined, zswap_cpu_comp_dead() could have - * already freed ctx->req (among other things) and set it to - * NULL. Just try again on the new CPU that we ended up on. - */ - mutex_unlock(&acomp_ctx->mutex); - } -} - -static void acomp_ctx_put_unlock(struct crypto_acomp_ctx *acomp_ctx) -{ - mutex_unlock(&acomp_ctx->mutex); -} - static bool zswap_compress(struct page *page, struct zswap_entry *entry, struct zswap_pool *pool) { @@ -872,7 +849,9 @@ static bool zswap_compress(struct page *page, struct zs= wap_entry *entry, u8 *dst; bool mapped =3D false; =20 - acomp_ctx =3D acomp_ctx_get_cpu_lock(pool); + acomp_ctx =3D raw_cpu_ptr(pool->acomp_ctx); + mutex_lock(&acomp_ctx->mutex); + dst =3D acomp_ctx->buffer; sg_init_table(&input, 1); sg_set_page(&input, page, PAGE_SIZE, 0); @@ -938,7 +917,7 @@ static bool zswap_compress(struct page *page, struct zs= wap_entry *entry, else if (alloc_ret) zswap_reject_alloc_fail++; =20 - acomp_ctx_put_unlock(acomp_ctx); + mutex_unlock(&acomp_ctx->mutex); return comp_ret =3D=3D 0 && alloc_ret =3D=3D 0; } =20 @@ -950,7 +929,8 @@ static bool zswap_decompress(struct zswap_entry *entry,= struct folio *folio) struct crypto_acomp_ctx *acomp_ctx; int ret =3D 0, dlen; =20 - acomp_ctx =3D acomp_ctx_get_cpu_lock(pool); + acomp_ctx =3D raw_cpu_ptr(pool->acomp_ctx); + mutex_lock(&acomp_ctx->mutex); zs_obj_read_sg_begin(pool->zs_pool, entry->handle, input, entry->length); =20 /* zswap entries of length PAGE_SIZE are not compressed. */ @@ -969,7 +949,7 @@ static bool zswap_decompress(struct zswap_entry *entry,= struct folio *folio) } =20 zs_obj_read_sg_end(pool->zs_pool, entry->handle); - acomp_ctx_put_unlock(acomp_ctx); + mutex_unlock(&acomp_ctx->mutex); =20 if (!ret && dlen =3D=3D PAGE_SIZE) return true; @@ -1789,7 +1769,7 @@ static int zswap_setup(void) ret =3D cpuhp_setup_state_multi(CPUHP_MM_ZSWP_POOL_PREPARE, "mm/zswap_pool:prepare", zswap_cpu_comp_prepare, - zswap_cpu_comp_dead); + NULL); if (ret) goto hp_fail; =20 --=20 2.39.5