From nobody Mon Sep 15 21:48:41 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 723ABC678D9 for ; Mon, 9 Jan 2023 21:40:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238126AbjAIVkZ (ORCPT ); Mon, 9 Jan 2023 16:40:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50088 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238119AbjAIVjl (ORCPT ); Mon, 9 Jan 2023 16:39:41 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 20C7D19C35 for ; Mon, 9 Jan 2023 13:38:25 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id l194-20020a2525cb000000b007b411fbdc13so10521428ybl.23 for ; Mon, 09 Jan 2023 13:38:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ZzZnon9AuhDOmqXXwy+4pPV98gk0DXyzNf2lwwH3FVc=; b=MIKhBBBnQ+bqQnJww8/fPsZhi0whxCGIUG6kEGUGMp4FOf1NaJ2uHO5gbQnelxNiQA /zZow36TcojzDwPLoUOvG98yay8W+H/pD2HMrI1l93vbQq7yL0EPGLWZQ0R6+VY7DRtV JNWPRrorFPERhG5pCoV73P8K4BeUX6WEsXjfyFLgqrSJKvyOp3o0siNMo696W/r6yKXU y8RJ2vfZCTDjh6Oi2w1rHjBj4wQlmL0t+ZRPOFSxOTrZS7vGovErpCiunYz6Rey8BJDi cncB+zuYtfJ9rQv3FhWwefPz3sWwVINattaPynPAVTeC+nbR6FJw+i6h1qaddB0J7iUL VYgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ZzZnon9AuhDOmqXXwy+4pPV98gk0DXyzNf2lwwH3FVc=; b=xYacJHP40bNwZZ9twnGYb5PCc8dpJkchSlDL1xF2i5PTJ7ZHxpwYyqDWxJRQq6uWg2 Yy6TKQ44ZPUlsJXDr0kkEUQd9jqEbXIXx+NoUzjetoz8srSk6LYmLzEYDBGcBjJVls3u JRrs7qAkURMiStp29GN6Z5Izkg/zAwKJ0Sx8Qv3+4N3OjDEYYtnAOt041OfNibz3IT1a vmPgeKDtvJ1B9rN79BM7ubUSrPFASFxKeB1zCw/2blzF09500N76+PqpPUkBAnE0fNTu 7Mhb4dLkeFMcgizGYXP0C0V6KGuaJ/5OYXZnL+guHdKfudoyk4U3zmQ/pr8hoDqdqTF/ QSow== X-Gm-Message-State: AFqh2kpb1hLG0CF0ZIBaiK0HRHfIDnH+sbYs3nntm7gSBlERwEoVTEj/ OcEgeZEmKZA22A3zFYVQ1gUcx9nNUN1VVJ0= X-Google-Smtp-Source: AMrXdXtOrlHW8ygC4EeYcDKs1WQIoeE3W+TEcqNZ/tFdoHO00Cnd/PEn3XI5TMhGx2JhnY0yguaIzWNDLrFIktg= X-Received: from tj.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:53a]) (user=tjmercier job=sendgmr) by 2002:a81:6784:0:b0:460:c029:6c76 with SMTP id b126-20020a816784000000b00460c0296c76mr2274300ywc.515.1673300304313; Mon, 09 Jan 2023 13:38:24 -0800 (PST) Date: Mon, 9 Jan 2023 21:38:04 +0000 In-Reply-To: <20230109213809.418135-1-tjmercier@google.com> Mime-Version: 1.0 References: <20230109213809.418135-1-tjmercier@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109213809.418135-2-tjmercier@google.com> Subject: [PATCH 1/4] memcg: Track exported dma-buffers From: "T.J. Mercier" To: tjmercier@google.com, Tejun Heo , Zefan Li , Johannes Weiner , Jonathan Corbet , Sumit Semwal , "=?UTF-8?q?Christian=20K=C3=B6nig?=" , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton Cc: daniel.vetter@ffwll.ch, android-mm@google.com, jstultz@google.com, cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org, linux-mm@kvack.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When a buffer is exported to userspace, use memcg to attribute the buffer to the allocating cgroup until all buffer references are released. Unlike the dmabuf sysfs stats implementation, this memcg accounting avoids contention over the kernfs_rwsem incurred when creating or removing nodes. Signed-off-by: T.J. Mercier --- Documentation/admin-guide/cgroup-v2.rst | 4 ++++ drivers/dma-buf/dma-buf.c | 5 +++++ include/linux/dma-buf.h | 3 +++ include/linux/memcontrol.h | 1 + mm/memcontrol.c | 4 ++++ 5 files changed, 17 insertions(+) diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-= guide/cgroup-v2.rst index c8ae7c897f14..538ae22bc514 100644 --- a/Documentation/admin-guide/cgroup-v2.rst +++ b/Documentation/admin-guide/cgroup-v2.rst @@ -1455,6 +1455,10 @@ PAGE_SIZE multiple when read back. Amount of memory used for storing in-kernel data structures. =20 + dmabuf (npn) + Amount of memory used for exported DMA buffers allocated by the cgroup. + Stays with the allocating cgroup regardless of how the buffer is shared. + workingset_refault_anon Number of refaults of previously evicted anonymous pages. =20 diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index e6528767efc7..ac45dd101c4d 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -75,6 +75,8 @@ static void dma_buf_release(struct dentry *dentry) */ BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active); =20 + mod_memcg_state(dmabuf->memcg, MEMCG_DMABUF, -dmabuf->size); + mem_cgroup_put(dmabuf->memcg); dma_buf_stats_teardown(dmabuf); dmabuf->ops->release(dmabuf); =20 @@ -673,6 +675,9 @@ struct dma_buf *dma_buf_export(const struct dma_buf_exp= ort_info *exp_info) if (ret) goto err_dmabuf; =20 + dmabuf->memcg =3D get_mem_cgroup_from_mm(current->mm); + mod_memcg_state(dmabuf->memcg, MEMCG_DMABUF, dmabuf->size); + file->private_data =3D dmabuf; file->f_path.dentry->d_fsdata =3D dmabuf; dmabuf->file =3D file; diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 6fa8d4e29719..1f0ffb8e4bf5 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -22,6 +22,7 @@ #include #include #include +#include =20 struct device; struct dma_buf; @@ -446,6 +447,8 @@ struct dma_buf { struct dma_buf *dmabuf; } *sysfs_entry; #endif + /* The cgroup to which this buffer is currently attributed */ + struct mem_cgroup *memcg; }; =20 /** diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index d3c8203cab6c..1c1da2da20a6 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -37,6 +37,7 @@ enum memcg_stat_item { MEMCG_KMEM, MEMCG_ZSWAP_B, MEMCG_ZSWAPPED, + MEMCG_DMABUF, MEMCG_NR_STAT, }; =20 diff --git a/mm/memcontrol.c b/mm/memcontrol.c index ab457f0394ab..680189bec7e0 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1502,6 +1502,7 @@ static const struct memory_stat memory_stats[] =3D { { "unevictable", NR_UNEVICTABLE }, { "slab_reclaimable", NR_SLAB_RECLAIMABLE_B }, { "slab_unreclaimable", NR_SLAB_UNRECLAIMABLE_B }, + { "dmabuf", MEMCG_DMABUF }, =20 /* The memory events */ { "workingset_refault_anon", WORKINGSET_REFAULT_ANON }, @@ -1519,6 +1520,7 @@ static int memcg_page_state_unit(int item) switch (item) { case MEMCG_PERCPU_B: case MEMCG_ZSWAP_B: + case MEMCG_DMABUF: case NR_SLAB_RECLAIMABLE_B: case NR_SLAB_UNRECLAIMABLE_B: case WORKINGSET_REFAULT_ANON: @@ -4042,6 +4044,7 @@ static const unsigned int memcg1_stats[] =3D { WORKINGSET_REFAULT_ANON, WORKINGSET_REFAULT_FILE, MEMCG_SWAP, + MEMCG_DMABUF, }; =20 static const char *const memcg1_stat_names[] =3D { @@ -4057,6 +4060,7 @@ static const char *const memcg1_stat_names[] =3D { "workingset_refault_anon", "workingset_refault_file", "swap", + "dmabuf", }; =20 /* Universal VM events cgroup1 shows, original sort order */ --=20 2.39.0.314.g84b9a713c41-goog From nobody Mon Sep 15 21:48:41 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 42879C678D8 for ; Mon, 9 Jan 2023 21:40:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238047AbjAIVkP (ORCPT ); Mon, 9 Jan 2023 16:40:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49864 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238132AbjAIVjp (ORCPT ); Mon, 9 Jan 2023 16:39:45 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D5B6E3BE87 for ; Mon, 9 Jan 2023 13:38:27 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-4755eb8a57bso106023707b3.12 for ; Mon, 09 Jan 2023 13:38:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=jsT6wq4yb7S+NdjklyWb1ngjMGzkBXwNX+64A3lfzpA=; b=fvDtcLaeN/uzP3qHT8MCEklfe9OhLf6lmImIfuCacJYjG57XiBued7S3DOodRE/TYJ zCKNDg3ctqd3Dd67SPwPhW8ysEt0WGDBME6NGDgBcM5YrM326jXvkSXkPWXqmZwxbxsz JlJKx2q3UL/4NVtdfwNCMCP7QlTJxXBQXsxWNIIGkTyzBjhTdHqnlRz7A6SrrXEGohYg Y1TskQ4j5YZqdyazr4xhq4xZW9Gypy/dhRHNo7P2nwjIq6XxOhFfSm8keT19RigyoUi+ Lj952kU65FcH0FMfMGjpPdMsLkeIqjmWXAgeBMVlEt6SVmi5tV7uui22G7kOc9eQRFhz T4Xw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=jsT6wq4yb7S+NdjklyWb1ngjMGzkBXwNX+64A3lfzpA=; b=E6doaDt6zk8VjxgdXnWzmHLvKYLrXsRADPuk4EoDBxymm1eZZKHy7whqb93K8ujM9g 70qzAfreCmqsMol0vnjTtMpZSu//1BihK35koGLEoFVqspumEH/lGhbOw19G8wc9TAyn eBq4T//uMGBao8GbXsTaBEJiA/ZY6nc/ODBz7CY7KJj+yzjqEJdkHj957z8HoFHclYXu sGokTb88UX4jlcMW5Yh+mVwpNKzAoInyPixWBSZstp7fOAj5JNBWW+Hky9IQu2EYfVnj wJg8BwLylKpvwOOpGuIHHVI9uhogt9lwT4uxIOvwUTtxSAP4HVj/SRBtU/xh5orvE0Xa bd6w== X-Gm-Message-State: AFqh2kr81W2h0ef1BpbMPVTgSCGJHkq74vQvkEgzBdm++388hCxxaLYS noskJaPWBddmPrzgXW9/fETArHS9lP24SN0= X-Google-Smtp-Source: AMrXdXtIpfaXFqcDGkNsmQctSMqXPQ/tLvykMkynrX+C5cGc1INOkvBaGX4j41udFUeKmd51zadys8k1Cp1SKbA= X-Received: from tj.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:53a]) (user=tjmercier job=sendgmr) by 2002:a0d:e241:0:b0:48f:a921:40f2 with SMTP id l62-20020a0de241000000b0048fa92140f2mr5685523ywe.275.1673300307137; Mon, 09 Jan 2023 13:38:27 -0800 (PST) Date: Mon, 9 Jan 2023 21:38:05 +0000 In-Reply-To: <20230109213809.418135-1-tjmercier@google.com> Mime-Version: 1.0 References: <20230109213809.418135-1-tjmercier@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109213809.418135-3-tjmercier@google.com> Subject: [PATCH 2/4] dmabuf: Add cgroup charge transfer function From: "T.J. Mercier" To: tjmercier@google.com, Sumit Semwal , "=?UTF-8?q?Christian=20K=C3=B6nig?=" Cc: hannes@cmpxchg.org, daniel.vetter@ffwll.ch, android-mm@google.com, jstultz@google.com, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The dma_buf_transfer_charge function provides a way for processes to transfer charge of a buffer to a different cgroup. This is essential for the cases where a central allocator process does allocations for various subsystems, hands over the fd to the client who requested the memory, and drops all references to the allocated memory. Signed-off-by: T.J. Mercier --- drivers/dma-buf/dma-buf.c | 45 ++++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 1 + include/linux/memcontrol.h | 6 +++++ 3 files changed, 52 insertions(+) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index ac45dd101c4d..fd6c5002032b 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -11,6 +11,7 @@ * refining of this idea. */ =20 +#include #include #include #include @@ -1618,6 +1619,50 @@ void dma_buf_vunmap_unlocked(struct dma_buf *dmabuf,= struct iosys_map *map) } EXPORT_SYMBOL_NS_GPL(dma_buf_vunmap_unlocked, DMA_BUF); =20 +/** + * dma_buf_transfer_charge - Change the cgroup to which the provided dma_b= uf is charged. + * @dmabuf: [in] buffer whose charge will be migrated to a different cgroup + * @target: [in] the task_struct of the destination process for the cgroup= charge + * + * Only tasks that belong to the same cgroup the buffer is currently charg= ed to + * may call this function, otherwise it will return -EPERM. + * + * Returns 0 on success, or a negative errno code otherwise. + */ +int dma_buf_transfer_charge(struct dma_buf *dmabuf, struct task_struct *ta= rget) +{ + struct mem_cgroup *current_cg, *target_cg; + int ret =3D 0; + + if (!IS_ENABLED(CONFIG_MEMCG)) + return 0; + + if (WARN_ON(!dmabuf) || WARN_ON(!target)) + return -EINVAL; + + current_cg =3D mem_cgroup_from_task(current); + target_cg =3D get_mem_cgroup_from_mm(target->mm); + + if (current_cg =3D=3D target_cg) + goto skip_transfer; + + if (cmpxchg(&dmabuf->memcg, current_cg, target_cg) !=3D current_cg) { + /* Only the current owner can transfer the charge */ + ret =3D -EPERM; + goto skip_transfer; + } + + mod_memcg_state(current_cg, MEMCG_DMABUF, -dmabuf->size); + mod_memcg_state(target_cg, MEMCG_DMABUF, dmabuf->size); + + mem_cgroup_put(current_cg); /* unref from buffer - buffer keeps new ref t= o target_cg */ + return 0; + +skip_transfer: + mem_cgroup_put(target_cg); + return ret; +} + #ifdef CONFIG_DEBUG_FS static int dma_buf_debug_show(struct seq_file *s, void *unused) { diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 1f0ffb8e4bf5..6aa128d76aa7 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -634,4 +634,5 @@ int dma_buf_vmap(struct dma_buf *dmabuf, struct iosys_m= ap *map); void dma_buf_vunmap(struct dma_buf *dmabuf, struct iosys_map *map); int dma_buf_vmap_unlocked(struct dma_buf *dmabuf, struct iosys_map *map); void dma_buf_vunmap_unlocked(struct dma_buf *dmabuf, struct iosys_map *map= ); +int dma_buf_transfer_charge(struct dma_buf *dmabuf, struct task_struct *ta= rget); #endif /* __DMA_BUF_H__ */ diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 1c1da2da20a6..e5aec27044c7 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -1298,6 +1298,12 @@ struct mem_cgroup *mem_cgroup_from_css(struct cgroup= _subsys_state *css) return NULL; } =20 +static inline +struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p) +{ + return NULL; +} + static inline void obj_cgroup_put(struct obj_cgroup *objcg) { } --=20 2.39.0.314.g84b9a713c41-goog From nobody Mon Sep 15 21:48:41 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62F4BC678DB for ; Mon, 9 Jan 2023 21:40:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238102AbjAIVkX (ORCPT ); Mon, 9 Jan 2023 16:40:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50124 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238137AbjAIVjp (ORCPT ); Mon, 9 Jan 2023 16:39:45 -0500 Received: from mail-oi1-x249.google.com (mail-oi1-x249.google.com [IPv6:2607:f8b0:4864:20::249]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 85FF63BEA0 for ; Mon, 9 Jan 2023 13:38:32 -0800 (PST) Received: by mail-oi1-x249.google.com with SMTP id q8-20020a056808200800b00363bb37d22cso3115607oiw.19 for ; Mon, 09 Jan 2023 13:38:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=bWOMIyzYML+GnCVLadZ/bfKF2vZbEfEtiwoUNHe8I8g=; b=B/4nGCkbSar9JPAjXdE2nMB2agrlgj7EbvSmEmaJ+FSJY/iiSdTcvg5MdvBrbYHsHT aJRIFm3Wr5pW8VLrvwXyFhP3E3TT9U5sZcpJvG8FV1JDpbHJwB7rtgFFGgNK9LDaVjog Pxt5yN9L6hOC7uq/3GfuD7+lUUQyPP+3VSsa17lsU0G2BkfMYQpcL3C/SsXM7xi372Zn fNO5FJJkb2PUBmghvakzxN8uNK9UFoT5CrG30MTZzLQBuir0MX/aQWG2h1G8i+8H8PX5 4fz1y8odBf62yg98vv3M9x+vuJXu9iZH5joEdWq+vYgTfLpmFQLvIEwSeteAmim8RvJX MOWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=bWOMIyzYML+GnCVLadZ/bfKF2vZbEfEtiwoUNHe8I8g=; b=zfej8iom4Jwl7KB3cMLWWSLFxQnhId53GGFXjfXWDdKFA1iKmNsFjhL2f4GNywh0OM uFAxuMJdPpQnuWHlJLBwb54ZIdXzArwphHslIcuN4QAWMQNnvXoaiBOqEggHz8M3f49X aZX5lcUIpkSJng6EWi0aZS5Vx8YxI0XdcvuYBPxC1Au8itWWtAW+jYRLLaU2QNM1dkhm 1hcwgQgGcWdGZkwZWT1vd55oknaJnGKcVznSpatCoc+ICQe1kSSomKHP9P5JmZOYjc85 zeEVByTkmanix9sSueCoXdm0AkK+R7jlfte404jQKPTB81JUY+iYnUglbebD365A94Fv YX2A== X-Gm-Message-State: AFqh2kprtxNX7FdWFJLSJFEQ7A6irUMrrRoKY9ufyl2O/KiidfQOnJL9 ZTx5FEae34F+RxR+xc3EHge1qMK2w0O/YxU= X-Google-Smtp-Source: AMrXdXt2b4GmnQbBAnictl+P9Qid5OxsP4Aa+Rreix4gjPXQPZYSTzo2urBLSPG5NnFGp9zwycV58Z5FT1JxcHQ= X-Received: from tj.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:53a]) (user=tjmercier job=sendgmr) by 2002:a05:6870:6327:b0:15b:d2e:d059 with SMTP id s39-20020a056870632700b0015b0d2ed059mr587093oao.179.1673300311796; Mon, 09 Jan 2023 13:38:31 -0800 (PST) Date: Mon, 9 Jan 2023 21:38:06 +0000 In-Reply-To: <20230109213809.418135-1-tjmercier@google.com> Mime-Version: 1.0 References: <20230109213809.418135-1-tjmercier@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109213809.418135-4-tjmercier@google.com> Subject: [PATCH 3/4] binder: Add flags to relinquish ownership of fds From: "T.J. Mercier" To: tjmercier@google.com, Tejun Heo , Zefan Li , Johannes Weiner , Jonathan Corbet , Greg Kroah-Hartman , "=?UTF-8?q?Arve=20Hj=C3=B8nnev=C3=A5g?=" , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Sumit Semwal , "=?UTF-8?q?Christian=20K=C3=B6nig?=" Cc: daniel.vetter@ffwll.ch, android-mm@google.com, jstultz@google.com, Hridya Valsaraju , cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Hridya Valsaraju This patch introduces flags BINDER_FD_FLAG_XFER_CHARGE, and BINDER_FD_FLAG_XFER_CHARGE that a process sending an individual fd or fd array to another process over binder IPC can set to relinquish ownership of the fd(s) being sent for memory accounting purposes. If the flag is found to be set during the fd or fd array translation and the fd is for a DMA-BUF, the buffer is uncharged from the sender's cgroup and charged to the receiving process's cgroup instead. It is up to the sending process to ensure that it closes the fds regardless of whether the transfer failed or succeeded. Most graphics shared memory allocations in Android are done by the graphics allocator HAL process. On requests from clients, the HAL process allocates memory and sends the fds to the clients over binder IPC. The graphics allocator HAL will not retain any references to the buffers. When the HAL sets *_FLAG_XFER_CHARGE for fd arrays holding DMA-BUF fds, or individual fd objects, binder will transfer the charge for the buffer from the allocator process cgroup to the client process cgroup. The pad [1] and pad_flags [2] fields of binder_fd_object and binder_fda_array_object come from alignment with flat_binder_object and have never been exposed for use from userspace. This new flags use follows the pattern set by binder_buffer_object. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/comm= it/include/uapi/linux/android/binder.h?id=3Dfeba3900cabb8e7c87368faa28e7a69= 36809ba22 [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/comm= it/include/uapi/linux/android/binder.h?id=3D5cdcf4c6a638591ec0e98c57404a19e= 7f9997567 Signed-off-by: Hridya Valsaraju Signed-off-by: T.J. Mercier --- Documentation/admin-guide/cgroup-v2.rst | 3 ++- drivers/android/binder.c | 31 +++++++++++++++++++++---- drivers/dma-buf/dma-buf.c | 4 +--- include/linux/dma-buf.h | 1 + include/uapi/linux/android/binder.h | 23 ++++++++++++++---- 5 files changed, 50 insertions(+), 12 deletions(-) diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-= guide/cgroup-v2.rst index 538ae22bc514..d225295932c0 100644 --- a/Documentation/admin-guide/cgroup-v2.rst +++ b/Documentation/admin-guide/cgroup-v2.rst @@ -1457,7 +1457,8 @@ PAGE_SIZE multiple when read back. =20 dmabuf (npn) Amount of memory used for exported DMA buffers allocated by the cgroup. - Stays with the allocating cgroup regardless of how the buffer is shared. + Stays with the allocating cgroup regardless of how the buffer is shared + unless explicitly transferred. =20 workingset_refault_anon Number of refaults of previously evicted anonymous pages. diff --git a/drivers/android/binder.c b/drivers/android/binder.c index 880224ec6abb..9830848c8d25 100644 --- a/drivers/android/binder.c +++ b/drivers/android/binder.c @@ -42,6 +42,7 @@ =20 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt =20 +#include #include #include #include @@ -2237,7 +2238,7 @@ static int binder_translate_handle(struct flat_binder= _object *fp, return ret; } =20 -static int binder_translate_fd(u32 fd, binder_size_t fd_offset, +static int binder_translate_fd(u32 fd, binder_size_t fd_offset, __u32 flag= s, struct binder_transaction *t, struct binder_thread *thread, struct binder_transaction *in_reply_to) @@ -2275,6 +2276,26 @@ static int binder_translate_fd(u32 fd, binder_size_t= fd_offset, goto err_security; } =20 + if (IS_ENABLED(CONFIG_MEMCG) && (flags & BINDER_FD_FLAG_XFER_CHARGE)) { + struct dma_buf *dmabuf; + + if (unlikely(!is_dma_buf_file(file))) { + binder_user_error( + "%d:%d got transaction with XFER_CHARGE for non-dmabuf fd, %d\n", + proc->pid, thread->pid, fd); + ret =3D -EINVAL; + goto err_dmabuf; + } + + dmabuf =3D file->private_data; + ret =3D dma_buf_transfer_charge(dmabuf, target_proc->tsk); + if (ret) { + pr_warn("%d:%d Unable to transfer DMA-BUF fd charge to %d\n", + proc->pid, thread->pid, target_proc->pid); + goto err_xfer; + } + } + /* * Add fixup record for this transaction. The allocation * of the fd in the target needs to be done from a @@ -2294,6 +2315,8 @@ static int binder_translate_fd(u32 fd, binder_size_t = fd_offset, return ret; =20 err_alloc: +err_xfer: +err_dmabuf: err_security: fput(file); err_fget: @@ -2604,7 +2627,7 @@ static int binder_translate_fd_array(struct list_head= *pf_head, =20 ret =3D copy_from_user(&fd, sender_ufda_base + sender_uoffset, sizeof(fd= )); if (!ret) - ret =3D binder_translate_fd(fd, offset, t, thread, + ret =3D binder_translate_fd(fd, offset, fda->flags, t, thread, in_reply_to); if (ret) return ret > 0 ? -EINVAL : ret; @@ -3383,8 +3406,8 @@ static void binder_transaction(struct binder_proc *pr= oc, struct binder_fd_object *fp =3D to_binder_fd_object(hdr); binder_size_t fd_offset =3D object_offset + (uintptr_t)&fp->fd - (uintptr_t)fp; - int ret =3D binder_translate_fd(fp->fd, fd_offset, t, - thread, in_reply_to); + int ret =3D binder_translate_fd(fp->fd, fd_offset, fp->flags, + t, thread, in_reply_to); =20 fp->pad_binder =3D 0; if (ret < 0 || diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index fd6c5002032b..a65b42433099 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -34,8 +34,6 @@ =20 #include "dma-buf-sysfs-stats.h" =20 -static inline int is_dma_buf_file(struct file *); - struct dma_buf_list { struct list_head head; struct mutex lock; @@ -527,7 +525,7 @@ static const struct file_operations dma_buf_fops =3D { /* * is_dma_buf_file - Check if struct file* is associated with dma_buf */ -static inline int is_dma_buf_file(struct file *file) +int is_dma_buf_file(struct file *file) { return file->f_op =3D=3D &dma_buf_fops; } diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 6aa128d76aa7..092d572ce528 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -595,6 +595,7 @@ dma_buf_attachment_is_dynamic(struct dma_buf_attachment= *attach) return !!attach->importer_ops; } =20 +int is_dma_buf_file(struct file *file); struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, struct device *dev); struct dma_buf_attachment * diff --git a/include/uapi/linux/android/binder.h b/include/uapi/linux/andro= id/binder.h index e72e4de8f452..696c2bdb8a7e 100644 --- a/include/uapi/linux/android/binder.h +++ b/include/uapi/linux/android/binder.h @@ -91,14 +91,14 @@ struct flat_binder_object { /** * struct binder_fd_object - describes a filedescriptor to be fixed up. * @hdr: common header structure - * @pad_flags: padding to remain compatible with old userspace code + * @flags: One or more BINDER_FD_FLAG_* flags * @pad_binder: padding to remain compatible with old userspace code * @fd: file descriptor * @cookie: opaque data, used by user-space */ struct binder_fd_object { struct binder_object_header hdr; - __u32 pad_flags; + __u32 flags; union { binder_uintptr_t pad_binder; __u32 fd; @@ -107,6 +107,17 @@ struct binder_fd_object { binder_uintptr_t cookie; }; =20 +enum { + /** + * @BINDER_FD_FLAG_XFER_CHARGE + * + * When set, the sender of a binder_fd_object wishes to relinquish owners= hip of the fd for + * memory accounting purposes. If the fd is for a DMA-BUF, the buffer is = uncharged from the + * sender's cgroup and charged to the receiving process's cgroup instead. + */ + BINDER_FD_FLAG_XFER_CHARGE =3D 0x01, +}; + /* struct binder_buffer_object - object describing a userspace buffer * @hdr: common header structure * @flags: one or more BINDER_BUFFER_* flags @@ -141,7 +152,7 @@ enum { =20 /* struct binder_fd_array_object - object describing an array of fds in a = buffer * @hdr: common header structure - * @pad: padding to ensure correct alignment + * @flags: One or more BINDER_FDA_FLAG_* flags * @num_fds: number of file descriptors in the buffer * @parent: index in offset array to buffer holding the fd array * @parent_offset: start offset of fd array in the buffer @@ -162,12 +173,16 @@ enum { */ struct binder_fd_array_object { struct binder_object_header hdr; - __u32 pad; + __u32 flags; binder_size_t num_fds; binder_size_t parent; binder_size_t parent_offset; }; =20 +enum { + BINDER_FDA_FLAG_XFER_CHARGE =3D BINDER_FD_FLAG_XFER_CHARGE, +}; + /* * On 64-bit platforms where user code may run in 32-bits the driver must * translate the buffer (and local binder) addresses appropriately. --=20 2.39.0.314.g84b9a713c41-goog From nobody Mon Sep 15 21:48:41 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51FA5C678DA for ; Mon, 9 Jan 2023 21:40:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238063AbjAIVkU (ORCPT ); Mon, 9 Jan 2023 16:40:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49898 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238143AbjAIVjq (ORCPT ); Mon, 9 Jan 2023 16:39:46 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1A9DB2725 for ; Mon, 9 Jan 2023 13:38:37 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-4b34cf67fb6so105183227b3.6 for ; Mon, 09 Jan 2023 13:38:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=SQeNgY+J6MKzCRrIntmsIbLAzrhXUK7Ksf6ShqqTDqg=; b=JBDEWasB+3OrvV/Y27HbN2Neb2kEXiiVXQ+wxK++73bxItz+h84CoAdQzRT6NPJANE CLVxy9fExG/dYCBL9Kj+T7e6Fb6ul1yD6ZQb8A+AL4pzEaI7DoHPA9JZGXcUqFYa49aP iAkXaV01QrmTY+TL3EB8WOUxQiuibhGQDvpIh6fIzVNuB+Qk+ocz4E7Y33kxzw4PYa6g NN/E4LuCqSW7+xRTARJMsEa7Pt/FZojDU3MFoN6807jAGeTOdKlMYAXKehPKnbE4ZgBV im2AF/krq07AeNXvCPN706WdUF61A4jBsp9p0BqVwBod4BUCHSRlawT3zlavAiChILB8 KYsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=SQeNgY+J6MKzCRrIntmsIbLAzrhXUK7Ksf6ShqqTDqg=; b=JjD2PvgbN2WnELd7pCXPtkpynphaW0sNaGyM9Ni/uiC0Y7XYnll1XfRK6seHB9tWzu Z5GBF9MNrWWxdDSjlUH6qD36/eFaUL30etsJ0iTumrQrJp6xYLG94sqqIFAmtt1tnax7 mBKBeMrtKFVUpFtEhWx7Mv/oHxe79DkBOy11Ky5Ds+8vrNMj4oCywCYO0pYYbFLFLi0R +A2OV3o4ko1ARIbPApJG6xtI/p4/jcHZfkfr5BLvf0E929JFeAiL/etbW2UR/Qeih5dP w2W5uwvGA/4YkhEg1SR80g99dqYetFFHmQS1W8tkOmu0mPpwyIci5kYfAqy7Rb+v6AXz YZsA== X-Gm-Message-State: AFqh2koAotobfNB9Thgx/jVrAhckrc0ltKryXmyyHIDamhurtIcR0xij u9TL5VJp14cxc+fZDMWe253T2LcgX1rcCSY= X-Google-Smtp-Source: AMrXdXsly8ftnTGxkXWURs6cgQlim0110CyvAjtUB2YdXI6XZT+4/hpYTJ6JlWae8xNU5rTd/sJIbiwGsASFKCY= X-Received: from tj.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:53a]) (user=tjmercier job=sendgmr) by 2002:a0d:eb07:0:b0:475:794e:90e3 with SMTP id u7-20020a0deb07000000b00475794e90e3mr515886ywe.483.1673300316284; Mon, 09 Jan 2023 13:38:36 -0800 (PST) Date: Mon, 9 Jan 2023 21:38:07 +0000 In-Reply-To: <20230109213809.418135-1-tjmercier@google.com> Mime-Version: 1.0 References: <20230109213809.418135-1-tjmercier@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109213809.418135-5-tjmercier@google.com> Subject: [PATCH 4/4] security: binder: Add transfer_charge SElinux hook From: "T.J. Mercier" To: tjmercier@google.com, Greg Kroah-Hartman , "=?UTF-8?q?Arve=20Hj=C3=B8nnev=C3=A5g?=" , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Paul Moore , James Morris , "Serge E. Hallyn" , Stephen Smalley , Eric Paris Cc: hannes@cmpxchg.org, daniel.vetter@ffwll.ch, android-mm@google.com, jstultz@google.com, linux-kernel@vger.kernel.org, linux-security-module@vger.kernel.org, selinux@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Any process can cause a memory charge transfer to occur to any other process when transmitting a file descriptor through binder. This should only be possible for central allocator processes, so a new SELinux permission is added to restrict which processes are allowed to initiate these charge transfers. Signed-off-by: T.J. Mercier --- drivers/android/binder.c | 5 +++++ include/linux/lsm_hook_defs.h | 2 ++ include/linux/lsm_hooks.h | 6 ++++++ include/linux/security.h | 2 ++ security/security.c | 6 ++++++ security/selinux/hooks.c | 9 +++++++++ security/selinux/include/classmap.h | 2 +- 7 files changed, 31 insertions(+), 1 deletion(-) diff --git a/drivers/android/binder.c b/drivers/android/binder.c index 9830848c8d25..9063db04826d 100644 --- a/drivers/android/binder.c +++ b/drivers/android/binder.c @@ -2279,6 +2279,11 @@ static int binder_translate_fd(u32 fd, binder_size_t= fd_offset, __u32 flags, if (IS_ENABLED(CONFIG_MEMCG) && (flags & BINDER_FD_FLAG_XFER_CHARGE)) { struct dma_buf *dmabuf; =20 + if (security_binder_transfer_charge(proc->cred, target_proc->cred)) { + ret =3D -EPERM; + goto err_security; + } + if (unlikely(!is_dma_buf_file(file))) { binder_user_error( "%d:%d got transaction with XFER_CHARGE for non-dmabuf fd, %d\n", diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h index ed6cb2ac55fa..8db2a958557e 100644 --- a/include/linux/lsm_hook_defs.h +++ b/include/linux/lsm_hook_defs.h @@ -33,6 +33,8 @@ LSM_HOOK(int, 0, binder_transfer_binder, const struct cre= d *from, const struct cred *to) LSM_HOOK(int, 0, binder_transfer_file, const struct cred *from, const struct cred *to, struct file *file) +LSM_HOOK(int, 0, binder_transfer_charge, const struct cred *from, + const struct cred *to) LSM_HOOK(int, 0, ptrace_access_check, struct task_struct *child, unsigned int mode) LSM_HOOK(int, 0, ptrace_traceme, struct task_struct *parent) diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h index 0a5ba81f7367..39c40c7bf519 100644 --- a/include/linux/lsm_hooks.h +++ b/include/linux/lsm_hooks.h @@ -1385,6 +1385,12 @@ * @file contains the struct file being transferred. * @to contains the struct cred for the receiving process. * Return 0 if permission is granted. + * @binder_transfer_charge: + * Check whether @from is allowed to transfer the memory charge for a + * buffer out of its cgroup to @to. + * @from contains the struct cred for the sending process. + * @to contains the struct cred for the receiving process. + * Return 0 if permission is granted. * * @ptrace_access_check: * Check permission before allowing the current process to trace the diff --git a/include/linux/security.h b/include/linux/security.h index 5b67f208f7de..3b7472308430 100644 --- a/include/linux/security.h +++ b/include/linux/security.h @@ -270,6 +270,8 @@ int security_binder_transfer_binder(const struct cred *= from, const struct cred *to); int security_binder_transfer_file(const struct cred *from, const struct cred *to, struct file *file); +int security_binder_transfer_charge(const struct cred *from, + const struct cred *to); int security_ptrace_access_check(struct task_struct *child, unsigned int m= ode); int security_ptrace_traceme(struct task_struct *parent); int security_capget(struct task_struct *target, diff --git a/security/security.c b/security/security.c index d1571900a8c7..97e1e74d1ff2 100644 --- a/security/security.c +++ b/security/security.c @@ -801,6 +801,12 @@ int security_binder_transfer_file(const struct cred *f= rom, return call_int_hook(binder_transfer_file, 0, from, to, file); } =20 +int security_binder_transfer_charge(const struct cred *from, + const struct cred *to) +{ + return call_int_hook(binder_transfer_charge, 0, from, to); +} + int security_ptrace_access_check(struct task_struct *child, unsigned int m= ode) { return call_int_hook(ptrace_access_check, 0, child, mode); diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index 3c5be76a9199..823ef14924bd 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -2066,6 +2066,14 @@ static int selinux_binder_transfer_file(const struct= cred *from, &ad); } =20 +static int selinux_binder_transfer_charge(const struct cred *from, const s= truct cred *to) +{ + return avc_has_perm(&selinux_state, + cred_sid(from), cred_sid(to), + SECCLASS_BINDER, BINDER__TRANSFER_CHARGE, + NULL); +} + static int selinux_ptrace_access_check(struct task_struct *child, unsigned int mode) { @@ -7052,6 +7060,7 @@ static struct security_hook_list selinux_hooks[] __ls= m_ro_after_init =3D { LSM_HOOK_INIT(binder_transaction, selinux_binder_transaction), LSM_HOOK_INIT(binder_transfer_binder, selinux_binder_transfer_binder), LSM_HOOK_INIT(binder_transfer_file, selinux_binder_transfer_file), + LSM_HOOK_INIT(binder_transfer_charge, selinux_binder_transfer_charge), =20 LSM_HOOK_INIT(ptrace_access_check, selinux_ptrace_access_check), LSM_HOOK_INIT(ptrace_traceme, selinux_ptrace_traceme), diff --git a/security/selinux/include/classmap.h b/security/selinux/include= /classmap.h index a3c380775d41..2eef180d10d7 100644 --- a/security/selinux/include/classmap.h +++ b/security/selinux/include/classmap.h @@ -172,7 +172,7 @@ const struct security_class_mapping secclass_map[] =3D { { "tun_socket", { COMMON_SOCK_PERMS, "attach_queue", NULL } }, { "binder", { "impersonate", "call", "set_context_mgr", "transfer", - NULL } }, + "transfer_charge", NULL } }, { "cap_userns", { COMMON_CAP_PERMS, NULL } }, { "cap2_userns", --=20 2.39.0.314.g84b9a713c41-goog