From nobody Thu Dec 18 14:09:47 2025 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D51F2148827 for ; Mon, 2 Jun 2025 19:18:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748891891; cv=none; b=n13doXQ519A1koazQ0f1rcVGYa0lPXPGIqh1uVUC9cNNMFIZ3x3i8tKJIG0WFzr86OxnN/grZJXNmQbqUjsguVKcEa6wrBOraO3u6sZS8hW2xrjE7jl6U2lsVG4OIevtnXskbDY4TWzvIsjV+GyPgrBlklj4Nzfj1r9YLlk2qek= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748891891; c=relaxed/simple; bh=Kk4G9OjSA5hI8BLupzEnDWGdQPbbgM19H/mBxXD+AwQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Nm5rKBM/+dJdqJxOGPFJkaud2B604ZYX4TuJRndhEv7Dh+WPAjPLF9R6mUW7DJJoxuHij11fZx+U1tY1dZBPPAJ9Ya7sj5QjJqTwKNPdsgYNjbhA+ouBCbR5smYfXPkZclsAPFK/R6cGMRpFIbm5lhDnJT33KAtgNkdZ1jpY1Dg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ackerleytng.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=aFlY5ySW; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ackerleytng.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="aFlY5ySW" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-7370e73f690so5435380b3a.3 for ; Mon, 02 Jun 2025 12:18:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1748891889; x=1749496689; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=gC27xGN+OZxYxd8x3HPELSRtz95EF1fMjoET+WlkV5U=; b=aFlY5ySWQyoPc1s+gut3HKnEuw19gZOGXVjHcUk4OtxVfu5h2RXuh99Ya/653K2xwS 283iF8bBznmq/1nG3BbT6JDnJaHkHKqWB7Dbd4NusV5hPo2ZhsiTzjegg9HwN0KE9AkZ KVOUznD0Sh+0PsMNjbjG+XC9J0fQN6S7L203WTYXBr0caJH05ZohV3Hxf0ptPH6pidtO /2CNTSIUGYwv3jqHvY7D5xhOif/tuSsWWdYlPnf8FCcIftZJRcI6gW8Q+6BaRr++eDqV W7aaZ2SL3qo32lgZr5jgujxGKvCeu4+JUzhR/QhyR1Jhv1gbHa6dRW0bFqlG+lVglRNL akcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748891889; x=1749496689; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=gC27xGN+OZxYxd8x3HPELSRtz95EF1fMjoET+WlkV5U=; b=tHeBFikcOBi1dktgQPhfzuay0s/MHus7WkwBCfQGuEN1N6xyHHTw733P4OOPB2Xd9Z RHQaYkLlkoEztPvj4Q+JiuB1Xa3sjrfT2M+j3QcZ0NN4/xrWwA0KGYoJZxPyXxmG3HiQ Hv0LHInbt+sWdROoe8TpoPUrcJzq8uyqtxhHY9agz4ELLX4JAJPF4ETvMHixc6YFYg5n 7eNFpbO0zJi4w/qBcUoXSU0gOmqd20S3k/GPz/DiV6uL9d0oH7BUkT4APc8UcPvQ/Pza eF9ljkpPswYTiGX6oIOvVo0Hn5ISBt/J25rZgVrrq3ZCHl6v9SmoVc79+cm+NljZuNmb A37A== X-Forwarded-Encrypted: i=1; AJvYcCULjTSigCg4hZTj9QFl9vco+U61gS+yTMebJn8//82QVMhyeEYGobC9GPkV97FFAZA0nZRHIQdFW5qI53A=@vger.kernel.org X-Gm-Message-State: AOJu0YwCLaD/nXav0P1HtjJmTzkjtMkPC8GTiuIWEtiEYK/aRxSkduo3 UC6Iza7frDvtooajoLTvd4xpnFcIBssHoIovp3qqhLag34hA9ZdS0ajfZtVVU8TNNx5wh18qkMS N8NRl7MhmIX1ueZhFHYm7YzSt4A== X-Google-Smtp-Source: AGHT+IEZUo9EMXuhRCwoaNc5hwobX/60MfpH4lGs/P6VHbaKIkRHWpm212b2a7KnBe+VwVfAYZRtTQUZPJL+sgEDqQ== X-Received: from pfblu7.prod.google.com ([2002:a05:6a00:7487:b0:747:a97f:513f]) (user=ackerleytng job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:3cc5:b0:740:595a:f9bf with SMTP id d2e1a72fcca58-747bd9510fbmr21737179b3a.3.1748891888859; Mon, 02 Jun 2025 12:18:08 -0700 (PDT) Date: Mon, 2 Jun 2025 12:17:54 -0700 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.49.0.1204.g71687c7c1d-goog Message-ID: Subject: [PATCH 1/2] fs: Provide function that allocates a secure anonymous inode From: Ackerley Tng To: kvm@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, linux-fsdevel@vger.kernel.org Cc: ackerleytng@google.com, aik@amd.com, ajones@ventanamicro.com, akpm@linux-foundation.org, amoorthy@google.com, anthony.yznaga@oracle.com, anup@brainfault.org, aou@eecs.berkeley.edu, bfoster@redhat.com, binbin.wu@linux.intel.com, brauner@kernel.org, catalin.marinas@arm.com, chao.p.peng@intel.com, chenhuacai@kernel.org, dave.hansen@intel.com, david@redhat.com, dmatlack@google.com, dwmw@amazon.co.uk, erdemaktas@google.com, fan.du@intel.com, fvdl@google.com, graf@amazon.com, haibo1.xu@intel.com, hch@infradead.org, hughd@google.com, ira.weiny@intel.com, isaku.yamahata@intel.com, jack@suse.cz, james.morse@arm.com, jarkko@kernel.org, jgg@ziepe.ca, jgowans@amazon.com, jhubbard@nvidia.com, jroedel@suse.de, jthoughton@google.com, jun.miao@intel.com, kai.huang@intel.com, keirf@google.com, kent.overstreet@linux.dev, kirill.shutemov@intel.com, liam.merwick@oracle.com, maciej.wieczor-retman@intel.com, mail@maciej.szmigiero.name, maz@kernel.org, mic@digikod.net, michael.roth@amd.com, mpe@ellerman.id.au, muchun.song@linux.dev, nikunj@amd.com, nsaenz@amazon.es, oliver.upton@linux.dev, palmer@dabbelt.com, pankaj.gupta@amd.com, paul.walmsley@sifive.com, pbonzini@redhat.com, pdurrant@amazon.co.uk, peterx@redhat.com, pgonda@google.com, pvorel@suse.cz, qperret@google.com, quic_cvanscha@quicinc.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, quic_svaddagi@quicinc.com, quic_tsoni@quicinc.com, richard.weiyang@gmail.com, rick.p.edgecombe@intel.com, rientjes@google.com, roypat@amazon.co.uk, rppt@kernel.org, seanjc@google.com, shuah@kernel.org, steven.price@arm.com, steven.sistare@oracle.com, suzuki.poulose@arm.com, tabba@google.com, thomas.lendacky@amd.com, vannapurve@google.com, vbabka@suse.cz, viro@zeniv.linux.org.uk, vkuznets@redhat.com, wei.w.wang@intel.com, will@kernel.org, willy@infradead.org, xiaoyao.li@intel.com, yan.y.zhao@intel.com, yilun.xu@intel.com, yuzenghui@huawei.com, zhiquan1.li@intel.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The new function, alloc_anon_secure_inode(), returns an inode after running checks in security_inode_init_security_anon(). Also refactor secretmem's file creation process to use the new function. Suggested-by: David Hildenbrand Signed-off-by: Ackerley Tng Acked-by: David Hildenbrand --- fs/anon_inodes.c | 22 ++++++++++++++++------ include/linux/fs.h | 1 + mm/secretmem.c | 9 +-------- 3 files changed, 18 insertions(+), 14 deletions(-) diff --git a/fs/anon_inodes.c b/fs/anon_inodes.c index 583ac81669c2..4c3110378647 100644 --- a/fs/anon_inodes.c +++ b/fs/anon_inodes.c @@ -55,17 +55,20 @@ static struct file_system_type anon_inode_fs_type =3D { .kill_sb =3D kill_anon_super, }; -static struct inode *anon_inode_make_secure_inode( - const char *name, - const struct inode *context_inode) +static struct inode *anon_inode_make_secure_inode(struct super_block *s, + const char *name, const struct inode *context_inode, + bool fs_internal) { struct inode *inode; int error; - inode =3D alloc_anon_inode(anon_inode_mnt->mnt_sb); + inode =3D alloc_anon_inode(s); if (IS_ERR(inode)) return inode; - inode->i_flags &=3D ~S_PRIVATE; + + if (!fs_internal) + inode->i_flags &=3D ~S_PRIVATE; + error =3D security_inode_init_security_anon(inode, &QSTR(name), context_inode); if (error) { @@ -75,6 +78,12 @@ static struct inode *anon_inode_make_secure_inode( return inode; } +struct inode *alloc_anon_secure_inode(struct super_block *s, const char *n= ame) +{ + return anon_inode_make_secure_inode(s, name, NULL, true); +} +EXPORT_SYMBOL_GPL(alloc_anon_secure_inode); + static struct file *__anon_inode_getfile(const char *name, const struct file_operations *fops, void *priv, int flags, @@ -88,7 +97,8 @@ static struct file *__anon_inode_getfile(const char *name, return ERR_PTR(-ENOENT); if (make_inode) { - inode =3D anon_inode_make_secure_inode(name, context_inode); + inode =3D anon_inode_make_secure_inode(anon_inode_mnt->mnt_sb, + name, context_inode, false); if (IS_ERR(inode)) { file =3D ERR_CAST(inode); goto err; diff --git a/include/linux/fs.h b/include/linux/fs.h index 016b0fe1536e..0fded2e3c661 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3550,6 +3550,7 @@ extern int simple_write_begin(struct file *file, stru= ct address_space *mapping, extern const struct address_space_operations ram_aops; extern int always_delete_dentry(const struct dentry *); extern struct inode *alloc_anon_inode(struct super_block *); +extern struct inode *alloc_anon_secure_inode(struct super_block *, const c= har *); extern int simple_nosetlease(struct file *, int, struct file_lease **, voi= d **); extern const struct dentry_operations simple_dentry_operations; diff --git a/mm/secretmem.c b/mm/secretmem.c index 1b0a214ee558..c0e459e58cb6 100644 --- a/mm/secretmem.c +++ b/mm/secretmem.c @@ -195,18 +195,11 @@ static struct file *secretmem_file_create(unsigned lo= ng flags) struct file *file; struct inode *inode; const char *anon_name =3D "[secretmem]"; - int err; - inode =3D alloc_anon_inode(secretmem_mnt->mnt_sb); + inode =3D alloc_anon_secure_inode(secretmem_mnt->mnt_sb, anon_name); if (IS_ERR(inode)) return ERR_CAST(inode); - err =3D security_inode_init_security_anon(inode, &QSTR(anon_name), NULL); - if (err) { - file =3D ERR_PTR(err); - goto err_free_inode; - } - file =3D alloc_file_pseudo(inode, secretmem_mnt, "secretmem", O_RDWR, &secretmem_fops); if (IS_ERR(file)) -- 2.49.0.1204.g71687c7c1d-goog From nobody Thu Dec 18 14:09:47 2025 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6C3C6227EA4 for ; Mon, 2 Jun 2025 19:18:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748891893; cv=none; b=bZy48bCfTdPcOwE2L5inX5h+argPltsvAzFyHKT48xa63lblYcjC+yWOwj6TwMoKzRgdufZfSmCn5bdVZvkKoY0n3rZMDm4GySvW4Lg35RQbra+8iGGizlSUIl+LmeSZI0bgO56OiECUYzkzyNutVkMkZcoRJkkFfra9bOkQCKM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748891893; c=relaxed/simple; bh=uob5dPHELISa2eBCv6uj+1AuIK4K9rtFBcnduREVRwU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=USROSlLmByN1Tstr4lIfmmzBPrEas3rvnHPiwmBeGlE1slvCRRM6kWZ/tbk2Eq6zCE4ijQWpE2ebq/3V16zDj3vh/hlrklpdBPIqA2rI9ncEtyw1qaClcruJbeIAavZovD78CrJRmzjoMdnMnact4J6s2YaSMznOc+8RxmYsXhk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ackerleytng.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=g7EMk3qK; arc=none smtp.client-ip=209.85.215.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ackerleytng.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="g7EMk3qK" Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-b16b35ea570so4662182a12.0 for ; Mon, 02 Jun 2025 12:18:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1748891891; x=1749496691; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ZnfgUZmf8OAS/bgXEs2qdTQcvNpZfKjnQrP7dUxeEks=; b=g7EMk3qKhaEO50Eaa9FrbVLhlRbm0CQpGUcBFr1mNehJwPcHfAQCO3bSPtEpz39ASR fgXG85714VPTeQLEc5rYtS1EAF+m/YrYDnjn4FWV11447a8guLj6fQ/KBb4AMo9z9k2f TdlYy/T1whfezGh5bJjSVvoOkM9DYzCfYrQV9PmNxUYHSvuns49mzD63JPmO3d8m/1pL 3t3OJR2KK4geOtkgMXD23yXS22YCJ3mKICehk9jivYJ/BQYpakStYHvKRJ2KyfUX9A8D 25oljdy7hFI5Tb+P3OqitDB9ELwKc116DgzdjqSVPZ7gfnQROWPj6R5hYJd7DAoFb7Fr p9JA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748891891; x=1749496691; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ZnfgUZmf8OAS/bgXEs2qdTQcvNpZfKjnQrP7dUxeEks=; b=vGlErkcjEYg80hbr0FiZcWeabRsgU/3fSHoBB7DQy5jIwQPA9z0aW1XurMzXtP9vWQ E160X0mcuVlPcJSI/O23R5W7s8vqhNUjh1S9IkdTH6WY12gf7xpkw4VnM3WyQ6fb6rHf NvKsBRZF+MxpidLGaCLJwXZAvP2mSdh5AOS6vIhskAUSSVw6Gh3EWDBUyf23VjvNtdv2 Oq/h+joMXufPz8ejPSiKdUMrzdipMrlxK29k+t+IVM2UIDc1qjBd7UsD0MbX6EAGZY5n yU99ucwfUXu9nemBSDjz2HxZr3gJym7M/U1Rl2TNHaqjnZMXbqkQYoG0mFRn0plhfFlt EJtw== X-Forwarded-Encrypted: i=1; AJvYcCWz7SWjs31BF2jJJZlsjd8GqVUh48Kz77/2TxW/sjSeGmJIGoGRLzIHjfN7rUMnNp7gj50IAOeF1oJ5WEo=@vger.kernel.org X-Gm-Message-State: AOJu0YzT+76dfL5E0LlzY7za0FwCjHwBiJmeF6d4+Ct+AY6AjhFDOg5j 5owdDciXDMhLFauCQX1b7JMJzccNSG7t8m7DkZkedpXSkJNoZlZ2uUoXOLC4zmmPBSVCFVjjmTH vs0P5tblF4hg48rWE0EeAakKDnA== X-Google-Smtp-Source: AGHT+IH9pb2nOAVzd+tT3ShmCTFFviJ+5ZlU2s7fnOIWaU7dRI+FLtSvZUReomAAqqXsrjcDFUQgssBhev3ROJLBjg== X-Received: from pgct17.prod.google.com ([2002:a05:6a02:5291:b0:af9:8f44:d7ec]) (user=ackerleytng job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:394b:b0:21a:de8e:44a9 with SMTP id adf61e73a8af0-21bad1e8773mr14639083637.37.1748891890493; Mon, 02 Jun 2025 12:18:10 -0700 (PDT) Date: Mon, 2 Jun 2025 12:17:55 -0700 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.49.0.1204.g71687c7c1d-goog Message-ID: <425cd410403e8913b42552d892add6ca543ec869.1748890962.git.ackerleytng@google.com> Subject: [PATCH 2/2] KVM: guest_memfd: Use guest mem inodes instead of anonymous inodes From: Ackerley Tng To: kvm@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, linux-fsdevel@vger.kernel.org Cc: ackerleytng@google.com, aik@amd.com, ajones@ventanamicro.com, akpm@linux-foundation.org, amoorthy@google.com, anthony.yznaga@oracle.com, anup@brainfault.org, aou@eecs.berkeley.edu, bfoster@redhat.com, binbin.wu@linux.intel.com, brauner@kernel.org, catalin.marinas@arm.com, chao.p.peng@intel.com, chenhuacai@kernel.org, dave.hansen@intel.com, david@redhat.com, dmatlack@google.com, dwmw@amazon.co.uk, erdemaktas@google.com, fan.du@intel.com, fvdl@google.com, graf@amazon.com, haibo1.xu@intel.com, hch@infradead.org, hughd@google.com, ira.weiny@intel.com, isaku.yamahata@intel.com, jack@suse.cz, james.morse@arm.com, jarkko@kernel.org, jgg@ziepe.ca, jgowans@amazon.com, jhubbard@nvidia.com, jroedel@suse.de, jthoughton@google.com, jun.miao@intel.com, kai.huang@intel.com, keirf@google.com, kent.overstreet@linux.dev, kirill.shutemov@intel.com, liam.merwick@oracle.com, maciej.wieczor-retman@intel.com, mail@maciej.szmigiero.name, maz@kernel.org, mic@digikod.net, michael.roth@amd.com, mpe@ellerman.id.au, muchun.song@linux.dev, nikunj@amd.com, nsaenz@amazon.es, oliver.upton@linux.dev, palmer@dabbelt.com, pankaj.gupta@amd.com, paul.walmsley@sifive.com, pbonzini@redhat.com, pdurrant@amazon.co.uk, peterx@redhat.com, pgonda@google.com, pvorel@suse.cz, qperret@google.com, quic_cvanscha@quicinc.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, quic_svaddagi@quicinc.com, quic_tsoni@quicinc.com, richard.weiyang@gmail.com, rick.p.edgecombe@intel.com, rientjes@google.com, roypat@amazon.co.uk, rppt@kernel.org, seanjc@google.com, shuah@kernel.org, steven.price@arm.com, steven.sistare@oracle.com, suzuki.poulose@arm.com, tabba@google.com, thomas.lendacky@amd.com, vannapurve@google.com, vbabka@suse.cz, viro@zeniv.linux.org.uk, vkuznets@redhat.com, wei.w.wang@intel.com, will@kernel.org, willy@infradead.org, xiaoyao.li@intel.com, yan.y.zhao@intel.com, yilun.xu@intel.com, yuzenghui@huawei.com, zhiquan1.li@intel.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" guest_memfd's inode represents memory the guest_memfd is providing. guest_memfd's file represents a struct kvm's view of that memory. Using a custom inode allows customization of the inode teardown process via callbacks. For example, ->evict_inode() allows customization of the truncation process on file close, and ->destroy_inode() and ->free_inode() allow customization of the inode freeing process. Customizing the truncation process allows flexibility in management of guest_memfd memory and customization of the inode freeing process allows proper cleanup of memory metadata stored on the inode. Memory metadata is more appropriately stored on the inode (as opposed to the file), since the metadata is for the memory and is not unique to a specific binding and struct kvm. Signed-off-by: Ackerley Tng Signed-off-by: Fuad Tabba --- include/uapi/linux/magic.h | 1 + virt/kvm/guest_memfd.c | 134 +++++++++++++++++++++++++++++++------ virt/kvm/kvm_main.c | 7 +- virt/kvm/kvm_mm.h | 9 ++- 4 files changed, 125 insertions(+), 26 deletions(-) diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h index bb575f3ab45e..638ca21b7a90 100644 --- a/include/uapi/linux/magic.h +++ b/include/uapi/linux/magic.h @@ -103,5 +103,6 @@ #define DEVMEM_MAGIC 0x454d444d /* "DMEM" */ #define SECRETMEM_MAGIC 0x5345434d /* "SECM" */ #define PID_FS_MAGIC 0x50494446 /* "PIDF" */ +#define GUEST_MEMFD_MAGIC 0x474d454d /* "GMEM" */ #endif /* __LINUX_MAGIC_H__ */ diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index b2aa6bf24d3a..1283b85aeb44 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -1,12 +1,16 @@ // SPDX-License-Identifier: GPL-2.0 +#include #include #include +#include #include +#include #include -#include #include "kvm_mm.h" +static struct vfsmount *kvm_gmem_mnt; + struct kvm_gmem { struct kvm *kvm; struct xarray bindings; @@ -318,9 +322,51 @@ static struct file_operations kvm_gmem_fops =3D { .fallocate =3D kvm_gmem_fallocate, }; -void kvm_gmem_init(struct module *module) +static const struct super_operations kvm_gmem_super_operations =3D { + .statfs =3D simple_statfs, +}; + +static int kvm_gmem_init_fs_context(struct fs_context *fc) +{ + struct pseudo_fs_context *ctx; + + if (!init_pseudo(fc, GUEST_MEMFD_MAGIC)) + return -ENOMEM; + + ctx =3D fc->fs_private; + ctx->ops =3D &kvm_gmem_super_operations; + + return 0; +} + +static struct file_system_type kvm_gmem_fs =3D { + .name =3D "kvm_guest_memory", + .init_fs_context =3D kvm_gmem_init_fs_context, + .kill_sb =3D kill_anon_super, +}; + +static int kvm_gmem_init_mount(void) +{ + kvm_gmem_mnt =3D kern_mount(&kvm_gmem_fs); + + if (WARN_ON_ONCE(IS_ERR(kvm_gmem_mnt))) + return PTR_ERR(kvm_gmem_mnt); + + kvm_gmem_mnt->mnt_flags |=3D MNT_NOEXEC; + return 0; +} + +int kvm_gmem_init(struct module *module) { kvm_gmem_fops.owner =3D module; + + return kvm_gmem_init_mount(); +} + +void kvm_gmem_exit(void) +{ + kern_unmount(kvm_gmem_mnt); + kvm_gmem_mnt =3D NULL; } static int kvm_gmem_migrate_folio(struct address_space *mapping, @@ -402,11 +448,71 @@ static const struct inode_operations kvm_gmem_iops = =3D { .setattr =3D kvm_gmem_setattr, }; +static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, + loff_t size, u64 flags) +{ + struct inode *inode; + + inode =3D alloc_anon_secure_inode(kvm_gmem_mnt->mnt_sb, name); + if (IS_ERR(inode)) + return inode; + + inode->i_private =3D (void *)(unsigned long)flags; + inode->i_op =3D &kvm_gmem_iops; + inode->i_mapping->a_ops =3D &kvm_gmem_aops; + inode->i_mode |=3D S_IFREG; + inode->i_size =3D size; + mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); + mapping_set_inaccessible(inode->i_mapping); + /* Unmovable mappings are supposed to be marked unevictable as well. */ + WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); + + return inode; +} + +static struct file *kvm_gmem_inode_create_getfile(void *priv, loff_t size, + u64 flags) +{ + static const char *name =3D "[kvm-gmem]"; + struct inode *inode; + struct file *file; + int err; + + err =3D -ENOENT; + if (!try_module_get(kvm_gmem_fops.owner)) + goto err; + + inode =3D kvm_gmem_inode_make_secure_inode(name, size, flags); + if (IS_ERR(inode)) { + err =3D PTR_ERR(inode); + goto err_put_module; + } + + file =3D alloc_file_pseudo(inode, kvm_gmem_mnt, name, O_RDWR, + &kvm_gmem_fops); + if (IS_ERR(file)) { + err =3D PTR_ERR(file); + goto err_put_inode; + } + + file->f_flags |=3D O_LARGEFILE; + file->private_data =3D priv; + +out: + return file; + +err_put_inode: + iput(inode); +err_put_module: + module_put(kvm_gmem_fops.owner); +err: + file =3D ERR_PTR(err); + goto out; +} + static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) { - const char *anon_name =3D "[kvm-gmem]"; struct kvm_gmem *gmem; - struct inode *inode; struct file *file; int fd, err; @@ -420,32 +526,16 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t = size, u64 flags) goto err_fd; } - file =3D anon_inode_create_getfile(anon_name, &kvm_gmem_fops, gmem, - O_RDWR, NULL); + file =3D kvm_gmem_inode_create_getfile(gmem, size, flags); if (IS_ERR(file)) { err =3D PTR_ERR(file); goto err_gmem; } - file->f_flags |=3D O_LARGEFILE; - - inode =3D file->f_inode; - WARN_ON(file->f_mapping !=3D inode->i_mapping); - - inode->i_private =3D (void *)(unsigned long)flags; - inode->i_op =3D &kvm_gmem_iops; - inode->i_mapping->a_ops =3D &kvm_gmem_aops; - inode->i_mode |=3D S_IFREG; - inode->i_size =3D size; - mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); - mapping_set_inaccessible(inode->i_mapping); - /* Unmovable mappings are supposed to be marked unevictable as well. */ - WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); - kvm_get_kvm(kvm); gmem->kvm =3D kvm; xa_init(&gmem->bindings); - list_add(&gmem->entry, &inode->i_mapping->i_private_list); + list_add(&gmem->entry, &file_inode(file)->i_mapping->i_private_list); fd_install(fd, file); return fd; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index e85b33a92624..094cc0ad31fb 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -6420,7 +6420,9 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align,= struct module *module) if (WARN_ON_ONCE(r)) goto err_vfio; - kvm_gmem_init(module); + r =3D kvm_gmem_init(module); + if (r) + goto err_gmem; r =3D kvm_init_virtualization(); if (r) @@ -6441,6 +6443,8 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align,= struct module *module) err_register: kvm_uninit_virtualization(); err_virt: + kvm_gmem_exit(); +err_gmem: kvm_vfio_ops_exit(); err_vfio: kvm_async_pf_deinit(); @@ -6472,6 +6476,7 @@ void kvm_exit(void) for_each_possible_cpu(cpu) free_cpumask_var(per_cpu(cpu_kick_mask, cpu)); kmem_cache_destroy(kvm_vcpu_cache); + kvm_gmem_exit(); kvm_vfio_ops_exit(); kvm_async_pf_deinit(); kvm_irqfd_exit(); diff --git a/virt/kvm/kvm_mm.h b/virt/kvm/kvm_mm.h index acef3f5c582a..dcacb76b8f00 100644 --- a/virt/kvm/kvm_mm.h +++ b/virt/kvm/kvm_mm.h @@ -68,17 +68,20 @@ static inline void gfn_to_pfn_cache_invalidate_start(st= ruct kvm *kvm, #endif /* HAVE_KVM_PFNCACHE */ #ifdef CONFIG_KVM_PRIVATE_MEM -void kvm_gmem_init(struct module *module); +int kvm_gmem_init(struct module *module); +void kvm_gmem_exit(void); int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args); int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot, unsigned int fd, loff_t offset); void kvm_gmem_unbind(struct kvm_memory_slot *slot); #else -static inline void kvm_gmem_init(struct module *module) +static inline int kvm_gmem_init(struct module *module) { - + return 0; } +static inline void kvm_gmem_exit(void) {}; + static inline int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot, unsigned int fd, loff_t offset) -- 2.49.0.1204.g71687c7c1d-goog