From nobody Sun Dec 22 20:08:08 2024 Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7182A140363; Thu, 22 Feb 2024 23:17:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.180.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708643832; cv=none; b=m7o+wo7mjLSKsagVlI/gOQi8NDAlpWdXyfTDaO3TIzW/XwvSEmUNUiUYqxeYoaZ1YMqbpTBplxXS5Zw0zyWuOHC9oNO2+7myZmXDodQRdjiD8jqmatDryH8Zp8CLFQS/PAa26fkSnM+r3wfs+3xL6iA2DEevIsACm68eEr32zVk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708643832; c=relaxed/simple; bh=Vc+psiBnbZwadwDh1G9+BjcG6jFGR/j3uakfMnakDIc=; h=From:Date:Subject:MIME-Version:Content-Type:Message-ID:References: In-Reply-To:To:CC; b=ZScStHmS4WdKIu8wV0e7OUa+PJnV0DdIzosimyspsL2xxNpF8SkfE9puufd7k6RbzismArQ3JN8GDDSXAQSYYBWcJKC1Sis3RN7fXSUrnhb6Qaddgm4dwLpKGjTdNehBE/nE609EX6hwzAnF+SU6OC5fLs3TCZ8Z81TM5Wk2+mc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=quicinc.com; spf=pass smtp.mailfrom=quicinc.com; dkim=pass (2048-bit key) header.d=quicinc.com header.i=@quicinc.com header.b=D3nyL8It; arc=none smtp.client-ip=205.220.180.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=quicinc.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=quicinc.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=quicinc.com header.i=@quicinc.com header.b="D3nyL8It" Received: from pps.filterd (m0279870.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.24/8.17.1.24) with ESMTP id 41MMsnkp019978; Thu, 22 Feb 2024 23:16:42 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h= from:date:subject:mime-version:content-type :content-transfer-encoding:message-id:references:in-reply-to:to :cc; s=qcppdkim1; bh=G57zUnnCOipB39Fc9wEKGkJOLsd/qSHJbghSWcnXHxc =; b=D3nyL8Itw2xoZCX/+zre+aXKl0KxN22GoMcQt9xUQrVYlwScRt1kKIT9KI3 HZ8veKEe6rtor6Dp/FgAYtxZQhlMaaZdpwc+TnEeFTR2s6hUE67ZYutnTraSOa+D LZsSgUwMhCjTHv0y7luiV5g19Rn3biHwDr5p/ekZB1Z8xKER6Cn5SOnkSJpTj9oc fM5fpldh7lnirw9foJ174cKylphcNhGnJTlAnVq4v2g8Sb35vMeBohpQJaXuJjHu Z6rNs8vp7MlrbedZN8aAhbA5pcvCBplXxLUouQc4BuE/5lUFejUaXeIHYLoZx3Yf GAQ/bmc+WcPZ7wb7dL37+nT5RIg== Received: from nasanppmta03.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3we3231h1t-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 22 Feb 2024 23:16:42 +0000 (GMT) Received: from nasanex01b.na.qualcomm.com (nasanex01b.na.qualcomm.com [10.46.141.250]) by NASANPPMTA03.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTPS id 41MNGfvA017522 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 22 Feb 2024 23:16:41 GMT Received: from hu-eberman-lv.qualcomm.com (10.49.16.6) by nasanex01b.na.qualcomm.com (10.46.141.250) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Thu, 22 Feb 2024 15:16:40 -0800 From: Elliot Berman Date: Thu, 22 Feb 2024 15:16:47 -0800 Subject: [PATCH v17 24/35] virt: gunyah: Add ioctl to bind guestmem to VMs Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-ID: <20240222-gunyah-v17-24-1e9da6763d38@quicinc.com> References: <20240222-gunyah-v17-0-1e9da6763d38@quicinc.com> In-Reply-To: <20240222-gunyah-v17-0-1e9da6763d38@quicinc.com> To: Alex Elder , Srinivas Kandagatla , Murali Nalajal , Trilok Soni , Srivatsa Vaddagiri , Carl van Schaik , Philip Derrin , Prakruthi Deepak Heragu , Jonathan Corbet , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Catalin Marinas , Will Deacon , Konrad Dybcio , Bjorn Andersson , Dmitry Baryshkov , "Fuad Tabba" , Sean Christopherson , "Andrew Morton" CC: , , , , , , Elliot Berman X-Mailer: b4 0.12.4 X-ClientProxiedBy: nalasex01c.na.qualcomm.com (10.47.97.35) To nasanex01b.na.qualcomm.com (10.46.141.250) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-ORIG-GUID: 9yYiQdUwUCwOBtv2gOezqpIXaV8fZGn_ X-Proofpoint-GUID: 9yYiQdUwUCwOBtv2gOezqpIXaV8fZGn_ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-02-22_15,2024-02-22_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 bulkscore=0 impostorscore=0 priorityscore=1501 mlxscore=0 lowpriorityscore=0 clxscore=1015 suspectscore=0 malwarescore=0 mlxlogscore=999 adultscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2402120000 definitions=main-2402220178 A maple tree is used to maintain a map from guest address ranges to a guestmemfd that provides the memory for that range of memory for the guest. The mapping of guest address range to guestmemfd is called a binding. Implement an ioctl to add/remove bindings to the virtual machine. The binding determines whether the memory is shared (host retains access) or lent (host loses access). Signed-off-by: Elliot Berman --- drivers/virt/gunyah/guest_memfd.c | 426 ++++++++++++++++++++++++++++++++++= +++- drivers/virt/gunyah/vm_mgr.c | 20 ++ drivers/virt/gunyah/vm_mgr.h | 9 + include/uapi/linux/gunyah.h | 41 ++++ 4 files changed, 488 insertions(+), 8 deletions(-) diff --git a/drivers/virt/gunyah/guest_memfd.c b/drivers/virt/gunyah/guest_= memfd.c index 19c72aa1d0e5e..d6be8bf67d996 100644 --- a/drivers/virt/gunyah/guest_memfd.c +++ b/drivers/virt/gunyah/guest_memfd.c @@ -16,6 +16,51 @@ =20 #include "vm_mgr.h" =20 +/** + * struct gunyah_gmem_binding - Represents a binding of guestmem to a Guny= ah VM + * @gfn: Guest address to place acquired folios + * @ghvm: Pointer to Gunyah VM in this binding + * @i_off: offset into the guestmem to grab folios from + * @file: Pointer to guest_memfd + * @i_entry: list entry for inode->i_private_list + * @flags: Access flags for the binding + * @nr: Number of pages covered by this binding + */ +struct gunyah_gmem_binding { + u64 gfn; + struct gunyah_vm *ghvm; + + pgoff_t i_off; + struct file *file; + struct list_head i_entry; + + u32 flags; + unsigned long nr; +}; + +static inline pgoff_t gunyah_gfn_to_off(struct gunyah_gmem_binding *b, u64= gfn) +{ + return gfn - b->gfn + b->i_off; +} + +static inline u64 gunyah_off_to_gfn(struct gunyah_gmem_binding *b, pgoff_t= off) +{ + return off - b->i_off + b->gfn; +} + +static inline bool gunyah_guest_mem_is_lend(struct gunyah_vm *ghvm, u32 fl= ags) +{ + u8 access =3D flags & GUNYAH_MEM_ACCESS_MASK; + + if (access =3D=3D GUNYAH_MEM_FORCE_LEND) + return true; + else if (access =3D=3D GUNYAH_MEM_FORCE_SHARE) + return false; + + /* RM requires all VMs to be protected (isolated) */ + return true; +} + static struct folio *gunyah_gmem_get_huge_folio(struct inode *inode, pgoff_t index) { @@ -83,6 +128,38 @@ static struct folio *gunyah_gmem_get_folio(struct inode= *inode, pgoff_t index) return folio; } =20 +/** + * gunyah_gmem_launder_folio() - Tries to unmap one folio from virtual mac= hine(s) + * @folio: The folio to unmap + * + * Returns - 0 if the folio has been reclaimed from any virtual machine(s)= that + * folio was mapped into. + */ +static int gunyah_gmem_launder_folio(struct folio *folio) +{ + struct address_space *const mapping =3D folio->mapping; + struct gunyah_gmem_binding *b; + pgoff_t index =3D folio_index(folio); + int ret =3D 0; + u64 gfn; + + filemap_invalidate_lock_shared(mapping); + list_for_each_entry(b, &mapping->i_private_list, i_entry) { + /* if the mapping doesn't cover this folio: skip */ + if (b->i_off > index || index > b->i_off + b->nr) + continue; + + gfn =3D gunyah_off_to_gfn(b, index); + ret =3D gunyah_vm_reclaim_folio(b->ghvm, gfn, folio); + if (WARN_RATELIMIT(ret, "failed to reclaim gfn: %08llx %d\n", + gfn, ret)) + break; + } + filemap_invalidate_unlock_shared(mapping); + + return ret; +} + static vm_fault_t gunyah_gmem_host_fault(struct vm_fault *vmf) { struct folio *folio; @@ -92,6 +169,15 @@ static vm_fault_t gunyah_gmem_host_fault(struct vm_faul= t *vmf) if (!folio) return VM_FAULT_SIGBUS; =20 + /* If the folio is lent to a VM, try to reclaim it */ + if (folio_test_private(folio) && gunyah_gmem_launder_folio(folio)) { + folio_unlock(folio); + folio_put(folio); + return VM_FAULT_SIGBUS; + } + /* gunyah_gmem_launder_folio should clear the private bit if it returns 0= */ + BUG_ON(folio_test_private(folio)); + vmf->page =3D folio_file_page(folio, vmf->pgoff); =20 return VM_FAULT_LOCKED; @@ -103,9 +189,48 @@ static const struct vm_operations_struct gunyah_gmem_v= m_ops =3D { =20 static int gunyah_gmem_mmap(struct file *file, struct vm_area_struct *vma) { - file_accessed(file); - vma->vm_ops =3D &gunyah_gmem_vm_ops; - return 0; + struct address_space *const mapping =3D file->f_mapping; + struct gunyah_gmem_binding *b; + pgoff_t end_off; + int ret =3D 0; + u64 gfn, nr; + + /* No support for private mappings to avoid COW. */ + if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) !=3D + (VM_SHARED | VM_MAYSHARE)) { + return -EINVAL; + } + + filemap_invalidate_lock_shared(mapping); + /** + * userspace can only mmap if the folios covered by the requested + * offset are not lent to the guest + */ + list_for_each_entry(b, &mapping->i_private_list, i_entry) { + if (!gunyah_guest_mem_is_lend(b->ghvm, b->flags)) + continue; + + /* if the binding doesn't cover this vma: skip */ + if (vma->vm_pgoff + vma_pages(vma) < b->i_off) + continue; + if (vma->vm_pgoff > b->i_off + b->nr) + continue; + + gfn =3D gunyah_off_to_gfn(b, vma->vm_pgoff); + end_off =3D max(vma->vm_pgoff + vma_pages(vma), b->i_off + b->nr); + nr =3D gunyah_off_to_gfn(b, end_off) - gfn; + ret =3D gunyah_vm_reclaim_range(b->ghvm, gfn, nr); + if (ret) + break; + } + filemap_invalidate_unlock_shared(mapping); + + if (!ret) { + file_accessed(file); + vma->vm_ops =3D &gunyah_gmem_vm_ops; + } + + return ret; } =20 /** @@ -122,9 +247,7 @@ static int gunyah_gmem_mmap(struct file *file, struct v= m_area_struct *vma) static long gunyah_gmem_punch_hole(struct inode *inode, loff_t offset, loff_t len) { - truncate_inode_pages_range(inode->i_mapping, offset, offset + len - 1); - - return 0; + return invalidate_inode_pages2_range(inode->i_mapping, offset, offset + l= en - 1); } =20 static long gunyah_gmem_allocate(struct inode *inode, loff_t offset, loff_= t len) @@ -201,6 +324,12 @@ static long gunyah_gmem_fallocate(struct file *file, i= nt mode, loff_t offset, =20 static int gunyah_gmem_release(struct inode *inode, struct file *file) { + /** + * each binding increments refcount on file, so we shouldn't be here + * if i_private_list not empty. + */ + BUG_ON(!list_empty(&inode->i_mapping->i_private_list)); + return 0; } =20 @@ -213,10 +342,26 @@ static const struct file_operations gunyah_gmem_fops = =3D { .release =3D gunyah_gmem_release, }; =20 +static bool gunyah_gmem_release_folio(struct folio *folio, gfp_t gfp_flags) +{ + /* should return true if released; launder folio returns 0 if freed */ + return !gunyah_gmem_launder_folio(folio); +} + +static int gunyah_gmem_remove_folio(struct address_space *mapping, + struct folio *folio) +{ + if (mapping !=3D folio->mapping) + return -EINVAL; + + return gunyah_gmem_launder_folio(folio); +} + static const struct address_space_operations gunyah_gmem_aops =3D { .dirty_folio =3D noop_dirty_folio, - .migrate_folio =3D migrate_folio, - .error_remove_folio =3D generic_error_remove_folio, + .release_folio =3D gunyah_gmem_release_folio, + .launder_folio =3D gunyah_gmem_launder_folio, + .error_remove_folio =3D gunyah_gmem_remove_folio, }; =20 int gunyah_guest_mem_create(struct gunyah_create_mem_args *args) @@ -264,6 +409,7 @@ int gunyah_guest_mem_create(struct gunyah_create_mem_ar= gs *args) mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); mapping_set_large_folios(inode->i_mapping); mapping_set_unmovable(inode->i_mapping); + mapping_set_release_always(inode->i_mapping); /* Unmovable mappings are supposed to be marked unevictable as well. */ WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); =20 @@ -274,3 +420,267 @@ int gunyah_guest_mem_create(struct gunyah_create_mem_= args *args) put_unused_fd(fd); return err; } + +void gunyah_gmem_remove_binding(struct gunyah_gmem_binding *b) +{ + WARN_ON(gunyah_vm_reclaim_range(b->ghvm, b->gfn, b->nr)); + mtree_erase(&b->ghvm->bindings, b->gfn); + list_del(&b->i_entry); + fput(b->file); + kfree(b); +} + +static inline unsigned long gunyah_gmem_page_mask(struct file *file) +{ + unsigned long gmem_flags =3D (unsigned long)file_inode(file)->i_private; + + if (gmem_flags & GHMF_ALLOW_HUGEPAGE) { +#if IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) + return HPAGE_PMD_MASK; +#else + return ULONG_MAX; +#endif + } + + return PAGE_MASK; +} + +static int gunyah_gmem_init_binding(struct gunyah_vm *ghvm, struct file *f= ile, + struct gunyah_map_mem_args *args, + struct gunyah_gmem_binding *binding) +{ + const unsigned long page_mask =3D ~gunyah_gmem_page_mask(file); + + if (args->flags & ~(GUNYAH_MEM_ALLOW_RWX | GUNYAH_MEM_ACCESS_MASK)) + return -EINVAL; + + if (args->guest_addr & page_mask) + return -EINVAL; + + if (args->offset & page_mask) + return -EINVAL; + + if (args->size & page_mask) + return -EINVAL; + + binding->gfn =3D gunyah_gpa_to_gfn(args->guest_addr); + binding->ghvm =3D ghvm; + binding->i_off =3D args->offset >> PAGE_SHIFT; + binding->file =3D file; + binding->flags =3D args->flags; + binding->nr =3D args->size >> PAGE_SHIFT; + + return 0; +} + +static int gunyah_gmem_trim_binding(struct gunyah_gmem_binding *b, + unsigned long start_delta, + unsigned long end_delta) +{ + struct gunyah_vm *ghvm =3D b->ghvm; + int ret; + + down_write(&ghvm->bindings_lock); + if (!start_delta && !end_delta) { + ret =3D gunyah_vm_reclaim_range(ghvm, b->gfn, b->nr); + if (ret) + goto unlock; + gunyah_gmem_remove_binding(b); + } else if (start_delta && !end_delta) { + /* keep the start */ + ret =3D gunyah_vm_reclaim_range(ghvm, b->gfn + start_delta, + b->gfn + b->nr); + if (ret) + goto unlock; + mtree_erase(&ghvm->bindings, b->gfn); + b->nr =3D start_delta; + ret =3D mtree_insert_range(&ghvm->bindings, b->gfn, + b->gfn + b->nr - 1, b, GFP_KERNEL); + } else if (!start_delta && end_delta) { + /* keep the end */ + ret =3D gunyah_vm_reclaim_range(ghvm, b->gfn, + b->gfn + b->nr - end_delta); + if (ret) + goto unlock; + mtree_erase(&ghvm->bindings, b->gfn); + b->gfn +=3D b->nr - end_delta; + b->i_off +=3D b->nr - end_delta; + b->nr =3D end_delta; + ret =3D mtree_insert_range(&ghvm->bindings, b->gfn, + b->gfn + b->nr - 1, b, GFP_KERNEL); + } else { + /* TODO: split the mapping into 2 */ + ret =3D -EINVAL; + } + +unlock: + up_write(&ghvm->bindings_lock); + return ret; +} + +static int gunyah_gmem_remove_mapping(struct gunyah_vm *ghvm, struct file = *file, + struct gunyah_map_mem_args *args) +{ + struct inode *inode =3D file_inode(file); + struct gunyah_gmem_binding *b =3D NULL; + unsigned long start_delta, end_delta; + struct gunyah_gmem_binding remove; + int ret; + + ret =3D gunyah_gmem_init_binding(ghvm, file, args, &remove); + if (ret) + return ret; + + ret =3D -ENOENT; + filemap_invalidate_lock(inode->i_mapping); + list_for_each_entry(b, &inode->i_mapping->i_private_list, i_entry) { + if (b->ghvm !=3D remove.ghvm || b->flags !=3D remove.flags || + WARN_ON(b->file !=3D remove.file)) + continue; + /** + * Test if the binding to remove is within this binding + * [gfn b nr] + * [gfn remove nr] + */ + if (b->gfn > remove.gfn) + continue; + if (b->gfn + b->nr < remove.gfn + remove.nr) + continue; + + /** + * We found the binding! + * Compute the delta in gfn start and make sure the offset + * into guest memfd matches. + */ + start_delta =3D remove.gfn - b->gfn; + if (remove.i_off - b->i_off !=3D start_delta) + break; + end_delta =3D b->gfn + b->nr - remove.gfn - remove.nr; + + ret =3D gunyah_gmem_trim_binding(b, start_delta, end_delta); + break; + } + + filemap_invalidate_unlock(inode->i_mapping); + return ret; +} + +static bool gunyah_gmem_binding_overlaps(struct gunyah_gmem_binding *a, + struct gunyah_gmem_binding *b) +{ + /* assumes we are operating on the same file, check to be sure */ + BUG_ON(a->file !=3D b->file); + + /** + * Gunyah only guarantees we can share a page with one VM and + * doesn't (currently) allow us to share same page with multiple VMs, + * regardless whether host can also access. + * Gunyah supports, but Linux hasn't implemented mapping same page + * into 2 separate addresses in guest's address space. This doesn't + * seem reasonable today, but we could do it later. + * All this to justify: check that the `a` region doesn't overlap with + * `b` region w.r.t. file offsets. + */ + if (a->i_off + a->nr <=3D b->i_off) + return false; + if (a->i_off >=3D b->i_off + b->nr) + return false; + + return true; +} + +static int gunyah_gmem_add_mapping(struct gunyah_vm *ghvm, struct file *fi= le, + struct gunyah_map_mem_args *args) +{ + struct gunyah_gmem_binding *b, *tmp =3D NULL; + struct inode *inode =3D file_inode(file); + int ret; + + b =3D kzalloc(sizeof(*b), GFP_KERNEL); + if (!b) + return -ENOMEM; + + ret =3D gunyah_gmem_init_binding(ghvm, file, args, b); + if (ret) + return ret; + + /** + * When lending memory, we need to unmap single page from kernel's + * logical map. To do that, we need can_set_direct_map(). + * arm64 doesn't map at page granularity without rodata=3Dfull. + */ + if (gunyah_guest_mem_is_lend(ghvm, b->flags) && !can_set_direct_map()) { + kfree(b); + pr_warn_once("Cannot lend memory without rodata=3Dfull"); + return -EINVAL; + } + + /** + * First, check that the region of guets memfd user is binding isn't + * already bound to some other guest region. + */ + filemap_invalidate_lock(inode->i_mapping); + list_for_each_entry(tmp, &inode->i_mapping->i_private_list, i_entry) { + if (gunyah_gmem_binding_overlaps(b, tmp)) { + ret =3D -EEXIST; + goto unlock; + } + } + + /** + * mtree_insert_range will check that user hasn't mapped some other guest + * memfd region to the same addresses. + */ + ret =3D mtree_insert_range(&ghvm->bindings, b->gfn, b->gfn + b->nr - 1, b, + GFP_KERNEL); + if (ret) + goto unlock; + + list_add(&b->i_entry, &inode->i_mapping->i_private_list); + +unlock: + filemap_invalidate_unlock(inode->i_mapping); + return ret; +} + +int gunyah_gmem_modify_mapping(struct gunyah_vm *ghvm, + struct gunyah_map_mem_args *args) +{ + u8 access =3D args->flags & GUNYAH_MEM_ACCESS_MASK; + struct file *file; + int ret =3D -EINVAL; + + file =3D fget(args->guest_mem_fd); + if (!file) + return -EINVAL; + + if (file->f_op !=3D &gunyah_gmem_fops) + goto err_file; + + if (args->flags & ~(GUNYAH_MEM_ALLOW_RWX | GUNYAH_MEM_UNMAP | GUNYAH_MEM_= ACCESS_MASK)) + goto err_file; + + /* VM needs to have some permissions to the memory */ + if (!(args->flags & GUNYAH_MEM_ALLOW_RWX)) + goto err_file; + + if (access !=3D GUNYAH_MEM_DEFAULT_ACCESS && + access !=3D GUNYAH_MEM_FORCE_LEND && access !=3D GUNYAH_MEM_FORCE_SHA= RE) + goto err_file; + + if (!PAGE_ALIGNED(args->guest_addr) || !PAGE_ALIGNED(args->offset) || + !PAGE_ALIGNED(args->size)) + goto err_file; + + if (args->flags & GUNYAH_MEM_UNMAP) { + args->flags &=3D ~GUNYAH_MEM_UNMAP; + ret =3D gunyah_gmem_remove_mapping(ghvm, file, args); + } else { + ret =3D gunyah_gmem_add_mapping(ghvm, file, args); + } + +err_file: + if (ret) + fput(file); + return ret; +} diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c index 238a7a4da554b..43878a39d4714 100644 --- a/drivers/virt/gunyah/vm_mgr.c +++ b/drivers/virt/gunyah/vm_mgr.c @@ -404,6 +404,8 @@ static __must_check struct gunyah_vm *gunyah_vm_alloc(s= truct gunyah_rm *rm) INIT_LIST_HEAD(&ghvm->resource_tickets); =20 mt_init(&ghvm->mm); + mt_init(&ghvm->bindings); + init_rwsem(&ghvm->bindings_lock); =20 ghvm->addrspace_ticket.resource_type =3D GUNYAH_RESOURCE_TYPE_ADDR_SPACE; ghvm->addrspace_ticket.label =3D GUNYAH_VM_ADDRSPACE_LABEL; @@ -556,6 +558,14 @@ static long gunyah_vm_ioctl(struct file *filp, unsigne= d int cmd, r =3D gunyah_vm_rm_function_instance(ghvm, &f); break; } + case GUNYAH_VM_MAP_MEM: { + struct gunyah_map_mem_args args; + + if (copy_from_user(&args, argp, sizeof(args))) + return -EFAULT; + + return gunyah_gmem_modify_mapping(ghvm, &args); + } default: r =3D -ENOTTY; break; @@ -573,6 +583,8 @@ EXPORT_SYMBOL_GPL(gunyah_vm_get); static void _gunyah_vm_put(struct kref *kref) { struct gunyah_vm *ghvm =3D container_of(kref, struct gunyah_vm, kref); + struct gunyah_gmem_binding *b; + unsigned long idx =3D 0; int ret; =20 /** @@ -584,6 +596,13 @@ static void _gunyah_vm_put(struct kref *kref) =20 gunyah_vm_remove_functions(ghvm); =20 + down_write(&ghvm->bindings_lock); + mt_for_each(&ghvm->bindings, b, idx, ULONG_MAX) { + gunyah_gmem_remove_binding(b); + } + up_write(&ghvm->bindings_lock); + WARN_ON(!mtree_empty(&ghvm->bindings)); + mtree_destroy(&ghvm->bindings); /** * If this fails, we're going to lose the memory for good and is * BUG_ON-worthy, but not unrecoverable (we just lose memory). @@ -615,6 +634,7 @@ static void _gunyah_vm_put(struct kref *kref) /* clang-format on */ } =20 + WARN_ON(!mtree_empty(&ghvm->mm)); mtree_destroy(&ghvm->mm); =20 if (ghvm->vm_status > GUNYAH_RM_VM_STATUS_NO_STATE) { diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h index 7ac0dd066991b..d9f101fc52632 100644 --- a/drivers/virt/gunyah/vm_mgr.h +++ b/drivers/virt/gunyah/vm_mgr.h @@ -38,6 +38,9 @@ long gunyah_dev_vm_mgr_ioctl(struct gunyah_rm *rm, unsign= ed int cmd, * @mm: A maple tree of all memory that has been mapped to a VM. * Indices are guest frame numbers; entries are either folios or * RM mem parcels + * @bindings: A maple tree of guest memfd bindings. Indices are guest frame + * numbers; entries are &struct gunyah_gmem_binding + * @bindings_lock: For serialization to @bindings * @addrspace_ticket: Resource ticket to the capability for guest VM's * address space * @host_private_extent_ticket: Resource ticket to the capability for our @@ -77,6 +80,8 @@ long gunyah_dev_vm_mgr_ioctl(struct gunyah_rm *rm, unsign= ed int cmd, struct gunyah_vm { u16 vmid; struct maple_tree mm; + struct maple_tree bindings; + struct rw_semaphore bindings_lock; struct gunyah_vm_resource_ticket addrspace_ticket, host_private_extent_ticket, host_shared_extent_ticket, guest_private_extent_ticket, guest_shared_extent_ticket; @@ -172,5 +177,9 @@ int gunyah_vm_reclaim_folio(struct gunyah_vm *ghvm, u64= gfn, struct folio *folio int gunyah_vm_reclaim_range(struct gunyah_vm *ghvm, u64 gfn, u64 nr); =20 int gunyah_guest_mem_create(struct gunyah_create_mem_args *args); +int gunyah_gmem_modify_mapping(struct gunyah_vm *ghvm, + struct gunyah_map_mem_args *args); +struct gunyah_gmem_binding; +void gunyah_gmem_remove_binding(struct gunyah_gmem_binding *binding); =20 #endif diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h index c5f506350364d..1af4c5ae6bc3e 100644 --- a/include/uapi/linux/gunyah.h +++ b/include/uapi/linux/gunyah.h @@ -87,6 +87,47 @@ struct gunyah_fn_desc { #define GUNYAH_VM_ADD_FUNCTION _IOW(GUNYAH_IOCTL_TYPE, 0x4, struct gunyah_= fn_desc) #define GUNYAH_VM_REMOVE_FUNCTION _IOW(GUNYAH_IOCTL_TYPE, 0x7, struct guny= ah_fn_desc) =20 +/** + * enum gunyah_map_flags- Possible flags on &struct gunyah_map_mem_args + * @GUNYAH_MEM_DEFAULT_SHARE: Use default host access for the VM type + * @GUNYAH_MEM_FORCE_LEND: Force unmapping the memory once the guest start= s to use + * @GUNYAH_MEM_FORCE_SHARE: Allow host to continue accessing memory when g= uest starts to use + * @GUNYAH_MEM_ALLOW_READ: Allow guest to read memory + * @GUNYAH_MEM_ALLOW_WRITE: Allow guest to write to the memory + * @GUNYAH_MEM_ALLOW_EXEC: Allow guest to execute instructions in the memo= ry + */ +enum gunyah_map_flags { + GUNYAH_MEM_DEFAULT_ACCESS =3D 0, + GUNYAH_MEM_FORCE_LEND =3D 1, + GUNYAH_MEM_FORCE_SHARE =3D 2, +#define GUNYAH_MEM_ACCESS_MASK 0x7 + + GUNYAH_MEM_ALLOW_READ =3D 1UL << 4, + GUNYAH_MEM_ALLOW_WRITE =3D 1UL << 5, + GUNYAH_MEM_ALLOW_EXEC =3D 1UL << 6, + GUNYAH_MEM_ALLOW_RWX =3D + (GUNYAH_MEM_ALLOW_READ | GUNYAH_MEM_ALLOW_WRITE | GUNYAH_MEM_ALLOW_EXEC), + + GUNYAH_MEM_UNMAP =3D 1UL << 8, +}; + +/** + * struct gunyah_map_mem_args - Description to provide guest memory into a= VM + * @guest_addr: Location in guest address space to place the memory + * @flags: See &enum gunyah_map_flags. + * @guest_mem_fd: File descriptor created by GUNYAH_CREATE_GUEST_MEM + * @offset: Offset into the guest memory file + */ +struct gunyah_map_mem_args { + __u64 guest_addr; + __u32 flags; + __u32 guest_mem_fd; + __u64 offset; + __u64 size; +}; + +#define GUNYAH_VM_MAP_MEM _IOW(GUNYAH_IOCTL_TYPE, 0x9, struct gunyah_map_m= em_args) + /* * ioctls for vCPU fds */ --=20 2.34.1