From nobody Sat Nov 30 07:48:23 2024 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 16B4217BEBD for ; Tue, 10 Sep 2024 19:16:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725995765; cv=none; b=SL005g7ZFBfD6C68zwAuE1I6OkXf4GyxinOo6eGRp8kEiE0wIs31gwetfFbMTkfdI0QF30GRTIVBSSZUf2Zs34aGeFSLzK5DGoyfdBcE1iIMZmO5HkLM6zqB1HbxEJ8Uu/4ho4ibjRASyb4/cV5h2f+fUdRGYrjZO+DMmqBHoeY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725995765; c=relaxed/simple; bh=ENqsfC2AIXwZ7rYjQKdgaymmNFIejXxikgRDslPwSKU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=AdOpaM6b+FNB8Gbuq+M/elEn+6nQwk2dZI3Iqh8o7px+jX3Q4+U3ovRqvw4tBWFEYU69yNLB3OyZVcDUfNRIJ2ogTgvXea8+VyPbndyWQk0MCfsY5ZXzCAIB3NyCGEgTM0aq8W9/XlamGw6SHA2/pfQFipZm8K8wxhpcBuelwFY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=CH8d6O8j; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="CH8d6O8j" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1725995763; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ueb83hVIj3hVMTASy964Wv7fO6dBN/axyM8tuEYr7Bc=; b=CH8d6O8jatP6972tZSXWot/ILOicPcipeQxzb42+UW7bm3kOeg83HEHvRhYiiuREmusn6q m3l+lxn0LQALC62hr7eX7Pt18Cev5ClYxRkDAaxOkvRst4wWIa+syTFU9sJX1USffrQacs QCaUX+LMKiumCEEs3BDV1SDk5nauDXc= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-116-doxVM81dNx2G7XLtemTiQw-1; Tue, 10 Sep 2024 15:15:59 -0400 X-MC-Unique: doxVM81dNx2G7XLtemTiQw-1 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 9738F1955F28; Tue, 10 Sep 2024 19:15:56 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.22.17.222]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id E8F6E30001A1; Tue, 10 Sep 2024 19:15:50 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-s390@vger.kernel.org, virtualization@lists.linux.dev, David Hildenbrand , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Thomas Huth , Cornelia Huck , Janosch Frank , Claudio Imbrenda , "Michael S. Tsirkin" , Jason Wang , Xuan Zhuo , =?UTF-8?q?Eugenio=20P=C3=A9rez?= , Andrew Morton Subject: [PATCH v1 1/5] s390/kdump: implement is_kdump_kernel() Date: Tue, 10 Sep 2024 21:15:35 +0200 Message-ID: <20240910191541.2179655-2-david@redhat.com> In-Reply-To: <20240910191541.2179655-1-david@redhat.com> References: <20240910191541.2179655-1-david@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Content-Type: text/plain; charset="utf-8" s390x currently always results in is_kdump_kernel() =3D=3D false, because it sets "elfcorehdr_addr =3D ELFCORE_ADDR_MAX;" early during setup_arch to deactivate the elfcorehdr=3D kernel parameter. Let's follow the powerpc example and implement our own logic. This is required for virtio-mem to reliably identify a kdump environment to not try hotplugging memory. Signed-off-by: David Hildenbrand Tested-by: Mario Casquero --- arch/s390/include/asm/kexec.h | 4 ++++ arch/s390/kernel/crash_dump.c | 6 ++++++ 2 files changed, 10 insertions(+) diff --git a/arch/s390/include/asm/kexec.h b/arch/s390/include/asm/kexec.h index 1bd08eb56d5f..bd20543515f5 100644 --- a/arch/s390/include/asm/kexec.h +++ b/arch/s390/include/asm/kexec.h @@ -94,6 +94,9 @@ void arch_kexec_protect_crashkres(void); =20 void arch_kexec_unprotect_crashkres(void); #define arch_kexec_unprotect_crashkres arch_kexec_unprotect_crashkres + +bool is_kdump_kernel(void); +#define is_kdump_kernel is_kdump_kernel #endif =20 #ifdef CONFIG_KEXEC_FILE @@ -107,4 +110,5 @@ int arch_kexec_apply_relocations_add(struct purgatory_i= nfo *pi, int arch_kimage_file_post_load_cleanup(struct kimage *image); #define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_clea= nup #endif + #endif /*_S390_KEXEC_H */ diff --git a/arch/s390/kernel/crash_dump.c b/arch/s390/kernel/crash_dump.c index edae13416196..cca1827d3d2e 100644 --- a/arch/s390/kernel/crash_dump.c +++ b/arch/s390/kernel/crash_dump.c @@ -237,6 +237,12 @@ int remap_oldmem_pfn_range(struct vm_area_struct *vma,= unsigned long from, prot); } =20 +bool is_kdump_kernel(void) +{ + return oldmem_data.start && !is_ipl_type_dump(); +} +EXPORT_SYMBOL_GPL(is_kdump_kernel); + static const char *nt_name(Elf64_Word type) { const char *name =3D "LINUX"; --=20 2.46.0 From nobody Sat Nov 30 07:48:23 2024 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 22B211A7066 for ; Tue, 10 Sep 2024 19:16:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725995770; cv=none; b=Z9GO29GWqImbeP7FbKYj9pn8+WII/uqr7G/e88UAHmX1FfMqUkuFNKCzjA1tMQv6iEwUq+tCTIMmO3kyxqQuZFyMIE3ooIk4QRdzGFwfW0/yRPV78T4m3fZ7LXx6SA3yhZqi9dUh6uMSt7i+hjNmxgHvRR3Is+xhYAj4pFPc8o0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725995770; c=relaxed/simple; bh=78GqS+FJMRc1v3zn5JSPVLZDDgcaItiFFgkRUR2/MpM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Ucbtl2fKujJs7c1J+O7qf+RvzG8dv1r3RC3m3tBSP1OpDad7InYOonkRKQWFeuVGiCnSfcRQ1uh0Uhp+OoS4gbVX6xD/CB3JArYkYoHc+MlRzgarOzkHrJLNylSQLIXPEwpyAhgpz9/b3nIfGtM4y6zUc/AOqiDa4VzWAPVp4Ks= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=hro5NJpp; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="hro5NJpp" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1725995768; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Xq/TMyiVBlIW+rwu1EuyYBMF+1VwzZ0MGyvT9zBb2aI=; b=hro5NJppvJnZtDrBB60Z5heUGMKyrJQfkySMpPZvrrkD5ejP2fMBB43gw28NlBGW6WMzzX z+iT0zcBBgQlYxPB18lHuCGQWXHeg2d90u5XF+s0GhBVYbS0YuYxniewv2E/uARi18U3aa FXbtZyGTPKnDWDc2TT4YQ9KxGsc64/E= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-125-xGJCYATCMwGtvkeozeKKzg-1; Tue, 10 Sep 2024 15:16:05 -0400 X-MC-Unique: xGJCYATCMwGtvkeozeKKzg-1 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id EA6F21955F07; Tue, 10 Sep 2024 19:16:02 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.22.17.222]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 1E8D730001A1; Tue, 10 Sep 2024 19:15:56 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-s390@vger.kernel.org, virtualization@lists.linux.dev, David Hildenbrand , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Thomas Huth , Cornelia Huck , Janosch Frank , Claudio Imbrenda , "Michael S. Tsirkin" , Jason Wang , Xuan Zhuo , =?UTF-8?q?Eugenio=20P=C3=A9rez?= , Andrew Morton Subject: [PATCH v1 2/5] s390/physmem_info: query diag500(STORAGE_LIMIT) to support QEMU/KVM memory devices Date: Tue, 10 Sep 2024 21:15:36 +0200 Message-ID: <20240910191541.2179655-3-david@redhat.com> In-Reply-To: <20240910191541.2179655-1-david@redhat.com> References: <20240910191541.2179655-1-david@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Content-Type: text/plain; charset="utf-8" To support memory devices under QEMU/KVM, such as virtio-mem, we have to prepare our kernel virtual address space accordingly and have to know the highest possible physical memory address we might see later: the storage limit. The good old SCLP interface is not suitable for this use case. In particular, memory owned by memory devices has no relationship to storage increments, it is always detected using the device driver, and unaware OSes (no driver) must never try making use of that memory. Consequently this memory is located outside of the "maximum storage increment"-indicated memory range. Let's use our new diag500 STORAGE_LIMIT subcode to query this storage limit that can exceed the "maximum storage increment", and use the existing interfaces (i.e., SCLP) to obtain information about the initial memory that is not owned+managed by memory devices. If a hypervisor does not support such memory devices, the address exposed through diag500 STORAGE_LIMIT will correspond to the maximum storage increment exposed through SCLP. To teach kdump on s390x to include memory owned by memory devices, there will be ways to query the relevant memory ranges from the device via a driver running in special kdump mode (like virtio-mem already implements to filter /proc/vmcore access so we don't end up reading from unplugged device blocks). Signed-off-by: David Hildenbrand Tested-by: Mario Casquero --- arch/s390/boot/physmem_info.c | 46 ++++++++++++++++++++++++++-- arch/s390/include/asm/physmem_info.h | 3 ++ 2 files changed, 46 insertions(+), 3 deletions(-) diff --git a/arch/s390/boot/physmem_info.c b/arch/s390/boot/physmem_info.c index 4c9ad8258f7e..9cac8550bdca 100644 --- a/arch/s390/boot/physmem_info.c +++ b/arch/s390/boot/physmem_info.c @@ -109,6 +109,38 @@ static int diag260(void) return 0; } =20 +static int diag500_storage_limit(unsigned long *max_physmem_end) +{ + register unsigned long __nr asm("1") =3D 0x4; + register unsigned long __storage_limit asm("2") =3D 0; + unsigned long reg1, reg2; + psw_t old; + + asm volatile( + " mvc 0(16,%[psw_old]),0(%[psw_pgm])\n" + " epsw %[reg1],%[reg2]\n" + " st %[reg1],0(%[psw_pgm])\n" + " st %[reg2],4(%[psw_pgm])\n" + " larl %[reg1],1f\n" + " stg %[reg1],8(%[psw_pgm])\n" + " diag 2,4,0x500\n" + "1: mvc 0(16,%[psw_pgm]),0(%[psw_old])\n" + : [reg1] "=3D&d" (reg1), + [reg2] "=3D&a" (reg2), + "+&d" (__storage_limit), + "=3DQ" (get_lowcore()->program_new_psw), + "=3DQ" (old) + : [psw_old] "a" (&old), + [psw_pgm] "a" (&get_lowcore()->program_new_psw), + "d" (__nr) + : "memory"); + if (!__storage_limit) + return -EINVAL; + /* convert inclusive end to exclusive end. */ + *max_physmem_end =3D __storage_limit + 1; + return 0; +} + static int tprot(unsigned long addr) { unsigned long reg1, reg2; @@ -157,7 +189,9 @@ unsigned long detect_max_physmem_end(void) { unsigned long max_physmem_end =3D 0; =20 - if (!sclp_early_get_memsize(&max_physmem_end)) { + if (!diag500_storage_limit(&max_physmem_end)) { + physmem_info.info_source =3D MEM_DETECT_DIAG500_STOR_LIMIT; + } else if (!sclp_early_get_memsize(&max_physmem_end)) { physmem_info.info_source =3D MEM_DETECT_SCLP_READ_INFO; } else { max_physmem_end =3D search_mem_end(); @@ -170,11 +204,17 @@ void detect_physmem_online_ranges(unsigned long max_p= hysmem_end) { if (!sclp_early_read_storage_info()) { physmem_info.info_source =3D MEM_DETECT_SCLP_STOR_INFO; + return; } else if (!diag260()) { physmem_info.info_source =3D MEM_DETECT_DIAG260; - } else if (max_physmem_end) { - add_physmem_online_range(0, max_physmem_end); + return; + } else if (physmem_info.info_source =3D=3D MEM_DETECT_DIAG500_STOR_LIMIT)= { + max_physmem_end =3D 0; + if (!sclp_early_get_memsize(&max_physmem_end)) + physmem_info.info_source =3D MEM_DETECT_SCLP_READ_INFO; } + if (max_physmem_end) + add_physmem_online_range(0, max_physmem_end); } =20 void physmem_set_usable_limit(unsigned long limit) diff --git a/arch/s390/include/asm/physmem_info.h b/arch/s390/include/asm/p= hysmem_info.h index f45cfc8bc233..51b68a43e195 100644 --- a/arch/s390/include/asm/physmem_info.h +++ b/arch/s390/include/asm/physmem_info.h @@ -9,6 +9,7 @@ enum physmem_info_source { MEM_DETECT_NONE =3D 0, MEM_DETECT_SCLP_STOR_INFO, MEM_DETECT_DIAG260, + MEM_DETECT_DIAG500_STOR_LIMIT, MEM_DETECT_SCLP_READ_INFO, MEM_DETECT_BIN_SEARCH }; @@ -107,6 +108,8 @@ static inline const char *get_physmem_info_source(void) return "sclp storage info"; case MEM_DETECT_DIAG260: return "diag260"; + case MEM_DETECT_DIAG500_STOR_LIMIT: + return "diag500 storage limit"; case MEM_DETECT_SCLP_READ_INFO: return "sclp read info"; case MEM_DETECT_BIN_SEARCH: --=20 2.46.0 From nobody Sat Nov 30 07:48:23 2024 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A7CDE1A4B9C for ; Tue, 10 Sep 2024 19:16:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725995782; cv=none; b=QDfTngTEu/K0f0jn5wNKA76gx+ebS4Rj6CsvHs//XJHY51zL/2mI4j/J8EX/lyOkheznhJxY5Du2F6MSX1l+AQ+24ASBiWG25sqinGMZkp2CTxXGzcb0MXVz4uaUt0OsedFLyXQ8DlZhhkmj36q6YiYsfD6WvZNGUinJwVRgBjM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725995782; c=relaxed/simple; bh=g2MT/MwTETprI4pBWNr9vWAdNWEhQnGcNvqYKtU2pWE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=EidFzoN0IM0ZMyot6+oo/9uxIcZjEFlqzu5NBHgBIgQlpPZQJgGQO3okmAw5UCk8s5FGdChwzFyneHOuSU+kP8/XuWB5Wso3fx8tPUvNqTA3nJfZVr3Ph6aLHAenBKk85KXQMQly1EqKZfx2BkoSy3C7aT0QnhPzViAzZ23WkuE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=QI+OYWFe; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="QI+OYWFe" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1725995779; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=f8jrsz+oSo4UCVf0NzVCWgl3T4VzNafVCQ5XVw1eXHQ=; b=QI+OYWFey2knFXg/6TV3zWfXWgBA6brmw7vu1RU/4CAWjhqZp7obg2EW20gJGwpc9GeeiR BjiQ6iLeW1Nf2uLUC+LmUup/k7OP4CuglmpB/A64PFGbPvypykr7B/Zbim+NkC7W8xDVH1 PD2GBd1Mn6BNIxVPWwrA8xi/yDMhQ5U= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-315-dB9lZH7VNWyNeRxWmJ1cNA-1; Tue, 10 Sep 2024 15:16:16 -0400 X-MC-Unique: dB9lZH7VNWyNeRxWmJ1cNA-1 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id DDF491956069; Tue, 10 Sep 2024 19:16:10 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.22.17.222]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 6DC0130001A1; Tue, 10 Sep 2024 19:16:03 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-s390@vger.kernel.org, virtualization@lists.linux.dev, David Hildenbrand , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Thomas Huth , Cornelia Huck , Janosch Frank , Claudio Imbrenda , "Michael S. Tsirkin" , Jason Wang , Xuan Zhuo , =?UTF-8?q?Eugenio=20P=C3=A9rez?= , Andrew Morton Subject: [PATCH v1 3/5] virtio-mem: s390x support Date: Tue, 10 Sep 2024 21:15:37 +0200 Message-ID: <20240910191541.2179655-4-david@redhat.com> In-Reply-To: <20240910191541.2179655-1-david@redhat.com> References: <20240910191541.2179655-1-david@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Content-Type: text/plain; charset="utf-8" Now that s390x code is prepared for memory devices that reside above the maximum storage increment exposed through SCLP, everything is in place to unlock virtio-mem support. As virtio-mem in Linux currently supports logically onlining/offlining memory in pageblock granularity, we have an effective hot(un)plug granularity of 1 MiB on s390x. As virito-mem adds/removes individual Linux memory blocks (256MB), we will currently never use gigantic pages in the identity mapping. It is worth noting that neither storage keys nor storage attributes (e.g., data / nodat) are touched when onlining memory blocks, which is good because we are not supposed to touch these parts for unplugged device blocks that are logically offline in Linux. We will currently never initialize storage keys for virtio-mem memory -- IOW, storage_key_init_range() is never called. It could be added in the future when plugging device blocks. But as that function essentially does nothing without modifying the code (changing PAGE_DEFAULT_ACC), that's just fine for now. kexec should work as intended and just like on other architectures that support virtio-mem: we will never place kexec binaries on virtio-mem memory, and never indicate virtio-mem memory to the 2nd kernel. The device driver in the 2nd kernel can simply reset the device -- turning all memory unplugged, to then start plugging memory and adding them to Linux, without causing trouble because the memory is already used elsewhere. The special s390x kdump mode, whereby the 2nd kernel creates the ELF core header, won't currently dump virtio-mem memory. The virtio-mem driver has a special kdump mode, from where we can detect memory ranges to dump. Based on this, support for dumping virtio-mem memory can be added in the future fairly easily. Signed-off-by: David Hildenbrand Acked-by: Michael S. Tsirkin Tested-by: Mario Casquero --- drivers/virtio/Kconfig | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig index 42a48ac763ee..fb320eea70fe 100644 --- a/drivers/virtio/Kconfig +++ b/drivers/virtio/Kconfig @@ -122,7 +122,7 @@ config VIRTIO_BALLOON =20 config VIRTIO_MEM tristate "Virtio mem driver" - depends on X86_64 || ARM64 || RISCV + depends on X86_64 || ARM64 || RISCV || S390 depends on VIRTIO depends on MEMORY_HOTPLUG depends on MEMORY_HOTREMOVE @@ -132,11 +132,11 @@ config VIRTIO_MEM This driver provides access to virtio-mem paravirtualized memory devices, allowing to hotplug and hotunplug memory. =20 - This driver currently only supports x86-64 and arm64. Although it - should compile on other architectures that implement memory - hot(un)plug, architecture-specific and/or common - code changes may be required for virtio-mem, kdump and kexec to work as - expected. + This driver currently supports x86-64, arm64, riscv and s390x. + Although it should compile on other architectures that implement + memory hot(un)plug, architecture-specific and/or common + code changes may be required for virtio-mem, kdump and kexec to + work as expected. =20 If unsure, say M. =20 --=20 2.46.0 From nobody Sat Nov 30 07:48:23 2024 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0278C1A7AC6 for ; Tue, 10 Sep 2024 19:16:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725995785; cv=none; b=S34MnwMrqWeg51ynvK30TsWFpkBBtkCYsbfx1uFqGlvjJy+ZZk9Mfcd8K1nNiFo+qRU3m4y6YcCoC0CAQ+EJImI+UXyLu0YWLlcFtrMCMHNN5Inx//Dt3fUCPzZMjsKIz5QRrYcX8Hb4VeF8sN4fzvzWWyrGw5nxgc9G3vM0518= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725995785; c=relaxed/simple; bh=LDxSkPoI+HT/J2WKBXNgRLxgSrReE8BVDgXu2HZs7aw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QjnAHEFrcN1ZUYty0yX3CumFRfUUl8Jg3ewPLhKfetij9YBfHq93t6suf2XTfdqyfFoS4IRM+N1DURUbz85bUNG/+ThltX+rfS1Q1My3YCxVEao8lfivGeJyrKGRHfvI3yocZGs4kr+Qp8mC2pANPYBmBv4tq5DddfTaGxc/NT0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=S9GiH19T; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="S9GiH19T" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1725995782; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5dwp1+ZWlakCCMbjLs+RuljQC2sU4OdCWLWut1FtVKY=; b=S9GiH19TA8EgjiNOWacL/x1PHa8cLJgknSxkts5M8mq9tDBV7f0xOZhrCWKICLu2TjwXF7 8a8sBWy35aOxYvmKmt4hrwNvzCAtAJkxtPr/5miwyVPAciq94ZYAqC5ILf9OP8+393QIp9 0wdIjgRq0aKlOvR9bpgRUziX4kIzPJE= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-221-xIwG_QIJM6al6Jomp-rQGQ-1; Tue, 10 Sep 2024 15:16:19 -0400 X-MC-Unique: xIwG_QIJM6al6Jomp-rQGQ-1 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 64853193E8EC; Tue, 10 Sep 2024 19:16:17 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.22.17.222]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 6AF713001D0F; Tue, 10 Sep 2024 19:16:11 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-s390@vger.kernel.org, virtualization@lists.linux.dev, David Hildenbrand , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Thomas Huth , Cornelia Huck , Janosch Frank , Claudio Imbrenda , "Michael S. Tsirkin" , Jason Wang , Xuan Zhuo , =?UTF-8?q?Eugenio=20P=C3=A9rez?= , Andrew Morton Subject: [PATCH v1 4/5] lib/Kconfig.debug: default STRICT_DEVMEM to "y" on s390x Date: Tue, 10 Sep 2024 21:15:38 +0200 Message-ID: <20240910191541.2179655-5-david@redhat.com> In-Reply-To: <20240910191541.2179655-1-david@redhat.com> References: <20240910191541.2179655-1-david@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Content-Type: text/plain; charset="utf-8" virtio-mem currently depends on !DEVMEM | STRICT_DEVMEM. Let's default STRICT_DEVMEM to "y" just like we do for arm64 and x86. There could be ways in the future to filter access to virtio-mem device memory even without STRICT_DEVMEM, but for now let's just keep it simple. Signed-off-by: David Hildenbrand Tested-by: Mario Casquero --- lib/Kconfig.debug | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index a30c03a66172..fce22ce54983 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1887,7 +1887,7 @@ config STRICT_DEVMEM bool "Filter access to /dev/mem" depends on MMU && DEVMEM depends on ARCH_HAS_DEVMEM_IS_ALLOWED || GENERIC_LIB_DEVMEM_IS_ALLOWED - default y if PPC || X86 || ARM64 + default y if PPC || X86 || ARM64 || S390 help If this option is disabled, you allow userspace (root) access to all of memory, including kernel and userspace memory. Accidental --=20 2.46.0 From nobody Sat Nov 30 07:48:23 2024 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5AD0C1AAE2C for ; Tue, 10 Sep 2024 19:16:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725995792; cv=none; b=HTYn/InM8IXd6l4NOOdxJNl1ctiRzXB+8/ILFFXlaXoYGfvVNB1SiVzlliSBZTIgQoDZ9VNeyk4lwCWgQlTl+oGu7R2wKWWGJ4M7Ik9rvEmksT1i6KzAH7rKYRX1QZheYCDqQcogGtDQ3NVtY8kpUcF8l8i/ACkY89vO8UHvGpY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725995792; c=relaxed/simple; bh=5AbZneFrwb0hG7IzHsWdJ92AUDYVOma72/NL/xjkm1k=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=D830xyWc7abmvIVP63Q46V4NmgvUF26ps+CJA68nlN63RHQwGY0uhzJJNgwq75QD8Ox/wu+w9iP+qEk6sXdOak/ZQWYoXlmqqKhplC63VAMXnD22y3QBTk/wpZV+AsIP+usKDj6kKPEZkWch2UNtHK0ZmPC3/js4kFF+9Ch4XoU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=ZWNM2C8T; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="ZWNM2C8T" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1725995789; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xMATIjJHQHWxMji5r7WdDXGhstKkdxXCo0yMMstlfH8=; b=ZWNM2C8TqQNfWEiN6cAHxi5RXzN0wAzkFUXVU8r4X1nfww6FCKYaqC3a5klH0YZBmKEMXr oFl+uurE40hGGzoy3U37iQTf/MDnNnmdBlP7hCMU1Kw0Ujg+rF61LMjiXlU67k3ot/RExz NHmqwX1RWVmZIs/CzMkl4EdD2mO385M= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-232-WbUGol_bMq2IBJu7l5Sszw-1; Tue, 10 Sep 2024 15:16:26 -0400 X-MC-Unique: WbUGol_bMq2IBJu7l5Sszw-1 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 2C2851955BC1; Tue, 10 Sep 2024 19:16:24 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.22.17.222]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id DF81B30001A1; Tue, 10 Sep 2024 19:16:17 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-s390@vger.kernel.org, virtualization@lists.linux.dev, David Hildenbrand , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Thomas Huth , Cornelia Huck , Janosch Frank , Claudio Imbrenda , "Michael S. Tsirkin" , Jason Wang , Xuan Zhuo , =?UTF-8?q?Eugenio=20P=C3=A9rez?= , Andrew Morton Subject: [PATCH v1 5/5] s390/sparsemem: reduce section size to 128 MiB Date: Tue, 10 Sep 2024 21:15:39 +0200 Message-ID: <20240910191541.2179655-6-david@redhat.com> In-Reply-To: <20240910191541.2179655-1-david@redhat.com> References: <20240910191541.2179655-1-david@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Content-Type: text/plain; charset="utf-8" Ever since commit 421c175c4d609 ("[S390] Add support for memory hot-add.") we've been using a section size of 256 MiB on s390x and 32 MiB on s390. Before that, we were using a section size of 32 MiB on both architectures. Likely the reason was that we'd expect a storage increment size of 256 MiB under z/VM back then. As we didn't support memory blocks spanning multiple memory sections, we would have had to handle having multiple memory blocks for a single storage increment, which complicates things. Although that issue reappeared with even bigger storage increment sizes later, nowadays we have memory blocks that can span multiple memory sections and we avoid any such issue completely. Now that we have a new mechanism to expose additional memory to a VM -- virtio-mem -- reduce the section size to 128 MiB to allow for more flexibility and reduce the metadata overhead when dealing with hot(un)plug granularity smaller than 256 MiB. 128 MiB has been used by x86-64 since the very beginning. arm64 with 4k base pages switched to 128 MiB as well: it's just big enough on these architectures to allows for using a huge page (2 MiB) in the vmemmap in sane setups with sizeof(struct page) =3D=3D 64 bytes and a huge page mapping in the direct mapping, while still allowing for small hot(un)plug granularity. For s390x, we could even switch to a 64 MiB section size, as our huge page size is 1 MiB: but the smaller the section size, the more sections we'll have to manage especially on bigger machines. Making it consistent with x86-64 and arm64 feels like te right thing for now. Note that the smallest memory hot(un)plug granularity is also limited by the memory block size, determined by extracting the memory increment size from SCLP. Under QEMU/KVM, implementing virtio-mem, we expose 0; therefore, we'll end up with a memory block size of 128 MiB with a 128 MiB section size. Signed-off-by: David Hildenbrand Tested-by: Mario Casquero --- arch/s390/include/asm/sparsemem.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/s390/include/asm/sparsemem.h b/arch/s390/include/asm/spar= semem.h index c549893602ea..ff628c50afac 100644 --- a/arch/s390/include/asm/sparsemem.h +++ b/arch/s390/include/asm/sparsemem.h @@ -2,7 +2,7 @@ #ifndef _ASM_S390_SPARSEMEM_H #define _ASM_S390_SPARSEMEM_H =20 -#define SECTION_SIZE_BITS 28 +#define SECTION_SIZE_BITS 27 #define MAX_PHYSMEM_BITS CONFIG_MAX_PHYSMEM_BITS =20 #endif /* _ASM_S390_SPARSEMEM_H */ --=20 2.46.0