From nobody Thu Oct 2 02:13:22 2025 Received: from mail-qk1-f181.google.com (mail-qk1-f181.google.com [209.85.222.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D556419B5B1 for ; Tue, 23 Sep 2025 15:31:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758641513; cv=none; b=AonWSz86xOXlX/vGI5ss/wIEb6x/q+T75BAJ+HY1GedmVUD7UmneQz0SDUbjmfSjT8jdkUwiwLPAlfcRlNZH+zwdvwnXH9gwX/ZYOZ8ElyUnv7cafKQECo4e88LppA0DsfZ9dXKl55y/3DDU5K4sWNGzbKOJ91yVN1rFSM+G5zY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758641513; c=relaxed/simple; bh=3HNjt6CULz+cHwwdKVVXN2sfulSDWOqQIOjqdL72tRI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=mqAB0Ewmby93qlALDZZXhQGOxz7Akl2HvPTj0OqE4ICJeimXqWFgbwHkJ+3+CIT6dR06OWNIttX1XyLwWz4q2lkxeWy9JrCoY9ucm6AH7V/mYzhDyZphDAWeQ4z+CPDEI0Cq5Vgn2zYCILPWcrXJi8ksdAnojlIFEwKQDesEWHA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=k0apYubt; arc=none smtp.client-ip=209.85.222.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="k0apYubt" Received: by mail-qk1-f181.google.com with SMTP id af79cd13be357-826fe3b3e2cso576642585a.1 for ; Tue, 23 Sep 2025 08:31:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1758641510; x=1759246310; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=p2agHPfPZaTgXNBUU2YV+2DvStIzk8xrRXE1APYdnik=; b=k0apYubteyyMN0Zpf3m3PDZs/DmDyQzBORe7ena6vR2PjqQWDL2OF8HwkFnaPaQTE4 fxmgpk7Z5Mq1+hbLj0TUT8HayiTp6MWqXzqZVtaLH6y+2/7S03Ljwy1ibNTM7hGPg551 hn1tNyWWu18PXl32rE+oc1RwECjnJb/wgEPyhIkfjH9DTiZpZ4U77Bisrdwb4MqjEQFI dR1NfqBFxrHJ31FDmtX/2/puyXpNmDa7Cbne6nHDQkRDgNK5h3BAuv7+2gAO1PxNbw2i jv1xajk1efhkMPjZ/SVL885vuiTRxVMXzJhN+1oJtMbzGB/rBkad6SVDjfm+EG9uufyf cZ7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1758641510; x=1759246310; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=p2agHPfPZaTgXNBUU2YV+2DvStIzk8xrRXE1APYdnik=; b=pgVYDC9FDv/9JBK7ZZqA45Ovdy3hmK664WN6Hu4HztSFtfHQBZVahklAgXMrxbrZYS CrPnIce3VUGrYpzm9q8aA5+VfLLuEyrgRl3i3HQHuvalwWTuGvV+sP/HLGLJPcyM0MU1 sXu6xbntJnjq1gx3noEfpJ0V6lYv7HOwIT1EfKMbrWoSDGBO/cgBU2NnuuMwlaIzbHI3 yqaG6HLbaHodCqWCC/MZkgJcY2luREvEAiYyrRLPr/ipevXdi9XqyPUTu/dCqahPkqT/ FngvtIbixUxtjzG8mdZYhpoZ1MGV0BYb+9cKmSrspqn+hEaZFY9ePkgI4iEq6XpPY8oC RUJg== X-Gm-Message-State: AOJu0YyCa4OX3RnSfubexNUmgMyueEjmcL1oiNmUpF+suf8P/juZU/2r CaLnXejyeoOI3yKkOVCNOIYVYh9Jm6IS4HmfCADY0w1UNtBclXlbmlNfbPSoI9AdDK274Z8ym4z 2khv6vZ0= X-Gm-Gg: ASbGncugCECBOlyEWubI/rgmRTVP5Tr3Bu1DO+to+J/00LhFcuIhFYFzE9J+O4+UOz2 az6J6sToGRbWV70XNVXukXitkkN3tOM7t46xbsx06ZU4VmZZk4Hh1PT2tnUhxl5BKgnhqkgWFSe Snaqe4efyG/LDC8noYMcCxmqcoHf16rCDtScPlptYGkNMkQ17av3gkAilyvV5+Q3+SPSjk+qlgI ErLtj27RC6x7v4+1PlizrNZZ6l5QbtaHXf6TPhWX8q4yiRr0tKG1ZnKfzBd9JRsIPN0CuYshNuA JbrY6JyfiLFR7liMP/zOunFBVrrweVN4WEi7IXFh04wmmRAu+JhDon8r3GijevgoXRvGbZgucQT joZM6Q+ws15eYMfoSpJcdY8KW0LfziA== X-Google-Smtp-Source: AGHT+IFBiermbVc+NGaH2KSDAIb6LQ20HaNdZFOLB6++JaSogyaGxikiQGxrCvgpFfH2xruK1FwEgg== X-Received: by 2002:a05:620a:bcd:b0:848:81e5:446e with SMTP id af79cd13be357-8517279f40amr323834685a.72.1758641510121; Tue, 23 Sep 2025 08:31:50 -0700 (PDT) Received: from localhost ([79.173.157.19]) by smtp.gmail.com with ESMTPSA id af79cd13be357-84abe9c219asm365942385a.20.2025.09.23.08.31.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Sep 2025 08:31:49 -0700 (PDT) From: Fam Zheng To: linux-kernel@vger.kernel.org Cc: Lukasz Luba , linyongting@bytedance.com, songmuchun@bytedance.com, satish.kumar@bytedance.com, Borislav Petkov , Thomas Gleixner , yuanzhu@bytedance.com, Ingo Molnar , Daniel Lezcano , fam.zheng@bytedance.com, Zhang Rui , fam@euphon.net, "H. Peter Anvin" , x86@kernel.org, liangma@bytedance.com, Dave Hansen , "Rafael J. Wysocki" , guojinhui.liam@bytedance.com, linux-pm@vger.kernel.org, Thom Hughes Subject: [RFC 1/5] x86/boot/e820: Fix memmap to parse with 1 argument Date: Tue, 23 Sep 2025 15:31:42 +0000 Message-Id: <20250923153146.365015-2-fam.zheng@bytedance.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250923153146.365015-1-fam.zheng@bytedance.com> References: <20250923153146.365015-1-fam.zheng@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Thom Hughes This is needed because in the simplest case, parker Application Kernel only gets one user e820 entry from memmap. Signed-off-by: Thom Hughes Signed-off-by: Fam Zheng --- arch/x86/kernel/e820.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c index 84264205dae5..05dfb192d4b9 100644 --- a/arch/x86/kernel/e820.c +++ b/arch/x86/kernel/e820.c @@ -330,7 +330,7 @@ int __init e820__update_table(struct e820_table *table) =20 /* If there's only one memory region, don't bother: */ if (table->nr_entries < 2) - return -1; + return 0; =20 BUG_ON(table->nr_entries > max_nr_entries); =20 --=20 2.39.5 From nobody Thu Oct 2 02:13:22 2025 Received: from mail-qv1-f51.google.com (mail-qv1-f51.google.com [209.85.219.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D613D25EFBC for ; Tue, 23 Sep 2025 15:31:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758641515; cv=none; b=OErHUcHHG79vx2p1qFEFYhqKXokURuLo5A7iqRXBAeUslBYSvurGLJw8rWeBuWsUSTYjndQY2EUHhUzpqUVegf0I9FWBW0LfJmJrAGFZHYXuUUUXaKG5WB3895sX0rv9zO/X1dJo0MfRVBRnXgxgfYxTDREFaV1ucQ/89xMGJbo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758641515; c=relaxed/simple; bh=6htOvfZjI2OMhlgVSh0J2CmuVWGKZAFM9288jbISAEY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=NQyfsjoUtwZT1X9bwq8WafanIBFHlpnZ6VhrO/zcF41Ael6Ja1Bwoj6w3n3wGW/ZuR6H2QEmJ7hq9JftJxXhRYfPj9vVXXxUkPZ1Q/9Qv+kF/1smE/6Mc4YtcTtQD8B+QYhoRbOHTullhqmjSST8YnxNSxTiyCr7GYERnLz+9GY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=DC6/kH63; arc=none smtp.client-ip=209.85.219.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="DC6/kH63" Received: by mail-qv1-f51.google.com with SMTP id 6a1803df08f44-792f273fbe4so31657836d6.3 for ; Tue, 23 Sep 2025 08:31:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1758641512; x=1759246312; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=hcJ2gCgUm1v6G7+he7UDrBEl/Rd/VnVpfcNtogjhvQc=; b=DC6/kH63SPhSNJ6jw3pHQ01uVDe5Az9QT8tluZHwVP9ct/s+8XQeIP/PMsptN5U88O tc6aI4X0d1+4q0dB/piWz2wwEH/Am6A29EHaWa1ZplS8tsvPspLqz9f9Arlo12VgiGCu l+0PIJZo1By+bldQGAYCsJkBd20fzxARDYz2YtqMTfHF6AnBdXK29PyXb6XkfBBq0j5W WtUPxOFvC8RmtGn6GUrPumuPlU/ayAXwfCYvyBqAQX0L3e6TNR72iIMf0VEkS7LO/r/c W745wkdH+WY+Rh+m+Y+cA8PWDbL1bEohybo8iqKd+My4qIQMAHVwnHHQBCuf8eXP9oaa Va0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1758641512; x=1759246312; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hcJ2gCgUm1v6G7+he7UDrBEl/Rd/VnVpfcNtogjhvQc=; b=YfOjTG9rsgo1o08D7UTTHvCgv/1VmqjU14zLPNGHdsT8YJDmYGKiQMzwLEF+IPAXxw OkDU7sLXI0Ob5owJMkfoT3nzk63HkaJPZTCkUQgo0ODA1RV3vFYyni7zq7LK/xcwwAzx WH6wLdTDoUDPdB8ts04vLR+vHK8i/IyHbvVAte+Zk5SShizG1otpHO0ATiEkNzGrK7gE bVst9w2Pcnt6euFj9ZQyz23ZzGELtLrC7Pn9fSAps6E+Rkc3WbH7CmkX6d/FTporRhS/ qE5UX4qk8S/BgjdoVXnAx/6YeP8RNUU0Ww9oCszlG667rUdwmiUpWxfY/I6ltI29QwIs clSA== X-Gm-Message-State: AOJu0YzptSpXZe9gkBX/BQ7gvLFwGBFx58d5LjZUBmuej3TXKnvRBTgI r57oKoS4PMJHMCJFMZLYmfljoMBa3OZtdHUYIOQjKT0gq3Mdfa23MXNrUNtfNWPYVsX4OWjWo/X TwfjGiuU= X-Gm-Gg: ASbGncu8KAbcxQ5rh37m2VnbdOuUX9edU2nmSzCI1UGaAcyvIHo5yWiz6THSJOCFNkr PQ5Smot2XdLwv3cHjLl8Bn4OCa0pQngDV2hm4q3aorX/N9KK8cIxRuygeWrW4DzMDra8TR0gZsC jJeh23IYOOCyZr9pJeY8pQd+8vReDX9e0jL74xTDcN54PhIXt+m2c83GRZ4KDTbYFukMslZBNct /qD76tZnI8Zyb5Av1i/r2yi5g54an+nQsx9B+Vz5fzzT3ignr9mVVFqfN5VuYI5RsdnEwHit+Qo Zm6QpqgD6JhPpAeEmZaYRCxGrXIaGF8VXZu4UgndLGXYqH9VWEyCRUU94IcuHbpKtdlUhtt7gyd peNH+Wh8HcoV0NWri5Dm3Qeji9AlHF+m7TgpEQuf5 X-Google-Smtp-Source: AGHT+IFpL6H1dB2FZrnf40AeXbyhYSYaqzdlpFnn0TiqkPSYKZV0L9ra8zT/5TF6hcSjR6U0Mc1XQg== X-Received: by 2002:ad4:4eea:0:b0:780:4845:f347 with SMTP id 6a1803df08f44-7e70df9acc3mr37328886d6.44.1758641512131; Tue, 23 Sep 2025 08:31:52 -0700 (PDT) Received: from localhost ([79.173.157.19]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7b0eb62c2cesm52459436d6.54.2025.09.23.08.31.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Sep 2025 08:31:51 -0700 (PDT) From: Fam Zheng To: linux-kernel@vger.kernel.org Cc: Lukasz Luba , linyongting@bytedance.com, songmuchun@bytedance.com, satish.kumar@bytedance.com, Borislav Petkov , Thomas Gleixner , yuanzhu@bytedance.com, Ingo Molnar , Daniel Lezcano , fam.zheng@bytedance.com, Zhang Rui , fam@euphon.net, "H. Peter Anvin" , x86@kernel.org, liangma@bytedance.com, Dave Hansen , "Rafael J. Wysocki" , guojinhui.liam@bytedance.com, linux-pm@vger.kernel.org, Thom Hughes Subject: [RFC 2/5] x86/smpboot: Export wakeup_secondary_cpu_via_init Date: Tue, 23 Sep 2025 15:31:43 +0000 Message-Id: <20250923153146.365015-3-fam.zheng@bytedance.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250923153146.365015-1-fam.zheng@bytedance.com> References: <20250923153146.365015-1-fam.zheng@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Thom Hughes Will be used by parker setup code. Signed-off-by: Thom Hughes Signed-off-by: Fam Zheng --- arch/x86/include/asm/smp.h | 1 + arch/x86/kernel/smpboot.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h index ca073f40698f..cfc212bbb4a6 100644 --- a/arch/x86/include/asm/smp.h +++ b/arch/x86/include/asm/smp.h @@ -104,6 +104,7 @@ void native_smp_prepare_boot_cpu(void); void smp_prepare_cpus_common(void); void native_smp_prepare_cpus(unsigned int max_cpus); void native_smp_cpus_done(unsigned int max_cpus); +int wakeup_secondary_cpu_via_init(u32 phys_apicid, unsigned long start_eip= ); int common_cpu_up(unsigned int cpunum, struct task_struct *tidle); int native_kick_ap(unsigned int cpu, struct task_struct *tidle); int native_cpu_disable(void); diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index c10850ae6f09..c9a941178488 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -715,7 +715,7 @@ static void send_init_sequence(u32 phys_apicid) /* * Wake up AP by INIT, INIT, STARTUP sequence. */ -static int wakeup_secondary_cpu_via_init(u32 phys_apicid, unsigned long st= art_eip) +int wakeup_secondary_cpu_via_init(u32 phys_apicid, unsigned long start_eip) { unsigned long send_status =3D 0, accept_status =3D 0; int num_starts, j, maxlvt; --=20 2.39.5 From nobody Thu Oct 2 02:13:22 2025 Received: from mail-qk1-f172.google.com (mail-qk1-f172.google.com [209.85.222.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EFA2B26D4D4 for ; Tue, 23 Sep 2025 15:31:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758641522; cv=none; b=k7+PR6bUaEHTM+ulwoCl66UHbLjGHsuvZFmqGalrcuor5feFrs8rF5Dya824EFDkFbEGa4o2TFx1qrcAROEOk36UrVCb16tnMNw5kercQ8VBmjWIoOf/NmF5Y12eNvpX1nDBXePEIcXzACfAmxA8US6plWR6J4Tw00bnadvFkGI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758641522; c=relaxed/simple; bh=8gD7kOk9XUvcbXPZGRTdVEh45R1zcqpPgpBi0ANs4yI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=M5ySYMH5X/+qIlNuWLaucZh0IUJJ+qITsrHt9o9xo9C+R85RHppXn/Dx4xeYJSkbocvPOTNMMN3y+xEjR6/i6bjcOcFSu0xvkigD8eROOx7WQ1d+YjKwVrrQyi0Y8iIfBlEnxxRciMZP3w3ebskj0RCz+93TJhN+Y2/Znp6xxoc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=flI53IY3; arc=none smtp.client-ip=209.85.222.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="flI53IY3" Received: by mail-qk1-f172.google.com with SMTP id af79cd13be357-84f8ec2df12so148732385a.3 for ; Tue, 23 Sep 2025 08:31:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1758641518; x=1759246318; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6LeiKW+348aDLti2UK6b0T6va2myl+XeyweBDEl4/Zw=; b=flI53IY3kycg0Hj61kNyF4Ihz6U+8YRb5g8k+dvI2mNV9uD6x/p2Zn8QTruxmjJVoF eqHJY6GMsSzDMdaa+4g3xE95PRdSX2FNmdJi1oCf7VZMIms7RVJ+6jIvX2PQ7DaarNxs ZJ4GoLJ3h9o2pYpgRFM1nu/buiE4lM/E1WijidO3OrEssYoeXjv+VVZdFN8uw0dmw+vh NXmIimcjMPyPv2I4V8tGaXJPyStpDslSf0Xapkkik69ksYtLqQf8iFwGVZzu7Jc5vfBW kKrILj81BVt/rXtMxRys2+mjJp95URJN1WZM4mz8wa130HjcJF4pos/y8owy9rzZy9cY l7pA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1758641518; x=1759246318; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6LeiKW+348aDLti2UK6b0T6va2myl+XeyweBDEl4/Zw=; b=MrRfsAtoaMiPYYRdFGq8YO9Y0cBSYYTc4GmYGzk0zISIBHy6H0So3E3OHyQZ8QnwI/ QcUi36cBbs8mz1VvWjV/2nZYpswsTxX2OsK/hhgk6/GFnve7F630aQ9bddOg4u7q7S7p 3Co3ndajLlPLpIljgpK7aLVFKi0OSplPDq7QtqLmSR4tkDmeAQxXGjnd4X8B40NKY2I1 KkKR7I/mpIS3pfjPTNZnRI795JEV5c69A0KLrIf1mzOqLqLo49rHhkk35l/AO11OCHqV x+AL6B8FTT0SQRj8C4KjMjf1nKQgEMFaRuwPlhHCW0xvha8woxcSTDoVuIAQTmEwZCo4 QT2Q== X-Gm-Message-State: AOJu0YzqciGbA207SDHMvm9TNZtDztW1BeXmtGwNdX0/u5LEFsLYM5iR mMSQDnTucowdZq6LwwLfn7Ug+fXEbMXOiMymJWXJWGPcPP50e4xVdDQrIPK33cI2h4CiB3fr+2+ kAKNpFVY= X-Gm-Gg: ASbGncsuViTadHzBjQAf48MrTQccq6y495r7bTJBCb9hXShBg9V1giP0UWEUmj+t9/G pIEAZwsforyjKtgNDSuUd4w4Fzp+WBmc2rvcxxuE4YTYKxSznUNSiU1AIXwV3DPHLfJJu9Lzeyr Wx18+o0MUsL3u9Ekt3NL1BO+yXc0lD0Fmct1d4sglxcYTiuvZxaio/zDrqYBohANcQFk8f+oX+c mif/Ux93GwutB5AVyfRL7j/REYQfkyhYru7RRnlZe44Y3ue5aeL/8688l9Ah1m5+gNtHvQzxXCO /RFnMK15HPLGlgcjWe8yogRiIfcnAsZAQskvWf9Kb+WyzQgbtUoaOggvVpOwztyhrCIvSSBRQqM UgpvyLa3WIevWab71NL/d2CySGMHCSaRTaG3qalHb X-Google-Smtp-Source: AGHT+IEtW5XPlbWjdJ2OTuuFZ79VV3pbXaIb8pDBEvWl/VWVRf6bjizM8kYkkTD6em39TFMetUW30A== X-Received: by 2002:a05:620a:400a:b0:82a:50c5:6138 with SMTP id af79cd13be357-8517456f8f9mr376900285a.40.1758641514217; Tue, 23 Sep 2025 08:31:54 -0700 (PDT) Received: from localhost ([79.173.157.19]) by smtp.gmail.com with ESMTPSA id af79cd13be357-836305769ecsm1021513185a.42.2025.09.23.08.31.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Sep 2025 08:31:53 -0700 (PDT) From: Fam Zheng To: linux-kernel@vger.kernel.org Cc: Lukasz Luba , linyongting@bytedance.com, songmuchun@bytedance.com, satish.kumar@bytedance.com, Borislav Petkov , Thomas Gleixner , yuanzhu@bytedance.com, Ingo Molnar , Daniel Lezcano , fam.zheng@bytedance.com, Zhang Rui , fam@euphon.net, "H. Peter Anvin" , x86@kernel.org, liangma@bytedance.com, Dave Hansen , "Rafael J. Wysocki" , guojinhui.liam@bytedance.com, linux-pm@vger.kernel.org, Thom Hughes Subject: [RFC 3/5] x86/parker: Introduce parker kerfs interface Date: Tue, 23 Sep 2025 15:31:44 +0000 Message-Id: <20250923153146.365015-4-fam.zheng@bytedance.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250923153146.365015-1-fam.zheng@bytedance.com> References: <20250923153146.365015-1-fam.zheng@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Thom Hughes This is the control knobs exposed to the boot kernel in order to start the secondary kernels. Signed-off-by: Thom Hughes Signed-off-by: Fam Zheng --- arch/x86/Kbuild | 3 + arch/x86/Kconfig | 2 + arch/x86/parker/Kconfig | 4 + arch/x86/parker/Makefile | 2 + arch/x86/parker/internal.h | 54 ++ arch/x86/parker/kernfs.c | 1266 ++++++++++++++++++++++++++++++++++++ include/linux/parker.h | 7 + include/uapi/linux/magic.h | 1 + 8 files changed, 1339 insertions(+) create mode 100644 arch/x86/parker/Kconfig create mode 100644 arch/x86/parker/Makefile create mode 100644 arch/x86/parker/internal.h create mode 100644 arch/x86/parker/kernfs.c create mode 100644 include/linux/parker.h diff --git a/arch/x86/Kbuild b/arch/x86/Kbuild index f7fb3d88c57b..e50fec2e8e5a 100644 --- a/arch/x86/Kbuild +++ b/arch/x86/Kbuild @@ -16,6 +16,9 @@ obj-$(CONFIG_XEN) +=3D xen/ =20 obj-$(CONFIG_PVH) +=3D platform/pvh/ =20 +# Multi-kernel support +obj-$(CONFIG_PARKER) +=3D parker/ + # Hyper-V paravirtualization support obj-$(subst m,y,$(CONFIG_HYPERV)) +=3D hyperv/ =20 diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index f86e7072a5ba..490ea18cf783 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -3218,3 +3218,5 @@ config HAVE_ATOMIC_IOMAP source "arch/x86/kvm/Kconfig" =20 source "arch/x86/Kconfig.assembler" + +source "arch/x86/parker/Kconfig" diff --git a/arch/x86/parker/Kconfig b/arch/x86/parker/Kconfig new file mode 100644 index 000000000000..716a2537f12c --- /dev/null +++ b/arch/x86/parker/Kconfig @@ -0,0 +1,4 @@ +config PARKER + bool "Enable multi-kernel host support" + depends on X86_64 && SMP + select CMA diff --git a/arch/x86/parker/Makefile b/arch/x86/parker/Makefile new file mode 100644 index 000000000000..41c40fc64267 --- /dev/null +++ b/arch/x86/parker/Makefile @@ -0,0 +1,2 @@ +obj-y +=3D kernfs.o +$(obj)/kernfs.o: $(obj)/internal.h diff --git a/arch/x86/parker/internal.h b/arch/x86/parker/internal.h new file mode 100644 index 000000000000..a6150f1beb77 --- /dev/null +++ b/arch/x86/parker/internal.h @@ -0,0 +1,54 @@ +#ifndef _PARKER_INTERNAL_H +#define _PARKER_INTERNAL_H + +#include +#include +#include +#include + +/* Currently limit support for devices */ +#define PARKER_MAX_PCI_DEVICES 256 +#define PARKER_MAX_CPUS 512 + +/* For now, limited to one page, + * but could have chained pages for PCI devs + APIC ids */ +struct parker_control_structure { + phys_addr_t start_address; + bool online; + unsigned int parker_id; + u32 pci_dev_ids[PARKER_MAX_PCI_DEVICES]; + unsigned int num_pci_devs; + u32 apic_ids[PARKER_MAX_CPUS]; + unsigned int num_cpus; +}; + +struct parker_kernel_device_entry { + struct list_head list_entry; + struct kernfs_node *kn; + struct device *dev; +}; + +struct parker_kernel_entry { + struct kernfs_node *kn; + struct mutex mutex; + unsigned int id; + bool online; + struct cpumask cpu_mask; + /* Contiguous pages from CMA for parker physical memory */ + struct page *physical_memory_pages; + unsigned long physical_memory_page_count; + /* Control structure PAGE for now */ + struct page *control_structure_pages; + /* Currently always 1 but future proofing */ + unsigned long control_structure_page_count; + struct kernfs_node *kn_devices; + /* List of each kernfs node, get kobj from kernfs_node */ + struct list_head list_devices; +}; + +/* Ensure we don't exceed 1 page, if we do. We need to rethink control str= ucture + * and chain pages together. */ +static_assert(sizeof(struct parker_control_structure) < PAGE_SIZE, + "struct (parker_control_structure) too large!"); + +#endif diff --git a/arch/x86/parker/kernfs.c b/arch/x86/parker/kernfs.c new file mode 100644 index 000000000000..68f4b7f779b5 --- /dev/null +++ b/arch/x86/parker/kernfs.c @@ -0,0 +1,1266 @@ +#define pr_fmt(fmt) "parker: " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "internal.h" + +static struct cma *parker_cma[MAX_NUMNODES]; +static unsigned long long parker_cma_size; +static unsigned long long parker_cma_size_in_node[MAX_NUMNODES]; +static struct page *parker_active_control_structure_page; + +static struct transition_pagetable_data { + struct x86_mapping_info info; + pgd_t *pgd; + void *stack; +} transition_pagetable; + +static int __init parker_early_cma(char *p) +{ + int nid, count =3D 0; + unsigned long long tmp; + char *s =3D p; + + while (*s) { + if (sscanf(s, "%llu%n", &tmp, &count) !=3D 1) + break; + + if (s[count] =3D=3D ':') { + if (tmp >=3D MAX_NUMNODES) + break; + nid =3D array_index_nospec(tmp, MAX_NUMNODES); + + s +=3D count + 1; + tmp =3D memparse(s, &s); + parker_cma_size_in_node[nid] =3D tmp; + parker_cma_size +=3D tmp; + + /* + * Skip the separator if have one, otherwise + * break the parsing. + */ + if (*s =3D=3D ',') + s++; + else + break; + } else { + parker_cma_size =3D memparse(p, &p); + break; + } + } + + return 0; +} +early_param("parker_cma", parker_early_cma); + +#define ORDER_1G 30 +void __init parker_cma_reserve(void) +{ + bool node_specific_cma_alloc =3D false; + unsigned long long size, reserved, per_node; + int nid; + + if (!parker_cma_size) + return; + + for (nid =3D 0; nid < MAX_NUMNODES; nid++) { + if (parker_cma_size_in_node[nid] =3D=3D 0) + continue; + + if (!node_online(nid)) { + pr_warn("invalid node %d specified for CMA allocation\n", nid); + parker_cma_size -=3D parker_cma_size_in_node[nid]; + parker_cma_size_in_node[nid] =3D 0; + continue; + } + + if (parker_cma_size_in_node[nid] < SZ_1G) { + pr_warn("cma area of node %d should be at least 1GiB\n", nid); + parker_cma_size -=3D parker_cma_size_in_node[nid]; + parker_cma_size_in_node[nid] =3D 0; + } else { + node_specific_cma_alloc =3D true; + } + } + /* Validate the CMA size again in case some invalid nodes specified. */ + if (!parker_cma_size) + return; + + if (parker_cma_size < SZ_1G) { + pr_warn("cma area should be at least 1 GiB\n"); + parker_cma_size =3D 0; + return; + } + + if (!node_specific_cma_alloc) { + /* + * If 3 GB area is requested on a machine with 4 numa nodes, + * let's allocate 1 GB on first three nodes and ignore the last one. + */ + per_node =3D DIV_ROUND_UP(parker_cma_size, nr_online_nodes); + pr_info("reserve CMA %llu MiB, up to %llu MiB per node\n", + parker_cma_size / SZ_1M, per_node / SZ_1M); + } + + reserved =3D 0; + for_each_online_node(nid) { + int res; + char name[CMA_MAX_NAME]; + + if (node_specific_cma_alloc) { + if (parker_cma_size_in_node[nid] =3D=3D 0) + continue; + + size =3D parker_cma_size_in_node[nid]; + } else { + size =3D min(per_node, parker_cma_size - reserved); + } + + size =3D round_up(size, SZ_1G); + + snprintf(name, sizeof(name), "parker%d", nid); + /* + * Note that 'order per bit' is based on smallest size that + * may be returned to CMA allocator in the case of + * huge page demotion. + */ + res =3D cma_declare_contiguous_nid(0, size, 0, + SZ_1G, + ORDER_1G - PAGE_SHIFT, false, name, + &parker_cma[nid], nid); + if (res) { + pr_warn("reservation failed - err %d, node %d", + res, nid); + continue; + } + + reserved +=3D size; + pr_info("reserved %llu MiB on node %d\n", + size / SZ_1M, nid); + + if (reserved >=3D parker_cma_size) + break; + } + + if (!reserved) + /* + * parker_cma_size is used to determine if allocations from + * cma are possible. Set to zero if no cma regions are set up. + */ + parker_cma_size =3D 0; +} + +/* Make sure we don't overwrite initial_code too early */ +struct semaphore cpu_kick_semaphore; + +__attribute__((noreturn)) static void parker_bsp_start(void) +{ + /* Let parker_start_kernel know we're here */ + up(&cpu_kick_semaphore); + + if (kexec_image) { + machine_kexec(kexec_image); + } + // never get here but? + for (;;) { + continue; + } +} + +__attribute__((noreturn)) static void parker_ap_wait(void) +{ + /* Let parker_start_kernel know we're here */ + up(&cpu_kick_semaphore); + + unsigned int cpu =3D smp_processor_id(); + unsigned int apic_id =3D apic->cpu_present_to_apicid(cpu); + + volatile struct parker_control_structure *pcs; + /* For now, use global active control page. + * Eventually we can add lookup from CPU -> control page */ + pcs =3D page_address(parker_active_control_structure_page); + int idx =3D 0; + while (!READ_ONCE(pcs->start_address)) { + idx++; + continue; + } + pr_debug("parker trampoline physical address %llx\n", pcs->start_address); + smp_mb(); + u64 call_addr =3D 0; + /* There's no race condition on stack as we don't read the stack pointer = again */ + asm volatile ( + "mov (%1), %0\n\t" + "mov %3, %%rsp\n\t" + "mov %4, %%esi\n\t" + "mov %2, %%cr3\n\t" + ANNOTATE_RETPOLINE_SAFE + "call *%0\n\t" + : "+r" (call_addr) + : "r" (&pcs->start_address), + "r" (__sme_pa(transition_pagetable.pgd)), + "r" (__sme_pa(transition_pagetable.stack + PAGE_SIZE)), + "r" (apic_id) + : "esi", "rsp" + ); + + for (;;) { + continue; + } +} + +static void parker_host_ipicb(void) +{ + pr_info("OKK\n"); +} + +static void __init *alloc_pgt_page(void *dummy) +{ + return (void*)get_zeroed_page(GFP_ATOMIC); +} + +static int __init init_transition_pgtable(pgd_t *pgd) +{ + pgprot_t prot =3D PAGE_KERNEL_EXEC_NOENC; + unsigned long vaddr, paddr; + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + pte_t *pte; + + vaddr =3D (unsigned long)parker_ap_wait; + pgd +=3D pgd_index(vaddr); + if (!pgd_present(*pgd)) { + p4d =3D (p4d_t *)alloc_pgt_page(NULL); + if (!p4d) + return -ENOMEM; + set_pgd(pgd, __pgd(__pa(p4d) | _KERNPG_TABLE)); + } + p4d =3D p4d_offset(pgd, vaddr); + if (!p4d_present(*p4d)) { + pud =3D (pud_t *)alloc_pgt_page(NULL); + if (!pud) + return -ENOMEM; + set_p4d(p4d, __p4d(__pa(pud) | _KERNPG_TABLE)); + } + pud =3D pud_offset(p4d, vaddr); + if (!pud_present(*pud)) { + pmd =3D (pmd_t *)alloc_pgt_page(NULL); + if (!pmd) + return -ENOMEM; + set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE)); + } + pmd =3D pmd_offset(pud, vaddr); + if (!pmd_present(*pmd)) { + pte =3D (pte_t *)alloc_pgt_page(NULL); + if (!pte) + return -ENOMEM; + set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE)); + } + pte =3D pte_offset_kernel(pmd, vaddr); + + paddr =3D __pa(vaddr); + set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, prot)); + + return 0; +} + +/* Allocate intermediate trampoline pagetable, that has all physical memory + * mapped allowing us to reuse this for all parker kernel instantiations. = */ +static int __init parker_host_transition_pagetable_init(void) +{ + struct x86_mapping_info info =3D { + .alloc_pgt_page =3D alloc_pgt_page, +// .free_pgt_page =3D free_pgt_page, + .page_flag =3D __PAGE_KERNEL_LARGE_EXEC, + .kernpg_flag =3D _KERNPG_TABLE_NOENC, + }; + + pgd_t *pgd; + pgd =3D alloc_pgt_page(NULL); + void *stack =3D alloc_pgt_page(NULL); + if (!pgd) + return -ENOMEM; + + for (int i =3D 0; i < nr_pfn_mapped; i++) { + unsigned long mstart, mend; + + mstart =3D pfn_mapped[i].start << PAGE_SHIFT; + mend =3D pfn_mapped[i].end << PAGE_SHIFT; + if (kernel_ident_mapping_init(&info, pgd, mstart, mend)) { + //kernel_ident_mapping_free(&info, pgd); + return -ENOMEM; + } + } + + transition_pagetable.info =3D info; + transition_pagetable.pgd =3D pgd; + transition_pagetable.stack =3D stack; + + return init_transition_pgtable(pgd); +} +static int __init parker_kernfs_init(void); + +/* Multi-kernel module code for Primary <-> secondary communication */ +static int __init parker_module_init(void) +{ + if (is_parker_instance()) + return -ENODEV; + parker_kernfs_init(); + sema_init(&cpu_kick_semaphore, 0); + // TODO: Device registration for sysfs interface + // copying resctrl interface style with folder creation + // and deletion to create kernels. + pr_info("Multikernel module loading...\n"); + // TODO: Custom + if (x86_platform_ipi_callback) { + pr_err("Platform callback exists\n"); + return -ENODEV; + } + x86_platform_ipi_callback =3D parker_host_ipicb; + if(parker_host_transition_pagetable_init()) { + pr_info("TTABLE FAILED!\n"); + return -ENODEV; + } + + return 0; +} + +static void __exit parker_module_exit(void) +{ + pr_info("Multikernel exiting.\n"); + //__free_pages(parker_control_page, 0); +} + +/* Ensure global parker lock is held */ +static int parker_start_kernel(struct parker_kernel_entry *pke) +{ + struct parker_control_structure *pcs; + struct list_head *dev_elem; + int ret; + + WRITE_ONCE(parker_active_control_structure_page, pke->control_structure_p= ages); + pcs =3D page_address(parker_active_control_structure_page); + + if (!pcs) + return -EINVAL; + + /* Add PCI device IDs to control structure */ + list_for_each(dev_elem, &pke->list_devices) { + struct parker_kernel_device_entry *pkde; + struct pci_dev *pdev; + int pci_dev_index =3D pcs->num_pci_devs++; + pkde =3D container_of(dev_elem, + struct parker_kernel_device_entry, + list_entry); + pdev =3D to_pci_dev(pkde->dev); + pcs->pci_dev_ids[pci_dev_index] =3D pci_dev_id(pdev); + } + + int bsp_cpu, cpu, i =3D 0; + /* Partitioned kernel's AP will wait on BSP to jump to kernel's startup c= ode */ + for_each_cpu(cpu, &pke->cpu_mask) { + u32 apicid =3D apic->cpu_present_to_apicid(cpu); + pcs->apic_ids[i] =3D apicid; + ++pcs->num_cpus; + int old =3D i++; + if (old =3D=3D 0) { + bsp_cpu =3D cpu; + continue; + } + + smpboot_control =3D cpu; + initial_code =3D (unsigned long)parker_ap_wait; + init_espfix_ap(cpu); + smp_mb(); + + pr_debug("parker AP %d %d\n", apicid, ret); + unsigned long start_ip =3D real_mode_header->trampoline_start; + ret =3D wakeup_secondary_cpu_via_init(apicid, start_ip); + /* Continue on with errors for now */ + if (ret) { + pr_err("Failed to start cpu %d\n", cpu); + --i; + --pcs->num_cpus; + continue; + } + /* Wait for CPU to wakeup and start executing AP wait function */ + down(&cpu_kick_semaphore); + } + + /* Start the partitioned kernel's BSP */ + //mtrr_save_state(); + u32 apicid =3D apic->cpu_present_to_apicid(bsp_cpu); + smpboot_control =3D bsp_cpu; + initial_code =3D (unsigned long)parker_bsp_start; + init_espfix_ap(bsp_cpu); + smp_mb(); + unsigned long start_ip =3D real_mode_header->trampoline_start; + ret =3D wakeup_secondary_cpu_via_init(apicid, start_ip); + if (ret) + return ret; + down(&cpu_kick_semaphore); + + /* Wait for partitioned kernel to start */ + while (!READ_ONCE(pcs->online)) + cpu_relax(); + + return 0; +} + +static bool parker_kernel_is_online(struct parker_kernel_entry *pke) +{ + struct parker_control_structure *pcs; + pcs =3D page_address(pke->control_structure_pages); + return READ_ONCE(pcs->online); +} + +/* + * + * Proper implementation: + * /sys/fs/parker new kernelfs + * + */ +/* The filesystem can only be mounted once. */ +/* TODO: Deal with recovery of structures if unmounted */ +// Forward declarations +static int parker_get_tree(struct fs_context *fc); +static int parker_init_fs_context(struct fs_context *fc); +static void parker_fs_context_free(struct fs_context *fc); +static void parker_kill_sb(struct super_block *sb); +static int parker_kn_set_ugid(struct kernfs_node *kn); + +/* Mutex to protect parker access. */ +DEFINE_MUTEX(parker_mutex); +atomic_t parker_kernels =3D ATOMIC_INIT(0); +static bool parker_mounted; +/* All CPUs belonging to second kernel*/ +static struct cpumask parker_cpus; + +struct parker { + struct kernfs_node *kn; + /* TODO: control structures etc... */ +}; + +struct parker_file_type { + char *name; + umode_t mode; + const struct kernfs_ops *kf_ops; + + int (*seq_show)(struct kernfs_open_file *of, struct seq_file *sf, void *v= ); + + ssize_t (*write)(struct kernfs_open_file *of, char *buf, size_t nbytes, l= off_t off); +}; + +static int parker_add_files(struct kernfs_node *kn, struct parker_file_typ= e *pfts, int len); + +static int parker_seqfile_show(struct seq_file *m, void *arg) +{ + struct kernfs_open_file *of =3D m->private; + struct parker_file_type *pft =3D of->kn->priv; + + if (pft->seq_show) + return pft->seq_show(of, m, arg); + + return 0; +} + +static ssize_t parker_file_write(struct kernfs_open_file *of, char *buf, + size_t nbytes, loff_t off) +{ + struct parker_file_type *pft =3D of->kn->priv; + + if (pft->write) + return pft->write(of, buf, nbytes, off); + + return -EINVAL; +} + +static const struct kernfs_ops parker_kf_ops =3D { + .atomic_write_len =3D PAGE_SIZE, + .write =3D parker_file_write, + .seq_show =3D parker_seqfile_show, +}; + +/* List of attributes in root - currently none */ +static struct parker_file_type root_attributes[] =3D {}; + +static int parker_kernel_index_show(struct kernfs_open_file *of, + struct seq_file *seq, void *v) +{ + struct parker_kernel_entry *pke =3D of->kn->parent->priv; + mutex_lock(&pke->mutex); + seq_printf(seq, "%u\n", pke->id); + mutex_unlock(&pke->mutex); + return 0; +} + +static int parker_kernel_control_structure_show(struct kernfs_open_file *o= f, + struct seq_file *seq, void *v) +{ + struct parker_kernel_entry *pke =3D of->kn->parent->priv; + mutex_lock(&parker_mutex); + seq_printf(seq, "0x%llx\n", page_to_phys(pke->control_structure_pages)); + mutex_unlock(&parker_mutex); + return 0; +} + + +static int parker_kernel_online_show(struct kernfs_open_file *of, + struct seq_file *seq, void *v) +{ + struct parker_kernel_entry *pke =3D of->kn->parent->priv; + bool online; + + mutex_lock(&pke->mutex); + online =3D parker_kernel_is_online(pke); + seq_printf(seq, "%u\n", online); + mutex_unlock(&pke->mutex); + return 0; +} + +static ssize_t parker_kernel_online_write(struct kernfs_open_file *of, + char *buf, + size_t nbytes, loff_t off) +{ + int ret; + struct parker_kernel_entry *pke =3D of->kn->parent->priv; + + mutex_lock(&parker_mutex); + mutex_lock(&pke->mutex); + + ret =3D parker_start_kernel(pke); + /* Only set online if the second kernel successfully started */ + if (!ret) + pke->online =3D true; + + + mutex_unlock(&pke->mutex); + mutex_unlock(&parker_mutex); + + return ret ?: nbytes; +} + +static int parker_kernel_cpus_show(struct kernfs_open_file *of, + struct seq_file *seq, void *v) +{ + struct parker_kernel_entry *pke =3D of->kn->parent->priv; + mutex_lock(&pke->mutex); + seq_printf(seq, "%*pbl\n", cpumask_pr_args(&pke->cpu_mask)); + mutex_unlock(&pke->mutex); + return 0; +} + +static ssize_t parker_kernel_cpus_write(struct kernfs_open_file *of, char = *buf, + size_t nbytes, loff_t off) +{ + cpumask_var_t tmpmask, newmask; + struct parker_kernel_entry *pke =3D of->kn->parent->priv; + int cpu, ret; + + if (!zalloc_cpumask_var(&tmpmask, GFP_KERNEL)) + return -ENOMEM; + + if (!zalloc_cpumask_var(&newmask, GFP_KERNEL)) { + free_cpumask_var(tmpmask); + return -ENOMEM; + } + + mutex_lock(&parker_mutex); + mutex_lock(&pke->mutex); + ret =3D cpulist_parse(buf, newmask); + + /* Check if any CPUs belong to another parker kernel */ + cpumask_and(tmpmask, newmask, &parker_cpus); + if (!cpumask_empty(tmpmask)) { + ret =3D -EINVAL; + goto out; + } + + /* If CPUs are currently online, offline them */ + cpumask_and(tmpmask, newmask, cpu_online_mask); + if (!cpumask_empty(tmpmask)) { + for_each_cpu(cpu, tmpmask) + remove_cpu(cpu); + } + + cpumask_or(&parker_cpus, &parker_cpus, newmask); + cpumask_copy(&pke->cpu_mask, newmask); +out: + free_cpumask_var(tmpmask); + free_cpumask_var(newmask); + mutex_unlock(&pke->mutex); + mutex_unlock(&parker_mutex); + return ret ?: nbytes; +} + +static int parker_kernel_memory_show(struct kernfs_open_file *of, + struct seq_file *seq, void *v) +{ + int ret; + + struct parker_kernel_entry *pke =3D of->kn->parent->priv; + mutex_lock(&pke->mutex); + if (!pke->physical_memory_pages) { + ret =3D -EINVAL; + goto out; + } + phys_addr_t base =3D page_to_phys(pke->physical_memory_pages); + unsigned long long size =3D pke->physical_memory_page_count * PAGE_SIZE; + seq_printf(seq, "%llu@0x%llx\n", size, base); + ret =3D 0; +out: + mutex_unlock(&pke->mutex); + return ret; +} + +static ssize_t parker_kernel_memory_write(struct kernfs_open_file *of, cha= r *buf, + size_t nbytes, loff_t off) +{ + struct parker_kernel_entry *pke =3D of->kn->parent->priv; + struct page *result; + int ret, memory_nid =3D NUMA_NO_NODE; + char *end; + + unsigned long long size; + unsigned long page_count; + + mutex_lock(&pke->mutex); + if (!(size =3D memparse(buf, &end))) { + ret =3D -EINVAL; + goto out; + } + + /* Ensure write is fully parsed */ + if (*end !=3D '\0' && *end !=3D '\n') { + ret =3D -EINVAL; + goto out; + } + + /* We need a CPU to determine which NUMA node to allocate memory on */ + if (cpumask_empty(&pke->cpu_mask)) { + ret =3D -EINVAL; + goto out; + } + + /* Get NUMA node for first cpu (BSP) */ + memory_nid =3D cpu_to_node(cpumask_first(&pke->cpu_mask)); + + if (pke->physical_memory_pages) { + if (!cma_release(parker_cma[memory_nid], + pke->physical_memory_pages, + pke->physical_memory_page_count)) { + ret =3D -EBUSY; + goto out; + } + } + + /* Assume that size is page aligned, if not second kernel loses page */ + page_count =3D size >> PAGE_SHIFT; + result =3D cma_alloc(parker_cma[memory_nid], page_count, 0, false); + + if (!result) { + ret =3D -ENOMEM; + pke->physical_memory_pages =3D NULL; + pke->physical_memory_page_count =3D 0; + goto out; + } + + if (!cma_pages_valid(parker_cma[memory_nid], result, page_count)) { + ret =3D -EINVAL; + if (!cma_release(parker_cma[memory_nid], result, page_count)) + pr_err("Failed to release invalid allocation."); + goto out; + + } + + pke->physical_memory_pages =3D result; + pke->physical_memory_page_count =3D page_count; + ret =3D 0; +out: + mutex_unlock(&pke->mutex); + return ret ?: nbytes; +} + +/* TODO: Consider implementation where we bind to pci-stub instead - avoid= rescanning problem? */ +static ssize_t parker_kernel_bind_write(struct kernfs_open_file *of, char = *buf, + size_t nbytes, loff_t off) +{ + struct parker_kernel_entry *pke =3D of->kn->parent->priv; + struct kernfs_node *dev_kn; + struct device *dev; + int ret =3D -ENODEV; + + mutex_lock(&pke->mutex); + dev =3D bus_find_device_by_name(&pci_bus_type, NULL, buf); + /* Remove from bus to prevent anyone from using it */ + if (dev) { + struct parker_kernel_device_entry *pkde; + /* If device already disabled, maybe owned by another kernel. Only claim= enabled devices */ + if (!pci_is_enabled(to_pci_dev(dev))) { + put_device(dev); + ret =3D -EBUSY; + goto out; + } + + pkde =3D kzalloc(sizeof(*pkde), GFP_KERNEL); + pkde->dev =3D dev; + dev_kn =3D kernfs_create_dir(pke->kn_devices, dev_name(dev), + pke->kn_devices->mode, pkde); + /* We use after kernfs_remove in unbind & rmdir case*/ + kernfs_get(dev_kn); + pkde->kn =3D dev_kn; + list_add_tail(&pkde->list_entry, &pke->list_devices); + kernfs_activate(dev_kn); + pci_bus_type.remove(dev); + ret =3D 0; + } + +out: + mutex_unlock(&pke->mutex); + return ret ?: nbytes; +} + +static ssize_t parker_kernel_unbind_write(struct kernfs_open_file *of, cha= r *buf, + size_t nbytes, loff_t off) +{ + struct parker_kernel_entry *pke =3D of->kn->parent->priv; + struct parker_kernel_device_entry *pkde; + struct kernfs_node *dev_kn; + struct device *dev; + int ret =3D -ENODEV; + + mutex_lock(&pke->mutex); + dev =3D bus_find_device_by_name(&pci_bus_type, NULL, buf); + /* Remove from bus to prevent anyone from using it */ + if (dev) { + /* Check if device is claimed by kernel */ + dev_kn =3D kernfs_find_and_get(pke->kn_devices, dev_name(dev)); + if (!dev_kn) { + put_device(dev); + goto out; + } + + /* Ensure PCI device isn't enabled */ + if (pci_is_enabled(to_pci_dev(dev))) { + put_device(dev); + kernfs_put(dev_kn); + goto out; + } + pkde =3D dev_kn->priv; + + ret =3D pci_bus_type.probe(dev); + put_device(dev); + kernfs_remove(dev_kn); + /* One reference from getting above, one from device subdir creation */ + kernfs_put(dev_kn); + kernfs_put(dev_kn); + list_del(&pkde->list_entry); + kfree(pkde); + } + +out: + mutex_unlock(&pke->mutex); + return ret ?: nbytes; +} + +/* Secondary kernel attributes */ +static struct parker_file_type per_kernel_attributes[] =3D { + /* Passed to secondary kernel to identify */ + { + .name =3D "id", + .mode =3D 0644, + .kf_ops =3D &parker_kf_ops, + .seq_show =3D parker_kernel_index_show, + }, + { + .name =3D "control_structure", + .mode =3D 0644, + .kf_ops =3D &parker_kf_ops, + .seq_show =3D parker_kernel_control_structure_show, + }, + { + .name =3D "cpus", + .mode =3D 0644, + .kf_ops =3D &parker_kf_ops, + .seq_show =3D parker_kernel_cpus_show, + .write =3D parker_kernel_cpus_write, + }, + /* Add per numa node memory? */ + { + .name =3D "memory", + .mode =3D 0644, + .kf_ops =3D &parker_kf_ops, + .seq_show =3D parker_kernel_memory_show, + .write =3D parker_kernel_memory_write, + }, + { + .name =3D "bind", + .mode =3D 0644, + .kf_ops =3D &parker_kf_ops, + .write =3D parker_kernel_bind_write, + }, + { + .name =3D "unbind", + .mode =3D 0644, + .kf_ops =3D &parker_kf_ops, + .write =3D parker_kernel_unbind_write, + }, + /* TODO: is status better? */ + { + .name =3D "online", + .mode =3D 0644, + .kf_ops =3D &parker_kf_ops, + .seq_show =3D parker_kernel_online_show, + .write =3D parker_kernel_online_write, // todo + }, +}; + +struct parker_fs_context { + struct kernfs_fs_context kfc; +}; + +static int parker_setup_root(struct parker_fs_context *ctx); +static void parker_destroy_root(void); + +static struct kernfs_root *parker_root; +struct parker parker_default; + +static const struct fs_context_operations parker_fs_context_ops =3D { + .free =3D parker_fs_context_free, + .get_tree =3D parker_get_tree, +}; + +static struct file_system_type parker_fs_type =3D { + .name =3D "parker", + .init_fs_context =3D parker_init_fs_context, + .kill_sb =3D parker_kill_sb, +}; + + +static int parker_kernel_entry_destroy(struct parker_kernel_entry *pke) +{ + int cpu, memory_nid =3D NUMA_NO_NODE, ret =3D 0; + struct list_head *dev_elem, *n; + + + /* Bring back parker CPUs */ + for_each_cpu(cpu, &pke->cpu_mask) { + add_cpu(cpu); + if (memory_nid =3D=3D NUMA_NO_NODE) + memory_nid =3D cpu_to_node(cpu); + } + cpumask_andnot(&parker_cpus, &parker_cpus, &pke->cpu_mask); + + /* Free memory allocated */ + if (pke->physical_memory_page_count > 0 && + !cma_release(parker_cma[memory_nid], + pke->physical_memory_pages, + pke->physical_memory_page_count)) { + ret =3D -EBUSY; + } + + for (int i =3D 0; i < pke->control_structure_page_count; ++i) { + __free_pages(pke->control_structure_pages + i, 0); + } + + + /* Unclaim PCI devices */ + list_for_each_safe(dev_elem, n, &pke->list_devices) { + struct parker_kernel_device_entry *pkde; + pkde =3D container_of(dev_elem, + struct parker_kernel_device_entry, + list_entry); + ret =3D pci_bus_type.probe(pkde->dev); + if (ret) + continue; + put_device(pkde->dev); + kernfs_remove(pkde->kn); + kernfs_put(pkde->kn); + kfree(pkde); + } + + atomic_dec(&parker_kernels); + mutex_destroy(&pke->mutex); + if (pke->kn) + kernfs_put(pke->kn); + kfree(pke); + + return ret; +} + +static int parker_kernel_control_structure_alloc(struct parker_kernel_entr= y *pke) +{ + pke->control_structure_pages =3D alloc_pages(GFP_KERNEL | __GFP_ZERO, 0); + if (!pke->control_structure_pages) + return -ENOMEM; + + pke->control_structure_page_count =3D 1; + return 0; +} + +static int parker_kernel_entry_init(struct parker_kernel_entry *pke) +{ + struct kernfs_node *kn; + int ret; + // Also allocate any secondary structures? + + ret =3D parker_kernel_control_structure_alloc(pke); + if (ret) + return ret; + + atomic_inc(&parker_kernels); + pke->id =3D atomic_read(&parker_kernels); + pke->online =3D false; + mutex_init(&pke->mutex); + INIT_LIST_HEAD(&pke->list_devices); + + kn =3D kernfs_create_dir(pke->kn, "devices", pke->kn->mode, pke); + if (IS_ERR(kn)) { + /* As no devices, can't fail */ + parker_kernel_entry_destroy(pke); + return PTR_ERR(kn); + } + pke->kn_devices =3D kn; + + return 0; +} + +static int parker_mkdir(struct kernfs_node *parent_kn, const char *name, u= mode_t mode) +{ + int ret =3D 0; + struct parker_kernel_entry *pke; + struct kernfs_node *kn; + + /* Only allow creation from within root directory */ + if (parent_kn !=3D parker_default.kn) + return -EINVAL; + + if (strchr(name, '\n')) + return -EINVAL; + + mutex_lock(&parker_mutex); + pke =3D kzalloc(sizeof(*pke), GFP_KERNEL); + if (!pke) { + ret =3D -ENOMEM; + goto out_unlock; + } + + kn =3D kernfs_create_dir(parent_kn, name, mode, pke); + if (IS_ERR(kn)) { + ret =3D PTR_ERR(kn); + goto out_free_pke; + } + pke->kn =3D kn; + + ret =3D parker_kernel_entry_init(pke); + if (ret) + goto out_unlock; + + /* As we will use pke after kernfs_remove */ + kernfs_get(pke->kn); + + ret =3D parker_kn_set_ugid(kn); + if (ret) { + goto out_destroy; + } + + ret =3D parker_add_files(kn, per_kernel_attributes, ARRAY_SIZE(per_kernel= _attributes)); + if (ret) { + goto out_destroy; + } + + kernfs_activate(kn); + goto out_unlock; + +out_destroy: + kernfs_remove(pke->kn); + kernfs_put(pke->kn); +out_free_pke: + kfree(pke); +out_unlock: + mutex_unlock(&parker_mutex); + return ret; +} + +static int parker_rmdir(struct kernfs_node *kn) +{ + struct parker_kernel_entry *pke =3D kn->priv; + int ret =3D 0; + + /* Only handle rmdir of kernel */ + if (pke->kn !=3D kn) { + ret =3D -EBUSY; + goto out; + } + + if (parker_kernel_is_online(pke)) { + ret =3D -EBUSY; + goto out; + } + + /* First remove, ensuring no new operations */ + mutex_lock(&pke->mutex); + kernfs_remove_self(kn); + mutex_unlock(&pke->mutex); + + ret =3D parker_kernel_entry_destroy(pke); +out: + return ret; +} + +static struct kernfs_syscall_ops parker_kf_syscall_ops =3D { + .mkdir =3D parker_mkdir, + .rmdir =3D parker_rmdir, +}; + +static inline struct parker_fs_context *parker_fc2context(struct fs_contex= t *fc) +{ + struct kernfs_fs_context *kfc =3D fc->fs_private; + + return container_of(kfc, struct parker_fs_context, kfc); +} + +static int parker_kn_set_ugid(struct kernfs_node *kn) +{ + struct iattr iattr =3D { .ia_valid =3D ATTR_UID | ATTR_GID, + .ia_uid =3D current_fsuid(), + .ia_gid =3D current_fsgid(), }; + + if (uid_eq(iattr.ia_uid, GLOBAL_ROOT_UID) && + gid_eq(iattr.ia_gid, GLOBAL_ROOT_GID)) + return 0; + + return kernfs_setattr(kn, &iattr); +} + +static int parker_add_file(struct kernfs_node *parent_kn, + struct parker_file_type *pft) +{ + struct kernfs_node *kn; + int ret; + + kn =3D __kernfs_create_file(parent_kn, pft->name, pft->mode, + GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, + 0, pft->kf_ops, pft, NULL, NULL); + if (IS_ERR(kn)) + return PTR_ERR(kn); + + ret =3D parker_kn_set_ugid(kn); + if (ret) { + kernfs_remove(kn); + return ret; + } + + return 0; +} + +static int parker_add_files(struct kernfs_node *kn, struct parker_file_typ= e *pfts, int len) +{ + struct parker_file_type *pft; + int ret; + + lockdep_assert_held(&parker_mutex); + + for (pft =3D pfts; pft < pfts + len; pft++) { + ret =3D parker_add_file(kn, pft); + if (ret) + goto error; + } + + return 0; +error: + pr_warn("Failed to add %s, err=3D%d\n", pft->name, ret); + while (--pft >=3D pfts) { + kernfs_remove_by_name(kn, pft->name); + } + return ret; +} + + +static int parker_init_fs_context(struct fs_context *fc) +{ + struct parker_fs_context *ctx; + ctx =3D kzalloc(sizeof(struct parker_fs_context), GFP_KERNEL); + if (!ctx) + return -ENOMEM; + + ctx->kfc.magic =3D PARKER_SUPER_MAGIC; // TODO: Add to include/uapi/linux= /magic.h + fc->fs_private =3D &ctx->kfc; + fc->ops =3D &parker_fs_context_ops; + put_user_ns(fc->user_ns); + fc->user_ns =3D get_user_ns(&init_user_ns); + fc->global =3D true; + return 0; +} + +static int parker_get_tree(struct fs_context *fc) +{ + struct parker_fs_context *ctx =3D parker_fc2context(fc); + int ret =3D 0; + + mutex_lock(&parker_mutex); + if (parker_mounted) { + ret =3D -EBUSY; + goto out; + } + + /* filesystem was unmounted but kernels weren't cleared up, reactivate la= st root */ + if (parker_default.kn) { + ctx->kfc.root =3D parker_root; + goto activate_root; + } + + ret =3D parker_setup_root(ctx); + if (ret) + goto destroy_root; + + ret =3D parker_add_files(parker_default.kn, root_attributes, ARRAY_SIZE(r= oot_attributes)); + if (ret < 0) + goto destroy_root; + +activate_root: + kernfs_activate(parker_default.kn); + ret =3D kernfs_get_tree(fc); + if (ret < 0) + goto destroy_root; + parker_mounted =3D true; +out: + mutex_unlock(&parker_mutex); + return ret; + +destroy_root: + parker_destroy_root(); + return ret; +} + +static void parker_fs_context_free(struct fs_context *fc) +{ + struct parker_fs_context *ctx =3D parker_fc2context(fc); + + kernfs_free_fs_context(fc); + kfree(ctx); +} + +static void parker_kill_sb(struct super_block *sb) +{ + mutex_lock(&parker_mutex); + parker_mounted =3D false; + + /* Only destroy root if no kernels are still declared */ + if (atomic_read(&parker_kernels) =3D=3D 0) { + parker_destroy_root(); + } + + kernfs_kill_sb(sb); + mutex_unlock(&parker_mutex); +} + +static void parker_destroy_root(void) +{ + kernfs_destroy_root(parker_root); + parker_default.kn =3D NULL; +} + +static int parker_setup_root(struct parker_fs_context *ctx) +{ + parker_root =3D kernfs_create_root( + &parker_kf_syscall_ops, + KERNFS_ROOT_CREATE_DEACTIVATED | KERNFS_ROOT_EXTRA_OPEN_PERM_CHECK, + &parker_default); + + if (IS_ERR(parker_root)) + return PTR_ERR(parker_root); + + ctx->kfc.root =3D parker_root; + parker_default.kn =3D kernfs_root_to_node(parker_root); + + return 0; +} + +/* Prevent us from onlining CPUs provisioned to parker instance */ +static int parker_cpu_offline_startup(unsigned int cpu) +{ + int ret; + + mutex_lock(&parker_mutex); + ret =3D cpumask_test_cpu(cpu, &parker_cpus) ? -EINVAL : 0; + mutex_unlock(&parker_mutex); + + return 0; +} + + +static int __init parker_kernfs_init(void) +{ + int ret =3D 0; + + if (!parker_cma_size) { + pr_err("No parker CMA regions allocated, disabling parker."); + return -ENOENT; + } + + ret =3D sysfs_create_mount_point(fs_kobj, "parker"); + if (ret) + return ret; + + ret =3D register_filesystem(&parker_fs_type); + if (ret) + goto cleanup_mountpoint; + + ret =3D cpuhp_setup_state(CPUHP_BP_PREPARE_DYN, "parker", parker_cpu_offl= ine_startup, NULL); + if (ret < 0) + goto cleanup_filesystem; + + return ret; +cleanup_filesystem: + unregister_filesystem(&parker_fs_type); +cleanup_mountpoint: + sysfs_remove_mount_point(fs_kobj, "parker"); + return ret; +} + +module_init(parker_module_init); +module_exit(parker_module_exit); + +MODULE_LICENSE("GPL v2"); +MODULE_AUTHOR("Thom Hughes"); +MODULE_DESCRIPTION("Parker linux host module."); + diff --git a/include/linux/parker.h b/include/linux/parker.h new file mode 100644 index 000000000000..4984aefcee0f --- /dev/null +++ b/include/linux/parker.h @@ -0,0 +1,7 @@ +#ifndef _LINUX_PARKER_H +#define _LINUX_PARKER_H +#ifdef CONFIG_PARKER + +#endif /* CONFIG_PARKER */ +#endif /* _LINUX_PARKER_H */ + diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h index bb575f3ab45e..25658054e3a7 100644 --- a/include/uapi/linux/magic.h +++ b/include/uapi/linux/magic.h @@ -38,6 +38,7 @@ #define OVERLAYFS_SUPER_MAGIC 0x794c7630 #define FUSE_SUPER_MAGIC 0x65735546 #define BCACHEFS_SUPER_MAGIC 0xca451a4e +#define PARKER_SUPER_MAGIC 0x5041524b /* "PARK" */ =20 #define MINIX_SUPER_MAGIC 0x137F /* minix v1 fs, 14 char names */ #define MINIX_SUPER_MAGIC2 0x138F /* minix v1 fs, 30 char names */ --=20 2.39.5 From nobody Thu Oct 2 02:13:22 2025 Received: from mail-qk1-f170.google.com (mail-qk1-f170.google.com [209.85.222.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EFCBB257844 for ; Tue, 23 Sep 2025 15:32:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758641523; cv=none; b=lX/IQh4geIP29UyNLktcWFmKjsknNqhD0YGgWRQBY8Sa7r/fme4nJO5FGYV2Z2sDDwUOr2mWyoiDgHgCxvRnGCI7D+EYjDhX/c0SGb5eJto+bihQFCuEKmp++W6FCaBUG8Ovduyhrm3lL2789fV0VekWVnUsexg5OKX5LklIlc8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758641523; c=relaxed/simple; bh=gODZzPD/opn1TM9cf3UdOfrIL0SMTLmEZ7NN+GXHa6I=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ONNA3dwmXy+ZZ9XWpsO6YcT5KaWjKWY2CF2jUbd5D5a+/aB7Gu2iYcaSYplOaYrLQ4+9vRxNaWs556Djalhyzj56jzkB7fqoi8KFBPvLMsVnytEUzuB073g1p25R28LUMEFYavqr1w+TEY4qHqJgxjNar6m/j9BPCJn19/nHS/E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=PhjyFkfE; arc=none smtp.client-ip=209.85.222.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="PhjyFkfE" Received: by mail-qk1-f170.google.com with SMTP id af79cd13be357-85322d9c606so72591985a.1 for ; Tue, 23 Sep 2025 08:32:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1758641519; x=1759246319; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jNQg6Dmvc7QhPeOGSjQkaaU9Q6+ewyJ1+A+N+0yiv7Q=; b=PhjyFkfESyNWL+OKGPOGUyoNuCEc+jtJZ1Hc5QX2yk/WKg+Mx9fWpHzyRsS8UwNsoj +PQD2mjt/mIepxgl2qwZERaFypJYzgf57q6y7IMGt7LaxVVjDOYA4osTStalc4PRauXq LkS1e9ZSmHFdXatNhS3hswcBUqaJk/Ee/CmtTsU8WRI65W6C//bU+PIj5mGDHCCPShR8 ItxTYHfZp0rN7FBFM7XWA5kmZLDj/cKWGIIC5T2Hw3q5XyeSj7a1uhZfm854u6iG1avl 7tQlECWjMbHQgPfAUHO4W9jCzAFhXa3e2dbM4jRYu5jm/bz8qDeSGXQzZ8y+TLQ1hFQG pmIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1758641519; x=1759246319; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jNQg6Dmvc7QhPeOGSjQkaaU9Q6+ewyJ1+A+N+0yiv7Q=; b=dojjkB+Nk7bGrqU7EMwlqWfffOOekOutIRQy2vlWy03I74UuZbxy+l8RH60AWzWVeG 19sRMb+ygH++94HxT/gnJUdoSH7T3ZmLGtWkEAFk26oqJ7VOi6v5K3tg3j9eZYIH4E2F p1r8KMGEMbJ0pBl8j5xpu4NgqgcbS0hkmtghwfdSlWYsz9kA0uj4UJ+N5CPAfiQNkO9Y MFzgyFUiitan8VAN2jec/EbXgso79mcOgVT3vZiM7hJQ/fYUvLQe/iYsk1dhAg819K5w dz2b1Y8h5Q8kFG4Q3szSMduy9a2ivPlg3ZOww8WeDNqpFfG6QtaU71P/qoeWcEUBKMV0 Ni+Q== X-Gm-Message-State: AOJu0YzF4mlg2eOP8aXrfIPppfciFxF/HXrPPkY9rm9USLyCVJFxdT75 1iPIwExOTH4Q9t5UV4GHrK++zFS9LWvnkFxJFGKpPnOFaw2nxjzw+REP1aRX0R7Ca1ZO/jXk/XE eUUJt3KA= X-Gm-Gg: ASbGncuA3C/YeSJwWXhL30T37T9KtfJnJ+dVItKfY2x086j2VlRwBBqyKrfSq7Ob9e0 eTkeL4Au1zLwEIPaHM0Ctc87yBbaNBp0tATGKZSnctkVqLqqfLg2XqJ1p4O5zXSmmc0vSFbeVBF Lp4WQqWpKmF6DKilj85BChL6bDHD6p5U2ClKq84oh9GfzQZ6Ti8/llk1rFX0xHLXEvlO652WW8F qTauJrrgLl1o5I+0+xFLPAL/PuuY/lJ+VsIGpSHC2Gbs361xWve7Jm2Ghxxq+ppT7tJpfkX9CmI ErmJ8TjFFRs68q/TE2cZ/NT62VcN7eGz7o6vpbusDVakY/Fy9dTAmeky6ZpPD1g6NLtLR8+FK0B 0M91NJfOonQkABtS3uuWTqyUozWcf X-Google-Smtp-Source: AGHT+IG3zk1Sk0vaD8CMZLIiKqK7VvkHbQL7KBuD4nbMD4OVDlX0qOxl76RR2+NJ7QBWuWzryzvOrw== X-Received: by 2002:a05:620a:29d2:b0:848:624c:457a with SMTP id af79cd13be357-8516ba5bdf9mr381050785a.33.1758641518908; Tue, 23 Sep 2025 08:31:58 -0700 (PDT) Received: from localhost ([93.115.195.2]) by smtp.gmail.com with ESMTPSA id af79cd13be357-836305769ecsm1021524285a.42.2025.09.23.08.31.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Sep 2025 08:31:58 -0700 (PDT) From: Fam Zheng To: linux-kernel@vger.kernel.org Cc: Lukasz Luba , linyongting@bytedance.com, songmuchun@bytedance.com, satish.kumar@bytedance.com, Borislav Petkov , Thomas Gleixner , yuanzhu@bytedance.com, Ingo Molnar , Daniel Lezcano , fam.zheng@bytedance.com, Zhang Rui , fam@euphon.net, "H. Peter Anvin" , x86@kernel.org, liangma@bytedance.com, Dave Hansen , "Rafael J. Wysocki" , guojinhui.liam@bytedance.com, linux-pm@vger.kernel.org, Thom Hughes Subject: [RFC 4/5] x86/parker: Add parker initialisation code Date: Tue, 23 Sep 2025 15:31:45 +0000 Message-Id: <20250923153146.365015-5-fam.zheng@bytedance.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250923153146.365015-1-fam.zheng@bytedance.com> References: <20250923153146.365015-1-fam.zheng@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Thom Hughes Signed-off-by: Thom Hughes Signed-off-by: Fam Zheng --- arch/x86/kernel/setup.c | 4 + arch/x86/parker/Makefile | 3 +- arch/x86/parker/Makefile-full | 3 + arch/x86/parker/setup.c | 423 ++++++++++++++++++++++++++++ arch/x86/parker/trampoline.S | 55 ++++ arch/x86/parker/trampoline.h | 10 + drivers/thermal/intel/therm_throt.c | 3 + include/linux/parker-bkup.h | 22 ++ include/linux/parker.h | 15 + 9 files changed, 537 insertions(+), 1 deletion(-) create mode 100644 arch/x86/parker/Makefile-full create mode 100644 arch/x86/parker/setup.c create mode 100644 arch/x86/parker/trampoline.S create mode 100644 arch/x86/parker/trampoline.h create mode 100644 include/linux/parker-bkup.h diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index cebee310e200..a3c7909efaf5 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include #include @@ -917,6 +918,7 @@ void __init setup_arch(char **cmdline_p) * called before cache_bp_init() for setting up MTRR state. */ init_hypervisor_platform(); + parker_init(); =20 tsc_early_init(); x86_init.resources.probe_roms(); @@ -1110,6 +1112,8 @@ void __init setup_arch(char **cmdline_p) =20 if (boot_cpu_has(X86_FEATURE_GBPAGES)) hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT); + /* Allocate memory for PARKER kernels */ + parker_cma_reserve(); =20 /* * Reserve memory for crash kernel after SRAT is parsed so that it diff --git a/arch/x86/parker/Makefile b/arch/x86/parker/Makefile index 41c40fc64267..506ad8cbff00 100644 --- a/arch/x86/parker/Makefile +++ b/arch/x86/parker/Makefile @@ -1,2 +1,3 @@ -obj-y +=3D kernfs.o +obj-y +=3D kernfs.o setup.o trampoline.o $(obj)/kernfs.o: $(obj)/internal.h +$(obj)/setup.o: $(obj)/internal.h diff --git a/arch/x86/parker/Makefile-full b/arch/x86/parker/Makefile-full new file mode 100644 index 000000000000..506ad8cbff00 --- /dev/null +++ b/arch/x86/parker/Makefile-full @@ -0,0 +1,3 @@ +obj-y +=3D kernfs.o setup.o trampoline.o +$(obj)/kernfs.o: $(obj)/internal.h +$(obj)/setup.o: $(obj)/internal.h diff --git a/arch/x86/parker/setup.c b/arch/x86/parker/setup.c new file mode 100644 index 000000000000..2d36dac05289 --- /dev/null +++ b/arch/x86/parker/setup.c @@ -0,0 +1,423 @@ +#define pr_fmt(fmt) "parker: " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "internal.h" +#include "trampoline.h" + +bool is_parker =3D false; + +static phys_addr_t parker_control_structure_address; +static volatile struct parker_control_structure *parker_control_structure; + +static void (*old_shutdown)(void); +static void (*old_restart)(char*); + +/* Take in parker control page as kernel parameter + * this also indicates we are booting as a parker kernel + * currently assumes that control structure is 1 page */ +static __init int parker_parse_early_param(char *opt) +{ + if (!opt) + return -EINVAL; + char *oldopt =3D opt; + parker_control_structure_address =3D memparse(opt, &opt); + if (oldopt =3D=3D opt) + return -EINVAL; + is_parker =3D true; + return 0; +} +early_param("parker", parker_parse_early_param); + +inline bool is_parker_instance(void) +{ + return is_parker; +} + +static struct resource parker_control_structure_resource =3D { + .name =3D "Parker Control Structure", + .start =3D 0, + .end =3D 0, + .flags =3D IORESOURCE_SYSTEM_RAM, + .desc =3D IORES_DESC_RESERVED +}; + +static struct real_mode_header parker_dummy_real_mode_header; + +static void parker_reserve_control_structure(unsigned long long addr) +{ + parker_control_structure_resource.start =3D addr; + parker_control_structure_resource.end =3D addr + PAGE_SIZE - 1; + insert_resource(&iomem_resource, &parker_control_structure_resource); +} + +static void __init parker_x2apic_init(void) +{ +#ifdef CONFIG_X86_X2APIC + if (!x2apic_enabled()) + return; + x2apic_phys =3D 1; + /* + * This will trigger the switch to apic_x2apic_phys. Empty OEM IDs + * ensure that only this APIC driver picks up the call. + */ + default_acpi_madt_oem_check("", ""); +#endif +} + +/* Setup trampoline pagetable, stack and initial code pointer */ +static int __init parker_init_trampoline(void) +{ + /* Setup trampoline lock or else head_64.S:secondary_startup_64 will cras= h */ + trampoline_lock =3D (u32 *)&parker_trampoline_lock; + WRITE_ONCE(*trampoline_lock, 0); + + /* Map kernel page table so we can access kernel memory */ + for (int i =3D pgd_index(__PAGE_OFFSET); i < PTRS_PER_PGD; i++) + WRITE_ONCE(parker_trampoline_pgt[i], init_top_pgt[i].pgd); + + /* 1:1 map parker_ap_trampoline physical memory so we can jump from host*/ + u64 paddr_map =3D virt_to_phys(parker_ap_trampoline); + pgd_t *pgd =3D (pgd_t*)parker_trampoline_pgt; + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + pte_t *pte; + pgd +=3D pgd_index(paddr_map); + pgprot_t prot =3D PAGE_KERNEL_EXEC_NOENC; + if (!pgd_present(*pgd)) { + p4d =3D (p4d_t *)memblock_alloc(PAGE_SIZE, PAGE_SIZE); + if (!p4d) + return -ENOMEM; + set_pgd(pgd, __pgd(__pa(p4d) | _KERNPG_TABLE)); + } + p4d =3D p4d_offset(pgd, paddr_map); + if (!p4d_present(*p4d)) { + pud =3D (pud_t *)memblock_alloc(PAGE_SIZE, PAGE_SIZE); + if (!pud) { + memblock_free(p4d, PAGE_SIZE); + return -ENOMEM; + } + set_p4d(p4d, __p4d(__pa(pud) | _KERNPG_TABLE)); + } + pud =3D pud_offset(p4d, paddr_map); + if (!pud_present(*pud)) { + pmd =3D (pmd_t *)memblock_alloc(PAGE_SIZE, PAGE_SIZE); + if (!pmd) { + memblock_free(p4d, PAGE_SIZE); + memblock_free(pud, PAGE_SIZE); + return -ENOMEM; + } + set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE)); + } + pmd =3D pmd_offset(pud, paddr_map); + if (!pmd_present(*pmd)) { + pte =3D (pte_t *)memblock_alloc(PAGE_SIZE, PAGE_SIZE); + if (!pte) { + memblock_free(p4d, PAGE_SIZE); + memblock_free(pud, PAGE_SIZE); + memblock_free(pmd, PAGE_SIZE); + return -ENOMEM; + } + set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE)); + } + pte =3D pte_offset_kernel(pmd, paddr_map); + + if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) + prot =3D PAGE_KERNEL_EXEC; + + set_pte(pte, pfn_pte(paddr_map >> PAGE_SHIFT, prot)); + + /* Write initial code (within parker) for secondary CPU initialisation */ + WRITE_ONCE(parker_trampoline_start, secondary_startup_64); + + /* Store virtual address of top of stack at bottom of stack */ + WRITE_ONCE(parker_trampoline_stack, &parker_trampoline_stack_end); + + /* Synchronise all updates */ + smp_mb(); + + return 0; +} + +static void parker_set_spoof_bsp(bool enabled) +{ + u64 msr; + rdmsrl(MSR_IA32_APICBASE, msr); + msr =3D enabled ? (msr | MSR_IA32_APICBASE_BSP) : + (msr & ~MSR_IA32_APICBASE_BSP); + wrmsrl(MSR_IA32_APICBASE, msr); +} + +static void __init parker_parse_smp_cfg(void) +{ + int ret; + pr_info("smpconfig\n"); + /* Disable legacy PIC */ + pic_mode =3D 0; + + /* Initialize x2apic as ACPI disabled */ + parker_x2apic_init(); + + /* Spoof parker BSP as BSP or kernel thinks it's crash kernel */ + parker_set_spoof_bsp(true); + + /* Assume that lapic address is unchangd */ + register_lapic_address(APIC_DEFAULT_PHYS_BASE); + ret =3D parker_init_trampoline(); + if (ret < 0) { + pr_info("Failed to initialise trampoline.\n"); + smp_found_config =3D 0; + return; + } + + /* Register all APIC ID's for parker APs */ + for (int i =3D 0; i < parker_control_structure->num_cpus; ++i) { + topology_register_apic(READ_ONCE(parker_control_structure->apic_ids[i]), + CPU_ACPIID_INVALID, + true); + } + + smp_found_config =3D 1; +} + +static bool __init parker_x2apic_available(void) +{ + return x2apic_enabled(); +} + +static void parker_init_host_control(void) +{ + parker_reserve_control_structure((unsigned long long)parker_control_struc= ture); + phys_addr_t address =3D parker_control_structure_address; + parker_control_structure =3D early_memremap(address, PAGE_SIZE); + /* Announce trampoline physical address to host kernel */ + phys_addr_t trampoline_phys_addr =3D virt_to_phys(parker_ap_trampoline); + WRITE_ONCE(parker_control_structure->start_address, trampoline_phys_addr); + smp_mb(); +} + +/* Some APIC callback overrides */ +static int parker_wakeup_secondary_cpu_64(u32 apicid, unsigned long _dummy= _start_eip) +{ + WRITE_ONCE(parker_trampoline_apicid, apicid); + smp_mb(); + + /* Wait for APIC id to be reset before continuing, + * ensuring no CPU misses trampoline kick. */ + while (READ_ONCE(parker_trampoline_apicid) !=3D 0) + cpu_relax(); + + return 0; +} + +static void parker_send_IPI_allbutself(int vector) +{ + if (num_online_cpus() < 2) + return; + + __apic_send_IPI_mask_allbutself(cpu_online_mask, vector); +} + +static void parker_send_IPI_all(int vector) +{ + __apic_send_IPI_mask(cpu_online_mask, vector); +} + + +/* Setup real mode header so SMP doesn't dereference null pointer */ +static void __init parker_realmode_init(void) +{ + real_mode_header =3D &parker_dummy_real_mode_header; +} + +static void parker_emergency_restart(void) +{ + pr_notice("Restart not supported, spinning\n"); + for (;;) { + continue; + } +} + +static void parker_offline(void) +{ + /* Remove BSP flag from APIC MSR + * or we crash on second use of BSP in parker kernel */ + parker_set_spoof_bsp(false); + parker_control_structure =3D memremap(parker_control_structure_address, P= AGE_SIZE, MEMREMAP_WB); + if (!parker_control_structure) { + pr_err("Unable to map control structure, unable to tell host we are offl= ine.\n"); + return; + } + WRITE_ONCE(parker_control_structure->online, false); + memunmap((void*)parker_control_structure); +} + +static void parker_shutdown(void) +{ + pr_info("shutting down.\n"); + parker_offline(); + old_shutdown(); +} + +/* No restart occurs, will just effectively shutdown */ +static void parker_restart(char *msg) +{ + pr_info("rebooting\n"); + parker_offline(); + old_restart(msg); +} + +static struct pci_bus __init *parker_pci_init_root_bus(int busno) +{ + struct pci_bus *bus; + struct pci_sysdata *sd; + LIST_HEAD(resources); + + /* If bus exists, continue */ + /* TODO: Is domain always 0? (probably not) */ + bus =3D pci_find_bus(0, busno); + if (bus) + return bus; + + sd =3D kzalloc(sizeof(*sd), GFP_KERNEL); + if (!sd) { + printk(KERN_ERR "PCI: OOM, skipping PCI bus %02x\n", busno); + return NULL; + } + + sd->node =3D x86_pci_root_bus_node(busno); + x86_pci_root_bus_resources(busno, &resources); + bus =3D pci_create_root_bus(NULL, busno, &pci_root_ops, sd, &resources); + if (!bus) { + pci_free_resource_list(&resources); + kfree(sd); + return NULL; + } + + return bus; +} + +static int __init parker_pci_init(void) +{ + /* Set to 0 as we are manually setting up and probing busses ourselves */ + pcibios_last_bus =3D 0; + + /* Scan only passed through PCI devices, passing through PCIe port unsupp= ored */ + for (int i =3D 0; i < parker_control_structure->num_pci_devs; ++i) { + u32 dev_id =3D parker_control_structure->pci_dev_ids[i]; + u32 busno =3D PCI_BUS_NUM(dev_id); + u32 devfn =3D dev_id & 0xff; + struct pci_bus *bus =3D parker_pci_init_root_bus(busno); + if (!bus) { + pr_err("Failed to get bus: %d\n", busno); + continue; + } + struct pci_dev *dev =3D pci_scan_single_device(bus, devfn); + if (!dev) { + pr_err("Failed to get dev: %d\n", devfn); + continue; + } + pci_bus_add_device(dev); + } + + /* We can announce online now to host kernel */ + WRITE_ONCE(parker_control_structure->online, 1); + smp_mb(); + + /* PCI initialisation is the last time the early mapped structure is used= */ + early_memunmap((void*)parker_control_structure, PAGE_SIZE); + + /* TODO: Disable rescan! */ + return 0; +} + +static int parker_pci_enable_irq(struct pci_dev *dev) +{ + /* Let's lie to everyone ;) */ + /* TODO: Find drivers that can use MSI only that fail to load without INT= -A + * then we can return -EINVAL; */ + return 0; +} + +static void parker_pci_disable_irq(struct pci_dev *dev) +{ + return; +} + +void __init parker_init() +{ + if (!is_parker_instance()) + return; + + pr_info("parker: Initialising parker..\n"); + /* TODO: Re-enable! */ + legacy_pic =3D &null_legacy_pic; + /* Reserve dummy header so existing smpboot.c:do_boot_cpu code + * doesn't dereference NULL pointer */ + x86_platform.realmode_reserve =3D x86_init_noop; + x86_platform.realmode_init =3D parker_realmode_init; + + /* Disable legacy code */ + x86_platform.legacy.rtc =3D 0; + x86_platform.legacy.warm_reset =3D 0; + x86_platform.legacy.i8042 =3D X86_LEGACY_I8042_PLATFORM_ABSENT; + + /* Disable emergency restart */ + machine_ops.emergency_restart =3D parker_emergency_restart; + + /* Save old machine ops */ + old_shutdown =3D machine_ops.shutdown; + old_restart =3D machine_ops.restart; + + /* Ensure shutdown / restart makes host kernel aware parker is offline */ + machine_ops.shutdown =3D parker_shutdown; + machine_ops.restart =3D parker_restart; + + /* Use control structure for SMP CPU APIC ID enumeration */ + x86_init.mpparse.find_mptable =3D x86_init_noop; + x86_init.mpparse.early_parse_smp_cfg =3D x86_init_noop; + x86_init.mpparse.parse_smp_cfg =3D parker_parse_smp_cfg; + + /* TODO: Investigate x2apic alternative, but requires baremetal */ + x86_init.hyper.x2apic_available =3D parker_x2apic_available; + + /* Disable PCI IRQ handling as we don't support INT-A mode */ + x86_init.pci.init =3D parker_pci_init; + x86_init.pci.init_irq =3D x86_init_noop; + x86_init.pci.fixup_irqs =3D x86_init_noop; + pcibios_enable_irq =3D parker_pci_enable_irq; + pcibios_disable_irq =3D parker_pci_disable_irq; + + /* No ACPI, so no hotplugging (be nice) */ + disable_acpi(); + + /* Setup host kernel control page */ + parker_init_host_control(); + + /* Let smpboot.c:do_boot_cpu use our wakeup routine */ + apic_update_callback(wakeup_secondary_cpu_64, parker_wakeup_secondary_cpu= _64); + + /* Prevent shorthand IPIs */ + apic_update_callback(send_IPI_all, parker_send_IPI_all); + apic_update_callback(send_IPI_allbutself, parker_send_IPI_allbutself); +} diff --git a/arch/x86/parker/trampoline.S b/arch/x86/parker/trampoline.S new file mode 100644 index 000000000000..2107201eb1de --- /dev/null +++ b/arch/x86/parker/trampoline.S @@ -0,0 +1,55 @@ +#include + +#include +#include +#include + +/* NOTE: No SME, host kernel and secondary kernel must match N-level pgt */ +/* NOTE: Changing this file will require a full recompilation as makefile = isn't setup properly */ +.text +.code64 +SYM_CODE_START(parker_ap_trampoline) + UNWIND_HINT_END_OF_STACK + ANNOTATE_NOENDBR +/* Spin for now */ +.Lno_trampoline_start: + mov parker_trampoline_start(%rip), %rcx + test %rcx, %rcx + jz .Lno_trampoline_start + leaq parker_trampoline_pgt(%rip), %rax +.Lno_stack: + /* Store vaddr of stack at top of stack */ + movq parker_trampoline_stack(%rip), %rsp + test %rsp, %rsp + jz .Lno_stack + mov %rax, %cr3 +.Lwrong_apicid: + cmp parker_trampoline_apicid, %esi + jne .Lwrong_apicid +.Ltrampoline_locked: + lock btsl $0, parker_trampoline_lock + jnc .Ltrampoline_unlocked + pause + jmp .Ltrampoline_locked +/* Assume APIC ID 0 is never secondary processor */ +.Ltrampoline_unlocked: + movl $0, parker_trampoline_apicid + ANNOTATE_RETPOLINE_SAFE + call *%rcx + ANNOTATE_UNRET_SAFE + ret + int3 +SYM_CODE_END(parker_ap_trampoline) + +.data +.balign PAGE_SIZE +SYM_DATA(parker_trampoline_pgt, .skip 4096) +SYM_DATA(parker_trampoline_start, .quad 0) +SYM_DATA(parker_trampoline_apicid, .long 0) +SYM_DATA(parker_trampoline_lock, .long 0) +.balign 4096 +/* TODO: Just allocate a stack why waste 4KB */ +SYM_DATA_START(parker_trampoline_stack) + .skip 4096 +SYM_DATA_END_LABEL(parker_trampoline_stack, SYM_L_GLOBAL, parker_trampolin= e_stack_end) +SYM_DATA(parker_trampoline_end, .quad 0) diff --git a/arch/x86/parker/trampoline.h b/arch/x86/parker/trampoline.h new file mode 100644 index 000000000000..b93ca612db99 --- /dev/null +++ b/arch/x86/parker/trampoline.h @@ -0,0 +1,10 @@ +#ifndef _TRAMPOLINE_H +#define _TRAMPOLINE_H +void parker_ap_trampoline(void); +extern u64 parker_trampoline_pgt[]; +extern void *parker_trampoline_start; +extern u32 parker_trampoline_lock; +extern void *parker_trampoline_stack; +extern u32 parker_trampoline_apicid; +extern u64 parker_trampoline_stack_end; +#endif diff --git a/drivers/thermal/intel/therm_throt.c b/drivers/thermal/intel/th= erm_throt.c index e69868e868eb..dabc7e35ff72 100644 --- a/drivers/thermal/intel/therm_throt.c +++ b/drivers/thermal/intel/therm_throt.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include #include @@ -690,6 +691,8 @@ void intel_thermal_interrupt(void) /* Thermal monitoring depends on APIC, ACPI and clock modulation */ static int intel_thermal_supported(struct cpuinfo_x86 *c) { + if (is_parker_instance()) + return 0; if (!boot_cpu_has(X86_FEATURE_APIC)) return 0; if (!cpu_has(c, X86_FEATURE_ACPI) || !cpu_has(c, X86_FEATURE_ACC)) diff --git a/include/linux/parker-bkup.h b/include/linux/parker-bkup.h new file mode 100644 index 000000000000..b00833b5a24b --- /dev/null +++ b/include/linux/parker-bkup.h @@ -0,0 +1,22 @@ +#ifndef _LINUX_PARKER_H +#define _LINUX_PARKER_H +#ifdef CONFIG_PARKER +extern void __init parker_cma_reserve(void); +extern void __init parker_init(void); +extern bool is_parker_instance(void); +#else +static inline __init void parker_cma_reserve(void) +{ +} + +static inline __init void parker_init(void) +{ +} + +static inline bool is_parker_instance(void) +{ + return false; +} +#endif /* CONFIG_PARKER */ +#endif /* _LINUX_PARKER_H */ + diff --git a/include/linux/parker.h b/include/linux/parker.h index 4984aefcee0f..b00833b5a24b 100644 --- a/include/linux/parker.h +++ b/include/linux/parker.h @@ -1,7 +1,22 @@ #ifndef _LINUX_PARKER_H #define _LINUX_PARKER_H #ifdef CONFIG_PARKER +extern void __init parker_cma_reserve(void); +extern void __init parker_init(void); +extern bool is_parker_instance(void); +#else +static inline __init void parker_cma_reserve(void) +{ +} =20 +static inline __init void parker_init(void) +{ +} + +static inline bool is_parker_instance(void) +{ + return false; +} #endif /* CONFIG_PARKER */ #endif /* _LINUX_PARKER_H */ =20 --=20 2.39.5 From nobody Thu Oct 2 02:13:22 2025 Received: from mail-qv1-f42.google.com (mail-qv1-f42.google.com [209.85.219.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6B4AA1FF7C7 for ; Tue, 23 Sep 2025 15:32:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758641524; cv=none; b=D51hW6rarKayBKdoboTN/7sY7pOWKfFbKpjzFVb9JGKJDRX7iPSzf3KN+Q95zHvgrQjvjGQTIewZuySifCz7KmK7k/UsVf7KNBaS0ChzuXydbf4N9dKE8TC6JCX0VfHzZMJ6dGdJLqHLIMK43kd1/mJO1+0tEaS/VU6ryKadJ/Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758641524; c=relaxed/simple; bh=1YwZpHexCtqP9eAf1fTZ1LlVtJXXtxJcq5y9B0el7Vs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=bwQTGWlnxvxl9eHQVrghPIU8LAg/A53VuXPi2VpKRH+ONJqcTbLFf6AGvjVlkil+9E4Q19zDzmK9TjycqdZaX534YiWi0W4w/vtXMgZCbc6gwRdLJ6Us47hZ3vuX9aRoYwciwneQj4WdZGw4vAqiIKRIDaMjVp6iAXF29TLyGlE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=BYWNwon6; arc=none smtp.client-ip=209.85.219.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="BYWNwon6" Received: by mail-qv1-f42.google.com with SMTP id 6a1803df08f44-796d68804a0so55832796d6.3 for ; Tue, 23 Sep 2025 08:32:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1758641521; x=1759246321; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ST/TiPE3UoG2aOPw0mJW/QZGPve3OqLp6tKuBIFjGhc=; b=BYWNwon6jAWJfaFUXukhsjTj/U055EoAAsEkT79ezCrnhy/urtfvlMapmatUiYti1B 8jT8GZEot1DN1BroiiRe6beQcok16LWQVu4CtVoEGTJcrwW/lBc9grn4SZ9kdyP54LqV aebDVOQJ6FNc/8kn6Pb3M4lbuTLiInl0CbWiSZYxLFz/ziPkKpa9YX449DrURMynGvfc tz+Yv1FWC5WFzok1s1tQ1Cq/BosX4YacmL9HK8S7hMKy3tLCaoSv36W8KVTBdXPThfnQ 7kwejeaT3QAjVekKSVAS4ziyzoeOnwNBedebuXKMzvQF7jah9O/1yKIVktXgvEz5i4aE X60Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1758641521; x=1759246321; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ST/TiPE3UoG2aOPw0mJW/QZGPve3OqLp6tKuBIFjGhc=; b=b5Y5/2cHD3D1qnLQNpitqdIUtHLXdZxAdYP63HBcUD3cyYhNTOaR8afYDGSvvdmfpH DfZJrTA2vvbFMY1bUCePSOEeyT5ax4786nLBRWDeuX25o7/ILIW91VntYzOUoH/B7lBe K8/sw3sk3pcB1s35uqaLVk9ontwQJa3fQ8iLgD17mxcPsuAqJr7bEDKrHaNv9IXqp+cG QtanWjMk9nP/0wtjqPYaEO5xKVJ//PIVKyhF5R6Ad9dyLsd3NrneTTk8Y81S8ZRrMwPs lIvGANdUgiZeOM1gwy4wyKN3YbSHpuEB2toDq69rJ3HNEbhlKPNn0wiHCS3DfrBtvYcc PJoA== X-Gm-Message-State: AOJu0YyTLuKHzsed2b6Ni0nfCrpaFGj7tnDx7YuQUheRbjFJiRqbQZL3 1i48i9K/PjK6SaYmyeRdGDfem37qTkFh9GHr0gXG4Sk+M8kHPUUCX98EQIxVpThO698gMymuKh8 Obcp9EWc= X-Gm-Gg: ASbGncsTy4EW2IlylxwVOuu7XgVkVvXbjL2/gmbDgaL7Uo2jQTPl3hB02l93sbR1ypN 6qleI3Wo7ZN+MP20ehHToANk/zZEgW4dMyIf0vFXcQSMpjlzlo86vJbRyzySU+BDzchKvt/k9f5 uczqSIRdCOHWIJ6Y0INQkVrBYvsBvMYu005YoRoIwbmAQnEhdzo2lb6HAuJtbMdRO3ooaCnYPG0 E2Wq+8R4FEfTtoP+lVhW+2qSlmX+Lc95C7sLI8yIpmZOnAuCT25n0fYP3g56yRp/3XaJ23cArRe ohoEoUJ/N2lytlH1EoOzaZaK4eFnjC7HTYsPGG4S6Of1p3DDV6S91QFZmgWAfsWOk7orkr7cgIc HY48ZfAtb5KnXX+AI+8dC+HWC9VM2Ew== X-Google-Smtp-Source: AGHT+IEvanVRKTjJIXXUzM6LDTiGRKMr5d0J8UNTnUv1MjeOEx7kiAqkICaVyGwRNy1LTn6CuXzPng== X-Received: by 2002:a05:6214:202e:b0:791:c933:ad45 with SMTP id 6a1803df08f44-7e7145081f1mr33486566d6.37.1758641520758; Tue, 23 Sep 2025 08:32:00 -0700 (PDT) Received: from localhost ([79.173.157.19]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7b6a2fa0c5esm50177486d6.66.2025.09.23.08.32.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Sep 2025 08:32:00 -0700 (PDT) From: Fam Zheng To: linux-kernel@vger.kernel.org Cc: Lukasz Luba , linyongting@bytedance.com, songmuchun@bytedance.com, satish.kumar@bytedance.com, Borislav Petkov , Thomas Gleixner , yuanzhu@bytedance.com, Ingo Molnar , Daniel Lezcano , fam.zheng@bytedance.com, Zhang Rui , fam@euphon.net, "H. Peter Anvin" , x86@kernel.org, liangma@bytedance.com, Dave Hansen , "Rafael J. Wysocki" , guojinhui.liam@bytedance.com, linux-pm@vger.kernel.org, Thom Hughes Subject: [RFC 5/5] x86/apic: Make Parker instance use physical APIC Date: Tue, 23 Sep 2025 15:31:46 +0000 Message-Id: <20250923153146.365015-6-fam.zheng@bytedance.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250923153146.365015-1-fam.zheng@bytedance.com> References: <20250923153146.365015-1-fam.zheng@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Thom Hughes Signed-off-by: Thom Hughes Signed-off-by: Fam Zheng --- arch/x86/kernel/apic/apic_flat_64.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/apic/apic_flat_64.c b/arch/x86/kernel/apic/api= c_flat_64.c index e0308d8c4e6c..e753125a1de8 100644 --- a/arch/x86/kernel/apic/apic_flat_64.c +++ b/arch/x86/kernel/apic/apic_flat_64.c @@ -9,6 +9,7 @@ * James Cleverdon. */ #include +#include =20 #include =20 @@ -21,7 +22,7 @@ static u32 physflat_get_apic_id(u32 x) =20 static int physflat_probe(void) { - return 1; + return is_parker_instance(); } =20 static int physflat_acpi_madt_oem_check(char *oem_id, char *oem_table_id) --=20 2.39.5