From nobody Fri Dec 19 15:00:31 2025 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 87EFE32F756 for ; Fri, 5 Dec 2025 14:29:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764944984; cv=none; b=cFpU0idXyd/iglsomdNg8If3nJ1DVUG2MxB5MBxMVbksDKNJmVEKjE83ifpNo8/RTndGq6rPsoiVV5tm0J06SM3/PiF5WpbNgNOl+jsQMeZDT6fVuTPGi1J4n5iX9AVf3VkL6gBC2wBk3g3Tyg6wUAK5j7hrn5FPkSK27fz9AdM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764944984; c=relaxed/simple; bh=B7oo0G1VcYIIqfCMxCyeZ/F96Q925oS0z4uM0th6SdM=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=hPBj/vA77vMnEC7KAn4p12tuDlT3C/MSe4eaA/aYXkC8zxgu5CHLAyQ4jlnhm3hkB4J08DlTbZ3KeGEFmCLi9BDtVxwpj2/KD+8tNwgYSZmTryp25AcOBvFIzc7xnmKBkilI/J6Jn1VWm7BO5BHHim0ayNmlxtVhHSJY93PnVHc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=nejW9fuR; arc=none smtp.client-ip=148.163.156.1 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="nejW9fuR" Received: from pps.filterd (m0353729.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 5B51vjHg014375; Fri, 5 Dec 2025 14:29:28 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-transfer-encoding:date:from:message-id:mime-version :subject:to; s=pp1; bh=/dXcxfSlF774i0xjufzkyQASJ8OfL3gMQTfDsWaya JE=; b=nejW9fuRXsE40wykHd24iIiggj0f2eIM9+Lck/SdkZSrRoVJEMe4iH7SP B8CX+F5D75nB2YIslef15/MtRLdCreNw4Nnly/DN04DCSbWzxoCK5d0zjZa6GHaz xFDpH+wFlWz+hD8wmwIlxiSOixnp5Bo+kK83l8iSATZXVlu5IhchvpWwpMHtKmvY ymHwQKF1h/Yv+jt+xIL1L5vli0l9qBN1JckWWQQ8qPDsj07eRYfFyae6gJkh7oKY 2hI1jmOJoYTgNIc3g1hIuu7c9Cphyf+YdJvKJZ1OXCqItd7Whg1Uw9QZGExazOKT K0Vwi67pW33pfjkgyNSZihYKnTR1g== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 4aqrh7e677-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 05 Dec 2025 14:29:27 +0000 (GMT) Received: from m0353729.ppops.net (m0353729.ppops.net [127.0.0.1]) by pps.reinject (8.18.1.12/8.18.0.8) with ESMTP id 5B5EADig029789; Fri, 5 Dec 2025 14:29:27 GMT Received: from ppma13.dal12v.mail.ibm.com (dd.9e.1632.ip4.static.sl-reverse.com [50.22.158.221]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 4aqrh7e673-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 05 Dec 2025 14:29:27 +0000 (GMT) Received: from pps.filterd (ppma13.dal12v.mail.ibm.com [127.0.0.1]) by ppma13.dal12v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 5B5C29Se003853; Fri, 5 Dec 2025 14:29:26 GMT Received: from smtprelay06.fra02v.mail.ibm.com ([9.218.2.230]) by ppma13.dal12v.mail.ibm.com (PPS) with ESMTPS id 4ardck5f83-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 05 Dec 2025 14:29:26 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay06.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 5B5ETMuX26739088 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 5 Dec 2025 14:29:22 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id AEFB12004B; Fri, 5 Dec 2025 14:29:22 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9AC5D20040; Fri, 5 Dec 2025 14:29:19 +0000 (GMT) Received: from shivang.com (unknown [9.39.16.183]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 5 Dec 2025 14:29:19 +0000 (GMT) From: Shivang Upadhyay To: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Cc: Shivang Upadhyay , Madhavan Srinivasan , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Srikar Dronamraju , Shrikanth Hegde , "Nysal Jan K.A." , Vishal Chourasia , Ritesh Harjani , Sourabh Jain Subject: [PATCH] pseries/kexec: skip resetting CPUs added by firmware but not started by the kernel Date: Fri, 5 Dec 2025 19:58:25 +0530 Message-ID: <20251205142825.44698-1-shivangu@linux.ibm.com> X-Mailer: git-send-email 2.52.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Authority-Analysis: v=2.4 cv=dK+rWeZb c=1 sm=1 tr=0 ts=6932ec47 cx=c_pps a=AfN7/Ok6k8XGzOShvHwTGQ==:117 a=AfN7/Ok6k8XGzOShvHwTGQ==:17 a=wP3pNCr1ah4A:10 a=VkNPw1HP01LnGYTKEx00:22 a=VnNF1IyMAAAA:8 a=pGLkceISAAAA:8 a=1UX6Do5GAAAA:8 a=3RTAq5kX-fAwh9RnVlgA:9 a=Et2XPkok5AAZYJIKzHr1:22 X-Proofpoint-GUID: zGd6o2ekTCGfSVNiy1j5dtWoNEsWi11v X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUxMTI5MDAyMCBTYWx0ZWRfX5eSUDGdlFr58 Qdv0kloc2k/vEWO3j+XJNxvti24Fvqb7XdrsO5xhwWeeORKs61PcVECtCN8fPgm8o4n3Mkt7WT3 dgYCRvwd+W6J68/+C0xWMbj90YJaiFhg7ANtizEdDN9yUYTfeBUpg6oaznCVeDCmiKxnE71JYX0 WniQ4kSjSk4hAT+Q+saglK56XHrflbsRlMHOLNNlV7XrWur1s0e4hCZaJezxxmJDiQhyNw+g17e wPtIfLyXWGQVa/AE9hn7kDfGYGm/yzQB7h/1g1AfJvx/GMYKODggyY2rxqW1cT57aXS51MNChoE Y3ExQyMfm21Qs+3sQizx7KWHfAUpOOXcKI9vvpirVG+UCZ1JuAYME55kC8ygPrKoqRo5gXc7Now pwu6BNf4KLi6ZjhvEYB6dnqGHtNzwA== X-Proofpoint-ORIG-GUID: uXwQqtdSuqCqhOB0cXhCO8NEWpppFadK X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1121,Hydra:6.1.9,FMLib:17.12.100.49 definitions=2025-12-05_04,2025-12-04_04,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 lowpriorityscore=0 clxscore=1011 priorityscore=1501 bulkscore=0 adultscore=0 phishscore=0 impostorscore=0 spamscore=0 malwarescore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.19.0-2510240000 definitions=main-2511290020 Content-Type: text/plain; charset="utf-8" During DLPAR operations, The newly added CPUs will start in halted mode. Kernel will then take sometime to initialize those cpu interally and start them using "start-cpu" rtas call. However if a kexec-crash is occurred in between this window (till the new cpu has been initialized), The kexec nmi will try to reset all-other-cpus from the crashing cpu, Which will lead to firmware starting the uninitialized cpus aswell. This will lead to kdump kernel to hang during bringup. Sample Log: [175993.028231][ T1502] NIP [00007fffb953f394] 0x7fffb953f394 [175993.028314][ T1502] LR [00007fffb953f394] 0x7fffb953f394 [175993.028390][ T1502] --- interrupt: 3000 [ 5.519483][ T1] Processor 0 is stuck. [ 11.089481][ T1] Processor 1 is stuck. To Fix this, Only issue the system-reset hcall to CPUs that have actually been started by the kernel. Cc: Madhavan Srinivasan Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Christophe Leroy Cc: Srikar Dronamraju Cc: Shrikanth Hegde Cc: Nysal Jan K.A. Cc: Vishal Chourasia Cc: Ritesh Harjani Cc: Sourabh Jain Signed-off-by: Shivang Upadhyay --- arch/powerpc/platforms/pseries/smp.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/pseries/smp.c b/arch/powerpc/platforms/= pseries/smp.c index db99725e752b..e5518cf71094 100644 --- a/arch/powerpc/platforms/pseries/smp.c +++ b/arch/powerpc/platforms/pseries/smp.c @@ -173,10 +173,24 @@ static void dbell_or_ic_cause_ipi(int cpu) =20 static int pseries_cause_nmi_ipi(int cpu) { - int hwcpu; + int hwcpu, k; =20 if (cpu =3D=3D NMI_IPI_ALL_OTHERS) { - hwcpu =3D H_SIGNAL_SYS_RESET_ALL_OTHERS; + + for_each_present_cpu(k) { + if (k !=3D smp_processor_id()) { + hwcpu =3D get_hard_smp_processor_id(k); + + /* it is possible that cpu is present, + * but not started yet + */ + if (paca_ptrs[hwcpu]->cpu_start =3D=3D 1) + plpar_signal_sys_reset(hwcpu); + } + } + + return 1; + } else { if (cpu < 0) { WARN_ONCE(true, "incorrect cpu parameter %d", cpu); --=20 2.52.0