From nobody Tue Jun 16 19:36:41 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1AC261D5CFB; Mon, 20 Apr 2026 14:15:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=192.198.163.14 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776694525; cv=fail; b=PRJmF0IQVkzvSOHrCep+ZX/QAARbVu43Ua7XSchLMCNw95IiUBnndytCl3vkRblGZ0NNr/dcxJFO4Itee4ELLZPRMJ/a3j5GlXDrnrIm1OPDIzPWGgUBHDreox+9+LUxgEHVxo+q9UTyAxp5+4UF8ksaLnxPiQ8Ry1uVADu0vIg= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776694525; c=relaxed/simple; bh=CsJunGP9bxDOAVzNXXitezh9D0SPA7i7wZZA4CBMY7I=; h=From:To:CC:Subject:Date:Message-ID:References:In-Reply-To: Content-Type:MIME-Version; b=dDdijK9K+n3Xm+ccJF13N6AF9vwyss53ntXm6nRJ2Xw/WS128kdUgFYxbI+bEOi5bZ8r7gqXD0axunwlKti0motYluapvo+X3hAjP7qi4z0dKLGA3v7ZRBInVyV1DBhGXSz2ycueZpSVdcl5QR6tf7tqyzZ1l6JUukRDnvK2EpI= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=lmxmxJ3u; arc=fail smtp.client-ip=192.198.163.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="lmxmxJ3u" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1776694523; x=1808230523; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-transfer-encoding:mime-version; bh=CsJunGP9bxDOAVzNXXitezh9D0SPA7i7wZZA4CBMY7I=; b=lmxmxJ3u8GCMMts0+TY3W3/ZYuDIJglF+qup3SrtC31o+PWbtICKlavP bg1jA+115h6+XcuM4q20VPEcfGiLD2l799jOY5d9gYGZLqDdGyp9n2sDx UAR0CM+0//YKuxZdq0/JO1DfwoUrjGRtHZlTmCUtQAP1HF+JIzt8egRf/ ndqjl4sjQm9h1m6pxkJNvtMVMdv4Vp3QxealeKKMWErqYy0KAb8poz50D zdWAmdi1HPh4afKK2WIXmGrj2dMSD1Jd1qBFO8b+5+AHx5Ausg98/4ulD aVT4po2TWQH2V/S150MBf+I8JiFyMGLFcx9rI3sRjBbLNAz8eaQon8kyJ Q==; X-CSE-ConnectionGUID: +uIjYugFQF6lQ0ZjU+3qoA== X-CSE-MsgGUID: Smv+Sh/1SGmR+zdbnG56kA== X-IronPort-AV: E=McAfee;i="6800,10657,11762"; a="77684485" X-IronPort-AV: E=Sophos;i="6.23,190,1770624000"; d="scan'208";a="77684485" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa108.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Apr 2026 07:15:22 -0700 X-CSE-ConnectionGUID: uZ69fHoXSjuNRu5XSUZ1ew== X-CSE-MsgGUID: ZTU3tVqkTSiGhXwG9AUZ6w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,190,1770624000"; d="scan'208";a="229049515" Received: from orsmsx902.amr.corp.intel.com ([10.22.229.24]) by fmviesa008.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Apr 2026 07:14:57 -0700 Received: from ORSMSX902.amr.corp.intel.com (10.22.229.24) by ORSMSX902.amr.corp.intel.com (10.22.229.24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.37; Mon, 20 Apr 2026 07:14:56 -0700 Received: from ORSEDG901.ED.cps.intel.com (10.7.248.11) by ORSMSX902.amr.corp.intel.com (10.22.229.24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.37 via Frontend Transport; Mon, 20 Apr 2026 07:14:56 -0700 Received: from PH0PR06CU001.outbound.protection.outlook.com (40.107.208.46) by edgegateway.intel.com (134.134.137.111) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.37; Mon, 20 Apr 2026 07:14:54 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=CO72Kp4o2l69VKboVFLC580VCM8dWbCpjCKnKrbVhvzc4aEcoCX5yK30/zcSzmXMJDPAW/cYmfmiB40UeHzrgOqEYgjt+NfQXCmctf4blytej0zvS5eeC/XIhcZy29JAhSCFFEI8sfNMmDSbx+Jq8fzcfP/q9R5ALKxcFLWbWRSGCehT3w9cIUokFjPJ9RP1Fv6a4TIYbzXNzvwmb0/ofJ+LQ2r+OEFPO9YUNiLsfns0SzgB/sl4rYUAG6THlUVga8B69YPc3A8vSHHorV1Gz+eSMNjzRP3bI8SRU+tlXKSWAK0wpBDypBT2iQvg4V6WwgFAA/0nLIOG2asLsZAjcw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=CsJunGP9bxDOAVzNXXitezh9D0SPA7i7wZZA4CBMY7I=; b=VRCUfRiDASu5cwT2KT3z3hAVdKtKxyZ915OVQZ404kTD0TxOZjHOMtGd6JjFwKEiaiHpuIXv+kQrFYDhDsNwinpy1d5YnE851TdBx5JkgrvDJHFjmbp7AYh9NXx+ghXIz0QzU9F9qfhy333+A5nlkYkN9zvfA5TKKiWOkYti0doOejwm+zPYlTDSaOrw5tdo5Q+ApZ8SmpncC/lbDJbzYkQ6GIDNryhj1nv3njNjcW64bnwWrP+1rYAHZYlZes2ntvOHckRLRtfX8FOfGFwTBCINypGMPB4/P3gXmi+9/NYS4z6y7wfEdJPRfa5MTTNKk4qMwFA/pLdXI707CgdemA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Received: from CY8PR11MB7134.namprd11.prod.outlook.com (2603:10b6:930:62::17) by DS0PR11MB7925.namprd11.prod.outlook.com (2603:10b6:8:f8::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9846.16; Mon, 20 Apr 2026 14:14:52 +0000 Received: from CY8PR11MB7134.namprd11.prod.outlook.com ([fe80::5670:5b2e:6ecb:dcaf]) by CY8PR11MB7134.namprd11.prod.outlook.com ([fe80::5670:5b2e:6ecb:dcaf%5]) with mapi id 15.20.9846.014; Mon, 20 Apr 2026 14:14:52 +0000 From: "Zhuo, Qiuxu" To: Borislav Petkov , "Luck, Tony" , Nikolay Borisov CC: "Li,Rongqing(ACG CCN)" , Thomas Gleixner , Ingo Molnar , Dave Hansen , "x86@kernel.org" , "H . Peter Anvin" , Yazen Ghannam , Avadhut Naik , "linux-kernel@vger.kernel.org" , "linux-edac@vger.kernel.org" Subject: RE: [PATCH] x86/mce: Restore MCA polling interval halving Thread-Topic: [PATCH] x86/mce: Restore MCA polling interval halving Thread-Index: AQHcxhew8U3ff10PFEaAfYGPFnZcQbXTsEPwgAtr6ICAABH6gIABYXCAgAAHbQCAAAJCgIACmzGAgATPNhA= Date: Mon, 20 Apr 2026 14:14:52 +0000 Message-ID: References: <20260306152903.GEaaryvxD9BTT7Fd10@fat_crate.local> <20260316134411.GBabgJK1RFUG3zrTrx@fat_crate.local> <20260406224915.GAadQ4azQ7IRsTjnXD@fat_crate.local> <20260414211803.GFad6vC9LSYxGScTNH@fat_crate.local> <20260415192723.GAad_mm7YFTi7ypK5I@fat_crate.local> <20260415200203.GBad_uu1gC_V_LFH0u@fat_crate.local> <20260417115001.GCaeIeaU8eS_EMNq6G@fat_crate.local> In-Reply-To: <20260417115001.GCaeIeaU8eS_EMNq6G@fat_crate.local> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: CY8PR11MB7134:EE_|DS0PR11MB7925:EE_ x-ms-office365-filtering-correlation-id: 35639926-a8b0-49be-4570-08de9ee7308b x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0;ARA:13230040|366016|376014|7416014|1800799024|38070700021|56012099003|22082099003|18002099003; x-microsoft-antispam-message-info: ISbh1NItRMvrhzoiPFOozzByj0VwVeNv60W8nQWoL5znBC7i579im1L3WF1OzcD6lMQS1Fx+36GLaim+7GCir02kZ03T8RMrDVrdH3LKbMSXYkpm8P5lQSxUSvKINKzUOrQuswrQoS2F/eQp7vfm3LS5DzE6PJpo57aMDSO6Bo7xefji5VdOmeqrsCo8WoL6VKp3ovHUrBNHK8BjjDMJsGXoS+qvysnLfGe3clf+l+mW9rU6L5xW5N7p1L3IBcxDP0YNwJjsxUIX82YeVGucBobh3DLiMIkxqFSLzl0B0i/VnD04DTNtdp1hBvaCi9+0gEQ7I8F9cGS+zrCcm9G6VWzvJnagl9A2yBKoRIupBJnZ2qgA8NetaMqMQu/YhHiLlHRAX8+qFjpROd8I+6XeSVY6CHTIdiSPJP3ma1lh7f29USlEH34Qtu2jeIzMSqpzns4UWdBU/9cAWS6jGqGEBPeQee60cpPmNhe0v/xJPuEpUC/qtIGpWmGUv3CivPBaCXUfyipW/ZELNVeL7296ymjgqrcsVhBstswMOlasUOd+Z4pm4hCisE/R5lp0H5KLKXqmOJRiFXCBJiX2jcS5xJQXPDhXf7wsHtpXzkyPWCyjMsVE6zc01VaRVVE3cw4bkCJVouiDW4o0cAE7ng5xQ9gVgv3MrNDyrDgjA8omyd+ck9NHplQL9QUrqJxdUZnRW9UvEbjKHeb5ojGv0vWAgPf6pzNqFd1BS6L8GLTQ7o0oaj6Qz3HJ570BXgpmFHnsb1HYTxW1TcQF0Nrg2O4fAn/nI6ZwvTIJw+YaVvVoDag= x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:CY8PR11MB7134.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(376014)(7416014)(1800799024)(38070700021)(56012099003)(22082099003)(18002099003);DIR:OUT;SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?utf-8?B?Y0tQOUg3MFovOHJuRWZrOEdxRWh6Qld4djJpaVQ1aWo2dmpwRnY3bHJ3NGor?= =?utf-8?B?ekI3bVB4MEF4YjlTUXBWZjgwMW41U1lxcy9qSDBXaXFtb1hTWkZIeEFjNjRN?= =?utf-8?B?MHZxb3RHekFXVGl5dG5mUWNLTUxHNUcyN0tEY0tFWXZOUDdOcFBIbEtmUGF3?= =?utf-8?B?NkJ0d3pCYmNHVGc5ZUZJQnZhYkYwNURZeFdhdzdRcWNSNStvVTVJVTZEdW1M?= =?utf-8?B?VjRxQXEwdnpoUDlrR2tkdTdoekd6QUFLS0hVcnE2T0N2TDNkZ1YySGVwUk1x?= =?utf-8?B?NWtsVXkvOFZLTm9QVHNPRzlSR1VFdm0vUkk4YUQyVTNrTU1CL3FuWXptUjZt?= =?utf-8?B?TFRHMXlKRGlpVDU4aGQ5VGZXeVAvUS9uK2xSODZDNVF5UFppS1V3d3JOOU1Q?= =?utf-8?B?S21lVGFjQ0hEektuNlNOdkRSbW9yNGZZQ3Fva0xnOWl0VjlDSkl5cUJrZUx1?= =?utf-8?B?b3VIbENWUE40NlV1MkxoUlU0QlFObGpTM3JqRVRsb0pJWjFuenRHU3hKaWhV?= =?utf-8?B?dllobklDYUVNckNGTVk4VXg1QjYrTHNjM0NES2JPaS9RVHNMSXV4enNRRlY4?= =?utf-8?B?L2RtdjlKenB3U1hnemY4Ni9VUTJ1VThKVE5OU1dTK0o5UzJVRzZnSjlIRmkr?= =?utf-8?B?eWJUNkdpK3pmTmJ6bXdxTkZ4cGxaZ3BrODNhT25PbURJWEhhY2hjcEVGZ1JQ?= =?utf-8?B?MkJxQkxhaE5ja3ZpK29xbllFN3BkTGxwTUNYdGMycFdkbGFiYTNEZUJQaXF2?= =?utf-8?B?dzRVS3FxS0pBd2JaZDJWV0huTUhYcFBNTTN4QjRYZk1PSVhlN1hzcTllTkV4?= =?utf-8?B?TjVWNytMMGErUG0yQUtBOWdMTGdaenZXUzdNOVVVQjhpNTJERVYxSlBnV1Fh?= =?utf-8?B?UUoxck1oNU84L1pEeTBOdnc3TUFiWlJFMjdmOG5HckdTWGR1OXdmVnhLVUhp?= =?utf-8?B?QmMzZVplRWsrbDNwMmxobGxkcXRNMkt6VWVJY2ZtWkxaN1lyUXNrQjNIditn?= =?utf-8?B?czhUMjVUMms2bjhyWURWSTlwUWM5bG1QRXdGQnRjUDhLTCt1K1Jsdzh3OTlW?= =?utf-8?B?V0FhZkQwYjRSNnBab1RXVUhFMStTNU9jUUVJVkVBNFZ1UlVYQ0E4dVczODlT?= =?utf-8?B?ZzVZeGNQYW9MejU5RU9WSEt4eFJhZy9Nd1EzTSsvMW51NHJGbHNPYVhQQ0kw?= =?utf-8?B?NlI1bncxQWx0WVg1R0s2enBUUThiMU5XcjVjZ2xLTEhxNk82L0Y2RHdXWXdk?= =?utf-8?B?bWZKN3ZPS0Q3R0xLdW5MNVVkV1RIcjMrNXhQb1A0c3JiTEdZWjMwR0l6aG1a?= =?utf-8?B?SnJHWCt5WCtzamtHc0VrY0ZiS1ZxUGlKT1ZWNmxWeTVFVmg3ZGNHbVJEbEk0?= =?utf-8?B?M0taVEl5Qy9vYXN6cURVRUc4YVdQWXNKS0N3NmRMeDFJMHZpeWNLelRHSFU2?= =?utf-8?B?MDFZeENwSGhTSW1WTnBOL1B6RUliNU1KTENBTWl2NURnTjVGM3hUM29pZG1L?= =?utf-8?B?emYwVGs1VUE5NS9CSlREdThVb3pYcnpPMi92dTh3Z1I1KzhxTnhwa3BLOFlU?= =?utf-8?B?RnVMb0Vsc2V3M0NYQkZUMGEvL0g4Z3p0eE9xZng0aHpvZDE2VDc1Z0I5QzU1?= =?utf-8?B?c1pDeVFzdGZBdzRNQzBrUFpkcXdaOTJLenI4YkJwTmJ0UUhxY0RHc2pTYWt4?= =?utf-8?B?aGhRNFhEZUdYUEsyc0ZRUVhGYVFwNk5oOWJKMDc4V3Bpdm9SOFMvTDEwN2FW?= =?utf-8?B?am5CS3VTOFczRTU1dmVGckl5UTVqelRsVVlmTS8yWlBJcWNxajF2S3A5aE9W?= =?utf-8?B?K1F1Umh1NC9obHdkYnNYVVNkSEFURytrNCtyS3RocEM1S2xlOWRLalBNV0Vs?= =?utf-8?B?L2FjbnBEOE1qWit4dG5GZ1Zyb0UrbVlESHFmZ29oWGdHcjQ1R010Y3VCeCtk?= =?utf-8?B?cTkrZXdEU1dkdzR6bmhPaVkzZWNyVTZYMDdURjFjbmRvWFFNdWJLdnRFeDdl?= =?utf-8?B?UzU3QndlN0x6N254OFd1NkJicWtjNjBrMFR0RFhMQzlGcjVBYzZBZmw2N3RL?= =?utf-8?B?aGJxNDhXcVNPQmRVS3I4VUp5NW5QaDNNNkdNZDlza2F2Z2FZcGNOM08xNk5M?= =?utf-8?B?WStxM0JPcWRNOWZDeENVOGtBM09UUU1FNmwrd0lJOHBZVXZGTE1ONHdOUmJy?= =?utf-8?B?T0JOZnJtKzdrOHl6cFZ3SGZlMEZ2T3d3UlJKbmZLVElJd3kzS3VUYWEvcWtQ?= =?utf-8?B?VFNBeThZUW9VYzZmMDhDZFNmbE5mVGdFRFI0TGlTK3lZN2RGY28rM2NSVlBn?= =?utf-8?B?QmRiUGVVZGRqYmJkekdUMkRQOGMvZExLRkRvWmsrbXFtS3FJY1c5Zz09?= Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Exchange-RoutingPolicyChecked: D8ijgVKVCkYK6T33ll4mcG1thzq2vBQR0XQL1GINEGqsSX4zB3JZb97mKs2c+kbZ1CAN28Y9WREKA/ENil9uzBOy/6k6TaHvIMyjqa+wk16p/+NDvJkIyeT4b56AO/YurUfmFv8GRDcfM68ykEzjT/YqT/e8QOHRyaBQdfjdFpq/98KQL2ppZKX4L9j/Cw4L5wYpyC2UiDdFGpMOcUFhcUF923B/BpHyffzh/HuUKe67JonaeIk1Z+IM8HQIvjC9WZgJ0/pKAb3YaytITVsV62ZeV24Gm8BCz6xR+sPIoryAXANfJSSTMMMsNMK9y6VHuDr7mcFsVd+WtTyOjXvi8w== X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: CY8PR11MB7134.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 35639926-a8b0-49be-4570-08de9ee7308b X-MS-Exchange-CrossTenant-originalarrivaltime: 20 Apr 2026 14:14:52.1961 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: a4Z2d4SjMVrY7Ll8gPwF2ARL98oWIsEJa7TPinRBZsAsWoL2g2guYzZ038Ed9Q3z0LUAOiLggj+5R5zE1wmDvw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS0PR11MB7925 X-OriginatorOrg: intel.com Hi Boris, > From: Borislav Petkov > [...] >=20 > On Wed, Apr 15, 2026 at 10:02:03PM +0200, Borislav Petkov wrote: > > Lemme think about how to restructure this patch of mine... >=20 > Ok, totally untested. This is only to show the idea. I've basically went = and > distributed the functionality where it fits best: the pr_info logging at = mce_log > time and the work trigger in the notifier. It ended up like below. >=20 > I'll run it but I'd let you folks check it first, whether I've missed an = angle > conceptually. >=20 1. Test precondition: - Added debug messages [1] on top of Boris' patch. - RAS_CEC was disabled. - A correctable error was injected every 10 seconds. 2. Tested with CMCI interrupts enabled: - The message "Machine check events logged" was printed each time a corr= ectable error was injected. - EDAC and mcelog in the decode chain were notified as expected. So, this part tested OK. 3. Tested in polling mode (boot with "mce=3Dno_cmci"): - A CPU=E2=80=99s timer interval was halved after calling mce_log(), or = when !mce_gen_pool_empty() was true during polling [2]. - A CPU=E2=80=99s timer interval was doubled when mce_gen_pool_empty() w= as true during polling [2]. This part tested OK, but please see comments below about mce_gen_pool_e= mpty() check in mce_timer_fn(). =20 [1] diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index f3a793e3a6c8..927dcdb15ff4 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -152,7 +152,7 @@ EXPORT_PER_CPU_SYMBOL_GPL(injectm); void mce_log(struct mce_hw_err *err) { if (mce_gen_pool_add(err)) { - pr_info(HW_ERR "Machine check events logged\n"); + pr_info(HW_ERR "Machine check events logged by CPU %d\n", s= mp_processor_id()); irq_work_queue(&mce_irq_work); } } @@ -1781,10 +1781,13 @@ static void mce_timer_fn(struct timer_list *t) * Alert userspace if needed. If we logged an MCE, reduce the polli= ng * interval, otherwise increase the polling interval. */ - if (!mce_gen_pool_empty()) + if (!mce_gen_pool_empty()) { iv =3D max(iv / 2, (unsigned long) HZ/100); - else + pr_info("!mce_gen_pool_empty() - CPU %d halves timer interv= al %ums\n", smp_processor_id(), jiffies_to_msecs(iv)); + } else { iv =3D min(iv * 2, round_jiffies_relative(check_interval * = HZ)); + pr_info(" mce_gen_pool_empty() - CPU %d doubles timer inter= val %ums\n", smp_processor_id(), jiffies_to_msecs(iv)); + } if (mce_get_storm_mode()) { __start_timer(t, HZ); [2] See example of 'CPU 82': dmesg | grep -E 'Machine check events logged|CPU 82' | grep -v "EDAC" [ 323.797260] mce: [Hardware Error]: Machine check events logged by CPU 82 [ 323.804985] mce: [Hardware Error]: Machine check events logged by CPU 82 [ 323.812618] mce: [Hardware Error]: Machine check events logged by CPU 82 [ 323.820237] mce: [Hardware Error]: Machine check events logged by CPU 82 [ 323.827868] mce: !mce_gen_pool_empty() - CPU 82 halves timer interval 15= 0000ms [ 323.827970] mce: [Hardware Error]: Machine check events logged by CPU 147 [ 487.635781] mce: [Hardware Error]: Machine check events logged by CPU 219 [ 487.652751] mce: [Hardware Error]: Machine check events logged by CPU 219 [ 487.660571] mce: [Hardware Error]: Machine check events logged by CPU 219 [ 487.668386] mce: [Hardware Error]: Machine check events logged by CPU 219 [ 487.676195] mce: [Hardware Error]: Machine check events logged by CPU 219 [ 487.684874] mce: !mce_gen_pool_empty() - CPU 82 halves timer interval 75= 000ms [ 563.411184] mce: [Hardware Error]: Machine check events logged by CPU 88 [ 563.427845] mce: [Hardware Error]: Machine check events logged by CPU 88 [ 563.435553] mce: [Hardware Error]: Machine check events logged by CPU 88 [ 563.444290] mce: !mce_gen_pool_empty() - CPU 82 halves timer interval 37= 500ms [ 602.322784] mce: [Hardware Error]: Machine check events logged by CPU 241 [ 602.331355] mce: [Hardware Error]: Machine check events logged by CPU 241 [ 602.339264] mce: !mce_gen_pool_empty() - CPU 82 halves timer interval 18= 748ms [ 622.802721] mce: [Hardware Error]: Machine check events logged by CPU 82 [ 622.811199] mce: [Hardware Error]: Machine check events logged by CPU 82 [ 622.818948] mce: !mce_gen_pool_empty() - CPU 82 halves timer interval 93= 72ms [ 632.018480] mce: [Hardware Error]: Machine check events logged by CPU 273 [ 632.026526] mce: [Hardware Error]: Machine check events logged by CPU 185 [ 632.275383] mce: mce_gen_pool_empty() - CPU 82 doubles timer interval 1= 8744ms [ 647.122282] mce: [Hardware Error]: Machine check events logged by CPU 273 [ 651.475854] mce: mce_gen_pool_empty() - CPU 82 doubles timer interval 3= 7488ms [ 661.970112] mce: [Hardware Error]: Machine check events logged by CPU 273 [ 677.073945] mce: [Hardware Error]: Machine check events logged by CPU 273 [ 682.193878] mce: [Hardware Error]: Machine check events logged by CPU 273 [ 690.386214] mce: mce_gen_pool_empty() - CPU 82 doubles timer interval 7= 4976ms [ 692.433727] mce: [Hardware Error]: Machine check events logged by CPU 225 [ 712.913487] mce: [Hardware Error]: Machine check events logged by CPU 113 [ 717.009440] mce: [Hardware Error]: Machine check events logged by CPU 232 [ 743.632392] mce: [Hardware Error]: Machine check events logged by CPU 113 [ 743.640869] mce: [Hardware Error]: Machine check events logged by CPU 113 [ 757.967947] mce: [Hardware Error]: Machine check events logged by CPU 273 [ 766.160445] mce: mce_gen_pool_empty() - CPU 82 doubles timer interval 1= 49952ms [ 768.207807] mce: [Hardware Error]: Machine check events logged by CPU 253 [ 792.783453] mce: [Hardware Error]: Machine check events logged by CPU 234 [ 807.119257] mce: [Hardware Error]: Machine check events logged by CPU 253 [ 817.359155] mce: [Hardware Error]: Machine check events logged by CPU 273 [ 831.695030] mce: [Hardware Error]: Machine check events logged by CPU 234 [ 852.174749] mce: [Hardware Error]: Machine check events logged by CPU 232 [ 861.646550] mce: [Hardware Error]: Machine check events logged by CPU 232 [ 866.510500] mce: [Hardware Error]: Machine check events logged by CPU 234 [ 884.430286] mce: [Hardware Error]: Machine check events logged by CPU 234 [ 899.534081] mce: [Hardware Error]: Machine check events logged by CPU 234 [ 904.654067] mce: [Hardware Error]: Machine check events logged by CPU 234 [ 922.573822] mce: [Hardware Error]: Machine check events logged by CPU 234 [ 929.998246] mce: [Hardware Error]: Machine check events logged by CPU 261 [ 930.000003] mce: mce_gen_pool_empty() - CPU 82 doubles timer interval 2= 99904ms [ 944.333567] mce: [Hardware Error]: Machine check events logged by CPU 232 [ 952.525529] mce: [Hardware Error]: Machine check events logged by CPU 273 [ 964.813362] mce: [Hardware Error]: Machine check events logged by CPU 232 [ 979.149459] mce: [Hardware Error]: Machine check events logged by CPU 225 [ 991.436910] mce: [Hardware Error]: Machine check events logged by CPU 273 [ 1005.772632] mce: [Hardware Error]: Machine check events logged by CPU 261 [ 1028.300439] mce: [Hardware Error]: Machine check events logged by CPU 233 [ 1028.300522] mce: [Hardware Error]: Machine check events logged by CPU 233 [ 1044.683195] mce: [Hardware Error]: Machine check events logged by CPU 261 [ 1054.922669] mce: [Hardware Error]: Machine check events logged by CPU 225 [ 1065.162622] mce: [Hardware Error]: Machine check events logged by CPU 261 > Thx. >=20 > --- > diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/cor= e.c > index 8dd424ac5de8..f3a793e3a6c8 100644 > --- a/arch/x86/kernel/cpu/mce/core.c > +++ b/arch/x86/kernel/cpu/mce/core.c > @@ -90,7 +90,6 @@ struct mca_config mca_cfg __read_mostly =3D { }; >=20 > static DEFINE_PER_CPU(struct mce_hw_err, hw_errs_seen); -static unsigned > long mce_need_notify; >=20 > /* > * MCA banks polled by the period polling timer for corrected events. > @@ -152,8 +151,10 @@ EXPORT_PER_CPU_SYMBOL_GPL(injectm); >=20 > void mce_log(struct mce_hw_err *err) > { > - if (mce_gen_pool_add(err)) > + if (mce_gen_pool_add(err)) { > + pr_info(HW_ERR "Machine check events logged\n"); > irq_work_queue(&mce_irq_work); > + } > } > EXPORT_SYMBOL_GPL(mce_log); >=20 > @@ -585,28 +586,6 @@ bool mce_is_correctable(struct mce *m) } > EXPORT_SYMBOL_GPL(mce_is_correctable); >=20 > -/* > - * Notify the user(s) about new machine check events. > - * Can be called from interrupt context, but not from machine check/NMI > - * context. > - */ > -static bool mce_notify_irq(void) > -{ > - /* Not more than two messages every minute */ > - static DEFINE_RATELIMIT_STATE(ratelimit, 60*HZ, 2); > - > - if (test_and_clear_bit(0, &mce_need_notify)) { > - mce_work_trigger(); > - > - if (__ratelimit(&ratelimit)) > - pr_info(HW_ERR "Machine check events logged\n"); > - > - return true; > - } > - > - return false; > -} > - > static int mce_early_notifier(struct notifier_block *nb, unsigned long v= al, > void *data) > { > @@ -618,9 +597,7 @@ static int mce_early_notifier(struct notifier_block *= nb, > unsigned long val, > /* Emit the trace record: */ > trace_mce_record(err); >=20 > - set_bit(0, &mce_need_notify); > - > - mce_notify_irq(); > + mce_work_trigger(); >=20 > return NOTIFY_DONE; > } > @@ -1804,7 +1781,7 @@ static void mce_timer_fn(struct timer_list *t) > * Alert userspace if needed. If we logged an MCE, reduce the polling > * interval, otherwise increase the polling interval. > */ > - if (mce_notify_irq()) > + if (!mce_gen_pool_empty()) mce_timer_fn() machine_check_poll() mce_log() irq_work_queue(&mce_irq_work) ... mce_irq_work_cb() mce_schedule_work() schedule_work(&mce_work) ... mce_gen_pool_process() // [3] worker thread concurren= tly running on any CPU handles MCE logs. mce_gen_pool_empty() // [4] It seems there is a race between [3] and [4]. Although my testing did not observe this race, it's possible=20 that mce_timer_fn() (in softirq) completes fast=20 enough that it always finishes before [1] (in worker thread) is scheduled t= o run. > iv =3D max(iv / 2, (unsigned long) HZ/100); > else > iv =3D min(iv * 2, round_jiffies_relative(check_interval * HZ)); >=20 [...] Thanks! - Qiuxu