From nobody Tue Apr 7 18:03:28 2026 Received: from mx0b-0064b401.pphosted.com (mx0b-0064b401.pphosted.com [205.220.178.238]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E9C0B28D8DA; Thu, 12 Mar 2026 08:16:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.178.238 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303415; cv=fail; b=txb/FT92uMft3DJWWNN/XNRJtcGkz1tM0Ms720hgiFh4c+Ze5fi7Phx49uGKi1R10JBwElWH0cRCIhnZJg0jDKUb4PBX1KCXHFmfSuFM4n2fvPYsyQv33m3G1H3VbXEjGr/5i0duzxRJRrjghfVO3rbWE2NqHc7hGaKSGZqUEtE= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303415; c=relaxed/simple; bh=XkXNfXuGmMA99Wwbc/D8Jb+OOilyET0IIy8toHBxi0g=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=u10cbfqSK2qeDo/l5obAu/RJ+zdn9AH4bBVY3WQUUpUhZULCsN1vNG1ThCNHoQ+4DTULa8LDE3McRqS5+BO9Hc0gb9Iwm9OucU/B1QX+bhGW1k8AqnT64OKCxDtD27LILBJXHuHm5kKcyvIphluZiZBZnndvg4mBN27foUrpgX0= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com; spf=fail smtp.mailfrom=windriver.com; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b=RvEBJw+K; arc=fail smtp.client-ip=205.220.178.238 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=windriver.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b="RvEBJw+K" Received: from pps.filterd (m0250812.ppops.net [127.0.0.1]) by mx0a-0064b401.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 62C5U2Hb3044191; Thu, 12 Mar 2026 08:16:41 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=windriver.com; h=cc:content-transfer-encoding:content-type:date:from :in-reply-to:message-id:mime-version:references:subject:to; s= PPS06212021; bh=gDIXXldpa7CZA/XfUCr79rBAbFjiAFc2QyJksVT6aaY=; b= RvEBJw+K39qoo2p8JxoWCyKsxCSJhHJglg0UFkvA1oxHShiTfTJ07i4y+uG7rF/R 5j8seVnx/AQ1ERnmDqwWXiM+bHKnk5eWsRpjT0/gHZNsnyjFgSLrAdsBm5ccq1/E Z7Z4sRh9iEEF6E+TNecIxcqtuuNC3/otaslxSO8SjxDWBMjVqzSA/48N5guFC5Oq oe2p7vQFA/iFLtZvoWTX6I24d4s7SSaM+u9EzelfnqObgL/YQAQ+DoCQTOHIbvCL DIaOSFSop1IQgMPQTgOtdPeFaXnFy+74r9fnvDbhllqIhNw/xMYf5SIaxPtplNH1 TH3ZtQX/pf8To24LF+0oMw== Received: from byapr05cu005.outbound.protection.outlook.com (mail-westusazon11010033.outbound.protection.outlook.com [52.101.85.33]) by mx0a-0064b401.pphosted.com (PPS) with ESMTPS id 4cuh78gdvp-1 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 12 Mar 2026 08:16:41 +0000 (GMT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=BEcb+Rj0Vf0/PhOHCOmSb7eATalWkFA+3BdzJ8dd3iimHVfSLaeX/Fj0JaVUYIqYx7v5XPOLvn1bXCOLGjUF9aGIBi/NTeY/PRx+Od8LduETFND+WWy4Jys+MSkJ88VMe9uXWyHLS0dFdsA4ooEHcVc1TJjI59y4R8llbPxCkx0zXur8ELbcvE/a019zwJij2JbmIp00Tt2M4sNc3kbEV8fQEJ9HkfwPzrJzMyggTQgiogaa+EugP0uOAstugws021yqxV6lkWZ0B5bUSBrsbp5cLoW5KlBQZv+bZt6+Vwv1tJ1t29bR52Duu4sh97xaQbn/VKg/mzw530vDbklIAw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=gDIXXldpa7CZA/XfUCr79rBAbFjiAFc2QyJksVT6aaY=; b=T6h4BqVwkBpMEvigHBBlr9fOE9DgNmeAmBNpheLuVQ9cDBtawHG21ULP5hfE2sQMjVHpI6EGVBCiVFoyoN/5cj3II3LLNsxZ6RzDAk0dvq7SyKq8rj21s6D1RvY8Kk1Imbssdr6B34N+eRpINeQqAVRXBb1+2ioEgwq637Zu+vx7kYJlu0mbpW39M/kpR12bmJ9dXqZCrNugHniyVNOAW9zWTtbPryqDFaEe2nl6CNt2GW/14twB4EweEwvUsJVMCMNBf7zFDeG1sb8/Y5jVUj/6lOJ9omorSxgf8DOT2ivHZ3bAf4Xwapf3wXHWT1WdB4JKKFqC5bAZyj0qD5gNUQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=windriver.com; dmarc=pass action=none header.from=windriver.com; dkim=pass header.d=windriver.com; arc=none Received: from SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) by SA0PR11MB4720.namprd11.prod.outlook.com (2603:10b6:806:72::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9723.4; Thu, 12 Mar 2026 08:16:39 +0000 Received: from SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced]) by SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced%5]) with mapi id 15.20.9700.010; Thu, 12 Mar 2026 08:16:39 +0000 From: "Ionut Nechita (Wind River)" To: ceph-devel@vger.kernel.org Cc: idryomov@gmail.com, xiubli@redhat.com, linux-kernel@vger.kernel.org, ionut_n2001@yahoo.com, Ionut Nechita Subject: [PATCH v1 01/13] libceph: handle EADDRNOTAVAIL more gracefully Date: Thu, 12 Mar 2026 10:16:07 +0200 Message-ID: <20260312081619.40854-2-ionut.nechita@windriver.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260312081619.40854-1-ionut.nechita@windriver.com> References: <20260312081619.40854-1-ionut.nechita@windriver.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: FR2P281CA0085.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:9b::12) To SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ2PR11MB7546:EE_|SA0PR11MB4720:EE_ X-MS-Office365-Filtering-Correlation-Id: e013758a-6c43-4b3f-c57e-08de800fafa6 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|52116014|376014|10070799003|366016|18002099003|56012099003|22082099003; X-Microsoft-Antispam-Message-Info: OhkvKu9yJaRq5YBfpOZFplWfuBqc9OaIuAKFKUced4M8RGa5rj5vFPqRL2PJ13fbsIgAhl64XU8uMlGRXehTFf5UvTqNQEtjF1b/5bDnhHNWwpnZ3KbVJslyIcDKRpQCgUtmt3jpxJBtMq3DN5wOfaShLRCLMPiEHjTJnsiJQkNvj5v3ZhQ6LyI4hEsX4dATGamlhH2oxYQnwykDlrOsJ1wDd+r2zWr2PxT6/Ox7x57y4vHweHbxfdNTcDTlKiJW6rERoQ4Y08ClDx/maAzFs+3FUXUJpWn1yMvTyvjXZLJamKPBWvzt+m1qVtZ38CzsPCGUey3W8MhM+iP7w7WxH8w0MacU88TOsS5rYXro4BJtc605MjgfSwat2L4cEChBnMA6/EDnJST79Ey7ecjHuvxsK80/hNA8lEbDPpoH53jFpgNVF/PNoWuJtrEquoKeAsPB8SsNqEcve+wWgE8qRyRe8Byr7qCJfS7Ih5ONvT54ZUcDm38fwqzKPhqw4alhYlp9bs4r9wt3eczrp/+LHoMvPKj+JBKpJ9r1Jm2pbiO9bJS7Mo0XxHd5qGKaqo6T+Zpcfjlcc57HupbT0xja2lidrz4tTkyTMrYuxJOvcc5Iw79QycWzSuEsDCMFqPeVASl9bc54EWVVHl1o9xC2By2YQKcnIqYWiEImMsbz3pL3LhCOLeBMBDo1aBGr5xuCNbV3nucLrDmeFpk5bQRMsg52Z1TIpvD7piKse6mZVsQ= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ2PR11MB7546.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(52116014)(376014)(10070799003)(366016)(18002099003)(56012099003)(22082099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 2 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?342UXeiucB6UHFx+7exqthxzcY8emBJglb1UBndlOe5UDtJgnt3VOoB2W2FI?= =?us-ascii?Q?wKoAB0XTg6IZF3meoPCyNXoWawKY5wFnWu51BaxtheG1RVWjM8aR4+z3Lnoo?= =?us-ascii?Q?qO6EnEmrf+1apwZJpolLVmGAhv9Vn3R7z0kX6b5nQJ5pgxXcaT1X6aIxDq8j?= =?us-ascii?Q?en9T3a9PT8CTHeX1KSWXdXT1q7IvhBKb6HOJWE5gsMJMxKp3B4i9ufSs2rL7?= =?us-ascii?Q?LbRfmnWRz2ywOqwI6wg0K8BQh5hYAKLZCqYG//hFuPR92s5pf5FraQRwW6BA?= =?us-ascii?Q?aR30CpIUpeJ219bdtxlS+QIAHXJ9aDl2xJOfKdCTulD8lvtLStlNOZqMsjQW?= =?us-ascii?Q?645nhHB2XqbeOi9HY70rKUGcJqpG95aABLM8x/Hl3/R54nIz5IKpmXg81F3i?= =?us-ascii?Q?DuiHPtu5k860pWrKa/9l6AaRopBNws0Pydos2/yicB9KGIE6wdmsMrilg/v2?= =?us-ascii?Q?PgQdJER2k5K6qyKZd1UU0Go5YJ5jE8OIqvi8EWyEIfUku+IXmkHV7Pt1mMhR?= =?us-ascii?Q?yQxW67qKgOsyp1JPjE8/KXYYK4sW3+5STXGYxD+RfsEJ7+fvC4Vzhxty4Yqk?= =?us-ascii?Q?TmFI2nqek9YxzkI0LVkt3Rqv1H0lKO3z5vvyzdxYyl97t2Yq2rarII3JgjQY?= =?us-ascii?Q?0G9r/5iBVjc+cQtUqpoWW/jXUY4Srkx1xuK54pnaYVit2vlCK8+jSIatTaTG?= =?us-ascii?Q?Uqg17aaalxBDWywl84Z7hMZuoMrHmZzzxxMnPn5JDpOJM7y9DtltTttlmnCf?= =?us-ascii?Q?o98WNqmKrQX94dgtoA7Dpygx4A00/fec14BBE8kBgcCW1pOQP7QhV2vrayfi?= =?us-ascii?Q?mkzfI6Ncyoa/mHKXimkv6npJLq75uzeKS95YaO5Ipxo9aY84eYe7tvW0tvjW?= =?us-ascii?Q?yf12w0COEz5pYQt89QI2EJ8Pupf+jqoTBW0rgZEdqYP6U5X3SBQtSRFH3rBO?= =?us-ascii?Q?WTB4T0VsmpR09xdFG/f/T9760Ff1qkBRMu5F7dfl8bK0FtbLqIxsfcTR6zXf?= =?us-ascii?Q?tn7YwJMtuMqYq+mPHVqcGxE184YEdwWpwCBL5xHerslxyafXNmF6UmAfiGPn?= =?us-ascii?Q?1vx2OJU6hzlRAxGiCusotiApN3RjhTBl3empgO3YWc33wFZqRBttuuZ0JIb4?= =?us-ascii?Q?rGZPAfoBXOdj3xao33XNcpF9bZu1RVjh+1YjCcqmkCn6HAbCwx1qiDGrU1ON?= =?us-ascii?Q?67mHJu/RUBe7pvESVz1DdhWvr3nzbkfihnO72Q10SFLXjVROYKWNBU33wuIH?= =?us-ascii?Q?ImmtqcjYP3bUaFIGXoqxb8oIm9RzPn4GK7WnjLZrCEfqeTFJe9yPbcS582SS?= =?us-ascii?Q?wY9SbZ6jRHdsQQyQkrnUgjlpOiJ2fMBzFY7jmCZCIonfif94bo3cfQ1N1CV2?= =?us-ascii?Q?4i+69gE23kZpEPk5MgAzGXuXfjKjajYk+v8Q/jFQv0PQwWyZIJNPifgvMDmf?= =?us-ascii?Q?Ia2lc3k6oTSTq4hVmDYvWXWZbhKGYwO+0aAPOqq2m8MI3hVDmrHuCiwNe9DT?= =?us-ascii?Q?Yx01TxgxkwyeAzBWhL9qPRk+HTJj6BFWyTJiGMJe2VxO/eMxBubleGKeIsx7?= =?us-ascii?Q?nirZPELO2lFkrpLYbLsmdAFXencJz2InaciV64nXM3yJEP6nxoBgXr24KwcX?= =?us-ascii?Q?GVq6zEhIenIIjC6iEyEorRx0M2+dsgtWxkdPFiTR4XR7DSDXb6augCgaPpRx?= =?us-ascii?Q?POlVuISX/q2AheUCxGUXFZRzxj/89myzy3mekeMVspG+nNXVudeD2/I9OCv7?= =?us-ascii?Q?IcrIb7+wAv9b1LagMpgB8XtPlvti+G1qAj6DVM9iDqD2IB5a+v7RLdVFaaVq?= X-MS-Exchange-AntiSpam-MessageData-1: vZ1mHHV0M+W2WkW43+MZf+bIsJgktR7BhOc= X-Exchange-RoutingPolicyChecked: Tasmnnrx+FyBSDDQJ6dyQhJyCDt4T+PqIgXnEC+swTKMVp1YwWJ1g1T3yd4eJ3HFtzMjA+3dqyECSCRptRLTGPHZQLP0/Cb+TRJsllpiSINd79tXfD8Z38p3wW0lTlJDZf8F9mJE5WICyVmffU6WX4OKM3Vgv39c25PzrxWeIjLkKKj8flRKnZLRiTYA49UNLi15uMeyasOkD9nRji0d4Ej3ZXt5KySbMzG3lmdAG+BR9d/fH5lvtB3FyOeIvreiNlsfZNF1x4us8CppmcDlWvWS5OZpFz/eanu4RwnFNO8aqB/m4Xj7lVnNFOwjxH+qlg85YeLPtcH8iwuaOzXQwQ== X-OriginatorOrg: windriver.com X-MS-Exchange-CrossTenant-Network-Message-Id: e013758a-6c43-4b3f-c57e-08de800fafa6 X-MS-Exchange-CrossTenant-AuthSource: SJ2PR11MB7546.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Mar 2026 08:16:39.4196 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8ddb2873-a1ad-4a18-ae4e-4644631433be X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: pjqUXMF96jw+aQkNBQv8BHOPzUnWl7+cPcPF4Y2fGz80cvKQrvEUznwiUZP5amD3TQWRGTv/kXBdfjlhyBEFpPYcRfV8bgjlRiGkBd+TFSg= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR11MB4720 X-Authority-Analysis: v=2.4 cv=ALvEU0hV c=1 sm=1 tr=0 ts=69b27669 cx=c_pps a=+v7uXpzEFv5PX1YPhMLqHw==:117 a=6eWqkTHjU83fiwn7nKZWdM+Sl24=:19 a=z/mQ4Ysz8XfWz/Q5cLBRGdckG28=:19 a=lCpzRmAYbLLaTzLvsPZ7Mbvzbb8=:19 a=xqWC_Br6kY4A:10 a=Yq5XynenixoA:10 a=VkNPw1HP01LnGYTKEx00:22 a=bi6dqmuHe4P4UrxVR6um:22 a=fTW__CHxibyLmBMfj2wP:22 a=t7CeM3EgAAAA:8 a=1uu5jfe8Kc4fJiQIugQA:9 a=FdTzh2GWekK77mhwV6Dw:22 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMzEyMDA2NSBTYWx0ZWRfXxCbkOdm7929W setgyheODKnXIGvwsMf9LGbl2Vmfef4L3/8lWTD1M7HggerBulnSpOa/auq2t/hqMpd4v01C3DB lizxjuP5kinVgD+lR7JjJcEH3sTHtPC3ZE93vCu2fiDQLUsf9BghqJHm+N6AL5bUSoVlDY5iver P+VFDGjROmqIPKe1PyUf5WcNkppnAI618DDT/VYSOjB2eim1MKxv1VqQBVQ6sHuNM5M3hJwutmo 9cOnp2/MFiStTXuAnN4fNbbfvsRz/NpuBr1Pa+6OzH+XA7RDPoVpRF4BpjwzX/Yad8Yho7Y4NTZ 5+l0fWLuactA/MKR+5d8tIuBV16eUNHr9nbINn1XkqhjT8Knx57NyngIRXDjj4JgXIGeZxXXRN9 jcEVIGldpSrHIuHUBi4QNk0y6EIaRh/0zCelUVDn3amf6Td/aBg6SBI/KEklOXHPSEyOY83rSYh hmddVm6AFYYbwcnkDmg== X-Proofpoint-ORIG-GUID: 9V3lhcuEWMP4fO-2QzddipKnFQYWGDqD X-Proofpoint-GUID: 9V3lhcuEWMP4fO-2QzddipKnFQYWGDqD X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1143,Hydra:6.1.51,FMLib:17.12.100.49 definitions=2026-03-11_02,2026-03-09_02,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 phishscore=0 clxscore=1015 adultscore=0 suspectscore=0 priorityscore=1501 malwarescore=0 impostorscore=0 lowpriorityscore=0 spamscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2603050001 definitions=main-2603120065 Content-Type: text/plain; charset="utf-8" From: Ionut Nechita When connecting to Ceph monitors/OSDs, kernel_connect() may return -EADDRNOTAVAIL if the source address is unavailable. This occurs during: - IPv6 Duplicate Address Detection (DAD) - IPv4/IPv6 interface state changes (link up/down events) - Address removal or reconfiguration on the interface - Network namespace transitions in containerized environments - CNI reconfigurations during containerized rolling upgrades Currently, libceph treats EADDRNOTAVAIL like any other connection error and enters exponential backoff (BASE_DELAY_INTERVAL 250ms doubling up to MAX_DELAY_INTERVAL 15s). Additionally, the monitor client has its own hunt-level backoff (CEPH_MONC_HUNT_INTERVAL 3s * hunt_mult, where hunt_mult doubles up to 10x =3D 30s max). These two backoff mechanisms compound: at steady state each monitor gets ~30 seconds of attempts with connection-level delays up to 15s, and the round-trip through all monitors takes ~60 seconds. In production testing (6.12.0-1-rt-amd64, Dell PowerEdge R720, IPv6-only Ceph cluster with 2 monitors), the EADDRNOTAVAIL condition persisted for ~36 minutes during a rolling upgrade: 13:20:52 - mon0 session lost, hunting begins, first error -99 13:57:03 - mon0 session finally re-established ~470 failed connect attempts across both monitors sync task blocked for 983+ seconds, triggering hung task warnings: "INFO: task sync:514917 blocked for more than 122 seconds" ...repeated at 245s, 368s, 491s, 614s, 737s, 860s, 983s The duration of EADDRNOTAVAIL varies by environment: it can be brief (simple DAD, 1-2s) or prolonged (complex network reconfiguration during rolling upgrades, minutes). In both cases, the key issue is that exponential backoff up to 15s wastes time once the address becomes available -- the client may sit idle for up to 15 seconds before attempting to reconnect. This patch bypasses the exponential backoff for EADDRNOTAVAIL by using a fixed short retry interval (ADDRNOTAVAIL_DELAY, HZ/10 =3D 100ms). This ensures reconnection happens within 100ms of the address becoming available, rather than waiting up to 15 seconds. Implementation: - Detect EADDRNOTAVAIL in ceph_tcp_connect() for both IPv4 and IPv6 - Signal the condition to con_fault() via an addr_notavail flag (per-protocol: v1 and v2) - In con_fault(), use ADDRNOTAVAIL_DELAY instead of exponential backoff when the flag is set - Clear the flag on successful connection and when reopening - Use pr_warn_ratelimited() instead of pr_err() for this case The fast retry is appropriate because each attempt is inexpensive (kernel_connect() fails immediately when the address is unavailable) and quick recovery is critical for storage availability. Fixes: 60bf8bf8815e ("libceph: fix msgr backoff") Signed-off-by: Ionut Nechita --- include/linux/ceph/messenger.h | 11 +++++++ net/ceph/messenger.c | 55 ++++++++++++++++++++++++++++++++-- 2 files changed, 63 insertions(+), 3 deletions(-) diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h index 1717cc57cdacd..730a754353aed 100644 --- a/include/linux/ceph/messenger.h +++ b/include/linux/ceph/messenger.h @@ -320,6 +320,13 @@ struct ceph_msg { /* ceph connection fault delay defaults, for exponential backoff */ #define BASE_DELAY_INTERVAL (HZ / 4) #define MAX_DELAY_INTERVAL (15 * HZ) +/* + * Shorter retry delay for EADDRNOTAVAIL. This error typically indicates + * a transient condition (IPv6 DAD in progress, address reconfiguration, + * temporary route issue) that resolves in 1-2 seconds. Fast retries + * allow quick recovery without exponential backoff delays. + */ +#define ADDRNOTAVAIL_DELAY (HZ / 10) =20 struct ceph_connection_v1_info { struct kvec out_kvec[8], /* sending header/footer data */ @@ -360,6 +367,8 @@ struct ceph_connection_v1_info { u32 connect_seq; /* identify the most recent connection attempt for this session */ u32 peer_global_seq; /* peer's global seq for this connection */ + + bool addr_notavail; /* address not available (transient) */ }; =20 #define CEPH_CRC_LEN 4 @@ -430,6 +439,8 @@ struct ceph_connection_v2_info { =20 int con_mode; /* CEPH_CON_MODE_* */ =20 + bool addr_notavail; /* address not available (transient) */ + void *conn_bufs[16]; int conn_buf_cnt; int data_len_remain; diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index 9f6d860411cbd..c40c7c332e7f4 100644 --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -466,8 +466,22 @@ int ceph_tcp_connect(struct ceph_connection *con) ceph_pr_addr(&con->peer_addr), sock->sk->sk_state); } else if (ret < 0) { - pr_err("connect %s error %d\n", - ceph_pr_addr(&con->peer_addr), ret); + if (ret =3D=3D -EADDRNOTAVAIL) { + /* + * Address not yet available - could be IPv6 DAD in + * progress, address reconfiguration, or temporary + * route issue. Use shorter delay. + */ + pr_warn_ratelimited("connect %s: address not available (DAD/route issue= ?), will retry\n", + ceph_pr_addr(&con->peer_addr)); + if (ceph_msgr2(from_msgr(con->msgr))) + con->v2.addr_notavail =3D true; + else + con->v1.addr_notavail =3D true; + } else { + pr_err("connect %s error %d\n", + ceph_pr_addr(&con->peer_addr), ret); + } sock_release(sock); return ret; } @@ -476,6 +490,13 @@ int ceph_tcp_connect(struct ceph_connection *con) tcp_sock_set_nodelay(sock->sk); =20 con->sock =3D sock; + + /* Clear addr_notavail flag on successful connection */ + if (ceph_msgr2(from_msgr(con->msgr))) + con->v2.addr_notavail =3D false; + else + con->v1.addr_notavail =3D false; + return 0; } =20 @@ -609,6 +630,13 @@ void ceph_con_open(struct ceph_connection *con, =20 memcpy(&con->peer_addr, addr, sizeof(*addr)); con->delay =3D 0; /* reset backoff memory */ + + /* Clear addr_notavail flag when opening/reopening connection */ + if (ceph_msgr2(from_msgr(con->msgr))) + con->v2.addr_notavail =3D false; + else + con->v1.addr_notavail =3D false; + mutex_unlock(&con->mutex); queue_con(con); } @@ -1613,6 +1641,8 @@ static void ceph_con_workfn(struct work_struct *work) */ static void con_fault(struct ceph_connection *con) { + bool addr_issue =3D false; + dout("fault %p state %d to peer %s\n", con, con->state, ceph_pr_addr(&con->peer_addr)); =20 @@ -1620,6 +1650,19 @@ static void con_fault(struct ceph_connection *con) ceph_pr_addr(&con->peer_addr), con->error_msg); con->error_msg =3D NULL; =20 + /* Check and reset addr_notavail flag if set */ + if (ceph_msgr2(from_msgr(con->msgr))) { + if (con->v2.addr_notavail) { + addr_issue =3D true; + con->v2.addr_notavail =3D false; + } + } else { + if (con->v1.addr_notavail) { + addr_issue =3D true; + con->v1.addr_notavail =3D false; + } + } + WARN_ON(con->state =3D=3D CEPH_CON_S_STANDBY || con->state =3D=3D CEPH_CON_S_CLOSED); =20 @@ -1644,7 +1687,13 @@ static void con_fault(struct ceph_connection *con) } else { /* retry after a delay. */ con->state =3D CEPH_CON_S_PREOPEN; - if (!con->delay) { + if (addr_issue) { + /* + * Address not available - use shorter delay as this + * is often a transient condition. + */ + con->delay =3D ADDRNOTAVAIL_DELAY; + } else if (!con->delay) { con->delay =3D BASE_DELAY_INTERVAL; } else if (con->delay < MAX_DELAY_INTERVAL) { con->delay *=3D 2; --=20 2.53.0 From nobody Tue Apr 7 18:03:28 2026 Received: from mx0b-0064b401.pphosted.com (mx0b-0064b401.pphosted.com [205.220.178.238]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D7F51283FC8; Thu, 12 Mar 2026 08:16:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.178.238 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303414; cv=fail; b=bL+f+N1zKmllw2OOeXCcN0B0pSsNKiuC1mSaK8lHv6cCyc6b/sxYE7c43P4sAisapQwftp1sFuerPk1wismKw+ILFmJ4GgIjpVWVL9dXi11hGS4Gqz+/6Y+/FUPFtlDRFopLIZyL9WE7v0cXPuYE2Wx5qi9mvyJXy2J49mw4yyE= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303414; c=relaxed/simple; bh=IzKUfgvjYpai3ijpdj9FtWWzwMFeI6Wg8sduiIZYXHk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=dDM/uJhZyEAWRQzNiNnDVPhugHpbhqp0yQS3JqpFqotcemt6al1DYPDtm6OM0jfXLUWByXWs7OF/YRVp3pgtOiO5rAm6hm9f+UC58CYPvXOL6+nIgJiG74J3wcuztq6nSWMgqAiesPCwNMkbGkGY/z2MPIknHAStAuHM9eHKuak= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com; spf=pass smtp.mailfrom=windriver.com; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b=tC20hKbI; arc=fail smtp.client-ip=205.220.178.238 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=windriver.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b="tC20hKbI" Received: from pps.filterd (m0250812.ppops.net [127.0.0.1]) by mx0a-0064b401.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 62C5sMis3084412; Thu, 12 Mar 2026 08:16:43 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=windriver.com; h=cc:content-transfer-encoding:content-type:date:from :in-reply-to:message-id:mime-version:references:subject:to; s= PPS06212021; bh=QZiCRB5VY08wdymg15E36OteejdJ2gAIoiEzqamLAJM=; b= tC20hKbInN4KXLqWa0L3U43U+RafBovWFSW3Sik9kKZZ6TmqTfuEPXWW8JmX3baW 4F3xKwu6HFp1mIYLcoA+qvypOD43WFxG+X0PWEpDTPNUu+w6KiyniMOKYMIKqRSy uHazhKCmrZ7/IeN6p4SMk11rHU8Lz2buUGkNs8wLEIM3fI0aPBOWE2/WyUJqQOoj zqOqHgU7E3kvuDauz0zPeymJdxR04OcC2Hd/MiBrZCpOE7MV+woNkTZ70jdYpnFM e/+GxBwYqGgMh/7eNDizDURn+s64jYHCzYoNHsd88HzfZ9RNjMTRgCaLX2pUgsSv 4dSlBrnWurtBbQNgwG/vTg== Received: from byapr05cu005.outbound.protection.outlook.com (mail-westusazon11010057.outbound.protection.outlook.com [52.101.85.57]) by mx0a-0064b401.pphosted.com (PPS) with ESMTPS id 4cuh78gdvq-1 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 12 Mar 2026 08:16:43 +0000 (GMT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=QyKbyETiL6QzsfpC+Mh7yMJJtpZmV+1ChQfSBRGlsNbmeCvd57jb0p1wHKRfwBac4c76Iue4lAV15QFXl/k33VQ/eZygPw7+SUtudn3VnW1+5qo+4djtCYzbKdN6V741J8nq4dGGZ5G/jpFGKK+MrycPGB1bLY6fAUUA/j6YU3HDjK+9cz33l/lYaVEroXNDJVQ3fJvoWL9/gJX+A1jHO8kW8acuffIwjufNCmHYCIqrxObEHc2xennOtmxlp4TLYHK6jmMEaiEDSmXA4bVgYUlf6qiJkRNtqDPmYljMTNIXeB/eMfLArvRtKs+8Gz6o2IeyIn3SIUUmEvny3uIOFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=QZiCRB5VY08wdymg15E36OteejdJ2gAIoiEzqamLAJM=; b=Tb8oGdR2dSnyyS8gmx7Si2MmFrBx2taNdmdCeqXqGjYo2qKHgjLDL0bzn8dJXD8mEclOM+n0aCj2G5Q89xF8D+pV7qchFwGkgKtwuDRkeI7NGt9dzExe/KkUw1G4ZAqkmbEinMmGo43hkJ14R6nQ8ZXadDmD/GM0dTotfM/jHGBBJgG0oJILRiUFbdD5i6GzGOFu/GfIBNtn1+S35yOyqIEF7YK5b2XPNKwvLmzKEiZUBvL4ZNw9q//7MjpJzR97c2eeobNsXlMjRwNtHvA8Xm6RowJRQMznqKLrpY1pwJIHO2wOh8XuedL3JjW8bFDapgA8Xp62B8qCbYFlPjhVdw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=windriver.com; dmarc=pass action=none header.from=windriver.com; dkim=pass header.d=windriver.com; arc=none Received: from SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) by SA0PR11MB4720.namprd11.prod.outlook.com (2603:10b6:806:72::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9723.4; Thu, 12 Mar 2026 08:16:41 +0000 Received: from SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced]) by SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced%5]) with mapi id 15.20.9700.010; Thu, 12 Mar 2026 08:16:41 +0000 From: "Ionut Nechita (Wind River)" To: ceph-devel@vger.kernel.org Cc: idryomov@gmail.com, xiubli@redhat.com, linux-kernel@vger.kernel.org, ionut_n2001@yahoo.com, Ionut Nechita Subject: [PATCH v1 02/13] ceph: add timeout protection to ceph_mdsc_sync() path Date: Thu, 12 Mar 2026 10:16:08 +0200 Message-ID: <20260312081619.40854-3-ionut.nechita@windriver.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260312081619.40854-1-ionut.nechita@windriver.com> References: <20260312081619.40854-1-ionut.nechita@windriver.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: FR2P281CA0085.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:9b::12) To SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ2PR11MB7546:EE_|SA0PR11MB4720:EE_ X-MS-Office365-Filtering-Correlation-Id: a5adb8af-664e-4fd4-3d29-08de800fb0dd X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|52116014|376014|10070799003|366016|18002099003|56012099003|22082099003; X-Microsoft-Antispam-Message-Info: 9czNZPYvIjTs2UaETJ9/JTNGxD2w9vMkgvPqbzVWdvsJ8hedEkHZf2z1/30/Qn/F3lJh+uqzBGMnrYPRwAz+sLobvy7YCv26jWAbjEHr7Hl+h5yKSl3Exkcqf4inyn2ZpG9OzFtNYJ0/YBEh9WdDtxzz66DFlHC5oRvsZ+NFgIpS2YousGGId6JfAB5LZEfCUTmApgtCwFBJkFLRA2QvGadYpqVdAYH9TzqytIKbNOHfsuSp5lQXeHAGUlCSHuE5YxfOZNIllxP0a25E7bdlmHQeWUfgAgHvICUjlUZgy+OE1sUW/8WY18B0NyV0e23O69a9qTS+PXYUDYe6GJjBeYywHD0zM7An40mfX9qRXv1j/ZApFYeGqhLLVdCPxSFcxfz3M7wev9v1yxrY7kxx1AMLCe+Z5G45arXGZg84y6aiyUMFwSSohE8vxMLxIxQOLFsR9lkx49m+V/gKo4+HysDaNDNHNLEYrSUnQFGgl1whA+kOZGLfpTca4DQ+GH9joNvMfF/ujM2j1fA9KK9uW+Fvce6bAz5EDiDxrCnFameLrFLfXv8S10eX2NIqIhBhCF/XKWWU0CO1e93Zp7Nz+Z/FBjPjMUlFSdusnr01hKmDs+DEM11D34Wj7YDAx82ytQynuClrTSvc1cJPwvqQzkyvH8GJNikZmQdZ53zNcKZEwNEKdP3Ep085ZuG0yoVY1H3EJh/e490EKGyIMLfESNZWbkSrxKg34Keh0IoYtR4= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ2PR11MB7546.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(52116014)(376014)(10070799003)(366016)(18002099003)(56012099003)(22082099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 2 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?F7FVmT0/5Cgx96ZHqbLhRfUr6YU5C1zAdcVRVb2TgOlWfSA+OhcI5WFDCEuE?= =?us-ascii?Q?+Inn6NlUhEJfbsljUTj0PesiDnjk/ZmjwX1e7T4osQBEksl1/LdzL+rShs/O?= =?us-ascii?Q?FIwzYlZNXAQJ3Fo/aINJRNoRpb212pH+Ao1eEfsbrOaupaJpGj+FLy9YBlbf?= =?us-ascii?Q?lbFfhM0tT2jBrRC2CMBTYOeTTvuwOo33dHPYpRC6TICUMecfsbJjrA6dthCk?= =?us-ascii?Q?wbORI4xIJiFO7OJvyYKPgXF50KQy64wT+oLreTWEHU+Mp5Yj75Nr2KqTDOoF?= =?us-ascii?Q?je3ecokadHldh6rSC6JD++Po8KboW36i9gTLzG2uQY034AZPnDnlcRe1UJGU?= =?us-ascii?Q?eCsmBQZxMtZf1qY74RX5z9EsBUYDghpzgt1nfvoNbj6kJBNzjmld+rpwaDoQ?= =?us-ascii?Q?5ODyHQRNjmR3NCBSBxZFuCiKk7v6RpGEz+YU0B1tHPeCzw5gnjK1/GvDCskV?= =?us-ascii?Q?bL/6kj13vWRN3CW/V0vcemAIS41P5vPyiL436tJdleirqzqHww6Z2tB3eGBL?= =?us-ascii?Q?DSlXVzBWkp6QK9aVPF9ewTBqYU7dxAheW672C6r1o/NRHG98iSdGprlvDlGw?= =?us-ascii?Q?gEhO1Se2gFBhIFZLW3+j2fYh0Y1PBcy9KZNf2EIrVJw3J9jXlXgYKYWcRzQU?= =?us-ascii?Q?+TpKqvtTafIVR4Qxqxb5/TBRrJyxk1DSoFlCHsSFNEiSD64epUyr9sILs33S?= =?us-ascii?Q?IYEn5Rc5BbossjygD3HTu0c3e/1Xho6NC1uhJZm5ekT7Kq/MgoF6C5Vk7hdU?= =?us-ascii?Q?UaNB5XwrO5+1ldxnO+ew7wGeZI8E5Bg2vMabn++oaMmgo10PNoAmmiD9O+cN?= =?us-ascii?Q?I4LD9wSdJgvHF0G7ZhFZn3CO9oOCI7mC5sMaHAPJzk8YRqaq/NSQwuAhJ/9T?= =?us-ascii?Q?vgIOY/ofTnwL3tihyedrCv42Re2LJqVo1PjGWMQV9uQs07Dxv3h/4nB3msIB?= =?us-ascii?Q?mfnI7+hSX5VXY4pWkM6cp3ZYBXqYT7MQ1u5Qtp1QQqQe43ZSIJAd4O8Ou2pz?= =?us-ascii?Q?dGNH0JBhjTCG63IZPd/kinZmiNQSYhLOUpQJNQgWHVOdDLmYb2vDd4ZQFLkk?= =?us-ascii?Q?xs1wX3GgyVmidJbyy+EGumGgP1vUKQOgI66zDDaL3gwVuoW0FO9A68/Mhdxy?= =?us-ascii?Q?Ehcp2sSEJ1Zt0Bf91CV+lDTdJX2o/AVV1X8I1RjS2p1qTvbDyNAsv0qwCWdL?= =?us-ascii?Q?MaCe9X25eTyuv/YeVBFVWSmnItMEigsFCuwLQ7moFt67BvR1wgMTLT+74kcR?= =?us-ascii?Q?mtWV/6DjOEkbSgkxbryBb3fJKiJTOE55s2E4gZ/v6VvV0n20kTTzAazBYgML?= =?us-ascii?Q?x19njZNWAr5ujIxAWzfVK9zf1mAaUUQ4uwOn5KmhROZWfY7swmwKZz3re9uZ?= =?us-ascii?Q?l+blCKf81u1jJaqy24BPXBSxV/mla2vI+R+TU+kbKEkbe4GyU1SEUx+8SCKM?= =?us-ascii?Q?JH9eAFQDQkta5Ifkl3wdsthVIDszpY1Jaqengh0aLDWEGFEjP7LMIc38In1h?= =?us-ascii?Q?ua7p2GmuuCCPEt1GHs9aQQfQtCgxgX2U5ZrrmNcTADSR/u+tFf7cKhBlmNdV?= =?us-ascii?Q?HE+QhXMLqey3I9XjpDweEOQu+wsZZdD0m3qMpIKBs5DJ/enoUPoiTijQ+Lt1?= =?us-ascii?Q?BQcIFKeuglgglIdbGWXO8tRccY4XKyfvwMv1Vwdu5c+14HOn9+L/coFgslzl?= =?us-ascii?Q?2QtiP/33gXvA8B19sFCAP+uCfg7EflNtueLKALSHODGGVS6jxYCR5lWUvVt6?= =?us-ascii?Q?S4e2WLiq3uT9cMwYLWHr3b9LeH3gEJUghbV80ph67n8yiOFGiRRG8PvxHjps?= X-MS-Exchange-AntiSpam-MessageData-1: zzmXMijIUZqQLklSkGFg/PPCCgRBCW8njzg= X-Exchange-RoutingPolicyChecked: P46FQ2wAtRtvrzrmObutXlew1AvCrRnJ2/5nACnZTL5SxmwVWHzfpxO6VmcUU83tS2cTELm8mL38g1FKSSC4XGZZc6THh1wPI+jmzTDc2UCu9qDK92ppXavjrA9axiwP6WniTXg3pf5o2UyMIrhzuEToe8IVSJQXJmzQeR7/FoH6/Sz30YJTv8kDMfonHb6nJtToHv453CKuCJDdGGrsiLvOq9sVugzlJXIWY1fCmvQ9axFv2uFwXu9le9diBs5Wtw2J+9xNd02ykdsr9Yt12YcUUyMuzxhNjx3PCIQaljSq/+qmItBJ0kxkUA5nLjLu3XASKautcZAOgOct+t63Zw== X-OriginatorOrg: windriver.com X-MS-Exchange-CrossTenant-Network-Message-Id: a5adb8af-664e-4fd4-3d29-08de800fb0dd X-MS-Exchange-CrossTenant-AuthSource: SJ2PR11MB7546.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Mar 2026 08:16:41.5617 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8ddb2873-a1ad-4a18-ae4e-4644631433be X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: uikq3rNKnx8xpU5jNm0ToP9FJrf21rVcvYTmpmcp7fOSMiHOyuOCx5xO/S8BUAgGQ5kFqyTcHS4/Vbqgn1KJVjwoH60CXzNkn5Az3joG1h8= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR11MB4720 X-Authority-Analysis: v=2.4 cv=ALvEU0hV c=1 sm=1 tr=0 ts=69b2766b cx=c_pps a=PUtblwwgLuFy1ufMWAoo+A==:117 a=6eWqkTHjU83fiwn7nKZWdM+Sl24=:19 a=z/mQ4Ysz8XfWz/Q5cLBRGdckG28=:19 a=lCpzRmAYbLLaTzLvsPZ7Mbvzbb8=:19 a=xqWC_Br6kY4A:10 a=Yq5XynenixoA:10 a=VkNPw1HP01LnGYTKEx00:22 a=bi6dqmuHe4P4UrxVR6um:22 a=fTW__CHxibyLmBMfj2wP:22 a=t7CeM3EgAAAA:8 a=SaH7p1EYcUSNWIx4HiAA:9 a=FdTzh2GWekK77mhwV6Dw:22 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMzEyMDA2NSBTYWx0ZWRfX/OccGkXC7rgl y1egFhefj6qV06SJOepelVwfdN3OSZgP8qpldMGkWu4g/IqCK4dZxAKZVqMRs8K/P/PgvU76PnE BHYvTLWc0/NE1rI0A+5DnVBVNJzJ6vxVOog9+PJpB0ri6nq9Ab4HMGPe6logSVfcF+to0nCL1g0 LqR9JFEWkdfapu7WXUJsmxE3Cx8GpcC47KE7L3DbfBlHN7dxqmQxYLRJ2+9TIGgprj0se11iV8F ehVjduptYhjTwUmfRfT8z559XVCoTxIusldOEqC+2WHWIPW1H7WAjPiEoFsJsPCfQqPo0gW4axR 6Bz5RjaRUSAeIbhDASBa3nsoAROugDyGIJ0vuWCH/3yfdEHJNEkNVl+km2ZknPDsWzn+d1Rnuqr qzFRsc5QpMQPhCec7uD9PoJY//EdqsPl6+cwB2qiBulcwWNdgE7LiPc7WSUu0ZMEeBLHNkkbggt Vv7zpeG15XZ6tWLstvA== X-Proofpoint-ORIG-GUID: Zha4t5rdWtfhnFsLaNSfq4-vlG0F9B-J X-Proofpoint-GUID: Zha4t5rdWtfhnFsLaNSfq4-vlG0F9B-J X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1143,Hydra:6.1.51,FMLib:17.12.100.49 definitions=2026-03-11_02,2026-03-09_02,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 phishscore=0 clxscore=1015 adultscore=0 suspectscore=0 priorityscore=1501 malwarescore=0 impostorscore=0 lowpriorityscore=0 spamscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2603050001 definitions=main-2603120065 Content-Type: text/plain; charset="utf-8" From: Ionut Nechita When Ceph MDS becomes unreachable (e.g., due to IPv6 EADDRNOTAVAIL during DAD or network transitions), the sync syscall can block indefinitely in ceph_mdsc_sync(). The hung_task detector fires repeatedly (122s, 245s, 368s... up to 983+ seconds) with traces like: INFO: task sync:12345 blocked for more than 122 seconds. Call Trace: ceph_mdsc_sync+0x4d6/0x5a0 [ceph] ceph_sync_fs+0x31/0x130 [ceph] iterate_supers+0x97/0x100 ksys_sync+0x32/0xb0 Three functions in the MDS sync path use indefinite waits: 1. wait_caps_flush() uses wait_event() with no timeout 2. flush_mdlog_and_wait_mdsc_unsafe_requests() uses wait_for_completion() with no timeout 3. ceph_mdsc_sync() returns void, cannot propagate errors This is particularly problematic in containerized environments with PREEMPT_RT kernels where Ceph storage pods undergo rolling updates and IPv6 network reconfigurations cause temporary MDS unavailability. Fix this by adding mount_timeout-based timeouts (default 60s) to the blocking waits, following the existing pattern used by wait_requests() and ceph_mdsc_close_sessions() in the same file: - wait_caps_flush(): use wait_event_timeout() with mount_timeout - flush_mdlog_and_wait_mdsc_unsafe_requests(): use wait_for_completion_timeout() with mount_timeout - ceph_mdsc_sync(): change return type to int, propagate -ETIMEDOUT - ceph_sync_fs(): propagate error from ceph_mdsc_sync() to VFS On timeout, dirty caps and pending requests are NOT discarded - they remain in memory and are re-synced when MDS reconnects. The timeout simply unblocks the calling task. If mount_timeout is set to 0, ceph_timeout_jiffies() returns MAX_SCHEDULE_TIMEOUT, preserving the original infinite-wait behavior. Real-world impact: In production logs showing 'task sync blocked for more than 983 seconds', this patch limits the block to mount_timeout (60s default), returning -ETIMEDOUT to the VFS layer instead of hanging indefinitely. Fixes: 1b2ba3c5616e ("ceph: flush the mdlog for filesystem sync") Signed-off-by: Ionut Nechita --- fs/ceph/mds_client.c | 50 ++++++++++++++++++++++++++++++++++---------- fs/ceph/mds_client.h | 2 +- fs/ceph/super.c | 5 +++-- 3 files changed, 43 insertions(+), 14 deletions(-) diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index df89d45f33a1f..37899464101f7 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -2296,17 +2296,26 @@ static int check_caps_flush(struct ceph_mds_client = *mdsc, * * returns true if we've flushed through want_flush_tid */ -static void wait_caps_flush(struct ceph_mds_client *mdsc, - u64 want_flush_tid) +static int wait_caps_flush(struct ceph_mds_client *mdsc, + u64 want_flush_tid) { struct ceph_client *cl =3D mdsc->fsc->client; + struct ceph_options *opts =3D mdsc->fsc->client->options; + long ret; =20 doutc(cl, "want %llu\n", want_flush_tid); =20 - wait_event(mdsc->cap_flushing_wq, - check_caps_flush(mdsc, want_flush_tid)); + ret =3D wait_event_timeout(mdsc->cap_flushing_wq, + check_caps_flush(mdsc, want_flush_tid), + ceph_timeout_jiffies(opts->mount_timeout)); + if (!ret) { + pr_warn_client(cl, "cap flush timeout waiting for tid %llu\n", + want_flush_tid); + return -ETIMEDOUT; + } =20 doutc(cl, "ok, flushed thru %llu\n", want_flush_tid); + return 0; } =20 /* @@ -5838,13 +5847,15 @@ void ceph_mdsc_pre_umount(struct ceph_mds_client *m= dsc) /* * flush the mdlog and wait for all write mds requests to flush. */ -static void flush_mdlog_and_wait_mdsc_unsafe_requests(struct ceph_mds_clie= nt *mdsc, - u64 want_tid) +static int flush_mdlog_and_wait_mdsc_unsafe_requests(struct ceph_mds_clien= t *mdsc, + u64 want_tid) { struct ceph_client *cl =3D mdsc->fsc->client; + struct ceph_options *opts =3D mdsc->fsc->client->options; struct ceph_mds_request *req =3D NULL, *nextreq; struct ceph_mds_session *last_session =3D NULL; struct rb_node *n; + unsigned long left; =20 mutex_lock(&mdsc->mutex); doutc(cl, "want %lld\n", want_tid); @@ -5883,7 +5894,19 @@ static void flush_mdlog_and_wait_mdsc_unsafe_request= s(struct ceph_mds_client *md } doutc(cl, "wait on %llu (want %llu)\n", req->r_tid, want_tid); - wait_for_completion(&req->r_safe_completion); + left =3D wait_for_completion_timeout( + &req->r_safe_completion, + ceph_timeout_jiffies(opts->mount_timeout)); + if (!left) { + pr_warn_client(cl, + "flush mdlog request tid %llu timed out\n", + req->r_tid); + ceph_mdsc_put_request(req); + if (nextreq) + ceph_mdsc_put_request(nextreq); + ceph_put_mds_session(last_session); + return -ETIMEDOUT; + } =20 mutex_lock(&mdsc->mutex); ceph_mdsc_put_request(req); @@ -5901,15 +5924,17 @@ static void flush_mdlog_and_wait_mdsc_unsafe_reques= ts(struct ceph_mds_client *md mutex_unlock(&mdsc->mutex); ceph_put_mds_session(last_session); doutc(cl, "done\n"); + return 0; } =20 -void ceph_mdsc_sync(struct ceph_mds_client *mdsc) +int ceph_mdsc_sync(struct ceph_mds_client *mdsc) { struct ceph_client *cl =3D mdsc->fsc->client; u64 want_tid, want_flush; + int ret; =20 if (READ_ONCE(mdsc->fsc->mount_state) >=3D CEPH_MOUNT_SHUTDOWN) - return; + return -EIO; =20 doutc(cl, "sync\n"); mutex_lock(&mdsc->mutex); @@ -5930,8 +5955,11 @@ void ceph_mdsc_sync(struct ceph_mds_client *mdsc) =20 doutc(cl, "sync want tid %lld flush_seq %lld\n", want_tid, want_flush); =20 - flush_mdlog_and_wait_mdsc_unsafe_requests(mdsc, want_tid); - wait_caps_flush(mdsc, want_flush); + ret =3D flush_mdlog_and_wait_mdsc_unsafe_requests(mdsc, want_tid); + if (ret) + return ret; + + return wait_caps_flush(mdsc, want_flush); } =20 /* diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index 0a602080d8ef6..695c5a9c94026 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -564,7 +564,7 @@ extern void ceph_mdsc_close_sessions(struct ceph_mds_cl= ient *mdsc); extern void ceph_mdsc_force_umount(struct ceph_mds_client *mdsc); extern void ceph_mdsc_destroy(struct ceph_fs_client *fsc); =20 -extern void ceph_mdsc_sync(struct ceph_mds_client *mdsc); +extern int ceph_mdsc_sync(struct ceph_mds_client *mdsc); =20 extern void ceph_invalidate_dir_request(struct ceph_mds_request *req); extern int ceph_alloc_readdir_reply_buffer(struct ceph_mds_request *req, diff --git a/fs/ceph/super.c b/fs/ceph/super.c index b61074b377ac5..b52960402d68e 100644 --- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -122,6 +122,7 @@ static int ceph_sync_fs(struct super_block *sb, int wai= t) { struct ceph_fs_client *fsc =3D ceph_sb_to_fs_client(sb); struct ceph_client *cl =3D fsc->client; + int ret; =20 if (!wait) { doutc(cl, "(non-blocking)\n"); @@ -133,9 +134,9 @@ static int ceph_sync_fs(struct super_block *sb, int wai= t) =20 doutc(cl, "(blocking)\n"); ceph_osdc_sync(&fsc->client->osdc); - ceph_mdsc_sync(fsc->mdsc); + ret =3D ceph_mdsc_sync(fsc->mdsc); doutc(cl, "(blocking) done\n"); - return 0; + return ret; } =20 /* --=20 2.53.0 From nobody Tue Apr 7 18:03:28 2026 Received: from mx0b-0064b401.pphosted.com (mx0b-0064b401.pphosted.com [205.220.178.238]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BD4BE348453; Thu, 12 Mar 2026 08:16:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.178.238 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303415; cv=fail; b=VtUO5rYtMPZ5jh3IECSOie32EKe/LeQ/Oq42fPdqOyVEzQZljwaM/W+936vH1bmtFao6XRBBbx7+Un3adl1i+OeBgoOivK2w8rd6gu0IBxVtrDeAEoajdRSLfXE6GdewTd2p4wk/+w8wHzYU/tg8XoVzWpCcXcqySs29k878RR8= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303415; c=relaxed/simple; bh=QCKiZ3LfOyaKGUvD4a+q3JsMb1l7a5lSfWCWVjInFiw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=PHjVhE7Zt/L7ic0nD7Y0/7cuCiZ5a/ASxYxomWBGbcE97JLA3WREB/0MUwbVJoB31wuFnHZCS3aaR53M2ekUKlPcdzEqV6VEvwWIcyaZympLJDZrB4AQMwhdIJ7WHXSkk58aOjBCZmnlrc4KEIJo0E2hBeC5l5Bf3xHKHbQ4Bkw= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com; spf=pass smtp.mailfrom=windriver.com; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b=Rk/h4dPE; arc=fail smtp.client-ip=205.220.178.238 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=windriver.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b="Rk/h4dPE" Received: from pps.filterd (m0250812.ppops.net [127.0.0.1]) by mx0a-0064b401.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 62C5sMit3084412; Thu, 12 Mar 2026 08:16:45 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=windriver.com; h=cc:content-transfer-encoding:content-type:date:from :in-reply-to:message-id:mime-version:references:subject:to; s= PPS06212021; bh=z9dQDUJiT4y7wI0xDonTMLRFIm9J1K3RM7Akq06UgcU=; b= Rk/h4dPEAqfNmiI3eFDY0HDPXh8HHLe29p8Mz6dB/rVOpslo9VDJDNcosgjDD+ZI 8BGMSsdh88rQLqJdlrZNPf9GC8DmS8o6a1TlzhalieTk439Uc4gMuXYLoxZPJAyG 7iIfXMuHpgb+pgnKljjTLw+mM1M//hU0AazTzgBKF2f+3jgZIy/D5N2ybq731FCJ 1yYNdC4R2Abv4e4BLZoN7GhpLHOXyDVWMA2uSZQNcoLs7Q+xCYUzOyp5yjQfwvvZ WkdNMjX2OQLxmNkTMPehl9QJ/THNLsJhQJdmIETAm42UyGGEo5ZqizAm8MDXnt78 SFqKkxFKI54RhnhVtP+Pbg== Received: from byapr05cu005.outbound.protection.outlook.com (mail-westusazon11010003.outbound.protection.outlook.com [52.101.85.3]) by mx0a-0064b401.pphosted.com (PPS) with ESMTPS id 4cuh78gdvs-1 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 12 Mar 2026 08:16:45 +0000 (GMT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Y71TCIry5vr/QtFNHpKcSFa4+7yaYpPKg6r9S1zytsrEZwYOKtv60dFdWcnbl/sRooc8YLF74VziCYd1Mt3xBEAimP9roMtCgrDdswhcpaBr2yNSi7ahoFK9r/hcltkWH0HDnQ+MA6GM8x18/6/n6TgIvT5gSZAfRKprKJf/DlhcteAH6BVMTVHe5UznnNSJoeci4I1TKLH2Yts1L1lvvXv+bVbU7whP7IGsfTJVmfHW9zJo+bwhWc2MxqWH3h1+ceDg2LYh6Jtdo+b6D/O54Z9Lc6phyHh1o7qWHQzckAwa4nbqmCOXDR7BUa/OF9SsWy9FiLGvJ5n7YIfpDZ8wYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=z9dQDUJiT4y7wI0xDonTMLRFIm9J1K3RM7Akq06UgcU=; b=MVBBa8be1aCm/jcLXFx8oHFy2UKlD1sk8lAiDTHg1dB1HhNv1vpbU2wnW0R2Lblun0n7wN7XV0+Y4MYv4fhtKjM1ilnW9AflbxukygVZWOPaDiwcQj/7jZaMCK92qBc+Ree/b4v+kPl/Dw7VB7uo4Rfj0XTNvvvkXPj8jwUYhcvBclkywKTO+qIJGMtU0/RVMkbb0ccsdO/LwAS3tBg57wd2QcMsfUWQowo+wvPhv40oTkChODTau+yGlr6YYRcfbcQYW0x1SJkZZhHAIx0IXjwhubpwB0QD/pfxrERuKYh0tjzbgXMnZ/4VsO7QCPciKvTvBg6BZMScGbJdWxgyXg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=windriver.com; dmarc=pass action=none header.from=windriver.com; dkim=pass header.d=windriver.com; arc=none Received: from SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) by SA0PR11MB4720.namprd11.prod.outlook.com (2603:10b6:806:72::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9723.4; Thu, 12 Mar 2026 08:16:43 +0000 Received: from SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced]) by SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced%5]) with mapi id 15.20.9700.010; Thu, 12 Mar 2026 08:16:43 +0000 From: "Ionut Nechita (Wind River)" To: ceph-devel@vger.kernel.org Cc: idryomov@gmail.com, xiubli@redhat.com, linux-kernel@vger.kernel.org, ionut_n2001@yahoo.com, Ionut Nechita Subject: [PATCH v1 03/13] ceph: add timeout protection to ceph_osdc_sync() path Date: Thu, 12 Mar 2026 10:16:09 +0200 Message-ID: <20260312081619.40854-4-ionut.nechita@windriver.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260312081619.40854-1-ionut.nechita@windriver.com> References: <20260312081619.40854-1-ionut.nechita@windriver.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: FR2P281CA0085.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:9b::12) To SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ2PR11MB7546:EE_|SA0PR11MB4720:EE_ X-MS-Office365-Filtering-Correlation-Id: 20b43a32-fbcd-45a7-eda5-08de800fb224 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|52116014|376014|10070799003|366016|18002099003|56012099003|22082099003; X-Microsoft-Antispam-Message-Info: OpBg9kmFfEbSSjrArJuoqXObkbi0R5Ob77ZfgAVQHIQBf3qb3VNOIhxDK9Q6DAbz0NSn3jBVNWt6mdMQGgATde5LXGLLIwT5cvPSrCDA+9d6FviLQk1QMruDP3MVA4hE5L8aXmo8V00WxjlZJ/pX9c9TsLyjs+9C/6+wlZO6/cGiCNZ8GGczIAXuIWWLq8hOWPufQcW3bu+F6rCAUbG7zziCeZvncDcabO+l4lVBlbi9B0kh1WxqfF4CbA761zTr1xJDp5Bysxd2MinescvJ0EIyzpWdsfICnHwEw3ObtIxCnPVPKQyCIvfw4OyZDHq+tXURJtakfBsIIL+WlCofs69EJA17m2S+h5Mbzrf8/ObYeoC743zA6XYaF5HyTL53fBTgOtIaF31gNuoJTS+769beGCYOAU8gbuDVogfnf501w3S99ooJ/6VRoi5PgcU5ZlsSSC4d+OX0FX68KVuGldzUrE+gGZDZ5BjVmJCK0vxSh/9FvIZCESsPUv1Dc1BT4QY93ZeLiWC9jFvKPxucuq/jjJwFnPpoIHxzk3bfoQ58HtP2Eaf2X+AF8OjbhtzV0g+pIYhQii2ZW5DOiAHEEQVquHoUr2rvjTGHL9cnTmE28lFCqlo5U+wq4J7TvWCO6RH/IwdEaLpz8hgqzw736In3PvDz0VWFteDK6lJIcBO+p9/44K94xy5fHHo3xPZPBSTIWkP2eSY17XlOECGMfe5/VLmOWR2awELSNFld+HQ= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ2PR11MB7546.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(52116014)(376014)(10070799003)(366016)(18002099003)(56012099003)(22082099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 2 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?9h+BjoGh06X4nNa+Q3IfekoG15b6glJGMUH+BHSB8JVdczc6t8Jh8OnZNocZ?= =?us-ascii?Q?gefl2ySmt8lsVw/dadekbOnd8dCzozBqclumieiaw76Q9xa8o/40bGagrUH2?= =?us-ascii?Q?+x/BfTjuvBj7k6SXzAmSUMw2u7xFWMo6eRD/ZkvwDSRBV74zIthWWjXICmZW?= =?us-ascii?Q?rVwXlfOlmXKdyuds9fOa8D6+l1FsSQGfvA+QGNXnN53CdwEh05/J9x0k++15?= =?us-ascii?Q?HHx3Kc+Grk5ieI5tnoheqiiSN3V/Yf3ro0NyXRVdpIOeeow3bIRkVRNeUkZc?= =?us-ascii?Q?xllULde284otmZlFxChLk20SO170f0oYbKBn40TJSdlsYoWlMDRunqnDnWyy?= =?us-ascii?Q?W1xG1qP61llVqHtYTrqYWZNrkQ7ZgBp0plymu3IJU0qdK01ZDzd/g6pEKKBI?= =?us-ascii?Q?enWDL5xk913GO6G7/hzCOukdhftao5YRAIRA3pcBoOasam+7BfWpyD0fSpfP?= =?us-ascii?Q?HNyF3cOr2ZdE9yhU2OOgSfX/nIrr8JOkhguLMgzIXx3fyVK49u4IO9rV3u8a?= =?us-ascii?Q?c0Ndk1OLRI0CR54tdherDRSBxxLb/MVhe2+pb537OLTaQrpw4UQLSkwWNOT6?= =?us-ascii?Q?PzxDv66zqNwA1QaGNWKoKpJWmc396AFwPAcDRnEloX6xTTV54yB9aKZ6YWik?= =?us-ascii?Q?W0fT1sxxydx1vIdYLMGL0ezIu1GbAYVTQ759V5rke65m/EwoJaLE5x56nnHz?= =?us-ascii?Q?CDYN1n1E4GOoHZ9TWvGUPfQ4Vlfa+443nSQ60XKitzR1FBqhrzia//IA2+yX?= =?us-ascii?Q?2nT+P4OBxkCPWL92O3aaAbL/+bNKFvkvDaO+mQhg3SOd2c7f+wcXI7Mj3Uu0?= =?us-ascii?Q?hU8E9wVUdfxbVqcs1D0CJrA/1GzjgImv7QsV/l0VX11HlndNg/0t5FBkqmQY?= =?us-ascii?Q?9DY1ERaQq1J01y5Xh3LR9auNhz3s1uMmhrAvBFlit7D30X9+myAQYNdambpd?= =?us-ascii?Q?lxoYMM8umI0Ho+Ka2OWkl6O6bkZjbSoOLZVG40ehbK8N8vEdWjAYjqf5bwgI?= =?us-ascii?Q?6kifpPMtbU4etXFckCjDG+NGSF4Z/jirlgAKeAkIjoRRrET72nlO5EegdguQ?= =?us-ascii?Q?hazbyvxory+wXSC15hg5cF3IFflww/ADUZcB2lCNCE01wQeDWJRPqVpgJuE7?= =?us-ascii?Q?jEBCE59M2lWouIc7rowfTXzL/hV2AoK58Jgb6Z09fe7fMLqNLTa5XLbfQZdb?= =?us-ascii?Q?7EPjjJLB38Cn/KEpv2MEcaDjSrQtVuDINnrDdwWUxAjYdM8JaFRBcQxY4Rz1?= =?us-ascii?Q?QZzbHIqGGAZGpZhLEg2fcJligc/7BfAOGLLRbc0tNxF7nwjp76OYzbuhMpKo?= =?us-ascii?Q?pUmHh6pEs04QdZ/yVwzVh93+XhzdylcjJ+c1bI7lsSFhcrLlYGLSZ2APGnHS?= =?us-ascii?Q?uFjziMOOBH1JECzfJGVLsgRJc6uV2QT/I6kleUtgdZFibmcm15uQrxGw3wt2?= =?us-ascii?Q?I+AM9Mg/P5FIbTTl82qYZRcXC9DhtXfh4OPE5RV6C2NfGGDq1zlXoZgumBBI?= =?us-ascii?Q?L54Qw4SOPVxE3hOyX2LkKpTiZfMULV1sZhzTiI5vK2Hgm2ioWMgjLnH6LAaN?= =?us-ascii?Q?392of3EFZrzQT+pJYUb0ilapQLSGz8+sHuhnjpmvhMLH5CTq6ySVQnyTeL3R?= =?us-ascii?Q?OQU/ITUPUvU+/eVwGm+bUn28zRIcSs/MgJz7llrwIUZGtdIpvQQpPgnawc6k?= =?us-ascii?Q?/Yp/Z4yOkami+kYx394AOML+1YnG75czLyxjzJIoVuSzM2etBofjhGI4AmUV?= =?us-ascii?Q?ylv5NIZ6T4QT6DyAYLQWRVlbGmqQZXhovyvvMa4ge05UhRhJ8JXLdB+XqAMP?= X-MS-Exchange-AntiSpam-MessageData-1: lU+UAZzLxDYsMXkOz22qPqG6FUNHKhZMnr0= X-Exchange-RoutingPolicyChecked: gh4xwhBCNWecYAR7K+jb/iirMOQ/Nb+iJCDdU53IOv9WMqCRekGDzd4T0w8mw/9gW7gAoy6RIdF4TGc0FACYIhwnRb7EgH0BB+e2IZ/B94y00mhSqBiN9BaowpfcBFVEm3jWKKtHKzVLwCpL/vdMfee4Z4xapaFUv9+yFWO0t5nBCRPMibwoiq9oL0CNdwZUv0/ZpCWZNcr0lFt0i3qKPmQZakG5ls4HnO0GIplCRlc/011OQ0APVNt5eMPkoBqrkaiuAkY6XRiboIgFC0dYjGGV97SIbqafM+QF7o7b2veiem+dQH5TPl2a2lCZmL+9qhcNGhg2Vw1Kd5M9TEaezg== X-OriginatorOrg: windriver.com X-MS-Exchange-CrossTenant-Network-Message-Id: 20b43a32-fbcd-45a7-eda5-08de800fb224 X-MS-Exchange-CrossTenant-AuthSource: SJ2PR11MB7546.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Mar 2026 08:16:43.6407 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8ddb2873-a1ad-4a18-ae4e-4644631433be X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 7re9zCJ5NQkOqYKvw5EZ53uW35rUUInxbXaCFCFYGI1b5mlTY2wsYgNLQiPj68cZ8jHAeWz4pM9TNt6VXk/Ex2a5P7LSEN6x+3X/KrleY7E= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR11MB4720 X-Authority-Analysis: v=2.4 cv=ALvEU0hV c=1 sm=1 tr=0 ts=69b2766d cx=c_pps a=uej8S9duoQdrJ4czObxKNw==:117 a=6eWqkTHjU83fiwn7nKZWdM+Sl24=:19 a=z/mQ4Ysz8XfWz/Q5cLBRGdckG28=:19 a=lCpzRmAYbLLaTzLvsPZ7Mbvzbb8=:19 a=xqWC_Br6kY4A:10 a=Yq5XynenixoA:10 a=VkNPw1HP01LnGYTKEx00:22 a=bi6dqmuHe4P4UrxVR6um:22 a=fTW__CHxibyLmBMfj2wP:22 a=t7CeM3EgAAAA:8 a=AFEWvRXOxcBv0v_5dq0A:9 a=FdTzh2GWekK77mhwV6Dw:22 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMzEyMDA2NSBTYWx0ZWRfX8orGTgEDCUhI CyV8PjKi9bnr0wFHtfw99TRqujRiohk/S5xFTjvwALSzvp7jjzUtGLDnVcyiPYvdJV5Ujqn7TvS Nysu6Xf7CBAfOzuvNa9Y7CKBZXIhHtO8meern0ecRJBczMWcILvVcmnlI+dek42MYmOQGmYVtn4 MgUPdBi1tEplAw5VShvM9z8uBlnw9PLVdjz8kFVX8k1aZP7tEt1mw9FKeb0CkvW1QJ7nRd7VKmw hsV7x4tZ3pYHzuU5bws/NlSTrpvg9+28LjQWZgQpcNrtjNrohAMILPGUojJ3Uuh6lttpo1Y1Mi5 lF9uzBk3++uggdJcFSjjx1+Y+RrxCF01oHFbw4crjWkDla+RajPAHgUjxB9KXERDdnKEQypU+V3 e/zWqP9T39OwjRiDEvttOOZhsvJBK8ACZg30gSV79fP5QGK4cBydF+ce++eKuFrdvfIrlAY699y WwexIbOewhW1gBSOuRQ== X-Proofpoint-ORIG-GUID: X3I8YB898OVuRX85js8GuNI0E8Aq4MIp X-Proofpoint-GUID: X3I8YB898OVuRX85js8GuNI0E8Aq4MIp X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1143,Hydra:6.1.51,FMLib:17.12.100.49 definitions=2026-03-11_02,2026-03-09_02,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 phishscore=0 clxscore=1015 adultscore=0 suspectscore=0 priorityscore=1501 malwarescore=0 impostorscore=0 lowpriorityscore=0 spamscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2603050001 definitions=main-2603120065 Content-Type: text/plain; charset="utf-8" From: Ionut Nechita When a Ceph OSD becomes unreachable (e.g., due to IPv6 EADDRNOTAVAIL during DAD or network transitions), the sync syscall can block indefinitely in ceph_osdc_sync(). This function iterates over all in-flight write requests and calls wait_for_completion() with no timeout on each one. The hung_task detector fires repeatedly with stack traces showing: ceph_osdc_sync [libceph] ceph_sync_fs [ceph] iterate_supers ksys_sync Since ceph_osdc_sync() is called before ceph_mdsc_sync() in ceph_sync_fs(), an OSD hang prevents the MDS timeout protection from commit e789e5252fda ("ceph: add timeout protection to ceph_mdsc_sync() path") from ever being reached. This is particularly problematic in containerized environments with PREEMPT_RT kernels where Ceph storage pods undergo rolling updates and IPv6 network reconfigurations cause temporary OSD unavailability. Fix this by adding mount_timeout-based timeout to the blocking wait, following the existing pattern used by wait_request_timeout() in the same file: - ceph_osdc_sync(): use wait_for_completion_timeout() with mount_timeout instead of indefinite wait_for_completion() - Change return type from void to int, return -ETIMEDOUT on timeout - ceph_sync_fs(): propagate OSD sync error, short-circuit before MDS sync on failure On timeout, pending OSD requests are NOT cancelled - they remain in-flight and complete when the OSD reconnects. The timeout simply unblocks the calling task. If mount_timeout is set to 0, ceph_timeout_jiffies() returns MAX_SCHEDULE_TIMEOUT, preserving the original infinite-wait behavior. Signed-off-by: Ionut Nechita --- fs/ceph/super.c | 4 +++- include/linux/ceph/osd_client.h | 2 +- net/ceph/osd_client.c | 15 +++++++++++++-- 3 files changed, 17 insertions(+), 4 deletions(-) diff --git a/fs/ceph/super.c b/fs/ceph/super.c index b52960402d68e..6f4ee457c1b52 100644 --- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -133,7 +133,9 @@ static int ceph_sync_fs(struct super_block *sb, int wai= t) } =20 doutc(cl, "(blocking)\n"); - ceph_osdc_sync(&fsc->client->osdc); + ret =3D ceph_osdc_sync(&fsc->client->osdc); + if (ret) + return ret; ret =3D ceph_mdsc_sync(fsc->mdsc); doutc(cl, "(blocking) done\n"); return ret; diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_clien= t.h index d7941478158cd..871827e2dd983 100644 --- a/include/linux/ceph/osd_client.h +++ b/include/linux/ceph/osd_client.h @@ -587,7 +587,7 @@ void ceph_osdc_start_request(struct ceph_osd_client *os= dc, extern void ceph_osdc_cancel_request(struct ceph_osd_request *req); extern int ceph_osdc_wait_request(struct ceph_osd_client *osdc, struct ceph_osd_request *req); -extern void ceph_osdc_sync(struct ceph_osd_client *osdc); +extern int ceph_osdc_sync(struct ceph_osd_client *osdc); =20 extern void ceph_osdc_flush_notifies(struct ceph_osd_client *osdc); void ceph_osdc_maybe_request_map(struct ceph_osd_client *osdc); diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index abac770bc0b4c..7d5e4a078fb10 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -4734,10 +4734,13 @@ EXPORT_SYMBOL(ceph_osdc_wait_request); /* * sync - wait for all in-flight requests to flush. avoid starvation. */ -void ceph_osdc_sync(struct ceph_osd_client *osdc) +int ceph_osdc_sync(struct ceph_osd_client *osdc) { + struct ceph_options *opts =3D osdc->client->options; + unsigned long timeout =3D ceph_timeout_jiffies(opts->mount_timeout); struct rb_node *n, *p; u64 last_tid =3D atomic64_read(&osdc->last_tid); + unsigned long left; =20 again: down_read(&osdc->lock); @@ -4760,7 +4763,14 @@ void ceph_osdc_sync(struct ceph_osd_client *osdc) up_read(&osdc->lock); dout("%s waiting on req %p tid %llu last_tid %llu\n", __func__, req, req->r_tid, last_tid); - wait_for_completion(&req->r_completion); + left =3D wait_for_completion_timeout(&req->r_completion, + timeout); + if (!left) { + pr_warn("ceph: osd sync request tid %llu timed out\n", + req->r_tid); + ceph_osdc_put_request(req); + return -ETIMEDOUT; + } ceph_osdc_put_request(req); goto again; } @@ -4770,6 +4780,7 @@ void ceph_osdc_sync(struct ceph_osd_client *osdc) =20 up_read(&osdc->lock); dout("%s done last_tid %llu\n", __func__, last_tid); + return 0; } EXPORT_SYMBOL(ceph_osdc_sync); =20 --=20 2.53.0 From nobody Tue Apr 7 18:03:28 2026 Received: from mx0b-0064b401.pphosted.com (mx0b-0064b401.pphosted.com [205.220.178.238]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ECFA92D7804; Thu, 12 Mar 2026 08:16:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.178.238 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303412; cv=fail; b=iQ2k6nWpJW5O/QKJt8IasEQI3my5I2VXAQvP4KdJF2Ue546DdZeqA/hadUwIHloyiWOczbCHqxkZtyJzxXPCqZMFpvWY97wstTK5N8kWDmsWModsfELello+2BrQPWq6MyOTmuFJk50EAa5aI/kiB5efejDnUJmQldd54Oboa34= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303412; c=relaxed/simple; bh=WB3RUggO+WzfJlPdaMJwM0fnMoNZeZNmy1K3mI6Ipgg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=pCKSwl/L5EUYIoWS1b9e/LIaQo7FIj5fo3beh0n6lSHDV3vUYlwUeKVbFnMzyqrn2Ly8/CMz97lRLtwB+HuHXGv9GOWfQaTMpTbpci6GdLN86ySRtSs6NPo+urMJ5ljRPEH+zDojweHq49159ZBlp/e712M7ZtUmGF96fx4Geyc= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com; spf=pass smtp.mailfrom=windriver.com; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b=hSGArITp; arc=fail smtp.client-ip=205.220.178.238 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=windriver.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b="hSGArITp" Received: from pps.filterd (m0250812.ppops.net [127.0.0.1]) by mx0a-0064b401.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 62C7Xgae3253345; Thu, 12 Mar 2026 08:16:47 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=windriver.com; h=cc:content-transfer-encoding:content-type:date:from :in-reply-to:message-id:mime-version:references:subject:to; s= PPS06212021; bh=ryulqqajZF1GmuaXblC6EwQaGGb+lUFHAZvUb+vEoFg=; b= hSGArITpDB5Bmu9RUUGY9VIkE9ZNEW4sL+IfRsyKjaPlJK+oaYC6smcTaOkNWfrI bt8o+taxnxY6Uo9e3zpbKshbJo/Z5CTyAkoFUrxmLJ6D64ZWheyULe9j53G97R/H 24ANrz80EK32Sqt3k87rtgxW0gYg/HUDB/2Qhb9X255/9RAydKIgv4R2/HaDUzbr 6T150QODOTva1ERjRF7BnoQ0LJXUghu1/STgqoWMw2lfVAWe2xvJmIOilswXT//C dpmrRmrnmuZ2o0yYWRc9Z+TuHql+aLXE/b86Clx6Bp9Ag2GXr/cA3Zhmg+opx60y NHDYmqLBxT9YehaCngJIng== Received: from byapr05cu005.outbound.protection.outlook.com (mail-westusazon11010070.outbound.protection.outlook.com [52.101.85.70]) by mx0a-0064b401.pphosted.com (PPS) with ESMTPS id 4cuh78gdvv-1 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 12 Mar 2026 08:16:47 +0000 (GMT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=m/U3A0otsODu6O4zolsoe94lFlINfl2Dapqm0amkjmYkhYpzrZ1VAzbnBQN06ZmBVBSLHR0spuIVYdrro5Fpag2NzX70LoafdiVJS7ADN/Mp1ZylnZgZx7GPGZTAbr1uICNmFtBj1AAw5rJyXL7lP3KpQE7SVg/rGczbeIk49VkR16+iXm/ZBvqwkfyG+JyTbCQ/pbRpEhjBYXYIO7knnahmwnZQKITSb54O+mTuTD6P9wKp/edvYpKYhp9ajdYrX5iX4UhNtXHsemlR0RrYhSzcOnfsephoUFN2biRmBfKrPYkIcBO9HlxPC9nW2D+zsmrfgw08jnJMWBs7KLIe0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ryulqqajZF1GmuaXblC6EwQaGGb+lUFHAZvUb+vEoFg=; b=MSH8T09GfpV1xgEI/135skANTzFXsnQ5XNA8eRb0KVKMCgBDnJSavMUUZpQEevzIz+VLK5egWKfDsf7yZU2YKPYzwIdOYlqF7czN1qQvoIfXsitCGEfbxRB8GwWnP4d4uM/qc0uB7QqNvWzRXciUAy5xM2rvFlftlNIK5agfoJUqpKsY46eeohMDsEh5mBuP6ZM3Bu60azqLF+IPMIQcDHvXA5UfxqKUNsnuCp9Y3gnnxVx5XJpTBLBuhEflevJccUsUD9WrDTgNmSY0h5hYeyHodBuWDPMPfxW82fODxMK9TSclDx2EdgyN7AkaEY4QRxFCfXEA/cMF/vuvtKWQqg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=windriver.com; dmarc=pass action=none header.from=windriver.com; dkim=pass header.d=windriver.com; arc=none Received: from SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) by SA0PR11MB4720.namprd11.prod.outlook.com (2603:10b6:806:72::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9723.4; Thu, 12 Mar 2026 08:16:45 +0000 Received: from SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced]) by SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced%5]) with mapi id 15.20.9700.010; Thu, 12 Mar 2026 08:16:45 +0000 From: "Ionut Nechita (Wind River)" To: ceph-devel@vger.kernel.org Cc: idryomov@gmail.com, xiubli@redhat.com, linux-kernel@vger.kernel.org, ionut_n2001@yahoo.com, Ionut Nechita Subject: [PATCH v1 04/13] ceph: fix race condition in cleanup_session_requests() Date: Thu, 12 Mar 2026 10:16:10 +0200 Message-ID: <20260312081619.40854-5-ionut.nechita@windriver.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260312081619.40854-1-ionut.nechita@windriver.com> References: <20260312081619.40854-1-ionut.nechita@windriver.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: FR2P281CA0085.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:9b::12) To SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ2PR11MB7546:EE_|SA0PR11MB4720:EE_ X-MS-Office365-Filtering-Correlation-Id: 9d0d79c6-c6b0-4918-98bb-08de800fb37d X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|52116014|376014|10070799003|366016|18002099003|56012099003|22082099003; X-Microsoft-Antispam-Message-Info: t6WRyFYXtgwl5hHkn0xIKnRinlUrUfvLCB/3VwD5bZqrjGcSh52Z128sBEBQjh9q4WGVxhVXWDZCb/+Iy9R7snU9h4DKo97QPrxVtsdSUY9Sy8y4nyBterdB/ObFCIBjBEc22QpEIR1buwx2Hd9gxsp/DXDvlLPSd/wUty2mllq6eBEoEnj89O3ZWJZxIveZ+Iqij+/MsTYeNnSZg1jwChDEGQiyG/BlXNfEStQr1f9DhrzhMEh46+be4XpHjPXU5IokgnLEmilLprxBFA9KRt9OzjKEXyv9bQYwrsAfgSUysOKnyC9sDToExqS+ugWt7Vpofg3qAkdUBZ+pY0yIr0v7UED9Cu2vumVdUq817tEw09urJR2a9SaDNiyMSkxirX36rOQ898MOQWatUUtYNjndt2vGJuEJsE26pVPRGmqrXhKJGsJ7920F746pUYOrk7JhCxvcxXiS1Jbm6g2mfM0HsACNXCMmwj9Y2Z2cfW6AVDdQ7I5ydQUEIj9xf0UEBw6VByk2LutfalVbitgfo+bZwnQpSSIej4xHfw89a6W3Xle1tujiXrGpAMsec94RFNNehj9ry53p2HIsrV1e0NeTdzau+9XrpV5onN21nHgM5lb8uMJgYEPuPMDpBK/Xpw1n+2T12tAd382nTZHhDDn5bHXjtfxDg1M1zfiYDqHvARa9eEgEYVqPHSYv3P9Gf6V0L0MPu0o+9NjPlUnRwfyDqXNwFABHfm1QGO3TmDw= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ2PR11MB7546.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(52116014)(376014)(10070799003)(366016)(18002099003)(56012099003)(22082099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 2 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?IWTW7fk6mB9o7Mc+hBYg/Sl+sQjoqQK0YmeZcuCR1aByOsemK0J08RphjWVL?= =?us-ascii?Q?GM+8ojBaht9v7U3wv5XEgc9wdVmJ8RXSOIRHTp+GnhpwjfX98x+0R1eSAab4?= =?us-ascii?Q?/a3f3e4fnpRyz+FGxM7OnEOXnJDpe3/LO8AqHWRNrK4hgTPODZzxA6bPncpt?= =?us-ascii?Q?cj7ugJ2+M3uonNwTKCLOno9ra+8MkCeKqtUC3O0T0YYyZUnj9EWKY/nxvaJ/?= =?us-ascii?Q?90cF9hukDwiBZ12VTTftkUla2z7QLUqh4H+o8cV/VFPegsPT4Dr5/S638YrM?= =?us-ascii?Q?zXIaS1g+dkdxrBSYvkNaGnHdlYEeJd3wOKqTGqR6hpzT7gddhH8gGWTuqoR6?= =?us-ascii?Q?6fNS82t5FrC4q+s3U03lUFPjSa1g8laOpID5C912k77TNWi6AXk+oI5Fxzz0?= =?us-ascii?Q?pv9qh2vfBepJKV7hbZsxfbkqpj9mrSVFVv2bIBnGLdWbakZjPvMJCDskFfEJ?= =?us-ascii?Q?J0InkKhYwTW6JH0qZRk3pbCQA/b8ukYnYKQq3RjGVGLCye8Ms0fXIrtOLjDS?= =?us-ascii?Q?qQUSMnzxggbG2FWkNq0RJMdGfgHq9HkbgwWHvi1R8hQwKXION6osbxlx2ajX?= =?us-ascii?Q?SxwLf9GuXozjOIm1KbsY/HLTH4CmclSX5VfqujoaDKjQm6ZdmbL4pcIEQfu+?= =?us-ascii?Q?oz/9M+LCGkJEtmaRRHlVbFLnFugp7zePx1l6I3ot3N2/5ekQCjuxM24VQm6w?= =?us-ascii?Q?JD2lf5WFNiK/Nec0VhyY6pM/qY0Q5aX2eaQCEB9mS3hHtk2n2lllZuyevacK?= =?us-ascii?Q?YYX7kuJZ7ZAy41G4h7CcKuKQ5ZuEILOJtoIWO2/PXm8fiphKP7OcN/xboTjV?= =?us-ascii?Q?979ui/pP3aY77rX+7ebQCdkIYO4GDDqQYiojz2r82XLlkC+2kQoOkr6IqVmN?= =?us-ascii?Q?MO3WE0cXa9duWS70d1UP0yNHYiAgcwFxdC0peleygGu2BB5BeKTd4LAbkDtc?= =?us-ascii?Q?XKLbzwhH6qgKFeeoy4+T9mzgCJg9ilJ+6Pzt6X3ciHR21pirMuER7skYCZGS?= =?us-ascii?Q?cGXvFzoPun6nuO6w07wy7vfrTL/RJ3Baf1oZlN3OioXEitUbQEjJd4jc6bJR?= =?us-ascii?Q?wNkfLcLM/jibtqmHEVqok2u6wydhUcuVmguz2yHqYxwUwB5u3n1jPtCbp+eq?= =?us-ascii?Q?zk8X9+NGVeFrAV0KThLRFhN/sL/MpfyuXW0fm+Ruy40Fl2w9KvjV/ZznTmaB?= =?us-ascii?Q?VHxp+SKkWt6zvWwdX5ATRTgdflqkxKZpVfrI6PQGmdRKUdkIbLkk1C8gS9JI?= =?us-ascii?Q?rP0jWObd58XdT8VcLdBH7a2ONCloH7S1mQLkLZ5hpaSF3rnL7AdXlBfZh9OR?= =?us-ascii?Q?yE5MfX4RnTW4qTe2LKr4NSNHL3X1gZ1XMPdpHZtVpui6WwcgcV6/ut3VGwJg?= =?us-ascii?Q?rksGW1Rx6VfwP8HWIcXfF/P1lG08bY0i/RpUqHYotF2XvcTU8NjAA7JE6wRC?= =?us-ascii?Q?FI7R4jRviAmfkDExVhAW6s3gTMHajPsEEyw3mcuZnMxEGJ7M4uy5B5Emrgn0?= =?us-ascii?Q?AvIKjlw+bd/ZT6VrLHDseRV57xkKN9d1z0457ZyxGcZOeZn/7u98kioMRTg2?= =?us-ascii?Q?YBK8QhqKn50t2cxw5NVwCzMoOXVSs6O+egdmB9NJKG6EXejJ7QxghEjPtiKW?= =?us-ascii?Q?Hzgq09Zlb4vFJUpjKqqRFfe67r3n+8w7mJ5/guNR/LOaTlGYEKdr98pDxu9G?= =?us-ascii?Q?p4UwKv3I0xPWc9cz7XpVoCW+FpbHptphY+/SH9t0xLqYCBFST55zeHIgzGps?= =?us-ascii?Q?LzlLz/9E6dz1DFTVYyLgNmQq4HJE2ftoR3EgKuJIwyeWaMtgsVc4iIZjIPLP?= X-MS-Exchange-AntiSpam-MessageData-1: WYue3qcPAV6LeWPr0UNSS5DJthv/tSs2q6w= X-Exchange-RoutingPolicyChecked: EZF1OY7FiO/wyL7SsqJAJt8peU9d2ZN/atpdPa2SoYq266WfvgmKv8Z5/o651ifyCN8mb9aAqkcgYsuEcUfgA/Q+TSWDXZ8JfbHtFvxvb71jdz4a6YhannhUAO+ZVt+2OZ5GT9CM2jNC8Rda15j18RRBGo6WBxl0kUUNwDfM7NO/07eaxpVN7jUtCOBTEPDeYRmjoy8Ezdp7z4nF0HxsBy8I1oznxcpunGs4Hhe/LkQd3Y2njbaNRdt0sT+u3w1IhtI+vRCVizjFRYDlsLXz9bd+5p57g6l19snD9Z4EejjR4BswPax8w0lL2nW5X9BGnAJAy1FbLI9fcvALXaBYsg== X-OriginatorOrg: windriver.com X-MS-Exchange-CrossTenant-Network-Message-Id: 9d0d79c6-c6b0-4918-98bb-08de800fb37d X-MS-Exchange-CrossTenant-AuthSource: SJ2PR11MB7546.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Mar 2026 08:16:45.8547 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8ddb2873-a1ad-4a18-ae4e-4644631433be X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 3nrb5sDpR1d8hMJB/Vg8Fjd+t7UJPeXxqDFk1gcFJqmzNP5XPMODwIMdMCGgkaWTUi9EpZMRKNLaXQKt5rM5yVt23JhZjHjWTkDvR3Z85qk= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR11MB4720 X-Authority-Analysis: v=2.4 cv=ALvEU0hV c=1 sm=1 tr=0 ts=69b2766f cx=c_pps a=8AG4s+rqopmfoFde1j+oJg==:117 a=6eWqkTHjU83fiwn7nKZWdM+Sl24=:19 a=z/mQ4Ysz8XfWz/Q5cLBRGdckG28=:19 a=lCpzRmAYbLLaTzLvsPZ7Mbvzbb8=:19 a=xqWC_Br6kY4A:10 a=Yq5XynenixoA:10 a=VkNPw1HP01LnGYTKEx00:22 a=bi6dqmuHe4P4UrxVR6um:22 a=fTW__CHxibyLmBMfj2wP:22 a=t7CeM3EgAAAA:8 a=LiVfuos2sF1OzTiVPoMA:9 a=FdTzh2GWekK77mhwV6Dw:22 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMzEyMDA2NSBTYWx0ZWRfX7wbCaIIw+LY/ L+maqZ0XLFAc2fTWDNGyLF/DeJF3qX74+o25FwQ/hvWZUSag2bokHXyocuISR6Qpa57RZnXVPH0 rQRkA0NAHQ3f5QyR5btxpdUD9u7jrRW0hQF7DAjk83EW9idMf1XYVc4sNI2CvSD9vgqCFn3GEBK REuQmQvFFt64fdGjKIrS9lQzTQO47D8zfSUBsHM1QN9PSO25JYkw2irdiT7iLPPN0woyJPR6EIK 4lhQf0mq/C81PNv/DyzxlgEa422fPC7XD7cOOfLnjFZb87CvkJ1FKoRM8IL75DZzjaIJCMpinSe pzilq+LwidX7ntfkCc2RD+UCBjeYBvA99sRT6psjAr25xyujbRyASaa8RfRt6qNyb30ehRKDbpZ uWR1zmuqi2+pfCeI9tO6wXrvpw4ni3bC+eqWrb4Qz+rYMYlC2L3J4+Why0QTvTWTOwaTt+m9iMG yHYok1eSaHCRsWJlK3w== X-Proofpoint-ORIG-GUID: 1dkxApkTUtzwIYB_lRc2oQZ8sKznIxX3 X-Proofpoint-GUID: 1dkxApkTUtzwIYB_lRc2oQZ8sKznIxX3 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1143,Hydra:6.1.51,FMLib:17.12.100.49 definitions=2026-03-11_02,2026-03-09_02,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 phishscore=0 clxscore=1015 adultscore=0 suspectscore=0 priorityscore=1501 malwarescore=0 impostorscore=0 lowpriorityscore=0 spamscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2603050001 definitions=main-2603120065 Content-Type: text/plain; charset="utf-8" From: Ionut Nechita When an MDS session is closed or reset, cleanup_session_requests() only unregisters requests that are on the session's s_unsafe list. However, requests are only added to s_unsafe after receiving an "unsafe" reply from the MDS. This creates a race condition: if a write request has been sent but the MDS becomes unavailable before sending the unsafe reply, the request will: - Have r_session set (points to the failed session) - Be in the request_tree - NOT be on s_unsafe list - Never have r_safe_completion signaled Meanwhile, flush_mdlog_and_wait_mdsc_unsafe_requests() iterates the request_tree looking for write requests with r_session set, and waits on r_safe_completion for each one. Since the request is not on s_unsafe, cleanup_session_requests() won't unregister it, and the completion is never signaled - causing an indefinite hang. This was observed in production when running xfstests generic/013 in a loop, with stack traces showing: INFO: task fsstress:14466 blocked for more than 122 seconds. Call Trace: wait_for_completion+0x14a/0x340 ceph_mdsc_sync+0x4b4/0xe80 ceph_sync_fs+0xa0/0x4c0 sync_filesystem+0x182/0x240 Fix this by extending cleanup_session_requests() to also unregister requests that: - Belong to the closing session (r_session->s_mds matches) - Have NOT received an unsafe reply (CEPH_MDS_R_GOT_UNSAFE not set) - Have NOT received a safe reply (CEPH_MDS_R_GOT_SAFE not set) These are requests that were in-flight when the session failed and will never complete. Unregistering them signals r_safe_completion, unblocking any waiters. Requests that received an unsafe reply but not yet a safe reply are already on s_unsafe and handled by the existing code. For these, we preserve the original behavior of resetting r_attempts to allow re-sending when the session reconnects. Fixes: e3ec8d689cf4 ("ceph: clean up unsafe requests when reconnecting is d= enied") Signed-off-by: Ionut Nechita Reviewed-by: Viacheslav Dubeyko --- fs/ceph/mds_client.c | 24 +++++++++++++++++++++--- 1 file changed, 21 insertions(+), 3 deletions(-) diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 37899464101f7..45abddd7f317e 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -1792,6 +1792,8 @@ static void cleanup_session_requests(struct ceph_mds_= client *mdsc, =20 doutc(cl, "mds%d\n", session->s_mds); mutex_lock(&mdsc->mutex); + + /* First, handle requests on the unsafe list */ while (!list_empty(&session->s_unsafe)) { req =3D list_first_entry(&session->s_unsafe, struct ceph_mds_request, r_unsafe_item); @@ -1803,14 +1805,30 @@ static void cleanup_session_requests(struct ceph_md= s_client *mdsc, mapping_set_error(req->r_unsafe_dir->i_mapping, -EIO); __unregister_request(mdsc, req); } - /* zero r_attempts, so kick_requests() will re-send requests */ + + /* + * Iterate through all pending requests for this session. + * Requests that haven't received an unsafe reply yet will never + * complete on this session - unregister them to signal waiters. + * Requests that got unsafe but not safe are handled above via + * s_unsafe list; for any remaining, reset r_attempts to allow + * re-sending when session reconnects. + */ p =3D rb_first(&mdsc->request_tree); while (p) { req =3D rb_entry(p, struct ceph_mds_request, r_node); p =3D rb_next(p); if (req->r_session && - req->r_session->s_mds =3D=3D session->s_mds) - req->r_attempts =3D 0; + req->r_session->s_mds =3D=3D session->s_mds) { + if (!test_bit(CEPH_MDS_R_GOT_UNSAFE, &req->r_req_flags) && + !test_bit(CEPH_MDS_R_GOT_SAFE, &req->r_req_flags)) { + doutc(cl, " dropping pending request %llu\n", + req->r_tid); + __unregister_request(mdsc, req); + } else { + req->r_attempts =3D 0; + } + } } mutex_unlock(&mdsc->mutex); } --=20 2.53.0 From nobody Tue Apr 7 18:03:28 2026 Received: from mx0a-0064b401.pphosted.com (mx0a-0064b401.pphosted.com [205.220.166.238]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5BFEB37DEB4; Thu, 12 Mar 2026 08:16:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.166.238 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303421; cv=fail; b=BiwGLcbZooExOJHn2nkS7VKfKSWhgFL089sGvdlTY5inTjt6AmAFwYVtRtHzbNP6iZZUFtu0cCaaKkAQTB89s2pWEQgfZccG+F18MwffFvDNge8a4XGj0jSmsHdPp3cG0gD30/Y0qbZMX0dugwbt6Lw/oN1aFFsdCPS92AXULsc= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303421; c=relaxed/simple; bh=eQCemDr35acPZMFawejDy/2FbxtDOlYvRvsc2oyeH8s=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=Kix4Bq7RyhgjcyukeG44EYekv8r/SURgg+Eyoau3ugXOmvJzHlhR/2E4GM2bYMPaLeCZFu50LThcM1NvifDpZNhnCBAOOrF4P4i1TAfGPUZVxyaX2xUPrM9GGIkWDztWEMUiVn2omYZzQPWPDSbPvYe7NWNWqBLv2O7xTlyYDGY= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com; spf=pass smtp.mailfrom=windriver.com; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b=jrwot+nl; arc=fail smtp.client-ip=205.220.166.238 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=windriver.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b="jrwot+nl" Received: from pps.filterd (m0250810.ppops.net [127.0.0.1]) by mx0a-0064b401.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 62C6Q1eM1808083; Thu, 12 Mar 2026 01:16:49 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=windriver.com; h=cc:content-transfer-encoding:content-type:date:from :in-reply-to:message-id:mime-version:references:subject:to; s= PPS06212021; bh=wF8ljwMZFmxKlHdpC/KkmWq08ByR84HvtocIKKvYLBg=; b= jrwot+nlhQr0KcKx5kfnP9EeCB+waMagS6VlK2LDJiK5dlFNRg54qMVT8IQm264B EEo/DrNk3fzgtliraqeQ9YZHwckJtaISARu6NInu7nWxjj0+Kf9FPlcsh3HHdeRj glJX8UC3656O5IC82eNsGvcqel0VHwFGGpcndijnvLa6FZIFRWWIlJMjWsIhPe1I SVLuVIkcX/vN0/VJm3VjerpbiMrrIAaMsrdpP9aPFBTZgbckXzliK7WVX7UK9Mhf QDpP2z2DvRT0ybZTswTxABOJ9A27Z7r+rPSmAtygpsUN8lXd6B5F5IxnX2hSzj4F YqSGQYIfBIZqefgQeWgyQA== Received: from byapr05cu005.outbound.protection.outlook.com (mail-westusazon11010004.outbound.protection.outlook.com [52.101.85.4]) by mx0a-0064b401.pphosted.com (PPS) with ESMTPS id 4cuh6pren0-1 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 12 Mar 2026 01:16:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=h/vBC/CdIcOP8qApd//yInCTJE52ruMX11lF1VsipQX0M1XeINTgfIKyNVQbJY+MrvtcRn6xh6gJzZPF7dZlVsI2Rx+e3os+68kRHEyodSh3+uSivK/fWinJEnDqsTy4jz7a2dbpa4CmboZHJZiPlWJjG7itH0bPti5OfyAe7DQz08PD+YTICnZrmkKfwzdccH2ezEGvhL2WECIYVwPwrKQSin8DK1JTLpYF+yeBnJm8fbgauo47h4AYnpI8+B3I/mHJPlOIWKf3RN/ZhPyLmQ0MkuAzki1Gta1cYe8ZqN+PJBScLkc/vfEOq730cYxb6TX0tANgaS+HRYpf1JmM6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=wF8ljwMZFmxKlHdpC/KkmWq08ByR84HvtocIKKvYLBg=; b=BgkcbCXNkvlEIaHO7DmaPAeZ9vD6Ab5Vedkov4rmqyIioVMrHEoChq/w0e6Y1vMw0Ro0kAYI2UODH6l9vHHi9FSupCWEUSecJIZZg4uydb+XgNih+1hOntLzJN1M8UsWpOWw4MbdhLSu4dWADWXxkZEUozIuY4x9V5oAW4opV6Edwpfee0oPf6ZTDn3H/3ZUCdYDKTp2ESbtjt8WCOEJ5QmAyjOU7qhvt+GRN++Z3sm04bJm0OxPNLdHj8lwi9o7zlyM10oq7MuQuqEAKiRhv0QZai31YjbznfWmpLh5xopX7jeD2Z7BIBzBmI5LV1bsTBM6Q3PodLKr39xx9ANd3g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=windriver.com; dmarc=pass action=none header.from=windriver.com; dkim=pass header.d=windriver.com; arc=none Received: from SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) by SA0PR11MB4720.namprd11.prod.outlook.com (2603:10b6:806:72::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9723.4; Thu, 12 Mar 2026 08:16:48 +0000 Received: from SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced]) by SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced%5]) with mapi id 15.20.9700.010; Thu, 12 Mar 2026 08:16:47 +0000 From: "Ionut Nechita (Wind River)" To: ceph-devel@vger.kernel.org Cc: idryomov@gmail.com, xiubli@redhat.com, linux-kernel@vger.kernel.org, ionut_n2001@yahoo.com, Ionut Nechita Subject: [PATCH v1 05/13] ceph: add timeout protection to ceph_lock_wait_for_completion() Date: Thu, 12 Mar 2026 10:16:11 +0200 Message-ID: <20260312081619.40854-6-ionut.nechita@windriver.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260312081619.40854-1-ionut.nechita@windriver.com> References: <20260312081619.40854-1-ionut.nechita@windriver.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: FR2P281CA0085.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:9b::12) To SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ2PR11MB7546:EE_|SA0PR11MB4720:EE_ X-MS-Office365-Filtering-Correlation-Id: 43ff4c34-2c6b-4319-3cc8-08de800fb4a5 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|52116014|376014|10070799003|366016|18002099003|56012099003|22082099003; X-Microsoft-Antispam-Message-Info: KYHR39wVTevpUh2xUdM9s4rf54vlodwUDcGQ4sxmHEW3y4lXBBAOpY5ov7hxGUUMtKoECZuv4gdMpf4jD/ZlG1KrHDpJFDQ54tzsrRwyaTiZlPZYv6ycu5h046ZpmfReAwzAHYKlskQfD8LfrskAooWzU1VNJPENrEeR6aidd2YWjy2SznENnqiZMSpeFwr61uO21b+NAwmbUzXD3O/OtSFh20zPENV615fUhE9biGQohjoisEqPKkKfNa1reLjh1usKbv3iUrnyfRB4uwF0SI9Y1efJccy3m5SJ4QcWyeBUoCXBkKXsuwDcelUV3tIMGOKb6BdiVOfYUFfctF8IkdaG4fcYJkRr45Y53nvkDfyYCdZOu+Xu/TKVZ5hmcTcw3XMv67n6ibvr4HEtdQ+JWjpy56wsjtilP85xqB711JaOlTQfX4qAb0+3jCwcCRJ7XdWbjNsKLU9W5oUURVMjwkizCqhWxWil19mGtLw6b/sShscKwy4IPmB/22adnrZV2ZnZHxyeJd0eDpPyFWeyofmPmLDqKRnISKIV6lfXIMhRBOLRmZzjcD2smK5Az9NAlKaYKI9rHb7FLvSrjVJ7Q8cLihwg3QYAa+/WMhX7r+Ikfx5eEvmPPsCQnFqqY8AZlkxNcNN+95zOfN7HG74fh9XaJ0OBXre9QAQKZ/IuZB+gpNHlAoMmnGfkYzrRJdqd7PmULGpUdvccGOhDTkGgjbBqORA+9nsG2lJ+5Gkc2Ys= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ2PR11MB7546.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(52116014)(376014)(10070799003)(366016)(18002099003)(56012099003)(22082099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 2 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?zoa7Bw4rhYVputxssm8Fs16Y0bkeLJDqnvh4x5i9Mfy837xqG0rwjsuyM1uI?= =?us-ascii?Q?HTWXVE2fYpfxCrRwi7z2QfCuiUrcdQzKPGaAjI2cZ+opAGynprMFb76cyeig?= =?us-ascii?Q?pTb50jUWRQgUdRrZdTGsGl+SglPADfN0K11Z3CFKlZBC7uNohRE01MIUynBj?= =?us-ascii?Q?NUF3wEs6qgwIin0x4ZrZvpCDpFZ/ZcTYDlX+R89yF5v8bSCa8Scc0PI2s5/c?= =?us-ascii?Q?OcHJxt0A1Gq24L+/dK5aYrecA4bYlRLuoXwcDjdioSY2lDU6bneJPSMW/l1H?= =?us-ascii?Q?AeD8jHLs7TZBmy+vIyWZoo48zwhepgH+VfkCWt7VcosJ5WA9OmsOUAFUPGny?= =?us-ascii?Q?ydtZeIpiIqH+IRX24DGkZKHef+8CyKVFO+nJaiLWjLeQIdRf2bjUi9QijlmO?= =?us-ascii?Q?j2n4MEca8vtkJePDtL8A1WW6IRc+8BRnSMY68NkVQriul6Y3a9QLmMixggDq?= =?us-ascii?Q?ep4MnvqUzGbwwhRO3fo1H9WYs0ZOhMEmwEgyeNk6d9tECkEn1xe2M6Hb7R/2?= =?us-ascii?Q?sXXgvbMUTmkTv7mc1kB/UiBaAddYExBAqerzTkSMWV8YLp4XhTNIEXFpIjxz?= =?us-ascii?Q?POIWsK+pc/BfDjP6bIngsMzrE6gOa/dnbX4c4zpJde5BM13lPy5WWfYXktVX?= =?us-ascii?Q?rB35XO30L0+dkvFju8FweQPLY77VbPwM3IJTQsAoWbOGDPgBYR+0j0kGuw1C?= =?us-ascii?Q?WKhuczIDJXFl77hxA6gFvi3Kw2X8BJcFVZnJO2TlN3IF8V9zINxs5LGDG2go?= =?us-ascii?Q?QiVeKUbWLq9sPyOJvnq2sWcXjmJASJZWFdKWuURLDyUlmsmH3wA12jWSSKuA?= =?us-ascii?Q?cAt6aBCBZWar7G/wJcYoxKZq/028OF1KSKlTpz2860Iq9J0afGuAZXazcpEU?= =?us-ascii?Q?ZBpNqH6s8GSu6EbzxgcGnWbtnm8EXWMK3E/ebzK3z++Uz/8s64ujLjg3IxTN?= =?us-ascii?Q?89EjMGzZRAe48SUZ2xb8Tk50eGojTsP1F5KvdsXXfL13lXjD+ZbiSFRKg67r?= =?us-ascii?Q?HfRnIFZaVhW6DbeAOjaSG1CAyoEofUQdaMDcWdA2ijViwqV3OTpQnchY07j8?= =?us-ascii?Q?WQKhT99t87l8l2oerE9uchML1XqC9Zjc2tuAnZiMwyL6RZIs7T3KUrjMC+Un?= =?us-ascii?Q?kGyH+FjDK05StAfb5PElZllmV0zTuiQUdtBBAXevmykCpJ0T2bQaCdy8eYfb?= =?us-ascii?Q?rr2ubzYWCYRFaVx2UVcSgpOvUoAPXJtmgp1tL6uVad1weV86PxIgEtwRfR9B?= =?us-ascii?Q?AHqo1lqx4C0QVWU1msGey2K/HqU8BsLFHdiwFAs3M3RIx+sXbz5keb5qoYCP?= =?us-ascii?Q?tbCvIfl+Sn2FSqDyd2XqB2w5ODG4Mfp3tqsv9GMI5JTW6LY49nta6D924eKV?= =?us-ascii?Q?Wb2fxyRMw0T28Dwp4nT2ebHPUrApMjvRKVAT7CaUcRwohd1m17xvsXOvSSGr?= =?us-ascii?Q?Bw8dvcmMo4HMu6cX4P2I+6Y8FSLEmzApikgdjriRrXQCvlLVcm+MAx9S7oZu?= =?us-ascii?Q?ReLtmRL/+uhcFYFtZuiKQbYjeL3JkzWormS+CmMA1fHVXcdGiP/TEEGYxSVa?= =?us-ascii?Q?ETm/VDVJJJZSEFppmwaWD9vghtM46cmYMzmQVKbn0UeTA8aanWbelGovGzz9?= =?us-ascii?Q?nDdIZNYtaWY1uUver682gJY7q3yyompBRFedgScoPSPvHOsbq1rYMhOAzA5F?= =?us-ascii?Q?R+lejEYRI2h/a2TBUJ5lYK40voa+X87lBeuowO5d+cUgZy+2m3OVpi/UO9xu?= =?us-ascii?Q?yVtixNsOwCgukwn8Q2CCHOSlSAAF3iv/b+DKzZ3n2vD125ObSIsvXpgHBw4R?= X-MS-Exchange-AntiSpam-MessageData-1: ttxJ6ga2R3HrdYuWCx0zgUx5pyFtopA9ro8= X-Exchange-RoutingPolicyChecked: D0mVRAx9EHLV5n5/ZPyOjZHnE7ZaaIV0ab8Lw0a4sMslhkbstAHiRyklJxOaeZEKGbxUzHbymtrflPbv3ifWh3eBBbynS3mjboDlj5VJ4p4br/GpYeAePWp0GRe3JtC3iv4DtEvqjIGceZBrS7k2s67taeeKILukvPZ46QLXJ9zq6+vYkgigMMfnoI6+pXTPbo8eSAS/dl9Ma21VzRFVoC3wwcLPzWnXZW00P/9wd3S+xQwE0ZWTqHGsV9V72AUOVWmtelMt2mQ7DAHVe/z2987mwH6gUVOZGgyNSQ5SIczFRca5S4zywl0BjMQ06HY75cjbnOAe63gE/wVjJIitIA== X-OriginatorOrg: windriver.com X-MS-Exchange-CrossTenant-Network-Message-Id: 43ff4c34-2c6b-4319-3cc8-08de800fb4a5 X-MS-Exchange-CrossTenant-AuthSource: SJ2PR11MB7546.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Mar 2026 08:16:47.8198 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8ddb2873-a1ad-4a18-ae4e-4644631433be X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: igsfIkLRU9YOJwF1wGelUHjEcIYWowGo9u6lKAvjJbgHCnRxYaKZ+R2dsbKHf4Hma5RMAdsiFf53W1vV8IvZIDQmNgpbTS+9m1d6L+B5mh4= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR11MB4720 X-Authority-Analysis: v=2.4 cv=ePAeTXp1 c=1 sm=1 tr=0 ts=69b27671 cx=c_pps a=GX94MXth5n4ySrE0IhOemQ==:117 a=6eWqkTHjU83fiwn7nKZWdM+Sl24=:19 a=z/mQ4Ysz8XfWz/Q5cLBRGdckG28=:19 a=lCpzRmAYbLLaTzLvsPZ7Mbvzbb8=:19 a=xqWC_Br6kY4A:10 a=Yq5XynenixoA:10 a=VkNPw1HP01LnGYTKEx00:22 a=bi6dqmuHe4P4UrxVR6um:22 a=HK-ge7EqtdluswH-FwHe:22 a=t7CeM3EgAAAA:8 a=C2ouNZtNROy2VeIb3P8A:9 a=FdTzh2GWekK77mhwV6Dw:22 X-Proofpoint-ORIG-GUID: prubFGDrzduFmfBtjzvsGg_6ZseAg3Hy X-Proofpoint-GUID: prubFGDrzduFmfBtjzvsGg_6ZseAg3Hy X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMzEyMDA2NSBTYWx0ZWRfX/segWeMg3Mgt W9/zECTyjf1DtHN0dw3m2eHyDvu6AM+ifwhpTF966aAbVsSxHpjPLLahqCF8DeyBcU4r4aX/BX5 ZByHtSx7/iDOakdk0CqMX7fn7lWCmyLH/L9SHWQqBisNPTSO16wgKWQMVs8AH467rB6NQA3R8lw Zt3aH/LPB81YZCQK6f/qTkLJOj2rhvH/RIfggdnCITEtlphNrgUUrv5yCG8AdaZg/qqsDBCFR0Q RrX5R8KLosgvPxh2jHdAINli6FI+78rJ8YEdtIaAVbNuMqZ8thSUB1Dc8OCXeQPUHp3y4wUQS1P WyzsFsVwfj+/uaf3isTyIHqnXc/4LkB92UHeUxZnj3moG6VWqjijTKXmmjEVAnBoLcgwUcwdYOs WHQPh9SaVxWtsR1ZVr3n3wOBUMIAvS35xalM+sBsYVDkAIW/8RQITlkE8mkQPjEu82r8o2LEmbx lVdoOfDqrntIYWOEZUg== X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1143,Hydra:6.1.51,FMLib:17.12.100.49 definitions=2026-03-11_02,2026-03-09_02,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 suspectscore=0 impostorscore=0 spamscore=0 lowpriorityscore=0 clxscore=1015 priorityscore=1501 bulkscore=0 malwarescore=0 phishscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2603050001 definitions=main-2603120065 Content-Type: text/plain; charset="utf-8" From: Ionut Nechita When a file lock operation is interrupted and an unlock request is sent to cancel it, ceph_lock_wait_for_completion() waits indefinitely for r_safe_completion using wait_for_completion_killable(). If the MDS becomes unreachable after the unlock request is sent, this wait will block indefinitely, causing hung task warnings: INFO: task flock:12345 blocked for more than 122 seconds. Call Trace: wait_for_completion_killable+0x... ceph_lock_wait_for_completion+0x... ceph_flock+0x... This is similar to the issue fixed in ceph_mdsc_sync() where indefinite waits on r_safe_completion can hang when MDS is unavailable. Fix this by using wait_for_completion_killable_timeout() with mount_timeout instead of the indefinite wait. On timeout, return -ETIMEDOUT to the caller. The lock state remains consistent because: 1. If the unlock succeeded on MDS, the lock is released 2. If the unlock didn't reach MDS, the original lock request was already aborted (CEPH_MDS_R_ABORTED set), so MDS will clean it up on reconnect This follows the same timeout pattern used throughout the ceph client for MDS operations. Signed-off-by: Ionut Nechita --- fs/ceph/locks.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/fs/ceph/locks.c b/fs/ceph/locks.c index ebf4ac0055ddc..55dd99460b81a 100644 --- a/fs/ceph/locks.c +++ b/fs/ceph/locks.c @@ -160,6 +160,8 @@ static int ceph_lock_wait_for_completion(struct ceph_md= s_client *mdsc, struct ceph_mds_request *req) { struct ceph_client *cl =3D mdsc->fsc->client; + struct ceph_options *opts =3D mdsc->fsc->client->options; + unsigned long timeout =3D ceph_timeout_jiffies(opts->mount_timeout); struct ceph_mds_request *intr_req; struct inode *inode =3D req->r_inode; int err, lock_type; @@ -221,7 +223,17 @@ static int ceph_lock_wait_for_completion(struct ceph_m= ds_client *mdsc, if (err && err !=3D -ERESTARTSYS) return err; =20 - wait_for_completion_killable(&req->r_safe_completion); + err =3D wait_for_completion_killable_timeout(&req->r_safe_completion, + timeout); + if (err =3D=3D -ERESTARTSYS) { + /* Interrupted again, just return the error */ + return err; + } + if (err =3D=3D 0) { + pr_warn_client(cl, "lock request tid %llu safe completion timed out\n", + req->r_tid); + return -ETIMEDOUT; + } return 0; } =20 --=20 2.53.0 From nobody Tue Apr 7 18:03:28 2026 Received: from mx0a-0064b401.pphosted.com (mx0a-0064b401.pphosted.com [205.220.166.238]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 760C2346770; Thu, 12 Mar 2026 08:17:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.166.238 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303424; cv=fail; b=JLTA3QkQVkysg8jQwqOlj+LE32t0khmcqfCl1EWj51WynP5Hh16nZ5EZKjUx9mH5+cEOK76mIes4GMYZ4AamjPgPmQozKPe9LOMX6eHIFUph3KIj72wRmNvxqrIopReUyHaqlBHPO76APzuHrjbenR/4wbz2TRoAJ4Lx1awDJJo= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303424; c=relaxed/simple; bh=rFsmNvSJepdAmEmfmFopqj+r7oiSC5NTsBCAxypVvfc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=twUtvehnmCkD3jZr2GHUwz3loTUgIfUQ3JGVNsmA0GuW6zCzcyQC/CcVDxGBLQ/eHtZ2SttOvZa8zpa4MKDm+dWqvQp7RSATfEaBgoN/eDatj/ksuodGUsGmsGoxoyU6yqb1Bjm6xFiUxoqm17g6rNIjOS9xKpNR7GZGkF5tUZc= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com; spf=pass smtp.mailfrom=windriver.com; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b=QO2RHI6x; arc=fail smtp.client-ip=205.220.166.238 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=windriver.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b="QO2RHI6x" Received: from pps.filterd (m0250810.ppops.net [127.0.0.1]) by mx0a-0064b401.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 62C7BIaj1885429; Thu, 12 Mar 2026 01:16:51 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=windriver.com; h=cc:content-transfer-encoding:content-type:date:from :in-reply-to:message-id:mime-version:references:subject:to; s= PPS06212021; bh=oYVxFUcXefKmgBTmYlSmQ21mLolaWy343RxHk4KhCRE=; b= QO2RHI6x/kO1i4N3aRuBDEplafuds/QDoYpfroqp0PpXvvyYKTt1ljXwhtUO0A9m 4sYrS802UU5+lb0++CzqwRdiizE1opbDQtj58ZSquxIKJeY503bZmklblQNL456Z pdDkhn8zKqUlr3CGvhSKSdIjNJMsYyqEAumHS7PbRKZNsSKZh4gk4G26SAcGd5s6 T1BMR4T9ZA9Ye2NWin4RVPN2u5xlJlOWmDjgfdTbxuj02ZvORKZli/ofEcy60krC k11yjhuzmYoWFUTrtG1M7p+jHz+C+P6RTAdIyg1g2uirk2ILw5+hGZNhMOMpY6r2 R4YRCFH80IC65pAS3rdR+w== Received: from byapr05cu005.outbound.protection.outlook.com (mail-westusazon11010056.outbound.protection.outlook.com [52.101.85.56]) by mx0a-0064b401.pphosted.com (PPS) with ESMTPS id 4cuh6pren3-1 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 12 Mar 2026 01:16:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=cEq0ZLCwIoT4b6waePoqaL1wDO+e+UBmVRVYlciuWXcbAD8X/HkzFFoOJ1dE1dBUeuFieMb8eJIY2r2+2Pl0rcuRpYP5TeVGtKjExPfI/Jo6iTdRGDpR+GNkYTEpqlucVaDu1iBMBPuxXgTQNtKPfigbFbgCyCDhpi8tgzg6Pv7vtJJOY5vqZ7bxh6p8N4cS465owfQMY1DZbfusvV0zwuGVW4yf4p2cmdh9JJUu6WDLKoR2zMQxdM8s9fX0Qly38R7Eg80pF1MKdXUlGrhBuhQGpEDINSKFGvY+ezZxYE4Iikbod7NbJBSolfaj6SAt0uVGFDH09Lg4Gk25xRAcqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=oYVxFUcXefKmgBTmYlSmQ21mLolaWy343RxHk4KhCRE=; b=VJMPJYnEfISwAnJeWw6AsH6bKHcw/yxjzkvqkgsRk9mIiOn2YvBSLTfuPYjUpsoJ5MvRTsxZxAufipWAqzuo4qXYFytdfm2z4XEQGYati2bspHQFiZd02TIJiyayQvbCC+F9Fc1SBVaA/DcfKpqkW5qOPZE/QMPE0CvCSRW8W2+S7GCJpBeIfbnf6wpcJ+f/in8XtstcgpQj9bUK2ZoY+/oaDtbhGhXazyXtvI6Yuijep2Xpag1gd7Fyy/yOXLoPUE99vxOnAUZkfWD4DkRo0tzIQtrF4iQjwjc0kaZmgtSS/iyjvl+OKLA9evyNa4IuXh+zGhtlQYZ+OEBLIYrtiQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=windriver.com; dmarc=pass action=none header.from=windriver.com; dkim=pass header.d=windriver.com; arc=none Received: from SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) by SA0PR11MB4720.namprd11.prod.outlook.com (2603:10b6:806:72::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9723.4; Thu, 12 Mar 2026 08:16:49 +0000 Received: from SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced]) by SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced%5]) with mapi id 15.20.9700.010; Thu, 12 Mar 2026 08:16:49 +0000 From: "Ionut Nechita (Wind River)" To: ceph-devel@vger.kernel.org Cc: idryomov@gmail.com, xiubli@redhat.com, linux-kernel@vger.kernel.org, ionut_n2001@yahoo.com, Ionut Nechita Subject: [PATCH v1 06/13] ceph: set default timeout for MDS requests Date: Thu, 12 Mar 2026 10:16:12 +0200 Message-ID: <20260312081619.40854-7-ionut.nechita@windriver.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260312081619.40854-1-ionut.nechita@windriver.com> References: <20260312081619.40854-1-ionut.nechita@windriver.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: FR2P281CA0085.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:9b::12) To SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ2PR11MB7546:EE_|SA0PR11MB4720:EE_ X-MS-Office365-Filtering-Correlation-Id: 9985a488-4a1b-4bf1-ab14-08de800fb5dd X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|52116014|376014|10070799003|366016|18002099003|56012099003|22082099003; X-Microsoft-Antispam-Message-Info: F7jzpvohPmZZg77g2t4dk3oQXpen9etbzP/Du4Vh53W8sE9/vJwT0Xqd2HJA9D0C/81j8uy1pD1eXsln2zba+PoDuV9EYorijyVZ+L8ZijP2qV3PZJyQrdEblWrFnA6USN0FAhogjeWzdHF2hj3NK0SVIV9kTwS2kgf2xZeL6RYKr1CeWoFEEZW/f+Rx0WCGzgikJMz4nztMfwdLGt5uxxKtuJz2Va+n1qntPhdwTPJYTyHHjmDtKz5x37TiyRZrQhj//j5OrxVn+Z1Q1uaJ6uKSVOyuHRkriKz5wdR9T3GafxZegwEz3dy3yTyNiHUulavpTQ3LYFW/7sSX19fAYFBbkQ9dkIieRZ6TZM3yFK7XuG5eTJdWSE8/rH3lJhZdBmFoZwSbr75OuyYiVIVBvx5A1GmzgqxJWdq6Mv8u5f6pVn2+hGLVai6d28DSl1Y19FeuW53JQXGxoAe9CfceDX+2+B3DXPEZENrLDzRl95V/YpTLVK0UR1SdcPSqFQsx7tYVDbmHJ5ACyd25AeE1+9Dbx9oJkW0dVe5UPTqCyN+xTEspJQVKX0g1suVqJknHdWvcTNab8aFWAeMjWs9NwJN33/059m5TTP+w/XMKrhsUQ0MrBq5klCcI3AyrnZPH6cQKn4mq8UVb2SDrqAn/cXAw8SljI35JDSq5PM89wAedgprZHg0mTyFFEjyrkdIqgkjDuSlSBM9sX3Fz9E8i/twvLoh15E4AqCAHb+fwaCw= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ2PR11MB7546.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(52116014)(376014)(10070799003)(366016)(18002099003)(56012099003)(22082099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 2 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?vLxRZh0qP/bkkza/53zNUykO/JRMvvlcCqN88+xsjM5oUANWJp+NsOjPe22V?= =?us-ascii?Q?LeIFgROSBbDXM86RRcH6wqlmw08Sqd2Y84nszYkBmcDIH/PskvVhZA+gY8E8?= =?us-ascii?Q?g2VNAhspVOJnUsT3W37xGUC6eYnxIoh/mpDhY/nJ1g6wupiO277ej90U61kH?= =?us-ascii?Q?N5+IGIU5YoZ+FIx4bIYrz6CzUnchKd5ZR/vPBTeoZZSB2dCHQ+yNJfgWEY5Q?= =?us-ascii?Q?JJOtqHdpqk10Zbb6DkBBChNR6sk0ID6jcOxdfEsU/qIIVX+238PchBHnHW0T?= =?us-ascii?Q?+KLnwGW+Uq6PjJafnh/B2HN40e22vXCq1KOd9J7G+1b7+NFrTurGjPS2Yyiy?= =?us-ascii?Q?psoCjaMqccGKLAKPRaJYPwIZInUeg9F9sj1BZcuxGRtS5gt8lGbWPKh0Ptwe?= =?us-ascii?Q?7rr1Mz/tyvDo5iNQJajehMMktRSQJ5iyDCTs6siUuw9NzTDb7goKeEO9IJfc?= =?us-ascii?Q?WBXtqnLTZRGSo24FiJcnlvzuduUg4va+ai1H0Opc0EdkWGtIv8TbsGDe1csk?= =?us-ascii?Q?j6/ZOMN3tAuXuMZwha5LS+7oMroXTrSR9KXC0Ebd1dKiGJkJ5ynJqjJJrm63?= =?us-ascii?Q?Ubizh6dnJEX1fMfaI7p0jKlsUviFj1KJTQmgQ6pk96ql2bRYwDxMCrPO0rkU?= =?us-ascii?Q?751ojM+eh7edmP7XOjXtb3s/P8KBDEQ8KS++yAbFU7wb0wUQ2Po6Qtygs+oJ?= =?us-ascii?Q?LrElV57IzI/8CC/qsCsJBQlKYbsQ/hJppLvjHMXEyZ+Tc1FKPJ7dncoL62Vo?= =?us-ascii?Q?4yDJuU7eqQUZiyU0ReptvDWR6rdlWcKhqoVy/bLoPpNlK951qsEcNjKBmkUs?= =?us-ascii?Q?caT69rqwNa1FS5Ebzq/1weVhutF2wcjXiTOAVL/7CQ4rlrg+cLMh++peX6xp?= =?us-ascii?Q?7f2vQlCCgzN3+OAcQY7MWbCTHEmiaNwKHot6fP8xL81Rt4B7yFhDkuUcDyYZ?= =?us-ascii?Q?uKgT6O4Dwzfu3gedi/MnzN9SalgiaNj+jdRQPWHmLnZKH6KVZHacOoh4GvNl?= =?us-ascii?Q?C9jJ0kmfXEXxM1rK6AXT9g+4FfoCEclcv4fP1FjQOEaw3uDk/Q3LraDfAVQG?= =?us-ascii?Q?h90yV7dB60EoWiAnT8qDEWqsVeh3zzzia3cTC7OGJLRkgOkSK8kwnmsYHxEV?= =?us-ascii?Q?gP6gHuCqOkRD6wHxRtTAMXcpKoAXIp819wfl4lj1qbgSlC2JNcZNZR8WX3c+?= =?us-ascii?Q?1tkO9XWN/DYGQJXu+itzNqPBA6oZ8wdwMKV1Wlq+rfmSAbnvjfmJ11gPU6Gf?= =?us-ascii?Q?Ye+Kfia2X5EAvXYgXOi4kGGL9dH4nZk6sZy8+PEGRpwGoB9hjziNpRjsBHq7?= =?us-ascii?Q?82M813iFZGKP47KLTUo5/8AXCnA/2r5xJ9r5eEN/YOpl3HhfgEtnNQci76nf?= =?us-ascii?Q?1+u9HP3gzA0Pp8JWWp0yrQhLyizhboM0PFYEx+oMBxUvQdHlpFUhnKLjwmFF?= =?us-ascii?Q?YL7rkAUPeJw68FxwLqzBN/t5RQkOHTifUHVHx6m9PNJ2RwLC9Vfv8urYEHS4?= =?us-ascii?Q?lQi24g//wirn8rENWjKfewFOgl/axhr+Mx9IYMnknGm3JgYAd/Kt2IrbaSnF?= =?us-ascii?Q?AQZuECjzZt/bZdtoy1ZIB3U/79hY0KeIOPWM0dEPebbOzOwJP4PsfEw1zc62?= =?us-ascii?Q?iqLiYiQ6979vRNgXJMDf9mbWRAwPEd1PKkpQ2HQe/dmRh4NdVJehRxvThCFx?= =?us-ascii?Q?4Wx+sF8QPBxtuA+P8voMws/+kD3KuRBsEMtONxVMEyoy217Qkr/RiTZFpgS9?= =?us-ascii?Q?lZkS52rSv0p3GRJILuhEzbYohKMq5KwCmSvhhjrxptBk8eDSjAFORuAE4syH?= X-MS-Exchange-AntiSpam-MessageData-1: yAPfz6ymvtefle16dDEekPcfXFEpxG2v/tw= X-Exchange-RoutingPolicyChecked: r+Fw0ruQGv+Iuo3ckujVGhipM1ZpF+k4C1rnqrUvBFRDppIJlyVQ20UQ71ah/ZMeddQHsSYxtxLqI7Q5M7iC0vyMeZFXz+a9yy7wqyVsoXlHRJ2pa+gitwf3i3MmTTmM6glI6yb9sLo9isErG0YDkcQwLoG9sMe9PwUfCSoj/h424YVSqAHPzw+dpSPFMLQ1aOsye7EBpcm7uRsLlQ70x74vaTHUJtqD5LIVEU93piVhOYC9XO/TYz+QpejpEnS8XrU4akInFYwjA1zDXzdv0DFOdZ3jiyE+pj2+87PKqdskpsnlYH+9I1XXbMnFEaUYJ1f11y00TSww0iyYVj7+oQ== X-OriginatorOrg: windriver.com X-MS-Exchange-CrossTenant-Network-Message-Id: 9985a488-4a1b-4bf1-ab14-08de800fb5dd X-MS-Exchange-CrossTenant-AuthSource: SJ2PR11MB7546.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Mar 2026 08:16:49.8883 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8ddb2873-a1ad-4a18-ae4e-4644631433be X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: z0rtaOENJBXbPfo2RhnihwjnkixconHRLXqNUc/0OkVFuatqMOhu0KNiu0mdAzX6cChpjOAsj0uTD/iD8jDMbjSKGEHWt9pG07QiWgbBItE= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR11MB4720 X-Authority-Analysis: v=2.4 cv=ePAeTXp1 c=1 sm=1 tr=0 ts=69b27673 cx=c_pps a=CnS1ju6qDdjTg6bRefQovg==:117 a=6eWqkTHjU83fiwn7nKZWdM+Sl24=:19 a=z/mQ4Ysz8XfWz/Q5cLBRGdckG28=:19 a=lCpzRmAYbLLaTzLvsPZ7Mbvzbb8=:19 a=xqWC_Br6kY4A:10 a=Yq5XynenixoA:10 a=VkNPw1HP01LnGYTKEx00:22 a=bi6dqmuHe4P4UrxVR6um:22 a=HK-ge7EqtdluswH-FwHe:22 a=t7CeM3EgAAAA:8 a=mwtzV0mbf7fsuNmZlS8A:9 a=FdTzh2GWekK77mhwV6Dw:22 X-Proofpoint-ORIG-GUID: aIO_lnsUfqQYeLjXM62fq8Lne82kwGpd X-Proofpoint-GUID: aIO_lnsUfqQYeLjXM62fq8Lne82kwGpd X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMzEyMDA2NSBTYWx0ZWRfX46Ldy8z1qjqt 4vtIey+H/B14aYZ9Xx2jkUTdMeI3dAlFjSX7KUQuxnLWWLqFfLwRsm/9B5kjZAsHIhPP7Fovpf8 hVKwPigQjccLK+0gKSVfBDTVXidnztM+SI5WdMKt/AnTCQpFTK8JqtY9WZPVbslyLifLQjv0a0P 3UqGA0Y1JT2e7bsQNh2AVg9QAo7kchjEvrrDyp5FgZgXmRVHMnkK90SyxrnBat4tSaD7IEB4RkG JUh+3Ut6+Kh9I7r5jL0J4IJdr/QYkvcj3PKTyPWj00wWU78rWIb/B+v5Vsluk6mDK1sOWJYxtXJ +ShDi6WZukxjsWqpipKnDIWuAkXvFTBZXBcDYJMRxDvBmyDzRtRofousNW9sCGqgk3OIw6cCFLa gNfglmfk7u8Y0oNmAX3WhRIaTn4Vy4H4kMHPvKslnJT7Q0zDVMTECPckzbWmnuwQR4kpnTkxaSr K1tECZqlqX5EccdIG7w== X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1143,Hydra:6.1.51,FMLib:17.12.100.49 definitions=2026-03-11_02,2026-03-09_02,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 suspectscore=0 impostorscore=0 spamscore=0 lowpriorityscore=0 clxscore=1015 priorityscore=1501 bulkscore=0 malwarescore=0 phishscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2603050001 definitions=main-2603120065 Content-Type: text/plain; charset="utf-8" From: Ionut Nechita MDS requests created via ceph_mdsc_create_request() have r_timeout initialized to 0 (from kmem_cache_zalloc). When r_timeout is 0, ceph_timeout_jiffies() returns MAX_SCHEDULE_TIMEOUT, causing ceph_mdsc_wait_request() to wait indefinitely. This causes hung task warnings when MDS becomes unavailable during operations like setattr or truncate: INFO: task dd:12345 blocked for more than 122 seconds. Call Trace: ceph_mdsc_wait_request+0x... ceph_mdsc_do_request+0x... __ceph_setattr+0x... Only the mount path in super.c explicitly sets r_timeout to mount_timeout. All other MDS requests (setattr, lookup, mkdir, etc.) use the default 0 value, making them wait forever. Fix this by initializing r_timeout to mount_timeout in ceph_mdsc_create_request(). This ensures all MDS requests have a reasonable timeout and will fail with -ETIMEDOUT rather than hanging indefinitely. Signed-off-by: Ionut Nechita Reviewed-by: Viacheslav Dubeyko --- fs/ceph/mds_client.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 45abddd7f317e..ac86225595b5f 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -2613,6 +2613,7 @@ ceph_mdsc_create_request(struct ceph_mds_client *mdsc= , int op, int mode) mutex_init(&req->r_fill_mutex); req->r_mdsc =3D mdsc; req->r_started =3D jiffies; + req->r_timeout =3D mdsc->fsc->client->options->mount_timeout; req->r_start_latency =3D ktime_get(); req->r_resend_mds =3D -1; INIT_LIST_HEAD(&req->r_unsafe_dir_item); --=20 2.53.0 From nobody Tue Apr 7 18:03:28 2026 Received: from mx0a-0064b401.pphosted.com (mx0a-0064b401.pphosted.com [205.220.166.238]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7873C34EEED; Thu, 12 Mar 2026 08:17:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.166.238 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303425; cv=fail; b=qZrnor4U1dwEDOBDTsVnXlByczTOYO/exRN9RN3k9vqM8/CCsF8wcU/0jVrJOGMaGaI3YNshjX9TQzzMxVrEDZ4YXbDPXefwQuu6pQ+ltkQZ+F7aF6lmz6hxqW7eOsjqogY2XsZQRchaKxFBlNWWbsDiANxohUxbz/UtrnN9uVE= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303425; c=relaxed/simple; bh=u41OY1UtoRLity68mOrvjZkCDwzYcWDHSW8ANiTUWDQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=WoqJNd1Il16S30UCvdHEWJuBqSYaSWafYOVtjMonBfscL+K8t5hHPtrAFGc+u+oslsC50XOso0VOuSdjXZCHHAJQh088bVShlq1LqDM6BXS82viI7l2OM+kTTDB283y3ukiHKmo/dvwmXSs5VHypKoFpUiS3tiiaRWMzg8VHd9A= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com; spf=pass smtp.mailfrom=windriver.com; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b=CA9AnOdT; arc=fail smtp.client-ip=205.220.166.238 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=windriver.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b="CA9AnOdT" Received: from pps.filterd (m0250810.ppops.net [127.0.0.1]) by mx0a-0064b401.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 62C6vxZ41862126; Thu, 12 Mar 2026 01:16:54 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=windriver.com; h=cc:content-transfer-encoding:content-type:date:from :in-reply-to:message-id:mime-version:references:subject:to; s= PPS06212021; bh=ZxlnFN+nIyNH9zxWdjMIVGg59OmOFUaydD6pIC95pXM=; b= CA9AnOdTTk/+EerZvAoCvdqgvr8JsEP0plRj7uF3F5uCCg5eqfbMTOrW15S/HLh2 sQP+cBZvCpfUjss1zeEYvvoETZCs+FToSXGWp9XUeNGv7pg0v/3DDtyWP2hV60y3 8OwokT2Lv0QheOEj2d7Q3+p29izLW4YKP4/+B0dcxLRFEG9rBCa5glKjBgoHNm+B 8fu4Ntl+iRaEtsCmOH4uW3us2GSttGKrvneDzVLVHQBIVbn3gdTfHmSC+d8jfgiS R8/x7yQctHrCO3QX4eU6nDTbZlAaJjCECpcXkPWcNJjTPToK0vwYcq2jvQSSXaq/ VYogdjiSWjhs8/32A78x8Q== Received: from byapr05cu005.outbound.protection.outlook.com (mail-westusazon11010010.outbound.protection.outlook.com [52.101.85.10]) by mx0a-0064b401.pphosted.com (PPS) with ESMTPS id 4cuh6pren4-1 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 12 Mar 2026 01:16:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=PDsfJrj1EeaaTwFrLBduWzELvo6Hj9WWRpxO6I0oE6e61O2/xaxq08hw3q5mFvcJRVMAcAO3x1IzvIrMxxSKHa0qrPCyOJ0hJmjgRT/oZ/6GSDH7i1h3WV/I0h1hrWCQ6kpUUVMyDF6tEBmYoV5W7BzhYREir3rY3zI8rPA3600n8qYd0HgIqxddyP3GuowjOetjljvs+jxqYvagFQSQGHLCrQoknb3XkbQDNubvowM/N5mweY0imoD9WMAUiCNeHxiTmeJy5zyOpZaki+aifsMjspU7lY2jKs2mq3Qy1WswSb+9me5C279qAmXnIveGTmmtbqa3v9WO+Ai6sZL4Wg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ZxlnFN+nIyNH9zxWdjMIVGg59OmOFUaydD6pIC95pXM=; b=JzJIFeH178SEqoTsUo43w1QaAoAXitPWc3WrSo+UPa1XYqo/wuK7RutV61DD1nB7FxJwcORCGF8Mx3hGH/JEm7jzFb4mOcFk6PezkCPIs/XjqQKUNaZGzTcZeOC+TRQcre6IfvZZM76VC16Aesa0m4Te5hlM3MdYiv6wdczBNNxhjYqbEqP0PP6MgBIfkF0MEKn16uG7t/gGxt23KO+dS9kxqlX3vBGII7OnKR+nKJ3vhtW15AjU2A2xefn8At/8wEscDEFMjsxEnPd9h/xmSJwYTZqoo0JwreV0VWZONYlZYHHnZgcXBS79rUPOgEsR7/I5VjRNmQC+EIYZzDWtNg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=windriver.com; dmarc=pass action=none header.from=windriver.com; dkim=pass header.d=windriver.com; arc=none Received: from SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) by SA0PR11MB4720.namprd11.prod.outlook.com (2603:10b6:806:72::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9723.4; Thu, 12 Mar 2026 08:16:52 +0000 Received: from SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced]) by SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced%5]) with mapi id 15.20.9700.010; Thu, 12 Mar 2026 08:16:52 +0000 From: "Ionut Nechita (Wind River)" To: ceph-devel@vger.kernel.org Cc: idryomov@gmail.com, xiubli@redhat.com, linux-kernel@vger.kernel.org, ionut_n2001@yahoo.com, Ionut Nechita Subject: [PATCH v1 07/13] ceph: add timeout to caps wait in __ceph_get_caps() Date: Thu, 12 Mar 2026 10:16:13 +0200 Message-ID: <20260312081619.40854-8-ionut.nechita@windriver.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260312081619.40854-1-ionut.nechita@windriver.com> References: <20260312081619.40854-1-ionut.nechita@windriver.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: FR2P281CA0085.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:9b::12) To SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ2PR11MB7546:EE_|SA0PR11MB4720:EE_ X-MS-Office365-Filtering-Correlation-Id: e8bc42b8-d5a8-401e-bb5b-08de800fb710 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|52116014|376014|10070799003|366016|18002099003|56012099003|22082099003; X-Microsoft-Antispam-Message-Info: u9QBiRqegm5Lzoi4mw1sShtHTDCWiydD9O1rtrFl/jRgj8WCk4Kr8rAhw1kxe4WzenrOV9uKYOvGlDw8mBH+JA7xcGXmNbbsbK8iGAYPSrrDe9mMWWPnwZI8tj3yP0P15IzIX0Q+4ML6iXg/H9wHq0WmNPTCLZ1i/f2MdPcwBwGCPH/lEaFQ0rCrlpvHBBSw9n2BTKk4h+mx9jTWNaD0bUKBz9pWX0obS0XzutGA6wENcgEcyTql3HHP8zme+s6H9cnC541u1fYgPKPP4vWbZliazTlgps3FQNIOkN7/qxWzQ2op7T+KuPagVeaeEglSdC1GedqGLCMZUtQAKVBRHMd66Ybu1aOYav2prlxfCYUumo47vcd1/T07qR/XcRlhxtIN7DiO6LdwCKUJ4Fq79iJ/+LAIZzHhZE5bst5IsGfGHNee17QUAoyrqWKvwAKyKzTORR1mgr448+TWzZ6GCc0ghuPfsJ81UwU6oa2KCR/SzW8aBSnNOOqD+oEmHr+Ra21goCvvDb0U365mT4/t88IKvLjoVhFfudG8l2Dm0vFiaQFD4v0YKlhZzBSH14AgT8UzxdtadHF/2Euz6TKkT/5OVUXIpMXOg3PAmEtwoTjnc+boWA1utYtGjm4jSyYx6D+IyDVFpmixDBOntHwd9UH67gGsRoR3RRW+xu2ZrmhAbjhrvza3NHJouLb54qBoyqKmDDXNXgkIhGSeImqWp4Gl258gx68bQVtw/5VkbRM= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ2PR11MB7546.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(52116014)(376014)(10070799003)(366016)(18002099003)(56012099003)(22082099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 2 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?rt/OCXujRXS/9fJrsl7YBJMOYsHQMu+J09O2krs59CLUkPW41RMxl+Kyz0Xx?= =?us-ascii?Q?YFnDxywefb+Dknlbu775oNFq09OynV5FR6zlYhNNiA/mX+xoxSHDTdCA1/J6?= =?us-ascii?Q?BvlACV3vSIm/NXVl/55e8as0bgYhYvl5HjHiZ1cSVW2Wqskg6REC0FRs2QAv?= =?us-ascii?Q?8/Rui8XuRMgjk+oY4smvyhclMS3GbR7/+uPs1WbDrzxXYXMphB6lVwhhW155?= =?us-ascii?Q?0LphQzgpRR5lUaTOU3kseRLke8uCzYgAO9EWgxRm9rYmYEtTi2wj0jWlPp0v?= =?us-ascii?Q?mDNcAVpIxR6BV2vcvmA0smiPTlB8bB+3+MfluA8D+dc22dzwoCEgZ4zEnVUJ?= =?us-ascii?Q?M2sanxSnddTcTAvOcQGVCw2JDqPNg+hirvWgPFwU9Z9OQd7IHPjCAonpwAbr?= =?us-ascii?Q?AdDbFXa98m7Ct5wv0jAZq9HUB45DfLCd9XKdHw8grslrE9S7Po7NyVlgE/Zo?= =?us-ascii?Q?fxBup+7LhIKYcjsY7NbXpArRC2R66zI1SgBj9qwq7N33/G48IeK9j2nj68ro?= =?us-ascii?Q?jb6YllHtqT4oP2HRHuXjM97SGooqPNO3zd1PoNB6bvdzfjUTEJwE7mKH8iBk?= =?us-ascii?Q?K3r58Ot2gFOcQUcD+qIyPv5T4zBY5vZQq5f0mQe2MPmyvzpxSX3J5Cv94cX9?= =?us-ascii?Q?ce/cpatvFL1srKRIOXC7KT1qa5KB1i3EsyIgj2jtRrbWQDWY32nCniCxwUVG?= =?us-ascii?Q?8s2zK5v3nnHRAM1QaQ/W1DFkFLLdCEjNeXFlBkYMyoyeo58tP0z8p7/82vDr?= =?us-ascii?Q?WXZoyx+QUnRAuxoQhvTXI0JJmfZqumZP75RM5RuH7nrSSpAVj1esMcg54gBh?= =?us-ascii?Q?X5jgecoBm/OPyqohOfMKYPY6DIt/wjZToJCm72OwpGoeDHSXEKhA2SXU4yMf?= =?us-ascii?Q?pUWZgiO4b/i/KgqQ5rDYJq+wbki3vrjUw6MBmZBtWa93ijkC+KPfOusnGeVj?= =?us-ascii?Q?MmcilPLh85LDCscDNhrTcokz72WR2xWJX/uoHWjR5JloR2Gkrm4bmOnkjda6?= =?us-ascii?Q?EFpjjtiIbwc1/HBkW8loFv4ytZIWw5OAAVQ8FLV7Mft5fEqJ27PYaf6uTCl1?= =?us-ascii?Q?1OVYP/50jSXHgDMe1JXqcEKpz6Li7glS4NjZB7wD14Y5t9Tl4ZD8BVoAO0sc?= =?us-ascii?Q?0QhX4f7R/KhhXH6EPKcVrE9WU+B2ZXtU9eYiry4Zd/miscmeYGRJwM5HYFLO?= =?us-ascii?Q?O1rJa+RTCkhZH8r6Uy+Qvp6bDmvD6hs29VPk5HpMVPsddxP8wmbC2PF6Sy9j?= =?us-ascii?Q?bVd23FCR/0bmpsJ7sHXkQUb9J4xO084e5+LiAlvsiyrJ9R/GiWfhd5fB4YCX?= =?us-ascii?Q?+zz/6+DVgxI1G0XHP4Dsp0G8Xi8K2YEehrlY07NdOFU3YL+kP1LGs1kD7djD?= =?us-ascii?Q?kPL37lgzmIS0DmLzHagDUclV9DinQ6HqbFeApJRvtY7M528lcW0CHdni3rDc?= =?us-ascii?Q?pEQXsMRZaK1SsVBs+yQIohoAZl52H1xrrPnzFZkhhpxNc1YV9u0bdG2lmVTF?= =?us-ascii?Q?+pcfY/bWts1q8kg05cqSqH/GWbFpV06gsJYCYdJ9FnSku+gwCNRdu7oYw/GU?= =?us-ascii?Q?YGBUDqoDt5LPWVyHhRukT4EImO1umXpC6v0ca5K7QNUQO8zNWCOq7/awFo6i?= =?us-ascii?Q?FJ+DXTiDAp1Rm3rElEEBw4NY6l14XD4Ad+PsrDB12i7F5PIcQN87EWXRZh/g?= =?us-ascii?Q?XHklhHqCMCxmR7rFGXvWw5Az9l6bK55Qjw/t0olN0qyMHnNIPqgeR36H/FKz?= =?us-ascii?Q?EWBjN08evxsgjFv4hGtBI7XqWxaVxtaw7ORH2ZFxHTDDWkjFfvi1fpRkukGt?= X-MS-Exchange-AntiSpam-MessageData-1: CrNuneeVnV5yr0LK7Kv1vc7Cy32CFgoJ1uo= X-Exchange-RoutingPolicyChecked: iZNprHLgO4M1wLTCyVsg14TYTP4TepfJ3nLEJfNWwPFptnV77xFowQj80acukqtaxx1Hj2+ESEURbqYybEcjvwSUoL2O59oMwFIjNYO1QcrAwlVHB/0q/lUuKi91R1yCfE49wGFNYDAH+PkEbMyDUlQ9HuZQZyXMkI6KCIFgz81YhjDi3gP2a4RlUm5/Vlzt/Cl3cXrqkYhtsxIlriRFati7DXxmiafWwhduqrNOFrD8qeg0N/TpeJm42FANG/ZTit2VTbi3z+TYntIkFRtteTGATNE5s5zkRxTsV5BDk/ibcPwA8qaSBuK2O0eRxyb6uPchpUK746s2jvAG2d1G1Q== X-OriginatorOrg: windriver.com X-MS-Exchange-CrossTenant-Network-Message-Id: e8bc42b8-d5a8-401e-bb5b-08de800fb710 X-MS-Exchange-CrossTenant-AuthSource: SJ2PR11MB7546.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Mar 2026 08:16:51.9106 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8ddb2873-a1ad-4a18-ae4e-4644631433be X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: HzyeMyo7IBO1R46iM4kMUdelcE5fuDosu9eoEeVKbcCIwKFH02+jLnMcEdpGrRoXu8IUVRH5yF5rpD+qUUD9zeqYpFHmbIJRHq7CVlHhCrI= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR11MB4720 X-Authority-Analysis: v=2.4 cv=ePAeTXp1 c=1 sm=1 tr=0 ts=69b27675 cx=c_pps a=wodMs23R9wL6gmOfRTd43Q==:117 a=6eWqkTHjU83fiwn7nKZWdM+Sl24=:19 a=z/mQ4Ysz8XfWz/Q5cLBRGdckG28=:19 a=lCpzRmAYbLLaTzLvsPZ7Mbvzbb8=:19 a=xqWC_Br6kY4A:10 a=Yq5XynenixoA:10 a=VkNPw1HP01LnGYTKEx00:22 a=bi6dqmuHe4P4UrxVR6um:22 a=HK-ge7EqtdluswH-FwHe:22 a=t7CeM3EgAAAA:8 a=VlUkj_UfwbxC0px-5zMA:9 a=FdTzh2GWekK77mhwV6Dw:22 X-Proofpoint-ORIG-GUID: 2T9vTxPDNe98BX4hAh-TVVzyVjBQLVA9 X-Proofpoint-GUID: 2T9vTxPDNe98BX4hAh-TVVzyVjBQLVA9 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMzEyMDA2NSBTYWx0ZWRfXwkLoXplgq+2a AbgzZfR58sYR3N/PJiXjDXKvWXXyaGSUu496s2uID/HS4vsA7N21AiLGz1rojeOdkB0RdgO9Sc4 NoYml1OqASl3nvB7HRpDLzZQyxFGx8MT+sheo198Oy4OIAKIA3PklEbSzUAB8h2HCTZp0rht9UU ZhBniQYaqLzrWhTM2Dm1CKDAdQ4t0HpX5pQMuHarLolwt00My28RX9dBBVMBg/3dDJ49/bvtXbD y+8r0YWtGXU3pX/+MOUcVncGFYnIYJBt3kUZiygQtDmqAkgOlbncsbw2WxuEK+BaTSQFukO316o Dcio/rfXrVA3exxZDTNQXvRaEfY3tXKkjBfB4DacIT25v73MhNNADaEwZnt7NYGdQ46DfXPHWjI +BNT35d8tqyUiZmXKHkyIf655kFtAMsdoXXEE8Fz1XFeTyc9GCfUILdKPWvB+EH9Es2CFZAuBik MUI70A7tf7ojqLfNEog== X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1143,Hydra:6.1.51,FMLib:17.12.100.49 definitions=2026-03-11_02,2026-03-09_02,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 suspectscore=0 impostorscore=0 spamscore=0 lowpriorityscore=0 clxscore=1015 priorityscore=1501 bulkscore=0 malwarescore=0 phishscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2603050001 definitions=main-2603120065 Content-Type: text/plain; charset="utf-8" From: Ionut Nechita When waiting for caps in __ceph_get_caps(), the code uses wait_woken() with MAX_SCHEDULE_TIMEOUT, which can block indefinitely if the MDS is unavailable or slow to grant caps during reconnection. This causes hung task warnings when MDS fails over: INFO: task dd:12345 blocked for more than 122 seconds. Call Trace: __ceph_get_caps+0x... ceph_write_iter+0x... During MDS failover, caps may be revoked or delayed while the client reconnects. Processes waiting for caps block indefinitely, also holding i_rwsem which blocks other I/O operations on the same inode, causing a cascade of blocked processes. Fix this by using wait_woken() with mount_timeout instead of MAX_SCHEDULE_TIMEOUT. On timeout, return -ETIMEDOUT to allow the caller to handle the situation appropriately. Signed-off-by: Ionut Nechita --- fs/ceph/caps.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index bed34fc11c919..c88e10a634e5c 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -3055,7 +3055,10 @@ int __ceph_get_caps(struct inode *inode, struct ceph= _file_info *fi, int need, { struct ceph_inode_info *ci =3D ceph_inode(inode); struct ceph_fs_client *fsc =3D ceph_inode_to_fs_client(inode); + struct ceph_client *cl =3D fsc->client; + unsigned long timeout =3D ceph_timeout_jiffies(cl->options->mount_timeout= ); int ret, _got, flags; + bool warned =3D false; =20 ret =3D ceph_pool_perm_check(inode, need); if (ret < 0) @@ -3104,7 +3107,18 @@ int __ceph_get_caps(struct inode *inode, struct ceph= _file_info *fi, int need, ret =3D -ERESTARTSYS; break; } - wait_woken(&wait, TASK_INTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT); + if (!wait_woken(&wait, TASK_INTERRUPTIBLE, timeout)) { + if (!warned) { + pr_warn_ratelimited_client(cl, + "%p %llx.%llx caps wait timed out (need %s want %s)\n", + inode, ceph_vinop(inode), + ceph_cap_string(need), + ceph_cap_string(want)); + warned =3D true; + } + ret =3D -ETIMEDOUT; + break; + } } =20 remove_wait_queue(&ci->i_cap_wq, &wait); --=20 2.53.0 From nobody Tue Apr 7 18:03:28 2026 Received: from mx0b-0064b401.pphosted.com (mx0b-0064b401.pphosted.com [205.220.178.238]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7081437F74F; Thu, 12 Mar 2026 08:16:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.178.238 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303423; cv=fail; b=ixDAueM1n9Jti7jwOn3s6fyfjD0mNaBnqoucweE0MBXyeuW60/LLpktnNDlFdLqXOF/qc7lBg53Tfc9eu9W6XdvQ3Im9F18BCVTj8Wcg8oIInC/Mn1So5CK/0ilhH55+IA2i2xDvN/wrBsNNkYV5Xpwbe51qorBJJXsGAyttDYw= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303423; c=relaxed/simple; bh=4JI9qtEv8FUYfrOxbMoNw7CZcsTx92o7v2gytvnrVOM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=iAbllQOEzR38YkNimMOTJslQVbOhyUqItzjW/bJDWqGp9hTJpBvXrsbt2P4VzxmjxirHu6f1KuJs1gqSxnag/Y3+cF5Qnqdt/87hNutlRYySCm5UZSoF3RmKUCLkUGSuGyFH29/+9QM7tKuvox4B8i6620bo2JL4L3RtiQUdlBA= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com; spf=pass smtp.mailfrom=windriver.com; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b=ql+edXe+; arc=fail smtp.client-ip=205.220.178.238 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=windriver.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b="ql+edXe+" Received: from pps.filterd (m0250812.ppops.net [127.0.0.1]) by mx0a-0064b401.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 62C70YDG3197163; Thu, 12 Mar 2026 08:16:56 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=windriver.com; h=cc:content-transfer-encoding:content-type:date:from :in-reply-to:message-id:mime-version:references:subject:to; s= PPS06212021; bh=zhVlJAbfwhE+qpn75gL9J9InjOvGatlFGIigfFFysas=; b= ql+edXe+Wgf9D7qWgRieyPlwoP1orDqzfPFKpgSpM7goJKfDUO/PZwAeWF2iVZnj 0hz3dTnx/lppsyKGW7vWI6UBlbfAyXFVe/f09W6xHJAKHPCM41/BoSLaXaI+4rho nXmnyK/pBZKfCQh/yfU1pIOJl/W9Vxjnf6f8s8YLGir9rKAPpkKxPMD0wsH5H8z1 fyi/qQg6kBCTC5dXQTIjFSRZGLssoj2LkT7lolRf2ydY3ng1/am6bj1PFu2DHBjA iQLnWGQk6IHxrJxcA7WvWLR+Dr7JKVck6icMGFImP7KG7oqom230mlHC49a1dVYv wTe5+CM9QErN4tGo7Kw4+w== Received: from byapr05cu005.outbound.protection.outlook.com (mail-westusazon11010065.outbound.protection.outlook.com [52.101.85.65]) by mx0a-0064b401.pphosted.com (PPS) with ESMTPS id 4cuh78gdw2-1 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 12 Mar 2026 08:16:55 +0000 (GMT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=AWCgp2kOWZrzKOYBA1DEpunbNF0r7CwCuTkf/sKD1GwdS1Ew2O2RTkiUWmOR9JuupAnLlxFN8FtJqGe4y4bHqjVnI5lcWJDgpdGnMOjdaKF9t9p96/+i/FhxN7S5N2OFHrFlglZ7fgSN8ZiJsuz7jE2irdRaX4AwYKkZlQSOFfa9Krb8dBSzYlEh5LwjJkuf7XtLOFjzDu8k1JEb6WBpruyVjgMxYuJ4yJwg+cHP2xBB9obUXkgB7iRvjRHAX+s7dsgk8Wm5Al2zwWo24nARtgkaxnkwxNWxDQbdzBf+9kyA60thRe/dfFysteD6Eqa1/9xYu3X0P3A6/d3D1W2Xbw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=zhVlJAbfwhE+qpn75gL9J9InjOvGatlFGIigfFFysas=; b=ZT2uvTx4BpdTqnjJeG7Ojh66Z2ouRE313Ay1irDLPzw2ySGqtCHjt3BSCDLlqvx9C9E57kjdmYaKlWccwCCDkDi6HQJOglhSJfux4pt6A0Wqqbnei5I4ZtRdBXLvpilijoDdksHCVie0i4jemFU6uPvzNr7ku5gU6X6t3U59FgyZleK8gP/9g/ip0LtHwMxGJi7OZh5Tomcs/DHiR7e0jbX8nsxnmJREkf0/N3pnfNmdkkj1Pko/SFrwhtyBIYXGsLT2iGkazOcNQ8CYGFSIoCq1zK7zHdUcpAL8gfCcEGKlNN9TRfFvkmTLqcVYz2B+WtAJ4IsRZ6qyqqXZNkIHHQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=windriver.com; dmarc=pass action=none header.from=windriver.com; dkim=pass header.d=windriver.com; arc=none Received: from SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) by SA0PR11MB4720.namprd11.prod.outlook.com (2603:10b6:806:72::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9723.4; Thu, 12 Mar 2026 08:16:54 +0000 Received: from SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced]) by SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced%5]) with mapi id 15.20.9700.010; Thu, 12 Mar 2026 08:16:54 +0000 From: "Ionut Nechita (Wind River)" To: ceph-devel@vger.kernel.org Cc: idryomov@gmail.com, xiubli@redhat.com, linux-kernel@vger.kernel.org, ionut_n2001@yahoo.com, Ionut Nechita Subject: [PATCH v1 08/13] ceph: make ceph_start_io_write() killable Date: Thu, 12 Mar 2026 10:16:14 +0200 Message-ID: <20260312081619.40854-9-ionut.nechita@windriver.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260312081619.40854-1-ionut.nechita@windriver.com> References: <20260312081619.40854-1-ionut.nechita@windriver.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: FR2P281CA0085.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:9b::12) To SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ2PR11MB7546:EE_|SA0PR11MB4720:EE_ X-MS-Office365-Filtering-Correlation-Id: 5c34a40e-73e7-46b8-95fc-08de800fb852 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|52116014|376014|10070799003|366016|18002099003|56012099003|22082099003; X-Microsoft-Antispam-Message-Info: jgAvy3h4lolqf7pJ5c+MyO/6Ld7h5FcFCQr/mYIM9nQsEEdP09hK3PliqEDEsNUL5jtdyVSwzf1WrcCJhz24Ce9+TttvwVeHoaGlpM+oVq+wourneIiML3Yr+0jlKwuCZJJeW9sDLNRQBy1SrubEeZ55XeObfCAdimYUN6UPSI3qvpGa2tuChUbGJ48pm/nJ01x+4sPoNQNv8DkZ8vNuKTAGXAXARZd2GA8zOh+gA8A5tmyn9Ak/Snlc6MxspZYNT4ZzXCso33Yb5EaQS8irg8PzYw1uYiUxoDlsNHKqusDt2XmM3ABYrzpzIqs669BHr0Ukh2qsAqy7GalRsLLFtz6wsvZDcIWPeI6Tbp65U2Rn9dNe/MHjaqud3cObNXFt41fZ/a+YEQ4OCTNiyuCmcX5j7pSB2aa2wz8GU/6ilatpp38Ymk1QiHIsGdo/RsmqiJKc6+suvdASgSbw94MKh7EqxKIu0e7xF0c5lX0kclRPV9D6HVxVBs46H6VqWeBpzviXqn4n6Og8OU20YAb8xjhd1xx5sdR/3Ci5KU9RiUUIA+PqyaA0QiSW7o/jV3Im0G0u0jeL+7z4P9/yPUD0PBDrUAh0+O7FD9JC8MJyzQAKJxGH7r1GvoWVFr7MWGQoMjsEqzZiG0YLimd0EbrIubdV5tNe0NZMZeXrG+mdRbZ8dIPZrHMKsJaQs6gZsF7TmOP1OxmBXSmu9o0OmLQJWxJZDwbqXWRjPrL5DXPNpKs= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ2PR11MB7546.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(52116014)(376014)(10070799003)(366016)(18002099003)(56012099003)(22082099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 2 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?9DpqRSa6QMltuQIIK7J+bQSIo9eKr+Ger5ItNF8HwkyRXpDWAF/XSCMyztlz?= =?us-ascii?Q?M3jHj0VshHiiTQLemWJYLx8B27Hc8V0O9GDtB37nTv0tBnK/6/evDOX+W2fN?= =?us-ascii?Q?Ox/7bfFhLqmLmT1Oa7SAXmTHETvViHFmXp3bXPm7GTjMBlWR84hctBucNJD5?= =?us-ascii?Q?R8DS2Fg3hoBDGN3MLXC1bSx5eOrEO0nSP160puwkDHu6lMK7q5oLsH48YwmO?= =?us-ascii?Q?0dNUqMZCdxZ5niAAbv1gsLVD0VfO1Yq3iI5CA9ovFY4fxUmAMlBCt++v2fzF?= =?us-ascii?Q?rtFB7ndzo2CKx+5Tr6x/g2Mv5+pIJ1wcbAlQNjIS2/1WR/4JC2bsDf1s2iHN?= =?us-ascii?Q?PMljamRT5p053iiOHyTUhzQzzvCrhbRu/XLUoThuoLAoIymdwuwUI1GSuMJV?= =?us-ascii?Q?bujgIH/DQ8/0+5x85N2etEw/9ofYfx0KwIfteAoPtC8PHkU4xE+XFKRMqUR5?= =?us-ascii?Q?60wkU8uc/FkWwqfSz+BkbR1zeWZ6oAhsvFY0lN0mGY9BwEzXC+uP95bhsASM?= =?us-ascii?Q?FKTDTxTB17hme3KvKeZRJocT1LqloYfGyPxtkT/ZF8CWf+3IEuiqSO08aVZF?= =?us-ascii?Q?MZS1Vkpskq/zXUkRc0fLvmz3AKu5L/sFtDw0L35bKF1vhp2wztUJKa3+qKBX?= =?us-ascii?Q?v8vMtE/kmf+vwAATd7NqJKvpgMBXRYrbj4gGSGxhkPn3oyWsDnbf14F5cRkF?= =?us-ascii?Q?wR5lm5My1Bi6kcFixqcP/tNA30K1ZSk3VnYw5EKiDSileuX0XrOXioiw8q6Q?= =?us-ascii?Q?Uf/xDVaBVgOQO3F44jigksr7poaNL3v2Ok/W6PXScGSGW99xo2U9gic3Xpoh?= =?us-ascii?Q?n8UOSrkteC1I89h31B4C7gotLYIjR+4JpZSwWk2a7yD5lRdhdKsql+RmifIg?= =?us-ascii?Q?fNoNpS2jJqzH/MZWamH3QtABEilzgRWQha65Lfe3Wy5N7tflgAxXt/u6aGz7?= =?us-ascii?Q?WTDGWFRrKjOxzvfeNVq1A9nWBk/0St5askoPsGxW3o7Zi2x+kD97G22tfF95?= =?us-ascii?Q?+R9Zz1qGFXuHMCTAdsbJ9/V7OmyDNF4Y7YBjjuPP19cH2LpY4lSqMPGd+oJO?= =?us-ascii?Q?oHob3FzZxD4+2d00vzoxq0z6/neJajbnYSAf8U4cswsDK4Y+lmljHjBNkr5j?= =?us-ascii?Q?5AlEMQtLJLCxwEueqcfEeqq+GNUxfQ+4Ty/++ewkJ7C4tBVOZwyzoMr5255j?= =?us-ascii?Q?vw8Fdr+4XnxJ3nIIsdnoClNspM4EG4Q+USYmS8jsq2R6fxE2UyC3Rre2ykvY?= =?us-ascii?Q?jpTbrViJTIqakrDyBMF4UE8LbkM+Y71cveRadUEPQtmcaAT+Uym8W7+nEn9h?= =?us-ascii?Q?69fzozID+kCG/oQ0JRtv8xJVh4EEV/EHpTxImTF9em7PKQIP3Vwdv4HTqKZL?= =?us-ascii?Q?naz/DQ2JrMr2SL8VxXecRqAdjLtXr/y/fM0pt/OfijbReO48Yccqm1UU3joc?= =?us-ascii?Q?jqzTtIzeqxF8waQ4fxVDXe/z6tm9gf31Ov1KK0sqodTinsVMm2/+lCtfBxeq?= =?us-ascii?Q?tyjkObkqP2GyYxp8IN2PmF6ydTZB7eqnj+PN6mhSY4GdGE6RVKtb9Qy/H7Ea?= =?us-ascii?Q?nBrSOOTe4tph1TL3HdPduSIu1+XgfyDvcub47DsYC5CuXfWs9Jeg8gGjpMO9?= =?us-ascii?Q?DTZPQ11ciJR5I2VzabCW7ndh2EQnloy2bu6qafR8H/4rxZpMYBCzDCNRIQSp?= =?us-ascii?Q?JPQhDWSGCzXEXFT01qZvrMyVtGPaoqqz6FGxaDHw4CdqnXX6FXeYma93sYik?= =?us-ascii?Q?8CkkyJwhI3G6Kcl943sK3yN+oYKoqWEYIkZmBPPsa7uP3ExxXwersOSg6abL?= X-MS-Exchange-AntiSpam-MessageData-1: /BADkpRI40s+KEt7dU9ESlg9a/+FyVZN9GA= X-Exchange-RoutingPolicyChecked: P2YSuD9Pjhz++GM4TuWaGwIzLVHJuBNwHz8lAt2vvkW7dpImZLcNHq9yQ1ASwg7rC397DJw43pvFCIiG12EbIUnwSgmtTVUTlGtiIHtLK2l73qm9fO7e/+GeifZzxgwaT0Llg+ARFRFPtJXdZaeZyuHgMgBJI3AphS2fKSOBqvNRMfmlZ4dLKnPADSqFkQmJ+8lfRolTmmCyUjtLLmNIKm1NvCShRLsAtphBH0XRcByXShtZjwuGZTS7UX+Z2/lITEMh7T/+0SLLDEPZTVtFQ42MBIaoc/BsUesWhr0AsT7X9h/91W3gup98eYYdEfwuIyr0uyjBYWJexLzYKHhrrQ== X-OriginatorOrg: windriver.com X-MS-Exchange-CrossTenant-Network-Message-Id: 5c34a40e-73e7-46b8-95fc-08de800fb852 X-MS-Exchange-CrossTenant-AuthSource: SJ2PR11MB7546.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Mar 2026 08:16:54.0736 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8ddb2873-a1ad-4a18-ae4e-4644631433be X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: PmcVpw3d5hfPlTW8hLpp66x+obn2NPFlgMDzWA58PZuSZQNo6Fg7Agu3+lf6+76sVb44qTZr6h76U56N9FUkQuPXb6p+JSgDm3RftBP+N7k= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR11MB4720 X-Authority-Analysis: v=2.4 cv=ALvEU0hV c=1 sm=1 tr=0 ts=69b27677 cx=c_pps a=ZfP+Ya0swgQO+W9lSFUZ5g==:117 a=6eWqkTHjU83fiwn7nKZWdM+Sl24=:19 a=z/mQ4Ysz8XfWz/Q5cLBRGdckG28=:19 a=lCpzRmAYbLLaTzLvsPZ7Mbvzbb8=:19 a=xqWC_Br6kY4A:10 a=Yq5XynenixoA:10 a=VkNPw1HP01LnGYTKEx00:22 a=bi6dqmuHe4P4UrxVR6um:22 a=fTW__CHxibyLmBMfj2wP:22 a=t7CeM3EgAAAA:8 a=XuYGRnHiGaKy14c8jA4A:9 a=FdTzh2GWekK77mhwV6Dw:22 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMzEyMDA2NSBTYWx0ZWRfX0dQ6uzVGMJNF QcD9St3ow+EM0ev+47RV2uEeCxPy9SxDoevAtuGquHexcUQvF+nhBSVxpQYX6nC29IBaYL4yrJZ VWpVpq3rzE7qmgzMnk7xgfJuRev6eryQrsD5HJO3PrRHaNabcDaY81AfdLOR0p0363lZ7uTUlzy WB+OWGeZNaubZn0h4tdVzAeT+W76Tbu0PNxqgOaBdWXBu7s8qINZ0Mi3LW8KQnRxulEk9ViYk7v wrbjyO9kskmnsqPGy93J7RCvtvzNt0eIlG9wdPQt9N8XkzHDwjaxwHUEJCl96B0qnsH1Sqt2JXQ 96gMrl4ZsM85jHPf8qoTkhTMuIT260jiCC97bEQIIsGMuRdMjSvrhi6doAu8cgaOXS1TVW1Aqiy swidq9Pxq7JfmvPkFHutuzWYjjxckEl0MhcCSpulExKhuEs9PT/ZRi/sByLCZ8kROjYPqJx5arp Hc8UwrHlwcDyoE0/s9w== X-Proofpoint-ORIG-GUID: YCUPr6x3p8Vbcw3LfIJPt3mhzUzm1iEX X-Proofpoint-GUID: YCUPr6x3p8Vbcw3LfIJPt3mhzUzm1iEX X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1143,Hydra:6.1.51,FMLib:17.12.100.49 definitions=2026-03-11_02,2026-03-09_02,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 phishscore=0 clxscore=1015 adultscore=0 suspectscore=0 priorityscore=1501 malwarescore=0 impostorscore=0 lowpriorityscore=0 spamscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2603050001 definitions=main-2603120065 Content-Type: text/plain; charset="utf-8" From: Ionut Nechita When multiple processes write to the same file and one of them is blocked waiting for MDS/OSD response (e.g., during MDS failover), other processes block indefinitely on down_write(&inode->i_rwsem) in ceph_start_io_write(). This causes hung task warnings: INFO: task dd:12345 blocked for more than 122 seconds. Call Trace: ceph_start_io_write+0x... ceph_write_iter+0x... The i_rwsem is held by a process doing fsync/writeback that is waiting for MDS or OSD response. Other writers queue up on the rwsem and block indefinitely. Fix this by using down_write_killable() instead of down_write(). This allows blocked processes to be killed with SIGKILL, preventing indefinite hangs. The function now returns an error code that callers must check. Update ceph_write_iter() to handle the new error return from ceph_start_io_write(). Signed-off-by: Ionut Nechita --- fs/ceph/file.c | 9 +++++++-- fs/ceph/io.c | 9 +++++++-- fs/ceph/io.h | 2 +- 3 files changed, 15 insertions(+), 5 deletions(-) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 6587c2d5af1e0..01e4f31b1f2f3 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -2359,8 +2359,13 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, s= truct iov_iter *from) retry_snap: if (direct_lock) ceph_start_io_direct(inode); - else - ceph_start_io_write(inode); + else { + err =3D ceph_start_io_write(inode); + if (err) { + ceph_free_cap_flush(prealloc_cf); + return err; + } + } =20 if (iocb->ki_flags & IOCB_APPEND) { err =3D ceph_do_getattr(inode, CEPH_STAT_CAP_SIZE, false); diff --git a/fs/ceph/io.c b/fs/ceph/io.c index c456509b31c3f..f9ac89ec1d6a1 100644 --- a/fs/ceph/io.c +++ b/fs/ceph/io.c @@ -83,11 +83,16 @@ ceph_end_io_read(struct inode *inode) * Declare that a buffered write operation is about to start, and ensure * that we block all direct I/O. */ -void +int ceph_start_io_write(struct inode *inode) { - down_write(&inode->i_rwsem); + int ret; + + ret =3D down_write_killable(&inode->i_rwsem); + if (ret) + return ret; ceph_block_o_direct(ceph_inode(inode), inode); + return 0; } =20 /** diff --git a/fs/ceph/io.h b/fs/ceph/io.h index fa594cd77348a..94ce176df9997 100644 --- a/fs/ceph/io.h +++ b/fs/ceph/io.h @@ -4,7 +4,7 @@ =20 void ceph_start_io_read(struct inode *inode); void ceph_end_io_read(struct inode *inode); -void ceph_start_io_write(struct inode *inode); +int ceph_start_io_write(struct inode *inode); void ceph_end_io_write(struct inode *inode); void ceph_start_io_direct(struct inode *inode); void ceph_end_io_direct(struct inode *inode); --=20 2.53.0 From nobody Tue Apr 7 18:03:28 2026 Received: from mx0a-0064b401.pphosted.com (mx0a-0064b401.pphosted.com [205.220.166.238]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 77BE830E835; Thu, 12 Mar 2026 08:17:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.166.238 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303428; cv=fail; b=qJtsoUT6sCKCO8wfnmxvMCCzE5nHz0howuxdIwh5NeHHyBO8XiJVeAv9I3/vO0+aCcOJ2FxXcqGxe1y64xWGb0nkUqwuiGwFGKtzDiXV5xIQg8oa4YY4CimlqmIX49V8ApR1blpgQaK5dVJmnGh9kd0+A+i+/FYJWiMyS7ViPK4= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303428; c=relaxed/simple; bh=UUEvl0jKOIix0RTUgq+jGBhF9XV4FAB4zSFl/7FVSgI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=BmVd1Vm+WXbWik8RUedsTODhzgUGAIuTVWQo6soZYkak1TU15UQMknQoTuwhRPeDrRjIorJ1R58Jn4XyxkpMS6M4V68uIPu+CBLydBJf0+EnhERtmPtmNeNRrV4IOtz05YdMpHCxLrFChy4l35K1embNzxkItbY8DmEspfEt1mY= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com; spf=pass smtp.mailfrom=windriver.com; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b=C+K971LR; arc=fail smtp.client-ip=205.220.166.238 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=windriver.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b="C+K971LR" Received: from pps.filterd (m0250810.ppops.net [127.0.0.1]) by mx0a-0064b401.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 62C6koDW1844649; Thu, 12 Mar 2026 01:16:58 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=windriver.com; h=cc:content-transfer-encoding:content-type:date:from :in-reply-to:message-id:mime-version:references:subject:to; s= PPS06212021; bh=mXZhuy0BYt8HK3eY32NhJLwcs9CcSYnGuAZm7jxEZy8=; b= C+K971LRgP+6BkRrKv3ZC126Le78WMsapmX5yoZ0gSEmjgIJT1XxsObHJXJRv2Uo +skkstZPfoMGV6756c6fBA8fozpTppXBhKHFgsrF2vQQx/zG5Xre0K7pZr5tDd9q 2pYnGFE8RZIQUcPMi/kmZHVhPz1wufVpVg5sbxaqaLIB/mapQsxJj61XKIHCKaHR j37KptQ+fzvvlypUEq1x2j+xeHT6n4rCg5jCXroNzn7brYs8EPd4DXpCP/BWOCMq agiukX+/Hf5a7MNrdNPUloiT9WzgTVNj7uMzfoB8mzCJOPFCMoLrhdR+ttqMKYGp h/oVNHewL+7PJLUFbLvxpw== Received: from mw6pr02cu001.outbound.protection.outlook.com (mail-westus2azon11012049.outbound.protection.outlook.com [52.101.48.49]) by mx0a-0064b401.pphosted.com (PPS) with ESMTPS id 4cuh6pren5-1 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 12 Mar 2026 01:16:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=YYeC+bumUaX734uqjzn0uRQHupnAFAlDaQpW8toWbqk6ca//prOTZpzDMzmt/XEUJXKffovbj3B/zPqXncya+YxcezImD03TydESuk1rBwQC5LRJekhyyW6YAs4Z6v+MbHvX0UrB0OPk8Gu+se14md1XSedWG0booaODAQDAK+1DbwnZl51w6bln0Km5xczbvSByezwup26FGhMpJ8nWOBjUWp2Wz9urDConYYvpQzDfQgcmltAC6JZUII9Wy5PP/mQxuhT1KEdI1JyCt9nktBRiAoPYv9Jz9yZMg6qhVttzaPwLtm3bWYDs3YjmKM1ravs2MxLQGZhattixZjW0oQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=mXZhuy0BYt8HK3eY32NhJLwcs9CcSYnGuAZm7jxEZy8=; b=OvLVggo5otpb6UfEU7Hd7a3fzHHEJMBr6cpPhiDQXbOPEIc6hzf7QHRpglMz8RZkKF3V/+N2TK5a9r/A/0Nl0Xv84bfzmyF+DDA/4N7trqMp1h9DYqLatruksGhie2OwhbXWTgsrZfdyZ3QEfA3Jb4INf5vayUJmDV2RaihJ6zYjBzzHspk8IzMcNak3Tm8LpgFJ0sfL4ag2EqdxdzjujfGqtrB6GDVbWmIraUvkT0Sh8U4u1Zz7WKoK006PK9wQ4IxvrkLYR/kyogPZZl80JSHGqbpIpj47lVBSrWJHPBqa7qR+Tzlo42/RpjUYHhTClgte2fEXak/1dlfZbj+WtA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=windriver.com; dmarc=pass action=none header.from=windriver.com; dkim=pass header.d=windriver.com; arc=none Received: from SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) by SA0PR11MB4720.namprd11.prod.outlook.com (2603:10b6:806:72::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9723.4; Thu, 12 Mar 2026 08:16:56 +0000 Received: from SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced]) by SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced%5]) with mapi id 15.20.9700.010; Thu, 12 Mar 2026 08:16:56 +0000 From: "Ionut Nechita (Wind River)" To: ceph-devel@vger.kernel.org Cc: idryomov@gmail.com, xiubli@redhat.com, linux-kernel@vger.kernel.org, ionut_n2001@yahoo.com, Ionut Nechita Subject: [PATCH v1 09/13] ceph: make remaining I/O lock functions killable Date: Thu, 12 Mar 2026 10:16:15 +0200 Message-ID: <20260312081619.40854-10-ionut.nechita@windriver.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260312081619.40854-1-ionut.nechita@windriver.com> References: <20260312081619.40854-1-ionut.nechita@windriver.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: FR2P281CA0085.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:9b::12) To SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ2PR11MB7546:EE_|SA0PR11MB4720:EE_ X-MS-Office365-Filtering-Correlation-Id: 2023103a-cc60-41f8-be41-08de800fb991 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|52116014|376014|10070799003|366016|18002099003|56012099003|22082099003; X-Microsoft-Antispam-Message-Info: vGY3OlV73vGWq6w4yZiQxlwMOlrpykUUXVABWJCSZ0Tx3B1Ck4ecYI3Ojrq3I6MCGfDLT4xs+m9wwxEJzvpZ91M+xUiWOHwdcQwGu3n48KgopjDgS0vHMNg0p4EsYS4/FhoqWRQuYLlM1PgA6VX8swKj7yU7zZ8lWAS/vQhim5qczqRXr8xDhdXsU696wonZXdvHnU0Wn2JQBmmACom6LJD8vaTeGyLA+gVbxQmnlukNnwM8KqdYbfTMU0SSuwbqAz7Yq9kJpysm9gVUo/8Njsrbt5haN45EnytMi6DHxomP9xPv5r8DA8fp7XX23xob+pdSrdZ7nolRHj1qDR0XzgpicIJr3MjlDvtLKqvLbFw2i4Gqydw+UvR1sW6pHv6Ft0ls17pitohoJkkhLWzWX173VDkhMYwP6T+BkPptQYnLU9ajujhaZpnAjLcApm7OSYJjaBRYlc3W0cgOhY1TwtLnpak7deeyaH9+20ciMFm1Sq8PnicVMuvbC6bT2bpqALWXZ4v0b0eebvyveZyV9XYPzgUvAuk/uZnLNv5mQHtIX7z8a2QhG9XcuQYFtQUdI0A+2CvJJOkSvdCU9f24EXEw15OwVYr92NgzT6btvZcyp5dCje1ON1YhwmA7OQFiWmktwEjUQq3A12yk4e+wn5vuDnGM1RoQzbNK2MGJ2C06CYfFo4TnSiQV1mieuBosZKESL61e2K/uYUXPK2qMTQ0D2w21Q6bXODPp4syJ/NQ= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ2PR11MB7546.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(52116014)(376014)(10070799003)(366016)(18002099003)(56012099003)(22082099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 2 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?Id5k8cyvHIzaX221q3vUs7MXUnjVOzRtyyqgfrq2OVXzfPWDi1e17FH+BKwj?= =?us-ascii?Q?XYS3K4oCRgGeXnflPfa5Cpya4Kzptem2EJtOjuPIjNa3e73/19ur3a38cHhK?= =?us-ascii?Q?BCp3ltqr4Y7bCgKE4bUvklHNrBBgR+GWEM7xM9Z/BFqKSM+E3HPwkZtxFirQ?= =?us-ascii?Q?9r3uKe5LLVTZAqyedAdsPKiu5lSGvwGXSA4u07BwcxD8iiKG0XlKrXU6iAQo?= =?us-ascii?Q?uv2Tpi4NokaTQwXhJt3HvDCJeraKaOqAJjhPuNkKDz/YjQ9Pc+LC79ZP2zhh?= =?us-ascii?Q?S0gsONmwwih88qfz8JFFMbIKN2pDDoGFFMBWR3GT+WbVxxT9tDkjFIKIYGW7?= =?us-ascii?Q?5svRyD6GQCXsRo7ffMMN3z/Me7X9xf+QkG+MAvlhvxpVRcCZv6Y/Y+VYNE8g?= =?us-ascii?Q?OVDHxkm7KyV8osX1NJk8j3cgHUagzF3x3DORqLMUpEjBnu3phUlHHEysw029?= =?us-ascii?Q?fSibu7kLcAmcGhSI3A2CyUnN2sZscW4M8bfi0yxi+tMN+B3gphDtTcYfjVS0?= =?us-ascii?Q?Cq2/lvGXtCQmChHBDrQNL8DV9BnCis9Ql9QqZBm6Q9P0ng7DdsP3X1kXilXo?= =?us-ascii?Q?5bej9AhfGXzNJ1zscxUMnk4szBquPnCbtuM2vlAE9meeQ1dj73it8nOUTQoG?= =?us-ascii?Q?ZxsgeWJy5aFSIy6RldHMsLcvfqUvNwmPoy2L8V8TFZDaqTtbgUGQt909VnUH?= =?us-ascii?Q?G6pUEKbWLALQtBQSt8tzguOxoV9ascLV8baCD46PiiNMWA1PsDurp/NBvb5p?= =?us-ascii?Q?6umMSc7qP2LmCeisDZz62Bqbjb8AJqtSs9gZIJaOLBd34KZgouu2wPDLHduB?= =?us-ascii?Q?2WslQLyLUVVok1Utlcn3ZDqnae2fFRA2MfQt6bbnSAT2lovKt6ro5j2i/aQL?= =?us-ascii?Q?3I0fLlWhOJFql7Lc1Ug8FLmBURGSWVlZZhtLLc+FurrEWNTyvi8cRfa+dsWg?= =?us-ascii?Q?Bsf3lZt3J69CsNeNqaPFprISqRT5ELsmHHscZVyF0AwGMjaIrFYuCHRW81n5?= =?us-ascii?Q?iJYPvJ0ySDjOTwwo86LecBDHusTEUzos93glBnyCoQCZMoa5v8tI3H5HPB0g?= =?us-ascii?Q?4QR9fkMz5cDvnT3bor6oxZpbjCu1E67XjJR1zumSiK9VQDzZG4iXzKAbOQM6?= =?us-ascii?Q?A/Uoo1brjgUVdncB4tHEbLxK9hwsn2XDYtRNmUHvaOmMI9Uddc3BPfGvHzzZ?= =?us-ascii?Q?cyfj1zw9ftuOq9pJytPSyQHwPFtRoseiZWCCRtoLRlduUM3b+x9eabgolqdw?= =?us-ascii?Q?LQ8D8ZuoMOITZkdBkaCMQm24xe/WLMJMvc3nSUL7f9a9NncZJ1RFOSyPC5zG?= =?us-ascii?Q?IyJ8O8TA15aqMg+Ll8t/JjMGh+s6YRfGZij+sfAeMKjeYW/UCs1nojFAP1kg?= =?us-ascii?Q?q5bO8PO+wiSNSIW2xo2n1698D2N6ukFnMelPZa2n0amdPRI0/vtEzJqObkeU?= =?us-ascii?Q?3A/Av533TIlYNp/HrXoH8tYoiA8cPPFhv4e/AwgAR2ShGo2QhDalkG061KfS?= =?us-ascii?Q?hYuNT8y6Me84AszoSmEUkGIICEfNtHJ+8ZViw/pjp+QoPh15UIgzTOHNToxp?= =?us-ascii?Q?5ggOGT3jYaOMozDQiMdBSR9YOXWDID2rKA3tkGO7VBb8Xcx17AwSc3+ArJbc?= =?us-ascii?Q?9yogJuTcaHzxVNc5xijzyOmGW7Lc2VaQKV9HV/rzSKgHnrxQHdTXsCxhkCw4?= =?us-ascii?Q?7TfLme7y2+PIcnATfGaXAiZrKGHje13wGv3xi3JUX3ClVNhM4efZ7L3Iqvn/?= =?us-ascii?Q?DUpcrYQmMGS1hnTCaVq5gB0r5wZY9VWaDVMaGkpH+2F2M3Lg+PXfyUXrpSxp?= X-MS-Exchange-AntiSpam-MessageData-1: 7kPc3Gz++7CcT8huR1gjxdlISTdk96u1h7Q= X-Exchange-RoutingPolicyChecked: QFng6/sNlp8HsLIm2cmBnxN1t/sQCBYRTqwHdvnwX2CuYKp0Zggb+/BY1rRqt2DrYrcA4QGGvgv/5xfz8MiRWOrGa+nQ6DcK9YN2nSp4+XRVUSFo7knaaSB61tG51EbITd598A8//yMj1QJ/MD55rfOqHZeH/mKEZ0SyXsJUSyNaufUjJrWpKof2VaOQSZe0mW9I0qTbyDpFDHwZBKOhh4HRa/hxFNAAw7J/9buVh5I61qaMOyVvPMW1PH7qgoWEEmlNuc+Jvrv4JIRSqSYYJVIo5eNlXXxsXN6+5cmL3L8r1CcE8WDm32+4P6KXtMfvXie4QDkCl3A4hgQpfUTJSg== X-OriginatorOrg: windriver.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2023103a-cc60-41f8-be41-08de800fb991 X-MS-Exchange-CrossTenant-AuthSource: SJ2PR11MB7546.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Mar 2026 08:16:56.0871 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8ddb2873-a1ad-4a18-ae4e-4644631433be X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: tL9/axBafLvIwEX3DYbSRSt1migfuVqhOxmo1JR2jv9ljG406gJB6gK2GAFdtPrsGpAzjJN0J0rGvhGcvCGRKDrmX836Yi8IUVn7m3tR5Jk= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR11MB4720 X-Authority-Analysis: v=2.4 cv=ePAeTXp1 c=1 sm=1 tr=0 ts=69b2767a cx=c_pps a=w9jTARCbdo3hRO6w7Rsibg==:117 a=6eWqkTHjU83fiwn7nKZWdM+Sl24=:19 a=z/mQ4Ysz8XfWz/Q5cLBRGdckG28=:19 a=lCpzRmAYbLLaTzLvsPZ7Mbvzbb8=:19 a=xqWC_Br6kY4A:10 a=Yq5XynenixoA:10 a=VkNPw1HP01LnGYTKEx00:22 a=bi6dqmuHe4P4UrxVR6um:22 a=HK-ge7EqtdluswH-FwHe:22 a=t7CeM3EgAAAA:8 a=lyTWyWsNyHauK_dbpokA:9 a=FdTzh2GWekK77mhwV6Dw:22 X-Proofpoint-ORIG-GUID: H_VhTBafzkqZvb9n2g1-eL3mGZ2lZRaB X-Proofpoint-GUID: H_VhTBafzkqZvb9n2g1-eL3mGZ2lZRaB X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMzEyMDA2NSBTYWx0ZWRfX3gO+ifuUjDNl osQxjrr+UsLzHdqSoJvsks10R281I06fv3YXLAM0YBmJu1tfjkajM5ShOGVrJhDykFSOFhoGrT1 gsuc9ZoyOFEXf1OxvcTNohAU2Vq/Zqp1CnsHnBJuZOVOGB0nqVd2WxgmiOVrn0IBo0+SiZzhmMZ RyUawSuzFppei9QXvcCXbQG6e6hBVh+lrojadp9KNvVzrexUISDWm9fqAy39X5d3AwDToJXx0Y/ EFaSsDJ0SPHP6xIHsdd7N5nAObhJWcue6Qzyu6FgmErNROZqe3ekAZ2H97t5qknhEetF+R3jl2s FQXr79rTtj3a/bTibfzThQCZnt+ZFFDAg1GEJzqOaMDO3qRnLRs5LypwGcEhK7gEmmyNaYmJW9K D0B0bNxrGZIK8Iuhq5N4BKKghu1ULEckfmZ09C0E4OKdl6FQNKpsLRAOp47nm47GKSwl4I2ZR8S njp2LnmB3Ha8AJNBlZw== X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1143,Hydra:6.1.51,FMLib:17.12.100.49 definitions=2026-03-11_02,2026-03-09_02,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 suspectscore=0 impostorscore=0 spamscore=0 lowpriorityscore=0 clxscore=1015 priorityscore=1501 bulkscore=0 malwarescore=0 phishscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2603050001 definitions=main-2603120065 Content-Type: text/plain; charset="utf-8" From: Ionut Nechita Following the same pattern as ceph_start_io_write(), make ceph_start_io_read() and ceph_start_io_direct() killable to prevent indefinite hangs when waiting for i_rwsem during MDS/OSD unavailability. This completes the killable lock conversion for all ceph I/O start functions, allowing blocked processes to be terminated with SIGKILL instead of hanging indefinitely. Signed-off-by: Ionut Nechita --- fs/ceph/file.c | 27 +++++++++++++++++++-------- fs/ceph/io.c | 28 ++++++++++++++++++++-------- fs/ceph/io.h | 4 ++-- 3 files changed, 41 insertions(+), 18 deletions(-) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 01e4f31b1f2f3..c828552d51920 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -2122,10 +2122,15 @@ static ssize_t ceph_read_iter(struct kiocb *iocb, s= truct iov_iter *to) if (ceph_inode_is_shutdown(inode)) return -ESTALE; =20 - if (direct_lock) - ceph_start_io_direct(inode); - else - ceph_start_io_read(inode); + if (direct_lock) { + ret =3D ceph_start_io_direct(inode); + if (ret) + return ret; + } else { + ret =3D ceph_start_io_read(inode); + if (ret) + return ret; + } =20 if (!(fi->flags & CEPH_F_SYNC) && !direct_lock) want |=3D CEPH_CAP_FILE_CACHE; @@ -2278,7 +2283,9 @@ static ssize_t ceph_splice_read(struct file *in, loff= _t *ppos, (fi->flags & CEPH_F_SYNC)) return copy_splice_read(in, ppos, pipe, len, flags); =20 - ceph_start_io_read(inode); + ret =3D ceph_start_io_read(inode); + if (ret) + return ret; =20 want =3D CEPH_CAP_FILE_CACHE; if (fi->fmode & CEPH_FILE_MODE_LAZY) @@ -2357,9 +2364,13 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, s= truct iov_iter *from) direct_lock =3D true; =20 retry_snap: - if (direct_lock) - ceph_start_io_direct(inode); - else { + if (direct_lock) { + err =3D ceph_start_io_direct(inode); + if (err) { + ceph_free_cap_flush(prealloc_cf); + return err; + } + } else { err =3D ceph_start_io_write(inode); if (err) { ceph_free_cap_flush(prealloc_cf); diff --git a/fs/ceph/io.c b/fs/ceph/io.c index f9ac89ec1d6a1..7bd57de2d9681 100644 --- a/fs/ceph/io.c +++ b/fs/ceph/io.c @@ -47,20 +47,26 @@ static void ceph_block_o_direct(struct ceph_inode_info = *ci, struct inode *inode) * Note that buffered writes and truncates both take a write lock on * inode->i_rwsem, meaning that those are serialised w.r.t. the reads. */ -void +int ceph_start_io_read(struct inode *inode) { struct ceph_inode_info *ci =3D ceph_inode(inode); + int ret; =20 /* Be an optimist! */ - down_read(&inode->i_rwsem); + ret =3D down_read_killable(&inode->i_rwsem); + if (ret) + return ret; if (!(READ_ONCE(ci->i_ceph_flags) & CEPH_I_ODIRECT)) - return; + return 0; up_read(&inode->i_rwsem); /* Slow path.... */ - down_write(&inode->i_rwsem); + ret =3D down_write_killable(&inode->i_rwsem); + if (ret) + return ret; ceph_block_o_direct(ci, inode); downgrade_write(&inode->i_rwsem); + return 0; } =20 /** @@ -138,20 +144,26 @@ static void ceph_block_buffered(struct ceph_inode_inf= o *ci, struct inode *inode) * Note that buffered writes and truncates both take a write lock on * inode->i_rwsem, meaning that those are serialised w.r.t. O_DIRECT. */ -void +int ceph_start_io_direct(struct inode *inode) { struct ceph_inode_info *ci =3D ceph_inode(inode); + int ret; =20 /* Be an optimist! */ - down_read(&inode->i_rwsem); + ret =3D down_read_killable(&inode->i_rwsem); + if (ret) + return ret; if (READ_ONCE(ci->i_ceph_flags) & CEPH_I_ODIRECT) - return; + return 0; up_read(&inode->i_rwsem); /* Slow path.... */ - down_write(&inode->i_rwsem); + ret =3D down_write_killable(&inode->i_rwsem); + if (ret) + return ret; ceph_block_buffered(ci, inode); downgrade_write(&inode->i_rwsem); + return 0; } =20 /** diff --git a/fs/ceph/io.h b/fs/ceph/io.h index 94ce176df9997..9432b8b607650 100644 --- a/fs/ceph/io.h +++ b/fs/ceph/io.h @@ -2,11 +2,11 @@ #ifndef _FS_CEPH_IO_H #define _FS_CEPH_IO_H =20 -void ceph_start_io_read(struct inode *inode); +int ceph_start_io_read(struct inode *inode); void ceph_end_io_read(struct inode *inode); int ceph_start_io_write(struct inode *inode); void ceph_end_io_write(struct inode *inode); -void ceph_start_io_direct(struct inode *inode); +int ceph_start_io_direct(struct inode *inode); void ceph_end_io_direct(struct inode *inode); =20 #endif /* FS_CEPH_IO_H */ --=20 2.53.0 From nobody Tue Apr 7 18:03:28 2026 Received: from mx0b-0064b401.pphosted.com (mx0b-0064b401.pphosted.com [205.220.178.238]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AD0C538A72A; Thu, 12 Mar 2026 08:17:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.178.238 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303426; cv=fail; b=JrnXk5Q1EkqBs0bZ9fwmN1rNatKMhhdMs55wy7e2ActHeyPbU3f0QBHlBwXmbmbRq3tI5q6K7gQrNSXzWxabXBRcN3GBd6P4ySMpvLBzsx/0TcC/oHlMcsVzaFQgrN3xMj8Njd5gmTXlTe1cLa4zRR8NsDFHrIZIU3R7fAQDyI0= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303426; c=relaxed/simple; bh=787lSHR8EbukgEKrHNqjqkMatgxFgftfqqwDu9n+PXA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=tLiJW1qLVB8WjOaa4SxpAZAs43k7SycEYFPhsfirWCG9Ywckj7PWGQ7wT1mIvhs42knYz2SnhtUlpGOqpTx7YA1nVN3b/RO1yqe87BFd9nSBjcps5GwwZ7qm8HXAQLTENuo9hxbeIoXq/6f+cEHOPVMKcyHdC+czLz7eKp3l2sA= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com; spf=pass smtp.mailfrom=windriver.com; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b=fmsHw1bJ; arc=fail smtp.client-ip=205.220.178.238 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=windriver.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b="fmsHw1bJ" Received: from pps.filterd (m0250812.ppops.net [127.0.0.1]) by mx0a-0064b401.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 62C7Xgaf3253345; Thu, 12 Mar 2026 08:17:00 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=windriver.com; h=cc:content-transfer-encoding:content-type:date:from :in-reply-to:message-id:mime-version:references:subject:to; s= PPS06212021; bh=3nk4Fo6pDwtGSRwcunzvqVLCCZcjKEuii/3fjPnCl1s=; b= fmsHw1bJAa7wo4QGzUkK8AqXE9BrVGcWDELlisyGcC2/7E4OHfOoI+Kzia53imMe tSHSz7YFpeO37EKt4dmt/gG0kRUTBBl9E2tZHQP/SFcpaLVVRSLsrhmOrXEkR1Yf WPtiMBHfhrmW/mxJnjnaJVKJ+2j7tgxROukRMDStQm3Tdt8XrzUWtWKSIW6Jup5A guNQEyPZCE7kCZH/ZrYjRQ01pqvIRbkuyyGYAlncEuvszgQ/7pjv0F0zubENfJLw GcrjbDKlRm4JdMFUwOoGWaHGRIQcaOb1k9PNQJtYEEIs/jBMUFtrwU4zSe/yi2++ 3aDzM/42MCcxirM1YtUHyQ== Received: from mw6pr02cu001.outbound.protection.outlook.com (mail-westus2azon11012045.outbound.protection.outlook.com [52.101.48.45]) by mx0a-0064b401.pphosted.com (PPS) with ESMTPS id 4cuh78gdw4-1 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 12 Mar 2026 08:17:00 +0000 (GMT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=SJT8MPAOabqnjb+cawwswfm7vQnlMeifH7H08MTXXV0dfaW8bcmdVi1GBDW6vNrPfEfv6jE3AgYm43aXHMF8400DY/BHg2tfsAayvChVjg0vbb6m9QJNl9OyCqY2sutja4XR49mvMEVmoT46e9xDge+V7lyJ6BYkrn6kAnWlya6pc52fBwDRYAq8uxvzL/D9ZwfKvRNNGYBaE9O+7jTcyEDAvsJoDlXGhGNOuedxMpwSpaMS0Np0hZL+RfmcxU80cd3zgnL0yUdXcpiY92eljx6Lf4DtRLuqKzmX5qh6dkVw1UIUJJBqLHWoJYQFCBZoF4ZZD1wZrNOUEHpIhYq0eA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3nk4Fo6pDwtGSRwcunzvqVLCCZcjKEuii/3fjPnCl1s=; b=coqi7Xp5CHebYt4lkf/TYrIiWSL11bOaUo6aRJtHBEAz59L50fikr9X4VBRw2Gt1Asff+dAac73pQwvTsxm9EFWxpXzsz7kXZO1yLup/uUqZifXBcJPGeo/kTKFRYIhgyKL1JfIyIjaE4ZBrxHvYDOlmDvd3ouXR5R+YOUijvUn4ei7CMt1u7lTiWCJ0WyznIl4XGF647eY5+y9Iw+JFEzmnQ7xY+DQX+kv5cr+BZMpSbkDrVlc/1QeJI9qnGvO0+m44Rkg6mDNPgS+/nORV1JtuRgqDeBSdR2CsIdInavG67dSjsmC2J7088FftXRx6sBUGe7nNm+4PibQj8RimTA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=windriver.com; dmarc=pass action=none header.from=windriver.com; dkim=pass header.d=windriver.com; arc=none Received: from SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) by SA0PR11MB4720.namprd11.prod.outlook.com (2603:10b6:806:72::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9723.4; Thu, 12 Mar 2026 08:16:58 +0000 Received: from SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced]) by SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced%5]) with mapi id 15.20.9700.010; Thu, 12 Mar 2026 08:16:58 +0000 From: "Ionut Nechita (Wind River)" To: ceph-devel@vger.kernel.org Cc: idryomov@gmail.com, xiubli@redhat.com, linux-kernel@vger.kernel.org, ionut_n2001@yahoo.com, Ionut Nechita Subject: [PATCH v1 10/13] ceph: force mdsmap refresh on persistent MDS connection failures Date: Thu, 12 Mar 2026 10:16:16 +0200 Message-ID: <20260312081619.40854-11-ionut.nechita@windriver.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260312081619.40854-1-ionut.nechita@windriver.com> References: <20260312081619.40854-1-ionut.nechita@windriver.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: FR2P281CA0085.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:9b::12) To SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ2PR11MB7546:EE_|SA0PR11MB4720:EE_ X-MS-Office365-Filtering-Correlation-Id: 8386b13b-1436-4774-db47-08de800fbad3 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|52116014|376014|10070799003|366016|18002099003|56012099003|22082099003; X-Microsoft-Antispam-Message-Info: X6x8+CCYAsUafGzmpxNyqQPwdmWx+aKczvklW0xqtKGCK3OyBhrMnL+S++r8TPGbJJ4Wv0uhJAgga6l4J2kLFV/wpCYxUC7+y7Fq559yY6J7hEOOTEOFLilrKNST8ubfgj7kAQuLXZHzWmkxTUqQkLklxEgBghKf0n1/uUK4kBdiCnPrYafvPmnznwE5NkBqg3uq3cnRpGdlZsLdW7Lxz2MN8pYKLpTGBXyqUQhcEyuF+0+PJPgkfdMNPVVM6JIAHl42AXZYSu9OLJG+4+nU547z7safHtTI+kYvi6tsS8dPv8YIbPcMKQPjOI8AAdfiEDnoJOSJn/U1WJG/n9fHNPaUk+DlT4Uk15kgm//c1Y30L2+Zk4Y9pMN7QhFCfhX1H82HDFSf7XyC0R7ExCc8OuRVkHcQG0c8OjTSvNfkehdrOEXD3XDvlMSGv2eLv0Kq1YMyPQG+F4vH1xixwWR3VQAK8DbNN3OiJ+YIAZx0NuCLhzpSmfYHl3p+Bt7OPf4H6/+8BsX+omn9SBlLrRBQdMBavpxo4x9jV6fB7zf5WNMDslODColsnu98h7iylMUObChmCEBSAvrIZrewbqXq8nWi555SwMiRyeRp6+1h39GDhrtgF87/kWSgWyWq58v/DJJkdPgWw+k5GLC/zUeEAbErLQTJrhQz46go3rcWoWucZzcvMi65QzcFlNiGjcDUDienqM/ewTG9Rer0dgntqb8XWNZSvm1TzpHwCfYybBo= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ2PR11MB7546.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(52116014)(376014)(10070799003)(366016)(18002099003)(56012099003)(22082099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 2 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?GdSd1GNSWUBcnIXsTWTOlVc33F+ewbVk1G5qOxKgInKdlsiEQzMdon/tIDEn?= =?us-ascii?Q?Fogu0KTU6O3G/zg602fsPkRpcYpVksh7ZeFtItKXCQ40JlFYgGDq0DnMSmnI?= =?us-ascii?Q?Hxnbz/xG+fnmBU+f1fEwJ85ZlgTlPXvPUVKK5wjk1MDHPZfAm0EjvX2URsJh?= =?us-ascii?Q?0VTe9hm4B3/ycsFWlBZyjxjQF9Yr9z1hepg2kkf7L5X7PIxVW8oXw5qUDUGL?= =?us-ascii?Q?kJ0sY5/5SXghhmA9EDUK2QLW4XCVpcSqBXG6+qp/Heo+ZMTxQcQkIqAz+b17?= =?us-ascii?Q?0cB1kWmG6+c7bofzf6xfolfm0peL66mO0SE1GoQbbxaKju/J608zvVKBLUoP?= =?us-ascii?Q?ZBuOqnAmD6KBBKZcBZfdprI5srMiod9L2ohtYmRZxFQXAQVhFo82Z9Pgh+6k?= =?us-ascii?Q?htvGZ3ZrLaMP+CcTKRMHzoH+gQPGzzPsUIW7suD7oQPbkhXz0UXzjCg0s2NY?= =?us-ascii?Q?Za3VPVfjN3vptdBRfvoeex1vg4SPQSRSbwbo5DY5AxMC5si7LG8GnfJvNWpq?= =?us-ascii?Q?q++V3RZH+M5v4XR8I8nCcKxki5aHcrs3FI8e5qeAAbqG2OnM/afFBmZeId6t?= =?us-ascii?Q?dsOejO136wyBfp6C0L3y6loV1U87MRJMQ5StWg4jykpY5GSi7I8hkYFJJyLM?= =?us-ascii?Q?Qrp7Lo5ZduC31V0n6wF1Ah3GOOjetxf6aOTUTr9vIeLxbjZyy/DcKutOmGw4?= =?us-ascii?Q?mPHOk32y2SGZ58kltNAG/iz9gTQLw+yM6DmQ25gpDA6R3Y9o6xNp1IhBvITO?= =?us-ascii?Q?orR9m4ju6M68uRMTcoYWfPWTGyO2i24yEA7kS0o05/CCf5Gve2u7ulIhE/tg?= =?us-ascii?Q?2CHerL0Vs3AQXf0lnFqA+kR4c2G4/IQW5ohNzZjr/yefa6Q1GVuTL0iCA5rF?= =?us-ascii?Q?uzYClwyBZpdeF/OXDFjEUae/KsHYSJotPfrTos7YaQ03cB/gSvP6iLZjRQag?= =?us-ascii?Q?2Q2AVBqy82rZ1+GsOyzHDS8OG7daRMsKgC5wBrbDDPB3Q4hh8i82bX6fln20?= =?us-ascii?Q?P/7WtXHazgPsrpIoNDnRLdbFSAgZrWTnrr/rL0Sjhc+3kvfrBI35L8FnTybV?= =?us-ascii?Q?F2P2ChQwVLwKTouraitzNyxbodKI8C6LSd/FLmJblQ/GpBKANz4rgQ+wZ3aB?= =?us-ascii?Q?nrwcTCoMcUgnMphPpZ5htMz1zvH//UxUvXJYyi+H8yT1GJDR9U0m4tgAtInm?= =?us-ascii?Q?9z/zHHnA2a5WFXWdUvGlL8Hcf25HgCftdBVXsKSwpyenM2Ay9y17URkac8cy?= =?us-ascii?Q?nPEU9xAR/4r0lB+TrR72gnbAhO/HP9NPbBWzPYDhMPdGXZnchi/P6CD4rtnL?= =?us-ascii?Q?VXFvl2AXurFODjFBbRhroJSJkFT4ZRBS0Tw2V/TDQQ0MQPrfqp6ZLkFg6zJN?= =?us-ascii?Q?aHUr7wZhiYy5dQ0AaSJa91154pZNUqZxQJ2tuH7zY7ckMI5iX3NG5CJH/tXG?= =?us-ascii?Q?2X9lH5vrj4bmBz5W9DnboPiHK00JCo1I0IpA1kfgs7q+1MnuleTVndF7tGgF?= =?us-ascii?Q?gw0eAD1JDnoPWqSR4dhAsmZ6vglBR/ilmzRS+qAID1O94N0s1DFYN626uCqz?= =?us-ascii?Q?B63ghdEcUGZdGtqziJfnMstdOb0sTy+jytkxe4OnOQQnrpdGwz9Oy/L26FXi?= =?us-ascii?Q?yyPOd4BUNB13ce6Eh9XfvE+Ki1DiAG0sGEutOR2Pw2uPZca4pO9K/ZXyy18c?= =?us-ascii?Q?RUpdOOk/8Zr05MrHn1fUVagQZr+Lbt1kHosDtT24bBtVHEIEAi/r80+1fUWk?= =?us-ascii?Q?H3AAgN2Tr1K9b4335smys/9mpa0PEPKtOYlwV0n7QUXXOkhDxxeub9e4VqyY?= X-MS-Exchange-AntiSpam-MessageData-1: kXSILegXzOfSGE7xTJgbCaMTJRom44O7YnY= X-Exchange-RoutingPolicyChecked: RqVwmleNdkFZH6STJyy9Ov9qRbZhbD9UyePR8iC9KksBfnJeoWqGru7ird5DIKT3m4jfLafdBqpJyuA3SPBNrGpaPmrFqQ4QglvAdQiqJaWtdxx+2BI7zgBcV7l13s1K/qIOUe18xCyfdrHi/Rvixs94ILTgI8g5lvMiOR6UTBiB+dMXwI0KwGYGG9HHzRbtLC1tNLKxQGscgoSMdACCg1t22YyIa4VDcBalEp0JXg6TYva65i8gTt6HxVcbI3lxHla3g/emkZQpLe2mKhUYvG4xUq5Hy1Jh2qkNqGf2uPIzVddVN8xJTs2OTAjI39g3Yn2BplcfjKafMpL3eoN5NQ== X-OriginatorOrg: windriver.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8386b13b-1436-4774-db47-08de800fbad3 X-MS-Exchange-CrossTenant-AuthSource: SJ2PR11MB7546.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Mar 2026 08:16:58.1952 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8ddb2873-a1ad-4a18-ae4e-4644631433be X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 3i0HtNAfuiysBMCsittUbNwb5xkZoWyryMIGt4QUqz636kjMq+kkgQtzV42dXNpAGUR/n4sVe9ZTWApcstmpUUU9Si9r+CkaFIbVtvxYsak= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR11MB4720 X-Authority-Analysis: v=2.4 cv=ALvEU0hV c=1 sm=1 tr=0 ts=69b2767c cx=c_pps a=klJJBKWT8dzIbr5yswcz1A==:117 a=6eWqkTHjU83fiwn7nKZWdM+Sl24=:19 a=z/mQ4Ysz8XfWz/Q5cLBRGdckG28=:19 a=lCpzRmAYbLLaTzLvsPZ7Mbvzbb8=:19 a=xqWC_Br6kY4A:10 a=Yq5XynenixoA:10 a=VkNPw1HP01LnGYTKEx00:22 a=bi6dqmuHe4P4UrxVR6um:22 a=fTW__CHxibyLmBMfj2wP:22 a=t7CeM3EgAAAA:8 a=Ed52Kt3qmj_-iDMSIBoA:9 a=FdTzh2GWekK77mhwV6Dw:22 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMzEyMDA2NSBTYWx0ZWRfXwFM+5YsSEpvb Z7irZr6wdvkJicy6Hs3mTylmBPMYEEVsW5HtnEmGSsaAphBsFqdfW1osZSKLnVhgMqTGZGNvH03 usZVjxA65/U89+RxjVmLuGd46H+la8WRZh66BrGYXv9eAy/Yr4BnmnS9JNcndLen23H3l1Bq3GD Msh0YrOL1ye5aAebccd7xpVR2l1a1BxI6yhp0VmP2gST6N6MSB5t8AwToeh/X/IEXU8JyEM5BOy VWd2MK53ArAFRryrr9YucpzkSLLb8B3pCRjXIjT72+wixwLZs1ON1zljPlk9AC6SpagIbTkpCBz Fk9QLzRXBoX+mTheQc3rJVMPR3twz/IlqKtxigmpUiYv/PnqQBdFQ1zKYBDUoM2kDq6RBD1/2ZR IEQEe+lfbLAb06DaajuN7shuZ+yWxategUp4PNXVh5qrG6vXV3R+UQhwOc/9rf4L+kdi0F/vsiS n6YoJ3kCyWfoG2/xZdg== X-Proofpoint-ORIG-GUID: w_xd9cUddHe_Rttcb7P2GCLXs47JUVjk X-Proofpoint-GUID: w_xd9cUddHe_Rttcb7P2GCLXs47JUVjk X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1143,Hydra:6.1.51,FMLib:17.12.100.49 definitions=2026-03-11_02,2026-03-09_02,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 phishscore=0 clxscore=1015 adultscore=0 suspectscore=0 priorityscore=1501 malwarescore=0 impostorscore=0 lowpriorityscore=0 spamscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2603050001 definitions=main-2603120065 Content-Type: text/plain; charset="utf-8" From: Ionut Nechita During rolling upgrades in containerized environments (e.g., rook-ceph in containerized environments), MDS daemons are restarted and may receive new IP addresses from the CNI plugin. The kernel CephFS client (libceph) maintains a cached mdsmap with the old MDS address and attempts to reconnect indefinitely. The monitor client subscribes to mdsmap updates with start=3Dcurrent_epoch+1, expecting the monitor to push new maps. However, if the monitor connection was also disrupted during the upgrade (e.g., due to EADDRNOTAVAIL from IPv6 DAD), the subscription may not be properly re-established, leaving the client with a stale mdsmap. This results in a deadlock: - The kernel client retries connecting to the old MDS address forever - The MDS connection has no .fault callback, so the MDS client is never notified of persistent connection failures - The stale mdsmap is never refreshed because the client believes its subscription is active - New pod mounts via CSI hang in ContainerCreating state - The rook-ceph upgrade cannot complete Observed in production (kernel 6.12.0-1-rt-amd64, Ceph Reef 18.2.2->18.2.5 upgrade): - mdsmap stuck at epoch 53 while cluster was at epoch 68 - MDS session state: hung - monc showed: have mdsmap 53 want 54+ - MDS address changed from dead:beef::...eb75 to dead:beef::...bc76 - Client kept retrying on old address for 30+ minutes Fix this by: 1. Adding a .fault callback to the MDS connection operations (mds_con_ops) so the MDS client is notified when connections fail 2. Tracking consecutive connection failures per MDS session via a new s_con_failures counter 3. When failures exceed MDS_CON_FAIL_REFRESH_MDSMAP (10 consecutive failures, ~2.5-15 seconds depending on backoff), forcing a fresh mdsmap subscription with start=3D0 to get the complete current map 4. Resetting the failure counter when a session message is successfully received (in handle_session) Signed-off-by: Ionut Nechita --- fs/ceph/mds_client.c | 73 ++++++++++++++++++++++++++++++++++++++++++++ fs/ceph/mds_client.h | 2 ++ 2 files changed, 75 insertions(+) diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index ac86225595b5f..0e766880056c0 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -66,6 +66,12 @@ static void ceph_cap_release_work(struct work_struct *wo= rk); static void ceph_cap_reclaim_work(struct work_struct *work); =20 static const struct ceph_connection_operations mds_con_ops; +/* + * Number of consecutive MDS connection failures before forcing + * a fresh mdsmap subscription. This handles stale mdsmap scenarios + * during rolling upgrades where MDS addresses change. + */ +#define MDS_CON_FAIL_REFRESH_MDSMAP 10 =20 =20 /* @@ -997,6 +1003,7 @@ static struct ceph_mds_session *register_session(struc= t ceph_mds_client *mdsc, s->s_mdsc =3D mdsc; s->s_mds =3D mds; s->s_state =3D CEPH_MDS_SESSION_NEW; + s->s_con_failures =3D 0; mutex_init(&s->s_mutex); =20 ceph_con_init(&s->s_con, s, &mds_con_ops, &mdsc->fsc->client->msgr); @@ -4341,6 +4348,9 @@ static void handle_session(struct ceph_mds_session *s= ession, ceph_session_op_name(op), session, ceph_session_state_name(session->s_state), seq); =20 + /* Reset connection failure counter on successful session message */ + session->s_con_failures =3D 0; + if (session->s_state =3D=3D CEPH_MDS_SESSION_HUNG) { session->s_state =3D CEPH_MDS_SESSION_OPEN; pr_info_client(cl, "mds%d came back\n", session->s_mds); @@ -5427,6 +5437,22 @@ bool check_session_state(struct ceph_mds_session *s) if (s->s_ttl && time_after(jiffies, s->s_ttl)) { s->s_state =3D CEPH_MDS_SESSION_HUNG; pr_info_client(cl, "mds%d hung\n", s->s_mds); + + /* + * Force a fresh mdsmap subscription when a session + * becomes hung. The MDS may have restarted with a + * new address during a rolling upgrade, and the + * connection may have entered STANDBY state (no + * .fault callback) rather than generating connect + * errors. Requesting mdsmap from epoch 0 ensures + * we get the current map with updated addresses. + */ + pr_warn_client(cl, + "mds%d hung, requesting fresh mdsmap\n", + s->s_mds); + if (ceph_monc_want_map(&s->s_mdsc->fsc->client->monc, + CEPH_SUB_MDSMAP, 0, true)) + ceph_monc_renew_subs(&s->s_mdsc->fsc->client->monc); } break; case CEPH_MDS_SESSION_CLOSING: @@ -6528,12 +6554,59 @@ static int mds_check_message_signature(struct ceph_= msg *msg) return ceph_auth_check_message_signature(auth, msg); } =20 +/* + * Handle MDS connection fault. + * + * Track consecutive connection failures and force a fresh mdsmap + * subscription when failures exceed the threshold. This handles the + * case where the MDS address has changed (e.g., during a rolling + * upgrade) but the client has a stale mdsmap and keeps retrying + * on the old address. + */ +static void mds_fault(struct ceph_connection *con) +{ + struct ceph_mds_session *s =3D con->private; + struct ceph_mds_client *mdsc =3D s->s_mdsc; + struct ceph_client *cl =3D mdsc->fsc->client; + int failures; + + failures =3D ++s->s_con_failures; + + if (failures =3D=3D MDS_CON_FAIL_REFRESH_MDSMAP) { + pr_warn_client(cl, + "mds%d connection failed %d times, requesting fresh mdsmap\n", + s->s_mds, failures); + + /* + * Force a fresh mdsmap subscription by requesting from + * epoch 0. This ensures we get the complete current map + * with up-to-date MDS addresses, rather than waiting for + * an incremental update that may never arrive if our + * subscription was lost during a monitor reconnection. + */ + if (ceph_monc_want_map(&mdsc->fsc->client->monc, + CEPH_SUB_MDSMAP, 0, true)) + ceph_monc_renew_subs(&mdsc->fsc->client->monc); + } else if (failures > MDS_CON_FAIL_REFRESH_MDSMAP && + failures % MDS_CON_FAIL_REFRESH_MDSMAP =3D=3D 0) { + /* + * Periodically retry the fresh mdsmap request in case + * the previous one was lost or the monitor was also + * temporarily unavailable. + */ + if (ceph_monc_want_map(&mdsc->fsc->client->monc, + CEPH_SUB_MDSMAP, 0, true)) + ceph_monc_renew_subs(&mdsc->fsc->client->monc); + } +} + static const struct ceph_connection_operations mds_con_ops =3D { .get =3D mds_get_con, .put =3D mds_put_con, .alloc_msg =3D mds_alloc_msg, .dispatch =3D mds_dispatch, .peer_reset =3D mds_peer_reset, + .fault =3D mds_fault, .get_authorizer =3D mds_get_authorizer, .add_authorizer_challenge =3D mds_add_authorizer_challenge, .verify_authorizer_reply =3D mds_verify_authorizer_reply, diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index 695c5a9c94026..44585b1cb4485 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -251,6 +251,8 @@ struct ceph_mds_session { struct list_head s_waiting; /* waiting requests */ struct list_head s_unsafe; /* unsafe requests */ struct xarray s_delegated_inos; + + int s_con_failures; /* consecutive connection failures */ }; =20 /* --=20 2.53.0 From nobody Tue Apr 7 18:03:28 2026 Received: from mx0a-0064b401.pphosted.com (mx0a-0064b401.pphosted.com [205.220.166.238]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 438BA3B7B93; Thu, 12 Mar 2026 08:17:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.166.238 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303429; cv=fail; b=ASeHuroKr4QtpEfepQoEm0R8mdC0DNC/tJOLEum0fAhWPclawPWaO0a87sD/lXBJJYisLILu4IWCrQKf6HnIZ9MUUjbBjgshHMPuupsTyR4//PT5M1PmQoXC+ubAbsN8NPAiRkjskxCW3Ou8nneXv2SYcU1B6JPb5c9OotpmIko= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303429; c=relaxed/simple; bh=QTMTsJqFPnhhbtzWDQwjjtAeLJRZV2x4J9JBjKtq5A4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=cjZjP4BQUy5OP/E7sXAsUr7LQ2+uN3zmoY0b167VwgIIvHfG1HPlDkXxANnF5xdGNYX52zhnEPN+juSIG+C1oOenrpmHF0k/Dsi7Ad3pWktjR4ZJCoRcfRIu8oG2bCeqVPftj3fuO7+2oxWt4ywCxBCMTQXazxNDuaE39rgDITo= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com; spf=pass smtp.mailfrom=windriver.com; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b=kFycJ79l; arc=fail smtp.client-ip=205.220.166.238 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=windriver.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b="kFycJ79l" Received: from pps.filterd (m0250809.ppops.net [127.0.0.1]) by mx0a-0064b401.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 62C58umV1757210; Thu, 12 Mar 2026 01:17:04 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=windriver.com; h=cc:content-transfer-encoding:content-type:date:from :in-reply-to:message-id:mime-version:references:subject:to; s= PPS06212021; bh=MULjjVSasZ4jlYHjr3zkmwYyqj1Hxm/izbSmYUhtfNA=; b= kFycJ79lslzKYL1nVOelJx5bh3qiLramoGAJhord7zkb0naUHNs7l8XIoYqLFLmo 5F6v2GPLbmuj5npffvdyq+sFiwMIZSC7/eX8EC5YXLhF7gbfJsD+/f+IgluxwRIr 4TyTTknaxRe5TB9KkmHAMxVBA3NQ8CPC0g1nR1bg3rqZswz9aEusjwVn366UXY1S yiW5UP5AWhQD/gh6MaDeW7fxf1WsJjcjLqYCTMLe8aEQbGz2Gg0RAi/nlxAoCrUv OHYbRuhjZnF3fU/XO1j1UXSoDDs7r1XE6CT1f4vjr+kWMkBxW+OyzHd0b3cFoGZe 5rJMcCFoB4H12WjALzpWoQ== Received: from ch5pr02cu005.outbound.protection.outlook.com (mail-northcentralusazon11012046.outbound.protection.outlook.com [40.107.200.46]) by mx0a-0064b401.pphosted.com (PPS) with ESMTPS id 4cuh6t8enf-1 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 12 Mar 2026 01:17:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=hkNQflsS7b5Mr27OriSmP9Ts0CmrIijqzPECbAVmFAn7YMnZiIblxRwA5m5+Nj2WrlCcN7hbipdXQr7oLyS47d/4tgXh5oC2fYnUOpYCi63GO5Jc/uoyLxK6vEOvGEJSXWid7PZBwZt5Lc/QUaseOWp6VXlr4xhQLZklP+cyCmPhSAfdRrjcQRPfLnCOmL98eL20S/LUPavlqZhlua+rsZDBGYJLrxYPz7xk/hQDOhBsqWrLc8YaxT2e+t1lNYtNJcrdgAjl3bupZVG4aJqyqQ6BVpPppk9TZM0cy10+N5ilgMUNzMi/keo8n0jMmlvjb5927zYYKtTfRXI0VjrLTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=MULjjVSasZ4jlYHjr3zkmwYyqj1Hxm/izbSmYUhtfNA=; b=cwRUYLSs/QYGPMm7FhPGGAhUNlTep11ffyyfwZxOPlyKEwAaaqWVr4rYqyd1Fil+mibc6j9tHLXhHR463HZFgNb30bNLNhD45Tyaa56/n0DepNzurBgetyra7teCI/3r+LmvDBpYCCUDdYDHXRFmFdU79FMUVY1o3yXY1WQQjjP6a1kfY7Jn/hxsrW2C4yh2zgtA9iYW5iq/gwsZg5c5jmWqrkpnURnL9Ka2ar1zVR3E+mmBu8yrVPLxSFvQZ4uoehaEosPFzg9SIcM5LKK89gMy74EdeIqWW1JIOjtMyzUGxsINtFMjQI9sHWmdI5X+6x+fuErQwRhYbR0WWeUkPQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=windriver.com; dmarc=pass action=none header.from=windriver.com; dkim=pass header.d=windriver.com; arc=none Received: from SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) by DM4PR11MB6168.namprd11.prod.outlook.com (2603:10b6:8:ab::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9723.6; Thu, 12 Mar 2026 08:17:00 +0000 Received: from SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced]) by SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced%5]) with mapi id 15.20.9700.010; Thu, 12 Mar 2026 08:17:00 +0000 From: "Ionut Nechita (Wind River)" To: ceph-devel@vger.kernel.org Cc: idryomov@gmail.com, xiubli@redhat.com, linux-kernel@vger.kernel.org, ionut_n2001@yahoo.com, Ionut Nechita Subject: [PATCH v1 11/13] libceph: reset source address on persistent EADDRNOTAVAIL Date: Thu, 12 Mar 2026 10:16:17 +0200 Message-ID: <20260312081619.40854-12-ionut.nechita@windriver.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260312081619.40854-1-ionut.nechita@windriver.com> References: <20260312081619.40854-1-ionut.nechita@windriver.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: FR2P281CA0085.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:9b::12) To SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ2PR11MB7546:EE_|DM4PR11MB6168:EE_ X-MS-Office365-Filtering-Correlation-Id: ab623bf7-95a1-4dd8-64ce-08de800fbc16 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|52116014|10070799003|1800799024|366016|18002099003|56012099003|22082099003; X-Microsoft-Antispam-Message-Info: J00273C3a9FSsCliUFjfjXx2kb8++eDi+ZQZqe4x//U+3XP1J88CMKpCWfcGxPIOCH3cLaXc4gH3I7B/b2v2mYNWb6GNylb+Y3w6y5CdRUjV76iajxPJvBtfFp1Bwft91a/+lGTDmhEJIrJsj7zjfkGwSVmPXBEf6CgLZrOQEZiNqDPiGY8tj9mlOiWEcJgtdbCJgf45aqA6LhJaFJQyVnOCfLFlVlIOI0VgK7rXKurU4j5HJC0mlobQDSvgzTNMrmW5TU+Y1XLNKo7KeKgJOf6WqOkDDWyxovZuR7ScwZJ+laqCCBN/TFYjCjpD6D2jgSqqEN5esRCVUwNC0QYVM7H+YOkMateX4bq4QVoGNnTCffgUxxyZt4Z7azgXhSfmjOEj13cIfMVAou4gzbX6CqkD8HZAqe2L+ctp5NxcweyhP/0UpA6fqIk+vFZwcj1eD3QCTWDxfjn2xceDTLuE5g34AlAPybGNPW6Vcs02PocFjc5N3K5phLXdZVcMRrUFB680EPIetdaK0j4ImCdatrO1pMggdwE92GnyPwudR1oYizz8W/VclDKvtJwLszBr5lYz0/NV2B+DbetR7nyI2dRm1dqhFHWLnmUaw1//cysNXKH0OsrfGCpI3yT9GBkh3oN5KgEHmtOuyiSCO2yCFUZAPCu3Z7YJcucStyy3IOtW3kKRZNWEBA1hKDVssFv+pS1ffU9/e/HOxvapOj2lkWMlijC4XxZrSxTE8EVjJm0= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ2PR11MB7546.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(376014)(52116014)(10070799003)(1800799024)(366016)(18002099003)(56012099003)(22082099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 2 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?ACg+gj+DG4KyMjhjABhD8o6aws5/6dl17KVQGPxJ5BXnLuv1GuVzeVQP/3Sw?= =?us-ascii?Q?pjQdKJnRra3SMTuG6TpWKtZmUQ+pWmV8HCJMV43av+LrA5732VwNZO79uDPb?= =?us-ascii?Q?QElmt7iglplV6cmtX/Sp8nYhW00tEMDSERC+8uZeNE73HBqCJaruNicbcAb5?= =?us-ascii?Q?nSA5FVVmRa1DD0eU6GhbCI0LaC02vCxNJUIYwBswkQJ5C9Zgk74gTLHcoPnL?= =?us-ascii?Q?Vr5llM/Lp/a2NCScgqAUVnAL8bN071YVIqAFDGVRCU0KoQkQZIatLrAtIAen?= =?us-ascii?Q?Qn4xaIIoqpg76WDkmnCie2KsnP8MLxTHgxiSrLASjX4sOOBhGlKrg4pT3+Xk?= =?us-ascii?Q?OiFYYWZxCDL4zsDRhAUqasZha7O4Qxl1GeD95IBDrSL83oOohFBERcSrqCQm?= =?us-ascii?Q?XxaxsconvBI5Q8nF4LdnWyIzV6eekGJpPH62jFPhEWfovUIeIgyMrkm4FIkc?= =?us-ascii?Q?iAOLj7t5mGKxO9kVc67BMhKx0ya8C9V2lRcTx3knzwizvpuMgevUFLFlk64I?= =?us-ascii?Q?mGD2/eP66+53kZgF+6QfTiFlnmq145cKhl82yHUByjT3ggrEHZF8dcQqEprq?= =?us-ascii?Q?iObgEpPDmGZ354l6MSIZ5HU2C62VZWcFbFXQeKmAznE4p0eJ4+DdHOkQkCMv?= =?us-ascii?Q?/DqfD0mpuU471vkKVuQoNmn4oSL0/MHgbXPk907kN8DePq8Kx0Yo8WPDjSs7?= =?us-ascii?Q?SrHrZCtpAxxT6P/MMLDTpGnuxSD93RDaiDQpaJNZEyxQtFmUZq/ulTurKj5s?= =?us-ascii?Q?AwkaHYOiDsy8g6gzTUHJgZG21+vmoJxTWmJtOmDXz+fTt7iyk4sXWoFqZlkV?= =?us-ascii?Q?kTH/MImgTuiotcZATjQIeKKrC07w4W1hFoUBQVObhFHsWQcmD33J3s7LH8ay?= =?us-ascii?Q?i2W0SmIHugCUpDfr5Yb7Cx/AraWRTcKWKDKkXLtNT5CQbfNE8SGn61CKm2NA?= =?us-ascii?Q?DL8xpMpQpVeDGDWt06qhszlWYKNLhIfcJlidesqG/yn4GlSMePsysxxJ+A32?= =?us-ascii?Q?9cqxezImCpXsaTYzibTUK2jHLv4goBPH5cmxhtghDzLUUYLnYWoAMhFgSUHg?= =?us-ascii?Q?Grgv2nQl2ZhBOryzVwuexppItX6fo/6HIJgHKRNza3VpQjrvD3xcGeXqDxdn?= =?us-ascii?Q?HLxkCfr4GZnj69OrS+J9mRTIAW5ymZvgyRsnI7OnL0NXUxBR452HOvWc1fkX?= =?us-ascii?Q?Dga1QQsM0jJVd7VrY7jB9qKRjKxHy2n8HYVYKHUf5UlAvI+9/h9nN2eXyaTU?= =?us-ascii?Q?87VYKUxK5g/mRz6GBKMFMBI9rsFE8ilJRVUGPoMklGJl4vS8bEUl8d5ZEXhP?= =?us-ascii?Q?PW+3sDhOTl12Xy9KW3OED/lfSNcWtIkg522LptOftou2hd1cbjgWB886CS1v?= =?us-ascii?Q?2fiWjo4wsJGqg2LxzhjAFY0x2jP9E0HIsplrtFXGsD/t4+n0saCdPAy2VXaO?= =?us-ascii?Q?31nwzOPkZ1p2pgn+0qCinr+hJ/yGawl/HChlG4TUH4ES/F4zXVgmsI7D+ofI?= =?us-ascii?Q?qhQlJSl9F9d1tbBknrdm+i3gL0pxpZ5VbtRupz4zIa1buC+tKWPBUA7Wj7MS?= =?us-ascii?Q?elSzK0dTlHQmvfPjdUbvsiwotZHOp9YLY0cS+qkTJ6bE1amWNpPLm96gUKI7?= =?us-ascii?Q?qjRmXoImnsDvtISl3YqegrJeQa+mItuyjyXNTNlQiy915FDF4W6W+BidU0ql?= =?us-ascii?Q?de7XBOLLKnuyPb+keTkBFRzGJvnAcHUtbetiFCBuKrNDyVHmIs+BzpE3GIq5?= =?us-ascii?Q?syred2H5D+gOmTgIHoMpLAHOS3UEdyoNSjs5sBXv9mUrtWJOgnU69lFPOIKJ?= X-MS-Exchange-AntiSpam-MessageData-1: OKh42+x12j7m9aurblne/UEK71JnMbklwdA= X-Exchange-RoutingPolicyChecked: uix5AwDvWiP1IKKOH5d7EmuAEetgms4e8QKaPHuoR0xWvIYNZstd0niMDvc5cdaDCBLYRQPFkh2hS6SJhrokTUMEDGgipy5qhL4APLjoFug3UC+FWigXAIRTbUjvVIh7wGsmbs6yJwFqaKV53FWPHbQTlZtlhQqk+6Jbo40lTrmZtb3mMxig7t1kTjm+Z49Z4E/WZX6DlU20jogI9gWqKlr7SCAlTEjUzVPhQsrG0uQfvE+XSQxRovbgW7bSisr/oeXtogCYunMQr9HqWBnGMfzBshg9iJ6mLPkU956/uS7ro3rZjQlJFdjbbaRJmmHYdNfbX6ABehP1FIaN4MfuoA== X-OriginatorOrg: windriver.com X-MS-Exchange-CrossTenant-Network-Message-Id: ab623bf7-95a1-4dd8-64ce-08de800fbc16 X-MS-Exchange-CrossTenant-AuthSource: SJ2PR11MB7546.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Mar 2026 08:17:00.4131 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8ddb2873-a1ad-4a18-ae4e-4644631433be X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: taGq56niwXRVctXExtsWM5xS9NvdRXCSQnaj7XbJdy7LAhaVVZy9dVdsYvC36ulZuAxKVF/2IH9q5XHJa6xgVieKKQ8mH5YvZ2gfaOeAw/0= X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR11MB6168 X-Proofpoint-GUID: 7M-VP9_ijJTetRnKUQ9uP-IDkYVW0kl_ X-Authority-Analysis: v=2.4 cv=Cf8FJbrl c=1 sm=1 tr=0 ts=69b27680 cx=c_pps a=KNS8ES/6Vao0xGfhhZwSfQ==:117 a=6eWqkTHjU83fiwn7nKZWdM+Sl24=:19 a=z/mQ4Ysz8XfWz/Q5cLBRGdckG28=:19 a=lCpzRmAYbLLaTzLvsPZ7Mbvzbb8=:19 a=xqWC_Br6kY4A:10 a=Yq5XynenixoA:10 a=VkNPw1HP01LnGYTKEx00:22 a=bi6dqmuHe4P4UrxVR6um:22 a=iKiJcTA2PjBS6x5JeXcw:22 a=t7CeM3EgAAAA:8 a=EdDtN2VJIkaq4EvSNGIA:9 a=LONr8ofnQXjLKK9x:21 a=FdTzh2GWekK77mhwV6Dw:22 X-Proofpoint-ORIG-GUID: 7M-VP9_ijJTetRnKUQ9uP-IDkYVW0kl_ X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMzEyMDA2NSBTYWx0ZWRfXyIliYIrfP5ct Zrrt4cuyQcQVxCq0V5djkHOj+aqJyuIizRccp5bKFeNX36V8Lxn8Q2XMlgSppVNvUg20wQN9YLk 8i9u5F29taT+f+YoYdCQnPI6h0/GJA8KJn302otyUkfBqsXxIGOJE21/zD1aHi3J0uJ1OIJVapv R6jfxu4jKvo7VpQyfBEDOGYRjfFEKozXKlUx9is/a1Z5SOyahMuMP5iSdUdl86MqtT/ogM2YR1l kAKNS2BlzBcDhDlOP78v1HZOBRVZ0/1SUv6wK9ZA5OeGvcov9NsnxIFoAmu1pVc6E+U6Zh8A1WD WR4jgAbI1ymfUOXLzUXRKLhYHEu8fSzjS5fWj7ft244pamASEIYahehXkqx7Q4BOeMp1EY9NicH AjwXCNm1gP6Cj8MGP//BXZ/nlTedkA0gYVAEDnsEZgzOTeosm6GS0sMzzMVyu1CTFynZhe1PZ6U 0jolErnidQFLakPKbsg== X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1143,Hydra:6.1.51,FMLib:17.12.100.49 definitions=2026-03-11_02,2026-03-09_02,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 adultscore=0 priorityscore=1501 lowpriorityscore=0 bulkscore=0 phishscore=0 impostorscore=0 malwarescore=0 suspectscore=0 spamscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2603050001 definitions=main-2603120065 Content-Type: text/plain; charset="utf-8" From: Ionut Nechita In containerized environments (e.g., Rook-Ceph with Calico CNI), the kernel CephFS client's source address (msgr->inst.addr) is learned from the first successful monitor connection via process_hello(). If the initial connection was made through a transient CNI pod address (e.g., a Calico-assigned dead:beef::... address from a CSI plugin pod), that address is stored permanently in inst.addr. When the pod is later rescheduled or the CNI reconfigures networking, the original pod address is removed and Calico installs a blackhole route for the old address range. All subsequent kernel socket connections fail with EADDRNOTAVAIL at ip6_dst_lookup_flow() before even sending a TCP SYN, because the IPv6 source address selection finds the blackhole route for the old address range. This creates a permanent deadlock: - All connections (mon, mds, osd) fail with EADDRNOTAVAIL - The client cannot reach any monitor to re-learn its address - inst.addr is never blank again (set once, never cleared) - The only recovery is force-unmounting and remounting Fix this by tracking consecutive EADDRNOTAVAIL failures across all connections using an atomic counter in struct ceph_messenger. After ADDRNOTAVAIL_RESET_THRESHOLD (30) consecutive failures (~3 seconds at 100ms retry interval), reset inst.addr.in_addr to zero (blank) while preserving the nonce and type. This allows process_hello() (msgr2) or process_banner() (msgr1) to re-learn the source address from the next successful monitor connection, which will use the current stable host address instead of the defunct pod address. The counter is reset to zero when: - A TCP connection succeeds (in ceph_tcp_connect) - The address is successfully re-learned (in process_hello/ process_banner) Observed in production (kernel 6.12.0-1-rt-amd64, Ceph Reef 18.2.2->18.2.5 upgrade, IPv6-only cluster): - Client instance: client.55136 [dead:beef::a2bf:c94c:345d:bc66]:0 - Address dead:beef::a2bf:c94c:345d:bc66 was a Calico pod address - After pod reschedule: blackhole dead:beef::a2bf:c94c:345d:bc40/122 - All connections stuck in EADDRNOTAVAIL loop for 16+ hours - After force-unmount + remount: new client got stable host address [aefd::2b93:d245:fd09:127e]:0 and worked immediately Signed-off-by: Ionut Nechita --- include/linux/ceph/messenger.h | 20 +++++++++++++ net/ceph/messenger.c | 51 ++++++++++++++++++++++++++++++++++ net/ceph/messenger_v1.c | 7 +++++ net/ceph/messenger_v2.c | 12 ++++++++ 4 files changed, 90 insertions(+) diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h index 730a754353aed..d8f7946d85a68 100644 --- a/include/linux/ceph/messenger.h +++ b/include/linux/ceph/messenger.h @@ -113,6 +113,17 @@ struct ceph_messenger { */ u32 global_seq; spinlock_t global_seq_lock; + + /* + * Track consecutive EADDRNOTAVAIL failures across all + * connections. When this exceeds a threshold, the client's + * inst.addr is reset to blank so that process_hello() will + * re-learn the source address from the next successful + * monitor connection. This handles the case where the + * original source address was a transient CNI pod address + * that no longer exists. + */ + atomic_t addr_notavail_count; }; =20 enum ceph_msg_data_type { @@ -328,6 +339,15 @@ struct ceph_msg { */ #define ADDRNOTAVAIL_DELAY (HZ / 10) =20 +/* + * Number of consecutive EADDRNOTAVAIL failures (across all connections) + * before resetting the messenger's source address. At ~100ms per retry, + * 30 failures means ~3 seconds of persistent EADDRNOTAVAIL before we + * conclude the source address is permanently gone (e.g., a CNI pod + * address that was removed) and needs to be re-learned. + */ +#define ADDRNOTAVAIL_RESET_THRESHOLD 30 + struct ceph_connection_v1_info { struct kvec out_kvec[8], /* sending header/footer data */ *out_kvec_cur; diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index c40c7c332e7f4..8165e6a8fe092 100644 --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -497,6 +497,10 @@ int ceph_tcp_connect(struct ceph_connection *con) else con->v1.addr_notavail =3D false; =20 + /* Reset the persistent EADDRNOTAVAIL counter on success */ + if (atomic_read(&con->msgr->addr_notavail_count) > 0) + atomic_set(&con->msgr->addr_notavail_count, 0); + return 0; } =20 @@ -1663,6 +1667,52 @@ static void con_fault(struct ceph_connection *con) } } =20 + /* + * Track persistent EADDRNOTAVAIL across all connections. + * If the source address stored in msgr->inst.addr is no longer + * valid (e.g., it was a transient CNI pod address that has been + * removed), all connections will fail with EADDRNOTAVAIL at + * ip6_dst_lookup_flow() before even sending a SYN. + * + * After ADDRNOTAVAIL_RESET_THRESHOLD consecutive failures, + * reset inst.addr to blank so that process_hello() will + * re-learn the source address from the next successful + * monitor connection. The nonce is preserved. + */ + if (addr_issue) { + int count =3D atomic_inc_return(&con->msgr->addr_notavail_count); + + if (count =3D=3D ADDRNOTAVAIL_RESET_THRESHOLD) { + struct ceph_entity_addr *my_addr =3D + &con->msgr->inst.addr; + + pr_warn("libceph: %d consecutive EADDRNOTAVAIL errors, resetting source= address %s (will re-learn from monitor)\n", + count, ceph_pr_addr(my_addr)); + + /* + * Zero out the address portion of in_addr but + * preserve ss_family, nonce, and type so the + * client identity is maintained and debug output + * remains readable. process_hello() checks + * ceph_addr_is_blank() and will fill in the new + * address from the monitor's addr_for_me response. + * + * We preserve ss_family so that ceph_pr_addr() + * shows e.g. "[::]:0" instead of + * "(unknown sockaddr family 0)". + */ + { + sa_family_t family =3D + get_unaligned(&my_addr->in_addr.ss_family); + memset(&my_addr->in_addr, 0, + sizeof(my_addr->in_addr)); + put_unaligned(family, + &my_addr->in_addr.ss_family); + } + ceph_encode_my_addr(con->msgr); + } + } + WARN_ON(con->state =3D=3D CEPH_CON_S_STANDBY || con->state =3D=3D CEPH_CON_S_CLOSED); =20 @@ -1740,6 +1790,7 @@ void ceph_messenger_init(struct ceph_messenger *msgr, ceph_encode_my_addr(msgr); =20 atomic_set(&msgr->stopping, 0); + atomic_set(&msgr->addr_notavail_count, 0); write_pnet(&msgr->net, get_net(current->nsproxy->net_ns)); =20 dout("%s %p\n", __func__, msgr); diff --git a/net/ceph/messenger_v1.c b/net/ceph/messenger_v1.c index 0cb61c76b9b87..4f3868f296c06 100644 --- a/net/ceph/messenger_v1.c +++ b/net/ceph/messenger_v1.c @@ -736,6 +736,13 @@ static int process_banner(struct ceph_connection *con) ceph_encode_my_addr(con->msgr); dout("process_banner learned my addr is %s\n", ceph_pr_addr(my_addr)); + + if (atomic_read(&con->msgr->addr_notavail_count) > 0) { + pr_info("libceph: re-learned source address %s from peer %s\n", + ceph_pr_addr(my_addr), + ceph_pr_addr(&con->peer_addr)); + atomic_set(&con->msgr->addr_notavail_count, 0); + } } =20 return 0; diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c index bd608ffa06279..12ad9f571dcca 100644 --- a/net/ceph/messenger_v2.c +++ b/net/ceph/messenger_v2.c @@ -2260,6 +2260,18 @@ static int process_hello(struct ceph_connection *con= , void *p, void *end) dout("%s con %p set my addr %s, as seen by peer %s\n", __func__, con, ceph_pr_addr(my_addr), ceph_pr_addr(&con->peer_addr)); + + /* + * If we re-learned the address after a reset due to + * persistent EADDRNOTAVAIL, log it and clear the + * failure counter. + */ + if (atomic_read(&con->msgr->addr_notavail_count) > 0) { + pr_info("libceph: re-learned source address %s from monitor %s\n", + ceph_pr_addr(my_addr), + ceph_pr_addr(&con->peer_addr)); + atomic_set(&con->msgr->addr_notavail_count, 0); + } } else { dout("%s con %p my addr already set %s\n", __func__, con, ceph_pr_addr(my_addr)); --=20 2.53.0 From nobody Tue Apr 7 18:03:28 2026 Received: from mx0a-0064b401.pphosted.com (mx0a-0064b401.pphosted.com [205.220.166.238]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D6B7C358D3D; Thu, 12 Mar 2026 08:17:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.166.238 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303429; cv=fail; b=qqKJrGNdmVOJgJray5MtZQqv3KyEciTya0ztXUNrNDV9Sa7waB78AnTvPFxEiTJ5bACuFbcv5cI2+aqES90ad4+02nf5toaI8iWU57L4rcoe37wh5EKJ5QJRoG0pOREcLBbndXj/BUfg3Ra8zlQu5BMZqfM0r2QFAsrmuZbvpfs= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303429; c=relaxed/simple; bh=/X/rJ426oDns/1aY2eCPS7Znev/S+pGXnn6ekwEwACk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=Sl7o6epXiSrAM0KNnXuIUohO6lZ08f/jg/GagPTcnOyIdU0+nFRT0h9hm1LzJeajiomDTw7vq+bZDAa1TTkNYvkPuch3duuRKMTF4jCn7p7MwJ7lrvIAExh6ap68w9B7/PSuiUx+kBpGI53wp4zo4bey2MBI7QPYJc39H/VF25U= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com; spf=pass smtp.mailfrom=windriver.com; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b=Gg3qiAXe; arc=fail smtp.client-ip=205.220.166.238 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=windriver.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b="Gg3qiAXe" Received: from pps.filterd (m0250809.ppops.net [127.0.0.1]) by mx0a-0064b401.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 62C58umW1757210; Thu, 12 Mar 2026 01:17:04 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=windriver.com; h=cc:content-transfer-encoding:content-type:date:from :in-reply-to:message-id:mime-version:references:subject:to; s= PPS06212021; bh=gWJLPplyKhQueZ/hNkK1BR3VQIvJxWNDSgrFyNlUUcY=; b= Gg3qiAXenH0/A8WQ3t11L/P6x9gat4YFBWbh4iIw9wbnhQpul6jC2NQyI0zplJDL OQz5g6/8ORBczp7BlnphMLAJqllKXpVJJ4AxHFiDshUDQ3VCPCqoQZtzcxkf2ioH GXhk8AMPuoI5Gj+us6DPKkwV/+BB01W4GqMfLS5Q9jkImIKYK50waZbNRXGX03/T XmZozGGW9hTELbSY+lqajEfvMCCngfr2zvpwBKxZbvLX4t9sLoh7CQlzm5evV+To vu47V/1booVNwk/XRuJpfKVk2p+EBLzndHktHnBvURBnhk/b6fweLsa99zGOkKzt mxdBks7JFlBOpzm+sweiXA== Received: from ch5pr02cu005.outbound.protection.outlook.com (mail-northcentralusazon11012046.outbound.protection.outlook.com [40.107.200.46]) by mx0a-0064b401.pphosted.com (PPS) with ESMTPS id 4cuh6t8enf-2 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 12 Mar 2026 01:17:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=M8CQ5HNiZhB642ZSteWPAdBXs+X2i/eUm9uULaQi9lO8k/vlEsrS42VB454S8XUhJ+qqlCxCzCwPEEYuH/6Nij00Hg4smlajE/tIZ4+Qu+DZ98MvJAIMY/BbRhnLiQJmSRwnTMe8XERKbWSiBX4nJSu+adlaBL3DofaWGiexLe4tsYqZNpEhmToBXzbT9lz5EFIUiQlYjP6oHKNVk3giZlM7McAlqeujj6EpT+IJdoopIx7MahYYf/t7FGekES/+6imBAOc8SEa0aqqz/Aq5eSZYYLL1iFdWi0BYFYqotjfUQA19ug4n9hJRt15U6feqxAgmjmzEFqqMfFmT6eEjNg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=gWJLPplyKhQueZ/hNkK1BR3VQIvJxWNDSgrFyNlUUcY=; b=GxMyuLJ1wzeNel0jD6F3olxXnsWAEeD1Np/SbOqLAvhINZTIs3R8oxxQfhec4E7UhBv7o/lIsg0et3D8hx3eP/BbE+Osm0FNa7uKCh6St2IYWhANGpz91IojQedlnD2yeUwtXTs2uUqsJwRskTT/xo4JJ1cQw6Z1RSaOvR2wTpoD9peF2Z0qNuPbSb/J7m9wVx/iQgLUUc00hlRF5zbFKiQlAiK9t0n/pF94gd9lUTPwQBKxt/NYt1GDqNaj2A+/oDP2/n65sDAmosdFHmC3hftQOOaZWPQnoqdFV8uUWn11X5F+OwnwELjV9TYyZrjon8c7T9qiKbONFoHvxGCuLw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=windriver.com; dmarc=pass action=none header.from=windriver.com; dkim=pass header.d=windriver.com; arc=none Received: from SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) by DM4PR11MB6168.namprd11.prod.outlook.com (2603:10b6:8:ab::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9723.6; Thu, 12 Mar 2026 08:17:02 +0000 Received: from SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced]) by SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced%5]) with mapi id 15.20.9700.010; Thu, 12 Mar 2026 08:17:02 +0000 From: "Ionut Nechita (Wind River)" To: ceph-devel@vger.kernel.org Cc: idryomov@gmail.com, xiubli@redhat.com, linux-kernel@vger.kernel.org, ionut_n2001@yahoo.com, Ionut Nechita Subject: [PATCH v1 12/13] libceph: force monitor reconnect on persistent EADDRNOTAVAIL Date: Thu, 12 Mar 2026 10:16:18 +0200 Message-ID: <20260312081619.40854-13-ionut.nechita@windriver.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260312081619.40854-1-ionut.nechita@windriver.com> References: <20260312081619.40854-1-ionut.nechita@windriver.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: FR2P281CA0085.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:9b::12) To SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ2PR11MB7546:EE_|DM4PR11MB6168:EE_ X-MS-Office365-Filtering-Correlation-Id: c915df10-9cfd-40b6-f282-08de800fbd5f X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|52116014|10070799003|1800799024|366016|18002099003|56012099003|22082099003; X-Microsoft-Antispam-Message-Info: Cwx5U0zrv5vsbYM7y1TpVyd94S3gm/uywYKXYwMg/QjUdGofiJNvr24/6E6ZZQEwNINc0y5wvhQHXl6RoT7PFlX0xuNdBFeL/dqCNJQ3gW59zMwFzVmaa8vO7S9EjsHjw0xT6+NxvQpMMBNe7xk5lf2AdslD2xoBnLajHZQvHt0WWWizrmAkl2wnkgVfSt1d6Rju5UhQ/SS3OV/CXBa+Wii+YgvVv4TzH6cKDSaGS1ToAI7rmBA9jszyFL2Y+ncf0QqRN1irUZ72IMyBTB3Vb8rFSnqrCSnY1yvjrtCQKSK5ELP8tHd5QhTeb4t07f2u3UArWG0moEliWd3fvh85yEWXMebmgS1eWr+5A4/grSGfOTRtx0EcRkMSBLDZ2Mkv1xWunfAYLcYvd17SbjVVi7MnRJYRpD9mt4onWhNGGoc0TDBT/SZZ90HOmXDWpc83c01srBplcZ14LKzMmxyiVeOKgkhEzUTayOJ3UpwjmdxR1YgVPAVoTfq42+XIP6l2WUGG92WqU3aFdqeeoL8EFDZmLeN+FZAVJ2zTkMR2aTh2K7AZlEKQJlO3SAPlk1NWPh1AjtaszSRwqy6RpCuokE0+Nc/0D0RS0B5qrxMGXb79Ql/0v0ViFoNC9X0n283ECQMmq/ykZeJpZ8NHKnhlMh/zYX4WDBNdz2ESch8PDU9MdyL1hz5DPGRe0ZbTbwooi8iwRZ8cLbOCcHGCKkhzpKOvjO9KtY16lpnIjM5VL58= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ2PR11MB7546.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(376014)(52116014)(10070799003)(1800799024)(366016)(18002099003)(56012099003)(22082099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 2 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?L5hkf4mZEgm5nuYft+nWbPrMmzMCRKn3rK3WlfRvMEpsMx3rSJOEVga1O6Hk?= =?us-ascii?Q?s0qly/Sr/wj2wc1xp7qonzGFhO5WLE/r0rVcX+44cv+RRz3boFoI0nHMTiRv?= =?us-ascii?Q?9vCFBn4oRot1kckx76NVHBNR2ad6B5HuOYcf4Cmp2xD33l8xpR1g633vsZ9j?= =?us-ascii?Q?vUkM3N70FHO3iejXvrGFLvPaLBOwglt348IitdkkV5sLDWaD/aVAvYu+6M/g?= =?us-ascii?Q?F+03NyA8Uy5fFBQSZwBJcFamih4Gemy8sNvfwyJifF9W5pC6/fcaHhwlraXw?= =?us-ascii?Q?5rCM+GdKvhiDZGIAqRG2Ba8WY77Vdq8zFVblI9vL79JrRRcLd3wE/KiG5QHm?= =?us-ascii?Q?VbNhV8CGAWjZXwl4iBQIIP48CpFSAdNoeGmw6pT+9vk2icHr7xUv2wrvyn3x?= =?us-ascii?Q?RKjHqt2UkScwPDZBGEJzag4+CQAv7OXnX77ywtBLC2uQO030HDfyuWGQuOyc?= =?us-ascii?Q?z5fjDDBrpXJj3FXVaJJ3SxGOf+fNhzU2FhJkb9txl1hp93gHLzudYT2wQEiH?= =?us-ascii?Q?4qMlTUClM5LI5iaGugW8f9vd5KI0L9ZSxSNngSmpFQ4yLqtDFUVEXpuhkEgp?= =?us-ascii?Q?NjPnAK4s54cnYnb2+vMb03YqZPrW3AHhQy9dTRBT8XuR5DP5LB3ZLnQNAe4D?= =?us-ascii?Q?DlREi03pV5oNV6y+IzH5GnlogAFbA9aHe23iHd/vIn7ly2YUJVauxstQC4Lf?= =?us-ascii?Q?bxdNgqhEB/vAsF89A8TUULeB5Q81R/lKByLTF/pl8rVPB7LDrF/Ha79OatA/?= =?us-ascii?Q?ERQC8Bz7FgNP5U1VqiAZRk1+7kZn4HE+ThfsfLhnyJqwT1VQ6qhb4TKB+qtq?= =?us-ascii?Q?Da6PJKQOJHkhTHkOL2HWjKmS84qf7Ayq9RXkoaRbEk0SPrWOAYdHuJ3DiFk/?= =?us-ascii?Q?eHKMS29YS4ZP25+axbH3dOou0Fh+L5Nc0okYZeLQea8pEF3hLjH4UM8ZLptd?= =?us-ascii?Q?SAri8Dx3OK+6wzTA7cQBslSl0SgLDeR0r53uoJCALZe7JuZrcM63lUH2NktH?= =?us-ascii?Q?k6FGbEddW+c2w0H8qxKNPo+lzaJXwyNQBR9fQ4y0hn8iRx7gftAupq4jjHRj?= =?us-ascii?Q?yWnm7zNTdksHvtECk8STMMwM6nkfCVC0ddjRVGHlg1mlE7PZ96/Hqr3VYNjH?= =?us-ascii?Q?oT4exlZKFK22JTagP7NtMO17jJ5qD41CZ0ftsQM4oKJkbbE+SSIN4+C9q725?= =?us-ascii?Q?0ATbnfCO2pU1ouY9tz8Nqa4agxdDbyvlHZhNjz3wrmqk22k7eV3UiZ48gPE9?= =?us-ascii?Q?0w0+6Lhn7gHT6s1XbmjYFFAIisDYdy6rEVVtxnqGpE65TIcdrOUzhnu+EHIE?= =?us-ascii?Q?1XE1SqzKQb74MRBVsmxJzNXJsOXbZWq4UzPhrFBeW8iNz/KO30IGW8SVDDDd?= =?us-ascii?Q?Bo6R+V8I6v228M4l8DKfmye+MZxvDf2Zynp8aHwZ/wh2808gS+HPOv3iLsVv?= =?us-ascii?Q?z+cD+ToGkuUpxcQDFX0n32RwGU7+KQCy9HquFFo/CTKbU966bvEl8fj0a6WX?= =?us-ascii?Q?kJ06IvUbRck6DmvnUN18c934Emc3rT6vvHelJh+XyC2/4fYFswUzuF2iJU7K?= =?us-ascii?Q?cM3DiJchUZfoRGLMVfB7EOdFKvCwsrPSrxHVv5ZY9A7mXI9u9tNS72ScT4zf?= =?us-ascii?Q?yuT4+KjsRFN5PkXJqPC72KamLgADqP1IzPMPblTtdO/dLDW2Uiiss3cqrvfY?= =?us-ascii?Q?lSilHJBqp4i6HOHRCP4C3cCgcbqDhc3nQGcNEfxm5Juuj70RQFyXghamjZMU?= =?us-ascii?Q?jqQrPBxOXpR9s5JwIgK4SGRkYb2pEpviOE97kMCOiuJYMo65HyndAspixE3S?= X-MS-Exchange-AntiSpam-MessageData-1: sUdeFQxh/t8nNwyO1Tl7/3kuJB9x6QrCXFI= X-Exchange-RoutingPolicyChecked: riYZQlcduz9CA1ObTkvEGJao4QhCNJMKcnQ60pynU4OX7KjIzmyG0sP8OPFH9QnrTfrkzlQys2VfS0MA4xbpIIhLjK5zoStok+BzQOkvYCzp8VvQadtDwCEKEJHPsFjGpoDSDCN9blb1prTefvDjXju6PCHZSGT+bJz5z1hUhQei6Lfpry3JGd4OgHZDWqlpiw6r+dmWuIWin3KryNbYWEIEVmWPUfY3Rh6l3rukjSYJW+jeeTYeryhiqphzBBXKK0iT5GxxwwL6iNBjKUFn1AKQoQUdnjQED+e1uuJ81zsxfbGznk1VdFhgxnhXQwS6R+J3JOdjHn1Im0trNuUZMQ== X-OriginatorOrg: windriver.com X-MS-Exchange-CrossTenant-Network-Message-Id: c915df10-9cfd-40b6-f282-08de800fbd5f X-MS-Exchange-CrossTenant-AuthSource: SJ2PR11MB7546.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Mar 2026 08:17:02.5074 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8ddb2873-a1ad-4a18-ae4e-4644631433be X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: BaMufS6Emv5e4tEk7DjmMm+6kd6YH4xc5MYNG61AdDbWFrUEiod2k8m18A5DB9D+4jpmiy9mdOByCfpo407zCsNCuXFUBhSLc8G2eeHkKmk= X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR11MB6168 X-Proofpoint-GUID: qzwuNZeljt7xilzdM3ORxsCWsbHQK459 X-Authority-Analysis: v=2.4 cv=Cf8FJbrl c=1 sm=1 tr=0 ts=69b27680 cx=c_pps a=KNS8ES/6Vao0xGfhhZwSfQ==:117 a=6eWqkTHjU83fiwn7nKZWdM+Sl24=:19 a=z/mQ4Ysz8XfWz/Q5cLBRGdckG28=:19 a=lCpzRmAYbLLaTzLvsPZ7Mbvzbb8=:19 a=xqWC_Br6kY4A:10 a=Yq5XynenixoA:10 a=VkNPw1HP01LnGYTKEx00:22 a=bi6dqmuHe4P4UrxVR6um:22 a=iKiJcTA2PjBS6x5JeXcw:22 a=t7CeM3EgAAAA:8 a=bG8hRs5GvZOM3gRM8dcA:9 a=FdTzh2GWekK77mhwV6Dw:22 X-Proofpoint-ORIG-GUID: qzwuNZeljt7xilzdM3ORxsCWsbHQK459 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMzEyMDA2NSBTYWx0ZWRfX60pURCQ5AiNN 7S4syLIGtVArNvaxmF/dCNOjCyyzwzRKOkWJ7izu/IghVrXZ7UNCOS+vHQ6KSesmJrODiEs1ZGm mi+bTFobPq2weuQRL8zCev9rKsiEmhrJGPpu7ULnRW9wc1loB9NVDtyWAY0hiUDyILiuZJO3Vcd UocbEHtgGRYf6uZiypW4et+TYoJ0U+xYqreDAxs7VBcUVsaOt/L3d/I6AHQW69vAY4Cv8der6Dt oGkrP8GKj/7a3TWxQcW3SKcc8CdHS1zIB8sPZnbuKkXyYZaB8Gm9fuHO7A2wSuRPj8ALuhayLAD 3IWDZwQaOozQ0CpIyAJBm5Cl+BvYtBS+k556nEeVLvbr/OVg+SYJ86vAvJKwFYy3T8kCGHt365w kLExp28I1K15z4dHrH/uc1jaG+rD9LsJE6oDxQop8xWM2q0mf7LTS+NVOS/eb42tXD2J7FQtVlR XkF+nTLMfe1ei52nf5g== X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1143,Hydra:6.1.51,FMLib:17.12.100.49 definitions=2026-03-11_02,2026-03-09_02,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 adultscore=0 priorityscore=1501 lowpriorityscore=0 bulkscore=0 phishscore=0 impostorscore=0 malwarescore=0 suspectscore=0 spamscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2603050001 definitions=main-2603120065 Content-Type: text/plain; charset="utf-8" From: Ionut Nechita When the kernel CephFS client experiences persistent EADDRNOTAVAIL errors (e.g., because the original source address was a transient CNI pod address that no longer exists), the monitor client may get stuck retrying the same monitor indefinitely while in hunting mode. The mon_fault() handler currently ignores faults when already hunting, assuming delayed_work() will handle the retry. However, delayed_work() simply calls reopen_session() which may pick the same monitor again, creating an infinite loop of failed connection attempts to the same target. Additionally, when EADDRNOTAVAIL is persistent across all monitors, the hunt_mult backoff grows exponentially, causing increasingly long delays between reconnection attempts. Once the network issue resolves (e.g., route cache expires, new address becomes available), the client may take minutes to recover due to the accumulated backoff. Fix this by modifying mon_fault() to force a reopen_session() even when already hunting, if the messenger's addr_notavail_count indicates persistent address failures. This ensures the client tries a different monitor on each fault rather than waiting for the delayed_work timer. Also reset hunt_mult to 1 when forcing a reconnect due to EADDRNOTAVAIL, so that once the network issue resolves, the client recovers quickly without accumulated backoff delays. Also add a safety check in delayed_work(): if addr_notavail_count exceeds the reset threshold and we're hunting, reset hunt_mult to prevent accumulated backoff from delaying recovery. Signed-off-by: Ionut Nechita --- net/ceph/mon_client.c | 39 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 38 insertions(+), 1 deletion(-) diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c index ab66b599ac479..6e3d314fbf2b2 100644 --- a/net/ceph/mon_client.c +++ b/net/ceph/mon_client.c @@ -1084,6 +1084,7 @@ static void delayed_work(struct work_struct *work) { struct ceph_mon_client *monc =3D container_of(work, struct ceph_mon_client, delayed_work.work); + int notavail_count; =20 mutex_lock(&monc->mutex); dout("%s mon%d\n", __func__, monc->cur_mon); @@ -1094,6 +1095,22 @@ static void delayed_work(struct work_struct *work) if (monc->hunting) { dout("%s continuing hunt\n", __func__); reopen_session(monc); + + /* + * If we're hunting and EADDRNOTAVAIL has been persistent, + * reset the backoff multiplier so we recover quickly once + * the network issue resolves. Without this, hunt_mult can + * grow large during extended EADDRNOTAVAIL periods, causing + * the client to take minutes to reconnect even after the + * underlying issue is fixed. + */ + notavail_count =3D + atomic_read(&monc->client->msgr.addr_notavail_count); + if (notavail_count >=3D ADDRNOTAVAIL_RESET_THRESHOLD) { + dout("%s addr_notavail_count %d, resetting hunt_mult\n", + __func__, notavail_count); + monc->hunt_mult =3D 1; + } } else { int is_auth =3D ceph_auth_is_authenticated(monc->auth); =20 @@ -1554,6 +1571,7 @@ static struct ceph_msg *mon_alloc_msg(struct ceph_con= nection *con, static void mon_fault(struct ceph_connection *con) { struct ceph_mon_client *monc =3D con->private; + int notavail_count; =20 mutex_lock(&monc->mutex); dout("%s mon%d\n", __func__, monc->cur_mon); @@ -1563,7 +1581,26 @@ static void mon_fault(struct ceph_connection *con) reopen_session(monc); __schedule_delayed(monc); } else { - dout("%s already hunting\n", __func__); + /* + * Already hunting. Normally we just wait for + * delayed_work() to retry. But if EADDRNOTAVAIL + * is persistent, force an immediate reconnect to + * a different monitor. This avoids getting stuck + * retrying the same monitor that keeps failing. + * Also reset hunt_mult so we don't accumulate + * excessive backoff during the outage. + */ + notavail_count =3D + atomic_read(&con->msgr->addr_notavail_count); + if (notavail_count > 0) { + dout("%s addr_notavail %d, forcing reopen\n", + __func__, notavail_count); + monc->hunt_mult =3D 1; + reopen_session(monc); + __schedule_delayed(monc); + } else { + dout("%s already hunting\n", __func__); + } } } mutex_unlock(&monc->mutex); --=20 2.53.0 From nobody Tue Apr 7 18:03:28 2026 Received: from mx0a-0064b401.pphosted.com (mx0a-0064b401.pphosted.com [205.220.166.238]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ED0683BA240; Thu, 12 Mar 2026 08:17:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.166.238 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303431; cv=fail; b=Sc/EglZCL3EPs6fSmILpkJF6dG/HPRxWwo7OF3VOXmfcWPryVn5NrsKdULyeHdZfFKhKbqVpDCPG7GBbZO13J+qw+gZicn95U35+hYbK2iBok/aQHGgCdJMl3S6nSiMHIIzJoesPwo9aBh16Q9AQojHJa4vAK9/J7kKdGt4hhT4= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773303431; c=relaxed/simple; bh=QHNrVwQwX4HaNjPNWkgKvo6VnbZjw1VQXi/C5dzr3uo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=F1N6zN1DVL7BoxR89KDL1uO2DF/Mjtp7mwxql7iNQPQfXkVQCAOZC1Ak1tKJKU++PwG/M6bB+URKasnG7qdrbi25VT4DEqwj6NyI/YKeETtN6NRqKoe33rVCCCAKW3mphBwvmUQbGGdXhwYDcuNYLZsQKchZ8STVCaL3TyYknMg= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com; spf=pass smtp.mailfrom=windriver.com; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b=Rcj8qE9t; arc=fail smtp.client-ip=205.220.166.238 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=windriver.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=windriver.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=windriver.com header.i=@windriver.com header.b="Rcj8qE9t" Received: from pps.filterd (m0250810.ppops.net [127.0.0.1]) by mx0a-0064b401.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 62C6vxZ51862126; Thu, 12 Mar 2026 01:17:07 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=windriver.com; h=cc:content-transfer-encoding:content-type:date:from :in-reply-to:message-id:mime-version:references:subject:to; s= PPS06212021; bh=LiQpYqXKDi/2llVuJXhkwAjxLgXyyHX5InmYxOskjow=; b= Rcj8qE9ttCWWmOcfOgE6fYUs9MX6bZnIuk1pOKeMx0Oy+w5v/A1+yDK+DDVXeykp rLVOrW6XutBqJBxcO1eFsxnIslvUhoKP0eAqXNs8hbiuiF8WdwbFyIK4v97SjoT+ HW6XzYVcwz61iXTwLkVFT6vSSya8bRWieNGTYk2N4kKgpYUZifwD5OP9M0ES043D XwZkl72Y4keFvK76kKlrsiPF9ciCT+AOGdCnXB/yjiEZRCuWZiOc1BYBYCzBQlBU heFk8KU6Znew82P+sKaZqFypivvl088gdXPf9UQ1vWcbFejddtw7E3o1xWv+dvFU HIMfhOrs5E1m/dQF5cxbmQ== Received: from ch5pr02cu005.outbound.protection.outlook.com (mail-northcentralusazon11012011.outbound.protection.outlook.com [40.107.200.11]) by mx0a-0064b401.pphosted.com (PPS) with ESMTPS id 4cuh6prenc-1 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 12 Mar 2026 01:17:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=eDUTRFq4sbP1BzNMScmYIG8ou5QM8K5yrwLuCrFH/B93/7SwhaBQK79/VbLVQOPhCH//Bf5AaVm7eRHckMaLwoLs5z7nWbNACA/AX84a/o8oQ/V8cI7fJ7xqf8FN/J1sBXOQ5MbRY/jP5ToXHMkFHGSp8hxccLyDrgKuBaQEjnue+s6cG2lGEtFtK+EFiLVDclT9HZcrf75o3CJvJpQTOBPslYk1adwyTKY7tJ3hKDvXOgKzlsEhP+krj+aC9cUe/b4bi2fqAcr7vHojX3EoqVeCON6BrFPY2hwW6Egpfz198HTJ8Z6pcSLIhMqV0ZEUnkCwTfhyH6xf0pYWkc1zuA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=LiQpYqXKDi/2llVuJXhkwAjxLgXyyHX5InmYxOskjow=; b=L1O0J+awLJbHFgB51N6+Bir/StidTQUSWxfmeuUGYAUhNFlamm+em5d+5SGsn73xgjqbfUOardgyQ6d6Du9Ml9GIyMnlRjLuLwD5hte979vJ3HhEFfWUmouDdcNCPWLpG9kwkkfJxkYuKUdKf1UgzJakBsrXl54UAs2RowwL5hepu6UyTi5HFLBzzJDBUq5FQzpmuUtlmVBlk59kEawO4VZ4HM6PU7AacpJO84OIMwJ67kJTk23X0J15lplNcQqmeg31na+I/tTgCH+Awo+rOWJ1n5QSSRF5NGRV/G2Hzs7/dCShyXWWQg5Oc0L+3Me3nS+Otnplq688VgRTMD3nWg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=windriver.com; dmarc=pass action=none header.from=windriver.com; dkim=pass header.d=windriver.com; arc=none Received: from SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) by DM4PR11MB6168.namprd11.prod.outlook.com (2603:10b6:8:ab::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9723.6; Thu, 12 Mar 2026 08:17:04 +0000 Received: from SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced]) by SJ2PR11MB7546.namprd11.prod.outlook.com ([fe80::ca9b:dcf:8881:bced%5]) with mapi id 15.20.9700.010; Thu, 12 Mar 2026 08:17:04 +0000 From: "Ionut Nechita (Wind River)" To: ceph-devel@vger.kernel.org Cc: idryomov@gmail.com, xiubli@redhat.com, linux-kernel@vger.kernel.org, ionut_n2001@yahoo.com, Ionut Nechita Subject: [PATCH v1 13/13] libceph: force host network namespace for kernel CephFS mounts Date: Thu, 12 Mar 2026 10:16:19 +0200 Message-ID: <20260312081619.40854-14-ionut.nechita@windriver.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260312081619.40854-1-ionut.nechita@windriver.com> References: <20260312081619.40854-1-ionut.nechita@windriver.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: FR2P281CA0085.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:9b::12) To SJ2PR11MB7546.namprd11.prod.outlook.com (2603:10b6:a03:4cc::8) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ2PR11MB7546:EE_|DM4PR11MB6168:EE_ X-MS-Office365-Filtering-Correlation-Id: 2e5fdb2a-5172-457f-de05-08de800fbea2 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|52116014|10070799003|1800799024|366016|18002099003|56012099003|22082099003; X-Microsoft-Antispam-Message-Info: axxEeYYgTye5GLeExWcKDuFSv6sJW5La/uzo37N7olYkGERumYpdodDeFZKjwRVkCLOh2z6buD2YyzBc72d5K7QMOMjS4Wko9f0RKZZKiUQFZkhJlGOqKe5jb8hZjOVYIQiWiPfYTqKaWb/TDuYDbudbeSnkqAWyVuYqCBMi/MVdoMQep1P9ZgMC/VUGFrS9l5P64LkH3pMvilOy6BNKCM7EcGvDhq+O1XSs1D4lZ/ldB4yozk2KDK9tmOKvpJexTUZy0IBKwqET7oJLVLkKFc/HtPrHyu1ET/GYWKsBdfE6sGmY/poUYWh9NEL8mgPcCKFMLKyX4KiGUoG3Ubeh/7f6VJunNqLsjSyipFBy+3Qr6YoQhu3hM5ntSAIcolhpIGGQDgx1LlAj9HSHFBXmQU8ul4YWmwUVHREdO346raicYsPGVSnmWs+oGaeq9kYzor8lb8CrzvLOKHDFs4Dcc6wmLNFxqodf67iEsdDBsxHC6YRFwHucYnSiuybV/XKH0Jd6okfsVMI9Q00+V7a30LWrD8GARdQgZUv9C47usjB5cLueZoMhF1eiNn+zdrzf9KgbSamcWiyn4jlmLcaE17dmwpNtbcdOGKRpG+el6eM7y9WapztOatjfGuromeryyARaP4wm0me2rvVNedB96PuVnBXqfQZrUbCtKbPMhHUov+59Ra7jew4qlrBhnlQpKA4000iygFxVljSrGcJWP5gujU+/MfZSMNQRJrmEezI= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ2PR11MB7546.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(376014)(52116014)(10070799003)(1800799024)(366016)(18002099003)(56012099003)(22082099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 2 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?92uz2562oSXs/sn0fI/iO1YM8vLRmoRgu9eVo2iUTb+j95rsqvQZmJmpgMAV?= =?us-ascii?Q?JRdfvrgcXKJzJP1ZiEKuHnR9ck6f2z9gVUIhGKevAr2IHxNmyo8ND0Sgaown?= =?us-ascii?Q?GyCwdKghBXIMgFVhuiWWXOZqV97YD3BgN9jpgTJ8n0LggqpOXXPDmazvT1UH?= =?us-ascii?Q?kKyF4wcE5NjRGd9ngIt67jmkwdSssaeXUJYdzMj1qowDooc+Okt0Z/Ocy7+a?= =?us-ascii?Q?ppFmzZcRrys4cQJrYrplCt7EkvFMiCd2zwxX85Qv1eHfeATxlr+mOJ/zTNH7?= =?us-ascii?Q?ljElQ+8+1Hz10KFZhy7xoqhgNoumsptW4m0bP0VM2043lQLAe5lTNzfZMn9a?= =?us-ascii?Q?UDs4dVwH5APpYDfi9Na/xk+2bSfPXVHTTeDvzrbwAxH8aqtY1xCgHpGtNOF2?= =?us-ascii?Q?YdmjZ46K2MDo2gxo9KnkbgpIw2OpaK1MSLr9qXb7u/kjXlxotPuZooX+tMx8?= =?us-ascii?Q?p127/mJk3lAaHDq8+ONZkM950JT4a4OMauo3/PhQ4CUv1+8W2e4p3B6pecST?= =?us-ascii?Q?almjz2AlRQTZFH+aX0JmUKTYB1imfHOiwi0H81enAWGqh+XCjVMCL6/BpGmn?= =?us-ascii?Q?h2/dxFYcrdx466hYQNdm59hBUZxGAI4+6SUuW8PUCCNUocyQGxEvdnWp3ebP?= =?us-ascii?Q?rpZHYZm73khvZ19f2snFZSomhpdZe3tpSriEwP0Ev6XBtqo7ns57l1Nz3CbF?= =?us-ascii?Q?SapLmDSi6fhX4wIlXqJzIeIOwIRh3G/Q2V6AHylVcNDQ3Ducc9lGPC4XPUf4?= =?us-ascii?Q?MwIWW++YIuV/Xhx6dSxO+TAgRD98z+EEtpoUKPf1nNoi2CBBC/zPmJiYCJXb?= =?us-ascii?Q?ZQw5owXnpNBKzrN9wWIz3TKq3/f6wrO5RFiQgM7ysscPWDppFzsY2P/DIwz3?= =?us-ascii?Q?BlQm1ofsoudwwnZoLQUOG6Z4NuJm6OeBZ6e4c93qtOwnwEXBlIJESPszRVf7?= =?us-ascii?Q?maeH2qTnXc4f99LHKux2ABhiMxNNYuWVrOMF49gujZ55+KZBtPrdtmCLKglX?= =?us-ascii?Q?zB8XYp3/ebmEgfDcTcQY36TT4qzF6NtOGnBxcgynQDOXiy5TRaId+2YHeJ6T?= =?us-ascii?Q?dflPrcFPq9qSfwDIUdZVfDYPU5Yx64SI45HVChp4X9eNHiQCJLxyYpboEd8t?= =?us-ascii?Q?t6gAEUhqUyBAkIHh5yeyFqd7ArqzfEQZsPf+in44f2gwgRMelVhd2N5YFo8T?= =?us-ascii?Q?pFYIWeifq2kWthk9Qo47HT0CW5K7WwDQsmMAnhdzvqfMsZ9vdD3tBsg05WHm?= =?us-ascii?Q?JjEZ/3bUVckVEbDhfFp9DCrYTNsatXys+3AJNzc2ctd6jEwe5bHj97dRqvjs?= =?us-ascii?Q?jnjQHzFfyoVmYtSUIuCFAlaRJ0usYuChNl+rr4XPtljhCPsa7/aU33Ik36at?= =?us-ascii?Q?zo92lM2jbRtdXxbTqsxeU4p29kv19FMMZW2nGsBn0SXB2oG1Ql2qTtUhlrJV?= =?us-ascii?Q?WD9PCMGXYRar22RRCrXnxgKEdY40JHLmk8Pui8XcLx1EHXFYLsJiEUPOT+/a?= =?us-ascii?Q?KP+jgviqYudsrtAPfXXeBzwZga+oAIF9bTBpjE/LD6NKKyiLFSZBg2O8fPoI?= =?us-ascii?Q?fGMldHJtyCeKzYdd4DYy7qw70p1zCH/vAcOmK3o3SEOf400YKokz3qTFZTt0?= =?us-ascii?Q?MtB8gOF3vSdnPYNezVg8KksMZwQexy4i9/lK/9pLOhZRfatU8SsZupfo+jaO?= =?us-ascii?Q?r/ri9toNJpo2jzPbSKodycaGq2fZ/FCZLGpuOICDFLRD92SOr0tcyPwCVDRe?= =?us-ascii?Q?QqaFPJnTcVhZ5Ff4QTSM1OcHVNKM9+SXdDKkNp7izxtWNk1y49TFZwrFYW6H?= X-MS-Exchange-AntiSpam-MessageData-1: VVmsx9V182pT5ljz/3JUbKaYz4ncw181VNM= X-Exchange-RoutingPolicyChecked: nfFQDHQ3xedMeXjRqfxoDyc4C8aTJqWLShcoEvJwIXyEZ5xcitZI3MbT+FKh6X+1blwg+j3eU11sNcsMzTyV/bq0+Vsa6u4HRJ1TSi4fjk8EeW08SJWRI69ckdJjrOhEm9JlENzu/hCyEaIBl0LZ9bD6Mc6K6Szm4PQaJT4Ay1mDIDKjRB1+pTLQ/ibzpR61+/zpv2AW2Ck8AvUUY08FJ1MCK/sZ8SQ8Whduc6Arl7l3ghawE1Z7soHTzBy19YpC2ZlsD5WRIosyMQXmBxXywb4IhusY+bi9qoi3vAY7pbsb9Nh9Ojhr6H+9tfUA5/k5XT+VSLk2u8yXiTp/lnoEVQ== X-OriginatorOrg: windriver.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2e5fdb2a-5172-457f-de05-08de800fbea2 X-MS-Exchange-CrossTenant-AuthSource: SJ2PR11MB7546.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Mar 2026 08:17:04.5992 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8ddb2873-a1ad-4a18-ae4e-4644631433be X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: AOPiVDo7ItTfcxPtuwPjCKRJ6Q5yCPHi0f48sEXYk5WwYHgpg0q3pUk+ecJFQPCsyFmaB6i/wmnXaD1eKvjDjqJ1QNxH2Y1NaV85r3qvhSE= X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR11MB6168 X-Authority-Analysis: v=2.4 cv=ePAeTXp1 c=1 sm=1 tr=0 ts=69b27682 cx=c_pps a=la37NW5KtsdugurppmDlYQ==:117 a=6eWqkTHjU83fiwn7nKZWdM+Sl24=:19 a=z/mQ4Ysz8XfWz/Q5cLBRGdckG28=:19 a=lCpzRmAYbLLaTzLvsPZ7Mbvzbb8=:19 a=xqWC_Br6kY4A:10 a=Yq5XynenixoA:10 a=VkNPw1HP01LnGYTKEx00:22 a=bi6dqmuHe4P4UrxVR6um:22 a=HK-ge7EqtdluswH-FwHe:22 a=t7CeM3EgAAAA:8 a=oitQaU9eZ9I-RMSOPSMA:9 a=FdTzh2GWekK77mhwV6Dw:22 X-Proofpoint-ORIG-GUID: 7aq5eubD8_-o2F6PFIUOSVlpSYuwF5zj X-Proofpoint-GUID: 7aq5eubD8_-o2F6PFIUOSVlpSYuwF5zj X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMzEyMDA2NSBTYWx0ZWRfX6s5Eg3NtZ87w 9Rw3gSu03+Ug0AtCz96fG4iDANh4nPzX42E5ZmyLazinMs+910HzE4eLqoAwEVuzmiagFt0ARsn qyvDTwQEu8A5iOyCHZ+vZYoCJdCAxEywhdTRQGCPLbiGtukXyD+hOzaclryC8xyqGgqfj6Wc1GG Q9tXf4zm4Mp+NDdO0wrUZvBw1jhvQHUgLyRM7As27h9s60CMPGw7QYXiNhIEePZ1gaCD1ZEKJoZ wNR6pr9/HZnEcVUx9jAToG0CK2ApBPra4TPEL483wQhD3cBy14Vd12VoTIUr3528N53cVP9hv+1 Hf/ABjS0JLwZwCgkn1LGOC7EtsZ+W8uc7UIYtRAAoWW3ndRjBik3oLAZwMyFYOZZEoiW9mFnLuu hGPMV4R3leO14kru9xKqdV9MVVZ+Es6r9oFHPd7h2hgSTyazRtjrl/lB7DxYdguA0FdvgzTuIpB AKeFZlprNueh/Di9YDg== X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1143,Hydra:6.1.51,FMLib:17.12.100.49 definitions=2026-03-11_02,2026-03-09_02,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 suspectscore=0 impostorscore=0 spamscore=0 lowpriorityscore=0 clxscore=1015 priorityscore=1501 bulkscore=0 malwarescore=0 phishscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2603050001 definitions=main-2603120065 Content-Type: text/plain; charset="utf-8" From: Ionut Nechita In containerized environments (e.g., Rook-Ceph CSI with forcecephkernelclient=3Dtrue), the mount() syscall for kernel CephFS may be invoked from a pod's network namespace instead of the host namespace. This happens despite the CSI node plugin (csi-cephfsplugin) running with hostNetwork: true, due to race conditions during kubelet restart or pod scheduling. ceph_messenger_init() captures current->nsproxy->net_ns at mount time and uses it for all subsequent socket operations. When a pod NS is captured, all kernel ceph sockets (mon, mds, osd) are created in that namespace, which typically lacks routes to the Ceph monitors (e.g., fd04:: ClusterIP addresses). This causes permanent EADDRNOTAVAIL (-99) on every connection attempt at ip6_dst_lookup_flow(), with no possibility of recovery short of force-unmount and remount from the correct namespace. Root cause confirmed via kprobe tracing on ip6_dst_lookup_flow: the net pointer passed to the routing lookup was the pod's net_ns (0xff367a0125dd5780) instead of init_net (0xffffffffbda76940). The pod NS had no route for fd04::/64 (monitor ClusterIP range), while userspace python connect() from the same host succeeded because it ran in host NS. Fix this by always using init_net (the host network namespace) in ceph_messenger_init(). The kernel CephFS client inherently requires host-level network access to reach Ceph monitors, OSDs, and MDS daemons. Using the caller's namespace was inherited from generic socket patterns but is incorrect for a kernel filesystem client that must survive beyond the lifetime of the mounting process and its network namespace. A warning is logged when a mount from a non-init namespace is detected, to aid debugging. Observed in production (kernel 6.12.0-1-rt-amd64, Ceph Reef 18.2.5, IPv6-only cluster, ceph-csi v3.13.1): - Fresh boot of compute-0, ceph-csi mounts CephFS via kernel - All monitor connections fail with EADDRNOTAVAIL immediately - kprobe confirms wrong net_ns in ip6_dst_lookup_flow - Workaround: umount -l + systemctl restart kubelet - After restart: mount captures host NS, works immediately Signed-off-by: Ionut Nechita --- net/ceph/messenger.c | 27 ++++++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index 8165e6a8fe092..a2e8ea6d339c9 100644 --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -1791,7 +1791,32 @@ void ceph_messenger_init(struct ceph_messenger *msgr, =20 atomic_set(&msgr->stopping, 0); atomic_set(&msgr->addr_notavail_count, 0); - write_pnet(&msgr->net, get_net(current->nsproxy->net_ns)); + + /* + * Use the initial (host) network namespace instead of the + * caller's current namespace. In containerized environments + * (e.g., Rook-Ceph CSI with forcecephkernelclient=3Dtrue), the + * mount() syscall may be invoked from a pod's network namespace + * even when the CSI plugin runs with hostNetwork: true (race + * conditions during kubelet restart, pod scheduling, etc.). + * + * If the pod NS is captured here, all kernel ceph sockets will + * be created in that NS, which typically lacks routes to the + * Ceph monitors (e.g., fd04:: ClusterIP addresses). This causes + * permanent EADDRNOTAVAIL on every connection attempt with no + * possibility of recovery short of force-unmount + remount. + * + * The kernel CephFS client always needs host-level network + * access to reach Ceph monitors, OSDs, and MDS daemons, so + * using init_net is the correct choice. The previous behavior + * of capturing current->nsproxy->net_ns was inherited from + * generic socket code but is wrong for a kernel filesystem + * client that must survive beyond the lifetime of the mounting + * process's network namespace. + */ + if (current->nsproxy->net_ns !=3D &init_net) + pr_warn("libceph: mount from non-init network namespace detected, using = host namespace instead\n"); + write_pnet(&msgr->net, get_net(&init_net)); =20 dout("%s %p\n", __func__, msgr); } --=20 2.53.0